I built an AI log monitor for my homelab — local LLM reads my *arr logs so I don't have to

My homelab runs the usual stack — Sonarr, Radarr, Prowlarr, qBittorrent, Plex. I was getting ntfy alerts at all hours for things like ffprobe metadata reads and HTTP 429s from indexers. Not actionable, just noise.

So I built Cortex: a monitoring layer that sends Docker logs through a local LLM (Ollama) every 30 minutes, filters the noise, and routes only meaningful alerts to my phone.

The problem with threshold-based monitoring

Standard monitoring tools watch numbers. CPU > 80%? Alert. Disk > 90%? Alert. That works for infrastructure — it doesn't work for application logs.

A Sonarr log line like:

[Warn] NzbDrone.Core.Download.TrackedDownloads.TrackedDownloadService: 
Couldn't import album track / No files found are eligible for import

Is that a problem? Maybe. Depends on context. Is it a one-off, or has it been happening for 6 hours? Is the download queue healthy? Did the episode actually get imported by another path?

A fixed threshold can't answer that. A language model can.

Architecture

Docker logs → Cortex → Ollama (local LLM) → parsed report → ntfy
                                                    ↓
                                           Prometheus metrics

Every 30 minutes, cortex-monitor.py runs via cron:

Collects recent log lines from each monitored container
Filters known noise patterns (ffprobe, VideoFileInfoReader, HTTP 429, etc.)
Sends the filtered logs to a local Ollama endpoint
Parses the LLM response into structured alerts
Routes alerts by priority — INFO goes to the daily digest, WARNING/CRITICAL go to ntfy immediately

The Ollama Modelfile

The key is giving the LLM enough context to understand what it's reading. The Modelfile bakes in knowledge of the stack:

SYSTEM """
You are an infrastructure monitoring assistant for a self-hosted homelab.
You analyse log output from Docker containers running *arr media services.

NOISE — these are NOT alerts:
- ffprobe metadata reads
- VideoFileInfoReader routine scans  
- HTTP 429 rate limiting from indexers (expected, indexers throttle)
- Prowlarr health check on port 9696

SIGNAL — these ARE worth reporting:
- Import failures after successful downloads
- Indexer connectivity issues lasting > 30 minutes
- Download client queue stalls
- Authentication errors
- Database errors

Output format:
ALERT_LEVEL: INFO|WARNING|CRITICAL
SUMMARY: one sentence
DETAIL: what happened and why it matters
RECOMMENDATION: what to check or do
"""

Temperature 0.2 keeps the output deterministic and consistent — you don't want creative variation in monitoring alerts.

Noise filtering before the LLM

The LLM call costs time (2-4 seconds on a local GPU). Filtering before sending keeps the context window clean and the latency low:

NOISE_PATTERNS = [
    "ffprobe",
    "VideoFileInfoReader", 
    "429",
    "invalid torrent",
    "9696/",
]

def filter_noise(log_lines: list) -> list:
    return [
        line for line in log_lines
        if not any(pattern in line for pattern in NOISE_PATTERNS)
    ]

On a normal day, this drops 60-70% of log volume before it ever reaches Ollama.

Alert routing with cooldown

Not every WARNING needs an immediate ntfy push. Cortex uses a cooldown per alert type to avoid notification fatigue:

def route_alert(alert: dict, state: dict) -> bool:
    key = f"{alert['container']}:{alert['alert_level']}"
    last_sent = state.get(key, 0)
    cooldown = COOLDOWNS.get(alert['alert_level'], 3600)

    if time.time() - last_sent < cooldown:
        return False  # still in cooldown

    state[key] = time.time()
    return True

INFO alerts accumulate and go into the daily digest at 09:00. WARNING and CRITICAL bypass the cooldown and go out immediately.

The daily digest

Every morning at 09:00, cortex-digest.py sends a summary via ntfy:

📊 Cortex Daily Digest — 2026-04-17

Containers: 5/5 healthy
Alerts last 24h: 2 (1 WARNING, 1 INFO)
Noise filtered: 847 log entries

Top event: prowlarr indexer timeout on NZBgeek (non-critical)
Recommendation: check NZBgeek API key expiry

Imports: 4 episodes, 3 movies — all clean

One message per day with everything that actually happened. No alert fatigue.

Prometheus metrics

cortex-exporter.py exposes metrics on port 9192 for Grafana:

cortex_alerts_total
cortex_last_run_timestamp
cortex_containers_monitored
cortex_noise_filtered_total
cortex_digest_last_sent

The "last run age" gauge is particularly useful — if Cortex stops running, the gauge climbs and you get a Grafana alert.

Hardware requirements

CPU-only: 16GB RAM minimum — runs qwen2.5:7b adequately
GPU: 8GB VRAM — runs qwen2.5:14b comfortably (recommended)

I run it on a machine with a modest GPU. The 30-minute cron cadence means inference load is negligible — one batch call every half hour, not a continuous service.

Getting started

git clone https://github.com/pdegidio/cortex-homelab.git
cd cortex-homelab
bash install.sh

The installer walks you through Ollama endpoint, ntfy config, container names, and cron setup. Done in ~15 minutes.

Full repo: github.com/pdegidio/cortex-homelab — MIT license.

What's your biggest source of homelab alert noise? I'm curious whether the noise filter patterns generalise beyond my stack or if everyone's list is completely different.