My homelab runs the usual stack — Sonarr, Radarr, Prowlarr, qBittorrent, Plex. I was getting ntfy alerts at all hours for things like ffprobe metadata reads and HTTP 429s from indexers. Not actionable, just noise.
So I built Cortex: a monitoring layer that sends Docker logs through a local LLM (Ollama) every 30 minutes, filters the noise, and routes only meaningful alerts to my phone.
The problem with threshold-based monitoring
Standard monitoring tools watch numbers. CPU > 80%? Alert. Disk > 90%? Alert. That works for infrastructure — it doesn't work for application logs.
A Sonarr log line like:
[Warn] NzbDrone.Core.Download.TrackedDownloads.TrackedDownloadService:
Couldn't import album track / No files found are eligible for import
Is that a problem? Maybe. Depends on context. Is it a one-off, or has it been happening for 6 hours? Is the download queue healthy? Did the episode actually get imported by another path?
A fixed threshold can't answer that. A language model can.
Architecture
Docker logs → Cortex → Ollama (local LLM) → parsed report → ntfy
↓
Prometheus metrics
Every 30 minutes, cortex-monitor.py runs via cron:
- Collects recent log lines from each monitored container
- Filters known noise patterns (ffprobe, VideoFileInfoReader, HTTP 429, etc.)
- Sends the filtered logs to a local Ollama endpoint
- Parses the LLM response into structured alerts
- Routes alerts by priority — INFO goes to the daily digest, WARNING/CRITICAL go to ntfy immediately
The Ollama Modelfile
The key is giving the LLM enough context to understand what it's reading. The Modelfile bakes in knowledge of the stack:
SYSTEM """
You are an infrastructure monitoring assistant for a self-hosted homelab.
You analyse log output from Docker containers running *arr media services.
NOISE — these are NOT alerts:
- ffprobe metadata reads
- VideoFileInfoReader routine scans
- HTTP 429 rate limiting from indexers (expected, indexers throttle)
- Prowlarr health check on port 9696
SIGNAL — these ARE worth reporting:
- Import failures after successful downloads
- Indexer connectivity issues lasting > 30 minutes
- Download client queue stalls
- Authentication errors
- Database errors
Output format:
ALERT_LEVEL: INFO|WARNING|CRITICAL
SUMMARY: one sentence
DETAIL: what happened and why it matters
RECOMMENDATION: what to check or do
"""
Temperature 0.2 keeps the output deterministic and consistent — you don't want creative variation in monitoring alerts.
Noise filtering before the LLM
The LLM call costs time (2-4 seconds on a local GPU). Filtering before sending keeps the context window clean and the latency low:
NOISE_PATTERNS = [
"ffprobe",
"VideoFileInfoReader",
"429",
"invalid torrent",
"9696/",
]
def filter_noise(log_lines: list) -> list:
return [
line for line in log_lines
if not any(pattern in line for pattern in NOISE_PATTERNS)
]
On a normal day, this drops 60-70% of log volume before it ever reaches Ollama.
Alert routing with cooldown
Not every WARNING needs an immediate ntfy push. Cortex uses a cooldown per alert type to avoid notification fatigue:
def route_alert(alert: dict, state: dict) -> bool:
key = f"{alert['container']}:{alert['alert_level']}"
last_sent = state.get(key, 0)
cooldown = COOLDOWNS.get(alert['alert_level'], 3600)
if time.time() - last_sent < cooldown:
return False # still in cooldown
state[key] = time.time()
return True
INFO alerts accumulate and go into the daily digest at 09:00. WARNING and CRITICAL bypass the cooldown and go out immediately.
The daily digest
Every morning at 09:00, cortex-digest.py sends a summary via ntfy:
📊 Cortex Daily Digest — 2026-04-17
Containers: 5/5 healthy
Alerts last 24h: 2 (1 WARNING, 1 INFO)
Noise filtered: 847 log entries
Top event: prowlarr indexer timeout on NZBgeek (non-critical)
Recommendation: check NZBgeek API key expiry
Imports: 4 episodes, 3 movies — all clean
One message per day with everything that actually happened. No alert fatigue.
Prometheus metrics
cortex-exporter.py exposes metrics on port 9192 for Grafana:
cortex_alerts_total
cortex_last_run_timestamp
cortex_containers_monitored
cortex_noise_filtered_total
cortex_digest_last_sent
The "last run age" gauge is particularly useful — if Cortex stops running, the gauge climbs and you get a Grafana alert.
Hardware requirements
-
CPU-only: 16GB RAM minimum — runs
qwen2.5:7badequately -
GPU: 8GB VRAM — runs
qwen2.5:14bcomfortably (recommended)
I run it on a machine with a modest GPU. The 30-minute cron cadence means inference load is negligible — one batch call every half hour, not a continuous service.
Getting started
git clone https://github.com/pdegidio/cortex-homelab.git
cd cortex-homelab
bash install.sh
The installer walks you through Ollama endpoint, ntfy config, container names, and cron setup. Done in ~15 minutes.
Full repo: github.com/pdegidio/cortex-homelab — MIT license.
What's your biggest source of homelab alert noise? I'm curious whether the noise filter patterns generalise beyond my stack or if everyone's list is completely different.











