My Raspberry Pi 5 Kept Vanishing From the Network — Diagnosing With journalctl and Fixing It With a Watchdog

The symptom

I run a headless Raspberry Pi 5 (over Wi-Fi) via SSH and Tailscale. Every so often it would stay powered on (LED lit) but completely vanish from the network:

SSH over LAN → timeout
SSH over Tailscale → timeout
RDP → timeout

And it kept recurring — every time I "fixed" it by power-cycling. Here is how I actually diagnosed it and set up auto-recovery.

Step 1: `tailscale status` tells you device-side vs your-side

When SSH hangs, it's tempting to assume "the Pi died." But the problem can be on your side (your PC's Wi-Fi / VPN). Rule that out first.

If you use Tailscale, tailscale status is the fastest check:

100.x.y.z   raspberrypi   user@   linux   active; relay "..."; offline, last seen 1h ago, tx 124956 rx 0

The key is the peer line: offline, last seen 1h ago. This comes from Tailscale's coordination server — it's a third-party fact, independent of your own SSH attempt.

peer offline → the Pi really did drop off the network
peer active/online → it's your side that's broken (your VPN/network)

last seen also tells you when it dropped. tx ... rx 0 means "you're sending, but getting zero bytes back."

Confirm your own side is healthy too via tailscale status --json → BackendState=Running / Self.Online=true.

Step 2: After recovery, read `journalctl -b -1` (the previous boot)

This is the part people miss. While the Pi is off the network you obviously can't pull its logs — you read them after it's back. But the moment you reboot, the "current boot" is brand new, so journalctl (default) and dmesg won't contain the moment it dropped.

The drop is in the previous boot:

# tail of the previous boot
journalctl -b -1 --no-pager | tail -30

# only wlan0 / network-related lines
journalctl -b -1 --no-pager | grep -iE "wlan0|cfg80211|brcmfmac|LinkChange|network is unreachable" | tail -40

What I found, in order:

19:57  avahi-daemon: Withdrawing address record ... on wlan0   # IPv6 address flapping
20:08  tailscaled: LinkChange: major, rebinding ... rebind-reason=[time-jumped(13m50s),ips-changed,protocols-changed]
22:06  tailscaled: ... connect: network is unreachable          # fully down

The important part: the logs kept flowing the whole time. So this was not a full OS freeze or OOM — only the wlan0 / network layer dropped while the OS stayed alive (no oom, no panic at the tail of journalctl -b -1).

I couldn't pin the root trigger (why wlan0 dropped) from the logs — there was no explicit driver crash line, and vcgencmd get_throttled read 0x0 after recovery (no undervoltage flag). So from here it's a symptomatic fix for wlan0 drops in general.

The fix: a NetworkManager watchdog

If "the OS is alive but only the network is down," then a cron job that watches for it and restarts networking can auto-recover. Recent Pi OS uses NetworkManager (resolv.conf says # Generated by NetworkManager), so restart that.

/home/pi/.local/bin/net-watchdog.sh:

#!/usr/bin/env bash
# If the gateway is unreachable, restart NetworkManager to bring wlan0 back.
GATEWAY=192.168.1.1   # replace with your router IP
LOG=/home/pi/logs/net-watchdog.log
if ping -c 3 -W 3 "$GATEWAY" >/dev/null 2>&1; then
  exit 0
fi
echo "$(date -Is) gateway $GATEWAY unreachable, restarting NetworkManager" >> "$LOG"
systemctl restart NetworkManager

/etc/cron.d/net-watchdog (every 3 minutes, as root):

*/3 * * * * root /home/pi/.local/bin/net-watchdog.sh

Install:

chmod +x /home/pi/.local/bin/net-watchdog.sh
mkdir -p /home/pi/logs
echo "*/3 * * * * root /home/pi/.local/bin/net-watchdog.sh" | sudo tee /etc/cron.d/net-watchdog

Verify it does NOT restart when the network is fine

You don't want spurious restarts. Run it once by hand while the network is healthy:

/home/pi/.local/bin/net-watchdog.sh
echo $?        # → 0
cat /home/pi/logs/net-watchdog.log   # → no such file (= it did not restart anything)

While ping succeeds, nothing is logged and systemctl restart never runs. It only acts when the network is actually down.

Takeaways

"Powered on but gone from the network" is usually a network-layer drop — the OS is still alive.
Fastest triage: tailscale status (third-party offline record) → after recovery, journalctl -b -1 (the previous boot).
A gateway-ping → restart-NetworkManager watchdog in cron auto-recovers within minutes, as long as the OS keeps running.
Caveat: a cron-driven watchdog can't help if the OS itself stops executing.

I still haven't pinned the root cause, but at least I'm out of the "power-cycle it by hand every time" loop.

My Raspberry Pi 5 Kept Vanishing From the Network — Diagnosing With journalctl and Fixing It With a Watchdog

The symptom

Step 1: `tailscale status` tells you device-side vs your-side

Step 2: After recovery, read `journalctl -b -1` (the previous boot)

The fix: a NetworkManager watchdog

Verify it does NOT restart when the network is fine

Takeaways

Tags

Author

Stats

Published

You Might Also Like

Hacking with Raspberry Pico: Data Exfiltration with a Captive Portal

Why Your MQTT Client Is Silently Losing Messages (And How I Fixed It) - robmqtt

I Gave My Dead Raspberry Pi to an AI Agent. It Fixed Everything Over SSH.

Gemma 4 ExecuTorch Deployment on Raspberry Pi 5 and Why It's 7.7 Slower Than llama.cpp

Run Gemma-4 E2B-it with llama.cpp on Raspberry Pi4

The Homelab Rabbit Hole

My Raspberry Pi 5 Kept Vanishing From the Network — Diagnosing With journalctl and Fixing It With a Watchdog

The symptom

Step 1: tailscale status tells you device-side vs your-side

Step 2: After recovery, read journalctl -b -1 (the previous boot)

The fix: a NetworkManager watchdog

Verify it does NOT restart when the network is fine

Takeaways

Tags

Author

Stats

Published

You Might Also Like

Hacking with Raspberry Pico: Data Exfiltration with a Captive Portal

Why Your MQTT Client Is Silently Losing Messages (And How I Fixed It) - robmqtt

I Gave My Dead Raspberry Pi to an AI Agent. It Fixed Everything Over SSH.

Gemma 4 ExecuTorch Deployment on Raspberry Pi 5 and Why It's 7.7 Slower Than llama.cpp

Run Gemma-4 E2B-it with llama.cpp on Raspberry Pi4

The Homelab Rabbit Hole

Step 1: `tailscale status` tells you device-side vs your-side

Step 2: After recovery, read `journalctl -b -1` (the previous boot)