The symptom
I run a headless Raspberry Pi 5 (over Wi-Fi) via SSH and Tailscale. Every so often it would stay powered on (LED lit) but completely vanish from the network:
- SSH over LAN → timeout
- SSH over Tailscale → timeout
- RDP → timeout
And it kept recurring — every time I "fixed" it by power-cycling. Here is how I actually diagnosed it and set up auto-recovery.
Step 1: tailscale status tells you device-side vs your-side
When SSH hangs, it's tempting to assume "the Pi died." But the problem can be on your side (your PC's Wi-Fi / VPN). Rule that out first.
If you use Tailscale, tailscale status is the fastest check:
100.x.y.z raspberrypi user@ linux active; relay "..."; offline, last seen 1h ago, tx 124956 rx 0
The key is the peer line: offline, last seen 1h ago. This comes from Tailscale's coordination server — it's a third-party fact, independent of your own SSH attempt.
- peer
offline→ the Pi really did drop off the network - peer
active/online → it's your side that's broken (your VPN/network)
last seen also tells you when it dropped. tx ... rx 0 means "you're sending, but getting zero bytes back."
Confirm your own side is healthy too via tailscale status --json → BackendState=Running / Self.Online=true.
Step 2: After recovery, read journalctl -b -1 (the previous boot)
This is the part people miss. While the Pi is off the network you obviously can't pull its logs — you read them after it's back. But the moment you reboot, the "current boot" is brand new, so journalctl (default) and dmesg won't contain the moment it dropped.
The drop is in the previous boot:
# tail of the previous boot
journalctl -b -1 --no-pager | tail -30
# only wlan0 / network-related lines
journalctl -b -1 --no-pager | grep -iE "wlan0|cfg80211|brcmfmac|LinkChange|network is unreachable" | tail -40
What I found, in order:
19:57 avahi-daemon: Withdrawing address record ... on wlan0 # IPv6 address flapping
20:08 tailscaled: LinkChange: major, rebinding ... rebind-reason=[time-jumped(13m50s),ips-changed,protocols-changed]
22:06 tailscaled: ... connect: network is unreachable # fully down
The important part: the logs kept flowing the whole time. So this was not a full OS freeze or OOM — only the wlan0 / network layer dropped while the OS stayed alive (no oom, no panic at the tail of journalctl -b -1).
I couldn't pin the root trigger (why wlan0 dropped) from the logs — there was no explicit driver crash line, and
vcgencmd get_throttledread0x0after recovery (no undervoltage flag). So from here it's a symptomatic fix for wlan0 drops in general.
The fix: a NetworkManager watchdog
If "the OS is alive but only the network is down," then a cron job that watches for it and restarts networking can auto-recover. Recent Pi OS uses NetworkManager (resolv.conf says # Generated by NetworkManager), so restart that.
/home/pi/.local/bin/net-watchdog.sh:
#!/usr/bin/env bash
# If the gateway is unreachable, restart NetworkManager to bring wlan0 back.
GATEWAY=192.168.1.1 # replace with your router IP
LOG=/home/pi/logs/net-watchdog.log
if ping -c 3 -W 3 "$GATEWAY" >/dev/null 2>&1; then
exit 0
fi
echo "$(date -Is) gateway $GATEWAY unreachable, restarting NetworkManager" >> "$LOG"
systemctl restart NetworkManager
/etc/cron.d/net-watchdog (every 3 minutes, as root):
*/3 * * * * root /home/pi/.local/bin/net-watchdog.sh
Install:
chmod +x /home/pi/.local/bin/net-watchdog.sh
mkdir -p /home/pi/logs
echo "*/3 * * * * root /home/pi/.local/bin/net-watchdog.sh" | sudo tee /etc/cron.d/net-watchdog
Verify it does NOT restart when the network is fine
You don't want spurious restarts. Run it once by hand while the network is healthy:
/home/pi/.local/bin/net-watchdog.sh
echo $? # → 0
cat /home/pi/logs/net-watchdog.log # → no such file (= it did not restart anything)
While ping succeeds, nothing is logged and systemctl restart never runs. It only acts when the network is actually down.
Takeaways
- "Powered on but gone from the network" is usually a network-layer drop — the OS is still alive.
- Fastest triage:
tailscale status(third-party offline record) → after recovery,journalctl -b -1(the previous boot). - A gateway-ping → restart-NetworkManager watchdog in cron auto-recovers within minutes, as long as the OS keeps running.
- Caveat: a cron-driven watchdog can't help if the OS itself stops executing.
I still haven't pinned the root cause, but at least I'm out of the "power-cycle it by hand every time" loop.












