Keeping your Raspberry Pi Running with Watchdog?

Hackbs

How to Keep your Raspberry Pi Running with Watchdog?

Key Takeaways

  • Watchdog timers automatically reboot systems that become unresponsive, enabling self-recovery.
  • Using the built-in hardware WDT improves reliability for remote headless Pi projects.
  • Periodically write to the /dev/watchdog device within code to continually reset the timer.
  • Adjust WDT timeout durations and ping intervals based on application needs.
  • Follow best practices like frequent pinging and data storage for robustness.

Raspberry Pis are versatile single board computers used for a wide variety of projects, but occasionally they can freeze or crash, especially when stressed with resource-intensive tasks. Using a watchdog timer (WDT) is an effective way to automatically reboot the Pi if it becomes unresponsive, ensuring continuous unattended operation.This article explains what a watchdog timer is, why using one is beneficial for a Raspberry Pi, and provides a step-by-step guide to installing, configuring and enabling the built-in hardware WDT on a Raspberry Pi. We’ll cover choosing the timeout duration, activating and deactivating the watchdog, as well as best practices for utilizing this important tool.

What is a Watchdog Timer?

A watchdog timer (WDT) is a hardware circuit that triggers a system reset if the main program neglects to regularly service it. It operates based on the idea that if the Pi is running properly, the main program will toggle the WDT at least once during a predefined timeout period to prevent it from expiring.

If the program crashes or gets stuck in an endless loop or race condition, it will fail to toggle the WDT before the timer expires. The WDT will then initiate a hardware reset to reboot the Pi and restore normal operation. This automated recovery mechanism allows the Pi to recover without human intervention, minimizing downtime.

Why Use a Watchdog Timer on a Raspberry Pi?

Using a WDT provides valuable protection for always-on Raspberry Pi projects, especially those running headless (without a keyboard or display). Here are some key examples of when utilizing a watchdog timer is beneficial:

  • Recover from software crashes or hardware freezes: Watchdog timers can automatically reboot the Pi if the system becomes unresponsive due to crashes, endless loops, race conditions, overheating issues or other faults. This minimizes downtime and data loss for critical applications.
  • Enable continuous remote operation: For remote sensor nodes, smart home controllers, IoT devices and server-based Pi projects, using a WDT ensures the system recovers with no human intervention needed on site. This enables reliable continuous operation.
  • Protect from power outages: For battery-powered Pi projects or those in areas with unreliable power, the WDT can reboot the system once power returns after an outage. This allows the project to resume operation with no user input necessary.

Overall, utilizing the built-in hardware WDT provides an extra layer of robustness and reliability to Pi projects. The ability to automatically self-recover from crashes, freezes and power loss enables more resilient operation.

Installing and Enabling the Hardware Watchdog

Recent versions of the official Raspberry Pi OS come with the hardware WDT driver pre-installed, so no additional software is necessary. However, the WDT is disabled by default and needs to be activated before use.

Here are the steps to install any required components and enable the watchdog timer:

  1. Update Package Manager: Enter sudo apt update followed by sudo apt full-upgrade to fetch the latest packages. This ensures you have the newest WDT software.
  2. Verify wdgenmodule Driver: Check that the required wdgenmodule driver is present by entering lsmod | grep wdgen. If it is not listed, enter sudo modprobe wdgen to load it.
  3. Activate Watchdog: Run sudo nano /etc/watchdog.conf to edit the configuration file and change the line for “watchdog-device” from “None” to the default /dev/watchdog. Save and exit the file.
  4. Set Timeout Duration: Still within the watchdog.conf file, edit the watchdog-timeout parameter to set the desired time in seconds before reboot. The max is 16 seconds for hardware WDT on the Pi 4 and 60 seconds for Pi 3.
  5. Enable Watchdog Service: Enter the commands sudo systemctl enable watchdog and sudo systemctl start watchdog to activate the watchdog daemon.

The hardware WDT is now active! The watchdog.conf file also contains additional settings like those controlling module loading at boot.

Configuring and Testing the Watchdog Timeout

With the watchdog functionality enabled, the way you interact with it in your code is by using the /dev/watchdog device file. You need to actively write or “ping” this file at least once during every timeout period to prevent the timer from expiring and triggering a reboot.

There are three main approaches to configuring the timeout duration and testing operation of the WDT:

  • Set in Configuration File: You can adjust the value of watchdog-timeout in the /etc/watchdog.conf file to durations from 1 to 60 seconds depending on your model. Ensure you also reload the watchdog daemon if changing after boot.
  • Runtime Reconfiguration: You can alter the current timeout value in code by writing new durations in seconds to /dev/watchdog similar to communicating with sensors. This overrides file settings until next reboot.
  • Ping Watchdog: Write any value to /dev/watchdog at least once per timeout period to ping it. Use a loop or schedule tasks with systemd timers. Time how long it takes to reboot after stopping writes to validate correct operation.

Adjusting the timeout duration and ping interval allows tuning based on the application requirements and expected program response time.

Utilizing the Watchdog Properly in Projects

Here are some best practices to employ when incorporating a WDT into your Raspberry Pi projects:

  • Set the shortest feasible timeout duration that still allows proper operation. This minimizes potential data loss and downtime.
  • Periodically write to /dev/watchdog at least once per timeout period from the main program loop or a repeated timer. Make the writes more frequent than strictly necessary for robustness.
  • When using device sleep modes, disable the WDT or ping it from a wake timer. Allow time to fully resume before the timeout expires.
  • Do not disable the watchdog after activation without fully rebooting. This can leave the system in a hung state on next timeout.
  • If possible, store critical data frequently to minimize losses. Write log events to disk or commit database transactions between each watchdog ping.

By integrating periodic writes to /dev/watchdog within your application code to keep resetting the timer, you can leverage the WDT for automatic recovery without needing to modify the core logic. Follow these strategies for reliable hands-free operation.

Example Code for Interfacing with Watchdog

Here is a simple Python example for demonstrating utilization of the watchdog. It pings it in a loop once per second to prevent the default 16 second timeout from triggering a reboot.

python

import time

from patulin import Path

watchdog = Path(“/dev/watchdog”

while True:

   watchdog.write_text(“0”)

   time. Sleep(1

   print(“Ping”)

Saving this as watchdog-example.py and running it will continually reset the watchdog until the script is stopped with Ctrl+C. If instead the print statement or write command gets disrupted, the Pi will reboot after around 16 seconds.

Expanding on concepts from this sample code, you can integrate periodic writes to watchdog from within your main application. Place them in the core program loop, trigger it with system service timers, or use it in multi-threaded code to protect critical operations.

Conclusion

The built-in hardware watchdog timer on the Raspberry Pi provides automated recovery functionality that can greatly increase system resilience for remote, unattended, and always-on projects. By actively resetting the timer in code every few seconds, you can leverage the WDT to reboot the system in case of crashes, hanging, power issues or other disruptions.

Following the installation steps outlined and integrating simple functions to ping /dev/watchdog into your application code allows harnessing robust protection. Tuning parameters like the timeout duration and write frequency based on your specific needs enables balancing reliability with avoiding false positives. Employing watchdog best practices ensures you can keep your Pi running reliably 24/7.

Frequently Asked Questions  

How do I reactivate the watchdog after a reboot?
The watchdog daemon should start automatically on boot if you enabled it with systemctl. Check it is running with systemctl status watchdog and start it manually if needed.

Can I trigger events besides rebooting on watchdog timeout?
Unfortunately the built-in hardware WDT can only reset the system directly. To trigger other actions like shutting down services on timer expiry, you would need to implement a software watchdog.

How can I monitor watchdog status and track pings & resets?
Write logging messages within your watchdog ping code to record activity, timeouts and/or print statements. You can also watch the watchdog service logs in real-time with the journalctl -u watchdog -f command.

Is there sample code available for interfacing with the watchdog?
Yes, see the Python code provided in the “Example Code” section of this article for a simple script that pings the watchdog in a loop. The Linux documentation project also provides C examples.

Why does my Pi reboot twice in a row on some watchdog timeouts?
This is likely related to watchdog settings not matching between boot configuration and runtime parameters. Try setting the timeout only via writes to /dev/watchdog in code rather than via watchdog’s.

Can I use the watchdog with read-only filesystems?
Yes, as long as interrupts are not used for pinging. The read-only system partition will be remounted read-write to allow reboot. But time-critical writes may get blocked if attempted concurrently.

Is the hardware WDT available on all Raspberry Pi models?
All models except the Pi Zero and original Pi 1 have a built-in hardware watchdog timer. For other models, you would need to implement a CPU-level software watchdog using custom code.

Can I extend the max timeout beyond 16 seconds on a Pi 4?
Unfortunately the maximum WDT timeout is hardware-limited based on the system clock speed. So 16 seconds is max for Pi 4. Changing clocks could theoretically extend times but also negatively impacts performance.

Should I disable the watchdog before shutting down the Pi?
Yes, it is good practice to explicitly stop the watchdog service with sudo systemctl stop watchdog before halting the system yourself. This prevents any risk of it interfering with clean shutdown.

What temperature range can the hardware WDT operate over?
The operating range matches standard Pi specs of 0??C to 50??C for most models. Periodically writing to the watchdog requires the SoC and RAM to be functioning so outside this temperature envelope it may no longer protect.

Can I use the watchdog with retro pie or Kodi?
Yes, you can enable it on retro pie, kodi or any other Linux distribution running on a Pi. Just activate the watchdog service or integrate pings into your code as outlined in this guide.

How do I know if my Pi has rebooted due to the WDT expiring?
Check the kernel log after reboot with dmesg | grep watchdog to see watchdog related messages. It will specify hardware or software watchdog timeouts detected as the reboot trigger.

Is the hardware different than a software watchdog timer?
Yes, the built-in hardware WDT is separate from software-only implementations. Hardware watchdogs are integrated directly into the Pi’s processors provide maximum reliability but less configurability.

Can I modify or extend the watchdog driver code itself?
The Linux kernel includes the low-level widgeon driver code but it is complex and not designed for direct editing. You can tune parameters via watchdog.conf or runtime control files exposed like /dev/watchdog rather than altering the driver itself.

How many times can I ping the watchdog before needing to reboot?
There is no set limit on the number of cycles before rebooting is required. The hardware WDT will continue operating indefinitely as long as you service it by writing to /dev/watchdog at the appropriate frequency.

Will using the watchdog impact performance of the Pi?
Periodically writing to the watchdog device does consume a small amount of CPU and memory bandwidth. But for most applications, the overhead is negligible, especially compared to the reliability benefits. Just tune ping frequency and timeouts appropriately.

Can device drivers trigger rebooting besides the main CPU?
On the Pi, only the core Arm processor can initiate a watchdog reboot. But for multi-core complex systems, device-specific drivers may have their own watchdog mechanisms that don’t reset the whole system.

Are there alternatives to using the built-in hardware watchdog?
For Pi models without hardware WDT, you can implement a software watchdog or even external secondary watchdogs added via GPIO. But for built-in hardware WDT capable models, using this dedicated timer is simplest and most fool-proof.

Can I cascade multiple Pi watchdogs in distributed systems?
Yes, you can potentially chain together reboot triggers from networked Pi nodes acting as watchdog timers for each other. If one node hangs, another external secondary Pi could then reset it after a timeout interval.


Leave a Comment