watchdog

file: /proc/sys/kernel/watchdog
variable: kernel.watchdog
Official reference

A watchdog timer is provided that implements the Linux-standard watchdog timer interface. It has three module parameters that can be used to control it::

modprobe ipmi_watchdog timeout= pretimeout= action= preaction= preop= start_now=x nowayout=x ifnum_to_use=n panic_wdt_timeout=

ifnum_to_use specifies which interface the watchdog timer should use. The default is -1, which means to pick the first one registered.

The timeout is the number of seconds to the action, and the pretimeout is the amount of seconds before the reset that the pre-timeout panic will occur (if pretimeout is zero, then pretimeout will not be enabled). Note that the pretimeout is the time before the final timeout. So if the timeout is 50 seconds and the pretimeout is 10 seconds, then the pretimeout will occur in 40 second (10 seconds before the timeout). The panic_wdt_timeout is the value of timeout which is set on kernel panic, in order to let actions such as kdump to occur during panic.

The action may be “reset”, “power_cycle”, or “power_off”, and specifies what to do when the timer times out, and defaults to “reset”.

The preaction may be “pre_smi” for an indication through the SMI interface, “pre_int” for an indication through the SMI with an interrupts, and “pre_nmi” for a NMI on a preaction. This is how the driver is informed of the pretimeout.

The preop may be set to “preop_none” for no operation on a pretimeout, “preop_panic” to set the preoperation to panic, or “preop_give_data” to provide data to read from the watchdog device when the pretimeout occurs. A “pre_nmi” setting CANNOT be used with “preop_give_data” because you can’t do data operations from an NMI.

When preop is set to “preop_give_data”, one byte comes ready to read on the device when the pretimeout occurs. Select and fasync work on the device, as well.

If start_now is set to 1, the watchdog timer will start running as soon as the driver is loaded.

If nowayout is set to 1, the watchdog timer will not stop when the watchdog device is closed. The default value of nowayout is true if the CONFIG_WATCHDOG_NOWAYOUT option is enabled, or false if not.

When compiled into the kernel, the kernel command line is available for configuring the watchdog::

ipmi_watchdog.timeout= ipmi_watchdog.pretimeout= ipmi_watchdog.action= ipmi_watchdog.preaction= ipmi_watchdog.preop= ipmi_watchdog.start_now=x ipmi_watchdog.nowayout=x ipmi_watchdog.panic_wdt_timeout=

The options are the same as the module parameter options.

The watchdog will panic and start a 120 second reset timeout if it gets a pre-action. During a panic or a reboot, the watchdog will start a 120 timer if it is running to make sure the reboot occurs.

Note that if you use the NMI preaction for the watchdog, you MUST NOT use the nmi watchdog. There is no reasonable way to tell if an NMI comes from the IPMI controller, so it must assume that if it gets an otherwise unhandled NMI, it must be from IPMI and it will panic immediately.

Once you open the watchdog timer, you must write a ‘V’ character to the device to close it, or the timer will not stop. This is a new semantic for the driver, but makes it consistent with the rest of the watchdog drivers in Linux.