From userspace, how can I tell if Linux's soft watchdog is configured with no way out?

后端 未结 3 1503
悲哀的现实
悲哀的现实 2021-02-01 23:21

I am writing a system monitor for Linux and want to include some watchdog functionality. In the kernel, you can configure the watchdog to keep going even if /dev/watchdog is clo

相关标签:
3条回答
  • 2021-02-02 00:03

    I think the watchdog device drivers are really intended for use on embedded platforms (or at least well controlled ones) where the developers will have control of which kernel is in use.

    This could be considered to be an oversight, but I think it is not.

    One other thing you could try, if the watchdog was built as a loadable module, unloading it will presumably abort the shutdown?

    0 讨论(0)
  • 2021-02-02 00:18

    a watchdog guards against hard-locking the system, either because of a software crash, or hardware failure.

    what you need is a daemon monitoring daemon (dmd). check 'monit'

    0 讨论(0)
  • 2021-02-02 00:23

    AHA! After digging through the kernel's linux/watchdog.h and drivers/watchdog/softdog.c, I was able to determine the capabilities of the softdog ioctl() interface. Looking at the capabilities that it announces in struct watchdog_info:

    static struct watchdog_info ident = {
                    .options =              WDIOF_SETTIMEOUT |
                                            WDIOF_KEEPALIVEPING |
                                            WDIOF_MAGICCLOSE,
                    .firmware_version =     0,
                    .identity =             "Software Watchdog",
            };
    

    It does support a magic close that (seems to) override CONFIG_WATCHDOG_NOWAYOUT. So, when terminating normally, I have to write a single char 'V' to /dev/watchdog then close it, and the timer will stop counting.

    A simple ioctl() on a file descriptor to /dev/watchdog asking WDIOC_GETSUPPORT allows one to determine if this flag is set. Pseudo code:

    int fd;
    struct watchdog_info info;
    
    fd = open("/dev/watchdog", O_WRONLY);
    if (fd == -1) {
       perror("open");
       // abort, timer did not start - no additional concerns
    }
    
    if (ioctl(fd, WDIOC_GETSUPPORT, &info)) {
        perror("ioctl");
        // abort, but you probably started the timer! See below.
    }
    
    if (WDIOF_MAGICCLOSE & info.options) {
       printf("Watchdog supports magic close char\n");
       // You have started the timer here! Handle that appropriately.
    }
    

    When working with hardware watchdogs, you might want to open with O_NONBLOCK so ioctl() not open() blocks (hence detecting a busy card).

    If WDIOF_MAGICCLOSE is not supported, one should just assume that the soft watchdog is configured with NOWAYOUT. Remember, just opening the device successfully starts the countdown. If all you're doing is probing to see if it supports magic close and it does, then magic close it. Otherwise, be sure to deal with the fact that you now have a running watchdog.

    Unfortunately, there's no real way to know for sure without actually starting it, at least not that I could find.

    0 讨论(0)
提交回复
热议问题