Running a persistent program from a udev rule

The RUN key name in a udev(7) rule does not allow persistent programs to be started. udevd(8) uses the process group to find any children and kills them after an arbitrary timeout. kill -9 no less. This timeout leaves a gap in Linux userland: it is not possible to monitor hardware events and start just any applications on behalf of the user. inotify won't work. It doesn't handle pseudo-filesystems like /proc.

Searching the Internet yielded some clues and incomplete examples of defeating this mechanism. One solution was using part of systemd, systemctl, to start a service. This is no solution since the CRUX distribution and others run (gasp) without systemd.

The solution is a launcher that destroys the trail of breadcrumbs udevd uses to track down and destroy any long running children including its own workers.

 1     sigemptyset(&mask);
 2     sigaddset(&mask, SIGCHLD);
 3     if (sigprocmask(SIG_BLOCK, &mask, &savemask) == -1)
 4             syslog(LOG_ERR,"cannot block SIGCHLD");
 5 
 6     pid = fork();
 7     if(pid) {
 8             do {
 9                 pid_changed = waitpid(pid, &wait_status, 0);
10             } while (pid == -1 && errno == EINTR);
11 
12             if (pid_changed != pid)
13                 syslog(LOG_ERR, "error waiting for child %d", pid);
14 
15             if (WIFEXITED(wait_status)) {
16                 int rc = WEXITSTATUS(wait_status);
17 
18                 if (rc != 0)
19                     syslog(LOG_ERR, "child returned error exit status %d", rc);
20             } else if (WIFSIGNALED(wait_status)) {
21                         int signum = WTERMSIG(wait_status);
22 
23                                 syslog(LOG_ERR, "child was killed by signal %d", signum);
24             } else {
25                         syslog(LOG_ERR, "unexpected status %d waiting for child", wait_status);
26             }
27 
28             _exit(0);               /* exit parent after forking child */
29     }
30 
31     /* redirect stdin, stdout, and stderr to /dev/null */
32     devnull_fd = open("/dev/null", O_RDWR);
33     if (devnull_fd < 0) {
34             syslog(LOG_DEBUG, "unable to open '%s'", "/dev/null");
35             _exit(1);
36     }
37 
38     dup2(devnull_fd, 0); /* stdin */
39     dup2(devnull_fd, 1); /* stdout */
40     dup2(devnull_fd, 2); /* stderr */
41 
42     /* Now close all extra fds. */
43     for (i = getdtablesize() - 1; i >= 3; --i)
44             close(i);
45 
46     /* change to a new process group to detach from udevd */
47     if(setpgid(0, 0))
48             syslog(LOG_DEBUG, "Error on setpgid(): %s", strerror(errno));
49 
50     pid = fork();
51     if(pid) {
52             _exit(0);               /* exit parent after forking child */
53     }
54 
55     if (sigprocmask(SIG_SETMASK, &savemask, NULL) == -1)
56             syslog(LOG_ERR, "failed restoring previous signal mask");
57 
58 /* Above is the important bit. Then an environment is built and */
59 /* the actual program is executed */
60 
61     rc = execve(*arglist, arglist, envtable);
62     /* should not return, but if so, something went haywire */
63     syslog(LOG_DEBUG, "execve error of %s: %s", *arglist, strerror(errno));
64     exit(11);

As always what's going on is not that obvious. Terminology gets tricky when using fork because excution of the child commences just after the fork(2) in the source but they are using completely separate address spaces. In this discussion we have first parent, second parent, third parent, first child, and second child.

First parent is the thread of execution that is running at the start of this example. Second parent is also the first child. Third parent is also the second child and is the thread that will execute the program that must persist.

The first thing the first parent does is block signals from a child exiting. In a multi-thread machine the first child might finish before the first parent gets control back. So the child's exit signal is blocked so the first parent can wait on it. Issuing a wait for a child is known as reaping the child. When the child is reaped, the first parent uses _exit(2),at line #28, instead of exit(3). _exit(3) does two special things: it sends a SIGCHILD to its parent and any of its children get reparented to init(8). This is important because init kindly waits for any children it adopts and reaps them on exit lest they become zombies.

At the first fork first parent's execution is directed into the if {} block at line #7. For that thread the pid contains the pid of the first child. The if {} block waits for end of that process to prevent a zombie.

After the first fork the first child skips over the if {} block. It then:

  • opens a file descriptor to /dev/null
  • redirects stdin, stdout, and stderr to that file descriptor
  • closes all other file descriptors

This step took the longest time to uncover. Without it the standard files still point to the same place that udevd does. Not good.

Finally, the first child/second parent destroys the breadcrumb trail with setpgid(2). Now it is free to fork() the second child and exit. The second child goes on to execute the persistent program.

This is a very non-portable Linux only solution.

The start-stop-daemon, put in the public domain by its authors, for Debian provide some of the insights needed for this task. The start-stop-daemon, when run from a udev rule, seemed to work fine right up to the point where GTK showed the top level window and took a segfault in libgdk-x11. Never figured that one out. It was faster just to write the 'C' code needed.

links

social