brad's life - wait(2)-equivalent in an event loop? [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

wait(2)-equivalent in an event loop? [May. 20th, 2004|11:05 am]
Previous Entry Add to Memories Share Next Entry
[Tags|, ]

Say I have a parent process and a bunch of child processes, and I want an event loop in the parent.

Is there a way to get a filehandle in the parent which becomes readable once a child has died? I can then put that filehandle in my event loop. I don't want to poll waitpid(0, NULL, WNOHANG) every 'n' seconds. That's a lame tradeoff between latency and sucking CPU.

Yeah, I can catch SIGCHLD, but I'd prefer to avoid that.

Some google queries:

Python's Twisted framework catches SIGCHLD, sets a flag, and then reaps children as part of its event loop? (Python's signal handling is like Perl's pre-5.8... unsafe, possibly between opcodes)
http://twistedmatrix.com/pipermail/twisted-commits/2003-March/006222.html

Pert's POE seems to block SIGCHLD at times and poll for dead children at other times:
http://www.dwam.net/docs/perl/site/lib/POE/Kernel.html

OCaml does more polling/SIGCHLD stuff:
http://www.ocaml-programming.de/....

So I'm starting to think there is no clean way, at least for what I consider clean. It's also entirely possible I have no clue what I'm talking about.

I could just have each child connect to a server in the parent, and when that connection dies, I know the child is gone?

Recommendations?
LinkReply

Comments:
From: evan
2004-05-20 11:22 am (UTC)

(Link)

Rather than one connection per child:
Create a pipe in the parent and add it to your event loop. Catch sigchld and have it write to the pipe. (Dunno about locking issues within the sigchld handler; my understanding of this stuff is sorta fuzzy.) If you need to know which child died, you can have the sigchld handler write the pid onto the pipe.
[User Picture]From: brad
2004-05-20 11:26 am (UTC)

(Link)

But then I'd have to require Perl 5.8, since only then can I do anything fancy in signal handlers and not risk Perl segfaulting in 5.6.

Also, when writing/reading a pipe to yourself, there's a limit to the amount of stuff you can write before the buffer (by default 8k) fills up if you don't read from it in time. You can keep track of that, but it gets ugly.
[User Picture]From: ghewgill
2004-05-20 11:33 am (UTC)

(Link)

You don't even have to worry about catching signals or writing to the pipe. If you open a pipe in the parent before forking, then close the write end in the parent, only the child will have the write end open. If the parent select()s on the pipe, it will become readable when the child goes away because the write end is now closed and the read end will let you know about that by returning -1 from read().
From: evan
2004-05-20 11:53 am (UTC)

(Link)

I think that's what he was proposing initially. I was just trying to think of a way to keep the number of pipes down.

It's weird to me that reading closed pipes return -1 and EPIPE, while reading closed sockets just returns 0.
[User Picture]From: taral
2004-05-20 12:02 pm (UTC)

(Link)

Sockets can't be reopened, (named) pipes can.
From: evan
2004-05-20 11:49 am (UTC)

(Link)

Dunno how many children you're gonna have, but it seems you should your even loop should wake you up before n thousand of them all die off.
[User Picture]From: brad
2004-05-20 12:09 pm (UTC)

(Link)

Dunno how many children you're gonna have, but it seems you should your even loop should wake you up before n thousand of them all die off.

Yeah, it's probably not even worth mentioning in this case, but I ran into the problem just the other week (blocking on a pipe write to myself).
[User Picture]From: taral
2004-05-20 11:53 am (UTC)

(Link)

Use a decent event library. They allow you to receive signals the same way you receive other events.

If you don't care about the child PIDs, you can use SA_NOCLDWAIT.
[User Picture]From: brad
2004-05-20 12:08 pm (UTC)

(Link)

How does SA_NOCLDWAIT help? I *want* to know when a child is dead. I don't want to ignore it.

How do decent event libraries work? Not their interface, which you described, but what do they actually do? There are only so many primitives to work with, and they don't all play well together.

I've noticed that when I stop strace'ing a program, epoll_wait exits, even though I had it set for -1 (infinity). So perhaps I can get epoll_wait to exit after a signal, then before I reinvoke epoll_wait, check for dead children?
[User Picture]From: taral
2004-05-20 01:16 pm (UTC)

(Link)

As for SA_NOCLDWAIT, I did say it was only useful if you didn't care about the child PIDs. You still get the SIGCHLD, I believe, but you don't have to wait(). Most people use SA_NOCLDWAIT and then just ignore SIGCHLD, but that clearly isn't useful to you.

epoll doesn't support queueing signals directly, as far as I can tell. kqueue and rtsig do. It _does_, however, exit with EINTR when a signal arrives that you have a handler for.

libevent (a decent event library) provides signal events in epoll by queueing them up in the signal handler and handling them in the dispatch loop.
[User Picture]From: scosol
2004-05-20 12:19 pm (UTC)

(Link)

do you need to know "child x has died" or do you only need to know "a child has died"?
[User Picture]From: brad
2004-05-20 12:40 pm (UTC)

(Link)

I need to know which child.

But if I just get informed it's A child, I can call wait() and get the pid.