So then we're like, "Oh, we need a dumb fallback method for people on single hosts. Let's just do flock or fnctl or lockf or something.... that'll be easy...."
Yeah, right.
Either we're all crazy, or flock/fnctl/lockf are a total pain in the ass. Our lock stress tester (10 forked children fighting for a few seconds over a small number of locks) works perfectly with our network lock client/daemon..... the synchronized code (an O_CREAT|O_EXCL open + unlink) never fails.
But once we switch to using local-machine locking, reliability goes to hell.
What's wrong here?
http://www.danga.com/temp/flock-test.c
It seems both fcntl and flock locks are released at different times, but I can't get it right. It only works if I comment out the line after BIZARRE in the source above. (never unlinking the file I flock)
Where's the race?