Brad Fitzpatrick (brad) wrote,
Brad Fitzpatrick

request for advice (epoll_wait usage problem)

The epoll_wait functions takes an epfd (the epoll object), the number of events you want returned (and pointer), and a timeout (or -1 for infinity).

I arbitrary picked 20 maxevents, but I just ran into a bug which highlights a problem with that:

Say I do an epoll_wait with 5 maxevents, and it returns 4 events. It says the following fds are interesting:


My bug:
So you process 1, and as some result, you close fd 4.
Process 2, 3, then get to 4 and 4 is gone. (since 1 closed it)

Worse bug:
Now you process 2, and as some result, you open a new fd 4.
You process 3. No prob.
Then you process 4, but it's a different 4. It just shares the same fd.

So there are three solutions:

a) only epoll_wait on a single event at a time. (performance loss? I'll need to benchmark)

b) since I'm using level-triggered notifications for portability, I can just ignore events on fds 2, 3, and 4 once event 1 initiates any close. the outer "while(1)" loop will just reinvoke epoll_wait and I'll get the descriptors again.

c) just be ready for readiness notifications on the wrong sockets. since everything is non-blocking, it's not a huge deal, but I need to verify all the code to make sure I never assume a read will work, for instance. currently the TCP listener assumes that if its event_read method is called, the accept will return a new sock. with this broken model, i can't assume that.

Any recommendations on the best course of action?
  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.