?

Log in

No account? Create an account
sockets and close/shutdown/RST packets - brad's life [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

sockets and close/shutdown/RST packets [Sep. 13th, 2005|02:30 pm]
Brad Fitzpatrick
[Tags|, , ]

Dear Lazyweb,

Network programming question:

I have a listening socket. Client connects to it, and I write a bunch of data to it, in chunks as it becomes writable. After I'm done writing and want to properly close the connection, what do I have to do?

a) just close the socket?
b) shutdown and close the socket?
c) wait for socket to become writable again, and (a) or (b)?

I'd thought it was just (a), but Perlbal is sending RST packets like crazy when I do that. If I put in arbitrary delays before the close, the RST packets go away, but that's really hacky and lame.

We'd never noticed this on LiveJournal.com before, because the BIG-IP sanitizes the situation to some degree, but RST packets still reach end users, and it's technically wrong. Also, other sites (like discogs.com) which are behind Alteons don't handle the RST packets well.

*braced for cluestick beating*

Educated me. Thanks!
LinkReply

Comments:
[User Picture]From: toast0
2005-09-13 09:52 pm (UTC)
This may be related to SO_LINGER ?
(Reply) (Thread)
[User Picture]From: brad
2005-09-13 11:03 pm (UTC)
Thanks!

I forgot about that. I remember it back from using lingerd with Apache, but I never used it in Perlbal, and I still think I can't... becuase I can't have anything blocking in my event loop. But the other commenters below suggest I can just wait for it to become writable again and then close it, so maybe that's the answer. But I thought I tried that and it still didn't work, so maybe writable-to-the-kernel and remote-peer-has-acked-all-data are different, and that's my open question.
(Reply) (Parent) (Thread)
[User Picture]From: edm
2005-09-13 10:10 pm (UTC)
If you're using non-blocking operations (eg, write()), and you want all the data to go out then the socket to close, then yes you need to do (c) wait for the socket to become writeable again then close() it. (If you close() the socket prior to it becoming writeable again, then it's possible not all the data will go through.)

It's not necessary to both shutdown() and close() the socket; shutdown() is a specialised call used to "close" just one side or the other of the socket (ie, receives or sends) and close() will do both for you. shutdown() is typically used in a protocol where you use a socket closed status as an indicator of "I'm done writing, but I'm listening for what you say" and other such special cases.

The other thing to watch out for is that you're closing all file handles to the socket; if you, eg, fork(), then you'll have two handles to the same socket. The socket will only get properly closed (ie, the remote end told about it) if both of them are closed. Typically the approach is to fork(), have the parent close the socket immediately and go back to listen()ing or accept()ing, and have the child do its thing with the socket then close() it when it is ready.

Finally note that close() can return EINTR (interrupted - by a signal), and if that happens you'll need to call it again at some convenient time. If you wait for the socket to become writable then it should return pretty much immediately which reduces the chances of EINTR.

Ewen
(Reply) (Thread)
[User Picture]From: edm
2005-09-13 10:14 pm (UTC)
One more thing. IIRC if you close() a socket that has pending data waiting to be sent, a RST will be sent instead of a FIN, to let the remote end know that some data didn't get sent. From what you've described this may well be your issue: the close() is being called before all the data is sent (because you're not waiting for the socket to come writable again), and hence the "some data not sent" behaviour is being triggered. If so, waiting for the socket to come writable again before issuing close() should resolve it.

Ewen
(Reply) (Parent) (Thread)
[User Picture]From: brad
2005-09-13 11:05 pm (UTC)
If so, waiting for the socket to come writable again before issuing close() should resolve it.

I'm pretty sure I tried that without success. Are you sure that writability is the same as the remote peer acking all the data? If the TCP window size is, say, 5000 bytes and they've acked half of it, can't I write 2500 bytes to the kernel, even though 2500 bytes are still in flight?

I can't close on SO_LINGER because I can't block in my event loop, and I can't reliably do an aio_close, so I have to somehow detect things are good before I close...
(Reply) (Parent) (Thread)
[User Picture]From: edm
2005-09-13 11:14 pm (UTC)
Writability indicates that enough data has been ACK'd that the kernel is willing to accept more. As you say, there may be more data that is in flight. When I've seen this, this has been sufficient to get the kernel to behave correctly with respect to close(), ie, ensure that the remaining data is sent and a FIN is sent.

I haven't used aio_* at all though, so possibly there's something special there.

Ewen
(Reply) (Parent) (Thread)
[User Picture]From: scosol
2005-09-13 10:34 pm (UTC)
TCP actions and buffer behavior upon a socket(close) is entirely OS-dependant.
Some immediately RST, some wait for the data to be sent, then do a normal FIN shutdown- some halve alternate close-like methods and/or arguments that let you specify the behavior.
(Reply) (Thread)
[User Picture]From: taral
2005-09-13 10:47 pm (UTC)
SO_LINGER is what you want, I believe. You get RSTs when there's unsent data at close() time. SO_LINGER makes close() block waiting for the remote to ACK the unsent data.
(Reply) (Thread)
[User Picture]From: midendian
2005-09-13 11:01 pm (UTC)
If your socket is non-blocking, you want (c), then keep calling close until it returns zero. You specifically do NOT want SO_LINGER. That will cause close to return before the data is sent.

It's funny, I was just thinking about this this morning when I was skimming the perlbal page for the nth time and read this: Almost. When you enable PUT support, the close() operation is blocking. However, it's generally pretty fast (we've had no problems). Which sounds like you had a similar problem, and you set the socket blocking before calling close, which is an easy cheat.
(Reply) (Thread)
[User Picture]From: midendian
2005-09-13 11:04 pm (UTC)
Oh, and checking for writability isn't a sure sign that the previous data was sent, since if you sent less than the window, the socket is still writable, and if that data hasn't been acked yet, close will still block (I think).

I'm guessing this is full of portability issues.
(Reply) (Parent) (Thread)
[User Picture]From: brad
2005-09-13 11:06 pm (UTC)
Oh, and checking for writability isn't a sure sign that the previous data was sent, since i

Heh, we both posted this at the same time.
(Reply) (Parent) (Thread)
[User Picture]From: brad
2005-09-13 11:09 pm (UTC)
Well, closing a file fd is different from closing a socket fd. So for PUT support (which is opening an on-disk file), the close() there only tends to block if you're writing to a filesystem that does some buffer write-out on close. Hence that comment.

But as for your first comment: so I wait for writability, then repeatedly call close? Won't that just send a RST on the first close? What would it return if not zero? I don't see the (Linux) manpages mentioning that EINTR or EIO are sent if there's outstanding data in the buffer.
(Reply) (Parent) (Thread)
[User Picture]From: midendian
2005-09-13 11:29 pm (UTC)
Ah, didn't realize it was in reference to files.

Yeah, I don't really know.

I can't believe they made sending RST on close() with outstanding data the default.
(Reply) (Parent) (Thread)
[User Picture]From: midendian
2005-09-13 11:41 pm (UTC)
I think I might've misunderstood something.

Are you sure you're reading all the data the client offered? That would be a much more sensible reason for the RSTs.

(It's still not making any sense to me why it would send a RST if not all the data was /sent/. You only send RST's when there is data loss, and the sender should never lose data unless the other end did something.)
(Reply) (Parent) (Thread)
[User Picture]From: brad
2005-09-13 11:55 pm (UTC)
Are you sure you're reading all the data the client offered? That would be a much more sensible reason for the RSTs.

I'm starting to think the same.
(Reply) (Parent) (Thread)
[User Picture]From: midendian
2005-09-19 05:25 pm (UTC)
Ever figure this out?
(Reply) (Parent) (Thread)
[User Picture]From: brad
2005-09-19 05:26 pm (UTC)
Yeah, there were two places where I was closing connections without reading everything. Where "everything" generally meant IE6 sending an extra "\r\n" after a POST.
(Reply) (Parent) (Thread)
[User Picture]From: ninaf
2005-09-17 07:35 am (UTC)
oh this is my fault. :) I'll reply to that email this weekend.
(Reply) (Thread)
From: (Anonymous)
2008-07-28 05:16 pm (UTC)

Solution candidate

1. Do shutdown(1) on socket. This wil cause remote recv() return 0 and remote application will close connection by close().

2. After shutdown wait for our recv()==0. This means that remote was closed connection.

(Reply) (Parent) (Thread)
From: (Anonymous)
2008-07-28 05:20 pm (UTC)

URL

Look here

http://vadmyst.blogspot.com/2008/04/proper-way-to-close-tcp-socket.html
(Reply) (Parent) (Thread)