This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: VM and non-blocking writes

On Dec 16, 2007 9:07 AM, Corinna Vinschen wrote:
> On Dec 16 14:42, Corinna Vinschen wrote:
> >   I'm contemplating the idea to workaround this problem in Cygwin (not
> >   for 1.5.25, but in the main trunk) by caping the number of bytes in a
> >   single send call, according to the patch Lev sent in
> >
> >
> >   Lev, are you interested in reworking your patch (minus the pipe stuff)
> >   to match current CVS?  Is there any gain in raising SO_SNDBUF/SO_RCVBUF
> >   to a value > 8K, especially in the light of my experiences commented
> >   on in, function fdsock()?
> Lev, do you have a copyright assignment in place?  I don't find you on
> my list of signers.

No I don't have a copyright assignment in place yet. I will see what I
can do about that -- don't think it will be a problem. I'd be
interested in reworking the patch against current CVS (though I
haven't looked to see how far current CVS has moved so I don't know
how much that will involve). But I have to warn you in advance that I
haven't had much time to work on this stuff, and I don't see that
situation changing any time soon, so it may take multiple weeks before
I get a chance. (I'll have some time over christmas, but I'll be away
from all my network hardware and the openbsd box I originally used at
the other end of the wire for testing the patches, so testing would be
a problem). If you were hoping to get something into CVS on a more
rigorous timescale, better to push on without me -- I'll still try to
get a copyright assignment submitted, in case you wish to derive from
my original patches.

As far as changing SO_SNDBUF/SO_RCVBUF a few comments, which I
originally wrote in response to your patch in fdsock() but you had
already #ifdef'd out the patch by the time I wrote this, so I never
bothered to send it:
Your intention with the patch was to make cygwin's default buffer
sizes be more like on linux, but....
1) On windows/cygwin (without my patch), the interpretation of
so_sndbuf is very different from linux. The afd layer will accept
*any* size of send, so long as the current buffer position is less
than so_sndbuf. Whereas on linux, so_sndbuf limits the total size of
the send buffer. This works nicely for transaction-oriented apps. For
an app which does it's side of a transaction in one large writev() and
then waits for the next request from the client (which will piggyback
the ack the server needs in order to empty it's send buffer), the send
buffer on windows is effectively infinite, for all values of so_sndbuf
except 0. So so_sndbuf cannot really be compared between windows and
linux, because the interpretation is totally different.
2) Linux includes all the overheads of it's skb structures, the part
of the buffer that's given to the application, etc, etc when it
accounts for the memory used by the send buffer, the result of which
is that you can only put about half as much data into the buffer as
there is memory allocated (linux internally doubles the number from
setsockopt(SO_SNDBUF) to hide this from applications expecting BSD
semantics, but it doesn't halve the number from getsockopt() a
longstanding point of controversy). The upshot of this is that the
cygwin default sendbuffer should better be *half* of the linux
tcp_wmem default, if you are going to go that way.
3) Linux does dynamic autotuning on the buffers, so the middle value
in tcp_wmem is more like a hint on what's a convenient chunk of memory
to allocate in one go, rather than a hint on what's actually the best
size for the buffer.
4) Your implementation ignored that some users may have actually
calculated optimal values for their situation and put them in the
relevant registry parameters. It seems it would be best either to:
only set so_{snd,rcv}buf in the case that the registry parameters are
absent; or don't touch so_{snd,rcv}buf at all and just advise users
experiencing problems that the registry parameters have the desired
effect. I'm inclined to go with the latter.

Having said all that, the winsock default 8kb really is far too small
for many situations. I find that in my tests (this may be network
hardware/driver dependent) I need 32kb for the stack to start
coalescing packets reliably. Based on this, and on the problems
described in your comments of fdsock() where the issue was with
64kb buffer size, it seems that 32kb would be a good size to use
(again, it's possibly better to recommend the user to alter his
registry setting to 32kb, rather than have cygwin force it through

Before getting too set on the plan of having cygwin break
applications' send()s into chunks, maybe it's worth reconsidering the
overall strategy. We're basically at this point implementing our best
attempt at BSD semantics on top of microsoft's half-assed attempt at
BSD semantics on top of the native not-BSD-like-at-all but powerful
and quite self-consistent NT semantics. If we keep having to work
around more issues like this, perhaps we'd be better off bypassing the
afd layer entirely, by setting SO_SNDBUF to 0, using overlapped IO,
and managing buffers ourselves. I'm sure this would bring it's own set
of complications, but at least we'd be in a better position to deal
with them, not having to go through the afd layer. What do you think?

Unsubscribe info:
Problem reports:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]