Files
linux/fs
Eric Dumazet 8d2228dd95 tcp: allow splice() to build full TSO packets
[ This combines upstream commit
  2f53384424 and the follow-on bug fix
  commit 35f9c09fe9 ]

vmsplice()/splice(pipe, socket) call do_tcp_sendpages() one page at a
time, adding at most 4096 bytes to an skb. (assuming PAGE_SIZE=4096)

The call to tcp_push() at the end of do_tcp_sendpages() forces an
immediate xmit when pipe is not already filled, and tso_fragment() try
to split these skb to MSS multiples.

4096 bytes are usually split in a skb with 2 MSS, and a remaining
sub-mss skb (assuming MTU=1500)

This makes slow start suboptimal because many small frames are sent to
qdisc/driver layers instead of big ones (constrained by cwnd and packets
in flight of course)

In fact, applications using sendmsg() (adding an additional memory copy)
instead of vmsplice()/splice()/sendfile() are a bit faster because of
this anomaly, especially if serving small files in environments with
large initial [c]wnd.

Call tcp_push() only if MSG_MORE is not set in the flags parameter.

This bit is automatically provided by splice() internals but for the
last page, or on all pages if user specified SPLICE_F_MORE splice()
flag.

In some workloads, this can reduce number of sent logical packets by an
order of magnitude, making zero-copy TCP actually faster than
one-copy :)

Reported-by: Tom Herbert <therbert@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-04-27 09:51:18 -07:00
..
2011-03-31 11:26:23 -03:00
2011-03-31 11:26:23 -03:00
2011-05-19 16:55:28 +09:30
2011-03-10 08:52:07 +01:00
2011-07-17 23:20:29 -04:00
2011-12-21 12:57:44 -08:00
2011-11-11 09:35:59 -08:00
2011-06-20 17:53:24 -05:00
2012-04-27 09:51:18 -07:00
2011-03-31 11:26:23 -03:00
2011-03-31 11:26:23 -03:00
2011-01-07 17:50:26 +11:00
2011-03-10 08:52:07 +01:00
2011-04-14 16:06:56 -07:00
2011-07-17 23:21:35 -04:00
2012-03-19 08:57:43 -07:00
2011-07-06 12:15:16 -07:00
2010-10-29 04:16:28 -04:00
2012-04-27 09:51:09 -07:00
2011-03-21 00:16:08 -04:00
2011-07-06 10:41:13 -07:00
2011-05-26 10:01:43 -06:00
2011-03-21 01:10:41 -04:00
2011-01-07 17:50:33 +11:00
2011-06-03 18:24:58 -04:00