linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-13 23:51:39 +00:00

Author	SHA1	Message	Date
Marcelo Ricardo Leitner	0970f5b366	sctp: signal sk_data_ready earlier on data chunks reception Dave Miller pointed out that `fb586f2530` ("sctp: delay calls to sk_data_ready() as much as possible") may insert latency specially if the receiving application is running on another CPU and that it would be better if we signalled as early as possible. This patch thus basically inverts the logic on `fb586f2530` and signals it as early as possible, similar to what we had before. Fixes: `fb586f2530` ("sctp: delay calls to sk_data_ready() as much as possible") Reported-by: Dave Miller <davem@davemloft.net> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-01 21:06:10 -04:00
Marcelo Ricardo Leitner	311b21774f	sctp: simplify sk_receive_queue locking SCTP already serializes access to rcvbuf through its sock lock: sctp_recvmsg takes it right in the start and release at the end, while rx path will also take the lock before doing any socket processing. On sctp_rcv() it will check if there is an user using the socket and, if there is, it will queue incoming packets to the backlog. The backlog processing will do the same. Even timers will do such check and re-schedule if an user is using the socket. Simplifying this will allow us to remove sctp_skb_list_tail and get ride of some expensive lockings. The lists that it is used on are also mangled with functions like __skb_queue_tail and __skb_unlink in the same context, like on sctp_ulpq_tail_event() and sctp_clear_pd(). sctp_close() will also purge those while using only the sock lock. Therefore the lockings performed by sctp_skb_list_tail() are not necessary. This patch removes this function and replaces its calls with just skb_queue_splice_tail_init() instead. The biggest gain is at sctp_ulpq_tail_event(), because the events always contain a list, even if it's queueing a single skb and this was triggering expensive calls to spin_lock_irqsave/_irqrestore for every data chunk received. As SCTP will deliver each data chunk on a corresponding recvmsg, the more effective the change will be. Before this patch, with chunks with 30 bytes: netperf -t SCTP_STREAM -H 192.168.1.2 -cC -l 60 -- -m 30 -S 400000 400000 -s 400000 400000 on a 10Gbit link with 1500 MTU: SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 425984 425984 30 60.00 137.45 7.34 7.36 52.504 52.608 With it: SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 425984 425984 30 60.00 179.10 7.97 6.70 43.740 36.788 Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-15 17:22:20 -04:00
Marcelo Ricardo Leitner	fb586f2530	sctp: delay calls to sk_data_ready() as much as possible Currently processing of multiple chunks in a single SCTP packet leads to multiple calls to sk_data_ready, causing multiple wake up signals which are costy and doesn't make it wake up any faster. With this patch it will note that the wake up is pending and will do it before leaving the state machine interpreter, latest place possible to do it realiably and cleanly. Note that sk_data_ready events are not dependent on asocs, unlike waking up writers. v2: series re-checked v3: use local vars to cleanup the code, suggested by Jakub Sitnicki Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-13 23:04:44 -04:00
Eric Dumazet	2c8c56e15d	net: introduce SO_INCOMING_CPU Alternative to RPS/RFS is to use hardware support for multiple queues. Then split a set of million of sockets into worker threads, each one using epoll() to manage events on its own socket pool. Ideally, we want one thread per RX/TX queue/cpu, but we have no way to know after accept() or connect() on which queue/cpu a socket is managed. We normally use one cpu per RX queue (IRQ smp_affinity being properly set), so remembering on socket structure which cpu delivered last packet is enough to solve the problem. After accept(), connect(), or even file descriptor passing around processes, applications can use : int cpu; socklen_t len = sizeof(cpu); getsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len); And use this information to put the socket into the right silo for optimal performance, as all networking stack should run on the appropriate cpu, without need to send IPI (RPS/RFS). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-11 13:00:06 -05:00
Neil Horman	8465a5fcd1	sctp: add support for busy polling to sctp protocol The busy polling socket option adds support for sockets to busy wait on data arriving on the napi queue from which they have most recently received a frame. Currently only tcp and udp support this feature, but theres no reason sctp can't do so as well. Add it in so appliations can take advantage of it Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: "David S. Miller" <davem@davemloft.net> CC: Daniel Borkmann <dborkman@redhat.com> CC: netdev@vger.kernel.org Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-20 18:18:55 -04:00
David S. Miller	676d23690f	net: Fix use after free by removing length arg from sk_data_ready callbacks. Several spots in the kernel perform a sequence like: skb_queue_tail(&sk->s_receive_queue, skb); sk->sk_data_ready(sk, skb->len); But at the moment we place the SKB onto the socket receive queue it can be consumed and freed up. So this skb->len access is potentially to freed up memory. Furthermore, the skb->len can be modified by the consumer so it is possible that the value isn't accurate. And finally, no actual implementation of this callback actually uses the length argument. And since nobody actually cared about it's value, lots of call sites pass arbitrary values in such as '0' and even '1'. So just remove the length argument from the callback, that way there is no confusion whatsoever and all of these use-after-free cases get fixed as a side effect. Based upon a patch by Eric Dumazet and his suggestion to audit this issue tree-wide. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-11 16:15:36 -04:00
wangweidong	8d72651d86	sctp: fix checkpatch errors with open brace '{' and trailing statements fix checkpatch errors below: ERROR: that open brace { should be on the previous line ERROR: open brace '{' following function declarations go on the next line ERROR: trailing statements should be on next line Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:47:48 -05:00
wangweidong	26ac8e5fe1	sctp: fix checkpatch errors with (foo)\|foo bar\|foo* bar fix checkpatch errors below: ERROR: "(foo)" should be "(foo )" ERROR: "foo * bar" should be "foo bar" ERROR: "foo bar" should be "foo *bar" Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:47:47 -05:00
wangweidong	cb3f837ba9	sctp: fix checkpatch errors with space required or prohibited fix checkpatch errors while the space is required or prohibited to the "=,()++..." Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:47:47 -05:00
Jeff Kirsher	4b2f13a251	sctp: Fix FSF address in file headers Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: Vlad Yasevich <vyasevich@gmail.com> CC: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 12:37:56 -05:00
Daniel Borkmann	477143e3fe	net: sctp: trivial: update bug report in header comment With the restructuring of the lksctp.org site, we only allow bug reports through the SCTP mailing list linux-sctp@vger.kernel.org, not via SF, as SF is only used for web hosting and nothing more. While at it, also remove the obvious statement that bugs will be fixed and incooperated into the kernel. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-09 11:33:02 -07:00
Daniel Borkmann	91705c61b5	net: sctp: trivial: update mailing list address The SCTP mailing list address to send patches or questions to is linux-sctp@vger.kernel.org and not lksctp-developers@lists.sourceforge.net anymore. Therefore, update all occurences. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 17:53:38 -07:00
Daniel Borkmann	c1db7a26ac	net: sctp: sctp_ulpq: remove 'malloced' struct member The structure sctp_ulpq is embedded into sctp_association and never separately allocated, also ulpq->malloced is always 0, so that kfree() is never called. Therefore, remove this code. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-17 14:13:02 -04:00
Lee A. Roberts	d003b41b80	sctp: fix association hangs due to partial delivery errors In sctp_ulpq_tail_data(), use return values 0,1 to indicate whether a complete event (with MSG_EOR set) was delivered. A return value of -ENOMEM continues to indicate an out-of-memory condition was encountered. In sctp_ulpq_retrieve_partial() and sctp_ulpq_retrieve_first(), correct message reassembly logic for SCTP partial delivery. Change logic to ensure that as much data as possible is sent with the initial partial delivery and that following partial deliveries contain all available data. In sctp_ulpq_partial_delivery(), attempt partial delivery only if the data on the head of the reassembly queue is at or before the cumulative TSN ACK point. In sctp_ulpq_renege(), use the modified return values from sctp_ulpq_tail_data() to choose whether to attempt partial delivery or to attempt to drain the reassembly queue as a means to reduce memory pressure. Remove call to sctp_tsnmap_mark(), as this is handled correctly in call to sctp_ulpq_tail_data(). Signed-off-by: Lee A. Roberts <lee.roberts@hp.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2013-02-28 15:34:27 -05:00
Lee A. Roberts	95ac7b859f	sctp: fix association hangs due to errors when reneging events from the ordering queue In sctp_ulpq_renege_list(), events being reneged from the ordering queue may correspond to multiple TSNs. Identify all affected packets; sum freed space and renege from the tsnmap. Signed-off-by: Lee A. Roberts <lee.roberts@hp.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2013-02-28 15:34:26 -05:00
Lee A. Roberts	e67f85ecd8	sctp: fix association hangs due to reneging packets below the cumulative TSN ACK point In sctp_ulpq_renege_list(), do not renege packets below the cumulative TSN ACK point. Signed-off-by: Lee A. Roberts <lee.roberts@hp.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2013-02-28 15:34:26 -05:00
Neil Horman	b26ddd8130	sctp: Clean up type-punning in sctp_cmd_t union Lots of points in the sctp_cmd_interpreter function treat the sctp_cmd_t arg as a void pointer, even though they are written as various other types. Theres no need for this as doing so just leads to possible type-punning issues that could cause crashes, and if we remain type-consistent we can actually just remove the void * member of the union entirely. Change Notes: v2) * Dropped chunk that modified SCTP_NULL to create a marker pattern should anyone try to use a SCTP_NULL() assigned sctp_arg_t, Assigning to .zero provides the same effect and should be faster, per Vlad Y. v3) * Reverted part of V2, opting to use memset instead of .zero, so that the entire union is initalized thus avoiding the i164 speculative load problems previously encountered, per Dave M.. Also rewrote SCTP_[NO]FORCE so as to use common infrastructure a little more Signed-off-by: Neil Horman <nhorman@tuxdriver.com CC: Vlad Yasevich <vyasevich@gmail.com> CC: "David S. Miller" <davem@davemloft.net> CC: linux-sctp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-03 14:54:55 -04:00
Eric W. Biederman	b01a24078f	sctp: Make the mib per network namespace Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:36 -07:00
Neil Horman	4244854d22	sctp: be more restrictive in transport selection on bundled sacks It was noticed recently that when we send data on a transport, its possible that we might bundle a sack that arrived on a different transport. While this isn't a major problem, it does go against the SHOULD requirement in section 6.4 of RFC 2960: An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK, etc.) to the same destination transport address from which it received the DATA or control chunk to which it is replying. This rule should also be followed if the endpoint is bundling DATA chunks together with the reply chunk. This patch seeks to correct that. It restricts the bundling of sack operations to only those transports which have moved the ctsn of the association forward since the last sack. By doing this we guarantee that we only bundle outbound saks on a transport that has received a chunk since the last sack. This brings us into stricter compliance with the RFC. Vlad had initially suggested that we strictly allow only sack bundling on the transport that last moved the ctsn forward. While this makes sense, I was concerned that doing so prevented us from bundling in the case where we had received chunks that moved the ctsn on multiple transports. In those cases, the RFC allows us to select any of the transports having received chunks to bundle the sack on. so I've modified the approach to allow for that, by adding a state variable to each transport that tracks weather it has moved the ctsn since the last sack. This I think keeps our behavior (and performance), close enough to our current profile that I think we can do this without a sysctl knob to enable/disable it. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yaseivch <vyasevich@gmail.com> CC: David S. Miller <davem@davemloft.net> CC: linux-sctp@vger.kernel.org Reported-by: Michele Baldessari <michele@redhat.com> Reported-by: sorin serban <sserban@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-30 22:44:35 -07:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Hagen Paul Pfeifer	efea2c6b2e	sctp: several declared/set but unused fixes Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-07 15:51:14 -08:00
Joe Perches	3fa21e07e6	net: Remove unnecessary returns from void function()s This patch removes from net/ (but not any netfilter files) all the unnecessary return; statements that precede the last closing brace of void functions. It does not remove the returns that are immediately preceded by a label as gcc doesn't like that. Done via: $ grep -rP --include=*.[ch] -l "return;\n}" net/ \| \ xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }' Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-05-17 23:23:14 -07:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
David S. Miller	43f59c8939	net: Remove __skb_insert() calls outside of skbuff internals. This minor cleanup simplifies later changes which will convert struct sk_buff and friends over to using struct list_head. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-09-21 21:28:51 -07:00
Vlad Yasevich	c068be5491	[SCTP]: Correctly reap SSNs when processing FORWARD_TSN chunk When we recieve a FORWARD_TSN chunk, we need to reap all the queued fast-forwarded chunks from the ordering queue However, if we don't have them queued, we need to see if the next expected one is there as well. If it is, start deliver from that point instead of waiting for the next chunk to arrive. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2008-02-06 21:26:26 -05:00
Vlad Yasevich	01f2d38498	[SCTP]: Kill silly inlines in ulpqueue.c Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2008-02-05 10:59:30 -05:00
Vlad Yasevich	60c778b259	[SCTP]: Stop claiming that this is a "reference implementation" I was notified by Randy Stewart that lksctp claims to be "the reference implementation". First of all, "the refrence implementation" was the original implementation of SCTP in usersapce written ty Randy and a few others. Second, after looking at the definiton of 'reference implementation', we don't really meet the requirements. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2008-02-05 10:59:07 -05:00
Hideo Aoki	3ab224be6d	[NET] CORE: Introducing new memory accounting interface. This patch introduces new memory accounting functions for each network protocol. Most of them are renamed from memory accounting functions for stream protocols. At the same time, some stream memory accounting functions are removed since other functions do same thing. Renaming: sk_stream_free_skb() -> sk_wmem_free_skb() __sk_stream_mem_reclaim() -> __sk_mem_reclaim() sk_stream_mem_reclaim() -> sk_mem_reclaim() sk_stream_mem_schedule -> __sk_mem_schedule() sk_stream_pages() -> sk_mem_pages() sk_stream_rmem_schedule() -> sk_rmem_schedule() sk_stream_wmem_schedule() -> sk_wmem_schedule() sk_charge_skb() -> sk_mem_charge() Removeing sk_stream_rfree(): consolidates into sock_rfree() sk_stream_set_owner_r(): consolidates into skb_set_owner_r() sk_stream_mem_schedule() The following functions are added. sk_has_account(): check if the protocol supports accounting sk_mem_uncharge(): do the opposite of sk_mem_charge() In addition, to achieve consolidation, updating sk_wmem_queued is removed from sk_mem_charge(). Next, to consolidate memory accounting functions, this patch adds memory accounting calls to network core functions. Moreover, present memory accounting call is renamed to new accounting call. Finally we replace present memory accounting calls with new interface in TCP and SCTP. Signed-off-by: Takahiro Yasui <tyasui@redhat.com> Signed-off-by: Hideo Aoki <haoki@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:00:18 -08:00
Vlad Yasevich	ef5d4cf2f9	[SCTP]: Flush fragment queue when exiting partial delivery. At the end of partial delivery, we may have complete messages sitting on the fragment queue. These messages are stuck there until a new fragment arrives. This can comletely stall a given association. When clearing partial delivery state, flush any complete messages from the fragment queue and send them on their way up. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-16 14:05:45 -08:00
Vlad Yasevich	cd3ae8e615	SCTP: Fix PR-SCTP to deliver all the accumulated ordered chunks There is a small bug when we process a FWD-TSN. We'll deliver anything upto the current next expected SSN. However, if the next expected is already in the queue, it will take another chunk to trigger its delivery. The fix is to simply check the current queued SSN is the next expected one. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2007-11-09 11:43:41 -05:00
Pavel Emelyanov	16d14ef9f2	[SCTP]: Consolidate sctp_ulpq_renege_xxx functions Both are equal, except for the list to be traversed. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:52 -07:00
Neil Horman	4d93df0abd	[SCTP]: Rewrite of sctp buffer management code This patch introduces autotuning to the sctp buffer management code similar to the TCP. The buffer space can be grown if the advertised receive window still has room. This might happen if small message sizes are used, which is common in telecom environmens. New tunables are introduced that provide limits to buffer growth and memory pressure is entered if to much buffer spaces is used. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:09 -07:00
Vlad Yasevich	ea2dfb3733	SCTP: properly clean up fragment and ordering queues during FWD-TSN. When we recieve a FWD-TSN (meaning the peer has abandoned the data), we need to clean up any partially received messages that may be hanging out on the re-assembly or re-ordering queues. This is a MUST requirement that was not properly done before. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com.>	2007-08-29 13:34:33 -04:00
Stephen Hemminger	3ff50b7997	[NET]: cleanup extra semicolons Spring cleaning time... There seems to be a lot of places in the network code that have extra bogus semicolons after conditionals. Most commonly is a bogus semicolon after: switch() { } Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:24 -07:00
Vlad Yasevich	d49d91d79a	[SCTP]: Implement SCTP_PARTIAL_DELIVERY_POINT option. This option induces partial delivery to run as soon as the specified amount of data has been accumulated on the association. However, we give preference to fully reassembled messages over PD messages. In any case, window and buffer is freed up. Signed-off-by: Vlad Yasevich <vladislav.yasevich@.hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:28:00 -07:00
Vlad Yasevich	b6e1331f3c	[SCTP]: Implement SCTP_FRAGMENT_INTERLEAVE socket option This option was introduced in draft-ietf-tsvwg-sctpsocket-13. It prevents head-of-line blocking in the case of one-to-many endpoint. Applications enabling this option really must enable SCTP_SNDRCV event so that they would know where the data belongs. Based on an earlier patch by Ivan Skytte Jørgensen. Additionally, this functionality now permits multiple associations on the same endpoint to enter Partial Delivery. Applications should be extra careful, when using this functionality, to track EOR indicators. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:59 -07:00
Vlad Yasevich	d0cf0d9940	[SCTP]: Do not interleave non-fragments when in partial delivery The way partial delivery is currently implemnted, it is possible to intereleave a message (either from another steram, or unordered) that is not part of partial delivery process. The only way to this is for a message to not be a fragment and be 'in order' or unorderd for a given stream. This will result in bypassing the reassembly/ordering queues where things live duing partial delivery, and the message will be delivered to the socket in the middle of partial delivery. This is a two-fold problem, in that: 1. the app now must check the stream-id and flags which it may not be doing. 2. this clearing partial delivery state from the association and results in ulp hanging. This patch is a band-aid over a much bigger problem in that we don't do stream interleave. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-18 14:16:09 -07:00
Vlad Yasevich	0b58a81146	[SCTP]: Clean up stale data during association restart During association restart we may have stale data sitting on the ULP queue waiting for ordering or reassembly. This data may cause severe problems if not cleaned up. In particular stale data pending ordering may cause problems with receive window exhaustion if our peer has decided to restart the association. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-20 00:09:43 -07:00
YOSHIFUJI Hideaki	d808ad9ab8	[NET] SCTP: Fix whitespace errors. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-10 23:20:11 -08:00
Vlad Yasevich	331c4ee7fa	[SCTP]: Fix receive buffer accounting. When doing receiver buffer accounting, we always used skb->truesize. This is problematic when processing bundled DATA chunks because for every DATA chunk that could be small part of one large skb, we would charge the size of the entire skb. The new approach is to store the size of the DATA chunk we are accounting for in the sctp_ulpevent structure and use that stored value for accounting. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:44 -07:00
Vladislav Yasevich	672e7cca17	[SCTP]: Prevent possible infinite recursion with multiple bundled DATA. There is a rare situation that causes lksctp to go into infinite recursion and crash the system. The trigger is a packet that contains at least the first two DATA fragments of a message bundled together. The recursion is triggered when the user data buffer is smaller that the full data message. The problem is that we clone the skb for every fragment in the message. When reassembling the full message, we try to link skbs from the "first fragment" clone using the frag_list. However, since the frag_list is shared between two clones in this rare situation, we end up setting the frag_list pointer of the second fragment to point to itself. This causes sctp_skb_pull() to potentially recurse indefinitely. Proposed solution is to make a copy of the skb when attempting to link things using frag_list. Signed-off-by: Vladislav Yasevich <vladsilav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-05 17:03:49 -07:00
Al Viro	dd0fc66fb3	[PATCH] gfp flags annotations - part 1 - added typedef unsigned int __nocast gfp_t; - replaced __nocast uses for gfp flags with gfp_t - it gives exactly the same warnings as far as sparse is concerned, doesn't change generated code (from gcc point of view we replaced unsigned int with typedef) and documents what's going on far better. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-08 15:00:57 -07:00
David S. Miller	8728b834b2	[NET]: Kill skb->list Remove the "list" member of struct sk_buff, as it is entirely redundant. All SKB list removal callers know which list the SKB is on, so storing this in sk_buff does nothing other than taking up some space. Two tricky bits were SCTP, which I took care of, and two ATM drivers which Francois Romieu <romieu@fr.zoreil.com> fixed up. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>	2005-08-29 15:31:14 -07:00
Alexey Dobriyan	3182cd84f0	[SCTP]: __nocast annotations Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-11 20:57:47 -07:00
Linus Torvalds	1da177e4c3	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!	2005-04-16 15:20:36 -07:00

45 Commits