fanout_add() might return with fanout_mutex held.
Reduce indentation level while we are at it
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds userspace buffers support in skb shared info. A new
struct skb_ubuf_info is needed to maintain the userspace buffers
argument and index, a callback is used to notify userspace to release
the buffers once lower device has done DMA (Last reference to that skb
has gone).
If there is any userspace apps to reference these userspace buffers,
then these userspaces buffers will be copied into kernel. This way we
can prevent userspace apps from holding these userspace buffers too long.
Use destructor_arg to point to the userspace buffer info; a new tx flags
SKBTX_DEV_ZEROCOPY is added for zero-copy buffer check.
Signed-off-by: Shirley Ma <xma@...ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RFC 6164 requires that routers MUST disable Subnet-Router anycast
for the prefix when /127 prefixes are used.
No need for matching code in addrconf_leave_anycast() as it
will silently ignore any attempt to leave an unknown anycast
address.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we clone the SKB, we forget about the original
one. Avoid this problem by using skb_share_check().
Reported-by: Penttilä Mika <mika.penttila@ixonos.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unfortunately we have to use a real modulus here as
the multiply trick won't work as effectively with cpu
numbers as it does with rxhash values.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch add an unsolicited notification of the DCBX negotiated
parameters for the CEE flavor of the DCBX protocol. The notification
message is identical to the aggregated CEE get operation and holds all
the pertinent local and peer information. The notification routine is
exported so it can be invoked by drivers supporting an embedded DCBX
stack.
Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following couple of patches add dcbnl an unsolicited notification of
the the DCB configuration for the CEE flavor of the DCBX protocol. This
is useful when the user-mode DCB client is not responsible for
conducting and resolving the DCBX negotiation (either because the DCBX
stack is embedded in the HW or the negotiation is handled by another
agent in the host), but still needs to get the negotiated parameters.
This functionality already exists for the IEEE flavor of the DCBX
protocol and these patches add it to the older CEE flavor.
The first patch extends the CEE attribute GET operation to include not
only the peer information, but also all the pertinent local
configuration (negotiated parameters). The second patch adds and export
a CEE specific notification routine.
Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just add GSO to vlan_features initialization, and update comments.
When we set offload features, vlan_dev_fix_features() will do more check.
In vlan_dev_fix_features(), final features is decided by
features of real device and vlan_features of real device.
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The skb->rxhash cannot be properly computed if the
packet is a fragment. To alleviate this, allow the
AF_PACKET client to ask for defragmentation to be
done at demux time.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fanouts allow packet capturing to be demuxed to a set of AF_PACKET
sockets. Two fanout policies are implemented:
1) Hashing based upon skb->rxhash
2) Pure round-robin
An AF_PACKET socket must be fully bound before it tries to add itself
to a fanout. All AF_PACKET sockets trying to join the same fanout
must all have the same bind settings.
Fanouts are identified (within a network namespace) by a 16-bit ID.
The first socket to try to add itself to a fanout with a particular
ID, creates that fanout. When the last socket leaves the fanout
(which happens only when the socket is closed), that fanout is
destroyed.
Signed-off-by: David S. Miller <davem@davemloft.net>
If gso/gro feature of underlying device is turned off,
then new created vlan device never can turn gso/gro on.
Although underlying device don't support TSO, we still
should use software segments for vlan device.
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As is_multicast_ether_addr returns true on broadcast packets as
well, we need to explicitly exclude broadcast packets so that
they're always flooded. This wasn't an issue before as broadcast
packets were considered to be an unregistered multicast group,
which were always flooded. However, as we now only flood such
packets to router ports, this is no longer acceptable.
Reported-by: Michael Guntsche <mike@it-loops.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The version number of modules build outside of the tree can get revision
numbers added. This is useful to give hints about the revision of a
distribution package and the used patchset. The prepended source number or
branch name doesn't add any additional information which would help to identify
problems and can therefore be omitted.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
The packet aggregation needs to ensure that only compatible packets
are aggregated. Some of the checks are based on the interface number
while assuming that the first interface also is the primary interface
which is not always the case.
This patch addresses the issue by using the primary_if pointer.
Reported-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
The primary interface OGM has to be broadcasted on all hard-interfaces
even if the primary interface is not the first interface (if_num = 0).
Therefore the code has to compare the originating interface with the
primary interface instead of checking the if_num.
Reported-by: Linus Luessing <linus.luessing@web.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
now tt_local_event() takes a flags argument instead of a sequence of
boolean values which would grow up with the time.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
In order to make possible to use the broadcast list for delayed sendings
the "delay" parameter is now provided instead of using 1 as hardcoded
value.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
The tt_local_entry structure now has a 'flags' field. This helps to
unify the flags format to all the client related structures (tt_global_entry
and tt_change). The 'never_purge' field is now encoded in the 'flags' one.
To optimise the usage of this field, its length has been increased to 16bit
in order to use the eight leading bits (from 0 to 7) to store flags that
have to be sent on the wire, while the eight ending ones are used for local
computation only.
Moreover 'enum tt_change_flags' is now called 'enum tt_client_flags' and the
defined values apply to the tt_local_entry, tt_global_entry and the tt_change
'flags' field.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Hi,
Reinhard Max also pointed out that the error should EAFNOSUPPORT according
to POSIX.
The Linux manpages have it as EINVAL, some other OSes (Minix, HPUX, perhaps BSD) use
EAFNOSUPPORT. Windows uses WSAEFAULT according to MSDN.
Other protocols error values in their af bind() methods in current mainline git as far
as a brief look shows:
EAFNOSUPPORT: atm, appletalk, l2tp, llc, phonet, rxrpc
EINVAL: ax25, bluetooth, decnet, econet, ieee802154, iucv, netlink, netrom, packet, rds, rose, unix, x25,
No check?: can/raw, ipv6/raw, irda, l2tp/l2tp_ip
Ciao, Marcus
Signed-off-by: Marcus Meissner <meissner@suse.de>
Cc: Reinhard Max <max@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
CCID-2's cwnd increases like TCP during slow-start, which has implications for
* the local Sequence Window value (should be > cwnd),
* the Ack Ratio value.
Hence an exponential growth, if it does not reflect the actual network
conditions, can quickly lead to instability.
This patch adds congestion-window validation (RFC2861) to CCID-2:
* cwnd is constrained if the sender is application limited;
* cwnd is reduced after a long idle period, as suggested in the '90 paper
by Van Jacobson, in RFC 2581 (sec. 4.1);
* cwnd is never reduced below the RFC 3390 initial window.
As marked in the comments, the code is actually almost a direct copy of the
TCP congestion-window-validation algorithms. By continuing this work, it may
in future be possible to use the TCP code (not possible at the moment).
The mechanism can be turned off using a module parameter. Sampling of the
currently-used window (moving-maximum) is however done constantly; this is
used to determine the expected window, which can be exploited to regulate
DCCP's Sequence Window value.
This patch also sets slow-start-after-idle (RFC 4341, 5.1), i.e. it behaves like
TCP when net.ipv4.tcp_slow_start_after_idle = 1.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This replaces a switch statement with a test, using the equivalent
function dccp_data_packet(skb). It also doubles the range of the field
`rx_num_data_pkts' by changing the type from `int' to `u32', avoiding
signed/unsigned comparison with the u16 field `dccps_r_ack_ratio'.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This moves CCID-2's initial window function into the header file, since several
parts throughout the CCID-2 code need to call it (CCID-2 still uses RFC 3390).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Leandro Melo de Sales <leandro@ic.ufal.br>
Change the CCID (de)activation message to start with the
protocol name, as 'CCID' is already in there.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Realising the following call pattern,
* first dccp_entail() is called to enqueue a new skb and
* then skb_clone() is called to transmit a clone of that skb,
this patch integrates both into the same function.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This patch rearranges the order of statements of the slow-path input processing
(i.e. any other state than OPEN), to resolve the following issues.
1. Dependencies: the order of statements now better matches RFC 4340, 8.5, i.e.
step 7 is before step 9 (previously 9 was before 7), and parsing options in
step 8 (which may consume resources) now comes after step 7.
2. Sequence number checks are omitted if in state LISTEN/REQUEST, due to the
note underneath the table in RFC 4340, 7.5.3.
As a result, CCID processing is now indeed confined to OPEN/PARTOPEN states,
i.e. congestion control is performed only on the flow of data packets. This
avoids pathological cases of doing congestion control on those messages
which set up and terminate the connection.
3. Packets are now passed on to Ack Vector / CCID processing only after
- step 7 (receive unexpected packets),
- step 9 (receive Reset),
- step 13 (receive CloseReq),
- step 14 (receive Close)
and only if the state is PARTOPEN. This simplifies CCID processing:
- in LISTEN/CLOSED the CCIDs are non-existent;
- in RESPOND/REQUEST the CCIDs have not yet been negotiated;
- in CLOSEREQ and active-CLOSING the node has already closed this socket;
- in passive-CLOSING the client is waiting for its Reset.
In the last case, RFC 4340, 8.3 leaves it open to ignore further incoming
data, which is the approach taken here.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Use pr_fmt() without KBUILD_MODNAME to allow AUN and econet prefixes.
Convert printks with KERN_DEBUG to pr_debug.
Hoist assigns from if.
80 column wrapping.
Move open braces to end of line.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Too trivial to live.
cc: WANG Cong <amwang@redhat.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unused symbols waste space.
Commit 0e34e93177
"(netpoll: add generic support for bridge and bonding devices)"
added the symbol more than a year ago with the promise of "future use".
Because it is so far unused, remove it for now.
It can be easily readded if or when it actually needs to be used.
cc: WANG Cong <amwang@redhat.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calling icmp_send() on a local message size error leads to
an incorrect update of the path mtu. So use ip_local_error()
instead to notify the socket about the error.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We might call ip_ufo_append_data() for packets that will be IPsec
transformed later. This function should be used just for real
udp packets. So we check for rt->dst.header_len which is only
nonzero on IPsec handling and call ip_ufo_append_data() just
if rt->dst.header_len is zero.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The family arg is not used any more, so remove it.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPV6, unlike IPV4, doesn't have a routing cache.
Routing table entries, as well as clones made in response
to route lookup requests, all live in the same table. And
all of these things are together collected in the destination
cache table for ipv6.
This means that routing table entries count against the garbage
collection limits, even though such entries cannot ever be reclaimed
and are added explicitly by the administrator (rather than being
created in response to lookups).
Therefore it makes no sense to count ipv6 routing table entries
against the GC limits.
Add a DST_NOCOUNT destination cache entry flag, and skip the counting
if it is set. Use this flag bit in ipv6 when adding routing table
entries.
Signed-off-by: David S. Miller <davem@davemloft.net>
Correct the syntax so that both array and pointer are const.
Signed-off-by: Greg Dietsche <Gregory.Dietsche@cuw.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows 80 column line reflowing.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows 80 column line reflowing.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows useless break;s removed after returns
and a comment added to an unnecessary default: break;
because of a dubious gcc warning.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows code on same line as case moved
to separate lines and a "/* Fallthrough */" comment
added where appropriate.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows 80 column reflowing.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows 80 column reflowing,
removal of a useless break after return, and moving
open brace after case instead of separate line.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows no difference.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows miscellaneous 80 column wrapping,
comment reflowing and a comment for a useless gcc
warning for an otherwise unused default: case.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>