linux

Author	SHA1	Message	Date
Patrick McHardy	151d799a61	netfilter: nf_tables: mark stateful expressions Add a flag to mark stateful expressions. This is used for dynamic expression instanstiation to limit the usable expressions. Strictly speaking only the dynset expression can not be used in order to avoid recursion, but since dynamically instantiating non-stateful expressions will simply create an identical copy, which behaves no differently than the original, this limits to expressions where it actually makes sense to dynamically instantiate them. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 20:12:31 +02:00
Patrick McHardy	f25ad2e907	netfilter: nf_tables: prepare for expressions associated to set elements Preparation to attach expressions to set elements: add a set extension type to hold an expression and dump the expression information with the set element. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 20:12:31 +02:00
Patrick McHardy	0b2d8a7b63	netfilter: nf_tables: add helper functions for expression handling Add helper functions for initializing, cloning, dumping and destroying a single expression that is not part of a rule. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 20:12:31 +02:00
Kenneth Klette Jonassen	3d0d26c797	tcp: fix bogus RTT for CC when retransmissions are acked Since retransmitted segments are not used for RTT estimation, previously SACKed segments present in the rtx queue are used. This estimation can be several times larger than the actual RTT. When a cumulative ack covers both previously SACKed and retransmitted segments, CC may thus get a bogus RTT. Such segments previously had an RTT estimation in tcp_sacktag_one(), so it seems reasonable to not reuse them in tcp_clean_rtx_queue() at all. Afaik, this has had no effect on SRTT/RTO because of Karn's check. Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no> Acked-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-13 13:54:25 -04:00
Daniel Borkmann	4577139b2d	net: use jump label patching for ingress qdisc in __netif_receive_skb_core Even if we make use of classifier and actions from the egress path, we're going into handle_ing() executing additional code on a per-packet cost for ingress qdisc, just to realize that nothing is attached on ingress. Instead, this can just be blinded out as a no-op entirely with the use of a static key. On input fast-path, we already make use of static keys in various places, e.g. skb time stamping, in RPS, etc. It makes sense to not waste time when we're assured that no ingress qdisc is attached anywhere. Enabling/disabling of that code path is being done via two helpers, namely net_{inc,dec}_ingress_queue(), that are being invoked under RTNL mutex when a ingress qdisc is being either initialized or destructed. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-13 13:34:40 -04:00
Patrick McHardy	7d7402642e	netfilter: nf_tables: variable sized set element keys / data This patch changes sets to support variable sized set element keys / data up to 64 bytes each by using variable sized set extensions. This allows to use concatenations with bigger data items suchs as IPv6 addresses. As a side effect, small keys/data now don't require the full 16 bytes of struct nft_data anymore but just the space they need. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 17:17:31 +02:00
Patrick McHardy	d0a11fc3dc	netfilter: nf_tables: support variable sized data in nft_data_init() Add a size argument to nft_data_init() and pass in the available space. This will be used by the following patches to support variable sized set element data. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 17:17:30 +02:00
Patrick McHardy	49499c3e6e	netfilter: nf_tables: switch registers to 32 bit addressing Switch the nf_tables registers from 128 bit addressing to 32 bit addressing to support so called concatenations, where multiple values can be concatenated over multiple registers for O(1) exact matches of multiple dimensions using sets. The old register values are mapped to areas of 128 bits for compatibility. When dumping register numbers, values are expressed using the old values if they refer to the beginning of a 128 bit area for compatibility. To support concatenations, register loads of less than a full 32 bit value need to be padded. This mainly affects the payload and exthdr expressions, which both unconditionally zero the last word before copying the data. Userspace fully passes the testsuite using both old and new register addressing. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 17:17:29 +02:00
Patrick McHardy	b1c96ed37c	netfilter: nf_tables: add register parsing/dumping helpers Add helper functions to parse and dump register values in netlink attributes. These helpers will later be changed to take care of translation between the old 128 bit and the new 32 bit register numbers. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 17:17:28 +02:00
Patrick McHardy	8cd8937ac0	netfilter: nf_tables: convert sets to u32 data pointers Simple conversion to use u32 pointers to the beginning of the data area to keep follow up patches smaller. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 17:17:27 +02:00
Patrick McHardy	e562d860d7	netfilter: nf_tables: kill nft_data_cmp() Only needlessly complicates things due to requiring specific argument types. Use memcmp directly. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 17:17:26 +02:00
Patrick McHardy	fad136ea0d	netfilter: nf_tables: convert expressions to u32 register pointers Simple conversion to use u32 pointers to the beginning of the registers to keep follow up patches smaller. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 17:17:25 +02:00
Patrick McHardy	1ca2e1702c	netfilter: nf_tables: use struct nft_verdict within struct nft_data Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 17:17:24 +02:00
Patrick McHardy	a55e22e92f	netfilter: nf_tables: get rid of NFT_REG_VERDICT usage Replace the array of registers passed to expressions by a struct nft_regs, containing the verdict as a seperate member, which aliases to the NFT_REG_VERDICT register. This is needed to seperate the verdict from the data registers completely, so their size can be changed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 17:17:07 +02:00
Patrick McHardy	d07db9884a	netfilter: nf_tables: introduce nft_validate_register_load() Change nft_validate_input_register() to not only validate the input register number, but also the length of the load, and rename it to nft_validate_register_load() to reflect that change. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 16:25:50 +02:00
Patrick McHardy	27e6d2017a	netfilter: nf_tables: kill nft_validate_output_register() All users of nft_validate_register_store() first invoke nft_validate_output_register(). There is in fact no use for using it on its own, so simplify the code by folding the functionality into nft_validate_register_store() and kill it. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 16:25:50 +02:00
Patrick McHardy	58f40ab6e2	netfilter: nft_lookup: use nft_validate_register_store() to validate types In preparation of validating the length of a register store, use nft_validate_register_store() in nft_lookup instead of open coding the validation. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 16:25:49 +02:00
Patrick McHardy	1ec10212f9	netfilter: nf_tables: rename nft_validate_data_load() The existing name is ambiguous, data is loaded as well when we read from a register. Rename to nft_validate_register_store() for clarity and consistency with the upcoming patch to introduce its counterpart, nft_validate_register_load(). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 16:25:49 +02:00
Patrick McHardy	45d9bcda21	netfilter: nf_tables: validate len in nft_validate_data_load() For values spanning multiple registers, we need to validate that enough space is available from the destination register onwards. Add a len argument to nft_validate_data_load() and consolidate the existing length validations in preparation of that. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-13 16:25:49 +02:00
David S. Miller	e60a9de49c	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2015-04-11 This series contains updates to iflink, ixgbe and ixgbevf. The entire set of changes come from Vlad Zolotarov to ultimately add the ethtool ops to VF driver to allow querying the RSS indirection table and RSS random key. Currently we support only 82599 and x540 devices. On those devices, VFs share the RSS redirection table and hash key with a PF. Letting the VF query this information may introduce some security risks, therefore this feature will be disabled by default. The new netdev op allows a system administrator to change the default behaviour with "ip link set" command. The relevant iproute2 patch has already been sent and awaits for this series upstream. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-12 21:36:57 -04:00
WANG Cong	7a6c8c34e5	fou: implement FOU_CMD_GET Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-12 21:25:13 -04:00
WANG Cong	02d793c5bb	fou: add network namespace support Also convert the spinlock to a mutex. Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-12 21:25:13 -04:00
WANG Cong	4cbcdf2b6c	fou: always use be16 for port udp_config.local_udp_port is be16. And iproute2 passes network order for FOU_ATTR_PORT. This doesn't fix any bug, just for consistency. Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-12 21:25:13 -04:00
WANG Cong	67270636a8	fou: exit early when parsing config fails Not a big deal, just for corretness. Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-12 21:25:13 -04:00
WANG Cong	9272f04872	fou: avoid calling udp_del_offload() twice This fixes the following harmless warning: ./ip/ip fou del port 7777 [ 122.907516] udp_del_offload: didn't find offload for port 7777 Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-12 21:25:13 -04:00
Eric Dumazet	52db70dca5	tcp: do not cache align timewait sockets With recent adoption of skc_cookie in struct sock_common, struct tcp_timewait_sock size increased from 192 to 200 bytes on 64bit arches. SLAB rounds then to 256 bytes. It is time to drop SLAB_HWCACHE_ALIGN constraint for twsk_slab. This saves about 12 MB of memory on typical configuration reaching 262144 timewait sockets, and has no noticeable impact on performance. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-12 21:16:05 -04:00
David S. Miller	4e78eb0dbf	There isn't much left, but we have * new mac80211 internal software queue to allow drivers to have shorter hardware queues and pull on-demand * use rhashtable for mac80211 station table * minstrel rate control debug improvements and some refactoring * fix noisy message about TX power reduction * fix continuous message printing and activity if CRDA doesn't respond * fix VHT-related capabilities with "iw connect" or "iwconfig ..." * fix Kconfig for cfg80211 wireless extensions compatibility -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJVJ7CPAAoJEDBSmw7B7bqr8+IQAKCAbUyd6PFRT5tcz9kW5GCW /ibb+n1e14yWKgNEe1gddUGKG/L3HGCBXNkCYzR2M8mlL7dLPqspaBcGHS4dx8F4 D0AuikqvtXIxfAXi0zmU2uo7rH7u2X34R2LtS8AlKByD+jmFvxMiPPvxNFgzJu/7 63UQm73p2pnu/KdXLW1OQEcpZtZJ9+N/uBiq9zbVdX3A8T84ME0oMyy+EAQqCZdK CcsTXHCnAgmmXWJlu1JRdopr1bd38mSGB70eXduFtPqDdmtQRnoaCQ9e+tJDA4j4 svEw0yDmsc4WG1EKLKKCRd3uFOZsng+lcXrHfpm5wlSPpCOItfQ9BzT3x1u6Y5JU Z1WMOMkkEce+95U7/RLoXwC/2RS3XelUXTde4cGIRMvO5drOrU58P0gdn3J+yKbv 6v+2GGKy/39tdXUOxIl3EZT/huIl+h1UNO8C2hyaEwdXK+X1zl31/u6kk1Ns18Wr YPEJixxHx0zR8jaZgDC7OlWLuqn4Ay+Ls9yCyIesdHzKpizJKqn83PntYnpJmxoA 9hlIyRDWnqH44KxzB85ni1C2Qudec3mcCWIWV7M+UoSC1Cgs/LxDzH7kRejR2ZIl vRhg5pqyr53L0h2lq5DO4Cj4UzbXb7YioKJRxjyKloNOlRrCZtK/VEsHbdsKEcIp d/wHj1AyFZeQfuhk8Qqr =mtuo -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-davem-2015-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== There isn't much left, but we have * new mac80211 internal software queue to allow drivers to have shorter hardware queues and pull on-demand * use rhashtable for mac80211 station table * minstrel rate control debug improvements and some refactoring * fix noisy message about TX power reduction * fix continuous message printing and activity if CRDA doesn't respond * fix VHT-related capabilities with "iw connect" or "iwconfig ..." * fix Kconfig for cfg80211 wireless extensions compatibility ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-12 20:43:46 -04:00
Al Viro	5d5d568975	make new_sync_{read,write}() static All places outside of core VFS that checked ->read and ->write for being NULL or called the methods directly are gone now, so NULL {read,write} with non-NULL {read,write}_iter will do the right thing in all cases. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:29:40 -04:00
Al Viro	dcdbd7b269	net/9p: remove (now-)unused helpers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:28:29 -04:00
Al Viro	21c9f5ccb1	p9_client_attach(): set fid->uid correctly it's almost always equal to current_fsuid(), but there's an exception - if the first writeback fid is opened by non-root and that happens before root has done any lookups in /, we end up doing attach for root. The current code leaves the resulting FID owned by root from the server POV and by non-root from the client one. Unfortunately, it means that e.g. massive dcache eviction will leave that user buggered - they'll end up redoing walks from / and picking that FID every time. As soon as they try to create something, the things will get nasty. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:28:28 -04:00
Al Viro	e1200fe68f	9p: switch p9_client_read() to passing struct iov_iter * ... and make it loop Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:28:27 -04:00
Al Viro	070b3656cf	9p: switch p9_client_write() to passing it struct iov_iter * ... and make it loop until it's done Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:28:25 -04:00
Al Viro	4f3b35c157	net/9p: switch the guts of p9_client_{read,write}() to iov_iter ... and have get_user_pages_fast() mapping fewer pages than requested to generate a short read/write. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:28:25 -04:00
Al Viro	c0fec3a98b	Merge branch 'iocb' into for-next	2015-04-11 22:24:41 -04:00
Al Viro	01e97e6517	new helper: msg_data_left() convert open-coded instances Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 15:53:35 -04:00
Al Viro	a2dd3793a1	Merge remote-tracking branch 'dh/afs' into for-davem	2015-04-11 15:51:09 -04:00
Al Viro	d8725c86ae	get rid of the size argument of sock_sendmsg() it's equal to iov_iter_count(&msg->msg_iter) in all cases Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 15:27:37 -04:00
Vlad Zolotarov	01a3d79681	if_link: Add an additional parameter to ifla_vf_info for RSS querying Add configuration setting for drivers to allow/block an RSS Redirection Table and a Hash Key querying for discrete VFs. On some devices VF share the mentioned above information with PF and querying it may adduce a theoretical security risk. We want to let a system administrator to decide if he/she wants to take this risk or not. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-04-10 21:57:22 -07:00
Thomas Graf	78ebb0d00b	rtnetlink: Mark name argument of rtnl_create_link() const Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-10 12:42:40 -07:00
David S. Miller	c3d0dac693	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-04-09 We've had enough new patches during the past week (especially from Marcel) that it'd be good to still get these queued for 4.1. The majority of the changes are from Marcel with lots of cleanup & refactoring patches for the HCI UART driver. Marcel also split out some Broadcom & Intel vendor specific functionality into two new btintel & btbcm modules. In addition to the HCI driver changes there's the completion of our local OOB data interface for pairing, added support for requesting remote LE features when connecting, as well as a couple of minor fixes for mac802154. Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-09 18:31:50 -04:00
Eric Dumazet	b52e69217b	tcp: md5: fix a typo in tcp_v4_md5_lookup() Lookup key for tcp_md5_do_lookup() has to be taken from addr_sk, not sk (which can be the listener) Fixes: `fd3a154a00` ("tcp: md5: get rid of tcp_v[46]_reqsk_md5_lookup()") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-09 18:11:16 -04:00
Hubert Sokolowski	1e53d5bb88	net: Pass VLAN ID to rtnl_fdb_notify. When an FDB entry is added or deleted the information about VLAN is not passed to listening applications like 'bridge monitor fdb'. With this patch VLAN ID is passed if it was set in the original netlink message. Also remove an unused bdev variable. Signed-off-by: Hubert Sokolowski <hubert.sokolowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-09 17:30:58 -04:00
Eric Dumazet	b50edd7812	tcp: tcp_make_synack() should clear skb->tstamp I noticed tcpdump was giving funky timestamps for locally generated SYNACK messages on loopback interface. 11:42:46.938990 IP 127.0.0.1.48245 > 127.0.0.2.23850: S 945476042:945476042(0) win 43690 <mss 65495,nop,nop,sackOK,nop,wscale 7> 20:28:58.502209 IP 127.0.0.2.23850 > 127.0.0.1.48245: S 3160535375:3160535375(0) ack 945476043 win 43690 <mss 65495,nop,nop,sackOK,nop,wscale 7> This is because we need to clear skb->tstamp before entering lower stack, otherwise net_timestamp_check() does not set skb->tstamp. Fixes: `7faee5c0d5` ("tcp: remove TCP_SKB_CB(skb)->when") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-09 17:28:57 -04:00
Jesse Gross	b736a623bd	udptunnels: Call handle_offloads after inserting vlan tag. handle_offloads() calls skb_reset_inner_headers() to store the layer pointers to the encapsulated packet. However, we currently push the vlag tag (if there is one) onto the packet afterwards. This changes the MAC header for the encapsulated packet but it is not reflected in skb->inner_mac_header, which breaks GSO and drivers which attempt to use this for encapsulation offloads. Fixes: `1eaa8178` ("vxlan: Add tx-vlan offload support.") Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-09 14:56:32 -04:00
David S. Miller	ca69d7102f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree. They are: * nf_tables set timeout infrastructure from Patrick Mchardy. 1) Add support for set timeout support. 2) Add support for set element timeouts using the new set extension infrastructure. 4) Add garbage collection helper functions to get rid of stale elements. Elements are accumulated in a batch that are asynchronously released via RCU when the batch is full. 5) Add garbage collection synchronization helpers. This introduces a new element busy bit to address concurrent access from the netlink API and the garbage collector. 5) Add timeout support for the nft_hash set implementation. The garbage collector peridically checks for stale elements from the workqueue. * iptables/nftables cgroup fixes: 6) Ignore non full-socket objects from the input path, otherwise cgroup match may crash, from Daniel Borkmann. 7) Fix cgroup in nf_tables. 8) Save some cycles from xt_socket by skipping packet header parsing when skb->sk is already set because of early demux. Also from Daniel. * br_netfilter updates from Florian Westphal. 9) Save frag_max_size and restore it from the forward path too. 10) Use a per-cpu area to restore the original source MAC address when traffic is DNAT'ed. 11) Add helper functions to access physical devices. 12) Use these new physdev helper function from xt_physdev. 13) Add another nf_bridge_info_get() helper function to fetch the br_netfilter state information. 14) Annotate original layer 2 protocol number in nf_bridge info, instead of using kludgy flags. 15) Also annotate the pkttype mangling when the packet travels back and forth from the IP to the bridge layer, instead of using a flag. * More nf_tables set enhancement from Patrick: 16) Fix possible usage of set variant that doesn't support timeouts. 17) Avoid spurious "set is full" errors from Netlink API when there are pending stale elements scheduled to be released. 18) Restrict loop checks to set maps. 19) Add support for dynamic set updates from the packet path. 20) Add support to store optional user data (eg. comments) per set element. BTW, I have also pulled net-next into nf-next to anticipate the conflict resolution between your okfn() signature changes and Florian's br_netfilter updates. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-09 14:46:04 -04:00
David S. Miller	9399bdcbb5	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2015-04-09 1) Prohibit the use/abuse of the xfrm netlink interface on 32/64 bit compatibility tasks. We need a full compat layer before we can allow this. From Fan Du. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-09 14:41:47 -04:00
David S. Miller	634d8ee4de	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2015-04-09 1) We dereferenced the xfrm outer_mode too early, larval SAs don't have it set. Move the dereference of the outer mode below the larval SA check to fix it. From Alexey Dobriyan. 2) Fix vti6 tunnel uninit on namespace crosssing. From Yao Xiwei. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-09 14:39:37 -04:00
Marcel Holtmann	0fe29fd1cd	Bluetooth: Read LE remote features during connection establishment When establishing a Bluetooth LE connection, read the remote used features mask to determine which features are supported. This was not really needed with Bluetooth 4.0, but since Bluetooth 4.1 and also 4.2 have introduced new optional features, this becomes more important. This works the same as with BR/EDR where the connection enters the BT_CONFIG stage and hci_connect_cfm call is delayed until the remote features have been retrieved. Only after successfully receiving the remote features, the connection enters the BT_CONNECTED state. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-09 08:36:54 +03:00
Al Viro	6aa248145a	switch kernel_sendmsg() and kernel_recvmsg() to iov_iter_kvec() For kernel_sendmsg() that eliminates the need to play with setfs(); for kernel_recvmsg() it does not - a couple of callers are using it with non-NULL ->msg_control, which would be treated as userland address on recvmsg side of things. In all cases we are really setting a kvec-backed iov_iter, though. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-09 00:02:34 -04:00
Al Viro	da18428498	net: switch importing msghdr from userland to {compat_,}import_iovec() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-09 00:02:26 -04:00
Al Viro	602bd0e90e	net: switch sendto() and recvfrom() to import_single_range() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-09 00:02:21 -04:00
Al Viro	237dae8890	Merge branch 'iocb' into for-davem trivial conflict in net/socket.c and non-trivial one in crypto - that one had evaded aio_complete() removal. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-09 00:01:38 -04:00
Eric Dumazet	dd929c1b3d	tcp: do not rearm rsk_timer on FastOpen requests FastOpen requests are not like other regular request sockets. They do not yet use rsk_timer : tcp_fastopen_queue_check() simply manually removes one expired request from fastopenq->rskq_rst list. Therefore, tcp_check_req() must not call mod_timer_pending(), otherwise we crash because rsk_timer was not initialized. Fixes: `fa76ce7328` ("inet: get rid of central tcp/dccp listener timer") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-08 20:19:42 -04:00
David S. Miller	82740b9ac2	NFC: 4.1 pull request This is the NFC pull request for 4.1. This is a shorter one than usual, as the Intel Field Peak NFC driver could not make it in time. We have: - A new driver for NXP NCI based chipsets, like e.g. the NPC100 or the PN7150. It currently only supports an i2c physical layer, but could easily be extended to work on top of e.g. SPI. This driver also includes support for user space triggered firmware updates. - A few minor st21nfc[ab] fixes, cleanups, and comments improvements. - A pn533 error return fix. - A few NFC related logs formatting cleanups. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVJV81AAoJEIqAPN1PVmxKM3sP/R6fKXMxtVxXsiPzBMk+SpBI onNXbCx27rp1lHTFznE16JAaaSfcQEwSQmYx7xa6KXqRYVlDfRfC+5R+rPGrCjfH kV/eLBNnYpmPw0ViVT7dsWK0b4me0k5pr9ki9mze3YxvuMbA5vdv0tvuRFz5IRF/ hl+WI5pntGuRtnIyBKIauAMylylUYVvCBGhmHnveiX0Dp8noJLBSV0wvdzujm51S +Uio/jHlUEV2lotrQBOfWNtEkwonXSwzZWSzimBCyEGLAwTx4lGXHmQftOtPd3zE sOT7Gw77ZCsulHoHiJyC0KpDSDS0NYVrtTI5BiuGgAivGi0YZw2XD3CYRIdy0m2Z lMoodYdiCgsxmrU6I6Rd/7DQBxD0Vhc+CGAyk41f7AwU4Fwq105kpSLupTtU+NzT ZpdvSXeirU8sxt+3WDOgv8esyYGZVVD8GuBbofMZQZ0vLq+FcDpYAW4w3LKvvi6X C+WN8f7SCI0kqpd4leyl6EG3SoQKFyPWobu0IlV520R9b76iBcyqooTIpvVa4Yk7 az6fKhi9gK/T6FHW78y6fnkczd47JKrC924m+g6P/hhTD5zQ956ferp0uFTjkPtF 8kNgRT7OIRi1JO783cnQx1uA61axC3GX1HFzsD9paVkTwRNtJ3eFsxtcYt7Nfv8D WGNWRQp5LmBsD/SqFBfk =MzOJ -----END PGP SIGNATURE----- Merge tag 'nfc-next-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next Samuel Ortiz says: ==================== NFC: 4.1 pull request This is the NFC pull request for 4.1. This is a shorter one than usual, as the Intel Field Peak NFC driver could not make it in time. We have: - A new driver for NXP NCI based chipsets, like e.g. the NPC100 or the PN7150. It currently only supports an i2c physical layer, but could easily be extended to work on top of e.g. SPI. This driver also includes support for user space triggered firmware updates. - A few minor st21nfc[ab] fixes, cleanups, and comments improvements. - A pn533 error return fix. - A few NFC related logs formatting cleanups. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-08 17:12:50 -04:00
Andi Kleen	5eeb292215	fou: Don't use const __read_mostly const __read_mostly is a senseless combination. If something is already const it cannot be __read_mostly. Remove the bogus __read_mostly in the fou driver. This fixes section conflicts with LTO. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-08 15:29:08 -04:00
David Miller	c1f8667677	netfilter: Fix switch statement warnings with recent gcc. More recent GCC warns about two kinds of switch statement uses: 1) Switching on an enumeration, but not having an explicit case statement for all members of the enumeration. To show the compiler this is intentional, we simply add a default case with nothing more than a break statement. 2) Switching on a boolean value. I think this warning is dumb but nevertheless you get it wholesale with -Wswitch. This patch cures all such warnings in netfilter. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-08 15:20:50 -04:00
Sowmini Varadhan	443be0e5af	RDS: make sure not to loop forever inside rds_send_xmit If a determined set of concurrent senders keep the send queue full, we can loop forever inside rds_send_xmit. This fix has two parts. First we are dropping out of the while(1) loop after we've processed a large batch of messages. Second we add a generation number that gets bumped each time the xmit bit lock is acquired. If someone else has jumped in and made progress in the queue, we skip our goto restart. Original patch by Chris Mason. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-08 15:17:32 -04:00
Sowmini Varadhan	1789b2c077	RDS: only use passive connections when addresses match Passive connections were added for the case where one loopback IB connection between identical addresses needs another connection to store the second QP. Unfortunately, they were also created in the case where the addesses differ and we already have both QPs. This lead to a message reordering bug. - two different IB interfaces and addresses on a machine: A B - traffic is sent from A to B - connection from A-B is created, connect request sent - listening accepts connect request, B-A is created - traffic flows, next_rx is incremented - unacked messages exist on the retrans list - connection A-B is shut down, new connect request sent - listen sees existing loopback B-A, creates new passive B-A - retrans messages are sent and delivered because of 0 next_rx The problem is that the second connection request saw the previously existing parent connection. Instead of using it, and using the existing next_rx_seq state for the traffic between those IPs, it mistakenly thought that it had to create a passive connection. We fix this by only using passive connections in the special case where laddr and faddr match. In this case we'll only ever have one parent sending connection requests and one passive connection created as the listening path sees the existing parent connection which initiated the request. Original patch by Zach Brown Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-08 15:17:23 -04:00
Pablo Neira Ayuso	aadd51aa71	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Resolve conflicts between `5888b93` ("Merge branch 'nf-hook-compress'") and Florian Westphal br_netfilter works. Conflicts: net/bridge/br_netfilter.c Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-08 18:30:21 +02:00
Hannes Frederic Sowa	1b11287118	ipv6: call iptunnel_xmit with NULL sock pointer if no tunnel sock is available Fixes: `79b16aadea` ("udp_tunnel: Pass UDP socket down through udp_tunnel{, 6}_xmit_skb().") Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-08 12:09:43 -04:00
Hannes Frederic Sowa	926a882f69	ipv4: ip_tunnel: use net namespace from rtable not socket The socket parameter might legally be NULL, thus sock_net is sometimes causing a NULL pointer dereference. Using net_device pointer in dst_entry is more reliable. Fixes: `b6a7719aed` ("ipv4: hash net ptr into fragmentation bucket selection") Reported-by: Rick Jones <rick.jones2@hp.com> Cc: Rick Jones <rick.jones2@hp.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-08 12:09:42 -04:00
Patrick McHardy	68e942e88a	netfilter: nf_tables: support optional userdata for set elements Add an userdata set extension and allow the user to attach arbitrary data to set elements. This is intended to hold TLV encoded data like comments or DNS annotations that have no meaning to the kernel. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-08 16:58:27 +02:00
Patrick McHardy	22fe54d5fe	netfilter: nf_tables: add support for dynamic set updates Add a new "dynset" expression for dynamic set updates. A new set op ->update() is added which, for non existant elements, invokes an initialization callback and inserts the new element. For both new or existing elements the extenstion pointer is returned to the caller to optionally perform timer updates or other actions. Element removal is not supported so far, however that seems to be a rather exotic need and can be added later on. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-08 16:58:27 +02:00
Patrick McHardy	11113e190b	netfilter: nf_tables: support different set binding types Currently a set binding is assumed to be related to a lookup and, in case of maps, a data load. In order to use bindings for set updates, the loop detection checks must be restricted to map operations only. Add a flags member to the binding struct to hold the set "action" flags such as NFT_SET_MAP, and perform loop detection based on these. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-08 16:58:27 +02:00
Patrick McHardy	3dd0673ac3	netfilter: nf_tables: prepare set element accounting for async updates Use atomic operations for the element count to avoid races with async updates. To properly handle the transactional semantics during netlink updates, deleted but not yet committed elements are accounted for seperately and are treated as being already removed. This means for the duration of a netlink transaction, the limit might be exceeded by the amount of elements deleted. Set implementations must be prepared to handle this. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-08 16:58:27 +02:00
Patrick McHardy	4a8678efbe	netfilter: nf_tables: fix set selection when timeouts are requested The NFT_SET_TIMEOUT flag is ignore in nft_select_set_ops, which may lead to selection of a set implementation that doesn't actually support timeouts. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-08 16:58:26 +02:00
Florian Westphal	a1e67951e6	netfilter: bridge: make BRNF_PKT_TYPE flag a bool nf_bridge_info->mask is used for several things, for example to remember if skb->pkt_type was set to OTHER_HOST. For a bridge, OTHER_HOST is expected case. For ip forward its a non-starter though -- routing expects PACKET_HOST. Bridge netfilter thus changes OTHER_HOST to PACKET_HOST before hook invocation and then un-does it after hook traversal. This information is irrelevant outside of br_netfilter. After this change, ->mask now only contains flags that need to be known outside of br_netfilter in fast-path. Future patch changes mask into a 2bit state field in sk_buff, so that we can remove skb->nf_bridge pointer for good and consider all remaining places that access nf_bridge info content a not-so fastpath. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-08 16:49:12 +02:00
Florian Westphal	3eaf402502	netfilter: bridge: start splitting mask into public/private chunks ->mask is a bit info field that mixes various use cases. In particular, we have flags that are mutually exlusive, and flags that are only used within br_netfilter while others need to be exposed to other parts of the kernel. Remove BRNF_8021Q/PPPoE flags. They're mutually exclusive and only needed within br_netfilter context. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-08 16:49:11 +02:00
Florian Westphal	383307838d	netfilter: bridge: add and use nf_bridge_info_get helper Don't access skb->nf_bridge directly, this pointer will be removed soon. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-08 16:49:10 +02:00
Florian Westphal	a99074ae1f	netfilter: physdev: use helpers Avoid skb->nf_bridge accesses where possible. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-08 16:49:09 +02:00
Florian Westphal	c737b7c451	netfilter: bridge: add helpers for fetching physin/outdev right now we store this in the nf_bridge_info struct, accessible via skb->nf_bridge. This patch prepares removal of this pointer from skb: Instead of using skb->nf_bridge->x, we use helpers to obtain the in/out device (or ifindexes). Followup patches to netfilter will then allow nf_bridge_info to be obtained by a call into the br_netfilter core, rather than keeping a pointer to it in sk_buff. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-08 16:49:08 +02:00
Florian Westphal	e70deecbf8	netfilter: bridge: don't use nf_bridge_info data to store mac header br_netfilter maintains an extra state, nf_bridge_info, which is attached to skb via skb->nf_bridge pointer. Amongst other things we use skb->nf_bridge->data to store the original mac header for every processed skb. This is required for ip refragmentation when using conntrack on top of bridge, because ip_fragment doesn't copy it from original skb. However there is no need anymore to do this unconditionally. Move this to the one place where its needed -- when br_netfilter calls ip_fragment(). Also switch to percpu storage for this so we can handle fragmenting without accessing nf_bridge meta data. Only user left is neigh resolution when DNAT is detected, to hold the original source mac address (neigh resolution builds new mac header using bridge mac), so rename ->data and reduce its size to whats needed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-08 16:49:07 +02:00
Daniel Borkmann	d64d80a2cd	netfilter: x_tables: don't extract flow keys on early demuxed sks in socket match Currently in xt_socket, we take advantage of early demuxed sockets since commit `00028aa370` ("netfilter: xt_socket: use IP early demux") in order to avoid a second socket lookup in the fast path, but we only make partial use of this: We still unnecessarily parse headers, extract proto, {s,d}addr and {s,d}ports from the skb data, accessing possible conntrack information, etc even though we were not even calling into the socket lookup via xt_socket_get_sock_{v4,v6}() due to skb->sk hit, meaning those cycles can be spared. After this patch, we only proceed the slower, manual lookup path when we have a skb->sk miss, thus time to match verdict for early demuxed sockets will improve further, which might be i.e. interesting for use cases such as mentioned in `681f130f39` ("netfilter: xt_socket: add XT_SOCKET_NOWILDCARD flag"). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-08 16:47:49 +02:00
Johannes Berg	6d00ec0514	cfg80211: don't allow disabling WEXT if it's required The change to only export WEXT symbols when required could break the build if CONFIG_CFG80211_WEXT was explicitly disabled while a driver like orinoco selected it. Fix this by hiding the symbol when it's required so it can't be disabled in that case. Fixes: `2afe38d15c` ("cfg80211-wext: export symbols only when needed") Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-08 09:19:29 +02:00
Sheng Yong	8bc0034cf6	net: remove extra newlines Signed-off-by: Sheng Yong <shengyong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-07 22:24:37 -04:00
Daniel Lee	2646c831c0	tcp: RFC7413 option support for Fast Open client Fast Open has been using an experimental option with a magic number (RFC6994). This patch makes the client by default use the RFC7413 option (34) to get and send Fast Open cookies. This patch makes the client solicit cookies from a given server first with the RFC7413 option. If that fails to elicit a cookie, then it tries the RFC6994 experimental option. If that also fails, it uses the RFC7413 option on all subsequent connect attempts. If the server returns a Fast Open cookie then the client caches the form of the option that successfully elicited a cookie, and uses that form on later connects when it presents that cookie. The idea is to gradually obsolete the use of experimental options as the servers and clients upgrade, while keeping the interoperability meanwhile. Signed-off-by: Daniel Lee <Longinus00@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-07 18:36:39 -04:00
Daniel Lee	7f9b838b71	tcp: RFC7413 option support for Fast Open server Fast Open has been using the experimental option with a magic number (RFC6994) to request and grant Fast Open cookies. This patch enables the server to support the official IANA option 34 in RFC7413 in addition. The change has passed all existing Fast Open tests with both old and new options at Google. Signed-off-by: Daniel Lee <Longinus00@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-07 18:36:39 -04:00
Beshay, Joseph	0ad2a83659	netem: Fixes byte backlog accounting for the first of two chained netem instances Fixes byte backlog accounting for the first of two chained netem instances. Bytes backlog reported now corresponds to the number of queued packets. When two netem instances are chained, for instance to apply rate and queue limitation followed by packet delay, the number of backlogged bytes reported by the first netem instance is wrong. It reports the sum of bytes in the queues of the first and second netem. The first netem reports the correct number of backlogged packets but not bytes. This is shown in the example below. Consider a chain of two netem schedulers created using the following commands: $ tc -s qdisc replace dev veth2 root handle 1:0 netem rate 10000kbit limit 100 $ tc -s qdisc add dev veth2 parent 1:0 handle 2: netem delay 50ms Start an iperf session to send packets out on the specified interface and monitor the backlog using tc: $ tc -s qdisc show dev veth2 Output using unpatched netem: qdisc netem 1: root refcnt 2 limit 100 rate 10000Kbit Sent 98422639 bytes 65434 pkt (dropped 123, overlimits 0 requeues 0) backlog 172694b 73p requeues 0 qdisc netem 2: parent 1: limit 1000 delay 50.0ms Sent 98422639 bytes 65434 pkt (dropped 0, overlimits 0 requeues 0) backlog 63588b 42p requeues 0 The interface used to produce this output has an MTU of 1500. The output for backlogged bytes behind netem 1 is 172694b. This value is not correct. Consider the total number of sent bytes and packets. By dividing the number of sent bytes by the number of sent packets, we get an average packet size of ~=1504. If we divide the number of backlogged bytes by packets, we get ~=2365. This is due to the first netem incorrectly counting the 63588b which are in netem 2's queue as being in its own queue. To verify this is the case, we subtract them from the reported value and divide by the number of packets as follows: 172694 - 63588 = 109106 bytes actualled backlogged in netem 1 109106 / 73 packets ~= 1494 bytes (which matches our MTU) The root cause is that the byte accounting is not done at the same time with packet accounting. The solution is to update the backlog value every time the packet queue is updated. Signed-off-by: Joseph D Beshay <joseph.beshay@utdallas.edu> Acked-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-07 18:34:24 -04:00
Johan Hedberg	40f66c05c3	Bluetooth: Add local SSP OOB data to OOB ext data mgmt command The Read Local Out Of Band Extended Data mgmt command is specified to return the SSP values when given a BR/EDR address type as input parameter. The returned values may include either the 192-bit variants of C and R, or their 256-bit variants, or both, depending on the status of Secure Connections and Secure Connections Only modes. If SSP is not enabled the command will only return the Class of Device value (like it has done so far). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-04-07 23:31:20 +02:00
Nicolas Dichtel	a143c40c32	netns: allow to dump netns ids Which this patch, it's possible to dump the list of ids allocated for peer netns. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-07 17:29:41 -04:00
Nicolas Dichtel	9a9634545c	netns: notify netns id events With this patch, netns ids that are created and deleted are advertised into the group RTNLGRP_NSID. Because callers of rtnl_net_notifyid() already know the id of the peer, there is no need to call __peernet2id() in rtnl_net_fill(). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-07 17:29:41 -04:00
Nicolas Dichtel	b111e4e111	netns: minor cleanup in rtnl_net_getid() No need to initialize err, it will be overridden by the value of nlmsg_parse(). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-07 17:29:41 -04:00
David Miller	79b16aadea	udp_tunnel: Pass UDP socket down through udp_tunnel{, 6}_xmit_skb(). That was we can make sure the output path of ipv4/ipv6 operate on the UDP socket rather than whatever random thing happens to be in skb->sk. Based upon a patch by Jiri Pirko. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>	2015-04-07 15:29:08 -04:00
David Miller	7026b1ddb6	netfilter: Pass socket pointer down through okfn(). On the output paths in particular, we have to sometimes deal with two socket contexts. First, and usually skb->sk, is the local socket that generated the frame. And second, is potentially the socket used to control a tunneling socket, such as one the encapsulates using UDP. We do not want to disassociate skb->sk when encapsulating in order to fix this, because that would break socket memory accounting. The most extreme case where this can cause huge problems is an AF_PACKET socket transmitting over a vxlan device. We hit code paths doing checks that assume they are dealing with an ipv4 socket, but are actually operating upon the AF_PACKET one. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-07 15:25:55 -04:00
David Miller	1c984f8a5d	netfilter: Add socket pointer to nf_hook_state. It is currently always set to NULL, but nf_queue is adjusted to be prepared for it being set to a real socket by taking and releasing a reference to that socket when necessary. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-07 15:25:55 -04:00
Marcel Holtmann	2d7cc19eeb	Bluetooth: Remove hci_recv_stream_fragment function The hci_recv_stream_fragment function should have never been introduced in the first place. The Bluetooth core does not need to know anything about the HCI transport protocol. With all transport protocol specific detailed moved back into the drivers where they belong (mainly generic USB and UART drivers), this function can now be removed. This reduces the size of hci_dev structure and also removes an exported symbol from the Bluetooth core module. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-07 18:47:10 +02:00
Marcel Holtmann	5c7d2dd285	Bluetooth: Make data pointer of hci_recv_stream_fragment const The data pointer provided to hci_recv_stream_fragment function should have been marked const. The function has no business in modifying the original data. So fix this now. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-07 18:47:09 +02:00
Steven Rostedt (Red Hat)	1bc1e4d048	mac80211: Move message tracepoints to their own header Every tracing file must have its own TRACE_SYSTEM defined. The mac80211 tracepoint header broke this and add in the middle of the file had: #undef TRACE_SYSTEM #define TRACE_SYSTEM mac80211_msg Unfortunately, this broke new code in the ftrace infrastructure. Moving the mac80211_msg into its own trace file with its own TRACE_SYSTEM defined fixes the issue. Link: http://lkml.kernel.org/r/1428389938.1841.1.camel@sipsolutions.net Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2015-04-07 12:32:09 -04:00
Ilya Dryomov	6d7fdb0ab3	Revert "libceph: use memalloc flags for net IO" This reverts commit `89baaa570a`. Dirty page throttling should be sufficient for us in the general case so there is no need to use __GFP_MEMALLOC - it would be needed only in the swap-over-rbd case, which we currently don't support. (It would probably take approximately the commit that is being reverted to add that support, but we would also need the "swap" option to distinguish from the general case and make sure swap ceph_client-s aren't shared with anything else.) See ceph-devel threads [1] and [2] for the details of why enabling pfmemalloc reserves for all cases is a bad thing. On top of potential system lockups related to drained emergency reserves, this turned out to cause ceph lockups in case peers are on the same host and communicating via loopback due to sk_filter() dropping pfmemalloc skbs on the receiving side because the receiving loopback socket is not tagged with SOCK_MEMALLOC. [1] "SOCK_MEMALLOC vs loopback" http://www.spinics.net/lists/ceph-devel/msg22998.html [2] "[PATCH] libceph: don't set memalloc flags in loopback case" http://www.spinics.net/lists/ceph-devel/msg23392.html Conflicts: net/ceph/messenger.c [ context: tcp_nodelay option ] Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Mel Gorman <mgorman@suse.de> Cc: Sage Weil <sage@redhat.com> Cc: stable@vger.kernel.org # 3.18+, needs backporting Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Acked-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: Mel Gorman <mgorman@suse.de>	2015-04-07 19:08:35 +03:00
David S. Miller	7abccdba25	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-04-04 Here's what's probably the last bluetooth-next pull request for 4.1: - Fixes for LE advertising data & advertising parameters - Fix for race condition with HCI_RESET flag - New BNEPGETSUPPFEAT ioctl, needed for certification - New HCI request callback type to get the resulting skb - Cleanups to use BIT() macro wherever possible - Consolidate Broadcom device entries in the btusb HCI driver - Check for valid flags in CMTP, HIDP & BNEP - Disallow local privacy & OOB data combo to prevent a potential race - Expose SMP & ECDH selftest results through debugfs - Expose current Device ID info through debugfs Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-07 11:47:52 -04:00
Johannes Berg	46b9d18014	cfg80211: send extended capabilities IE in connect If the connect request from userspace didn't include an extended capabilities IE, create one using the driver capabilities. This fixes VHT associations, since those need to set the operating mode notification capability. Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-07 13:56:45 +02:00
Johannes Berg	29464ccc78	cfg80211: move IE split utilities here from mac80211 As the next patch will require the IE splitting utility functions in cfg80211, move them there from mac80211. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-07 13:56:41 +02:00
Yao Xiwei	092a29a40b	vti6: fix uninit when using x-netns When the kernel deleted a vti6 interface, this interface was not removed from the tunnels list. Thus, when the ip6_vti module was removed, this old interface was found and the kernel tried to delete it again. This was leading to a kernel panic. Fixes: `61220ab349` ("vti6: Enable namespace changing") Signed-off-by: Yao Xiwei <xiwei.yao@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2015-04-07 07:52:28 +02:00
Alexey Dobriyan	68c11e98ef	xfrm: fix xfrm_input/xfrm_tunnel_check oops https://bugzilla.kernel.org/show_bug.cgi?id=95211 Commit `70be6c91c8` ("xfrm: Add xfrm_tunnel_skb_cb to the skb common buffer") added check which dereferences ->outer_mode too early but larval SAs don't have this pointer set (yet). So check for tunnel stuff later. Mike Noordermeer reported this bug and patiently applied all the debugging. Technically this is remote-oops-in-interrupt-context type of thing. BUG: unable to handle kernel NULL pointer dereference at 0000000000000034 IP: [<ffffffff8150dca2>] xfrm_input+0x3c2/0x5a0 ... [<ffffffff81500fc6>] ? xfrm4_esp_rcv+0x36/0x70 [<ffffffff814acc9a>] ? ip_local_deliver_finish+0x9a/0x200 [<ffffffff81471b83>] ? __netif_receive_skb_core+0x6f3/0x8f0 ... RIP [<ffffffff8150dca2>] xfrm_input+0x3c2/0x5a0 Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2015-04-07 07:52:27 +02:00
David S. Miller	c85d6975ef	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/mellanox/mlx4/cmd.c net/core/fib_rules.c net/ipv4/fib_frontend.c The fib_rules.c and fib_frontend.c conflicts were locking adjustments in 'net' overlapping addition and removal of code in 'net-next'. The mlx4 conflict was a bug fix in 'net' happening in the same place a constant was being replaced with a more suitable macro. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-06 22:34:15 -04:00
Pavel Nakonechny	303038135a	net: dsa: fix filling routing table from OF description According to description in 'include/net/dsa.h', in cascade switches configurations where there are more than one interconnected devices, 'rtable' array in 'dsa_chip_data' structure is used to indicate which port on this switch should be used to send packets to that are destined for corresponding switch. However, dsa_of_setup_routing_table() fills 'rtable' with port numbers of the _target_ switch, but not current one. This commit removes redundant devicetree parsing and adds needed port number as a function argument. So dsa_of_setup_routing_table() now just looks for target switch number by parsing parent of 'link' device node. To remove possible misunderstandings with the way of determining target switch number, a corresponding comment was added to the source code and to the DSA device tree bindings documentation file. This was tested on a custom board with two Marvell 88E6095 switches with following corresponding routing tables: { -1, 10 } and { 8, -1 }. Signed-off-by: Pavel Nakonechny <pavel.nakonechny@skitlab.ru> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-06 17:31:37 -04:00
WANG Cong	67e04c29ec	l2tp: unregister l2tp_net_ops on failure path Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-06 16:53:02 -04:00
Alexei Starovoitov	91bc4822c3	tc: bpf: add checksum helpers Commit `608cd71a9c` ("tc: bpf: generalize pedit action") has added the possibility to mangle packet data to BPF programs in the tc pipeline. This patch adds two helpers bpf_l3_csum_replace() and bpf_l4_csum_replace() for fixing up the protocol checksums after the packet mangling. It also adds 'flags' argument to bpf_skb_store_bytes() helper to avoid unnecessary checksum recomputations when BPF programs adjusting l3/l4 checksums and documents all three helpers in uapi header. Moreover, a sample program is added to show how BPF programs can make use of the mangle and csum helpers. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-06 16:42:35 -04:00
hannes@stressinduktion.org	f60e5990d9	ipv6: protect skb->sk accesses from recursive dereference inside the stack We should not consult skb->sk for output decisions in xmit recursion levels > 0 in the stack. Otherwise local socket settings could influence the result of e.g. tunnel encapsulation process. ipv6 does not conform with this in three places: 1) ip6_fragment: we do consult ipv6_npinfo for frag_size 2) sk_mc_loop in ipv6 uses skb->sk and checks if we should loop the packet back to the local socket 3) ip6_skb_dst_mtu could query the settings from the user socket and force a wrong MTU Furthermore: In sk_mc_loop we could potentially land in WARN_ON(1) if we use a PF_PACKET socket ontop of an IPv6-backed vxlan device. Reuse xmit_recursion as we are currently only interested in protecting tunnel devices. Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-06 16:12:49 -04:00
David S. Miller	b85c3dc9bd	netfilter: Pass nf_hook_state through arpt_do_table(). Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-04 13:26:52 -04:00
David S. Miller	073bfd5686	netfilter: Pass nf_hook_state through nft_set_pktinfo*(). Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-04 12:54:27 -04:00
David S. Miller	8f8a37152d	netfilter: Pass nf_hook_state through ip6t_do_table(). Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-04 12:52:06 -04:00
David S. Miller	8fe22382d1	netfilter: Pass nf_hook_state through nf_nat_ipv6_{in,out,fn,local_fn}(). Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-04 12:48:08 -04:00
David S. Miller	1c491ba259	netfilter: Pass nf_hook_state through ipt_do_table(). Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-04 12:47:04 -04:00
David S. Miller	d7cf4081ed	netfilter: Pass nf_hook_state through nf_nat_ipv4_{in,out,fn,local_fn}(). Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-04 12:45:19 -04:00
David S. Miller	238e54c9cb	netfilter: Make nf_hookfn use nf_hook_state. Pass the nf_hook_state all the way down into the hook functions themselves. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-04 12:31:38 -04:00
David S. Miller	1d1de89b9a	netfilter: Use nf_hook_state in nf_queue_entry. That way we don't have to reinstantiate another nf_hook_state on the stack of the nf_reinject() path. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-04 12:25:22 -04:00
David S. Miller	cfdfab3146	netfilter: Create and use nf_hook_state. Instead of passing a large number of arguments down into the nf_hook() entry points, create a structure which carries this state down through the hook processing layers. This makes is so that if we want to change the types or signatures of any of these pieces of state, there are less places that need to be changed. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-04 12:17:40 -04:00
Marcel Holtmann	38c8af6004	Bluetooth: Fix location of TX power field in LE advertising data The TX power field in the LE advertising data should be placed last since it needs to be possible to enable kernel controlled TX power, but still allow for userspace provided flags field. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-04 08:50:20 +03:00
Marcel Holtmann	fd6413d882	Bluetooth: hidp: Use BIT(x) instead of (1 << x) Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-04 08:50:20 +03:00
Marcel Holtmann	b2ddeb1173	Bluetooth: cmtp: Use BIT(x) instead of (1 << x) Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-04 08:50:20 +03:00
Grzegorz Kolodziejczyk	836a061b19	Bluetooth: bnep: Handle BNEP connection setup request With this patch kernel will be able to handle setup request. This is needed if we would like to handle control mesages with extension headers. User space will be only resposible for reading setup data and checking if scenario is conformance to specification (dst and src device bnep role). In case of new user space, setup data must be leaved(peek msg) on queue. New bnep session will be responsible for handling this data. Signed-off-by: Grzegorz Kolodziejczyk <grzegorz.kolodziejczyk@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-04-03 23:21:34 +02:00
Grzegorz Kolodziejczyk	bf8b9a9cb7	Bluetooth: bnep: Add support to extended headers of control frames Handling extended headers of control frames is required BNEP functionality. This patch refractor bnep rx frame handling function. Extended header for control frames shouldn't be omitted as it was previously done. Every control frame should be checked if it contains extended header and then every extension should be parsed separately. Signed-off-by: Grzegorz Kolodziejczyk <grzegorz.kolodziejczyk@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-04-03 23:21:34 +02:00
Grzegorz Kolodziejczyk	0477e2e868	Bluetooth: bnep: Add support for get bnep features via ioctl This is needed if user space wants to know supported bnep features by kernel, e.g. if kernel supports sending response to bnep setup control message. By now there is no possibility to know supported features by kernel in case of bnep. Ioctls allows only to add connection, delete connection, get connection list, get connection info. Adding connection if it's possible (establishing network device connection) is equivalent to starting bnep session. Bnep session handles data queue of transmit, receive messages over bnep channel. It means that if we add connection the received/transmitted data will be parsed immediately. In case of get bnep features we want to know before session start, if we should leave setup data on socket queue and let kernel to handle with it, or in case of no setup handling support, if we should pull this message and handle setup response within user space. Signed-off-by: Grzegorz Kolodziejczyk <grzegorz.kolodziejczyk@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-04-03 23:21:34 +02:00
Daniel Borkmann	bcad571824	ebpf: add skb->priority to offset map for usage in {cls, act}_bpf This adds the ability to read out the skb->priority from an eBPF program, so that it can be taken into account from a tc filter or action for the use-case where the priority is not being used to directly override the filter classification in a qdisc, but to tag traffic otherwise for the classifier; the priority can be assigned from various places incl. user space, in future we may also mangle it from an eBPF program. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-03 14:59:15 -04:00
Grzegorz Kolodziejczyk	e0fdbab169	Bluetooth: bnep: Return err value while sending cmd is not understood Send command not understood response should be verified if it was successfully sent, like all send responses. Signed-off-by: Grzegorz Kolodziejczyk <grzegorz.kolodziejczyk@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-04-03 19:52:35 +02:00
Nicolas Dichtel	576b7cd2f6	netns: don't allocate an id for dead netns First, let's explain the problem. Suppose you have an ipip interface that stands in the netns foo and its link part in the netns bar (so the netns bar has an nsid into the netns foo). Now, you remove the netns bar: - the bar nsid into the netns foo is removed - the netns exit method of ipip is called, thus our ipip iface is removed: => a netlink message is built in the netns foo to advertise this deletion => this netlink message requests an nsid for bar, thus a new nsid is allocated for bar and never removed. This patch adds a check in peernet2id() so that an id cannot be allocated for a netns which is currently destroyed. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-03 12:36:31 -04:00
Nicolas Dichtel	6d458f5b4e	Revert "netns: don't clear nsid too early on removal" This reverts commit `4217291e59` ("netns: don't clear nsid too early on removal"). This is not the right fix, it introduces races. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-03 12:36:31 -04:00
Ian Morris	00db41243e	ipv4: coding style: comparison for inequality with NULL The ipv4 code uses a mixture of coding styles. In some instances check for non-NULL pointer is done as x != NULL and sometimes as x. x is preferred according to checkpatch and this patch makes the code consistent by adopting the latter form. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-03 12:11:15 -04:00
Ian Morris	51456b2914	ipv4: coding style: comparison for equality with NULL The ipv4 code uses a mixture of coding styles. In some instances check for NULL pointer is done as x == NULL and sometimes as !x. !x is preferred according to checkpatch and this patch makes the code consistent by adopting the latter form. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-03 12:11:15 -04:00
WANG Cong	7ba0c47c34	ip6mr: call del_timer_sync() in ip6mr_free_table() We need to wait for the flying timers, since we are going to free the mrtable right after it. Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 20:52:35 -04:00
WANG Cong	419df12fb5	net: move fib_rules_unregister() under rtnl lock We have to hold rtnl lock for fib_rules_unregister() otherwise the following race could happen: fib_rules_unregister(): fib_nl_delrule(): ... ... ... ops = lookup_rules_ops(); list_del_rcu(&ops->list); list_for_each_entry(ops->rules) { fib_rules_cleanup_ops(ops); ... list_del_rcu(); list_del_rcu(); } Note, net->rules_mod_lock is actually not needed at all, either upper layer netns code or rtnl lock guarantees we are safe. Cc: Alexander Duyck <alexander.h.duyck@redhat.com> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 20:52:34 -04:00
WANG Cong	ed785309c9	ipv4: take rtnl_lock and mark mrt table as freed on namespace cleanup This is the IPv4 part for commit `905a6f96a1` (ipv6: take rtnl_lock and mark mrt6 table as freed on namespace cleanup). Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 20:52:34 -04:00
Neal Cardwell	666b805150	tcp: fix FRTO undo on cumulative ACK of SACKed range On processing cumulative ACKs, the FRTO code was not checking the SACKed bit, meaning that there could be a spurious FRTO undo on a cumulative ACK of a previously SACKed skb. The FRTO code should only consider a cumulative ACK to indicate that an original/unretransmitted skb is newly ACKed if the skb was not yet SACKed. The effect of the spurious FRTO undo would typically be to make the connection think that all previously-sent packets were in flight when they really weren't, leading to a stall and an RTO. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Fixes: `e33099f96d` ("tcp: implement RFC5682 F-RTO") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:35:58 -04:00
Jon Paul Maloy	ed193ece26	tipc: simplify link mtu negotiation When a link is being established, the two endpoints advertise their respective interface MTU in the transmitted RESET and ACTIVATE messages. If there is any difference, the lower of the two MTUs will be selected for use by both endpoints. However, as a remnant of earlier attempts to introduce TIPC level routing. there also exists an MTU discovery mechanism. If an intermediate node has a lower MTU than the two endpoints, they will discover this through a bisectional approach, and finally adopt this MTU for common use. Since there is no TIPC level routing, and probably never will be, this mechanism doesn't make any sense, and only serves to make the link level protocol unecessarily complex. In this commit, we eliminate the MTU discovery algorithm,and fall back to the simple MTU advertising approach. This change is fully backwards compatible. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:27:12 -04:00
Jon Paul Maloy	dff29b1a88	tipc: eliminate delayed link deletion at link failover When a bearer is disabled manually, all its links have to be reset and deleted. However, if there is a remaining, parallel link ready to take over a deleted link's traffic, we currently delay the delete of the removed link until the failover procedure is finished. This is because the remaining link needs to access state from the reset link, such as the last received packet number, and any partially reassembled buffer, in order to perform a successful failover. In this commit, we do instead move the state data over to the new link, so that it can fulfill the procedure autonomously, without accessing any data on the old link. This means that we can now proceed and delete all pertaining links immediately when a bearer is disabled. This saves us from some unnecessary complexity in such situations. We also choose to change the confusing definitions CHANGEOVER_PROTOCOL, ORIGINAL_MSG and DUPLICATE_MSG to the more descriptive TUNNEL_PROTOCOL, FAILOVER_MSG and SYNCH_MSG respectively. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:27:12 -04:00
Jon Paul Maloy	2da7142516	tipc: drop tunneled packet duplicates at reception In commit `8b4ed8634f` ("tipc: eliminate race condition at dual link establishment") we introduced a parallel link synchronization mechanism that guarentees sequential delivery even for users switching from an old to a newly established link. The new mechanism makes it unnecessary to deliver the tunneled duplicate packets back to the old link, as we are currently doing. It is now sufficient to use the last tunneled packet's inner sequence number as synchronization point between the two parallel links, whereafter it can be dropped. In this commit, we drop the duplicate packets arriving on the new link, after updating the synchronization point at each new arrival. Although it would now have been sufficient for the other endpoint to only tunnel the last packet in its send queue, and not the entire queue, we must still do this to maintain compatibility with older nodes. This commit makes it possible to get rid if some complex interaction between the two parallel links. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:27:12 -04:00
David S. Miller	9f0d34bc34	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/usb/asix_common.c drivers/net/usb/sr9800.c drivers/net/usb/usbnet.c include/linux/usb/usbnet.h net/ipv4/tcp_ipv4.c net/ipv6/tcp_ipv6.c The TCP conflicts were overlapping changes. In 'net' we added a READ_ONCE() to the socket cached RX route read, whilst in 'net-next' Eric Dumazet touched the surrounding code dealing with how mini sockets are handled. With USB, it's a case of the same bug fix first going into net-next and then I cherry picked it back into net. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:16:53 -04:00
Marcel Holtmann	e213568ad6	Bluetooth: Disallow LE local out-of-band data when LE privacy is used When the LE pivacy feature is used, then pairing has to happen based on resolvable random addresses (RPA), but currently there is no clean way to retrieve the correct RPA. So instead of returning an outdated RPA, just disallow this command when LE privacy is in use. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-02 22:18:58 +03:00
Linus Torvalds	8172ba51e2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix use-after-free with mac80211 RX A-MPDU reorder timer, from Johannes Berg. 2) iwlwifi leaks memory every module load/unload cycles, fix from Larry Finger. 3) Need to use for_each_netdev_safe() in rtnl_group_changelink() otherwise we can crash, from WANG Cong. 4) mlx4 driver does register_netdev() too early in the probe sequence, from Ido Shamay. 5) Don't allow router discovery hop limit to decrease the interface's hop limit, from D.S. Ljungmark. 6) tx_packets and tx_bytes improperly accounted for certain classes of USB network devices, fix from Ben Hutchings. 7) ip{6}mr_rules_init() mistakenly use plain kfree to release the ipmr tables in the error path, they must instead use ip{6}mr_free_table(). Fix from WANG Cong. 8) cxgb4 doesn't properly quiesce all RX activity before unregistering the netdevice. Fix from Hariprasad Shenai. 9) Fix hash corruptions in ipvlan driver, from Jiri Benc. 10) nla_memcpy(), like a real memcpy, should fully initialize the destination buffer, even if the source attribute is smaller. Fix from Jiri Benc. 11) Fix wrong error code returned from iucv_sock_sendmsg(). We should use whatever sock_alloc_send_skb() put into 'err'. From Eugene Crosser. 12) Fix slab object leak on module unload in TIPC, from Ying Xue. 13) Need a READ_ONCE() when reading the cached RX socket route in tcp_v{4,6}_early_demux(). From Michal Kubecek. 14) Still too many problems with TPC support in the ath9k driver, so disable it for now. From Felix Fietkau. 15) When in AP mode the rtlwifi driver can leak DMA mappings, fix from Larry Finger. 16) Missing kzalloc() failure check in gs_usb CAN driver, from Colin Ian King. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits) cxgb4: Fix to dump devlog, even if FW is crashed cxgb4: Firmware macro changes for fw verison 1.13.32.0 bnx2x: Fix kdump when iommu=on bnx2x: Fix kdump on 4-port device mac80211: fix RX A-MPDU session reorder timer deletion MAINTAINERS: Update Intel Wired Ethernet Driver info tipc: fix a slab object leak net/usb/r8152: add device id for Lenovo TP USB 3.0 Ethernet af_iucv: fix AF_IUCV sendmsg() errno openvswitch: Return vport module ref before destruction netlink: pad nla_memcpy dest buffer with zeroes bonding: Bonding Overriding Configuration logic restored. ipvlan: fix check for IP addresses in control path ipvlan: do not use rcu operations for address list ipvlan: protect against concurrent link removal ipvlan: fix addr hash list corruption net: fec: setup right value for mdio hold time net: tcp6: fix double call of tcp_v6_fill_cb() cxgb4vf: Fix sparse warnings netns: don't clear nsid too early on removal ...	2015-04-02 11:09:41 -07:00
Nicolas Dichtel	e1622baf54	dev: set iflink to 0 for virtual interfaces Virtual interfaces are supposed to set an iflink value != of their ifindex. It was not the case for some of them, like vxlan, bond or bridge. Let's set iflink to 0 when dev->rtnl_link_ops is set. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 14:05:01 -04:00
Nicolas Dichtel	7a66bbc96c	net: remove iflink field from struct net_device Now that all users of iflink have the ndo_get_iflink handler available, it's possible to remove this field. By default, dev_get_iflink() returns the ifindex of the interface. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 14:05:01 -04:00
Nicolas Dichtel	abd2be00d4	dsa: implement ndo_get_iflink Don't use dev->iflink anymore. CC: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 14:05:01 -04:00
Nicolas Dichtel	2dbf6b5058	vlan: implement ndo_get_iflink Don't use dev->iflink anymore. CC: Patrick McHardy <kaber@trash.net> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 14:05:00 -04:00
Nicolas Dichtel	ee9b9596a8	ipmr,ip6mr: implement ndo_get_iflink Don't use dev->iflink anymore. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 14:05:00 -04:00
Nicolas Dichtel	1e99584b91	ipip,gre,vti,sit: implement ndo_get_iflink Don't use dev->iflink anymore. CC: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 14:05:00 -04:00
Nicolas Dichtel	ecf2c06a88	ip6tnl,gre6,vti6: implement ndo_get_iflink Don't use dev->iflink anymore. CC: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 14:04:59 -04:00
Nicolas Dichtel	a54acb3a6f	dev: introduce dev_get_iflink() The goal of this patch is to prepare the removal of the iflink field. It introduces a new ndo function, which will be implemented by virtual interfaces. There is no functional change into this patch. All readers of iflink field now call dev_get_iflink(). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 14:04:59 -04:00
Johan Hedberg	1b9441f8ec	Bluetooth: Convert local OOB data reading to use HCI request Now that there's a HCI request API available where the callback receives the resulting skb, we can convert the local OOB data reading to use this new API. This patch does the necessary update in mgmt.c (which also requires moving the callback higher up since it's now a static function) and removes the custom calls from hci_event.c that are no-longer necessary. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-04-02 16:09:29 +02:00
Johan Hedberg	757aa0b56d	Bluetooth: Move hci_get_cmd_complete() to hci_event.c To make the hci_req_run_skb() API consistent with hci_cmd_sync_ev() the callback should receive the cmd_complete parameters in the 'normal' case and the full HCI event if a special event was expected. This patch moves the hci_get_cmd_complete() function from hci_core.c to hci_event.c where it's used to strip the skb from the needed headers before passing it on to the callback. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-04-02 16:09:28 +02:00
Johan Hedberg	abe66a4d03	Bluetooth: Remove unused hci_req_pending() function The hci_req_pending() function has no users anymore, so simply remove it. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-04-02 16:09:28 +02:00
Johan Hedberg	f7d9e97592	Bluetooth: Remove unneeded recv_event variable Now that the synchronous HCI requests use the new API and a new private variable the recv_evt member of hci_dev is no-longer needed. This patch removes it. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-04-02 16:09:27 +02:00
Johan Hedberg	f60cb30579	Bluetooth: Convert hci_req_sync family of function to new request API Now that there's an API in place that allows passing the resulting skb to the request callback we can conveniently convert the hci_req_sync and related functions to use it. Since we still need to get the skb from the async callback into the sleeping _sync() function the patch adds another req_skb variable to hci_dev where the sync request state is tracked. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-04-02 16:09:27 +02:00
Johan Hedberg	e621448749	Bluetooth: Add second hci_request callback option for full skb This patch adds a second possible callback for HCI requests where the callback will receive the full skb of the last successfully completed HCI command. This API is useful for cases where we want to use a request to read some data and the existing hci_event.c handlers do not store it e.g. in the hci_dev struct. The reason the patch is a bit bigger than just adding the new API is because the hci_req_cmd_complete() functions required some refactoring to enable it: now hci_req_cmd_complete() is simply used to request the callback pointers if any, and the actual calling of them happens from a single place at the end of hci_event_packet(). The reason for this is that we need to pass the original skb (without any skb_pull, etc modifications done to it) and it's simplest to keep track of it within the hci_event_packet() function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-04-02 16:09:27 +02:00
Johan Hedberg	444c6dd54d	Bluetooth: Add clarifying comment to command status handling When dealing with HCI command status events, the reasoning for trying to mark a request as complete if no specific event is being waited for and status was success is not self-evident. This patch adds a clarifying comment above the if-statement. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-04-02 16:09:27 +02:00
Florian Westphal	0b67c43ce3	netfilter: bridge: really save frag_max_size between PRE and POST_ROUTING We also need to save/store in forward, else br_parse_ip_options call will zero frag_max_size as well. Fixes: `93fdd47e5` ('bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTING') Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-02 10:41:14 +02:00
Marcel Holtmann	64dd374eac	Bluetooth: Export SMP selftest result in debugfs When SMP selftest is enabled, then besides printing the result into the kernel message buffer, also create a debugfs file that allows retrieving the same information. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-02 08:47:40 +03:00
Marcel Holtmann	6de50f9fdb	Bluetooth: Export ECDH selftest result in debugfs When ECDH selftest is enabled, then besides printing the result into the kernel message buffer, also create a debugfs file that allows retrieving the same information. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-02 08:47:38 +03:00
Marcel Holtmann	0151e426b1	Bluetooth: Restrict BNEP flags to only valid ones The BNEP flags should be clearly restricted to valid ones. So this puts extra checks in place to ensure this. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-02 08:44:02 +03:00
Marcel Holtmann	5f5da99f1d	Bluetooth: Restrict HIDP flags to only valid ones The HIDP flags should be clearly restricted to valid ones. So this puts extra checks in place to ensure this. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-02 08:43:11 +03:00
Marcel Holtmann	8bf17a3619	Bluetooth: Restrict CMTP flags to only valid ones The CMTP flags should be clearly restricted to valid ones. So this puts extra checks in place to ensure this. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-02 08:42:21 +03:00
Marcel Holtmann	c3370de64d	Bluetooth: Expose current Device ID information via debugfs For debugging purposes it is good to be able to read the current configured Device ID details. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-02 08:40:35 +03:00
Simon Horman	05e8bb860b	pkt_sched: fq: correct spelling of locally Correct spelling of locally. Also remove extra space before tab character in struct fq_flow. Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-01 22:52:29 -04:00
Felix Fietkau	ba8c3d6f16	mac80211: add an intermediate software queue implementation This allows drivers to request per-vif and per-sta-tid queues from which they can pull frames. This makes it easier to keep the hardware queues short, and to improve fairness between clients and vifs. The task of scheduling packet transmission is left up to the driver - queueing is controlled by mac80211. Drivers can only dequeue packets by calling ieee80211_tx_dequeue. This makes it possible to add active queue management later without changing drivers using this code. This can also be used as a starting point to implement A-MSDU aggregation in a way that does not add artificially induced latency. Signed-off-by: Felix Fietkau <nbd@openwrt.org> [resolved minor context conflict, minor changes, endian annotations] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:34 +02:00
John Linville	cef2fc1ce4	mac80211: reduce log spam from ieee80211_handle_pwr_constr This changes a couple of messages from sdata_info to sdata_dbg. This should reduce some log spam, as reported here: https://bugzilla.redhat.com/show_bug.cgi?id=1206468 Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:34 +02:00
Thomas Huehn	5f919abc76	mac80211: add standard deviation to Minstrel stats This patch adds the statistical descriptor "standard deviation" to better describe the current properties of Minstrel and Minstrel-HTs success probability distribution. The standard deviation (SD) is calculated as exponential weighted moving standard deviation (EWMSD) and its current value is added as new column in all rc_stats (in debugfs). Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:33 +02:00
Thomas Huehn	ade6d4a2ec	mac80211: reduce calculation costs of EWMA This patch reduces the calculation costs of the EWMA macro from "2x multiplication and 1 addition" down to "1x multiplication and 2x additions". This slightly improves performance depending on the CPU architecture. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:33 +02:00
Thomas Huehn	50e55a8ea7	mac80211: add max lossless throughput per rate This patch adds the new statistic "maximum possible lossless throughput" to Minstrels and Minstrel-HTs rc_stats (in debugfs). This enables comprehensive comparison between current per-rate throughput and max. achievable per-rate throughput. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:32 +02:00
Thomas Huehn	6a27b2c40b	mac80211: restructure per-rate throughput calculation into function This patch moves Minstrels and Minstrel-HTs per-rate throughput calculation (EWMA(thr)) into a dedicated function to be called. Therefore the variable "unsigned int cur_tp" within struct "minstrel_rate_stats" becomes obsolete. and is removed to free up its space. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:32 +02:00
Thomas Huehn	9134073bc6	mac80211: improve Minstrel variable & function naming This patch ensures a consistent usage of variable names for type "minstrel_rate_stats" to be used as "mrs" and from type minstrel_rate as "mr" across both Minstrel & Minstrel-HT. In addition some variable and function names got changed to more meaningful ones. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:31 +02:00
Thomas Huehn	f62838bcc5	mac80211: unify Minstrel & Minstrel-HTs calculation of rate statistics This patch unifies the calculation of Minstrels and Minstrel-HTs per-rate statistic. The new common function minstrel_calc_rate_stats() is called when a statistic update is performed. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:31 +02:00
Thomas Huehn	2cae0b6a70	mac80211: add new Minstrel-HT statistic output via csv This patch adds a new debugfs file "rc_stats_csv" to output Minstrel-HTs statistics in a common csv format that is easy to parse. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Signed-off-by: Stefan Venz <ikstream86@gmail.com> Acked-by: Felix Fietkau <nbd@openwrt.org> [remove printing current time of day] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:26 +02:00
Thomas Huehn	6d48851779	mac80211: add new Minstrel statistic output via csv This patch adds a new debugfs file "rc_stats_csv" to output Minstrels statistics in a common csv format that is easy to parse. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Signed-off-by: Stefan Venz <ikstream86@gmail.com> Acked-by: Felix Fietkau <nbd@openwrt.org> [remove printing current time of day] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:41:43 +02:00
David S. Miller	af3e09e666	This contains just a single fix for a crash I happened to randomly run into today during testing. It's clearly been around for a while, but is pretty hard to trigger, even when I tried explicitly (and modified the code to make it more likely) it rarely did. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJVG/WSAAoJEDBSmw7B7bqrzmgQALygsyo0GrmHHorg0wkO+PBK l6kVknRwlsMil+vmB6mPTnwgUaGnEWJeRXl4zPJeA4Z+Xr58K1lyTFcMC0nu2MVb MNcTJycRmF4Lqyycd52zFaA+1vMsgG7AZb6vXLYppchUFTmbLVNX/IWhJGNOAsKZ HjYdvLr2kbJunIRofc/0PCC/8J4qFQj2ZphF5WfMckhNrh8+SQjIqmscFdRIqcgj R+LtyxscyKWspQ97J6OdoTsKpxTKSZ8mi9IvohdJhkJVnzBEkJx+Krf6PNs+Xnh1 Z4x1Dkn1RoM8+7cakq2tTwwxokEw7de/3v/s90W+yVuZWNKsaiKHcnU+KGHu6t0J 2766PXv6hC3KSWN6XrfTxYP7CBsT46Vf5+7FSlDsck1tUbn3W55c2kPFMBKk79yi CHjz1O82wzx4bAVNdaKMpR8rz6bSyhZijmduMuYxxrvnKkl5BSHype9IAUo+etVz KxPnwGq9yJjf0RYdr9tttxiwXJaADD6/R/bO21SIi1JeKa5sCyoAA5nLDyJfXwyt KtlrzzM9NWqoQUi2SGmurHHEIBBmgg9RBBWvq+MNM0Ik7d9kIawCBlvPerVc4IH4 bUvIXEnrQNOEwxmY/9nsUShQvkzuQbkwR3rpEYA3XemO0qW1t0Tkdp/3UY3TY6Sq EOTi9LkOquZnKd9GB7yl =Imm5 -----END PGP SIGNATURE----- Merge tag 'mac80211-for-davem-2015-04-01' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== This contains just a single fix for a crash I happened to randomly run into today during testing. It's clearly been around for a while, but is pretty hard to trigger, even when I tried explicitly (and modified the code to make it more likely) it rarely did. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-01 14:19:22 -04:00
Linus Torvalds	1e848913f0	Merge branch 'for-4.0' of git://linux-nfs.org/~bfields/linux Pull nfsd fixes from Bruce Fields: "Two main issues: - We found that turning on pNFS by default (when it's configured at build time) was too aggressive, so we want to switch the default before the 4.0 release. - Recent client changes to increase open parallelism uncovered a serious bug lurking in the server's open code. Also fix a krb5/selinux regression. The rest is mainly smaller pNFS fixes" * 'for-4.0' of git://linux-nfs.org/~bfields/linux: sunrpc: make debugfs file creation failure non-fatal nfsd: require an explicit option to enable pNFS NFSD: Fix bad update of layout in nfsd4_return_file_layout NFSD: Take care the return value from nfsd4_encode_stateid NFSD: Printk blocklayout length and offset as format 0x%llx nfsd: return correct lockowner when there is a race on hash insert nfsd: return correct openowner when there is a race to put one in the hash NFSD: Put exports after nfsd4_layout_verify fail NFSD: Error out when register_shrinker() fail NFSD: Take care the return value from nfsd4_decode_stateid NFSD: Check layout type when returning client layouts NFSD: restore trace event lost in mismerge	2015-04-01 09:45:47 -07:00
David Howells	44ba06987c	RxRPC: Handle VERSION Rx protocol packets Handle VERSION Rx protocol packets. We should respond to a VERSION packet with a string indicating the Rx version. This is a maximum of 64 characters and is padded out to 65 chars with NUL bytes. Note that other AFS clients use the version request as a NAT keepalive so we need to handle it rather than returning an abort. The standard formulation seems to be: <project> <version> built <yyyy>-<mm>-<dd> for example: " OpenAFS 1.6.2 built 2013-05-07 " (note the three extra spaces) as obtained with: rxdebug grand.mit.edu -version from the openafs package. Signed-off-by: David Howells <dhowells@redhat.com>	2015-04-01 16:31:26 +01:00
David Howells	382d7974de	RxRPC: Use iov_iter_count() in rxrpc_send_data() instead of the len argument Use iov_iter_count() in rxrpc_send_data() to get the remaining data length instead of using the len argument as the len argument is now redundant. Signed-off-by: David Howells <dhowells@redhat.com>	2015-04-01 15:49:26 +01:00
David Howells	aab94830a7	RxRPC: Don't call skb_add_data() if there's no data to copy Don't call skb_add_data() in rxrpc_send_data() if there's no data to copy and also skip the calculations associated with it in such a case. Signed-off-by: David Howells <dhowells@redhat.com>	2015-04-01 15:48:00 +01:00
David Howells	3af6878eca	RxRPC: Fix the conversion to iov_iter This commit: commit `af2b040e47` Author: Al Viro <viro@zeniv.linux.org.uk> Date: Thu Nov 27 21:44:24 2014 -0500 Subject: rxrpc: switch rxrpc_send_data() to iov_iter primitives incorrectly changes a do-while loop into a while loop in rxrpc_send_data(). Unfortunately, at least one pass through the loop is required - even if there is no data - so that the packet the closes the send phase can be sent if MSG_MORE is not set. Signed-off-by: David Howells <dhowells@redhat.com>	2015-04-01 14:06:00 +01:00
Johannes Berg	788211d81b	mac80211: fix RX A-MPDU session reorder timer deletion There's an issue with the way the RX A-MPDU reorder timer is deleted that can cause a kernel crash like this: * tid_rx is removed - call_rcu(ieee80211_free_tid_rx) * station is destroyed * reorder timer fires before ieee80211_free_tid_rx() runs, accessing the station, thus potentially crashing due to the use-after-free The station deletion is protected by synchronize_net(), but that isn't enough -- ieee80211_free_tid_rx() need not have run when that returns (it deletes the timer.) We could use rcu_barrier() instead of synchronize_net(), but that's much more expensive. Instead, to fix this, add a field tracking that the session is being deleted. In this case, the only re-arming of the timer happens with the reorder spinlock held, so make that code not rearm it if the session is being deleted and also delete the timer after setting that field. This ensures the timer cannot fire after ___ieee80211_stop_rx_ba_session() returns, which fixes the problem. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 14:35:01 +02:00
Pablo Neira Ayuso	c5035c77f8	netfilter: nft_meta: fix cgroup matching We have to stop iterating on the rule expressions if the cgroup mismatches. Moreover, make sure a non-full socket from the input path leads us to a crash. Fixes: `ce67417` ("netfilter: nft_meta: add cgroup support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-01 11:33:00 +02:00
Oliver Hartkopp	a5581ef4c2	can: introduce new raw socket option to join the given CAN filters The CAN_RAW socket can set multiple CAN identifier specific filters that lead to multiple filters in the af_can.c filter processing. These filters are indenpendent from each other which leads to logical OR'ed filters when applied. This socket option joines the given CAN filters in the way that only CAN frames are passed to user space that matched all given CAN filters. The semantic for the applied filters is therefore changed to a logical AND. This is useful especially when the filterset is a combination of filters where the CAN_INV_FILTER flag is set in order to notch single CAN IDs or CAN ID ranges from the incoming traffic. As the raw_rcv() function is executed from NET_RX softirq the introduced variables are implemented as per-CPU variables to avoid extensive locking at CAN frame reception time. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2015-04-01 11:28:22 +02:00
Oliver Hartkopp	514ac99c64	can: fix multiple delivery of a single CAN frame for overlapping CAN filters The CAN_RAW socket can set multiple CAN identifier specific filters that lead to multiple filters in the af_can.c filter processing. These filters are indenpendent from each other which leads to logical OR'ed filters when applied. This patch makes sure that every CAN frame which is filtered for a specific socket is only delivered once to the user space. This is independent from the number of matching CAN filters of this socket. As the raw_rcv() function is executed from NET_RX softirq the introduced variables are implemented as per-CPU variables to avoid extensive locking at CAN frame reception time. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2015-04-01 11:27:41 +02:00
Daniel Borkmann	afb7718016	netfilter: x_tables: fix cgroup matching on non-full sks While originally only being intended for outgoing traffic, commit `a00e76349f` ("netfilter: x_tables: allow to use cgroup match for LOCAL_IN nf hooks") enabled xt_cgroups for the NF_INET_LOCAL_IN hook as well, in order to allow for nfacct accounting. Besides being currently limited to early demuxes only, commit `a00e76349f` forgot to add a check if we deal with full sockets, i.e. in this case not with time wait sockets. TCP time wait sockets do not have the same memory layout as full sockets, a lower memory footprint and consequently also don't have a sk_classid member; probing for sk_classid member there could potentially lead to a crash. Fixes: `a00e76349f` ("netfilter: x_tables: allow to use cgroup match for LOCAL_IN nf hooks") Cc: Alexey Perevalov <a.perevalov@samsung.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-01 11:26:42 +02:00
Thomas Huehn	9c00bb7210	mac80211: enhance readability of Minstrel-HTs rc_stats output This patch restructures the rc_stats debugfs table of Minstrel-HT in order to achieve better human readability. A new layout of the statistics and a new header is added. In addition to the old layout there are two new columns of information added: idx - representing the rate index of each rate in mac80211 which can be used to set specific rates as fixed rate via debugfs airtime - the tx-time in micro seconds that a 1200 Byte packet takes to be transmitted over the air at the given rate The old layout of rc_stats: type rate tpt eprob prob ret ok(*cum) ok( cum) HT20/LGI MCS0 5.6 100.0 100.0 1 0( 0) 1( 1) HT20/LGI B MCS1 10.5 100.0 100.0 0 0( 0) 1( 1) HT20/LGI A MCS2 14.8 100.0 100.0 0 0( 0) 1( 1) ... is changed into this new layout: best ________rate______ __statistics__ ________last_______ ______sum-of________ mode guard # rate [name idx airtime] [ ø(tp) ø(prob)] [prob.\|retry\|suc\|att] [#success \| #attempts] HT20 LGI 1 MCS0 0 1480 0.0 0.0 0.0 1 0 0 0 0 HT20 LGI 1 B MCS1 1 740 10.5 100.0 100.0 0 0 0 1 1 HT20 LGI 1 A MCS2 2 496 14.8 100.0 100.0 0 0 0 1 1 ... Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Signed-off-by: Stefan Venz <ikstream86@gmail.com> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 11:22:39 +02:00
Thomas Huehn	e161f7f6c4	mac80211: enhance readability of Minstrels rc_stats output This patch restructures the rc_stats debugfs table of Minstrel in order to achieve better human readability. A new layout of the statistics and a new header is added. In addition to the old layout there are two new columns of information added: idx - representing the rate index of each rate in mac80211 which can be used to set specific rates as fixed rate via debugfs airtime - the tx-time in micro seconds that a 1200 Byte packet takes to be transmitted over the air at the given rate The old layout of rc_stats: rate tpt eprob prob ret ok(*cum) ok( cum) DP 1 0.9 93.5 100.0 1 0( 0) 2( 2) 2 0.4 40.0 100.0 0 0( 0) 4( 10) 5.5 0.0 0.0 0.0 0 0( 0) 0( 0) ... is changed into this new layout: best _______rate_____ __statistics__ ________last_______ ______sum-of________ rate [name idx tx-time] [ ø(tp) ø(prob)] [prob.\|retry\|suc\|att] [#success \| #attempts] DP 1 0 9738 0.9 93.5 100.0 1 1 1 2 2 2 1 4922 0.4 40.0 100.0 1 0 0 4 10 5.5 2 1858 0.0 0.0 0.0 2 0 0 0 0 ... Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Signed-off-by: Stefan Venz <ikstream86@gmail.com> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 11:22:39 +02:00
Ilan peer	c37722bd19	cfg80211: Stop calling crda if it is not responsive Patch `eeca9fce1d` (cfg80211: Schedule timeout for all CRDA call) introduced a regression, where in case that crda is not installed (or not configured properly etc.), the regulatory core will needlessly continue to call it, polluting the log with the following log: "cfg80211: Calling CRDA to update world regulatory domain" Fix this by limiting the number of continuous CRDA request failures. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 11:22:38 +02:00
Patrick McHardy	9d0982927e	netfilter: nft_hash: add support for timeouts Add support for element timeouts to nft_hash. The lookup and walking functions are changed to ignore timed out elements, a periodic garbage collection task cleans out expired entries. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-01 11:17:49 +02:00
Patrick McHardy	6908665826	netfilter: nf_tables: add GC synchronization helpers GC is expected to happen asynchrously to the netlink interface. In the netlink path, both insertion and removal of elements consist of two steps, insertion followed by activation or deactivation followed by removal, during which the element must not be freed by GC. The synchronization helpers use an unused bit in the genmask field to atomically mark an element as "busy", meaning it is either currently being handled through the netlink API or by GC. Elements being processed by GC will never survive, netlink will simply ignore them. Elements being currently processed through netlink will be skipped by GC and reprocessed during the next run. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-01 11:17:29 +02:00
Patrick McHardy	cfed7e1b1f	netfilter: nf_tables: add set garbage collection helpers Add helpers for GC batch destruction: since element destruction needs a RCU grace period for all set implementations, add some helper functions for asynchronous batch destruction. Elements are collected in a batch structure, which is asynchronously released using RCU once its full. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-01 11:17:29 +02:00
Patrick McHardy	c3e1b005ed	netfilter: nf_tables: add set element timeout support Add API support for set element timeouts. Elements can have a individual timeout value specified, overriding the sets' default. Two new extension types are used for timeouts - the timeout value and the expiration time. The timeout value only exists if it differs from the default value. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-01 11:17:28 +02:00
Patrick McHardy	761da2935d	netfilter: nf_tables: add set timeout API support Add set timeout support to the netlink API. Sets with timeout support enabled can have a default timeout value and garbage collection interval specified. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-01 11:17:28 +02:00
Johannes Berg	7bedd0cfad	mac80211: use rhashtable for station table We currently have a hand-rolled table with 256 entries and are using the last byte of the MAC address as the hash. This hash is obviously very fast, but collisions are easily created and we waste a lot of space in the common case of just connecting as a client to an AP where we just have a single station. The other common case of an AP is also suboptimal due to the size of the hash table and the ease of causing collisions. Convert all of this to use rhashtable with jhash, which gives us the advantage of a far better hash function (with random perturbation to avoid hash collision attacks) and of course that the hash table grows and shrinks dynamically with chain length, improving both cases above. Use a specialised hash function (using jhash, but with fixed length) to achieve better compiler optimisation as suggested by Sergey Ryazanov. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 10:06:26 +02:00
Ying Xue	7e43690578	tipc: fix a slab object leak When remove TIPC module, there is a warning to remind us that a slab object is leaked like: root@localhost:~# rmmod tipc [ 19.056226] ============================================================================= [ 19.057549] BUG TIPC (Not tainted): Objects remaining in TIPC on kmem_cache_close() [ 19.058736] ----------------------------------------------------------------------------- [ 19.058736] [ 19.060287] INFO: Slab 0xffffea0000519a00 objects=23 used=1 fp=0xffff880014668b00 flags=0x100000000004080 [ 19.061915] INFO: Object 0xffff880014668000 @offset=0 [ 19.062717] kmem_cache_destroy TIPC: Slab cache still has objects This is because the listening socket of TIPC topology server is not closed before TIPC proto handler is unregistered with proto_unregister(). However, as the socket is closed in tipc_exit_net() which is called by unregister_pernet_subsys() during unregistering TIPC namespace operation, the warning can be eliminated if calling unregister_pernet_subsys() is moved before calling proto_unregister(). Fixes: `e05b31f4bf` ("tipc: make tipc socket support net namespace") Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 23:10:08 -04:00
David S. Miller	7b6249bba9	Lots of updates for net-next; along with the usual flurry of small fixes, cleanups and internal features we have: * VHT support for TDLS and IBSS (conditional on drivers though) * first TX performance improvements (the biggest will come later) * many suspend/resume (race) fixes * name_assign_type support from Tom Gundersen -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJVGWlNAAoJEDBSmw7B7bqr5s8P/R9+Q6y4Ixice9dDOJYynl/d dMEUEfCBWUyDaQD1bNQIED2mc0sM5+Ax8OVIVx9fdrLGPxaISBqDJKE1USoTNZzm q+U3dM4Q45SfQSsgaY4FtxTlPWPtUKsNMXY/CxLR+IqVeA3+30rX+hv1f3SAqBj0 68IwW/uUIHb71IZ+hz2Mwudt4JVW8KRg9VlZ0UY6EEvC4m5QD2YkwQQo/hJ2WF+/ wAJbku02L/Vy4J7E6hNcKYWXokht4fVYphjl/1ZDd/+8L8SUv9mC88n1Jzxa428p 1PmbtwzbpOrtTcC2BCPDA04IyfMc7k9DlLKw/h2KLPbHZXheD9eVmo/Am5vz+uH6 926f+FK339SzoJnZ5wBBDiZ8W8TLYNc8ImxtcxjnrtGfr1CKiuh23P1CWyOlKJCO BYFJqkCOqWOHYnN0embaj7JqM/LmQI5ZoBZHZhD2KQXIXpTsjjIMPfJvip5D+tsV +iXIlQwdeK6rbjxdonBgn7n57XIeSVMAYeyDCbzIShfibjHbUZPn+RsZCtv8RWv8 EaZu8PerU5ZDKwdX940+lWrtf/TMDJBYQpAIBRuiZK4DTNWCt3BrDlvb1FXGgA+X vQJnr32vjJ/pLDxNLHQlkKWC4I/CYtG47OgcJN9AQXrig1zApd+C29zy3aqch3ea wxV9dFfheTqZFjtZfSsH =O/cf -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-davem-2015-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Lots of updates for net-next; along with the usual flurry of small fixes, cleanups and internal features we have: * VHT support for TDLS and IBSS (conditional on drivers though) * first TX performance improvements (the biggest will come later) * many suspend/resume (race) fixes * name_assign_type support from Tom Gundersen ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 16:39:04 -04:00
Jiri Pirko	fbcb217059	net: rename dev to orig_dev in deliver_ptype_list_skb Unlike other places, this function uses name "dev" for what should be "orig_dev", which might be a bit confusing. So fix this. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 16:37:43 -04:00
Eugene Crosser	ed4ac42217	af_iucv: fix AF_IUCV sendmsg() errno When sending over AF_IUCV socket, errno was incorrectly set to ENOMEM even when other values where appropriate, notably EAGAIN. With this patch, error indicator returned by sock_alloc_send_skb() is passed to the caller, rather than being overwritten with ENOMEM. Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com> Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 16:06:50 -04:00
Thomas Graf	fa2d8ff4e3	openvswitch: Return vport module ref before destruction Return module reference before invoking the respective vport ->destroy() function. This is needed as ovs_vport_del() is not invoked inside an RCU read side critical section so the kfree can occur immediately before returning to ovs_vport_del(). Returning the module reference before ->destroy() is safe because the module unregistration is blocked on ovs_lock which we hold while destroying the datapath. Fixes: `62b9c8d037` ("ovs: Turn vports with dependencies into separate modules") Reported-by: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 15:59:50 -04:00
Jeff Layton	f9c72d10d6	sunrpc: make debugfs file creation failure non-fatal We currently have a problem that SELinux policy is being enforced when creating debugfs files. If a debugfs file is created as a side effect of doing some syscall, then that creation can fail if the SELinux policy for that process prevents it. This seems wrong. We don't do that for files under /proc, for instance, so Bruce has proposed a patch to fix that. While discussing that patch however, Greg K.H. stated: "No kernel code should care / fail if a debugfs function fails, so please fix up the sunrpc code first." This patch converts all of the sunrpc debugfs setup code to be void return functins, and the callers to not look for errors from those functions. This should allow rpc_clnt and rpc_xprt creation to work, even if the kernel fails to create debugfs files for some reason. Symptoms were failing krb5 mounts on systems using gss-proxy and selinux. Fixes: `388f0c7767` "sunrpc: add a debugfs rpc_xprt directory..." Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-03-31 14:15:08 -04:00
Jiri Benc	67b61f6c13	netlink: implement nla_get_in_addr and nla_get_in6_addr Those are counterparts to nla_put_in_addr and nla_put_in6_addr. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:58:35 -04:00
Jiri Benc	930345ea63	netlink: implement nla_put_in_addr and nla_put_in6_addr IP addresses are often stored in netlink attributes. Add generic functions to do that. For nla_put_in_addr, it would be nicer to pass struct in_addr but this is not used universally throughout the kernel, in way too many places __be32 is used to store IPv4 address. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:58:35 -04:00
Jiri Benc	15e318bdc6	xfrm: simplify xfrm_address_t use In many places, the a6 field is typecasted to struct in6_addr. As the fields are in union anyway, just add in6_addr type to the union and get rid of the typecasting. Modifying the uapi header is okay, the union has still the same size. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:58:35 -04:00
Jiri Benc	8f55db4860	tcp: simplify inetpeer_addr_base use In many places, the a6 field is typecasted to struct in6_addr. As the fields are in union anyway, just add in6_addr type to the union and get rid of the typecasting. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:58:35 -04:00
Ian Morris	53b24b8f94	ipv6: coding style: comparison for inequality with NULL The ipv6 code uses a mixture of coding styles. In some instances check for NULL pointer is done as x != NULL and sometimes as x. x is preferred according to checkpatch and this patch makes the code consistent by adopting the latter form. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:51:54 -04:00
Ian Morris	63159f29be	ipv6: coding style: comparison for equality with NULL The ipv6 code uses a mixture of coding styles. In some instances check for NULL pointer is done as x == NULL and sometimes as !x. !x is preferred according to checkpatch and this patch makes the code consistent by adopting the latter form. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:51:54 -04:00
Alexander Duyck	6e47d6caff	fib_trie: Cleanup ip_fib_net_exit code path While fixing a recent issue I noticed that we are doing some unnecessary work inside the loop for ip_fib_net_exit. As such I am pulling out the initialization to NULL for the locally stored fib_local, fib_main, and fib_default. In addition I am restoring the original code for flushing the table as there is no need to split up the fib_table_flush and hlist_del work since the code for packing the tnodes with multiple key vectors was dropped. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:18:56 -04:00
Alexander Duyck	ad88d05136	fib_trie: Fix warning on fib4_rules_exit This fixes the following warning: BUG: sleeping function called from invalid context at mm/slub.c:1268 in_atomic(): 1, irqs_disabled(): 0, pid: 6, name: kworker/u8:0 INFO: lockdep is turned off. CPU: 3 PID: 6 Comm: kworker/u8:0 Tainted: G W 4.0.0-rc5+ #895 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: netns cleanup_net 0000000000000006 ffff88011953fa68 ffffffff81a203b6 000000002c3a2c39 ffff88011952a680 ffff88011953fa98 ffffffff8109daf0 ffff8801186c6aa8 ffffffff81fbc9e5 00000000000004f4 0000000000000000 ffff88011953fac8 Call Trace: [<ffffffff81a203b6>] dump_stack+0x4c/0x65 [<ffffffff8109daf0>] ___might_sleep+0x1c3/0x1cb [<ffffffff8109db70>] __might_sleep+0x78/0x80 [<ffffffff8117a60e>] slab_pre_alloc_hook+0x31/0x8f [<ffffffff8117d4f6>] __kmalloc+0x69/0x14e [<ffffffff818ed0e1>] ? kzalloc.constprop.20+0xe/0x10 [<ffffffff818ed0e1>] kzalloc.constprop.20+0xe/0x10 [<ffffffff818ef622>] fib_trie_table+0x27/0x8b [<ffffffff818ef6bd>] fib_trie_unmerge+0x37/0x2a6 [<ffffffff810b06e1>] ? arch_local_irq_save+0x9/0xc [<ffffffff818e9793>] fib_unmerge+0x2d/0xb3 [<ffffffff818f5f56>] fib4_rule_delete+0x1f/0x52 [<ffffffff817f1c3f>] ? fib_rules_unregister+0x30/0xb2 [<ffffffff817f1c8b>] fib_rules_unregister+0x7c/0xb2 [<ffffffff818f64a1>] fib4_rules_exit+0x15/0x18 [<ffffffff818e8c0a>] ip_fib_net_exit+0x23/0xf2 [<ffffffff818e91f8>] fib_net_exit+0x32/0x36 [<ffffffff817c8352>] ops_exit_list+0x45/0x57 [<ffffffff817c8d3d>] cleanup_net+0x13c/0x1cd [<ffffffff8108b05d>] process_one_work+0x255/0x4ad [<ffffffff8108af69>] ? process_one_work+0x161/0x4ad [<ffffffff8108b4b1>] worker_thread+0x1cd/0x2ab [<ffffffff8108b2e4>] ? process_scheduled_works+0x2f/0x2f [<ffffffff81090686>] kthread+0xd4/0xdc [<ffffffff8109ec8f>] ? local_clock+0x19/0x22 [<ffffffff810905b2>] ? __kthread_parkme+0x83/0x83 [<ffffffff81a2c0c8>] ret_from_fork+0x58/0x90 [<ffffffff810905b2>] ? __kthread_parkme+0x83/0x83 The issue was that as a part of exiting the default rules were being deleted which resulted in the local trie being unmerged. By moving the freeing of the FIB tables up we can avoid the unmerge since there is no local table left when we call the fib4_rules_exit function. Fixes: `0ddcf43d5d` ("ipv4: FIB Local/MAIN table collapse") Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:18:56 -04:00
David S. Miller	89a3f3c55b	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-03-27 Here's another set of Bluetooth & 802.15.4 patches for 4.1: - New API to control LE advertising data (i.e. peripheral role) - mac802154 & at86rf230 cleanups - Support for toggling quirks from debugfs (useful for testing) - Memory leak fix for LE scanning - Extra version info reading support for Broadcom controllers Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 12:45:58 -04:00
Chuck Lever	d654788e98	xprtrdma: Make rpcrdma_{un}map_one() into inline functions These functions are called in a loop for each page transferred via RDMA READ or WRITE. Extract loop invariants and inline them to reduce CPU overhead. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:53 -04:00
Chuck Lever	e46ac34c3c	xprtrdma: Handle non-SEND completions via a callout Allow each memory registration mode to plug in a callout that handles the completion of a memory registration operation. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:53 -04:00
Chuck Lever	3968cb5850	xprtrdma: Add "open" memreg op The open op determines the size of various transport data structures based on device capabilities and memory registration mode. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:53 -04:00
Chuck Lever	4561f347d4	xprtrdma: Add "destroy MRs" memreg op Memory Region objects associated with a transport instance are destroyed before the instance is shutdown and destroyed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:53 -04:00
Chuck Lever	31a701a947	xprtrdma: Add "reset MRs" memreg op This method is invoked when a transport instance is about to be reconnected. Each Memory Region object is reset to its initial state. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:53 -04:00
Chuck Lever	91e70e70e4	xprtrdma: Add "init MRs" memreg op This method is used when setting up a new transport instance to create a pool of Memory Region objects that will be used to register memory during operation. Memory Regions are not needed for "physical" registration, since ->prepare and ->release are no-ops for that mode. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	6814baead8	xprtrdma: Add a "deregister_external" op for each memreg mode There is very little common processing among the different external memory deregistration functions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	9c1b4d775f	xprtrdma: Add a "register_external" op for each memreg mode There is very little common processing among the different external memory registration functions. Have rpcrdma_create_chunks() call the registration method directly. This removes a stack frame and a switch statement from the external registration path. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	1c9351ee0e	xprtrdma: Add a "max_payload" op for each memreg mode The max_payload computation is generalized to ensure that the payload maximum is the lesser of RPC_MAX_DATA_SEGS and the number of data segments that can be transmitted in an inline buffer. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	a0ce85f595	xprtrdma: Add vector of ops for each memory registration strategy Instead of employing switch() statements, let's use the typical Linux kernel idiom for handling behavioral variation: virtual functions. Start by defining a vector of operations for each supported memory registration mode, and by adding a source file for each mode. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	41f9702896	xprtrdma: Prevent infinite loop in rpcrdma_ep_create() If a provider advertizes a zero max_fast_reg_page_list_len, FRWR depth detection loops forever. Instead of just failing the mount, try other memory registration modes. Fixes: `0fc6c4e7bb` ("xprtrdma: mind the device's max fast . . .") Reported-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	805272406a	xprtrdma: Byte-align FRWR registration The RPC/RDMA transport's FRWR registration logic registers whole pages. This means areas in the first and last pages that are not involved in the RDMA I/O are needlessly exposed to the server. Buffered I/O is typically page-aligned, so not a problem there. But for direct I/O, which can be byte-aligned, and for reply chunks, which are nearly always smaller than a page, the transport could expose memory outside the I/O buffer. FRWR allows byte-aligned memory registration, so let's use it as it was intended. Reported-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	e23779451e	xprtrdma: Perform a full marshal on retransmit Commit `6ab59945f2` ("xprtrdma: Update rkeys after transport reconnect" added logic in the ->send_request path to update the chunk list when an RPC/RDMA request is retransmitted. Note that rpc_xdr_encode() resets and re-encodes the entire RPC send buffer for each retransmit of an RPC. The RPC send buffer is not preserved from the previous transmission of an RPC. Revert `6ab59945f2`, and instead, just force each request to be fully marshaled every time through ->send_request. This should preserve the fix from `6ab59945f2`, while also performing pullup during retransmits. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	0dd39cae26	xprtrdma: Display IPv6 addresses and port numbers correctly Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Johan Hedberg	db6e3e8d01	Bluetooth: Refactor HCI request variables into own struct In order to shrink the size of bt_skb_cb, this patch moves the HCI request related variables into their own req_ctrl struct. Additionall the L2CAP and HCI request structs are placed inside the same union since they will never be used at the same time for the same skb. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-30 23:20:53 +02:00
Johan Hedberg	a4368ff3ed	Bluetooth: Refactor L2CAP variables into l2cap_ctrl We're getting very close to the maximum possible size of bt_skb_cb. To prepare to shrink the struct with the help of a union this patch moves all L2CAP related variables into the l2cap_ctrl struct. To later add other 'ctrl' structs the L2CAP one is renamed simple 'l2cap' instead of 'control'. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-30 23:20:53 +02:00
Johannes Berg	2c44be81f0	mac80211: set QoS capability before changing station state In the upcoming fast-xmit patch, changing station state will build a header cache based on the station's capabilities, and as the QoS capability (sta.wme) impacts the header, it needs to be set before. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 15:11:01 +02:00
Arik Nemtsov	070e176a75	mac80211: send HT/VHT IEs in TDLS discovery response These are mandated by IEEE802.11-2012 section 8.5.8.6 and IEEE802.11ac-2013 section 8.5.8.16. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:48:59 +02:00
Janusz.Dziedzic@tieto.com	abcff6ef01	mac80211: add VHT support for IBSS Add VHT support for IBSS. Drivers could activate this feature by setting NL80211_EXT_FEATURE_VHT_IBSS flag. Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:48:26 +02:00
Janusz.Dziedzic@tieto.com	76bed0f43b	mac80211: IBSS fix scan request In case of wide bandwidth (wider than 20MHz) used by IBSS, scan all channels in chandef to be able to find neighboring IBSS netwqworks that use the same overall channels but a different control channel. Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:47:56 +02:00
Johannes Berg	97ffe75791	mac80211: factor out station lookup from ieee80211_build_hdr() In order to look up the RA station earlier to implement a TX fastpath, factor out the lookup from ieee80211_build_hdr(). To always have a valid station pointer, also move some of the checks into the new function. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:47:19 +02:00
Johannes Berg	527871d720	mac80211: make sta.wme indicate whether QoS is used Indicating just the peer's capability is fairly pointless if the local device doesn't support it. Make the variable track both combined, and remove the 'local support' check in the TX path. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:47:16 +02:00
Johannes Berg	a73f8e21f3	mac80211: send AP probe as unicast again Louis reported that a static checker was complaining that the 'dst' variable was set (multiple times) but not used. This is due to a previous commit having removed the usage (apparently erroneously), so add it back. Fixes: `a344d6778a` ("mac80211: allow drivers to support NL80211_SCAN_FLAG_RANDOM_ADDR") Reported-by: Louis Langholtz <lou_langholtz@me.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:47:14 +02:00
Johannes Berg	07862e13e6	mac80111: aes_gcm: clean up ieee80211_aes_gcm_key_setup_encrypt() This code is written using an anti-pattern called "success handling" which makes it hard to read, especially if you are used to normal kernel style. It should instead be written as a list of directives in a row with branches for error handling. (Basically copied from Dan's previous patch for CCM) Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:42:02 +02:00
Dan Carpenter	45fd63293a	mac80111: aes_ccm: cleanup ieee80211_aes_key_setup_encrypt() This code is written using an anti-pattern called "success handling" which makes it hard to read, especially if you are used to normal kernel style. It should instead be written as a list of directives in a row with branches for error handling. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:40:03 +02:00
Jouni Malinen	82ca6ef686	mac80211: Fix misplaced return in AES-GMAC key setup Commit `8ade538bf3` ("mac80111: Add BIP-GMAC-128 and BIP-GMAC-256 ciphers") had the success return in incorrect place before the crypto_aead_setauthsize() call which practically ended up skipping that call unconditionally. The missing call did not actually change any functionality since GMAC_MIC_LEN (16) is identical to the maxauthsize in gcm(aes) and as such, the default value used for the authsize parameter. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:38:07 +02:00
Tom Gundersen	6bab2e19c5	cfg80211: pass name_assign_type to rdev_add_virtual_intf() This will expose in /sys whether the ifname of a device is set by userspace or generated by the kernel. The latter kind (wlanX, etc) is not deterministic, so userspace needs to rename these devices to names that are guaranteed to stay the same between reboots. The former, however should never be renamed, so userspace needs to be able to reliably tell the difference. Similar functionality was introduced for the rtnetlink core in commit `5517750f05` ("net: rtnetlink - make create_link take name_assign_type") Signed-off-by: Tom Gundersen <teg@jklm.no> Cc: Kalle Valo <kvalo@qca.qualcomm.com> Cc: Brett Rudley <brudley@broadcom.com> Cc: Arend van Spriel <arend@broadcom.com> Cc: Franky (Zhenhui) Lin <frankyl@broadcom.com> Cc: Hante Meuleman <meuleman@broadcom.com> Cc: Johannes Berg <johannes@sipsolutions.net> [reformat changelog to fit 72 cols] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:36:17 +02:00
Michael Braun	6a8b4adb47	mac80211: fix typo in debug output Signed-off-by: Michael Braun <michael-dev@fami-braun.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:34:13 +02:00
Johannes Berg	8f9c77fc1e	mac80211: reject aggregation sessions with non-HT peers If a peer or some local agent (rate control, ...) decides to start an aggregation session but doesn't support HT (which also implies QoS), reject it. This is mostly a corner case as such peers normally won't try to use block-ack sessions and rate control wouldn't start them, but technically QoS stations could request it according to the spec. However, since drivers don't really support such non-HT sessions it's better to reject them. Also, while at it, move the tracing for TX sessions earlier so it captures the error cases as well. Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:33:19 +02:00
Arik Nemtsov	a38700dd48	cfg/mac80211: add regulatory classes IE during TDLS setup Seems Broadcom TDLS peers (Nexus 5, Xperia Z3) refuse to allow TDLS connection when channel-switching is supported but the regulatory classes IE is missing from the setup request. Add a chandef to reg-class translation function to cfg80211 and use it to add the required IE during setup. For now add only the current regulatory class as supported - it is enough to resolve the compatibility issue. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:26:36 +02:00
David Spinadel	7d830a1986	mac80211: stop scan before connection Stop scan before authentication or association to make sure that nothing interferes with connection flow. Currently mac80211 defers RX auth and assoc packets (among other ones) until after the scan is complete, so auth during scan is likely to fail if scan took too much time. Signed-off-by: David Spinadel <david.spinadel@intel.com> Reviewed-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:19:13 +02:00
Luciano Coelho	21fea56731	nl80211: add net-detect delay to wowlan info Pass the initial net-detect delay (NL80211_ATTR_SCHED_SCAN_DELAY) attribute in the WoWLAN info response. Additionally, remove a bogus TODO comment. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:18:12 +02:00
Emmanuel Grumbach	a90faa9d64	mac80211: notify the driver about deauth This can allow the driver to take action based on the reason of the deauth. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:17:12 +02:00
Emmanuel Grumbach	d0d1a12f9c	mac80211: notify the driver about association status This can allow the driver to take action based on the success / failure of the association. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:17:11 +02:00
Emmanuel Grumbach	a9409093d2	mac80211: notify the driver about authentication status This can allow the driver to take action based on the success / failure of the authentication. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:17:10 +02:00
Emmanuel Grumbach	a818292952	mac80211: convert rssi_callback() to event_callback() We will be able to add more events, such as MLME events and others. The low level driver may be interested in knowing about these events to dump firmware data upon failures, or to change parameters in case connection attempts fail etc... Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:17:09 +02:00
Johannes Berg	2c158887f1	mac80211: agg-tx: avoid sending DelBA with sta->lock held The rate control locking caused a potential deadlock here due to the locks being acquired in different orders, so that change cannot yet be applied. However, there's no fundamental reason for this code to hold the sta->lock while transmitting frames. Clearly it's better not to hold the lock for longer periods of time, which can happen here since we call all the way down to the driver. Change the code a bit to not hold it while doing that. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 09:46:52 +02:00
Jon Paul Maloy	d482994fca	tipc: fix two bugs in secondary destination lookup A message sent to a node after a successful name table lookup may still find that the destination socket has disappeared, because distribution of name table updates is non-atomic. If so, the message will be rejected back to the sender with error code TIPC_ERR_NO_PORT. If the source socket of the message has disappeared in the meantime, the message should be dropped. However, in the currrent code, the message will instead be subject to an unwanted tertiary lookup, because the function tipc_msg_lookup_dest() doesn't check if there is an error code present in the message before performing the lookup. In the worst case, the message may now find the old destination again, and be redirected once more, instead of being dropped directly as it should be. A second bug in this function is that the "prev_node" field in the message is not updated after successful lookup, something that may have unpredictable consequences. The problems arising from those bugs occur very infrequently. The third change in this function; the test on msg_reroute_msg_cnt() is purely cosmetic, reflecting that the returned value never can be negative. This commit corrects the two bugs described above. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:47:36 -07:00
Alexey Kodanev	4ad19de877	net: tcp6: fix double call of tcp_v6_fill_cb() tcp_v6_fill_cb() will be called twice if socket's state changes from TCP_TIME_WAIT to TCP_LISTEN. That can result in control buffer data corruption because in the second tcp_v6_fill_cb() call it's not copying IP6CB(skb) anymore, but 'seq', 'end_seq', etc., so we can get weird and unpredictable results. Performance loss of up to 1200% has been observed in LTP/vxlan03 test. This can be fixed by copying inet6_skb_parm to the beginning of 'cb' only if xfrm6_policy_check() and tcp_v6_fill_cb() are going to be called again. Fixes: `2dc49d1680` ("tcp6: don't move IP6CB before xfrm6_policy_check()") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:36:05 -07:00
Toshiaki Makita	e38f30256b	net: Introduce passthru_features_check As there are a number of (especially virtual) devices that don't need the multiple vlan check, introduce passthru_features_check() for convenience. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:33:22 -07:00
Toshiaki Makita	8cb65d0008	net: Move check for multiple vlans to drivers To allow drivers to handle the features check for multiple tags, move the check to ndo_features_check(). As no drivers currently handle multiple tagged TSO, introduce dflt_features_check() and call it if the driver does not have ndo_features_check(). Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:33:22 -07:00
Toshiaki Makita	f5a7fb88e1	vlan: Introduce helper functions to check if skb is tagged Separate the two checks for single vlan and multiple vlans in netif_skb_features(). This allows us to move the check for multiple vlans to another function later. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:33:22 -07:00
Toshiaki Makita	8d463504c1	vlan: Add features for stacked vlan device Stacked vlan devices curretly have few features (GRO, HIGHDMA, LLTX). Since we have software fallbacks in case the NIC can not handle some features for multiple vlans, we can add the same features as the lower vlan devices for stacked vlan devices. This allows stacked vlan devices to create large (GSO) packets and not to segment packets. Those packets will be segmented by software on the real device, or even can be segmented by the NIC once TSO for multiple vlans becomes enabled by the following patches. The exception is those related to FCoE, which does not have a software fallback. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:33:21 -07:00
Alexei Starovoitov	608cd71a9c	tc: bpf: generalize pedit action existing TC action 'pedit' can munge any bits of the packet. Generalize it for use in bpf programs attached as cls_bpf and act_bpf via bpf_skb_store_bytes() helper function. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:26:54 -07:00
Guenter Roeck	339d82626d	net: dsa: Add basic framework to support ndo_fdb functions Provide callbacks for ndo_fdb_add, ndo_fdb_del, and ndo_fdb_dump. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:23:54 -07:00
Nicolas Dichtel	4217291e59	netns: don't clear nsid too early on removal With the current code, ids are removed too early. Suppose you have an ipip interface that stands in the netns foo and its link part in the netns bar (so the netns bar has an nsid into the netns foo). Now, you remove the netns bar: - the bar nsid into the netns foo is removed - the netns exit method of ipip is called, thus our ipip iface is removed: => a netlink message is sent in the netns foo to advertise this deletion => this netlink message requests an nsid for bar, thus a new nsid is allocated for bar and never removed. We must remove nsids when we are sure that nobody will refer to netns currently cleaned. Fixes: `0c7aecd4bd` ("netns: add rtnl cmd to add and get peer netns ids") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:58:21 -07:00
David S. Miller	4ef295e047	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree. Basically, nf_tables updates to add the set extension infrastructure and finish the transaction for sets from Patrick McHardy. More specifically, they are: 1) Move netns to basechain and use recently added possible_net_t, from Patrick McHardy. 2) Use LOGLEVEL_<FOO> from nf_log infrastructure, from Joe Perches. 3) Restore nf_log_trace that was accidentally removed during conflict resolution. 4) nft_queue does not depend on NETFILTER_XTABLES, starting from here all patches from Patrick McHardy. 5) Use raw_smp_processor_id() in nft_meta. Then, several patches to prepare ground for the new set extension infrastructure: 6) Pass object length to the hash callback in rhashtable as needed by the new set extension infrastructure. 7) Cleanup patch to restore struct nft_hash as wrapper for struct rhashtable 8) Another small source code readability cleanup for nft_hash. 9) Convert nft_hash to rhashtable callbacks. And finally... 10) Add the new set extension infrastructure. 11) Convert the nft_hash and nft_rbtree sets to use it. 12) Batch set element release to avoid several RCU grace period in a row and add new function nft_set_elem_destroy() to consolidate set element release. 13) Return the set extension data area from nft_lookup. 14) Refactor existing transaction code to add some helper functions and document it. 15) Complete the set transaction support, using similar approach to what we already use, to activate/deactivate elements in an atomic fashion. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:43:43 -07:00
Ying Xue	8a0f6ebe84	tipc: involve reference counter for node structure TIPC node hash node table is protected with rcu lock on read side. tipc_node_find() is used to look for a node object with node address through iterating the hash node table. As the entire process of what tipc_node_find() traverses the table is guarded with rcu read lock, it's safe for us. However, when callers use the node object returned by tipc_node_find(), there is no rcu read lock applied. Therefore, this is absolutely unsafe for callers of tipc_node_find(). Now we introduce a reference counter for node structure. Before tipc_node_find() returns node object to its caller, it first increases the reference counter. Accordingly, after its caller used it up, it decreases the counter again. This can prevent a node being used by one thread from being freed by another thread. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:40:28 -07:00
Ying Xue	b952b2befb	tipc: fix potential deadlock when all links are reset [ 60.988363] ====================================================== [ 60.988754] [ INFO: possible circular locking dependency detected ] [ 60.989152] 3.19.0+ #194 Not tainted [ 60.989377] ------------------------------------------------------- [ 60.989781] swapper/3/0 is trying to acquire lock: [ 60.990079] (&(&n_ptr->lock)->rlock){+.-...}, at: [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc] [ 60.990743] [ 60.990743] but task is already holding lock: [ 60.991106] (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.991738] [ 60.991738] which lock already depends on the new lock. [ 60.991738] [ 60.992174] [ 60.992174] the existing dependency chain (in reverse order) is: [ 60.992174] -> #1 (&(&bclink->lock)->rlock){+.-...}: [ 60.992174] [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140 [ 60.992174] [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50 [ 60.992174] [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.992174] [<ffffffffa0000f57>] tipc_bclink_add_node+0x97/0xf0 [tipc] [ 60.992174] [<ffffffffa0011815>] tipc_node_link_up+0xf5/0x110 [tipc] [ 60.992174] [<ffffffffa0007783>] link_state_event+0x2b3/0x4f0 [tipc] [ 60.992174] [<ffffffffa00193c0>] tipc_link_proto_rcv+0x24c/0x418 [tipc] [ 60.992174] [<ffffffffa0008857>] tipc_rcv+0x827/0xac0 [tipc] [ 60.992174] [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc] [ 60.992174] [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980 [ 60.992174] [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70 [ 60.992174] [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130 [ 60.992174] [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0 [ 60.992174] [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490 [ 60.992174] [<ffffffff8155c1b7>] e1000_clean+0x267/0x990 [ 60.992174] [<ffffffff81647b60>] net_rx_action+0x150/0x360 [ 60.992174] [<ffffffff8105ec43>] __do_softirq+0x123/0x360 [ 60.992174] [<ffffffff8105f12e>] irq_exit+0x8e/0xb0 [ 60.992174] [<ffffffff8179f9f5>] do_IRQ+0x65/0x110 [ 60.992174] [<ffffffff8179da6f>] ret_from_intr+0x0/0x13 [ 60.992174] [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20 [ 60.992174] [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0 [ 60.992174] [<ffffffff81033cda>] start_secondary+0x13a/0x150 [ 60.992174] -> #0 (&(&n_ptr->lock)->rlock){+.-...}: [ 60.992174] [<ffffffff810a8f7d>] __lock_acquire+0x163d/0x1ca0 [ 60.992174] [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140 [ 60.992174] [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50 [ 60.992174] [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc] [ 60.992174] [<ffffffffa0001e11>] tipc_bclink_rcv+0x611/0x640 [tipc] [ 60.992174] [<ffffffffa0008646>] tipc_rcv+0x616/0xac0 [tipc] [ 60.992174] [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc] [ 60.992174] [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980 [ 60.992174] [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70 [ 60.992174] [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130 [ 60.992174] [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0 [ 60.992174] [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490 [ 60.992174] [<ffffffff8155c1b7>] e1000_clean+0x267/0x990 [ 60.992174] [<ffffffff81647b60>] net_rx_action+0x150/0x360 [ 60.992174] [<ffffffff8105ec43>] __do_softirq+0x123/0x360 [ 60.992174] [<ffffffff8105f12e>] irq_exit+0x8e/0xb0 [ 60.992174] [<ffffffff8179f9f5>] do_IRQ+0x65/0x110 [ 60.992174] [<ffffffff8179da6f>] ret_from_intr+0x0/0x13 [ 60.992174] [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20 [ 60.992174] [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0 [ 60.992174] [<ffffffff81033cda>] start_secondary+0x13a/0x150 [ 60.992174] [ 60.992174] other info that might help us debug this: [ 60.992174] [ 60.992174] Possible unsafe locking scenario: [ 60.992174] [ 60.992174] CPU0 CPU1 [ 60.992174] ---- ---- [ 60.992174] lock(&(&bclink->lock)->rlock); [ 60.992174] lock(&(&n_ptr->lock)->rlock); [ 60.992174] lock(&(&bclink->lock)->rlock); [ 60.992174] lock(&(&n_ptr->lock)->rlock); [ 60.992174] [ 60.992174] * DEADLOCK * [ 60.992174] [ 60.992174] 3 locks held by swapper/3/0: [ 60.992174] #0: (rcu_read_lock){......}, at: [<ffffffff81646791>] __netif_receive_skb_core+0x71/0x980 [ 60.992174] #1: (rcu_read_lock){......}, at: [<ffffffffa0002c35>] tipc_l2_rcv_msg+0x5/0xd0 [tipc] [ 60.992174] #2: (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.992174] The correct the sequence of grabbing n_ptr->lock and bclink->lock should be that the former is first held and the latter is then taken, which exactly happened on CPU1. But especially when the retransmission of broadcast link is failed, bclink->lock is first held in tipc_bclink_rcv(), and n_ptr->lock is taken in link_retransmit_failure() called by tipc_link_retransmit() subsequently, which is demonstrated on CPU0. As a result, deadlock occurs. If the order of holding the two locks happening on CPU0 is reversed, the deadlock risk will be relieved. Therefore, the node lock taken in link_retransmit_failure() originally is moved to tipc_bclink_rcv() so that it's obtained before bclink lock. But the precondition of the adjustment of node lock is that responding to bclink reset event must be moved from tipc_bclink_unlock() to tipc_node_unlock(). Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:40:27 -07:00
Eric Dumazet	41d25fe092	tcp: tcp_syn_flood_action() can be static After commit `1fb6f159fd` ("tcp: add tcp_conn_request"), tcp_syn_flood_action() is no longer used from IPv6. We can make it static, by moving it above tcp_conn_request() Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Octavian Purdila <octavian.purdila@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:17:18 -07:00
WANG Cong	f243e5a785	ipmr,ip6mr: call ip6mr_free_table() on failure path Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:13:54 -07:00
WANG Cong	85b9909272	fib6: install fib6 ops in the last step We should not commit the new ops until we finish all the setup, otherwise we have to NULL it on failure. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:12:37 -07:00
Marcel Holtmann	20fa110a54	Bluetooth: Remove superfluous extra empty line between functions Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-29 07:57:03 +03:00
Marcel Holtmann	57b0d3e8e7	Bluetooth: Fix error returns for Read Local OOB Extended Data commands The Read Local OOB Extended Data commands are required to return the address type and the data length at least. However currently the error returns only the address type. To fix this and avoid any extra allocations or stack memory, rearrange the code so that the same path can be used for error returns. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-29 07:57:02 +03:00
Marcel Holtmann	efcd8c98e0	Bluetooth: Move memory location outside of hci_dev lock Taking the hci_dev lock for just a memory allocation seems a bit too much and not really needed. So instead try to allocate the memory first and then take the lock. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-29 07:57:00 +03:00
Arman Uguray	880897d4c9	Bluetooth: Update adv. parameters when conn. setting changes This patch fixes a bug where the advertising parameters weren't updated after a call to "Set Connectable" if the HCI_ADVERTISING_INSTANCE setting was set. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-28 21:31:57 +01:00
Arman Uguray	c7d4883b06	Bluetooth: Use ADV_SCAN_IND for adv. instances With this patch, ADV_SCAN_IND will be used for advertising instances that have non-zero scan response data while the global "connectable" setting is "off". Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-28 21:31:57 +01:00
Arman Uguray	faccb950f7	Bluetooth: Fix using global connectable settings for adv This patch fixes a bug where ADV_NONCONN_IND was being used for advertising instances >0 while the global connectable setting was set to "on". Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-28 21:31:57 +01:00
Johan Hedberg	600b21507e	Bluetooth: Fix race condition with HCI_RESET flag During the HCI init phase a completed request might be the last part of the setup procedure after which the actual init procedure starts. The init procedure begins with a call to hci_reset_req() which sets the HCI_RESET flag. The purpose of this flag is to make us ignore any updates to ncmd/cmd_cnt as long as we haven't received the command complete event for the HCI_Reset. There's a potential race with this however: hci_req_cmd_complete(hdev, opcode, status); if (ev->ncmd && !test_bit(HCI_RESET, &hdev->flags)) { atomic_set(&hdev->cmd_cnt, 1); if (!skb_queue_empty(&hdev->cmd_q)) queue_work(hdev->workqueue, &hdev->cmd_work); } Since the hci_req_cmd_complete() will trigger the completion of the setup stage, it's possible that hci_reset_req() gets called before we try to read ev->ncmd and the HCI_RESET flag. Because of this the cmd_cnt would never be updated and the hci_reset_req() in practice ends up blocking itself. This patch fixes the issue by updating cmd_cnt before notifying the request completion, and then reading it again to determine whether the cmd_work should be queued or not. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-28 20:05:11 +01:00
Alexander Aring	8bf9538a5d	mac802154: cleanup concurrent check This patch cleanups the checking of different mac phy depended values by handling depended mac settings per hw support flag in one condition. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-27 19:18:50 +01:00
Trond Myklebust	0695314ef0	SUNRPC: Fix a regression when reconnecting If the task needs to give up the socket lock in order to allow a reconnect to occur, then it must also clear the 'rq_bytes_sent' field so that when it retransmits, it knows to start from the beginning. Fixes: `718ba5b873` ("SUNRPC: Add helpers to prevent socket create from racing") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-03-27 12:24:36 -04:00
Patrick McHardy	cc02e457bb	netfilter: nf_tables: implement set transaction support Set elements are the last object type not supporting transaction support. Implement similar to the existing rule transactions: The global transaction counter keeps track of two generations, current and next. Each element contains a bitmask specifying in which generations it is inactive. New elements start out as inactive in the current generation and active in the next. On commit, the previous next generation becomes the current generation and the element becomes active. The bitmask is then cleared to indicate that the element is active in all future generations. If the transaction is aborted, the element is removed from the set before it becomes active. When removing an element, it gets marked as inactive in the next generation. On commit the next generation becomes active and the therefor the element inactive. It is then taken out of then set and released. On abort, the element is marked as active for the next generation again. Lookups ignore elements not active in the current generation. The current set types (hash/rbtree) both use a field in the extension area to store the generation mask. This (currently) does not require any additional memory since we have some free space in there. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-26 11:09:35 +01:00
Patrick McHardy	ea4bd995b0	netfilter: nf_tables: add transaction helper functions Add some helper functions for building the genmask as preparation for set transactions. Also add a little documentation how this stuff actually works. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-26 11:09:35 +01:00
Patrick McHardy	b2832dd662	netfilter: nf_tables: return set extensions from ->lookup() Return the extension area from the ->lookup() function to allow to consolidate common actions. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-26 11:09:34 +01:00
Patrick McHardy	61edafbb47	netfilter: nf_tables: consolide set element destruction With the conversion to set extensions, it is now possible to consolidate the different set element destruction functions. The set implementations' ->remove() functions are changed to only take the element out of their internal data structures. Elements will be freed in a batched fashion after the global transaction's completion RCU grace period. This reduces the amount of grace periods required for nft_hash from N to zero additional ones, additionally this guarantees that the set elements' extensions of all implementations can be used under RCU protection. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-26 11:09:34 +01:00
Clément Perrochaud	25af01ed18	NFC: nci: Add firmware download support A simple forward for firmware download (i.e. sending a new firmware to the NFC adapter) from the NFC subsystem to the drivers. This feature is required to update the firmware of NXP-NCI NFC controllers but can be used by any NCI driver. This feature has been present in the HCI subsystem since 9a695d. Signed-off-by: Clément Perrochaud <clement.perrochaud@effinnov.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-03-26 10:56:20 +01:00
Arman Uguray	fdf51784cd	Bluetooth: Unify advertising data code paths This patch simplifies the code paths for assembling the advertising data used by advertising instances 0 and 1. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-26 03:30:29 +01:00
Arman Uguray	089fa8c09e	Bluetooth: Update supported_flags for AD features This patch updates the "supported_flags" parameter returned from the "Read Advertising Features" command. Add Advertising will now return an error if an unsupported flag is provided. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-26 03:30:29 +01:00
Arman Uguray	5507e35811	Bluetooth: Support the "tx-power" adv flag This patch adds support for the "tx-power" flag of the Add Advertising command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-26 03:30:29 +01:00
Arman Uguray	67e0c0cd8f	Bluetooth: Support the "managed-flags" adv flag This patch adds support for the "managed-flags" flag of the Add Advertising command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-26 03:30:29 +01:00
Arman Uguray	807ec772bf	Bluetooth: Support the "limited-discoverable" adv flag This patch adds support for the "limited-discoverable" flag of the Add Advertising command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-26 03:30:29 +01:00
Arman Uguray	b44133ff03	Bluetooth: Support the "discoverable" adv flag This patch adds support for the "discoverable" flag of the Add Advertising command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-26 03:30:28 +01:00
Arman Uguray	e7a685d316	Bluetooth: Support the "connectable mode" adv flag This patch adds support for the "connectable mode" flag of the Add Advertising command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-26 03:30:28 +01:00
Marcel Holtmann	08dc0e987e	Bluetooth: Fix minor typo in comment for static address setting Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-25 19:09:45 -07:00
Christoph Hellwig	e2e40f2c1e	fs: move struct kiocb to fs.h struct kiocb now is a generic I/O container, so move it to fs.h. Also do a #include diet for aio.h while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-03-25 20:28:11 -04:00
Hannes Frederic Sowa	5a352dd0a3	ipv6: hash net ptr into fragmentation bucket selection As namespaces are sometimes used with overlapping ip address ranges, we should also use the namespace as input to the hash to select the ip fragmentation counter bucket. Cc: Eric Dumazet <edumazet@google.com> Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:07:04 -04:00
Hannes Frederic Sowa	b6a7719aed	ipv4: hash net ptr into fragmentation bucket selection As namespaces are sometimes used with overlapping ip address ranges, we should also use the namespace as input to the hash to select the ip fragmentation counter bucket. Cc: Eric Dumazet <edumazet@google.com> Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:07:04 -04:00
Jon Paul Maloy	8b4ed8634f	tipc: eliminate race condition at dual link establishment Despite recent improvements, the establishment of dual parallel links still has a small glitch where messages can bypass each other. When the second link in a dual-link configuration is established, part of the first link's traffic will be steered over to the new link. Although we do have a mechanism to ensure that packets sent before and after the establishment of the new link arrive in sequence to the destination node, this is not enough. The arriving messages will still be delivered upwards in different threads, something entailing a risk of message disordering during the transition phase. To fix this, we introduce a synchronization mechanism between the two parallel links, so that traffic arriving on the new link cannot be added to its input queue until we are guaranteed that all pre-establishment messages have been delivered on the old, parallel link. This problem seems to always have been around, but its occurrence is so rare that it has not been noticed until recent intensive testing. Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:05:56 -04:00
Jon Paul Maloy	3127a0200d	tipc: clean up handling of link congestion After the recent changes in message importance handling it becomes possible to simplify handling of messages and sockets when we encounter link congestion. We merge the function tipc_link_cong() into link_schedule_user(), and simplify the code of the latter. The code should now be easier to follow, especially regarding return codes and handling of the message that caused the situation. In case the scheduling function is unable to pre-allocate a wakeup message buffer, it now returns -ENOBUFS, which is a more correct code than the previously used -EHOSTUNREACH. Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:05:56 -04:00
Jon Paul Maloy	1f66d161ab	tipc: introduce starvation free send algorithm Currently, we only use a single counter; the length of the backlog queue, to determine whether a message should be accepted to the queue or not. Each time a message is being sent, the queue length is compared to a threshold value for the message's importance priority. If the queue length is beyond this threshold, the message is rejected. This algorithm implies a risk of starvation of low importance senders during very high load, because it may take a long time before the backlog queue has decreased enough to accept a lower level message. We now eliminate this risk by introducing a counter for each importance priority. When a message is sent, we check only the queue level for that particular message's priority. If that is ok, the message can be added to the backlog, irrespective of the queue level for other priorities. This way, each level is guaranteed a certain portion of the total bandwidth, and any risk of starvation is eliminated. Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:05:56 -04:00
Guenter Roeck	b06b107a4c	net: dsa: Handle non-bridge master change Master change notifications may occur other than when joining or leaving a bridge, for example when being added to or removed from a bond or Open vSwitch. In that case, do nothing instead of asking the switch driver to remove a port from a bridge that it didn't join. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:04:57 -04:00
Patrick McHardy	fe2811ebeb	netfilter: nf_tables: convert hash and rbtree to set extensions The set implementations' private struct will only contain the elements needed to maintain the search structure, all other elements are moved to the set extensions. Element allocation and initialization is performed centrally by nf_tables_api instead of by the different set implementations' ->insert() functions. A new "elemsize" member in the set ops specifies the amount of memory to reserve for internal usage. Destruction will also be moved out of the set implementations by a following patch. Except for element allocation, the patch is a simple conversion to using data from the extension area. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:35 +01:00
Patrick McHardy	3ac4c07a24	netfilter: nf_tables: add set extensions Add simple set extension infrastructure for maintaining variable sized and optional per element data. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:34 +01:00
Patrick McHardy	bfd6e327e1	netfilter: nft_hash: convert to use rhashtable callbacks A following patch will convert sets to use so called set extensions, where the key is not located in a fixed position anymore. This will require rhashtable hashing and comparison callbacks to be used. As preparation, convert nft_hash to use these callbacks without any functional changes. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:34 +01:00
Patrick McHardy	45d84751fb	netfilter: nft_hash: indent rhashtable parameters Improve readability by indenting the parameter initialization. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:34 +01:00
Patrick McHardy	745f5450d5	netfilter: nft_hash: restore struct nft_hash Following patches will add new private members, restore struct nft_hash as preparation. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:33 +01:00
Patrick McHardy	49f7b33e63	rhashtable: provide len to obj_hashfn nftables sets will be converted to use so called setextensions, moving the key to a non-fixed position. To hash it, the obj_hashfn must be used, however it so far doesn't receive the length parameter. Pass the key length to obj_hashfn() and convert existing users. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:33 +01:00
Ying Xue	bc14b8d6a9	tipc: fix a link reset issue due to retransmission failures When a node joins a cluster while we are transmitting a fragment stream over the broadcast link, it's missing the preceding fragments needed to build a meaningful message. As a result, the node has to drop it. However, as the fragment message is not acknowledged to its sender before it's dropped, it accidentally causes link reset of retransmission failure on the node. Reported-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 11:43:32 -04:00
D.S. Ljungmark	6fd99094de	ipv6: Don't reduce hop limit for an interface A local route may have a lower hop_limit set than global routes do. RFC 3756, Section 4.2.7, "Parameter Spoofing" > 1. The attacker includes a Current Hop Limit of one or another small > number which the attacker knows will cause legitimate packets to > be dropped before they reach their destination. > As an example, one possible approach to mitigate this threat is to > ignore very small hop limits. The nodes could implement a > configurable minimum hop limit, and ignore attempts to set it below > said limit. Signed-off-by: D.S. Ljungmark <ljungmark@modio.se> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 11:41:08 -04:00
Ying Xue	7e3ea6d5c4	sctp: avoid to repeatedly declare external variables Move the declaration for external variables to sctp.h file avoiding to repeatedly declare them with extern keyword. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 11:40:16 -04:00
Patrick McHardy	14d14a5d29	netfilter: nft_meta: use raw_smp_processor_id() Using smp_processor_id() triggers warnings with PREEMPT_RCU. There is no point in disabling preemption since we only collect the numeric value, so use raw_smp_processor_id() instead. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 12:09:40 +01:00
Patrick McHardy	d95797252a	netfilter: nf_tables: nft_queue does not depend on x_tables Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 12:09:39 +01:00
Pablo Neira Ayuso	fce1528ef6	netfilter: nf_tables: restore nf_log_trace() in nf_tables_core.c As described by `4017a7e` ("netfilter: restore rule tracing via nfnetlink_log"), this accidentally slipped through during conflict resolution in `d5c1d8c`. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 12:09:39 +01:00
Joe Perches	a81b2ce850	netfilter: Use LOGLEVEL_<FOO> defines Use the #defines where appropriate. Miscellanea: Add explicit #include <linux/kernel.h> where it was not previously used so that these #defines are a bit more explicitly defined instead of indirectly included via: module.h->moduleparam.h->kernel.h Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 12:09:39 +01:00
Patrick McHardy	5ebb335dcb	netfilter: nf_tables: move struct net pointer to base chain The network namespace is only needed for base chains to get at the gencursor. Also convert to possible_net_t. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 12:09:38 +01:00
Eric Dumazet	0144a81ccc	tcp: fix ipv4 mapped request socks ss should display ipv4 mapped request sockets like this : tcp SYN-RECV 0 0 ::ffff:192.168.0.1:8080 ::ffff:192.0.2.1:35261 and not like this : tcp SYN-RECV 0 0 192.168.0.1:8080 192.0.2.1:35261 We should init ireq->ireq_family based on listener sk_family, not the actual protocol carried by SYN packet. This means we can set ireq_family in inet_reqsk_alloc() Fixes: `3f66b083a5` ("inet: introduce ireq_family") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 00:57:48 -04:00
Marcel Holtmann	99c679acce	Bluetooth: Filter list of supported commands/events for untrusted users When the user of the management interface is not trusted, then it only has access to a limited set of commands and events. When providing the list of supported commands and events take the trusted vs untrusted status of the user into account and return different lists. This way the untrusted user knows exactly which commands it can execute and which events it can receive. So no guesswork needed. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-24 18:37:42 -07:00
Eric Dumazet	fd3a154a00	tcp: md5: get rid of tcp_v[46]_reqsk_md5_lookup() With request socks convergence, we no longer need different lookup methods. A request socket can use generic lookup function. Add const qualifier to 2nd tcp_v[46]_md5_lookup() parameter. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:30 -04:00
Eric Dumazet	39f8e58e53	tcp: md5: remove request sock argument of calc_md5_hash() Since request and established sockets now have same base, there is no need to pass two pointers to tcp_v4_md5_hash_skb() or tcp_v6_md5_hash_skb() Also add a const qualifier to their struct tcp_md5sig_key argument. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:30 -04:00
Eric Dumazet	ff74e23f7e	tcp: md5: input path is run under rcu protected sections It is guaranteed that both tcp_v4_rcv() and tcp_v6_rcv() run from rcu read locked sections : ip_local_deliver_finish() and ip6_input_finish() both use rcu_read_lock() Also align tcp_v6_inbound_md5_hash() on tcp_v4_inbound_md5_hash() by returning a boolean. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:29 -04:00
Eric Dumazet	0980c1e308	tcp: use C99 initializers in new_state[] Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:29 -04:00
Eric Dumazet	80f03e27a3	tcp: md5: fix rcu lockdep splat While timer handler effectively runs a rcu read locked section, there is no explicit rcu_read_lock()/rcu_read_unlock() annotations and lockdep can be confused here : net/ipv4/tcp_ipv4.c-906- /* caller either holds rcu_read_lock() or socket lock */ net/ipv4/tcp_ipv4.c:907: md5sig = rcu_dereference_check(tp->md5sig_info, net/ipv4/tcp_ipv4.c-908- sock_owned_by_user(sk) \|\| net/ipv4/tcp_ipv4.c-909- lockdep_is_held(&sk->sk_lock.slock)); Let's explicitely acquire rcu_read_lock() in tcp_make_synack() Before commit `fa76ce7328` ("inet: get rid of central tcp/dccp listener timer"), we were holding listener lock so lockdep was happy. Fixes: `fa76ce7328` ("inet: get rid of central tcp/dccp listener timer") Signed-off-by: Eric DUmazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:29 -04:00
Thomas Graf	6b6f302ced	rhashtable: Add rhashtable_free_and_destroy() rhashtable_destroy() variant which stops rehashes, iterates over the table and calls a callback to release resources. Avoids need for nft_hash to embed rhashtable internals and allows to get rid of the being_destroyed flag. It also saves a 2nd mutex lock upon destruction. Also fixes an RCU lockdep splash on nft set destruction due to calling rht_for_each_entry_safe() without holding bucket locks. Open code this loop as we need know that no mutations may occur in parallel. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 17:48:40 -04:00
Thomas Graf	b5e2c150ac	rhashtable: Disable automatic shrinking by default Introduce a new bool automatic_shrinking to require the user to explicitly opt-in to automatic shrinking of tables. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 17:48:40 -04:00
Michal Sekletar	27cd545247	filter: introduce SKF_AD_VLAN_TPID BPF extension If vlan offloading takes place then vlan header is removed from frame and its contents, both vlan_tci and vlan_proto, is available to user space via TPACKET interface. However, only vlan_tci can be used in BPF filters. This commit introduces a new BPF extension. It makes possible to load the value of vlan_proto (vlan TPID) to register A. Support for classic BPF and eBPF is being added, analogous to skb->protocol. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Michal Sekletar <msekleta@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 15:25:15 -04:00
Hannes Frederic Sowa	ff40217e73	ipv6: fix sparse warnings in privacy stable addresses generation Those warnings reported by sparse endianness check (via kbuild test robot) are harmless, nevertheless fix them up and make the code a little bit easier to read. Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: `622c81d57b` ("ipv6: generation of stable privacy addresses for link-local and autoconf") Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 15:21:35 -04:00
Ying Xue	ed3e852aa5	tipc: fix compile error when IPV6=m and TIPC=y When IPV6=m and TIPC=y, below error will appear during building kernel image: net/tipc/udp_media.c:196: undefined reference to `ip6_dst_lookup' make: *** [vmlinux] Error 1 As ip6_dst_lookup() is implemented in IPV6 and IPV6 is compiled as module, ip6_dst_lookup() is not built-in core kernel image. As a result, compiler cannot find 'ip6_dst_lookup' reference while compiling TIPC code into core kernel image. But with the method introduced by commit `5f81bd2e5d` ("ipv6: export a stub for IPv6 symbols used by vxlan"), we can avoid the compile error through "ipv6_stub" pointer to access ip6_dst_lookup(). Fixes: `d0f91938be` ("tipc: add ip/udp media type") Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 15:20:29 -04:00
WANG Cong	66400d5430	net: allow to delete a whole device group With dev group, we can change a batch of net devices, so we should allow to delete them together too. Group 0 is not allowed to be deleted since it is the default group. Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 15:00:01 -04:00
WANG Cong	d079535d5e	net: use for_each_netdev_safe() in rtnl_group_changelink() In case we move the whole dev group to another netns, we should call for_each_netdev_safe(), otherwise we get a soft lockup: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [ip:798] irq event stamp: 255424 hardirqs last enabled at (255423): [<ffffffff81a2aa95>] restore_args+0x0/0x30 hardirqs last disabled at (255424): [<ffffffff81a2ad5a>] apic_timer_interrupt+0x6a/0x80 softirqs last enabled at (255422): [<ffffffff81079ebc>] __do_softirq+0x2c1/0x3a9 softirqs last disabled at (255417): [<ffffffff8107a190>] irq_exit+0x41/0x95 CPU: 0 PID: 798 Comm: ip Not tainted 4.0.0-rc4+ #881 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff8800d1b88000 ti: ffff880119530000 task.ti: ffff880119530000 RIP: 0010:[<ffffffff810cad11>] [<ffffffff810cad11>] debug_lockdep_rcu_enabled+0x28/0x30 RSP: 0018:ffff880119533778 EFLAGS: 00000246 RAX: ffff8800d1b88000 RBX: 0000000000000002 RCX: 0000000000000038 RDX: 0000000000000000 RSI: ffff8800d1b888c8 RDI: ffff8800d1b888c8 RBP: ffff880119533778 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 000000000000b5c2 R12: 0000000000000246 R13: ffff880119533708 R14: 00000000001d5a40 R15: ffff88011a7d5a40 FS: 00007fc01315f740(0000) GS:ffff88011a600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f367a120988 CR3: 000000011849c000 CR4: 00000000000007f0 Stack: ffff880119533798 ffffffff811ac868 ffffffff811ac831 ffffffff811ac828 ffff8801195337c8 ffffffff811ac8c9 ffff8801195339b0 ffff8801197633e0 0000000000000000 ffff8801195339b0 ffff8801195337d8 ffffffff811ad2d7 Call Trace: [<ffffffff811ac868>] rcu_read_lock+0x37/0x6e [<ffffffff811ac831>] ? rcu_read_unlock+0x5f/0x5f [<ffffffff811ac828>] ? rcu_read_unlock+0x56/0x5f [<ffffffff811ac8c9>] __fget+0x2a/0x7a [<ffffffff811ad2d7>] fget+0x13/0x15 [<ffffffff811be732>] proc_ns_fget+0xe/0x38 [<ffffffff817c7714>] get_net_ns_by_fd+0x11/0x59 [<ffffffff817df359>] rtnl_link_get_net+0x33/0x3e [<ffffffff817df3d7>] do_setlink+0x73/0x87b [<ffffffff810b28ce>] ? trace_hardirqs_off+0xd/0xf [<ffffffff81a2aa95>] ? retint_restore_args+0xe/0xe [<ffffffff817e0301>] rtnl_newlink+0x40c/0x699 [<ffffffff817dffe0>] ? rtnl_newlink+0xeb/0x699 [<ffffffff81a29246>] ? _raw_spin_unlock+0x28/0x33 [<ffffffff8143ed1e>] ? security_capable+0x18/0x1a [<ffffffff8107da51>] ? ns_capable+0x4d/0x65 [<ffffffff817de5ce>] rtnetlink_rcv_msg+0x181/0x194 [<ffffffff817de407>] ? rtnl_lock+0x17/0x19 [<ffffffff817de407>] ? rtnl_lock+0x17/0x19 [<ffffffff817de44d>] ? __rtnl_unlock+0x17/0x17 [<ffffffff818327c6>] netlink_rcv_skb+0x4d/0x93 [<ffffffff817de42f>] rtnetlink_rcv+0x26/0x2d [<ffffffff81830f18>] netlink_unicast+0xcb/0x150 [<ffffffff8183198e>] netlink_sendmsg+0x501/0x523 [<ffffffff8115cba9>] ? might_fault+0x59/0xa9 [<ffffffff817b5398>] ? copy_from_user+0x2a/0x2c [<ffffffff817b7b74>] sock_sendmsg+0x34/0x3c [<ffffffff817b7f6d>] ___sys_sendmsg+0x1b8/0x255 [<ffffffff8115c5eb>] ? handle_pte_fault+0xbd5/0xd4a [<ffffffff8100a2b0>] ? native_sched_clock+0x35/0x37 [<ffffffff8109e94b>] ? sched_clock_local+0x12/0x72 [<ffffffff8109eb9c>] ? sched_clock_cpu+0x9e/0xb7 [<ffffffff810cadbf>] ? rcu_read_lock_held+0x3b/0x3d [<ffffffff811ac1d8>] ? __fcheck_files+0x4c/0x58 [<ffffffff811ac946>] ? __fget_light+0x2d/0x52 [<ffffffff817b8adc>] __sys_sendmsg+0x42/0x60 [<ffffffff817b8b0c>] SyS_sendmsg+0x12/0x1c [<ffffffff81a29e32>] system_call_fastpath+0x12/0x17 Fixes: `e7ed828f10` ("netlink: support setting devgroup parameters") Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 13:02:32 -04:00
Sasha Levin	610600c8c5	tipc: validate length of sockaddr in connect() for dgram/rdm Commit `f2f8036` ("tipc: add support for connect() on dgram/rdm sockets") hasn't validated user input length for the sockaddr structure which allows a user to overwrite kernel memory with arbitrary input. Fixes: `f2f8036` ("tipc: add support for connect() on dgram/rdm sockets") Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 12:50:39 -04:00
Michal Kubeček	d0c294c53a	tcp: prevent fetching dst twice in early demux code On s390x, gcc 4.8 compiles this part of tcp_v6_early_demux() struct dst_entry *dst = sk->sk_rx_dst; if (dst) dst = dst_check(dst, inet6_sk(sk)->rx_dst_cookie); to code reading sk->sk_rx_dst twice, once for the test and once for the argument of ip6_dst_check() (dst_check() is inline). This allows ip6_dst_check() to be called with null first argument, causing a crash. Protect sk->sk_rx_dst access by READ_ONCE() both in IPv4 and IPv6 TCP early demux code. Fixes: `41063e9dd1` ("ipv4: Early TCP socket demux.") Fixes: `c7109986db` ("ipv6: Early TCP socket demux") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:38:24 -04:00
David S. Miller	d5c1d8c567	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/netfilter/nf_tables_core.c The nf_tables_core.c conflict was resolved using a conflict resolution from Stephen Rothwell as a guide. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:22:43 -04:00
Hannes Frederic Sowa	1855b7c3e8	ipv6: introduce idgen_delay and idgen_retries knobs This is specified by RFC 7217. Cc: Erik Kline <ek@google.com> Cc: Fernando Gont <fgont@si6networks.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:12:09 -04:00
Hannes Frederic Sowa	5f40ef77ad	ipv6: do retries on stable privacy addresses If a DAD conflict is detected, we want to retry privacy stable address generation up to idgen_retries (= 3) times with a delay of idgen_delay (= 1 second). Add the logic to addrconf_dad_failure. By design, we don't clean up dad failed permanent addresses. Cc: Erik Kline <ek@google.com> Cc: Fernando Gont <fgont@si6networks.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:12:09 -04:00
Hannes Frederic Sowa	8e8e676d0b	ipv6: collapse state_lock and lock Cc: Erik Kline <ek@google.com> Cc: Fernando Gont <fgont@si6networks.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:12:09 -04:00
Hannes Frederic Sowa	64236f3f3d	ipv6: introduce IFA_F_STABLE_PRIVACY flag We need to mark appropriate addresses so we can do retries in case their DAD failed. Cc: Erik Kline <ek@google.com> Cc: Fernando Gont <fgont@si6networks.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:12:09 -04:00
Hannes Frederic Sowa	622c81d57b	ipv6: generation of stable privacy addresses for link-local and autoconf This patch implements the stable privacy address generation for link-local and autoconf addresses as specified in RFC7217. RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key) is the RID (random identifier). As the hash function F we chose one round of sha1. Prefix will be either the link-local prefix or the router advertised one. As Net_Iface we use the MAC address of the device. DAD_Counter and secret_key are implemented as specified. We don't use Network_ID, as it couples the code too closely to other subsystems. It is specified as optional in the RFC. As Net_Iface we only use the MAC address: we simply have no stable identifier in the kernel we could possibly use: because this code might run very early, we cannot depend on names, as they might be changed by user space early on during the boot process. A new address generation mode is introduced, IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to none or eui64 address configuration mode although the stable_secret is already set. We refuse writes to ipv6/conf/all/stable_secret but only allow ipv6/conf/default/stable_secret and the interface specific file to be written to. The default stable_secret is used as the parameter for the namespace, the interface specific can overwrite the secret, e.g. when switching a network configuration from one system to another while inheriting the secret. Cc: Erik Kline <ek@google.com> Cc: Fernando Gont <fgont@si6networks.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:12:08 -04:00
Hannes Frederic Sowa	3d1bec9932	ipv6: introduce secret_stable to ipv6_devconf This patch implements the procfs logic for the stable_address knob: The secret is formatted as an ipv6 address and will be stored per interface and per namespace. We track initialized flag and return EIO errors until the secret is set. We don't inherit the secret to newly created namespaces. Cc: Erik Kline <ek@google.com> Cc: Fernando Gont <fgont@si6networks.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:12:08 -04:00
Herbert Xu	6d02294981	tipc: Use default rhashtable hashfn This patch removes the explicit jhash value for the hashfn parameter of rhashtable. The default is now jhash so removing the setting makes no difference apart from making one less copy of jhash in the kernel. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:07:51 -04:00
Herbert Xu	11b58ba146	netlink: Use default rhashtable hashfn This patch removes the explicit jhash value for the hashfn parameter of rhashtable. As the key length is a multiple of 4, this means that we will actually end up using jhash2. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:07:51 -04:00
David S. Miller	40451fd013	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next. Basically, more incremental updates for br_netfilter from Florian Westphal, small nf_tables updates (including one fix for rb-tree locking) and small two-liner to add extra validation for the REJECT6 target. More specifically, they are: 1) Use the conntrack status flags from br_netfilter to know that DNAT is happening. Patch for Florian Westphal. 2) nf_bridge->physoutdev == NULL already indicates that the traffic is bridged, so let's get rid of the BRNF_BRIDGED flag. Also from Florian. 3) Another patch to prepare voidization of seq_printf/seq_puts/seq_putc, from Joe Perches. 4) Consolidation of nf_tables_newtable() error path. 5) Kill nf_bridge_pad used by br_netfilter from ip_fragment(), from Florian Westphal. 6) Access rb-tree root node inside the lock and remove unnecessary locking from the get path (we already hold nfnl_lock there), from Patrick McHardy. 7) You cannot use a NFT_SET_ELEM_INTERVAL_END when the set doesn't support interval, also from Patrick. 8) Enforce IP6T_F_PROTO from ip6t_REJECT to make sure the core is actually restricting matches to TCP. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:02:46 -04:00
Alexander Drozdov	682f048bd4	af_packet: pass checksum validation status to the user Introduce TP_STATUS_CSUM_VALID tp_status flag to tell the af_packet user that at least the transport header checksum has been already validated. For now, the flag may be set for incoming packets only. Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:01:28 -04:00
Alexander Drozdov	68c2e5de36	af_packet: make tpacket_rcv to not set status value before run_filter It is just an optimization. We don't need the value of status variable if the packet is filtered. Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:00:36 -04:00
Fan Du	c69736696c	inet: fix double request socket freeing Eric Hugne reported following error : I'm hitting this warning on latest net-next when i try to SSH into a machine with eth0 added to a bridge (but i think the problem is older than that) Steps to reproduce: node2 ~ # brctl addif br0 eth0 [ 223.758785] device eth0 entered promiscuous mode node2 ~ # ip link set br0 up [ 244.503614] br0: port 1(eth0) entered forwarding state [ 244.505108] br0: port 1(eth0) entered forwarding state node2 ~ # [ 251.160159] ------------[ cut here ]------------ [ 251.160831] WARNING: CPU: 0 PID: 3 at include/net/request_sock.h:102 tcp_v4_err+0x6b1/0x720() [ 251.162077] Modules linked in: [ 251.162496] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.0.0-rc3+ #18 [ 251.163334] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 251.164078] ffffffff81a8365c ffff880038a6ba18 ffffffff8162ace4 0000000000009898 [ 251.165084] 0000000000000000 ffff880038a6ba58 ffffffff8104da85 ffff88003fa437c0 [ 251.166195] ffff88003fa437c0 ffff88003fa74e00 ffff88003fa43bb8 ffff88003fad99a0 [ 251.167203] Call Trace: [ 251.167533] [<ffffffff8162ace4>] dump_stack+0x45/0x57 [ 251.168206] [<ffffffff8104da85>] warn_slowpath_common+0x85/0xc0 [ 251.169239] [<ffffffff8104db65>] warn_slowpath_null+0x15/0x20 [ 251.170271] [<ffffffff81559d51>] tcp_v4_err+0x6b1/0x720 [ 251.171408] [<ffffffff81630d03>] ? _raw_read_lock_irq+0x3/0x10 [ 251.172589] [<ffffffff81534e20>] ? inet_del_offload+0x40/0x40 [ 251.173366] [<ffffffff81569295>] icmp_socket_deliver+0x65/0xb0 [ 251.174134] [<ffffffff815693a2>] icmp_unreach+0xc2/0x280 [ 251.174820] [<ffffffff8156a82d>] icmp_rcv+0x2bd/0x3a0 [ 251.175473] [<ffffffff81534ea2>] ip_local_deliver_finish+0x82/0x1e0 [ 251.176282] [<ffffffff815354d8>] ip_local_deliver+0x88/0x90 [ 251.177004] [<ffffffff815350f0>] ip_rcv_finish+0xf0/0x310 [ 251.177693] [<ffffffff815357bc>] ip_rcv+0x2dc/0x390 [ 251.178336] [<ffffffff814f5da3>] __netif_receive_skb_core+0x713/0xa20 [ 251.179170] [<ffffffff814f7fca>] __netif_receive_skb+0x1a/0x80 [ 251.179922] [<ffffffff814f97d4>] process_backlog+0x94/0x120 [ 251.180639] [<ffffffff814f9612>] net_rx_action+0x1e2/0x310 [ 251.181356] [<ffffffff81051267>] __do_softirq+0xa7/0x290 [ 251.182046] [<ffffffff81051469>] run_ksoftirqd+0x19/0x30 [ 251.182726] [<ffffffff8106cc23>] smpboot_thread_fn+0x153/0x1d0 [ 251.183485] [<ffffffff8106cad0>] ? SyS_setgroups+0x130/0x130 [ 251.184228] [<ffffffff8106935e>] kthread+0xee/0x110 [ 251.184871] [<ffffffff81069270>] ? kthread_create_on_node+0x1b0/0x1b0 [ 251.185690] [<ffffffff81631108>] ret_from_fork+0x58/0x90 [ 251.186385] [<ffffffff81069270>] ? kthread_create_on_node+0x1b0/0x1b0 [ 251.187216] ---[ end trace c947fc7b24e42ea1 ]--- [ 259.542268] br0: port 1(eth0) entered forwarding state Remove the double calls to reqsk_put() [edumazet] : I got confused because reqsk_timer_handler() _has_ to call reqsk_put(req) after calling inet_csk_reqsk_queue_drop(), as the timer handler holds a reference on req. Signed-off-by: Fan Du <fan.du@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Erik Hugne <erik.hugne@ericsson.com> Fixes: `fa76ce7328` ("inet: get rid of central tcp/dccp listener timer") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 21:40:48 -04:00
Arman Uguray	912098a630	Bluetooth: Add support for adv instance timeout This patch implements support for the timeout parameter of the Add Advertising command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-24 01:53:47 +01:00
Arman Uguray	4117ed70a5	Bluetooth: Add support for instance scan response This patch implements setting the Scan Response data provided as part of an advertising instance through the Add Advertising command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-24 01:53:47 +01:00
Arman Uguray	da929335f2	Bluetooth: Implement the Remove Advertising command This patch implements the "Remove Advertising" mgmt command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-24 01:53:47 +01:00
Arman Uguray	24b4f38fc9	Bluetooth: Implement the Add Advertising command This patch adds the most basic implementation for the "Add Advertisement" command. All state updates between the various HCI settings (POWERED, ADVERTISING, ADVERTISING_INSTANCE, and LE_ENABLED) has been implemented. The command currently supports only setting the advertising data fields, with no flags and no scan response data. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-24 01:53:46 +01:00
Arman Uguray	203fea0178	Bluetooth: Add data structure for advertising instance This patch introduces a new data structure to represent advertising instances that were added using the "Add Advertising" mgmt command. Initially an hci_dev structure will support only one of these instances at a time, so the current instance is simply stored as a direct member of hci_dev. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-24 01:53:46 +01:00
Alexander Duyck	b6f15f828d	fib_trie: Fix regression in handling of inflate/halve failure When I updated the code to address a possible null pointer dereference in resize I ended up reverting an exception handling fix for the suffix length in the event that inflate or halve failed. This change is meant to correct that by reverting the earlier fix and instead simply getting the parent again after inflate has been completed to avoid the possible null pointer issue. Fixes: `ddb4b9a13` ("fib_trie: Address possible NULL pointer dereference in resize") Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:58:32 -04:00
YOSHIFUJI Hideaki/吉藤英明	443b5991a7	net: Move the comment about unsettable socket-level options to default clause and update its reference. We implement the SO_SNDLOWAT etc not to be settable and return ENOPROTOOPT per 1003.1g 7. Move the comment to appropriate position and update the reference. Signed-off-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:54:34 -04:00
Eric Dumazet	52036a4305	ipv6: dccp: handle ICMP messages on DCCP_NEW_SYN_RECV request sockets dccp_v6_err() can restrict lookups to ehash table, and not to listeners. Note this patch creates the infrastructure, but this means that ICMP messages for request sockets are ignored until complete conversion. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:26 -04:00
Eric Dumazet	85645bab57	ipv4: dccp: handle ICMP messages on DCCP_NEW_SYN_RECV request sockets dccp_v4_err() can restrict lookups to ehash table, and not to listeners. Note this patch creates the infrastructure, but this means that ICMP messages for request sockets are ignored until complete conversion. New dccp_req_err() helper is exported so that we can use it in IPv6 in following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:26 -04:00
Eric Dumazet	2215089b22	ipv6: tcp: handle ICMP messages on TCP_NEW_SYN_RECV request sockets tcp_v6_err() can restrict lookups to ehash table, and not to listeners. Note this patch creates the infrastructure, but this means that ICMP messages for request sockets are ignored until complete conversion. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:26 -04:00
Eric Dumazet	26e3736090	ipv4: tcp: handle ICMP messages on TCP_NEW_SYN_RECV request sockets tcp_v4_err() can restrict lookups to ehash table, and not to listeners. Note this patch creates the infrastructure, but this means that ICMP messages for request sockets are ignored until complete conversion. New tcp_req_err() helper is exported so that we can use it in IPv6 in following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:26 -04:00
Eric Dumazet	b282705336	net: convert syn_wait_lock to a spinlock This is a low hanging fruit, as we'll get rid of syn_wait_lock eventually. We hold syn_wait_lock for such small sections, that it makes no sense to use a read/write lock. A spin lock is simply faster. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:26 -04:00
Eric Dumazet	8b929ab12f	inet: remove some sk_listener dependencies listener can be source of false sharing. request sock has some useful information like : ireq->ir_iif, ireq->ir_num, ireq->ireq_net This patch does not solve the major problem of having to read sk->sk_protocol which is sharing a cache line with sk->sk_wmem_alloc. (This same field is read later in ip_build_and_send_pkt()) One idea would be to move sk_protocol close to sk_family (using 8 bits instead of 16 for sk_family seems enough) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:26 -04:00
Eric Dumazet	42cb80a235	inet: remove sk_listener parameter from syn_ack_timeout() It is not needed, and req->sk_listener points to the listener anyway. request_sock argument can be const. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:25 -04:00
Eric Dumazet	2b41fab70f	inet: cache listen_sock_qlen() and read rskq_defer_accept once Cache listen_sock_qlen() to limit false sharing, and read rskq_defer_accept once as it might change under us. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:25 -04:00
Roopa Prabhu	558d51fa2f	switchdev: fix stp update API to work with layered netdevices make it same as the netdev_switch_port_bridge_setlink/dellink api (ie traverse lowerdevs to get to the switch port). removes "WARN_ON(!ops->ndo_switch_parent_id_get)" because direct bridge ports can be stacked netdevices (like bonds and team of switch ports) which may not implement this ndo. v2 to v3: - remove changes to bond and team. Bring back the transparently following lowerdevs like i initially had for setlink/getlink (http://www.spinics.net/lists/netdev/msg313436.html) dave and scott feldman also seem to prefer it be that way and move to non-transparent way of doing things if we see a problem down the lane. v3 to v4: - fix ret initialization v4 to v5: - return err on first failure (scott feldman) v5 to v6: - change variable name (err) and initialize to -EOPNOTSUPP (scott feldman). Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:44:56 -04:00
WANG Cong	08b4b8ea79	net: clear skb->priority when forwarding to another netns skb->priority can be set for two purposes: 1) With respect to IP TOS field, which is computed by a mask. Ususally used for priority qdisc's (pfifo, prio etc.), on TX side (we only have ingress qdisc on RX side). 2) Used as a classid or flowid, works in the same way with tc classid. What's more, this can even override the classid of tc filters. For case 1), it has been respected within its netns, I don't see any point of keeping it for another netns, especially when packets will be forwarded to Rx path (no matter from TX path or RX path). For case 2) we care, our applications run inside a netns, and we classify the packets by our own filters outside, If some application sets this priority, it could bypass our filters, therefore clear it when moving out of a netns, it makes no sense to bypass tc filters out of its netns. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:43:08 -04:00
tadeusz.struk@intel.com	0345f93138	net: socket: add support for async operations Add support for async operations. Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:41:36 -04:00
David S. Miller	c0e41fa76c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Fix missing initialization of tuple structure in nfnetlink_cthelper to avoid mismatches when looking up to attach userspace helpers to flows, from Ian Wilson. 2) Fix potential crash in nft_hash when we hit -EAGAIN in nft_hash_walk(), from Herbert Xu. 3) We don't need to indicate the hook information to update the basechain default policy in nf_tables. 4) Restore tracing over nfnetlink_log due to recent rework to accomodate logging infrastructure into nf_tables. 5) Fix wrong IP6T_INV_PROTO check in xt_TPROXY. 6) Set IP6T_F_PROTO flag in nft_compat so we can use SYNPROXY6 and REJECT6 from xt over nftables. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-22 16:57:07 -04:00
Pablo Neira Ayuso	e35158e401	netfilter: ip6t_REJECT: check for IP6T_F_PROTO Make sure IP6T_F_PROTO is set to enforce layer 4 protocol matching from the ip6_tables core. Suggested-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-22 20:02:46 +01:00
Patrick McHardy	55df35d22f	netfilter: nf_tables: reject NFT_SET_ELEM_INTERVAL_END flag for non-interval sets Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-22 19:50:35 +01:00
Patrick McHardy	16c45eda96	netfilter: nft_rbtree: fix locking Fix a race condition and unnecessary locking: * the root rb_node must only be accessed under the lock in nft_rbtree_lookup() * the lock is not needed in lookup functions in netlink context Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-22 19:49:09 +01:00
Florian Westphal	8d0451638a	netfilter: bridge: kill nf_bridge_pad The br_netfilter frag output function calls skb_cow_head() so in case it needs a larger headroom to e.g. re-add a previously stripped PPPOE or VLAN header things will still work (at cost of reallocation). We can then move nf_bridge_encap_header_len to br_netfilter. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-22 19:45:55 +01:00
Pablo Neira Ayuso	749177ccc7	netfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set ip6tables extensions check for this flag to restrict match/target to a given protocol. Without this flag set, SYNPROXY6 returns an error. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Patrick McHardy <kaber@trash.net>	2015-03-22 19:32:05 +01:00
Johan Hedberg	baf880a968	Bluetooth: Fix memory leak in le_scan_disable_work_complete() The hci_request in le_scan_disable_work_complete() was being initialized in a general context but only used in a specific branch in the function (when simultaneous discovery is not supported). This patch moves the usage to be limited to the branch where hci_req_run() is actually called. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-22 08:03:54 +01:00
Dominique Martinet	f569d3ef82	net/9p: add a privport option for RDMA transport. RDMA can use the same kind of weak security as TCP by checking the client can bind to a privileged port, which is better than nothing if TAUTH isn't implemented. Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2015-03-21 19:32:33 -07:00
Dominique Martinet	b99baa43e5	net/9p: Initialize opts->privport as it should be. We're currently using an uninitialized value if option privport is not set, thus (almost) always using a privileged port. Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2015-03-21 19:27:22 -07:00
Herbert Xu	8f2ddaac30	netlink: Remove netlink_compare_arg.trailer Instead of computing the offset from trailer, this patch computes netlink_compare_arg_len from the offset of portid and then adds 4 to it. This allows trailer to be removed. Reported-by: David Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-21 00:16:39 -04:00
YOSHIFUJI Hideaki/吉藤英明	8da86466b8	net: neighbour: Add mcast_resolicit to configure the number of multicast resolicitations in PROBE state. We send unicast neighbor (ARP or NDP) solicitations ucast_probes times in PROBE state. Zhu Yanjun reported that some implementation does not reply against them and the entry will become FAILED, which is undesirable. We had been dealt with such nodes by sending multicast probes mcast_ solicit times after unicast probes in PROBE state. In 2003, I made a change not to send them to improve compatibility with IPv6 NDP. Let's introduce per-protocol per-interface sysctl knob "mcast_ reprobe" to configure the number of multicast (re)solicitation for reconfirmation in PROBE state. The default is 0, since we have been doing so for 10+ years. Reported-by: Zhu Yanjun <Yanjun.Zhu@windriver.com> CC: Ulf Samuelsson <ulf.samuelsson@ericsson.com> Signed-off-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 21:47:40 -04:00
Mathieu Olivari	c6f15070e7	net: dsa: make NET_DSA manually selectable from the config Change `bd76a11670` made all DSA drivers depend on NET_DSA rather than selecting them. However, as the only way to select this option was to actually select a driver, it made DSA impossible to enable at all. This patch adds an explicit entry which the user will have to enable prior selecting a driver. Signed-off-by: Mathieu Olivari <mathieu@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 21:37:58 -04:00
Eric Dumazet	d3593b5cef	Revert "selinux: add a skb_owned_by() hook" This reverts commit `ca10b9e9a8`. No longer needed after commit `eb8895debe` ("tcp: tcp_make_synack() should use sock_wmalloc") When under SYNFLOOD, we build lot of SYNACK and hit false sharing because of multiple modifications done on sk_listener->sk_wmem_alloc Since tcp_make_synack() uses sock_wmalloc(), there is no need to call skb_set_owner_w() again, as this adds two atomic operations. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 21:36:53 -04:00
Daniel Borkmann	a8cb5f556b	act_bpf: add initial eBPF support for actions This work extends the "classic" BPF programmable tc action by extending its scope also to native eBPF code! Together with commit `e2e9b6541d` ("cls_bpf: add initial eBPF support for programmable classifiers") this adds the facility to implement fully flexible classifier and actions for tc that can be implemented in a C subset in user space, "safely" loaded into the kernel, and being run in native speed when JITed. Also, since eBPF maps can be shared between eBPF programs, it offers the possibility that cls_bpf and act_bpf can share data 1) between themselves and 2) between user space applications. That means that, f.e. customized runtime statistics can be collected in user space, but also more importantly classifier and action behaviour could be altered based on map input from the user space application. For the remaining details on the workflow and integration, see the cls_bpf commit `e2e9b6541d`. Preliminary iproute2 part can be found under [1]. [1] http://git.breakpoint.cc/cgit/dborkman/iproute2.git/log/?h=ebpf-act Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 19:10:44 -04:00
Daniel Borkmann	94caee8c31	ebpf: add sched_act_type and map it to sk_filter's verifier ops In order to prepare eBPF support for tc action, we need to add sched_act_type, so that the eBPF verifier is aware of what helper function act_bpf may use, that it can load skb data and read out currently available skb fields. This is bascially analogous to `96be4325f4` ("ebpf: add sched_cls_type and map it to sk_filter's verifier ops"). BPF_PROG_TYPE_SCHED_CLS and BPF_PROG_TYPE_SCHED_ACT need to be separate since both will have a different set of functionality in future (classifier vs action), thus we won't run into ABI troubles when the point in time comes to diverge functionality from the classifier. The future plan for act_bpf would be that it will be able to write into skb->data and alter selected fields mirrored in struct __sk_buff. For an initial support, it's sufficient to map it to sk_filter_ops. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 19:10:44 -04:00
David S. Miller	0fa74a4be4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/emulex/benet/be_main.c net/core/sysctl_net_core.c net/ipv4/inet_diag.c The be_main.c conflict resolution was really tricky. The conflict hunks generated by GIT were very unhelpful, to say the least. It split functions in half and moved them around, when the real actual conflict only existed solely inside of one function, that being be_map_pci_bars(). So instead, to resolve this, I checked out be_main.c from the top of net-next, then I applied the be_main.c changes from 'net' since the last time I merged. And this worked beautifully. The inet_diag.c and sysctl_net_core.c conflicts were simple overlapping changes, and were easily to resolve. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 18:51:09 -04:00
Al Viro	4de930efc2	net: validate the range we feed to iov_iter_init() in sys_sendto/sys_recvfrom Cc: stable@vger.kernel.org # v3.19 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:38:06 -04:00
Catalin Marinas	91edd096e2	net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour Commit `db31c55a6f` (net: clamp ->msg_namelen instead of returning an error) introduced the clamping of msg_namelen when the unsigned value was larger than sizeof(struct sockaddr_storage). This caused a msg_namelen of -1 to be valid. The native code was subsequently fixed by commit `dbb490b965` (net: socket: error on a negative msg_namelen). In addition, the native code sets msg_namelen to 0 when msg_name is NULL. This was done in commit (`6a2a2b3ae0` net:socket: set msg_namelen to 0 if msg_name is passed as NULL in msghdr struct from userland) and subsequently updated by `08adb7dabd` (fold verify_iovec() into copy_msghdr_from_user()). This patch brings the get_compat_msghdr() in line with copy_msghdr_from_user(). Fixes: `db31c55a6f` (net: clamp ->msg_namelen instead of returning an error) Cc: David S. Miller <davem@davemloft.net> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:31:09 -04:00
Herbert Xu	6cca7289d5	tipc: Use inlined rhashtable interface This patch converts tipc to the inlined rhashtable interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	fa3773211e	netfilter: Convert nft_hash to inlined rhashtable This patch converts nft_hash to the inlined rhashtable interface. This patch also replaces the call to rhashtable_lookup_compare with a straight rhashtable_lookup_fast because it's simply doing a memcmp (in fact nft_hash_lookup already uses memcmp instead of nft_data_cmp). Furthermore, the compare function is only meant to compare, it is not supposed to have side-effects. The current side-effect code can simply be moved into the nft_hash_get. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	c428ecd1a2	netlink: Move namespace into hash key Currently the name space is a de facto key because it has to match before we find an object in the hash table. However, it isn't in the hash value so all objects from different name spaces with the same port ID hash to the same bucket. This is bad as the number of name spaces is unbounded. This patch fixes this by using the namespace when doing the hash. Because the namespace field doesn't lie next to the portid field in the netlink socket, this patch switches over to the rhashtable interface without a fixed key. This patch also uses the new inlined rhashtable interface where possible. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Daniel Borkmann	0b8c707ddf	ebpf, filter: do not convert skb->protocol to host endianess during runtime Commit `c249739579` ("bpf: allow BPF programs access 'protocol' and 'vlan_tci' fields") has added support for accessing protocol, vlan_present and vlan_tci into the skb offset map. As referenced in the below discussion, accessing skb->protocol from an eBPF program should be converted without handling endianess. The reason for this is that an eBPF program could simply do a check more naturally, by f.e. testing skb->protocol == htons(ETH_P_IP), where the LLVM compiler resolves htons() against a constant automatically during compilation time, as opposed to an otherwise needed run time conversion. After all, the way of programming both from a user perspective differs quite a lot, i.e. bpf_asm ["ld proto"] versus a C subset/LLVM. Reference: https://patchwork.ozlabs.org/patch/450819/ Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 15:24:26 -04:00
Johannes Berg	5d8325ecb9	cfg80211: add vlan to station add/change tracing This helps debug issues with VLAN modifications that are otherwise not really visible in any tracing/debugging. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-20 19:56:41 +01:00
Jakub Pawlowski	b55d1abf56	Bluetooth: Expose quirks through debugfs This patch expose controller quirks through debugfs. It would be useful for BlueZ tests using vhci. Currently there is no way to test quirk dependent behaviour. It might be also useful for manual testing. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-20 19:47:01 +01:00
Marcelo Ricardo Leitner	c4a6853d8f	ipv6: invert join/leave anycast rtnl/socket locking order Commit `baf606d9c9` ("ipv4,ipv6: grab rtnl before locking the socket") missed to update two setsockopt options, IPV6_JOIN_ANYCAST and IPV6_LEAVE_ANYCAST, causing a lock inverstion regarding to the updated ones. As ipv6_sock_ac_join and ipv6_sock_ac_leave are only called from do_ipv6_setsockopt, we are good to just move the rtnl lock upper. Fixes: `baf606d9c9` ("ipv4,ipv6: grab rtnl before locking the socket") Reported-by: Ying Huang <ying.huang@intel.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:32:38 -04:00
Josh Hunt	d22e153718	tcp: fix tcp fin memory accounting tcp_send_fin() does not account for the memory it allocates properly, so sk_forward_alloc can be negative in cases where we've sent a FIN: ss example output (ss -amn \| grep -B1 f4294): tcp FIN-WAIT-1 0 1 192.168.0.1:45520 192.0.2.1:8080 skmem:(r0,rb87380,t0,tb87380,f4294966016,w1280,o0,bl0) Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:18:52 -04:00
Steven Barth	73ba57bfae	ipv6: fix backtracking for throw routes for throw routes to trigger evaluation of other policy rules EAGAIN needs to be propagated up to fib_rules_lookup similar to how its done for IPv4 A simple testcase for verification is: ip -6 rule add lookup 33333 priority 33333 ip -6 route add throw 2001:db8::1 ip -6 route add 2001:db8::1 via fe80::1 dev wlan0 table 33333 ip route get 2001:db8::1 Signed-off-by: Steven Barth <cyrus@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:57:23 -04:00
Sabrina Dubroca	8e199dfd82	ipv6: call ipv6_proxy_select_ident instead of ipv6_select_ident in udp6_ufo_fragment Matt Grant reported frequent crashes in ipv6_select_ident when udp6_ufo_fragment is called from openvswitch on a skb that doesn't have a dst_entry set. ipv6_proxy_select_ident generates the frag_id without using the dst associated with the skb. This approach was suggested by Vladislav Yasevich. Fixes: `0508c07f5e` ("ipv6: Select fragment id during UFO segmentation if not set.") Cc: Vladislav Yasevich <vyasevic@redhat.com> Reported-by: Matt Grant <matt@mattgrant.net.nz> Tested-by: Matt Grant <matt@mattgrant.net.nz> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:56:11 -04:00
Eric Dumazet	becb74f0ac	net: increase sk_[max_]ack_backlog sk_ack_backlog & sk_max_ack_backlog were 16bit fields, meaning listen() backlog was limited to 65535. It is time to increase the width to allow much bigger backlog, if admins change /proc/sys/net/core/somaxconn & /proc/sys/net/ipv4/tcp_max_syn_backlog default values. Tested: echo 5000000 >/proc/sys/net/core/somaxconn echo 5000000 >/proc/sys/net/ipv4/tcp_max_syn_backlog Ran a SYNFLOOD test against a listener using listen(fd, 5000000) myhost~# grep request_sock_TCP /proc/slabinfo request_sock_TCP 4185642 4411940 304 13 1 : tunables 54 27 8 : slabdata 339380 339380 0 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:40:25 -04:00
Eric Dumazet	fa76ce7328	inet: get rid of central tcp/dccp listener timer One of the major issue for TCP is the SYNACK rtx handling, done by inet_csk_reqsk_queue_prune(), fired by the keepalive timer of a TCP_LISTEN socket. This function runs for awful long times, with socket lock held, meaning that other cpus needing this lock have to spin for hundred of ms. SYNACK are sent in huge bursts, likely to cause severe drops anyway. This model was OK 15 years ago when memory was very tight. We now can afford to have a timer per request sock. Timer invocations no longer need to lock the listener, and can be run from all cpus in parallel. With following patch increasing somaxconn width to 32 bits, I tested a listener with more than 4 million active request sockets, and a steady SYNFLOOD of ~200,000 SYN per second. Host was sending ~830,000 SYNACK per second. This is ~100 times more what we could achieve before this patch. Later, we will get rid of the listener hash and use ehash instead. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:40:25 -04:00
Eric Dumazet	52452c5425	inet: drop prev pointer handling in request sock When request sock are put in ehash table, the whole notion of having a previous request to update dl_next is pointless. Also, following patch will get rid of big purge timer, so we want to delete a request sock without holding listener lock. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:40:25 -04:00
Johannes Berg	7c10770f99	mac80211: avoid duplicate TX path station lookup Instead of looking up the destination station twice in the TX path (first to build the header, and then for control processing), save it when building the header and use it later in the TX path. To avoid having to look up the station in the many callers, allow those to pass %NULL which keeps the existing lookup. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-20 16:27:36 +01:00
Johannes Berg	e33f5569aa	mac80211: mesh: avoid pointless station lookup In ieee80211_build_hdr(), the station is looked up to build the header correctly (QoS field) and to check for authorization. For mesh, authorization isn't checked here, and QoS capability is mandatory, so the station lookup can be avoided. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-20 16:27:36 +01:00
Johannes Berg	a8d15ff005	mac80211: drop 4-addr VLAN frames earlier if not connected If there's no station on the 4-addr VLAN interface, then frames cannot be transmitted. Drop such frames earlier, before setting up all the information for them. We should keep the old check though since that code might be used for other internally-generated frames. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-20 16:27:35 +01:00
Johannes Berg	5041006c42	mac80211: don't look up destination station twice There's no need to look up the destination station twice while building the 802.11 header for a given frame if the frame will actually be transmitted to the station we initially looked up. This happens for 4-addr VLAN interfaces and TDLS connections, which both directly send the frame to the station they looked up, though in the case of TDLS some station conditions need to be checked. To avoid that, add a variable indicating that we've looked up the station that the frame is going to be transmitted to, and avoid the lookup/flag checking if it already has been done. In the TDLS case, also move the authorized/wme_sta flag assignment to the correct place, i.e. only when that station is really used. Before this change, the new lookup should always have succeeded so that the potentially erroneous data would be overwritten. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-20 16:27:34 +01:00
Andrey Ryabinin	179a5bc4b8	net/9p: use memcpy() instead of snprintf() in p9_mount_tag_show() p9_mount_tag_show() uses '%s' format string to print non-NULL terminated chan->tag string. This leads to out of bounds memory read, because format '%s' implies that string is NULL-terminated. The length of string is know here, so its simpler and safer to use memcpy instead of snprintf(). Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2015-03-20 07:34:43 -07:00
Kirill A. Shutemov	6250a8badb	9p: use unsigned integers for nwqid/count As specification says, all integers in messages are unsigned. Let's fix behaviour of p9pdu_vreadf()/p9pdu_vwritef() accordingly. Fix for p9pdu_vreadf() is critical. If server replies with Rwalk, where nwqid > SHRT_MAX, the value will be interpreted as negative. kmalloc, in its order, will cast the value to (very big) size_t. It should never happen in normal situation: we never submit Twalk with nwname > 16, but malicious or broken server can still produce problematic Rwalk. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2015-03-20 07:34:42 -07:00
Fabian Frederick	9bfc52c109	9p: remove unused variable in p9_fd_create() p is initialized but unused. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2015-03-20 07:34:41 -07:00
Pablo Neira Ayuso	3d8c6dce53	netfilter: xt_TPROXY: fix invflags check in tproxy_tg6_check() We have to check for IP6T_INV_PROTO in invflags, instead of flags. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Balazs Scheidler <bazsi@balabit.hu>	2015-03-20 14:35:33 +01:00
Marcel Holtmann	dc5d82a9fe	Bluetooth: Use HCI_MAX_AD_LENGTH constant instead hardcoded value Using the HCI_MAX_AD_LENGTH for the max advertising data and max scan response data length makes more sense than hardcoding the value. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-20 14:08:32 +02:00
Marcel Holtmann	e7844ee599	Bluetooth: Gracefully response to enabling LE on LE only devices Currently the enabling of LE on LE only devices causes an error. This is a bit difference from other commands where trying to set the same existing settings causes a positive response. Fix this behavior for this single corner case. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-20 14:05:27 +02:00
Johannes Berg	e8f4fb7c7c	mac80211: remove drop_unencrypted code This mechanism was historic, and only ever used by IBSS, which also doesn't need to have it as it properly manages station's 802.1X PAE state (or, with WEP, always has a key.) Remove the mechanism to clean up the code. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-20 11:37:36 +01:00
Marcelo Ricardo Leitner	446981e5fc	tipc: fix build issue when building without IPv6 We can't directly call ipv6_sock_mc_join() but should use the stub instead and protect it around IS_ENABLED. Fixes: `d0f91938be` ("tipc: add ip/udp media type") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-19 16:06:27 -04:00
David S. Miller	970282d0e1	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-03-19 This wont the last 4.1 bluetooth-next pull request, but we've piled up enough patches in less than a week that I wanted to save you from a single huge "last-minute" pull somewhere closer to the merge window. The main changes are: - Simultaneous LE & BR/EDR discovery support for HW that can do it - Complete LE OOB pairing support - More fine-grained mgmt-command access control (normal user can now do harmless read-only operations). - Added RF power amplifier support in cc2520 ieee802154 driver - Some cleanups/fixes in ieee802154 code Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-19 15:18:04 -04:00
Linus Torvalds	47226fe1b5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix packet header offset calculation in _decode_session6(), from Hajime Tazaki. 2) Fix route leak in error paths of xfrm_lookup(), from Huaibin Wang. 3) Be sure to clear state properly when scans fail in iwlwifi mvm code, from Luciano Coelho. 4) iwlwifi tries to stop scans that aren't actually running, also from Luciano Coelho. 5) mac80211 should drop mesh frames that are not encrypted, fix from Bob Copeland. 6) Add new device ID to b43 wireless driver for BCM432228 chips, from Rafał Miłecki. 7) Fix accidental addition of members after variable sized array in struct tc_u_hnode, from WANG Cong. 8) Don't re-enable interrupts until after we call napi_complete() in ibmveth and WIZnet drivers, frm Yongbae Park. 9) Fix regression in vlan tag handling of fec driver, from Fugang Duan. 10) If a network namespace change fails during rtnl_newlink(), we don't unwind the device registry properly. 11) Fix two TCP regressions, from Neal Cardwell: - Don't allow snd_cwnd_cnt to accumulate huge values due to missing test in tcp_cong_avoid_ai(). - Restore CUBIC back to advancing cwnd by 1.5x packets per RTT. 12) Fix performance regression in xne-netback involving push TX notifications, from David Vrabel. 13) __skb_tstamp_tx() can be called with a NULL sk pointer, do not dereference blindly. From Willem de Bruijn. 14) Fix potential stack overflow in RDS protocol stack, from Arnd Bergmann. 15) VXLAN_VID_MASK used incorrectly in new remote checksum offload support of VXLAN driver. Fix from Alexey Kodanev. 16) Fix too small netlink SKB allocation in inet_diag layer, from Eric Dumazet. 17) ieee80211_check_combinations() does not count interfaces correctly, from Andrei Otcheretianski. 18) Hardware feature determination in bxn2x driver references a piece of software state that actually isn't initialized yet, fix from Michal Schmidt. 19) inet_csk_wait_for_connect() needs a sched_annotate_sleep() annoation, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits) Revert "net: cx82310_eth: use common match macro" net/mlx4_en: Set statistics bitmap at port init IB/mlx4: Saturate RoCE port PMA counters in case of overflow net/mlx4_en: Fix off-by-one in ethtool statistics display IB/mlx4: Verify net device validity on port change event act_bpf: allow non-default TC_ACT opcodes as BPF exec outcome Revert "smc91x: retrieve IRQ and trigger flags in a modern way" inet: Clean up inet_csk_wait_for_connect() vs. might_sleep() ip6_tunnel: fix error code when tunnel exists netdevice.h: fix ndo_bridge_* comments bnx2x: fix encapsulation features on 57710/57711 mac80211: ignore CSA to same channel nl80211: ignore HT/VHT capabilities without QoS/WMM mac80211: ask for ECSA IE to be considered for beacon parse CRC mac80211: count interfaces correctly for combination checks isdn: icn: use strlcpy() when parsing setup options rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg() caif: fix MSG_OOB test in caif_seqpkt_recvmsg() bridge: reset bridge mtu after deleting an interface can: kvaser_usb: Fix tx queue start/stop race conditions ...	2015-03-19 11:19:44 -07:00
Erik Hugne	f2f8036e39	tipc: add support for connect() on dgram/rdm sockets Following the example of ip4_datagram_connect, we store the address in the socket structure for dgram/rdm sockets and use that as the default destination for subsequent send() calls. It is allowed to connect to any address types, and the behaviour of send() will be the same as a normal sendto() with this address provided. Binding to an AF_UNSPEC address clears the association. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-19 12:25:54 -04:00
Erik Hugne	3bd88ee7a2	tipc: do not report -EHOSTUNREACH for failed local delivery Since commit `1186adf7df` ("tipc: simplify message forwarding and rejection in socket layer") -EHOSTUNREACH is propagated back to the sending process if we fail to deliver the message to another socket local to the node. This is wrong, host unreachable should only be reported when the destination port/name does not exist in the cluster, and that check is always done before sending the message. Also, this introduces inconsistent sendmsg() behavior for local/remote destinations. Errors occurring on the receiving side should not trickle up to the sender. If message delivery fails TIPC should either discard the packet or reject it back to the sender based on the destination droppable option. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-19 12:25:54 -04:00
Erik Hugne	18d6c58415	tipc: remove redundant call to tipc_node_remove_conn tipc_node_remove_conn may be called twice if shutdown() is called on a socket that have messages in the receive queue. Calling this function twice does no harm, but is unnecessary and we remove the redundant call. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-19 12:25:54 -04:00
Nicolas Iooss	ea6edfbcef	mac802154: fix typo in header guard Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Fixes: `b6eea9ca35` ("mac802154: introduce driver-ops header") Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-19 15:32:22 +01:00
Pablo Neira Ayuso	4017a7ee69	netfilter: restore rule tracing via nfnetlink_log Since `fab4085` ("netfilter: log: nf_log_packet() as real unified interface"), the loginfo structure that is passed to nf_log_packet() is used to explicitly indicate the logger type you want to use. This is a problem for people tracing rules through nfnetlink_log since packets are always routed to the NF_LOG_TYPE logger after the aforementioned patch. We can fix this by removing the trace loginfo structures, but that still changes the log level from 4 to 5 for tracing messages and there may be someone relying on this outthere. So let's just introduce a new nf_log_trace() function that restores the former behaviour. Reported-by: Markus Kötter <koetter@rrzn.uni-hannover.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-19 11:14:48 +01:00
Jörg Thalheim	af615762e9	bridge: add ageing_time, stp_state, priority over netlink Signed-off-by: Jörg Thalheim <joerg@higgsboson.tk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 23:21:06 -04:00
David S. Miller	99c4a26a15	net: Fix high overhead of vlan sub-device teardown. When a networking device is taken down that has a non-trivial number of VLAN devices configured under it, we eat a full synchronize_net() for every such VLAN device. This is because of the call chain: NETDEV_DOWN notifier --> vlan_device_event() --> dev_change_flags() --> __dev_change_flags() --> __dev_close() --> __dev_close_many() --> dev_deactivate_many() --> synchronize_net() This is kind of rediculous because we already have infrastructure for batching doing operation X to a list of net devices so that we only incur one sync. So make use of that by exporting dev_close_many() and adjusting it's interfaace so that the caller can fully manage the batch list. Use this in vlan_device_event() and all the overhead goes away. Reported-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:52:56 -04:00
Eric Dumazet	738e6d30d3	inet: add a schedule point in inet_twsk_purge() On a large hash table, we can easily spend seconds to walk over all entries. Add a cond_resched() to yield cpu if necessary. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:38:13 -04:00
David Ahern	db24a9044e	net: add support for phys_port_name Similar to port id allow netdevices to specify port names and export the name via sysfs. Drivers can implement the netdevice operation to assist udev in having sane default names for the devices using the rule: $ cat /etc/udev/rules.d/80-net-setup-link.rules SUBSYSTEM=="net", ACTION=="add", ATTR{phys_port_name}!="", NAME="$attr{phys_port_name}" Use of phys_name versus phys_id was suggested-by Jiri Pirko. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:30:35 -04:00
Marcelo Ricardo Leitner	54ff9ef36b	ipv4, ipv6: kill ip_mc_{join, leave}_group and ipv6_sock_mc_{join, drop} in favor of their inner __ ones, which doesn't grab rtnl. As these functions need to operate on a locked socket, we can't be grabbing rtnl by then. It's too late and doing so causes reversed locking. So this patch: - move rtnl handling to callers instead while already fixing some reversed locking situations, like on vxlan and ipvs code. - renames __ ones to not have the __ mark: __ip_mc_{join,leave}_group -> ip_mc_{join,leave}_group __ipv6_sock_mc_{join,drop} -> ipv6_sock_mc_{join,drop} Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:05:09 -04:00
Marcelo Ricardo Leitner	baf606d9c9	ipv4,ipv6: grab rtnl before locking the socket There are some setsockopt operations in ipv4 and ipv6 that are grabbing rtnl after having grabbed the socket lock. Yet this makes it impossible to do operations that have to lock the socket when already within a rtnl protected scope, like ndo dev_open and dev_stop. We normally take coarse grained locks first but setsockopt inverted that. So this patch invert the lock logic for these operations and makes setsockopt grab rtnl if it will be needed prior to grabbing socket lock. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:05:09 -04:00
Eric Dumazet	08d2cc3b26	inet: request sock should init IPv6/IPv4 addresses In order to be able to use sk_ehashfn() for request socks, we need to initialize their IPv6/IPv4 addresses. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:00:35 -04:00
Eric Dumazet	b4d6444ea3	inet: get rid of last __inet_hash_connect() argument We now always call __inet_hash_nolisten(), no need to pass it as an argument. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:00:35 -04:00
Eric Dumazet	77a6a471bc	ipv6: get rid of __inet6_hash() We can now use inet_hash() and __inet_hash() instead of private functions. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:00:35 -04:00
Eric Dumazet	d1e559d0b1	inet: add IPv6 support to sk_ehashfn() Intent is to converge IPv4 & IPv6 inet_hash functions to factorize code. IPv4 sockets initialize sk_rcv_saddr and sk_v6_daddr in this patch, thanks to new sk_daddr_set() and sk_rcv_saddr_set() helpers. __inet6_hash can now use sk_ehashfn() instead of a private inet6_sk_ehashfn() and will simply use __inet_hash() in a following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:00:34 -04:00
Eric Dumazet	5b441f76f1	net: introduce sk_ehashfn() helper Goal is to unify IPv4/IPv6 inet_hash handling, and use common helpers for all kind of sockets (full sockets, timewait and request sockets) inet_sk_ehashfn() becomes sk_ehashfn() but still only copes with IPv4 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:00:34 -04:00
Eric Dumazet	6eada0110c	netns: constify net_hash_mix() and various callers const qualifiers ease code review by making clear which objects are not written in a function. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:00:34 -04:00
John Fastabend	822b3b2ebf	net: Add max rate tx queue attribute This adds a tx_maxrate attribute to the tx queue sysfs entry allowing for max-rate limiting. Along with DCB-ETS and BQL this provides another knob to tune queue performance. The limit units are Mbps. By default it is disabled. To disable the rate limitation after it has been set for a queue, it should be set to zero. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 14:55:18 -04:00
Herbert Xu	446c89ac1f	tipc: Use rhashtable max/min_size instead of max/min_shift This patch converts tipc to use rhashtable max/min_size instead of the obsolete max/min_shift. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 12:46:40 -04:00
Herbert Xu	b06eee59b1	netlink: Use rhashtable max_size instead of max_shift This patch converts netlink to use rhashtable max_size instead of the obsolete max_shift. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 12:46:40 -04:00
Pablo Neira Ayuso	ffdb210eb4	netfilter: nf_tables: consolidate error path of nf_tables_newtable() Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-18 11:57:31 +01:00
Joe Perches	1ca9e41770	netfilter: Remove uses of seq_<foo> return values The seq_printf/seq_puts/seq_putc return values, because they are frequently misused, will eventually be converted to void. See: commit `1f33c41c03` ("seq_file: Rename seq_overflow() to seq_has_overflowed() and make public") Miscellanea: o realign arguments Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-18 10:51:35 +01:00
Marcel Holtmann	63511f6d5b	Bluetooth: Fix potential NULL dereference in SMP channel setup When the allocation of the L2CAP channel for the BR/EDR security manager fails, then the smp variable might be NULL. In that case do not try to free the non-existing crypto contexts Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-18 08:30:03 +02:00
Daniel Borkmann	ced585c83b	act_bpf: allow non-default TC_ACT opcodes as BPF exec outcome Revisiting commit `d23b8ad8ab` ("tc: add BPF based action") with regards to eBPF support, I was thinking that it might be better to improve return semantics from a BPF program invoked through BPF_PROG_RUN(). Currently, in case filter_res is 0, we overwrite the default action opcode with TC_ACT_SHOT. A default action opcode configured through tc's m_bpf can be: TC_ACT_RECLASSIFY, TC_ACT_PIPE, TC_ACT_SHOT, TC_ACT_UNSPEC, TC_ACT_OK. In cls_bpf, we have the possibility to overwrite the default class associated with the classifier in case filter_res is _not_ 0xffffffff (-1). That allows us to fold multiple [e]BPF programs into a single one, where they would otherwise need to be defined as a separate classifier with its own classid, needlessly redoing parsing work, etc. Similarly, we could do better in act_bpf: Since above TC_ACT* opcodes are exported to UAPI anyway, we reuse them for return-code-to-tc-opcode mapping, where we would allow above possibilities. Thus, like in cls_bpf, a filter_res of 0xffffffff (-1) means that the configured _default_ action is used. Any unkown return code from the BPF program would fail in tcf_bpf() with TC_ACT_UNSPEC. Should we one day want to make use of TC_ACT_STOLEN or TC_ACT_QUEUED, which both have the same semantics, we have the option to either use that as a default action (filter_res of 0xffffffff) or non-default BPF return code. All that will allow us to transparently use tcf_bpf() for both BPF flavours. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 22:15:06 -04:00
Ying Xue	2b9bb7f338	tipc: withdraw tipc topology server name when namespace is deleted The TIPC topology server is a per namespace service associated with the tipc name {1, 1}. When a namespace is deleted, that name must be withdrawn before we call sk_release_kernel because the kernel socket release is done in init_net and trying to withdraw a TIPC name published in another namespace will fail with an error as: [ 170.093264] Unable to remove local publication [ 170.093264] (type=1, lower=1, ref=2184244004, key=2184244005) We fix this by breaking the association between the topology server name and socket before calling sk_release_kernel. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 22:11:26 -04:00
Ying Xue	8460504bdd	tipc: fix a potential deadlock when nametable is purged [ 28.531768] ============================================= [ 28.532322] [ INFO: possible recursive locking detected ] [ 28.532322] 3.19.0+ #194 Not tainted [ 28.532322] --------------------------------------------- [ 28.532322] insmod/583 is trying to acquire lock: [ 28.532322] (&(&nseq->lock)->rlock){+.....}, at: [<ffffffffa000d219>] tipc_nametbl_remove_publ+0x49/0x2e0 [tipc] [ 28.532322] [ 28.532322] but task is already holding lock: [ 28.532322] (&(&nseq->lock)->rlock){+.....}, at: [<ffffffffa000e0dc>] tipc_nametbl_stop+0xfc/0x1f0 [tipc] [ 28.532322] [ 28.532322] other info that might help us debug this: [ 28.532322] Possible unsafe locking scenario: [ 28.532322] [ 28.532322] CPU0 [ 28.532322] ---- [ 28.532322] lock(&(&nseq->lock)->rlock); [ 28.532322] lock(&(&nseq->lock)->rlock); [ 28.532322] [ 28.532322] * DEADLOCK * [ 28.532322] [ 28.532322] May be due to missing lock nesting notation [ 28.532322] [ 28.532322] 3 locks held by insmod/583: [ 28.532322] #0: (net_mutex){+.+.+.}, at: [<ffffffff8163e30f>] register_pernet_subsys+0x1f/0x50 [ 28.532322] #1: (&(&tn->nametbl_lock)->rlock){+.....}, at: [<ffffffffa000e091>] tipc_nametbl_stop+0xb1/0x1f0 [tipc] [ 28.532322] #2: (&(&nseq->lock)->rlock){+.....}, at: [<ffffffffa000e0dc>] tipc_nametbl_stop+0xfc/0x1f0 [tipc] [ 28.532322] [ 28.532322] stack backtrace: [ 28.532322] CPU: 1 PID: 583 Comm: insmod Not tainted 3.19.0+ #194 [ 28.532322] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 28.532322] ffffffff82394460 ffff8800144cb928 ffffffff81792f3e 0000000000000007 [ 28.532322] ffffffff82394460 ffff8800144cba28 ffffffff810a8080 ffff8800144cb998 [ 28.532322] ffffffff810a4df3 ffff880013e9cb10 ffffffff82b0d330 ffff880013e9cb38 [ 28.532322] Call Trace: [ 28.532322] [<ffffffff81792f3e>] dump_stack+0x4c/0x65 [ 28.532322] [<ffffffff810a8080>] __lock_acquire+0x740/0x1ca0 [ 28.532322] [<ffffffff810a4df3>] ? __bfs+0x23/0x270 [ 28.532322] [<ffffffff810a7506>] ? check_irq_usage+0x96/0xe0 [ 28.532322] [<ffffffff810a8a73>] ? __lock_acquire+0x1133/0x1ca0 [ 28.532322] [<ffffffffa000d219>] ? tipc_nametbl_remove_publ+0x49/0x2e0 [tipc] [ 28.532322] [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140 [ 28.532322] [<ffffffffa000d219>] ? tipc_nametbl_remove_publ+0x49/0x2e0 [tipc] [ 28.532322] [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50 [ 28.532322] [<ffffffffa000d219>] ? tipc_nametbl_remove_publ+0x49/0x2e0 [tipc] [ 28.532322] [<ffffffffa000d219>] tipc_nametbl_remove_publ+0x49/0x2e0 [tipc] [ 28.532322] [<ffffffffa000e11e>] tipc_nametbl_stop+0x13e/0x1f0 [tipc] [ 28.532322] [<ffffffffa000dfe5>] ? tipc_nametbl_stop+0x5/0x1f0 [tipc] [ 28.532322] [<ffffffffa0004bab>] tipc_init_net+0x13b/0x150 [tipc] [ 28.532322] [<ffffffffa0004a75>] ? tipc_init_net+0x5/0x150 [tipc] [ 28.532322] [<ffffffff8163dece>] ops_init+0x4e/0x150 [ 28.532322] [<ffffffff810aa66d>] ? trace_hardirqs_on+0xd/0x10 [ 28.532322] [<ffffffff8163e1d3>] register_pernet_operations+0xf3/0x190 [ 28.532322] [<ffffffff8163e31e>] register_pernet_subsys+0x2e/0x50 [ 28.532322] [<ffffffffa002406a>] tipc_init+0x6a/0x1000 [tipc] [ 28.532322] [<ffffffffa0024000>] ? 0xffffffffa0024000 [ 28.532322] [<ffffffff810002d9>] do_one_initcall+0x89/0x1c0 [ 28.532322] [<ffffffff811b7cb0>] ? kmem_cache_alloc_trace+0x50/0x1b0 [ 28.532322] [<ffffffff810e725b>] ? do_init_module+0x2b/0x200 [ 28.532322] [<ffffffff810e7294>] do_init_module+0x64/0x200 [ 28.532322] [<ffffffff810e9353>] load_module+0x12f3/0x18e0 [ 28.532322] [<ffffffff810e5890>] ? show_initstate+0x50/0x50 [ 28.532322] [<ffffffff810e9a19>] SyS_init_module+0xd9/0x110 [ 28.532322] [<ffffffff8179f3b3>] sysenter_dispatch+0x7/0x1f Before tipc_purge_publications() calls tipc_nametbl_remove_publ() to remove a publication with a name sequence, the name sequence's lock is held. However, when tipc_nametbl_remove_publ() calling tipc_nameseq_remove_publ() to remove the publication, it first tries to query name sequence instance with the publication, and then holds the lock of the found name sequence. But as the lock may be already taken in tipc_purge_publications(), deadlock happens like above scenario demonstrated. As tipc_nameseq_remove_publ() doesn't grab name sequence's lock, the deadlock can be avoided if it's directly invoked by tipc_purge_publications(). Fixes: `97ede29e80` ("tipc: convert name table read-write lock to RCU") Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 22:11:26 -04:00
Ying Xue	76100a8a64	tipc: fix netns refcnt leak When the TIPC module is loaded, we launch a topology server in kernel space, which in its turn is creating TIPC sockets for communication with topology server users. Because both the socket's creator and provider reside in the same module, it is necessary that the TIPC module's reference count remains zero after the server is started and the socket created; otherwise it becomes impossible to perform "rmmod" even on an idle module. Currently, we achieve this by defining a separate "tipc_proto_kern" protocol struct, that is used only for kernel space socket allocations. This structure has the "owner" field set to NULL, which restricts the module reference count from being be bumped when sk_alloc() for local sockets is called. Furthermore, we have defined three kernel-specific functions, tipc_sock_create_local(), tipc_sock_release_local() and tipc_sock_accept_local(), to avoid the module counter being modified when module local sockets are created or deleted. This has worked well until we introduced name space support. However, after name space support was introduced, we have observed that a reference count leak occurs, because the netns counter is not decremented in tipc_sock_delete_local(). This commit remedies this problem. But instead of just modifying tipc_sock_delete_local(), we eliminate the whole parallel socket handling infrastructure, and start using the regular sk_create_kern(), kernel_accept() and sk_release_kernel() calls. Since those functions manipulate the module counter, we must now compensate for that by explicitly decrementing the counter after module local sockets are created, and increment it just before calling sk_release_kernel(). Fixes: `a62fbccecd` ("tipc: make subscriber server support net namespace") Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reported-by: Cong Wang <cwang@twopensource.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 22:11:26 -04:00
Eric Dumazet	0470c8ca1d	inet: fix request sock refcounting While testing last patch series, I found req sock refcounting was wrong. We must set skc_refcnt to 1 for all request socks added in hashes, but also on request sockets created by FastOpen or syncookies. It is tricky because we need to defer this initialization so that future RCU lookups do not try to take a refcount on a not yet fully initialized request socket. Also get rid of ireq_refcnt alias. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `13854e5a60` ("inet: add proper refcounting to request sock") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 22:02:29 -04:00
Eric Dumazet	e3d95ad7da	inet: avoid fastopen lock for regular accept() It is not because a TCP listener is FastOpen ready that all incoming sockets actually used FastOpen. Avoid taking queue->fastopenq->lock if not needed. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 22:01:56 -04:00
Eric Dumazet	9439ce00f2	tcp: rename struct tcp_request_sock listener The listener field in struct tcp_request_sock is a pointer back to the listener. We now have req->rsk_listener, so TCP only needs one boolean and not a full pointer. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 22:01:56 -04:00
Eric Dumazet	4e9a578e5b	inet: add rsk_listener field to struct request_sock Once we'll be able to lookup request sockets in ehash table, we'll need to get access to listener which created this request. This avoid doing a lookup to find the listener, which benefits for a more solid SO_REUSEPORT, and is needed once we no longer queue request sock into a listener private queue. Note that 'struct tcp_request_sock'->listener could be reduced to a single bit, as TFO listener should match req->rsk_listener. TFO will no longer need to hold a reference on the listener. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 22:01:56 -04:00
Eric Dumazet	e49bb337d7	inet: uninline inet_reqsk_alloc() inet_reqsk_alloc() is becoming fat and should not be inlined. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 22:01:56 -04:00
Eric Dumazet	407640de21	inet: add sk_listener argument to inet_reqsk_alloc() listener socket can be used to set net pointer, and will be later used to hold a reference on listener. Add a const qualifier to first argument (struct request_sock_ops *), and factorize all write_pnet(&ireq->ireq_net, sock_net(sk)); Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 22:01:55 -04:00
Eric Dumazet	7970ddc8f9	tcp: uninline tcp_oow_rate_limited() tcp_oow_rate_limited() is hardly used in fast path, there is no point inlining it. Signed-of-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 15:18:00 -04:00
Eric Dumazet	1bfc4438a7	tcp: move tcp_openreq_init() to tcp_input.c This big helper is called once from tcp_conn_request(), there is no point having it in an include. Compiler will inline it anyway. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 15:18:00 -04:00
Eric Dumazet	a940700003	netfilter: xt_socket: prepare for TCP_NEW_SYN_RECV support TCP request socks soon will be visible in ehash table. xt_socket will be able to match them, but first we need to make sure to not consider them as full sockets. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 15:17:59 -04:00
Eric Dumazet	8b58014779	netfilter: tproxy: prepare TCP_NEW_SYN_RECV support TCP request socks soon will be visible in ehash table. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 15:17:59 -04:00
Eric Dumazet	a8399231f0	netfilter: use sk_fullsock() helper Upcoming request sockets have TCP_NEW_SYN_RECV state and should be special cased a bit like TCP_TIME_WAIT sockets. Signed-off-by; Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 15:17:59 -04:00
Alexei Starovoitov	c249739579	bpf: allow BPF programs access 'protocol' and 'vlan_tci' fields as a follow on to patch `70006af955` ("bpf: allow eBPF access skb fields") this patch allows 'protocol' and 'vlan_tci' fields to be accessible from extended BPF programs. The usage of 'protocol', 'vlan_present' and 'vlan_tci' fields is the same as corresponding SKF_AD_PROTOCOL, SKF_AD_VLAN_TAG_PRESENT and SKF_AD_VLAN_TAG accesses in classic BPF. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 15:06:31 -04:00
Eric Dumazet	cb7cf8a33f	inet: Clean up inet_csk_wait_for_connect() vs. might_sleep() I got the following trace with current net-next kernel : [14723.885290] WARNING: CPU: 26 PID: 22658 at kernel/sched/core.c:7285 __might_sleep+0x89/0xa0() [14723.885325] do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810e8734>] prepare_to_wait_exclusive+0x34/0xa0 [14723.885355] CPU: 26 PID: 22658 Comm: netserver Not tainted 4.0.0-dbg-DEV #1379 [14723.885359] ffffffff81a223a8 ffff881fae9e7ca8 ffffffff81650b5d 0000000000000001 [14723.885364] ffff881fae9e7cf8 ffff881fae9e7ce8 ffffffff810a72e7 0000000000000000 [14723.885367] ffffffff81a57620 000000000000093a 0000000000000000 ffff881fae9e7e64 [14723.885371] Call Trace: [14723.885377] [<ffffffff81650b5d>] dump_stack+0x4c/0x65 [14723.885382] [<ffffffff810a72e7>] warn_slowpath_common+0x97/0xe0 [14723.885386] [<ffffffff810a73e6>] warn_slowpath_fmt+0x46/0x50 [14723.885390] [<ffffffff810f4c5d>] ? trace_hardirqs_on_caller+0x10d/0x1d0 [14723.885393] [<ffffffff810e8734>] ? prepare_to_wait_exclusive+0x34/0xa0 [14723.885396] [<ffffffff810e8734>] ? prepare_to_wait_exclusive+0x34/0xa0 [14723.885399] [<ffffffff810ccdc9>] __might_sleep+0x89/0xa0 [14723.885403] [<ffffffff81581846>] lock_sock_nested+0x36/0xb0 [14723.885406] [<ffffffff815829a3>] ? release_sock+0x173/0x1c0 [14723.885411] [<ffffffff815ea1f7>] inet_csk_accept+0x157/0x2a0 [14723.885415] [<ffffffff810e8900>] ? abort_exclusive_wait+0xc0/0xc0 [14723.885419] [<ffffffff8161b96d>] inet_accept+0x2d/0x150 [14723.885424] [<ffffffff8157db6f>] SYSC_accept4+0xff/0x210 [14723.885428] [<ffffffff8165a451>] ? retint_swapgs+0xe/0x44 [14723.885431] [<ffffffff810f4c5d>] ? trace_hardirqs_on_caller+0x10d/0x1d0 [14723.885437] [<ffffffff81369c0e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [14723.885441] [<ffffffff8157ef40>] SyS_accept+0x10/0x20 [14723.885444] [<ffffffff81659872>] system_call_fastpath+0x12/0x17 [14723.885447] ---[ end trace ff74cd83355b1873 ]--- In commit `26cabd3125` Peter added a sched_annotate_sleep() in sk_wait_event() Is the following patch needed as well ? Alternative would be to use sk_wait_event() from inet_csk_wait_for_connect() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 15:03:54 -04:00
Nicolas Dichtel	37355565ba	ip6_tunnel: fix error code when tunnel exists After commit `2b0bb01b6e`, the kernel returns -ENOBUFS when user tries to add an existing tunnel with ioctl API: $ ip -6 tunnel add ip6tnl1 mode ip6ip6 dev eth1 add tunnel "ip6tnl0" failed: No buffer space available It's confusing, the right error is EEXIST. This patch also change a bit the code returned: - ENOBUFS -> ENOMEM - ENOENT -> ENODEV Fixes: `2b0bb01b6e` ("ip6_tunnel: Return an error when adding an existing tunnel.") CC: Steffen Klassert <steffen.klassert@secunet.com> Reported-by: Pierre Cheynier <me@pierre-cheynier.net> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-17 15:01:18 -04:00
Johan Hedberg	19c5ce9c5f	Bluetooth: Add workaround for broken OS X legacy SMP pairing OS X version 10.10.2 (and possibly older versions) doesn't support LE Secure Connections but incorrectly copies all authentication request bits from a Security Request to its Pairing Request. The result is that an SC capable initiator (such as BlueZ) will think OS X intends to do SC when in fact it's incapable of it: < ACL Data TX: Handle 3585 flags 0x00 dlen 6 SMP: Security Request (0x0b) len 1 Authentication requirement: Bonding, No MITM, SC, No Keypresses (0x09) > ACL Data RX: Handle 3585 flags 0x02 dlen 11 SMP: Pairing Request (0x01) len 6 IO capability: KeyboardDisplay (0x04) OOB data: Authentication data not present (0x00) Authentication requirement: Bonding, No MITM, SC, No Keypresses (0x09) Max encryption key size: 16 Initiator key distribution: EncKey (0x01) Responder key distribution: EncKey IdKey Sign (0x07) < ACL Data TX: Handle 3585 flags 0x00 dlen 11 SMP: Pairing Response (0x02) len 6 IO capability: NoInputNoOutput (0x03) OOB data: Authentication data not present (0x00) Authentication requirement: Bonding, No MITM, SC, No Keypresses (0x09) Max encryption key size: 16 Initiator key distribution: EncKey (0x01) Responder key distribution: EncKey Sign (0x05) The pairing eventually fails when we get an unexpected Pairing Confirm PDU instead of a Public Key PDU: > ACL Data RX: Handle 3585 flags 0x02 dlen 21 SMP: Pairing Confirm (0x03) len 16 Confim value: bcc3bed31b8f313a78ec3cce32685faf It is only at this point that we can speculate that the remote doesn't really support SC. This patch creates a workaround for the just-works model, however the MITM case is unsolvable because the OS X user has already been requested to enter a PIN which we're now expected to randomly generate and show the user (i.e. a chicken-and-egg problem). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-17 18:58:24 +01:00
Linus Torvalds	4d272f90a7	Not entirely surprising: the ongoing QEMU work on virtio 1.0 has revealed more minor issues with our virtio 1.0 drivers just introduced in the kernel. (I would normally use my fixes branch for this, but there were a batch of them...) Thanks, Rusty. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVB4gHAAoJENkgDmzRrbjx+BAP/0Z8TPfP5v2XQ0YlLPFpKMQ0 DdsL9s1Se74e0RUJ6fvpVZaVgBxFq1+xi2wiSOhlOyIn0fXlO8+eYK886QX13cYO CUAvuwbQhGHgj+1DfQ+7pJlQe905VvHVxWIVZU1dq+PEI181kOQQE9lOHhsKsYvQ IxEAX3/avYALAV29FN+PvGZcmY5fgeJf58RkQb5h3XXVaBlI175HhiGm3izgqyKe qmZqEDuUnsdlxT5rvJnb0kg/VfRJ2MdkpIcpjpqO4DK3lY+x90LibMmnGLdLkigS cEfjSXPmJKNug+IoxxQuDH6zRsWqTLnwI4gU/FqbOkN12Ovt5k4F+FrFCuXD7GdW tCiBblkQjQ8xS+OP5slXwYKE3a4q4ih6u+9/STNlSVlG1mqZxxmK56WD00CvBpvR CDyTO4yHUV9rjDIhD/r8guFXsPwaaiZxKiGP1k+nnel0E9dMmZf0dE8xqHpTrl0Z 8OAv3TgJFqhmfecFCBxj28e++dl7KvhiAGRKiZvHYkoxWZmJ5EFkw5E8FUOlJQNS 2apGPbBEyeEQho7emzb0l9vvAu+0jJGEObxvA9wUdEcXg2kbDmGIpidZfN6xBemJ 5WuMoGJh9UnYeImtGyXINTuQXagXdzt5bB/IkVmUYMcvsGty5lKIH1G87Q/qV4BE YCbKn/3J+G2TmF3+m8AH =8QwR -----END PGP SIGNATURE----- Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio fixes from Rusty Russell: "Not entirely surprising: the ongoing QEMU work on virtio 1.0 has revealed more minor issues with our virtio 1.0 drivers just introduced in the kernel. (I would normally use my fixes branch for this, but there were a batch of them...)" * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: virtio_mmio: fix access width for mmio uapi/virtio_scsi: allow overriding CDB/SENSE size virtio_mmio: generation support virtio_rpmsg: set DRIVER_OK before using device 9p/trans_virtio: fix hot-unplug virtio-balloon: do not call blocking ops when !TASK_RUNNING virtio_blk: fix comment for virtio 1.0 virtio_blk: typo fix virtio_balloon: set DRIVER_OK before using device virtio_console: avoid config access from irq virtio_console: init work unconditionally	2015-03-17 10:36:01 -07:00
Johan Hedberg	fa4335d71a	Bluetooth: Move generic mgmt command dispatcher to hci_sock.c The mgmt.c file should be reserved purely for HCI_CHANNEL_CONTROL. The mgmt_control() function in it is already completely generic and has a single user in hci_sock.c. This patch moves the function there and renames it a bit more appropriately to hci_mgmt_cmd() (as it's a command dispatcher). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-17 18:03:08 +01:00
Johan Hedberg	88b94ce925	Bluetooth: Add hdev_init callback for HCI channels In order to make the mgmt command handling more generic we can't have a direct call to mgmt_init_hdev() from mgmt_control(). This patch adds a new callback to struct hci_mgmt_chan. And sets it to point to the mgmt_init_hdev() function for the HCI_CHANNEL_CONTROL instance. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-17 18:03:08 +01:00
Johan Hedberg	a380b6cff1	Bluetooth: Add generic mgmt helper API There are several mgmt protocol features that will be needed by more than just the current HCI_CHANNEL_CONTROL. These include sending generic events as well as handling pending commands. This patch moves these functions out from mgmt.c to a new mgmt_util.c file. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-17 18:03:08 +01:00
Johan Hedberg	333ae95d05	Bluetooth: Add channel parameter to mgmt_pending_find() API To be able to have pending commands for different HCI channels we need to be able to distinguish for which channel a command was sent to. The channel information is already part of the socket data and can be fetched using the recently added hci_sock_get_channel() function. To not require all mgmt.c code to pass an extra channel parameter this patch also adds a helper pending_find() & pending_find_data() functions which act as a wrapper to the new mgmt_pending_find() & mgmt_pending_find_data() APIs. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-17 18:03:08 +01:00
Johan Hedberg	d0f172b14a	Bluetooth: Add helper to get HCI channel of a socket We'll need to have access to which HCI channel a socket is bound to, in order to manage pending mgmt commands in clean way. This patch adds a helper for the purpose. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-17 18:03:07 +01:00
Jakub Pawlowski	07d2334ae7	Bluetooth: Add simultaneous dual mode scan When doing scan through mgmt api, some controllers can do both le and classic scan at same time. They can be distinguished by HCI_QUIRK_SIMULTANEOUS_DISCOVERY set. This patch enables them to use this feature when doing dual mode scan. Instead of doing le, then classic scan, both scans are run at once. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-17 18:31:00 +02:00
Jakub Pawlowski	812abb13a9	Bluetooth: Refactor BR/EDR inquiry and LE scan triggering. This patch refactor BR/EDR inquiry and LE scan triggering logic into separate methods. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-17 18:30:59 +02:00
Pablo Neira Ayuso	d6b6cb1d3e	netfilter: nf_tables: allow to change chain policy without hook if it exists If there's an existing base chain, we have to allow to change the default policy without indicating the hook information. However, if the chain doesn't exists, we have to enforce the presence of the hook attribute. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-17 13:48:04 +01:00
Cedric Izoard	c7ef38e0cc	mac80211: Get IV len from key conf and not cipher scheme When a key is installed using a cipher scheme, set a new internal key flag (KEY_FLAG_CIPHER_SCHEME) on it, to allow distinguishing such keys more easily. In particular, use this flag on the TX path instead of testing the sta->cipher_scheme pointer, as the station is NULL for broad-/multicast message, and use the key's iv_len instead of the cipher scheme information. Signed-off-by: Cedric Izoard <cedric.izoard@ceva-dsp.com> [add missing documentation, rewrite commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-17 12:34:26 +01:00
Janusz.Dziedzic@tieto.com	8a4988d137	mac80211: IBSS: refactor ieee80211_rx_bss_info Put station specific code in ieee80211_update_sta_info function. Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-17 11:06:53 +01:00
Felix Fietkau	d66c258278	mac80211: minstrel_ht: fix rounding issue in MCS duration calculation On very high MCS bitrates, the calculated duration of rates that are next to each other can be very imprecise, due to the small packet size used as reference (1200 bytes). This is most visible in VHT80 nss=2 MCS8/9, for which minstrel shows the same throughput when the probability is also the same. This leads to a bad rate selection for such rates. Fix this issue by introducing an average A-MPDU size factor into the calculation. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-17 11:03:30 +01:00
Ben	2e54a6895e	cfg80211: Process all pending regulatory requests/hints It is possible that there are several regulatory requests pending, but the processing of the last one does not call CRDA, and thus the other requests are not handled. Fix this by rescheduling the work until all requests have been processed. Signed-off-by: Ben Rosenfeld <ben.rosenfeld@intel.com> Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-17 11:02:42 +01:00
Marek Puzyniak	c23e31cf7b	mac80211: initialize rate control earlier for tdls station Currently when TDLS station in driver goes from authenticated to associated state it can not use rate control parameters because rate control is not initialized yet. Some drivers require parameters already initialized by rate control when entering associated state. It can be done by initializing rate control after station transition to associated state but before notifying driver about that. Signed-off-by: Marek Puzyniak <marek.puzyniak@tieto.com> Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> [fix comment to say 'associated' instead of 'authorized'] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-17 11:00:07 +01:00
Marcel Holtmann	72000df2c0	Bluetooth: Add support for Local OOB Extended Data Update events When a different user requests a new set of local out-of-band data, then inform all previous users that the data has been updated. To limit the scope of users, the updates are limited to previous users. If a user has never requested out-of-band data, it will also not see the update. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-17 08:16:48 +02:00
Marcel Holtmann	5425f98e86	Bluetooth: Fix length for Read Local OOB Extended Data respone packet The length of the respone packet for Read Local OOB Extended Data command has a calculation error. In case LE Secure Connections support is not enabled, the actual response is shorter. Keep this in mind and update the value accordingly. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-17 08:16:45 +02:00
Eric Dumazet	9f1ab18672	tcp_metrics: fix wrong lockdep annotations Changes in tcp_metric hash table are protected by tcp_metrics_lock only, not by genl_mutex While we are at it use deref_locked() instead of rcu_dereference() in tcp_new() to avoid unnecessary barrier, as we hold tcp_metrics_lock as well. Reported-by: Andrew Vagin <avagin@parallels.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `098a697b49` ("tcp_metrics: Use a single hash table for all network namespaces.") Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-16 16:32:23 -04:00
Jiri Pirko	bd76a11670	dsa: change "select" to "depends on" for NET_SWITCHDEV and for NET_DSA This would fix randconfig compile error: net/built-in.o: In function `netdev_switch_fib_ipv4_abort': (.text+0xf7811): undefined reference to `fib_flush_external' Also it fixes following warnings: warning: (NET_DSA) selects NET_SWITCHDEV which has unmet direct dependencies (NET && INET) warning: (NET_DSA_MV88E6060 && NET_DSA_MV88E6131 && NET_DSA_MV88E6123_61_65 && NET_DSA_MV88E6171 && NET_DSA_MV88E6352 && NET_DSA_BCM_SF2) selects NET_DSA which has unmet direct dependencies (NET && HAVE_NET_DSA && NET_SWITCHDEV) Reported-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-16 16:29:18 -04:00
Ying Xue	c243d7e209	net: kernel socket should be released in init_net namespace Creating a kernel socket with sock_create_kern() happens in "init_net" namespace, however, releasing it with sk_release_kernel() occurs in the current namespace which may be different with "init_net" namespace. Therefore, we should guarantee that the namespace in which a kernel socket is created is same as the socket is created. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-16 16:25:06 -04:00
David S. Miller	48b810d9bc	Here are a few fixes that I'd like to still get in: * disable U-APSD for better interoperability, from Michal Kazior * drop unencrypted frames in mesh forwarding, from Bob Copeland * treat non-QoS/WMM HT stations as non-HT, to fix confusion when they connect and then get QoS packets anyway due to HT * fix counting interfaces for combination checks, otherwise the interface combinations aren't properly enforced (from Andrei) * fix pure ECSA by reacting to the IE change * ignore erroneous (E)CSA to the current channel which sometimes happens due to AP/GO bugs -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJVBpZgAAoJEDBSmw7B7bqrqsUP/0muBLLr3A+KXTY8OgbR8bbH bDp68XPWcPKZyQCGsan+CrChhf1+9/5Mqf6Jf4kwcO1pjm+b9Lz7LVBWY68WLKF6 MAPBdp0sQ79zw6FT3pISEJ7XVEXH5JZJKjgY834DFTvr2l6k1eRElrEnObc1881V ltB++FuRzXeACOJ0PxVm8oTVBhA2GEcYqemJC0RZDFgIk3QVm8116cyEMYzLVvus NYX6tHanVfYbE2/mA7nfpWTaXTYzUupFd/ooVonkxB5E76bAFancY19s6Lfm04ye c2p7KBm1muaXfGJOBqA9eO+1942gLFINKrzJjN0yJkH1B/QHdVADvuNr3FyblkGW W0K5q6u+2LlX8ILBQLPEa5RITqV6qKhr8NX8zvriIA5qS5gJftW0WE+DrA0k65qw SRIj6ppPoYvEzU2KMbJenxG+fSNZZNESDAa9uWt1tyT3eyev+YBC9bl57oFUGjaS 49y2SDLlD0FTNezhbscMPZ9JxodN3NfbRzF+4FtOQluE/o+7YLvZVCA1Jt/Vfjs1 unurv8RKHqzTPCGBfDtVRuwtgTr6LTTl1UTFRyM0wyJYfUzGTII4Rp7GWzNd70ZI mZSAnTyBiOSBuQNNjhwxnhNSn9Hq5PITHs2q3vbzQe7VoqUE+OThbQbW2+0lAisJ oKTqesKeJU3WtLofpWiq =Ecdh -----END PGP SIGNATURE----- Merge tag 'mac80211-for-davem-2015-03-16' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Here are a few fixes that I'd like to still get in: * disable U-APSD for better interoperability, from Michal Kazior * drop unencrypted frames in mesh forwarding, from Bob Copeland * treat non-QoS/WMM HT stations as non-HT, to fix confusion when they connect and then get QoS packets anyway due to HT * fix counting interfaces for combination checks, otherwise the interface combinations aren't properly enforced (from Andrei) * fix pure ECSA by reacting to the IE change * ignore erroneous (E)CSA to the current channel which sometimes happens due to AP/GO bugs ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-16 16:17:48 -04:00
David S. Miller	ca00942a81	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2015-03-16 1) Fix the network header offset in _decode_session6 when multiple IPv6 extension headers are present. From Hajime Tazaki. 2) Fix an interfamily tunnel crash. We set outer mode protocol too early and may dispatch to the wrong address family. Move the setting of the outer mode protocol behind the last accessing of the inner mode to fix the crash. 3) Most callers of xfrm_lookup() expect that dst_orig is released on error. But xfrm_lookup_route() may need dst_orig to handle certain error cases. So introduce a flag that tells what should be done in case of error. From Huaibin Wang. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-16 16:16:49 -04:00
Eric Dumazet	13854e5a60	inet: add proper refcounting to request sock reqsk_put() is the generic function that should be used to release a refcount (and automatically call reqsk_free()) reqsk_free() might be called if refcount is known to be 0 or undefined. refcnt is set to one in inet_csk_reqsk_queue_add() As request socks are not yet in global ehash table, I added temporary debugging checks in reqsk_put() and reqsk_free() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-16 15:55:29 -04:00
Eric Dumazet	2c13270b44	inet: factorize sock_edemux()/sock_gen_put() code sock_edemux() is not used in fast path, and should really call sock_gen_put() to save some code. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-16 15:55:29 -04:00
Eric Dumazet	a58917f584	inet_diag: allow sk_diag_fill() to handle request socks inet_diag_fill_req() is renamed to inet_req_diag_fill() and moved up, so that it can be called fom sk_diag_fill() inet_diag_bc_sk() is ready to handle request socks. inet_twsk_diag_dump() is no longer needed. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-16 15:55:29 -04:00
Eric Dumazet	f7e4eb03f9	inet: ip early demux should avoid request sockets When a request socket is created, we do not cache ip route dst entry, like for timewait sockets. Let's use sk_fullsock() helper. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-16 15:55:29 -04:00
Marcel Holtmann	5082a59965	Bluetooth: Do not include LE SC out-of-band data if not enabled In case LE Secure Connections is not enabled, then the command for returning local out-of-band data should not include the confirmation and random value for LE SC pairing. All other fields are still valid, but these two need to be left out. In that case it is also no needed to generate the public/private key pair for out-of-band pairing. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-16 21:39:45 +02:00
Marcel Holtmann	b880ab869c	Bluetooth: The P-256 randomizer is 16 octets long and not 19 octets This seems to be a simple typo in the debugfs entry for the remote out-of-band data entries. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-16 21:36:24 +02:00
Marcel Holtmann	fb334fee60	Bluetooth: Rename smp->local_rr into smp->local_rand The variable for the out-of-band random number was badly named and with that confusing. Just rename it to local_rand so it is clear what value it represents. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-16 21:36:23 +02:00
Marcel Holtmann	bc07cd696e	Bluetooth: Add extra SMP_DBG statement for remote OOB data Just for pure debugging purposes print the remote out-of-band data that has been received and is going to be used. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-16 21:36:21 +02:00
Marcel Holtmann	e091526dfd	Bluetooth: Use smp->remote_pk + 32 instead of &smp->remote_pk[32] Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-16 21:36:19 +02:00
Johan Hedberg	cb06d366fb	Bluetooth: Add clarifying comment when setting local OOB flag It might be a bit counterintuitive to set a 'local' flag based on remote data. This patch adds a clarifying comment to the pairing req/rsp handlers when setting the LOCAL_OOB flag based on the PDU received from the remote side. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-16 20:16:08 +01:00
Johan Hedberg	a8ca617c13	Bluetooth: Don't send public key if OOB data verification fails When we receive the remote public key, if we have remote OOB data there's no point in sending our public key to the remote if the OOB data doesn't match. This patch moves the test for this higher up in the smp_cmd_public_key() function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-16 17:17:38 +01:00
Johan Hedberg	94ea7257ef	Bluetooth: Fix verifying confirm value when lacking remote OOB data If we haven't received remote OOB data we cannot perform any special checks on the confirm value. This patch updates the check after having received the public key to only perform the verification if we have remote OOB data present. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-16 17:16:46 +01:00
Johan Hedberg	58428563b5	Bluetooth: Set local OOB data flag if remote has our OOB data If the SMP Pairing Request or Response PDU received from the remote device indicates that it has received our OOB data we should set the SMP_FLAG_LOCAL_OOB flag. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-16 17:16:46 +01:00
Johan Hedberg	1a8bab4f39	Bluetooth: Track local vs remote OOB data availability There are several decisions in the SMP logic that depend not only on whether we're doing SMP or not, but also whether local and/or remote OOB data is present. This patch splits the existing SMP_FLAG_OOB into two new flags to track local and remote OOB data respectively. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-16 17:16:45 +01:00
Johan Hedberg	882fafad71	Bluetooth: Fix local OOB data handling for SMP We need to store the local ra/rb value in order to verify the Check value received from the remote. This patch adds a new 'lr' for the local ra/rb value and makes sure it gets used when verifying the DHKey Check PDU received from the remote. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-16 17:16:45 +01:00
Florian Westphal	e4bb9bcbfb	netfilter: bridge: remove BRNF_STATE_BRIDGED flag Its not needed anymore since `2bf540b73e` ([NETFILTER]: bridge-netfilter: remove deferred hooks). Before this it was possible to have physoutdev set for locally generated packets -- this isn't the case anymore: BRNF_STATE_BRIDGED flag is set when we assign nf_bridge->physoutdev, so physoutdev != NULL means BRNF_STATE_BRIDGED is set. If physoutdev is NULL, then we are looking at locally-delivered and routed packet. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-16 14:35:02 +01:00
Florian Westphal	c055d5b03b	netfilter: bridge: query conntrack about skb dnat ask conntrack instead of storing ipv4 address in nf_bridge_info->data. Ths avoids the need to use ->data during NF_PRE_ROUTING. Only two functions that need ->data remain. These will be addressed in followup patches. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-16 14:34:34 +01:00
Johannes Berg	f84eaa1068	mac80211: ignore CSA to same channel If the AP is confused and starts doing a CSA to the same channel, just ignore that request instead of trying to act it out since it was likely sent in error anyway. In the case of the bug I was investigating the GO was misbehaving and sending out a beacon with CSA IEs still included after having actually done the channel switch. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-16 09:36:12 +01:00
Johannes Berg	496fcc294d	nl80211: ignore HT/VHT capabilities without QoS/WMM As HT/VHT depend heavily on QoS/WMM, it's not a good idea to let userspace add clients that have HT/VHT but not QoS/WMM. Since it does so in certain cases we've observed (client is using HT IEs but not QoS/WMM) just ignore the HT/VHT info at this point and don't pass it down to the drivers which might unconditionally use it. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-16 09:36:11 +01:00
Johannes Berg	70a3fd6c61	mac80211: ask for ECSA IE to be considered for beacon parse CRC When a beacon from the AP contains only the ECSA IE, and not a CSA IE as well, this ECSA IE is not considered for calculating the CRC and the beacon might be dropped as not being interesting. This is clearly wrong, it should be handled and the channel switch should be executed. Fix this by including the ECSA IE ID in the bitmap of interesting IEs. Reported-by: Gil Tribush <gil.tribush@intel.com> Reviewed-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-16 09:36:11 +01:00
Andrei Otcheretianski	0f611d28fc	mac80211: count interfaces correctly for combination checks Since moving the interface combination checks to mac80211, it's broken because it now only considers interfaces with an assigned channel context, so for example any interface that isn't active can still be up, which is clearly an issue; also, in particular P2P-Device wdevs are an issue since they never have a chanctx. Fix this by counting running interfaces instead the ones with a channel context assigned. Cc: stable@vger.kernel.org [3.16+] Fixes: `73de86a389` ("cfg80211/mac80211: move interface counting for combination check to mac80211") Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> [rewrite commit message, dig out the commit it fixes] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-16 09:35:59 +01:00
Marcel Holtmann	8e4e2ee5d8	Bluetooth: Use smp->local_pk + 32 instead of &smp->local_pk[32] Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-16 10:31:31 +02:00
Marcel Holtmann	33d0c03071	Bluetooth: Use OOB key pair for LE SC pairing with OOB method The OOB public and secret key pair is different from the non-OOB pairing procedure. SO when OOB method is in use, then use this key pair instead of generating a new one. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-16 10:31:29 +02:00
Marcel Holtmann	0821a2c5ab	Bluetooth: Return LE SC confirm and random values for out-of-band data Then the local out-of-band data for LE SC pairing is requested via Read Local OOB Extended Data command, then fill in the values generated by the smp_generate_oob function. Every call of this command will overwrite previously generated values. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-16 10:31:28 +02:00
Marcel Holtmann	60a27d653d	Bluetooth: Add function for generating LE SC out-of-band data This patch adds a smp_generate_oob function that allows to create local out-of-band data that can be used for pairing and also provides the confirmation and random value. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-16 10:31:27 +02:00
Marcel Holtmann	6e2dc6d113	Bluetooth: Add support for AES-CMAC hash for security manager device The security manager device will require the use of AES-CMAC hash for out-of-band data generation. This patch makes sure it is correctly set up and available. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-16 10:31:25 +02:00
Marcel Holtmann	88a479d950	Bluetooth: Create SMP device structure for local crypto context Every Bluetooth Low Energy controller requires a local crypto context to handle the resolvable private addresses. At the moment this is just a single crypto context, but for out-of-band data generation it will require an additional. To facility this, create a struct smp_dev that will hold all the extra information. This patch is just the refactoring in preparation for future changes. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-16 10:31:05 +02:00
Johannes Berg	f3b0bbb35d	mac80211: refactor drop connection/unlock in CSA processing The schedule_work()/mutex unlocking code is duplicated many times, refactor that to a common place in the function. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-16 09:31:02 +01:00
Marcel Holtmann	276812ec3e	Bluetooth: Use kzfree instead of kfree in security manager Within the security manager, it makes sense to use kzfree instead of kfree for all data structures. This ensures that no key material leaks by accident. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-16 10:30:53 +02:00
Emmanuel Grumbach	dc5a1ad7bd	mac80211: allow to get wireless_dev structure from ieee80211_vif This will allow mac80211 drivers to call cfg80211 APIs with the right handle. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-16 09:30:30 +01:00
Johannes Berg	45ceeee81e	mac80211: add comment for rx_path_lock Add a comment explaining how the RX path lock is used. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-16 09:08:20 +01:00
Johannes Berg	4cc0dba95a	mac80211: move netdev stats to common function Move the netdev stats accounting into the common function ieee80211_deliver_skb() that is called in both places. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-16 09:05:27 +01:00
Marcel Holtmann	aefedc1a4c	Bluetooth: Remove unneeded HCI_CONN_REMOTE_OOB connection flag The HCI_CONN_REMOTE_OOB connection flag is used to indicate if the pairing initiator has provided out-of-band data. However since that value is no longer used in any decision making, just remove it. It is actually unclear what purpose the OOB data present field from the HCI IO Capability Response event serves in the first place. If either side provided out-of-band data, then that data will be used for pairing. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-16 06:53:43 +02:00
Marcel Holtmann	455c2ff0a5	Bluetooth: Fix BR/EDR out-of-band pairing with only initiator data When only the pairing initiator is providing out-of-band data, then the receiver side was ignoring the data. For some reason the code was checking if the initiator has received out-of-band data and only then also provide the required inidication that the acceptor actually has the needed data available. For BR/EDR out-of-band pairing it is enough if one side has received out-of-band data. There are no extra checks needed here to make this work smoothly. The only thing that is needed is to tell the controller if data is present (and if it is P-192 or P-256 or both) and then let the controller actually figure out the rest. This means the check for outgoing connection or if the initiator has indicated data are completely pointless and are in fact actually causing harm. The check in question is this one: if (conn->out \|\| test_bit(HCI_CONN_REMOTE_OOB, &conn->flags)) { After just taking the conditional check out and always executing the code for determining the type of out-of-band data, the pairing works flawlessly and prodcudes authenticated link keys. The patch itself looks more complicated due to the reformatting of the indentation, but it essentially just a two-line change. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-16 06:53:19 +02:00
Scott Feldman	98237d433b	switchdev: use new swdev ops Move swdev wrappers over to new swdev ops (from previous ndo ops). No functional changes to the implementation. Signed-off-by: Scott Feldman <sfeldma@gmail.com> rocker: move to new swdev ops Signed-off-by: Scott Feldman <sfeldma@gmail.com> dsa: move to new swdev ops Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-16 00:14:43 -04:00
Al Viro	7d985ed1dc	rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg() [I would really like an ACK on that one from dhowells; it appears to be quite straightforward, but...] MSG_PEEK isn't passed to ->recvmsg() via msg->msg_flags; as the matter of fact, neither the kernel users of rxrpc, nor the syscalls ever set that bit in there. It gets passed via flags; in fact, another such check in the same function is done correctly - as flags & MSG_PEEK. It had been that way (effectively disabled) for 8 years, though, so the patch needs beating up - that case had never been tested. If it is correct, it's -stable fodder. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-15 22:20:09 -04:00
Al Viro	3eeff778e0	caif: fix MSG_OOB test in caif_seqpkt_recvmsg() It should be checking flags, not msg->msg_flags. It's ->sendmsg() instances that need to look for that in ->msg_flags, ->recvmsg() ones (including the other ->recvmsg() instance in that file, as well as unix_dgram_recvmsg() this one claims to be imitating) check in flags. Braino had been introduced in commit dcda13 ("caif: Bugfix - use MSG_TRUNC in receive") back in 2010, so it goes quite a while back. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-15 22:19:17 -04:00
Alexei Starovoitov	9bac3d6d54	bpf: allow extended BPF programs access skb fields introduce user accessible mirror of in-kernel 'struct sk_buff': struct __sk_buff { __u32 len; __u32 pkt_type; __u32 mark; __u32 queue_mapping; }; bpf programs can do: int bpf_prog(struct __sk_buff skb) { __u32 var = skb->pkt_type; which will be compiled to bpf assembler as: dst_reg = (u32 )(src_reg + 4) // 4 == offsetof(struct __sk_buff, pkt_type) bpf verifier will check validity of access and will convert it to: dst_reg = (u8 *)(src_reg + offsetof(struct sk_buff, __pkt_type_offset)) dst_reg &= 7 since skb->pkt_type is a bitfield. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-15 22:02:28 -04:00
Daniel Borkmann	c04167ce2c	ebpf: add helper for obtaining current processor id This patch adds the possibility to obtain raw_smp_processor_id() in eBPF. Currently, this is only possible in classic BPF where commit `da2033c282` ("filter: add SKF_AD_RXHASH and SKF_AD_CPU") has added facilities for this. Perhaps most importantly, this would also allow us to track per CPU statistics with eBPF maps, or to implement a poor-man's per CPU data structure through eBPF maps. Example function proto-type looks like: u32 (smp_processor_id)(void) = (void )BPF_FUNC_get_smp_processor_id; Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-15 21:57:25 -04:00
Daniel Borkmann	03e69b508b	ebpf: add prandom helper for packet sampling This work is similar to commit `4cd3675ebf` ("filter: added BPF random opcode") and adds a possibility for packet sampling in eBPF. Currently, this is only possible in classic BPF and useful to combine sampling with f.e. packet sockets, possible also with tc. Example function proto-type looks like: u32 (prandom_u32)(void) = (void )BPF_FUNC_get_prandom_u32; Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-15 21:57:25 -04:00
Marcel Holtmann	4f0f155cea	Bluetooth: Add simple version of Read Local OOB Extended Data command This adds support for the simplest possible version of Read Local OOB Extended Data management command. It includes all mandatory fields, but none of the actual pairing related ones. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 10:05:29 +02:00
Marcel Holtmann	bea41609de	Bluetooth: Move eir_append_data function to a different location The eir_append_data helper function is needed for generating the extended local OOB data fields. So move it up into the right location. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 10:05:26 +02:00
Marcel Holtmann	d3d5305bfd	Bluetooth: Add simple version of Read Advertising Features command This adds support for the simplest possible version of Read Advertising Features management command. It allows basic testing of the interface. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 10:03:41 +02:00
Marcel Holtmann	f6b7712eb6	Bluetooth: Send global configuration updates to all management users Changes to the global configuration updates like settings, class of device, name etc. can be received by every user. They are allowed to read them in the first place so provide the updates via events as well. Otherwise untrusted users start polling for updates and that is not a desired behavior. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 09:59:39 +02:00
Marcel Holtmann	1195fbb8d0	Bluetooth: Open management interface for untrusted users Until now the management interface was restricted to CAP_NET_ADMIN. With this change every user can open the management socket. However the list of commands is heavily restricted to getting basic information about the attached controllers. No access for configuration or other operation is provided. The events are also limited. This is done so that no keys can leak or untrusted users can mess with the Bluetooth configuration. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 09:59:18 +02:00
Marcel Holtmann	c927a10487	Bluetooth: Add support for trust verification of management commands Check the required trust level of each management command with the trust level of the management socket. If it does not match up, then return the newly introduced permission denied error. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 09:58:56 +02:00
Marcel Holtmann	7aea8616cd	Bluetooth: Remove unneeded initializer for management command table The flags field for the management command table will be always initialized to zero and thus no need to do that manually. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 09:58:51 +02:00
Marcel Holtmann	c91041dc4e	Bluetooth: Add support for untrusted access to management commands Some management commands are safe to be accessed from any user without special permissions. First step for allowing access to any of these commands from untrusted application is to mark them accordingly. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 09:57:35 +02:00
Marcel Holtmann	c85be545ea	Bluetooth: Add hci_sock_test_flag helper function The management interface will need access to the socket flags and so provide a helper function for checking them. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 09:57:31 +02:00
Marcel Holtmann	c08b1a1dba	Bluetooth: Consolidate socket channel sending function back into one With the introduction of trusted socket flag for control and monitor channels, it is now possible to use a single function for sending packets to these sockets. And with that consolidate the handling. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 09:56:41 +02:00
Marcel Holtmann	50ebc055fa	Bluetooth: Introduce trusted flag for management control sockets Providing a global trusted flag for management control sockets provides an easy way for identifying sockets and imposing restriction on it. For now all management sockets are trusted since they require CAP_NET_ADMIN. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 09:56:00 +02:00
Marcel Holtmann	96f1474af0	Bluetooth: Add support for extended index management command The Read Extended Contoller Index List command can be used for retrieving the complete list of local available controllers. This included configured, unconfigured and also AMP controllers. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 09:55:51 +02:00
Marcel Holtmann	ced85549c3	Bluetooth: Add support for extended index management events This introduces support for using Extended Index Added and Extended Index Removed events. These events contain the controller type and also the hardware bus information from the driver. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 09:53:08 +02:00
Marcel Holtmann	f920733885	Bluetooth: Use special function to send filter management index events For sending Index Added, Index Removed, Unconfigured Index Added and Unconfigured Index Removed managment events the new helper functions allows taking into account if these events are enabled for a certain management socket or not. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 09:47:51 +02:00
Marcel Holtmann	17711c6291	Bluetooth: Provide hci_send_to_flagged_channel helper function The hci_send_to_flagged_channel helper function can be used to send packets to all channels that have a certain HCI socket flag set. This is especially useful for managment events that are limited to sockets that have first enabled certain functionality. This allows for filtering of events without confusing existing users. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 09:46:41 +02:00
Marcel Holtmann	6befc6445f	Bluetooth: Add flags field and setting function for HCI sockets To filter out certain actions for certain HCI sockets introcuce a flags field that allows to configure specific settings on individual sockets. Since the hci_pinfo structure is private in hci_sock.c, provide helper functions for setting and clearing a given flag. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-15 09:45:39 +02:00
Florian Fainelli	96026d057a	net: dsa: do not use slave MII bus for fixed PHYs Commit `cd28a1a9ba` ("net: dsa: fully divert PHY reads/writes if requested") introduced a check for particular PHYs that need to be accessed using the slave MII bus created by DSA, but this check was too inclusive. This would prevent fixed PHYs from being successfully registered because those should not go through the slave MII bus created by DSA. Make sure we check that the PHY is not a fixed PHY to prevent that from happening. Fixes: `cd28a1a9ba` ("net: dsa: fully divert PHY reads/writes if requested") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-14 19:16:44 -04:00
Venkat Venkatsubra	4c906c2798	bridge: reset bridge mtu after deleting an interface On adding an interface br_add_if() sets the MTU to the min of all the interfaces. Do the same thing on removing an interface too in br_del_if. Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-14 19:12:38 -04:00
Eric Dumazet	a4458343ac	inet_diag: factorize code in new inet_diag_msg_common_fill() helper Now the three type of sockets share a common base, we can factorize code in inet_diag_msg_common_fill(). inet_diag_entry no longer requires saddr_storage & daddr_storage and the extra copies. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-14 15:05:10 -04:00
Eric Dumazet	a07c92078d	inet_diag: adjust inet_sk_diag_fill() bug condition inet_sk_diag_fill() only copes with non timewait and non request socks Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-14 15:05:10 -04:00
Eric Dumazet	16f86165bd	inet: fill request sock ir_iif for IPv4 Once request socks will be in ehash table, they will need to have a valid ir_iff field. This is currently true only for IPv6. This patch extends support for IPv4 as well. This means inet_diag_fill_req() can now properly use ir_iif, which is better for IPv6 link locals anyway, as request sockets and established sockets will propagate consistent netlink idiag_if. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-14 15:05:10 -04:00
Jon Paul Maloy	e3eea1eb47	tipc: clean up handling of message priorities Messages transferred by TIPC are assigned an "importance priority", -an integer value indicating how to treat the message when there is link or destination socket congestion. There is no separate header field for this value. Instead, the message user values have been chosen in ascending order according to perceived importance, so that the message user field can be used for this. This is not a good solution. First, we have many more users than the needed priority levels, so we end up with treating more priority levels than necessary. Second, the user field cannot always accurately reflect the priority of the message. E.g., a message fragment packet should really have the priority of the enveloped user data message, and not the priority of the MSG_FRAGMENTER user. Until now, we have been working around this problem in different ways, but it is now time to implement a consistent way of handling such priorities, although still within the constraint that we cannot allocate any more bits in the regular data message header for this. In this commit, we define a new priority level, TIPC_SYSTEM_IMPORTANCE, that will be the only one used apart from the four (lower) user data levels. All non-data messages map down to this priority. Furthermore, we take some free bits from the MSG_FRAGMENTER header and allocate them to store the priority of the enveloped message. We then adjust the functions msg_importance()/msg_set_importance() so that they read/set the correct header fields depending on user type. This small protocol change is fully compatible, because the code at the receiving end of a link currently reads the importance level only from user data messages, where there is no change. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-14 14:38:32 -04:00
Jon Paul Maloy	05dcc5aa4d	tipc: split link outqueue struct tipc_link contains one single queue for outgoing packets, where both transmitted and waiting packets are queued. This infrastructure is hard to maintain, because we need to keep a number of fields to keep track of which packets are sent or unsent, and the number of packets in each category. A lot of code becomes simpler if we split this queue into a transmission queue, where sent/unacknowledged packets are kept, and a backlog queue, where we keep the not yet sent packets. In this commit we do this separation. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-14 14:38:32 -04:00
Jon Paul Maloy	2cdf3918e4	tipc: eliminate unnecessary call to broadcast ack function The unicast packet header contains a broadcast acknowledge sequence number, that may need to be conveyed to the broadcast link for proper treatment. Currently, the function tipc_rcv(), which is on the most critical data path, calls the function tipc_bclink_acknowledge() to have this done. This call is made for each received packet, and results in the unconditional grabbing of the broadcast link spinlock. This is unnecessary, since we can see directly from tipc_rcv() if the acknowledged number differs from what has been previously acked from the node in question. In the vast majority of cases the numbers won't differ, and there is nothing to update. We now make the call to tipc_bclink_acknowledge() conditional to that the two ack values differ. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-14 14:38:32 -04:00
Jon Paul Maloy	c1336ee472	tipc: extract bundled buffers by cloning instead of copying When we currently extract a bundled buffer from a message bundle in the function tipc_msg_extract(), we allocate a new buffer and explicitly copy the linear data area. This is unnecessary, since we can just clone the buffer and do skb_pull() on the clone to move the data pointer to the correct position. This is what we do in this commit. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-14 14:38:32 -04:00
Jon Paul Maloy	1149557d64	tipc: eliminate unnecessary linearization of incoming buffers Currently, TIPC linearizes all incoming buffers directly at reception before passing them upwards in the stack. This is clearly a waste of CPU resources, and must be avoided. In this commit, we eliminate this unnecessary linearization. We still ensure that at least the message header is linear, and that the buffer is linearized where this is still needed, i.e. when unbundling and when reversing messages. In addition, we ensure that fragmented messages are validated after reassembly before delivering them upwards in the stack. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-14 14:38:32 -04:00
Jon Paul Maloy	cf2157f88a	tipc: move message validation function to msg.c The function link_buf_validate() is in reality re-entrant and context independent, and will in later commits be called from several locations. Therefore, we move it to msg.c, make it outline and rename the it to tipc_msg_validate(). We also redesign the function to make proper use of pskb_may_pull() Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-14 14:38:32 -04:00
Jon Paul Maloy	7764d6e83d	tipc: add framework for node capabilities exchange The TIPC protocol spec has defined a 13 bit capability bitmap in the neighbor discovery header, as a means to maintain compatibility between different code and protocol generations. Until now this field has been unused. We now introduce the basic framework for exchanging capabilities between nodes at first contact. After exchange, a peer node's capabilities are stored as a 16 bit bitmap in struct tipc_node. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-14 14:38:32 -04:00
David S. Miller	5f1764ddfe	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== Here's another set of Bluetooth & ieee802154 patches intended for 4.1: - Added support for QCA ROME chipset family in the btusb driver - at86rf230 driver fixes & cleanups - ieee802154 cleanups - Refactoring of Bluetooth mgmt API to allow new users - New setting for static Bluetooth address exposed to user space - Refactoring of hci_dev flags to remove limit of 32 - Remove unnecessary fast-connectable setting usage restrictions - Fix behavior to be consistent when trying to pair already paired device - Service discovery corner-case fixes Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-14 14:29:45 -04:00
Julia Lawall	b6d595e3f7	ieee802154: don't export static symbol The semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ type T; identifier f; @@ static T f (...) { ... } @@ identifier r.f; declarer name EXPORT_SYMBOL; @@ -EXPORT_SYMBOL(f); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-14 17:11:31 +01:00
Alexander Aring	3f3c4bb5ec	mac802154: correct max sifs size handling This patch fix the max sifs size correction when the IEEE802154_HW_TX_OMIT_CKSUM flag is set. With this flag the sk_buff doesn't contain the CRC, because the transceiver will add the CRC while transmit. Also add some defines for the max sifs frame size value and frame check sequence according to 802.15.4 standard. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-14 17:11:30 +01:00
Alexander Aring	022d07e3d8	ieee802154: remove deprecated sysfs entries It's only necessary to offer the name and index, others value are available over netlink. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-14 17:11:30 +01:00
Alexander Aring	c4dd7471de	ieee802154: change wpan-phy name to phy Currently the wpan_phy under /sys/class/ieee802154/ is named as "wpan-phy#", this patch will change the name to phy. This will introduce the same naming convention like wireless. Note: wpan-tools users will not type "wpan-phy#" anymore, just a simple "phy#" is enough. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-14 17:11:30 +01:00
Alexander Aring	965e613d29	ieee802154: 6lowpan: fix ARPHRD to ARPHRD_6LOWPAN Currently there exists two interface types with ARPHRD_IEEE802154. These are the 802.15.4 interfaces and 802.15.4 6LoWPAN interfaces. This is more a bug because some userspace applications checks on this value like wireshark. This occurs that wireshark will always try to parse a lowpan interface as 802.15.4 frames. With ARPHRD_6LOWPAN wireshark will parse it as IPv6 frames which is correct. Much applications checks on this value to readout the EUI64 mac address which should be the same for ARPHRD_6LOWPAN. BTLE 6LoWPAN and ieee802154 6LoWPAN will share now the same ARPHRD. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-14 17:11:30 +01:00
Eric Dumazet	c8e2c80d7e	inet_diag: fix possible overflow in inet_diag_dump_one_icsk() inet_diag_dump_one_icsk() allocates too small skb. Add inet_sk_attr_size() helper right before inet_sk_diag_fill() so that it can be updated if/when new attributes are added. iproute2/ss currently does not use this dump_one() interface, this might explain nobody noticed this problem yet. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-13 15:54:27 -04:00
Marcel Holtmann	b7cb93e528	Bluetooth: Merge hdev->dbg_flags fields into hdev->dev_flags With the extension of hdev->dev_flags utilizing a bitmap now, the space is no longer restricted. Merge the hdev->dbg_flags into hdev->dev_flags to save space on 64-bit architectures. On 32-bit architectures no size reduction happens. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-13 19:28:36 +02:00
Marcel Holtmann	eacb44dff9	Bluetooth: Use DECLARE_BITMAP for hdev->dev_flags field The hdev->dev_flags field has outgrown itself on 32-bit systems. So instead of hacking around it, switch to using DECLARE_BITMAP. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-13 18:35:45 +02:00
Christoph Hellwig	599bd19bdc	fs: don't allow to complete sync iocbs through aio_complete The AIO interface is fairly complex because it tries to allow filesystems to always work async and then wakeup a synchronous caller through aio_complete. It turns out that basically no one was doing this to avoid the complexity and context switches, and we've already fixed up the remaining users and can now get rid of this case. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-03-13 12:10:22 -04:00
Nicholas Mc Guire	55cc1d7806	SUNRPC: fix build-warning due to format missmatch fix build-warning introduced by commit: `f0eede10fd` ("SUNRPC: use jiffies_to_msecs for converting jiffies") which did not fixup the format properly (my bad). Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-03-13 09:05:20 -04:00
Herbert Xu	d8bdff59ce	netfilter: Fix potential crash in nft_hash walker When we get back an EAGAIN from rhashtable_walk_next we were treating it as a valid object which obviously doesn't work too well. Luckily this is hard to trigger so it seems nobody has run into it yet. This patch fixes it by redoing the next call when we get an EAGAIN. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-13 12:03:00 +01:00
Marcel Holtmann	238be788fc	Bluetooth: Introduce hci_dev_test_and_set_flag helper macro Instead of manually coding test_and_set_bit on hdev->dev_flags all the time, use hci_dev_test_and_set_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-13 12:09:33 +02:00
Marcel Holtmann	a69d892726	Bluetooth: Introduce hci_dev_test_and_clear_flag helper macro Instead of manually coding test_and_clear_bit on hdev->dev_flags all the time, use hci_dev_test_and_clear_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-13 12:09:32 +02:00
Marcel Holtmann	516018a9c0	Bluetooth: Introduce hci_dev_test_and_change_flag helper macro Instead of manually coding test_and_change_bit on hdev->dev_flags all the time, use hci_dev_test_and_change_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-13 12:09:31 +02:00
Marcel Holtmann	ce05d603af	Bluetooth: Introduce hci_dev_change_flag helper macro Instead of manually coding change_bit on hdev->dev_flags all the time, use hci_dev_change_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-13 12:09:29 +02:00
Marcel Holtmann	a358dc11d8	Bluetooth: Introduce hci_dev_clear_flag helper macro Instead of manually coding clear_bit on hdev->dev_flags all the time, use hci_dev_clear_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-13 12:09:27 +02:00
Marcel Holtmann	a1536da255	Bluetooth: Introduce hci_dev_set_flag helper macro Instead of manually coding set_bit on hdev->dev_flags all the time, use hci_dev_set_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-13 12:09:26 +02:00
Marcel Holtmann	d7a5a11d7f	Bluetooth: Introduce hci_dev_test_flag helper macro Instead of manually coding test_bit on hdev->dev_flags all the time, use hci_dev_test_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-13 12:09:25 +02:00
Marcel Holtmann	cc91cb042c	Bluetooth: Add support connectable advertising setting The patch adds a second advertising setting that allows switching of the controller into connectable mode independent of the global connectable setting. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-13 12:07:54 +02:00
Eric W. Biederman	098a697b49	tcp_metrics: Use a single hash table for all network namespaces. Now that all of the operations are safe on a single hash table accross network namespaces, allocate a single global hash table and update the code to use it. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-13 01:57:07 -04:00
Eric W. Biederman	04f721c671	tcp_metrics: Rewrite tcp_metrics_flush_all Rewrite tcp_metrics_flush_all so that it can cope with entries from different network namespaces on it's hash chain. This is based on the logic in tcp_metrics_nl_cmd_del for deleting a selection of entries from a tcp metrics hash chain. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-13 01:57:07 -04:00
Eric W. Biederman	8a4bff714f	tcp_metrics: Remove the unused return code from tcp_metrics_flush_all tcp_metrics_flush_all always returns 0. Remove the unnecessary return code. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-13 01:57:07 -04:00
Eric W. Biederman	849e8a0ca8	tcp_metrics: Add a field tcpm_net and verify it matches on lookup In preparation for using one tcp metrics hash table for all network namespaces add a field tcpm_net to struct tcp_metrics_block, and verify that field on all hash table lookups. Make the field tcpm_net of type possible_net_t so it takes no space when network namespaces are disabled. Further add a function tm_net to read that field so we can be efficient when network namespaces are disabled and concise the rest of the time. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-13 01:57:07 -04:00
Eric W. Biederman	3e5da62d0b	tcp_metrics: Mix the network namespace into the hash function. In preparation for using one hash table for all network namespaces mix the network namespace into the hash value. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-13 01:57:07 -04:00
Eric W. Biederman	6493517eae	tcp_metrics: panic when tcp_metrics_init fails. There is not a practical way to cleanup during boot so just panic if there is a problem initializing tcp_metrics. That will at least give us a clear place to start debugging if something does go wrong. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-13 01:57:07 -04:00
Michael S. Tsirkin	8051a2a518	9p/trans_virtio: fix hot-unplug On device hot-unplug, 9p/virtio currently will kfree channel while it might still be in use. Of course, it might stay used forever, so it's an extremely ugly hack, but it seems better than use-after-free that we have now. [ Unused variable removed, whitespace cleanup, msg single-lined --RR ] Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2015-03-13 15:55:41 +10:30
Christoph Hellwig	66ee59af63	fs: remove ki_nbytes There is no need to pass the total request length in the kiocb, as we already get passed in through the iov_iter argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-03-12 23:50:23 -04:00
Eric W. Biederman	76fecd8275	mpls: In mpls_egress verify the packet length. Reobert Shearman noticed that mpls_egress is failing to verify that the bytes to be examined are in fact present in the packet before mpls_egress reads those bytes. As suggested by David Miller reduce this to a single pskb_may_pull call so that we don't do unnecessary work in the fast path. Reported-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 23:05:04 -04:00
Eric Dumazet	3f66b083a5	inet: introduce ireq_family Before inserting request socks into general hash table, fill their socket family. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 22:58:13 -04:00
Eric Dumazet	d4f06873b6	inet: get_openreq4() & get_openreq6() do not need listener ireq->ir_num contains local port, use it. Also, get_openreq4() dumping listen_sk->refcnt makes litle sense. inet_diag_fill_req() can also use ireq->ir_num Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 22:58:13 -04:00
Eric Dumazet	41b822c59e	inet: prepare sock_edemux() & sock_gen_put() for new SYN_RECV state sock_edemux() & sock_gen_put() should be ready to cope with request socks. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 22:58:13 -04:00
Eric Dumazet	0159dfd3d7	net: add req_prot_cleanup() & req_prot_init() helpers Make proto_register() & proto_unregister() a bit nicer. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 22:58:13 -04:00
Eric Dumazet	bd337c581b	ipv6: add missing ireq_net & ir_cookie initializations I forgot to update dccp_v6_conn_request() & cookie_v6_check(). They both need to set ireq->ireq_net and ireq->ir_cookie Lets clear ireq->ir_cookie in inet_reqsk_alloc() Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `33cf7c90fe` ("net: add real socket cookies") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 22:58:12 -04:00
Daniel Borkmann	54720df130	cls_bpf: do eBPF invocation under non-bh RCU lock variant for maps Currently, it is possible in cls_bpf to access eBPF maps only under rcu_read_lock_bh() variants: while on ingress side, that is, handle_ing(), the classifier would be called from __netif_receive_skb_core() under rcu_read_lock(); on egress side, however, it's rcu_read_lock_bh() via __dev_queue_xmit(). This rcu/rcu_bh mix doesn't work together with eBPF maps as they require soley to be called under rcu_read_lock(). eBPF maps could also be shared among various other eBPF programs (possibly even with other eBPF program types, f.e. tracing) and user space processes, so any context is assumed. Therefore, a possible fix for cls_bpf is to wrap/nest eBPF program invocation under non-bh RCU lock variant. Fixes: `e2e9b6541d` ("cls_bpf: add initial eBPF support for programmable classifiers") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 18:33:15 -04:00
Alexander Duyck	0b65bd97ba	fib_trie: Provide a deterministic order for fib_alias w/ tables merged This change makes it so that we should always have a deterministic ordering for the main and local aliases within the merged table when two leaves overlap. So for example if we have a leaf with a key of 192.168.254.0. If we previously added two aliases with a prefix length of 24 from both local and main the first entry would be first and the second would be second. When I was coding this I had added a WARN_ON should such a situation occur as I wasn't sure how likely it would be. However this WARN_ON has been triggered so this is something that should be addressed. With this patch the ordering of the aliases is as follows. First they are sorted on prefix length, then on their table ID, then tos, and finally priority. This way what we end up doing is essentially interleaving the two tables on what used to be leaf_info structure boundaries. Fixes: `0ddcf43d5` ("ipv4: FIB Local/MAIN table collapse") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 18:26:51 -04:00
Alexander Duyck	3c9e9f7320	fib_trie: Avoid NULL pointer if local table is not allocated The function fib_unmerge assumed the local table had already been allocated. If that is not the case however when custom rules are applied then this can result in a NULL pointer dereference. In order to prevent this we must check the value of the local table pointer and if it is NULL simply return 0 as there is no local table to separate from the main. Fixes: `0ddcf43d5` ("ipv4: FIB Local/MAIN table collapse") Reported-by: Madhu Challa <challa@noironetworks.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 18:26:51 -04:00
Eric W. Biederman	0c5c9fb551	net: Introduce possible_net_t Having to say > #ifdef CONFIG_NET_NS > struct net net; > #endif in structures is a little bit wordy and a little bit error prone. Instead it is possible to say: > typedef struct { > #ifdef CONFIG_NET_NS > struct net net; > #endif > } possible_net_t; And then in a header say: > possible_net_t net; Which is cleaner and easier to use and easier to test, as the possible_net_t is always there no matter what the compile options. Further this allows read_pnet and write_pnet to be functions in all cases which is better at catching typos. This change adds possible_net_t, updates the definitions of read_pnet and write_pnet, updates optional struct net * variables that write_pnet uses on to have the type possible_net_t, and finally fixes up the b0rked users of read_pnet and write_pnet. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 14:39:40 -04:00
Eric W. Biederman	efd7ef1c19	net: Kill hold_net release_net hold_net and release_net were an idea that turned out to be useless. The code has been disabled since 2008. Kill the code it is long past due. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 14:39:40 -04:00
Nicholas Mc Guire	f0eede10fd	SUNRPC: use jiffies_to_msecs for converting jiffies Use jiffies_to_msecs for converting jiffies as it handles all of the corner cases reliably and also helps readability. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-03-12 11:53:55 -04:00
Ian Wilson	78146572b9	netfilter: Zero the tuple in nfnl_cthelper_parse_tuple() nfnl_cthelper_parse_tuple() is called from nfnl_cthelper_new(), nfnl_cthelper_get() and nfnl_cthelper_del(). In each case they pass a pointer to an nf_conntrack_tuple data structure local variable: struct nf_conntrack_tuple tuple; ... ret = nfnl_cthelper_parse_tuple(&tuple, tb[NFCTH_TUPLE]); The problem is that this local variable is not initialized, and nfnl_cthelper_parse_tuple() only initializes two fields: src.l3num and dst.protonum. This leaves all other fields with undefined values based on whatever is on the stack: tuple->src.l3num = ntohs(nla_get_be16(tb[NFCTH_TUPLE_L3PROTONUM])); tuple->dst.protonum = nla_get_u8(tb[NFCTH_TUPLE_L4PROTONUM]); The symptom observed was that when the rpc and tns helpers were added then traffic to port 1536 was being sent to user-space. Signed-off-by: Ian Wilson <iwilson@brocade.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-12 13:07:36 +01:00
Marcel Holtmann	983f9814c0	Bluetooth: Remove two else branches that are not needed The SMP code contains two else branches that are not needed since the successful test will actually leave the function. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-12 09:00:48 +02:00
Arnd Bergmann	f862e07cf9	rds: avoid potential stack overflow The rds_iw_update_cm_id function stores a large 'struct rds_sock' object on the stack in order to pass a pair of addresses. This happens to just fit withint the 1024 byte stack size warning limit on x86, but just exceed that limit on ARM, which gives us this warning: net/rds/iw_rdma.c:200:1: warning: the frame size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=] As the use of this large variable is basically bogus, we can rearrange the code to not do that. Instead of passing an rds socket into rds_iw_get_device, we now just pass the two addresses that we have available in rds_iw_update_cm_id, and we change rds_iw_get_mr accordingly, to create two address structures on the stack there. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 00:28:01 -04:00
Willem de Bruijn	3a8dd9711e	sock: fix possible NULL sk dereference in __skb_tstamp_tx Test that sk != NULL before reading sk->sk_tsflags. Fixes: `49ca0d8bfa` ("net-timestamp: no-payload option") Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 00:09:55 -04:00
Eric Dumazet	c29390c6df	xps: must clear sender_cpu before forwarding John reported that my previous commit added a regression on his router. This is because sender_cpu & napi_id share a common location, so get_xps_queue() can see garbage and perform an out of bound access. We need to make sure sender_cpu is cleared before doing the transmit, otherwise any NIC busy poll enabled (skb_mark_napi_id()) can trigger this bug. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: John <jw@nuclearfallout.net> Bisected-by: John <jw@nuclearfallout.net> Fixes: `2bd82484bb` ("xps: fix xps for stacked devices") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 23:51:18 -04:00
Eric Dumazet	d77c555d32	net: fix CONFIG_NET_NS=n compilation I forgot to use write_pnet() in three locations. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `33cf7c90fe` ("net: add real socket cookies") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 23:28:49 -04:00
Lubomir Rintel	c78ba6d64c	ipv6: expose RFC4191 route preference via rtnetlink This makes it possible to retain the route preference when RAs are handled in userspace. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 23:28:09 -04:00
Simon Horman	ac70c05b6f	switchdev: correct spelling of notifier in comments Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 23:06:39 -04:00
Eric Dumazet	33cf7c90fe	net: add real socket cookies A long standing problem in netlink socket dumps is the use of kernel socket addresses as cookies. 1) It is a security concern. 2) Sockets can be reused quite quickly, so there is no guarantee a cookie is used once and identify a flow. 3) request sock, establish sock, and timewait socks for a given flow have different cookies. Part of our effort to bring better TCP statistics requires to switch to a different allocator. In this patch, I chose to use a per network namespace 64bit generator, and to use it only in the case a socket needs to be dumped to netlink. (This might be refined later if needed) Note that I tried to carry cookies from request sock, to establish sock, then timewait sockets. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Eric Salo <salo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 21:55:28 -04:00
Alexey Kodanev	b1cb59cf2e	net: sysctl_net_core: check SNDBUF and RCVBUF for min length sysctl has sysctl.net.core.rmem_/wmem_ parameters which can be set to incorrect values. Given that 'struct sk_buff' allocates from rcvbuf, incorrectly set buffer length could result to memory allocation failures. For example, set them as follows: # sysctl net.core.rmem_default=64 net.core.wmem_default = 64 # sysctl net.core.wmem_default=64 net.core.wmem_default = 64 # ping localhost -s 1024 -i 0 > /dev/null This could result to the following failure: skbuff: skb_over_panic: text:ffffffff81628db4 len:-32 put:-32 head:ffff88003a1cc200 data:ffff88003a1cc200 tail:0xffffffe0 end:0xc0 dev:<NULL> kernel BUG at net/core/skbuff.c:102! invalid opcode: 0000 [#1] SMP ... task: ffff88003b7f5550 ti: ffff88003ae88000 task.ti: ffff88003ae88000 RIP: 0010:[<ffffffff8155fbd1>] [<ffffffff8155fbd1>] skb_put+0xa1/0xb0 RSP: 0018:ffff88003ae8bc68 EFLAGS: 00010296 RAX: 000000000000008d RBX: 00000000ffffffe0 RCX: 0000000000000000 RDX: ffff88003fdcf598 RSI: ffff88003fdcd9c8 RDI: ffff88003fdcd9c8 RBP: ffff88003ae8bc88 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 00000000000002b2 R12: 0000000000000000 R13: 0000000000000000 R14: ffff88003d3f7300 R15: ffff88000012a900 FS: 00007fa0e2b4a840(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000d0f7e0 CR3: 000000003b8fb000 CR4: 00000000000006f0 Stack: ffff88003a1cc200 00000000ffffffe0 00000000000000c0 ffffffff818cab1d ffff88003ae8bd68 ffffffff81628db4 ffff88003ae8bd48 ffff88003b7f5550 ffff880031a09408 ffff88003b7f5550 ffff88000012aa48 ffff88000012ab00 Call Trace: [<ffffffff81628db4>] unix_stream_sendmsg+0x2c4/0x470 [<ffffffff81556f56>] sock_write_iter+0x146/0x160 [<ffffffff811d9612>] new_sync_write+0x92/0xd0 [<ffffffff811d9cd6>] vfs_write+0xd6/0x180 [<ffffffff811da499>] SyS_write+0x59/0xd0 [<ffffffff81651532>] system_call_fastpath+0x12/0x17 Code: 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00 00 00 48 c7 c7 30 db 91 81 48 89 04 24 31 c0 e8 4f a8 0e 00 <0f> 0b eb fe 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 RIP [<ffffffff8155fbd1>] skb_put+0xa1/0xb0 RSP <ffff88003ae8bc68> Kernel panic - not syncing: Fatal exception Moreover, the possible minimum is 1, so we can get another kernel panic: ... BUG: unable to handle kernel paging request at ffff88013caee5c0 IP: [<ffffffff815604cf>] __alloc_skb+0x12f/0x1f0 ... Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 21:25:13 -04:00
Alexander Duyck	654eff4516	fib_trie: Only display main table in /proc/net/route When we merged the tries for local and main I had overlooked the iterator for /proc/net/route. As a result it was outputting both local and main when the two tries were merged. This patch resolves that by only providing output for aliases that are actually in the main trie. As a result we should go back to the original behavior which I assume will be necessary to maintain legacy support. Fixes: `0ddcf43d5` ("ipv4: FIB Local/MAIN table collapse") Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 21:24:32 -04:00
Florian Fainelli	cd28a1a9ba	net: dsa: fully divert PHY reads/writes if requested In case a PHY is found via Device Tree, and is also flagged by the switch driver as needing indirect reads/writes using the switch driver implemented MDIO bus, make sure that we bind this PHY to the slave MII bus in order for this to happen. Without this, we would succeed in having the PHY driver probe()'s function to use slave MII bus read/write functions, because this is done during dsa_slave_mii_init(), but past that point, the PHY driver would not go through these diverted reads and writes. Fixes: `0d8bcdd383` ("net: dsa: allow for more complex PHY setups") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 17:56:29 -04:00
Florian Fainelli	c305c1651c	net: dsa: move PHY setup on DSA MII bus to its own function In preparation for dealing with indirect reads and writes towards certain PHY devices, move the code which deals with binding the PHY device to the slave MII bus created by DSA to its own function: dsa_slave_phy_connect(). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 17:56:28 -04:00
Alexander Duyck	61f0d861fc	fib_trie: Fix uninitialized variable warning The 0-day kernel test infrastructure reported a use of uninitialized variable warning for local_table due to the fact that the local and main allocations had been swapped from the original setup. This change corrects that by making it so that we free the main table if the local table allocation fails. Fixes: `0ddcf43d5` ("ipv4: FIB Local/MAIN table collapse") Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 17:33:44 -04:00
Neal Cardwell	d578e18ce9	tcp: restore 1.5x per RTT limit to CUBIC cwnd growth in congestion avoidance Commit `814d488c61` ("tcp: fix the timid additive increase on stretch ACKs") fixed a bug where tcp_cong_avoid_ai() would either credit a connection with an increase of snd_cwnd_cnt, or increase snd_cwnd, but not both, resulting in cwnd increasing by 1 packet on at most every alternate invocation of tcp_cong_avoid_ai(). Although the commit correctly implemented the CUBIC algorithm, which can increase cwnd by as much as 1 packet per 1 packet ACKed (2x per RTT), in practice that could be too aggressive: in tests on network paths with small buffers, YouTube server retransmission rates nearly doubled. This commit restores CUBIC to a maximum cwnd growth rate of 1 packet per 2 packets ACKed (1.5x per RTT). In YouTube tests this restored retransmit rates to low levels. Testing: This patch has been tested in datacenter netperf transfers and live youtube.com and google.com servers. Fixes: `9cd981dcf1` ("tcp: fix stretch ACK bugs in CUBIC") Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 16:51:51 -04:00
Neal Cardwell	9949afa42b	tcp: fix tcp_cong_avoid_ai() credit accumulation bug with decreases in w The recent change to tcp_cong_avoid_ai() to handle stretch ACKs introduced a bug where snd_cwnd_cnt could accumulate a very large value while w was large, and then if w was reduced snd_cwnd could be incremented by a large delta, leading to a large burst and high packet loss. This was tickled when CUBIC's bictcp_update() sets "ca->cnt = 100 * cwnd". This bug crept in while preparing the upstream version of `814d488c61`. Testing: This patch has been tested in datacenter netperf transfers and live youtube.com and google.com servers. Fixes: `814d488c61` ("tcp: fix the timid additive increase on stretch ACKs") Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 16:51:51 -04:00
Sabrina Dubroca	6dede75b7e	fib_trie: call fib_table_flush_external under RTNL Move rtnl_lock() before the call to fib4_rules_exit so that fib_table_flush_external is called under RTNL. Fixes: `104616e74e` ("switchdev: don't support custom ip rules, for now") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 16:46:26 -04:00
Robert Shearman	8a08919f43	mpls: Allow mpls_gso and mpls_router to be built as modules CONFIG_MPLS=m doesn't result in a kernel module being built because it applies to the net/mpls directory, rather than to .o files. So revert the MPLS menuitem to being a boolean and make MPLS_GSO and MPLS_ROUTING tristates to allow mpls_gso and mpls_router modules to be produced as desired. Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 16:38:54 -04:00
Alexander Duyck	0ddcf43d5d	ipv4: FIB Local/MAIN table collapse This patch is meant to collapse local and main into one by converting tb_data from an array to a pointer. Doing this allows us to point the local table into the main while maintaining the same variables in the table. As such the tb_data was converted from an array to a pointer, and a new array called data is added in order to still provide an object for tb_data to point to. In order to track the origin of the fib aliases a tb_id value was added in a hole that existed on 64b systems. Using this we can also reverse the merge in the event that custom FIB rules are enabled. With this patch I am seeing an improvement of 20ns to 30ns for routing lookups as long as custom rules are not enabled, with custom rules enabled we fall back to split tables and the original behavior. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 16:22:14 -04:00
Johan Hedberg	4ba9faf35f	Bluetooth: Check for matching IRK when looking for paired LE devices If we're given an RPA when checking whether we're paired or not, we should consult the local RPA storage whether there's a matching IRK. This we we ensure that hci_bdaddr_is_paired() gives the right result even when trying to pair a second time with the same device with an RPA. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-11 15:54:23 +01:00
Johan Hedberg	87c8b28d29	Bluetooth: Fix missing rcu_read_unlock() in hci_bdaddr_is_paired() When finding a matching LTK the rcu_read_unlock() function was failing to release the RCU read lock. This patch adds the missing call to rcu_reaD_unlock(). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-11 08:52:32 +01:00
Marcel Holtmann	beb1c21b8e	Bluetooth: Increment management interface revision This patch increments the management interface revision due to introduction of new static address setting and fixes for the fast connectable feature. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-11 09:28:41 +02:00
David S. Miller	4363890079	net: Handle unregister properly when netdev namespace change fails. If rtnl_newlink() fails on it's call to dev_change_net_namespace(), we have to make use of the ->dellink() method, if present, just like we do when rtnl_configure_link() fails. Fixes: `317f4810e4` ("rtnl: allow to create device with IFLA_LINK_NETNSID set") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 21:59:46 -04:00
Jon Paul Maloy	169bf9121b	tipc: ensure that idle links are deleted when a bearer is disabled commit `afaa3f65f6` (tipc: purge links when bearer is disabled) was an attempt to resolve a problem that turned out to have a more profound reason. When we disable a bearer, we delete all its pertaining links if there is no other bearer to perform failover to, or if the module is shutting down. In case there are dual bearers, we wait with deleting links until the failover procedure is finished. However, this misses the case when a link on the removed bearer was already down, so that there will be no failover procedure to finish the link delete. This causes confusion if a new bearer is added to replace the removed one, and also entails a small memory leak. This commit takes the current state of the link into account when deciding when to delete it, and also reverses the above-mentioned commit. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 18:37:36 -04:00
Alexander Duyck	ddb4b9a132	fib_trie: Address possible NULL pointer dereference in resize If the inflate call failed it would return NULL. As a result tp would be set to NULL and cause use to trigger a NULL pointer dereference in should_halve if the inflate failed on the first attempt. In order to prevent this we should decrement max_work before we actually attempt to inflate as this will force us to exit before attempting to halve a node we should have inflated. In order to keep things symmetric between inflate and halve I went ahead and also moved the decrement of max_work for the halve case as well so we take care of that before we actually attempt to halve the tnode. Fixes: `88bae714` ("fib_trie: Add key vector to root, return parent key_vector in resize") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 18:36:56 -04:00
Johan Hedberg	55e76b3898	Bluetooth: Add 'Already Paired' error for Pair Device command To make the behavior predictable when attempting to pair with a device for which we already have a Link Key or Long Term Key, this patch adds a new 'Already Paired' error which gets sent in such a scenario. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-10 21:42:05 +01:00
Alexander Duyck	3ec320dd5c	fib_trie: Correctly handle case of key == 0 in leaf_walk_rcu In the case of a trie that had no tnodes with a key of 0 the initial look-up would fail resulting in an out-of-bounds cindex on the first tnode. This resulted in an entire trie being skipped. In order resolve this I have updated the cindex logic in the initial look-up so that if the key is zero we will always traverse the child zero path. Fixes: `8be33e95` ("fib_trie: Fib walk rcu should take a tnode and key instead of a trie and a leaf") Reported-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 16:13:55 -04:00
Oliver Hartkopp	7768eed8bf	net: add comment for sock_efree() usage Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 16:12:20 -04:00
Johan Hedberg	406ef2a67b	Bluetooth: Make Fast Connectable available while powered off To maximize the usability of the Fast Connectable feature we should make it possible to set (or unset) it at any given moment. This means removing the dependency on the 'connectable' setting as well as the 'powered' setting. The former makes also sense since page scan may get enabled through add_device even if 'connectable' is false. To keep the setting available over power cycles its flag also needs to be removed from the flags that are cleared upon HCI_Reset. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-10 19:37:02 +01:00
Eric Dumazet	34160ea3f9	inet_diag: add const to inet_diag_req_v2 diag dumpers should not modify the request. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 13:45:28 -04:00
Eric Dumazet	e31c5e0e48	inet_diag: cleanups Remove all inline keywords, add some const, and cleanup style. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 13:45:28 -04:00
Eric Dumazet	491da2a477	net: constify sock_diag_check_cookie() sock_diag_check_cookie() second parameter is constant Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 13:45:28 -04:00
David S. Miller	515fb5c317	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter fixes for net-next The following batch contains a couple of fixes to address some fallout from the previous pull request, they are: 1) Address link problems in the bridge code after `e5de75b`. Fix it by using rcu hook to address to avoid ifdef pollution and hard dependency between bridge and br_netfilter. 2) Address sparse warnings in the netfilter reject code, patch from Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 12:48:47 -04:00
Pablo Neira Ayuso	1a4ba64d16	netfilter: bridge: use rcu hook to resolve br_netfilter dependency `e5de75b` ("netfilter: bridge: move DNAT helper to br_netfilter") results in the following link problem: net/bridge/br_device.c:29: undefined reference to `br_nf_prerouting_finish_bridge` Moreover it creates a hard dependency between br_netfilter and the bridge core, which is what we've been trying to avoid so far. Resolve this problem by using a hook structure so we reduce #ifdef pollution and keep bridge netfilter specific code under br_netfilter.c which was the original intention. Reported-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-10 15:03:02 +01:00
Florian Westphal	a03a8dbe20	netfilter: fix sparse warnings in reject handling make C=1 CF=-D__CHECK_ENDIAN__ shows following: net/bridge/netfilter/nft_reject_bridge.c:65:50: warning: incorrect type in argument 3 (different base types) net/bridge/netfilter/nft_reject_bridge.c:65:50: expected restricted __be16 [usertype] protocol [..] net/bridge/netfilter/nft_reject_bridge.c:102:37: warning: cast from restricted __be16 net/bridge/netfilter/nft_reject_bridge.c:102:37: warning: incorrect type in argument 1 (different base types) [..] net/bridge/netfilter/nft_reject_bridge.c:121:50: warning: incorrect type in argument 3 (different base types) [..] net/bridge/netfilter/nft_reject_bridge.c:168:52: warning: incorrect type in argument 3 (different base types) [..] net/bridge/netfilter/nft_reject_bridge.c:233:52: warning: incorrect type in argument 3 (different base types) [..] Caused by two (harmless) errors: 1. htons() instead of ntohs() 2. __be16 for protocol in nf_reject_ipXhdr_put API, use u8 instead. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-10 15:01:32 +01:00
Scott Feldman	f8f2147150	switchdev: add netlink flags to IPv4 FIB add op Pass in the netlink flags (NLM_F_*) into switchdev driver for IPv4 FIB add op to allow driver to 1) optimize hardware updates, 2) handle ip route prepend and append commands correctly. Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com> Suggested-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 23:56:52 -04:00
Florian Fainelli	769a020289	net: dsa: utilize of_find_net_device_by_node Using of_find_device_by_node() restricts the search to platform_device that match the specified device_node pointer. This is not even remotely true for network devices backed by a pci_device for instance. of_find_net_device_by_node() allows us to do a more thorough lookup to find the struct net_device corresponding to a particular device_node pointer. For symetry with the non-OF code path, we hold the net_device pointer in dsa_probe() just like what dev_to_net_dev() does when we call this function. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 23:50:21 -04:00
Florian Fainelli	aa836df958	net: core: add of_find_net_device_by_node() Add a helper function which allows getting the struct net_device pointer associated with a given struct device_node pointer. This is useful for instance for DSA Ethernet devices not backed by a platform_device, but a PCI device. Since we need to access net_class which is not accessible outside of net/core/net-sysfs.c, this helper function is also added here and gated with CONFIG_OF_NET. Network devices initialized with SET_NETDEV_DEV() are also taken into account by checking for dev->parent first and then falling back to checking the device pointer within struct net_device. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 23:50:20 -04:00
WANG Cong	5778d39d07	net_sched: fix struct tc_u_hnode layout in u32 We dynamically allocate divisor+1 entries for ->ht[] in tc_u_hnode: ht = kzalloc(sizeof(ht) + divisorsizeof(void *), GFP_KERNEL); So ->ht is supposed to be the last field of this struct, however this is broken, since an rcu head is appended after it. Fixes: `1ce87720d4` ("net: sched: make cls_u32 lockless") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 23:44:31 -04:00
David S. Miller	3cef5c5b0b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/cadence/macb.c Overlapping changes in macb driver, mostly fixes and cleanups in 'net' overlapping with the integration of at91_ether into macb in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 23:38:02 -04:00
Linus Torvalds	36bef88380	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) nft_compat accidently truncates ethernet protocol to 8-bits, from Arturo Borrero. 2) Memory leak in ip_vs_proc_conn(), from Julian Anastasov. 3) Don't allow the space required for nftables rules to exceed the maximum value representable in the dlen field. From Patrick McHardy. 4) bcm63xx_enet can accidently leave interrupts permanently disabled due to errors in the NAPI polling exit logic. Fix from Nicolas Schichan. 5) Fix OOPSes triggerable by the ping protocol module, due to missing address family validations etc. From Lorenzo Colitti. 6) Don't use RCU locking in sleepable context in team driver, from Jiri Pirko. 7) xen-netback miscalculates statistic offset pointers when reporting the stats to userspace. From David Vrabel. 8) Fix a leak of up to 256 pages per VIF destroy in xen-netaback, also from David Vrabel. 9) ip_check_defrag() cannot assume that skb_network_offset(), particularly when it is used by the AF_PACKET fanout defrag code. From Alexander Drozdov. 10) gianfar driver doesn't query OF node names properly when trying to determine the number of hw queues available. Fix it to explicitly check for OF nodes named queue-group. From Tobias Waldekranz. 11) MID field in macb driver should be 12 bits, not 16. From Punnaiah Choudary Kalluri. 12) Fix unintentional regression in traceroute due to timestamp socket option changes. Empty ICMP payloads should be allowed in non-timestamp cases. From Willem de Bruijn. 13) When devices are unregistered, we have to get rid of AF_PACKET multicast list entries that point to it via ifindex. Fix from Francesco Ruggeri. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits) tipc: fix bug in link failover handling net: delete stale packet_mclist entries net: macb: constify macb configuration data MAINTAINERS: add Marc Kleine-Budde as co maintainer for CAN networking layer MAINTAINERS: linux-can moved to github can: kvaser_usb: Read all messages in a bulk-in URB buffer can: kvaser_usb: Avoid double free on URB submission failures can: peak_usb: fix missing ctrlmode_ init for every dev can: add missing initialisations in CAN related skbuffs ip: fix error queue empty skb handling bgmac: Clean warning messages tcp: align tcp_xmit_size_goal() on tcp_tso_autosize() net: fec: fix unbalanced clk disable on driver unbind net: macb: Correct the MID field length value net: gianfar: correctly determine the number of queue groups ipv4: ip_check_defrag should not assume that skb_network_offset is zero net: bcmgenet: properly disable password matching net: eth: xgene: fix booting with devicetree bnx2x: Force fundamental reset for EEH recovery xen-netback: refactor xenvif_handle_frag_list() ...	2015-03-09 18:17:21 -07:00
Jon Paul Maloy	e6441bae32	tipc: fix bug in link failover handling In commit `c637c10355` ("tipc: resolve race problem at unicast message reception") we introduced a new mechanism for delivering buffers upwards from link to socket layer. That code contains a bug in how we handle the new link input queue during failover. When a link is reset, some of its users may be blocked because of congestion, and in order to resolve this, we add any pending wakeup pseudo messages to the link's input queue, and deliver them to the socket. This misses the case where the other, remaining link also may have congested users. Currently, the owner node's reference to the remaining link's input queue is unconditionally overwritten by the reset link's input queue. This has the effect that wakeup events from the remaining link may be unduely delayed (but not lost) for a potentially long period. We fix this by adding the pending events from the reset link to the input queue that is currently referenced by the node, whichever one it is. This commit should be applied to both net and net-next. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 16:20:41 -04:00
Francesco Ruggeri	82f17091e6	net: delete stale packet_mclist entries When an interface is deleted from a net namespace the ifindex in the corresponding entries in PF_PACKET sockets' mclists becomes stale. This can create inconsistencies if later an interface with the same ifindex is moved from a different namespace (not that unlikely since ifindexes are per-namespace). In particular we saw problems with dev->promiscuity, resulting in "promiscuity touches roof, set promiscuity failed. promiscuity feature of device might be broken" warnings and EOVERFLOW failures of setsockopt(PACKET_ADD_MEMBERSHIP). This patch deletes the mclist entries for interfaces that are deleted. Since this now causes setsockopt(PACKET_DROP_MEMBERSHIP) to fail with EADDRNOTAVAIL if called after the interface is deleted, also make packet_mc_drop not fail. Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 16:17:43 -04:00
Eric W. Biederman	ddb3b6033c	net: Remove protocol from struct dst_ops After my change to neigh_hh_init to obtain the protocol from the neigh_table there are no more users of protocol in struct dst_ops. Remove the protocol field from dst_ops and all of it's initializers. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 16:06:10 -04:00
David S. Miller	5428aef811	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree. Basically, improvements for the packet rejection infrastructure, deprecation of CLUSTERIP, cleanups for nf_tables and some untangling for br_netfilter. More specifically they are: 1) Send packet to reset flow if checksum is valid, from Florian Westphal. 2) Fix nf_tables reject bridge from the input chain, also from Florian. 3) Deprecate the CLUSTERIP target, the cluster match supersedes it in functionality and it's known to have problems. 4) A couple of cleanups for nf_tables rule tracing infrastructure, from Patrick McHardy. 5) Another cleanup to place transaction declarations at the bottom of nf_tables.h, also from Patrick. 6) Consolidate Kconfig dependencies wrt. NF_TABLES. 7) Limit table names to 32 bytes in nf_tables. 8) mac header copying in bridge netfilter is already required when calling ip_fragment(), from Florian Westphal. 9) move nf_bridge_update_protocol() to br_netfilter.c, also from Florian. 10) Small refactor in br_netfilter in the transmission path, again from Florian. 11) Move br_nf_pre_routing_finish_bridge_slow() to br_netfilter. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 15:58:21 -04:00
Geert Uytterhoeven	26c459a807	mpls: Spelling: s/conceved/conceived/, s/as/a/ Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 15:51:59 -04:00
Erik Hugne	143fe22f50	tipc: fix inconsistent signal handling regression Commit `9bbb4ecc68` ("tipc: standardize recvmsg routine") changed the sleep/wakeup behaviour for sockets entering recv() or accept(). In this process the order of reporting -EAGAIN/-EINTR was reversed. This caused problems with wrong errno being reported back if the timeout expires. The same problem happens if the socket is nonblocking and recv()/accept() is called when the process have pending signals. If there is no pending data read or connections to accept, -EINTR will be returned instead of -EAGAIN. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Reported-by László Benedek <laszlo.benedek@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 15:42:19 -04:00
Jiri Pirko	f4427bc3e2	switchdev: use gpl variant of symbol export Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Scott Feldman <sfeldma@gmail.com> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 15:37:39 -04:00
Erik Hugne	4fee6be813	tipc: sparse: fix htons conversion warnings Commit `d0f91938be` ("tipc: add ip/udp media type") introduced some new sparse warnings. Clean them up. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 15:37:04 -04:00
Cong Wang	1e052be69d	net_sched: destroy proto tp when all filters are gone Kernel automatically creates a tp for each (kind, protocol, priority) tuple, which has handle 0, when we add a new filter, but it still is left there after we remove our own, unless we don't specify the handle (literally means all the filters under the tuple). For example this one is left: # tc filter show dev eth0 filter parent 8001: protocol arp pref 49152 basic The user-space is hard to clean up these for kernel because filters like u32 are organized in a complex way. So kernel is responsible to remove it after all filters are gone. Each type of filter has its own way to store the filters, so each type has to provide its way to check if all filters are gone. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim<jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 15:35:55 -04:00
Pablo Neira Ayuso	e5de75bf88	netfilter: bridge: move DNAT helper to br_netfilter Only one caller, there is no need to keep this in a header. Move it to br_netfilter.c where this belongs to. Based on patch from Florian Westphal. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-09 17:56:07 +01:00
Florian Westphal	7a8d831df5	netfilter: bridge: refactor conditional in br_nf_dev_queue_xmit simpilifies followup patch that re-works brnf ip_fragment handling. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-09 13:22:58 +01:00
Florian Westphal	4a9d2f2008	netfilter: bridge: move nf_bridge_update_protocol to where its used no need to keep it in a header file. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-09 13:21:31 +01:00
Florian Westphal	8bd63cf1a4	bridge: move mac header copying into br_netfilter The mac header only has to be copied back into the skb for fragments generated by ip_fragment(), which only happens for bridge forwarded packets with nf-call-iptables=1 && active nf_defrag. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-09 13:20:48 +01:00
Oliver Hartkopp	969439016d	can: add missing initialisations in CAN related skbuffs When accessing CAN network interfaces with AF_PACKET sockets e.g. by dhclient this can lead to a skb_under_panic due to missing skb initialisations. Add the missing initialisations at the CAN skbuff creation times on driver level (rx path) and in the network layer (tx path). Reported-by: Austin Schuh <austin@peloton-tech.com> Reported-by: Daniel Steer <daniel.steer@mclaren.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2015-03-09 10:22:24 +01:00
Willem de Bruijn	c247f0534c	ip: fix error queue empty skb handling When reading from the error queue, msg_name and msg_control are only populated for some errors. A new exception for empty timestamp skbs added a false positive on icmp errors without payload. `traceroute -M udpconn` only displayed gateways that return payload with the icmp error: the embedded network headers are pulled before sock_queue_err_skb, leaving an skb with skb->len == 0 otherwise. Fix this regression by refining when msg_name and msg_control branches are taken. The solutions for the two fields are independent. msg_name only makes sense for errors that configure serr->port and serr->addr_offset. Test the first instead of skb->len. This also fixes another issue. saddr could hold the wrong data, as serr->addr_offset is not initialized in some code paths, pointing to the start of the network header. It is only valid when serr->port is set (non-zero). msg_control support differs between IPv4 and IPv6. IPv4 only honors requests for ICMP and timestamps with SOF_TIMESTAMPING_OPT_CMSG. The skb->len test can simply be removed, because skb->dev is also tested and never true for empty skbs. IPv6 honors requests for all errors aside from local errors and timestamps on empty skbs. In both cases, make the policy more explicit by moving this logic to a new function that decides whether to process msg_control and that optionally prepares the necessary fields in skb->cb[]. After this change, the IPv4 and IPv6 paths are more similar. The last case is rxrpc. Here, simply refine to only match timestamps. Fixes: `49ca0d8bfa` ("net-timestamp: no-payload option") Reported-by: Jan Niehusmann <jan@gondor.com> Signed-off-by: Willem de Bruijn <willemb@google.com> ---- Changes v1->v2 - fix local origin test inversion in ip6_datagram_support_cmsg - make v4 and v6 code paths more similar by introducing analogous ipv4_datagram_support_cmsg - fix compile bug in rxrpc Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 23:01:54 -04:00
Eric W. Biederman	b79bda3d38	neigh: Use neigh table index for neigh_packet_xmit Remove a little bit of unnecessary work when transmitting a packet with neigh_packet_xmit. Use the neighbour table index not the address family as a parameter. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 19:30:06 -04:00
Eric W. Biederman	7d5f41f276	mpls: Fix the openvswitch select of NET_MPLS_GSO Fix the OPENVSWITCH Kconfig option and old Kconfigs by having OPENVSWITCH select both NET_MPLS_GSO and MPLSO. A Kbuild test robot reported that when NET_MPLS_GSO is selected by OPENVSWITCH the generated .config is broken because MPLS is not selected. Cc: Simon Horman <horms@verge.net.au> Fixes: `cec9166ca4` mpls: Refactor how the mpls module is built Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 19:30:06 -04:00
Eric W. Biederman	aa7da93756	mpls: Correct the ttl decrement. According to RFC3032 section 2.4.2 packets with an outgoing ttl of 0 MUST NOT be forwarded. According to section 2.4.1 an outgoing TTL of 0 comes from an incomming TTL <= 1. Therefore any packets that is received with a ttl <= 1 should not have it's ttl decremented and forwarded. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 19:30:06 -04:00
Eric W. Biederman	0f7bbd5805	mpls: Better error code for unsupported option. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 19:30:06 -04:00
Eric W. Biederman	19d0c341d9	mpls: Cleanup the rcu usage in the code. Sparse was generating a lot of warnings mostly from missing annotations in the code. Add missing annotations and in a few cases tweak the code for performance by moving work before loops. This also fixes a problematic ommision of rcu_assign_pointer and rcu_dereference. Hopefully with complete rcu annotations any new rcu errors will stick out like a sore thumb. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 19:30:06 -04:00
Eric W. Biederman	d865616e18	mpls: Fix the kzalloc argument order in mpls_rt_alloc Blink I got the argument order wrong to kzalloc and the code was working properly when tested. Blink Fix that. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 19:30:06 -04:00
Al Viro	1711fd9add	sunrpc: fix braino in ->poll() POLL_OUT isn't what callers of ->poll() are expecting to see; it's actually __SI_POLL \| 2 and it's a siginfo code, not a poll bitmap bit... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Cc: Bruce Fields <bfields@fieldses.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-03-08 12:53:46 -07:00
Linus Torvalds	bbbce516bb	TTY/Serial fixes for 4.0-rc3 Here are some tty and serial driver fixes for 4.0-rc3. Along with the atime fix that you know about, here are some other serial driver bugfixes as well. Most notable is a wait_until_sent bugfix that was traced back to being around since before 2.6.12 that Johan has fixed up. All have been in linux-next successfully. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlT8RCYACgkQMUfUDdst+yk62QCgycxS4giC2hyRver3dyvaNR6g zYYAn2w0uRndW+AqP4Tls54isRz6owpF =gA2k -----END PGP SIGNATURE----- Merge tag 'tty-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are some tty and serial driver fixes for 4.0-rc3. Along with the atime fix that you know about, here are some other serial driver bugfixes as well. Most notable is a wait_until_sent bugfix that was traced back to being around since before 2.6.12 that Johan has fixed up. All have been in linux-next successfully" * tag 'tty-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: TTY: fix tty_wait_until_sent maximum timeout TTY: fix tty_wait_until_sent on 64-bit machines USB: serial: fix infinite wait_until_sent timeout TTY: bfin_jtag_comm: remove incorrect wait_until_sent operation net: irda: fix wait_until_sent poll timeout serial: uapi: Declare all userspace-visible io types serial: core: Fix iotype userspace breakage serial: sprd: Fix missing spin_unlock in sprd_handle_irq() console: Fix console name size mismatch tty: fix up atime/mtime mess, take four serial: 8250_dw: Fix get_mctrl behaviour serial:8250:8250_pci: delete unneeded quirk entries serial:8250:8250_pci: fix redundant entry report for WCH_CH352_2S Change email address for 8250_pci serial: 8250: Revert "tty: serial: 8250_core: read only RX if there is something in the FIFO" Revert "tty/serial: of_serial: add DT alias ID handling"	2015-03-08 12:25:40 -07:00
Alexander Aring	0402d9f233	Bluetooth: fix sco_exit compile warning While compiling the following warning occurs: WARNING: net/built-in.o(.init.text+0x602c): Section mismatch in reference from the function bt_init() to the function .exit.text:sco_exit() The function __init bt_init() references a function __exit sco_exit(). This is often seen when error handling in the init function uses functionality in the exit path. The fix is often to remove the __exit annotation of sco_exit() so it may be used outside an exit section. Since commit `6d785aa345` ("Bluetooth: Convert mgmt to use HCI chan registration API") the function "sco_exit" is used inside of function "bt_init". The suggested solution by remove the __exit annotation solved this issue. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-07 22:13:17 +02:00
Eric Dumazet	58025e46ea	net: gro: remove obsolete code from skb_gro_receive() Some drivers use copybreak to copy tiny frames into smaller skb, and this smaller skb might not have skb->head_frag set for various reasons. skb_gro_receive() currently doesn't allow to aggregate the smaller skb into the previous GRO packet if this GRO packet has at least 2 MSS in it. Following workload easily demonstrates the problem. netperf -t TCP_RR -H target -- -r 3000,3000 (tcpdump shows one GRO packet with 2 MSS, plus one additional packet of 104 bytes that should have been appended.) It turns out that we can remove code from skb_gro_receive(), because commit `8a29111c7c` ("net: gro: allow to build full sized skb") and its followups removed the assumption that a GRO packet with a frag_list had to have an empty head. Removing this code allows the aggregation of the last (incomplete) frame in some RPC workloads. Note that tcp_gro_receive() already takes care of forcing a flush if necessary, including this case. If we want to avoid using frag_list in the first place (in forwarding workloads for example, as the outgoing NIC is generally not able to cope with skbs having a frag_list), we need to address this separately. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 21:50:55 -05:00
Shani Michaeli	c93682477b	net/dcb: Add IEEE QCN attribute As specified in 802.1Qau spec. Add this optional attribute to the DCB netlink layer. To allow for application to use the new attribute, NIC drivers should implement and register the callbacks ieee_getqcn, ieee_setqcn and ieee_getqcnstats. The QCN attribute holds a set of parameters for management, and a set of statistics to provide informative data on Congestion-Control defined by this spec. Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 21:50:02 -05:00
Johan Hovold	2c3fbe3cf2	net: irda: fix wait_until_sent poll timeout In case an infinite timeout (0) is requested, the irda wait_until_sent implementation would use a zero poll timeout rather than the default 200ms. Note that wait_until_sent is currently never called with a 0-timeout argument due to a bug in tty_wait_until_sent. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable <stable@vger.kernel.org> # v2.6.12 Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-03-07 03:44:14 +01:00
Masanari Iida	d939be3add	treewide: Fix typo in printk messages This patch fix spelling typo in printk messages. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2015-03-06 23:05:39 +01:00
Masanari Iida	f42cf8d6a3	treewide: Fix typo in printk messages This patch fix spelling typo in printk messages. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2015-03-06 23:04:40 +01:00
Alexander Duyck	88bae7149a	fib_trie: Add key vector to root, return parent key_vector in resize This change makes it so that the root of the trie contains a key_vector, by doing this we make room to essentially collapse the entire trie by at least one cache line as we can store the information about the tnode or leaf that is pointed to in the root. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:28 -05:00
Alexander Duyck	f23e59fbd7	fib_trie: Move parent from key_vector to tnode This change pulls the parent pointer from the key_vector and places it in the tnode structure. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:28 -05:00
Alexander Duyck	6e22d174ba	fib_trie: Pull empty_children and full_children into tnode This pulls the information about the child array out of the key_vector and places it in the tnode since that is where it is needed. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:28 -05:00
Alexander Duyck	56ca2adf6a	fib_trie: Move rcu from key_vector to tnode, add accessors. RCU is only needed once for the entire node, not once per key_vector so we can pull that out and move it to the tnode structure. In addition add accessors to be used inside the RCU functions so that we can more easily get from the key vector to either the tnode or the trie pointers. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:28 -05:00
Alexander Duyck	dc35dbeda3	fib_trie: Add tnode struct as a container for fields not needed in key_vector This change pulls the fields not explicitly needed in the key_vector and placed them in the new tnode structure. By doing this we will eventually be able to reduce the key_vector down to 16 bytes on 64 bit systems, and 12 bytes on 32 bit systems. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:28 -05:00
Alexander Duyck	2e1ac88a48	fib_trie: Rename tnode_child_length to child_length We are now checking the length of a key_vector instead of a tnode so it makes sense to probably just rename this to child_length since it would probably even be applicable to a leaf. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:28 -05:00
Alexander Duyck	754baf8dec	fib_trie: replace tnode_get_child functions with get_child macros I am replacing the tnode_get_child call with get_child since we are techically pulling the child out of a key_vector now and not a tnode. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:27 -05:00
Alexander Duyck	35c6edac19	fib_trie: Rename tnode to key_vector Rename the tnode to key_vector. The key_vector will be the eventual container for all of the information needed by either a leaf or a tnode. The final result should be much smaller than the 40 bytes currently needed for either one. This also updates the trie struct so that it contains an array of size 1 of tnode pointers. This is to bring the structure more inline with how an actual tnode itself is configured. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:27 -05:00
Alexander Duyck	8d8e810ca8	fib_trie: Return pointer to tnode pointer in resize/inflate/halve Resize related functions now all return a pointer to the pointer that references the object that was resized. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:27 -05:00
Alexander Duyck	72be72607a	fib_trie: Minor cleanups to fib_table_flush_external This change just does a couple of minor cleanups on fib_table_flush_external. Specifically it addresses the fact that resize was being called even though nothing was being removed from the table, and it drops an unecessary indent since we could just call continue on the inverse of the fi && flag check. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:27 -05:00
Robert Shearman	f8d54afc4c	mpls: Properly validate RTA_VIA payload length If the nla length is less than 2 then the nla data could be accessed beyond the accessible bounds. So ensure that the nla is big enough to at least read the via_family before doing so. Replace magic value of 2. Fixes: `03c0566542` ("mpls: Basic support for adding and removing routes") Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Robert Shearman <rshearma@brocade.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:19:06 -05:00
Fan Du	05cbc0db03	ipv4: Create probe timer for tcp PMTU as per RFC4821 As per RFC4821 7.3. Selecting Probe Size, a probe timer should be armed once probing has converged. Once this timer expired, probing again to take advantage of any path PMTU change. The recommended probing interval is 10 minutes per RFC1981. Probing interval could be sysctled by sysctl_tcp_probe_interval. Eric Dumazet suggested to implement pseudo timer based on 32bits jiffies tcp_time_stamp instead of using classic timer for such rare event. Signed-off-by: Fan Du <fan.du@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 14:57:42 -05:00
Fan Du	6b58e0a5f3	ipv4: Use binary search to choose tcp PMTU probe_size Current probe_size is chosen by doubling mss_cache, the probing process will end shortly with a sub-optimal mss size, and the link mtu will not be taken full advantage of, in return, this will make user to tweak tcp_base_mss with care. Use binary search to choose probe_size in a fine granularity manner, an optimal mss will be found to boost performance as its maxmium. In addition, introduce a sysctl_tcp_probe_threshold to control when probing will stop in respect to the width of search range. Test env: Docker instance with vxlan encapuslation(82599EB) iperf -c 10.0.0.24 -t 60 before this patch: 1.26 Gbits/sec After this patch: increase 26% 1.59 Gbits/sec Signed-off-by: Fan Du <fan.du@intel.com> Acked-by: John Heffner <johnwheffner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 14:57:41 -05:00
Eric W. Biederman	aaa4e70404	DECnet: Only use neigh_ops for adding the link layer header Other users users of the neighbour table use neigh->output as the method to decided when and which link-layer header to place on a packet. DECnet has been using neigh->output to decide which DECnet headers to place on a packet depending which neighbour the packet is destined for. The DECnet usage isn't totally wrong but it can run into problems if the neighbour output function is run for a second time as the teql driver and the bridge netfilter code can do. Therefore to avoid pathologic problems later down the line and make the neighbour code easier to understand by refactoring the decnet output code to only use a neighbour method to add a link layer header to a packet. This is done by moving the neigbhour operations lookup from dn_to_neigh_output to dn_neigh_output_packet. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 14:54:22 -05:00
Johan Hedberg	7a00ff445f	Bluetooth: Add mgmt_send_event() helper to send to any HCI channel Currently the mgmt_event() function is only capable of sending to HCI_CHANNEL_CONTROL. To void having to change all users of it, add a new mgmt_send_event() function that takes a channel parameter, and make the old mgmt_event() a wrapper that passes MGMT_CHANNEL_CONTROL to it. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-06 20:15:22 +01:00
Johan Hedberg	3b0602cd01	Bluetooth: Rename pending_cmd to mgmt_pending_cmd This patch renames the pending_cmd struct (used for tracking pending mgmt commands) to mgmt_pending_cmd, so that it can be moved to a more generic place and be used also by other modules using other HCI channels. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-06 20:15:21 +01:00
Johan Hedberg	2a1afb5ac8	Bluetooth: Rename cmd_complete() to mgmt_cmd_complete() This patch renames the cmd_complete() function to mgmt_cmd_complete() in preparation of making it a generic helper for other modules to use too. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-06 20:15:21 +01:00
Johan Hedberg	a69e8375a1	Bluetooth: Rename cmd_status() to mgmt_cmd_status() This patch renames the cmd_status() function to mgmt_cmd_status() in preparation of making it a generic helper for other modules to use too. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-06 20:15:21 +01:00
Johan Hedberg	b9a245fb12	Bluetooth: Move all mgmt command quirks to handler table In order to completely generalize the mgmt command handling we need to move away command-specific information from mgmt_control() into the actual command table. This patch adds a new 'flags' field to the handler entries which can now contain the following command specific information: - Command takes variable length parameters - Command doesn't target any specific HCI device - Command can be sent when the HCI device is unconfigured After this the mgmt_control() function is completely generic and can potentially be reused by new HCI channels. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-06 20:15:21 +01:00
Johan Hedberg	6d785aa345	Bluetooth: Convert mgmt to use HCI chan registration API This patch converts the existing mgmt code to use the newly introduced generic API for registering HCI channels with mgmt-like semantics. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-06 20:15:21 +01:00
Johan Hedberg	801c1e8da5	Bluetooth: Add mgmt HCI channel registration API This patch adds an API for registering HCI channels with mgmt-like semantics. For now the only user will be HCI_CHANNEL_CONTROL, but e.g. 6lowpan is intended to use this as well in the future. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-06 20:15:21 +01:00
Marcel Holtmann	93690c227a	Bluetooth: Introduce controller setting information for static address Currently it is not possible to determine if the static address is used by the controller. It is also not possible to determine if using a static on a dual-mode controller with disabled BR/EDR is possible or not. To address this issue, introduce a new setting called static-address. If support for this setting is signaled that means that the kernel supports using static addresses. And if used on dual-mode controllers with BR/EDR disabled it means that a configured static address can be used. In addition utilize the same setting for the list of current active settings that indicates if a static address is configured and if that address will be actually used. With this in mind the existing Set Static Address management command has been extended to return the current settings. That way the caller of that command can easily determine if the programmed address will be used or if extra steps are required. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-06 20:43:07 +02:00
Linus Torvalds	1b1bd56191	NFS client bugfixes for Linux 4.0 Highlights include: - Fix a regression in the NFSv4 open state recovery code - Fix a regression in the NFSv4 close code - Fix regressions and side-effects of the loop-back mounted NFS fixes in 3.18, that cause the NFS read() syscall to return EBUSY. - Fix regressions around the readdirplus code and how it interacts with the VFS lazy unmount changes that went into v3.18. - Fix issues with out-of-order RPC call replies replacing updated attributes with stale ones (particularly after a truncate()). - Fix an underflow checking issue with RPC/RDMA credits - Fix a number of issues with the NFSv4 delegation return/free code. - Fix issues around stale NFSv4.1 leases when doing a mount -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJU+SJ6AAoJEGcL54qWCgDyVgQQAKNsF/O2O9ip2uAHZRZvM6TS Ev8c3Spj/FmRI1tlCcGi1zZ8uCSwvPQz3uAN0vOTMocUWjokT5sAgN5yBIxHasem 6YK7jxs9WHiP7MGyReaAFJwG/W6LZndnlNqPWPs9KiaWwKVXIsQ3uFm/Y0lr90Fi ew16DQm0DUd4Yvv42WJR9ay7UwUPvT7wmaGIVVK2hjQqr2lx02jspt5kfrC+vsMU OYU/0YDofb2ajs5krbah6tUHf3VDnSVrXP6if66IrukCM9S4AvowpnMJQ5QJALh+ cPlqHDm2ZzuIecpqZEgYLM73wQ2q+KBXTlDLcgYg6LjnqBEivwO9RDn6tfCwKTcS tCFohQc9iDOj9rTZ9EQlQME6u/FdxWncpxyTsMyBk7FlcLsOQRqio/FXZhbyJGuH gvIdIW3fPseijtejpYkxgabe6JL9NRvjv3SnOay7xHs9Vn4tRkFF2mkkQDZOG6HT atkxQp8kB3m9gMoeAmTLdTcJkcFdk6AKnNrcyJaa1GW4msmMuGZtq/6vayKJsBZb OIw788bDSOVpcVR+6SAC24/dutcl+WJHlSJShqvIrTKejBxPCc7IVSeg63FJsWTO sxfXdUr3wVJ1ooDFCspeBj+3zAimIq7qDmRRs85ekgEtxrgdUldhg/VwbDjvLjmb whXFpiCS7Ii0fVYtZvjz =qMB7 -----END PGP SIGNATURE----- Merge tag 'nfs-for-4.0-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: "Highlights include: - Fix a regression in the NFSv4 open state recovery code - Fix a regression in the NFSv4 close code - Fix regressions and side-effects of the loop-back mounted NFS fixes in 3.18, that cause the NFS read() syscall to return EBUSY. - Fix regressions around the readdirplus code and how it interacts with the VFS lazy unmount changes that went into v3.18. - Fix issues with out-of-order RPC call replies replacing updated attributes with stale ones (particularly after a truncate()). - Fix an underflow checking issue with RPC/RDMA credits - Fix a number of issues with the NFSv4 delegation return/free code. - Fix issues around stale NFSv4.1 leases when doing a mount" * tag 'nfs-for-4.0-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits) NFSv4.1: Clear the old state by our client id before establishing a new lease NFSv4: Fix a race in NFSv4.1 server trunking discovery NFS: Don't write enable new pages while an invalidation is proceeding NFS: Fix a regression in the read() syscall NFSv4: Ensure we skip delegations that are already being returned NFSv4: Pin the superblock while we're returning the delegation NFSv4: Ensure we honour NFS_DELEGATION_RETURNING in nfs_inode_set_delegation() NFSv4: Ensure that we don't reap a delegation that is being returned NFS: Fix stateid used for NFS v4 closes NFSv4: Don't call put_rpccred() under the rcu_read_lock() NFS: Don't require a filehandle to refresh the inode in nfs_prime_dcache() NFSv3: Use the readdir fileid as the mounted-on-fileid NFS: Don't invalidate a submounted dentry in nfs_prime_dcache() NFSv4: Set a barrier in the update_changeattr() helper NFS: Fix nfs_post_op_update_inode() to set an attribute barrier NFS: Remove size hack in nfs_inode_attrs_need_update() NFSv4: Add attribute update barriers to delegreturn and pNFS layoutcommit NFS: Add attribute update barriers to NFS writebacks NFS: Set an attribute barrier on all updates NFS: Add attribute update barriers to nfs_setattr_update_inode() ...	2015-03-06 10:09:57 -08:00
Scott Feldman	e1315db17d	switchdev: fix CONFIG_IP_MULTIPLE_TABLES compile issue Signed-off-by: Scott Feldman <sfeldma@gmail.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 12:43:54 -05:00
Ilan peer	eeca9fce1d	cfg80211: Schedule timeout for all CRDA calls Timeout was scheduled only in case CRDA was called due to user hints, but was not scheduled for other cases. This can result in regulatory hint processing getting stuck in case that there is no CRDA configured. Change this by scheduling a timeout every time CRDA is called. In addition, in restore_regulatory_settings() all pending requests are restored (and not only the user ones). Signed-off-by: Ilan Peer <ilan.peer@intel.com> Acked-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-06 09:37:59 +01:00
Ilan peer	0505075360	cfg80211: Add API to change the indoor regulatory setting Previously, the indoor setting configuration assumed that as long as a station interface is connected, the indoor environment setting does not change. However, this assumption is problematic as: - It is possible that a station interface is connected to a mobile AP, e.g., softAP or a P2P GO, where it is possible that both the station and the mobile AP move out of the indoor environment making the indoor setting invalid. In such a case, user space has no way to invalidate the setting. - A station interface disconnection does not necessarily imply that the device is no longer operating in an indoor environment, e.g., it is possible that the station interface is roaming but is still stays indoor. To handle the above, extend the indoor configuration API to allow user space to indicate a change of indoor settings, and allow it to indicate weather it controls the indoor setting, such that: 1. If the user space process explicitly indicates that it is going to control the indoor setting, do not clear the indoor setting internally, unless the socket is released. The user space process should use the NL80211_ATTR_SOCKET_OWNER attribute in the command to state that it is going to control the indoor setting. 2. Reset the indoor setting when restoring the regulatory settings in case it is not owned by a user space process. Based on the above, a user space tool that continuously monitors the indoor settings, i.e., tracking power setting, location etc., can indicate environment changes to the regulatory core. It should be noted that currently user space is the only provided mechanism used to hint to the regulatory core over the indoor/outdoor environment -- while the country IEs do have an environment setting this has been completely ignored by the regulatory core by design for a while now since country IEs typically can contain bogus data. Acked-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: ArikX Nemtsov <arik@wizery.com> Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-06 09:37:47 +01:00
Ilan peer	0c4ddcd214	cfg80211: Simplify the handling of regulatory indoor setting Directly update the indoor setting without wrapping it as a regulatory request, to simplify the processing. Acked-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-06 09:37:38 +01:00
David S. Miller	23375a0fd5	ipv4: Fix unused variable warnings in fib_table_flush_external. net/ipv4/fib_trie.c: In function ‘fib_table_flush_external’: net/ipv4/fib_trie.c:1572:6: warning: unused variable ‘found’ [-Wunused-variable] int found = 0; ^ net/ipv4/fib_trie.c:1571:16: warning: unused variable ‘slen’ [-Wunused-variable] unsigned char slen; ^ Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:38:35 -05:00
Scott Feldman	8e05fd7166	fib: hook IPv4 fib for hardware offload Call into the switchdev driver any time an IPv4 fib entry is added/modified/deleted from the kernel's FIB. The switchdev driver may or may not install the route to the offload device. In the case where the driver tries to install the route and something goes wrong (device's routing table is full, etc), then all of the offloaded routes will be flushed from the device, route forwarding falls back to the kernel, and no more routes are offloading. We can refine this logic later. For now, use the simplist model of offloading routes up to the point of failure, and then on failure, undo everything and mark IPv4 offloading disabled. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:58 -05:00
Scott Feldman	b5d6fbdeed	switchdev: implement IPv4 fib ndo wrappers Flesh out ndo wrappers to call into device driver. To call into device driver, the wrapper must interate over route's nexthops to ensure all nexthop devs belong to the same switch device. Currently, there is no support for route's nexthops spanning offloaded and non-offloaded devices, or spanning ports of multiple offload devices. Since switch device ports may be stacked under virtual interfaces (bonds and/or bridges), and the route's nexthop may be on the virtual interface, the wrapper will traverse the nexthop dev down to the base dev. It's the base dev that's passed to the switchdev driver's ndo ops. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:58 -05:00
Scott Feldman	104616e74e	switchdev: don't support custom ip rules, for now Keep switchdev FIB offload model simple for now and don't allow custom ip rules. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:58 -05:00
Scott Feldman	5e8d90497d	switchdev: add IPv4 fib ndo ops wrappers Add IPv4 fib ndo wrapper funcs and stub them out for now. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:58 -05:00
Florian Fainelli	c86e59b9e6	net: dsa: extract dsa switch tree setup and removal Extract the core logic that setups a 'struct dsa_switch_tree' and removes it, update dsa_probe() and dsa_remove() to use the two helper functions. This will be useful to allow for other callers to setup this structure differently. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:18:20 -05:00
Florian Fainelli	5929903103	net: dsa: let switches specify their tagging protocol In order to support the new DSA device driver model, a dsa_switch should be able to advertise the type of tagging protocol supported by the underlying switch device. This also removes constraints on how tagging can be stacked to each other. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:18:20 -05:00
Florian Fainelli	df197195a5	net: dsa: split dsa_switch_setup into two functions Split the part of dsa_switch_setup() which is responsible for allocating and initializing a 'struct dsa_switch' and the part which is doing a given switch device setup and slave network device creation. This is a preliminary change to allow a separate caller of dsa_switch_setup_one() which may have externally initialized the dsa_switch structure, outside of dsa_switch_setup(). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:18:20 -05:00
Florian Fainelli	b324c07ac4	net: dsa: allow deferred probing In preparation for allowing a different model to register DSA switches, update dsa_of_probe() and dsa_probe() to return -EPROBE_DEFER where appropriate. Failure to find a phandle or Device Tree property is still fatal, but looking up the internal device structure associated with a Device Tree node is something that might need to be delayed based on driver probe ordering. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:18:20 -05:00
Florian Fainelli	f1a26a062f	net: dsa: update dsa_of_{probe, remove} to use a device pointer In preparation for allowing a different mechanism to register DSA switch devices and driver, update dsa_of_probe and dsa_of_remove to take a struct device pointer since neither of these two functions uses the struct platform_device pointer. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:18:20 -05:00
Eric Dumazet	496127290f	inet_diag: remove duplicate code from inet_twsk_diag_dump() timewait sockets now share a common base with established sockets. inet_twsk_diag_dump() can use inet_diag_bc_sk() instead of duplicating code, granted that inet_diag_bc_sk() does proper userlocks initialization. twsk_build_assert() will catch any future changes that could break the assumptions. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 22:55:44 -05:00
Eric Dumazet	6c09fa09d4	tcp: align tcp_xmit_size_goal() on tcp_tso_autosize() With some mss values, it is possible tcp_xmit_size_goal() puts one segment more in TSO packet than tcp_tso_autosize(). We send then one TSO packet followed by one single MSS. It is not a serious bug, but we can do slightly better, especially for drivers using netif_set_gso_max_size() to lower gso_max_size. Using same formula avoids these corner cases and makes tcp_xmit_size_goal() a bit faster. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `605ad7f184` ("tcp: refine TSO autosizing") Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 22:31:12 -05:00
Erik Hugne	d0f91938be	tipc: add ip/udp media type The ip/udp bearer can be configured in a point-to-point mode by specifying both local and remote ip/hostname, or it can be enabled in multicast mode, where links are established to all tipc nodes that have joined the same multicast group. The multicast IP address is generated based on the TIPC network ID, but can be overridden by using another multicast address as remote ip. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 22:08:42 -05:00
Erik Hugne	948fa2d115	tipc: increase size of tipc discovery messages The payload area following the TIPC discovery message header is an opaque area defined by the media. INT_H_SIZE was enough for Ethernet/IB/IPv4 but needs to be expanded to carry IPv6 addressing information. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 22:08:42 -05:00
David S. Miller	9d73b42bbf	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree, they are: 1) Don't truncate ethernet protocol type to u8 in nft_compat, from Arturo Borrero. 2) Fix several problems in the addition/deletion of elements in nf_tables. 3) Fix module refcount leak in ip_vs_sync, from Julian Anastasov. 4) Fix a race condition in the abort path in the nf_tables transaction infrastructure. Basically aborted rules can show up as active rules until changes are unrolled, oneliner from Patrick McHardy. 5) Check for overflows in the data area of the rule, also from Patrick. 6) Fix off-by-one in the per-rule user data size field. This introduces a new nft_userdata structure that is placed at the beginning of the user data area that contains the length to save some bits from the rule and we only need one bit to indicate its presence, from Patrick. 7) Fix rule replacement error path, the replaced rule is deleted on error instead of leaving it in place. This has been fixed by relying on the abort path to undo the incomplete replacement. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 21:51:07 -05:00
Alexander Drozdov	3e32e733d1	ipv4: ip_check_defrag should not assume that skb_network_offset is zero ip_check_defrag() may be used by af_packet to defragment outgoing packets. skb_network_offset() of af_packet's outgoing packets is not zero. Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 21:43:48 -05:00
WANG Cong	33f8b9ecdb	net_sched: move tp->root allocation into fw_init() Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 21:30:44 -05:00
WANG Cong	a05c2d112c	net_sched: move tp->root allocation into route4_init() Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 21:30:44 -05:00
Stephen Rothwell	4b5edb2f4a	mpls: using vzalloc requires including vmalloc.h Fixes this build error: net/mpls/af_mpls.c: In function 'resize_platform_label_table': net/mpls/af_mpls.c:767:4: error: implicit declaration of function 'vzalloc' [-Werror=implicit-function-declaration] labels = vzalloc(size); ^ Fixes: `7720c01f3f` ("mpls: Add a sysctl to control the size of the mpls label table") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 21:01:33 -05:00
Pablo Neira Ayuso	1cae565e8b	netfilter: nf_tables: limit maximum table name length to 32 bytes Set the same as we use for chain names, it should be enough. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-06 01:21:21 +01:00
Pablo Neira Ayuso	f04e599e20	netfilter: nf_tables: consolidate Kconfig options Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-06 01:21:15 +01:00
Patrick McHardy	354bf5a0d7	netfilter: nf_tables: consolidate tracing invocations * JUMP and GOTO are equivalent except for JUMP pushing the current context to the stack * RETURN and implicit RETURN (CONTINUE) are equivalent except that the logged rule number differs Result: nft_do_chain \| -112 1 function changed, 112 bytes removed, diff: -112 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-06 01:21:12 +01:00
Patrick McHardy	01ef16c2dd	netfilter: nf_tables: minor tracing cleanups The tracing code is squeezed between multiple related parts of the evaluation code, move it out. Also add an inline wrapper for the reoccuring test for skb->nf_trace. Small code savings in nft_do_chain(): nft_trace_packet \| -137 nft_do_chain \| -8 2 functions changed, 145 bytes removed, diff: -145 net/netfilter/nf_tables_core.c: __nft_trace_packet \| +137 1 function changed, 137 bytes added, diff: +137 net/netfilter/nf_tables_core.o: 3 functions changed, 137 bytes added, 145 bytes removed, diff: -8 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-06 01:21:11 +01:00
Pablo Neira Ayuso	43270b1bc5	netfilter: ipt_CLUSTERIP: deprecate it in favour of xt_cluster xt_cluster supersedes ipt_CLUSTERIP since it can be also used in gateway configurations (not only from the backend side). ipt_CLUSTER is also known to leak the netdev that it uses on device removal, which requires a rather large fix to workaround the problem: http://patchwork.ozlabs.org/patch/358629/ So let's deprecate this so we can probably kill code this in the future. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-06 01:21:05 +01:00
Jouni Malinen	842a9ae08a	bridge: Extend Proxy ARP design to allow optional rules for Wi-Fi This extends the design in commit `958501163d` ("bridge: Add support for IEEE 802.11 Proxy ARP") with optional set of rules that are needed to meet the IEEE 802.11 and Hotspot 2.0 requirements for ProxyARP. The previously added BR_PROXYARP behavior is left as-is and a new BR_PROXYARP_WIFI alternative is added so that this behavior can be configured from user space when required. In addition, this enables proxyarp functionality for unicast ARP requests for both BR_PROXYARP and BR_PROXYARP_WIFI since it is possible to use unicast as well as broadcast for these frames. The key differences in functionality: BR_PROXYARP: - uses the flag on the bridge port on which the request frame was received to determine whether to reply - block bridge port flooding completely on ports that enable proxy ARP BR_PROXYARP_WIFI: - uses the flag on the bridge port to which the target device of the request belongs - block bridge port flooding selectively based on whether the proxyarp functionality replied Signed-off-by: Jouni Malinen <jouni@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 14:52:23 -05:00
kbuild test robot	787fb2bd42	ax25: Fix the build when CONFIG_INET is disabled > > >> net/ax25/ax25_ip.c:225:26: error: unknown type name 'sturct' > netdev_tx_t ax25_ip_xmit(sturct sk_buff skb) > ^ > > vim +/sturct +225 net/ax25/ax25_ip.c > > 219 unsigned short type, const void daddr, > 220 const void saddr, unsigned int len) > 221 { > 222 return -AX25_HEADER_LEN; > 223 } > 224 > > 225 netdev_tx_t ax25_ip_xmit(sturct sk_buff skb) > 226 { > 227 kfree_skb(skb); > 228 return NETDEV_TX_OK; Ooops I misspelled struct... Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 13:17:39 -05:00
Jakub Pawlowski	82f8b651a9	Bluetooth: fix service discovery behaviour for empty uuids filter This patch fixes service discovery behaviour, when provided uuid filter is empty and HCI_QUIRK_STRICT_DUPLICATE_FILTER is set. Before this patch, empty uuid filter was unable to trigger scan restart, and that caused inconsistent behaviour in applications. Example: two DBus clients call BlueZ, one to find all devices with service abcd, second to find all devices with rssi smaller than -90. Sum of those filters, that is passed to mgmt_service_scan is empty filter, with no rssi or uuids set. That caused kernel not to restart scan when quirk was set. That was inconsistent with what happen when there's only one of those two filters set (scan is restarted and reports devices). To fix that, new variable hdev->discovery.result_filtering was introduced. It can indicate that filtered scan is running, no matter what uuid or rssi filter is set. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-05 09:50:50 +02:00
Jakub Pawlowski	2976cdeb27	Bluetooth: Refactor service discovery filter logic This patch refactor code responsible for filtering when service discovery method is used. Previously this code was mixed with mgmt_device found logic. Now when it's in one place whole logic can be greatly simplified. That includes removing no longer necessary length field and merging checks for eir and scan_rsp. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-05 09:50:50 +02:00
Jakub Pawlowski	48f86b7f26	Bluetooth: Move Service Discovery logic before refactoring This patch moves whole packet filering logic of service discovery into new function is_filter_match. It's done because logic inside mgmt_device_found is very complicated and needs some simplification. Also having whole logic in one place will allow to simplify it in the future. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-05 09:50:50 +02:00
Alexander Duyck	1de3d87bcd	fib_trie: Prevent allocating tnode if bits is too big for size_t This patch adds code to prevent us from attempting to allocate a tnode with a size larger than what can be represented by size_t. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 23:35:18 -05:00
Alexander Duyck	71e8b67d0f	fib_trie: Update last spot w/ idx >> n->bits code and explanation This change updates the fib_table_lookup function so that it is in sync with the fib_find_node function in terms of the explanation for the index check based on the bits value. I have also updated it from doing a mask to just doing a compare as I have found that seems to provide more options to the compiler as I have seen it turn this into a shift of the value and test under some circumstances. In addition I addressed one minor issue in which we kept computing the key ^ n->key when checking the fib aliases. I pulled the xor out of the loop in order to reduce the number of memory reads in the lookup. As a result we should save a couple cycles since the xor is only done once much earlier in the lookup. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 23:35:18 -05:00
Alexander Duyck	a7e5353123	fib_trie: Make fib_table rcu safe The fib_table was wrapped in several places with an rcu_read_lock/rcu_read_unlock however after looking over the code I found several spots where the tables were being accessed as just standard pointers without any protections. This change fixes that so that all of the proper protections are in place when accessing the table to take RCU replacement or removal of the table into account. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 23:35:18 -05:00
Alexander Duyck	41b489fd6c	fib_trie: move leaf and tnode to occupy the same spot in the key vector If we are going to compact the leaf and tnode we first need to make sure the fields are all in the same place. In that regard I am moving the leaf pointer which represents the fib_alias hash list to occupy what is currently the first key_vector pointer. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 23:35:18 -05:00
Alexander Duyck	d5d6487cb8	fib_trie: Update insert and delete to make use of tp from find_node This change makes it so that the insert and delete functions make use of the tnode pointer returned in the fib_find_node call. By doing this we will not have to rely on the parent pointer in the leaf which will be going away soon. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 23:35:18 -05:00
Alexander Duyck	d4a975e83f	fib_trie: Fib find node should return parent This change makes it so that the parent pointer is returned by reference in fib_find_node. By doing this I can use it to find the parent node when I am performing an insertion and I don't have to look for it again in fib_insert_node. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 23:35:17 -05:00
Alexander Duyck	8be33e955c	fib_trie: Fib walk rcu should take a tnode and key instead of a trie and a leaf This change makes it so that leaf_walk_rcu takes a tnode and a key instead of the trie and a leaf. The main idea behind this is to avoid using the leaf parent pointer as that can have additional overhead in the future as I am trying to reduce the size of a leaf down to 16 bytes on 64b systems and 12b on 32b systems. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 23:35:17 -05:00
Alexander Duyck	7289e6ddb6	fib_trie: Only resize tnodes once instead of on each leaf removal in fib_table_flush This change makes it so that we only call resize on the tnodes, instead of from each of the leaves. By doing this we can significantly reduce the amount of time spent resizing as we can update all of the leaves in the tnode first before we make any determinations about resizing. As a result we can simply free the tnode in the case that all of the leaves from a given tnode are flushed instead of resizing with each leaf removed. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 23:35:17 -05:00
Wu Fengguang	f0126539c7	mpls: rtm_mpls_policy[] can be static Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 16:39:45 -05:00
Lorenzo Colitti	9145736d48	net: ping: Return EAFNOSUPPORT when appropriate. 1. For an IPv4 ping socket, ping_check_bind_addr does not check the family of the socket address that's passed in. Instead, make it behave like inet_bind, which enforces either that the address family is AF_INET, or that the family is AF_UNSPEC and the address is 0.0.0.0. 2. For an IPv6 ping socket, ping_check_bind_addr returns EINVAL if the socket family is not AF_INET6. Return EAFNOSUPPORT instead, for consistency with inet6_bind. 3. Make ping_v4_sendmsg and ping_v6_sendmsg return EAFNOSUPPORT instead of EINVAL if an incorrect socket address structure is passed in. 4. Make IPv6 ping sockets be IPv6-only. The code does not support IPv4, and it cannot easily be made to support IPv4 because the protocol numbers for ICMP and ICMPv6 are different. This makes connect(::ffff:192.0.2.1) fail with EAFNOSUPPORT instead of making the socket unusable. Among other things, this fixes an oops that can be triggered by: int s = socket(AF_INET, SOCK_DGRAM, IPPROTO_ICMP); struct sockaddr_in6 sin6 = { .sin6_family = AF_INET6, .sin6_addr = in6addr_any, }; bind(s, (struct sockaddr *) &sin6, sizeof(sin6)); Change-Id: If06ca86d9f1e4593c0d6df174caca3487c57a241 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 15:46:51 -05:00
Pablo Neira Ayuso	59900e0a01	netfilter: nf_tables: fix error handling of rule replacement In general, if a transaction object is added to the list successfully, we can rely on the abort path to undo what we've done. This allows us to simplify the error handling of the rule replacement path in nf_tables_newrule(). This implicitly fixes an unnecessary removal of the old rule, which needs to be left in place if we fail to replace. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-04 18:46:08 +01:00
Patrick McHardy	86f1ec3231	netfilter: nf_tables: fix userdata length overflow The NFT_USERDATA_MAXLEN is defined to 256, however we only have a u8 to store its size. Introduce a struct nft_userdata which contains a length field and indicate its presence using a single bit in the rule. The length field of struct nft_userdata is also a u8, however we don't store zero sized data, so the actual length is udata->len + 1. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-04 18:46:06 +01:00
Patrick McHardy	9889840f59	netfilter: nf_tables: check for overflow of rule dlen field Check that the space required for the expressions doesn't exceed the size of the dlen field, which would lead to the iterators crashing. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-04 18:46:05 +01:00
Patrick McHardy	8670c3a55e	netfilter: nf_tables: fix transaction race condition A race condition exists in the rule transaction code for rules that get added and removed within the same transaction. The new rule starts out as inactive in the current and active in the next generation and is inserted into the ruleset. When it is deleted, it is additionally set to inactive in the next generation as well. On commit the next generation is begun, then the actions are finalized. For the new rule this would mean clearing out the inactive bit for the previously current, now next generation. However nft_rule_clear() clears out the bits for both generations, activating the rule in the current generation, where it should be deactivated due to being deleted. The rule will thus be active until the deletion is finalized, removing the rule from the ruleset. Similarly, when aborting a transaction for the same case, the undo of insertion will remove it from the RCU protected rule list, the deletion will clear out all bits. However until the next RCU synchronization after all operations have been undone, the rule is active on CPUs which can still see the rule on the list. Generally, there may never be any modifications of the current generations' inactive bit since this defeats the entire purpose of atomicity. Change nft_rule_clear() to only touch the next generations bit to fix this. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-04 18:46:04 +01:00
Nicholas Mc Guire	0df2f6c118	mesh_plink: fixup type of timeout to match usage timeout was being passed as int but assigned from u32/u16 values and used as unsigned type. This is really only for better readability. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-04 10:34:14 +01:00
Nicholas Mc Guire	cc57ac536a	mesh_plink: use msecs_to_jiffies for proper time conversion This is primarily an API consolidation and should make things more readable it replaces var * HZ / 1000 by msecs_to_jiffies(var) which also handles corner cases correctly. There is a change of behavior as e.g. for HZ 100, t * HZ / 1000 will return 0 for t < 10 but msecs_to_jiffies will return at least 1 always. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-04 10:34:13 +01:00
SenthilKumar Jegadeesan	64a8cef41a	mac80211: provide station PMF configuration to driver Some device drivers offload part of aggregation including AddBA/DelBA negotiations to firmware. In such scenario, the PMF configuration of the station needs to be provided to driver to enable encryption of AddBA/DelBA action frames. Signed-off-by: SenthilKumar Jegadeesan <sjegadee@qti.qualcomm.com> [fix commit log, documentation] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-04 10:34:12 +01:00
Arik Nemtsov	3384d757d4	mac80211: allow iterating inactive interfaces Sometimes the driver might want to modify private data in interfaces that are down. One possible use-case is cleaning up interface state after HW recovery. Some interfaces that were up before the recovery took place might be down now, but they might still be "dirty". Introduce a new iterate_interfaces() API and a new ACTIVE iterator flag. This way the internal implementation of the both active and inactive APIs remains the same. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-04 10:34:12 +01:00
Johannes Berg	98fc43864a	nl80211: prohibit mixing 'any' and regular wowlan triggers If the device supports waking up on 'any' signal - i.e. it continues operating as usual and wakes up the host on pretty much anything that happens, then it makes no sense to also configure the more restricted WoWLAN mode where the device operates more autonomously but also in a more restricted fashion. Currently only cw2100 supports both 'any' and other triggers, but it seems to be broken as it doesn't configure anything to the device, so we can't currently get into a situation where both even can correctly be configured. This is about to change (Intel devices are going to support both and have different behaviour depending on configuration) so make sure the conflicting modes cannot be configured. (It seems that cw2100 advertises 'any' and 'disconnect' as a means of saying that's what it will always do, but that isn't really the way this API was meant to be used nor does it actually mean anything as 'any' always implies 'disconnect' already, and the driver doesn't change device configuration in any way depending on the settings.) Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-04 10:34:11 +01:00
Johannes Berg	88724a81b4	mac80211: check and dequeue skb in ieee80211_tx_prepare_skb() The ieee80211_tx_prepare_skb() function currently entirely ignores the fact that the SKB that is passed in might be split into more than one due to fragmentation and doesn't check the list of skbs that the TX handlers may create. In case this happens, it would leak them. Fix this and also don't leave the skb next/prev pointers dangling pointing to the on-stack sk_buff_head. Reported-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-04 10:34:11 +01:00
Luciano Coelho	0b4e11074a	mac80211: remove duplicate check for quiescing when queueing work In ieee80211_queue_work() we check if we're quiescing or suspended, so it's not necessary to check for quiescing before calling this function. Remove duplicate checks. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-04 10:34:10 +01:00
Arik Nemtsov	ae2e9fba85	mac80211: allow TDLS setup code to take wdev lock TDLS off-channel can be allowed in channels marked with GO_CONCURRENT, provided the device is connected to an AP on the same UNII. When relaxing the NO-IR requirements for TDLS, we might hit flows in cfg80211_reg_can_beacon that acquire the wdev lock. Take some measures to allow this during TDLS setup. Acquire the RCU read lock later in the flow that invokes cfg80211_reg_can_beacon. Avoid taking local->mtx when preparing the setup packet to avoid circular deadlocks with mac80211 code that is invoked with wdev-mtx held. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-04 10:34:10 +01:00
Johannes Berg	23e370989c	mac80211: start queues if driver rejected wowlan If the driver rejects WoWLAN, restart the queues before returning to cfg80211. cfg80211 will return to mac80211, but not before it disconnects all interfaces. If we don't start the queues, any of the packets needed for disconnecting won't be transmitted, which is strange. Fix that. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-04 10:34:09 +01:00
Luciano Coelho	8bb6f4b9c5	mac80211: remove useless double check for open_count in __ieee80211_suspend() We check local->open_count at the top of the __ieee80211_suspend(), so there's no need to check for it again. open_count is protected by the rtnl, so there's no chance for it to have change between the two calls. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-04 10:34:09 +01:00
Johannes Berg	ef7c67257c	mac80211: don't do driver suspend with auth/assoc in progress Drivers can't really be expected to suspend properly while auth or assoc is in progress since then they don't have any state they could keep with WoWLAN, nor can they actually finish the authentication or association. In fact, keeping this can cause subtle issues with drivers like iwlwifi that refuse WoWLAN if not associated, but have trouble figuring out what's going on in the middle of association. In any case, regardless of possible driver issues in this area, it doesn't make sense for mac80211 to try to WoWLAN-suspend in the middle of such operations, so stop them before. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-04 10:34:08 +01:00
Eric W. Biederman	8de147dc8e	mpls: Multicast route table change notifications Unlike IPv4 this code notifies on all cases where mpls routes are added or removed and it never automatically removes routes. Avoiding both the userspace confusion that is caused by omitting route updates and the possibility of a flood of netlink traffic when an interface goes doew. For now reserved labels are handled automatically and userspace is not notified. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 00:26:06 -05:00
Eric W. Biederman	03c0566542	mpls: Netlink commands to add, remove, and dump routes This change adds two new netlink routing attributes: RTA_VIA and RTA_NEWDST. RTA_VIA specifies the specifies the next machine to send a packet to like RTA_GATEWAY. RTA_VIA differs from RTA_GATEWAY in that it includes the address family of the address of the next machine to send a packet to. Currently the MPLS code supports addresses in AF_INET, AF_INET6 and AF_PACKET. For AF_INET and AF_INET6 the destination mac address is acquired from the neighbour table. For AF_PACKET the destination mac_address is specified in the netlink configuration. I think raw destination mac address support with the family AF_PACKET will prove useful. There is MPLS-TP which is defined to operate on machines that do not support internet packets of any flavor. Further seem to be corner cases where it can be useful. At this point I don't care much either way. RTA_NEWDST specifies the destination address to forward the packet with. MPLS typically changes it's destination address at every hop. For a swap operation RTA_NEWDST is specified with a length of one label. For a push operation RTA_NEWDST is specified with two or more labels. For a pop operation RTA_NEWDST is not specified or equivalently an emtpy RTAN_NEWDST is specified. Those new netlink attributes are used to implement handling of rt-netlink RTM_NEWROUTE, RTM_DELROUTE, and RTM_GETROUTE messages, to maintain the MPLS label table. rtm_to_route_config parses a netlink RTM_NEWROUTE or RTM_DELROUTE message, verify no unhandled attributes or unhandled values are present and sets up the data structures for mpls_route_add and mpls_route_del. I did my best to match up with the existing conventions with the caveats that MPLS addresses are all destination-specific-addresses, and so don't properly have a scope. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 00:26:06 -05:00
Eric W. Biederman	966bae3349	mpls: Functions for reading and wrinting mpls labels over netlink Reading and writing addresses in network byte order in netlink is traditional and I see no reason to change that. MPLS is interesting as effectively it has variabely length addresses (the MPLS label stack). To represent these variable length addresses in netlink I use a valid MPLS label stack (complete with stop bit). This achieves two things: a well defined existing format is used, and the data can be interpreted without looking at it's length. Not needed to look at the length to decode the variable length network representation allows existing userspace functions such as inet_ntop to be used without needed to change their prototype. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 00:26:06 -05:00
Eric W. Biederman	a2519929ab	mpls: Basic support for adding and removing routes mpls_route_add and mpls_route_del implement the basic logic for adding and removing Next Hop Label Forwarding Entries from the MPLS input label map. The addition and subtraction is done in a way that is consistent with how the existing routing table in Linux are maintained. Thus all of the work to deal with NLM_F_APPEND, NLM_F_EXCL, NLM_F_REPLACE, and NLM_F_CREATE. Cases that are not clearly defined such as changing the interpretation of the mpls reserved labels is not allowed. Because it seems like the right thing to do adding an MPLS route without specifying an input label and allowing the kernel to pick a free label table entry is supported. The implementation is currently less than optimal but that can be changed. As I don't have anything else to test with only ethernet and the loopback device are the only two device types currently supported for forwarding MPLS over. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 00:26:06 -05:00
Eric W. Biederman	7720c01f3f	mpls: Add a sysctl to control the size of the mpls label table This sysctl gives two benefits. By defaulting the table size to 0 mpls even when compiled in and enabled defaults to not forwarding any packets. This prevents unpleasant surprises for users. The other benefit is that as mpls labels are allocated locally a dense table a small dense label table may be used which saves memory and is extremely simple and efficient to implement. This sysctl allows userspace to choose the restrictions on the label table size userspace applications need to cope with. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 00:26:06 -05:00
Eric W. Biederman	0189197f44	mpls: Basic routing support This change adds a new Kconfig option MPLS_ROUTING. The core of this change is the code to look at an mpls packet received from another machine. Look that packet up in a routing table and forward the packet on. Support of MPLS over ATM is not considered or attempted here. This implemntation follows RFC3032 and implements the MPLS shim header that can pass over essentially any network. What RFC3021 refers to as the as the Incoming Label Map (ILM) I call net->mpls.platform_label[]. What RFC3031 refers to as the Next Label Hop Forwarding Entry (NHLFE) I call mpls_route. Though calling it the label fordwarding information base (lfib) might also be valid. Further the implemntation forwards packets as described in RFC3032. There is no need and given the original motivation for MPLS a strong discincentive to have a flexible label forwarding path. In essence the logic is the topmost label is read, looked up, removed, and replaced by 0 or more new lables and the sent out the specified interface to it's next hop. Quite a few optional features are not implemented here. Among them are generation of ICMP errors when the TTL is exceeded or the packet is larger than the next hop MTU (those conditions are detected and the packets are dropped instead of generating an icmp error). The traffic class field is always set to 0. The implementation focuses on IP over MPLS and does not handle egress of other kinds of protocols. Instead of implementing coordination with the neighbour table and sorting out how to input next hops in a different address family (for which there is value). I was lazy and implemented a next hop mac address instead. The code is simpler and there are flavor of MPLS such as MPLS-TP where neither an IPv4 nor an IPv6 next hop is appropriate so a next hop by mac address would need to be implemented at some point. Two new definitions AF_MPLS and PF_MPLS are exposed to userspace. Decoding the mpls header must be done by first byeswapping a 32bit bit endian word into the local cpu endian and then bit shifting to extract the pieces. There is no C bit-field that can represent a wire format mpls header on a little endian machine as the low bits of the 20bit label wind up in the wrong half of third byte. Therefore internally everything is deal with in cpu native byte order except when writing to and reading from a packet. For management simplicity if a label is configured to forward out an interface that is down the packet is dropped early. Similarly if an network interface is removed rt_dev is updated to NULL (so no reference is preserved) and any packets for that label are dropped. Keeping the label entries in the kernel allows the kernel label table to function as the definitive source of which labels are allocated and which are not. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 00:26:06 -05:00
Eric W. Biederman	cec9166ca4	mpls: Refactor how the mpls module is built This refactoring is needed to allow more than just mpls gso support to be built into the mpls moddule. Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 00:26:06 -05:00
Eric W. Biederman	4fd3d7d9e8	neigh: Add helper function neigh_xmit For MPLS I am building the code so that either the neighbour mac address can be specified or we can have a next hop in ipv4 or ipv6. The kind of next hop we have is indicated by the neighbour table pointer. A neighbour table pointer of NULL is a link layer address. A non-NULL neighbour table pointer indicates which neighbour table and thus which address family the next hop address is in that we need to look up. The code either sends a packet directly or looks up the appropriate neighbour table entry and sends the packet. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 00:23:23 -05:00
Eric W. Biederman	60395a20ff	neigh: Factor out ___neigh_lookup_noref While looking at the mpls code I found myself writing yet another version of neigh_lookup_noref. We currently have __ipv4_lookup_noref and __ipv6_lookup_noref. So to make my work a little easier and to make it a smidge easier to verify/maintain the mpls code in the future I stopped and wrote ___neigh_lookup_noref. Then I rewote __ipv4_lookup_noref and __ipv6_lookup_noref in terms of this new function. I tested my new version by verifying that the same code is generated in ip_finish_output2 and ip6_finish_output2 where these functions are inlined. To get to ___neigh_lookup_noref I added a new neighbour cache table function key_eq. So that the static size of the key would be available. I also added __neigh_lookup_noref for people who want to to lookup a neighbour table entry quickly but don't know which neibhgour table they are going to look up. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 00:23:23 -05:00
Johannes Berg	2f56f6be47	bridge: fix bridge netlink RCU usage When the STP timer fires, it can call br_ifinfo_notify(), which in turn ends up in the new br_get_link_af_size(). This function is annotated to be using RTNL locking, which clearly isn't the case here, and thus lockdep warns: =============================== [ INFO: suspicious RCU usage. ] 3.19.0+ #569 Not tainted ------------------------------- net/bridge/br_private.h:204 suspicious rcu_dereference_protected() usage! Fix this by doing RCU locking here. Fixes: `b7853d73e3` ("bridge: add vlan info to bridge setlink and dellink notification messages") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-04 00:20:22 -05:00
David S. Miller	71a83a6db6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/rocker/rocker.c The rocker commit was two overlapping changes, one to rename the ->vport member to ->pport, and another making the bitmask expression use '1ULL' instead of plain '1'. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 21:16:48 -05:00
Linus Torvalds	a6c5170d1e	Merge branch 'for-4.0' of git://linux-nfs.org/~bfields/linux Pull nfsd fixes from Bruce Fields: "Three miscellaneous bugfixes, most importantly the clp->cl_revoked bug, which we've seen several reports of people hitting" * 'for-4.0' of git://linux-nfs.org/~bfields/linux: sunrpc: integer underflow in rsc_parse() nfsd: fix clp->cl_revoked list deletion causing softlock in nfsd svcrpc: fix memory leak in gssp_accept_sec_context_upcall	2015-03-03 15:52:50 -08:00
Linus Torvalds	789d7f60cd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) If an IPVS tunnel is created with a mixed-family destination address, it cannot be removed. Fix from Alexey Andriyanov. 2) Fix module refcount underflow in netfilter's nft_compat, from Pablo Neira Ayuso. 3) Generic statistics infrastructure can reference variables sitting on a released function stack, therefore use dynamic allocation always. Fix from Ignacy Gawędzki. 4) skb_copy_bits() return value test is inverted in ip_check_defrag(). 5) Fix network namespace exit in openvswitch, we have to release all of the per-net vports. From Pravin B Shelar. 6) Fix signedness bug in CAIF's cfpkt_iterate(), from Dan Carpenter. 7) Fix rhashtable grow/shrink behavior, only expand during inserts and shrink during deletes. From Daniel Borkmann. 8) Netdevice names with semicolons should never be allowed, because they serve as a separator. From Matthew Thode. 9) Use {,__}set_current_state() where appropriate, from Fabian Frederick. 10) Revert byte queue limits support in r8169 driver, it's causing regressions we can't figure out. 11) tcp_should_expand_sndbuf() erroneously uses tp->packets_out to measure packets in flight, properly use tcp_packets_in_flight() instead. From Neal Cardwell. 12) Fix accidental removal of support for bluetooth in CSR based Intel wireless cards. From Marcel Holtmann. 13) We accidently added a behavioral change between native and compat tasks, wrt testing the MSG_CMSG_COMPAT bit. Just ignore it if the user happened to set it in a native binary as that was always the behavior we had. From Catalin Marinas. 14) Check genlmsg_unicast() return valud in hwsim netlink tx frame handling, from Bob Copeland. 15) Fix stale ->radar_required setting in mac80211 that can prevent starting new scans, from Eliad Peller. 16) Fix memory leak in nl80211 monitor, from Johannes Berg. 17) Fix race in TX index handling in xen-netback, from David Vrabel. 18) Don't enable interrupts in amx-xgbe driver until all software et al. state is ready for the interrupt handler to run. From Thomas Lendacky. 19) Add missing netlink_ns_capable() checks to rtnl_newlink(), from Eric W Biederman. 20) The amount of header space needed in macvtap was not calculated properly, fix it otherwise we splat past the beginning of the packet. From Eric Dumazet. 21) Fix bcmgenet TCP TX perf regression, from Jaedon Shin. 22) Don't raw initialize or mod timers, use setup_timer() and mod_timer() instead. From Vaishali Thakkar. 23) Fix software maintained statistics in bcmgenet and systemport drivers, from Florian Fainelli. 24) DMA descriptor updates in sh_eth need proper memory barriers, from Ben Hutchings. 25) Don't do UDP Fragmentation Offload on RAW sockets, from Michal Kubecek. 26) Openvswitch's non-masked set actions aren't constructed properly into netlink messages, fix from Joe Stringer. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits) openvswitch: Fix serialization of non-masked set actions. gianfar: Reduce logging noise seen due to phy polling if link is down ibmveth: Add function to enable live MAC address changes net: bridge: add compile-time assert for cb struct size udp: only allow UFO for packets from SOCK_DGRAM sockets sh_eth: Really fix padding of short frames on TX Revert "sh_eth: Enable Rx descriptor word 0 shift for r8a7790" sh_eth: Fix RX recovery on R-Car in case of RX ring underrun sh_eth: Ensure proper ordering of descriptor active bit write/read net/mlx4_en: Disbale GRO for incoming loopback/selftest packets net/mlx4_core: Fix wrong mask and error flow for the update-qp command net: systemport: fix software maintained statistics net: bcmgenet: fix software maintained statistics rxrpc: don't multiply with HZ twice rxrpc: terminate retrans loop when sending of skb fails net/hsr: Fix NULL pointer dereference and refcnt bugs when deleting a HSR interface. net: pasemi: Use setup_timer and mod_timer net: stmmac: Use setup_timer and mod_timer net: 8390: axnet_cs: Use setup_timer and mod_timer net: 8390: pcnet_cs: Use setup_timer and mod_timer ...	2015-03-03 15:30:07 -08:00
Joe Perches	1cea7e2c9f	l2tp: Use eth_<foo>_addr instead of memset Use the built-in function instead of memset. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 17:01:38 -05:00
Joe Perches	d2beae1078	wireless: Use eth_<foo>_addr instead of memset Use the built-in function instead of memset. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 17:01:38 -05:00
Joe Perches	c84a67a2fc	mac80211: Use eth_<foo>_addr instead of memset Use the built-in function instead of memset. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 17:01:38 -05:00
Joe Perches	afc130dd39	ethernet: Use eth_<foo>_addr instead of memset Use the built-in function instead of memset. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 17:01:38 -05:00
Joe Perches	211b85349c	bluetooth: Use eth_<foo>_addr instead of memset Use the built-in function instead of memset. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 17:01:37 -05:00
Joe Perches	19ffa562ec	atm: Use eth_<foo>_addr instead of memset Use the built-in function instead of memset. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 17:01:37 -05:00
Joe Perches	1a73de0719	appletalk: Use eth_<foo>_addr instead of memset Use the built-in function instead of memset. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 17:01:37 -05:00
Joe Perches	423049ab1e	8021q: Use eth_<foo>_addr instead of memset Use the built-in function instead of memset. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 17:01:37 -05:00
Eric W. Biederman	1d5da757da	ax25: Stop using magic neighbour cache operations. Before the ax25 stack calls dev_queue_xmit it always calls ax25_type_trans which sets skb->protocol to ETH_P_AX25. Which means that by looking at the protocol type it is possible to detect IP packets that have not been munged by the ax25 stack in ndo_start_xmit and call a function to munge them. Rename ax25_neigh_xmit to ax25_ip_xmit and tweak the return type and value to be appropriate for an ndo_start_xmit function. Update all of the ax25 devices to test the protocol type for ETH_P_IP and return ax25_ip_xmit as the first thing they do. This preserves the existing semantics of IP packet processing, but the timing will be a little different as the IP packets now pass through the qdisc layer before reaching the ax25 ip packet processing. Remove the now unnecessary ax25 neighbour table operations. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 14:44:41 -05:00
Joe Stringer	f4f8e73850	openvswitch: Fix serialization of non-masked set actions. Set actions consist of a regular OVS_KEY_ATTR_* attribute nested inside of a OVS_ACTION_ATTR_SET action attribute. When converting masked actions back to regular set actions, the inner attribute length was not changed, ie, double the length being serialized. This patch fixes the bug. Fixes: `83d2b9b` ("net: openvswitch: Support masked set actions.") Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 14:38:57 -05:00
Fabian Frederick	bcc90e3fb1	net/atm/signaling.c: remove WAIT_FOR_DEMON code WAIT_FOR_DEMON code is directly undefined at the beginning of signaling.c since initial git version and thus never compiled. This also removes buggy current->state direct access. Suggested-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 14:22:11 -05:00
Florian Westphal	71e168b151	net: bridge: add compile-time assert for cb struct size make build fail if structure no longer fits into ->cb storage. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 14:07:04 -05:00
Luciano Coelho	c8fff3dc72	mac80211: handle drv_add_interface failures properly during reconfig If any interface fails to be added to the driver in during reconfig, we should remove all the successfully added interfaces and report reconfig failure, so things can be cleaned up properly. Failing to do so can lead to subsequent failures and leave the drivers in a messed up state. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-03 15:56:09 +01:00
Johannes Berg	be72afe0a4	mac80211: fix another suspend vs. association race Since cfg80211 disconnects, but has no insight into the association process, it can happen that it disconnects while association is in progress. We then try to abort association in mac80211, but this is only later so the association can complete between the two. This results in removing an interface from the driver while bound to the channel context, obviously causing confusion and issues. Solve this by also checking if we're associated during quiesce and if so deauthenticating. The frame will no longer go out to the AP which is a bit unfortunate, but it'll resolve the crash (and before we would have suspended without telling the AP as well.) I'm working on a better, but more complex solution as well, which should avoid that problem. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-03 15:56:08 +01:00
Arik Nemtsov	fb28ec0ce4	mac80211: TDLS: support VHT between peers Add the AID and VHT-cap/operation IEs during TDLS setup. Remove the block of TDLS peers when setting HT-caps of the peer station. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-03 15:56:07 +01:00
Eliad Peller	954a86ef45	cfg80211: add operating classes 128-130 Operating classes 128-130 are defined in the 11ac spec for the 5GHz band. Update ieee80211_operating_class_to_band() to support them. Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-03 15:56:07 +01:00
Alexander Bondar	2ecc3905e6	mac80211: Update beacon's timing and DTIM count on every beacon Beacon's timestamp, device system time associated with this beacon and DTIM count parameters are not updated in the associated vif context if the latest beacon's content is identical to the previously received. It make sense to update these changing parameters on every beacon so the driver can get most updated values. This may be necessary, for example, to avoid either beacons' drift effect or device time stamp overrun. IMPORTANT: Three sync_* parameters - sync_ts, sync_device_ts and sync_dtim_count would possibly be out of sync by the time the driver will use them. The synchronized view is currently guaranteed only in certain callbacks. Signed-off-by: Alexander Bondar <alexander.bondar@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-03 15:56:06 +01:00
Ahmad Kholaif	6c09e791b2	cfg80211: Allow NL80211_ATTR_IFINDEX to be added to vendor events This modifies cfg80211_vendor_event_alloc() with an additional argument struct wireless_dev wdev. __cfg80211_alloc_event_skb() is modified to take in wdev argument, if wdev != NULL, both the NL80211_ATTR_IFINDEX and wdev identifier are added to the vendor event. These changes make it easier for drivers to add ifindex indication in vendor events cleanly. This also updates all existing users of cfg80211_vendor_event_alloc() and __cfg80211_alloc_event_skb() in the kernel tree. Signed-off-by: Ahmad Kholaif <akholaif@qca.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-03 15:56:05 +01:00
Janusz.Dziedzic@tieto.com	ffc1199122	cfg80211: add VHT support for IBSS Add NL80211_EXT_FEATURE_VHT_IBSS flag and VHT support for IBSS. Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-03 15:56:04 +01:00
Dedy Lansky	6eb1813764	cfg80211: add bss_type and privacy arguments in cfg80211_get_bss() 802.11ad adds new a network type (PBSS) and changes the capability field interpretation for the DMG (60G) band. The same 2 bits that were interpreted as "ESS" and "IBSS" before are re-used as a 2-bit field with 3 valid values (and 1 reserved). Valid values are: "IBSS", "PBSS" (new) and "AP". In order to get the BSS struct for the new PBSS networks, change the cfg80211_get_bss() function to take a new enum ieee80211_bss_type argument with the valid network types, as "capa_mask" and "capa_val" no longer work correctly (the search must be band-aware now.) The remaining bits in "capa_mask" and "capa_val" are used only for privacy matching so replace those two with a privacy enum as well. Signed-off-by: Dedy Lansky <dlansky@codeaurora.org> [rewrite commit log, tiny fixes] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-03 15:56:01 +01:00
Michal Kazior	aa75ebc275	mac80211: disable u-APSD queues by default Some APs experience problems when working with U-APSD. Decreasing the probability of that happening by using legacy mode for all ACs but VO isn't enough. Cisco 4410N originally forced us to enable VO by default only because it treated non-VO ACs as legacy. However some APs (notably Netgear R7000) silently reclassify packets to different ACs. Since u-APSD ACs require trigger frames for frame retrieval clients would never see some frames (e.g. ARP responses) or would fetch them accidentally after a long time. It makes little sense to enable u-APSD queues by default because it needs userspace applications to be aware of it to actually take advantage of the possible additional powersavings. Implicitly depending on driver autotrigger frame support doesn't make much sense. Cc: stable@vger.kernel.org Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-03 10:14:47 +01:00
Fan Du	74005991b7	xfrm: Do not parse 32bits compiled xfrm netlink msg on 64bits host structure like xfrm_usersa_info or xfrm_userpolicy_info has different sizeof when compiled as 32bits and 64bits due to not appending pack attribute in their definition. This will result in broken SA and SP information when user trying to configure them through netlink interface. Inform user land about this situation instead of keeping silent, the upper test scripts would behave accordingly. Signed-off-by: Fan Du <fan.du@intel.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2015-03-03 10:10:16 +01:00
Bob Copeland	d0c22119f5	mac80211: drop unencrypted frames in mesh fwding The mesh forwarding path was not checking that data frames were protected when running an encrypted network; add the necessary check. Cc: stable@vger.kernel.org Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-03 09:27:28 +01:00
Michal Kubeček	acf8dd0a9d	udp: only allow UFO for packets from SOCK_DGRAM sockets If an over-MTU UDP datagram is sent through a SOCK_RAW socket to a UFO-capable device, ip_ufo_append_data() sets skb->ip_summed to CHECKSUM_PARTIAL unconditionally as all GSO code assumes transport layer checksum is to be computed on segmentation. However, in this case, skb->csum_start and skb->csum_offset are never set as raw socket transmit path bypasses udp_send_skb() where they are usually set. As a result, driver may access invalid memory when trying to calculate the checksum and store the result (as observed in virtio_net driver). Moreover, the very idea of modifying the userspace provided UDP header is IMHO against raw socket semantics (I wasn't able to find a document clearly stating this or the opposite, though). And while allowing CHECKSUM_NONE in the UFO case would be more efficient, it would be a bit too intrusive change just to handle a corner case like this. Therefore disallowing UFO for packets from SOCK_DGRAM seems to be the best option. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 22:19:29 -05:00
Florian Westphal	72500bc11e	netfilter: bridge: rework reject handling bridge reject handling is not straightforward, there are many subtle differences depending on configuration. skb->dev is either the bridge port (PRE_ROUTING) or the bridge itself (INPUT), so we need to use indev instead. Also, checksum validation will only work reliably if we trim skb according to the l3 header size. While at it, add csum validation for ipv6 and skip existing tests if skb was already checked e.g. by GRO. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-03 02:10:51 +01:00
Florian Westphal	ee586bbc28	netfilter: reject: don't send icmp error if csum is invalid tcp resets are never emitted if the packet that triggers the reject/reset has an invalid checksum. For icmp error responses there was no such check. It allows to distinguish icmp response generated via iptables -I INPUT -p udp --dport 42 -j REJECT and those emitted by network stack (won't respond if csum is invalid, REJECT does). Arguably its possible to avoid this by using conntrack and only using REJECT with -m conntrack NEW/RELATED. However, this doesn't work when connection tracking is not in use or when using nf_conntrack_checksum=0. Furthermore, sending errors in response to invalid csums doesn't make much sense so just add similar test as in nf_send_reset. Validate csum if needed and only send the response if it is ok. Reference: http://bugzilla.redhat.com/show_bug.cgi?id=1169829 Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-03 02:10:35 +01:00
Eric W. Biederman	435e8eb27e	neigh: Don't require a dst in neigh_resolve_output Having a dst helps a little bit for teql but is fundamentally unnecessary and there are code paths where a dst is not available that it would be nice to use the neighbour cache. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 16:43:41 -05:00
Eric W. Biederman	bdf53c5849	neigh: Don't require dst in neigh_hh_init - Add protocol to neigh_tbl so that dst->ops->protocol is not needed - Acquire the device from neigh->dev This results in a neigh_hh_init that will cache the samve values regardless of the packets flowing through it. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 16:43:41 -05:00
Eric W. Biederman	59b2af26b9	arp: Kill arp_find There are no more callers so kill this function. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 16:43:41 -05:00
Eric W. Biederman	d476059e77	net: Kill dev_rebuild_header Now that there are no more users kill dev_rebuild_header and all of it's implementations. This is long overdue. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 16:43:41 -05:00
Eric W. Biederman	945db424bf	ax25: Stop depending on arp_find Have ax25_neigh_output perform ordinary arp resolution before calling ax25_neigh_xmit. Call dev_hard_header in ax25_neigh_output with a destination address so it will not fail, and the destination mac address will not need to be set in ax25_neigh_xmit. Remove arp_find from ax25_neigh_xmit (the ordinary arp resolution added to ax25_neigh_output removes the need for calling arp_find). Document how close ax25_neigh_output is to neigh_resolve_output. Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-hams@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 16:43:41 -05:00
Eric W. Biederman	abb7b755d9	ax25: Stop calling/abusing dev_rebuild_header - Rename ax25_rebuild_header to ax25_neigh_xmit and call it from ax25_neigh_output directly. The rename is to make it clear that this is not a rebuild_header operation. - Remove ax25_rebuild_header from ax25_header_ops. Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-hams@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 16:43:40 -05:00
Eric W. Biederman	def6775369	neigh: Move neigh_compat_output into ax25_ip.c The only caller is now is ax25_neigh_construct so move neigh_compat_output into ax25_ip.c make it static and rename it ax25_neigh_output. Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-hams@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 16:43:40 -05:00
Eric W. Biederman	21bfb8e933	arp: Remove special case to give AX25 it's open arp operations. The special case has been pushed out into ax25_neigh_construct so there is no need to keep this code in arp.c Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 16:43:40 -05:00
Eric W. Biederman	3b6a94bed0	ax25: Refactor to use private neighbour operations. AX25 already has it's own private arp cache operations to isolate it's abuse of dev_rebuild_header to transmit packets. Add a function ax25_neigh_construct that will allow all of the ax25 devices to force using these operations, so that the generic arp code does not need to. Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-hams@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 16:43:40 -05:00
Eric W. Biederman	46d4e47abe	ax25: Make ax25_header and ax25_rebuild_header static The only user is in ax25_ip.c so stop exporting these functions. Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-hams@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 16:43:40 -05:00
Eric W. Biederman	03ec2ac097	rose: Transmit packets in rose_xmit not rose_rebuild_header Patterned after the similar code in net/rom this turns out to be a trivial obviously correct transmformation. Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-hams@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 16:43:39 -05:00
Eric W. Biederman	b753af31ab	rose: Set the destination address in rose_header Not setting the destination address is a bug that I suspect causes no problems today, as only the arp code seems to call dev_hard_header and the description I have of rose is that it is expected to be used with a static neigbour table. I have derived the offset and the length of the rose destination address from rose_rebuild_header where arp_find calls neigh_ha_snapshot to set the destination address. Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-hams@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 16:43:39 -05:00
Eric W. Biederman	e18dbd0593	ax25: In ax25_rebuild_header add missing kfree_skb In the unlikely (impossible?) event that we attempt to transmit an ax25 packet over a non-ax25 device free the skb so we don't leak it. Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-hams@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 16:43:39 -05:00
David S. Miller	77f0379fa8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next A small batch with accumulated updates in nf-next, mostly IPVS updates, they are: 1) Add 64-bits stats counters to IPVS, from Julian Anastasov. 2) Move NETFILTER_XT_MATCH_ADDRTYPE out of NETFILTER_ADVANCED as docker seem to require this, from Anton Blanchard. 3) Use boolean instead of numeric value in set_match_v*(), from coccinelle via Fengguang Wu. 4) Allows rescheduling of new connections in IPVS when port reuse is detected, from Marcelo Ricardo Leitner. 5) Add missing bits to support arptables extensions from nft_compat, from Arturo Borrero. Patrick is preparing a large batch to enhance the set infrastructure, named expressions among other things, that should follow up soon after this batch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 14:55:05 -05:00
Daniel Borkmann	49b31e576a	filter: refactor common filter attach code into __sk_attach_prog Both sk_attach_filter() and sk_attach_bpf() are setting up sk_filter, charging skmem and attaching it to the socket after we got the eBPF prog up and ready. Lets refactor that into a common helper. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 14:51:38 -05:00
David S. Miller	70c836a4d1	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-03-02 Here's the first bluetooth-next pull request targeting the 4.1 kernel: - ieee802154/6lowpan cleanups - SCO routing to host interface support for the btmrvl driver - AMP code cleanups - Fixes to AMP HCI init sequence - Refactoring of the HCI callback mechanism - Added shutdown routine for Intel controllers in the btusb driver - New config option to enable/disable Bluetooth debugfs information - Fix for early data reception on L2CAP fixed channels Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 14:47:12 -05:00
Ying Xue	1b78414047	net: Remove iocb argument from sendmsg and recvmsg After TIPC doesn't depend on iocb argument in its internal implementations of sendmsg() and recvmsg() hooks defined in proto structure, no any user is using iocb argument in them at all now. Then we can drop the redundant iocb argument completely from kinds of implementations of both sendmsg() and recvmsg() in the entire networking stack. Cc: Christoph Hellwig <hch@lst.de> Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 13:06:31 -05:00
Ying Xue	39a0295f90	tipc: Don't use iocb argument in socket layer Currently the iocb argument is used to idenfiy whether or not socket lock is hold before tipc_sendmsg()/tipc_send_stream() is called. But this usage prevents iocb argument from being dropped through sendmsg() at socket common layer. Therefore, in the commit we introduce two new functions called __tipc_sendmsg() and __tipc_send_stream(). When they are invoked, it assumes that their callers have taken socket lock, thereby avoiding the weird usage of iocb argument. Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 13:06:31 -05:00
Arturo Borrero	5f15893943	netfilter: nft_compat: add support for arptables extensions This patch adds support to arptables extensions from nft_compat. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-02 12:28:13 +01:00
Eyal Birger	744d5a3e9f	net: move skb->dropcount to skb->cb[] Commit `977750076d` ("af_packet: add interframe drop cmsg (v6)") unionized skb->mark and skb->dropcount in order to allow recording of the socket drop count while maintaining struct sk_buff size. skb->dropcount was introduced since there was no available room in skb->cb[] in packet sockets. However, its introduction led to the inability to export skb->mark, or any other aliased field to userspace if so desired. Moving the dropcount metric to skb->cb[] eliminates this problem at the expense of 4 bytes less in skb->cb[] for protocol families using it. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 00:19:30 -05:00
Eyal Birger	3bc3b96f3b	net: add common accessor for setting dropcount on packets As part of an effort to move skb->dropcount to skb->cb[], use a common function in order to set dropcount in struct sk_buff. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 00:19:30 -05:00
Eyal Birger	b4772ef879	net: use common macro for assering skb->cb[] available size in protocol families As part of an effort to move skb->dropcount to skb->cb[] use a common macro in protocol families using skb->cb[] for ancillary data to validate available room in skb->cb[]. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 00:19:30 -05:00
Eyal Birger	2472d7613b	net: packet: use sockaddr_ll fields as storage for skb original length in recvmsg path As part of an effort to move skb->dropcount to skb->cb[], 4 bytes of additional room are needed in skb->cb[] in packet sockets. Store the skb original length in the first two fields of sockaddr_ll (sll_family and sll_protocol) as they can be derived from the skb when needed. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 00:19:29 -05:00
Eyal Birger	2cfdf9fcb8	net: rxrpc: change call to sock_recv_ts_and_drops() on rxrpc recvmsg to sock_recv_timestamp() Commit `3b885787ea` ("net: Generalize socket rx gap / receive queue overflow cmsg") allowed receiving packet dropcount information as a socket level option. RXRPC sockets recvmsg function was changed to support this by calling sock_recv_ts_and_drops() instead of sock_recv_timestamp(). However, protocol families wishing to receive dropcount should call sock_queue_rcv_skb() or set the dropcount specifically (as done in packet_rcv()). This was not done for rxrpc and thus this feature never worked on these sockets. Formalizing this by not calling sock_recv_ts_and_drops() in rxrpc as part of an effort to move skb->dropcount into skb->cb[] Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 00:19:29 -05:00
Eyal Birger	6368c23577	net: bluetooth: compact struct bt_skb_cb by converting boolean fields to bit fields Convert boolean fields incoming and req_start to bit fields and move force_active in order save space in bt_skb_cb in an effort to use a portion of skb->cb[] for storing skb->dropcount. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 00:19:29 -05:00
Eyal Birger	49a6fe0557	net: bluetooth: compact struct bt_skb_cb by inlining struct hci_req_ctrl struct hci_req_ctrl is never used outside of struct bt_skb_cb; Inlining it frees 8 bytes on a 64 bit system in skb->cb[] allowing the addition of more ancillary data. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Reviewed-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 00:19:29 -05:00
Daniel Borkmann	e2e9b6541d	cls_bpf: add initial eBPF support for programmable classifiers This work extends the "classic" BPF programmable tc classifier by extending its scope also to native eBPF code! This allows for user space to implement own custom, 'safe' C like classifiers (or whatever other frontend language LLVM et al may provide in future), that can then be compiled with the LLVM eBPF backend to an eBPF elf file. The result of this can be loaded into the kernel via iproute2's tc. In the kernel, they can be JITed on major archs and thus run in native performance. Simple, minimal toy example to demonstrate the workflow: #include <linux/ip.h> #include <linux/if_ether.h> #include <linux/bpf.h> #include "tc_bpf_api.h" __section("classify") int cls_main(struct sk_buff *skb) { return (0x800 << 16) \| load_byte(skb, ETH_HLEN + __builtin_offsetof(struct iphdr, tos)); } char __license[] __section("license") = "GPL"; The classifier can then be compiled into eBPF opcodes and loaded via tc, for example: clang -O2 -emit-llvm -c cls.c -o - \| llc -march=bpf -filetype=obj -o cls.o tc filter add dev em1 parent 1: bpf cls.o [...] As it has been demonstrated, the scope can even reach up to a fully fledged flow dissector (similarly as in samples/bpf/sockex2_kern.c). For tc, maps are allowed to be used, but from kernel context only, in other words, eBPF code can keep state across filter invocations. In future, we perhaps may reattach from a different application to those maps e.g., to read out collected statistics/state. Similarly as in socket filters, we may extend functionality for eBPF classifiers over time depending on the use cases. For that purpose, cls_bpf programs are using BPF_PROG_TYPE_SCHED_CLS program type, so we can allow additional functions/accessors (e.g. an ABI compatible offset translation to skb fields/metadata). For an initial cls_bpf support, we allow the same set of helper functions as eBPF socket filters, but we could diverge at some point in time w/o problem. I was wondering whether cls_bpf and act_bpf could share C programs, I can imagine that at some point, we introduce i) further common handlers for both (or even beyond their scope), and/or if truly needed ii) some restricted function space for each of them. Both can be abstracted easily through struct bpf_verifier_ops in future. The context of cls_bpf versus act_bpf is slightly different though: a cls_bpf program will return a specific classid whereas act_bpf a drop/non-drop return code, latter may also in future mangle skbs. That said, we can surely have a "classify" and "action" section in a single object file, or considered mentioned constraint add a possibility of a shared section. The workflow for getting native eBPF running from tc [1] is as follows: for f_bpf, I've added a slightly modified ELF parser code from Alexei's kernel sample, which reads out the LLVM compiled object, sets up maps (and dynamically fixes up map fds) if any, and loads the eBPF instructions all centrally through the bpf syscall. The resulting fd from the loaded program itself is being passed down to cls_bpf, which looks up struct bpf_prog from the fd store, and holds reference, so that it stays available also after tc program lifetime. On tc filter destruction, it will then drop its reference. Moreover, I've also added the optional possibility to annotate an eBPF filter with a name (e.g. path to object file, or something else if preferred) so that when tc dumps currently installed filters, some more context can be given to an admin for a given instance (as opposed to just the file descriptor number). Last but not least, bpf_prog_get() and bpf_prog_put() needed to be exported, so that eBPF can be used from cls_bpf built as a module. Thanks to `60a3b2253c` ("net: bpf: make eBPF interpreter images read-only") I think this is of no concern since anything wanting to alter eBPF opcode after verification stage would crash the kernel. [1] http://git.breakpoint.cc/cgit/dborkman/iproute2.git/log/?h=ebpf Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-01 14:05:19 -05:00
Daniel Borkmann	24701ecea7	ebpf: move read-only fields to bpf_prog and shrink bpf_prog_aux is_gpl_compatible and prog_type should be moved directly into bpf_prog as they stay immutable during bpf_prog's lifetime, are core attributes and they can be locked as read-only later on via bpf_prog_select_runtime(). With a bit of rearranging, this also allows us to shrink bpf_prog_aux to exactly 1 cacheline. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-01 14:05:19 -05:00
Daniel Borkmann	96be4325f4	ebpf: add sched_cls_type and map it to sk_filter's verifier ops As discussed recently and at netconf/netdev01, we want to prevent making bpf_verifier_ops registration available for modules, but have them at a controlled place inside the kernel instead. The reason for this is, that out-of-tree modules can go crazy and define and register any verfifier ops they want, doing all sorts of crap, even bypassing available GPLed eBPF helper functions. We don't want to offer such a shiny playground, of course, but keep strict control to ourselves inside the core kernel. This also encourages us to design eBPF user helpers carefully and generically, so they can be shared among various subsystems using eBPF. For the eBPF traffic classifier (cls_bpf), it's a good start to share the same helper facilities as we currently do in eBPF for socket filters. That way, we have BPF_PROG_TYPE_SCHED_CLS look like it's own type, thus one day if there's a good reason to diverge the set of helper functions from the set available to socket filters, we keep ABI compatibility. In future, we could place all bpf_prog_type_list at a central place, perhaps. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-01 14:05:19 -05:00
Daniel Borkmann	d4052c4aea	ebpf: remove CONFIG_BPF_SYSCALL ifdefs in socket filter code This gets rid of CONFIG_BPF_SYSCALL ifdefs in the socket filter code, now that the BPF internal header can deal with it. While going over it, I also changed eBPF related functions to a sk_filter prefix to be more consistent with the rest of the file. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-01 14:05:19 -05:00
Daniel Borkmann	a2c83fff58	ebpf: constify various function pointer structs We can move bpf_map_ops and bpf_verifier_ops and other structs into ro section, bpf_map_type_list and bpf_prog_type_list into read mostly. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-01 14:05:18 -05:00
Florian Westphal	765dd3bb44	rxrpc: don't multiply with HZ twice rxrpc_resend_timeout has an initial value of 4 * HZ; use it as-is. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-01 13:40:23 -05:00
Florian Westphal	c03ae533a9	rxrpc: terminate retrans loop when sending of skb fails Typo, 'stop' is never set to true. Seems intent is to not attempt to retransmit more packets after sendmsg returns an error. This change is based on code inspection only. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-01 13:40:23 -05:00
Arvid Brodin	56b08fdcf6	net/hsr: Fix NULL pointer dereference and refcnt bugs when deleting a HSR interface. To repeat: $ sudo ip link del hsr0 BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: [<ffffffff8187f495>] hsr_del_port+0x15/0xa0 etc... Bug description: As part of the hsr master device destruction, hsr_del_port() is called for each of the hsr ports. At each such call, the master device is updated regarding features and mtu. When the master device is freed before the slave interfaces, master will be NULL in hsr_del_port(), which led to a NULL pointer dereference. Additionally, dev_put() was called on the master device itself in hsr_del_port(), causing a refcnt error. A third bug in the same code path was that the rtnl lock was not taken before hsr_del_port() was called as part of hsr_dev_destroy(). The reporter (Nicolas Dichtel) also said: "hsr_netdev_notify() supposes that the port will always be available when the notification is for an hsr interface. It's wrong. For example, netdev_wait_allrefs() may resend NETDEV_UNREGISTER.". As a precaution against this, a check for port == NULL was added in hsr_dev_notify(). Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Fixes: `51f3c60531` ("net/hsr: Move slave init to hsr_slave.c.") Signed-off-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-01 13:40:23 -05:00
Eric Dumazet	cac5e65e8a	net: do not use rcu in rtnl_dump_ifinfo() We did a failed attempt in the past to only use rcu in rtnl dump operations (commit `e67f88dd12` "net: dont hold rtnl mutex during netlink dump callbacks") Now that dumps are holding RTNL anyway, there is no need to also use rcu locking, as it forbids any scheduling ability, like GFP_KERNEL allocations that controlling path should use instead of GFP_ATOMIC whenever possible. This should fix following splat Cong Wang reported : [ INFO: suspicious RCU usage. ] 3.19.0+ #805 Tainted: G W include/linux/rcupdate.h:538 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by ip/771: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff8182b8f4>] netlink_dump+0x21/0x26c #1: (rcu_read_lock){......}, at: [<ffffffff817d785b>] rcu_read_lock+0x0/0x6e stack backtrace: CPU: 3 PID: 771 Comm: ip Tainted: G W 3.19.0+ #805 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 0000000000000001 ffff8800d51e7718 ffffffff81a27457 0000000029e729e6 ffff8800d6108000 ffff8800d51e7748 ffffffff810b539b ffffffff820013dd 00000000000001c8 0000000000000000 ffff8800d7448088 ffff8800d51e7758 Call Trace: [<ffffffff81a27457>] dump_stack+0x4c/0x65 [<ffffffff810b539b>] lockdep_rcu_suspicious+0x107/0x110 [<ffffffff8109796f>] rcu_preempt_sleep_check+0x45/0x47 [<ffffffff8109e457>] ___might_sleep+0x1d/0x1cb [<ffffffff8109e67d>] __might_sleep+0x78/0x80 [<ffffffff814b9b1f>] idr_alloc+0x45/0xd1 [<ffffffff810cb7ab>] ? rcu_read_lock_held+0x3b/0x3d [<ffffffff814b9f9d>] ? idr_for_each+0x53/0x101 [<ffffffff817c1383>] alloc_netid+0x61/0x69 [<ffffffff817c14c3>] __peernet2id+0x79/0x8d [<ffffffff817c1ab7>] peernet2id+0x13/0x1f [<ffffffff817d8673>] rtnl_fill_ifinfo+0xa8d/0xc20 [<ffffffff810b17d9>] ? __lock_is_held+0x39/0x52 [<ffffffff817d894f>] rtnl_dump_ifinfo+0x149/0x213 [<ffffffff8182b9c2>] netlink_dump+0xef/0x26c [<ffffffff8182bcba>] netlink_recvmsg+0x17b/0x2c5 [<ffffffff817b0adc>] __sock_recvmsg+0x4e/0x59 [<ffffffff817b1b40>] sock_recvmsg+0x3f/0x51 [<ffffffff817b1f9a>] ___sys_recvmsg+0xf6/0x1d9 [<ffffffff8115dc67>] ? handle_pte_fault+0x6e1/0xd3d [<ffffffff8100a3a0>] ? native_sched_clock+0x35/0x37 [<ffffffff8109f45b>] ? sched_clock_local+0x12/0x72 [<ffffffff8109f6ac>] ? sched_clock_cpu+0x9e/0xb7 [<ffffffff810cb7ab>] ? rcu_read_lock_held+0x3b/0x3d [<ffffffff811abde8>] ? __fcheck_files+0x4c/0x58 [<ffffffff811ac556>] ? __fget_light+0x2d/0x52 [<ffffffff817b376f>] __sys_recvmsg+0x42/0x60 [<ffffffff817b379f>] SyS_recvmsg+0x12/0x1c Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `0c7aecd4bd` ("netns: add rtnl cmd to add and get peer netns ids") Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-01 00:07:00 -05:00
David S. Miller	32034e0580	A few patches have accumulated, among them the fix for Linus's four-way-handshake problem. The others are various small fixes for problems all over, nothing really stands out. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJU8HQ5AAoJEDBSmw7B7bqrGeUQAK7O/UliGPxPNSMs1Mmb+Fj/ GT51sGhZLYXUj2nMMFTTPuhR5qXdgSsBaonitNQaAmJtMtmO6R21rcd74NDB29i7 1xHgtdxJDyDYh6zWUX6+IETV/eRHuS7vlfiiBY/PHAymAQZVa1doC5nN0yCfPHkd XanD4HRXEZppiw+OVh3VkgvxQZkhqmExQDdYZYWGVm8b3cspqa0l/e9WpLrU/Z19 15liupuhINcDAl5KAsR0oaBy/vUsETA6+H9K4fzZG6bemlaD5T+nkNqjlEPbw806 sMxiWZ7jrI1n5v2KiNHkKTJakzHsz08zb9nooA9swusSImflr8BUm3lNz+IVsqtL zBvtMEiGwNlGkSLdHieICBktXw7/Lw2VUSKgWlRj1LZSXfd3kISXrwdR46NOr93e cBzaGn7H8YrVM2Sl/EraRimYL9womICPr6+flE+YAeRnZzsAFrBd9qegc2N3LTt+ NtTqS/JbGAKyOS84JCge+1omGOKPIuQGBOtEIy1U0zlxibZzEzCulrKSeGjQ8o3Q dfrMYEwpicK4ADh7wxeslD6QlgkcQ/Ft7agkP+ps4m5ngFXo3swup+hxNMDoyyi3 BYiOGDum6SBzrOK8bN4hv3jEHK7e3fo68NeLhJ34Ap4K7jY9kOi8sk7WY8NerZBI 19oVt0+0yyzKVlaI7IQE =H3+7 -----END PGP SIGNATURE----- Merge tag 'mac80211-for-davem-2015-02-27' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A few patches have accumulated, among them the fix for Linus's four-way-handshake problem. The others are various small fixes for problems all over, nothing really stands out. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-28 23:33:53 -05:00
Eric Dumazet	74abc20ced	tcp: cleanup static functions tcp_fastopen_create_child() is static and should not be exported. tcp4_gso_segment() and tcp6_gso_segment() should be static. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-28 16:56:51 -05:00
James Minor	76a70e9c4b	cfg80211-wext: return -E2BIG when buffer can't hold full BSS entry When using the wext compatibility code in cfg80211, part of the IEs can be truncated if the passed user buffer is large enough for part of the BSS but not large enough for all of the IEs. This can cause an EAP network to show up as a PSK network. Always return -E2BIG in this case to avoid truncating data. Since this changes the control flow, use an on-stack variable for a small buffer instead of allocating it. Signed-off-by: James Minor <james.minor@ni.com> [rework patch to error out immediately, use _check wrappers] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-02-28 21:31:12 +01:00
Johannes Berg	abfbc3af57	mac80211: remove TX latency measurement code Revert commit `ad38bfc916` ("mac80211: Tx frame latency statistics") (along with some follow-up fixes). This code turned out not to be as useful in the current form as we thought, and we've internally hacked it up more, but that's not very suitable for upstream (for now), and we might just do that with tracing instead. Therefore, for now at least, remove this code. We might also need to use the skb->tstamp field for the TCP performance issue, which is more important than the debugging. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-02-28 21:31:11 +01:00
Masashi Honma	31f909a2c0	nl/mac80211: allow zero plink timeout to disable STA expiration Both wpa_supplicant and mac80211 have and inactivity timer. By default wpa_supplicant will be timed out in 5 minutes and mac80211's it is 30 minutes. If wpa_supplicant uses a longer timer than mac80211, it will get unexpected disconnection by mac80211. Using 0xffffffff instead as the configured value could solve this w/o changing the code, but due to integer overflow in the expression used this doesn't work. The expression is: (current jiffies) > (frame Rx jiffies + NL80211_MESHCONF_PLINK_TIMEOUT * 250) On 32bit system, the right side would overflow and be a very small value if NL80211_MESHCONF_PLINK_TIMEOUT is sufficiently large, causing unexpectedly early disconnections. Instead allow disabling the inactivity timer to avoid this situation, by passing the (previously invalid and useless) value 0. Signed-off-by: Masashi Honma <masashi.honma@gmail.com> [reword/rewrap commit log] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-02-28 21:31:10 +01:00
Johannes Berg	2afe38d15c	cfg80211-wext: export symbols only when needed When a fully converted cfg80211 driver needs cfg80211-wext for userspace API purposes, the symbols need not be exported. When other drivers (orinoco/hermes or ipw2200) are enabled, they do need the symbols exported as they use them directly. Make those drivers select a new CFG80211_WEXT_EXPORT Kconfig symbol (instead of just CFG80211_WEXT) and export the functions only if requested - this saves about 1/2k due to the size of EXPORT_SYMBOL() itself. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-02-28 21:31:09 +01:00
Johannes Berg	7d9bb2f065	mac80211: iterate using station list in AP SMPS When changing AP SMPS, we need to look up all the stations for this interface, so there's no reason to iterate over hash chains rather than doing the simpler iteration over the station list. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-02-28 21:31:09 +01:00
Johannes Berg	9d6b106b54	mac80211: don't look up stations for multicast addresses Since multicast addresses don't exist as stations, don't attempt to look them up in the hashtable on TX. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-02-28 21:28:44 +01:00
Eric W. Biederman	06615bed60	net: Verify permission to link_net in newlink When applicable verify that the caller has permisson to the underlying network namespace for a newly created network device. Similary checks exist for the network namespace a network device will be created in. Fixes: `317f4810e4` ("rtnl: allow to create device with IFLA_LINK_NETNSID set") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-28 15:14:44 -05:00
Eric W. Biederman	505ce4154a	net: Verify permission to dest_net in newlink When applicable verify that the caller has permision to create a network device in another network namespace. This check is already present when moving a network device between network namespaces in setlink so all that is needed is to duplicate that check in newlink. This change almost backports cleanly, but there are context conflicts as the code that follows was added in v4.0-rc1 Fixes: `b51642f6d7` net: Enable a userns root rtnl calls that are safe for unprivilged users Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-28 15:14:44 -05:00
Eric Dumazet	a0ea700e40	tcp: tso: allow CA_CWR state in tcp_tso_should_defer() Another TCP issue is triggered by ECN. Under pressure, receiver gets ECN marks, and send back ACK packets with ECE TCP flag. Senders enter CA_CWR state. In this state, tcp_tso_should_defer() is short cut : if (icsk->icsk_ca_state != TCP_CA_Open) goto send_now; This means that about all ACK packets we receive are triggering a partial send, and because cwnd is kept small, we can only send a small amount of data for each incoming ACK, which in return generate more ACK packets. Allowing CA_Open and CA_CWR states to enable TSO defer in tcp_tso_should_defer() brings performance back : TSO autodefer has more chance to defer under pressure. This patch increases TSO and LRO/GRO efficiency back to normal levels, and does not impact overall ECN behavior. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-28 15:10:39 -05:00
Eric Dumazet	50c8339e92	tcp: tso: restore IW10 after TSO autosizing With sysctl_tcp_min_tso_segs being 4, it is very possible that tcp_tso_should_defer() decides not sending last 2 MSS of initial window of 10 packets. This also applies if autosizing decides to send X MSS per GSO packet, and cwnd is not a multiple of X. This patch implements an heuristic based on age of first skb in write queue : If it was sent very recently (less than half srtt), we can predict that no ACK packet will come in less than half rtt, so deferring might cause an under utilization of our window. This is visible on initial send (IW10) on web servers, but more generally on some RPC, as the last part of the message might need an extra RTT to get delivered. Tested: Ran following packetdrill test // A simple server-side test that sends exactly an initial window (IW10) // worth of packets. `sysctl -e -q net.ipv4.tcp_min_tso_segs=4` 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +.1 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7> +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6> +.1 < . 1:1(0) ack 1 win 257 +0 accept(3, ..., ...) = 4 +0 write(4, ..., 14600) = 14600 +0 > . 1:5841(5840) ack 1 win 457 +0 > . 5841:11681(5840) ack 1 win 457 // Following packet should be sent right now. +0 > P. 11681:14601(2920) ack 1 win 457 +.1 < . 1:1(0) ack 14601 win 257 +0 close(4) = 0 +0 > F. 14601:14601(0) ack 1 +.1 < F. 1:1(0) ack 14602 win 257 +0 > . 14602:14602(0) ack 2 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-28 15:10:39 -05:00
Eric Dumazet	5f852eb536	tcp: tso: remove tp->tso_deferred TSO relies on ability to defer sending a small amount of packets. Heuristic is to wait for future ACKS in hope to send more packets at once. Current algorithm uses a per socket tso_deferred field as a pseudo timer. This pseudo timer relies on future ACK, but there is no guarantee we receive them in time. Fix would be to use a real timer, but cost of such timer is probably too expensive for typical cases. This patch changes the logic to test the time of last transmit, because we should not add bursts of more than 1ms for any given flow. We've used this patch for about two years at Google, before FQ/pacing as it would reduce a fair amount of bursts. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-28 15:10:39 -05:00
Erik Hugne	d76a436d50	tipc: make media address offset a common define With the exception of infiniband media which does not use media offsets, the media address is always located at offset 4 in the media info field as defined by the protocol, so we move the definition to the generic bearer.h Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 18:18:48 -05:00
Erik Hugne	91e2eb5684	tipc: rename media/msg related definitions The TIPC_MEDIA_ADDR_SIZE and TIPC_MEDIA_ADDR_OFFSET names are misleading, as they actually define the size and offset of the whole media info field and not the address part. This patch does not have any functional changes. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 18:18:48 -05:00
Erik Hugne	afaa3f65f6	tipc: purge links when bearer is disabled If a bearer is disabled by manual intervention, all links over that bearer should be purged, indicated with the 'shutting_down' flag. Otherwise tipc will get confused if a new bearer is enabled using a different media type. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 18:18:47 -05:00
Erik Hugne	7fe8097cef	tipc: fix nullpointer bug when subscribing to events If a subscription request is sent to a topology server connection, and any error occurs (malformed request, oom or limit reached) while processing this request, TIPC should terminate the subscriber connection. While doing so, it tries to access fields in an already freed (or never allocated) subscription element leading to a nullpointer exception. We fix this by removing the subscr_terminate function and terminate the connection immediately upon any subscription failure. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 18:18:47 -05:00
Erik Hugne	3622c36f37	tipc: only create header copy for name distr messages The TIPC name distributor pushes topology updates to the cluster neighbors. Currently this is done in a unicast manner, and the skb holding the update is cloned for each cluster member. This is unnecessary, as we only modify the destnode field in the header so we change it to do pskb_copy instead. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 18:18:47 -05:00
Alexander Duyck	79e5ad2ceb	fib_trie: Remove leaf_info At this point the leaf_info hash is redundant. By adding the suffix length to the fib_alias hash list we no longer have need of leaf_info as we can determine the prefix length from fa_slen. So we can compress things by dropping the leaf_info structure from fib_trie and instead directly connect the leaves to the fib_alias hash list. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 16:37:07 -05:00
Alexander Duyck	9b6ebad5c3	fib_trie: Add slen to fib alias Make use of an empty spot in the alias to store the suffix length so that we don't need to pull that information from the leaf_info structure. This patch also makes a slight change to the user statistics. Instead of incrementing semantic_match_miss once per leaf_info miss we now just increment it once per leaf if a match was not found. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 16:37:07 -05:00
Alexander Duyck	5786ec6054	fib_trie: Replace plen with slen in leaf_info This replaces the prefix length variable in the leaf_info structure with a suffix length value, or host identifier length in bits. By doing this it makes it easier to sort out since the tnodes and leaf are carrying this value as well since it is compatible with the ->pos field in tnodes. I also cleaned up one spot that had some list manipulation that could be simplified. I basically updated it so that we just use hlist_add_head_rcu instead of calling hlist_add_before_rcu on the first node in the list. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 16:37:06 -05:00
Alexander Duyck	56315f9e6e	fib_trie: Convert fib_alias to hlist from list There isn't any advantage to having it as a list and by making it an hlist we make the fib_alias more compatible with the list_info in terms of the type of list used. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 16:37:06 -05:00
Madhu Challa	93a714d6b5	multicast: Extend ip address command to enable multicast group join/leave on Joining multicast group on ethernet level via "ip maddr" command would not work if we have an Ethernet switch that does igmp snooping since the switch would not replicate multicast packets on ports that did not have IGMP reports for the multicast addresses. Linux vxlan interfaces created via "ip link add vxlan" have the group option that enables then to do the required join. By extending ip address command with option "autojoin" we can get similar functionality for openvswitch vxlan interfaces as well as other tunneling mechanisms that need to receive multicast traffic. The kernel code is structured similar to how the vxlan driver does a group join / leave. example: ip address add 224.1.1.10/24 dev eth5 autojoin ip address del 224.1.1.10/24 dev eth5 Signed-off-by: Madhu Challa <challa@noironetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 16:25:25 -05:00
Madhu Challa	46a4dee074	igmp v6: add __ipv6_sock_mc_join and __ipv6_sock_mc_drop Based on the igmp v4 changes from Eric Dumazet. 959d10f6bbf6("igmp: add __ip_mc_{join\|leave}_group()") These changes are needed to perform igmp v6 join/leave while RTNL is held. Make ipv6_sock_mc_join and ipv6_sock_mc_drop wrappers around __ipv6_sock_mc_join and __ipv6_sock_mc_drop to avoid proliferation of work queues. Signed-off-by: Madhu Challa <challa@noironetworks.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 16:25:24 -05:00
Daniel Borkmann	4c4b52d9b2	rhashtable: remove indirection for grow/shrink decision functions Currently, all real users of rhashtable default their grow and shrink decision functions to rht_grow_above_75() and rht_shrink_below_30(), so that there's currently no need to have this explicitly selectable. It can/should be generic and private inside rhashtable until a real use case pops up. Since we can make this private, we'll save us this additional indirection layer and can improve insertion/deletion time as well. Reference: http://patchwork.ozlabs.org/patch/443040/ Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 16:06:02 -05:00
Tom Herbert	723b8e460d	udp: In udp_flow_src_port use random hash value if skb_get_hash fails In the unlikely event that skb_get_hash is unable to deduce a hash in udp_flow_src_port we use a consistent random value instead. This is specified in GRE/UDP draft section 3.2.1: https://tools.ietf.org/html/draft-ietf-tsvwg-gre-in-udp-encap-04 Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 16:00:01 -05:00
Jiri Slaby	9391976a4d	Bluetooth: make hci_test_bit's addr const gcc5 warns about passing a const array to hci_test_bit which takes a non-const pointer: net/bluetooth/hci_sock.c: In function ‘hci_sock_sendmsg’: net/bluetooth/hci_sock.c:955:8: warning: passing argument 2 of ‘hci_test_bit’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-array-qualifiers] &hci_sec_filter.ocf_mask[ogf])) && ^ net/bluetooth/hci_sock.c:49:19: note: expected ‘void ’ but argument is of type ‘const __u32 ()[4] {aka const unsigned int ()[4]}’ static inline int hci_test_bit(int nr, void addr) ^ So make 'addr' 'const void *'. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Johan Hedberg <johan.hedberg@gmail.com>	2015-02-27 18:29:19 +01:00
Johan Hedberg	4cd3928a8b	Bluetooth: Update New CSRK event to match latest specification The 'master' parameter of the New CSRK event was recently renamed to 'type', with the old values kept for backwards compatibility as unauthenticated local/remote keys. This patch updates the code to take into account the two new (authenticated) values and ensures they get used based on the security level of the connection that the respective keys get distributed over. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-27 18:25:48 +01:00
Dan Carpenter	76cb4be993	sunrpc: integer underflow in rsc_parse() If we call groups_alloc() with invalid values then it's might lead to memory corruption. For example, with a negative value then we might not allocate enough for sizeof(struct group_info). (We're doing this in the caller for consistency with other callers of groups_alloc(). The other alternative might be to move the check out of all the callers into groups_alloc().) Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-02-26 15:40:16 -05:00
Jouni Malinen	9c1c98a3bb	mac80211: Send EAPOL frames at lowest rate The current minstrel_ht rate control behavior is somewhat optimistic in trying to find optimum TX rate. While this is usually fine for normal Data frames, there are cases where a more conservative set of retry parameters would be beneficial to make the connection more robust. EAPOL frames are critical to the authentication and especially the EAPOL-Key message 4/4 (the last message in the 4-way handshake) is important to get through to the AP. If that message is lost, the only recovery mechanism in many cases is to reassociate with the AP and start from scratch. This can often be avoided by trying to send the frame with more conservative rate and/or with more link layer retries. In most cases, minstrel_ht is currently using the initial EAPOL-Key frames for probing higher rates and this results in only five link layer transmission attempts (one at high(ish) MCS and four at MCS0). While this works with most APs, it looks like there are some deployed APs that may have issues with the EAPOL frames using HT MCS immediately after association. Similarly, there may be issues in cases where the signal strength or radio environment is not good enough to be able to get frames through even at couple of MCS 0 tries. The best approach for this would likely to be to reduce the TX rate for the last rate (3rd rate parameter in the set) to a low basic rate (say, 6 Mbps on 5 GHz and 2 or 5.5 Mbps on 2.4 GHz), but doing that cleanly requires some more effort. For now, we can start with a simple one-liner that forces the minimum rate to be used for EAPOL frames similarly how the TX rate is selected for the IEEE 802.11 Management frames. This does result in a small extra latency added to the cases where the AP would be able to receive the higher rate, but taken into account how small number of EAPOL frames are used, this is likely to be insignificant. A future optimization in the minstrel_ht design can also allow this patch to be reverted to get back to the more optimized initial TX rate. It should also be noted that many drivers that do not use minstrel as the rate control algorithm are already doing similar workarounds by forcing the lowest TX rate to be used for EAPOL frames. Cc: stable@vger.kernel.org Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-02-26 21:03:06 +01:00
Roopa Prabhu	fed0a159c8	bridge: fix link notification skb size calculation to include vlan ranges my previous patch skipped vlan range optimizations during skb size calculations for simplicity. This incremental patch considers vlan ranges during skb size calculations. This leads to a bit of code duplication in the fill and size calculation functions. But, I could not find a prettier way to do this. will take any suggestions. Previously, I had reused the existing br_get_link_af_size size calculation function to calculate skb size for notifications. Reusing it this time around creates some change in behaviour issues for the usual .get_link_af_size callback. This patch adds a new br_get_link_af_size_filtered() function to base the size calculation on the incoming filter flag and include vlan ranges. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Reviewed-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-26 11:25:43 -05:00
Guenter Roeck	d79d210736	net: dsa: Introduce dsa_is_port_initialized To avoid race conditions when using the ds->ports[] array, we need to check if the accessed port has been initialized. Introduce and use helper function dsa_is_port_initialized for that purpose and use it where needed. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-25 17:57:48 -05:00
Florian Fainelli	b73adef677	net: dsa: integrate with SWITCHDEV for HW bridging In order to support bridging offloads in DSA switch drivers, select NET_SWITCHDEV to get access to the port_stp_update and parent_get_id NDOs that we are required to implement. To facilitate the integratation at the DSA driver level, we implement 3 types of operations: - port_join_bridge - port_leave_bridge - port_stp_update DSA will resolve which switch ports that are currently bridge port members as some Switch hardware/drivers need to know about that to limit the register programming to just the relevant registers (especially for slow MDIO buses). We also take care of setting the correct STP state when slave network devices are brought up/down while being bridge members. Finally, when a port is leaving the bridge, we make sure we set in BR_STATE_FORWARDING state, otherwise the bridge layer would leave it disabled as a result of having left the bridge. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-25 17:03:38 -05:00
Guenter Roeck	d87d6f44d7	net: dsa: Ensure that port array elements are initialized before being used A network device notifier can be called for one or more of the created slave devices before all slave devices have been registered. This can result in a mismatch between ds->phys_port_mask and the registered devices by the time the call is made, and it can result in a slave device being added to a bridge before its entry in ds->ports[] has been initialized. Rework the initialization code to initialize entries in ds->ports[] in dsa_slave_create. With this change, dsa_slave_create no longer needs to return slave_dev but can return an error code instead. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-25 17:03:38 -05:00
Marcelo Ricardo Leitner	d752c36457	ipvs: allow rescheduling of new connections when port reuse is detected Currently, when TCP/SCTP port reusing happens, IPVS will find the old entry and use it for the new one, behaving like a forced persistence. But if you consider a cluster with a heavy load of small connections, such reuse will happen often and may lead to a not optimal load balancing and might prevent a new node from getting a fair load. This patch introduces a new sysctl, conn_reuse_mode, that allows controlling how to proceed when port reuse is detected. The default value will allow rescheduling of new connections only if the old entry was in TIME_WAIT state for TCP or CLOSED for SCTP. Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2015-02-25 13:46:35 +09:00
Ian Morris	49e64dcda2	ipv6: remove dead debug code from ip6_tunnel.c The IP6_TNL_TRACE macro is no longer used anywhere in the code so remove definition. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-24 16:25:25 -05:00
Alexander Drozdov	41a50d621a	af_packet: don't pass empty blocks for PACKET_V3 Before `da413eec72` ("packet: Fixed TPACKET V3 to signal poll when block is closed rather than every packet") poll listening for an af_packet socket was not signaled if there was no packets to process. After the patch poll is signaled evety time when block retire timer expires. That happens because af_packet closes the current block on timeout even if the block is empty. Passing empty blocks to the user not only wastes CPU but also wastes ring buffer space increasing probability of packets dropping on small timeouts. Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Cc: Dan Collins <dan@dcollins.co.nz> Cc: Willem de Bruijn <willemb@google.com> Cc: Guy Harris <guy@alum.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-24 16:08:42 -05:00
Sasha Levin	4e10fd5b4a	rtnetlink: avoid 0 sized arrays Arrays (when not in a struct) "shall have a value greater than zero". GCC complains when it's not the case here. Fixes: `ba7d49b1f0` ("rtnetlink: provide api for getting and setting slave info") Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-24 15:39:09 -05:00
Jiri Slaby	17dce15801	mac80211/minstrel: fix !x!=0 confusion Commit `06d961a8e2` ("mac80211/minstrel: use the new rate control API") inverted the condition 'if (msr->sample_limit != 0)' to 'if (!msr->sample_limit != 0)'. But it is confusing both to people and compilers (gcc5): net/mac80211/rc80211_minstrel.c: In function 'minstrel_get_rate': net/mac80211/rc80211_minstrel.c:376:26: warning: logical not is only applied to the left hand side of comparison if (!msr->sample_limit != 0) ^ Let there be only 'if (!msr->sample_limit)'. Fixes: `06d961a8e2` ("mac80211/minstrel: use the new rate control API") Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-02-24 21:12:07 +01:00
Pablo Neira Ayuso	8f711a601d	Merge https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs Simon Horman says: ==================== Second Round of IPVS Fixes for v3.20 This patch resolves some memory leaks in connection synchronisation code that date back to v2.6.39. ==================== Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-02-24 19:35:27 +01:00
Junjie Mao	81daf735f9	cfg80211: calls nl80211_exit on error nl80211_exit should be called in cfg80211_init if nl80211_init succeeds but regulatory_init or create_singlethread_workqueue fails. Signed-off-by: Junjie Mao <junjie_mao@yeah.net> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-02-24 11:41:21 +01:00
Jason Abele	28981e5eb4	cfg80211: fix n_reg_rules to match world_regdom There are currently 8 rules in the world_regdom, but only the first 6 are applied due to an incorrect value for n_reg_rules. This causes channels 149-165 and 60GHz to be disabled. Signed-off-by: Jason Abele <jason@aether.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-02-24 10:58:37 +01:00
Johannes Berg	a18c7192aa	nl80211: fix memory leak in monitor flags parsing If monitor flags parsing results in active monitor but that isn't supported, the already allocated message is leaked. Fix this by moving the allocation after this check. Reported-by: Christian Engelmayer <cengelma@gmx.at> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-02-24 10:56:42 +01:00
Samuel Tan	5528fae886	nl80211: use loop index as type for net detect frequency results We currently add nested members of the NL80211_ATTR_SCAN_FREQUENCIES as NLA_U32 attributes of type NL80211_ATTR_WIPHY_FREQ in cfg80211_net_detect_results. However, since there can be an arbitrary number of frequency results, we should use the loop index of the loop used to add the frequency results to NL80211_ATTR_SCAN_FREQUENCIES as the type (i.e. nla_type) for each result attribute, rather than a fixed type. This change is in line with how nested members are added to NL80211_ATTR_SCAN_FREQUENCIES in the functions nl80211_send_wowlan_nd and nl80211_add_scan_req. Signed-off-by: Samuel Tan <samueltan@chromium.org> Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-02-24 10:53:50 +01:00
Eliad Peller	104f5a6206	mac80211: clear sdata->radar_required If ieee80211_vif_use_channel() fails, we have to clear sdata->radar_required (which we might have just set). Failing to do it results in stale radar_required field which prevents starting new scan requests. Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Eliad Peller <eliad@wizery.com> [use false instead of 0] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-02-24 10:51:06 +01:00
Trond Myklebust	7a27eae362	NFS: RDMA Client Sparse Fix #2 This patch fixes another sparse fix found by Dan Carpenter's tool. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJU66HvAAoJENfLVL+wpUDrJdYP/Ap2xiMpEmO6Zl4h/MiDJKYs GIL3o+Ayc+wzlga2qBuNMfFBqzochCL2xqgS/9bvA0z/OT3ilq+FKZRr8FuDsmb+ 5XtIS49Kvo2mjWNpnh8GMoCbBoQfkBMJLETtR0KhhrMBNX5sidBPi2cSeZJxqnAw 9LIc3Od9jyZSLh2je3gJCKxh90CVWgjs9OfJookfksNyVMnwK+TRWjo1FInEfkW6 2fMFrC20xBg0qWWgtZL73ib8ojaIwv5MxnyFNgsiv/xcDWShZhdaMizBlqjDiNUq co9S1skJ9RTgInoDLeWvKoKkc7JjkLTpSzqsZx+Kke75p/D41Sah215jp0yyz2tb fEi+lZ8q1f5iorjamw5UDQPAuNcPx/eBzQNmkFGbue2xDP/kOXErb5AtnfMpDqav FXX/C3Vf6BxeAVB+8VifcdKEchE6cN9St6SBUludIisdmQWLLjSxazTAG7MXZdYW X0dxJ7D2hcwGEbjYKdNOn6a8+N3m1++XiJdPc1KSN71EO6z0xqjQM+3PytXCXaGI t3dKCDoEM/Ee1+4VxVa02OjW62ApNDd8APrCrn+8zMglG0eFwPmocNx/U42WanXd bGnEI9qd6mvn22eJrrOc55gShaDjovhXRmvvc2E8tHKVnaMXdfG8o5CjtB6LPXbW 6deK+m1/LxkzFhRvMySs =ooZn -----END PGP SIGNATURE----- Merge tag 'nfs-rdma-for-4.0-3' of git://git.linux-nfs.org/projects/anna/nfs-rdma NFS: RDMA Client Sparse Fix #2 This patch fixes another sparse fix found by Dan Carpenter's tool. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> * tag 'nfs-rdma-for-4.0-3' of git://git.linux-nfs.org/projects/anna/nfs-rdma: xprtrdma: Store RDMA credits in unsigned variables	2015-02-23 19:16:43 -05:00
Marcelo Leitner	77751427a1	ipv6: addrconf: validate new MTU before applying it Currently we don't check if the new MTU is valid or not and this allows one to configure a smaller than minimum allowed by RFCs or even bigger than interface own MTU, which is a problem as it may lead to packet drops. If you have a daemon like NetworkManager running, this may be exploited by remote attackers by forging RA packets with an invalid MTU, possibly leading to a DoS. (NetworkManager currently only validates for values too small, but not for too big ones.) The fix is just to make sure the new value is valid. That is, between IPV6_MIN_MTU and interface's MTU. Note that similar check is already performed at ndisc_router_discovery(), for when kernel itself parses the RA. Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-23 18:16:12 -05:00
Catalin Marinas	d720d8cec5	net: compat: Ignore MSG_CMSG_COMPAT in compat_sys_{send, recv}msg With commit `a7526eb5d0` (net: Unbreak compat_sys_{send,recv}msg), the MSG_CMSG_COMPAT flag is blocked at the compat syscall entry points, changing the kernel compat behaviour from the one before the commit it was trying to fix (`1be374a051`, net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg). On 32-bit kernels (!CONFIG_COMPAT), MSG_CMSG_COMPAT is 0 and the native 32-bit sys_sendmsg() allows flag 0x80000000 to be set (it is ignored by the kernel). However, on a 64-bit kernel, the compat ABI is different with commit `a7526eb5d0`. This patch changes the compat_sys_{send,recv}msg behaviour to the one prior to commit `1be374a051`. The problem was found running 32-bit LTP (sendmsg01) binary on an arm64 kernel. Arguably, LTP should not pass 0xffffffff as flags to sendmsg() but the general rule is not to break user ABI (even when the user behaviour is not entirely sane). Fixes: `a7526eb5d0` (net: Unbreak compat_sys_{send,recv}msg) Cc: Andy Lutomirski <luto@amacapital.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-23 17:22:05 -05:00
Fabian Frederick	a948f8ce77	irda: replace current->state by set_current_state() Use helper functions to access current->state. Direct assignments are prone to races and therefore buggy. current->state = TASK_RUNNING can be replaced by __set_current_state() Thanks to Peter Zijlstra for the exact definition of the problem. Suggested-By: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-23 17:21:11 -05:00
Chuck Lever	9b1dcbc8cf	xprtrdma: Store RDMA credits in unsigned variables Dan Carpenter's static checker pointed out: net/sunrpc/xprtrdma/rpc_rdma.c:879 rpcrdma_reply_handler() warn: can 'credits' be negative? "credits" is defined as an int. The credits value comes from the server as a 32-bit unsigned integer. A malicious or broken server can plant a large unsigned integer in that field which would result in an underflow in the following logic, potentially triggering a deadlock of the mount point by blocking the client from issuing more RPC requests. net/sunrpc/xprtrdma/rpc_rdma.c: 876 credits = be32_to_cpu(headerp->rm_credit); 877 if (credits == 0) 878 credits = 1; /* don't deadlock */ 879 else if (credits > r_xprt->rx_buf.rb_max_requests) 880 credits = r_xprt->rx_buf.rb_max_requests; 881 882 cwnd = xprt->cwnd; 883 xprt->cwnd = credits << RPC_CWNDSHIFT; 884 if (xprt->cwnd > cwnd) 885 xprt_release_rqst_cong(rqst->rq_task); Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `eba8ff660b` ("xprtrdma: Move credit update to RPC . . .") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-02-23 16:54:04 -05:00
Rasmus Villemoes	46b9e4bb76	decnet: Fix obvious o/0 typo Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-23 15:28:50 -05:00
Neal Cardwell	6514890f7a	tcp: fix tcp_should_expand_sndbuf() to use tcp_packets_in_flight() tcp_should_expand_sndbuf() does not expand the send buffer if we have filled the congestion window. However, it should use tcp_packets_in_flight() instead of tp->packets_out to make this check. Testing has established that the difference matters a lot if there are many SACKed packets, causing a needless performance shortfall. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-22 23:07:11 -05:00
Roopa Prabhu	b7853d73e3	bridge: add vlan info to bridge setlink and dellink notification messages vlan add/deletes are not notified to userspace today. This patch adds vlan info to bridge newlink/dellink notifications generated from the bridge driver. Notifications use the RTEXT_FILTER_BRVLAN_COMPRESSED flag to compress vlans into ranges whereever applicable. The size calculations does not take ranges into account for simplicity. This has the potential for allocating a larger skb than required. There is an existing inconsistency with bridge NEWLINK and DELLINK change notifications. Both generate NEWLINK notifications. Since its always a NEWLINK notification, this patch includes all vlans the port belongs to in the notification. The NEWLINK and DELLINK request messages however only include the vlans to be added and deleted. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-22 23:00:45 -05:00
Eric Dumazet	52d6c8c6ca	net: pktgen: disable xmit_clone on virtual devices Trying to use burst capability (aka xmit_more) on a virtual device like bonding is not supported. For example, skb might be queued multiple times on a qdisc, with various list corruptions. Fixes: `38b2cf2982` ("net: pktgen: packet bursting via skb->xmit_more") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-22 22:43:20 -05:00
Ameen Ali	e099b2d9df	net: __aligned(size) is preferred over __attribute__((aligned(size))) Signed-off-by: Ameen Ali <AmeenAli023@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-22 17:01:22 -05:00
Joe Perches	92b8391750	batman-adv: Fix use of seq_has_overflowed() net-next commit `6d91147d18` ("batman-adv: Remove uses of return value of seq_printf") incorrectly changed the overflow occurred return from -1 to 1. Change it back so that the test of batadv_write_buffer_text's return value in batadv_gw_client_seq_print_text works properly. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-22 17:00:08 -05:00
Julian Anastasov	528c943f3b	ipvs: add missing ip_vs_pe_put in sync code ip_vs_conn_fill_param_sync() gets in param.pe a module reference for persistence engine from __ip_vs_pe_getbyname() but forgets to put it. Problem occurs in backup for sync protocol v1 (2.6.39). Also, pe_data usually comes in sync messages for connection templates and ip_vs_conn_new() copies the pointer only in this case. Make sure pe_data is not leaked if it comes unexpectedly for normal connections. Leak can happen only if bogus messages are sent to backup server. Fixes: `fe5e7a1efb` ("IPVS: Backup, Adding Version 1 receive capability") Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2015-02-22 16:16:36 -05:00
Bojan Prtvar	059a2440fd	net: Remove state argument from skb_find_text() Although it is clear that textsearch state is intentionally passed to skb_find_text() as uninitialized argument, it was never used by the callers. Therefore, we can simplify skb_find_text() by making it local variable. Signed-off-by: Bojan Prtvar <prtvar.b@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-22 15:59:54 -05:00
Pablo Neira Ayuso	02263db00b	netfilter: nf_tables: fix addition/deletion of elements from commit/abort We have several problems in this path: 1) There is a use-after-free when removing individual elements from the commit path. 2) We have to uninit() the data part of the element from the abort path to avoid a chain refcount leak. 3) We have to check for set->flags to see if there's a mapping, instead of the element flags. 4) We have to check for !(flags & NFT_SET_ELEM_INTERVAL_END) to skip elements that are part of the interval that have no data part, so they don't need to be uninit(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-02-22 21:05:08 +01:00
Arturo Borrero	2156d321b8	netfilter: nft_compat: don't truncate ethernet protocol type to u8 Use u16 for protocol and then cast it to __be16 >> net/netfilter/nft_compat.c:140:37: sparse: incorrect type in assignment (different base types) net/netfilter/nft_compat.c:140:37: expected restricted __be16 [usertype] ethproto net/netfilter/nft_compat.c:140:37: got unsigned char [unsigned] [usertype] proto >> net/netfilter/nft_compat.c:351:37: sparse: incorrect type in assignment (different base types) net/netfilter/nft_compat.c:351:37: expected restricted __be16 [usertype] ethproto net/netfilter/nft_compat.c:351:37: got unsigned char [unsigned] [usertype] proto Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-02-22 21:04:06 +01:00
Alexander Drozdov	3f34b24a73	af_packet: allow packets defragmentation not only for hash fanout type Packets defragmentation was introduced for PACKET_FANOUT_HASH only, see `7736d33f42` ("packet: Add pre-defragmentation support for ipv4 fanouts") It may be useful to have defragmentation enabled regardless of fanout type. Without that, the AF_PACKET user may have to: 1. Collect fragments from different rings 2. Defragment by itself Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-21 23:00:18 -05:00
Dan Carpenter	d340c862e7	ethtool: use "ops" name consistenty in ethtool_set_rxfh() "dev->ethtool_ops" and "ops" are the same, but we should use "ops" everywhere to be consistent. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-21 22:05:53 -05:00
Alex W Slater	29778bec12	ipv6: Replace "#include <asm/uaccess>" with "#include <linux/uaccess>" Fix checkpatch.pl warning "Use #include <linux/uaccess.h> instead of <asm/uaccess.h>" Signed-off-by: Alex W Slater <alex.slater.dev@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-21 22:05:53 -05:00
Matthew Thode	a4176a9391	net: reject creation of netdev names with colons colons are used as a separator in netdev device lookup in dev_ioctl.c Specific functions are SIOCGIFTXQLEN SIOCETHTOOL SIOCSIFNAME Signed-off-by: Matthew Thode <mthode@mthode.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-21 21:45:25 -05:00
Linus Torvalds	24a52e412e	NFS client updates for Linux 3.20 Highlights include: - Fix a use-after-free in decode_cb_sequence_args() - Fix a compile error when #undef CONFIG_PROC_FS - NFSv4.1 backchannel spinlocking issue - Cleanups in the NFS unstable write code requested by Linus - NFSv4.1 fix issues when the server denies our backchannel request - Cleanups in create_session and bind_conn_to_session -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJU51zBAAoJEGcL54qWCgDyyvwQALp7vxXCTwC3/OSoPkhujd2C HGMoI2nVSl8CY0LFYxoT6auhefZCgQChmQXkGtlZwO+bkEr9rvJ6ii8lhuDR+JXF M0bAwOX+bNxUFkpqYYF3Q4Hi//tMJCIZdqUp2irtyFLL/qlNoN2ktoEYMIjMY5uO 4L1fxj9KaVuMtFuqt3xeSSe41LaXxitKyefyVJfbqz5qbcPGiXzS7WKYAun4RyvM 7rCir9kfnxxEX3+hc7xHxWeFJnW0jUMklrjNvnrubnpHE/fX2sUAnjX2hmx5bLg/ 4puxADzhPT4f3LdGDKXVWaULuTy20VksOJnB82TKJ3rLIELJixGZw88svOrFcGvt 5CMI8BvOihwn8ov+sSj7Xedz4046btwA1YHkUcwPV3LZAlyx/FSq4nasO4Cn27yl OPdkcAL1YR5I83mEKA+8BOVXJuFx5vKhkwmMdReJkBmxsWbSwB/qp6caPD9DtuXI K0qJYWHMqN+Dv9npi0Q4WR6vmnzxV+Eq7Z2D9WPW0FL8nT3STE0eWIrNipyqOv+7 bHptPxrrZej+i3c922bZ6hdaE2uAlwG8FPgEGhHFtm+s09RgDPDG7NiaeCutf9cQ 9ub82Hlk9nLTqq5X3poUPV35RS6THnZcybhngNMy8F1cPSUTs+/H+l91sEW/Zhgw odMB/DEa9sRnZGX5rQXE =NieR -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull more NFS client updates from Trond Myklebust: "Highlights include: - Fix a use-after-free in decode_cb_sequence_args() - Fix a compile error when #undef CONFIG_PROC_FS - NFSv4.1 backchannel spinlocking issue - Cleanups in the NFS unstable write code requested by Linus - NFSv4.1 fix issues when the server denies our backchannel request - Cleanups in create_session and bind_conn_to_session" * tag 'nfs-for-3.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4.1: Clean up bind_conn_to_session NFSv4.1: Always set up a forward channel when binding the session NFSv4.1: Don't set up a backchannel if the server didn't agree to do so NFSv4.1: Clean up create_session pnfs: Refactor the *_layout_mark_request_commit to use pnfs_layout_mark_request_commit NFSv4: Kill unused nfs_inode->delegation_state field NFS: struct nfs_commit_info.lock must always point to inode->i_lock nfs: Can call nfs_clear_page_commit() instead nfs: Provide and use helper functions for marking a page as unstable SUNRPC: Always manipulate rpc_rqst::rq_bc_pa_list under xprt->bc_pa_lock SUNRPC: Fix a compile error when #undef CONFIG_PROC_FS NFSv4.1: Convert open-coded array allocation calls to kmalloc_array() NFSv4.1: Fix a kfree() of uninitialised pointers in decode_cb_sequence_args	2015-02-21 14:02:59 -08:00
David S. Miller	ee92259849	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains updates for your net tree, they are: 1) Fix removal of destination in IPVS when the new mixed family support is used, from Alexey Andriyanov via Simon Horman. 2) Fix module refcount undeflow in nft_compat when reusing a match / target. 3) Fix iptables-restore when the recent match is used with a new hitcount that exceeds threshold, from Florian Westphal. 4) Fix stack corruption in xt_socket due to using stack storage to save the inner IPv6 header, from Eric Dumazet. I'll follow up soon with another batch with more fixes that are still cooking. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-20 17:36:20 -05:00
Dan Carpenter	278f7b4fff	caif: fix a signedness bug in cfpkt_iterate() The cfpkt_iterate() function can return -EPROTO on error, but the function is a u16 so the negative value gets truncated to a positive unsigned short. This causes a static checker warning. The only caller which might care is cffrml_receive(), when it's checking the frame checksum. I modified cffrml_receive() so that it never says -EPROTO is a valid checksum. Also this isn't ever going to be inlined so I removed the "inline". Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-20 17:35:14 -05:00
Rami Rosen	65e9256c4e	ethtool: Add hw-switch-offload to netdev_features_strings. commit `aafb3e98b2` (netdev: introduce new NETIF_F_HW_SWITCH_OFFLOAD feature flag for switch device offloads) add a new feature without adding it to netdev_features_strings array; this patch fixes this. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-20 16:36:43 -05:00
Eric Dumazet	997d5c3f44	sock: sock_dequeue_err_skb() needs hard irq safety Non NAPI drivers can call skb_tstamp_tx() and then sock_queue_err_skb() from hard IRQ context. Therefore, sock_dequeue_err_skb() needs to block hard irq or corruptions or hangs can happen. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `364a9e9324` ("sock: deduplicate errqueue dequeue") Fixes: `cb820f8e4b` ("net: Provide a generic socket error queue delivery method for Tx time stamps.") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-20 15:52:21 -05:00
Pravin B Shelar	7b4577a9da	openvswitch: Fix net exit. Open vSwitch allows moving internal vport to different namespace while still connected to the bridge. But when namespace deleted OVS does not detach these vports, that results in dangling pointer to netdevice which causes kernel panic as follows. This issue is fixed by detaching all ovs ports from the deleted namespace at net-exit. BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 IP: [<ffffffffa0aadaa5>] ovs_vport_locate+0x35/0x80 [openvswitch] Oops: 0000 [#1] SMP Call Trace: [<ffffffffa0aa6391>] lookup_vport+0x21/0xd0 [openvswitch] [<ffffffffa0aa65f9>] ovs_vport_cmd_get+0x59/0xf0 [openvswitch] [<ffffffff8167e07c>] genl_family_rcv_msg+0x1bc/0x3e0 [<ffffffff8167e319>] genl_rcv_msg+0x79/0xc0 [<ffffffff8167d919>] netlink_rcv_skb+0xb9/0xe0 [<ffffffff8167deac>] genl_rcv+0x2c/0x40 [<ffffffff8167cffd>] netlink_unicast+0x12d/0x1c0 [<ffffffff8167d3da>] netlink_sendmsg+0x34a/0x6b0 [<ffffffff8162e140>] sock_sendmsg+0xa0/0xe0 [<ffffffff8162e5e8>] ___sys_sendmsg+0x408/0x420 [<ffffffff8162f541>] __sys_sendmsg+0x51/0x90 [<ffffffff8162f592>] SyS_sendmsg+0x12/0x20 [<ffffffff81764ee9>] system_call_fastpath+0x12/0x17 Reported-by: Assaf Muller <amuller@redhat.com> Fixes: 46df7b81454("openvswitch: Add support for network namespaces.") Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-20 15:32:08 -05:00
Ignacy Gawędzki	34eea79e26	ematch: Fix auto-loading of ematch modules. In tcf_em_validate(), after calling request_module() to load the kind-specific module, set em->ops to NULL before returning -EAGAIN, so that module_put() is not called again by tcf_em_tree_destroy(). Signed-off-by: Ignacy Gawędzki <ignacy.gawedzki@green-communications.fr> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-20 15:30:56 -05:00
Eric Dumazet	959d10f6bb	igmp: add __ip_mc_{join\|leave}_group() There is a need to perform igmp join/leave operations while RTNL is held. Make ip_mc_{join\|leave}_group() wrappers around __ip_mc_{join\|leave}_group() to avoid the proliferation of work queues. For example, vxlan_igmp_join() could possibly be removed. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-20 15:24:04 -05:00
Alexander Drozdov	fba04a9e0c	ipv4: ip_check_defrag should correctly check return value of skb_copy_bits skb_copy_bits() returns zero on success and negative value on error, so it is needed to invert the condition in ip_check_defrag(). Fixes: `1bf3751ec9` ("ipv4: ip_check_defrag must not modify skb before unsharing") Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-20 15:22:38 -05:00
Joe Perches	6d91147d18	batman-adv: Remove uses of return value of seq_printf This function is soon going to return void so remove the return value use. Convert the return value to test seq_has_overflowed() instead. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-20 15:18:26 -05:00
stephen hemminger	db2855ae24	tcp: silence registration message This message isn't really needed it justs waits time/space. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-20 15:04:03 -05:00
Johan Hedberg	03f310efd4	Bluetooth: Remove unnecessary queue_monitor_skb() function Now that there's the general purpose hci_send_to_channel() API it will do the exact same thing as queue_monitor_skb() when passed the monitor HCI channel. This patch removes queue_monitor_skb() and replaces any users of it with calls to hci_send_to_channel(). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-20 18:20:17 +01:00
Johan Hedberg	7129069e84	Bluetooth: Rename hci_send_to_control to hci_send_to_channel The hci_send_to_control() can be made more general purpose with a small change of passing the desired HCI channel as a parameter to it. This allows using it for the monitor channel as well as e.g. 6lowpan in the future. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-20 18:20:17 +01:00
Johan Hedberg	39e3e74423	Bluetooth: Use hci_copy_identity_addr() helper for SMP chan creation The only reason the SMP code is essentially duplicating the hci_copy_identity_addr() function is that the helper returns the address type in the HCI format rather than the three-value format expected by l2cap_chan. This patch converts the SMP code to use the helper and then do a simple conversion from one address type to another. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-20 18:15:41 +01:00
Linus Torvalds	4533f6e27a	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph changes from Sage Weil: "On the RBD side, there is a conversion to blk-mq from Christoph, several long-standing bug fixes from Ilya, and some cleanup from Rickard Strandqvist. On the CephFS side there is a long list of fixes from Zheng, including improved session handling, a few IO path fixes, some dcache management correctness fixes, and several blocking while !TASK_RUNNING fixes. The core code gets a few cleanups and Chaitanya has added support for TCP_NODELAY (which has been used on the server side for ages but we somehow missed on the kernel client). There is also an update to MAINTAINERS to fix up some email addresses and reflect that Ilya and Zheng are doing most of the maintenance for RBD and CephFS these days. Do not be surprised to see a pull request come from one of them in the future if I am unavailable for some reason" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (27 commits) MAINTAINERS: update Ceph and RBD maintainers libceph: kfree() in put_osd() shouldn't depend on authorizer libceph: fix double __remove_osd() problem rbd: convert to blk-mq ceph: return error for traceless reply race ceph: fix dentry leaks ceph: re-send requests when MDS enters reconnecting stage ceph: show nocephx_require_signatures and notcp_nodelay options libceph: tcp_nodelay support rbd: do not treat standalone as flatten ceph: fix atomic_open snapdir ceph: properly mark empty directory as complete client: include kernel version in client metadata ceph: provide seperate {inode,file}_operations for snapdir ceph: fix request time stamp encoding ceph: fix reading inline data when i_size > PAGE_SIZE ceph: avoid block operation when !TASK_RUNNING (ceph_mdsc_close_sessions) ceph: avoid block operation when !TASK_RUNNING (ceph_get_caps) ceph: avoid block operation when !TASK_RUNNING (ceph_mdsc_sync) rbd: fix error paths in rbd_dev_refresh() ...	2015-02-19 14:14:42 -08:00
Ignacy Gawędzki	1c4cff0cf5	gen_stats.c: Duplicate xstats buffer for later use The gnet_stats_copy_app() function gets called, more often than not, with its second argument a pointer to an automatic variable in the caller's stack. Therefore, to avoid copying garbage afterwards when calling gnet_stats_finish_copy(), this data is better copied to a dynamically allocated memory that gets freed after use. [xiyou.wangcong@gmail.com: remove a useless kfree()] Signed-off-by: Ignacy Gawędzki <ignacy.gawedzki@green-communications.fr> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-19 15:45:53 -05:00
Linus Torvalds	b11a278397	Merge branch 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kconfig updates from Michal Marek: "Yann E Morin was supposed to take over kconfig maintainership, but this hasn't happened. So I'm sending a few kconfig patches that I collected: - Fix for missing va_end in kconfig - merge_config.sh displays used if given too few arguments - s/boolean/bool/ in Kconfig files for consistency, with the plan to only support bool in the future" * 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: kconfig: use va_end to match corresponding va_start merge_config.sh: Display usage if given too few arguments kconfig: use bool instead of boolean for type definition attributes	2015-02-19 10:36:45 -08:00
Johan Hedberg	a2cb01de1c	Bluetooth: Fix checking for pending Set SSP in Set HS handler Changing the HS setting requires that SSP is enabled, however so far the code only checked for the SSP flag but not a potentially ongoing Set SSP operation. This patch adds a check for a pending Set SSP command in the Set HS handler, and returns a 'busy' error if one is found. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-19 17:05:09 +01:00
Johan Hedberg	94d52dad9e	Bluetooth: Remove bogus check for pending mgmt Set HS command The command handler for Set HS doesn't use mgmt_pending_add() so we can never have a pending Set HS command that mgmt_pending_find() would return. This patch removes an unnecessary lookup for it in the set_ssp() handler function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-19 17:05:09 +01:00
Ilya Dryomov	b28ec2f37e	libceph: kfree() in put_osd() shouldn't depend on authorizer `a255651d4c` ("ceph: ensure auth ops are defined before use") made kfree() in put_osd() conditional on the authorizer. A mechanical mistake most likely - fix it. Cc: Alex Elder <elder@linaro.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>	2015-02-19 14:27:51 +03:00
Ilya Dryomov	7eb71e0351	libceph: fix double __remove_osd() problem It turns out it's possible to get __remove_osd() called twice on the same OSD. That doesn't sit well with rb_erase() - depending on the shape of the tree we can get a NULL dereference, a soft lockup or a random crash at some point in the future as we end up touching freed memory. One scenario that I was able to reproduce is as follows: <osd3 is idle, on the osd lru list> <con reset - osd3> con_fault_finish() osd_reset() <osdmap - osd3 down> ceph_osdc_handle_map() <takes map_sem> kick_requests() <takes request_mutex> reset_changed_osds() __reset_osd() __remove_osd() <releases request_mutex> <releases map_sem> <takes map_sem> <takes request_mutex> __kick_osd_requests() __reset_osd() __remove_osd() <-- !!! A case can be made that osd refcounting is imperfect and reworking it would be a proper resolution, but for now Sage and I decided to fix this by adding a safe guard around __remove_osd(). Fixes: http://tracker.ceph.com/issues/8087 Cc: Sage Weil <sage@redhat.com> Cc: stable@vger.kernel.org # 3.9+: `7c6e6fc53e`: libceph: assert both regular and lingering lists in __remove_osd() Cc: stable@vger.kernel.org # 3.9+: `cc9f1f518c`: libceph: change from BUG to WARN for __remove_osd() asserts Cc: stable@vger.kernel.org # 3.9+ Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>	2015-02-19 14:27:50 +03:00
Chaitanya Huilgol	ba988f87f5	libceph: tcp_nodelay support TCP_NODELAY socket option set on connection sockets, disables Nagle’s algorithm and improves latency characteristics. tcp_nodelay(default)/notcp_nodelay option flags provided to enable/disable setting the socket option. Signed-off-by: Chaitanya Huilgol <chaitanya.huilgol@sandisk.com> [idryomov@redhat.com: NO_TCP_NODELAY -> TCP_NODELAY, minor adjustments] Signed-off-by: Ilya Dryomov <idryomov@redhat.com>	2015-02-19 13:31:40 +03:00
Ilya Dryomov	f646912d10	libceph: use mon_client.c/put_generic_request() more Signed-off-by: Ilya Dryomov <idryomov@redhat.com>	2015-02-19 13:31:37 +03:00
Ilya Dryomov	7a6fdeb2b1	libceph: nuke pool op infrastructure On Mon, Dec 22, 2014 at 5:35 PM, Sage Weil <sage@newdream.net> wrote: > On Mon, 22 Dec 2014, Ilya Dryomov wrote: >> Actually, pool op stuff has been unused for over two years - looks like >> it was added for rbd create_snap and that got ripped out in 2012. It's >> unlikely we'd ever need to manage pools or snaps from the kernel client >> so I think it makes sense to nuke it. Sage? > > Yep! Signed-off-by: Ilya Dryomov <idryomov@redhat.com>	2015-02-19 13:31:37 +03:00
Johan Hedberg	3a6d576be9	Bluetooth: Convert disconn_cfm to be triggered through hci_cb This patch moves all the disconn_cfm callbacks to be based on the hci_cb list. This means making l2cap_disconn_cfm private to l2cap_core.c and sco_conn_cb private to sco.c respectively. Since the hci_conn type filtering isn't done any more on the wrapper level the callbacks themselves need to check that they were passed a relevant type of connection. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-19 08:44:29 +01:00
Johan Hedberg	539c496d88	Bluetooth: Convert connect_cfm to be triggered through hci_cb This patch moves all the connect_cfm callbacks to be based on the hci_cb list. This means making l2cap_connect_cfm private to l2cap_core.c and sco_connect_cb private to sco.c respectively. Since the hci_conn type filtering isn't done any more on the wrapper level the callbacks themselves need to check that they were passed a relevant type of connection. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-19 08:44:29 +01:00
Johan Hedberg	354fe804ed	Bluetooth: Convert L2CAP security callback to use hci_cb There's no reason to have the custom hci_proto_auth/encrypt_cfm helpers when the hci_cb list works equally well. This patch adds L2CAP to the hci_cb list and makes l2cap_security_cfm a private function of l2cap_core.c. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-19 08:44:28 +01:00
Johan Hedberg	fba7ecf09b	Bluetooth: Convert hci_cb_list_lock to a mutex We'll soon need to be able to sleep inside the loops that iterate the hci_cb list, so neither a spinlock, rwlock or rcu are usable. This patch changes the lock to a mutex which permits sleeping while holding the lock. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-19 08:44:28 +01:00
Johan Hedberg	00629e0fd5	Bluetooth: Add new hci_cb entries to the tail rather than the head When processing hci_cb entries we want first registered callbacks to be called first and later ones later. This is because eventually the L2CAP callbacks that are part of the core will use this list and get registered first. To keep the same order of calling L2CAP callbacks before e.g. RFCOMM the order of elements needs to be this way. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-19 08:44:28 +01:00
Linus Torvalds	53861af9a1	OK, this has the big virtio 1.0 implementation, as specified by OASIS. On top of tht is the major rework of lguest, to use PCI and virtio 1.0, to double-check the implementation. Then comes the inevitable fixes and cleanups from that work. Thanks, Rusty. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJU5B9cAAoJENkgDmzRrbjxPacP/jajliXX353JJ/g/hkZ6oDN5 o7FhELBKiUMr7enVZYwj2BBYk5OM36nB9pQkiqHMSbjJGoS5IK70enxb4YRxSHBn YCLblZMNqutGS0kclZ9DDysztjAhxH7CvLM6pMZ7eHP0f3+FM/QhbxHfbG9DTBUH 2U/nybvd3M/+YBe7ptwQdrH8aOCAD6RTIsXellfm99dNMK6K/5lqnWQ98WSXmNXq vyvdaAQsqqUkmxtajjcBumaCH4/SehOJJjUqojCMsR3aBkgOBWDZJURMek+KA5Dt X996fBsTAlvTtCUKRrmLTb2ScDH7fu+jwbWRqMYDk8zpEr3XqiLTTPV4/TiHGmi7 Wiw3g1wIY1YbETlZyongB5MIoVyUfmDAd+bT8nBsj3KIITD84gOUQFDMl6d63c0I z6A9Pu/UzpJGsXZT3WoFLi6TO67QyhOseqZnhS4wBgLabjxffNM7yov9RVKUVH/n JHunnpUk2iTtSgscBarOBz5867dstuurnaUIspZthVBo6y6N0z+GrU+agJ8Y4DXx mvwzeYLhQH2208PjxPFiah/kA/gHNm1m678TbpS+CUsgmpQiJ4gTwtazDSi4TwZY Hs9T9GulkzpZIzEyKL3qG2TsfyDhW5Avn+GvKInAT9+Fkig4BnP3DUONBxcwGZ78 eI3FDUWsE36NqE5ECWmz =ivCe -----END PGP SIGNATURE----- Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio updates from Rusty Russell: "OK, this has the big virtio 1.0 implementation, as specified by OASIS. On top of tht is the major rework of lguest, to use PCI and virtio 1.0, to double-check the implementation. Then comes the inevitable fixes and cleanups from that work" * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (80 commits) virtio: don't set VIRTIO_CONFIG_S_DRIVER_OK twice. virtio_net: unconditionally define struct virtio_net_hdr_v1. tools/lguest: don't use legacy definitions for net device in example launcher. virtio: Don't expose legacy net features when VIRTIO_NET_NO_LEGACY defined. tools/lguest: use common error macros in the example launcher. tools/lguest: give virtqueues names for better error messages tools/lguest: more documentation and checking of virtio 1.0 compliance. lguest: don't look in console features to find emerg_wr. tools/lguest: don't start devices until DRIVER_OK status set. tools/lguest: handle indirect partway through chain. tools/lguest: insert driver references from the 1.0 spec (4.1 Virtio Over PCI) tools/lguest: insert device references from the 1.0 spec (4.1 Virtio Over PCI) tools/lguest: rename virtio_pci_cfg_cap field to match spec. tools/lguest: fix features_accepted logic in example launcher. tools/lguest: handle device reset correctly in example launcher. virtual: Documentation: simplify and generalize paravirt_ops.txt lguest: remove NOTIFY call and eventfd facility. lguest: remove NOTIFY facility from demonstration launcher. lguest: use the PCI console device's emerg_wr for early boot messages. lguest: always put console in PCI slot #1. ...	2015-02-18 09:24:01 -08:00
Trond Myklebust	65d2918e71	Merge branch 'cleanups' Merge cleanups requested by Linus. * cleanups: (3 commits) pnfs: Refactor the *_layout_mark_request_commit to use pnfs_layout_mark_request_commit nfs: Can call nfs_clear_page_commit() instead nfs: Provide and use helper functions for marking a page as unstable	2015-02-18 07:28:37 -08:00
Linus Torvalds	f5af19d10d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking updates from David Miller: 1) Missing netlink attribute validation in nft_lookup, from Patrick McHardy. 2) Restrict ipv6 partial checksum handling to UDP, since that's the only case it works for. From Vlad Yasevich. 3) Clear out silly device table sentinal macros used by SSB and BCMA drivers. From Joe Perches. 4) Make sure the remote checksum code never creates a situation where the remote checksum is applied yet the tunneling metadata describing the remote checksum transformation is still present. Otherwise an external entity might see this and apply the checksum again. From Tom Herbert. 5) Use msecs_to_jiffies() where applicable, from Nicholas Mc Guire. 6) Don't explicitly initialize timer struct fields, use setup_timer() and mod_timer() instead. From Vaishali Thakkar. 7) Don't invoke tg3_halt() without the tp->lock held, from Jun'ichi Nomura. 8) Missing __percpu annotation in ipvlan driver, from Eric Dumazet. 9) Don't potentially perform skb_get() on shared skbs, also from Eric Dumazet. 10) Fix COW'ing of metrics for non-DST_HOST routes in ipv6, from Martin KaFai Lau. 11) Fix merge resolution error between the iov_iter changes in vhost and some bug fixes that occurred at the same time. From Jason Wang. 12) If rtnl_configure_link() fails we have to perform a call to ->dellink() before unregistering the device. From WANG Cong. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (39 commits) net: dsa: Set valid phy interface type rtnetlink: call ->dellink on failure when ->newlink exists com20020-pci: add support for eae single card vhost_net: fix wrong iter offset when setting number of buffers net: spelling fixes net/core: Fix warning while make xmldocs caused by dev.c net: phy: micrel: disable NAND-tree for KSZ8021, KSZ8031, KSZ8051, KSZ8081 ipv6: fix ipv6_cow_metrics for non DST_HOST case openvswitch: Fix key serialization. r8152: restore hw settings hso: fix rx parsing logic when skb allocation fails tcp: make sure skb is not shared before using skb_get() bridge: netfilter: Move sysctl-specific error code inside #ifdef ipv6: fix possible deadlock in ip6_fl_purge / ip6_fl_gc ipvlan: add a missing __percpu pcpu_stats tg3: Hold tp->lock before calling tg3_halt() from tg3_init_one() bgmac: fix device initialization on Northstar SoCs (condition typo) qlcnic: Delete existing multicast MAC list before adding new net/mlx5_core: Fix configuration of log_uar_page_sz sunvnet: don't change gso data on clones ...	2015-02-17 17:41:19 -08:00
David Ramos	a1d1e9be5a	svcrpc: fix memory leak in gssp_accept_sec_context_upcall Our UC-KLEE tool found a kernel memory leak of 512 bytes (on x86_64) for each call to gssp_accept_sec_context_upcall() (net/sunrpc/auth_gss/gss_rpc_upcall.c). Since it appears that this call can be triggered by remote connections (at least, from a cursory a glance at the call chain), it may be exploitable to cause kernel memory exhaustion. We found the bug in kernel 3.16.3, but it appears to date back to commit `9dfd87da1a` (2013-08-20). The gssp_accept_sec_context_upcall() function performs a pair of calls to gssp_alloc_receive_pages() and gssp_free_receive_pages(). The first allocates memory for arg->pages. The second then frees the pages pointed to by the arg->pages array, but not the array itself. Reported-by: David A. Ramos <daramos@stanford.edu> Fixes: `9dfd87da1a` ("rpc: fix huge kmalloc's in gss-proxy”) Signed-off-by: David A. Ramos <daramos@stanford.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-02-17 18:09:02 -05:00
Guenter Roeck	19334920ea	net: dsa: Set valid phy interface type If the phy interface mode is not found in devicetree, or if devicetree is not configured, of_get_phy_mode returns -ENODEV. The current code sets the phy interface mode to the return value from of_get_phy_mode without checking if it is valid. This invalid phy interface mode is passed as parameter to of_phy_connect or to phy_connect_direct. This sets the phy interface mode to the invalid value, which in turn causes problems for any code using phydev->interface. Fixes: `b31f65fb43` ("net: dsa: slave: Fix autoneg for phys on switch MDIO bus") Fixes: `0d8bcdd383` ("net: dsa: allow for more complex PHY setups") Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-17 10:37:39 -08:00
Johan Hedberg	0af801b9bf	Bluetooth: Fix AMP init for certain AMP controllers Some AMP controllers do not support the Read Local Features HCI commands (even though according to the spec they should). Luckily they at least correctly omit this from the supported commands bitmask, so we can work around the issue by creating a second AMP init phase and issuing the HCI command conditionally there. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-17 18:52:39 +01:00
Eric Dumazet	78296c97ca	netfilter: xt_socket: fix a stack corruption bug As soon as extract_icmp6_fields() returns, its local storage (automatic variables) is deallocated and can be overwritten. Lets add an additional parameter to make sure storage is valid long enough. While we are at it, adds some const qualifiers. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `b64c9256a9` ("tproxy: added IPv6 support to the socket match") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-02-16 17:00:48 +01:00
Florian Westphal	cef9ed86ed	netfilter: xt_recent: don't reject rule if new hitcount exceeds table max given: -A INPUT -m recent --update --seconds 30 --hitcount 4 and iptables-save > foo then iptables-restore < foo will fail with: kernel: xt_recent: hitcount (4) is larger than packets to be remembered (4) for table DEFAULT Even when the check is fixed, the restore won't work if the hitcount is increased to e.g. 6, since by the time checkentry runs it will find the 'old' incarnation of the table. We can avoid this by increasing the maximum threshold silently; we only have to rm all the current entries of the table (these entries would not have enough room to handle the increased hitcount). This even makes (not-very-useful) -A INPUT -m recent --update --seconds 30 --hitcount 4 -A INPUT -m recent --update --seconds 30 --hitcount 42 work. Fixes: `abc86d0f99` (netfilter: xt_recent: relax ip_pkt_list_tot restrictions) Tracked-down-by: Chris Vine <chris@cvine.freeserve.co.uk> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-02-16 17:00:47 +01:00
Pablo Neira Ayuso	520aa7414b	netfilter: nft_compat: fix module refcount underflow Feb 12 18:20:42 nfdev kernel: ------------[ cut here ]------------ Feb 12 18:20:42 nfdev kernel: WARNING: CPU: 4 PID: 4359 at kernel/module.c:963 module_put+0x9b/0xba() Feb 12 18:20:42 nfdev kernel: CPU: 4 PID: 4359 Comm: ebtables-compat Tainted: G W 3.19.0-rc6+ #43 [...] Feb 12 18:20:42 nfdev kernel: Call Trace: Feb 12 18:20:42 nfdev kernel: [<ffffffff815fd911>] dump_stack+0x4c/0x65 Feb 12 18:20:42 nfdev kernel: [<ffffffff8103e6f7>] warn_slowpath_common+0x9c/0xb6 Feb 12 18:20:42 nfdev kernel: [<ffffffff8109919f>] ? module_put+0x9b/0xba Feb 12 18:20:42 nfdev kernel: [<ffffffff8103e726>] warn_slowpath_null+0x15/0x17 Feb 12 18:20:42 nfdev kernel: [<ffffffff8109919f>] module_put+0x9b/0xba Feb 12 18:20:42 nfdev kernel: [<ffffffff813ecf7c>] nft_match_destroy+0x45/0x4c Feb 12 18:20:42 nfdev kernel: [<ffffffff813e683f>] nf_tables_rule_destroy+0x28/0x70 Reported-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Tested-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>	2015-02-16 17:00:36 +01:00
Johan Hedberg	315917e0a6	Bluetooth: Fix accepting early data on fixed channels On BR/EDR the L2CAP channel instances for fixed channels have so far been marked as ready only once the L2CAP information req/rsp procedure is complete and we have the fixed channel mask. This could however lead to data being dropped if we receive it on the channel before knowing the remote mask. Since it is valid for a remote to send data this early, simply assume that the channel is supported when we receive data on it. So far this hasn't been noticed much because of limited use of fixed channels on BR/EDR, but e.g. with SMP over BR/EDR this is already now visible with automated tests failing randomly. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-16 16:49:36 +01:00
Marcel Holtmann	035a07d5df	Bluetooth: Provide option to enable/disable debugfs information The Bluetooth controllers can export extensive information about internal states via debugfs. This patch provides an option to choose if these information are provided or not. For backwards compatibility with existing kernel configuration, this option defaults to yes. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-15 18:54:13 +02:00
WANG Cong	7afb8886a0	rtnetlink: call ->dellink on failure when ->newlink exists Ignacy reported that when eth0 is down and add a vlan device on top of it like: ip link add link eth0 name eth0.1 up type vlan id 1 We will get a refcount leak: unregister_netdevice: waiting for eth0.1 to become free. Usage count = 2 The problem is when rtnl_configure_link() fails in rtnl_newlink(), we simply call unregister_device(), but for stacked device like vlan, we almost do nothing when we unregister the upper device, more work is done when we unregister the lower device, so call its ->dellink(). Reported-by: Ignacy Gawedzki <ignacy.gawedzki@green-communications.fr> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-15 08:30:10 -08:00
Marcel Holtmann	87e2a020ca	Bluetooth: Make __next_ident function static. The __next_ident function is a local function and so do not export it and make it static. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-15 10:14:54 +02:00
Marcel Holtmann	bc333cc465	Bluetooth: Make a2mp_send function static The a2mp_send function is a local function and so do not export it and make it static. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-15 10:14:52 +02:00
Marcel Holtmann	469cd4c5a6	Bluetooth: Make amp_mgr_lookup_by_state function static The amp_mgr_lookup_by_state function does not need to be exported. So just move it to a different location and make it static. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-15 10:14:51 +02:00
Marcel Holtmann	59d4d0863e	Bluetooth: Make amp_mgr_list and amp_mgr_list_lock static There is no reason to have amp_mgr_list and amp_mgr_list_lock exported from a2mp.c and thus make both of them static. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-15 10:14:49 +02:00
Marcel Holtmann	055540a176	Bluetooth: Move A2MP_FEAT_EXT declaration into A2MP source The A2MP_FEAT_EXT declaration has a single user in a2mp.c and thus just move it there. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-15 10:14:48 +02:00
Stephen Hemminger	ca9f1fd263	net: spelling fixes Spelling errors caught by codespell. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-14 20:36:08 -08:00
Masanari Iida	4a26e453d9	net/core: Fix warning while make xmldocs caused by dev.c This patch fix following warning wile make xmldocs. Warning(.//net/core/dev.c:5345): No description found for parameter 'bonding_info' Warning(.//net/core/dev.c:5345): Excess function parameter 'netdev_bonding_info' description in 'netdev_bonding_info_change' This warning starts to appear after following patch was added into Linus's tree during merger period. commit `61bd3857ff` net/core: Add event for a change in slave state Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-14 20:34:49 -08:00
Martin KaFai Lau	3b4711757d	ipv6: fix ipv6_cow_metrics for non DST_HOST case ipv6_cow_metrics() currently assumes only DST_HOST routes require dynamic metrics allocation from inetpeer. The assumption breaks when ndisc discovered router with RTAX_MTU and RTAX_HOPLIMIT metric. Refer to ndisc_router_discovery() in ndisc.c and note that dst_metric_set() is called after the route is created. This patch creates the metrics array (by calling dst_cow_metrics_generic) in ipv6_cow_metrics(). Test: radvd.conf: interface qemubr0 { AdvLinkMTU 1300; AdvCurHopLimit 30; prefix fd00:face:face:face::/64 { AdvOnLink on; AdvAutonomous on; AdvRouterAddr off; }; }; Before: [root@qemu1 ~]# ip -6 r show \| egrep -v unreachable fd00:face:face:face::/64 dev eth0 proto kernel metric 256 expires 27sec fe80::/64 dev eth0 proto kernel metric 256 default via fe80::74df:d0ff:fe23:8ef2 dev eth0 proto ra metric 1024 expires 27sec After: [root@qemu1 ~]# ip -6 r show \| egrep -v unreachable fd00:face:face:face::/64 dev eth0 proto kernel metric 256 expires 27sec mtu 1300 fe80::/64 dev eth0 proto kernel metric 256 mtu 1300 default via fe80::74df:d0ff:fe23:8ef2 dev eth0 proto ra metric 1024 expires 27sec mtu 1300 hoplimit 30 Fixes: `8e2ec63917` (ipv6: don't use inetpeer to store metrics for routes.) Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-14 20:26:16 -08:00
Pravin B Shelar	26ad0b8358	openvswitch: Fix key serialization. Fix typo where mask is used rather than key. Fixes: 74ed7ab9264("openvswitch: Add support for unique flow IDs.") Reported-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-14 20:20:40 -08:00
Tedd Ho-Jeong An	a44fecbd52	Bluetooth: Add shutdown callback before closing the device This callback allows a vendor to send the vendor specific commands before cloing the hci interface. Signed-off-by: Tedd Ho-Jeong An <tedd.an@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-15 00:37:52 +01:00
Alexander Aring	ff0fcc2987	6lowpan: nhc: add other known rfc6282 compressions This patch adds other known rfc6282 compression formats to the nhc framework. These compression formats are known but not implemented yet. For now this is useful to printout a warning which compression format isn't supported. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Cc: Martin Townsend <mtownsend1973@gmail.com> Reviewed-by: Stefan Schmidt <s.schmidt@samsung.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-14 23:08:44 +01:00
Alexander Aring	cc6ed26847	6lowpan: add udp compression via nhc layer This patch move UDP header compression and uncompression into the generic 6LoWPAN nhc header compression layer. Moreover this patch activates the nhc layer compression in iphc compression and uncompression functions. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Cc: Martin Townsend <mtownsend1973@gmail.com> Reviewed-by: Stefan Schmidt <s.schmidt@samsung.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-14 23:08:44 +01:00
Alexander Aring	92aa7c65d2	6lowpan: add generic nhc layer interface This patch adds a generic next header compression layer interface. There exists various methods to do a header compression after 6LoWPAN header to save payload. This introduce a generic nhc header which allow a simple adding of a new header compression format instead of a static implementation inside the 6LoWPAN header compression and uncompression function. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Cc: Martin Townsend <mtownsend1973@gmail.com> Reviewed-by: Stefan Schmidt <s.schmidt@samsung.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-14 23:08:44 +01:00
Tejun Heo	f09068276c	net: use %pb[l] to print bitmaps including cpumasks and nodemasks printk and friends can now format bitmaps using '%pb[l]'. cpumask and nodemask also provide cpumask_pr_args() and nodemask_pr_args() respectively which can be used to generate the two printf arguments necessary to format the specified cpu/nodemask. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:38 -08:00
Lukasz Rymanowski	faa810303d	Bluetooth: Enhance error codes pair device command If user space is trying to pair on not enabled transport MGMT_STATUS_REJECT will be returned. If user space is trying to pair on transport which controller does not support, MGMT_STATUS_NOT_SUPPORTED will be returned. Having separate error code for that scenario might be useful for debugging at least. Signed-off-by: Lukasz Rymanowski <lukasz.rymanowski@tieto.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-14 05:19:59 +01:00
Lukasz Rymanowski	c411110e1f	Bluetooth: Improve error handling in connect acl With this patch -EOPNOTSUPP will be returned by hci_connect_acl for LE only controllers. If it is dual device with disabled BREDR -ECONNREFUSED will be returned Signed-off-by: Lukasz Rymanowski <lukasz.rymanowski@tieto.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-14 05:19:59 +01:00
Lukasz Rymanowski	152d386e11	Bluetooth: Do not allow LE connection if LE is not enabled Kernel gives possibility to enable/disable LE host support. There is flag HCI_LE_ENABLED which is set when this support is enabled and some parts of the code checks this flag e.g. SMP However it is still possible to make LE connection if LE Host support is disabled, what might be confused for remote device. This patch makes sure that kernel will not send HCI LE Create Connection if LE HOST support is not enabled. Signed-off-by: Lukasz Rymanowski <lukasz.rymanowski@tieto.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-14 05:19:59 +01:00
Nicolas Dichtel	f9d1ce8f81	ieee802154: fix netns settings 6LoWPAN currently doesn't supports x-netns and works only in init_net. With this patch, we ensure that: - the wpan interface cannot be moved to another netns; - the 6lowpan interface cannot be moved to another netns; - the wpan interface is in the same netns than the 6lowpan interface; - the 6lowpan interface is in init_net. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-14 05:19:58 +01:00
Chuck Lever	813b00d63f	SUNRPC: Always manipulate rpc_rqst::rq_bc_pa_list under xprt->bc_pa_lock Other code that accesses rq_bc_pa_list holds xprt->bc_pa_lock. xprt_complete_bc_request() should do the same. Fixes: `2ea24497a1` ("SUNRPC: RPC callbacks may be split . . .") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-13 17:41:10 -05:00
Eric Dumazet	ba34e6d9d3	tcp: make sure skb is not shared before using skb_get() IPv6 can keep a copy of SYN message using skb_get() in tcp_v6_conn_request() so that caller wont free the skb when calling kfree_skb() later. Therefore TCP fast open has to clone the skb it is queuing in child->sk_receive_queue, as all skbs consumed from receive_queue are freed using __kfree_skb() (ie assuming skb->users == 1) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Fixes: `5b7ed0892f` ("tcp: move fastopen functions to tcp_fastopen.c") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-13 07:11:40 -08:00
Vladimir Davydov	f48b80a5e2	memcg: cleanup static keys decrement Move memcg_socket_limit_enabled decrement to tcp_destroy_cgroup (called from memcg_destroy_kmem -> mem_cgroup_sockets_destroy) and zap a bunch of wrapper functions. Although this patch moves static keys decrement from __mem_cgroup_free to mem_cgroup_css_free, it does not introduce any functional changes, because the keys are incremented on setting the limit (tcp or kmem), which can only happen after successful mem_cgroup_css_online. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Glauber Costa <glommer@parallels.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: David S. Miller <davem@davemloft.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-12 18:54:10 -08:00
Linus Torvalds	61845143fe	Merge branch 'for-3.20' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "The main change is the pNFS block server support from Christoph, which allows an NFS client connected to shared disk to do block IO to the shared disk in place of NFS reads and writes. This also requires xfs patches, which should arrive soon through the xfs tree, barring unexpected problems. Support for other filesystems is also possible if there's interest. Thanks also to Chuck Lever for continuing work to get NFS/RDMA into shape" * 'for-3.20' of git://linux-nfs.org/~bfields/linux: (32 commits) nfsd: default NFSv4.2 to on nfsd: pNFS block layout driver exportfs: add methods for block layout exports nfsd: add trace events nfsd: update documentation for pNFS support nfsd: implement pNFS layout recalls nfsd: implement pNFS operations nfsd: make find_any_file available outside nfs4state.c nfsd: make find/get/put file available outside nfs4state.c nfsd: make lookup/alloc/unhash_stid available outside nfs4state.c nfsd: add fh_fsid_match helper nfsd: move nfsd_fh_match to nfsfh.h fs: add FL_LAYOUT lease type fs: track fl_owner for leases nfs: add LAYOUT_TYPE_MAX enum value nfsd: factor out a helper to decode nfstime4 values sunrpc/lockd: fix references to the BKL nfsd: fix year-2038 nfs4 state problem svcrdma: Handle additional inline content svcrdma: Move read list XDR round-up logic ...	2015-02-12 10:39:41 -08:00
Geert Uytterhoeven	9672723973	bridge: netfilter: Move sysctl-specific error code inside #ifdef If CONFIG_SYSCTL=n: net/bridge/br_netfilter.c: In function ‘br_netfilter_init’: net/bridge/br_netfilter.c:996: warning: label ‘err1’ defined but not used Move the label and the code after it inside the existing #ifdef to get rid of the warning. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-12 08:44:46 -08:00
Jan Stancek	4762fb9804	ipv6: fix possible deadlock in ip6_fl_purge / ip6_fl_gc Use spin_lock_bh in ip6_fl_purge() to prevent following potentially deadlock scenario between ip6_fl_purge() and ip6_fl_gc() timer. ================================= [ INFO: inconsistent lock state ] 3.19.0 #1 Not tainted --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. swapper/5/0 [HC0[0]:SC1[1]:HE1:SE0] takes: (ip6_fl_lock){+.?...}, at: [<ffffffff8171155d>] ip6_fl_gc+0x2d/0x180 {SOFTIRQ-ON-W} state was registered at: [<ffffffff810ee9a0>] __lock_acquire+0x4a0/0x10b0 [<ffffffff810efd54>] lock_acquire+0xc4/0x2b0 [<ffffffff81751d2d>] _raw_spin_lock+0x3d/0x80 [<ffffffff81711798>] ip6_flowlabel_net_exit+0x28/0x110 [<ffffffff815f9759>] ops_exit_list.isra.1+0x39/0x60 [<ffffffff815fa320>] cleanup_net+0x100/0x1e0 [<ffffffff810ad80a>] process_one_work+0x20a/0x830 [<ffffffff810adf4b>] worker_thread+0x11b/0x460 [<ffffffff810b42f4>] kthread+0x104/0x120 [<ffffffff81752bfc>] ret_from_fork+0x7c/0xb0 irq event stamp: 84640 hardirqs last enabled at (84640): [<ffffffff81752080>] _raw_spin_unlock_irq+0x30/0x50 hardirqs last disabled at (84639): [<ffffffff81751eff>] _raw_spin_lock_irq+0x1f/0x80 softirqs last enabled at (84628): [<ffffffff81091ad1>] _local_bh_enable+0x21/0x50 softirqs last disabled at (84629): [<ffffffff81093b7d>] irq_exit+0x12d/0x150 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(ip6_fl_lock); <Interrupt> lock(ip6_fl_lock); * DEADLOCK * Signed-off-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-12 07:13:03 -08:00
huaibin Wang	ac37e2515c	xfrm: release dst_orig in case of error in xfrm_lookup() dst_orig should be released on error. Function like __xfrm_route_forward() expects that behavior. Since a recent commit, xfrm_lookup() may also be called by xfrm_lookup_route(), which expects the opposite. Let's introduce a new flag (XFRM_LOOKUP_KEEP_DST_REF) to tell what should be done in case of error. Fixes: f92ee61982d("xfrm: Generate blackhole routes only from route lookup functions") Signed-off-by: huaibin Wang <huaibin.wang@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2015-02-12 07:10:56 +01:00
Linus Torvalds	8cc748aa76	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security layer updates from James Morris: "Highlights: - Smack adds secmark support for Netfilter - /proc/keys is now mandatory if CONFIG_KEYS=y - TPM gets its own device class - Added TPM 2.0 support - Smack file hook rework (all Smack users should review this!)" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (64 commits) cipso: don't use IPCB() to locate the CIPSO IP option SELinux: fix error code in policydb_init() selinux: add security in-core xattr support for pstore and debugfs selinux: quiet the filesystem labeling behavior message selinux: Remove unused function avc_sidcmp() ima: /proc/keys is now mandatory Smack: Repair netfilter dependency X.509: silence asn1 compiler debug output X.509: shut up about included cert for silent build KEYS: Make /proc/keys unconditional if CONFIG_KEYS=y MAINTAINERS: email update tpm/tpm_tis: Add missing ifdef CONFIG_ACPI for pnp_acpi_device smack: fix possible use after frees in task_security() callers smack: Add missing logging in bidirectional UDS connect check Smack: secmark support for netfilter Smack: Rework file hooks tpm: fix format string error in tpm-chip.c char/tpm/tpm_crb: fix build error smack: Fix a bidirectional UDS connect check typo smack: introduce a special case for tmpfs in smack_d_instantiate() ...	2015-02-11 20:25:11 -08:00
Linus Torvalds	59d53737a8	Merge branch 'akpm' (patches from Andrew) Merge second set of updates from Andrew Morton: "More of MM" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (83 commits) mm/nommu.c: fix arithmetic overflow in __vm_enough_memory() mm/mmap.c: fix arithmetic overflow in __vm_enough_memory() vmstat: Reduce time interval to stat update on idle cpu mm/page_owner.c: remove unnecessary stack_trace field Documentation/filesystems/proc.txt: describe /proc/<pid>/map_files mm: incorporate read-only pages into transparent huge pages vmstat: do not use deferrable delayed work for vmstat_update mm: more aggressive page stealing for UNMOVABLE allocations mm: always steal split buddies in fallback allocations mm: when stealing freepages, also take pages created by splitting buddy page mincore: apply page table walker on do_mincore() mm: /proc/pid/clear_refs: avoid split_huge_page() mm: pagewalk: fix misbehavior of walk_page_range for vma(VM_PFNMAP) mempolicy: apply page table walker on queue_pages_range() arch/powerpc/mm/subpage-prot.c: use walk->vma and walk_page_vma() memcg: cleanup preparation for page table walk numa_maps: remove numa_maps->vma numa_maps: fix typo in gather_hugetbl_stats pagemap: use walk->vma instead of calling find_vma() clear_refs: remove clear_refs_private->vma and introduce clear_refs_test_walk() ...	2015-02-11 18:23:28 -08:00
Linus Torvalds	6f83e5bd3e	NFS client updates for Linux 3.20 Highlights incluse: Features: - Removing the forced serialisation of open()/close() calls in NFSv4.x (x>0) makes for a significant performance improvement in metadata intensive workloads. - Full support for the pNFS "flexible files" layout type - Further RPC/RDMA client improvements from Chuck Bugfixes: - Stable fix: NFSv4.1 backchannel calls blocking operations with !TASK_RUNNING - Stable fix: pnfs_generic_pg_init_read/write can be called with lseg == NULL - Stable fix: Fix an Oopsable condition when nsm_mon_unmon is called as part of the namespace cleanup, - Stable fix: Ensure we reference the inode for return-on-close in delegreturn - Use SO_REUSEPORT to ensure that NFSv3 TCP connections can rebind to the same source address/port combination during a disconnect/reconnect event. This is a requirement imposed by most NFSv3 server duplicate reply cache implementations. Optimisations: - Ask for no NFSv4.1 delegations on OPEN if using O_DIRECT Other: - Add Anna Schumaker as co-maintainer for the NFS client -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJU2swgAAoJEGcL54qWCgDyCWoP/1bxN8PesqaiwsBm3fsEqcra WZtMirDIpJYpHwgysdv9t5otBQrb7GrLlNyGZ9NBOVNakifoyj2tHe+/XGDx7Qny iYxXam0QdyjLU+bi4QoG4bdFncwQ/NmC6fqoG0rc25Il96Oggnc6LeSwL6Koc3CD QitRLLi/PaU5qtuaV80+tYMJiqZbpBdVjB+xfSpu7rhyWzm1QNdEeQYor5CozzMi 6cRJuvHgjoZ1xriCWdxQHjqOiEaKNLwfm3uZ3XVaaUAIDhStXugdhIihj3J6Wi7k MKNuY+AKJiy3yOdFfhYALyq+TPundDbYoM9x1foigjgP8zxXVfIU3VS6l33TSlzX zH+/lcnXAHFWjFYoAijG1gv1H+OYcTuDlKaYAShQ/cOkTfWFrmlWv+pZs3SSkmPY 4Aeu97YYOkB5ZZ7wTWKksQMeAu/LYNRSA3h+ANvEIR+NLlTSQTcaChlvBmS0IY5D qMmko1Xgmsxv+B8UeIY7PLfGBGrUdFho1JiDTfL8Xk7fGOfM7iBtMeaMAfdyOSUq AMqH9EDUUOWaFDggO2iisLtMCY6kJ0iFGKRTwzR38jAqm3bjWaIDitUqshNrNbC+ mbwvAVxn0IFSCJGFsVd3kD2rTLGDElZ25GLFW9JMalarE6nlLG7e4p65g209Q9bT HYKiyinJJM2Zji07kmG/ =c47U -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Highlights incluse: Features: - Removing the forced serialisation of open()/close() calls in NFSv4.x (x>0) makes for a significant performance improvement in metadata intensive workloads. - Full support for the pNFS "flexible files" layout type - Further RPC/RDMA client improvements from Chuck Bugfixes: - Stable fix: NFSv4.1 backchannel calls blocking operations with !TASK_RUNNING - Stable fix: pnfs_generic_pg_init_read/write can be called with lseg == NULL - Stable fix: Fix an Oopsable condition when nsm_mon_unmon is called as part of the namespace cleanup, - Stable fix: Ensure we reference the inode for return-on-close in delegreturn - Use SO_REUSEPORT to ensure that NFSv3 TCP connections can rebind to the same source address/port combination during a disconnect/ reconnect event. This is a requirement imposed by most NFSv3 server duplicate reply cache implementations. Optimisations: - Ask for no NFSv4.1 delegations on OPEN if using O_DIRECT Other: - Add Anna Schumaker as co-maintainer for the NFS client" * tag 'nfs-for-3.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (119 commits) SUNRPC: Cleanup to remove xs_tcp_close() pnfs: delete an unintended goto pnfs/flexfiles: Do not dprintk after the free SUNRPC: Fix stupid typo in xs_sock_set_reuseport SUNRPC: Define xs_tcp_fin_timeout only if CONFIG_SUNRPC_DEBUG SUNRPC: Handle connection reset more efficiently. SUNRPC: Remove the redundant XPRT_CONNECTION_CLOSE flag SUNRPC: Make xs_tcp_close() do a socket shutdown rather than a sock_release SUNRPC: Ensure xs_tcp_shutdown() requests a full close of the connection SUNRPC: Cleanup to remove remaining uses of XPRT_CONNECTION_ABORT SUNRPC: Remove TCP socket linger code SUNRPC: Remove TCP client connection reset hack SUNRPC: TCP/UDP always close the old socket before reconnecting SUNRPC: Add helpers to prevent socket create from racing SUNRPC: Ensure xs_reset_transport() resets the close connection flags SUNRPC: Do not clear the source port in xs_reset_transport SUNRPC: Handle EADDRINUSE on connect SUNRPC: Set SO_REUSEPORT socket option for TCP connections NFSv4.1: Fix pnfs_put_lseg races NFSv4.1: pnfs_send_layoutreturn should use GFP_NOFS ...	2015-02-11 17:14:54 -08:00
Andrea Arcangeli	7e33912849	mm: gup: use get_user_pages_unlocked This allows those get_user_pages calls to pass FAULT_FLAG_ALLOW_RETRY to the page fault in order to release the mmap_sem during the I/O. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andres Lagar-Cavilla <andreslc@google.com> Cc: Peter Feiner <pfeiner@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-11 17:06:05 -08:00
Johannes Weiner	650c5e5654	mm: page_counter: pull "-1" handling out of page_counter_memparse() The unified hierarchy interface for memory cgroups will no longer use "-1" to mean maximum possible resource value. In preparation for this, make the string an argument and let the caller supply it. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Greg Thelen <gthelen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-11 17:06:02 -08:00
Tom Herbert	fe881ef11c	gue: Use checksum partial with remote checksum offload Change remote checksum handling to set checksum partial as default behavior. Added an iflink parameter to configure not using checksum partial (calling csum_partial to update checksum). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-11 15:12:13 -08:00
Tom Herbert	15e2396d4e	net: Infrastructure for CHECKSUM_PARTIAL with remote checsum offload This patch adds infrastructure so that remote checksum offload can set CHECKSUM_PARTIAL instead of calling csum_partial and writing the modfied checksum field. Add skb_remcsum_adjust_partial function to set an skb for using CHECKSUM_PARTIAL with remote checksum offload. Changed skb_remcsum_process and skb_gro_remcsum_process to take a boolean argument to indicate if checksum partial can be set or the checksum needs to be modified using the normal algorithm. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-11 15:12:12 -08:00
Tom Herbert	6db93ea13b	udp: Set SKB_GSO_UDP_TUNNEL* in UDP GRO path Properly set GSO types and skb->encapsulation in the UDP tunnel GRO complete so that packets are properly represented for GSO. This sets SKB_GSO_UDP_TUNNEL or SKB_GSO_UDP_TUNNEL_CSUM depending on whether non-zero checksums were received, and sets SKB_GSO_TUNNEL_REMCSUM if the remote checksum option was processed. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-11 15:12:10 -08:00
Tom Herbert	26c4f7da3e	net: Fix remcsum in GRO path to not change packet Remote checksum offload processing is currently the same for both the GRO and non-GRO path. When the remote checksum offload option is encountered, the checksum field referred to is modified in the packet. So in the GRO case, the packet is modified in the GRO path and then the operation is skipped when the packet goes through the normal path based on skb->remcsum_offload. There is a problem in that the packet may be modified in the GRO path, but then forwarded off host still containing the remote checksum option. A remote host will again perform RCO but now the checksum verification will fail since GRO RCO already modified the checksum. To fix this, we ensure that GRO restores a packet to it's original state before returning. In this model, when GRO processes a remote checksum option it still changes the checksum per the algorithm but on return from lower layer processing the checksum is restored to its original value. In this patch we add define gro_remcsum structure which is passed to skb_gro_remcsum_process to save offset and delta for the checksum being changed. After lower layer processing, skb_gro_remcsum_cleanup is called to restore the checksum before returning from GRO. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-11 15:12:09 -08:00
Geert Uytterhoeven	13101602c4	openvswitch: Add missing initialization in validate_and_copy_set_tun() net/openvswitch/flow_netlink.c: In function ‘validate_and_copy_set_tun’: net/openvswitch/flow_netlink.c:1749: warning: ‘err’ may be used uninitialized in this function If ipv4_tun_from_nlattr() returns a different positive value than OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS, err will be uninitialized, and validate_and_copy_set_tun() may return an undefined value instead of a zero success indicator. Initialize err to zero to fix this. Fixes: `1dd144cf5b` ("openvswitch: Support VXLAN Group Policy extension") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-11 14:40:15 -08:00
Pravin B Shelar	b35725a285	openvswitch: Reset key metadata for packet execution. Userspace packet execute command pass down flow key for given packet. But userspace can skip some parameter with zero value. Therefore kernel needs to initialize key metadata to zero. Fixes: `0714812134` ("openvswitch: Eliminate memset() from flow_extract.") Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-11 14:40:15 -08:00
Sowmini Varadhan	80ad0d4a7a	rds: rds_cong_queue_updates needs to defer the congestion update transmission When the RDS transport is TCP, we cannot inline the call to rds_send_xmit from rds_cong_queue_update because (a) we are already holding the sock_lock in the recv path, and will deadlock when tcp_setsockopt/tcp_sendmsg try to get the sock lock (b) cong_queue_update does an irqsave on the rds_cong_lock, and this will trigger warnings (for a good reason) from functions called out of sock_lock. This patch reverts the change introduced by `2fa57129d` ("RDS: Bypass workqueue when queueing cong updates"). The patch has been verified for both RDS/TCP as well as RDS/RDMA to ensure that there are not regressions for either transport: - for verification of RDS/TCP a client-server unit-test was used, with the server blocked in gdb and thus unable to drain its rcvbuf, eventually triggering a RDS congestion update. - for RDS/RDMA, the standard IB regression tests were used Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-11 14:35:44 -08:00
Vlad Yasevich	bf250a1fa7	ipv6: Partial checksum only UDP packets ip6_append_data is used by other protocols and some of them can't be partially checksummed. Only partially checksum UDP protocol. Fixes: `32dce968dd` (ipv6: Allow for partial checksums on non-ufo packets) Reported-by: Sabrina Dubroca <sd@queasysnail.net> Tested-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-11 14:32:08 -08:00
David S. Miller	4a3046d68a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains two small Netfilter updates for your net-next tree, they are: 1) Add ebtables support to nft_compat, from Arturo Borrero. 2) Fix missing validation of the SET_ID attribute in the lookup expressions, from Patrick McHardy. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-11 14:27:24 -08:00
Paul Moore	04f81f0154	cipso: don't use IPCB() to locate the CIPSO IP option Using the IPCB() macro to get the IPv4 options is convenient, but unfortunately NetLabel often needs to examine the CIPSO option outside of the scope of the IP layer in the stack. While historically IPCB() worked above the IP layer, due to the inclusion of the inet_skb_param struct at the head of the {tcp,udp}_skb_cb structs, recent commit `971f10ec` ("tcp: better TCP_SKB_CB layout to reduce cache line misses") reordered the tcp_skb_cb struct and invalidated this IPCB() trick. This patch fixes the problem by creating a new function, cipso_v4_optptr(), which locates the CIPSO option inside the IP header without calling IPCB(). Unfortunately, this isn't as fast as a simple lookup so some additional tweaks were made to limit the use of this new function. Cc: <stable@vger.kernel.org> # 3.18 Reported-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <pmoore@redhat.com> Tested-by: Casey Schaufler <casey@schaufler-ca.com>	2015-02-11 14:46:37 -05:00
Wu Fengguang	7f73b9f1ca	netfilter: ipset: fix boolreturn.cocci warnings net/netfilter/xt_set.c:196:9-10: WARNING: return of 0/1 in function 'set_match_v3' with return type bool net/netfilter/xt_set.c:242:9-10: WARNING: return of 0/1 in function 'set_match_v4' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Generated by: scripts/coccinelle/misc/boolreturn.cocci CC: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-02-11 16:13:30 +01:00
Anton Blanchard	5cca4ace0f	netfilter: Don't hide NETFILTER_XT_MATCH_ADDRTYPE behind NETFILTER_ADVANCED Docker needs NETFILTER_XT_MATCH_ADDRTYPE, so move it out from behind NETFILTER_ADVANCED and make it default to a module. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-02-11 16:09:29 +01:00
Trond Myklebust	c627d31ba0	SUNRPC: Cleanup to remove xs_tcp_close() xs_tcp_close() is now just a call to xs_tcp_shutdown(), so remove it, and replace the entry in xs_tcp_ops. Suggested-by: Anna Schumaker <anna.schumaker@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-10 11:06:04 -05:00
Fan Du	b0f9ca53cb	ipv4: Namespecify TCP PMTU mechanism Packetization Layer Path MTU Discovery works separately beside Path MTU Discovery at IP level, different net namespace has various requirements on which one to chose, e.g., a virutalized container instance would require TCP PMTU to probe an usable effective mtu for underlying tunnel, while the host would employ classical ICMP based PMTU to function. Hence making TCP PMTU mechanism per net namespace to decouple two functionality. Furthermore the probe base MSS should also be configured separately for each namespace. Signed-off-by: Fan Du <fan.du@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 18:45:00 -08:00
David S. Miller	2573beec56	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 14:35:57 -08:00
Trond Myklebust	402e23b4ed	SUNRPC: Fix stupid typo in xs_sock_set_reuseport Yes, kernel_setsockopt() hates you for using a char argument. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-09 17:31:02 -05:00
Yuchung Cheng	531c94a968	tcp: don't include Fast Open option in SYN-ACK on pure SYN-data If a server has enabled Fast Open and it receives a pure SYN-data packet (without a Fast Open option), it won't accept the data but it incorrectly returns a SYN-ACK with a Fast Open cookie and also increments the SNMP stat LINUX_MIB_TCPFASTOPENPASSIVEFAIL. This patch makes the server include a Fast Open cookie in SYN-ACK only if the SYN has some Fast Open option (i.e., when client requests or presents a cookie). Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 14:26:55 -08:00
Thomas Graf	fd3137cd33	openvswitch: Only set TUNNEL_VXLAN_OPT if VXLAN-GBP metadata is set This avoids setting TUNNEL_VXLAN_OPT for VXLAN frames which don't have any GBP metadata set. It is not invalid to set it but unnecessary. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 14:25:52 -08:00
Vlad Yasevich	8381eacf5c	ipv6: Make __ipv6_select_ident static Make __ipv6_select_ident() static as it isn't used outside the file. Fixes: `0508c07f5e` (ipv6: Select fragment id during UFO segmentation if not set.) Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 14:21:03 -08:00
Vlad Yasevich	51f30770e5	ipv6: Fix fragment id assignment on LE arches. Recent commit: `0508c07f5e` Author: Vlad Yasevich <vyasevich@gmail.com> Date: Tue Feb 3 16:36:15 2015 -0500 ipv6: Select fragment id during UFO segmentation if not set. Introduced a bug on LE in how ipv6 fragment id is assigned. This was cought by nightly sparce check: Resolve the following sparce error: net/ipv6/output_core.c:57:38: sparse: incorrect type in assignment (different base types) net/ipv6/output_core.c:57:38: expected restricted __be32 [usertype] ip6_frag_id net/ipv6/output_core.c:57:38: got unsigned int [unsigned] [assigned] [usertype] id Fixes: `0508c07f5e` (ipv6: Select fragment id during UFO segmentation if not set.) Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 14:21:03 -08:00
Toshiaki Makita	25d3b493a5	bridge: Fix inability to add non-vlan fdb entry Bridge's default_pvid adds a vid by default, by which we cannot add a non-vlan fdb entry by default, because br_fdb_add() adds fdb entries for all vlans instead of a non-vlan one when any vlan is configured. # ip link add br0 type bridge # ip link set eth0 master br0 # bridge fdb add 12:34:56:78:90:ab dev eth0 master temp # bridge fdb show brport eth0 \| grep 12:34:56:78:90:ab 12:34:56:78:90:ab dev eth0 vlan 1 static We expect a non-vlan fdb entry as well as vlan 1: 12:34:56:78:90:ab dev eth0 static To fix this, we need to insert a non-vlan fdb entry if vlan is not specified, even when any vlan is configured. Fixes: `5be5a2df40` ("bridge: Add filtering support for default_pvid") Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 14:19:04 -08:00
Andrew Lunn	b750f5b427	net: dsa: Remove redundant phy_attach() dsa_slave_phy_setup() finds the phy for the port via device tree and using of_phy_connect(), or it uses the fall back of taking a phy from the switch internal mdio bus and calling phy_connect_direct(). Either way, if a phy is found, phy_attach_direct() is called to attach the phy to the slave device. In dsa_slave_create(), a second call to phy_attach() is made. This results in the warning "PHY already attached". Remove this second, redundant attaching of the phy. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 14:05:42 -08:00
Richard Alpe	941787b829	tipc: remove tipc_snprintf tipc_snprintf() was heavily utilized by the old netlink API which no longer exists (now netlink compat). In this patch we swap tipc_snprintf() to the identical scnprintf() in the only remaining occurrence. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:49 -08:00
Richard Alpe	22ae7cff50	tipc: nl compat add noop and remove legacy nl framework Add TIPC_CMD_NOOP to compat layer and remove the old framework. All legacy nl commands are now converted to the compat layer in netlink_compat.c. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:49 -08:00
Richard Alpe	5a81a6377b	tipc: convert legacy nl stats show to nl compat Convert TIPC_CMD_SHOW_STATS to compat layer. This command does not have any counterpart in the new API, meaning it now solely exists as a function in the compat layer. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:49 -08:00
Richard Alpe	3c26181c5b	tipc: convert legacy nl net id get to nl compat Convert TIPC_CMD_GET_NETID to compat dumpit. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:49 -08:00
Richard Alpe	964f9501c1	tipc: convert legacy nl net id set to nl compat Convert TIPC_CMD_SET_NETID to compat doit. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:49 -08:00
Richard Alpe	d7cc75d3cb	tipc: convert legacy nl node addr set to nl compat Convert TIPC_CMD_SET_NODE_ADDR to compat doit. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:49 -08:00
Richard Alpe	4b28cb581d	tipc: convert legacy nl node dump to nl compat Convert TIPC_CMD_GET_NODES to compat dumpit and remove global node counter solely used by the legacy API. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:49 -08:00
Richard Alpe	5bfc335a63	tipc: convert legacy nl media dump to nl compat Convert TIPC_CMD_GET_MEDIA_NAMES to compat dumpit. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:48 -08:00
Richard Alpe	487d2a3a13	tipc: convert legacy nl socket dump to nl compat Convert socket (port) listing to compat dumpit call. If a socket (port) has publications a second dumpit call is issued to collect them and format then into the legacy buffer before continuing to process the sockets (ports). Command converted in this patch: TIPC_CMD_SHOW_PORTS Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:48 -08:00
Richard Alpe	44a8ae94fd	tipc: convert legacy nl name table dump to nl compat Add functionality for printing a dump header and convert TIPC_CMD_SHOW_NAME_TABLE to compat dumpit. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:48 -08:00
Richard Alpe	1817877b3c	tipc: convert legacy nl link stat reset to nl compat Convert TIPC_CMD_RESET_LINK_STATS to compat doit. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:48 -08:00
Richard Alpe	37e2d4843f	tipc: convert legacy nl link prop set to nl compat Convert setting of link proprieties to compat doit calls. Commands converted in this patch: TIPC_CMD_SET_LINK_TOL TIPC_CMD_SET_LINK_PRI TIPC_CMD_SET_LINK_WINDOW Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:48 -08:00
Richard Alpe	357ebdbfca	tipc: convert legacy nl link dump to nl compat Convert TIPC_CMD_GET_LINKS to compat dumpit and remove global link counter solely used by the legacy API. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:48 -08:00
Richard Alpe	f2b3b2d4cc	tipc: convert legacy nl link stat to nl compat Add functionality for safely appending string data to a TLV without keeping write count in the caller. Convert TIPC_CMD_SHOW_LINK_STATS to compat dumpit. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:47 -08:00
Richard Alpe	9ab154658a	tipc: convert legacy nl bearer enable/disable to nl compat Introduce a framework for transcoding legacy nl action into actions (.doit) calls from the new nl API. This is done by converting the incoming TLV data into netlink data with nested netlink attributes. Unfortunately due to the randomness of the legacy API we can't do this generically so each legacy netlink command requires a specific transcoding recipe. In this case for bearer enable and bearer disable. Convert TIPC_CMD_ENABLE_BEARER and TIPC_CMD_DISABLE_BEARER into doit compat calls. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:47 -08:00
Richard Alpe	d0796d1ef6	tipc: convert legacy nl bearer dump to nl compat Introduce a framework for dumping netlink data from the new netlink API and formatting it to the old legacy API format. This is done by looping the dump data and calling a format handler for each entity, in this case a bearer. We dump until either all data is dumped or we reach the limited buffer size of the legacy API. Remember, the legacy API doesn't scale. In this commit we convert TIPC_CMD_GET_BEARER_NAMES to use the compat layer. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:47 -08:00
Richard Alpe	bfb3e5dd8d	tipc: move and rename the legacy nl api to "nl compat" The new netlink API is no longer "v2" but rather the standard API and the legacy API is now "nl compat". We split them into separate start/stop and put them in different files in order to further distinguish them. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 13:20:47 -08:00
Trond Myklebust	54c0987492	SUNRPC: Define xs_tcp_fin_timeout only if CONFIG_SUNRPC_DEBUG Now that the linger code is gone, the xs_tcp_fin_timeout variable has no real function. Keep it for now, since it is part of the /proc interface, but only define it if that /proc interface is enabled. Suggested-by: Anna Schumaker <Anna.Schumaker@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-09 11:27:45 -05:00
Trond Myklebust	b70ae915e4	SUNRPC: Handle connection reset more efficiently. If the connection reset is due to an active call on our side, then the state change is sometimes not reported. Catch those instances using xs_error_report() instead. Also remove the xs_tcp_shutdown() call in xs_tcp_send_request() as the change in behaviour makes it redundant. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-09 11:27:42 -05:00
Trond Myklebust	9e2b9f3776	SUNRPC: Remove the redundant XPRT_CONNECTION_CLOSE flag Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-09 11:26:06 -05:00
Trond Myklebust	caf4ccd4e8	SUNRPC: Make xs_tcp_close() do a socket shutdown rather than a sock_release Use of socket shutdown() means that we monitor the shutdown process through the xs_tcp_state_change() callback, so it is preferable to a full close in all cases unless we're destroying the transport. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-09 11:20:44 -05:00
Trond Myklebust	0efeac261c	SUNRPC: Ensure xs_tcp_shutdown() requests a full close of the connection The previous behaviour left the connection half-open in order to try to scrape the last replies from the socket. Now that we have more reliable reconnection, change the behaviour to close down the socket faster. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-09 09:31:11 -05:00
Trond Myklebust	505936f59f	SUNRPC: Cleanup to remove remaining uses of XPRT_CONNECTION_ABORT Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-09 09:20:40 -05:00
Trond Myklebust	9cbc94fb06	SUNRPC: Remove TCP socket linger code Now that we no longer use the partial shutdown code when closing the socket, we no longer need to worry about the TCP linger2 state. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-09 09:20:40 -05:00
Steffen Klassert	044a832a77	xfrm: Fix local error reporting crash with interfamily tunnels We set the outer mode protocol too early. As a result, the local error handler might dispatch to the wrong address family and report the error to a wrong socket type. We fix this by setting the outer protocol to the skb after we accessed the inner mode for the last time, right before we do the atcual encapsulation where we switch finally to the outer mode. Reported-by: Chris Ruehl <chris.ruehl@gtsys.com.hk> Tested-by: Chris Ruehl <chris.ruehl@gtsys.com.hk> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2015-02-09 11:14:17 +01:00
Julian Anastasov	cd67cd5eb2	ipvs: use 64-bit rates in stats IPVS stats are limited to 2^(32-10) conns/s and packets/s, 2^(32-5) bytes/s. It is time to use 64 bits: * Change all conn/packet kernel counters to 64-bit and update them in u64_stats_update_{begin,end} section * In kernel use struct ip_vs_kstats instead of the user-space struct ip_vs_stats_user and use new func ip_vs_export_stats_user to export it to sockopt users to preserve compatibility with 32-bit values * Rename cpu counters "ustats" to "cnt" * To netlink users provide additionally 64-bit stats: IPVS_SVC_ATTR_STATS64 and IPVS_DEST_ATTR_STATS64. Old stats remain for old binaries. * We can use ip_vs_copy_stats in ip_vs_stats_percpu_show Thanks to Chris Caputo for providing initial patch for ip_vs_est.c Signed-off-by: Chris Caputo <ccaputo@alt.net> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2015-02-09 16:59:03 +09:00
Eric Dumazet	93c1af6ca9	net:rfs: adjust table size checking Make sure root user does not try something stupid. Also make sure mask field in struct rps_sock_flow_table does not share a cache line with the potentially often dirtied flow table. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `567e4b7973` ("net: rfs: add hash collision detection") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-08 21:54:09 -08:00
Alexey Andriyanov	dd3733b3e7	ipvs: fix inability to remove a mixed-family RS The current code prevents any operation with a mixed-family dest unless IP_VS_CONN_F_TUNNEL flag is set. The problem is that it's impossible for the client to follow this rule, because ip_vs_genl_parse_dest does not even read the destination conn_flags when cmd = IPVS_CMD_DEL_DEST (need_full_dest = 0). Also, not every client can pass this flag when removing a dest. ipvsadm, for example, does not support the "-i" command line option together with the "-d" option. This change disables any checks for mixed-family on IPVS_CMD_DEL_DEST command. Signed-off-by: Alexey Andriyanov <alan@al-an.info> Fixes: `bc18d37f67` ("ipvs: Allow heterogeneous pools now that we support them") Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2015-02-09 14:13:30 +09:00
Trond Myklebust	4efdd92c92	SUNRPC: Remove TCP client connection reset hack Instead we rely on SO_REUSEPORT to provide the reconnection semantics that we need for NFSv2/v3. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-08 21:47:30 -05:00
Trond Myklebust	de84d89030	SUNRPC: TCP/UDP always close the old socket before reconnecting It is not safe to call xs_reset_transport() from inside xs_udp_setup_socket() or xs_tcp_setup_socket(), since they do not own the correct locks. Instead, do it in xs_connect(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-08 21:47:30 -05:00
Trond Myklebust	718ba5b873	SUNRPC: Add helpers to prevent socket create from racing The socket lock is currently held by the task that is requesting the connection be established. While that is efficient in the case where the connection happens quickly, it is racy in the case where it doesn't. What we really want is for the connect helper to be able to block access to the socket while it is being set up. This patch does so by arranging to transfer the socket lock from the task that is requesting the connect attempt, and then releasing that lock once everything is done. This scheme also gives us automatic protection against collisions with the RPC close code, so we can kill the cancel_delayed_work_sync() call in xs_close(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-08 21:47:29 -05:00
Trond Myklebust	6cc7e90836	SUNRPC: Ensure xs_reset_transport() resets the close connection flags Otherwise, we may end up looping. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-08 21:47:28 -05:00
Trond Myklebust	76698b2358	SUNRPC: Do not clear the source port in xs_reset_transport Now that we can reuse bound ports after a close, we never really want to clear the transport's source port after it has been set. Doing so really messes up the NFSv3 DRC on the server. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-08 21:47:28 -05:00
Trond Myklebust	3913c78c3a	SUNRPC: Handle EADDRINUSE on connect Now that we're setting SO_REUSEPORT, we still need to handle the case where a connect() is attempted, but the old socket is still lingering. Essentially, all we want to do here is handle the error by waiting a few seconds and then retrying. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-08 21:47:27 -05:00
Eric Dumazet	567e4b7973	net: rfs: add hash collision detection Receive Flow Steering is a nice solution but suffers from hash collisions when a mix of connected and unconnected traffic is received on the host, when flow hash table is populated. Also, clearing flow in inet_release() makes RFS not very good for short lived flows, as many packets can follow close(). (FIN , ACK packets, ...) This patch extends the information stored into global hash table to not only include cpu number, but upper part of the hash value. I use a 32bit value, and dynamically split it in two parts. For host with less than 64 possible cpus, this gives 6 bits for the cpu number, and 26 (32-6) bits for the upper part of the hash. Since hash bucket selection use low order bits of the hash, we have a full hash match, if /proc/sys/net/core/rps_sock_flow_entries is big enough. If the hash found in flow table does not match, we fallback to RPS (if it is enabled for the rxqueue). This means that a packet for an non connected flow can avoid the IPI through a unrelated/victim CPU. This also means we no longer have to clear the table at socket close time, and this helps short lived flows performance. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-08 16:53:57 -08:00
Sabrina Dubroca	3e97fa7059	gre/ipip: use be16 variants of netlink functions encap.sport and encap.dport are __be16, use nla_{get,put}_be16 instead of nla_{get,put}_u16. Fixes the sparse warnings: warning: incorrect type in assignment (different base types) expected restricted __be32 [addressable] [usertype] o_key got restricted __be16 [addressable] [usertype] i_flags warning: incorrect type in assignment (different base types) expected restricted __be16 [usertype] sport got unsigned short warning: incorrect type in assignment (different base types) expected restricted __be16 [usertype] dport got unsigned short warning: incorrect type in argument 3 (different base types) expected unsigned short [unsigned] [usertype] value got restricted __be16 [usertype] sport warning: incorrect type in argument 3 (different base types) expected unsigned short [unsigned] [usertype] value got restricted __be16 [usertype] dport Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-08 16:28:06 -08:00
Trond Myklebust	4dda9c8a5e	SUNRPC: Set SO_REUSEPORT socket option for TCP connections When using TCP, we need the ability to reuse port numbers after a disconnection, so that the NFSv3 server knows that we're the same client. Currently we use a hack to work around the TCP socket's TIME_WAIT: we send an RST instead of closing, which doesn't always work... The SO_REUSEPORT option added in Linux 3.9 allows us to bind multiple TCP connections to the same source address+port combination, and thus to use ordinary TCP close() instead of the current hack. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-08 18:52:11 -05:00
Jon Paul Maloy	51a00daf73	tipc: fix bug in socket reception function In commit `c637c10355` ("tipc: resolve race problem at unicast message reception") we introduced a time limit for how long the function tipc_sk_eneque() would be allowed to execute its loop. Unfortunately, the test for when this limit is passed was put in the wrong place, resulting in a lost message when the test is true. We fix this by moving the test to before we dequeue the next buffer from the input queue. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-08 13:09:25 -08:00
Michael Büsch	662f5533c4	rt6_probe_deferred: Do not depend on struct ordering rt6_probe allocates a struct __rt6_probe_work and schedules a work handler rt6_probe_deferred. But rt6_probe_deferred kfree's the struct work_struct instead of struct __rt6_probe_work. This works, because struct work_struct is the first element of struct __rt6_probe_work. Change it to kfree struct __rt6_probe_work to not implicitly depend on struct work_struct being the first element. This does not affect the generated code. Signed-off-by: Michael Buesch <m@bues.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-08 13:00:43 -08:00
Trond Myklebust	bc3203cdca	NFS: RDMA Client Sparse Fixes This patch fixes a sparse warning in the initial submission. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJU09WdAAoJENfLVL+wpUDr2HMP/2rvjkncrlbumSUlkMMUPenZ PUM6dDAnvvhkRj/YrhmJ94v9SPb4BsR0ay4apn5aqmoEHqEy/LfAsoy1TlcspRO6 DwIKD8wHQ9RkpIVaEt1IGhApWlWK3R5FqoDTSMW9VLq72QuCsnbX3EXWN9B+hgT7 UVmcsGe0/5zd9v8RXzWJjyabOsx7rx7gNuuAyULO3RudNMzEpldMf7oHq0RwndOt akxSSvRSfMFyfJscfI3F2IJsbXBST+ZzzJ0oRXXWn0RfiWR+Z438Nk9HDuOfGBFv LACoELeJxtBEFAjBv6GnBILdSxRi6dTZPpAEzvNR45SJAk4c1gGDqgUbyIYXqeYM k8alEMCS+eKvDe7DL6N1Nlxq6oc5pAt7BYYpRzCz+2dd7gfHIOwurBnZwFgxN/rK GhtXCaBqqVr4lmu/lbgWjVFN/Jt6TGZ6+FqQhDM70CmJGBWbOEJNE6H50R7SiFpv 7UapXg9iknCpjexkZS3cyAB0Qu31qvIK64ZzsPhVafmSYh9ed5hlT2WSwphfDki5 ZQOktuPFtVkOI55Rz+2EpNNYe9nRuL5wBqD4Na3uUiEWgbnDLRmx+sDyzcfPujTN trFTwUO082SFK0yZVusZ04vlqjfzxSd4aYSeC5Zzgt0CYJCZd7z0tFrdLpyO/fDn PLfvcJyFRBBbaPgiwBHg =3fRY -----END PGP SIGNATURE----- Merge tag 'nfs-rdma-for-3.20-part-2' of git://git.linux-nfs.org/projects/anna/nfs-rdma NFS: RDMA Client Sparse Fixes This patch fixes a sparse warning in the initial submission. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> * tag 'nfs-rdma-for-3.20-part-2' of git://git.linux-nfs.org/projects/anna/nfs-rdma: xprtrdma: Address sparse complaint in rpcr_to_rdmar()	2015-02-08 10:37:34 -05:00
Neal Cardwell	4fb17a6091	tcp: mitigate ACK loops for connections as tcp_timewait_sock Ensure that in state FIN_WAIT2 or TIME_WAIT, where the connection is represented by a tcp_timewait_sock, we rate limit dupacks in response to incoming packets (a) with TCP timestamps that fail PAWS checks, or (b) with sequence numbers that are out of the acceptable window. We do not send a dupack in response to out-of-window packets if it has been less than sysctl_tcp_invalid_ratelimit (default 500ms) since we last sent a dupack in response to an out-of-window packet. Reported-by: Avery Fay <avery@mixpanel.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-08 01:03:13 -08:00
Neal Cardwell	f2b2c582e8	tcp: mitigate ACK loops for connections as tcp_sock Ensure that in state ESTABLISHED, where the connection is represented by a tcp_sock, we rate limit dupacks in response to incoming packets (a) with TCP timestamps that fail PAWS checks, or (b) with sequence numbers or ACK numbers that are out of the acceptable window. We do not send a dupack in response to out-of-window packets if it has been less than sysctl_tcp_invalid_ratelimit (default 500ms) since we last sent a dupack in response to an out-of-window packet. There is already a similar (although global) rate-limiting mechanism for "challenge ACKs". When deciding whether to send a challence ACK, we first consult the new per-connection rate limit, and then the global rate limit. Reported-by: Avery Fay <avery@mixpanel.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-08 01:03:12 -08:00
Neal Cardwell	a9b2c06dbe	tcp: mitigate ACK loops for connections as tcp_request_sock In the SYN_RECV state, where the TCP connection is represented by tcp_request_sock, we now rate-limit SYNACKs in response to a client's retransmitted SYNs: we do not send a SYNACK in response to client SYN if it has been less than sysctl_tcp_invalid_ratelimit (default 500ms) since we last sent a SYNACK in response to a client's retransmitted SYN. This allows the vast majority of legitimate client connections to proceed unimpeded, even for the most aggressive platforms, iOS and MacOS, which actually retransmit SYNs 1-second intervals for several times in a row. They use SYN RTO timeouts following the progression: 1,1,1,1,1,2,4,8,16,32. Reported-by: Avery Fay <avery@mixpanel.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-08 01:03:12 -08:00
Neal Cardwell	032ee42369	tcp: helpers to mitigate ACK loops by rate-limiting out-of-window dupacks Helpers for mitigating ACK loops by rate-limiting dupacks sent in response to incoming out-of-window packets. This patch includes: - rate-limiting logic - sysctl to control how often we allow dupacks to out-of-window packets - SNMP counter for cases where we rate-limited our dupack sending The rate-limiting logic in this patch decides to not send dupacks in response to out-of-window segments if (a) they are SYNs or pure ACKs and (b) the remote endpoint is sending them faster than the configured rate limit. We rate-limit our responses rather than blocking them entirely or resetting the connection, because legitimate connections can rely on dupacks in response to some out-of-window segments. For example, zero window probes are typically sent with a sequence number that is below the current window, and ZWPs thus expect to thus elicit a dupack in response. We allow dupacks in response to TCP segments with data, because these may be spurious retransmissions for which the remote endpoint wants to receive DSACKs. This is safe because segments with data can't realistically be part of ACK loops, which by their nature consist of each side sending pure/data-less ACKs to each other. The dupack interval is controlled by a new sysctl knob, tcp_invalid_ratelimit, given in milliseconds, in case an administrator needs to dial this upward in the face of a high-rate DoS attack. The name and units are chosen to be analogous to the existing analogous knob for ICMP, icmp_ratelimit. The default value for tcp_invalid_ratelimit is 500ms, which allows at most one such dupack per 500ms. This is chosen to be 2x faster than the 1-second minimum RTO interval allowed by RFC 6298 (section 2, rule 2.4). We allow the extra 2x factor because network delay variations can cause packets sent at 1 second intervals to be compressed and arrive much closer. Reported-by: Avery Fay <avery@mixpanel.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-08 01:03:12 -08:00
Pravin B Shelar	ca539345f8	openvswitch: Initialize unmasked key and uid len Flow alloc needs to initialize unmasked key pointer. Otherwise it can crash kernel trying to free random unmasked-key pointer. general protection fault: 0000 [#1] SMP 3.19.0-rc6-net-next+ #457 Hardware name: Supermicro X7DWU/X7DWU, BIOS 1.1 04/30/2008 RIP: 0010:[<ffffffff8111df0e>] [<ffffffff8111df0e>] kfree+0xac/0x196 Call Trace: [<ffffffffa060bd87>] flow_free+0x21/0x59 [openvswitch] [<ffffffffa060bde0>] ovs_flow_free+0x21/0x23 [openvswitch] [<ffffffffa0605b4a>] ovs_packet_cmd_execute+0x2f3/0x35f [openvswitch] [<ffffffffa0605995>] ? ovs_packet_cmd_execute+0x13e/0x35f [openvswitch] [<ffffffff811fe6fb>] ? nla_parse+0x4f/0xec [<ffffffff8139a2fc>] genl_family_rcv_msg+0x26d/0x2c9 [<ffffffff8107620f>] ? __lock_acquire+0x90e/0x9aa [<ffffffff8139a3be>] genl_rcv_msg+0x66/0x89 [<ffffffff8139a358>] ? genl_family_rcv_msg+0x2c9/0x2c9 [<ffffffff81399591>] netlink_rcv_skb+0x3e/0x95 [<ffffffff81399898>] ? genl_rcv+0x18/0x37 [<ffffffff813998a7>] genl_rcv+0x27/0x37 [<ffffffff81399033>] netlink_unicast+0x103/0x191 [<ffffffff81399382>] netlink_sendmsg+0x2c1/0x310 [<ffffffff811007ad>] ? might_fault+0x50/0xa0 [<ffffffff8135c773>] do_sock_sendmsg+0x5f/0x7a [<ffffffff8135c799>] sock_sendmsg+0xb/0xd [<ffffffff8135cacf>] ___sys_sendmsg+0x1a3/0x218 [<ffffffff8113e54b>] ? get_close_on_exec+0x86/0x86 [<ffffffff8115a9d0>] ? fsnotify+0x32c/0x348 [<ffffffff8115a720>] ? fsnotify+0x7c/0x348 [<ffffffff8113e5f5>] ? __fget+0xaa/0xbf [<ffffffff8113e54b>] ? get_close_on_exec+0x86/0x86 [<ffffffff8135cccd>] __sys_sendmsg+0x3d/0x5e [<ffffffff8135cd02>] SyS_sendmsg+0x14/0x16 [<ffffffff81411852>] system_call_fastpath+0x12/0x17 Fixes: 74ed7ab9264("openvswitch: Add support for unique flow IDs.") CC: Joe Stringer <joestringer@nicira.com> Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-08 00:51:14 -08:00
Roopa Prabhu	1fd0bddb61	bridge: add missing bridge port check for offloads This patch fixes a missing bridge port check caught by smatch. setlink/dellink of attributes like vlans can come for a bridge device and there is no need to offload those today. So, this patch adds a bridge port check. (In these cases however, the BRIDGE_SELF flags will always be set and we may not hit a problem with the current code). smatch complaint: The patch `68e331c785`: "bridge: offload bridge port attributes to switch asic if feature flag set" from Jan 29, 2015, leads to the following Smatch complaint: net/bridge/br_netlink.c:552 br_setlink() error: we previously assumed 'p' could be null (see line 518) net/bridge/br_netlink.c 517 518 if (p && protinfo) { ^ Check for NULL. Reported-By: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-07 22:49:47 -08:00
Eric Dumazet	91e83133e7	net: use netif_rx_ni() from process context Hotpluging a cpu might be rare, yet we have to use proper handlers when taking over packets found in backlog queues. dev_cpu_callback() runs from process context, thus we should call netif_rx_ni() to properly invoke softirq handler. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-07 22:43:52 -08:00
Sowmini Varadhan	d0a47d3272	rds: Make rds_message_copy_from_user() return 0 on success. Commit `083735f4b0` ("rds: switch rds_message_copy_from_user() to iov_iter") breaks rds_message_copy_from_user() semantics on success, and causes it to return nbytes copied, when it should return 0. This commit fixes that bug. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-07 22:41:56 -08:00
Rasmus Villemoes	11ac11999b	net: rds: Remove repeated function names from debug output The macro rdsdebug is defined as pr_debug("%s(): " fmt, __func__ , ##args) Hence it doesn't make sense to include the name of the calling function explicitly in the format string passed to rdsdebug. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-07 22:41:56 -08:00
Jarno Rajahalme	83d2b9ba1a	net: openvswitch: Support masked set actions. OVS userspace already probes the openvswitch kernel module for OVS_ACTION_ATTR_SET_MASKED support. This patch adds the kernel module implementation of masked set actions. The existing set action sets many fields at once. When only a subset of the IP header fields, for example, should be modified, all the IP fields need to be exact matched so that the other field values can be copied to the set action. A masked set action allows modification of an arbitrary subset of the supported header bits without requiring the rest to be matched. Masked set action is now supported for all writeable key types, except for the tunnel key. The set tunnel action is an exception as any input tunnel info is cleared before action processing starts, so there is no tunnel info to mask. The kernel module converts all (non-tunnel) set actions to masked set actions. This makes action processing more uniform, and results in less branching and duplicating the action processing code. When returning actions to userspace, the fully masked set actions are converted back to normal set actions. We use a kernel internal action code to be able to tell the userspace provided and converted masked set actions apart. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-07 22:40:17 -08:00
David S. Miller	3c09e92fb6	NFC: 3.20 second pull request This is the second NFC pull request for 3.20. It brings: - NCI NFCEE (NFC Execution Environment, typically an embedded or external secure element) discovery and enabling/disabling support. In order to communicate with an NFCEE, we also added NCI's logical connections support to the NCI stack. - HCI over NCI protocol support. Some secure elements only understand HCI and thus we need to send them HCI frames when they're part of an NCI chipset. - NFC_EVT_TRANSACTION userspace API addition. Whenever an application running on a secure element needs to notify its host counterpart, we send an NFC_EVENT_SE_TRANSACTION event to userspace through the NFC netlink socket. - Secure element and HCI transaction event support for the st21nfcb chipset. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJU06ozAAoJEIqAPN1PVmxKdtkP/2CeQ0+xJnJWgyh4xjkVxfPb wPYrz3cHeHdnVX36mnIdRoQZRjxh/Ef/vgK8RsXohcldzHA9D8GjxRpj0GrkVGQ9 xoDkiHsFDdsZHeezSrlnXNnObvZrkeMsmnUDJxfwLtcX1nzFKzBgW2GaDiNjUazC r5Ip4YIfoCaVLq5MSqYRqJ6uhplXd/ePdtPoo54L/E81y4ptQi8pUsQBy4K3GrFn FbXoemcymhi9AVBGl+OlppZ93t6bdOV1D5tcOwpytAbfa3jp2oUnJPZay7EoWruR aj8l5v3P+SSphAJwaGSbny0L83tTS7sq+N4QG+VxxdMkw2YjpmxgN3AGguNBtPGG SUThpZV6QhrkAOnwzP2BJnh68WSU1eJrGvk8zUWW0SanA8k2jaHO/VzXCA3/ZkC4 IFcXl4Jr6R3iUxDFgS/6MB5E9EGedzFTacsWoHVyZPtaDAQp4WVdAIRQ4QKgJCLw 1yRmCeVunTsrs6O0aenU1xHinPlnx+tkuiNh4H64APQYTz54WwXH1/S04OL1SkqI mTzUvFgWqJvSIOuU19g0HBw+xZ/9vvX8LLkm9kSTbe7tcmFH5CI7ZCN/hNDHvMSd sjNYOyag/SAHJ01tTKcSPAgWO49bHWPodI04zP0lK0gMLjKvRknVMdTgIDfu49HN 2da9E23iBmNNgG79iVHN =8caD -----END PGP SIGNATURE----- Merge tag 'nfc-next-3.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next NFC: 3.20 second pull request This is the second NFC pull request for 3.20. It brings: - NCI NFCEE (NFC Execution Environment, typically an embedded or external secure element) discovery and enabling/disabling support. In order to communicate with an NFCEE, we also added NCI's logical connections support to the NCI stack. - HCI over NCI protocol support. Some secure elements only understand HCI and thus we need to send them HCI frames when they're part of an NCI chipset. - NFC_EVT_TRANSACTION userspace API addition. Whenever an application running on a secure element needs to notify its host counterpart, we send an NFC_EVENT_SE_TRANSACTION event to userspace through the NFC netlink socket. - Secure element and HCI transaction event support for the st21nfcb chipset. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-07 22:22:25 -08:00
Daniel Borkmann	364d5716a7	rtnetlink: ifla_vf_policy: fix misuses of NLA_BINARY ifla_vf_policy[] is wrong in advertising its individual member types as NLA_BINARY since .type = NLA_BINARY in combination with .len declares the len member as max attribute length [0, len]. The issue is that when do_setvfinfo() is being called to set up a VF through ndo handler, we could set corrupted data if the attribute length is less than the size of the related structure itself. The intent is exactly the opposite, namely to make sure to pass at least data of minimum size of len. Fixes: `ebc08a6f47` ("rtnetlink: Add VF config code to rtnetlink") Cc: Mitch Williams <mitch.a.williams@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-07 22:13:10 -08:00
Tobias Waldekranz	e04449fcf2	dsa: correctly determine the number of switches in a system The number of connected switches was sourced from the number of children to the DSA node, change it to the number of available children, skipping any disabled switches. Fixes: `5e95329b70` ("dsa: add device tree bindings to register DSA switches") Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-07 22:07:37 -08:00
Daniel Borkmann	11b1f8288d	ipv6: addrconf: add missing validate_link_af handler We still need a validate_link_af() handler with an appropriate nla policy, similarly as we have in IPv4 case, otherwise size validations are not being done properly in that case. Fixes: `f53adae4ea` ("net: ipv6: add tokenized interface identifier support") Fixes: `bc91b0f07a` ("ipv6: addrconf: implement address generation modes") Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-06 12:51:33 -08:00
Hajime Tazaki	cd3bafc73d	xfrm6: Fix a offset value for network header in _decode_session6 When a network-layer header has multiple IPv6 extension headers, then offset for mobility header goes wrong. This regression breaks an xfrm policy lookup for a particular receive packet. Binding update packets of Mobile IPv6 are all discarded without this fix. Fixes: `de3b7a06df` ("xfrm6: Fix transport header offset in _decode_session6.") Signed-off-by: Hajime Tazaki <tazaki@sfc.wide.ad.jp> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2015-02-06 07:00:32 +01:00
Jon Paul Maloy	cb1b728096	tipc: eliminate race condition at multicast reception In a previous commit in this series we resolved a race problem during unicast message reception. Here, we resolve the same problem at multicast reception. We apply the same technique: an input queue serializing the delivery of arriving buffers. The main difference is that here we do it in two steps. First, the broadcast link feeds arriving buffers into the tail of an arrival queue, which head is consumed at the socket level, and where destination lookup is performed. Second, if the lookup is successful, the resulting buffer clones are fed into a second queue, the input queue. This queue is consumed at reception in the socket just like in the unicast case. Both queues are protected by the same lock, -the one of the input queue. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 16:00:03 -08:00
Jon Paul Maloy	3c724acdd5	tipc: simplify socket multicast reception The structure 'tipc_port_list' is used to collect port numbers representing multicast destination socket on a receiving node. The list is not based on a standard linked list, and is in reality optimized for the uncommon case that there are more than one multicast destinations per node. This makes the list handling unecessarily complex, and as a consequence, even the socket multicast reception becomes more complex. In this commit, we replace 'tipc_port_list' with a new 'struct tipc_plist', which is based on a standard list. We give the new list stack (push/pop) semantics, someting that simplifies the implementation of the function tipc_sk_mcast_rcv(). Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 16:00:03 -08:00
Jon Paul Maloy	708ac32cb5	tipc: simplify connection abort notifications when links break The new input message queue in struct tipc_link can be used for delivering connection abort messages to subscribing sockets. This makes it possible to simplify the code for such cases. This commit removes the temporary list in tipc_node_unlock() used for transforming abort subscriptions to messages. Instead, the abort messages are now created at the moment of lost contact, and then added to the last failed link's generic input queue for delivery to the sockets concerned. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 16:00:02 -08:00
Jon Paul Maloy	c637c10355	tipc: resolve race problem at unicast message reception TIPC handles message cardinality and sequencing at the link layer, before passing messages upwards to the destination sockets. During the upcall from link to socket no locks are held. It is therefore possible, and we see it happen occasionally, that messages arriving in different threads and delivered in sequence still bypass each other before they reach the destination socket. This must not happen, since it violates the sequentiality guarantee. We solve this by adding a new input buffer queue to the link structure. Arriving messages are added safely to the tail of that queue by the link, while the head of the queue is consumed, also safely, by the receiving socket. Sequentiality is secured per socket by only allowing buffers to be dequeued inside the socket lock. Since there may be multiple simultaneous readers of the queue, we use a 'filter' parameter to reduce the risk that they peek the same buffer from the queue, hence also reducing the risk of contention on the receiving socket locks. This solves the sequentiality problem, and seems to cause no measurable performance degradation. A nice side effect of this change is that lock handling in the functions tipc_rcv() and tipc_bcast_rcv() now becomes uniform, something that will enable future simplifications of those functions. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 16:00:02 -08:00
Jon Paul Maloy	94153e36e7	tipc: use existing sk_write_queue for outgoing packet chain The list for outgoing traffic buffers from a socket is currently allocated on the stack. This forces us to initialize the queue for each sent message, something costing extra CPU cycles in the most critical data path. Later in this series we will introduce a new safe input buffer queue, something that would force us to initialize even the spinlock of the outgoing queue. A closer analysis reveals that the queue always is filled and emptied within the same lock_sock() session. It is therefore safe to use a queue aggregated in the socket itself for this purpose. Since there already exists a queue for this in struct sock, sk_write_queue, we introduce use of that queue in this commit. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 16:00:02 -08:00
Jon Paul Maloy	e3a77561e7	tipc: split up function tipc_msg_eval() The function tipc_msg_eval() is in reality doing two related, but different tasks. First it tries to find a new destination for named messages, in case there was no first lookup, or if the first lookup failed. Second, it does what its name suggests, evaluating the validity of the message and its destination, and returning an appropriate error code depending on the result. This is confusing, and in this commit we choose to break it up into two functions. A new function, tipc_msg_lookup_dest(), first attempts to find a new destination, if the message is of the right type. If this lookup fails, or if the message should not be subject to a second lookup, the already existing tipc_msg_reverse() is called. This function performs prepares the message for rejection, if applicable. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 16:00:02 -08:00
Jon Paul Maloy	d570d86497	tipc: enqueue arrived buffers in socket in separate function The code for enqueuing arriving buffers in the function tipc_sk_rcv() contains long code lines and currently goes to two indentation levels. As a cosmetic preparaton for the next commits, we break it out into a separate function. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 16:00:02 -08:00
Jon Paul Maloy	1186adf7df	tipc: simplify message forwarding and rejection in socket layer Despite recent improvements, the handling of error codes and return values at reception of messages in the socket layer is still confusing. In this commit, we try to make it more comprehensible. First, we separate between the return values coming from the functions called by tipc_sk_rcv(), -those are TIPC specific error codes, and the return values returned by tipc_sk_rcv() itself. Second, we don't use the returned TIPC error code as indication for whether a buffer should be forwarded/rejected or not; instead we use the buffer pointer passed along with filter_msg(). This separation is necessary because we sometimes want to forward messages even when there is no error (i.e., protocol messages and successfully secondary looked up data messages). Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 16:00:01 -08:00
Jon Paul Maloy	c5898636c4	tipc: reduce usage of context info in socket and link The most common usage of namespace information is when we fetch the own node addess from the net structure. This leads to a lot of passing around of a parameter of type 'struct net *' between functions just to make them able to obtain this address. However, in many cases this is unnecessary. The own node address is readily available as a member of both struct tipc_sock and tipc_link, and can be fetched from there instead. The fact that the vast majority of functions in socket.c and link.c anyway are maintaining a pointer to their respective base structures makes this option even more compelling. In this commit, we introduce the inline functions tsk_own_node() and link_own_node() to make it easy for functions to fetch the node address from those structs instead of having to pass along and dereference the namespace struct. In particular, we make calls to the msg_xx() functions in msg.{h,c} context independent by directly passing them the own node address as parameter when needed. Those functions should be regarded as leaves in the code dependency tree, and it is hence desirable to keep them namspace unaware. Apart from a potential positive effect on cache behavior, these changes make it easier to introduce the changes that will follow later in this series. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 16:00:01 -08:00
Sabrina Dubroca	7744b5f369	pktgen: fix UDP checksum computation This patch fixes two issues in UDP checksum computation in pktgen. First, the pseudo-header uses the source and destination IP addresses. Currently, the ports are used for IPv4. Second, the UDP checksum covers both header and data. So we need to generate the data earlier (move pktgen_finalize_skb up), and compute the checksum for UDP header + data. Fixes: `c26bf4a513` ("pktgen: Add UDPCSUM flag to support UDP checksums") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 15:41:34 -08:00
Erik Kline	c58da4c659	net: ipv6: allow explicitly choosing optimistic addresses RFC 4429 ("Optimistic DAD") states that optimistic addresses should be treated as deprecated addresses. From section 2.1: Unless noted otherwise, components of the IPv6 protocol stack should treat addresses in the Optimistic state equivalently to those in the Deprecated state, indicating that the address is available for use but should not be used if another suitable address is available. Optimistic addresses are indeed avoided when other addresses are available (i.e. at source address selection time), but they have not heretofore been available for things like explicit bind() and sendmsg() with struct in6_pktinfo, etc. This change makes optimistic addresses treated more like deprecated addresses than tentative ones. Signed-off-by: Erik Kline <ek@google.com> Acked-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 15:37:41 -08:00
Miroslav Urbanek	233c96fc07	flowcache: Fix kernel panic in flow_cache_flush_task flow_cache_flush_task references a structure member flow_cache_gc_work where it should reference flow_cache_flush_task instead. Kernel panic occurs on kernels using IPsec during XFRM garbage collection. The garbage collection interval can be shortened using the following sysctl settings: net.ipv4.xfrm4_gc_thresh=4 net.ipv6.xfrm6_gc_thresh=4 With the default settings, our productions servers crash approximately once a week. With the settings above, they crash immediately. Fixes: `ca925cf153` ("flowcache: Make flow cache name space aware") Reported-by: Tomáš Charvát <tc@excello.cz> Tested-by: Jan Hejl <jh@excello.cz> Signed-off-by: Miroslav Urbanek <mu@miroslavurbanek.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 14:38:53 -08:00
David S. Miller	6e03f896b5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/vxlan.c drivers/vhost/net.c include/linux/if_vlan.h net/core/dev.c The net/core/dev.c conflict was the overlap of one commit marking an existing function static whilst another was adding a new function. In the include/linux/if_vlan.h case, the type used for a local variable was changed in 'net', whereas the function got rewritten to fix a stacked vlan bug in 'net-next'. In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next' overlapped with an endainness fix for VHOST 1.0 in 'net'. In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter in 'net-next' whereas in 'net' there was a bug fix to pass in the correct network namespace pointer in calls to this function. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 14:33:28 -08:00
Chuck Lever	b625a61698	xprtrdma: Address sparse complaint in rpcr_to_rdmar() With "make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__": linux-2.6/net/sunrpc/xprtrdma/xprt_rdma.h:273:30: warning: incorrect type in initializer (different base types) linux-2.6/net/sunrpc/xprtrdma/xprt_rdma.h:273:30: expected restricted __be32 [usertype] buffer linux-2.6/net/sunrpc/xprtrdma/xprt_rdma.h:273:30: got unsigned int [usertype] rq_buffer As far as I can tell this is a false positive. Reported-by: kbuild-all@01.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-02-05 15:38:29 -05:00
Eric Dumazet	a409caecb2	sit: fix some __be16/u16 mismatches Fixes following sparse warnings : net/ipv6/sit.c:1509:32: warning: incorrect type in assignment (different base types) net/ipv6/sit.c:1509:32: expected restricted __be16 [usertype] sport net/ipv6/sit.c:1509:32: got unsigned short net/ipv6/sit.c:1514:32: warning: incorrect type in assignment (different base types) net/ipv6/sit.c:1514:32: expected restricted __be16 [usertype] dport net/ipv6/sit.c:1514:32: got unsigned short net/ipv6/sit.c:1711:38: warning: incorrect type in argument 3 (different base types) net/ipv6/sit.c:1711:38: expected unsigned short [unsigned] [usertype] value net/ipv6/sit.c:1711:38: got restricted __be16 [usertype] sport net/ipv6/sit.c:1713:38: warning: incorrect type in argument 3 (different base types) net/ipv6/sit.c:1713:38: expected unsigned short [unsigned] [usertype] value net/ipv6/sit.c:1713:38: got restricted __be16 [usertype] dport Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 00:43:14 -08:00
Eric Dumazet	2ce1ee1780	net: remove some sparse warnings netdev_adjacent_add_links() and netdev_adjacent_del_links() are static. queue->qdisc has __rcu annotation, need to use RCU_INIT_POINTER() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 00:41:17 -08:00
Sabrina Dubroca	d1e158e2d7	ip6_gre: fix endianness errors in ip6gre_err info is in network byte order, change it back to host byte order before use. In particular, the current code sets the MTU of the tunnel to a wrong (too big) value. Fixes: `c12b395a46` ("gre: Support GRE over IPv6") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 00:33:10 -08:00
David S. Miller	3fcf901118	Revert "bridge: Let bridge not age 'externally' learnt FDB entries, they are removed when 'external' entity notifies the aging" This reverts commit `9a05dde59a`. Requested by Scott Feldman. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 23:52:44 -08:00
Eric Dumazet	06eb395fa9	pkt_sched: fq: better control of DDOS traffic FQ has a fast path for skb attached to a socket, as it does not have to compute a flow hash. But for other packets, FQ being non stochastic means that hosts exposed to random Internet traffic can allocate million of flows structure (104 bytes each) pretty easily. Not only host can OOM, but lookup in RB trees can take too much cpu and memory resources. This patch adds a new attribute, orphan_mask, that is adding possibility of having a stochastic hash for orphaned skb. Its default value is 1024 slots, to mimic SFQ behavior. Note: This does not apply to locally generated TCP traffic, and no locally generated traffic will share a flow structure with another perfect or stochastic flow. This patch also handles the specific case of SYNACK messages: They are attached to the listener socket, and therefore all map to a single hash bucket. If listener have set SO_MAX_PACING_RATE, hoping to have new accepted socket inherit this rate, SYNACK might be paced and even dropped. This is very similar to an internal patch Google have used more than one year. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 22:15:45 -08:00
David S. Miller	f2683b743f	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs More iov_iter work from Al Viro. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 20:46:55 -08:00
Eric Dumazet	9878196578	tcp: do not pace pure ack packets When we added pacing to TCP, we decided to let sch_fq take care of actual pacing. All TCP had to do was to compute sk->pacing_rate using simple formula: sk->pacing_rate = 2 * cwnd * mss / rtt It works well for senders (bulk flows), but not very well for receivers or even RPC : cwnd on the receiver can be less than 10, rtt can be around 100ms, so we can end up pacing ACK packets, slowing down the sender. Really, only the sender should pace, according to its own logic. Instead of adding a new bit in skb, or call yet another flow dissection, we tweak skb->truesize to a small value (2), and we instruct sch_fq to use new helper and not pace pure ack. Note this also helps TCP small queue, as ack packets present in qdisc/NIC do not prevent sending a data packet (RPC workload) This helps to reduce tx completion overhead, ack packets can use regular sock_wfree() instead of tcp_wfree() which is a bit more expensive. This has no impact in the case packets are sent to loopback interface, as we do not coalesce ack packets (were we would detect skb->truesize lie) In case netem (with a delay) is used, skb_orphan_partial() also sets skb->truesize to 1. This patch is a combination of two patches we used for about one year at Google. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 20:36:31 -08:00
Herbert Xu	9a77662882	netfilter: Use rhashtable walk iterator This patch gets rid of the manual rhashtable walk in nft_hash which touches rhashtable internals that should not be exposed. It does so by using the rhashtable iterator primitives. Note that I'm leaving nft_hash_destroy alone since it's only invoked on shutdown and it shouldn't be affected by changes to rhashtable internals (or at least not what I'm planning to change). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 20:34:53 -08:00
Herbert Xu	56d28b1e92	netlink: Use rhashtable walk iterator This patch gets rid of the manual rhashtable walk in netlink which touches rhashtable internals that should not be exposed. It does so by using the rhashtable iterator primitives. In fact the existing code was very buggy. Some sockets weren't shown at all while others were shown more than once. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 20:34:53 -08:00
Ignacy Gawędzki	b057df24a7	cls_api.c: Fix dumping of non-existing actions' stats. In tcf_exts_dump_stats(), ensure that exts->actions is not empty before accessing the first element of that list and calling tcf_action_copy_stats() on it. This fixes some random segvs when adding filters of type "basic" with no particular action. This also fixes the dumping of those "no-action" filters, which more often than not made calls to tcf_action_copy_stats() fail and consequently netlink attributes added by the caller to be removed by a call to nla_nest_cancel(). Fixes: `33be627159` ("net_sched: act: use standard struct list_head") Signed-off-by: Ignacy Gawędzki <ignacy.gawedzki@green-communications.fr> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 20:26:12 -08:00
Kenneth Klette Jonassen	3725a26981	pkt_sched: fq: avoid hang when quantum 0 Configuring fq with quantum 0 hangs the system, presumably because of a non-interruptible infinite loop. Either way quantum 0 does not make sense. Reproduce with: sudo tc qdisc add dev lo root fq quantum 0 initial_quantum 0 ping 127.0.0.1 Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 20:07:39 -08:00
Moni Shoua	61bd3857ff	net/core: Add event for a change in slave state Add event which provides an indication on a change in the state of a bonding slave. The event handler should cast the pointer to the appropriate type (struct netdev_bonding_info) in order to get the full info about the slave. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:14:24 -08:00
Jon Paul Maloy	af9946fde9	tipc: separate link starting event from link timeout event When a new link instance is created, it is trigged to start by sending it a TIPC_STARTING_EVT, whereafter a regular link reset is applied to it. The starting event is codewise treated as a timeout event, and prompts a link RESET message to be sent to the peer node, carrying a link session identifier. The later link_reset() call nudges this session identifier, whereafter all subsequent RESET messages will be sent out with the new identifier. The latter session number overrides the former, causing the peer to unconditionally accept it irrespective of its current working state. We don't think that this causes any problem, but it is not in accordance with the protocol spec, and may cause confusion when debugging TIPC sessions. To avoid this, we make the starting event distinct from the subsequent timeout events, by not allowing the former to send out any RESET message. This eliminates the described problem. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:09:31 -08:00
Jon Paul Maloy	b45db71b52	tipc: eliminate race during node creation Instances of struct node are created in the function tipc_disc_rcv() under the assumption that there is no race between received discovery messages arriving from the same node. This assumption is wrong. When we use more than one bearer, it is possible that discovery messages from the same node arrive at the same moment, resulting in creation of two instances of struct tipc_node. This may later cause confusion during link establishment, and may result in one of the links never becoming activated. We fix this by making lookup and potential creation of nodes atomic. Instead of first looking up the node, and in case of failure, create it, we now start with looking up the node inside node_link_create(), and return a reference to that one if found. Otherwise, we go ahead and create the node as we did before. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:09:31 -08:00
Jon Paul Maloy	7d24dcdb3f	tipc: avoid stale link after aborted failover During link failover it may happen that the remaining link goes down while it is still in the process of taking over traffic from a previously failed link. When this happens, we currently abort the failover procedure and reset the first failed link to non-failover mode, so that it will be ready to re-establish contact with its peer when it comes available. However, if the first link goes down because its bearer was manually disabled, it is not enough to reset it; it must also be deleted; which is supposed to happen when the failover procedure is finished. Otherwise it will remain a zombie link: attached to the owner node structure, in mode LINK_STOPPED, and permanently blocking any re- establishing of the link to the peer via the interface in question. We fix this by amending the failover abort procedure. Apart from resetting the link to non-failover state, we test if the link is also in LINK_STOPPED mode. If so, we delete it, using the conditional tipc_link_delete() function introduced in the previous commit. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:09:31 -08:00
Jon Paul Maloy	2d72d49553	tipc: add reference count to struct tipc_link When a bearer is disabled, all pertaining links will be reset and deleted. However, if there is a second active link towards a killed link's destination, the delete has to be postponed until the failover is finished. During this interval, we currently put the link in zombie mode, i.e., we take it out of traffic, delete its timer, but leave it attached to the owner node structure until all missing packets have been received. When this is done, we detach the link from its node and delete it, assuming that the synchronous timer deletion that was initiated earlier in a different thread has finished. This is unsafe, as the failover may finish before del_timer_sync() has returned in the other thread. We fix this by adding an atomic reference counter of type kref in struct tipc_link. The counter keeps track of the references kept to the link by the owner node and the timer. We then do a conditional delete, based on the reference counter, both after the failover has been finished and when the timer expires, if applicable. Whoever comes last, will actually delete the link. This approach also implies that we can make the deletion of the timer asynchronous. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:09:31 -08:00
Sasha Levin	db27ebb111	net: rds: use correct size for max unacked packets and bytes Max unacked packets/bytes is an int while sizeof(long) was used in the sysctl table. This means that when they were getting read we'd also leak kernel memory to userspace along with the timeout values. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:07:27 -08:00
David S. Miller	940288b6a5	Last round of updates for net-next: * revert a patch that caused a regression with mesh userspace (Bob) * fix a number of suspend/resume related races (from Emmanuel, Luca and myself - we'll look at backporting later) * add software implementations for new ciphers (Jouni) * add a new ACPI ID for Broadcom's rfkill (Mika) * allow using netns FD for wireless (Vadim) * some other cleanups (various) -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJU0NEaAAoJEDBSmw7B7bqr2DAP/3nbk6WB2Lz4Vwi8fh9C4y6X 4ZzN2v7NHaimC+Wpxg3wP2wCdX2VG3ZWwF3yLm6qgVHGnE35RFLxURen1eqNeW77 Yf5gvZ066nCGEQ8l8J6YK9vrLX4qp5c4lyE00bbxpZTA4Qq71SgTg+rmGC1be8uX vacfaLwfDWffuOOnjBAPfanj7f4AQaUEY2uN1WkBFC7iEeOtPcWVkHAFVIeGjJfQ vfgQJcwOjgWjYbwZdQQi7Aj+k1Yzda4pg1yEWn3CkZ8zyeCYGs1gk2ovoBgPgXBP yT9ypIdFsN242VJvy7nkFnCKA8mhKyltMQ1Xjs0Q9lAxWdaq9U+iqt5cUn4jxVIG T9Vi3PCbx/nVOqcfR81dBTQ3uDU5AyosPsmh2YTxi5lpRBrsjNY2FtaKE0sm2Om3 wRiSPOdPrXBeEnU0KssI9e6euXgS4JQV78Naq85OWZDd2yZ1fT5U2fi8y4drRGlz rucbbobdVhQch5L4FStPz1uW5pNuJrhekXeZIE8MruTNg2A2oBAK3ApO7hxn68sE RnbAnkxVLwgedC9042JF5eiS1PDIU46w4e782j/+XskVKMEakqd23iJycx3tmgHZ cxDi/qKZ2RCE74YsT61o/9ErSVPvCfNPL3+CVov918jidQQg2WHfKm/jFlIxHIxk 4wBP7p2VGlENMgw/R8GJ =wOaR -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-davem-2015-02-03' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Last round of updates for net-next: * revert a patch that caused a regression with mesh userspace (Bob) * fix a number of suspend/resume related races (from Emmanuel, Luca and myself - we'll look at backporting later) * add software implementations for new ciphers (Jouni) * add a new ACPI ID for Broadcom's rfkill (Mika) * allow using netns FD for wireless (Vadim) * some other cleanups (various) Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 14:57:45 -08:00
David S. Miller	45e826fd57	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-02-03 Here's what's likely the last bluetooth-next pull request for 3.20. Notable changes include: - xHCI workaround + a new id for the ath3k driver - Several new ids for the btusb driver - Support for new Intel Bluetooth controllers - Minor cleanups to ieee802154 code - Nested sleep warning fix in socket accept() code path - Fixes for Out of Band pairing handling - Support for LE scan restarting for HCI_QUIRK_STRICT_DUPLICATE_FILTER - Improvements to data we expose through debugfs - Proper handling of Hardware Error HCI events Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 13:56:37 -08:00
Tom Herbert	dcdc899469	net: add skb functions to process remote checksum offload This patch adds skb_remcsum_process and skb_gro_remcsum_process to perform the appropriate adjustments to the skb when receiving remote checksum offload. Updated vxlan and gue to use these functions. Tested: Ran TCP_RR and TCP_STREAM netperf for VXLAN and GUE, did not see any change in performance. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 13:54:07 -08:00
Siva Mannem	9a05dde59a	bridge: Let bridge not age 'externally' learnt FDB entries, they are removed when 'external' entity notifies the aging When 'learned_sync' flag is turned on, the offloaded switch port syncs learned MAC addresses to bridge's FDB via switchdev notifier (NETDEV_SWITCH_FDB_ADD). Currently, FDB entries learnt via this mechanism are wrongly being deleted by bridge aging logic. This patch ensures that FDB entries synced from offloaded switch ports are not deleted by bridging logic. Such entries can only be deleted via switchdev notifier (NETDEV_SWITCH_FDB_DEL). Signed-off-by: Siva Mannem <siva.mannem.lnx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 13:51:10 -08:00
Eric Dumazet	2bd82484bb	xps: fix xps for stacked devices A typical qdisc setup is the following : bond0 : bonding device, using HTB hierarchy eth1/eth2 : slaves, multiqueue NIC, using MQ + FQ qdisc XPS allows to spread packets on specific tx queues, based on the cpu doing the send. Problem is that dequeues from bond0 qdisc can happen on random cpus, due to the fact that qdisc_run() can dequeue a batch of packets. CPUA -> queue packet P1 on bond0 qdisc, P1->ooo_okay=1 CPUA -> queue packet P2 on bond0 qdisc, P2->ooo_okay=0 CPUB -> dequeue packet P1 from bond0 enqueue packet on eth1/eth2 CPUC -> dequeue packet P2 from bond0 enqueue packet on eth1/eth2 using sk cache (ooo_okay is 0) get_xps_queue() then might select wrong queue for P1, since current cpu might be different than CPUA. P2 might be sent on the old queue (stored in sk->sk_tx_queue_mapping), if CPUC runs a bit faster (or CPUB spins a bit on qdisc lock) Effect of this bug is TCP reorders, and more generally not optimal TX queue placement. (A victim bulk flow can be migrated to the wrong TX queue for a while) To fix this, we have to record sender cpu number the first time dev_queue_xmit() is called for one tx skb. We can union napi_id (used on receive path) and sender_cpu, granted we clear sender_cpu in skb_scrub_packet() (credit to Willem for this union idea) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 13:02:54 -08:00
Christophe Ricard	fa00e8fed4	NFC: nci: Move NFCEE discovery logic NFCEE_DISCOVER_CMD is a specified NCI command used to discover NFCEE IDs. Move nci_nfcee_discover() call to nci_discover_se() in order to guarantee: - NFCEE_DISCOVER_CMD run when the NCI state machine is initialized - NFCEE_DISCOVER_CMD is not run in case there is not discover_se hook defined by a NFC device driver. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-02-04 09:15:18 +01:00
Christophe Ricard	15d4a8da0e	NFC: nci: Move logical connection structure allocation conn_info is currently allocated only after nfcee_discovery_ntf which is not generic enough for logical connection other than NFCEE. The corresponding conn_info is now created in nci_core_conn_create_rsp(). Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-02-04 09:14:09 +01:00
Christophe Ricard	3ba5c8466b	NFC: nci: Change credits field to credits_cnt For consistency sake change nci_core_conn_create_rsp structure credits field to credits_cnt. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-02-04 09:13:15 +01:00
Christophe Ricard	b16ae7160a	NFC: nci: Support all destinations type when creating a connection The current implementation limits nci_core_conn_create_req() to only manage NCI_DESTINATION_NFCEE. Add new parameters to nci_core_conn_create() to support all destination types described in the NCI specification. Because there are some parameters with variable size dynamic buffer allocation is needed. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-02-04 09:10:50 +01:00
Christophe Ricard	12bdf27d46	NFC: nci: Add reference to the RF logical connection The NCI_STATIC_RF_CONN_ID logical connection is the most used connection. Keeping it directly accessible in the nci_dev structure will simplify and optimize the access. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-02-04 09:09:53 +01:00
Vlad Yasevich	0508c07f5e	ipv6: Select fragment id during UFO segmentation if not set. If the IPv6 fragment id has not been set and we perform fragmentation due to UFO, select a new fragment id. We now consider a fragment id of 0 as unset and if id selection process returns 0 (after all the pertrubations), we set it to 0x80000000, thus giving us ample space not to create collisions with the next packet we may have to fragment. When doing UFO integrity checking, we also select the fragment id if it has not be set yet. This is stored into the skb_shinfo() thus allowing UFO to function correclty. This patch also removes duplicate fragment id generation code and moves ipv6_select_ident() into the header as it may be used during GSO. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-03 23:06:43 -08:00
Al Viro	8ae5e030f3	net: switch sockets to ->read_iter/->write_iter Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-04 01:34:15 -05:00
Al Viro	6d65233020	net/socket.c: fold do_sock_{read,write} into callers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-04 01:34:15 -05:00
Al Viro	31a25fae85	net: bury net/core/iovec.c - nothing in there is used anymore Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-04 01:34:15 -05:00
Al Viro	f25dcc7687	tipc: tipc ->sendmsg() conversion This one needs to copy the same data from user potentially more than once. Sadly, MTU changes can trigger that ;-/ Cc: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-04 01:34:15 -05:00
Al Viro	21226abb4e	net: switch memcpy_fromiovec()/memcpy_fromiovecend() users to copy_from_iter() That takes care of the majority of ->sendmsg() instances - most of them via memcpy_to_msg() or assorted getfrag() callbacks. One place where we still keep memcpy_fromiovecend() is tipc - there we potentially read the same data over and over; separate patch, that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-04 01:34:15 -05:00
Al Viro	57be5bdad7	ip: convert tcp_sendmsg() to iov_iter primitives patch is actually smaller than it seems to be - most of it is unindenting the inner loop body in tcp_sendmsg() itself... the bit in tcp_input.c is going to get reverted very soon - that's what memcpy_from_msg() will become, but not in this commit; let's keep it reasonably contained... There's one potentially subtle change here: in case of short copy from userland, mainline tcp_send_syn_data() discards the skb it has allocated and falls back to normal path, where we'll send as much as possible after rereading the same data again. This patch trims SYN+data skb instead - that way we don't need to copy from the same place twice. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-04 01:34:14 -05:00
Al Viro	cacdc7d2f9	ip: stash a pointer to msghdr in struct ping_fakehdr ... instead of storing its ->mgs_iter.iov there Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-04 01:34:14 -05:00
Al Viro	2e90b1c45e	rxrpc: make the users of rxrpc_kernel_send_data() set kvec-backed msg_iter properly Use iov_iter_kvec() there, get rid of set_fs() games - now that rxrpc_send_data() uses iov_iter primitives, it'll handle ITER_KVEC just fine. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-04 01:34:14 -05:00
Al Viro	af2b040e47	rxrpc: switch rxrpc_send_data() to iov_iter primitives Convert skb_add_data() to iov_iter; allows to get rid of the explicit messing with iovec in its only caller - skb_add_data() will keep advancing ->msg_iter for us, so there's no need to similate that manually. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-04 01:34:14 -05:00
Al Viro	4c946d9c11	vmci: propagate msghdr all way down to __qp_memcpy_to_queue() Switch from passing msg->iov_iter.iov to passing msg itself Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-04 01:34:14 -05:00
Al Viro	c3c1a7dbe2	ipv6: rawv6_send_hdrinc(): pass msghdr Switch from passing msg->iov_iter.iov to passing msg itself Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-04 01:34:14 -05:00
Al Viro	7ae9abfd9d	ipv4: raw_send_hdrinc(): pass msghdr Switch from passing msg->iov_iter.iov to passing msg itself Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-04 01:34:13 -05:00
Al Viro	a8866ff6a5	netlink: make the check for "send from tx_ring" deterministic As it is, zero msg_iovlen means that the first iovec in the kernel array of iovecs is left uninitialized, so checking if its ->iov_base is NULL is random. Since the real users of that thing are doing sendto(fd, NULL, 0, ...), they are getting msg_iovlen = 1 and msg_iov[0] = {NULL, 0}, which is what this test is trying to catch. As suggested by davem, let's just check that msg_iovlen was 1 and msg_iov[0].iov_base was NULL - _that_ is well-defined and it catches what we want to catch. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-04 01:34:13 -05:00
Markus Elfring	4de46d5ebc	netlabel: Less function calls in netlbl_mgmt_add_common() after error detection The functions "cipso_v4_doi_putdef" and "kfree" could be called in some cases by the netlbl_mgmt_add_common() function during error handling even if the passed variables contained still a null pointer. * This implementation detail could be improved by adjustments for jump labels. * Let us return immediately after the first failed function call according to the current Linux coding style convention. * Let us delete also an unnecessary check for the variable "entry" there. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-03 16:22:13 -08:00
Markus Elfring	7a11b1d303	netlabel: Deletion of an unnecessary check before the function call "cipso_v4_doi_free" The cipso_v4_doi_free() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-03 16:22:12 -08:00
Markus Elfring	79b7cf60e1	netlabel: Deletion of an unnecessary check before the function call "cipso_v4_doi_putdef" The cipso_v4_doi_putdef() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-03 16:22:12 -08:00
Trond Myklebust	03a9a42a1a	SUNRPC: NULL utsname dereference on NFS umount during namespace cleanup Fix an Oopsable condition when nsm_mon_unmon is called as part of the namespace cleanup, which now apparently happens after the utsname has been freed. Link: http://lkml.kernel.org/r/20150125220604.090121ae@neptune.home Reported-by: Bruno Prémont <bonbons@linux-vserver.org> Cc: stable@vger.kernel.org # 3.18 Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-02-03 16:40:17 -05:00
Trond Myklebust	e2c63e091e	Merge branch 'flexfiles' * flexfiles: (53 commits) pnfs: lookup new lseg at lseg boundary nfs41: .init_read and .init_write can be called with valid pg_lseg pnfs: Update documentation on the Layout Drivers pnfs/flexfiles: Add the FlexFile Layout Driver nfs: count DIO good bytes correctly with mirroring nfs41: wait for LAYOUTRETURN before retrying LAYOUTGET nfs: add a helper to set NFS_ODIRECT_RESCHED_WRITES to direct writes nfs41: add NFS_LAYOUT_RETRY_LAYOUTGET to layout header flags nfs/flexfiles: send layoutreturn before freeing lseg nfs41: introduce NFS_LAYOUT_RETURN_BEFORE_CLOSE nfs41: allow async version layoutreturn nfs41: add range to layoutreturn args pnfs: allow LD to ask to resend read through pnfs nfs: add nfs_pgio_current_mirror helper nfs: only reset desc->pg_mirror_idx when mirroring is supported nfs41: add a debug warning if we destroy an unempty layout pnfs: fail comparison when bucket verifier not set nfs: mirroring support for direct io nfs: add mirroring support to pgio layer pnfs: pass ds_commit_idx through the commit path ... Conflicts: fs/nfs/pnfs.c fs/nfs/pnfs.h	2015-02-03 16:01:27 -05:00
Weston Andros Adamson	840210fc48	sunrpc: add rpc_count_iostats_idx Add a call to tally stats for a task under a different statsidx than what's contained in the task structure. This is needed to properly account for pnfs reads/writes when the DS nfs version != the MDS version. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>	2015-02-03 11:06:38 -08:00
Trond Myklebust	cc3ea893cb	NFS: Client side changes for RDMA These patches improve the scalability of the NFSoRDMA client and take large variables off of the stack. Additionally, the GFP_* flags are updated to match what TCP uses. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJUy/21AAoJENfLVL+wpUDrDR4QAJ+aZ1LqO5muJG5LDIReh0p+ IyvAO3ekWKaaXZd5e/wOviHRRSBCs3oRBm1sNyjOOZYf7nl67RrndjGq2Ccp6tGw 6V6ligIlhu3QAApFiQmLfpDXQeDBysMfhEQMXDj8kVhAvn3SWwbT6q1Bb8tMZvnu lwSPYCuLAdtPzDs3j3Pt3jSrOD7gS06Vtumco1yS82YiLK86E7JiOLb9w5Ltk0Wo DT9hZ7JXtflBy3trBsNYQ7GTjoYEp792Ca9DT/nie8/Tuuy3DhooLJKdwVErq/0/ ycI033mw//RneY+/eF2wLeS4PlzaiSq7DmXKpE5MnkpVuNAdPbHbCuVzuXa4/LLD NhKyw/NTAE9ucFJ8e3apj3m42O00bMV0rx0rOUvMMdbNPBKzXk3rrb4uRYfT4k+D yss7aEZg0EcDbe9KqfuDdwXVJIZnvAvXD+V1iL4X3QmVM0JyJZouNZPoG7fCwA1g uoqGJK9zu7osr59YMomf2okWQFUQYCea5ombXYOo3ZtANw+OYaHDcMKTXaiCtMdA CZUEOND8mB/EFtgTMA2TZmDzch8iOOuD7wmtifb8h1DqN/3GxgecBcT5XMCx5zDR bMpt9vbCfe4YP6idpl5XEnVbcPG9CZF2W2Hg2gI7axX8R+ZMvwQQlWJjOqxoR6+F bpD/jjWFYaS6U7NGyylM =YCZe -----END PGP SIGNATURE----- Merge tag 'nfs-rdma-for-3.20' of git://git.linux-nfs.org/projects/anna/nfs-rdma NFS: Client side changes for RDMA These patches improve the scalability of the NFSoRDMA client and take large variables off of the stack. Additionally, the GFP_* flags are updated to match what TCP uses. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> * tag 'nfs-rdma-for-3.20' of git://git.linux-nfs.org/projects/anna/nfs-rdma: (21 commits) xprtrdma: Update the GFP flags used in xprt_rdma_allocate() xprtrdma: Clean up after adding regbuf management xprtrdma: Allocate zero pad separately from rpcrdma_buffer xprtrdma: Allocate RPC/RDMA receive buffer separately from struct rpcrdma_rep xprtrdma: Allocate RPC/RDMA send buffer separately from struct rpcrdma_req xprtrdma: Allocate RPC send buffer separately from struct rpcrdma_req xprtrdma: Add struct rpcrdma_regbuf and helpers xprtrdma: Refactor rpcrdma_buffer_create() and rpcrdma_buffer_destroy() xprtrdma: Simplify synopsis of rpcrdma_buffer_create() xprtrdma: Take struct ib_qp_attr and ib_qp_init_attr off the stack xprtrdma: Take struct ib_device_attr off the stack xprtrdma: Free the pd if ib_query_qp() fails xprtrdma: Remove rpcrdma_ep::rep_func and ::rep_xprt xprtrdma: Move credit update to RPC reply handler xprtrdma: Remove rl_mr field, and the mr_chunk union xprtrdma: Remove rpcrdma_ep::rep_ia xprtrdma: Rename "xprt" and "rdma_connect" fields in struct rpcrdma_xprt xprtrdma: Clean up hdrlen xprtrdma: Display XIDs in host byte order xprtrdma: Modernize htonl and ntohl ...	2015-02-03 11:54:58 -05:00
Mika Westerberg	79044f60ca	net: rfkill: Add Broadcom BCM2E40 bluetooth ACPI ID This is yet another Broadcom bluetooth chip with ACPI ID BCM2E40. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-02-03 14:38:22 +01:00
Johan Hedberg	88d9077c27	Bluetooth: Fix potential NULL dereference The bnep_get_device function may be triggered by an ioctl just after a connection has gone down. In such a case the respective L2CAP chan->conn pointer will get set to NULL (by l2cap_chan_del). This patch adds a missing NULL check for this case in the bnep_get_device() function. Reported-by: Patrik Flykt <patrik.flykt@linux.intel.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-03 09:02:12 +01:00
David S. Miller	3ae55826ae	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree, they are: 1) Validate hooks for nf_tables NAT expressions, otherwise users can crash the kernel when using them from the wrong hook. We already got one user trapped on this when configuring masquerading. 2) Fix a BUG splat in nf_tables with CONFIG_DEBUG_PREEMPT=y. Reported by Andreas Schultz. 3) Avoid unnecessary reroute of traffic in the local input path in IPVS that triggers a crash in in xfrm. Reported by Florian Wiessner and fixes by Julian Anastasov. 4) Fix memory and module refcount leak from the error path of nf_tables_newchain(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-02 19:30:53 -08:00
Markus Elfring	7d37d0c159	net: sctp: Deletion of an unnecessary check before the function call "kfree" The kfree() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-By: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-02 19:29:43 -08:00
Vlad Yasevich	32dce968dd	ipv6: Allow for partial checksums on non-ufo packets Currntly, if we are not doing UFO on the packet, all UDP packets will start with CHECKSUM_NONE and thus perform full checksum computations in software even if device support IPv6 checksum offloading. Let's start start with CHECKSUM_PARTIAL if the device supports it and we are sending only a single packet at or below mtu size. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-02 19:28:05 -08:00
Vlad Yasevich	03485f2adc	udpv6: Add lockless sendmsg() support This commit adds the same functionaliy to IPv6 that commit `903ab86d19` Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue Mar 1 02:36:48 2011 +0000 udp: Add lockless transmit path added to IPv4. UDP transmit path can now run without a socket lock, thus allowing multiple threads to send to a single socket more efficiently. This is only used when corking/MSG_MORE is not used. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-02 19:28:04 -08:00
Vlad Yasevich	d39d938c82	ipv6: Introduce udpv6_send_skb() Now that we can individually construct IPv6 skbs to send, add a udpv6_send_skb() function to populate the udp header and send the skb. This allows udp_v6_push_pending_frames() to re-use this function as well as enables us to add lockless sendmsg() support. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-02 19:28:04 -08:00
Vlad Yasevich	6422398c2a	ipv6: introduce ipv6_make_skb This commit is very similar to commit `1c32c5ad6f` Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue Mar 1 02:36:47 2011 +0000 inet: Add ip_make_skb and ip_finish_skb It adds IPv6 version of the helpers ip6_make_skb and ip6_finish_skb. The job of ip6_make_skb is to collect messages into an ipv6 packet and poplulate ipv6 eader. The job of ip6_finish_skb is to transmit the generated skb. Together they replicated the job of ip6_push_pending_frames() while also provide the capability to be called independently. This will be needed to add lockless UDP sendmsg support. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-02 19:28:04 -08:00
Vlad Yasevich	0bbe84a67b	ipv6: Append sending data to arbitrary queue Add the ability to append data to arbitrary queue. This will be needed later to implement lockless UDP sends. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-02 19:28:04 -08:00
Vlad Yasevich	366e41d977	ipv6: pull cork initialization into its own function. Pull IPv6 cork initialization into its own function that can be re-used. IPv6 specific cork data did not have an explicit data structure. This patch creats eone so that just ipv6 cork data can be as arguemts. Also, since IPv6 tries to save the flow label into inet_cork_full tructure, pass the full cork. Adjust ip6_cork_release() to take cork data structures. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-02 19:28:04 -08:00
Florian Westphal	843c2fdf7a	net: dctcp: loosen requirement to assert ECT(0) during 3WHS One deployment requirement of DCTCP is to be able to run in a DC setting along with TCP traffic. As Glenn Judd's NSDI'15 paper "Attaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter" [1] (tba) explains, one way to solve this on switch side is to split DCTCP and TCP traffic in two queues per switch port based on the DSCP: one queue soley intended for DCTCP traffic and one for non-DCTCP traffic. For the DCTCP queue, there's the marking threshold K as explained in commit `e3118e8359` ("net: tcp: add DCTCP congestion control algorithm") for RED marking ECT(0) packets with CE. For the non-DCTCP queue, there's f.e. a classic tail drop queue. As already explained in `e3118e8359`, running DCTCP at scale when not marking SYN/SYN-ACK packets with ECT(0) has severe consequences as for non-ECT(0) packets, traversing the RED marking DCTCP queue will result in a severe reduction of connection probability. This is due to the DCTCP queue being dominated by ECT(0) traffic and switches handle non-ECT traffic in the RED marking queue after passing K as drops, where K is usually a low watermark in order to leave enough tailroom for bursts. Splitting DCTCP traffic among several queues (ECN and non-ECN queue) is being considered a terrible idea in the network community as it splits single flows across multiple network paths. Therefore, commit `e3118e8359` implements this on Linux as ECT(0) marked traffic, as we argue that marking all packets of a DCTCP flow is the only viable solution and also doesn't speak against the draft. However, recently, a DCTCP implementation for FreeBSD hit also their mainline kernel [2]. In order to let them play well together with Linux' DCTCP, we would need to loosen the requirement that ECT(0) has to be asserted during the 3WHS as not implemented in FreeBSD. This simplifies the ECN test and lets DCTCP work together with FreeBSD. Joint work with Daniel Borkmann. [1] https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/judd [2] `8ad8794452` Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Glenn Judd <glenn.judd@morganstanley.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-02 18:48:55 -08:00
Willem de Bruijn	b245be1f4d	net-timestamp: no-payload only sysctl Tx timestamps are looped onto the error queue on top of an skb. This mechanism leaks packet headers to processes unless the no-payload options SOF_TIMESTAMPING_OPT_TSONLY is set. Add a sysctl that optionally drops looped timestamp with data. This only affects processes without CAP_NET_RAW. The policy is checked when timestamps are generated in the stack. It is possible for timestamps with data to be reported after the sysctl is set, if these were queued internally earlier. No vulnerability is immediately known that exploits knowledge gleaned from packet headers, but it may still be preferable to allow administrators to lock down this path at the cost of possible breakage of legacy applications. Signed-off-by: Willem de Bruijn <willemb@google.com> ---- Changes (v1 -> v2) - test socket CAP_NET_RAW instead of capable(CAP_NET_RAW) (rfc -> v1) - document the sysctl in Documentation/sysctl/net.txt - fix access control race: read .._OPT_TSONLY only once, use same value for permission check and skb generation. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-02 18:46:51 -08:00
Willem de Bruijn	49ca0d8bfa	net-timestamp: no-payload option Add timestamping option SOF_TIMESTAMPING_OPT_TSONLY. For transmit timestamps, this loops timestamps on top of empty packets. Doing so reduces the pressure on SO_RCVBUF. Payload inspection and cmsg reception (aside from timestamps) are no longer possible. This works together with a follow on patch that allows administrators to only allow tx timestamping if it does not loop payload or metadata. Signed-off-by: Willem de Bruijn <willemb@google.com> ---- Changes (rfc -> v1) - add documentation - remove unnecessary skb->len test (thanks to Richard Cochran) Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-02 18:46:51 -08:00
Christophe Ricard	6095b0f07d	NFC: nci: Change NCI state machine to LISTEN_ACTIVE When receiving an interface activation notification, if the RF interface is NCI_RF_INTERFACE_NFCEE_DIRECT, we need to ignore the following parameters and change the NCI state machine to NCI_LISTEN_ACTIVE. According to the NCI specification, the parameters should be 0 and shall be ignored. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-02-02 21:50:41 +01:00
Christophe Ricard	a41bb8448e	NFC: nci: Add RF NFCEE action notification support The NFCC sends an NCI_OP_RF_NFCEE_ACTION_NTF notification to the host (DH) to let it know that for example an RF transaction with a payment reader is done. For now the notification handler is empty. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-02-02 21:50:41 +01:00
Christophe Ricard	447b27c4f2	NFC: Forward NFC_EVT_TRANSACTION to user space NFC_EVT_TRANSACTION is sent through netlink in order for a specific application running on a secure element to notify userspace of an event. Typically the secure element application counterpart on the host could interpret that event and act upon it. Forwarded information contains: - SE host generating the event - Application IDentifier doing the operation - Applications parameters Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-02-02 21:50:40 +01:00
Christophe Ricard	11f54f2286	NFC: nci: Add HCI over NCI protocol support According to the NCI specification, one can use HCI over NCI to talk with specific NFCEE. The HCI network is viewed as one logical NFCEE. This is needed to support secure element running HCI only firmwares embedded on an NCI capable chipset, like e.g. the st21nfcb. There is some duplication between this piece of code and the HCI core code, but the latter would need to be abstracted even more to be able to use NCI as a logical transport for HCP packets. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-02-02 21:50:40 +01:00
Christophe Ricard	736bb95774	NFC: nci: Support logical connections management In order to communicate with an NFCEE, we need to open a logical connection to it, by sending the NCI_OP_CORE_CONN_CREATE_CMD command to the NFCC. It's left up to the drivers to decide when to close an already opened logical connection. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-02-02 21:50:39 +01:00
Christophe Ricard	f7f793f313	NFC: nci: Add NFCEE enabling and disabling support NFCEEs can be enabled or disabled by sending the NCI_OP_NFCEE_MODE_SET_CMD command to the NFCC. This patch provides an API for drivers to enable and disable e.g. their NCI discoveredd secure elements. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-02-02 21:50:39 +01:00
Christophe Ricard	af9c8aa67d	NFC: nci: Add NFCEE discover support NFCEEs (NFC Execution Environment) have to be explicitly discovered by sending the NCI_OP_NFCEE_DISCOVER_CMD command. The NFCC will respond to this command by telling us how many NFCEEs are connected to it. Then the NFCC sends a notification command for each and every NFCEE connected. Here we implement support for sending NCI_OP_NFCEE_DISCOVER_CMD command, receiving the response and the potential notifications. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-02-02 21:50:38 +01:00
Christophe Ricard	4aeee6871e	NFC: nci: Add dynamic logical connections support The current NCI core only support the RF static connection. For other NFC features such as Secure Element communication, we may need to create logical connections to the NFCEE (Execution Environment. In order to track each logical connection ID dynamically, we add a linked list of connection info pointers to the nci_dev structure. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-02-02 21:50:31 +01:00
Johan Hedberg	66f096f791	Bluetooth: Remove mgmt_rp_read_local_oob_ext_data struct This extended return parameters struct conflicts with the new Read Local OOB Extended Data command definition. To avoid the conflict simply rename the old "extended" version to the normal one and update the code appropriately to take into account the two possible response PDU sizes. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-02 18:27:56 +01:00
J. Bruce Fields	a584143b01	Merge branch 'locks-3.20' of git://git.samba.org/jlayton/linux into for-3.20 Christoph's block pnfs patches have some minor dependencies on these lock patches.	2015-02-02 11:29:29 -05:00
Jakub Pawlowski	4b0e0ceddf	Bluetooth: Add restarting to service discovery When using LE_SCAN_FILTER_DUP_ENABLE, some controllers would send advertising report from each LE device only once. That means that we don't get any updates on RSSI value, and makes Service Discovery very slow. This patch adds restarting scan when in Service Discovery, and device with filtered uuid is found, but it's not in RSSI range to send event yet. This way if device moves into range, we will quickly get RSSI update. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-02 08:52:34 +01:00
Jakub Pawlowski	2d28cfe7aa	Bluetooth: Add le_scan_restart work for LE scan restarting Currently there is no way to restart le scan, and it's needed in service scan method. The way it work: it disable, and then enable le scan on controller. During the restart, we must remember when the scan was started, and it's duration, to later re-schedule the le_scan_disable work, that was stopped during the stop scan phase. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-02-02 08:52:33 +01:00
Roopa Prabhu	68e331c785	bridge: offload bridge port attributes to switch asic if feature flag set This patch adds support to set/del bridge port attributes in hardware from the bridge driver. With this, when the user sends a bridge setlink message with no flags or master flags set, - the bridge driver ndo_bridge_setlink handler sets settings in the kernel - calls the swicthdev api to propagate the attrs to the switchdev hardware You can still use the self flag to go to the switch hw or switch port driver directly. With this, it also makes sure a notification goes out only after the attributes are set both in the kernel and hw. The patch calls switchdev api only if BRIDGE_FLAGS_SELF is not set. This is because the offload cases with BRIDGE_FLAGS_SELF are handled in the caller (in rtnetlink.c). Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-01 23:16:34 -08:00
Roopa Prabhu	8a44dbb202	swdevice: add new apis to set and del bridge port attributes This patch adds two new api's netdev_switch_port_bridge_setlink and netdev_switch_port_bridge_dellink to offload bridge port attributes to switch port (The names of the apis look odd with 'switch_port_bridge', but am more inclined to change the prefix of the api to something else. Will take any suggestions). The api's look at the NETIF_F_HW_SWITCH_OFFLOAD feature flag to pass bridge port attributes to the port device. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-01 23:16:34 -08:00
Roopa Prabhu	add511b382	bridge: add flags argument to ndo_bridge_setlink and ndo_bridge_dellink bridge flags are needed inside ndo_bridge_setlink/dellink handlers to avoid another call to parse IFLA_AF_SPEC inside these handlers This is used later in this series Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-01 23:16:33 -08:00
Eric Dumazet	bdbbb8527b	ipv4: tcp: get rid of ugly unicast_sock In commit `be9f4a44e7` ("ipv4: tcp: remove per net tcp_sock") I tried to address contention on a socket lock, but the solution I chose was horrible : commit `3a7c384ffd` ("ipv4: tcp: unicast_sock should not land outside of TCP stack") addressed a selinux regression. commit `0980e56e50` ("ipv4: tcp: set unicast_sock uc_ttl to -1") took care of another regression. commit `b5ec8eeac4` ("ipv4: fix ip_send_skb()") fixed another regression. commit `811230cd85` ("tcp: ipv4: initialize unicast_sock sk_pacing_rate") was another shot in the dark. Really, just use a proper socket per cpu, and remove the skb_orphan() call, to re-enable flow control. This solves a serious problem with FQ packet scheduler when used in hostile environments, as we do not want to allocate a flow structure for every RST packet sent in response to a spoofed packet. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-01 23:06:19 -08:00
Marcel Holtmann	bf21d7931a	Bluetooth: Fix OOB data present for BR/EDR Secure Connections Only mode When using Secure Connections Only mode, then only P-256 OOB data is valid and should be provided. In case userspace provides P-192 and P-256 OOB data, then the P-192 values will be set to zero. However the present value of the IO capability exchange still mentioned that both values would be available. Fix this by telling the controller clearly that only the P-256 OOB data is present. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-01 11:52:54 +02:00
Marcel Holtmann	6858bcd073	Bluetooth: Expose remote OOB information as debugfs entry For debugging purposes it is good to know which OOB data is actually currently loaded for each controller. So expose that list via debugfs. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-01 09:15:21 +02:00
Marcel Holtmann	5789f37cbc	Bluetooth: Expose hardware error code as debugfs entry When the Hardware Error event is send by the controller, the Bluetooth core stores the error code. Expose it via debugfs so it can be retrieved later on. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-01 09:14:55 +02:00
Marcel Holtmann	0886aea6ac	Bluetooth: Expose debug keys usage setting via debugfs To allow easier debugging when debug keys are generated, provide debugfs entry for checking the setting of debug keys usage. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-01 09:14:19 +02:00
Marcel Holtmann	c50b33c80e	Bluetooth: Track changes from HCI Write Simple Pairing Debug Mode command When the HCI Write Simple Pairing Debug Mode command has been issued, the result needs to be tracked and stored. The hdev->ssp_debug_mode variable is already present, but was never updated when the mode in the controller was actually changed. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-01 09:13:23 +02:00
Marcel Holtmann	6e07231a80	Bluetooth: Expose Secure Simple Pairing debug mode setting in debugfs The value of the ssp_debug_mode should be accessible via debugfs to be able to determine if a BR/EDR controller generates debugs keys or not. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-02-01 09:12:56 +02:00
Eric Dumazet	349c9e3c73	ipv4: icmp: use percpu allocation Get rid of nr_cpu_ids and use modern percpu allocation. Note that the sockets themselves are not yet allocated using NUMA affinity. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-31 17:48:18 -08:00
Kenneth Klette Jonassen	932eb7638a	tcp: use SACK RTTs for CC Current behavior only passes RTTs from sequentially acked data to CC. If sender gets a combined ACK for segment 1 and SACK for segment 3, then the computed RTT for CC is the time between sending segment 1 and receiving SACK for segment 3. Pass the minimum computed RTT from any acked data to CC, i.e. time between sending segment 3 and receiving SACK for segment 3. Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-31 17:25:37 -08:00
Marcel Holtmann	41bcfd50d5	Bluetooth: Allow remote OOB data to only provide P-192 or P-256 values In case the remote only provided P-192 or P-256 data for OOB pairing, then make sure that the data value pointers are correctly set. That way the core can provide correct information when remote OOB data present information have to be communicated. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-31 21:26:14 +01:00
Marcel Holtmann	4775a4ea14	Bluetooth: Fix OOB data present value for SMP pairing Before setting the OOB data present flag with SMP pairing, check the newly introduced present tracking that actual OOB data values have been provided. The existence of remote OOB data structure does not actually mean that the correct data values are available. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-31 21:26:14 +01:00
Marcel Holtmann	659c7fb084	Bluetooth: Fix OOB data present value for BR/EDR Secure Connections When BR/EDR Secure Connections has been enabled, the OOB data present value can take 2 additional values. The host has to clearly provide details about if P-192 OOB data, P-256 OOB data or a combination of P-192 and P-256 OOB data is present. In case BR/EDR Secure Connections is not enabled or not supported, then check that P-192 OOB data is actually present and return the correct value based on that. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-31 21:26:12 +01:00
Marcel Holtmann	f7697b1602	Bluetooth: Store OOB data present value for each set of remote OOB data Instead of doing complex calculation every time the OOB data is used, just calculate the OOB data present value and store it with the OOB data raw values. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-31 09:59:45 +02:00
Nicholas Mc Guire	a994a09097	irda: use msecs_to_jiffies for conversions This is only an API consolidation and should make things more readable it replaces var * HZ / 1000 constructs by msecs_to_jiffies(var). Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-30 18:08:25 -08:00
Toshiaki Makita	d4bcef3fbe	net: Fix vlan_get_protocol for stacked vlan vlan_get_protocol() could not get network protocol if a skb has a 802.1ad vlan tag or multiple vlans, which caused incorrect checksum calculation in several drivers. Fix vlan_get_protocol() to retrieve network protocol instead of incorrect vlan protocol. As the logic is the same as skb_network_protocol(), create a common helper function __vlan_get_protocol() and call it from existing functions. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-30 18:03:47 -08:00
Daniel Borkmann	207895fd38	net: mark some potential candidates __read_mostly They are all either written once or extremly rarely (e.g. from init code), so we can move them to the .data..read_mostly section. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-30 17:58:39 -08:00
Saran Maruti Ramanara	cfbf654efc	net: sctp: fix passing wrong parameter header to param_type2af in sctp_process_param When making use of RFC5061, section 4.2.4. for setting the primary IP address, we're passing a wrong parameter header to param_type2af(), resulting always in NULL being returned. At this point, param.p points to a sctp_addip_param struct, containing a sctp_paramhdr (type = 0xc004, length = var), and crr_id as a correlation id. Followed by that, as also presented in RFC5061 section 4.2.4., comes the actual sctp_addr_param, which also contains a sctp_paramhdr, but this time with the correct type SCTP_PARAM_IPV{4,6}_ADDRESS that param_type2af() can make use of. Since we already hold a pointer to addr_param from previous line, just reuse it for param_type2af(). Fixes: `d6de309759` ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT") Signed-off-by: Saran Maruti Ramanara <saran.neti@telus.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-30 17:45:23 -08:00
Pablo Neira	8b7c36d810	netlink: fix wrong subscription bitmask to group mapping in The subscription bitmask passed via struct sockaddr_nl is converted to the group number when calling the netlink_bind() and netlink_unbind() callbacks. The conversion is however incorrect since bitmask (1 << 0) needs to be mapped to group number 1. Note that you cannot specify the group number 0 (usually known as _NONE) from setsockopt() using NETLINK_ADD_MEMBERSHIP since this is rejected through -EINVAL. This problem became noticeable since `97840cb` ("netfilter: nfnetlink: fix insufficient validation in nfnetlink_bind") when binding to bitmask (1 << 0) in ctnetlink. Reported-by: Andre Tomt <andre@tomt.net> Reported-by: Ivan Delalande <colona@arista.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-30 17:43:47 -08:00
Patrick McHardy	4c1017aa80	netfilter: nft_lookup: add missing attribute validation for NFTA_LOOKUP_SET_ID Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-01-30 19:08:20 +01:00
Arturo Borrero	5191f4d82d	netfilter: nft_compat: add ebtables support This patch extends nft_compat to support ebtables extensions. ebtables verdict codes are translated to the ones used by the nf_tables engine, so we can properly use ebtables target extensions from nft_compat. This patch extends previous work by Giuseppe Longo <giuseppelng@gmail.com>. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-01-30 19:07:59 +01:00
Pablo Neira Ayuso	f5553c19ff	netfilter: nf_tables: fix leaks in error path of nf_tables_newchain() Release statistics and module refcount on memory allocation problems. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-01-30 18:42:08 +01:00
Chuck Lever	a0a1d50cd1	xprtrdma: Update the GFP flags used in xprt_rdma_allocate() Reflect the more conservative approach used in the socket transport's version of this transport method. An RPC buffer allocation should avoid forcing not just FS activity, but any I/O. In particular, two recent changes missed updating xprtrdma: - Commit `c6c8fe79a8` ("net, sunrpc: suppress allocation warning ...") - Commit `a564b8f039` ("nfs: enable swap on NFS") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 12:18:48 -05:00
Chuck Lever	df515ca7b3	xprtrdma: Clean up after adding regbuf management rpcrdma_{de}register_internal() are used only in verbs.c now. MAX_RPCRDMAHDR is no longer used and can be removed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:49 -05:00
Chuck Lever	c05fbb5a59	xprtrdma: Allocate zero pad separately from rpcrdma_buffer Use the new rpcrdma_alloc_regbuf() API to shrink the amount of contiguous memory needed for a buffer pool by moving the zero pad buffer into a regbuf. This is for consistency with the other uses of internally registered memory. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:49 -05:00
Chuck Lever	6b1184cd4f	xprtrdma: Allocate RPC/RDMA receive buffer separately from struct rpcrdma_rep The rr_base field is currently the buffer where RPC replies land. An RPC/RDMA reply header lands in this buffer. In some cases an RPC reply header also lands in this buffer, just after the RPC/RDMA header. The inline threshold is an agreed-on size limit for RDMA SEND operations that pass from server and client. The sum of the RPC/RDMA reply header size and the RPC reply header size must be less than this threshold. The largest RDMA RECV that the client should have to handle is the size of the inline threshold. The receive buffer should thus be the size of the inline threshold, and not related to RPCRDMA_MAX_SEGS. RPC replies received via RDMA WRITE (long replies) are caught in rq_rcv_buf, which is the second half of the RPC send buffer. Ie, such replies are not involved in any way with rr_base. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:49 -05:00
Chuck Lever	85275c874e	xprtrdma: Allocate RPC/RDMA send buffer separately from struct rpcrdma_req The rl_base field is currently the buffer where each RPC/RDMA call header is built. The inline threshold is an agreed-on size limit to for RDMA SEND operations that pass between client and server. The sum of the RPC/RDMA header size and the RPC header size must be less than or equal to this threshold. Increasing the r/wsize maximum will require MAX_SEGS to grow significantly, but the inline threshold size won't change (both sides agree on it). The server's inline threshold doesn't change. Since an RPC/RDMA header can never be larger than the inline threshold, make all RPC/RDMA header buffers the size of the inline threshold. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:49 -05:00
Chuck Lever	0ca77dc372	xprtrdma: Allocate RPC send buffer separately from struct rpcrdma_req Because internal memory registration is an expensive and synchronous operation, xprtrdma pre-registers send and receive buffers at mount time, and then re-uses them for each RPC. A "hardway" allocation is a memory allocation and registration that replaces a send buffer during the processing of an RPC. Hardway must be done if the RPC send buffer is too small to accommodate an RPC's call and reply headers. For xprtrdma, each RPC send buffer is currently part of struct rpcrdma_req so that xprt_rdma_free(), which is passed nothing but the address of an RPC send buffer, can find its matching struct rpcrdma_req and rpcrdma_rep quickly via container_of / offsetof. That means that hardway currently has to replace a whole rpcrmda_req when it replaces an RPC send buffer. This is often a fairly hefty chunk of contiguous memory due to the size of the rl_segments array and the fact that both the send and receive buffers are part of struct rpcrdma_req. Some obscure re-use of fields in rpcrdma_req is done so that xprt_rdma_free() can detect replaced rpcrdma_req structs, and restore the original. This commit breaks apart the RPC send buffer and struct rpcrdma_req so that increasing the size of the rl_segments array does not change the alignment of each RPC send buffer. (Increasing rl_segments is needed to bump up the maximum r/wsize for NFS/RDMA). This change opens up some interesting possibilities for improving the design of xprt_rdma_allocate(). xprt_rdma_allocate() is now the one place where RPC send buffers are allocated or re-allocated, and they are now always left in place by xprt_rdma_free(). A large re-allocation that includes both the rl_segments array and the RPC send buffer is no longer needed. Send buffer re-allocation becomes quite rare. Good send buffer alignment is guaranteed no matter what the size of the rl_segments array is. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:49 -05:00
Chuck Lever	9128c3e794	xprtrdma: Add struct rpcrdma_regbuf and helpers There are several spots that allocate a buffer via kmalloc (usually contiguously with another data structure) and then register that buffer internally. I'd like to split the buffers out of these data structures to allow the data structures to scale. Start by adding functions that can kmalloc and register a buffer, and can manage/preserve the buffer's associated ib_sge and ib_mr fields. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:49 -05:00
Chuck Lever	1392402c40	xprtrdma: Refactor rpcrdma_buffer_create() and rpcrdma_buffer_destroy() Move the details of how to create and destroy rpcrdma_req and rpcrdma_rep structures into helper functions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:48 -05:00
Chuck Lever	ac920d04a7	xprtrdma: Simplify synopsis of rpcrdma_buffer_create() Clean up: There is one call site for rpcrdma_buffer_create(). All of the arguments there are fields of an rpcrdma_xprt. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:48 -05:00
Chuck Lever	ce1ab9ab47	xprtrdma: Take struct ib_qp_attr and ib_qp_init_attr off the stack Reduce stack footprint of the connection upcall handler function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:48 -05:00
Chuck Lever	7bc7972cdd	xprtrdma: Take struct ib_device_attr off the stack Device attributes are large, and are used in more than one place. Stash a copy in dynamically allocated memory. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:48 -05:00
Chuck Lever	5ae711a246	xprtrdma: Free the pd if ib_query_qp() fails If ib_query_qp() fails or the memory registration mode isn't supported, don't leak the PD. An orphaned IB/core resource will cause IB module removal to hang. Fixes: `bd7ed1d133` ("RPC/RDMA: check selected memory registration ...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:48 -05:00
Chuck Lever	afadc468eb	xprtrdma: Remove rpcrdma_ep::rep_func and ::rep_xprt Clean up: The rep_func field always refers to rpcrdma_conn_func(). rep_func should have been removed by commit `b45ccfd25d` ("xprtrdma: Remove MEMWINDOWS registration modes"). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:48 -05:00
Chuck Lever	eba8ff660b	xprtrdma: Move credit update to RPC reply handler Reduce work in the receive CQ handler, which can be run at hardware interrupt level, by moving the RPC/RDMA credit update logic to the RPC reply handler. This has some additional benefits: More header sanity checking is done before trusting the incoming credit value, and the receive CQ handler no longer touches the RPC/RDMA header (the CPU stalls while waiting for the header contents to be brought into the cache). This further extends work begun by commit `e7ce710a88` ("xprtrdma: Avoid deadlock when credit window is reset"). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:48 -05:00
Chuck Lever	3eb3581066	xprtrdma: Remove rl_mr field, and the mr_chunk union Clean up: Since commit `0ac531c183` ("xprtrdma: Remove REGISTER memory registration mode"), the rl_mr pointer is no longer used anywhere. After removal, there's only a single member of the mr_chunk union, so mr_chunk can be removed as well, in favor of a single pointer field. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:48 -05:00
Chuck Lever	5d410ba061	xprtrdma: Remove rpcrdma_ep::rep_ia Clean up: This field is not used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:48 -05:00
Chuck Lever	5abefb861f	xprtrdma: Rename "xprt" and "rdma_connect" fields in struct rpcrdma_xprt Clean up: Use consistent field names in struct rpcrdma_xprt. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:48 -05:00
Chuck Lever	f2846481b4	xprtrdma: Clean up hdrlen Clean up: Replace naked integers with a documenting macro. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:48 -05:00
Chuck Lever	052151a979	xprtrdma: Display XIDs in host byte order xprtsock.c and the backchannel code display XIDs in host byte order. Follow suit in xprtrdma. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:48 -05:00
Chuck Lever	284f4902a6	xprtrdma: Modernize htonl and ntohl Clean up: Replace htonl and ntohl with the be32 equivalents. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:48 -05:00
Chuck Lever	8502427ccd	xprtrdma: human-readable completion status Make it easier to grep the system log for specific error conditions. The wc.opcode field is not included because opcode numbers are sparse, and because wc.opcode is not necessarily valid when completion reports an error. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-01-30 10:47:47 -05:00
Julian Anastasov	579eb62ac3	ipvs: rerouting to local clients is not needed anymore commit `f5a41847ac` ("ipvs: move ip_route_me_harder for ICMP") from 2.6.37 introduced ip_route_me_harder() call for responses to local clients, so that we can provide valid rt_src after SNAT. It was used by TCP to provide valid daddr for ip_send_reply(). After commit `0a5ebb8000` ("ipv4: Pass explicit daddr arg to ip_send_reply()." from 3.0 this rerouting is not needed anymore and should be avoided, especially in LOCAL_IN. Fixes 3.12.33 crash in xfrm reported by Florian Wiessner: "3.12.33 - BUG xfrm_selector_match+0x25/0x2f6" Reported-by: Smart Weblications GmbH - Florian Wiessner <f.wiessner@smart-weblications.de> Tested-by: Smart Weblications GmbH - Florian Wiessner <f.wiessner@smart-weblications.de> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2015-01-30 10:05:55 +09:00
Li Wei	3cdaa5be9e	ipv4: Don't increase PMTU with Datagram Too Big message. RFC 1191 said, "a host MUST not increase its estimate of the Path MTU in response to the contents of a Datagram Too Big message." Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-29 15:28:59 -08:00
Salam Noureddine	7866a62104	dev: add per net_device packet type chains When many pf_packet listeners are created on a lot of interfaces the current implementation using global packet type lists scales poorly. This patch adds per net_device packet type lists to fix this problem. The patch was originally written by Eric Biederman for linux-2.6.29. Tested on linux-3.16. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-29 14:41:39 -08:00
Nicolas Dichtel	7b4ce694b2	rtnetlink: pass link_net to the newlink handler When IFLA_LINK_NETNSID is used, the netdevice should be built in this link netns and moved at the end to another netns (pointed by the socket netns or IFLA_NET_NS_[PID\|FD]). Existing user of the newlink handler will use the netns argument (src_net) to find a link netdevice or to check some other information into the link netns. For example, to find a netdevice, two information are required: an ifindex (usually from IFLA_LINK) and a netns (this link netns). Note: when using IFLA_LINK_NETNSID and IFLA_NET_NS_[PID\|FD], a user may create a netdevice that stands in netnsX and with its link part in netnsY, by sending a rtnl message from netnsZ. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-29 14:23:25 -08:00
Nicolas Dichtel	8997c27ec4	caif: remove wrong dev_net_set() call src_net points to the netns where the netlink message has been received. This netns may be different from the netns where the interface is created (because the user may add IFLA_NET_NS_[PID\|FD]). In this case, src_net is the link netns. It seems wrong to override the netns in the newlink() handler because if it was not already src_net, it means that the user explicitly asks to create the netdevice in another netns. CC: Sjur Brændeland <sjur.brandeland@stericsson.com> CC: Dmitry Tarnyagin <dmitry.tarnyagin@lockless.no> Fixes: `8391c4aab1` ("caif: Bugfixes in CAIF netdevice for close and flow control") Fixes: `c412540063` ("caif-hsi: Add rtnl support") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-29 14:20:02 -08:00
Szymon Janc	ac363cf9eb	Bluetooth: Fix sending Read Remote Extended Features command This command should only be used if remote device reports that it supports extended features. Otherwise command will fail and connection will be dropped. Some devices support SSP but don't support extended features so current check for SSP support is not enought. Instead of checking for SSP support just check if both ends support Extended Feature. < HCI Command: Create Connection (0x01\|0x0005) plen 13 Address: D0:9C:30:00:19:6F (Foster Electric Company, Limited) Packet type: 0xcc18 DM1 may be used DH1 may be used DM3 may be used DH3 may be used DM5 may be used DH5 may be used Page scan repetition mode: R1 (0x01) Page scan mode: Mandatory (0x00) Clock offset: 0x94c8 Role switch: Allow slave (0x01) > HCI Event: Command Status (0x0f) plen 4 Create Connection (0x01\|0x0005) ncmd 1 Status: Success (0x00) > HCI Event: Connect Complete (0x03) plen 11 Status: Success (0x00) Handle: 5 Address: D0:9C:30:00:19:6F (Foster Electric Company, Limited) Link type: ACL (0x01) Encryption: Disabled (0x00) < HCI Command: Read Remote Supported Features (0x01\|0x001b) plen 2 Handle: 5 > HCI Event: Command Status (0x0f) plen 4 Read Remote Supported Features (0x01\|0x001b) ncmd 1 Status: Success (0x00) > HCI Event: Page Scan Repetition Mode Change (0x20) plen 7 Address: D0:9C:30:00:19:6F (Foster Electric Company, Limited) Page scan repetition mode: R1 (0x01) > HCI Event: Read Remote Supported Features (0x0b) plen 11 Status: Success (0x00) Handle: 5 Features: 0xff 0xff 0x8f 0xfe 0xdb 0xff 0x5b 0x07 3 slot packets 5 slot packets Encryption Slot offset Timing accuracy Role switch Hold mode Sniff mode Park state Power control requests Channel quality driven data rate (CQDDR) SCO link HV2 packets HV3 packets u-law log synchronous data A-law log synchronous data CVSD synchronous data Paging parameter negotiation Power control Transparent synchronous data Broadcast Encryption Enhanced Data Rate ACL 2 Mbps mode Enhanced Data Rate ACL 3 Mbps mode Enhanced inquiry scan Interlaced inquiry scan Interlaced page scan RSSI with inquiry results Extended SCO link (EV3 packets) EV4 packets EV5 packets AFH capable slave AFH classification slave LE Supported (Controller) 3-slot Enhanced Data Rate ACL packets 5-slot Enhanced Data Rate ACL packets Sniff subrating Pause encryption AFH capable master AFH classification master Enhanced Data Rate eSCO 2 Mbps mode Enhanced Data Rate eSCO 3 Mbps mode 3-slot Enhanced Data Rate eSCO packets Extended Inquiry Response Simultaneous LE and BR/EDR (Controller) Secure Simple Pairing Encapsulated PDU Non-flushable Packet Boundary Flag Link Supervision Timeout Changed Event Inquiry TX Power Level Enhanced Power Control < HCI Command: Read Remote Extended Features (0x01\|0x001c) plen 3 Handle: 5 Page: 1 > HCI Event: Command Status (0x0f) plen 4 Read Remote Extended Features (0x01\|0x001c) ncmd 1 Status: Command Disallowed (0x0c) < HCI Command: Read Clock Offset (0x01\|0x001f) plen 2 Handle: 5 > HCI Event: Command Status (0x0f) plen 4 Read Clock Offset (0x01\|0x001f) ncmd 1 Status: Success (0x00) < HCI Command: Disconnect (0x01\|0x0006) plen 3 Handle: 5 Reason: Remote User Terminated Connection (0x13) Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-29 16:59:53 +01:00
Eric Dumazet	811230cd85	tcp: ipv4: initialize unicast_sock sk_pacing_rate When I added sk_pacing_rate field, I forgot to initialize its value in the per cpu unicast_sock used in ip_send_unicast_reply() This means that for sch_fq users, RST packets, or ACK packets sent on behalf of TIME_WAIT sockets might be sent to slowly or even dropped once we reach the per flow limit. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `95bd09eb27` ("tcp: TSO packets automatic sizing") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-28 23:24:47 -08:00
Eric Dumazet	86b3bfe914	pkt_sched: fq: remove useless TIME_WAIT check TIME_WAIT sockets are not owning any skb. ip_send_unicast_reply() and tcp_v6_send_response() both use regular sockets. We can safely remove a test in sch_fq and save one cache line miss, as sk_state is far away from sk_pacing_rate. Tested at Google for about one year. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-28 23:23:57 -08:00
Arnd Bergmann	2dbce096ca	act_connmark: fix dependencies better NET_ACT_CONNMARK fails to build if NF_CONNTRACK_MARK is disabled, and `d7924450e1` ("act_connmark: Add missing dependency on NF_CONNTRACK_MARK") fixed that case, but missed the cased where NF_CONNTRACK is a loadable module. This adds the second dependency to ensure that NET_ACT_CONNMARK can only be built-in if NF_CONNTRACK is also part of the kernel rather than a loadable module. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-28 23:23:06 -08:00
Christoph Hellwig	7cc0566268	net: remove sock_iocb The sock_iocb structure is allocate on stack for each read/write-like operation on sockets, and contains various fields of which only the embedded msghdr and sometimes a pointer to the scm_cookie is ever used. Get rid of the sock_iocb and put a msghdr directly on the stack and pass the scm_cookie explicitly to netlink_mmap_sendmsg. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-28 23:15:07 -08:00
Jesse Gross	b8693877ae	openvswitch: Add support for checksums on UDP tunnels. Currently, it isn't possible to request checksums on the outer UDP header of tunnels - the TUNNEL_CSUM flag is ignored. This adds support for requesting that UDP checksums be computed on transmit and properly reported if they are present on receive. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-28 23:04:15 -08:00
David S. Miller	b8f8be3f04	NFC: 3.20 first pull request This is the first NFC pull request for 3.20. With this one we have: - Secure element support for the ST Micro st21nfca driver. This depends on a few HCI internal changes in order for example to support more than one secure element per controller. - ACPI support for NXP's pn544 HCI driver. This controller is found on many x86 SoCs and is typically enumerated on the ACPI bus there. - A few st21nfca and st21nfcb fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUyXy7AAoJEIqAPN1PVmxKOY4P/1WJtEZfzBOFMlh9qeA8YESv cS1xbZuAVeEJ3r/sgDc87iA/4MMNZDzfhDHQlQJs/pJWAXE3Am1/dZGWHJC6pbk/ roTlbEh5OvQU8cRIdAvOcEgrBIWk+E30Mkd2OtOMpWyhbgChN7hKz0KUs7znVHBJ G3YCcZOPr7K2ra78UlYzApvGqoxiVaiEWQyj6rzx2HbVzxzICL6A5m9cRcZuGxYR sK3Y/DpKJKGwH3p1kkLIOqHy3nGhq2LttVSXF/f5xkOB8teERSkt4i8aJBiSb9ym a++iAWp2syH3sGh2rsb3G4KRYCYq2J9mJD7oBB3G/6UoNwyeSgW3GdxgH/yxt6C9 KfzHm9T8jfvQIBlf4lGeboV8zu+ysQehJFNKtxdQDlvwqPiR14clT5JFejocr9+N SegCbF+LJaZW8boOZVvhxTxASpEZ0RTFwUIkKKxtXOpK4ha0s1gEAtuZUpTYmRtJ wGqds8wszkE0wOdZJi7dsCGpp5JNI0LZZCsuKa6Ko3E7LkTAor8Bbmob0RR51ClD srtoz6kJgoozAMRLMISXBimk0geOp38iGs26GPSpRHwSV/1loN6+abaOMYlGbDCq +oVpqMmJsS/z5US5+wEkWU+O4y9TS0O74d/TxxcwUM+CLNyKI+Ms1rMOyYi0domo 2xa7MuDCrmztQbs6ROKT =SNWp -----END PGP SIGNATURE----- Merge tag 'nfc-next-3.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next NFC: 3.20 first pull request This is the first NFC pull request for 3.20. With this one we have: - Secure element support for the ST Micro st21nfca driver. This depends on a few HCI internal changes in order for example to support more than one secure element per controller. - ACPI support for NXP's pn544 HCI driver. This controller is found on many x86 SoCs and is typically enumerated on the ACPI bus there. - A few st21nfca and st21nfcb fixes. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-28 22:49:55 -08:00
Roopa Prabhu	59ccaaaa49	bridge: dont send notification when skb->len == 0 in rtnl_bridge_notify Reported in: https://bugzilla.kernel.org/show_bug.cgi?id=92081 This patch avoids calling rtnl_notify if the device ndo_bridge_getlink handler does not return any bytes in the skb. Alternately, the skb->len check can be moved inside rtnl_notify. For the bridge vlan case described in 92081, there is also a fix needed in bridge driver to generate a proper notification. Will fix that in subsequent patch. v2: rebase patch on net tree Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-28 22:21:31 -08:00
Neal Cardwell	d6b1a8a92a	tcp: fix timing issue in CUBIC slope calculation This patch fixes a bug in CUBIC that causes cwnd to increase slightly too slowly when multiple ACKs arrive in the same jiffy. If cwnd is supposed to increase at a rate of more than once per jiffy, then CUBIC was sometimes too slow. Because the bic_target is calculated for a future point in time, calculated with time in jiffies, the cwnd can increase over the course of the jiffy while the bic_target calculated as the proper CUBIC cwnd at time t=tcp_time_stamp+rtt does not increase, because tcp_time_stamp only increases on jiffy tick boundaries. So since the cnt is set to: ca->cnt = cwnd / (bic_target - cwnd); as cwnd increases but bic_target does not increase due to jiffy granularity, the cnt becomes too large, causing cwnd to increase too slowly. For example: - suppose at the beginning of a jiffy, cwnd=40, bic_target=44 - so CUBIC sets: ca->cnt = cwnd / (bic_target - cwnd) = 40 / (44 - 40) = 40/4 = 10 - suppose we get 10 acks, each for 1 segment, so tcp_cong_avoid_ai() increases cwnd to 41 - so CUBIC sets: ca->cnt = cwnd / (bic_target - cwnd) = 41 / (44 - 41) = 41 / 3 = 13 So now CUBIC will wait for 13 packets to be ACKed before increasing cwnd to 42, insted of 10 as it should. The fix is to avoid adjusting the slope (determined by ca->cnt) multiple times within a jiffy, and instead skip to compute the Reno cwnd, the "TCP friendliness" code path. Reported-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-28 22:18:38 -08:00
Neal Cardwell	9cd981dcf1	tcp: fix stretch ACK bugs in CUBIC Change CUBIC to properly handle stretch ACKs in additive increase mode by passing in the count of ACKed packets to tcp_cong_avoid_ai(). In addition, because we are now precisely accounting for stretch ACKs, including delayed ACKs, we can now remove the delayed ACK tracking and estimation code that tracked recent delayed ACK behavior in ca->delayed_ack. Reported-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-28 22:18:38 -08:00
Neal Cardwell	c22bdca947	tcp: fix stretch ACK bugs in Reno Change Reno to properly handle stretch ACKs in additive increase mode by passing in the count of ACKed packets to tcp_cong_avoid_ai(). In addition, if snd_cwnd crosses snd_ssthresh during slow start processing, and we then exit slow start mode, we need to carry over any remaining "credit" for packets ACKed and apply that to additive increase by passing this remaining "acked" count to tcp_cong_avoid_ai(). Reported-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-28 22:18:38 -08:00
Neal Cardwell	814d488c61	tcp: fix the timid additive increase on stretch ACKs tcp_cong_avoid_ai() was too timid (snd_cwnd increased too slowly) on "stretch ACKs" -- cases where the receiver ACKed more than 1 packet in a single ACK. For example, suppose w is 10 and we get a stretch ACK for 20 packets, so acked is 20. We ought to increase snd_cwnd by 2 (since acked/w = 20/10 = 2), but instead we were only increasing cwnd by 1. This patch fixes that behavior. Reported-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-28 22:18:37 -08:00
Neal Cardwell	e73ebb0881	tcp: stretch ACK fixes prep LRO, GRO, delayed ACKs, and middleboxes can cause "stretch ACKs" that cover more than the RFC-specified maximum of 2 packets. These stretch ACKs can cause serious performance shortfalls in common congestion control algorithms that were designed and tuned years ago with receiver hosts that were not using LRO or GRO, and were instead politely ACKing every other packet. This patch series fixes Reno and CUBIC to handle stretch ACKs. This patch prepares for the upcoming stretch ACK bug fix patches. It adds an "acked" parameter to tcp_cong_avoid_ai() to allow for future fixes to tcp_cong_avoid_ai() to correctly handle stretch ACKs, and changes all congestion control algorithms to pass in 1 for the ACKed count. It also changes tcp_slow_start() to return the number of packet ACK "credits" that were not processed in slow start mode, and can be processed by the congestion control module in additive increase mode. In future patches we will fix tcp_cong_avoid_ai() to handle stretch ACKs, and fix Reno and CUBIC handling of stretch ACKs in slow start and additive increase mode. Reported-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-28 22:18:37 -08:00
Marcel Holtmann	64dae967ca	Bluetooth: Move smp_unregister() into hci_dev_do_close() function The smp_unregister() function needs to be called every time the controller is powered down. There are multiple entry points when this can happen. One is "hciconfig hci0 reset" which will throw a WARN_ON when LE support has been enabled. [ 78.564620] WARNING: CPU: 0 PID: 148 at net/bluetooth/smp.c:3075 smp_register+0xf1/0x170() [ 78.564622] Modules linked in: [ 78.564628] CPU: 0 PID: 148 Comm: kworker/u3:1 Not tainted 3.19.0-rc4-devel+ #404 [ 78.564629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS [ 78.564635] Workqueue: hci0 hci_rx_work [ 78.564638] ffffffff81b4a7a2 ffff88001cb2fb38 ffffffff8161d881 0000000080000000 [ 78.564642] 0000000000000000 ffff88001cb2fb78 ffffffff8103b870 696e55206e6f6f6d [ 78.564645] ffff88001d965000 0000000000000000 0000000000000000 ffff88001d965000 [ 78.564648] Call Trace: [ 78.564655] [<ffffffff8161d881>] dump_stack+0x4f/0x7b [ 78.564662] [<ffffffff8103b870>] warn_slowpath_common+0x80/0xc0 [ 78.564667] [<ffffffff81544b00>] ? add_uuid+0x1f0/0x1f0 [ 78.564671] [<ffffffff8103b955>] warn_slowpath_null+0x15/0x20 [ 78.564674] [<ffffffff81562d81>] smp_register+0xf1/0x170 [ 78.564680] [<ffffffff81081236>] ? lock_timer_base.isra.30+0x26/0x50 [ 78.564683] [<ffffffff81544bf0>] powered_complete+0xf0/0x120 [ 78.564688] [<ffffffff8152e622>] hci_req_cmd_complete+0x82/0x260 [ 78.564692] [<ffffffff8153554f>] hci_cmd_complete_evt+0x6cf/0x2e20 [ 78.564697] [<ffffffff81623e43>] ? _raw_spin_unlock_irqrestore+0x13/0x30 [ 78.564701] [<ffffffff8106b0af>] ? __wake_up_sync_key+0x4f/0x60 [ 78.564705] [<ffffffff8153a2ab>] hci_event_packet+0xbcb/0x2e70 [ 78.564709] [<ffffffff814094d3>] ? skb_release_all+0x23/0x30 [ 78.564711] [<ffffffff81409529>] ? kfree_skb+0x29/0x40 [ 78.564715] [<ffffffff815296c8>] hci_rx_work+0x1c8/0x3f0 [ 78.564719] [<ffffffff8105bd91>] ? get_parent_ip+0x11/0x50 [ 78.564722] [<ffffffff8105be25>] ? preempt_count_add+0x55/0xb0 [ 78.564727] [<ffffffff8104f65f>] process_one_work+0x12f/0x360 [ 78.564731] [<ffffffff8104ff9b>] worker_thread+0x6b/0x4b0 [ 78.564735] [<ffffffff8104ff30>] ? cancel_delayed_work_sync+0x10/0x10 [ 78.564738] [<ffffffff810542fa>] kthread+0xea/0x100 [ 78.564742] [<ffffffff81620000>] ? __schedule+0x3e0/0x980 [ 78.564745] [<ffffffff81054210>] ? kthread_create_on_node+0x180/0x180 [ 78.564749] [<ffffffff816246ec>] ret_from_fork+0x7c/0xb0 [ 78.564752] [<ffffffff81054210>] ? kthread_create_on_node+0x180/0x180 [ 78.564755] ---[ end trace 8b0d943af76d3736 ]--- This warning is not critical and has only been placed in the code to actually catch this exact situation. To avoid triggering it move the smp_unregister() into hci_dev_do_close() which will now also take care of remove the SMP channel. It is safe to call this function since it only remove the channel if it has been previously registered. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-29 07:53:42 +02:00
Marcel Holtmann	c7741d16a5	Bluetooth: Perform a power cycle when receiving hardware error event When receiving a HCI Hardware Error event, the controller should be assumed to be non-functional until issuing a HCI Reset command. The Bluetooth hardware errors are vendor specific and so add a new hdev->hw_error callback that drivers can provide to run extra code to handle the hardware error. After completing the vendor specific error handling perform a full reset of the Bluetooth stack by closing and re-opening the transport. Based-on-patch-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-28 21:26:24 +01:00
Marcel Holtmann	5c912495b7	Bluetooth: Introduce hci_dev_do_reset helper function Split the hci_dev_reset ioctl handling into using hci_dev_do_reset helper function. Similar to what has been done with hci_dev_do_open and hci_dev_do_close. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-28 21:26:24 +01:00
Johan Hedberg	8f502f847a	Bluetooth: Fix notifying discovery state when powering off The discovery state should be set to stopped when the HCI device is powered off. This patch adds the appropriate call to the hci_discovery_set_state() function from hci_dev_do_close() which is responsible for the power-off procedure. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-28 21:26:23 +01:00
Johan Hedberg	39c5d970d4	Bluetooth: Fix notifying discovery state upon reset When HCI_Reset is issued the discovery state is assumed to be stopped. The hci_cc_reset() handler was trying to set the state but it was doing it without using the hci_discovery_set_state() function. Because of this e.g. the mgmt Discovering event could go without being sent. This patch fixes the code to use the hci_discovery_set_state() function instead of just blindly setting the state value. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-28 21:26:23 +01:00
Johan Hedberg	592002863a	Bluetooth: Fix check for SSP when enabling SC There's a check in set_secure_conn() that's supposed to ensure that SSP is enabled before we try to request the controller to enable SC (since SSP is a pre-requisite for it). However, this check only makes sense for controllers actually supporting BR/EDR SC. If we have a 4.0 controller we're only interested in the LE part of SC and should therefore not be requiring SSP to be enabled. This patch adds an additional condition to check for lmp_sc_capable(hdev) before requiring SSP to be enabled. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-28 21:26:22 +01:00
Marcel Holtmann	aa5b034565	Bluetooth: Check for P-256 OOB values in Secure Connections Only mode If Secure Connections Only mode has been enabled, the it is important to check that OOB data for P-256 values is provided. In case it is not, then tell the remote side that no OOB data is present. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-28 21:26:21 +01:00
Marcel Holtmann	a83ed81ef5	Bluetooth: Use helper function to determine BR/EDR OOB data present When replying to the IO capability request for Secure Simple Pairing and Secure Connections, the OOB data present fields needs to set. Instead of making the calculation inline, split this into a separate helper function. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-28 21:26:20 +01:00
Marcel Holtmann	6665d057fb	Bluetooth: Clear P-192 values for OOB when in Secure Connections Only mode When Secure Connections Only mode has been enabled and remote OOB data is requested, then only provide P-256 hash and randomizer vaulues. The fields for P-192 hash and randomizer should be set to zero. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-28 21:26:20 +01:00
Johan Hedberg	d25b78e2ed	Bluetooth: Enforce zero-valued hash/rand192 for LE OOB Until legacy SMP OOB pairing is implemented user space should be given a clear error when trying to use it. This patch adds a corresponding check to the Add Remote OOB Data handler function which returns "invalid parameters" if non-zero Rand192 or Hash192 parameters were given for an LE address. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-28 21:26:19 +01:00
David S. Miller	95f873f2ff	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: arch/arm/boot/dts/imx6sx-sdb.dts net/sched/cls_bpf.c Two simple sets of overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 16:59:56 -08:00
Christophe Ricard	ec14b6c93c	NFC: hci: Remove nfc_hci_pipe2gate function With the newly introduced pipes table hci_dev fields, the nfc_hci_pipe2gate routine is no longer needed. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-01-28 00:03:36 +01:00
Christophe Ricard	8409e4283c	NFC: hci: Add cmd_received handler When a command is received, it is sometime needed to let the CLF driver do some additional operations. (ex: count remaining pipe notification...) Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-01-28 00:03:34 +01:00
Christophe Ricard	615b524aca	NFC: hci: Reference every pipe information according to notification We update the tracked pipes status when receiving HCI commands. Also we forward HCI errors and we reply to any HCI command, even though we don't support it. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-01-28 00:03:29 +01:00
Christophe Ricard	af77522320	NFC: hci: Change nfc_hci_send_response gate parameter to pipe As there can be several pipes connected to the same gate, we need to know which pipe ID to use when sending an HCI response. A gate ID is not enough. Instead of changing the nfc_hci_send_response() API to something not aligned with the rest of the HCI API, we call nfc_hci_hcp_message_tx directly. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-01-27 23:55:20 +01:00
Christophe Ricard	118278f20a	NFC: hci: Add pipes table to reference them with a tuple {gate, host} In order to keep host source information on specific hci event (such as evt_connectivity or evt_transaction) and because 2 pipes can be connected to the same gate, it is necessary to add a table referencing every pipe with a {gate, host} tuple. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-01-27 23:39:32 +01:00
Christophe Ricard	fda7a49cb9	NFC: hci: Change event_received handler gate parameter to pipe Several pipes may point to the same CLF gate, so getting the gate ID as an input is not enough. For example dual secure element may have 2 pipes (1 for uicc and 1 for eSE) pointing to the connectivity gate. As resolving gate and host IDs can be done from a pipe, we now pass the pipe ID to the event received handler. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-01-27 23:39:23 +01:00
Christoph Hellwig	06539d3071	net: don't OOPS on socket aio Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 12:25:33 -08:00
Jouni Malinen	8ade538bf3	mac80111: Add BIP-GMAC-128 and BIP-GMAC-256 ciphers This allows mac80211 to configure BIP-GMAC-128 and BIP-GMAC-256 to the driver and also use software-implementation within mac80211 when the driver does not support this with hardware accelaration. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-27 11:10:13 +01:00
Jouni Malinen	56c52da2d5	mac80111: Add BIP-CMAC-256 cipher This allows mac80211 to configure BIP-CMAC-256 to the driver and also use software-implementation within mac80211 when the driver does not support this with hardware accelaration. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-27 11:09:13 +01:00
Jouni Malinen	2b2ba0db1c	mac80111: Add CCMP-256 cipher This allows mac80211 to configure CCMP-256 to the driver and also use software-implementation within mac80211 when the driver does not support this with hardware accelaration. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> [squash ccmp256 -> mic_len argument change] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-27 11:07:35 +01:00
Jouni Malinen	00b9cfa3ff	mac80111: Add GCMP and GCMP-256 ciphers This allows mac80211 to configure GCMP and GCMP-256 to the driver and also use software-implementation within mac80211 when the driver does not support this with hardware accelaration. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> [remove a spurious newline] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-27 11:06:09 +01:00
Jouni Malinen	cfcf1682c4	cfg80211: Add new GCMP, CCMP-256, BIP-GMAC, BIP-CMAC-256 ciphers This makes cfg80211 aware of the GCMP, GCMP-256, CCMP-256, BIP-GMAC-128, BIP-GMAC-256, and BIP-CMAC-256 cipher suites. These new cipher suites were defined in IEEE Std 802.11ac-2013. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-27 11:04:57 +01:00
Jouni Malinen	37720569cc	cfg80211: Fix BIP (AES-CMAC) cipher validation This cipher can be used only as a group management frame cipher and as such, there is no point in validating that it is not used with non-zero key-index. Instead, verify that it is not used as a pairwise cipher regardless of the key index. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> [change code to use switch statement which is easier to extend] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-27 11:03:41 +01:00
Luciano Coelho	9120d94e8f	mac80211: handle potential race between suspend and scan completion If suspend starts while ieee80211_scan_completed() is running, between the point where SCAN_COMPLETED is set and the work is queued, ieee80211_scan_cancel() will not catch the work and we may finish suspending before the work is actually executed, leaving the scan running while suspended. To fix this race, queue the scan work during resume if the SCAN_COMPLETED flag is set and flush it immediately. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-27 09:58:46 +01:00
Herbert Xu	8ea65f4a2d	netlink: Kill redundant net argument in netlink_insert The socket already carries the net namespace with it so there is no need to be passing another net around. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:30:30 -08:00
David S. Miller	bf693f7beb	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== ipsec 2015-01-26 Just two small fixes for _decode_session6() where we might decode to wrong header information in some rare situations. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:28:38 -08:00
Hannes Frederic Sowa	6e9e16e614	ipv6: replacing a rt6_info needs to purge possible propagated rt6_infos too Lubomir Rintel reported that during replacing a route the interface reference counter isn't correctly decremented. To quote bug <https://bugzilla.kernel.org/show_bug.cgi?id=91941>: \| [root@rhel7-5 lkundrak]# sh -x lal \| + ip link add dev0 type dummy \| + ip link set dev0 up \| + ip link add dev1 type dummy \| + ip link set dev1 up \| + ip addr add 2001:db8:8086::2/64 dev dev0 \| + ip route add 2001:db8:8086::/48 dev dev0 proto static metric 20 \| + ip route add 2001:db8:8088::/48 dev dev1 proto static metric 10 \| + ip route replace 2001:db8:8086::/48 dev dev1 proto static metric 20 \| + ip link del dev0 type dummy \| Message from syslogd@rhel7-5 at Jan 23 10:54:41 ... \| kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2 \| \| Message from syslogd@rhel7-5 at Jan 23 10:54:51 ... \| kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2 During replacement of a rt6_info we must walk all parent nodes and check if the to be replaced rt6_info got propagated. If so, replace it with an alive one. Fixes: `4a287eba2d` ("IPv6 routing, NLM_F_* flag support: REPLACE and EXCL flags support, warn about missing CREATE flag") Reported-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Tested-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:22:14 -08:00
subashab@codeaurora.org	fc752f1f43	ping: Fix race in free in receive path An exception is seen in ICMP ping receive path where the skb destructor sock_rfree() tries to access a freed socket. This happens because ping_rcv() releases socket reference with sock_put() and this internally frees up the socket. Later icmp_rcv() will try to free the skb and as part of this, skb destructor is called and which leads to a kernel panic as the socket is freed already in ping_rcv(). -->\|exception -007\|sk_mem_uncharge -007\|sock_rfree -008\|skb_release_head_state -009\|skb_release_all -009\|__kfree_skb -010\|kfree_skb -011\|icmp_rcv -012\|ip_local_deliver_finish Fix this incorrect free by cloning this skb and processing this cloned skb instead. This patch was suggested by Eric Dumazet Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:04:53 -08:00
Herbert Xu	86f3cddbc3	udp_diag: Fix socket skipping within chain While working on rhashtable walking I noticed that the UDP diag dumping code is buggy. In particular, the socket skipping within a chain never happens, even though we record the number of sockets that should be skipped. As this code was supposedly copied from TCP, this patch does what TCP does and resets num before we walk a chain. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:02:41 -08:00
David S. Miller	7d63585bf0	Another set of last-minute fixes: * fix station double-removal when suspending while associating * fix the HT (802.11n) header length calculation * fix the CCK radiotap flag used for monitoring, a pretty old regression but a simple one-liner * fix per-station group-key handling -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUwmkTAAoJEDBSmw7B7bqr3X0P/3Y/2UnQSBSi2LnqmsuV8igR noAwaPuCf2ASs25v9nQv64JM1qC0kiemYesi3fFWHiqPhoxx9WrMSvjXdarynrQN LwBG74tcQVEeW70w+36yeKCS+wMU35eTzDDxdnYOfJeNOhW2rQE7WVT2drkc6lVL hvGRhqlBx0mVLQ3VCW5c1kmMN/VAttZKx5WWWurmA8OSQzcHJGZW6ZTuqwpDx4RT iqwi5uBtMkqTQV+22Zb4xI+YxLGwcGiYjy15KksPsrSZZeS5pJzeyovrDhACBVpE aVj3K+OAGVmXCN7HVpQ3tTqCaQea5o+EDqtnT/IQrjnmw6p4mC3co6ShiVn+DkTX 6nv/ka92k9q1LMGIzl+lje2cNmaerVrDa84gUqnHXbTr80+ugKu4NpZowQgdW4yG Qu4kQALkTRhUoNf+RUzXKcFqAW2qLKf83qLcrhbQWkSrY0hzLVWKI/1/FAkCyo6I zKhsB44mHMMaKGVq5qsHTD0E89PtiJuizDLiU04uJExYJCi1FCh9NDlc+HaVlTnt 5LHFbrsPDuAUKoKSiJmJUIyueL+Of5N34+epuLRZb55aE2AQhApg6Nxd+FMuro1m 0aeHEREEO04QZ7IUrYgBA4G7L1dsZJDD8auU6Y4V3chAg6ArbswbtzAv8ON1tAk+ pr0kThACKqkfg2tN5fny =DhIp -----END PGP SIGNATURE----- Merge tag 'mac80211-for-davem-2015-01-23' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Another set of last-minute fixes: * fix station double-removal when suspending while associating * fix the HT (802.11n) header length calculation * fix the CCK radiotap flag used for monitoring, a pretty old regression but a simple one-liner * fix per-station group-key handling Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 17:32:24 -08:00
Hannes Frederic Sowa	df4d92549f	ipv4: try to cache dst_entries which would cause a redirect Not caching dst_entries which cause redirects could be exploited by hosts on the same subnet, causing a severe DoS attack. This effect aggravated since commit `f886497212` ("ipv4: fix dst race in sk_dst_get()"). Lookups causing redirects will be allocated with DST_NOCACHE set which will force dst_release to free them via RCU. Unfortunately waiting for RCU grace period just takes too long, we can end up with >1M dst_entries waiting to be released and the system will run OOM. rcuos threads cannot catch up under high softirq load. Attaching the flag to emit a redirect later on to the specific skb allows us to cache those dst_entries thus reducing the pressure on allocation and deallocation. This issue was discovered by Marcelo Leitner. Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: Marcelo Leitner <mleitner@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 17:28:27 -08:00
Daniel Borkmann	600ddd6825	net: sctp: fix slab corruption from use after free on INIT collisions When hitting an INIT collision case during the 4WHS with AUTH enabled, as already described in detail in commit `1be9a950c6` ("net: sctp: inherit auth_capable on INIT collisions"), it can happen that we occasionally still remotely trigger the following panic on server side which seems to have been uncovered after the fix from commit `1be9a950c6` ... [ 533.876389] BUG: unable to handle kernel paging request at 00000000ffffffff [ 533.913657] IP: [<ffffffff811ac385>] __kmalloc+0x95/0x230 [ 533.940559] PGD 5030f2067 PUD 0 [ 533.957104] Oops: 0000 [#1] SMP [ 533.974283] Modules linked in: sctp mlx4_en [...] [ 534.939704] Call Trace: [ 534.951833] [<ffffffff81294e30>] ? crypto_init_shash_ops+0x60/0xf0 [ 534.984213] [<ffffffff81294e30>] crypto_init_shash_ops+0x60/0xf0 [ 535.015025] [<ffffffff8128c8ed>] __crypto_alloc_tfm+0x6d/0x170 [ 535.045661] [<ffffffff8128d12c>] crypto_alloc_base+0x4c/0xb0 [ 535.074593] [<ffffffff8160bd42>] ? _raw_spin_lock_bh+0x12/0x50 [ 535.105239] [<ffffffffa0418c11>] sctp_inet_listen+0x161/0x1e0 [sctp] [ 535.138606] [<ffffffff814e43bd>] SyS_listen+0x9d/0xb0 [ 535.166848] [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b ... or depending on the the application, for example this one: [ 1370.026490] BUG: unable to handle kernel paging request at 00000000ffffffff [ 1370.026506] IP: [<ffffffff811ab455>] kmem_cache_alloc+0x75/0x1d0 [ 1370.054568] PGD 633c94067 PUD 0 [ 1370.070446] Oops: 0000 [#1] SMP [ 1370.085010] Modules linked in: sctp kvm_amd kvm [...] [ 1370.963431] Call Trace: [ 1370.974632] [<ffffffff8120f7cf>] ? SyS_epoll_ctl+0x53f/0x960 [ 1371.000863] [<ffffffff8120f7cf>] SyS_epoll_ctl+0x53f/0x960 [ 1371.027154] [<ffffffff812100d3>] ? anon_inode_getfile+0xd3/0x170 [ 1371.054679] [<ffffffff811e3d67>] ? __alloc_fd+0xa7/0x130 [ 1371.080183] [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b With slab debugging enabled, we can see that the poison has been overwritten: [ 669.826368] BUG kmalloc-128 (Tainted: G W ): Poison overwritten [ 669.826385] INFO: 0xffff880228b32e50-0xffff880228b32e50. First byte 0x6a instead of 0x6b [ 669.826414] INFO: Allocated in sctp_auth_create_key+0x23/0x50 [sctp] age=3 cpu=0 pid=18494 [ 669.826424] __slab_alloc+0x4bf/0x566 [ 669.826433] __kmalloc+0x280/0x310 [ 669.826453] sctp_auth_create_key+0x23/0x50 [sctp] [ 669.826471] sctp_auth_asoc_create_secret+0xcb/0x1e0 [sctp] [ 669.826488] sctp_auth_asoc_init_active_key+0x68/0xa0 [sctp] [ 669.826505] sctp_do_sm+0x29d/0x17c0 [sctp] [...] [ 669.826629] INFO: Freed in kzfree+0x31/0x40 age=1 cpu=0 pid=18494 [ 669.826635] __slab_free+0x39/0x2a8 [ 669.826643] kfree+0x1d6/0x230 [ 669.826650] kzfree+0x31/0x40 [ 669.826666] sctp_auth_key_put+0x19/0x20 [sctp] [ 669.826681] sctp_assoc_update+0x1ee/0x2d0 [sctp] [ 669.826695] sctp_do_sm+0x674/0x17c0 [sctp] Since this only triggers in some collision-cases with AUTH, the problem at heart is that sctp_auth_key_put() on asoc->asoc_shared_key is called twice when having refcnt 1, once directly in sctp_assoc_update() and yet again from within sctp_auth_asoc_init_active_key() via sctp_assoc_update() on the already kzfree'd memory, which is also consistent with the observation of the poison decrease from 0x6b to 0x6a (note: the overwrite is detected at a later point in time when poison is checked on new allocation). Reference counting of auth keys revisited: Shared keys for AUTH chunks are being stored in endpoints and associations in endpoint_shared_keys list. On endpoint creation, a null key is being added; on association creation, all endpoint shared keys are being cached and thus cloned over to the association. struct sctp_shared_key only holds a pointer to the actual key bytes, that is, struct sctp_auth_bytes which keeps track of users internally through refcounting. Naturally, on assoc or enpoint destruction, sctp_shared_key are being destroyed directly and the reference on sctp_auth_bytes dropped. User space can add keys to either list via setsockopt(2) through struct sctp_authkey and by passing that to sctp_auth_set_key() which replaces or adds a new auth key. There, sctp_auth_create_key() creates a new sctp_auth_bytes with refcount 1 and in case of replacement drops the reference on the old sctp_auth_bytes. A key can be set active from user space through setsockopt() on the id via sctp_auth_set_active_key(), which iterates through either endpoint_shared_keys and in case of an assoc, invokes (one of various places) sctp_auth_asoc_init_active_key(). sctp_auth_asoc_init_active_key() computes the actual secret from local's and peer's random, hmac and shared key parameters and returns a new key directly as sctp_auth_bytes, that is asoc->asoc_shared_key, plus drops the reference if there was a previous one. The secret, which where we eventually double drop the ref comes from sctp_auth_asoc_set_secret() with intitial refcount of 1, which also stays unchanged eventually in sctp_assoc_update(). This key is later being used for crypto layer to set the key for the hash in crypto_hash_setkey() from sctp_auth_calculate_hmac(). To close the loop: asoc->asoc_shared_key is freshly allocated secret material and independant of the sctp_shared_key management keeping track of only shared keys in endpoints and assocs. Hence, also commit `4184b2a79a` ("net: sctp: fix memory leak in auth key management") is independant of this bug here since it concerns a different layer (though same structures being used eventually). asoc->asoc_shared_key is reference dropped correctly on assoc destruction in sctp_association_free() and when active keys are being replaced in sctp_auth_asoc_init_active_key(), it always has a refcount of 1. Hence, it's freed prematurely in sctp_assoc_update(). Simple fix is to remove that sctp_auth_key_put() from there which fixes these panics. Fixes: `730fc3d05c` ("[SCTP]: Implete SCTP-AUTH parameter processing") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 17:02:05 -08:00
Erik Hugne	08bfc9cb76	flow_dissector: add tipc support The flows are hashed on the sending node address, which allows us to spread out the TIPC link processing to RPS enabled cores. There is no point to include the destination address in the hash as that will always be the same for all inbound links. We have experimented with a 3-tuple hash over [srcnode, sport, dport], but this showed to give slightly lower performance because of increased lock contention when the same link was handled by multiple cores. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 16:58:09 -08:00
Erik Hugne	3fa9cacd69	tipc: fix excessive network event logging If a large number of namespaces is spawned on a node and TIPC is enabled in each of these, the excessive printk tracing of network events will cause the system to grind down to a near halt. The traces are still of debug value, so instead of removing them completely we fix it by changing the link state and node availability logging debug traces. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 16:58:08 -08:00
Daniel Borkmann	fd3e646c87	net: act_bpf: fix size mismatch on filter preparation Similarly as in cls_bpf, also this code needs to reject mismatches. Reference: http://article.gmane.org/gmane.linux.network/347406 Fixes: `d23b8ad8ab` ("tc: add BPF based action") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 16:08:55 -08:00
Daniel Borkmann	1c1bc6bdb7	net: cls_basic: return from walking on match in basic_get As soon as we've found a matching handle in basic_get(), we can return it. There's no need to continue walking until the end of a filter chain, since they are unique anyway. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 16:08:55 -08:00
Daniel Borkmann	3f2ab13594	net: cls_bpf: fix auto generation of per list handles When creating a bpf classifier in tc with priority collisions and invoking automatic unique handle assignment, cls_bpf_grab_new_handle() will return a wrong handle id which in fact is non-unique. Usually altering of specific filters is being addressed over major id, but in case of collisions we result in a filter chain, where handle ids address individual cls_bpf_progs inside the classifier. Issue is, in cls_bpf_grab_new_handle() we probe for head->hgen handle in cls_bpf_get() and in case we found a free handle, we're supposed to use exactly head->hgen. In case of insufficient numbers of handles, we bail out later as handle id 0 is not allowed. Fixes: `7d1d65cb84` ("net: sched: cls_bpf: add BPF-based classifier") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 15:50:19 -08:00
Daniel Borkmann	7913ecf69e	net: cls_bpf: fix size mismatch on filter preparation In cls_bpf_modify_existing(), we read out the number of filter blocks, do some sanity checks, allocate a block on that size, and copy over the BPF instruction blob from user space, then pass everything through the classic BPF checker prior to installation of the classifier. We should reject mismatches here, there are 2 scenarios: the number of filter blocks could be smaller than the provided instruction blob, so we do a partial copy of the BPF program, and thus the instructions will either be rejected from the verifier or a valid BPF program will be run; in the other case, we'll end up copying more than we're supposed to, and most likely the trailing garbage will be rejected by the verifier as well (i.e. we need to fit instruction pattern, ret {A,K} needs to be last instruction, load/stores must be correct, etc); in case not, we would leak memory when dumping back instruction patterns. The code should have only used nla_len() as Dave noted to avoid this from the beginning. Anyway, lets fix it by rejecting such load attempts. Fixes: `7d1d65cb84` ("net: sched: cls_bpf: add BPF-based classifier") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 15:50:18 -08:00
Joe Stringer	74ed7ab926	openvswitch: Add support for unique flow IDs. Previously, flows were manipulated by userspace specifying a full, unmasked flow key. This adds significant burden onto flow serialization/deserialization, particularly when dumping flows. This patch adds an alternative way to refer to flows using a variable-length "unique flow identifier" (UFID). At flow setup time, userspace may specify a UFID for a flow, which is stored with the flow and inserted into a separate table for lookup, in addition to the standard flow table. Flows created using a UFID must be fetched or deleted using the UFID. All flow dump operations may now be made more terse with OVS_UFID_F_* flags. For example, the OVS_UFID_F_OMIT_KEY flag allows responses to omit the flow key from a datapath operation if the flow has a corresponding UFID. This significantly reduces the time spent assembling and transacting netlink messages. With all OVS_UFID_F_OMIT_* flags enabled, the datapath only returns the UFID and statistics for each flow during flow dump, increasing ovs-vswitchd revalidator performance by 40% or more. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 15:45:50 -08:00
Joe Stringer	272c2cf841	openvswitch: Use sw_flow_key_range for key ranges. These minor tidyups make a future patch a little tidier. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 15:45:50 -08:00
Joe Stringer	d29ab6f8a9	openvswitch: Refactor ovs_flow_tbl_insert(). Rework so that ovs_flow_tbl_insert() calls flow_{key,mask}_insert(). This tidies up a future patch. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 15:45:49 -08:00
Joe Stringer	5b4237bbc9	openvswitch: Refactor ovs_nla_fill_match(). Refactor the ovs_nla_fill_match() function into separate netlink serialization functions ovs_nla_put_{unmasked_key,mask}(). Modify ovs_nla_put_flow() to handle attribute nesting and expose the 'is_mask' parameter - all callers need to nest the flow, and callers have better knowledge about whether it is serializing a mask or not. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 15:45:49 -08:00
Christophe Ricard	511e78a38a	NFC: nfc_disable_se Remove useless blank line at beginning of function Remove one useless blank line at beginning of nfc_disable_se function. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-01-26 23:14:33 +01:00
Christophe Ricard	ec0684898f	NFC: nfc_enable_se Remove useless blank line at beginning of function Remove one useless blank line at beginning of nfc_enable_se function. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-01-26 23:14:33 +01:00
Pablo Neira Ayuso	e8781f70a5	netfilter: nf_tables: disable preemption when restoring chain counters With CONFIG_DEBUG_PREEMPT=y [22144.496057] BUG: using smp_processor_id() in preemptible [00000000] code: iptables-compat/10406 [22144.496061] caller is debug_smp_processor_id+0x17/0x1b [22144.496065] CPU: 2 PID: 10406 Comm: iptables-compat Not tainted 3.19.0-rc4+ # [...] [22144.496092] Call Trace: [22144.496098] [<ffffffff8145b9fa>] dump_stack+0x4f/0x7b [22144.496104] [<ffffffff81244f52>] check_preemption_disabled+0xd6/0xe8 [22144.496110] [<ffffffff81244f90>] debug_smp_processor_id+0x17/0x1b [22144.496120] [<ffffffffa07c557e>] nft_stats_alloc+0x94/0xc7 [nf_tables] [22144.496130] [<ffffffffa07c73d2>] nf_tables_newchain+0x471/0x6d8 [nf_tables] [22144.496140] [<ffffffffa07c5ef6>] ? nft_trans_alloc+0x18/0x34 [nf_tables] [22144.496154] [<ffffffffa063c8da>] nfnetlink_rcv_batch+0x2b4/0x457 [nfnetlink] Reported-by: Andreas Schultz <aschultz@tpip.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-01-26 11:50:02 +01:00
Eric Dumazet	1dc7b90f7c	ipv6: tcp: fix race in IPV6_2292PKTOPTIONS IPv6 TCP sockets store in np->pktoptions skbs, and use skb_set_owner_r() to charge the skb to socket. It means that destructor must be called while socket is locked. Therefore, we cannot use skb_get() or atomic_inc(&skb->users) to protect ourselves : kfree_skb() might race with other users manipulating sk->sk_forward_alloc Fix this race by holding socket lock for the duration of ip6_datagram_recv_ctl() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 00:44:08 -08:00
Dan Carpenter	1b846f9282	bridge: simplify br_getlink() a bit Static checkers complain that we should maybe set "ret" before we do the "goto out;". They interpret the NULL return from br_port_get_rtnl() as a failure and forgetting to set the error code is a common bug in this situation. The code is confusing but it's actually correct. We are returning zero deliberately. Let's re-write it a bit to be more clear. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 23:32:49 -08:00
Martin KaFai Lau	b0a1ba5992	ipv6: Fix __ip6_route_redirect In my last commit (`a3c00e4`: ipv6: Remove BACKTRACK macro), the changes in __ip6_route_redirect is incorrect. The following case is missed: 1. The for loop tries to find a valid gateway rt. If it fails to find one, rt will be NULL. 2. When rt is NULL, it is set to the ip6_null_entry. 3. The newly added 'else if', from `a3c00e4`, will stop the backtrack from happening. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 22:09:51 -08:00
Vivien Didelot	24df8986f3	net: dsa: set slave MII bus PHY mask When registering a mdio bus, Linux assumes than every port has a PHY and tries to scan it. If a switch port has no PHY registered, DSA will fail to register the slave MII bus. To fix this, set the slave MII bus PHY mask to the switch PHYs mask. As an example, if we use a Marvell MV88E6352 (which is a 7-port switch with no registered PHYs for port 5 and port 6), with the following declared names: static struct dsa_chip_data switch_cdata = { [...] .port_names[0] = "sw0", .port_names[1] = "sw1", .port_names[2] = "sw2", .port_names[3] = "sw3", .port_names[4] = "sw4", .port_names[5] = "cpu", }; DSA will fail to create the switch instance. With the PHY mask set for the slave MII bus, only the PHY for ports 0-4 will be scanned and the instance will be successfully created. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 16:00:54 -08:00
Harout Hedeshian	c2943f1453	net: ipv6: Add sysctl entry to disable MTU updates from RA The kernel forcefully applies MTU values received in router advertisements provided the new MTU is less than the current. This behavior is undesirable when the user space is managing the MTU. Instead a sysctl flag 'accept_ra_mtu' is introduced such that the user space can control whether or not RA provided MTU updates should be applied. The default behavior is unchanged; user space must explicitly set this flag to 0 for RA MTUs to be ignored. Signed-off-by: Harout Hedeshian <harouth@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:54:41 -08:00
Alexander Duyck	64c6272351	fib_trie: Various clean-ups for handling slen While doing further work on the fib_trie I noted a few items. First I was using calls that were far more complicated than they needed to be for determining when to push/pull the suffix length. I have updated the code to reflect the simplier logic. The second issue is that I realised we weren't necessarily handling the case of a leaf_info struct surviving a flush. I have updated the logic so that now we will call pull_suffix in the event of having a leaf info value left in the leaf after flushing it. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:47:16 -08:00
Alexander Duyck	02525368f4	fib_trie: Move fib_find_alias to file where it is used The function fib_find_alias is only accessed by functions in fib_trie.c as such it makes sense to relocate it and cast it as static so that the compiler can take advantage of optimizations it can do to it as a local function. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:47:16 -08:00
Alexander Duyck	30cfe7c9c8	fib_trie: Use empty_children instead of counting empty nodes in stats collection It doesn't make much sense to count the pointers ourselves when empty_children already has a count for the number of NULL pointers stored in the tnode. As such save ourselves the cycles and just use empty_children. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:47:16 -08:00
Alexander Duyck	95f60ea3e9	fib_trie: Add collapse() and should_collapse() to resize This patch really does two things. First it pulls the logic for determining if we should collapse one node out of the tree and the actual code doing the collapse into a separate pair of functions. This helps to make the changes to these areas more readable. Second it encodes the upper 32b of the empty_children value onto the full_children value in the case of bits == KEYLENGTH. By doing this we are able to handle the case of a 32b node where empty_children would appear to be 0 when it was actually 1ul << 32. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:47:16 -08:00
Alexander Duyck	a80e89d4c6	fib_trie: Fall back to slen update on inflate/halve failure This change corrects an issue where if inflate or halve fails we were exiting the resize function without at least updating the slen for the node. To correct this I have moved the update of max_size into the while loop so that it is only decremented on a successful call to either inflate or halve. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:47:16 -08:00
Alexander Duyck	69fa57b1e4	fib_trie: Fix RCU bug and merge similar bits of inflate/halve This patch addresses two issues. The first issue is the fact that I believe I had the RCU freeing sequence slightly out of order. As a result we could get into an issue if a caller went into a child of a child of the new node, then backtraced into the to be freed parent, and then attempted to access a child of a child that may have been consumed in a resize of one of the new nodes children. To resolve this I have moved the resize after we have freed the oldtnode. The only side effect of this is that we will now be calling resize on more nodes in the case of inflate due to the fact that we don't have a good way to test to see if a full_tnode on the new node was there before or after the allocation. This should have minimal impact however since the node should already be correctly size so it is just the cost of calling should_inflate that we will be taking on the node which is only a couple of cycles. The second issue is the fact that inflate and halve were essentially doing the same thing after the new node was added to the trie replacing the old one. As such it wasn't really necessary to keep the code in both functions so I have split it out into two other functions, called replace and update_children. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:47:15 -08:00
Alexander Duyck	b3832117b4	fib_trie: Use index & (~0ul << n->bits) instead of index >> n->bits In doing performance testing and analysis of the changes I recently found that by shifting the index I had created an unnecessary dependency. I have updated the code so that we instead shift a mask by bits and then just test against that as that should save us about 2 CPU cycles since we can generate the mask while the key and pos are being processed. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:47:15 -08:00
Sasha Levin	6b8d9117cc	net: llc: use correct size for sysctl timeout entries The timeout entries are sizeof(int) rather than sizeof(long), which means that when they were getting read we'd also leak kernel memory to userspace along with the timeout values. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 00:23:21 -08:00
Tom Herbert	af33c1adae	vxlan: Eliminate dependency on UDP socket in transmit path In the vxlan transmit path there is no need to reference the socket for a tunnel which is needed for the receive side. We do, however, need the vxlan_dev flags. This patch eliminate references to the socket in the transmit path, and changes VXLAN_F_UNSHAREABLE to be VXLAN_F_RCV_FLAGS. This mask is used to store the flags applicable to receive (GBP, CSUM6_RX, and REMCSUM_RX) in the vxlan_sock flags. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-24 23:15:40 -08:00
Tom Herbert	d998f8efa4	udp: Do not require sock in udp_tunnel_xmit_skb The UDP tunnel transmit functions udp_tunnel_xmit_skb and udp_tunnel6_xmit_skb include a socket argument. The socket being passed to the functions (from VXLAN) is a UDP created for receive side. The only thing that the socket is used for in the transmit functions is to get the setting for checksum (enabled or zero). This patch removes the argument and and adds a nocheck argument for checksum setting. This eliminates the unnecessary dependency on a UDP socket for UDP tunnel transmit. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-24 23:15:40 -08:00
Trond Myklebust	c4a7ca7749	SUNRPC: Allow waiting on memory allocation We should be safe now, as long as we don't do GFP_IO or higher allocations Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-01-24 18:46:50 -05:00
Trond Myklebust	127b21b89f	SUNRPC: Adjust rpciod workqueue parameters Increase the concurrency level for rpciod threads to allow for allocations etc that happen in the RPCSEC_GSS layer. Also note that the NFSv4 byte range locks may now need to allocate memory from inside rpciod. Add the WQ_HIGHPRI flag to improve latency guarantees while we're at it. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-01-24 18:46:49 -05:00
Nicolas Dichtel	193523bf93	vxlan: advertise netns of vxlan dev in fdb msg Netlink FDB messages are sent in the link netns. The header of these messages contains the ifindex (ndm_ifindex) of the netdevice, but this ifindex is unusable in case of x-netns vxlan. I named the new attribute NDA_NDM_IFINDEX_NETNSID, to avoid confusion with NDA_IFINDEX. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-23 17:51:15 -08:00
Nicolas Dichtel	1f17257b1f	vlan: advertise link netns via netlink Assign rtnl_link_ops->get_link_net() callback so that IFLA_LINK_NETNSID is added to rtnetlink messages. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-23 17:51:15 -08:00
Nicolas Dichtel	3390e39761	ip6gretap: advertise link netns via netlink Assign rtnl_link_ops->get_link_net() callback so that IFLA_LINK_NETNSID is added to rtnetlink messages. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-23 17:51:14 -08:00
Nicolas Dichtel	bdef279b99	rtnl: fix error path when adding an iface with a link net If an error occurs when the netdevice is moved to the link netns, a full cleanup must be done. Fixes: `317f4810e4` ("rtnl: allow to create device with IFLA_LINK_NETNSID set") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-23 17:51:14 -08:00
Thomas Graf	d7924450e1	act_connmark: Add missing dependency on NF_CONNTRACK_MARK Depending on NETFILTER is not sufficient to ensure the presence of the 'mark' field in nf_conn, also needs to depend on NF_CONNTRACK_MARK. Fixes: 22a5dc ("net: sched: Introduce connmark action") Cc: Felix Fietkau <nbd@openwrt.org> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-23 17:46:09 -08:00
Peter Hurley	dfb2fae7cd	Bluetooth: Fix nested sleeps l2cap/rfcomm/sco_sock_accept() are wait loops which may acquire sleeping locks. Since both wait loops and sleeping locks use task_struct.state to sleep and wake, the nested sleeping locks destroy the wait loop state. Use the newly-minted wait_woken() and DEFINE_WAIT_FUNC() for the wait loop. DEFINE_WAIT_FUNC() allows an alternate wake function to be specified; in this case, the predefined scheduler function, woken_wake_function(). This wait construct ensures wakeups will not be missed without requiring the wait loop to set the task state before condition evaluation. How this works: CPU 0 \| CPU 1 \| \| is <condition> set? \| no set <condition> \| \| wake_up_interruptible \| woken_wake_function \| set WQ_FLAG_WOKEN \| try_to_wake_up \| \| wait_woken \| set TASK_INTERRUPTIBLE \| WQ_FLAG_WOKEN? yes \| set TASK_RUNNING \| \| - loop - \| \| is <condition> set? \| yes - exit wait loop Fixes "do not call blocking ops when !TASK_RUNNING" warnings in l2cap_sock_accept(), rfcomm_sock_accept() and sco_sock_accept(). Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-23 20:29:42 +02:00
Johan Hedberg	a1443f5a27	Bluetooth: Convert Set SC to use HCI Request This patch converts the Set Secure Connection HCI handling to use a HCI request instead of using a hard-coded callback in hci_event.c. This e.g. ensures that we don't clear the flags incorrectly if something goes wrong with the power up process (not related to a mgmt Set SC command). The code can also be simplified a bit since only one pending Set SC command is allowed, i.e. mgmt_pending_foreach usage is not needed. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-23 19:07:03 +01:00
Johan Hedberg	484aabc1c4	Bluetooth: Remove incorrect check for BDADDR_BREDR address type The Add Remote OOB Data mgmt command should allow data to be passed for LE as well. This patch removes a left-over check for BDADDR_BREDR that should not be there anymore. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-23 18:59:31 +01:00
Johan Hedberg	5d57e7964c	Bluetooth: Check for valid bdaddr in add_remote_oob_data Before doing any other verifications, the add_remote_oob_data function should first check that the given address is valid. This patch adds such a missing check to the beginning of the function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-23 18:59:30 +01:00
Jeff Layton	3c5199143b	sunrpc/lockd: fix references to the BKL The BKL is completely out of the picture in the lockd and sunrpc code these days. Update the antiquated comments that refer to it. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-01-23 10:29:12 -05:00
Johannes Berg	225b818982	mac80211: support beacon statistics For drivers without beacon filtering, support beacon statistics entirely, i.e. report the number of beacons and average signal. For drivers with beacon filtering, give them the number of beacons received by mac80211 -- in case the device reports only the number of filtered beacons then driver doesn't have to count all beacons again as mac80211 already does. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 15:51:38 +01:00
Johannes Berg	3d6dc3431e	mac80211: fix per-TID RX-MSDU counter In the case of non-QoS association, the counter was actually wrong. The right index isn't security_idx but seqno_idx, as security_idx will be 0 for data frames, while 16 is needed. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 13:26:25 +01:00
Johannes Berg	c5309ba787	mac80211: tdls: disentangle HT supported conditions These conditions are rather difficult to follow, for example because "!sta" only exists to not crash in the case that we don't have a station pointer (WLAN_TDLS_SETUP_REQUEST) in which the additional condition (peer supports HT) doesn't actually matter anyway. Cleaning this up only duplicates two lines of code but makes the rest far easier to read, so do that. As a side effect, smatch stops complaining about the lack of a sta pointer test after the !sta (since the !sta goes away) Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 11:42:14 +01:00
Johannes Berg	d6f5cc091b	mac80211: tdls: remove shadowing variable There's no need to use another local 'sta' variable as the original (outer scope) one isn't needed any more and has become invalid anyway when exiting the RCU read section. Remove the inner scope one and along with it the useless NULL initialization. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 11:37:44 +01:00
Johannes Berg	13874e4b23	nl80211: suppress smatch warnings smatch warns that we once checked request->ssids in two functions and then unconditionally used it later again. This is actually fine, because the code has a relationship between attrs[NL80211_ATTR_SCAN_SSIDS], n_ssids and request->ssids, but smatch isn't smart enough to realize that. Suppress the warnings by always checking just n_ssids - that way smatch won't know that request->ssids could be NULL, and since it is only NULL when n_ssids is 0 we still check everything correctly. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 11:25:20 +01:00
Johannes Berg	0fa7b39131	nl80211: fix per-station group key get/del and memory leak In case userspace attempts to obtain key information for or delete a unicast key, this is currently erroneously rejected unless the driver sets the WIPHY_FLAG_IBSS_RSN flag. Apparently enough drivers do so it was never noticed. Fix that, and while at it fix a potential memory leak: the error path in the get_key() function was placed after allocating a message but didn't free it - move it to a better place. Luckily admin permissions are needed to call this operation. Cc: stable@vger.kernel.org Fixes: `e31b82136d` ("cfg80211/mac80211: allow per-station GTKs") Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 11:21:02 +01:00
Bob Copeland	985e88b13a	Revert "mac80211: keep sending peer candidate events while in listen state" This reverts commit `2ae70efcea`. The new peer events that are generated by the change are causing problems with wpa_supplicant in userspace: wpa_s tries to restart SAE authentication with the peer when receiving the event, even though authentication may be in progress already, and it gets very confused. Revert back to the original operating mode, which is to only get events when there is no corresponding station entry. Cc: Nishikawa, Kenzoh <Kenzoh.Nishikawa@jp.sony.com> Cc: Masashi Honma <masashi.honma@gmail.com> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 10:57:19 +01:00
Luciano Coelho	332ff7fe36	mac80211: complete scan work immediately if quiesced or suspended It is possible that a deferred scan is queued after the queues are flushed in __ieee80211_suspend(). The deferred scan work may be scheduled by ROC or ieee80211_stop_poll(). To make sure don't start a new scan while suspending, check whether we're quiescing or suspended and complete the scan immediately if that's the case. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 10:54:22 +01:00
Emmanuel Grumbach	4afaff176a	mac80211: avoid races related to suspend flow When we go to suspend, there is complex set of states that avoids races. The quiescing variable is set whlie __ieee80211_suspend is running. Then suspended is set. The code makes sure there is no window without any of these flags. The problem is that workers can still be enqueued while we are quiescing. This leads to situations where the driver is already suspending and other flows like disassociation are handled by a worker. To fix this, we need to check quiescing and suspended flags in the worker itself and not only before enqueueing it. I also add here extensive documentation to ease the understanding of these complex issues. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 10:54:22 +01:00
Emmanuel Grumbach	14f2ae83d0	mac80211: synchronize_net() before flushing the queues When mac80211 disconnects, it drops all the packets on the queues. This happens after the net stack has been notified that we have no link anymore (netif_carrier_off). netif_carrier_off ensures that no new packets are sent to xmit() callback, but we might have older packets in the middle of the Tx path. These packets will land in the driver's queues after the latter have been flushed. Synchronize_net() between netif_carrier_off and drv_flush() will fix this. Note that we can't call synchronize_net inside ieee80211_flush_queues since there are flows that call ieee80211_flush_queues and don't need synchronize_net() which is an expensive operation. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> [reword comment to be more accurate] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 10:54:21 +01:00
Mathy Vanhoef	3a5c5e81d8	mac80211: properly set CCK flag in radiotap Fix a regression introduced by commit `a5e70697d0` ("mac80211: add radiotap flag and handling for 5/10 MHz") where the IEEE80211_CHAN_CCK channel type flag was incorrectly replaced by the IEEE80211_CHAN_OFDM flag. This commit fixes that by using the CCK flag again. Cc: stable@vger.kernel.org Fixes: `a5e70697d0` ("mac80211: add radiotap flag and handling for 5/10 MHz") Signed-off-by: Mathy Vanhoef <vanhoefm@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 10:53:58 +01:00
Fred Chou	fb142f4bbb	mac80211: correct header length calculation HT Control field may also be present in management frames, as defined in 8.2.4.1.10 of 802.11-2012. Account for this in calculation of header length. Signed-off-by: Fred Chou <fred.chou.nd@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 10:52:48 +01:00
Luciano Coelho	2af81d6718	mac80211: only roll back station states for WDS when suspending In normal cases (i.e. when we are fully associated), cfg80211 takes care of removing all the stations before calling suspend in mac80211. But in the corner case when we suspend during authentication or association, mac80211 needs to roll back the station states. But we shouldn't roll back the station states in the suspend function, because this is taken care of in other parts of the code, except for WDS interfaces. For AP types of interfaces, cfg80211 takes care of disconnecting all stations before calling the driver's suspend code. For station interfaces, this is done in the quiesce code. For WDS interfaces we still need to do it here, so move the code into a new switch case for WDS. Cc: stable@kernel.org [3.15+] Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 10:47:40 +01:00
Luciano Coelho	9c74893441	nl80211: add an attribute to allow delaying the first scheduled scan cycle The userspace may want to delay the the first scheduled scan or net-detect cycle. Add an optional attribute to the scheduled scan configuration to pass the delay to be (optionally) used by the driver. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> [add the attribute to the policy to validate it] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 10:30:47 +01:00
Lorenzo Bianconi	db82d8a966	mac80211: enable TPC through mac80211 stack Control per packet Transmit Power Control (TPC) in lower drivers according to TX power settings configured by the user. In particular TPC is enabled if value passed in enum nl80211_tx_power_setting is NL80211_TX_POWER_LIMITED (allow using less than specified from userspace), whereas TPC is disabled if nl80211_tx_power_setting is set to NL80211_TX_POWER_FIXED (use value configured from userspace) Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi83@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 10:28:51 +01:00
Vadim Kochan	4b681c82d2	nl80211: Allow set network namespace by fd Added new NL80211_ATTR_NETNS_FD which allows to set namespace via nl80211 by fd. Signed-off-by: Vadim Kochan <vadim4j@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-23 10:25:25 +01:00
Johannes Berg	fa7e1fbcb5	mac80211: allow drivers to control software crypto Some drivers unfortunately cannot support software crypto, but mac80211 currently assumes that they do. This has the issue that if the hardware enabling fails for some reason, the software fallback is used, which won't work. This clearly isn't desirable, the error should be reported and the key setting refused. Support this in mac80211 by allowing drivers to set a new HW flag IEEE80211_HW_SW_CRYPTO_CONTROL, in which case mac80211 will only allow software fallback if the set_key() method returns 1. The driver will also need to advertise supported cipher suites so that mac80211 doesn't advertise any (future) software ciphers that the driver can't actually do. While at it, to make it easier to support this, refactor the ieee80211_init_cipher_suites() code. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-22 22:01:01 +01:00
Marcel Holtmann	ed93ec69c7	Bluetooth: Require SSP enabling before BR/EDR Secure Connections When BR/EDR is supported by a controller, then it is required to enable Secure Simple Pairing first before enabling the Secure Connections feature. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-22 21:44:20 +02:00
Marcel Holtmann	3a5486e1fd	Bluetooth: Limit BR/EDR switching for LE only with secure connections When a powered on dual-mode controller has been configured to operate as LE only with secure connections, then the BR/EDR side of things can not be switched back on. Do reconfigure the controller it first needs to be powered down. The secure connections feature is implemented in the BR/EDR controller while for LE it is implemented in the host. So explicitly forbid such a transaction to avoid inconsistent states. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-22 21:42:45 +02:00
Marcel Holtmann	574ea3c713	Bluetooth: Fix dependency for BR/EDR Secure Connections mode on SSP The BR/EDR Secure Connections feature should only be enabled when the Secure Simple Pairing mode has been enabled first. However since secure connections is feature that is valid for BR/EDR and LE, this needs special handling. When enabling secure connections on a LE only configured controller, thent the BR/EDR side should not be enabled in the controller. This patches makes the BR/EDR Secure Connections feature depending on enabling Secure Simple Pairing mode first. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-22 21:42:18 +02:00
Szymon Janc	91200e9f3e	Bluetooth: Fix reporting invalid RSSI for LE devices Start Discovery was reporting 0 RSSI for invalid RSSI only for BR/EDR devices. LE devices were reported with RSSI 127. Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org # 3.19+	2015-01-22 18:06:43 +01:00
Johannes Berg	54330bf63b	mac80211: fix HW registration error paths Station info state is started in allocation, so should be destroyed on free (it's just a timer); rate control must be freed if anything afterwards fails to initialize. LED exit should be later, no need for locking there, but it needs to be done also when rate init failed. Also clean up the code by moving a label so the locking doesn't have to be done separately. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-21 17:34:37 +01:00
Michael S. Tsirkin	7754f53e94	virtio/9p: verify device has config space Some devices might not implement config space access (e.g. remoteproc used not to - before 3.9). virtio/9p needs config space access so make it fail gracefully if not there. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2015-01-21 16:28:45 +10:30
David S. Miller	0c49087462	Some further updates for net-next: * fix network-manager which was broken by the previous changes * fix delete-station events, which were broken by me making the genlmsg_end() mistake * fix a timer left running during suspend in some race conditions that would cause an annoying (but harmless) warning * (less important, but in the tree already) remove 80+80 MHz rate reporting since the spec doesn't distinguish it from 160 MHz; as the bitrate they're both 160 MHz bandwidth -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUvUZlAAoJEDBSmw7B7bqrfNMQAJT5jjOSjmwW8Zdvujvda/qt bFpYa9t0NsN3izzMpjPSrCwPrHN5qE86ZA8TcZrIzejPH4rpltjaXB6JNHZardVo deCUWU9xotoPELoE0Xex9mHPEkYYvOaht/P8A/88qP1S2PykMmj9fqNeijyUvwuo Jlsh0wKe4Jq6bCmdxvy/bde84ceAQcuh2TKNov1S0tB38tRY9qSu2n6ZGpoMNcEe CWuW+23jL1uAvt6Ljk2fTKdLR8iyXykfM0UMX2/4R2PMnJXK/dQqV/eeXTjpxtMk UV4aIMcSS19D7HxICKOXOdZLdMMuXaFUnUMlGLBtXZz3n9lZoP7RtVIHoib8VBXZ tY7xQrK6YNvwZ4SZZPuv/yr6YWP2+vBM2FUfXjzD+or6uYsej203a5q0RsOY+3Tp 6Yklr+mQNlrOtpMsHMSgrBUUZsAK1I95ALMVVqaq1hgf1cDvRIDHlOo4A7bjwNFw eA3L1o4O1/E/IGp4v6+2Iukc9rIwm11sNr/wuD8dDkZTradUPH1iY6J5sxJNb2Nq YdpCnQ/lNXj650+z9/G2omSA6DTTTOtXJPxKR+FOHZVKDpZYtF6TxKb0S79fINps 6ZlWIna5bUiXF1b6xad1x+vtyjNMgTvkg6mR+WQnvF57Ri8hucbtpv5wpA5bhYUQ Fbz9VZF2nfMeIbXfTaWi =Bvmr -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-davem-2015-01-19' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Some further updates for net-next: * fix network-manager which was broken by the previous changes * fix delete-station events, which were broken by me making the genlmsg_end() mistake * fix a timer left running during suspend in some race conditions that would cause an annoying (but harmless) warning * (less important, but in the tree already) remove 80+80 MHz rate reporting since the spec doesn't distinguish it from 160 MHz; as the bitrate they're both 160 MHz bandwidth Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-19 16:22:19 -05:00
Johannes Berg	926e9878a3	phonet netlink: allow multiple messages per skb in route dump My previous patch to this file changed the code to be bug-compatible towards userspace. Unless userspace (which I wasn't able to find) implements the dump reader by hand in a wrong way, this isn't needed. If it uses libnl or similar code putting multiple messages into a single SKB is far more efficient. Change the code to do this. While at it, also clean it up and don't use so many variables - just store the address in the callback args directly. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-19 16:20:17 -05:00
Felix Fietkau	22a5dc0e5e	net: sched: Introduce connmark action This tc action allows you to retrieve the connection tracking mark This action has been used heavily by openwrt for a few years now. There are known limitations currently: doesn't work for initial packets, since we only query the ct table. Fine given use case is for returning packets no implicit defrag. frags should be rare so fix later.. won't work for more complex tasks, e.g. lookup of other extensions since we have no means to store results we still have a 2nd lookup later on via normal conntrack path. This shouldn't break anything though since skb->nfct isn't altered. V2: remove unnecessary braces (Jiri) change the action identifier to 14 (Jiri) Fix some stylistic issues caught by checkpatch V3: Move module params to bottom (Cong) Get rid of tcf_hashinfo_init and friends and conform to newer API (Cong) Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-19 16:02:06 -05:00
Florian Fainelli	8db0a2ee2c	net: bridge: reject DSA-enabled master netdevices as bridge members DSA-enabled master network devices with a switch tagging protocol should strip the protocol specific format before handing the frame over to higher layer. When adding such a DSA master network device as a bridge member, we go through the following code path when receiving a frame: __netif_receive_skb_core -> first ptype check against ptype_all is not returning any handler for this skb -> check and invoke rx_handler: -> deliver frame to the bridge layer: br_handle_frame DSA registers a ptype handler with the fake ETH_XDSA ethertype, which is called after the bridge-layer rx_handler has run. br_handle_frame() tries to parse the frame it received from the DSA master network device, and will not be able to match any of its conditions and jumps straight at the end of the end of br_handle_frame() and returns RX_HANDLER_CONSUMED there. Since we returned RX_HANDLER_CONSUMED, __netif_receive_skb_core() stops RX processing for this frame and returns NET_RX_SUCCESS, so we never get a chance to call our switch tag packet processing logic and deliver frames to the DSA slave network devices, and so we do not get any functional bridge members at all. Instead of cluttering the bridge receive path with DSA-specific checks, and rely on assumptions about how __netif_receive_skb_core() is processing frames, we simply deny adding the DSA master network device (conduit interface) as a bridge member, leaving only the slave DSA network devices to be bridge members, since those will work correctly in all circumstances. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-19 15:45:10 -05:00
Florian Fainelli	728c02089a	net: ipv4: handle DSA enabled master network devices The logic to configure a network interface for kernel IP auto-configuration is very simplistic, and does not handle the case where a device is stacked onto another such as with DSA. This causes the kernel not to open and configure the master network device in a DSA switch tree, and therefore slave network devices using this master network devices as conduit device cannot be open. This restriction comes from a check in net/dsa/slave.c, which is basically checking the master netdev flags for IFF_UP and returns -ENETDOWN if it is not the case. Automatically bringing-up DSA master network devices allows DSA slave network devices to be used as valid interfaces for e.g: NFS root booting by allowing kernel IP autoconfiguration to succeed on these interfaces. On the reverse path, make sure we do not attempt to close a DSA-enabled device as this would implicitely prevent the slave DSA network device from operating. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-19 15:45:10 -05:00
Hagen Paul Pfeifer	9d289715eb	ipv6: stop sending PTB packets for MTU < 1280 Reduce the attack vector and stop generating IPv6 Fragment Header for paths with an MTU smaller than the minimum required IPv6 MTU size (1280 byte) - called atomic fragments. See IETF I-D "Deprecating the Generation of IPv6 Atomic Fragments" [1] for more information and how this "feature" can be misused. [1] https://tools.ietf.org/html/draft-ietf-6man-deprecate-atomfrag-generation-00 Signed-off-by: Fernando Gont <fgont@si6networks.com> Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-19 14:52:07 -05:00
Nicolas Dichtel	317f4810e4	rtnl: allow to create device with IFLA_LINK_NETNSID set This patch adds the ability to create a netdevice in a specified netns and then move it into the final netns. In fact, it allows to have a symetry between get and set rtnl messages. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-19 14:32:03 -05:00
Nicolas Dichtel	1728d4fabd	tunnels: advertise link netns via netlink Implement rtnl_link_ops->get_link_net() callback so that IFLA_LINK_NETNSID is added to rtnetlink messages. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-19 14:32:03 -05:00
Nicolas Dichtel	d37512a277	rtnl: add link netns id to interface messages This patch adds a new attribute (IFLA_LINK_NETNSID) which contains the 'link' netns id when this netns is different from the netns where the interface stands (for example for x-net interfaces like ip tunnels). With this attribute, it's possible to interpret correctly all advertised information (like IFLA_LINK, etc.). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-19 14:21:26 -05:00
Nicolas Dichtel	0c7aecd4bd	netns: add rtnl cmd to add and get peer netns ids With this patch, a user can define an id for a peer netns by providing a FD or a PID. These ids are local to the netns where it is added (ie valid only into this netns). The main function (ie the one exported to other module), peernet2id(), allows to get the id of a peer netns. If no id has been assigned by the user, this function allocates one. These ids will be used in netlink messages to point to a peer netns, for example in case of a x-netns interface. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-19 14:21:18 -05:00
Emmanuel Grumbach	c1e140bf79	mac80211: delete the assoc/auth timer upon suspend While suspending, we destroy the authentication / association that might be taking place. While doing so, we forgot to delete the timer which can be firing after local->suspended is already set, producing the warning below. Fix that by deleting the timer. [66722.825487] WARNING: CPU: 2 PID: 5612 at net/mac80211/util.c:755 ieee80211_can_queue_work.isra.18+0x32/0x40 [mac80211]() [66722.825487] queueing ieee80211 work while going to suspend [66722.825529] CPU: 2 PID: 5612 Comm: kworker/u16:69 Tainted: G W O 3.16.1+ #24 [66722.825537] Workqueue: events_unbound async_run_entry_fn [66722.825545] Call Trace: [66722.825552] <IRQ> [<ffffffff817edbb2>] dump_stack+0x4d/0x66 [66722.825556] [<ffffffff81075cad>] warn_slowpath_common+0x7d/0xa0 [66722.825572] [<ffffffffa06b5b90>] ? ieee80211_sta_bcn_mon_timer+0x50/0x50 [mac80211] [66722.825573] [<ffffffff81075d1c>] warn_slowpath_fmt+0x4c/0x50 [66722.825586] [<ffffffffa06977a2>] ieee80211_can_queue_work.isra.18+0x32/0x40 [mac80211] [66722.825598] [<ffffffffa06977d5>] ieee80211_queue_work+0x25/0x50 [mac80211] [66722.825611] [<ffffffffa06b5bac>] ieee80211_sta_timer+0x1c/0x20 [mac80211] [66722.825614] [<ffffffff8108655a>] call_timer_fn+0x8a/0x300 Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-19 18:59:20 +01:00
Johannes Berg	6e9f3fa4f0	Revert "wireless: Support of IFLA_INFO_KIND rtnl attribute" This reverts commit `ba1debdfed`. Oliver reported that it breaks network-manager, for some reason with this patch NM decides that the device isn't wireless but "generic" (ethernet), sees no carrier (as expected with wifi) and fails to do anything else with it. Revert this to unbreak userspace. Reported-by: Oliver Hartkopp <socketcan@hartkopp.net> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-19 18:49:56 +01:00
Pablo Neira Ayuso	75e8d06d43	netfilter: nf_tables: validate hooks in NAT expressions The user can crash the kernel if it uses any of the existing NAT expressions from the wrong hook, so add some code to validate this when loading the rule. This patch introduces nft_chain_validate_hooks() which is based on an existing function in the bridge version of the reject expression. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-01-19 14:52:39 +01:00
Rosen, Rami	4de8b41370	bridge: remove oflags from setlink/dellink. Commit `02dba4388d` ("bridge: fix setlink/dellink notifications") removed usage of oflags in both rtnl_bridge_setlink() and rtnl_bridge_dellink() methods. This patch removes this variable as it is no longer needed. Signed-off-by: Rami Rosen <rami.rosen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-19 01:22:48 -05:00
David S. Miller	7b46a644a4	netlink: Fix bugs in nlmsg_end() conversions. Commit `053c095a82` ("netlink: make nlmsg_end() and genlmsg_end() void") didn't catch all of the cases where callers were breaking out on the return value being equal to zero, which they no longer should when zero means success. Fix all such cases. Reported-by: Marcel Holtmann <marcel@holtmann.org> Reported-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-18 23:36:08 -05:00
Johannes Berg	053c095a82	netlink: make nlmsg_end() and genlmsg_end() void Contrary to common expectations for an "int" return, these functions return only a positive value -- if used correctly they cannot even return 0 because the message header will necessarily be in the skb. This makes the very common pattern of if (genlmsg_end(...) < 0) { ... } be a whole bunch of dead code. Many places also simply do return nlmsg_end(...); and the caller is expected to deal with it. This also commonly (at least for me) causes errors, because it is very common to write if (my_function(...)) /* error condition */ and if my_function() does "return nlmsg_end()" this is of course wrong. Additionally, there's not a single place in the kernel that actually needs the message length returned, and if anyone needs it later then it'll be very easy to just use skb->len there. Remove this, and make the functions void. This removes a bunch of dead code as described above. The patch adds lines because I did - return nlmsg_end(...); + nlmsg_end(...); + return 0; I could have preserved all the function's return values by returning skb->len, but instead I've audited all the places calling the affected functions and found that none cared. A few places actually compared the return value with <= 0 in dump functionality, but that could just be changed to < 0 with no change in behaviour, so I opted for the more efficient version. One instance of the error I've made numerous times now is also present in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't check for <0 or <=0 and thus broke out of the loop every single time. I've preserved this since it will (I think) have caused the messages to userspace to be formatted differently with just a single message for every SKB returned to userspace. It's possible that this isn't needed for the tools that actually use this, but I don't even know what they are so couldn't test that changing this behaviour would be acceptable. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-18 01:03:45 -05:00
Richard Alpe	d6e164e321	tipc: fix socket list regression in new nl api Commit `07f6c4bc` (tipc: convert tipc reference table to use generic rhashtable) introduced a problem with port listing in the new netlink API. It broke the resume functionality resulting in a never ending loop. This was caused by starting with the first hash table every time subsequently never returning an empty skb (terminating). This patch fixes the resume mechanism by keeping a logical reference to the last hash table along with a logical reference to the socket (port) that didn't fit in the previous message. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-18 00:27:05 -05:00
David S. Miller	e445dd5f67	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-01-16 Here are some more bluetooth & ieee802154 patches intended for 3.20: - Refactoring & cleanups of ieee802154 & 6lowpan code - Various fixes to the btmrvl driver - Fixes for Bluetooth Low Energy Privacy feature handling - Added build-time sanity checks for sockaddr sizes - Fixes for Security Manager registration on LE-only controllers - Refactoring of broken inquiry mode handling to a generic quirk Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-18 00:25:30 -05:00
Jiri Pirko	3aeb66176f	net: replace br_fdb_external_learn_* calls with switchdev notifier events This patch benefits from newly introduced switchdev notifier and uses it to propagate fdb learn events from rocker driver to bridge. That avoids direct function calls and possible use by other listeners (ovs). Suggested-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-18 00:23:57 -05:00
Jiri Pirko	03bf0c2812	switchdev: introduce switchdev notifier This patch introduces new notifier for purposes of exposing events which happen on switch driver side. The consumers of the event messages are mainly involved masters, namely bridge and ovs. Suggested-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-18 00:23:57 -05:00
Nicolas Dichtel	66c1a12c65	socket: use ki_nbytes instead of iov_length() This field already contains the length of the iovec, no need to calculate it again. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-17 23:58:37 -05:00
Daniel Borkmann	2061dcd6bf	net: sctp: fix race for one-to-many sockets in sendmsg's auto associate I.e. one-to-many sockets in SCTP are not required to explicitly call into connect(2) or sctp_connectx(2) prior to data exchange. Instead, they can directly invoke sendmsg(2) and the SCTP stack will automatically trigger connection establishment through 4WHS via sctp_primitive_ASSOCIATE(). However, this in its current implementation is racy: INIT is being sent out immediately (as it cannot be bundled anyway) and the rest of the DATA chunks are queued up for later xmit when connection is established, meaning sendmsg(2) will return successfully. This behaviour can result in an undesired side-effect that the kernel made the application think the data has already been transmitted, although none of it has actually left the machine, worst case even after close(2)'ing the socket. Instead, when the association from client side has been shut down e.g. first gracefully through SCTP_EOF and then close(2), the client could afterwards still receive the server's INIT_ACK due to a connection with higher latency. This INIT_ACK is then considered out of the blue and hence responded with ABORT as there was no alive assoc found anymore. This can be easily reproduced f.e. with sctp_test application from lksctp. One way to fix this race is to wait for the handshake to actually complete. The fix defers waiting after sctp_primitive_ASSOCIATE() and sctp_primitive_SEND() succeeded, so that DATA chunks cooked up from sctp_sendmsg() have already been placed into the output queue through the side-effect interpreter, and therefore can then be bundeled together with COOKIE_ECHO control chunks. strace from example application (shortened): socket(PF_INET, SOCK_SEQPACKET, IPPROTO_SCTP) = 3 sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")}, msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5 sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")}, msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5 sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")}, msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5 sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")}, msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5 sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")}, msg_iov(0)=[], msg_controllen=48, {cmsg_len=48, cmsg_level=0x84 /* SOL_??? */, cmsg_type=, ...}, msg_flags=0}, 0) = 0 // graceful shutdown for SOCK_SEQPACKET via SCTP_EOF close(3) = 0 tcpdump before patch (fooling the application): 22:33:36.306142 IP 192.168.1.114.41462 > 192.168.1.115.8888: sctp (1) [INIT] [init tag: 3879023686] [rwnd: 106496] [OS: 10] [MIS: 65535] [init TSN: 3139201684] 22:33:36.316619 IP 192.168.1.115.8888 > 192.168.1.114.41462: sctp (1) [INIT ACK] [init tag: 3345394793] [rwnd: 106496] [OS: 10] [MIS: 10] [init TSN: 3380109591] 22:33:36.317600 IP 192.168.1.114.41462 > 192.168.1.115.8888: sctp (1) [ABORT] tcpdump after patch: 14:28:58.884116 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [INIT] [init tag: 438593213] [rwnd: 106496] [OS: 10] [MIS: 65535] [init TSN: 3092969729] 14:28:58.888414 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [INIT ACK] [init tag: 381429855] [rwnd: 106496] [OS: 10] [MIS: 10] [init TSN: 2141904492] 14:28:58.888638 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [COOKIE ECHO] , (2) [DATA] (B)(E) [TSN: 3092969729] [...] 14:28:58.893278 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [COOKIE ACK] , (2) [SACK] [cum ack 3092969729] [a_rwnd 106491] [#gap acks 0] [#dup tsns 0] 14:28:58.893591 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [DATA] (B)(E) [TSN: 3092969730] [...] 14:28:59.096963 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SACK] [cum ack 3092969730] [a_rwnd 106496] [#gap acks 0] [#dup tsns 0] 14:28:59.097086 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [DATA] (B)(E) [TSN: 3092969731] [...] , (2) [DATA] (B)(E) [TSN: 3092969732] [...] 14:28:59.103218 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SACK] [cum ack 3092969732] [a_rwnd 106486] [#gap acks 0] [#dup tsns 0] 14:28:59.103330 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [SHUTDOWN] 14:28:59.107793 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SHUTDOWN ACK] 14:28:59.107890 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [SHUTDOWN COMPLETE] Looks like this bug is from the pre-git history museum. ;) Fixes: 08707d5482df ("lksctp-2_5_31-0_5_1.patch") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-17 23:52:20 -05:00
Jiri Pirko	33e9fcc666	tc: cls_bpf: rename bpf_len to bpf_num_ops It was suggested by DaveM to change the name as "len" might indicate unit bytes. Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-17 23:51:10 -05:00
Jiri Pirko	d23b8ad8ab	tc: add BPF based action This action provides a possibility to exec custom BPF code. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-17 23:51:10 -05:00
Roopa Prabhu	02dba4388d	bridge: fix setlink/dellink notifications problems with bridge getlink/setlink notifications today: - bridge setlink generates two notifications to userspace - one from the bridge driver - one from rtnetlink.c (rtnl_bridge_notify) - dellink generates one notification from rtnetlink.c. Which means bridge setlink and dellink notifications are not consistent - Looking at the code it appears, If both BRIDGE_FLAGS_MASTER and BRIDGE_FLAGS_SELF were set, the size calculation in rtnl_bridge_notify can be wrong. Example: if you set both BRIDGE_FLAGS_MASTER and BRIDGE_FLAGS_SELF in a setlink request to rocker dev, rtnl_bridge_notify will allocate skb for one set of bridge attributes, but, both the bridge driver and rocker dev will try to add attributes resulting in twice the number of attributes being added to the skb. (rocker dev calls ndo_dflt_bridge_getlink) There are multiple options: 1) Generate one notification including all attributes from master and self: But, I don't think it will work, because both master and self may use the same attributes/policy. Cannot pack the same set of attributes in a single notification from both master and slave (duplicate attributes). 2) Generate one notification from master and the other notification from self (This seems to be ideal): For master: the master driver will send notification (bridge in this example) For self: the self driver will send notification (rocker in the above example. It can use helpers from rtnetlink.c to do so. Like the ndo_dflt_bridge_getlink api). This patch implements 2) (leaving the 'rtnl_bridge_notify' around to be used with 'self'). v1->v2 : - rtnl_bridge_notify is now called only for self, so, remove 'BRIDGE_FLAGS_SELF' check and cleanup a few things - rtnl_bridge_dellink used to always send a RTM_NEWLINK msg earlier. So, I have changed the notification from br_dellink to go as RTM_NEWLINK Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-17 23:49:51 -05:00
Johannes Berg	ee1c244219	genetlink: synchronize socket closing and family removal In addition to the problem Jeff Layton reported, I looked at the code and reproduced the same warning by subscribing and removing the genl family with a socket still open. This is a fairly tricky race which originates in the fact that generic netlink allows the family to go away while sockets are still open - unlike regular netlink which has a module refcount for every open socket so in general this cannot be triggered. Trying to resolve this issue by the obvious locking isn't possible as it will result in deadlocks between unregistration and group unbind notification (which incidentally lockdep doesn't find due to the home grown locking in the netlink table.) To really resolve this, introduce a "closing socket" reference counter (for generic netlink only, as it's the only affected family) in the core netlink code and use that in generic netlink to wait for all the sockets that are being closed at the same time as a generic netlink family is removed. This fixes the race that when a socket is closed, it will should call the unbind, but if the family is removed at the same time the unbind will not find it, leading to the warning. The real problem though is that in this case the unbind could actually find a new family that is registered to have a multicast group with the same ID, and call its mcast_unbind() leading to confusing. Also remove the warning since it would still trigger, but is now no longer a problem. This also moves the code in af_netlink.c to before unreferencing the module to avoid having the same problem in the normal non-genl case. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-16 17:04:25 -05:00
Johannes Berg	5ad6300524	genetlink: disallow subscribing to unknown mcast groups Jeff Layton reported that he could trigger the multicast unbind warning in generic netlink using trinity. I originally thought it was a race condition between unregistering the generic netlink family and closing the socket, but there's a far simpler explanation: genetlink currently allows subscribing to groups that don't (yet) exist, and the warning is triggered when unsubscribing again while the group still doesn't exist. Originally, I had a warning in the subscribe case and accepted it out of userspace API concerns, but the warning was of course wrong and removed later. However, I now think that allowing userspace to subscribe to groups that don't exist is wrong and could possibly become a security problem: Consider a (new) genetlink family implementing a permission check in the mcast_bind() function similar to the like the audit code does today; it would be possible to bypass the permission check by guessing the ID and subscribing to the group it exists. This is only possible in case a family like that would be dynamically loaded, but it doesn't seem like a huge stretch, for example wireless may be loaded when you plug in a USB device. To avoid this reject such subscription attempts. If this ends up causing userspace issues we may need to add a workaround in af_netlink to deny such requests but not return an error. Reported-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-16 17:04:24 -05:00
Johannes Berg	5700712122	cfg80211: fix checking nl80211_send_station() return value The return value from nl80211_send_station() is the length of the skb, or a negative error, so abort sending the message only when the return value was negative. This fixes the ibss_rsn wpa_supplicant test case. Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-16 21:05:52 +01:00
Johannes Berg	5e06a9e8b6	mac80211: remove doubled semicolon Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-16 13:27:56 +01:00
Rickard Strandqvist	0026b6551b	Bluetooth: Remove unused function Remove the function hci_conn_change_link_key() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-16 13:06:38 +02:00
Herbert Xu	919d9db952	netlink: Fix netlink_insert EADDRINUSE error The patch `c5adde9468` ("netlink: eliminate nl_sk_hash_lock") introduced a bug where the EADDRINUSE error has been replaced by ENOMEM. This patch rectifies that problem. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-16 02:38:07 -05:00
Eric Dumazet	ac64da0b83	net: rps: fix cpu unplug softnet_data.input_pkt_queue is protected by a spinlock that we must hold when transferring packets from victim queue to an active one. This is because other cpus could still be trying to enqueue packets into victim queue. A second problem is that when we transfert the NAPI poll_list from victim to current cpu, we absolutely need to special case the percpu backlog, because we do not want to add complex locking to protect process_queue : Only owner cpu is allowed to manipulate it, unless cpu is offline. Based on initial patch from Prasad Sodagudi & Subash Abhinov Kasiviswanathan. This version is better because we do not slow down packet processing, only make migration safer. Reported-by: Prasad Sodagudi <psodagud@codeaurora.org> Reported-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-16 01:02:42 -05:00
Willem de Bruijn	f812116b17	ip: zero sockaddr returned on error queue The sockaddr is returned in IP(V6)_RECVERR as part of errhdr. That structure is defined and allocated on the stack as struct { struct sock_extended_err ee; struct sockaddr_in(6) offender; } errhdr; The second part is only initialized for certain SO_EE_ORIGIN values. Always initialize it completely. An MTU exceeded error on a SOCK_RAW/IPPROTO_RAW is one example that would return uninitialized bytes. Signed-off-by: Willem de Bruijn <willemb@google.com> ---- Also verified that there is no padding between errhdr.ee and errhdr.offender that could leak additional kernel data. Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-15 19:41:16 -05:00
Nicolas Dichtel	12d872511c	bridge: use MDBA_SET_ENTRY_MAX for maxtype in nlmsg_parse() This is just a cleanup, because in the current code MDBA_SET_ENTRY_MAX == MDBA_SET_ENTRY. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-15 19:37:20 -05:00
David S. Miller	aaef66b837	Just two fixes - one for an uninialized variable and one for a deadlock in regulatory processing. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUt7Z6AAoJEDBSmw7B7bqrgxoQALEEWcIJ1wmu+M7ijdiXLiUM vRNuxGENwIgfdmoTs6R7pgKhEYFzePWccjHOzt9cQB5efdRzDjrxj2fDrPf4o5JB 7of2uHGoaD2RI2H+pJS1URT8igmxDJii+bOEzHn/WL730Hgr2J2iuJizxZ2lzsVM VKkiwOykV3kfN5MGsj7yvJQXR32DlGfmiT86+3bjNhE8hgU38NgE0TeUUnyF0AS9 jLV5mpJfkLmZyZmnvszV5tiqQQmQAdHImI+vbHuhzNUUAn6RLswxbWBzUrLXpXqu 5KBR2P/6TU4X89NcYGm+JhTI9PghsMbh1zDuqDQ9gq8j0mrV7Kzh1K6LdYoVpfXf s42gHe32+Mh0l6LRTlsjftMxJbFla7I6madPcVTqJCV2y1LocD1BseJ+qX5bngU1 lBSSbzE9MlAl5gyHVDh1CAV+8FM0CP8Ff3WtAyr8XtDxfAUwmo3xBqmL8pvLq6nh 49kDqDVOzC5KzASYIjqBwmRMcqW2AnaNQG64iIOzM3ure/l5trncPHHPsMkxgwu+ dDgEXwjWhJNaxWt7fcTSZndATLCRvkeb6ZeRoqmY6A2GJgzpUIhm6HETXc9BNGbg 3J56176xx04LYg6U5+vMiU5A+gFjlrUknQ3MGXF0KPgw0MvtSyempobV68Lpul4r 6DviuT9NiRqxloaBimyx =bMKg -----END PGP SIGNATURE----- Merge tag 'mac80211-for-davem-2015-01-15' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Just two fixes - one for an uninialized variable and one for a deadlock in regulatory processing. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-15 19:28:36 -05:00
David S. Miller	27f097177d	Here's a big pile of changes for this round. We have * a lot of regulatory code changes to deal with the way newer Intel devices handle this * a change to drop packets while disconnecting from an AP instead of trying to wait for them * a new attempt at improving the tailroom accounting to not kick in too much for performance reasons * improvements in wireless link statistics * many other small improvements and small fixes that didn't seem necessary for 3.19 (e.g. in hwsim which is testing only code) -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUt7WEAAoJEDBSmw7B7bqrVBoP/2EViE62HMmXdqG1SZWz8q9o Iigq8STC/sT2WCx1pYm+tKuVW4LD2O3mCriGNP8A3RwzDZ6H7sKJYb1gV6QCPV6f 4+yT5VSAB3D3lHmp/bbyNsmKCBQ5uS4LVgDrokrkbGpacDu94PYS5Wv9t3x6PBVB 5Xjky6g6A/pSuxTIstSO9k5xkzNjaB1TxvVRz/gJrGcFQVkDFSlVbuTHUVxs8p+p k6mwY/2WYijZkswWZVQTJLQlF9vRI7PYkKs5m8gz4pjNU48oFJoyu4IP3Z1Xj/Sm zgT1C9rgp0Du74HYO2niGAvLWgKajAZuW5hIacDndUPjYQQBLgGs/bCJGSntM+x9 XoOdPixdFPT/58ijyYZlmHc8rxPOd2kHsVbwGplp8f195S4VO04D+ejfOaoAUFwX v/kMvO3XIFmEH1jjkDAC3OTcRMYVMuENyWl7pFzxHIzPeRiEpQUd9iSdM4yol0F2 ZyWvKud4U75Sh+aCiDIIBETtdfCRFe12hgKs4COYbI/UYkGPTPrNei/uisopdubT JC+7pZOYdSgoX12yVi6ds6DmKE/ZpIQyhIK4wTWgVoszbnfdb9Mw7mJEThwNRjeK JJPsbuty7u8HWjXzEqHLoTV3BFv1cgRSJc5Wt0zfME+LzD7iQpEpv+QBAguwwChD Osn55Z3FnKEmBdGcOIje =vaEW -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-davem-2015-01-15' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Here's a big pile of changes for this round. We have * a lot of regulatory code changes to deal with the way newer Intel devices handle this * a change to drop packets while disconnecting from an AP instead of trying to wait for them * a new attempt at improving the tailroom accounting to not kick in too much for performance reasons * improvements in wireless link statistics * many other small improvements and small fixes that didn't seem necessary for 3.19 (e.g. in hwsim which is testing only code) Conflicts: drivers/staging/rtl8723au/os_dep/ioctl_cfg80211.c Minor overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-15 19:16:56 -05:00
Eric Dumazet	5055c371bf	ipv4: per cpu uncached list RAW sockets with hdrinc suffer from contention on rt_uncached_lock spinlock. One solution is to use percpu lists, since most routes are destroyed by the cpu that created them. It is unclear why we even have to put these routes in uncached_list, as all outgoing packets should be freed when a device is dismantled. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `caacf05e5a` ("ipv4: Properly purge netdev references on uncached routes.") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-15 18:26:16 -05:00
Johannes Berg	b51f3beecf	cfg80211: change bandwidth reporting to explicit field For some reason, we made the bandwidth separate flags, which is rather confusing - a single rate cannot have different bandwidths at the same time. Change this to no longer be flags but use a separate field for the bandwidth ('bw') instead. While at it, add support for 5 and 10 MHz rates - these are reported as regular legacy rates with their real bitrate, but tagged as 5/10 now to make it easier to distinguish them. In the nl80211 API, the flags are preserved, but the code now can also clearly only set a single one of the flags. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-15 22:41:32 +01:00
Chuck Lever	a97c331f9a	svcrdma: Handle additional inline content Most NFS RPCs place their large payload argument at the end of the RPC header (eg, NFSv3 WRITE). For NFSv3 WRITE and SYMLINK, RPC/RDMA sends the complete RPC header inline, and the payload argument in the read list. Data in the read list is the last part of the XDR stream. One important case is not like this, however. NFSv4 COMPOUND is a counted array of operations. A WRITE operation, with its large data payload, can appear in the middle of the compound's operations array. Thus NFSv4 WRITE compounds can have header content after the WRITE payload. The Linux client, for example, performs an NFSv4 WRITE like this: { PUTFH, WRITE, GETATTR } Though RFC 5667 is not precise about this, the proper way to convey this compound is to place the GETATTR inline, _after_ the front of the RPC header. The receiver inserts the read list payload into the XDR stream after the initial WRITE arguments, and before the GETATTR operation, thanks to the value of the read list "position" field. The Linux client currently sends the GETATTR at the end of the RPC/RDMA read list, which is incorrect. It will be corrected in the future. The Linux server currently rejects NFSv4 compounds with inline content after the read list. For the above NFSv4 WRITE compound, the NFS compound header indicates there are three operations, but the server finds nonsense when it looks in the XDR stream for the third operation, and the compound fails with OP_ILLEGAL. Move trailing inline content to the end of the XDR buffer's page list. This presents incoming NFSv4 WRITE compounds to NFSD in the same way the socket transport does. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-01-15 15:01:49 -05:00
Chuck Lever	fcbeced5b4	svcrdma: Move read list XDR round-up logic This is a pre-requisite for a subsequent patch. Read list XDR round-up needs to be done _before_ additional inline content is copied to the end of the XDR buffer's page list. Move the logic added by commit `e560e3b510` ("svcrdma: Add zero padding if the client doesn't send it"). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-01-15 15:01:48 -05:00
Chuck Lever	0b056c224b	svcrdma: Support RDMA_NOMSG requests Currently the Linux server can not decode RDMA_NOMSG type requests. Operations whose length exceeds the fixed size of RDMA SEND buffers, like large NFSv4 CREATE(NF4LNK) operations, must be conveyed via RDMA_NOMSG. For an RDMA_MSG type request, the client sends the RPC/RDMA, RPC headers, and some or all of the NFS arguments via RDMA SEND. For an RDMA_NOMSG type request, the client sends just the RPC/RDMA header via RDMA SEND. The request's read list contains elements for the entire RPC message, including the RPC header. NFSD expects the RPC/RMDA header and RPC header to be contiguous in page zero of the XDR buffer. Add logic in the RDMA READ path to make the read list contents land where the server prefers, when the incoming message is a type RDMA_NOMSG message. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-01-15 15:01:47 -05:00
Chuck Lever	61edbcb7c7	svcrdma: rc_position sanity checking An RPC/RDMA client may send large RPC arguments via a read list. This is a list of scatter/gather elements which convey RPC call arguments too large to fit in a small RDMA SEND. Each entry in the read list has a "position" field, whose value is the byte offset in the XDR stream where the data in that entry is to be inserted. Entries which share the same "position" value make up the same RPC argument. The receiver inserts entries with the same position field value in list order into the XDR stream. Currently the Linux NFS/RDMA server cannot handle receiving read chunks in more than one position, mostly because no current client sends read lists with elements in more than one position. As a sanity check, ensure that all received chunks have the same "rc_position." Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-01-15 15:01:47 -05:00
Chuck Lever	e54524111f	svcrdma: Plant reader function in struct svcxprt_rdma The RDMA reader function doesn't change once an svcxprt_rdma is instantiated. Instead of checking sc_devcap during every incoming RPC, set the reader function once when the connection is accepted. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-01-15 15:01:46 -05:00
Chuck Lever	e5523bd281	svcrdma: Find rmsgp more reliably xdr_start() can return the wrong rmsgp address if an assumption about how the xdr_buf was constructed changes. When it gets it wrong, the client receives a reply that has gibberish in the RPC/RDMA header, preventing it from matching a waiting RPC request. Instead, make (and document) just one assumption: that the RDMA header for the client's RPC call is at the start of the first page in rq_pages. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-01-15 15:01:45 -05:00
Chuck Lever	3fe04ee9f9	svcrdma: Scrub BUG_ON() and WARN_ON() call sites Current convention is to avoid using BUG_ON() in places where an oops could cause complete system failure. Replace BUG_ON() call sites in svcrdma with an assertion error message and allow execution to continue safely. Some BUG_ON() calls are removed because they have never fired in production (that we are aware of). Some WARN_ON() calls are also replaced where a back trace is not helpful; e.g., in a workqueue task. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-01-15 15:01:45 -05:00
Chuck Lever	2397aa8b51	svcrdma: Clean up read chunk counting The byte_count argument is not used, and the function is called only from one place. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-01-15 15:01:44 -05:00
Chuck Lever	83f2bedfc6	svcrdma: Remove unused variable Nit: remove an unused variable to squelch a compiler warning. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-01-15 15:01:43 -05:00
Chuck Lever	597561bf6a	svcrdma: Clean up dprintk Nit: Fix inconsistent white space in dprintk messages. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-01-15 15:01:43 -05:00
Marcel Holtmann	2b8df32395	Bluetooth: Add paranoid check for existing LE and BR/EDR SMP channels When the SMP channels have been already registered, then print out a clear WARN_ON message that something went wrong. Also unregister the existing channels in this case before trying to register new ones. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-15 21:59:38 +02:00
Nicolas Dichtel	7eb35b1483	socket: use iov_length() Better to use available helpers. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-15 13:56:15 -05:00
Johan Hedberg	327a71910c	Bluetooth: Fix lookup of fixed channels by local bdaddr The comparing of chan->src should always be done against the local identity address, represented by hcon->src and hcon->src_type. This patch modifies l2cap_global_fixed_chan() to take the full hci_conn so that we can easily compare against hcon->src and hcon->src_type. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-15 16:54:14 +01:00
Johan Hedberg	a250e048a7	Bluetooth: Add helpers for src/dst bdaddr type conversion The current bdaddr_type() usage in l2cap_core.c is a bit funny in that it's always passed a hci_conn + a hci_conn member. Because of this only the hci_conn is really needed. Since the second parameter is always either hcon->src_type or hcon->dst type this patch adds two helper functions for each purpose: bdaddr_src_type() and bdaddr_dst_type(). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-15 16:54:14 +01:00
Johannes Berg	97d910d0aa	cfg80211: remove 80+80 MHz rate reporting These rates are treated the same as 160 MHz in the spec, so it makes no sense to distinguish them. As no driver uses them yet, this is also not a problem, just remove them. In the userspace API the field remains reserved to preserve API and ABI. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-15 16:05:21 +01:00
Johannes Berg	f89903d53f	mac80211: remove 80+80 MHz rate reporting These rates are treated the same as 160 MHz in the spec, so it makes no sense to distinguish them. As no driver uses them yet, this is also not a problem, just remove them. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-15 16:02:46 +01:00
Marcel Holtmann	162a3bac8d	Bluetooth: Bind the SMP channel registration to management power state When the controller gets powered on via the management interface, then register the supported SMP channels. There is no point in registering these channels earlier since it is not know what identity address the controller is going to operate with. When powering down a controller unregister all SMP channels. This is required since a powered down controller is allowed to change its identity address. In addition the SMP channels are only available when the controller is powered via the management interface. When using legacy ioctl, then Bluetooth Low Energy is not supported and registering kernel side SMP integration may actually cause confusion. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-15 12:54:31 +02:00
Marcel Holtmann	7e7ec44564	Bluetooth: Don't register any SMP channel if LE is not supported When LE features are not supported, then do not bother registering any kind of SMP channel. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-15 12:54:30 +02:00
Marcel Holtmann	157029ba30	Bluetooth: Fix LE SMP channel source address and source address type The source address and source address type of the LE SMP channel can either be the public address of the controller or the static random address configured by the host. Right now the public address is used for the LE SMP channel and obviously that is not correct if the controller operates with the configured static random address. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-15 12:54:30 +02:00
Marcel Holtmann	111e4bccd1	Bluetooth: Fix issue with switching BR/EDR back on when disabled For dual-mode controllers it is possible to disable BR/EDR and operate as LE single mode controllers with a static random address. If that is the case, then refuse switching BR/EDR back on after the controller has been powered. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-15 10:27:47 +02:00
Marcel Holtmann	eeb5a067d1	Bluetooth: Show device address type for L2CAP debugfs entries The devices address types are BR/EDR Public, LE Public and LE Random and any of these three is valid for L2CAP connections. So show the correct type in the debugfs list. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-15 10:23:47 +02:00
David S. Miller	4e7a84b1a5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== netfilter updates for net-next The following patchset contains netfilter updates for net-next, just a bunch of cleanups and small enhancement to selectively flush conntracks in ctnetlink, more specifically the patches are: 1) Rise default number of buckets in conntrack from 16384 to 65536 in systems with >= 4GBytes, patch from Marcelo Leitner. 2) Small refactor to save one level on indentation in xt_osf, from Joe Perches. 3) Remove unnecessary sizeof(char) in nf_log, from Fabian Frederick. 4) Another small cleanup to remove redundant variable in nfnetlink, from Duan Jiong. 5) Fix compilation warning in nfnetlink_cthelper on parisc, from Chen Gang. 6) Fix wrong format in debugging for ctseqadj, from Gao feng. 7) Selective conntrack flushing through the mark for ctnetlink, patch from Kristian Evensen. 8) Remove nf_ct_conntrack_flush_report() exported symbol now that is not required anymore after the selective flushing patch, again from Kristian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-15 01:50:25 -05:00
Thomas Graf	1dd144cf5b	openvswitch: Support VXLAN Group Policy extension Introduces support for the group policy extension to the VXLAN virtual port. The extension is disabled by default and only enabled if the user has provided the respective configuration. ovs-vsctl add-port br0 vxlan0 -- \ set Interface vxlan0 type=vxlan options:exts=gbp The configuration interface to enable the extension is based on a new attribute OVS_VXLAN_EXT_GBP nested inside OVS_TUNNEL_ATTR_EXTENSION which can carry additional extensions as needed in the future. The group policy metadata is stored as binary blob (struct ovs_vxlan_opts) internally just like Geneve options but transported as nested Netlink attributes to user space. Renames the existing TUNNEL_OPTIONS_PRESENT to TUNNEL_GENEVE_OPT with the binary value kept intact, a new flag TUNNEL_VXLAN_OPT is introduced. The attributes OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS and existing OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS are implemented mutually exclusive. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-15 01:11:41 -05:00
Thomas Graf	81bfe3c3cf	openvswitch: Allow for any level of nesting in flow attributes nlattr_set() is currently hardcoded to two levels of nesting. This change introduces struct ovs_len_tbl to define minimal length requirements plus next level nesting tables to traverse the key attributes to arbitrary depth. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-15 01:11:41 -05:00
Thomas Graf	d91641d9b5	openvswitch: Rename GENEVE_TUN_OPTS() to TUN_METADATA_OPTS() Also factors out Geneve validation code into a new separate function validate_and_copy_geneve_opts(). A subsequent patch will introduce VXLAN options. Rename the existing GENEVE_TUN_OPTS() to reflect its extended purpose of carrying generic tunnel metadata options. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-15 01:11:41 -05:00
Thomas Graf	3511494ce2	vxlan: Group Policy extension Implements supports for the Group Policy VXLAN extension [0] to provide a lightweight and simple security label mechanism across network peers based on VXLAN. The security context and associated metadata is mapped to/from skb->mark. This allows further mapping to a SELinux context using SECMARK, to implement ACLs directly with nftables, iptables, OVS, tc, etc. The group membership is defined by the lower 16 bits of skb->mark, the upper 16 bits are used for flags. SELinux allows to manage label to secure local resources. However, distributed applications require ACLs to implemented across hosts. This is typically achieved by matching on L2-L4 fields to identify the original sending host and process on the receiver. On top of that, netlabel and specifically CIPSO [1] allow to map security contexts to universal labels. However, netlabel and CIPSO are relatively complex. This patch provides a lightweight alternative for overlay network environments with a trusted underlay. No additional control protocol is required. Host 1: Host 2: Group A Group B Group B Group A +-----+ +-------------+ +-------+ +-----+ \| lxc \| \| SELinux CTX \| \| httpd \| \| VM \| +--+--+ +--+----------+ +---+---+ +--+--+ \---+---/ \----+---/ \| \| +---+---+ +---+---+ \| vxlan \| \| vxlan \| +---+---+ +---+---+ +------------------------------+ Backwards compatibility: A VXLAN-GBP socket can receive standard VXLAN frames and will assign the default group 0x0000 to such frames. A Linux VXLAN socket will drop VXLAN-GBP frames. The extension is therefore disabled by default and needs to be specifically enabled: ip link add [...] type vxlan [...] gbp In a mixed environment with VXLAN and VXLAN-GBP sockets, the GBP socket must run on a separate port number. Examples: iptables: host1# iptables -I OUTPUT -m owner --uid-owner 101 -j MARK --set-mark 0x200 host2# iptables -I INPUT -m mark --mark 0x200 -j DROP OVS: # ovs-ofctl add-flow br0 'in_port=1,actions=load:0x200->NXM_NX_TUN_GBP_ID[],NORMAL' # ovs-ofctl add-flow br0 'in_port=2,tun_gbp_id=0x200,actions=drop' [0] https://tools.ietf.org/html/draft-smith-vxlan-group-policy [1] http://lwn.net/Articles/204905/ Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-15 01:11:41 -05:00
David S. Miller	3f3558bb51	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/xen-netfront.c Minor overlapping changes in xen-netfront.c, mostly to do with some buffer management changes alongside the split of stats into TX and RX. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-15 00:53:17 -05:00
Linus Torvalds	a6391a924c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Don't use uninitialized data in IPVS, from Dan Carpenter. 2) conntrack race fixes from Pablo Neira Ayuso. 3) Fix TX hangs with i40e, from Jesse Brandeburg. 4) Fix budget return from poll calls in dnet and alx, from Eric Dumazet. 5) Fix bugus "if (unlikely(x) < 0)" test in AF_PACKET, from Christoph Jaeger. 6) Fix bug introduced by conversion to list_head in TIPC retransmit code, from Jon Paul Maloy. 7) Don't use GFP_NOIO under spinlock in USB kaweth driver, from Alexey Khoroshilov. 8) Fix bridge build with INET disabled, from Arnd Bergmann. 9) Fix netlink array overrun for PROBE attributes in openvswitch, from Thomas Graf. 10) Don't hold spinlock across synchronize_irq() in tg3 driver, from Prashant Sreedharan. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits) tg3: Release tp->lock before invoking synchronize_irq() tg3: tg3_reset_task() needs to use rtnl_lock to synchronize tg3: tg3_timer() should grab tp->lock before checking for tp->irq_sync team: avoid possible underflow of count_pending value for notify_peers and mcast_rejoin openvswitch: packet messages need their own probe attribtue i40e: adds FCoE configure option cxgb4vf: Fix queue allocation for 40G adapter netdevice: Add missing parentheses in macro bridge: only provide proxy ARP when CONFIG_INET is enabled neighbour: fix base_reachable_time(_ms) not effective immediatly when changed net: fec: fix MDIO bus assignement for dual fec SoC's xen-netfront: use different locks for Rx and Tx stats drivers: net: cpsw: fix multicast flush in dual emac mode cxgb4vf: Initialize mdio_addr before using it net: Corrected the comment describing the ndo operations to reflect the actual prototype for couple of operations usb/kaweth: use GFP_ATOMIC under spin_lock in usb_start_wait_urb() MAINTAINERS: add me as ibmveth maintainer tipc: fix bug in broadcast retransmit code update ip-sysctl.txt documentation (v2) net/at91_ether: prepare and unprepare clock ...	2015-01-15 11:17:37 +13:00
Thomas Graf	1ba398041f	openvswitch: packet messages need their own probe attribtue User space is currently sending a OVS_FLOW_ATTR_PROBE for both flow and packet messages. This leads to an out-of-bounds access in ovs_packet_cmd_execute() because OVS_FLOW_ATTR_PROBE > OVS_PACKET_ATTR_MAX. Introduce a new OVS_PACKET_ATTR_PROBE with the same numeric value as OVS_FLOW_ATTR_PROBE to grow the range of accepted packet attributes while maintaining to be binary compatible with existing OVS binaries. Fixes: `05da589` ("openvswitch: Add support for OVS_FLOW_ATTR_PROBE.") Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Tracked-down-by: Florian Westphal <fw@strlen.de> Signed-off-by: Thomas Graf <tgraf@suug.ch> Reviewed-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-14 16:49:44 -05:00
Jukka Rissanen	7b2ed60ed4	Bluetooth: 6lowpan: Remove PSM setting code Removing PSM setting debugfs interface as the IPSP has a well defined PSM value that should be used. The patch introduces enable flag that can be used to toggle 6lowpan on/off. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-14 22:48:13 +01:00
Johan Hedberg	e12af489b9	Bluetooth: Fix valid Identity Address check According to the Bluetooth core specification valid identity addresses are either Public Device Addresses or Static Random Addresses. IRKs received with any other type of address should be discarded since we cannot assume to know the permanent identity of the peer device. This patch fixes a missing check for the Identity Address when receiving the Identity Address Information SMP PDU. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org # 3.17+	2015-01-14 22:48:06 +01:00
zhuyj	9a6b4b392d	ipv6:icmp:remove unnecessary brackets There are too many brackets. Maybe only one bracket is enough. Signed-off-by: Zhu Yanjun <Yanjun.Zhu@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-14 16:35:49 -05:00
Fan Du	3f4c1d87af	openvswitch: Introduce ovs_tunnel_route_lookup Introduce ovs_tunnel_route_lookup to consolidate route lookup shared by vxlan, gre, and geneve ports. Signed-off-by: Fan Du <fan.du@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-14 16:32:06 -05:00
Tom Herbert	a2b12f3c7a	udp: pass udp_offload struct to UDP gro callbacks This patch introduces udp_offload_callbacks which has the same GRO functions (but not a GSO function) as offload_callbacks, except there is an argument to a udp_offload struct passed to gro_receive and gro_complete functions. This additional argument can be used to retrieve the per port structure of the encapsulation for use in gro processing (mostly by doing container_of on the structure). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-14 15:20:04 -05:00
Arnd Bergmann	d92cfdbbea	bridge: only provide proxy ARP when CONFIG_INET is enabled When IPV4 support is disabled, we cannot call arp_send from the bridge code, which would result in a kernel link error: net/built-in.o: In function `br_handle_frame_finish': :(.text+0x59914): undefined reference to `arp_send' :(.text+0x59a50): undefined reference to `arp_tbl' This makes the newly added proxy ARP support in the bridge code depend on the CONFIG_INET symbol and lets the compiler optimize the code out to avoid the link error. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: `958501163d` ("bridge: Add support for IEEE 802.11 Proxy ARP") Cc: Kyeyoon Park <kyeyoonp@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-14 15:08:02 -05:00
Gowtham Anandha Babu	36c269cecf	Bluetooth: Remove dead code Variable 'controller' is assigned a value that is never used. Identified by cppcheck tool. Signed-off-by: Gowtham Anandha Babu <gowtham.ab@samsung.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-14 11:16:17 +02:00
Luciano Coelho	75453ccb61	nl80211: send netdetect configuration info in NL80211_CMD_GET_WOWLAN Send the netdetect configuration information in the response to NL8021_CMD_GET_WOWLAN commands. This includes the scan interval, SSIDs to match and frequencies to scan. Additionally, add the NL80211_WOWLAN_TRIG_NET_DETECT with NL80211_ATTR_WOWLAN_TRIGGERS_SUPPORTED. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-14 09:45:17 +01:00
Arik Nemtsov	ef51fb1d1c	cfg80211: avoid reg-hints in self-managed only systems When a system contains only self-managed regulatory devices all hints from the regulatory core are ignored. Stop hint processing early in this case. These systems usually don't have CRDA deployed, which results in endless (irrelevent) logs of the form: cfg80211: Calling CRDA to update world regulatory domain Make sure there's at least one self-managed device before discarding a hint, in order to prevent initial hints from disappearing on CRDA managed systems. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-14 09:43:44 +01:00
Arik Nemtsov	2c3e861c94	cfg80211: introduce sync regdom set API for self-managed A self-managed device will sometimes need to set its regdomain synchronously. Notably it should be set before usermode has a chance to query it. Expose a new API to accomplish this which requires the RTNL. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-14 09:43:44 +01:00
Eliad Peller	2726f23d2d	mac80211: don't defer scans in case of radar detection Radar detection can last indefinite time. There is no point in deferring a scan request in this case - simply return -EBUSY. Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-14 09:37:07 +01:00
Eliad Peller	e7f2337ae7	mac80211: consider only relevant vifs for radar_required calculation ctx->conf.radar_enabled should reflect whether radar detection is enabled for the channel context. When calculating it, make it consider only the vifs that have this context assigned (instead of all the vifs). Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-14 09:37:06 +01:00
Eliad Peller	5cbc95a749	mac80211: remove local->radar_detect_enabled local->radar_detect_enabled should tell whether radar_detect is enabled on any interface belonging to local. However, it's not getting updated correctly in many cases (actually, when testing with hwsim it's never been set, even when the dfs master is beaconing). Instead of handling all the corner cases (e.g. channel switch), simply check whether radar detection is enabled only when needed, instead of caching the result. Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-14 09:37:06 +01:00
Arik Nemtsov	50075892ba	mac80211: add TDLS supported channels correctly The function adding the supported channels IE during a TDLS connection had several issues: 1. If the entire subband is usable, the function exitted the loop without adding it 2. The function only checked chandef_usable, ignoring flags like RADAR which would prevent TDLS off-channel communcation. 3. HT20 was explicitly required in the chandef, while not a requirement for TDLS off-channel. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-14 09:34:33 +01:00
Emmanuel Grumbach	3b24f4c653	mac80211: let flush() drop packets when possible When roaming / suspending, it makes no sense to wait until the transmit queues of the device are empty. In extreme condition they can be starved (VO saturating the air), but even in regular cases, it is pointless to delay the roaming because the low level driver is trying to send packets to an AP which is far away. We'd rather drop these packets and let TCP retransmit if needed. This will allow to speed up the roaming. For suspend, the explanation is even more trivial. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-14 09:31:18 +01:00
Marcel Holtmann	5ced24644b	Bluetooth: Use %llu for printing duration details of selftests The duration variable for the selftests is unsigned long long and with that use %llu instead of %lld when printing the results. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-14 10:02:45 +02:00
Marcel Holtmann	36f260ceff	Bluetooth: Move Delete Stored Link Key to 4th phase of initialization This moves the execution of Delete Stored Link Key command to the hci_init4_req phase. No actual code has been changed. The command is just executed at a later stage of the initialization. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-14 10:02:21 +02:00
Jean-Francois Remy	4bf6980dd0	neighbour: fix base_reachable_time(_ms) not effective immediatly when changed When setting base_reachable_time or base_reachable_time_ms on a specific interface through sysctl or netlink, the reachable_time value is not updated. This means that neighbour entries will continue to be updated using the old value until it is recomputed in neigh_period_work (which recomputes the value every 300*HZ). On systems with HZ equal to 1000 for instance, it means 5mins before the change is effective. This patch changes this behavior by recomputing reachable_time after each set on base_reachable_time or base_reachable_time_ms. The new value will become effective the next time the neighbour's timer is triggered. Changes are made in two places: the netlink code for set and the sysctl handling code. For sysctl, I use a proc_handler. The ipv6 network code does provide its own handler but it already refreshes reachable_time correctly so it's not an issue. Any other user of neighbour which provide its own handlers must refresh reachable_time. Signed-off-by: Jean-Francois Remy <jeff@melix.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-14 00:28:00 -05:00
Jiri Pirko	df8a39defa	net: rename vlan_tx_* helpers since "tx" is misleading there The same macros are used for rx as well. So rename it. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-13 17:51:08 -05:00
Jiri Pirko	d8b9605d26	net: sched: fix skb->protocol use in case of accelerated vlan path tc code implicitly considers skb->protocol even in case of accelerated vlan paths and expects vlan protocol type here. However, on rx path, if the vlan header was already stripped, skb->protocol contains value of next header. Similar situation is on tx path. So for skbs that use skb->vlan_tci for tagging, use skb->vlan_proto instead. Reported-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-13 17:51:08 -05:00
Sasha Levin	357c4774b5	tipc: correctly handle releasing a not fully initialized sock Commit `f2f9800d49` "tipc: make tipc node table aware of net namespace" has added a dereference of sock->sk before making sure it's not NULL, which makes releasing a tipc socket NULL pointer dereference for sockets that are not fully initialized. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-13 17:26:27 -05:00
Ying Xue	3721e9c7c1	tipc: remove redundant timer defined in tipc_sock struct Remove the redundant timer defined in tipc_sock structure, instead we can directly reuse the sk_timer defined in sock structure. Signed-off-by: Ying Xue <ying.xue@windriver.com> Acked-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-13 16:45:55 -05:00
Roopa Prabhu	0fe6de4903	bridge: fix uninitialized variable warning net/bridge/br_netlink.c: In function ‘br_fill_ifinfo’: net/bridge/br_netlink.c:146:32: warning: ‘vid_range_flags’ may be used uninitialized in this function [-Wmaybe-uninitialized] err = br_fill_ifvlaninfo_range(skb, vid_range_start, ^ net/bridge/br_netlink.c:108:6: note: ‘vid_range_flags’ was declared here u16 vid_range_flags; Reported-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-13 16:39:36 -05:00
Syam Sidhardhan	a440edf1fc	openvswitch: Remove unnecessary version.h inclusion version.h inclusion is not necessary as detected by versioncheck. Signed-off-by: Syam Sidhardhan <s.syam@samsung.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-13 14:31:41 -05:00
Sébastien Barré	08abdffa1c	tcp: avoid reducing cwnd when ACK+DSACK is received With TLP, the peer may reply to a probe with an ACK+D-SACK, with ack value set to tlp_high_seq. In the current code, such ACK+DSACK will be missed and only at next, higher ack will the TLP episode be considered done. Since the DSACK is not present anymore, this will cost a cwnd reduction. This patch ensures that this scenario does not cause a cwnd reduction, since receiving an ACK+DSACK indicates that both the initial segment and the probe have been received by the peer. The following packetdrill test, from Neal Cardwell, validates this patch: // Establish a connection. 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6> +.020 < . 1:1(0) ack 1 win 257 +0 accept(3, ..., ...) = 4 // Send 1 packet. +0 write(4, ..., 1000) = 1000 +0 > P. 1:1001(1000) ack 1 // Loss probe retransmission. // packets_out == 1 => schedule PTO in max(2RTT, 1.5RTT + 200ms) // In this case, this means: 1.5*RTT + 200ms = 230ms +.230 > P. 1:1001(1000) ack 1 +0 %{ assert tcpi_snd_cwnd == 10 }% // Receiver ACKs at tlp_high_seq with a DSACK, // indicating they received the original packet and probe. +.020 < . 1:1(0) ack 1001 win 257 <sack 1:1001,nop,nop> +0 %{ assert tcpi_snd_cwnd == 10 }% // Send another packet. +0 write(4, ..., 1000) = 1000 +0 > P. 1001:2001(1000) ack 1 // Receiver ACKs above tlp_high_seq, which should end the TLP episode // if we haven't already. We should not reduce cwnd. +.020 < . 1:1(0) ack 2001 win 257 +0 %{ assert tcpi_snd_cwnd == 10, tcpi_snd_cwnd }% Credits: -Gregory helped in finding that tcp_process_tlp_ack was where the cwnd got reduced in our MPTCP tests. -Neal wrote the packetdrill test above -Yuchung reworked the patch to make it more readable. Cc: Gregory Detal <gregory.detal@uclouvain.be> Cc: Nandita Dukkipati <nanditad@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Sébastien Barré <sebastien.barre@uclouvain.be> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-13 14:22:02 -05:00
Ying Xue	c5adde9468	netlink: eliminate nl_sk_hash_lock As rhashtable_lookup_compare_insert() can guarantee the process of search and insertion is atomic, it's safe to eliminate the nl_sk_hash_lock. After this, object insertion or removal will be protected with per bucket lock on write side while object lookup is guarded with rcu read lock on read side. Signed-off-by: Ying Xue <ying.xue@windriver.com> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-13 14:01:00 -05:00
Pankaj Gupta	1059590254	net: allow large number of rx queues netif_alloc_rx_queues() uses kcalloc() to allocate memory for "struct netdev_queue *_rx" array. If we are doing large rx queue allocation kcalloc() might fail, so this patch does a fallback to vzalloc(). Similar implementation is done for tx queue allocation in netif_alloc_netdev_queues(). We avoid failure of high order memory allocation with the help of vzalloc(), this allows us to do large rx and tx queue allocation which in turn helps us to increase the number of queues in tun. As vmalloc() adds overhead on a critical network path, __GFP_REPEAT flag is used with kzalloc() to do this fallback only when really needed. Signed-off-by: Pankaj Gupta <pagupta@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Gibson <dgibson@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 17:05:05 -05:00
Rickard Strandqvist	ddcde70cbf	net: sched: sch_teql: Remove unused function Remove the function teql_neigh_release() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:50:46 -05:00
Rickard Strandqvist	83400b990c	net: xfrm: xfrm_algo: Remove unused function Remove the function aead_entries() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:50:46 -05:00
Roopa Prabhu	36cd0ffbab	bridge: new function to pack vlans into ranges during gets This patch adds new function to pack vlans into ranges whereever applicable using the flags BRIDGE_VLAN_INFO_RANGE_BEGIN and BRIDGE VLAN_INFO_RANGE_END Old vlan packing code is moved to a new function and continues to be called when filter_mask is RTEXT_FILTER_BRVLAN. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:47:05 -05:00
Roopa Prabhu	bdced7ef78	bridge: support for multiple vlans and vlan ranges in setlink and dellink requests This patch changes bridge IFLA_AF_SPEC netlink attribute parser to look for more than one IFLA_BRIDGE_VLAN_INFO attribute. This allows userspace to pack more than one vlan in the setlink msg. The dumps were already sending more than one vlan info in the getlink msg. This patch also adds bridge_vlan_info flags BRIDGE_VLAN_INFO_RANGE_BEGIN and BRIDGE_VLAN_INFO_RANGE_END to indicate start and end of vlan range This patch also deletes unused ifla_br_policy. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:47:04 -05:00
Ying Xue	d49e204161	tipc: make netlink support net namespace Currently tipc module only allows users sitting on "init_net" namespace to configure it through netlink interface. But now almost each tipc component is able to be aware of net namespace, so it's time to open the permission for users residing in other namespaces, allowing them to configure their own tipc stack instance through netlink interface. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:34 -05:00
Ying Xue	bafa29e341	tipc: make tipc random value aware of net namespace After namespace is supported, each namespace should own its private random value. So the global variable representing the random value must be moved to tipc_net structure. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:33 -05:00
Ying Xue	a62fbccecd	tipc: make subscriber server support net namespace TIPC establishes one subscriber server which allows users to subscribe their interesting name service status. After tipc supports namespace, one dedicated tipc stack instance is created for each namespace, and each instance can be deemed as one independent TIPC node. As a result, subscriber server must be built for each namespace. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:33 -05:00
Ying Xue	3474753954	tipc: make tipc node address support net namespace If net namespace is supported in tipc, each namespace will be treated as a separate tipc node. Therefore, every namespace must own its private tipc node address. This means the "tipc_own_addr" global variable of node address must be moved to tipc_net structure to satisfy the requirement. It's turned out that users also can assign node address for every namespace. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:33 -05:00
Ying Xue	4ac1c8d0ee	tipc: name tipc name table support net namespace TIPC name table is used to store the mapping relationship between TIPC service name and socket port ID. When tipc supports namespace, it allows users to publish service names only owned by a certain namespace. Therefore, every namespace must have its private name table to prevent service names published to one namespace from being contaminated by other service names in another namespace. Therefore, The name table global variable (ie, nametbl) and its lock must be moved to tipc_net structure, and a parameter of namespace must be added for necessary functions so that they can obtain name table variable defined in tipc_net structure. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:33 -05:00
Ying Xue	e05b31f4bf	tipc: make tipc socket support net namespace Now tipc socket table is statically allocated as a global variable. Through it, we can look up one socket instance with port ID, insert a new socket instance to the table, and delete a socket from the table. But when tipc supports net namespace, each namespace must own its specific socket table. So the global variable of socket table must be redefined in tipc_net structure. As a concequence, a new socket table will be allocated when a new namespace is created, and a socket table will be deallocated when namespace is destroyed. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:33 -05:00
Ying Xue	1da465683a	tipc: make tipc broadcast link support net namespace TIPC broadcast link is statically established and its relevant states are maintained with the global variables: "bcbearer", "bclink" and "bcl". Allowing different namespace to own different broadcast link instances, these variables must be moved to tipc_net structure and broadcast link instances would be allocated and initialized when namespace is created. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:33 -05:00
Ying Xue	7f9f95d9d9	tipc: make bearer list support net namespace Bearer list defined as a global variable is used to store bearer instances. When tipc supports net namespace, bearers created in one namespace must be isolated with others allocated in other namespaces, which requires us that the bearer list(bearer_list) must be moved to tipc_net structure. As a result, a net namespace pointer has to be passed to functions which access the bearer list. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:32 -05:00
Ying Xue	f2f9800d49	tipc: make tipc node table aware of net namespace Global variables associated with node table are below: - node table list (node_htable) - node hash table list (tipc_node_list) - node table lock (node_list_lock) - node number counter (tipc_num_nodes) - node link number counter (tipc_num_links) To make node table support namespace, above global variables must be moved to tipc_net structure in order to keep secret for different namespaces. As a consequence, these variables are allocated and initialized when namespace is created, and deallocated when namespace is destroyed. After the change, functions associated with these variables have to utilize a namespace pointer to access them. So adding namespace pointer as a parameter of these functions is the major change made in the commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:32 -05:00
Ying Xue	c93d3baa24	tipc: involve namespace infrastructure Involve namespace infrastructure, make the "tipc_net_id" global variable aware of per namespace, and rename it to "net_id". In order that the conversion can be successfully done, an instance of networking namespace must be passed to relevant functions, allowing them to access the "net_id" variable of per namespace. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:32 -05:00
Ying Xue	54fef04ad0	tipc: remove unused tipc_link_get_max_pkt routine Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:32 -05:00
Ying Xue	f2f2a96a20	tipc: feed tipc sock pointer to tipc_sk_timeout routine In order to make tipc socket table aware of namespace, a networking namespace instance must be passed to tipc_sk_lookup(), allowing it to look up tipc socket instance with a given port ID from a concrete socket table. However, as now tipc_sk_timeout() only has one port ID parameter and is not namespace aware, it's unable to obtain a correct socket instance through tipc_sk_lookup() just with a port ID, especially after namespace is completely supported. If port ID is replaced with socket instance as tipc_sk_timeout()'s parameter, it's unnecessary to look up socket table. But as the timer handler - tipc_sk_timeout() is run asynchronously, socket reference must be held before its timer is launched, and must be carefully checked to identify whether the socket reference needs to be put or not when its timer is terminated. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:32 -05:00
Ying Xue	859fc7c0ce	tipc: cleanup core.c and core.h files Only the works of initializing and shutting down tipc module are done in core.h and core.c files, so all stuffs which are not closely associated with the two tasks should be moved to appropriate places. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:31 -05:00
Ying Xue	2f55c43788	tipc: remove unnecessary wrapper functions of kernel timer APIs Not only some wrapper function like k_term_timer() is empty, but also some others including k_start_timer() and k_cancel_timer() don't return back any value to its caller, what's more, there is no any component in the kernel world to do such thing. Therefore, these timer interfaces defined in tipc module should be purged. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:31 -05:00
Ying Xue	6b8326ed14	tipc: remove tipc_core_start/stop routines Remove redundant wrapper functions like tipc_core_start() and tipc_core_stop(), and directly move them to their callers, such as tipc_init() and tipc_exit(), having us clearly know what are really done in both initialization and deinitialzation functions. Signed-off-by: Ying Xue <ying.xue@windriver.com> Tested-by: Tero Aho <Tero.Aho@coriant.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:31 -05:00
Jon Maloy	703068eee6	tipc: fix bug in broadcast retransmit code In commit `58dc55f256` ("tipc: use generic SKB list APIs to manage link transmission queue") we replace all list traversal loops with the macros skb_queue_walk() or skb_queue_walk_safe(). While the previous loops were based on the assumption that the list was NULL-terminated, the standard macros stop when the iterator reaches the list head, which is non-NULL. In the function bclink_retransmit_pkt() this macro replacement has lead to a bug. When we receive a BCAST STATE_MSG we unconditionally call the function bclink_retransmit_pkt(), whether there really is anything to retransmit or not, assuming that the sequence number comparisons will lead to the correct behavior. However, if the transmission queue is empty, or if there are no eligible buffers in the transmission queue, we will by mistake pass the list head pointer to the function tipc_link_retransmit(). Since the list head is not a valid sk_buff, this leads to a crash. In this commit we fix this by only calling tipc_link_retransmit() if we actually found eligible buffers in the transmission queue. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:24:31 -05:00
Toshiaki Makita	f902e8812e	bridge: Add ability to enable TSO Currently a bridge device turns off TSO feature if no bridge ports support it. We can always enable it, since packets can be segmented on ports by software as well as on the bridge device. This will reduce the number of packets processed in the bridge. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:18:09 -05:00
Jon Paul Maloy	164167794c	tipc: fix bug in broadcast retransmit code In commit `58dc55f256` ("tipc: use generic SKB list APIs to manage link transmission queue") we replace all list traversal loops with the macros skb_queue_walk() or skb_queue_walk_safe(). While the previous loops were based on the assumption that the list was NULL-terminated, the standard macros stop when the iterator reaches the list head, which is non-NULL. In the function bclink_retransmit_pkt() this macro replacement has lead to a bug. When we receive a BCAST STATE_MSG we unconditionally call the function bclink_retransmit_pkt(), whether there really is anything to retransmit or not, assuming that the sequence number comparisons will lead to the correct behavior. However, if the transmission queue is empty, or if there are no eligible buffers in the transmission queue, we will by mistake pass the list head pointer to the function tipc_link_retransmit(). Since the list head is not a valid sk_buff, this leads to a crash. In this commit we fix this by only calling tipc_link_retransmit() if we actually found eligible buffers in the transmission queue. Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:01:59 -05:00
Willem de Bruijn	eee2f04b80	packet: make packet too small warning match condition The expression in ll_header_truncated() tests less than or equal, but the warning prints less than. Update the warning. Reported-by: Jouni Malinen <jkmalinen@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 16:00:55 -05:00
Marcel Holtmann	a936612036	Bluetooth: Process result of HCI Delete Stored Link Key command When the HCI Delete Stored Link Key command completes, then update the value of current stored keys in hci_dev structure. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-12 21:56:06 +02:00
Marcel Holtmann	48ce62c4fa	Bluetooth: Read stored link key information when powering on controller The information about max stored link keys and current stored link keys should be read at controller initialization. So issue HCI Read Stored Link Key command with BDADDR_ANY and read_all flag set to 0x01 to retrieve this information. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-12 21:54:48 +02:00
Marcel Holtmann	c2f0f97927	Bluetooth: Handle command complete event for HCI Read Stored Link Keys When the HCI Read Stored Link Keys command completes it gives useful information of the current stored keys and maximum keys a controller can actually store. So process this event and store these information in hci_dev structure. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-12 21:54:16 +02:00
Marcel Holtmann	41e91e71f6	Bluetooth: Replace send_monitor_event with queue_monitor_skb The send_monitor_event function is essentially the same as the newly introduced queue_monitor_skb. So instead of having duplicated code, replace send_monitor_event with queue_monitor_skb. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-12 11:26:09 +02:00
Marcel Holtmann	d7f72f6195	Bluetooth: Create generic queue_monitor_skb helper function The hci_send_to_monitor function contains generic code for queueing the packet into the receive queue of every monitor client. To avoid code duplication, create a generic queue_monitor_skb function to interate over all monitor sockets. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-12 11:26:07 +02:00
Marcel Holtmann	2b531294b0	Bluetooth: Simplify packet copy in hci_send_to_monitor function Within the monitor functionality, the global atomic variable called monitor_promisc ensures that no memory allocation happend when there is actually no client listening. This means it is safe to just create a copy of the skb since it is guaranteed that at least one client exists. No extra checks needed. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-12 11:26:04 +02:00
Marcel Holtmann	15762fa772	Bluetooth: Add BUILD_BUG_ON for size of struct sockaddr_sco This adds an extra check for ensuring that the size of sockaddr_sco does not grow larger than sockaddr. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-12 11:24:24 +02:00
Marcel Holtmann	74b3fb8d0d	Bluetooth: Add BUILD_BUG_ON for size of struct sockaddr_rc This adds an extra check for ensuring that the size of sockaddr_rc does not grow larger than sockaddr. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-12 11:24:21 +02:00
Marcel Holtmann	dd6255588a	Bluetooth: Add BUILD_BUG_ON for size of struct sockaddr_l2 This adds an extra check for ensuring that the size of sockaddr_l2 does not grow larger than sockaddr. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-12 11:24:19 +02:00
Marcel Holtmann	b0a8e282b5	Bluetooth: Add BUILD_BUG_ON for size of struct sockaddr_hci This adds an extra check for ensuring that the size of sockaddr_hci does not grow larger than sockaddr. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-12 11:24:16 +02:00
Marcel Holtmann	1904a853fa	Bluetooth: Add opcode parameter to hci_req_complete_t callback When hci_req_run() calls its provided complete function and one of the HCI commands in the sequence fails, then provide the opcode of failing command. In case of success HCI_OP_NOP is provided since all commands completed. This patch fixes the prototype of hci_req_complete_t and all its users. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-12 11:16:31 +02:00
David S. Miller	2bd8221804	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== netfilter/ipvs fixes for net The following patchset contains netfilter/ipvs fixes, they are: 1) Small fix for the FTP helper in IPVS, a diff variable may be left unset when CONFIG_IP_VS_IPV6 is set. Patch from Dan Carpenter. 2) Fix nf_tables port NAT in little endian archs, patch from leroy christophe. 3) Fix race condition between conntrack confirmation and flush from userspace. This is the second reincarnation to resolve this problem. 4) Make sure inner messages in the batch come with the nfnetlink header. 5) Relax strict check from nfnetlink_bind() that may break old userspace applications using all 1s group mask. 6) Schedule removal of chains once no sets and rules refer to them in the new nf_tables ruleset flush command. Reported by Asbjoern Sloth Toennesen. Note that this batch comes later than usual because of the short winter holidays. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-12 00:14:49 -05:00
Christoph Jaeger	46d2cfb192	packet: bail out of packet_snd() if L2 header creation fails Due to a misplaced parenthesis, the expression (unlikely(offset) < 0), which expands to (__builtin_expect(!!(offset), 0) < 0), never evaluates to true. Therefore, when sending packets with PF_PACKET/SOCK_DGRAM, packet_snd() does not abort as intended if the creation of the layer 2 header fails. Spotted by Coverity - CID 1259975 ("Operands don't affect result"). Fixes: `9c7077622d` ("packet: make packet_snd fail on len smaller than l2 header") Signed-off-by: Christoph Jaeger <cj@linux.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-11 21:54:03 -05:00
Linus Torvalds	dc9319f5a3	Merge branch 'for-3.19' of git://linux-nfs.org/~bfields/linux Pull two nfsd bugfixes from Bruce Fields. * 'for-3.19' of git://linux-nfs.org/~bfields/linux: rpc: fix xdr_truncate_encode to handle buffer ending on page boundary nfsd: fix fi_delegees leak when fi_had_conflict returns true	2015-01-09 18:10:48 -08:00
Linus Torvalds	20ebb34528	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull two Ceph fixes from Sage Weil: "These are both pretty trivial: a sparse warning fix and size_t printk thing" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: fix sparse endianness warnings ceph: use %zu for len in ceph_fill_inline_data()	2015-01-09 17:55:00 -08:00
Johannes Berg	9b7a86f351	mac80211: fix handling TIM IE when stations disconnect When a station disconnects with frames still pending, we clear the TIM bit, but too late - it's only cleared when the station is already removed from the driver, and thus the driver can get confused (and hwsim will loudly complain.) Fix this by clearing the TIM bit earlier, when the station has been unlinked but not removed from the driver yet. To do this, refactor the TIM recalculation to in that case ignore traffic and simply assume no pending traffic - this is correct for the disconnected station even though the frames haven't been freed yet at that point. This patch isn't needed for current drivers though as they don't check the station argument to the set_tim() operation and thus don't really run into the possible confusion. Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-09 11:48:37 +01:00
David S. Miller	fb57720daf	Included changes: - remove useless return in void functions - remove unused member 'primary_iface' from 'struct orig_node' - improve existing kernel doc - fix several checkpatch complaints - ensure socket's control block is cleared for received skbs - add missing DEBUG_FS dependency to BATMAN_ADV_DEBUG symbol -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJUraKDAAoJEJgn97Bh2u9eJsQQAI0HHsTbv1yWi0LyQO25ZlJG H0sR6Mj21ol36jvlSa4OYVsEn9QdoJ5RO+KDwkxvFPQQU0VyN0TZHEKzcPzFC98P r7BboF7JGI7P/ixGWhhdjoH/ECNJOoS0MioEX2YjGpGYIdDzl2BH7HHPV8WEEJTW A2l11owAv+Sv6PkYrx8OOqNCtSaO5ogX9BxZkFZwFZDA8VeFZO9eYIDStSnHKt5x /gT8EQd7bsVZc4IRZaPH8Sc18hkpcEZoDgMQ2Wwk4pw+5g+KgoLTzfAU6xXDnX34 GqVqWNlrcdOQg5337mBH+xTszq6bvVSlYvcN+q8QZfRS/bkLEEwigcBh+H5H//Gf +MueLW8JFKKIH5sP2wbeTgdj4l7JYs5TlZsn7O+jroXrblT/SUSji8TJkkCiSfBP MjR9WTOrqZTzxRN/KoT+DgmQ1t2ZhKN9WVBOukKaBpMh9lxOPZD/pxJIJJLmFyWh VWcszxerll3ilFKgWR7YRz6h6tB3xs6DUMgl3PEFzjTa0xhZPT4NzOGIfuBTDLPc vg7HAm+DCIwCDfYn0N5/HLv0cU14CMWSy0SCJBIzvo+0fHppLgxBTeJ24lSywfqN 89soGNmdGADTPkXnt4kX4U/XPRHTC3stzdMYiWoCmkoo5zQkiGmVLQ7TWG2Soqez QQ+ookKhShBLDZ2FQJYk =2Y3v -----END PGP SIGNATURE----- Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - remove useless return in void functions - remove unused member 'primary_iface' from 'struct orig_node' - improve existing kernel doc - fix several checkpatch complaints - ensure socket's control block is cleared for received skbs - add missing DEBUG_FS dependency to BATMAN_ADV_DEBUG symbol Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-08 20:14:32 -08:00
Ying Xue	07f6c4bc04	tipc: convert tipc reference table to use generic rhashtable As tipc reference table is statically allocated, its memory size requested on stack initialization stage is quite big even if the maximum port number is just restricted to 8191 currently, however, the number already becomes insufficient in practice. But if the maximum ports is allowed to its theory value - 2^32, its consumed memory size will reach a ridiculously unacceptable value. Apart from this, heavy tipc users spend a considerable amount of time in tipc_sk_get() due to the read-lock on ref_table_lock. If tipc reference table is converted with generic rhashtable, above mentioned both disadvantages would be resolved respectively: making use of the new resizable hash table can avoid locking on the lookup; smaller memory size is required at initial stage, for example, 256 hash bucket slots are requested at the beginning phase instead of allocating the entire 8191 slots in old mode. The hash table will grow if entries exceeds 75% of table size up to a total table size of 1M, and it will automatically shrink if usage falls below 30%, but the minimum table size is allowed down to 256. Also converts ref_table_lock to a separate mutex to protect hash table mutations on write side. Lastly defers the release of the socket reference using call_rcu() to allow using an RCU read-side protected call to rhashtable_lookup(). Signed-off-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Erik Hugne <erik.hugne@ericsson.com> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-08 19:47:14 -08:00
Ilya Dryomov	d7d5a007b1	libceph: fix sparse endianness warnings The only real issue is the one in auth_x.c and it came with 3.19-rc1 merge. Signed-off-by: Ilya Dryomov <idryomov@redhat.com>	2015-01-08 20:36:57 +03:00
Alexander Aring	bc6efeeeb5	ieee802154: 6lowpan: fix Makefile entry Since commit `ea81ac2e70` ("ieee802154: create 6lowpan sub-directory") we have a subdirectory for the ieee802154 6lowpan implementation. This commit also moves the Kconfig entry inside of net/ieee802154/6lowpan/ and forgot to rename the Makefile entry from obj-$(CONFIG_IEEE802154_6LOWPAN) to obj-y and handle the obj-$(CONFIG_IEEE802154_6LOWPAN) inside the created 6lowpan directory. This will occur that the ieee802154_6lowpan can't be build. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-08 15:48:06 +01:00
Johannes Berg	79c892b850	mac80211: provide per-TID RX/TX MSDU counters Implement the new counters cfg80211 can now advertise to userspace. The TX code is in the sequence number handler, which is a bit odd, but that place already knows the TID and frame type, so it was easiest and least impact there. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-08 15:28:20 +01:00
Johannes Berg	6de39808cf	nl80211: support per-TID station statistics The base for the current statistics is pretty mixed up, support exporting RX/TX statistics for MSDUs per TID. This (currently) covers received MSDUs, transmitted MSDUs and retries/failures thereof. Doing it per TID for MSDUs makes more sense than say only per AC because it's symmetric - we could export per-AC statistics for all frames (which AC we used for transmission can be determined also for management frames) but per TID is better and usually data frames are really the ones we care about. Also, on RX we can't determine the AC - but we do know the TID for any QoS MPDU we received. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-08 15:28:18 +01:00
Johannes Berg	a76b1942a1	cfg80211: add nl80211 beacon-only statistics Add these two values: * BEACON_RX: number of beacons received from this peer * BEACON_SIGNAL_AVG: signal strength average for beacons only These can then be used for Android Lollipop's statistics request. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-08 15:28:13 +01:00
Johannes Berg	319090bf6c	cfg80211: remove enum station_info_flags This is really just duplicating the list of information that's already available in the nl80211 attribute, so remove the list. Two small changes are needed: * remove STATION_INFO_ASSOC_REQ_IES complete, but the length (assoc_req_ies_len) can be used instead * add NL80211_STA_INFO_RX_DROP_MISC which exists internally but not in nl80211 yet This gets rid of the duplicate maintenance of the two lists. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-08 15:28:10 +01:00
Johannes Berg	2b9a7e1bac	mac80211: allow drivers to provide most station statistics In many cases, drivers can filter things like beacons that will skew statistics reported by mac80211. To get correct statistics in these cases, call drivers to obtain statistics and let them override all values, filling values from mac80211 if the driver didn't provide them. Not all of them make sense for the driver to fill, so some are still always done by mac80211. Note that this doesn't currently allow a driver to say "I know this value is wrong, don't report it at all", or to sum it up with a mac80211 value (as could be useful for "dropped misc"), that can be added if it turns out to be needed. This also gets rid of the get_rssi() method as is can now be implemented using sta_statistics(). Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-08 15:28:06 +01:00
Johannes Berg	6f7a8d26e2	mac80211: send statistics with delete station event Use the new cfg80211_del_sta_sinfo() function to send the statistics about the deleted station with the delete event. This lets userspace see how much traffic etc. the deleted station used. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-08 15:28:03 +01:00
Johannes Berg	cf5ead822d	cfg80211: allow including station info in delete event When a station is removed, its statistics may be interesting to userspace, for example for further aggregation of statistics of all stations that ever connected to an AP. Introduce a new cfg80211_del_sta_sinfo() function (and make the cfg80211_del_sta() a static inline calling it) to allow passing a struct station_info along with this, and send the data in the nl80211 event message. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-08 15:28:00 +01:00
Johannes Berg	052536abfa	cfg80211: add scan time to survey data Add the time spent scanning to the survey data so it can be reported by drivers that collect such information. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-08 15:27:58 +01:00
Johannes Berg	11f78ac32b	cfg80211: allow survey data to return global data Not all devices are able to report survey data (particularly time spent for various operations) per channel. As all these statistics already exist in survey data, allow such devices to report them (if userspace requested it) Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-08 15:27:54 +01:00
Johannes Berg	4ed20bebf5	cfg80211: remove "channel" from survey names All of the survey data is (currently) per channel anyway, so having the word "channel" in the name does nothing. In the next patch I'll introduce global data to the survey, where the word "channel" is actually confusing. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-08 15:27:52 +01:00
Kristian Evensen	ae406bd057	netfilter: conntrack: Remove nf_ct_conntrack_flush_report The only user of nf_ct_conntrack_flush_report() was ctnetlink_del_conntrack(). After adding support for flushing connections with a given mark, this function is no longer called. Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-01-08 12:16:58 +01:00
Kristian Evensen	866476f323	netfilter: conntrack: Flush connections with a given mark This patch adds support for selective flushing of conntrack mappings. By adding CTA_MARK and CTA_MARK_MASK to a delete-message, the mark (and mask) is checked before a connection is deleted while flushing. Configuring the flush is moved out of ctnetlink_del_conntrack(), and instead of calling nf_conntrack_flush_report(), we always call nf_ct_iterate_cleanup(). This enables us to only make one call from the new ctnetlink_flush_conntrack() and makes it easy to add more filter parameters. Filtering is done in the ctnetlink_filter_match()-function, which is also called from ctnetlink_dump_table(). ctnetlink_dump_filter has been renamed ctnetlink_filter, to indicated that it is no longer only used when dumping conntrack entries. Moreover, reject mark filters with -EOPNOTSUPP if no ct mark support is available. Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-01-08 12:14:20 +01:00
Alexander Aring	52d84ffc54	ieee802154: 6lowpan: rename to core This patch renames the 6lowpan_rtnl.c file to core.c. 6lowpan_rtnl.c contains functionality to put all 802.15.4 6LoWPAN functionality together. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-08 07:25:59 +01:00
Alexander Aring	4dc315e267	ieee802154: 6lowpan: move transmit functionality This patch moves all relevant transmit functionality into a separate tx.c file. We can simple separate this functionality like we did it in mac802154. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-08 07:25:59 +01:00
Alexander Aring	4662a0da54	ieee802154: 6lowpan: move receive functionality This patch moves all relevant receive functionality into a separate rx.c file. We can simple separate this functionality like we did it in mac802154. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-08 07:25:59 +01:00
Alexander Aring	8691ee592c	ieee802154: 6lowpan: rename internal header This patch renames the internal header for af802154. This naming convention is like ieee802154_i.h in mac802154 and avoids naming confusing with the global af802154 header. Furthermore this header contains more ieee802154 specific definitions. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-08 07:25:59 +01:00
Alexander Aring	ea81ac2e70	ieee802154: create 6lowpan sub-directory This patch creates an 6lowpan sub-directory inside ieee802154. Additional we move all ieee802154 6lowpan relevant files into this sub-directory instead of placing the 6lowpan related files inside ieee802154. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-08 07:25:59 +01:00
Markus Pargmann	9535395640	batman-adv: Kconfig, Add missing DEBUG_FS dependency BATMAN_ADV_DEBUG is using debugfs files for the debugging log. So it depends on DEBUG_FS which is missing as dependency in the Kconfig file. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2015-01-07 22:17:11 +01:00
J. Bruce Fields	49a068f82a	rpc: fix xdr_truncate_encode to handle buffer ending on page boundary A struct xdr_stream at a page boundary might point to the end of one page or the beginning of the next, but xdr_truncate_encode isn't prepared to handle the former. This can cause corruption of NFSv4 READDIR replies in the case that a readdir entry that would have exceeded the client's dircount/maxcount limit would have ended exactly on a 4k page boundary. You're more likely to hit this case on large directories. Other xdr_truncate_encode callers are probably also affected. Reported-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com> Tested-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com> Fixes: `3e19ce762b` "rpc: xdr_truncate_encode" Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-01-07 14:03:58 -05:00
Simon Wunderlich	dcba6c9b45	batman-adv: Start new development cycle Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2015-01-07 17:21:58 +01:00
Antonio Quartulli	3f68785e61	batman-adv: fix misspelled words Reported-by: checkpatch Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2015-01-07 17:21:57 +01:00
Martin Hundebøll	e0d9677ea3	batman-adv: clear control block of received socket buffers Since other network components (and some drivers) uses the control block provided in skb's, the network coding feature might wrongly assume that an SKB has been decoded, and thus not try to code it with another packet again. This happens for instance when batman-adv is running on a bridge device. Fix this by clearing the control block for every received SKB. Introduced by `3c12de9a5c` ("batman-adv: network coding - code and transmit packets if possible") Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2015-01-07 17:21:57 +01:00
Antonio Quartulli	32db6aaafe	batman-adv: checkpatch - remove unnecessary parentheses Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2015-01-07 17:21:56 +01:00
Antonio Quartulli	8a3f8b6ac5	batman-adv: checkpatch - Please don't use multiple blank lines Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2015-01-07 17:21:56 +01:00
Antonio Quartulli	aa143d2837	batman-adv: checkpatch - Please use a blank line after declarations Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2015-01-07 17:21:55 +01:00
Antonio Quartulli	9687a31c84	batman-adv: checkpatch - No space is necessary after a cast Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2015-01-07 17:21:55 +01:00
Antonio Quartulli	24820df144	batman-adv: checkpatch - else is not generally useful after a break or return Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2015-01-07 17:21:55 +01:00
Martin Hundebøll	a0e2877505	batman-adv: kernel doc fixes for main.{c, h} Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2015-01-07 17:21:54 +01:00
Martin Hundebøll	ba97abb863	batman-adv: kernel doc fix for distributed-arp-table.h Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Acked-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2015-01-07 17:21:54 +01:00
Martin Hundebøll	e335718988	batman-adv: kernel doc fixes for bridge_loop_avoidance.c Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Acked-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2015-01-07 17:21:54 +01:00
Martin Hundebøll	95298a91f9	batman-adv: kernel doc fixes for bat_iv_ogm.c Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2015-01-07 17:21:53 +01:00
Simon Wunderlich	320b936678	batman-adv: remove obsolete variable primary_iface from orig_node This variable became obsolete when changing to the new bonding mechanism based on the multi interface optimization. Since its not used anywhere, remove it. Reported-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2015-01-07 17:21:53 +01:00
Antonio Quartulli	193924472a	batman-adv: avoid useless return in void functions Cc: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2015-01-07 17:21:52 +01:00
Arik Nemtsov	20658702e0	cfg80211: fix deadlock during reg chan check If a P2P GO is active, the cfg80211_reg_can_beacon function will take the wdev lock, in its call to cfg80211_go_permissive_chan. But the wdev lock is already taken by the parent channel-checking function, causing a deadlock. Split the checking code into two parts. The first part will check if the wdev is active and saves the channel under the wdev lock. The second part will check actual channel validity according to type. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-07 14:53:46 +01:00
Ido Yariv	db12847ca8	mac80211: Re-fix accounting of the tailroom-needed counter When hw acceleration is enabled, the GENERATE_IV or PUT_IV_SPACE flags only require headroom space. Therefore, the tailroom-needed counter can safely be decremented for most drivers. The older incarnation of this patch (`ca34e3b5`) assumed that the above holds true for all drivers. As reported by Christopher Chavez and researched by Christian Lamparter and Larry Finger, this isn't a valid assumption for p54 and cw1200. Drivers that still require tailroom for ICV/MIC even when HW encryption is enabled can use IEEE80211_KEY_FLAG_RESERVE_TAILROOM to indicate it. Signed-off-by: Ido Yariv <idox.yariv@intel.com> Cc: Christopher Chavez <chrischavez@gmx.us> Cc: Christian Lamparter <chunkeey@googlemail.com> Cc: Larry Finger <Larry.Finger@lwfinger.net> Cc: Solomon Peachy <pizza@shaftnet.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-07 14:39:32 +01:00
Johannes Berg	3a4b0c948d	Merge branch 'mac80211' into mac80211-next Merge mac80211.git to get some changes that would otherwise cause conflicts with new changes coming here. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-07 14:39:16 +01:00
John Linville	cc72f6e227	mac80211: uninitialized return val in __ieee80211_sta_handle_tspec_ac_params The return value should be initialized to false so that there's a valid return value when there are no sessions that need work to be done on them. Luckily, the side effect of using the uninitialized value is an extra harmless driver call. Coverity: CID 1260096 Fixes: `02219b3abc` ("mac80211: add WMM admission control support") Signed-off-by: John W. Linville <linville@tuxdriver.com> [extend commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-07 13:57:34 +01:00
Christoph Jaeger	6341e62b21	kconfig: use bool instead of boolean for type definition attributes Support for keyword 'boolean' will be dropped later on. No functional change. Reference: http://lkml.kernel.org/r/cover.1418003065.git.cj@linux.com Signed-off-by: Christoph Jaeger <cj@linux.com> Signed-off-by: Michal Marek <mmarek@suse.cz>	2015-01-07 13:08:04 +01:00
David S. Miller	44d84d7272	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2015-01-06 22:29:20 -05:00
Linus Torvalds	bdec419638	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "Just a pile of random fixes, including: 1) Do not apply TSO limits to non-TSO packets, fix from Herbert Xu. 2) MDI{,X} eeprom check in e100 driver is reversed, from John W. Linville. 3) Missing error return assignments in several ethernet drivers, from Julia Lawall. 4) Altera TSE device doesn't come back up after ifconfig down/up sequence, fix from Kostya Belezko. 5) Add more cases to the check for whether the qmi_wwan device has a bogus MAC address and needs to be assigned a random one. From Kristian Evensen. 6) Fix interrupt hangs in CPSW, from Felipe Balbi. 7) Implement ndo_features_check in r8152 so that the stack doesn't feed GSO packets which are outside of the chip's capabilities. From Hayes Wang" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits) qla3xxx: don't allow never end busy loop xen-netback: fixing the propagation of the transmit shaper timeout r8152: support ndo_features_check batman-adv: fix potential TT client + orig-node memory leak batman-adv: fix multicast counter when purging originators batman-adv: fix counter for multicast supporting nodes batman-adv: fix lock class for decoding hash in network-coding.c batman-adv: fix delayed foreign originator recognition batman-adv: fix and simplify condition when bonding should be used Revert "mac80211: Fix accounting of the tailroom-needed counter" net: ethernet: cpsw: fix hangs with interrupts enic: free all rq buffs when allocation fails qmi_wwan: Set random MAC on devices with buggy fw openvswitch: Consistently include VLAN header in flow and port stats. tcp: Do not apply TSO segment limit to non-TSO packets Altera TSE: Add missing phydev net/mlx4_core: Fix error flow in mlx4_init_hca() net/mlx4_core: Correcly update the mtt's offset in the MR re-reg flow qlcnic: Fix return value in qlcnic_probe() net: axienet: fix error return code ...	2015-01-06 17:48:14 -08:00
Ed Swierk	2f4383667d	ethtool: Extend ethtool plugin module eeprom API to phylib This patch extends the ethtool plugin module eeprom API to support cards whose phy support is delegated to a separate driver. The handlers for ETHTOOL_GMODULEINFO and ETHTOOL_GMODULEEEPROM call the module_info and module_eeprom functions if the phy driver provides them; otherwise the handlers call the equivalent ethtool_ops functions provided by network drivers with built-in phy support. Signed-off-by: Ed Swierk <eswierk@skyportsystems.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-06 17:16:55 -05:00
Pablo Neira Ayuso	a2f18db0c6	netfilter: nf_tables: fix flush ruleset chain dependencies Jumping between chains doesn't mix well with flush ruleset. Rules from a different chain and set elements may still refer to us. [ 353.373791] ------------[ cut here ]------------ [ 353.373845] kernel BUG at net/netfilter/nf_tables_api.c:1159! [ 353.373896] invalid opcode: 0000 [#1] SMP [ 353.373942] Modules linked in: intel_powerclamp uas iwldvm iwlwifi [ 353.374017] CPU: 0 PID: 6445 Comm: 31c3.nft Not tainted 3.18.0 #98 [ 353.374069] Hardware name: LENOVO 5129CTO/5129CTO, BIOS 6QET47WW (1.17 ) 07/14/2010 [...] [ 353.375018] Call Trace: [ 353.375046] [<ffffffff81964c31>] ? nf_tables_commit+0x381/0x540 [ 353.375101] [<ffffffff81949118>] nfnetlink_rcv+0x3d8/0x4b0 [ 353.375150] [<ffffffff81943fc5>] netlink_unicast+0x105/0x1a0 [ 353.375200] [<ffffffff8194438e>] netlink_sendmsg+0x32e/0x790 [ 353.375253] [<ffffffff818f398e>] sock_sendmsg+0x8e/0xc0 [ 353.375300] [<ffffffff818f36b9>] ? move_addr_to_kernel.part.20+0x19/0x70 [ 353.375357] [<ffffffff818f44f9>] ? move_addr_to_kernel+0x19/0x30 [ 353.375410] [<ffffffff819016d2>] ? verify_iovec+0x42/0xd0 [ 353.375459] [<ffffffff818f3e10>] ___sys_sendmsg+0x3f0/0x400 [ 353.375510] [<ffffffff810615fa>] ? native_sched_clock+0x2a/0x90 [ 353.375563] [<ffffffff81176697>] ? acct_account_cputime+0x17/0x20 [ 353.375616] [<ffffffff8110dc78>] ? account_user_time+0x88/0xa0 [ 353.375667] [<ffffffff818f4bbd>] __sys_sendmsg+0x3d/0x80 [ 353.375719] [<ffffffff81b184f4>] ? int_check_syscall_exit_work+0x34/0x3d [ 353.375776] [<ffffffff818f4c0d>] SyS_sendmsg+0xd/0x20 [ 353.375823] [<ffffffff81b1826d>] system_call_fastpath+0x16/0x1b Release objects in this order: rules -> sets -> chains -> tables, to make sure no references to chains are held anymore. Reported-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.biz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-01-06 22:27:48 +01:00
Pablo Neira Ayuso	62924af247	netfilter: nfnetlink: relax strict multicast group check from netlink_bind Relax the checking that was introduced in `97840cb` ("netfilter: nfnetlink: fix insufficient validation in nfnetlink_bind") when the subscription bitmask is used. Existing userspace code code may request to listen to all of the existing netlink groups by setting an all to one subscription group bitmask. Netlink already validates subscription via setsockopt() for us. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-01-06 22:27:47 +01:00
Pablo Neira Ayuso	9ea2aa8b7d	netfilter: nfnetlink: validate nfnetlink header from batch Make sure there is enough room for the nfnetlink header in the netlink messages that are part of the batch. There is a similar check in netlink_rcv_skb(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-01-06 22:27:46 +01:00
Pablo Neira Ayuso	8ca3f5e974	netfilter: conntrack: fix race between confirmation and flush Commit `5195c14c8b` ("netfilter: conntrack: fix race in __nf_conntrack_confirm against get_next_corpse") aimed to resolve the race condition between the confirmation (packet path) and the flush command (from control plane). However, it introduced a crash when several packets race to add a new conntrack, which seems easier to reproduce when nf_queue is in place. Fix this race, in __nf_conntrack_confirm(), by removing the CT from unconfirmed list before checking the DYING bit. In case race occured, re-add the CT to the dying list This patch also changes the verdict from NF_ACCEPT to NF_DROP when we lose race. Basically, the confirmation happens for the first packet that we see in a flow. If you just invoked conntrack -F once (which should be the common case), then this is likely to be the first packet of the flow (unless you already called flush anytime soon in the past). This should be hard to trigger, but better drop this packet, otherwise we leave things in inconsistent state since the destination will likely reply to this packet, but it will find no conntrack, unless the origin retransmits. The change of the verdict has been discussed in: https://www.marc.info/?l=linux-netdev&m=141588039530056&w=2 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-01-06 22:27:45 +01:00
David S. Miller	627d2cc016	Included changes: - ensure bonding is used (if enabled) for packets coming in the soft interface - fix race condition to avoid orig_nodes to be deleted right after being added - avoid false positive lockdep splats by assigning lockclass to the proper hashtable lock objects - avoid miscounting of multicast 'disabled' nodes in the network - fix memory leak in the Global Translation Table in case of originator interval change -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJUq7PtAAoJEJgn97Bh2u9ejD0P/AqFX7KAqlzEZo7SSIwduX/e OLUWJyplATljKAwmTM+gFxyTC6HOUhqG/0UZpUu83OA0uC5HmT2ATJXUal1vG8uN ewfxqhsnz8UMxmjENcjsExGdhjW0CDeD+lRdxPGbbq5vEbqe9dXXr4HVAlJFGhnX fzR65WB2om9BTYMgIWmZutRjP9VTPArgOCKtCgLclHut19VXmZDALPWZ5S6kPfUG O9fSPi0uzr7WIy6p+5XCi69pw+rlHv23A3soXK9G4Ze+ET2QcQSgkL6Q3VJ2+uMV wBzTahRN7sJgJ8GYBDrhXgw3yRl2FkjLNPav3+gOnbK0+PycrnGiC1YPbX8w/cpr EHyFB/kZCeBusPaRQAYP91YI3cYJ3Onq3kjbUPVV+ASMoJ/vdVORRjDHY8ALf12w uGp/+9aAEJYqAOZhVZKTofTgfT3ULQycd+Ffi/0LC8OLaKQeqdEcmGRrIKNJmsP5 a+gsKEMe7QCbHkWAVnKLGKthLH4IRU98gVApS1nld9OE2yrPyywrA5ac353iFB1K Y/56RFAqU1nywv+hhArcCZymiOPxKyF+lK3VAHvnlFgMMJs5gNZzoGX7V7TODkiA yiZCuzOGsB4hPSdzf1KfCVR3LkFfaI2jGr7X+38OdK9WGMG2mcPEB0tyXr7uFuky V/VPJDENwQwmgfdR+D+3 =30sl -----END PGP SIGNATURE----- Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - ensure bonding is used (if enabled) for packets coming in the soft interface - fix race condition to avoid orig_nodes to be deleted right after being added - avoid false positive lockdep splats by assigning lockclass to the proper hashtable lock objects - avoid miscounting of multicast 'disabled' nodes in the network - fix memory leak in the Global Translation Table in case of originator interval change Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-06 14:24:49 -05:00
David S. Miller	15ecf7a063	Here's just a single fix - a revert of a patch that broke the p54 and cw2100 drivers (arguably due to bad assumptions there.) Since this affects kernels since 3.17, I decided to revert for now and we'll revisit this optimisation properly for -next. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUq9EvAAoJEDBSmw7B7bqr7mMQAJRbdepVVqK5IFH1BH0NC7vi LkBE2lp8ZAz1Crg+OAQNUdlZUHtGyfYoXSfzezmrMG51i5xjHyOYQQikW7aJ2SQ0 XsjJJ5TcqKe83NwoakXUrMpE7KmCt/LnbjKNXDsZIvLlUkqa7ksXaS7btK195aXy WlVrmUE+BqT9a16VjFLZ6wRjI43+3bGxhtFL+g1eXw6nZ4a2o4EbIXdc9SN+/bT4 tAhWJfdAQqQc34jhesWGbMIvkXWhzy2R6Js+9gMIBNsmlAiYbFa4QZ/9tI3nBI/O yHSiDc7JnPNjkkC+3wTJxMl7mEd6fEKnAS1ryZ5L4XhPrQpV39iZuWSPvPGw6LLW kB6+wXkIyQdCSoyrQZxY75ibqOUKYYxhhkSYfMePXRKTYY6MlHYqiH8wPWFpPoqO iumLqx8/CtRW1q1t2EBAG6rZLRF8HqmfqtB+ptT0DWcAP8E81q8BImPoPFr+P9S2 XfuuSw97xKCcilOcYJ0uYSBe4XNNhy1dtC/zJ8cA9nV4WNkofALga5Z/t8ARhDsM wvP1D2uIX3U9My17bXq+Xn/fSSS7yhpLZjEHj/JNRvpDCWGf/tQl6A3ydMy//Oqe lRSKfmiAGysqhXnmK12+YhfO+4ioTz8dA88tHs1AO8qasfQwx45eRsUPemWeExiL 9Lntb0U6MhYvgiTdWqt6 =CnbJ -----END PGP SIGNATURE----- Merge tag 'mac80211-for-davem-2015-01-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Here's just a single fix - a revert of a patch that broke the p54 and cw2100 drivers (arguably due to bad assumptions there.) Since this affects kernels since 3.17, I decided to revert for now and we'll revisit this optimisation properly for -next. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-06 13:29:27 -05:00
Arik Nemtsov	fa44b988d2	mac80211: skip disabled channels in VHT check The patch "40a11ca mac80211: check if channels allow 80 MHz for VHT probe requests" considered disabled channels as VHT enabled, and mistakenly sent out probe-requests with the VHT IE. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-06 14:59:07 +01:00
Johannes Berg	71b836eca7	nl80211: define multicast group names in header Put the group names into the userspace API header file so that userspace clients can use symbolic names from there instead of hardcoding the actual names. This doesn't really change much, but seems somewhat cleaner. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-06 12:10:26 +01:00
Gautam Kumar Shukla	d75bb06b61	cfg80211: add extensible feature flag attribute With the wiphy::features flag being used up this patch adds a new field wiphy::ext_features. Considering extensibility this new field is declared as a byte array. This extensible flag is exposed to user-space by NL80211_ATTR_EXT_FEATURES. Cc: Avinash Patil <patila@marvell.com> Signed-off-by: Gautam (Gautam Kumar) Shukla <gautams@broadcom.com> Signed-off-by: Arend van Spriel <arend@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-06 12:10:24 +01:00
Linus Lüssing	9d31b3ce81	batman-adv: fix potential TT client + orig-node memory leak This patch fixes a potential memory leak which can occur once an originator times out. On timeout the according global translation table entry might not get purged correctly. Furthermore, the non purged TT entry will cause its orig-node to leak, too. Which additionally can lead to the new multicast optimization feature not kicking in because of a therefore bogus counter. In detail: The batadv_tt_global_entry->orig_list holds the reference to the orig-node. Usually this reference is released after BATADV_PURGE_TIMEOUT through: _batadv_purge_orig()-> batadv_purge_orig_node()->batadv_update_route()->_batadv_update_route()-> batadv_tt_global_del_orig() which purges this global tt entry and releases the reference to the orig-node. However, if between two batadv_purge_orig_node() calls the orig-node timeout grew to 2*BATADV_PURGE_TIMEOUT then this call path isn't reached. Instead the according orig-node is removed from the originator hash in _batadv_purge_orig(), the batadv_update_route() part is skipped and won't be reached anymore. Fixing the issue by moving batadv_tt_global_del_orig() out of the rcu callback. Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Acked-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2015-01-06 11:07:01 +01:00
Linus Lüssing	a5164886b0	batman-adv: fix multicast counter when purging originators When purging an orig_node we should only decrease counter tracking the number of nodes without multicast optimizations support if it was increased through this orig_node before. A not yet quite initialized orig_node (meaning it did not have its turn in the mcast-tvlv handler so far) which gets purged would not adhere to this and will lead to a counter imbalance. Fixing this by adding a check whether the orig_node is mcast-initalized before decreasing the counter in the mcast-orig_node-purging routine. Introduced by `60432d756c` ("batman-adv: Announce new capability via multicast TVLV") Reported-by: Tobias Hachmer <tobias@hachmer.de> Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2015-01-06 11:06:04 +01:00
Linus Lüssing	e8829f007e	batman-adv: fix counter for multicast supporting nodes A miscounting of nodes having multicast optimizations enabled can lead to multicast packet loss in the following scenario: If the first OGM a node receives from another one has no multicast optimizations support (no multicast tvlv) then we are missing to increase the counter. This potentially leads to the wrong assumption that we could safely use multicast optimizations. Fixings this by increasing the counter if the initial OGM has the multicast TVLV unset, too. Introduced by `60432d756c` ("batman-adv: Announce new capability via multicast TVLV") Reported-by: Tobias Hachmer <tobias@hachmer.de> Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2015-01-06 11:05:42 +01:00
Martin Hundebøll	f44d54077a	batman-adv: fix lock class for decoding hash in network-coding.c batadv_has_set_lock_class() is called with the wrong hash table as first argument (probably due to a copy-paste error), which leads to false positives when running with lockdep. Introduced-by: `612d2b4fe0` ("batman-adv: network coding - save overheard and tx packets for decoding") Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2015-01-06 11:05:12 +01:00
Linus Lüssing	2c667a339c	batman-adv: fix delayed foreign originator recognition Currently it can happen that the reception of an OGM from a new originator is not being accepted. More precisely it can happen that an originator struct gets allocated and initialized (batadv_orig_node_new()), even the TQ gets calculated and set correctly (batadv_iv_ogm_calc_tq()) but still the periodic orig_node purging thread will decide to delete it if it has a chance to jump between these two function calls. This is because batadv_orig_node_new() initializes the last_seen value to zero and its caller (batadv_iv_ogm_orig_get()) makes it visible to other threads by adding it to the hash table already. batadv_iv_ogm_calc_tq() will set the last_seen variable to the correct, current time a few lines later but if the purging thread jumps in between that it will think that the orig_node timed out and will wrongly schedule it for deletion already. If the purging interval is the same as the originator interval (which is the default: 1 second), then this game can continue for several rounds until the random OGM jitter added enough difference between these two (in tests, two to about four rounds seemed common). Fixing this by initializing the last_seen variable of an orig_node to the current time before adding it to the hash table. Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2015-01-06 11:05:09 +01:00
Simon Wunderlich	329887ad13	batman-adv: fix and simplify condition when bonding should be used The current condition actually does NOT consider bonding when the interface the packet came in from is the soft interface, which is the opposite of what it should do (and the comment describes). Fix that and slightly simplify the condition. Reported-by: Ray Gibson <booray@gmail.com> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2015-01-06 11:05:07 +01:00
Daniel Borkmann	81164413ad	net: tcp: add per route congestion control This work adds the possibility to define a per route/destination congestion control algorithm. Generally, this opens up the possibility for a machine with different links to enforce specific congestion control algorithms with optimal strategies for each of them based on their network characteristics, even transparently for a single application listening on all links. For our specific use case, this additionally facilitates deployment of DCTCP, for example, applications can easily serve internal traffic/dsts in DCTCP and external one with CUBIC. Other scenarios would also allow for utilizing e.g. long living, low priority background flows for certain destinations/routes while still being able for normal traffic to utilize the default congestion control algorithm. We also thought about a per netns setting (where different defaults are possible), but given its actually a link specific property, we argue that a per route/destination setting is the most natural and flexible. The administrator can utilize this through ip-route(8) by appending "congctl [lock] <name>", where <name> denotes the name of a congestion control algorithm and the optional lock parameter allows to enforce the given algorithm so that applications in user space would not be allowed to overwrite that algorithm for that destination. The dst metric lookups are being done when a dst entry is already available in order to avoid a costly lookup and still before the algorithms are being initialized, thus overhead is very low when the feature is not being used. While the client side would need to drop the current reference on the module, on server side this can actually even be avoided as we just got a flat-copied socket clone. Joint work with Florian Westphal. Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-05 22:55:24 -05:00
Daniel Borkmann	ea69763999	net: tcp: add RTAX_CC_ALGO fib handling This patch adds the minimum necessary for the RTAX_CC_ALGO congestion control metric to be set up and dumped back to user space. While the internal representation of RTAX_CC_ALGO is handled as a u32 key, we avoided to expose this implementation detail to user space, thus instead, we chose the netlink attribute that is being exchanged between user space to be the actual congestion control algorithm name, similarly as in the setsockopt(2) API in order to allow for maximum flexibility, even for 3rd party modules. It is a bit unfortunate that RTAX_QUICKACK used up a whole RTAX slot as it should have been stored in RTAX_FEATURES instead, we first thought about reusing it for the congestion control key, but it brings more complications and/or confusion than worth it. Joint work with Florian Westphal. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-05 22:55:24 -05:00
Daniel Borkmann	c5c6a8ab45	net: tcp: add key management to congestion control This patch adds necessary infrastructure to the congestion control framework for later per route congestion control support. For a per route congestion control possibility, our aim is to store a unique u32 key identifier into dst metrics, which can then be mapped into a tcp_congestion_ops struct. We argue that having a RTAX key entry is the most simple, generic and easy way to manage, and also keeps the memory footprint of dst entries lower on 64 bit than with storing a pointer directly, for example. Having a unique key id also allows for decoupling actual TCP congestion control module management from the FIB layer, i.e. we don't have to care about expensive module refcounting inside the FIB at this point. We first thought of using an IDR store for the realization, which takes over dynamic assignment of unused key space and also performs the key to pointer mapping in RCU. While doing so, we stumbled upon the issue that due to the nature of dynamic key distribution, it just so happens, arguably in very rare occasions, that excessive module loads and unloads can lead to a possible reuse of previously used key space. Thus, previously stale keys in the dst metric are now being reassigned to a different congestion control algorithm, which might lead to unexpected behaviour. One way to resolve this would have been to walk FIBs on the actually rare occasion of a module unload and reset the metric keys for each FIB in each netns, but that's just very costly. Therefore, we argue a better solution is to reuse the unique congestion control algorithm name member and map that into u32 key space through jhash. For that, we split the flags attribute (as it currently uses 2 bits only anyway) into two u32 attributes, flags and key, so that we can keep the cacheline boundary of 2 cachelines on x86_64 and cache the precalculated key at registration time for the fast path. On average we might expect 2 - 4 modules being loaded worst case perhaps 15, so a key collision possibility is extremely low, and guaranteed collision-free on LE/BE for all in-tree modules. Overall this results in much simpler code, and all without the overhead of an IDR. Due to the deterministic nature, modules can now be unloaded, the congestion control algorithm for a specific but unloaded key will fall back to the default one, and on module reload time it will switch back to the expected algorithm transparently. Joint work with Florian Westphal. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-05 22:55:24 -05:00
Daniel Borkmann	29ba4fffd3	net: tcp: refactor reinitialization of congestion control We can just move this to an extra function and make the code a bit more readable, no functional change. Joint work with Florian Westphal. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-05 22:55:24 -05:00
Florian Westphal	e715b6d3a5	net: fib6: convert cfg metric to u32 outside of table write lock Do the nla validation earlier, outside the write lock. This is needed by followup patch which needs to be able to call request_module (which can sleep) if needed. Joint work with Daniel Borkmann. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-05 22:55:24 -05:00
Daniel Borkmann	0409c9a5a7	net: fib6: fib6_commit_metrics: fix potential NULL pointer dereference When IPv6 host routes with metrics attached are being added, we fetch the metrics store from the dst via COW through dst_metrics_write_ptr(), added through commit `e5fd387ad5`. One remaining problem here is that we actually call into inet_getpeer() and may end up allocating/creating a new peer from the kmemcache, which may fail. Example trace from perf probe (inet_getpeer:41) where create is 1: ip 6877 [002] 4221.391591: probe:inet_getpeer: (ffffffff8165e293) 85e294 inet_getpeer.part.7 (<- kmem_cache_alloc()) 85e578 inet_getpeer 8eb333 ipv6_cow_metrics 8f10ff fib6_commit_metrics Therefore, a check for NULL on the return of dst_metrics_write_ptr() is necessary here. Joint work with Florian Westphal. Fixes: `e5fd387ad5` ("ipv6: do not overwrite inetpeer metrics prematurely") Cc: Michal Kubeček <mkubecek@suse.cz> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-05 22:55:24 -05:00
Hubert Sokolowski	6cb69742da	net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined Add checking whether the call to ndo_dflt_fdb_dump is needed. It is not expected to call ndo_dflt_fdb_dump unconditionally by some drivers (i.e. qlcnic or macvlan) that defines own ndo_fdb_dump. Other drivers define own ndo_fdb_dump and don't want ndo_dflt_fdb_dump to be called at all. At the same time it is desirable to call the default dump function on a bridge device. Fix attributes that are passed to dev->netdev_ops->ndo_fdb_dump. Add extra checking in br_fdb_dump to avoid duplicate entries as now filter_dev can be NULL. Following tests for filtering have been performed before the change and after the patch was applied to make sure they are the same and it doesn't break the filtering algorithm. [root@localhost ~]# cd /root/iproute2-3.18.0/bridge [root@localhost bridge]# modprobe dummy [root@localhost bridge]# ./bridge fdb add f1:f2:f3:f4:f5:f6 dev dummy0 [root@localhost bridge]# brctl addbr br0 [root@localhost bridge]# brctl addif br0 dummy0 [root@localhost bridge]# ip link set dev br0 address 02:00:00:12:01:04 [root@localhost bridge]# # show all [root@localhost bridge]# ./bridge fdb show 33:33:00:00:00:01 dev p2p1 self permanent 01:00:5e:00:00:01 dev p2p1 self permanent 33:33:ff:ac:ce:32 dev p2p1 self permanent 33:33:00:00:02:02 dev p2p1 self permanent 01:00:5e:00:00:fb dev p2p1 self permanent 33:33:00:00:00:01 dev p7p1 self permanent 01:00:5e:00:00:01 dev p7p1 self permanent 33:33:ff:79:50:53 dev p7p1 self permanent 33:33:00:00:02:02 dev p7p1 self permanent 01:00:5e:00:00:fb dev p7p1 self permanent f2:46:50:85:6d:d9 dev dummy0 master br0 permanent f2:46:50:85:6d:d9 dev dummy0 vlan 1 master br0 permanent 33:33:00:00:00:01 dev dummy0 self permanent f1:f2:f3:f4:f5:f6 dev dummy0 self permanent 33:33:00:00:00:01 dev br0 self permanent 02:00:00:12:01:04 dev br0 vlan 1 master br0 permanent 02:00:00:12:01:04 dev br0 master br0 permanent [root@localhost bridge]# # filter by bridge [root@localhost bridge]# ./bridge fdb show br br0 f2:46:50:85:6d:d9 dev dummy0 master br0 permanent f2:46:50:85:6d:d9 dev dummy0 vlan 1 master br0 permanent 33:33:00:00:00:01 dev dummy0 self permanent f1:f2:f3:f4:f5:f6 dev dummy0 self permanent 33:33:00:00:00:01 dev br0 self permanent 02:00:00:12:01:04 dev br0 vlan 1 master br0 permanent 02:00:00:12:01:04 dev br0 master br0 permanent [root@localhost bridge]# # filter by port [root@localhost bridge]# ./bridge fdb show brport dummy0 f2:46:50:85:6d:d9 master br0 permanent f2:46:50:85:6d:d9 vlan 1 master br0 permanent 33:33:00:00:00:01 self permanent f1:f2:f3:f4:f5:f6 self permanent [root@localhost bridge]# # filter by port + bridge [root@localhost bridge]# ./bridge fdb show br br0 brport dummy0 f2:46:50:85:6d:d9 master br0 permanent f2:46:50:85:6d:d9 vlan 1 master br0 permanent 33:33:00:00:00:01 self permanent f1:f2:f3:f4:f5:f6 self permanent [root@localhost bridge]# Signed-off-by: Hubert Sokolowski <hubert.sokolowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-05 22:52:06 -05:00
Tom Herbert	ad6f939ab1	ip: Add offset parameter to ip_cmsg_recv Add ip_cmsg_recv_offset function which takes an offset argument that indicates the starting offset in skb where data is being received from. This will be useful in the case of UDP and provided checksum to user space. ip_cmsg_recv is an inline call to ip_cmsg_recv_offset with offset of zero. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-05 22:44:46 -05:00
Tom Herbert	5961de9f19	ip: Add offset parameter to ip_cmsg_recv Add ip_cmsg_recv_offset function which takes an offset argument that indicates the starting offset in skb where data is being received from. This will be useful in the case of UDP and provided checksum to user space. ip_cmsg_recv is an inline call to ip_cmsg_recv_offset with offset of zero. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-05 22:44:46 -05:00
Tom Herbert	c44d13d6f3	ip: IP cmsg cleanup Move the IP_CMSG_* constants from ip_sockglue.c to inet_sock.h so that they can be referenced in other source files. Restructure ip_cmsg_recv to not go through flags using shift, check for flags by 'and'. This eliminates both the shift and a conditional per flag check. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-05 22:44:46 -05:00
Tom Herbert	224d019c4f	ip: Move checksum convert defines to inet Move convert_csum from udp_sock to inet_sock. This allows the possibility that we can use convert checksum for different types of sockets and also allows convert checksum to be enabled from inet layer (what we'll want to do when enabling IP_CHECKSUM cmsg). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-05 22:44:46 -05:00
Gao feng	b44b565cf5	netfilter: nf_ct_seqadj: print ack seq in the right host byte order new_start_seq and new_end_seq are network byte order, print the host byte order in debug message and print seq number as the type of unsigned int. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@soleta.eu>	2015-01-05 13:52:20 +01:00
Chen Gang	b18c5d15e8	netfilter: nfnetlink_cthelper: Remove 'const' and '&' to avoid warnings The related code can be simplified, and also can avoid related warnings (with allmodconfig under parisc): CC [M] net/netfilter/nfnetlink_cthelper.o net/netfilter/nfnetlink_cthelper.c: In function ‘nfnl_cthelper_from_nlattr’: net/netfilter/nfnetlink_cthelper.c:97:9: warning: passing argument 1 o ‘memcpy’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-array-qualifiers] memcpy(&help->data, nla_data(attr), help->helper->data_len); ^ In file included from include/linux/string.h:17:0, from include/uapi/linux/uuid.h:25, from include/linux/uuid.h:23, from include/linux/mod_devicetable.h:12, from ./arch/parisc/include/asm/hardware.h:4, from ./arch/parisc/include/asm/processor.h:15, from ./arch/parisc/include/asm/spinlock.h:6, from ./arch/parisc/include/asm/atomic.h:21, from include/linux/atomic.h:4, from ./arch/parisc/include/asm/bitops.h:12, from include/linux/bitops.h:36, from include/linux/kernel.h:10, from include/linux/list.h:8, from include/linux/module.h:9, from net/netfilter/nfnetlink_cthelper.c:11: ./arch/parisc/include/asm/string.h:8:8: note: expected ‘void ’ but argument is of type ‘const char ()[]’ void * memcpy(void * dest,const void *src,size_t count); ^ Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@soleta.eu>	2015-01-05 12:42:54 +01:00
Duan Jiong	0f8162326f	netfilter: nfnetlink: remove redundant variable nskb Actually after netlink_skb_clone() is called, the nskb and skb will point to the same thing, but they are used just like they are different, sometimes this is confusing, so i think there is no necessary to keep nskb anymore. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@soleta.eu>	2015-01-05 12:10:49 +01:00
Johannes Berg	1e359a5de8	Revert "mac80211: Fix accounting of the tailroom-needed counter" This reverts commit `ca34e3b5c8`. It turns out that the p54 and cw2100 drivers assume that there's tailroom even when they don't say they really need it. However, there's currently no way for them to explicitly say they do need it, so for now revert this. This fixes https://bugzilla.kernel.org/show_bug.cgi?id=90331. Cc: stable@vger.kernel.org Fixes: `ca34e3b5c8` ("mac80211: Fix accounting of the tailroom-needed counter") Reported-by: Christopher Chavez <chrischavez@gmx.us> Bisected-by: Larry Finger <Larry.Finger@lwfinger.net> Debugged-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-01-05 10:33:46 +01:00
Jesse Gross	46b1e4f911	geneve: Check family when reusing sockets. When searching for an existing socket to reuse, the address family is not taken into account - only port number. This means that an IPv4 socket could be used for IPv6 traffic and vice versa, which is sure to cause problems when passing packets. It is not possible to trigger this problem currently because the only user of Geneve creates just IPv4 sockets. However, that is likely to change in the near future. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-04 22:21:33 -05:00
Jesse Gross	df5dba8e52	geneve: Remove socket hash table. The hash table for open Geneve ports is used only on creation and deletion time. It is not performance critical and is not likely to grow to a large number of items. Therefore, this can be changed to use a simple linked list. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-04 22:21:33 -05:00
Jesse Gross	829a3ada9c	geneve: Simplify locking. The existing Geneve locking scheme was pulled over directly from VXLAN. However, VXLAN has a number of built in mechanisms which make the locking more complex and are unlikely to be necessary with Geneve. This simplifies the locking to use a basic scheme of a mutex when doing updates plus RCU on receive. In addition to making the code easier to read, this also avoids the possibility of a race when creating or destroying sockets since UDP sockets and the list of Geneve sockets are protected by different locks. After this change, the entire operation is atomic. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-04 22:21:33 -05:00
Jesse Gross	61f3cade76	geneve: Remove workqueue. The work queue is used only to free the UDP socket upon destruction. This is not necessary with Geneve and generally makes the code more difficult to reason about. It also introduces nondeterministic behavior such as when a socket is rapidly deleted and recreated, which could fail as the the deletion happens asynchronously. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-04 22:21:33 -05:00
Marcel Holtmann	043ec9bf7b	Bluetooth: Introduce HCI_QUIRK_FIXUP_INQUIRY_MODE option The HCI_QUIRK_FIXUP_INQUIRY_MODE option allows to force Inquiry Result with RSSI setting on controllers that do not indicate support for it, but where it is known to be fully functional. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-03 22:31:09 +02:00
Marcel Holtmann	04422da990	Bluetooth: Remove dead code for manufacturer inquiry mode quirks There are some old Bluetooth modules from Silicon Wave and Broadcom which support Inquiry Result with RSSI, but do not advertise it. The core has quirks in the code to enable that inquiry mode. However as it stands right now, that code is not even executed since entering the function to determine which inquiry mode requires that the device has the feature bit for Inquiry Result with RSSI set in the first place. So this makes this dead code that hasn't work for a long time. In conclusion, just remove these extra quirks and simplify the setup of the inquiry mode to be inline and with that a lot easier to read and understand. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-03 22:31:08 +02:00
Thomas Graf	21e4902aea	netlink: Lockless lookup with RCU grace period in socket release Defers the release of the socket reference using call_rcu() to allow using an RCU read-side protected call to rhashtable_lookup() This restores behaviour and performance gains as previously introduced by `e341694` ("netlink: Convert netlink_lookup() to use RCU protected hash table") without the side effect of severely delayed socket destruction. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:32:57 -05:00
Thomas Graf	97defe1ecf	rhashtable: Per bucket locks & deferred expansion/shrinking Introduces an array of spinlocks to protect bucket mutations. The number of spinlocks per CPU is configurable and selected based on the hash of the bucket. This allows for parallel insertions and removals of entries which do not share a lock. The patch also defers expansion and shrinking to a worker queue which allows insertion and removal from atomic context. Insertions and deletions may occur in parallel to it and are only held up briefly while the particular bucket is linked or unzipped. Mutations of the bucket table pointer is protected by a new mutex, read access is RCU protected. In the event of an expansion or shrinking, the new bucket table allocated is exposed as a so called future table as soon as the resize process starts. Lookups, deletions, and insertions will briefly use both tables. The future table becomes the main table after an RCU grace period and initial linking of the old to the new table was performed. Optimization of the chains to make use of the new number of buckets follows only the new table is in use. The side effect of this is that during that RCU grace period, a bucket traversal using any rht_for_each() variant on the main table will not see any insertions performed during the RCU grace period which would at that point land in the future table. The lookup will see them as it searches both tables if needed. Having multiple insertions and removals occur in parallel requires nelems to become an atomic counter. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:32:57 -05:00
Thomas Graf	897362e446	nft_hash: Remove rhashtable_remove_pprev() The removal function of nft_hash currently stores a reference to the previous element during lookup which is used to optimize removal later on. This was possible because a lock is held throughout calling rhashtable_lookup() and rhashtable_remove(). With the introdution of deferred table resizing in parallel to lookups and insertions, the nftables lock will no longer synchronize all table mutations and the stored pprev may become invalid. Removing this optimization makes removal slightly more expensive on average but allows taking the resize cost out of the insert and remove path. Signed-off-by: Thomas Graf <tgraf@suug.ch> Cc: netfilter-devel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:32:57 -05:00
Thomas Graf	88d6ed15ac	rhashtable: Convert bucket iterators to take table and index This patch is in preparation to introduce per bucket spinlocks. It extends all iterator macros to take the bucket table and bucket index. It also introduces a new rht_dereference_bucket() to handle protected accesses to buckets. It introduces a barrier() to the RCU iterators to the prevent the compiler from caching the first element. The lockdep verifier is introduced as stub which always succeeds and properly implement in the next patch when the locks are introduced. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:32:56 -05:00
Thomas Graf	8d24c0b431	rhashtable: Do hashing inside of rhashtable_lookup_compare() Hash the key inside of rhashtable_lookup_compare() like rhashtable_lookup() does. This allows to simplify the hashing functions and keep them private. Signed-off-by: Thomas Graf <tgraf@suug.ch> Cc: netfilter-devel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:32:56 -05:00
Alexander Aring	36cf942adf	mac802154: fix kbuild test robot warning This patch fixs the following kbuild test robot warning: coccinelle warnings: (new ones prefixed by >>) >> net/mac802154/cfg.c:53:1-3: WARNING: PTR_ERR_OR_ZERO can be used Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-03 01:51:51 +01:00
Alexander Aring	9dc52d49e2	ieee802154: handle config as menuconfig This patch handles the IEEE802154 Kconfig entry as menuconfig. Furthermore we move this entry out of "Network Options" and put it into "Networking" where the other networking subsystems are. This requires a menuconfig entry like all other networking subsystems. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-03 01:49:24 +01:00
Alexander Aring	71e36b1b01	ieee802154: rename af_ieee802154.c to socket.c This patch renames the "af_ieee802154.c" to "socket.c". This is just a cleanup to have a short name for it which describes the implementationm stuff more human understandable. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-03 01:49:24 +01:00
Alexander Aring	df2f65de4b	ieee802154: socket: fix checkpatch issue This patch solves the following checkpatch issue: CHECK: Comparison to NULL could be written "skb" + if (skb != NULL) { Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-03 01:49:24 +01:00
Alexander Aring	78f821b648	ieee802154: socket: put handling into one file This patch puts all related socket handling into one file. This is just a cleanup to do all socket handling stuff inside of one implementation file. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-03 01:49:24 +01:00
Alexander Aring	79300e3a43	ieee802154: socket: change module name This patch changes the module name of af_802154 to ieee802154_socket. Just for keeping the name convention according to the 6LoWPAN module ieee802154_6lowpan. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-03 01:49:24 +01:00
Alexander Aring	955d7fc93e	ieee802154: handle socket functionality as module This patch makes the ieee802154 socket handling as module. Currently this is part of ieee802154 module. It pointed out that ieee802154 module has also two module_init/module_exit functions. One inside of core.c and the other in af_ieee802154.c. This patch will also solve this issue by handle the af_802154 as separate module. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-01-03 01:49:23 +01:00
Marcel Holtmann	ec6cef9cd9	Bluetooth: Fix SMP channel registration for unconfigured controllers When the Bluetooth controllers requires an unconfigured state (for example when the BD_ADDR is missing), then it is important to try to register the SMP channels when the controller transitions to the configured state. This also fixes an issue with the debugfs entires that are not present for controllers that start out as unconfigured. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-02 22:22:04 +01:00
Marcel Holtmann	203de21bf6	Bluetooth: Fix for a leftover debug of pairing credentials One of the LE Secure Connections security credentials was still using the BT_DBG instead of SMP_DBG. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-02 22:22:04 +01:00
Marcel Holtmann	cb0d2faeb1	Bluetooth: Fix scope of sc_only_mode debugfs entry The sc_only_mode debugfs entry is used to read the current state of the Secure Connections Only mode. Before Bluetooth 4.2 this mode was only for BR/EDR controllers and with that tight to the support Secure Simple Pairing. Since Secure Connections is now available for BR/EDR and LE this debugfs entry is no longer correctly place. Move it to the common section and enable it when either BR/EDR Secure Connections feature is supported or when the controller has LE support. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-02 22:22:04 +01:00
Marcel Holtmann	05b3c3e790	Bluetooth: Remove no longer needed force_sc_support debugfs option The force_sc_support debugfs option was introduced to easily work with pre-production Bluetooth 4.1 silicon. This option is no longer needed since controllers supporting BR/EDR Secure Connections feature are now available. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-02 22:22:04 +01:00
Marcel Holtmann	91389af67c	Bluetooth: Remove broken force_lesc_support debugfs option The force_lesc_support debugfs option never really worked. It has a race condition between creating the debugfs entry and registering the L2CAP fixed channel for BR/EDR SMP support. Also this has been replaced with a working force_bredr_smp debugfs switch that developers can use now. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-02 22:22:03 +01:00
Marcel Holtmann	300acfdec9	Bluetooth: Introduce force_bredr_smp debugfs option for testing Testing cross-transport pairing that starts on BR/EDR is only valid when using a controller with BR/EDR Secure Connections. Devices will indicate this by providing BR/EDR SMP fixed channel over L2CAP. To allow testing of this feature on Bluetooth 4.0 controller or controllers without the BR/EDR Secure Connections features, introduce a force_bredr_smp debugfs option that allows faking the required AES connection. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-01-02 22:22:03 +01:00
Ben Pfaff	24cc59d1eb	openvswitch: Consistently include VLAN header in flow and port stats. Until now, when VLAN acceleration was in use, the bytes of the VLAN header were not included in port or flow byte counters. They were however included when VLAN acceleration was not used. This commit corrects the inconsistency, by always including the VLAN header in byte counters. Previous discussion at http://openvswitch.org/pipermail/dev/2014-December/049521.html Reported-by: Motonori Shindo <mshindo@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Reviewed-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:14:20 -05:00
Herbert Xu	843925f33f	tcp: Do not apply TSO segment limit to non-TSO packets Thomas Jarosch reported IPsec TCP stalls when a PMTU event occurs. In fact the problem was completely unrelated to IPsec. The bug is also reproducible if you just disable TSO/GSO. The problem is that when the MSS goes down, existing queued packet on the TX queue that have not been transmitted yet all look like TSO packets and get treated as such. This then triggers a bug where tcp_mss_split_point tells us to generate a zero-sized packet on the TX queue. Once that happens we're screwed because the zero-sized packet can never be removed by ACKs. Fixes: `1485348d24` ("tcp: Apply device TSO segment limit earlier") Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:13:20 -05:00
Florian Westphal	e8768f9715	net: skbuff: don't zero tc members when freeing skb Not needed, only four cases: - kfree_skb (or one of its aliases). Don't need to zero, memory will be freed. - kfree_skb_partial and head was stolen: memory will be freed. - skb_morph: The skb header fields (including tc ones) will be copied over from the 'to-be-morphed' skb right after skb_release_head_state returns. - skb_segment: Same as before, all the skb header fields are copied over from the original skb right away. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:04:29 -05:00
David S. Miller	6c032edc8a	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg say: ==================== pull request: bluetooth-next 2014-12-31 Here's the first batch of bluetooth patches for 3.20. - Cleanups & fixes to ieee802154 drivers - Fix synchronization of mgmt commands with respective HCI commands - Add self-tests for LE pairing crypto functionality - Remove 'BlueFritz!' specific handling from core using a new quirk flag - Public address configuration support for ath3012 - Refactor debugfs support into a dedicated file - Initial support for LE Data Length Extension feature from Bluetooth 4.2 Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 15:58:21 -05:00
Joe Stringer	a4c9ea5e8f	geneve: Add Geneve GRO support This results in an approximately 30% increase in throughput when handling encapsulated bulk traffic. Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 15:46:41 -05:00
Jesse Gross	9b174d88c2	net: Add Transparent Ethernet Bridging GRO support. Currently the only tunnel protocol that supports GRO with encapsulated Ethernet is VXLAN. This pulls out the Ethernet code into a proper layer so that it can be used by other tunnel protocols such as GRE and Geneve. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 15:46:41 -05:00
Alexander Duyck	5405afd1a3	fib_trie: Add tracking value for suffix length This change adds a tracking value for the maximum suffix length of all prefixes stored in any given tnode. With this value we can determine if we need to backtrace or not based on if the suffix is greater than the pos value. By doing this we can reduce the CPU overhead for lookups in the local table as many of the prefixes there are 32b long and have a suffix length of 0 meaning we can immediately backtrace to the root node without needing to test any of the nodes between it and where we ended up. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:55 -05:00
Alexander Duyck	21d1f11db0	fib_trie: Remove checks for index >= tnode_child_length from tnode_get_child For some reason the compiler doesn't seem to understand that when we are in a loop that runs from tnode_child_length - 1 to 0 we don't expect the value of tn->bits to change. As such every call to tnode_get_child was rerunning tnode_chile_length which ended up consuming quite a bit of space in the resultant assembly code. I have gone though and verified that in all cases where tnode_get_child is used we are either winding though a fixed loop from tnode_child_length - 1 to 0, or are in a fastpath case where we are verifying the value by either checking for any remaining bits after shifting index by bits and testing for leaf, or by using tnode_child_length. size net/ipv4/fib_trie.o Before: text data bss dec hex filename 15506 376 8 15890 3e12 net/ipv4/fib_trie.o After: text data bss dec hex filename 14827 376 8 15211 3b6b net/ipv4/fib_trie.o Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:55 -05:00
Alexander Duyck	12c081a5c8	fib_trie: inflate/halve nodes in a more RCU friendly way This change pulls the node_set_parent functionality out of put_child_reorg and instead leaves that to the function to take care of as well. By doing this we can fully construct the new cluster of tnodes and all of the pointers out of it before we start routing pointers into it. I am suspecting this will likely fix some concurency issues though I don't have a good test to show as such. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:55 -05:00
Alexander Duyck	fc86a93b46	fib_trie: Push tnode flushing down to inflate/halve This change pushes the tnode freeing down into the inflate and halve functions. It makes more sense here as we have a better grasp of what is going on and when a given cluster of nodes is ready to be freed. I believe this may address a bug in the freeing logic as well. For some reason if the freelist got to a certain size we would call synchronize_rcu(). I'm assuming that what they meant to do is call synchronize_rcu() after they had handed off that much memory via call_rcu(). As such that is what I have updated the behavior to be. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:55 -05:00
Alexander Duyck	ff181ed876	fib_trie: Push assignment of child to parent down into inflate/halve This change makes it so that the assignment of the tnode to the parent is handled directly within whatever function is currently handling the node be it inflate, halve, or resize. By doing this we can avoid some of the need to set NULL pointers in the tree while we are resizing the subnodes. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:55 -05:00
Alexander Duyck	f05a48198b	fib_trie: Add functions should_inflate and should_halve This change pulls the logic for if we should inflate/halve the nodes out into separate functions. It also addresses what I believe is a bug where 1 full node is all that is needed to keep a node from ever being halved. Simple script to reproduce the issue: modprobe dummy; ifconfig dummy0 up for i in `seq 0 255`; do ifconfig dummy0:$i 10.0.${i}.1/24 up; done ifconfig dummy0:256 10.0.255.33/16 up for i in `seq 0 254`; do ifconfig dummy0:$i down; done Results from /proc/net/fib_triestat Before: Local: Aver depth: 3.00 Max depth: 4 Leaves: 17 Prefixes: 18 Internal nodes: 11 1: 8 2: 2 10: 1 Pointers: 1048 Null ptrs: 1021 Total size: 11 kB After: Local: Aver depth: 3.41 Max depth: 5 Leaves: 17 Prefixes: 18 Internal nodes: 12 1: 8 2: 3 3: 1 Pointers: 36 Null ptrs: 8 Total size: 3 kB Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:54 -05:00
Alexander Duyck	cf3637bb8f	fib_trie: Move resize to after inflate/halve This change consists of a cut/paste of resize to behind inflate and halve so that I could remove the two function prototypes. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:54 -05:00
Alexander Duyck	345e9b5426	fib_trie: Push rcu_read_lock/unlock to callers This change is to start cleaning up some of the rcu_read_lock/unlock handling. I realized while reviewing the code there are several spots that I don't believe are being handled correctly or are masking warnings by locally calling rcu_read_lock/unlock instead of calling them at the correct level. A common example is a call to fib_get_table followed by fib_table_lookup. The rcu_read_lock/unlock ought to wrap both but there are several spots where they were not wrapped. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:54 -05:00
Alexander Duyck	98293e8d2f	fib_trie: Use unsigned long for anything dealing with a shift by bits This change makes it so that anything that can be shifted by, or compared to a value shifted by bits is updated to be an unsigned long. This is mostly a precaution against an insanely huge address space that somehow starts coming close to the 2^32 root node size which would require something like 1.5 billion addresses. I chose unsigned long instead of unsigned long long since I do not believe it is possible to allocate a 32 bit tnode on a 32 bit system as the memory consumed would be 16GB + 28B which exceeds the addressible space for any one process. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:54 -05:00
Alexander Duyck	e9b44019d4	fib_trie: Update meaning of pos to represent unchecked bits This change moves the pos value to the other side of the "bits" field. By doing this it actually simplifies a significant amount of code in the trie. For example when halving a tree we know that the bit lost exists at oldnode->pos, and if we inflate the tree the new bit being add is at tn->pos. Previously to find those bits you would have to subtract pos and bits from the keylength or start with a value of (1 << 31) and then shift that. There are a number of spots throughout the code that benefit from this. In the case of the hot-path searches the main advantage is that we can drop 2 or more operations from the search path as we no longer need to compute the value for the index to be shifted by and can instead just use the raw pos value. In addition the tkey_extract_bits is now defunct and can be replaced by get_index since the two operations were doing the same thing, but now get_index does it much more quickly as it is only an xor and shift versus a pair of shifts and a subtraction. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:54 -05:00
Alexander Duyck	836a0123c9	fib_trie: Optimize fib_table_insert This patch updates the fib_table_insert function to take advantage of the changes made to improve the performance of fib_table_lookup. As a result the code should be smaller and run faster then the original. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:54 -05:00
Alexander Duyck	939afb0657	fib_trie: Optimize fib_find_node This patch makes use of the same features I made use of for fib_table_lookup to streamline fib_find_node. The resultant code should be smaller and run faster than the original. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:54 -05:00
Alexander Duyck	9f9e636d4f	fib_trie: Optimize fib_table_lookup to avoid wasting time on loops/variables This patch is meant to reduce the complexity of fib_table_lookup by reducing the number of variables to the bare minimum while still keeping the same if not improved functionality versus the original. Most of this change was started off by the desire to rid the function of chopped_off and current_prefix_length as they actually added very little to the function since they only applied when computing the cindex. I was able to replace them mostly with just a check for the prefix match. As long as the prefix between the key and the node being tested was the same we know we can search the tnode fully versus just testing cindex 0. The second portion of the change ended up being a massive reordering. Originally the calls to check_leaf were up near the start of the loop, and the backtracing and descending into lower levels of tnodes was later. This didn't make much sense as the structure of the tree means the leaves are always the last thing to be tested. As such I reordered things so that we instead have a loop that will delve into the tree and only exit when we have either found a leaf or we have exhausted the tree. The advantage of rearranging things like this is that we can fully inline check_leaf since there is now only one reference to it in the function. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:54 -05:00
Alexander Duyck	adaf981685	fib_trie: Merge leaf into tnode This change makes it so that leaf and tnode are the same struct. As a result there is no need for rt_trie_node anymore since everyting can be merged into tnode. On 32b systems this results in the leaf being 4 bytes larger, however I don't know if that is really an issue as this and an eariler patch that added bits & pos have increased the size from 20 to 28. If I am not mistaken slub/slab allocate on power of 2 sizes so 20 was likely being rounded up to 32 anyway. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:54 -05:00
Alexander Duyck	37fd30f2da	fib_trie: Merge tnode_free and leaf_free into node_free Both the leaf and the tnode had an rcu_head in them, but they had them in slightly different places. Since we now have them in the same spot and know that any node with bits == 0 is a leaf and the rest are either vmalloc or kmalloc tnodes depending on the value of bits it makes it easy to combine the functions and reduce overhead. In addition I have taken advantage of the rcu_head pointer to go ahead and put together a simple linked list instead of using the tnode pointer as this way we can merge either type of structure for freeing. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:54 -05:00
Alexander Duyck	64c9b6fb26	fib_trie: Make leaf and tnode more uniform This change makes some fundamental changes to the way leaves and tnodes are constructed. The big differences are: 1. Leaves now populate pos and bits indicating their full key size. 2. Trie nodes now mask out their lower bits to be consistent with the leaf 3. Both structures have been reordered so that rt_trie_node now consisists of a much larger region including the pos, bits, and rcu portions of the tnode structure. On 32b systems this will result in the leaf being 4B larger as the pos and bits values were added to a hole created by the key as it was only 4B in length. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:54 -05:00
Alexander Duyck	8274a97aa4	fib_trie: Update usage stats to be percpu instead of global variables The trie usage stats were currently being shared by all threads that were calling fib_table_lookup. As a result when multiple threads were performing lookups simultaneously the trie would begin to cache bounce between those threads. In order to prevent this I have updated the usage stats to use a set of percpu variables. By doing this we should be able to avoid the cache bouncing and still make use of these stats. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:53 -05:00
stephen hemminger	bec94d430f	gre: allow live address change The GRE tap device supports Ethernet over GRE, but doesn't care about the source address of the tunnel, therefore it can be changed without bring device down. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 14:18:28 -05:00
Bill Hong	33f72e6f0c	l2tp : multicast notification to the registered listeners Previously l2tp module did not provide any means for the user space to get notified when tunnels/sessions are added/modified/deleted. This change contains the following - create a multicast group for the listeners to register. - notify the registered listeners when the tunnels/sessions are created/modified/deleted. Signed-off-by: Bill Hong <bhong@brocade.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Sven-Thorsten Dietrich <sven@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 14:17:20 -05:00
Fabian Frederick	886eaa1fe6	tipc: replace 0 by NULL for pointers Fix sparse warning: net/tipc/link.c:1924:40: warning: Using plain integer as NULL pointer Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 13:11:39 -05:00
Jiri Kosina	831a39c241	Revert "cfg80211: make WEXT compatibility unselectable" This reverts commit `24a0aa212e`. It's causing severe userspace breakage. Namely, all the utilities from wireless-utils which are relying on CONFIG_WEXT (which means tools like 'iwconfig', 'iwlist', etc) are not working anymore. There is a 'iw' utility in newer wireless-tools, which is supposed to be a replacement for all the "deprecated" binaries, but it's far away from being massively adopted. Please see [1] for example of the userspace breakage this is causing. In addition to that, Larry Finger reports [2] that this patch is also causing ipw2200 driver being impossible to build. To me this clearly shows that CONFIG_WEXT is far, far away from being "deprecated enough" to be removed. [1] http://thread.gmane.org/gmane.linux.kernel/1857010 [2] http://thread.gmane.org/gmane.linux.network/343688 Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-12-30 16:42:29 -08:00
Linus Torvalds	2c90331cf5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix double SKB free in bluetooth 6lowpan layer, from Jukka Rissanen. 2) Fix receive checksum handling in enic driver, from Govindarajulu Varadarajan. 3) Fix NAPI poll list corruption in virtio_net and caif_virtio, from Herbert Xu. Also, add code to detect drivers that have this mistake in the future. 4) Fix doorbell endianness handling in mlx4 driver, from Amir Vadai. 5) Don't clobber IP6CB() before xfrm6_policy_check() is called in TCP input path,f rom Nicolas Dichtel. 6) Fix MPLS action validation in openvswitch, from Pravin B Shelar. 7) Fix double SKB free in vxlan driver, also from Pravin. 8) When we scrub a packet, which happens when we are switching the context of the packet (namespace, etc.), we should reset the secmark. From Thomas Graf. 9) ->ndo_gso_check() needs to do more than return true/false, it also has to allow the driver to clear netdev feature bits in order for the caller to be able to proceed properly. From Jesse Gross. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits) genetlink: A genl_bind() to an out-of-range multicast group should not WARN(). netlink/genetlink: pass network namespace to bind/unbind ne2k-pci: Add pci_disable_device in error handling bonding: change error message to debug message in __bond_release_one() genetlink: pass multicast bind/unbind to families netlink: call unbind when releasing socket netlink: update listeners directly when removing socket genetlink: pass only network namespace to genl_has_listeners() netlink: rename netlink_unbind() to netlink_undo_bind() net: Generalize ndo_gso_check to ndo_features_check net: incorrect use of init_completion fixup neigh: remove next ptr from struct neigh_table net: xilinx: Remove unnecessary temac_property in the driver net: phy: micrel: use generic config_init for KSZ8021/KSZ8031 net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding openvswitch: fix odd_ptr_err.cocci warnings Bluetooth: Fix accepting connections when not using mgmt Bluetooth: Fix controller configuration with HCI_QUIRK_INVALID_BDADDR brcmfmac: Do not crash if platform data is not populated ipw2200: select CFG80211_WEXT ...	2014-12-30 10:45:47 -08:00
Marcel Holtmann	e64b4fb66c	Bluetooth: Add timing information to ECDH test case runs After successful completion of the ECDH test cases, print the time it took to run them. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-30 10:32:11 +02:00
Marcel Holtmann	255047b0dc	Bluetooth: Add timing information to SMP test case runs After successful completion of the SMP test cases, print the time it took to run them. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-30 10:32:08 +02:00
Johan Hedberg	fb2969a3a9	Bluetooth: Add LE Secure Connections tests for SMP This patch adds SMP self-tests for the Secure Connections crypto functions. The sample data has been taken from the core specification. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-30 08:54:34 +01:00
Johan Hedberg	cfc4198e71	Bluetooth: Add legacy SMP tests This patch adds self-tests for legacy SMP crypto functions. The sample data has been taken from the core specification. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-30 08:54:33 +01:00
Johan Hedberg	0a2b0f0452	Bluetooth: Add skeleton for SMP self-tests This patch adds the initial skeleton and kernel config option for SMP self-tests. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-30 08:54:33 +01:00
Johan Hedberg	0b6415b652	Bluetooth: Add support for ECDH test cases This patch adds the test cases for ECDH cryptographic functionality used by Bluetooth Low Energy Secure Connections feature. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-30 09:46:57 +02:00
Marcel Holtmann	ee485290c6	Bluetooth: Add support for self testing framework This add support for the Bluetooth self testing framework that allows running certain test cases of sample data to ensure correctness of its basic functionality. With this patch only the basic framework will be added. It contains the build magic that allows running this at module loading time or at late_initcall stage when built into the kernel image. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-30 08:53:55 +02:00
Johan Hedberg	4da50de895	Bluetooth: Fix const declarations for smp_f5 and smp_f6 These SMP crypto functions should have all their input parameters declared as const. This patch fixes the parameters that were missing the const declaration. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-30 07:30:18 +01:00
David S. Miller	dc97a1a947	genetlink: A genl_bind() to an out-of-range multicast group should not WARN(). Users can request to bind to arbitrary multicast groups, so warning when the requested group number is out of range is not appropriate. And with the warning removed, and the 'err' variable properly given an initial value, we can remove 'found' altogether. Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-29 16:31:49 -05:00
Johannes Berg	023e2cfa36	netlink/genetlink: pass network namespace to bind/unbind Netlink families can exist in multiple namespaces, and for the most part multicast subscriptions are per network namespace. Thus it only makes sense to have bind/unbind notifications per network namespace. To achieve this, pass the network namespace of a given client socket to the bind/unbind functions. Also do this in generic netlink, and there also make sure that any bind for multicast groups that only exist in init_net is rejected. This isn't really a problem if it is accepted since a client in a different namespace will never receive any notifications from such a group, but it can confuse the family if not rejected (it's also possible to silently (without telling the family) accept it, but it would also have to be ignored on unbind so families that take any kind of action on bind/unbind won't do unnecessary work for invalid clients like that. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-27 03:07:50 -05:00
Johannes Berg	c380d9a7af	genetlink: pass multicast bind/unbind to families In order to make the newly fixed multicast bind/unbind functionality in generic netlink, pass them down to the appropriate family. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-27 02:20:23 -05:00
Johannes Berg	7d68536bed	netlink: call unbind when releasing socket Currently, netlink_unbind() is only called when the socket explicitly unbinds, which limits its usefulness (luckily there are no users of it yet anyway.) Call netlink_unbind() also when a socket is released, so it becomes possible to track listeners with this callback and without also implementing a netlink notifier (and checking netlink_has_listeners() in there.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-27 02:20:23 -05:00
Johannes Berg	b10dcb3b94	netlink: update listeners directly when removing socket The code is now confusing to read - first in one function down (netlink_remove) any group subscriptions are implicitly removed by calling __sk_del_bind_node(), but the subscriber database is only updated far later by calling netlink_update_listeners(). Move the latter call to just after removal from the list so it is easier to follow the code. This also enables moving the locking inside the kernel-socket conditional, which improves the normal socket destruction path. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-27 02:20:23 -05:00
Johannes Berg	f8403a2e47	genetlink: pass only network namespace to genl_has_listeners() There's no point to force the caller to know about the internal genl_sock to use inside struct net, just have them pass the network namespace. This doesn't really change code generation since it's an inline, but makes the caller less magic - there's never any reason to pass another socket. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-27 02:20:23 -05:00
Johannes Berg	02c81ab95d	netlink: rename netlink_unbind() to netlink_undo_bind() The new name is more expressive - this isn't a generic unbind function but rather only a little undo helper for use only in netlink_bind(). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-27 02:20:23 -05:00
David S. Miller	eb46e2215f	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Johan Hedberg says: ==================== Here's one more bluetooth pull request for 3.19. We've got two fixes: - Fix for accepting connections with old user space versions of BlueZ - Fix for Bluetooth controllers that don't have a public address Both of these are regressions that were introduced in 3.17, so the appropriate Cc: stable annotations are provided. Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-26 18:23:37 -05:00
Jesse Gross	5f35227ea3	net: Generalize ndo_gso_check to ndo_features_check GSO isn't the only offload feature with restrictions that potentially can't be expressed with the current features mechanism. Checksum is another although it's a general issue that could in theory apply to anything. Even if it may be possible to implement these restrictions in other ways, it can result in duplicate code or inefficient per-packet behavior. This generalizes ndo_gso_check so that drivers can remove any features that don't make sense for a given packet, similar to netif_skb_features(). It also converts existing driver restrictions to the new format, completing the work that was done to support tunnel protocols since the issues apply to checksums as well. By actually removing features from the set that are used to do offloading, it solves another problem with the existing interface. In these cases, GSO would run with the original set of features and not do anything because it appears that segmentation is not required. CC: Tom Herbert <therbert@google.com> CC: Joe Stringer <joestringer@nicira.com> CC: Eric Dumazet <edumazet@google.com> CC: Hayes Wang <hayeswang@realtek.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Tom Herbert <therbert@google.com> Fixes: `04ffcb255f` ("net: Add ndo_gso_check") Tested-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-26 17:20:56 -05:00
Jay Vosburgh	2c26d34bbc	net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding When using VXLAN tunnels and a sky2 device, I have experienced checksum failures of the following type: [ 4297.761899] eth0: hw csum failure [...] [ 4297.765223] Call Trace: [ 4297.765224] <IRQ> [<ffffffff8172f026>] dump_stack+0x46/0x58 [ 4297.765235] [<ffffffff8162ba52>] netdev_rx_csum_fault+0x42/0x50 [ 4297.765238] [<ffffffff8161c1a0>] ? skb_push+0x40/0x40 [ 4297.765240] [<ffffffff8162325c>] __skb_checksum_complete+0xbc/0xd0 [ 4297.765243] [<ffffffff8168c602>] tcp_v4_rcv+0x2e2/0x950 [ 4297.765246] [<ffffffff81666ca0>] ? ip_rcv_finish+0x360/0x360 These are reliably reproduced in a network topology of: container:eth0 == host(OVS VXLAN on VLAN) == bond0 == eth0 (sky2) -> switch When VXLAN encapsulated traffic is received from a similarly configured peer, the above warning is generated in the receive processing of the encapsulated packet. Note that the warning is associated with the container eth0. The skbs from sky2 have ip_summed set to CHECKSUM_COMPLETE, and because the packet is an encapsulated Ethernet frame, the checksum generated by the hardware includes the inner protocol and Ethernet headers. The receive code is careful to update the skb->csum, except in __dev_forward_skb, as called by dev_forward_skb. __dev_forward_skb calls eth_type_trans, which in turn calls skb_pull_inline(skb, ETH_HLEN) to skip over the Ethernet header, but does not update skb->csum when doing so. This patch resolves the problem by adding a call to skb_postpull_rcsum to update the skb->csum after the call to eth_type_trans. Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-26 16:16:51 -05:00
Marcel Holtmann	0f3adeae60	Bluetooth: Remove BlueFritz! specific check from initialization The AVM BlueFritz! USB controllers had a special handling in the Bluetooth core when it comes to reading the supported commands. Both drivers now set the HCI_QUIRK_BROKEN_LOCAL_COMMANDS and with that it is no longer needed to look for vendor specific details. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-26 20:16:14 +02:00
Wu Fengguang	4aa6118811	openvswitch: fix odd_ptr_err.cocci warnings net/openvswitch/vport-gre.c:188:5-11: inconsistent IS_ERR and PTR_ERR, PTR_ERR on line 189 PTR_ERR should access the value just tested by IS_ERR Semantic patch information: There can be false positives in the patch case, where it is the call IS_ERR that is wrong. Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci CC: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-24 15:18:09 -05:00
Johan Hedberg	6a8fc95c87	Bluetooth: Fix accepting connections when not using mgmt When connectable mode is enabled (page scan on) through some non-mgmt method the HCI_CONNECTABLE flag will not be set. For backwards compatibility with user space versions not using mgmt we should not require HCI_CONNECTABLE to be set if HCI_MGMT is not set. Reported-by: Pali Rohár <pali.rohar@gmail.com> Tested-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org # 3.17+	2014-12-24 20:02:00 +01:00
Marcel Holtmann	8bfe8442ff	Bluetooth: Fix controller configuration with HCI_QUIRK_INVALID_BDADDR When controllers set the HCI_QUIRK_INVALID_BDADDR flag, it is required by userspace to program a valid public Bluetooth device address into the controller before it can be used. After successful address configuration, the internal state changes and the controller runs the complete initialization procedure. However one small difference is that this is no longer the HCI_SETUP stage. The HCI_SETUP stage is only valid during initial controller setup. In this case the stack runs the initialization as part of the HCI_CONFIG stage. The controller version information, default name and supported commands are only stored during HCI_SETUP. While these information are static, they are not read initially when HCI_QUIRK_INVALID_BDADDR is set. So when running in HCI_CONFIG state, these information need to be updated as well. This especially impacts Bluetooth 4.1 and later controllers using extended feature pages and second event mask page. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org # 3.17+	2014-12-24 20:35:46 +02:00
Thomas Graf	b8fb4e0648	net: Reset secmark when scrubbing packet skb_scrub_packet() is called when a packet switches between a context such as between underlay and overlay, between namespaces, or between L3 subnets. While we already scrub the packet mark, connection tracking entry, and cached destination, the security mark/context is left intact. It seems wrong to inherit the security context of a packet when going from overlay to underlay or across forwarding paths. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-24 00:21:43 -05:00
Toshiaki Makita	796f2da81b	net: Fix stacked vlan offload features computation When vlan tags are stacked, it is very likely that the outer tag is stored in skb->vlan_tci and skb->protocol shows the inner tag's vlan_proto. Currently netif_skb_features() first looks at skb->protocol even if there is the outer tag in vlan_tci, thus it incorrectly retrieves the protocol encapsulated by the inner vlan instead of the inner vlan protocol. This allows GSO packets to be passed to HW and they end up being corrupted. Fixes: `58e998c6d2` ("offloading: Force software GSO for multiple vlan tags.") Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-24 00:08:33 -05:00
Pravin B Shelar	997e068ebc	openvswitch: Fix vport_send double free Today vport-send has complex error handling because it involves freeing skb and updating stats depending on return value from vport send implementation. This can be simplified by delegating responsibility of freeing skb to the vport implementation for all cases. So that vport-send needs just update stats. Fixes: `91b7514cdf` ("openvswitch: Unify vport error stats handling") Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-23 23:57:31 -05:00
Pravin B Shelar	cbe7e76d94	openvswitch: Fix GSO with multiple MPLS label. MPLS GSO needs to know inner most protocol to process GSO packets. Fixes: `25cd9ba0ab` ("openvswitch: Add basic MPLS support to kernel"). Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-23 23:57:31 -05:00
Pravin B Shelar	ec449f40bb	openvswitch: Fix MPLS action validation. Linux stack does not implement GSO for packet with multiple encapsulations. Therefore there was check in MPLS action validation to detect such case, But this check introduced bug which deleted one or more actions from actions list. Following patch removes this check to fix the validation. Fixes: `25cd9ba0ab` ("openvswitch: Add basic MPLS support to kernel"). Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reported-by: Srinivas Neginhal <sneginha@vmware.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-23 23:57:31 -05:00
Pravin B Shelar	4cc1beca30	mpls: Fix allowed protocols for mpls gso MPLS and Tunnel GSO does not work together. Reject packet which request such GSO. Fixes: `0d89d2035f` ("MPLS: Add limited GSO support"). Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-23 23:57:31 -05:00
Pravin B Shelar	d0edc7bf39	mpls: Fix config check for mpls. Fixes MPLS GSO for case when mpls is compiled as kernel module. Fixes: `0d89d2035f` ("MPLS: Add limited GSO support"). Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-23 23:57:30 -05:00
Herbert Xu	ceb8d5bf17	net: Rearrange loop in net_rx_action This patch rearranges the loop in net_rx_action to reduce the amount of jumping back and forth when reading the code. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-23 23:20:21 -05:00
Herbert Xu	6bd373ebba	net: Always poll at least one device in net_rx_action We should only perform the softnet_break check after we have polled at least one device in net_rx_action. Otherwise a zero or negative setting of netdev_budget can lock up the whole system. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-23 23:20:21 -05:00
Herbert Xu	001ce546bb	net: Detect drivers that reschedule NAPI and exhaust budget The commit `d75b1ade56` (net: less interrupt masking in NAPI) required drivers to leave poll_list empty if the entire budget is consumed. We have already had two broken drivers so let's add a check for this. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-23 23:20:21 -05:00
Herbert Xu	726ce70e9e	net: Move napi polling code out of net_rx_action This patch creates a new function napi_poll and moves the napi polling code from net_rx_action into it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-23 23:20:21 -05:00
Antonio Quartulli	0d16449195	batman-adv: avoid NULL dereferences and fix if check Gateway having bandwidth_down equal to zero are not accepted at all and so never added to the Gateway list. For this reason checking the bandwidth_down member in batadv_gw_out_of_range() is useless. This is probably a copy/paste error and this check was supposed to be "!gw_node" only. Moreover, the way the check is written now may also lead to a NULL dereference. Fix this by rewriting the if-condition properly. Introduced by `414254e342` ("batman-adv: tvlv - gateway download/upload bandwidth container") Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-23 23:13:37 -05:00
Sven Eckelmann	0402e444cd	batman-adv: Unify fragment size calculation The fragmentation code was replaced in `610bfc6bc9` ("batman-adv: Receive fragmented packets and merge") by an implementation which can handle up to 16 fragments of a packet. The packet is prepared for the split in fragments by the function batadv_frag_send_packet and the actual split is done by batadv_frag_create. Both functions calculate the size of a fragment themself. But their calculation differs because batadv_frag_send_packet also subtracts ETH_HLEN. Therefore, the check in batadv_frag_send_packet "can a full fragment can be created?" may return true even when batadv_frag_create cannot create a full fragment. The function batadv_frag_create doesn't check the size of the skb before splitting it and therefore might try to create a larger fragment than the remaining buffer. This creates an integer underflow and an invalid len is given to skb_split. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-23 23:13:37 -05:00
Sven Eckelmann	5b6698b0e4	batman-adv: Calculate extra tail size based on queued fragments The fragmentation code was replaced in `610bfc6bc9` ("batman-adv: Receive fragmented packets and merge"). The new code provided a mostly unused parameter skb for the merging function. It is used inside the function to calculate the additionally needed skb tailroom. But instead of increasing its own tailroom, it is only increasing the tailroom of the first queued skb. This is not correct in some situations because the first queued entry can be a different one than the parameter. An observed problem was: 1. packet with size 104, total_size 1464, fragno 1 was received - packet is queued 2. packet with size 1400, total_size 1464, fragno 0 was received - packet is queued at the end of the list 3. enough data was received and can be given to the merge function (1464 == (1400 - 20) + (104 - 20)) - merge functions gets 1400 byte large packet as skb argument 4. merge function gets first entry in queue (104 byte) - stored as skb_out 5. merge function calculates the required extra tail as total_size - skb->len - pskb_expand_head tail of skb_out with 64 bytes 6. merge function tries to squeeze the extra 1380 bytes from the second queued skb (1400 byte aka skb parameter) in the 64 extra tail bytes of skb_out Instead calculate the extra required tail bytes for skb_out also using skb_out instead of using the parameter skb. The skb parameter is only used to get the total_size from the last received packet. This is also the total_size used to decide that all fragments were received. Reported-by: Philipp Psurek <philipp.psurek@gmail.com> Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-23 23:13:37 -05:00
Jason Wang	af6dabc9c7	net: drop the packet when fails to do software segmentation or header check Commit `cecda693a9` ("net: keep original skb which only needs header checking during software GSO") keeps the original skb for packets that only needs header check, but it doesn't drop the packet if software segmentation or header check were failed. Fixes `cecda693a9` ("net: keep original skb which only needs header checking during software GSO") Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-23 23:12:11 -05:00
leroy christophe	7b5bca4676	netfilter: nf_tables: fix port natting in little endian archs Make sure this fetches 16-bits port data from the register. Remove casting to make sparse happy, not needed anymore. Signed-off-by: leroy christophe <christophe.leroy@c-s.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-12-23 15:34:28 +01:00
Fabian Frederick	8aefc4d1c6	netfilter: log: remove unnecessary sizeof(char) sizeof(char) is always 1. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-12-23 14:33:58 +01:00
Joe Perches	372e28661a	netfilter: xt_osf: Use continue to reduce indentation Invert logic in test to use continue. This routine already uses continue, use it a bit more to minimize > 80 column long lines and unnecessary indentation. No change in compiled object file. Other miscellanea: o Remove trailing whitespace o Realign arguments to multiline statement Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-12-23 14:20:10 +01:00
Marcelo Leitner	88eab472ec	netfilter: conntrack: adjust nf_conntrack_buckets default value Manually bumping either nf_conntrack_buckets or nf_conntrack_max has become a common task as our Linux servers tend to serve more and more clients/applications, so let's adjust nf_conntrack_buckets this to a more updated value. Now for systems with more than 4GB of memory, nf_conntrack_buckets becomes 65536 instead of 16384, resulting in nf_conntrack_max=256k entries. Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-12-23 14:20:10 +01:00
Eliad Peller	85b89af07d	mac80211: fix dot11MulticastTransmittedFrameCount tested address dot11MulticastTransmittedFrameCount should be updated according to the DA, which might be different from A1. Checking A1 results in the counter being 0 in case of station, as to-DS data frames use A1 for the BSSID. This behaviour is defined in state machines, specifically in the sta_tx_dcf_3.1d(10) description of 802.11-2012. Signed-off-by: Eliad Peller <eliad@wizery.com> [rewrite commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-23 10:16:43 +01:00
Nicolas Dichtel	2dc49d1680	tcp6: don't move IP6CB before xfrm6_policy_check() When xfrm6_policy_check() is used, _decode_session6() is called after some intermediate functions. This function uses IP6CB(), thus TCP_SKB_CB() must be prepared after the call of xfrm6_policy_check(). Before this patch, scenarii with IPv6 + TCP + IPsec Transport are broken. Fixes: `971f10eca1` ("tcp: better TCP_SKB_CB layout to reduce cache line misses") Reported-by: Huaibin Wang <huaibin.wang@6wind.com> Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-22 16:48:01 -05:00
Dan Collins	da413eec72	packet: Fixed TPACKET V3 to signal poll when block is closed rather than every packet Make TPACKET_V3 signal poll when block is closed rather than for every packet. Side effect is that poll will be signaled when block retire timer expires which didn't previously happen. Issue was visible when sending packets at a very low frequency such that all blocks are retired before packets are received by TPACKET_V3. This caused avoidable packet loss. The fix ensures that the signal is sent when blocks are closed which covers the normal path where the block is filled as well as the path where the timer expires. The case where a block is filled without moving to the next block (ie. all blocks are full) will still cause poll to be signaled. Signed-off-by: Dan Collins <dan@dcollins.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-22 15:41:15 -05:00
Marcel Holtmann	72e4a6bd02	Bluetooth: Remove duplicate constant for RFCOMM PSM The RFCOMM_PSM constant is actually a duplicate. So remove it and use the L2CAP_PSM_RFCOMM constant instead. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-20 19:55:04 +02:00
Marcel Holtmann	23b9ceb74f	Bluetooth: Create debugfs directory for each connection handle For every internal representation of a Bluetooth connection which is identified by hci_conn, create a debugfs directory with the handle number as directory name. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-20 19:54:24 +02:00
Marcel Holtmann	a8e1bfaa55	Bluetooth: Store default and maximum LE data length settings When the controller supports the LE Data Length Extension feature, the default and maximum data length are read and now stored. For backwards compatibility all values are initialized to the data length values from Bluetooth 4.1 and earlier specifications. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-20 17:52:21 +02:00
Marcel Holtmann	a9f6068e00	Bluetooth: Enable basics for LE Data Length Extension feature When the controller supports the new LE Data Length Extension feature from Bluetooth 4.2 specification, enable the new events and read the values for default and maxmimum data length supported by the controller. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-20 17:52:01 +02:00
Marcel Holtmann	3a5c82b78f	Bluetooth: Move LE debugfs file creation into hci_debugfs.c This patch moves the creation of the debugs files for LE controllers into hci_debugfs.c file. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-20 17:51:23 +02:00
Marcel Holtmann	71c3b60ec6	Bluetooth: Move BR/EDR debugfs file creation into hci_debugfs.c This patch moves the creation of the debugs files for BR/EDR controllers into hci_debugfs.c file. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-20 17:51:07 +02:00
Marcel Holtmann	40ce72b195	Bluetooth: Move common debugfs file creation into hci_debugfs.c This patch moves the creation of the debugs files common for all controllers into hci_debugfs.c file. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-20 17:51:01 +02:00
Marcel Holtmann	60c5f5fb1f	Bluetooth: Add skeleton functions for debugfs creation The debugfs file creation has been part of the core initialization handling of controllers. With the introduction of Bluetooth 4.2 core specification, the number of debugfs files is increasing even further. To avoid cluttering the core controller handling, create a separate file hci_debugfs.c to centralize all debugfs file creation. For now leave the current files in the core, but in the future all debugfs file creation will be moved. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-20 17:50:34 +02:00
Marcel Holtmann	50b5b952b7	Bluetooth: Support static address when BR/EDR has been disabled Every BR/EDR/LE dual-mode controller requires to have a public address and so far that has become the identity address and own address. The only way to change that behavior was with a force_static_address debugfs option. However the host can actually disable the BR/EDR part of a dual-mode controller and turn into a single mode LE controller. In that case it makes perfect sense for a host to use a chosen static address instead of the public address. So if the host disables BR/EDR and configures a static address, then that static address is used as identity address and own address. If the host does not configure a static address, then the public address is used as before. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-20 09:29:49 +02:00
Linus Torvalds	ecb5ec044a	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile #3 from Al Viro: "Assorted fixes and patches from the last cycle" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: [regression] chunk lost from bd9b51 vfs: make mounts and mountstats honor root dir like mountinfo does vfs: cleanup show_mountinfo init: fix read-write root mount unfuck binfmt_misc.c (broken by commit `e6084d4`) vm_area_operations: kill ->migrate() new helper: iter_is_iovec() move_extent_per_page(): get rid of unused w_flags lustre: get rid of playing with ->fs btrfs: filp_open() returns ERR_PTR() on failure, not NULL...	2014-12-19 18:19:19 -08:00
Alexander Aring	cd1c56653a	ieee802154: iface: move multiple node type check This patch moves the handling for checking on multiple node type interface to the corresponding concurrent iface check function. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-20 00:06:55 +01:00
Alexander Aring	bd37a78de9	mac802154: iface: check concurrent ifaces This patch adds a check for concurrent interfaces while calling interface up. This avoids to have different mac parameters on one phy. Otherwise it could be that a interface can overwrite current phy mac settings which is set by an another interface. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-20 00:06:55 +01:00
Johan Hedberg	405a26110a	Bluetooth: Move hci_update_page_scan to hci_request.c This is a left-over from the patch that created hci_request.c. The hci_update_page_scan functions should have been moved from hci_core.c there. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-19 22:23:19 +01:00
Johan Hedberg	9df7465351	Bluetooth: Add return parameter to cmd_complete callbacks The cmd_complete callbacks for pending mgmt commands may fail e.g. in the case of memory allocation. Previously this error would be caught and returned to user space in the form of a failed write on the mgmt socket (when the error happened in the mgmt command handler) but with the introduction of the generic cmd_complete callback this information was lost. This patch returns the feature by making cmd_complete callbacks return int instead of void. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-19 22:06:37 +01:00
Johan Hedberg	5a154e6f71	Bluetooth: Fix Add Device to wait for HCI before sending cmd_complete This patch updates the Add Device mgmt command handler to use a hci_request to wait for HCI command completion before notifying user space of the mgmt command completion. To do this we need to add an extra hci_request parameter to the hci_conn_params_set function. Since this function has no other users besides mgmt.c it's moved there as a static function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-19 22:06:37 +01:00
Johan Hedberg	51ef3ebe7b	Bluetooth: Fix Remove Device to wait for HCI before sending cmd_complete This patch updates the Remove Device mgmt command handler to use a hci_request to wait for HCI command completion before notifying user space of the mgmt command completion. This way we ensure that once the mgmt command returns all HCI commands triggered by it have also completed. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-19 22:06:37 +01:00
Johan Hedberg	2cf22218b0	Bluetooth: Add hci_request support for hci_update_background_scan Many places using hci_update_background_scan() try to synchronize whatever they're doing with the help of hci_request callbacks. However, since the hci_update_background_scan() function hasn't so far accepted a hci_request pointer any commands triggered by it have been left out by the synchronization. This patch modifies the API in a similar way as was done for hci_update_page_scan, i.e. there's a variant that takes a hci_request and another one that takes a hci_dev. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-19 22:06:37 +01:00
Felix Fietkau	8d819a92cc	mac80211: minstrel: reduce size of struct minstrel_rate_stats On minstrel_ht, the size of the per-sta struct is almost 18k, making it an order-3 allocation. A few fields inside the per-rate statistics are bigger than they need to be. This patch reduces the size enough to cut down the per-sta struct to about 13k (order-2 allocation). Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-19 21:34:22 +01:00
Al Viro	71bb99a02b	Bluetooth: bnep: bnep_add_connection() should verify that it's dealing with l2cap socket same story as cmtp Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-19 13:48:27 +01:00
Al Viro	96c26653ce	Bluetooth: cmtp: cmtp_add_connection() should verify that it's dealing with l2cap socket ... rather than relying on ciptool(8) never passing it anything else. Give it e.g. an AF_UNIX connected socket (from socketpair(2)) and it'll oops, trying to evaluate &l2cap_pi(sock->sk)->chan->dst... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-19 13:48:21 +01:00
Al Viro	51bda2bca5	Bluetooth: hidp_connection_add() unsafe use of l2cap_pi() it's OK after we'd verified the sockets, but not before that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-19 13:40:07 +01:00
Jukka Rissanen	004fa5ed08	Bluetooth: 6lowpan: Do not free skb when packet is dropped If we need to drop the message because of some error in the compression etc, then do not free the skb as that is done automatically in other part of networking stack. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-19 13:39:50 +01:00
Al Viro	e3bb504efd	[regression] chunk lost from bd9b51 Reported-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-19 07:13:21 -05:00
Johan Hedberg	0857dd3bed	Bluetooth: Split hci_request helpers to hci_request.[ch] None of the hci_request related things in net/bluetooth/hci_core.h are needed anywhere outside of the core bluetooth module. This patch creates a new net/bluetooth/hci_request.c file with its corresponding h-file and moves the functionality there from hci_core.c and hci_core.h. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-19 13:04:42 +01:00
Johan Hedberg	1d2dc5b7b3	Bluetooth: Split hci_update_page_scan into two functions To keep the parameter list and its semantics clear it makes sense to split the hci_update_page_scan function into two separate functions: one taking a hci_dev and another taking a hci_request. The one taking a hci_dev constructs its own hci_request and then calls the other function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-19 12:52:18 +01:00
Linus Torvalds	00c845dbfe	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix NBMA tunnel mac header handling in GRE, from Timo Teräs. 2) Fix a NAPI race in the fec driver, from Nimrod Andy. 3) The new IFF_VNET_LE bit is outside the size of the flags member it is stored in (which is 16-bits), store the state locally in the drivers. From Michael S Tsirkin. 4) We are kicking the tires with the new wireless maintainership situation. Bluetooth fixes via Johan Hedberg, and mac80211 fixes from Johannes Berg. 5) Fix locking and leaks in geneve driver, from Jesse Gross. 6) Make netlink TX mmap code always copy, so we don't have to be potentially exposed to the user changing the underlying contents from underneath us. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (63 commits) be2net: Fix incorrect setting of tunnel offload flag in netdev features bnx2x: fix typos in "configure" xen-netback: support frontends without feature-rx-notify again MAINTAINERS: changes for wireless cxgb4: Fix decoding QSA module for ethtool get settings geneve: Fix races between socket add and release. geneve: Remove socket and offload handlers at destruction. netlink: Don't reorder loads/stores before marking mmap netlink frame as available netlink: Always copy on mmap TX. Bluetooth: Fix bug with filter in service discovery optimization mac80211: free management frame keys when removing station net: Disallow providing non zero VLAN ID for NIC drivers FDB add flow net/mlx4: Cache line CQE/EQE stride fixes net: fec: Fix NAPI race xen-netfront: use napi_complete() correctly to prevent Rx stalling ip_tunnel: Add missing validation of encap type to ip_tunnel_encap_setup() ip_tunnel: Add sanity checks to ip_tunnel_encap_add_ops() net: Allow FIXED_PHY to be modular. if_tun: drop broken IFF_VNET_LE macvtap: drop broken IFF_VNET_LE ...	2014-12-18 16:41:13 -08:00
Alexander Aring	ba2a9506a7	nl802154: introduce support for cca settings This patch adds support for setting cca parameters via nl802154. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-19 00:19:23 +01:00
Alexander Aring	7fe9a3882b	ieee802154: rework cca setting The current cca setting handle is a driver specific call. We need to introduce some 802.15.4 specific layer and mapping 802.15.4 cca modes to driver specific ones inside the 802.15.4 driver. This patch will add such 802.15.4 layer and mapping the cca settings to driver specific ones. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-19 00:19:23 +01:00
David S. Miller	86c8fc4bbe	This relatively large collection of fixes is all over - from things that were broken a long time ago (the management key issue) but not noticed yet, to small issues that were only introduced into 3.19 (like the multicast issue). At least one issue is old but can crash the kernel based on invalid userspace requests (the nl80211 matches array one.) -----BEGIN PGP SIGNATURE----- iQIbBAABCAAGBQJUkt82AAoJEDBSmw7B7bqrk5YP+NN821jzWqTPZy7mS1NgwRrN 4VjPw1J6ba5LRCfS8cYOcG5zPq4i88DdiRzsFe7TW1g3ayy5tXNp0FzcgJ4lBV5T cWpXybtCYwnJxlaRBNmS9H7ZOCSjVJH8B1U75t5jmY8ondZ7mRUpN6Od2b6rKfq6 8HDZ+I7fr+tPeb/1tj9iq0XaJAPvZOpoJzA5Hgjvs6UOURAxIxRrU4l9LZ/bd6Uf jz0Kg0u59d9YNDn9MPX2KmMOyjbJ2t5MYHsAfxOB9usnaq8Toq1ZEszApiIXuyz/ GsgLBYL9gVpIUU95Us7lDv+sqT7cFflNvE9JtgcB5dkSxjZrRNLnJM09NggkOI9a A1BGReZODj2UPEKdXgWMC4Jr63xLZW5vr7mX9c8egQu2U/eqLACcz0ewMxMxaoEO bXEjegDv6FaGuGU1ul8D03cvgznLjoMUtnkcu5UIFLIt7uf0Ohrn8bFovrpo0DX2 wQt8VkOIrRBHXkvTDIZCyTbfZn9LI9M50q8YEZWX84wrA7gDkFQ6ByNrxdFET4zA Hjpjkwi9CbpaJ66Xp0WFbWJ6AMAfVJHYBTHnRU36nfbNb82qD6O+1nZVH7xGjbHB 7zk+nrXKjhdnCBhhIZBku+Fkug6+wBdZNkTFSch6Q1ZO+aLb4Z02j+yb1fylAzjf ar3E7Ssx+jtGfJ9RQAk= =iSDP -----END PGP SIGNATURE----- Merge tag 'mac80211-for-davem-2014-12-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== pull-request: mac80211 2014-12-18 Also from me a first pull request - we have a number of really old issues that happened to crop up now with new work (or just more testing) in the right areas as well as some small bugs newly introduced in 3.19. Let me know if there are any problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-18 15:33:49 -05:00
David S. Miller	7dce675b28	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Johan Hedberg says: ==================== pull request: bluetooth 2014-12-17 Here's the first direct (i.e. skipping the wireless tree) bluetooth pull request for you, intended for 3.19. It's just one patch: a fix from Marcel for for remote service discovery filtering which also fixes a 'used uninitialized' compiler warning. Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-18 15:32:27 -05:00
Pablo Neira Ayuso	70314fc684	Merge tag 'ipvs2-for-v3.19' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next into ipvs-next Simon Horman says: ==================== Second round of IPVS Updates for v3.19 please consider these IPVS updates for v3.19 or alternatively v3.20. The single patch in this series fixes a long standing bug that has not caused any trouble and thus is not being prioritised as a fix. ==================== Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-12-18 20:54:26 +01:00
Jesse Gross	12069401d8	geneve: Fix races between socket add and release. Currently, searching for a socket to add a reference to is not synchronized with deletion of sockets. This can result in use after free if there is another operation that is removing a socket at the same time. Solving this requires both holding the appropriate lock and checking the refcount to ensure that it has not already hit zero. Inspired by a related (but not exactly the same) issue in the VXLAN driver. Fixes: `0b5e8b8e` ("net: Add Geneve tunneling protocol driver") CC: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-18 12:38:13 -05:00
Jesse Gross	7ed767f731	geneve: Remove socket and offload handlers at destruction. Sockets aren't currently removed from the the global list when they are destroyed. In addition, offload handlers need to be cleaned up as well. Fixes: `0b5e8b8e` ("net: Add Geneve tunneling protocol driver") CC: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-18 12:38:13 -05:00
Thomas Graf	a18e6a186f	netlink: Don't reorder loads/stores before marking mmap netlink frame as available Each mmap Netlink frame contains a status field which indicates whether the frame is unused, reserved, contains data or needs to be skipped. Both loads and stores may not be reordeded and must complete before the status field is changed and another CPU might pick up the frame for use. Use an smp_mb() to cover needs of both types of callers to netlink_set_status(), callers which have been reading data frame from the frame, and callers which have been filling or releasing and thus writing to the frame. - Example code path requiring a smp_rmb(): memcpy(skb->data, (void *)hdr + NL_MMAP_HDRLEN, hdr->nm_len); netlink_set_status(hdr, NL_MMAP_STATUS_UNUSED); - Example code path requiring a smp_wmb(): hdr->nm_uid = from_kuid(sk_user_ns(sk), NETLINK_CB(skb).creds.uid); hdr->nm_gid = from_kgid(sk_user_ns(sk), NETLINK_CB(skb).creds.gid); netlink_frame_flush_dcache(hdr); netlink_set_status(hdr, NL_MMAP_STATUS_VALID); Fixes: f9c228 ("netlink: implement memory mapped recvmsg()") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-18 12:35:55 -05:00
David Miller	4682a03586	netlink: Always copy on mmap TX. Checking the file f_count and the nlk->mapped count is not completely sufficient to prevent the mmap'd area contents from changing from under us during netlink mmap sendmsg() operations. Be careful to sample the header's length field only once, because this could change from under us as well. Fixes: `5fd96123ee` ("netlink: implement memory mapped sendmsg()") Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Thomas Graf <tgraf@suug.ch>	2014-12-18 12:35:23 -05:00
Jukka Rissanen	93a1e86ce1	nl80211: Stop scheduled scan if netlink client disappears An attribute NL80211_ATTR_SOCKET_OWNER can be set by the scan initiator. If present, the attribute will cause the scan to be stopped if the client dies. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-18 14:38:44 +01:00
Jukka Rissanen	31a60ed1e9	nl80211: Convert sched_scan_req pointer to RCU pointer Because of possible races when accessing sched_scan_req pointer in rdev, the sched_scan_req is converted to RCU pointer. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-18 14:38:09 +01:00
Linus Torvalds	57666509b7	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull ceph updates from Sage Weil: "The big item here is support for inline data for CephFS and for message signatures from Zheng. There are also several bug fixes, including interrupted flock request handling, 0-length xattrs, mksnap, cached readdir results, and a message version compat field. Finally there are several cleanups from Ilya, Dan, and Markus. Note that there is another series coming soon that fixes some bugs in the RBD 'lingering' requests, but it isn't quite ready yet" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (27 commits) ceph: fix setting empty extended attribute ceph: fix mksnap crash ceph: do_sync is never initialized libceph: fixup includes in pagelist.h ceph: support inline data feature ceph: flush inline version ceph: convert inline data to normal data before data write ceph: sync read inline data ceph: fetch inline data when getting Fcr cap refs ceph: use getattr request to fetch inline data ceph: add inline data to pagecache ceph: parse inline data in MClientReply and MClientCaps libceph: specify position of extent operation libceph: add CREATE osd operation support libceph: add SETXATTR/CMPXATTR osd operations support rbd: don't treat CEPH_OSD_OP_DELETE as extent op ceph: remove unused stringification macros libceph: require cephx message signature by default ceph: introduce global empty snap context ceph: message versioning fixes ...	2014-12-17 16:03:12 -08:00
Marcel Holtmann	ea8ae2516a	Bluetooth: Fix bug with filter in service discovery optimization The optimization for filtering out extended inquiry results, advertising reports or scan response data based on provided UUID list has a logic bug. In case no match is found in the advertising data, the scan response is ignored and not checked against the filter. This will lead to events being filtered wrongly. Change the code to actually only drop the events when the scan response data is not present. If it is present, it needs to be checked against the provided filter. The patch is a bit more complex than it needs to be. That is because it also fixes this compiler warning that some gcc versions produce. CC net/bluetooth/mgmt.o net/bluetooth/mgmt.c: In function ‘mgmt_device_found’: net/bluetooth/mgmt.c:7028:7: warning: ‘match’ may be used uninitialized in this function [-Wmaybe-uninitialized] bool match; ^ It seems that gcc can not clearly figure out the context of the match variable. So just change the branches for the extended inquiry response and advertising data around so that it is clear. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-17 22:03:49 +02:00
Yan, Zheng	715e4cd405	libceph: specify position of extent operation allow specifying position of extent operation in multi-operations osd request. This is required for cephfs to convert inline data to normal data (compare xattr, then write object). Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>	2014-12-17 20:09:52 +03:00
Yan, Zheng	864e9197f1	libceph: add CREATE osd operation support Add CEPH_OSD_OP_CREATE support. Also change libceph to not treat CEPH_OSD_OP_DELETE as an extent op and add an assert to that end. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>	2014-12-17 20:09:51 +03:00
Yan, Zheng	d74b50bed0	libceph: add SETXATTR/CMPXATTR osd operations support Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>	2014-12-17 20:09:51 +03:00
Yan, Zheng	a3fc98005c	libceph: require cephx message signature by default Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>	2014-12-17 20:09:51 +03:00
Yan, Zheng	33d0733796	libceph: message signature support Signed-off-by: Yan, Zheng <zyan@redhat.com>	2014-12-17 20:09:50 +03:00
Yan, Zheng	ae385eaf24	libceph: store session key in cephx authorizer Session key is required when calculating message signature. Save the session key in authorizer, this avoid lookup ticket handler for each message Signed-off-by: Yan, Zheng <zyan@redhat.com>	2014-12-17 20:09:50 +03:00
Ilya Dryomov	4965fc38c4	libceph: nuke ceph_kvfree() Use kvfree() from linux/mm.h instead, which is identical. Also fix the ceph_buffer comment: we will allocate with kmalloc() up to 32k - the value of PAGE_ALLOC_COSTLY_ORDER, but that really is just an implementation detail so don't mention it at all. Signed-off-by: Ilya Dryomov <idryomov@redhat.com>	2014-12-17 20:09:50 +03:00
Eliad Peller	0f8b824561	mac80211: avoid reconfig if no interfaces are up If there are no interfaces up, there is no reason to continue the reconfig flow. The current code might end up calling driver callbacks (e.g. resume(), reconfig_complete()) while the driver is already stopped. Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-17 15:45:17 +01:00
Luciano Coelho	1a952c94b5	mac80211: remove unused variable in ieee80211_parse_ch_switch_ie() The ht_oper variable is assigned a value, but never used in ieee80211_parse_ch_switch_ie(). Remove it. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-17 15:45:17 +01:00
Eliad Peller	1c45c5ce32	mac80211: update sta bw on ht chanwidth action frame Commit `e1a0c6b` ("mac80211: stop toggling IEEE80211_HT_CAP_SUP_WIDTH_20_40") mistakenly removed the actual update of sta->sta.bandwidth. Refactor ieee80211_sta_cur_vht_bw() into multiple functions (calculate caps-bw and chandef-bw separately, and min them with cur_max_bandwidth). On ht chanwidth action frame set only cur_max_bandwidth (according to the sta capabilities) and recalc the sta bw. Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-17 15:45:16 +01:00
Moshe Benji	a5fee9cb62	mac80211: handle power constraint and country IEs in RRM In beacons, handle the Country IE even if no Power Constraint IE is present, and, capability wise, also in case that the Radio Measurements capability is enabled. In cases where the Country IE should be handled and that the Power Constraint IE is not present, the Country IE alone will set the power limit (and not both Country and Power Constraint IEs). Signed-off-by: Moshe Benji <moshe.benji@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-17 15:45:16 +01:00
Chaya Rachel Ivgi	179b8fc7fb	mac80211: Fix ignored HT override configurations HT override configurations was ignored when choosing the channel (until now, the override configuration affected only the capabilities shown in the IEs). The override configurations received only on association time, so in this case we should determine the channel again. Signed-off-by: Chaya Rachel Ivgi <chaya.rachel.ivgi@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-17 15:45:06 +01:00
Johannes Berg	28a9bc6812	mac80211: free management frame keys when removing station When writing the code to allow per-station GTKs, I neglected to take into account the management frame keys (index 4 and 5) when freeing the station and only added code to free the first four data frame keys. Fix this by iterating the array of keys over the right length. Cc: stable@vger.kernel.org Fixes: `e31b82136d` ("cfg80211/mac80211: allow per-station GTKs") Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-17 14:00:17 +01:00
Arik Nemtsov	db8dfee57d	cfg80211: avoid intersection when applying self-managed reg The custom-reg handling function can currently only add flags to a given channel. This results in stale flags being left applied. In some cases a channel was disabled and even the orig_flags were changed to reflect this. Previously the API was designed for a single invocation before wiphy registration, so this didn't matter. The previous approach doesn't scale well to self-managed regulatory devices, particularly when a more permissive regdom is applied after a restrictive one. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-17 11:49:55 +01:00
Arik Nemtsov	1bdd716cbc	cfg80211: return private regdom for self-managed devices If a device has self-managed regulatory, insist on returning the wiphy specific regdomain if a wiphy-idx is specified. The global regdomain is meaningless for such devices. Also add an attribute for self-managed devices, so usermode can distinguish them as such. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Reviewed-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-17 11:49:55 +01:00
Jonathan Doron	b0d7aa5959	cfg80211: allow wiphy specific regdomain management Add a new regulatory flag that allows a driver to manage regdomain changes/updates for its own wiphy. A self-managed wiphys only employs regulatory information obtained from the FW and driver and does not use other cfg80211 sources like beacon-hints, country-code IEs and hints from other devices on the same system. Conversely, a self-managed wiphy does not share its regulatory hints with other devices in the system. If a system contains several devices, one or more of which are self-managed, there might be contradictory regulatory settings between them. Usage of flag is generally discouraged. Only use it if the FW/driver is incompatible with non-locally originated hints. A new API lets the driver send a complete regdomain, to be applied on its wiphy only. After a wiphy-specific regdomain change takes place, usermode will get a new type of change notification. The regulatory core also takes care enforce regulatory restrictions, in case some interfaces are on forbidden channels. Signed-off-by: Jonathan Doron <jonathanx.doron@intel.com> Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Reviewed-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-17 11:49:55 +01:00
Arik Nemtsov	ad30ca2c03	cfg80211: allow usermode to query wiphy specific regdom If a wiphy-idx is specified, the kernel will return the wiphy specific regdomain, if such exists. Otherwise return the global regdom. When no wiphy-idx is specified, return the global regdomain as well as all wiphy-specific regulatory domains in the system, via a new nested list of attributes. Add a new attribute for each wiphy-specific regdomain, for usermode to identify it as such. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-17 11:49:55 +01:00
Nishikawa, Kenzoh	2ae70efcea	mac80211: keep sending peer candidate events while in listen state Instead of sending peer candidate events just once, send them as long as the peer remains in the LISTEN state in the peering state machine, when userspace is implementing the peering manager. Userspace may silence the events from a peer by progressing the state machine or by setting the link state to BLOCKED. Fixes the problem that a mesh peering process won't be fired again after the previous first peering trial fails due to like air propagation error if the peering is managed by user space such as wpa_supplicant. This patch works with another patch for wpa_supplicant described here which fires a peering process again triggered by the notice from kernel. http://lists.shmoo.com/pipermail/hostap/2014-November/031235.html Signed-off-by: Kenzoh Nishikawa <Kenzoh.Nishikawa@jp.sony.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-17 11:49:24 +01:00
Luciano Coelho	688b1ecfb9	mac80211: notify channel switch at the end of ieee80211_chswitch_post_beacon() The call to cfg80211_ch_switch_notify() should be at the end of the ieee80211_chswitch_post_beacon() function, because it should only be sent if everything succeeded. Fixes: `d04b5ac9e7` ("cfg80211/mac80211: allow any interface to send channel switch notifications") Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-17 11:47:42 +01:00
Janusz Dziedzic	d295445715	mac80211: notify NSS changed when IBSS and HT When using IBSS in HT mode, we always get NSS=1 in rc_update callback. Force NSS recalculation when rates updated and notify driver that NSS changed. Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-17 11:47:26 +01:00
Linus Torvalds	603ba7e41b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile #2 from Al Viro: "Next pile (and there'll be one or two more). The large piece in this one is getting rid of /proc//ns/ weirdness; among other things, it allows to (finally) make nameidata completely opaque outside of fs/namei.c, making for easier further cleanups in there" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: coda_venus_readdir(): use file_inode() fs/namei.c: fold link_path_walk() call into path_init() path_init(): don't bother with LOOKUP_PARENT in argument fs/namei.c: new helper (path_cleanup()) path_init(): store the "base" pointer to file in nameidata itself make default ->i_fop have ->open() fail with ENXIO make nameidata completely opaque outside of fs/namei.c kill proc_ns completely take the targets of /proc//ns/ symlinks to separate fs bury struct proc_ns in fs/proc copy address of proc_ns_ops into ns_common new helpers: ns_alloc_inum/ns_free_inum make proc_ns_operations work with struct ns_common * instead of void * switch the rest of proc_ns_operations to working with &...->ns netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns common object embedded into various struct ....ns	2014-12-16 15:53:03 -08:00
Linus Torvalds	0b233b7c79	Merge branch 'for-3.19' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "A comparatively quieter cycle for nfsd this time, but still with two larger changes: - RPC server scalability improvements from Jeff Layton (using RCU instead of a spinlock to find idle threads). - server-side NFSv4.2 ALLOCATE/DEALLOCATE support from Anna Schumaker, enabling fallocate on new clients" * 'for-3.19' of git://linux-nfs.org/~bfields/linux: (32 commits) nfsd4: fix xdr4 count of server in fs_location4 nfsd4: fix xdr4 inclusion of escaped char sunrpc/cache: convert to use string_escape_str() sunrpc: only call test_bit once in svc_xprt_received fs: nfsd: Fix signedness bug in compare_blob sunrpc: add some tracepoints around enqueue and dequeue of svc_xprt sunrpc: convert to lockless lookup of queued server threads sunrpc: fix potential races in pool_stats collection sunrpc: add a rcu_head to svc_rqst and use kfree_rcu to free it sunrpc: require svc_create callers to pass in meaningful shutdown routine sunrpc: have svc_wake_up only deal with pool 0 sunrpc: convert sp_task_pending flag to use atomic bitops sunrpc: move rq_cachetype field to better optimize space sunrpc: move rq_splice_ok flag into rq_flags sunrpc: move rq_dropme flag into rq_flags sunrpc: move rq_usedeferral flag to rq_flags sunrpc: move rq_local field to rq_flags sunrpc: add a generic rq_flags field to svc_rqst and move rq_secure to it nfsd: minor off by one checks in __write_versions() sunrpc: release svc_pool_map reference when serv allocation fails ...	2014-12-16 15:25:31 -08:00
Or Gerlitz	65891feac2	net: Disallow providing non zero VLAN ID for NIC drivers FDB add flow The current implementations all use dev_uc_add_excl() and such whose API doesn't support vlans, so we can't make it with NICs HW for now. Fixes: `f6f6424ba7` ('net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-16 15:41:19 -05:00
Thomas Graf	f1fb521f7d	ip_tunnel: Add missing validation of encap type to ip_tunnel_encap_setup() The encap->type comes straight from Netlink. Validate it against max supported encap types just like ip_encap_hlen() already does. Fixes: a8c5f9 ("ip_tunnel: Ops registration for secondary encap (fou, gue)") Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-16 15:20:41 -05:00
Thomas Graf	bb1553c800	ip_tunnel: Add sanity checks to ip_tunnel_encap_add_ops() The symbols are exported and could be used by external modules. Fixes: a8c5f9 ("ip_tunnel: Ops registration for secondary encap (fou, gue)") Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-16 15:20:41 -05:00
David S. Miller	c9f2c3d36c	Merge tag 'master-2014-12-15' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-12-16 Please pull this batch of fixes intended for the 3.19 stream! For the Bluetooth bits, Johan says: "The patches consist of: - Coccinelle warning fix - hci_dev_lock/unlock fixes - Fixes for pending mgmt command handling - Fixes for properly following the force_lesc_support switch - Fix for a Microsoft branded Broadcom adapter - New device id for Atheros AR3012 - Fix for BR/EDR Secure Connections enabling" Along with that... Brian Norris avoids leaking some kernel memory contents via printk in brcmsmac. Julia Lawall corrects some misspellings in a few drivers. Larry Finger gives us one more rtlwifi fix to correct a porting oversight. Wei Yongjun fixes a sparse warning in rtlwifi. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-16 15:16:48 -05:00
John W. Linville	a463e9c57a	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2014-12-15 13:23:09 -05:00
Geert Uytterhoeven	6ff4a8ad4b	rds: Fix min() warning in rds_message_inc_copy_to_user() net/rds/message.c: In function ‘rds_message_inc_copy_to_user’: net/rds/message.c:328: warning: comparison of distinct pointer types lacks a cast Use min_t(unsigned long, ...) like is done in rds_message_copy_from_user(). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-15 11:49:09 -05:00
Timo Teräs	8a0033a947	gre: fix the inner mac header in nbma tunnel xmit path The NBMA GRE tunnels temporarily push GRE header that contain the per-packet NBMA destination on the skb via header ops early in xmit path. It is the later pulled before the real GRE header is constructed. The inner mac was thus set differently in nbma case: the GRE header has been pushed by neighbor layer, and mac header points to beginning of the temporary gre header (set by dev_queue_xmit). Now that the offloads expect mac header to point to the gre payload, fix the xmit patch to: - pull first the temporary gre header away - and reset mac header to point to gre payload This fixes tso to work again with nbma tunnels. Fixes: `14051f0452` ("gre: Use inner mac length when computing tunnel length") Signed-off-by: Timo Teräs <timo.teras@iki.fi> Cc: Tom Herbert <therbert@google.com> Cc: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-15 11:46:04 -05:00
Johannes Berg	848955ccf0	mac80211: move U-APSD enablement to vif flags In order to let drivers have more dynamic U-APSD support, move the enablement flag to the virtual interface driver flags. This lets drivers not only set it up differently for different interfaces, but also enable/disable on the fly if needed. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-15 12:34:45 +01:00
Johannes Berg	4bcc56bb3a	mac80211: ask driver to look at power level when starting AP The power level might have been set, but as the interface was idle it might not have taken effect yet. Ask the driver to check the power level when starting up an AP so that in this case the correct power level is used in case the device/driver can only set it when the interface is actually active. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-15 12:27:02 +01:00
Linus Torvalds	e6b5be2be4	Driver core patches for 3.19-rc1 Here's the set of driver core patches for 3.19-rc1. They are dominated by the removal of the .owner field in platform drivers. They touch a lot of files, but they are "simple" changes, just removing a line in a structure. Other than that, a few minor driver core and debugfs changes. There are some ath9k patches coming in through this tree that have been acked by the wireless maintainers as they relied on the debugfs changes. Everything has been in linux-next for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlSOD20ACgkQMUfUDdst+ylLPACg2QrW1oHhdTMT9WI8jihlHVRM 53kAoLeteByQ3iVwWurwwseRPiWa8+MI =OVRS -----END PGP SIGNATURE----- Merge tag 'driver-core-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core update from Greg KH: "Here's the set of driver core patches for 3.19-rc1. They are dominated by the removal of the .owner field in platform drivers. They touch a lot of files, but they are "simple" changes, just removing a line in a structure. Other than that, a few minor driver core and debugfs changes. There are some ath9k patches coming in through this tree that have been acked by the wireless maintainers as they relied on the debugfs changes. Everything has been in linux-next for a while" * tag 'driver-core-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (324 commits) Revert "ath: ath9k: use debugfs_create_devm_seqfile() helper for seq_file entries" fs: debugfs: add forward declaration for struct device type firmware class: Deletion of an unnecessary check before the function call "vunmap" firmware loader: fix hung task warning dump devcoredump: provide a one-way disable function device: Add dev_<level>_once variants ath: ath9k: use debugfs_create_devm_seqfile() helper for seq_file entries ath: use seq_file api for ath9k debugfs files debugfs: add helper function to create device related seq_file drivers/base: cacheinfo: remove noisy error boot message Revert "core: platform: add warning if driver has no owner" drivers: base: support cpu cache information interface to userspace via sysfs drivers: base: add cpu_device_create to support per-cpu devices topology: replace custom attribute macros with standard DEVICE_ATTR* cpumask: factor out show_cpumap into separate helper function driver core: Fix unbalanced device reference in drivers_probe driver core: fix race with userland in device_add() sysfs/kernfs: make read requests on pre-alloc files use the buffer. sysfs/kernfs: allow attributes to request write buffer be pre-allocated. fs: sysfs: return EGBIG on write if offset is larger than file size ...	2014-12-14 16:10:09 -08:00
Linus Torvalds	e3aa91a7cb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: - The crypto API is now documented :) - Disallow arbitrary module loading through crypto API. - Allow get request with empty driver name through crypto_user. - Allow speed testing of arbitrary hash functions. - Add caam support for ctr(aes), gcm(aes) and their derivatives. - nx now supports concurrent hashing properly. - Add sahara support for SHA1/256. - Add ARM64 version of CRC32. - Misc fixes. * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (77 commits) crypto: tcrypt - Allow speed testing of arbitrary hash functions crypto: af_alg - add user space interface for AEAD crypto: qat - fix problem with coalescing enable logic crypto: sahara - add support for SHA1/256 crypto: sahara - replace tasklets with kthread crypto: sahara - add support for i.MX53 crypto: sahara - fix spinlock initialization crypto: arm - replace memset by memzero_explicit crypto: powerpc - replace memset by memzero_explicit crypto: sha - replace memset by memzero_explicit crypto: sparc - replace memset by memzero_explicit crypto: algif_skcipher - initialize upon init request crypto: algif_skcipher - removed unneeded code crypto: algif_skcipher - Fixed blocking recvmsg crypto: drbg - use memzero_explicit() for clearing sensitive data crypto: drbg - use MODULE_ALIAS_CRYPTO crypto: include crypto- module prefix in template crypto: user - add MODULE_ALIAS crypto: sha-mb - remove a bogus NULL check crytpo: qat - Fix 64 bytes requests ...	2014-12-13 13:33:26 -08:00
Alexander Duyck	e962f30297	fib_trie: Fix trie balancing issue if new node pushes down existing node This patch addresses an issue with the level compression of the fib_trie. Specifically in the case of adding a new leaf that triggers a new node to be added that takes the place of the old node. The result is a trie where the 1 child tnode is on one side and one leaf is on the other which gives you a very deep trie. Below is the script I used to generate a trie on dummy0 with a 10.X.X.X family of addresses. ip link add type dummy ipval=184549374 bit=2 for i in `seq 1 23` do ifconfig dummy0:$bit $ipval/8 ipval=`expr $ipval - $bit` bit=`expr $bit \* 2` done cat /proc/net/fib_triestat Running the script before the patch: Local: Aver depth: 10.82 Max depth: 23 Leaves: 29 Prefixes: 30 Internal nodes: 27 1: 26 2: 1 Pointers: 56 Null ptrs: 1 Total size: 5 kB After applying the patch and repeating: Local: Aver depth: 4.72 Max depth: 9 Leaves: 29 Prefixes: 30 Internal nodes: 12 1: 3 2: 2 3: 7 Pointers: 70 Null ptrs: 30 Total size: 4 kB What this fix does is start the rebalance at the newly created tnode instead of at the parent tnode. This way if there is a gap between the parent and the new node it doesn't prevent the new tnode from being coalesced with any pre-existing nodes that may have been pushed into one of the new nodes child branches. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-12 10:58:53 -05:00
Toshiaki Makita	53f6b708f8	vlan: Add ability to always enable TSO/UFO Since the real device can segment packets by software, a vlan device can set TSO/UFO even when the real device doesn't have those features. Unlike GSO, this allows packets to be segmented after Qdisc. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-12 10:58:53 -05:00
Sujith Manoharan	5cf16616e1	mac80211: Fix accounting of multicast frames Since multicast frames are marked as no-ack, using IEEE80211_TX_STAT_ACK to check if they have been successfully transmitted by the driver is incorrect since a driver can choose to ignore transmission status for no-ack frames. This results in incorrect accounting for such frames. To fix this issue, this patch introduces a new flag that can be used by drivers to indicate error-free transmission of no-ack frames. Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> [add a note about not setting the flag for non-no-ack frames] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-12 13:48:43 +01:00
Sujith Manoharan	6b127c71fb	mac80211: Move IEEE80211_TX_CTL_PS_RESPONSE Move IEEE80211_TX_CTL_PS_RESPONSE to info->control.flags since this is used only in the TX path (by ath9k). This frees up a bit which can be used for other purposes. Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-12 13:48:26 +01:00
Vadim Kochan	ba1debdfed	wireless: Support of IFLA_INFO_KIND rtnl attribute It allows to identify the wlan kind of device for the user application, e.g.: # ip -d link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0 2: enp0s25: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff promiscuity 0 3: wlp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff promiscuity 0 wlan Signed-off-by: Vadim Kochan <vadim4j@gmail.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> [make wireless_link_ops const] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-12 13:48:25 +01:00
Arik Nemtsov	185076d6db	cfg80211: correctly check ad-hoc channels Ad-hoc requires beaconing for regulatory purposes. Validate that the channel is valid for beaconing, and not only enabled. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Reviewed-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-12 13:40:38 +01:00
Emmanuel Grumbach	70dcec5a48	cfg80211: don't WARN about two consecutive Country IE hint This can happen and there is no point in added more detection code lower in the stack. Catching these in one single point (cfg80211) is enough. Stop WARNING about this case. This fixes: https://bugzilla.kernel.org/show_bug.cgi?id=89001 Cc: stable@vger.kernel.org Fixes: `2f1c6c572d` ("cfg80211: process non country IE conflicting first") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-12 13:29:02 +01:00
Johan Hedberg	9845904fd4	Bluetooth: Fix mgmt response status when removing adapter When an adapter is removed (hci_unregister_dev) any pending mgmt commands for that adapter should get the appropriate INVALID_INDEX response. Since hci_unregister_dev() calls hci_dev_do_close() first that'd so far have caused "not powered" responses to be sent. Skipping the HCI_UNREGISTER case in mgmt_powered() is also not a solution since before reaching the mgmt_index_removed() stage any hci_conn callbacks (e.g. used by pairing) will get called, thereby causing "disconnected" status responses to be sent. The fix that covers all scenarios is to handle both INVALID_INDEX and NOT_POWERED responses through the mgmt_powered() function. The INVALID_INDEX response sending from mgmt_index_removed() is left untouched since there are a couple of places not related to powering off or removing an adapter that call it (e.g. configuring a new bdaddr). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-12 13:20:12 +01:00
Johan Hedberg	ec6f99b807	Bluetooth: Fix enabling BR/EDR SC when powering on If we're in the AUTO_OFF stage the powered_update_hci() function is responsible for doing the updates to the HCI state that were not done during the actual mgmt command handlers. One of the updates needing done is for BR/EDR SC support. This patch adds the missing HCI command for SC support to the powered_update_hci() function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-12 13:17:41 +01:00
Emmanuel Grumbach	722ddb0dab	mac80211: update the channel context after channel switch When the channel switch has been made, a vif is now using the channel context which was reserved. When that happens, we need to update the channel context since its parameters may change. I hit a case in which I switched to a 40Mhz channel but the reserved channel context was still on 20Mhz. The rate control would try to send 40Mhz packets on a 20Mhz channel context and that made iwlwifi's firmware unhappy. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-12 12:35:38 +01:00
Luciano Coelho	f89f46cf3a	nl80211: check matches array length before acessing it If the userspace passes a malformed sched scan request (or a net detect wowlan configuration) by adding a NL80211_ATTR_SCHED_SCAN_MATCH attribute without any nested matchsets, a NULL pointer dereference will occur. Fix this by checking that we do have matchsets in our array before trying to access it. BUG: unable to handle kernel NULL pointer dereference at 0000000000000024 IP: [<ffffffffa002fd69>] nl80211_parse_sched_scan.part.67+0x6e9/0x900 [cfg80211] PGD 865c067 PUD 865b067 PMD 0 Oops: 0002 [#1] SMP Modules linked in: iwlmvm(O) iwlwifi(O) mac80211(O) cfg80211(O) compat(O) [last unloaded: compat] CPU: 2 PID: 2442 Comm: iw Tainted: G O 3.17.2 #31 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff880013800790 ti: ffff880008d80000 task.ti: ffff880008d80000 RIP: 0010:[<ffffffffa002fd69>] [<ffffffffa002fd69>] nl80211_parse_sched_scan.part.67+0x6e9/0x900 [cfg80211] RSP: 0018:ffff880008d838d0 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 000000000000143c RSI: 0000000000000000 RDI: ffff880008ee8dd0 RBP: ffff880008d83948 R08: 0000000000000002 R09: 0000000000000019 R10: ffff88001d1b3c40 R11: 0000000000000002 R12: ffff880019e85e00 R13: 00000000fffffed4 R14: ffff880009757800 R15: 0000000000001388 FS: 00007fa3b6d13700(0000) GS:ffff88003e200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000024 CR3: 0000000008670000 CR4: 00000000000006e0 Stack: ffff880009757800 ffff880000000001 0000000000000000 ffff880008ee84e0 0000000000000000 ffff880009757800 00000000fffffed4 ffff880008d83948 ffffffff814689c9 ffff880009757800 ffff880008ee8000 0000000000000000 Call Trace: [<ffffffff814689c9>] ? nla_parse+0xb9/0x120 [<ffffffffa00306de>] nl80211_set_wowlan+0x75e/0x960 [cfg80211] [<ffffffff810bf3d5>] ? mark_held_locks+0x75/0xa0 [<ffffffff8161a77b>] genl_family_rcv_msg+0x18b/0x360 [<ffffffff810bf66d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff8161a9d4>] genl_rcv_msg+0x84/0xc0 [<ffffffff8161a950>] ? genl_family_rcv_msg+0x360/0x360 [<ffffffff81618e79>] netlink_rcv_skb+0xa9/0xd0 [<ffffffff81619458>] genl_rcv+0x28/0x40 [<ffffffff816184a5>] netlink_unicast+0x105/0x180 [<ffffffff8161886f>] netlink_sendmsg+0x34f/0x7a0 [<ffffffff8105a097>] ? kvm_clock_read+0x27/0x40 [<ffffffff815c644d>] sock_sendmsg+0x8d/0xc0 [<ffffffff811a75c9>] ? might_fault+0xb9/0xc0 [<ffffffff811a756e>] ? might_fault+0x5e/0xc0 [<ffffffff815d5d26>] ? verify_iovec+0x56/0xe0 [<ffffffff815c73e0>] ___sys_sendmsg+0x3d0/0x3e0 [<ffffffff810a7be8>] ? sched_clock_cpu+0x98/0xd0 [<ffffffff810611b4>] ? __do_page_fault+0x254/0x580 [<ffffffff810bb39f>] ? up_read+0x1f/0x40 [<ffffffff810611b4>] ? __do_page_fault+0x254/0x580 [<ffffffff812146ed>] ? __fget_light+0x13d/0x160 [<ffffffff815c7b02>] __sys_sendmsg+0x42/0x80 [<ffffffff815c7b52>] SyS_sendmsg+0x12/0x20 [<ffffffff81751f69>] system_call_fastpath+0x16/0x1b Fixes: `ea73cbce4e` ("nl80211: fix scheduled scan RSSI matchset attribute confusion") Cc: stable@vger.kernel.org [3.15+] Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-12 12:33:25 +01:00
Johannes Berg	cec3f0ed7d	cfg80211: use __force __rcu to suppress sparse warning The code assigns a constant value (a pointer to a static variable) to an RCU pointer, which results in a sparse warning: reg.c:112:10: warning: cast adds address space to expression (<asn:4>) Suppress this warning by using __force. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-12 12:27:23 +01:00
Arik Nemtsov	34f05f543f	cfg80211: avoid mem leak on driver hint set In the already-set and intersect case of a driver-hint, the previous wiphy regdomain was not freed before being reset with a copy of the cfg80211 regdomain. Cc: stable@vger.kernel.org Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Acked-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-12 12:25:33 +01:00
Jouni Malinen	08f6f14777	cfg80211: Fix 160 MHz channels with 80+80 and 160 MHz drivers The VHT supported channel width field is a two bit integer, not a bitfield. cfg80211_chandef_usable() was interpreting it incorrectly and ended up rejecting 160 MHz channel width if the driver indicated support for both 160 and 80+80 MHz channels. Cc: stable@vger.kernel.org (3.16+) Fixes: `3d9d1d6656` ("nl80211/cfg80211: support VHT channel configuration") (however, no real drivers had 160 MHz support it until 3.16) Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-12 12:18:47 +01:00
Andreas Müller	d025933e29	mac80211: fix multicast LED blinking and counter As multicast-frames can't be fragmented, "dot11MulticastReceivedFrameCount" stopped being incremented after the use-after-free fix. Furthermore, the RX-LED will be triggered by every multicast frame (which wouldn't happen before) which wouldn't allow the LED to rest at all. Fixes https://bugzilla.kernel.org/show_bug.cgi?id=89431 which also had the patch. Cc: stable@vger.kernel.org Fixes: `b8fff407a1` ("mac80211: fix use-after-free in defragmentation") Signed-off-by: Andreas Müller <goo@stapelspeicher.org> [rewrite commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-12 12:11:14 +01:00
Jes Sorensen	7e6225a160	mac80211: avoid using uninitialized stack data Avoid a case where we would access uninitialized stack data if the AP advertises HT support without 40MHz channel support. Cc: stable@vger.kernel.org Fixes: `f3000e1b43` ("mac80211: fix broken use of VHT/20Mhz with some APs") Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-12-12 12:00:46 +01:00
Florian Fainelli	9697f1cde9	net: dsa: propagate error code from dsa_slave_phy_setup In case we cannot attach to our slave netdevice PHY, error out and propagate that error up to the caller: dsa_slave_create(). Fixes: `0d8bcdd383` ("net: dsa: allow for more complex PHY setups") Signed-off-by: Andrey Volkov <andrey.volkov@nexvision.fr> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 21:00:08 -05:00
Florian Fainelli	53013c7743	net: dsa: handle non-existing PHYs on switch internal bus In case there is no PHY at the designated address on the internal switch, we would basically de-reference a null pointer here: dsa_slave_phy_setup(...) { p->phy = ds->slave_mii_bus->phy_map[p->port]; phy_connect_direct(slave_dev, p->phy, dsa_slave_adjust_link, ^------ This can be triggered when the platform configuration (platform_data or Device Tree) indicates there should be a PHY device at this address, but the HW is non-responsive, such that we cannot attach a PHY device at this specific location. Fix this by checking the return value prior to calling phy_connect_direct(). CC: Andrew Lunn <andrew@lunn.ch> Fixes: `b31f65fb43` ("net: dsa: slave: Fix autoneg for phys on switch MDIO bus") Reported-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Andrey Volkov <andrey.volkov@nexvision.fr> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 20:58:50 -05:00
Linus Torvalds	70e71ca0af	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: 1) New offloading infrastructure and example 'rocker' driver for offloading of switching and routing to hardware. This work was done by a large group of dedicated individuals, not limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend, Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu 2) Start making the networking operate on IOV iterators instead of modifying iov objects in-situ during transfers. Thanks to Al Viro and Herbert Xu. 3) A set of new netlink interfaces for the TIPC stack, from Richard Alpe. 4) Remove unnecessary looping during ipv6 routing lookups, from Martin KaFai Lau. 5) Add PAUSE frame generation support to gianfar driver, from Matei Pavaluca. 6) Allow for larger reordering levels in TCP, which are easily achievable in the real world right now, from Eric Dumazet. 7) Add a variable of napi_schedule that doesn't need to disable cpu interrupts, from Eric Dumazet. 8) Use a doubly linked list to optimize neigh_parms_release(), from Nicolas Dichtel. 9) Various enhancements to the kernel BPF verifier, and allow eBPF programs to actually be attached to sockets. From Alexei Starovoitov. 10) Support TSO/LSO in sunvnet driver, from David L Stevens. 11) Allow controlling ECN usage via routing metrics, from Florian Westphal. 12) Remote checksum offload, from Tom Herbert. 13) Add split-header receive, BQL, and xmit_more support to amd-xgbe driver, from Thomas Lendacky. 14) Add MPLS support to openvswitch, from Simon Horman. 15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen Klassert. 16) Do gro flushes on a per-device basis using a timer, from Eric Dumazet. This tries to resolve the conflicting goals between the desired handling of bulk vs. RPC-like traffic. 17) Allow userspace to ask for the CPU upon what a packet was received/steered, via SO_INCOMING_CPU. From Eric Dumazet. 18) Limit GSO packets to half the current congestion window, from Eric Dumazet. 19) Add a generic helper so that all drivers set their RSS keys in a consistent way, from Eric Dumazet. 20) Add xmit_more support to enic driver, from Govindarajulu Varadarajan. 21) Add VLAN packet scheduler action, from Jiri Pirko. 22) Support configurable RSS hash functions via ethtool, from Eyal Perry. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits) Fix race condition between vxlan_sock_add and vxlan_sock_release net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header net/mlx4: Add support for A0 steering net/mlx4: Refactor QUERY_PORT net/mlx4_core: Add explicit error message when rule doesn't meet configuration net/mlx4: Add A0 hybrid steering net/mlx4: Add mlx4_bitmap zone allocator net/mlx4: Add a check if there are too many reserved QPs net/mlx4: Change QP allocation scheme net/mlx4_core: Use tasklet for user-space CQ completion events net/mlx4_core: Mask out host side virtualization features for guests net/mlx4_en: Set csum level for encapsulated packets be2net: Export tunnel offloads only when a VxLAN tunnel is created gianfar: Fix dma check map error when DMA_API_DEBUG is enabled cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call net: fec: only enable mdio interrupt before phy device link up net: fec: clear all interrupt events to support i.MX6SX net: fec: reset fep link status in suspend function net: sock: fix access via invalid file descriptor net: introduce helper macro for_each_cmsghdr ...	2014-12-11 14:27:06 -08:00
Linus Torvalds	6b9e2cea42	virtio: virtio 1.0 support, misc patches This adds a lot of infrastructure for virtio 1.0 support. Notable missing pieces: virtio pci, virtio balloon (needs spec extension), vhost scsi. Plus, there are some minor fixes in a couple of places. Cc: David Miller <davem@davemloft.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJUh1CVAAoJECgfDbjSjVRpWZcH/2+EGPyng7Lca820UHA0cU1U u4D8CAAwOGaVdnUUo8ox1eon3LNB2UgRtgsl3rBDR3YTgFfNPrfuYdnHO0dYIDc1 lS26NuPrVrTX0lA+OBPe2nlKrsrOkn8aw1kxG9Y0gKtNg/+HAGNW5e2eE7R/LrA5 94XbWZ8g9Yf4GPG1iFmih9vQvvN0E68zcUlojfCnllySgaIEYr8nTiGQBWpRgJat fCqFAp1HMDZzGJQO+m1/Vw0OftTRVybyfai59e6uUTa8x1djvzPb/1MvREqQjegM ylSuofIVyj7JPu++FbAjd9mikkb53GSc8ql3YmWNZLdr69rnkzP0GdzQvrdheAo= =RtrR -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: "virtio: virtio 1.0 support, misc patches This adds a lot of infrastructure for virtio 1.0 support. Notable missing pieces: virtio pci, virtio balloon (needs spec extension), vhost scsi. Plus, there are some minor fixes in a couple of places. Note: some net drivers are affected by these patches. David said he's fine with merging these patches through my tree. Rusty's on vacation, he acked using my tree for these, too" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (70 commits) virtio_ccw: finalize_features error handling virtio_ccw: future-proof finalize_features virtio_pci: rename virtio_pci -> virtio_pci_common virtio_pci: update file descriptions and copyright virtio_pci: split out legacy device support virtio_pci: setup config vector indirectly virtio_pci: setup vqs indirectly virtio_pci: delete vqs indirectly virtio_pci: use priv for vq notification virtio_pci: free up vq->priv virtio_pci: fix coding style for structs virtio_pci: add isr field virtio: drop legacy_only driver flag virtio_balloon: drop legacy_only driver flag virtio_ccw: rev 1 devices set VIRTIO_F_VERSION_1 virtio: allow finalize_features to fail virtio_ccw: legacy: don't negotiate rev 1/features virtio: add API to detect legacy devices virtio_console: fix sparse warnings vhost: remove unnecessary forward declarations in vhost.h ...	2014-12-11 12:20:31 -08:00
Johan Hedberg	1aeb9c651c	Bluetooth: Fix notifying mgmt power off before flushing connection list This patch moves the mgmt_powered() notification earlier in the hci_dev_do_close() function. This way the correct "not powered" error gets passed to any pending mgmt commands. Without the patch the pending commands would instead get a misleading "disconnected" response when powering down the adapter. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-11 20:57:39 +01:00
Johan Hedberg	a511b35ba4	Bluetooth: Fix incorrect pending cmd removal in pairing_complete() The pairing_complete() function is used as a pending mgmt command cmd_complete callback. The expectation of such functions is that they are not responsible themselves for calling mgmt_pending_remove(). This patch fixes the incorrect mgmt_pending_remove() call in pairing_complete() and adds it to the appropriate changes. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-11 20:57:38 +01:00
Johan Hedberg	15013aeb63	Bluetooth: Fix calling hci_conn_put too early The pairing_complete() function relies on a hci_conn reference to be able to access the hci_conn object. It should therefore only release this reference once it's done accessing the object, i.e. at the end of the function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-11 20:57:38 +01:00
Marcel Holtmann	417287de88	Bluetooth: Fix check for support for page scan related commands The Read Page Scan Activity and Read Page Scan Type commands are not supported by all controllers. Move the execution of both commands into the 3rd phase of the init procedure. And then check the bit mask of supported commands before adding them to the init sequence. With this re-ordering of the init sequence, the extra check for AVM BlueFritz! controllers is no longer needed. They will report that these two commands are not supported. This fixes an issue with the Microsoft Corp. Wireless Transceiver for Bluetooth 2.0 (ID 045e:009c). Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-11 21:42:11 +02:00
Jaganath Kanakkassery	5c1a4c8f28	Bluetooth: Fix missing hci_dev_lock/unlock in hci_event mgmt_pending_remove() should be called with hci_dev_lock protection and all hci_event.c functions which calls mgmt_complete() (which eventually calls mgmt_pending_remove()) should hold the lock. So this patch fixes the same Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-11 15:09:04 +01:00
Jaganath Kanakkassery	3ad675827f	Bluetooth: Fix missing hci_dev_lock/unlock in mgmt req_complete() mgmt_pending_remove() should be called with hci_dev_lock protection and currently the rule to take dev lock is that all mgmt req_complete functions should take dev lock. So this patch fixes the same in the missing functions Without this patch there is a chance of invalid memory access while accessing the mgmt_pending list like below bluetoothd: 392] [0] Backtrace: bluetoothd: 392] [0] [<c04ec770>] (pending_eir_or_class+0x0/0x68) from [<c04f1830>] (add_uuid+0x34/0x1c4) bluetoothd: 392] [0] [<c04f17fc>] (add_uuid+0x0/0x1c4) from [<c04f3cc4>] (mgmt_control+0x204/0x274) bluetoothd: 392] [0] [<c04f3ac0>] (mgmt_control+0x0/0x274) from [<c04f609c>] (hci_sock_sendmsg+0x80/0x308) bluetoothd: 392] [0] [<c04f601c>] (hci_sock_sendmsg+0x0/0x308) from [<c03d4d68>] (sock_aio_write+0x144/0x174) bluetoothd: 392] [0] r8:00000000 r7 7c1be90 r6 7c1be18 r5:00000017 r4 a90ea80 bluetoothd: 392] [0] [<c03d4c24>] (sock_aio_write+0x0/0x174) from [<c00e2d4c>] (do_sync_write+0xb0/0xe0) bluetoothd: 392] [0] [<c00e2c9c>] (do_sync_write+0x0/0xe0) from [<c00e371c>] (vfs_write+0x134/0x13c) bluetoothd: 392] [0] r8:00000000 r7 7c1bf70 r6:beeca5c8 r5:00000017 r4 7c05900 bluetoothd: 392] [0] [<c00e35e8>] (vfs_write+0x0/0x13c) from [<c00e3910>] (sys_write+0x44/0x70) bluetoothd: 392] [0] r8:00000000 r7:00000004 r6:00000017 r5:beeca5c8 r4 7c05900 bluetoothd: 392] [0] [<c00e38cc>] (sys_write+0x0/0x70) from [<c000e3c0>] (ret_fast_syscall+0x0/0x30) bluetoothd: 392] [0] r9 `7c1a000` r8:c000e568 r6:400b5f10 r5:403896d8 r4:beeca604 bluetoothd: 392] [0] Code: e28cc00c e152000c 0a00000f e3a00001 (e1d210b8) bluetoothd: 392] [0] ---[ end trace 67b6ac67435864c4 ]--- bluetoothd: 392] [0] Kernel panic - not syncing: Fatal exception Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-11 14:08:47 +01:00
Linus Torvalds	92a578b064	ACPI and power management updates for 3.19-rc1 This time we have some more new material than we used to have during the last couple of development cycles. The most important part of it to me is the introduction of a unified interface for accessing device properties provided by platform firmware. It works with Device Trees and ACPI in a uniform way and drivers using it need not worry about where the properties come from as long as the platform firmware (either DT or ACPI) makes them available. It covers both devices and "bare" device node objects without struct device representation as that turns out to be necessary in some cases. This has been in the works for quite a few months (and development cycles) and has been approved by all of the relevant maintainers. On top of that, some drivers are switched over to the new interface (at25, leds-gpio, gpio_keys_polled) and some additional changes are made to the core GPIO subsystem to allow device drivers to manipulate GPIOs in the "canonical" way on platforms that provide GPIO information in their ACPI tables, but don't assign names to GPIO lines (in which case the driver needs to do that on the basis of what it knows about the device in question). That also has been approved by the GPIO core maintainers and the rfkill driver is now going to use it. Second is support for hardware P-states in the intel_pstate driver. It uses CPUID to detect whether or not the feature is supported by the processor in which case it will be enabled by default. However, it can be disabled entirely from the kernel command line if necessary. Next is support for a platform firmware interface based on ACPI operation regions used by the PMIC (Power Management Integrated Circuit) chips on the Intel Baytrail-T and Baytrail-T-CR platforms. That interface is used for manipulating power resources and for thermal management: sensor temperature reporting, trip point setting and so on. Also the ACPI core is now going to support the _DEP configuration information in a limited way. Basically, _DEP it supposed to reflect off-the-hierarchy dependencies between devices which may be very indirect, like when AML for one device accesses locations in an operation region handled by another device's driver (usually, the device depended on this way is a serial bus or GPIO controller). The support added this time is sufficient to make the ACPI battery driver work on Asus T100A, but it is general enough to be able to cover some other use cases in the future. Finally, we have a new cpufreq driver for the Loongson1B processor. In addition to the above, there are fixes and cleanups all over the place as usual and a traditional ACPICA update to a recent upstream release. As far as the fixes go, the ACPI LPSS (Low-power Subsystem) driver for Intel platforms should be able to handle power management of the DMA engine correctly, the cpufreq-dt driver should interact with the thermal subsystem in a better way and the ACPI backlight driver should handle some more corner cases, among other things. On top of the ACPICA update there are fixes for race conditions in the ACPICA's interrupt handling code which might lead to some random and strange looking failures on some systems. In the cleanups department the most visible part is the series of commits targeted at getting rid of the CONFIG_PM_RUNTIME configuration option. That was triggered by a discussion regarding the generic power domains code during which we realized that trying to support certain combinations of PM config options was painful and not really worth it, because nobody would use them in production anyway. For this reason, we decided to make CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and that lead to the conclusion that the latter became redundant and CONFIG_PM could be used instead of it. The material here makes that replacement in a major part of the tree, but there will be at least one more batch of that in the second part of the merge window. Specifics: - Support for retrieving device properties information from ACPI _DSD device configuration objects and a unified device properties interface for device drivers (and subsystems) on top of that. As stated above, this works with Device Trees and ACPI and allows device drivers to be written in a platform firmware (DT or ACPI) agnostic way. The at25, leds-gpio and gpio_keys_polled drivers are now going to use this new interface and the GPIO subsystem is additionally modified to allow device drivers to assign names to GPIO resources returned by ACPI _CRS objects (in case _DSD is not present or does not provide the expected data). The changes in this set are mostly from Mika Westerberg, Rafael J Wysocki, Aaron Lu, and Darren Hart with some fixes from others (Fabio Estevam, Geert Uytterhoeven). - Support for Hardware Managed Performance States (HWP) as described in Volume 3, section 14.4, of the Intel SDM in the intel_pstate driver. CPUID is used to detect whether or not the feature is supported by the processor. If supported, it will be enabled automatically unless the intel_pstate=no_hwp switch is present in the kernel command line. From Dirk Brandewie. - New Intel Broadwell-H ID for intel_pstate (Dirk Brandewie). - Support for firmware interface based on ACPI operation regions used by the PMIC chips on the Intel Baytrail-T and Baytrail-T-CR platforms for power resource control and thermal management (Aaron Lu). - Limited support for retrieving off-the-hierarchy dependencies between devices from ACPI _DEP device configuration objects and deferred probing support for the ACPI battery driver based on the _DEP information to make that driver work on Asus T100A (Lan Tianyu). - New cpufreq driver for the Loongson1B processor (Kelvin Cheung). - ACPICA update to upstream revision 20141107 which only affects tools (Bob Moore). - Fixes for race conditions in the ACPICA's interrupt handling code and in the ACPI code related to system suspend and resume (Lv Zheng and Rafael J Wysocki). - ACPI core fix for an RCU-related issue in the ioremap() regions management code that slowed down significantly after CPUs had been allowed to enter idle states even if they'd had RCU callbakcs queued and triggered some problems in certain proprietary graphics driver (and elsewhere). The fix replaces synchronize_rcu() in that code with synchronize_rcu_expedited() which makes the issue go away. From Konstantin Khlebnikov. - ACPI LPSS (Low-Power Subsystem) driver fix to handle power management of the DMA engine included into the LPSS correctly. The problem is that the DMA engine doesn't have ACPI PM support of its own and it simply is turned off when the last LPSS device having ACPI PM support goes into D3cold. To work around that, the PM domain used by the ACPI LPSS driver is redesigned so at least one device with ACPI PM support will be on as long as the DMA engine is in use. From Andy Shevchenko. - ACPI backlight driver fix to avoid using it on "Win8-compatible" systems where it doesn't work and where it was used by default by mistake (Aaron Lu). - Assorted minor ACPI core fixes and cleanups from Tomasz Nowicki, Sudeep Holla, Huang Rui, Hanjun Guo, Fabian Frederick, and Ashwin Chaugule (mostly related to the upcoming ARM64 support). - Intel RAPL (Running Average Power Limit) power capping driver fixes and improvements including new processor IDs (Jacob Pan). - Generic power domains modification to power up domains after attaching devices to them to meet the expectations of device drivers and bus types assuming devices to be accessible at probe time (Ulf Hansson). - Preliminary support for controlling device clocks from the generic power domains core code and modifications of the ARM/shmobile platform to use that feature (Ulf Hansson). - Assorted minor fixes and cleanups of the generic power domains core code (Ulf Hansson, Geert Uytterhoeven). - Assorted minor fixes and cleanups of the device clocks control code in the PM core (Geert Uytterhoeven, Grygorii Strashko). - Consolidation of device power management Kconfig options by making CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and removing the latter which is now redundant (Rafael J Wysocki and Kevin Hilman). That is the first batch of the changes needed for this purpose. - Core device runtime power management support code cleanup related to the execution of callbacks (Andrzej Hajda). - cpuidle ARM support improvements (Lorenzo Pieralisi). - cpuidle cleanup related to the CPUIDLE_FLAG_TIME_VALID flag and a new MAINTAINERS entry for ARM Exynos cpuidle (Daniel Lezcano and Bartlomiej Zolnierkiewicz). - New cpufreq driver callback (->ready) to be executed when the cpufreq core is ready to use a given policy object and cpufreq-dt driver modification to use that callback for cooling device registration (Viresh Kumar). - cpufreq core fixes and cleanups (Viresh Kumar, Vince Hsu, James Geboski, Tomeu Vizoso). - Assorted fixes and cleanups in the cpufreq-pcc, intel_pstate, cpufreq-dt, pxa2xx cpufreq drivers (Lenny Szubowicz, Ethan Zhao, Stefan Wahren, Petr Cvek). - OPP (Operating Performance Points) framework modification to allow OPPs to be removed too and update of a few cpufreq drivers (cpufreq-dt, exynos5440, imx6q, cpufreq) to remove OPPs (added during initialization) on driver removal (Viresh Kumar). - Hibernation core fixes and cleanups (Tina Ruchandani and Markus Elfring). - PM Kconfig fix related to CPU power management (Pankaj Dubey). - cpupower tool fix (Prarit Bhargava). / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJUhj6JAAoJEILEb/54YlRxTM4P/j5g5SfqvY0QKsn7sR7MGZ6v nsgCBhJAqTw3ocNC7EAs8z9h2GWy1KbKpakKYWAh9Fs1yZoey7tFSlcv/Rgjlp70 uU5sDQHtpE9mHKiymdsowiQuWgpl962L4k+k8hUslhlvgk1PvVbpajR6OqG8G+pD asuIW9eh1APNkLyXmRJ3ZPomzs0VmRdZJ0NEs0lKX9mJskqEvxPIwdaxq3iaJq9B Fo0J345zUDcJnxWblDRdHlOigCimglElfN5qJwaC4KpwUKuBvLRKbp4f69+wfT0c kYFiR29X5KjJ2kLfP/wKsLyuDCYYXRq3tCia5M1tAqOjZ+UA89H/GDftx/5lntmv qUlBa35VfdS1SX4HyApZitOHiLgo+It/hl8Z9bJnhyVw66NxmMQ8JYN2imb8Lhqh XCLR7BxLTah82AapLJuQ0ZDHPzZqMPG2veC2vAzRMYzVijict/p4Y2+qBqONltER 4rs9uRVn+hamX33lCLg8BEN8zqlnT3rJFIgGaKjq/wXHAU/zpE9CjOrKMQcAg9+s t51XMNPwypHMAYyGVhEL89ImjXnXxBkLRuquhlmEpvQchIhR+mR3dLsarGn7da44 WPIQJXzcsojXczcwwfqsJCR4I1FTFyQIW+UNh02GkDRgRovQqo+Jk762U7vQwqH+ LBdhvVaS1VW4v+FWXEoZ =5dox -----END PGP SIGNATURE----- Merge tag 'pm+acpi-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: "This time we have some more new material than we used to have during the last couple of development cycles. The most important part of it to me is the introduction of a unified interface for accessing device properties provided by platform firmware. It works with Device Trees and ACPI in a uniform way and drivers using it need not worry about where the properties come from as long as the platform firmware (either DT or ACPI) makes them available. It covers both devices and "bare" device node objects without struct device representation as that turns out to be necessary in some cases. This has been in the works for quite a few months (and development cycles) and has been approved by all of the relevant maintainers. On top of that, some drivers are switched over to the new interface (at25, leds-gpio, gpio_keys_polled) and some additional changes are made to the core GPIO subsystem to allow device drivers to manipulate GPIOs in the "canonical" way on platforms that provide GPIO information in their ACPI tables, but don't assign names to GPIO lines (in which case the driver needs to do that on the basis of what it knows about the device in question). That also has been approved by the GPIO core maintainers and the rfkill driver is now going to use it. Second is support for hardware P-states in the intel_pstate driver. It uses CPUID to detect whether or not the feature is supported by the processor in which case it will be enabled by default. However, it can be disabled entirely from the kernel command line if necessary. Next is support for a platform firmware interface based on ACPI operation regions used by the PMIC (Power Management Integrated Circuit) chips on the Intel Baytrail-T and Baytrail-T-CR platforms. That interface is used for manipulating power resources and for thermal management: sensor temperature reporting, trip point setting and so on. Also the ACPI core is now going to support the _DEP configuration information in a limited way. Basically, _DEP it supposed to reflect off-the-hierarchy dependencies between devices which may be very indirect, like when AML for one device accesses locations in an operation region handled by another device's driver (usually, the device depended on this way is a serial bus or GPIO controller). The support added this time is sufficient to make the ACPI battery driver work on Asus T100A, but it is general enough to be able to cover some other use cases in the future. Finally, we have a new cpufreq driver for the Loongson1B processor. In addition to the above, there are fixes and cleanups all over the place as usual and a traditional ACPICA update to a recent upstream release. As far as the fixes go, the ACPI LPSS (Low-power Subsystem) driver for Intel platforms should be able to handle power management of the DMA engine correctly, the cpufreq-dt driver should interact with the thermal subsystem in a better way and the ACPI backlight driver should handle some more corner cases, among other things. On top of the ACPICA update there are fixes for race conditions in the ACPICA's interrupt handling code which might lead to some random and strange looking failures on some systems. In the cleanups department the most visible part is the series of commits targeted at getting rid of the CONFIG_PM_RUNTIME configuration option. That was triggered by a discussion regarding the generic power domains code during which we realized that trying to support certain combinations of PM config options was painful and not really worth it, because nobody would use them in production anyway. For this reason, we decided to make CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and that lead to the conclusion that the latter became redundant and CONFIG_PM could be used instead of it. The material here makes that replacement in a major part of the tree, but there will be at least one more batch of that in the second part of the merge window. Specifics: - Support for retrieving device properties information from ACPI _DSD device configuration objects and a unified device properties interface for device drivers (and subsystems) on top of that. As stated above, this works with Device Trees and ACPI and allows device drivers to be written in a platform firmware (DT or ACPI) agnostic way. The at25, leds-gpio and gpio_keys_polled drivers are now going to use this new interface and the GPIO subsystem is additionally modified to allow device drivers to assign names to GPIO resources returned by ACPI _CRS objects (in case _DSD is not present or does not provide the expected data). The changes in this set are mostly from Mika Westerberg, Rafael J Wysocki, Aaron Lu, and Darren Hart with some fixes from others (Fabio Estevam, Geert Uytterhoeven). - Support for Hardware Managed Performance States (HWP) as described in Volume 3, section 14.4, of the Intel SDM in the intel_pstate driver. CPUID is used to detect whether or not the feature is supported by the processor. If supported, it will be enabled automatically unless the intel_pstate=no_hwp switch is present in the kernel command line. From Dirk Brandewie. - New Intel Broadwell-H ID for intel_pstate (Dirk Brandewie). - Support for firmware interface based on ACPI operation regions used by the PMIC chips on the Intel Baytrail-T and Baytrail-T-CR platforms for power resource control and thermal management (Aaron Lu). - Limited support for retrieving off-the-hierarchy dependencies between devices from ACPI _DEP device configuration objects and deferred probing support for the ACPI battery driver based on the _DEP information to make that driver work on Asus T100A (Lan Tianyu). - New cpufreq driver for the Loongson1B processor (Kelvin Cheung). - ACPICA update to upstream revision 20141107 which only affects tools (Bob Moore). - Fixes for race conditions in the ACPICA's interrupt handling code and in the ACPI code related to system suspend and resume (Lv Zheng and Rafael J Wysocki). - ACPI core fix for an RCU-related issue in the ioremap() regions management code that slowed down significantly after CPUs had been allowed to enter idle states even if they'd had RCU callbakcs queued and triggered some problems in certain proprietary graphics driver (and elsewhere). The fix replaces synchronize_rcu() in that code with synchronize_rcu_expedited() which makes the issue go away. From Konstantin Khlebnikov. - ACPI LPSS (Low-Power Subsystem) driver fix to handle power management of the DMA engine included into the LPSS correctly. The problem is that the DMA engine doesn't have ACPI PM support of its own and it simply is turned off when the last LPSS device having ACPI PM support goes into D3cold. To work around that, the PM domain used by the ACPI LPSS driver is redesigned so at least one device with ACPI PM support will be on as long as the DMA engine is in use. From Andy Shevchenko. - ACPI backlight driver fix to avoid using it on "Win8-compatible" systems where it doesn't work and where it was used by default by mistake (Aaron Lu). - Assorted minor ACPI core fixes and cleanups from Tomasz Nowicki, Sudeep Holla, Huang Rui, Hanjun Guo, Fabian Frederick, and Ashwin Chaugule (mostly related to the upcoming ARM64 support). - Intel RAPL (Running Average Power Limit) power capping driver fixes and improvements including new processor IDs (Jacob Pan). - Generic power domains modification to power up domains after attaching devices to them to meet the expectations of device drivers and bus types assuming devices to be accessible at probe time (Ulf Hansson). - Preliminary support for controlling device clocks from the generic power domains core code and modifications of the ARM/shmobile platform to use that feature (Ulf Hansson). - Assorted minor fixes and cleanups of the generic power domains core code (Ulf Hansson, Geert Uytterhoeven). - Assorted minor fixes and cleanups of the device clocks control code in the PM core (Geert Uytterhoeven, Grygorii Strashko). - Consolidation of device power management Kconfig options by making CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and removing the latter which is now redundant (Rafael J Wysocki and Kevin Hilman). That is the first batch of the changes needed for this purpose. - Core device runtime power management support code cleanup related to the execution of callbacks (Andrzej Hajda). - cpuidle ARM support improvements (Lorenzo Pieralisi). - cpuidle cleanup related to the CPUIDLE_FLAG_TIME_VALID flag and a new MAINTAINERS entry for ARM Exynos cpuidle (Daniel Lezcano and Bartlomiej Zolnierkiewicz). - New cpufreq driver callback (->ready) to be executed when the cpufreq core is ready to use a given policy object and cpufreq-dt driver modification to use that callback for cooling device registration (Viresh Kumar). - cpufreq core fixes and cleanups (Viresh Kumar, Vince Hsu, James Geboski, Tomeu Vizoso). - Assorted fixes and cleanups in the cpufreq-pcc, intel_pstate, cpufreq-dt, pxa2xx cpufreq drivers (Lenny Szubowicz, Ethan Zhao, Stefan Wahren, Petr Cvek). - OPP (Operating Performance Points) framework modification to allow OPPs to be removed too and update of a few cpufreq drivers (cpufreq-dt, exynos5440, imx6q, cpufreq) to remove OPPs (added during initialization) on driver removal (Viresh Kumar). - Hibernation core fixes and cleanups (Tina Ruchandani and Markus Elfring). - PM Kconfig fix related to CPU power management (Pankaj Dubey). - cpupower tool fix (Prarit Bhargava)" * tag 'pm+acpi-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (120 commits) i2c-omap / PM: Drop CONFIG_PM_RUNTIME from i2c-omap.c dmaengine / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM tools: cpupower: fix return checks for sysfs_get_idlestate_count() drivers: sh / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM e1000e / igb / PM: Eliminate CONFIG_PM_RUNTIME MMC / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM MFD / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM misc / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM media / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM input / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM leds: leds-gpio: Fix multiple instances registration without 'label' property iio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM hsi / OMAP / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM i2c-hid / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM drm / exynos / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM gpio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM hwrandom / exynos / PM: Use CONFIG_PM in #ifdef block / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM USB / PM: Drop CONFIG_PM_RUNTIME from the USB core PM: Merge the SET*_RUNTIME_PM_OPS() macros ...	2014-12-10 21:17:00 -08:00
Alexei Starovoitov	198bf1b046	net: sock: fix access via invalid file descriptor 0day robot reported the following crash: [ 21.233581] BUG: unable to handle kernel NULL pointer dereference at 0000000000000007 [ 21.234709] IP: [<ffffffff8156ebda>] sk_attach_bpf+0x39/0xc2 It's due to bpf_prog_get() returning ERR_PTR. Check it properly. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Fixes: `89aa075832` ("net: sock: allow eBPF programs to be attached to sockets") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 23:34:27 -05:00
Gu Zheng	f95b414edb	net: introduce helper macro for_each_cmsghdr Introduce helper macro for_each_cmsghdr as a wrapper of the enumerating cmsghdr from msghdr, just cleanup. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 22:41:55 -05:00
Linus Torvalds	b6da0076ba	Merge branch 'akpm' (patchbomb from Andrew) Merge first patchbomb from Andrew Morton: - a few minor cifs fixes - dma-debug upadtes - ocfs2 - slab - about half of MM - procfs - kernel/exit.c - panic.c tweaks - printk upates - lib/ updates - checkpatch updates - fs/binfmt updates - the drivers/rtc tree - nilfs - kmod fixes - more kernel/exit.c - various other misc tweaks and fixes * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits) exit: pidns: fix/update the comments in zap_pid_ns_processes() exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting exit: exit_notify: re-use "dead" list to autoreap current exit: reparent: call forget_original_parent() under tasklist_lock exit: reparent: avoid find_new_reaper() if no children exit: reparent: introduce find_alive_thread() exit: reparent: introduce find_child_reaper() exit: reparent: document the ->has_child_subreaper checks exit: reparent: s/while_each_thread/for_each_thread/ in find_new_reaper() exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting exit: proc: don't try to flush /proc/tgid/task/tgid exit: release_task: fix the comment about group leader accounting exit: wait: drop tasklist_lock before psig->c* accounting exit: wait: don't use zombie->real_parent exit: wait: cleanup the ptrace_reparented() checks usermodehelper: kill the kmod_thread_locker logic usermodehelper: don't use CLONE_VFORK for ____call_usermodehelper() fs/hfs/catalog.c: fix comparison bug in hfs_cat_keycmp nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races ...	2014-12-10 18:34:42 -08:00
Al Viro	bd9b51e79c	make default ->i_fop have ->open() fail with ENXIO As it is, default ->i_fop has NULL ->open() (along with all other methods). The only case where it matters is reopening (via procfs symlink) a file that didn't get its ->f_op from ->i_fop - anything else will have ->i_fop assigned to something sane (default would fail on read/write/ioctl/etc.). Unfortunately, such case exists - alloc_file() users, especially anon_get_file() ones. There we have tons of opened files of very different kinds sharing the same inode. As the result, attempt to reopen those via procfs succeeds and you get a descriptor you can't do anything with. Moreover, in case of sockets we set ->i_fop that will only be used on such reopen attempts - and put a failing ->open() into it to make sure those do not succeed. It would be simpler to put such ->open() into default ->i_fop and leave it unchanged both for anon inode (as we do anyway) and for socket ones. Result: * everything going through do_dentry_open() works as it used to * sock_no_open() kludge is gone * attempts to reopen anon-inode files fail as they really ought to * ditto for aio_private_file() * ditto for perfmon - this one actually tried to imitate sock_no_open() trick, but failed to set ->i_fop, so in the current tree reopens succeed and yield completely useless descriptor. Intent clearly had been to fail with -ENXIO on such reopens; now it actually does. * everything else that used alloc_file() keeps working - it has ->i_fop set for its inodes anyway Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-10 21:32:15 -05:00
Al Viro	707c5960f1	Merge branch 'nsfs' into for-next	2014-12-10 21:31:59 -05:00
Johannes Weiner	3e32cb2e0a	mm: memcontrol: lockless page counters Memory is internally accounted in bytes, using spinlock-protected 64-bit counters, even though the smallest accounting delta is a page. The counter interface is also convoluted and does too many things. Introduce a new lockless word-sized page counter API, then change all memory accounting over to it. The translation from and to bytes then only happens when interfacing with userspace. The removed locking overhead is noticable when scaling beyond the per-cpu charge caches - on a 4-socket machine with 144-threads, the following test shows the performance differences of 288 memcgs concurrently running a page fault benchmark: vanilla: 18631648.500498 task-clock (msec) # 140.643 CPUs utilized ( +- 0.33% ) 1,380,638 context-switches # 0.074 K/sec ( +- 0.75% ) 24,390 cpu-migrations # 0.001 K/sec ( +- 8.44% ) 1,843,305,768 page-faults # 0.099 M/sec ( +- 0.00% ) 50,134,994,088,218 cycles # 2.691 GHz ( +- 0.33% ) <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 8,049,712,224,651 instructions # 0.16 insns per cycle ( +- 0.04% ) 1,586,970,584,979 branches # 85.176 M/sec ( +- 0.05% ) 1,724,989,949 branch-misses # 0.11% of all branches ( +- 0.48% ) 132.474343877 seconds time elapsed ( +- 0.21% ) lockless: 12195979.037525 task-clock (msec) # 133.480 CPUs utilized ( +- 0.18% ) 832,850 context-switches # 0.068 K/sec ( +- 0.54% ) 15,624 cpu-migrations # 0.001 K/sec ( +- 10.17% ) 1,843,304,774 page-faults # 0.151 M/sec ( +- 0.00% ) 32,811,216,801,141 cycles # 2.690 GHz ( +- 0.18% ) <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 9,999,265,091,727 instructions # 0.30 insns per cycle ( +- 0.10% ) 2,076,759,325,203 branches # 170.282 M/sec ( +- 0.12% ) 1,656,917,214 branch-misses # 0.08% of all branches ( +- 0.55% ) 91.369330729 seconds time elapsed ( +- 0.45% ) On top of improved scalability, this also gets rid of the icky long long types in the very heart of memcg, which is great for 32 bit and also makes the code a lot more readable. Notable differences between the old and new API: - res_counter_charge() and res_counter_charge_nofail() become page_counter_try_charge() and page_counter_charge() resp. to match the more common kernel naming scheme of try_do()/do() - res_counter_uncharge_until() is only ever used to cancel a local counter and never to uncharge bigger segments of a hierarchy, so it's replaced by the simpler page_counter_cancel() - res_counter_set_limit() is replaced by page_counter_limit(), which expects its callers to serialize against themselves - res_counter_memparse_write_strategy() is replaced by page_counter_limit(), which rounds down to the nearest page size - rather than up. This is more reasonable for explicitely requested hard upper limits. - to keep charging light-weight, page_counter_try_charge() charges speculatively, only to roll back if the result exceeds the limit. Because of this, a failing bigger charge can temporarily lock out smaller charges that would otherwise succeed. The error is bounded to the difference between the smallest and the biggest possible charge size, so for memcg, this means that a failing THP charge can send base page charges into reclaim upto 2MB (4MB) before the limit would have been reached. This should be acceptable. [akpm@linux-foundation.org: add includes for WARN_ON_ONCE and memparse] [akpm@linux-foundation.org: add includes for WARN_ON_ONCE, memparse, strncmp, and PAGE_SIZE] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-12-10 17:41:04 -08:00
Linus Torvalds	cbfe0de303	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS changes from Al Viro: "First pile out of several (there _definitely_ will be more). Stuff in this one: - unification of d_splice_alias()/d_materialize_unique() - iov_iter rewrite - killing a bunch of ->f_path.dentry users (and f_dentry macro). Getting that completed will make life much simpler for unionmount/overlayfs, since then we'll be able to limit the places sensitive to file _dentry_ to reasonably few. Which allows to have file_inode(file) pointing to inode in a covered layer, with dentry pointing to (negative) dentry in union one. Still not complete, but much closer now. - crapectomy in lustre (dead code removal, mostly) - "let's make seq_printf return nothing" preparations - assorted cleanups and fixes There _definitely_ will be more piles" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) copy_from_iter_nocache() new helper: iov_iter_kvec() csum_and_copy_..._iter() iov_iter.c: handle ITER_KVEC directly iov_iter.c: convert copy_to_iter() to iterate_and_advance iov_iter.c: convert copy_from_iter() to iterate_and_advance iov_iter.c: get rid of bvec_copy_page_{to,from}_iter() iov_iter.c: convert iov_iter_zero() to iterate_and_advance iov_iter.c: convert iov_iter_get_pages_alloc() to iterate_all_kinds iov_iter.c: convert iov_iter_get_pages() to iterate_all_kinds iov_iter.c: convert iov_iter_npages() to iterate_all_kinds iov_iter.c: iterate_and_advance iov_iter.c: macros for iterating over iov_iter kill f_dentry macro dcache: fix kmemcheck warning in switch_names new helper: audit_file() nfsd_vfs_write(): use file_inode() ncpfs: use file_inode() kill f_dentry uses lockd: get rid of ->f_path.dentry->d_sb ...	2014-12-10 16:10:49 -08:00
Linus Torvalds	e20db597b6	NFS client updates for Linux 3.19 Highlights include: Features: - NFSv4.2 client support for hole punching and preallocation. - Further RPC/RDMA client improvements. - Add more RPC transport debugging tracepoints. - Add RPC debugging tools in debugfs. Bugfixes: - Stable fix for layoutget error handling - Fix a change in COMMIT behaviour resulting from the recent io code updates -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUhRVTAAoJEGcL54qWCgDyfeUP/RoFo3ImTMbGxfcPJqoELjcO lZbQ+27pOE/whFDkWgiOVTwlgGct5a0WRo7GCZmpYJA4q1kmSv4ngTb3nMTCUztt xMJ0mBr0BqttVs+ouKiVPm3cejQXedEhttwWcloIXS8lNenlpL29Zlrx2NHdU8UU 13+souocj0dwIyTYYS/4Lm9KpuCYnpDBpP5ShvQjVaMe/GxJo6GyZu70c7FgwGNz Nh9onzZV3mz1elhfizlV38aVA7KWVXtLWIqOFIKlT2fa4nWB8Hc07miR5UeOK0/h r+icnF2qCQe83MbjOxYNxIKB6uiA/4xwVc90X4AQ7F0RX8XPWHIQWG5tlkC9jrCQ 3RGzYshWDc9Ud2mXtLMyVQxHVVYlFAe1WtdP8ZWb1oxDInmhrarnWeNyECz9xGKu VzIDZzeq9G8slJXATWGRfPsYr+Ihpzcen4QQw58cakUBcqEJrYEhlEOfLovM71k3 /S/jSHBAbQqiw4LPMw87bA5A6+ZKcVSsNE0XCtNnhmqFpLc1kKRrl5vaN+QMk5tJ v4/zR0fPqH7SGAJWYs4brdfahyejEo0TwgpDs7KHmu1W9zQ0LCVTaYnQuUmQjta6 WyYwIy3TTibdfR191O0E3NOW82Q/k/NBD6ySvabN9HqQ9eSk6+rzrWAslXCbYohb BJfzcQfDdx+lsyhjeTx9 =wOP3 -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.19-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Highlights include: Features: - NFSv4.2 client support for hole punching and preallocation. - Further RPC/RDMA client improvements. - Add more RPC transport debugging tracepoints. - Add RPC debugging tools in debugfs. Bugfixes: - Stable fix for layoutget error handling - Fix a change in COMMIT behaviour resulting from the recent io code updates" * tag 'nfs-for-3.19-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (28 commits) sunrpc: add a debugfs rpc_xprt directory with an info file in it sunrpc: add debugfs file for displaying client rpc_task queue nfs: Add DEALLOCATE support nfs: Add ALLOCATE support NFS: Clean up nfs4_init_callback() NFS: SETCLIENTID XDR buffer sizes are incorrect SUNRPC: serialize iostats updates xprtrdma: Display async errors xprtrdma: Enable pad optimization xprtrdma: Re-write rpcrdma_flush_cqs() xprtrdma: Refactor tasklet scheduling xprtrdma: unmap all FMRs during transport disconnect xprtrdma: Cap req_cqinit xprtrdma: Return an errno from rpcrdma_register_external() nfs: define nfs_inc_fscache_stats and using it as possible nfs: replace nfs_add_stats with nfs_inc_stats when add one NFS: Deletion of unnecessary checks before the function call "nfs_put_client" sunrpc: eliminate RPC_TRACEPOINTS sunrpc: eliminate RPC_DEBUG lockd: eliminate LOCKD_DEBUG ...	2014-12-10 15:13:13 -08:00
David S. Miller	22f10923dd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/amd/xgbe/xgbe-desc.c drivers/net/ethernet/renesas/sh_eth.c Overlapping changes in both conflict cases. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 15:48:20 -05:00
Joe Perches	785c20a08b	irda: Convert function pointer arrays and uses to const Making things const is a good thing. (x86-64 defconfig with all irda) $ size net/irda/built-in.o* text data bss dec hex filename 109276 1868 244 111388 1b31c net/irda/built-in.o.new 108828 2316 244 111388 1b31c net/irda/built-in.o.old Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 15:33:16 -05:00
Joe Perches	22bbf5f3e4	llc: Make llc_sap_action_t function pointer arrays const It's better when function pointer arrays aren't modifiable. Net change: $ size net/llc/built-in.o.* text data bss dec hex filename 61193 12758 1344 75295 1261f net/llc/built-in.o.new 47113 27030 1344 75487 126df net/llc/built-in.o.old Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 15:21:24 -05:00
Joe Perches	9b37306935	llc: Make llc_conn_ev_qfyr_t function pointer arrays const It's better when function pointer arrays aren't modifiable. Net change from original: $ size net/llc/built-in.o.* text data bss dec hex filename 61065 12886 1344 75295 1261f net/llc/built-in.o.new 47113 27030 1344 75487 126df net/llc/built-in.o.old Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 15:21:24 -05:00
Joe Perches	14b7d95fd2	llc: Make function pointer arrays const It's better when function pointer arrays aren't modifiable. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 15:21:24 -05:00
Daniel Borkmann	87545899b5	net: replace remaining users of arch_fast_hash with jhash This patch effectively reverts commit `500f808726` ("net: ovs: use CRC32 accelerated flow hash if available"), and other remaining arch_fast_hash() users such as from nfsd via commit `6282cd5655` ("NFSD: Don't hand out delegations for 30 seconds after recalling them.") where it has been used as a hash function for bloom filtering. While we think that these users are actually not much of concern, it has been requested to remove the arch_fast_hash() library bits that arose from [1] entirely as per recent discussion [2]. The main argument is that using it as a hash may introduce bias due to its linearity (see avalanche criterion) and thus makes it less clear (though we tried to document that) when this security/performance trade-off is actually acceptable for a general purpose library function. Lets therefore avoid any further confusion on this matter and remove it to prevent any future accidental misuse of it. For the time being, this is going to make hashing of flow keys a bit more expensive in the ovs case, but future work could reevaluate a different hashing discipline. [1] https://patchwork.ozlabs.org/patch/299369/ [2] https://patchwork.ozlabs.org/patch/418756/ Cc: Neil Brown <neilb@suse.de> Cc: Francesco Fusco <fusco@ntop.org> Cc: Jesse Gross <jesse@nicira.com> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 15:17:45 -05:00
Daniel Borkmann	7f19fc5e0b	netlink: use jhash as hashfn for rhashtable For netlink, we shouldn't be using arch_fast_hash() as a hashing discipline, but rather jhash() instead. Since netlink sockets can be opened by any user, a local attacker would be able to easily create collisions with the DPDK-derived arch_fast_hash(), which trades off performance for security by using crc32 CPU instructions on x86_64. While it might have a legimite use case in other places, it should be avoided in netlink context, though. As rhashtable's API is very flexible, we could later on still decide on other hashing disciplines, if legitimate. Reference: http://thread.gmane.org/gmane.linux.kernel/1844123 Fixes: `e341694e3e` ("netlink: Convert netlink_lookup() to use RCU protected hash table") Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 15:17:45 -05:00
Richard Alpe	340b6e59fb	tipc: fix broadcast wakeup contention after congestion commit `908344cdda` ("tipc: fix bug in multicast congestion handling") introduced a race in the broadcast link wakeup functionality. This patch eliminates this broadcast link wakeup race caused by operation on the wakeup list without proper locking. If this race hit and corrupted the list all subsequent wakeup messages would be lost, resulting in a considerable memory leak. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 14:45:33 -05:00
Alexander Duyck	fd11a83dd3	net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb This change pulls the core functionality out of __netdev_alloc_skb and places them in a new function named __alloc_rx_skb. The reason for doing this is to make these bits accessible to a new function __napi_alloc_skb. In addition __alloc_rx_skb now has a new flags value that is used to determine which page frag pool to allocate from. If the SKB_ALLOC_NAPI flag is set then the NAPI pool is used. The advantage of this is that we do not have to use local_irq_save/restore when accessing the NAPI pool from NAPI context. In my test setup I saw at least 11ns of savings using the napi_alloc_skb function versus the netdev_alloc_skb function, most of this being due to the fact that we didn't have to call local_irq_save/restore. The main use case for napi_alloc_skb would be for things such as copybreak or page fragment based receive paths where an skb is allocated after the data has been received instead of before. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 13:31:57 -05:00
Alexander Duyck	ffde7328a3	net: Split netdev_alloc_frag into __alloc_page_frag and add __napi_alloc_frag This patch splits the netdev_alloc_frag function up so that it can be used on one of two page frag pools instead of being fixed on the netdev_alloc_cache. By doing this we can add a NAPI specific function __napi_alloc_frag that accesses a pool that is only used from softirq context. The advantage to this is that we do not need to call local_irq_save/restore which can be a significant savings. I also took the opportunity to refactor the core bits that were placed in __alloc_page_frag. First I updated the allocation to do either a 32K allocation or an order 0 page. This is based on the changes in commmit `d9b2938aa` where it was found that latencies could be reduced in case of failures. Then I also rewrote the logic to work from the end of the page to the start. By doing this the size value doesn't have to be used unless we have run out of space for page fragments. Finally I cleaned up the atomic bits so that we just do an atomic_sub_and_test and if that returns true then we set the page->_count via an atomic_set. This way we can remove the extra conditional for the atomic_read since it would have led to an atomic_inc in the case of success anyway. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 13:31:57 -05:00
David S. Miller	6e5f59aacb	Merge branch 'for-davem-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs More iov_iter work for the networking from Al Viro. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 13:17:23 -05:00
Dan Carpenter	3b05ac3824	ipvs: uninitialized data with IP_VS_IPV6 The app_tcp_pkt_out() function expects "*diff" to be set and ends up using uninitialized data if CONFIG_IP_VS_IPV6 is turned on. The same issue is there in app_tcp_pkt_in(). Thanks to Julian Anastasov for noticing that. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-12-10 17:36:47 +09:00
Linus Torvalds	86c6a2fddf	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle are: - 'Nested Sleep Debugging', activated when CONFIG_DEBUG_ATOMIC_SLEEP=y. This instruments might_sleep() checks to catch places that nest blocking primitives - such as mutex usage in a wait loop. Such bugs can result in hard to debug races/hangs. Another category of invalid nesting that this facility will detect is the calling of blocking functions from within schedule() -> sched_submit_work() -> blk_schedule_flush_plug(). There's some potential for false positives (if secondary blocking primitives themselves are not ready yet for this facility), but the kernel will warn once about such bugs per bootup, so the warning isn't much of a nuisance. This feature comes with a number of fixes, for problems uncovered with it, so no messages are expected normally. - Another round of sched/numa optimizations and refinements, for CONFIG_NUMA_BALANCING=y. - Another round of sched/dl fixes and refinements. Plus various smaller fixes and cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits) sched: Add missing rcu protection to wake_up_all_idle_cpus sched/deadline: Introduce start_hrtick_dl() for !CONFIG_SCHED_HRTICK sched/numa: Init numa balancing fields of init_task sched/deadline: Remove unnecessary definitions in cpudeadline.h sched/cpupri: Remove unnecessary definitions in cpupri.h sched/deadline: Fix rq->dl.pushable_tasks bug in push_dl_task() sched/fair: Fix stale overloaded status in the busiest group finding logic sched: Move p->nr_cpus_allowed check to select_task_rq() sched/completion: Document when to use wait_for_completion_io_() sched: Update comments about CLONE_NEWUTS and CLONE_NEWIPC sched/fair: Kill task_struct::numa_entry and numa_group::task_list sched: Refactor task_struct to use numa_faults instead of numa_ pointers sched/deadline: Don't check CONFIG_SMP in switched_from_dl() sched/deadline: Reschedule from switched_from_dl() after a successful pull sched/deadline: Push task away if the deadline is equal to curr during wakeup sched/deadline: Add deadline rq status print sched/deadline: Fix artificial overrun introduced by yield_task_dl() sched/rt: Clean up check_preempt_equal_prio() sched/core: Use dl_bw_of() under rcu_read_lock_sched() sched: Check if we got a shallowest_idle_cpu before searching for least_loaded_cpu ...	2014-12-09 21:21:34 -08:00
Jiri Pirko	6ea3b446b9	net: sched: cls: use nla_nest_cancel instead of nlmsg_trim To cancel nesting, this function is more convenient. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 21:49:58 -05:00
Valdis.Kletnieks@vt.edu	69204cf7eb	net: fix suspicious rcu_dereference_check in net/sched/sch_fq_codel.c commit `46e5da40ae` (net: qdisc: use rcu prefix and silence sparse warnings) triggers a spurious warning: net/sched/sch_fq_codel.c:97 suspicious rcu_dereference_check() usage! The code should be using the _bh variant of rcu_dereference. Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 21:49:09 -05:00
Eric Dumazet	0f85feae6b	tcp: fix more NULL deref after prequeue changes When I cooked commit `c3658e8d0f` ("tcp: fix possible NULL dereference in tcp_vX_send_reset()") I missed other spots we could deref a NULL skb_dst(skb) Again, if a socket is provided, we do not need skb_dst() to get a pointer to network namespace : sock_net(sk) is good enough. Reported-by: Dann Frazier <dann.frazier@canonical.com> Bisected-by: Dann Frazier <dann.frazier@canonical.com> Tested-by: Dann Frazier <dann.frazier@canonical.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `ca777eff51` ("tcp: remove dst refcount false sharing for prequeue mode") Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 21:38:44 -05:00
Ying Xue	023160bc8f	tipc: avoid double lock 'spin_lock:&seq->lock' The commit `fb9962f3ce` ("tipc: ensure all name sequences are properly protected with its lock") involves below errors: net/tipc/name_table.c:980 tipc_purge_publications() error: double lock 'spin_lock:&seq->lock' Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 18:27:03 -05:00
Roopa Prabhu	1d460b988d	rocker: remove swdev mode Remove use of 'swdev' mode in rocker. rocker dev offloads can use the BRIDGE_FLAGS_SELF to indicate offload to hardware. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 18:24:47 -05:00
David S. Miller	b5f185f33d	Merge tag 'master-2014-12-08' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== pull request: wireless-next 2014-12-08 Please pull this last batch of pending wireless updates for the 3.19 tree... For the wireless bits, Johannes says: "This time I have Felix's no-status rate control work, which will allow drivers to work better with rate control even if they don't have perfect status reporting. In addition to this, a small hwsim fix from Patrik, one of the regulatory patches from Arik, and a number of cleanups and fixes I did myself. Of note is a patch where I disable CFG80211_WEXT so that compatibility is no longer selectable - this is intended as a wake-up call for anyone who's still using it, and is still easily worked around (it's a one-line patch) before we fully remove the code as well in the future." For the Bluetooth bits, Johan says: "Here's one more bluetooth-next pull request for 3.19: - Minor cleanups for ieee802154 & mac802154 - Fix for the kernel warning with !TASK_RUNNING reported by Kirill A. Shutemov - Support for another ath3k device - Fix for tracking link key based security level - Device tree bindings for btmrvl + a state update fix - Fix for wrong ACL flags on LE links" And... "In addition to the previous one this contains two more cleanups to mac802154 as well as support for some new HCI features from the Bluetooth 4.2 specification. From the original request: 'Here's what should be the last bluetooth-next pull request for 3.19. It's rather large but the majority of it is the Low Energy Secure Connections feature that's part of the Bluetooth 4.2 specification. The specification went public only this week so we couldn't publish the corresponding code before that. The code itself can nevertheless be considered fairly mature as it's been in development for over 6 months and gone through several interoperability test events. Besides LE SC the pull request contains an important fix for command complete events for mgmt sockets which also fixes some leaks of hci_conn objects when powering off or unplugging Bluetooth adapters. A smaller feature that's part of the pull request is service discovery support. This is like normal device discovery except that devices not matching specific UUIDs or strong enough RSSI are filtered out. Other changes that the pull request contains are firmware dump support to the btmrvl driver, firmware download support for Broadcom BCM20702A0 variants, as well as some coding style cleanups in 6lowpan & ieee802154/mac802154 code.'" For the NFC bits, Samuel says: "With this one we get: - NFC digital improvements for DEP support: Chaining, NACK and ATN support added. - NCI improvements: Support for p2p target, SE IO operand addition, SE operands extensions to support proprietary implementations, and a few fixes. - NFC HCI improvements: OPEN_PIPE and NOTIFY_ALL_CLEARED support, and SE IO operand addition. - A bunch of minor improvements and fixes for STMicro st21nfcb and st21nfca" For the iwlwifi bits, Emmanuel says: "Major works are CSA and TDLS. On top of that I have a new firmware API for scan and a few rate control improvements. Johannes find a few tricks to improve our CPU utilization and adds support for a new spin of 7265 called 7265D. Along with this a few random things that don't stand out." And... "I deprecate here -8.ucode since -9 has been published long ago. Along with that I have a new activity, we have now better a infrastructure for firmware debugging. This will allow to have configurable probes insides the firmware. Luca continues his work on NetDetect, this feature is now complete. All the rest is minor fixes here and there." For the Atheros bits, Kalle says: "Only ath10k changes this time and no major changes. Most visible are: o new debugfs interface for runtime firmware debugging (Yanbo) o fix shared WEP (Sujith) o don't rebuild whenever kernel version changes (Johannes) o lots of refactoring to make it easier to add new hw support (Michal) There's also smaller fixes and improvements with no point of listing here." In addition, there are a few last minute updates to ath5k, ath9k, brcmfmac, brcmsmac, mwifiex, rt2x00, rtlwifi, and wil6210. Also included is a pull of the wireless tree to pick-up the fixes originally included in "pull request: wireless 2014-12-03"... Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 18:12:03 -05:00
Li RongQing	e008f3f07f	net: avoid to call skb_queue_len again the queue length of sd->input_pkt_queue has been put into qlen, and impossible to change, since hold the lock Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 17:03:19 -05:00
David S. Miller	602de7ead5	linux-can-next-for-3.19-20141207 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUhNqoAAoJECte4hHFiupU57IP/1ioNl+EkM8ZXCH+pZCsMuoF S33lLQjJ2WEh2WZXDEJqGWdv7FRh5dUyRB67TpMCzQa8lsyPykapFAy4s1DEEZ46 EbsRHjkJdw+fg3dRGp33XPD55t2xXz9CYB7OuVGLjBEWdFb5a2by+JYCctTynqum xI+qGo6IKcAvlyYAmiopZ+FOBUMhRo30GkkzVnoIsQn+Z1HYEdJ+QGryL1rOY01D Gt4d+hZ6bT08yy+4ZB3Sr6/H3w4e8saUCS8H+JyLVYR+quM0T/uV4drqk/21kUNU 954LPu5GY5l6gYDEaki96Rc6DpuqsWlgy7oh1E3p9XN0vZFPEjmFXkic28hHpvKm nDThB9qllwYUu9hmALaMuxkbRmJK/NvFwlzdtp0uZIiiENGGQrD368wiWxyzD3aP HvthWTNM2E+T15gmmzUNnGPbaTWgxjp4G4wEucX/yLiZDTu0ftoFBvnRy3emWhI0 3N1Lf3ZBGYuHQvyUMWMgQ53nwuPuDgcVy/wYEUu11rI4zFcP7OmrznPhtnwfwQmz lMppDC0d3L0PGjI4/oKXJAXrCuAVldv+eLFOpHaJuXU+VuglEOpetjUDMv2A0hbQ 23HcX+rIRd+8M8H+RtAYrqmocAOw70/cy0NzuLfI8a7kOW9H55dHADx4IFTae2E+ X1dBTj1EHrIlyw6lkC9e =icE1 -----END PGP SIGNATURE----- Merge tag 'linux-can-next-for-3.19-20141207' of git://gitorious.org/linux-can/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2014-12-07 this is a pull request of 8 patches for net-next/master. Andri Yngvason contributes 4 patches in which the CAN state change handling is consolidated and unified among the sja1000, mscan and flexcan driver. The three patches by Jeremiah Mahler fix spelling mistakes and eliminate the banner[] variable in various parts. And a patch by me that switches on sparse endianess checking by default. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 16:49:00 -05:00
Eric Dumazet	605ad7f184	tcp: refine TSO autosizing Commit `95bd09eb27` ("tcp: TSO packets automatic sizing") tried to control TSO size, but did this at the wrong place (sendmsg() time) At sendmsg() time, we might have a pessimistic view of flow rate, and we end up building very small skbs (with 2 MSS per skb). This is bad because : - It sends small TSO packets even in Slow Start where rate quickly increases. - It tends to make socket write queue very big, increasing tcp_ack() processing time, but also increasing memory needs, not necessarily accounted for, as fast clones overhead is currently ignored. - Lower GRO efficiency and more ACK packets. Servers with a lot of small lived connections suffer from this. Lets instead fill skbs as much as possible (64KB of payload), but split them at xmit time, when we have a precise idea of the flow rate. skb split is actually quite efficient. Patch looks bigger than necessary, because TCP Small Queue decision now has to take place after the eventual split. As Neal suggested, introduce a new tcp_tso_autosize() helper, so that tcp_tso_should_defer() can be synchronized on same goal. Rename tp->xmit_size_goal_segs to tp->gso_segs, as this variable contains number of mss that we can put in GSO packet, and is not related to the autosizing goal anymore. Tested: 40 ms rtt link nstat >/dev/null netperf -H remote -l -2000000 -- -s 1000000 nstat \| egrep "IpInReceives\|IpOutRequests\|TcpOutSegs\|IpExtOutOctets" Before patch : Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/s 87380 2000000 2000000 0.36 44.22 IpInReceives 600 0.0 IpOutRequests 599 0.0 TcpOutSegs 1397 0.0 IpExtOutOctets 2033249 0.0 After patch : Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 2000000 2000000 0.36 44.27 IpInReceives 221 0.0 IpOutRequests 232 0.0 TcpOutSegs 1397 0.0 IpExtOutOctets 2013953 0.0 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 16:39:22 -05:00
Al Viro	d3a9632f09	skb_copy_datagram_iovec() can die no callers other than itself. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-09 16:29:11 -05:00
Al Viro	e5a4b0bb80	switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives ... making both non-draining. That means that tcp_recvmsg() becomes non-draining. And _that_ would break iscsit_do_rx_data() unless we a) make sure tcp_recvmsg() is uniformly non-draining (it is) b) make sure it copes with arbitrary (including shifted) iov_iter (it does, all it uses is iov_iter primitives) c) make iscsit_do_rx_data() initialize ->msg_iter only once. Fortunately, (c) is doable with minimal work and we are rid of one the two places where kernel send/recvmsg users would be unhappy with non-draining behaviour. Actually, that makes all but one of ->recvmsg() instances iov_iter-clean. The exception is skcipher_recvmsg() and it also isn't hard to convert to primitives (iov_iter_get_pages() is needed there). That'll wait a bit - there's some interplay with ->sendmsg() path for that one. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-09 16:29:10 -05:00
Al Viro	17836394e5	first fruits - kill l2cap ->memcpy_fromiovec() Just use copy_from_iter(). That's what this method is trying to do in all cases, in a very convoluted fashion. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-09 16:29:10 -05:00
Al Viro	c0371da604	put iov_iter into msghdr Note that the code _using_ ->msg_iter at that point will be very unhappy with anything other than unshifted iovec-backed iov_iter. We still need to convert users to proper primitives. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-09 16:29:03 -05:00
Al Viro	d838df2e5d	vmci: propagate msghdr all way down to __qp_memcpy_from_queue() ... and switch it to memcpy_to_msg() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-09 16:28:23 -05:00
Al Viro	56c39fb67c	switch l2cap ->memcpy_fromiovec() to msghdr it'll die soon enough - now that kvec-backed iov_iter works regardless of set_fs(), both instances will become copy_from_iter() as soon as we introduce ->msg_iter... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-09 16:28:23 -05:00
Al Viro	f4362a2c95	switch tcp_sock->ucopy from iovec (ucopy.iov) to msghdr (ucopy.msg) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-09 16:28:22 -05:00
Al Viro	f69e6d131f	ip_generic_getfrag, udplite_getfrag: switch to passing msghdr Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-09 16:28:22 -05:00
Al Viro	19e3c66b52	ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-09 16:28:21 -05:00
Al Viro	b61e9dcc5e	raw.c: stick msghdr into raw_frag_vec we'll want access to ->msg_iter Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-09 16:28:21 -05:00
Hannes Frederic Sowa	dbfc4fb7d5	dst: no need to take reference on DST_NOCACHE dsts Since commit `f886497212` ("ipv4: fix dst race in sk_dst_get()") DST_NOCACHE dst_entries get freed by RCU. So there is no need to get a reference on them when we are in rcu protected sections. Cc: Eric Dumazet <edumazet@google.com> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Reviewed-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 16:08:17 -05:00
Jiri Benc	d2b2a13245	openvswitch: set correct protocol on route lookup Respect what the caller passed to ovs_tunnel_get_egress_info. Fixes: `8f0aad6f35` ("openvswitch: Extend packet attribute for egress tunnel info") Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 16:01:21 -05:00
Jiri Pirko	bd42b78860	net: sched: cls_basic: fix error path in basic_change() Signed-off-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 15:41:56 -05:00
Gu Zheng	0cf00c6f36	net/socket.c : introduce helper function do_sock_sendmsg to replace reduplicate code Introduce helper function do_sock_sendmsg() to simplify sock_sendmsg{_nosec}, and replace reduplicate code. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 15:24:26 -05:00
Eric Dumazet	42eef7a0bb	tcp_cubic: refine Hystart delay threshold In commit `2b4636a5f8` ("tcp_cubic: make the delay threshold of HyStart less sensitive"), HYSTART_DELAY_MIN was changed to 4 ms. The remaining problem is that using delay_min + (delay_min/16) as the threshold is too sensitive. 6.25 % of variation is too small for rtt above 60 ms, which are not uncommon. Lets use 12.5 % instead (delay_min + (delay_min/8)) Tested: 80 ms RTT between peers, FQ/pacing packet scheduler on sender. 10 bulk transfers of 10 seconds : nstat >/dev/null for i in `seq 1 10` do netperf -H remote -- -k THROUGHPUT \| grep THROUGHPUT done nstat \| grep Hystart With the 6.25 % threshold : THROUGHPUT=20.66 THROUGHPUT=249.38 THROUGHPUT=254.10 THROUGHPUT=14.94 THROUGHPUT=251.92 THROUGHPUT=237.73 THROUGHPUT=19.18 THROUGHPUT=252.89 THROUGHPUT=21.32 THROUGHPUT=15.58 TcpExtTCPHystartTrainDetect 2 0.0 TcpExtTCPHystartTrainCwnd 4756 0.0 TcpExtTCPHystartDelayDetect 5 0.0 TcpExtTCPHystartDelayCwnd 180 0.0 With the 12.5 % threshold THROUGHPUT=251.09 THROUGHPUT=247.46 THROUGHPUT=250.92 THROUGHPUT=248.91 THROUGHPUT=250.88 THROUGHPUT=249.84 THROUGHPUT=250.51 THROUGHPUT=254.15 THROUGHPUT=250.62 THROUGHPUT=250.89 TcpExtTCPHystartTrainDetect 1 0.0 TcpExtTCPHystartTrainCwnd 3175 0.0 Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 14:58:23 -05:00
Eric Dumazet	6e3a8a937c	tcp_cubic: add SNMP counters to track how effective is Hystart When deploying FQ pacing, one thing we noticed is that CUBIC Hystart triggers too soon. Having SNMP counters to have an idea of how often the various Hystart methods trigger is useful prior to any modifications. This patch adds SNMP counters tracking, how many time "ack train" or "Delay" based Hystart triggers, and cumulative sum of cwnd at the time Hystart decided to end SS (Slow Start) myhost:~# nstat -a \| grep Hystart TcpExtTCPHystartTrainDetect 9 0.0 TcpExtTCPHystartTrainCwnd 20650 0.0 TcpExtTCPHystartDelayDetect 10 0.0 TcpExtTCPHystartDelayCwnd 360 0.0 -> Train detection was triggered 9 times, and average cwnd was 20650/9=2294, Delay detection was triggered 10 times and average cwnd was 36 Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 14:58:23 -05:00
Jiri Pirko	57d743a3de	net: sched: cls: remove unused op put from tcf_proto_ops It is never called and implementations are void. So just remove it. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 14:49:02 -05:00
Erik Hugne	4988bb4a3f	tipc: fix missing spinlock init and nullptr oops commit `908344cdda` ("tipc: fix bug in multicast congestion handling") introduced two bugs with the bclink wakeup function. This commit fixes the missing spinlock init for the waiting_sks list. We also eliminate the race condition between the waiting_sks length check/dequeue operations in tipc_bclink_wakeup_users by simply removing the redundant length check. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Acked-by: Tero Aho <Tero.Aho@coriant.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 13:41:54 -05:00
Eric Dumazet	6ffe75eb53	net: avoid two atomic operations in fast clones Commit `ce1a4ea3f1` ("net: avoid one atomic operation in skb_clone()") took the wrong way to save one atomic operation. It is actually possible to avoid two atomic operations, if we do not change skb->fclone values, and only rely on clone_ref content to signal if the clone is available or not. skb_clone() can simply use the fast clone if clone_ref is 1. kfree_skbmem() can avoid the atomic_dec_and_test() if clone_ref is 1. Note that because we usually free the clone before the original skb, this particular attempt is only done for the original skb to have better branch prediction. SKB_FCLONE_FREE is removed. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Chris Mason <clm@fb.com> Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 13:40:20 -05:00
Andrew Shewmaker	5d330cddb9	Update old iproute2 and Xen Remus links Signed-off-by: Andrew Shewmaker <agshew@gmail.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 13:38:13 -05:00
Mahesh Bandewar	395eea6ccf	rtnetlink: delay RTM_DELLINK notification until after ndo_uninit() The commit `56bfa7ee7c` ("unregister_netdevice : move RTM_DELLINK to until after ndo_uninit") tried to do this ealier but while doing so it created a problem. Unfortunately the delayed rtmsg_ifinfo() also delayed call to fill_info(). So this translated into asking driver to remove private state and then query it's private state. This could have catastropic consequences. This change breaks the rtmsg_ifinfo() into two parts - one takes the precise snapshot of the device by called fill_info() before calling the ndo_uninit() and the second part sends the notification using collected snapshot. It was brought to notice when last link is deleted from an ipvlan device when it has free-ed the port and the subsequent .fill_info() call is trying to get the info from the port. kernel: [ 255.139429] ------------[ cut here ]------------ kernel: [ 255.139439] WARNING: CPU: 12 PID: 11173 at net/core/rtnetlink.c:2238 rtmsg_ifinfo+0x100/0x110() kernel: [ 255.139493] Modules linked in: ipvlan bonding w1_therm ds2482 wire cdc_acm ehci_pci ehci_hcd i2c_dev i2c_i801 i2c_core msr cpuid bnx2x ptp pps_core mdio libcrc32c kernel: [ 255.139513] CPU: 12 PID: 11173 Comm: ip Not tainted 3.18.0-smp-DEV #167 kernel: [ 255.139514] Hardware name: Intel RML,PCH/Ibis_QC_18, BIOS 1.0.10 05/15/2012 kernel: [ 255.139515] 0000000000000009 ffff880851b6b828 ffffffff815d87f4 00000000000000e0 kernel: [ 255.139516] 0000000000000000 ffff880851b6b868 ffffffff8109c29c 0000000000000000 kernel: [ 255.139518] 00000000ffffffa6 00000000000000d0 ffffffff81aaf580 0000000000000011 kernel: [ 255.139520] Call Trace: kernel: [ 255.139527] [<ffffffff815d87f4>] dump_stack+0x46/0x58 kernel: [ 255.139531] [<ffffffff8109c29c>] warn_slowpath_common+0x8c/0xc0 kernel: [ 255.139540] [<ffffffff8109c2ea>] warn_slowpath_null+0x1a/0x20 kernel: [ 255.139544] [<ffffffff8150d570>] rtmsg_ifinfo+0x100/0x110 kernel: [ 255.139547] [<ffffffff814f78b5>] rollback_registered_many+0x1d5/0x2d0 kernel: [ 255.139549] [<ffffffff814f79cf>] unregister_netdevice_many+0x1f/0xb0 kernel: [ 255.139551] [<ffffffff8150acab>] rtnl_dellink+0xbb/0x110 kernel: [ 255.139553] [<ffffffff8150da90>] rtnetlink_rcv_msg+0xa0/0x240 kernel: [ 255.139557] [<ffffffff81329283>] ? rhashtable_lookup_compare+0x43/0x80 kernel: [ 255.139558] [<ffffffff8150d9f0>] ? __rtnl_unlock+0x20/0x20 kernel: [ 255.139562] [<ffffffff8152cb11>] netlink_rcv_skb+0xb1/0xc0 kernel: [ 255.139563] [<ffffffff8150a495>] rtnetlink_rcv+0x25/0x40 kernel: [ 255.139565] [<ffffffff8152c398>] netlink_unicast+0x178/0x230 kernel: [ 255.139567] [<ffffffff8152c75f>] netlink_sendmsg+0x30f/0x420 kernel: [ 255.139571] [<ffffffff814e0b0c>] sock_sendmsg+0x9c/0xd0 kernel: [ 255.139575] [<ffffffff811d1d7f>] ? rw_copy_check_uvector+0x6f/0x130 kernel: [ 255.139577] [<ffffffff814e11c9>] ? copy_msghdr_from_user+0x139/0x1b0 kernel: [ 255.139578] [<ffffffff814e1774>] ___sys_sendmsg+0x304/0x310 kernel: [ 255.139581] [<ffffffff81198723>] ? handle_mm_fault+0xca3/0xde0 kernel: [ 255.139585] [<ffffffff811ebc4c>] ? destroy_inode+0x3c/0x70 kernel: [ 255.139589] [<ffffffff8108e6ec>] ? __do_page_fault+0x20c/0x500 kernel: [ 255.139597] [<ffffffff811e8336>] ? dput+0xb6/0x190 kernel: [ 255.139606] [<ffffffff811f05f6>] ? mntput+0x26/0x40 kernel: [ 255.139611] [<ffffffff811d2b94>] ? __fput+0x174/0x1e0 kernel: [ 255.139613] [<ffffffff814e2129>] __sys_sendmsg+0x49/0x90 kernel: [ 255.139615] [<ffffffff814e2182>] SyS_sendmsg+0x12/0x20 kernel: [ 255.139617] [<ffffffff815df092>] system_call_fastpath+0x12/0x17 kernel: [ 255.139619] ---[ end trace 5e6703e87d984f6b ]--- Signed-off-by: Mahesh Bandewar <maheshb@google.com> Reported-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Cc: Eric Dumazet <edumazet@google.com> Cc: Roopa Prabhu <roopa@cumulusnetworks.com> Cc: David S. Miller <davem@davemloft.net> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 13:36:57 -05:00
Erik Hugne	88b17b6a22	tipc: drop tx side permission checks Part of the old remote management feature is a piece of code that checked permissions on the local system to see if a certain operation was permitted, and if so pass the command to a remote node. This serves no purpose after the removal of remote management with commit `5902385a24` ("tipc: obsolete the remote management feature") so we remove it. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 13:30:13 -05:00
Daniel Borkmann	9772b54c55	net: sctp: use MAX_HEADER for headroom reserve in output path To accomodate for enough headroom for tunnels, use MAX_HEADER instead of LL_MAX_HEADER. Robert reported that he has hit after roughly 40hrs of trinity an skb_under_panic() via SCTP output path (see reference). I couldn't reproduce it from here, but not using MAX_HEADER as elsewhere in other protocols might be one possible cause for this. In any case, it looks like accounting on chunks themself seems to look good as the skb already passed the SCTP output path and did not hit any skb_over_panic(). Given tunneling was enabled in his .config, the headroom would have been expanded by MAX_HEADER in this case. Reported-by: Robert Święcki <robert@swiecki.net> Reference: https://lkml.org/lkml/2014/12/1/507 Fixes: `594ccc14df` ("[SCTP] Replace incorrect use of dev_alloc_skb with alloc_skb in sctp_packet_transmit().") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 13:24:03 -05:00
Duan Jiong	86fe8f8920	ipv6: remove useless spin_lock/spin_unlock xchg is atomic, so there is no necessary to use spin_lock/spin_unlock to protect it. At last, remove the redundant opt = xchg(&inet6_sk(sk)->opt, opt); statement. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-09 13:18:09 -05:00
Andy Shevchenko	1b2e122d16	sunrpc/cache: convert to use string_escape_str() There is nice kernel helper to escape a given strings by provided rules. Let's use it instead of custom approach. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> [bfields@redhat.com: fix length calculation] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:30:20 -05:00
Jeff Layton	acf06a7fa1	sunrpc: only call test_bit once in svc_xprt_received ...move the WARN_ON_ONCE inside the following if block since they use the same condition. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:29:14 -05:00
Jeff Layton	83a712e0af	sunrpc: add some tracepoints around enqueue and dequeue of svc_xprt These were useful when I was tracking down a race condition between svc_xprt_do_enqueue and svc_get_next_xprt. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:29:14 -05:00
Jeff Layton	b1691bc03d	sunrpc: convert to lockless lookup of queued server threads Testing has shown that the pool->sp_lock can be a bottleneck on a busy server. Every time data is received on a socket, the server must take that lock in order to dequeue a thread from the sp_threads list. Address this problem by eliminating the sp_threads list (which contains threads that are currently idle) and replacing it with a RQ_BUSY flag in svc_rqst. This allows us to walk the sp_all_threads list under the rcu_read_lock and find a suitable thread for the xprt by doing a test_and_set_bit. Note that we do still have a potential atomicity problem however with this approach. We don't want svc_xprt_do_enqueue to set the rqst->rq_xprt pointer unless a test_and_set_bit of RQ_BUSY returned zero (which indicates that the thread was idle). But, by the time we check that, the bit could be flipped by a waking thread. To address this, we acquire a new per-rqst spinlock (rq_lock) and take that before doing the test_and_set_bit. If that returns false, then we can set rq_xprt and drop the spinlock. Then, when the thread wakes up, it must set the bit under the same spinlock and can trust that if it was already set then the rq_xprt is also properly set. With this scheme, the case where we have an idle thread no longer needs to take the highly contended pool->sp_lock at all, and that removes the bottleneck. That still leaves one issue: What of the case where we walk the whole sp_all_threads list and don't find an idle thread? Because the search is lockess, it's possible for the queueing to race with a thread that is going to sleep. To address that, we queue the xprt and then search again. If we find an idle thread at that point, we can't attach the xprt to it directly since that might race with a different thread waking up and finding it. All we can do is wake the idle thread back up and let it attempt to find the now-queued xprt. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Tested-by: Chris Worley <chris.worley@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:22:22 -05:00
Jeff Layton	403c7b4444	sunrpc: fix potential races in pool_stats collection In a later patch, we'll be removing some spinlocking around the socket and thread queueing code in order to fix some contention problems. At that point, the stats counters will no longer be protected by the sp_lock. Change the counters to atomic_long_t fields, except for the "sockets_queued" counter which will still be manipulated under a spinlock. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Tested-by: Chris Worley <chris.worley@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:22:22 -05:00
Jeff Layton	812443865c	sunrpc: add a rcu_head to svc_rqst and use kfree_rcu to free it ...also make the manipulation of sp_all_threads list use RCU-friendly functions. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Tested-by: Chris Worley <chris.worley@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:22:22 -05:00
Jeff Layton	0b5707e452	sunrpc: require svc_create callers to pass in meaningful shutdown routine Currently all svc_create callers pass in NULL for the shutdown parm, which then gets fixed up to be svc_rpcb_cleanup if the service uses rpcbind. Simplify this by instead having the the only caller that requires it (lockd) pass in svc_rpcb_cleanup and get rid of the special casing. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:22:21 -05:00
Jeff Layton	ceff739c53	sunrpc: have svc_wake_up only deal with pool 0 The way that svc_wake_up works is a bit inefficient. It walks all of the available pools for a service and either wakes up a task in each one or sets the SP_TASK_PENDING flag in each one. When svc_wake_up is called, there is no need to wake up more than one thread to do this work. In practice, only lockd currently uses this function and it's single threaded anyway. Thus, this just boils down to doing a wake up of a thread in pool 0 or setting a single flag. Eliminate the for loop in this function and change it to just operate on pool 0. Also update the comments that sit above it and get rid of some code that has been commented out for years now. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:22:21 -05:00
Jeff Layton	4d5db3f536	sunrpc: convert sp_task_pending flag to use atomic bitops In a later patch, we'll want to be able to handle this flag without holding the sp_lock. Change this field to an unsigned long flags field, and declare a new flag in it that can be managed with atomic bitops. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:22:21 -05:00
Jeff Layton	779fb0f3af	sunrpc: move rq_splice_ok flag into rq_flags Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:22:21 -05:00
Jeff Layton	78b65eb3fd	sunrpc: move rq_dropme flag into rq_flags Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:22:20 -05:00
Jeff Layton	30660e04b0	sunrpc: move rq_usedeferral flag to rq_flags Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:22:20 -05:00
Jeff Layton	7501cc2bcf	sunrpc: move rq_local field to rq_flags Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:21:21 -05:00
Jeff Layton	4d152e2c9a	sunrpc: add a generic rq_flags field to svc_rqst and move rq_secure to it In a later patch, we're going to need some atomic bit flags. Since that field will need to be an unsigned long, we mitigate that space consumption by migrating some other bitflags to the new field. Start with the rq_secure flag. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:21:20 -05:00
J. Bruce Fields	2941b0e91b	NFS client updates for Linux 3.19 Highlights include: Features: - NFSv4.2 client support for hole punching and preallocation. - Further RPC/RDMA client improvements. - Add more RPC transport debugging tracepoints. - Add RPC debugging tools in debugfs. Bugfixes: - Stable fix for layoutget error handling - Fix a change in COMMIT behaviour resulting from the recent io code updates -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUhRVTAAoJEGcL54qWCgDyfeUP/RoFo3ImTMbGxfcPJqoELjcO lZbQ+27pOE/whFDkWgiOVTwlgGct5a0WRo7GCZmpYJA4q1kmSv4ngTb3nMTCUztt xMJ0mBr0BqttVs+ouKiVPm3cejQXedEhttwWcloIXS8lNenlpL29Zlrx2NHdU8UU 13+souocj0dwIyTYYS/4Lm9KpuCYnpDBpP5ShvQjVaMe/GxJo6GyZu70c7FgwGNz Nh9onzZV3mz1elhfizlV38aVA7KWVXtLWIqOFIKlT2fa4nWB8Hc07miR5UeOK0/h r+icnF2qCQe83MbjOxYNxIKB6uiA/4xwVc90X4AQ7F0RX8XPWHIQWG5tlkC9jrCQ 3RGzYshWDc9Ud2mXtLMyVQxHVVYlFAe1WtdP8ZWb1oxDInmhrarnWeNyECz9xGKu VzIDZzeq9G8slJXATWGRfPsYr+Ihpzcen4QQw58cakUBcqEJrYEhlEOfLovM71k3 /S/jSHBAbQqiw4LPMw87bA5A6+ZKcVSsNE0XCtNnhmqFpLc1kKRrl5vaN+QMk5tJ v4/zR0fPqH7SGAJWYs4brdfahyejEo0TwgpDs7KHmu1W9zQ0LCVTaYnQuUmQjta6 WyYwIy3TTibdfR191O0E3NOW82Q/k/NBD6ySvabN9HqQ9eSk6+rzrWAslXCbYohb BJfzcQfDdx+lsyhjeTx9 =wOP3 -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.19-1' into nfsd for-3.19 branch Mainly what I need is `860a0d9e51` "sunrpc: add some tracepoints in svc_rqst handling functions", which subsequent server rpc patches from jlayton depend on. I'm merging this later tag on the assumption that's more likely to be a tested and stable point.	2014-12-09 11:12:26 -05:00
Michael S. Tsirkin	dc9e51534b	af_packet: virtio 1.0 stubs This merely fixes sparse warnings, without actually adding support for the new APIs. Still working out the best way to enable the new functionality. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2014-12-09 12:06:32 +02:00
Fengguang Wu	fe70077197	Bluetooth: fix err_cast.cocci warnings net/bluetooth/smp.c:2650:9-16: WARNING: ERR_CAST can be used with tfm_aes Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)) Generated by: scripts/coccinelle/api/err_cast.cocci Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-09 08:06:51 +01:00
David S. Miller	6db70e3e1d	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2014-12-03 1) Fix a set but not used warning. From Fabian Frederick. 2) Currently we make sequence number values available to userspace only if we use ESN. Make the sequence number values also available for non ESN states. From Zhi Ding. 3) Remove socket policy hashing. We don't need it because socket policies are always looked up via a linked list. From Herbert Xu. 4) After removing socket policy hashing, we can use __xfrm_policy_link in xfrm_policy_insert. From Herbert Xu. 5) Add a lookup method for vti6 tunnels with wildcard endpoints. I forgot this when I initially implemented vti6. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 21:30:21 -05:00
Eyal Perry	892311f66f	ethtool: Support for configurable RSS hash function This patch extends the set/get_rxfh ethtool-options for getting or setting the RSS hash function. It modifies drivers implementation of set/get_rxfh accordingly. This change also delegates the responsibility of checking whether a modification to a certain RX flow hash parameter is supported to the driver implementation of set_rxfh. User-kernel API is done through the new hfunc bitmask field in the ethtool_rxfh struct. A bit set in the hfunc field is corresponding to an index in the new string-set ETH_SS_RSS_HASH_FUNCS. Got approval from most of the relevant driver maintainers that their driver is using Toeplitz, and for the few that didn't answered, also assumed it is Toeplitz. Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Ariel Elior <ariel.elior@qlogic.com> Cc: Prashant Sreedharan <prashant@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Cc: Greg Rose <gregory.v.rose@intel.com> Cc: Matthew Vick <matthew.vick@intel.com> Cc: John Ronciak <john.ronciak@intel.com> Cc: Mitch Williams <mitch.a.williams@intel.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Shradha Shah <sshah@solarflare.com> Cc: Shreyas Bhatewara <sbhatewara@vmware.com> Cc: "VMware, Inc." <pv-drivers@vmware.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 21:07:10 -05:00
Jiri Pirko	18b5427ae1	net_sched: cls_cgroup: remove unnecessary if since head->handle == handle (checked before), just assign handle. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:53:41 -05:00
Jiri Pirko	2f8a2965da	net_sched: cls_flow: remove duplicate assignments Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:53:41 -05:00
Jiri Pirko	6a659cd061	net_sched: cls_flow: remove faulty use of list_for_each_entry_rcu rcu variant is not correct here. The code is called by updater (rtnl lock is held), not by reader (no rcu_read_lock is held). Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:53:40 -05:00
Jiri Pirko	3fe6b49e2f	net_sched: cls_bpf: remove faulty use of list_for_each_entry_rcu rcu variant is not correct here. The code is called by updater (rtnl lock is held), not by reader (no rcu_read_lock is held). Signed-off-by: Jiri Pirko <jiri@resnulli.us> ACKed-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:53:40 -05:00
Jiri Pirko	472f583701	net_sched: cls_bpf: remove unnecessary iteration and use passed arg Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:53:40 -05:00
Jiri Pirko	e4386456ae	net_sched: cls_basic: remove unnecessary iteration and use passed arg Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:53:40 -05:00
Ying Xue	97ede29e80	tipc: convert name table read-write lock to RCU Convert tipc name table read-write lock to RCU. After this change, a new spin lock is used to protect name table on write side while RCU is applied on read side. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:39:57 -05:00
Ying Xue	834caafa3e	tipc: remove unnecessary INIT_LIST_HEAD When a list_head variable is seen as a new entry to be added to a list head, it's unnecessary to be initialized with INIT_LIST_HEAD(). Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:39:57 -05:00
Ying Xue	5492390a94	tipc: simplify relationship between name table lock and node lock When tipc name sequence is published, name table lock is released before name sequence buffer is delivered to remote nodes through its underlying unicast links. However, when name sequence is withdrawn, the name table lock is held until the transmission of the removal message of name sequence is finished. During the process, node lock is nested in name table lock. To prevent node lock from being nested in name table lock, while withdrawing name, we should adopt the same locking policy of publishing name sequence: name table lock should be released before message is sent. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:39:57 -05:00
Ying Xue	3493d25cfb	tipc: any name table member must be protected under name table lock As tipc_nametbl_lock is used to protect name_table structure, the lock must be held while all members of name_table structure are accessed. However, the lock is not obtained while a member of name_table structure - local_publ_count is read in tipc_nametbl_publish(), as a consequence, an inconsistent value of local_publ_count might be got. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:39:57 -05:00
Ying Xue	fb9962f3ce	tipc: ensure all name sequences are properly protected with its lock TIPC internally created a name table which is used to store name sequences. Now there is a read-write lock - tipc_nametbl_lock to protect the table, and each name sequence saved in the table is protected with its private lock. When a name sequence is inserted or removed to or from the table, its members might need to change. Therefore, in normal case, the two locks must be held while TIPC operates the table. However, there are still several places where we only hold tipc_nametbl_lock without proprerly obtaining name sequence lock, which might cause the corruption of name sequence. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:39:56 -05:00
Ying Xue	38622f4195	tipc: ensure all name sequences are released when name table is stopped As TIPC subscriber server is terminated before name table, no user depends on subscription list of name sequence when name table is stopped. Therefore, all name sequences stored in name table should be released whatever their subscriptions lists are empty or not, otherwise, memory leak might happen. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:39:56 -05:00
Ying Xue	993bfe5daf	tipc: make name table allocated dynamically Name table locking policy is going to be adjusted from read-write lock protection to RCU lock protection in the future commits. But its essential precondition is to convert the allocation way of name table from static to dynamic mode. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:39:56 -05:00
Ying Xue	1b61e70ad1	tipc: remove size variable from publ_list struct The size variable is introduced in publ_list struct to help us exactly calculate SKB buffer sizes needed by publications when all publications in name table are delivered in bulk in named_distribute(). But if publication SKB buffer size is assumed to MTU, the size variable in publ_list struct can be completely eliminated at the cost of wasting a bit memory space for last SKB. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Tero Aho <tero.aho@coriant.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:39:55 -05:00
Al Viro	ba00410b81	Merge branch 'iov_iter' into for-next	2014-12-08 20:39:29 -05:00
Joe Perches	60c04aecd8	udp: Neaten and reduce size of compute_score functions The compute_score functions are a bit difficult to read. Neaten them a bit to reduce object sizes and make them a bit more intelligible. Return early to avoid indentation and avoid unnecessary initializations. (allyesconfig, but w/ -O2 and no profiling) $ size net/ipv[46]/udp.o.* text data bss dec hex filename 28680 1184 25 29889 74c1 net/ipv4/udp.o.new 28756 1184 25 29965 750d net/ipv4/udp.o.old 17600 1010 2 18612 48b4 net/ipv6/udp.o.new 17632 1010 2 18644 48d4 net/ipv6/udp.o.old Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:28:47 -05:00
Willem de Bruijn	829ae9d611	net-timestamp: allow reading recv cmsg on errqueue with origin tstamp Allow reading of timestamps and cmsg at the same time on all relevant socket families. One use is to correlate timestamps with egress device, by asking for cmsg IP_PKTINFO. on AF_INET sockets, call the relevant function (ip_cmsg_recv). To avoid changing legacy expectations, only do so if the caller sets a new timestamping flag SOF_TIMESTAMPING_OPT_CMSG. on AF_INET6 sockets, IPV6_PKTINFO and all other recv cmsg are already returned for all origins. only change is to set ifindex, which is not initialized for all error origins. In both cases, only generate the pktinfo message if an ifindex is known. This is not the case for ACK timestamps. The difference between the protocol families is probably a historical accident as a result of the different conditions for generating cmsg in the relevant ip(v6)_recv_error function: ipv4: if (serr->ee.ee_origin == SO_EE_ORIGIN_ICMP) { ipv6: if (serr->ee.ee_origin != SO_EE_ORIGIN_LOCAL) { At one time, this was the same test bar for the ICMP/ICMP6 distinction. This is no longer true. Signed-off-by: Willem de Bruijn <willemb@google.com> ---- Changes v1 -> v2 large rewrite - integrate with existing pktinfo cmsg generation code - on ipv4: only send with new flag, to maintain legacy behavior - on ipv6: send at most a single pktinfo cmsg - on ipv6: initialize fields if not yet initialized The recv cmsg interfaces are also relevant to the discussion of whether looping packet headers is problematic. For v6, cmsgs that identify many headers are already returned. This patch expands that to v4. If it sounds reasonable, I will follow with patches 1. request timestamps without payload with SOF_TIMESTAMPING_OPT_TSONLY (http://patchwork.ozlabs.org/patch/366967/) 2. sysctl to conditionally drop all timestamps that have payload or cmsg from users without CAP_NET_RAW. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:20:48 -05:00
Willem de Bruijn	7ce875e5ec	ipv4: warn once on passing AF_INET6 socket to ip_recv_error One line change, in response to catching an occurrence of this bug. See also fix `f4713a3dfa` ("net-timestamp: make tcp_recvmsg call ...") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:20:48 -05:00
John W. Linville	81c412600f	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless	2014-12-08 13:58:58 -05:00
Rafael J. Wysocki	d3eaf5875e	Merge branch 'device-properties' * device-properties: leds: leds-gpio: Fix multiple instances registration without 'label' property leds: leds-gpio: Fix legacy GPIO number case ACPI / property: Drop size_prop from acpi_dev_get_property_reference() leds: leds-gpio: Convert gpio_blink_set() to use GPIO descriptors ACPI / GPIO: Document ACPI GPIO mappings API net: rfkill: gpio: Add default GPIO driver mappings for ACPI ACPI / GPIO: Driver GPIO mappings for ACPI GPIOs input: gpio_keys_polled: Make use of device property API leds: leds-gpio: Make use of device property API gpio: Support for unified device properties interface Driver core: Unified interface for firmware node properties input: gpio_keys_polled: Add support for GPIO descriptors leds: leds-gpio: Add support for GPIO descriptors gpio: sch: Consolidate core and resume banks gpio / ACPI: Add support for _DSD device properties misc: at25: Make use of device property API ACPI: Allow drivers to match using Device Tree compatible property Driver core: Unified device properties interface for platform firmware ACPI: Add support for device specific properties	2014-12-08 19:50:17 +01:00
Marcel Holtmann	9437d2edc3	Bluetooth: Fix generation of non-resolvable private addresses When the host decides to use a non-resolvable private address, it must ensure that this generated address does not match the public address of the controller. Add an extra check to ensure this required behavior. In addition rename the variable from urpa to nrpa and fix all of the comments in the code that use the term unresolvable instead of the term non-resolvable as used in the Bluetooth specification. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-08 09:21:14 +02:00
Steffen Klassert	f855691975	xfrm6: Fix the nexthdr offset in _decode_session6. xfrm_decode_session() was originally designed for the usage in the receive path where the correct nexthdr offset is stored in IP6CB(skb)->nhoff. Over time this function spread to code that is used in the output path (netfilter, vti) where IP6CB(skb)->nhoff is not set. As a result, we get a wrong nexthdr and the upper layer flow informations are wrong. This can leed to incorrect policy lookups. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-12-08 07:56:18 +01:00
Steffen Klassert	de3b7a06df	xfrm6: Fix transport header offset in _decode_session6. skb->transport_header might not be valid when we do a reverse decode because the ipv6 tunnel error handlers don't update it to the inner transport header. This leads to a wrong offset calculation and to wrong layer 4 informations. We fix this by using the size of the ipv6 header as the first offset. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-12-08 07:38:14 +01:00
Jeremiah Mahler	069f8457ae	can: fix spelling errors Fix various spelling errors in the comments of the CAN modules. Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2014-12-07 21:22:05 +01:00
Jeremiah Mahler	b111b78c6e	can: eliminate banner[] variable and switch to pr_info() Several CAN modules use a design pattern with a banner[] variable at the top which defines a string that is used once during init to print the banner. The string is also embedded with KERN_INFO which makes it printk() specific. Improve the code by eliminating the banner[] variable and moving the string to where it is printed. Then switch from printk(KERN_INFO to pr_info() for the lines that were changed. Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2014-12-07 21:22:01 +01:00
Marcel Holtmann	08f63cc502	Bluetooth: Check for force_lesc_support before rejecting SMP over BR/EDR The SMP over BR/EDR requests for cross-transport pairing should also accepted when the debugfs setting force_lesc_support has been enabled. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-07 20:03:58 +02:00
Marcel Holtmann	f9be9e8661	Bluetooth: Check for force_lesc_support when enabling SMP over BR/EDR The SMP over BR/EDR support for cross-transport pairing should also be enabled when the debugfs setting force_lesc_support has been enabled. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-06 09:51:41 +02:00
Alexei Starovoitov	89aa075832	net: sock: allow eBPF programs to be attached to sockets introduce new setsockopt() command: setsockopt(sock, SOL_SOCKET, SO_ATTACH_BPF, &prog_fd, sizeof(prog_fd)) where prog_fd was received from syscall bpf(BPF_PROG_LOAD, attr, ...) and attr->prog_type == BPF_PROG_TYPE_SOCKET_FILTER setsockopt() calls bpf_prog_get() which increments refcnt of the program, so it doesn't get unloaded while socket is using the program. The same eBPF program can be attached to multiple sockets. User task exit automatically closes socket which calls sk_filter_uncharge() which decrements refcnt of eBPF program Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-05 21:47:32 -08:00
Pravin B Shelar	f2a01517f2	openvswitch: Fix flow mask validation. Following patch fixes typo in the flow validation. This prevented installation of ARP and IPv6 flows. Fixes: `19e7a3df72` ("openvswitch: Fix NDP flow mask validation") Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-05 21:42:16 -08:00
Tom Herbert	6fb2a75673	gre: Set inner mac header in gro complete Set the inner mac header to point to the GRE payload when doing GRO. This is needed if we proceed to send the packet through GRE GSO which now uses the inner mac header instead of inner network header to determine the length of encapsulation headers. Fixes: `14051f0452` ("gre: Use inner mac length when computing tunnel length") Reported-by: Wolfgang Walter <linux@stwm.de> Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-05 21:18:34 -08:00
David S. Miller	244ebd9f8f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following batch contains netfilter updates for net-next. Basically, enhancements for xt_recent, skip zeroing of timer in conntrack, fix linking problem with recent redirect support for nf_tables, ipset updates and a couple of cleanups. More specifically, they are: 1) Rise maximum number per IP address to be remembered in xt_recent while retaining backward compatibility, from Florian Westphal. 2) Skip zeroing timer area in nf_conn objects, also from Florian. 3) Inspect IPv4 and IPv6 traffic from the bridge to allow filtering using using meta l4proto and transport layer header, from Alvaro Neira. 4) Fix linking problems in the new redirect support when CONFIG_IPV6=n and IP6_NF_IPTABLES=n. And ipset updates from Jozsef Kadlecsik: 5) Support updating element extensions when the set is full (fixes netfilter bugzilla id 880). 6) Fix set match with 32-bits userspace / 64-bits kernel. 7) Indicate explicitly when /0 networks are supported in ipset. 8) Simplify cidr handling for hash:net types. 9) Allocate the proper size of memory when /0 networks are supported. 10) Explicitly add padding elements to hash:net,net and hash:net,port, because the elements must be u32 sized for the used hash function. Jozsef is also cooking ipset RCU conversion which should land soon if they reach the merge window in time. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-05 20:56:46 -08:00
John W. Linville	f700076a9d	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2014-12-05 14:12:24 -05:00
Marcel Holtmann	5a34bd5f5d	Bluetooth: Enable events for P-256 Public Key and DHKey commands When the LE Read Local P-256 Public Key command is supported, then enable its corresponding complete event. And when the LE Generate DHKey command is supported, enable its corresponding complete event as well. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-05 18:17:49 +02:00
Marcel Holtmann	4efbb2ce8b	Bluetooth: Add support for enabling Extended Scanner Filter Policies The new Extended Scanner Filter Policies feature has to be enabled by selecting the correct filter policy for the scan parameters. This patch does that when the controller has been enabled to use LE Privacy. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-05 18:17:19 +02:00
Marcel Holtmann	2f010b5588	Bluetooth: Add support for handling LE Direct Advertising Report events When the controller sends a LE Direct Advertising Report event, the host must confirm that the resolvable random address provided matches with its own identity resolving key. If it does, then that advertising report needs to be processed. If it does not match, the report needs to be ignored. This patch adds full support for handling these new reports and using them for device discovery and connection handling. This means when a Bluetooth controller supports the Extended Scanner Filter Policies, it is possible to use directed advertising with LE privacy. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-05 18:16:41 +02:00
Marcel Holtmann	4b71bba45c	Bluetooth: Enabled LE Direct Advertising Report event if supported When the controller supports the Extended Scanner Filter Policies, it supports the LE Direct Advertising Report event. However by default that event is blocked by the LE event mask. It is required to enable it during controller setup. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-05 18:15:33 +02:00
Varka Bhadram	bcb47aabf4	mac802154: use goto label on failure Signed-off-by: Varka Bhadram <varkab@cdac.in> Reviewed-by: Stefan Schmidt <s.schmidt@samsung.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 14:18:42 +01:00
Marcel Holtmann	da25cf6a98	Bluetooth: Report invalid RSSI for service discovery and background scan When using Start Service Discovery and when background scanning is used to report devices, the RSSI is reported or the value 127 is provided in case RSSI in unavailable. For Start Discovery the value 0 is reported to keep backwards compatibility with the existing users. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-05 14:14:28 +02:00
Marcel Holtmann	efb2513fd6	Bluetooth: Fix discovery filter when no RSSI is available When no RSSI value is available then make sure that the result is filtered out when the RSSI threshold filter is active. This means that all Bluetooth 1.1 or earlier devices will not report any results when using a RSSI threshold filter. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-05 14:14:26 +02:00
Johan Hedberg	189f6ad21f	Bluetooth: Remove redundant reverse_base_uuid variable The mgmt.c file already has a bluetooth_base_uuid variable which has the exact same value as the reverse_base_uuid one. This patch removes the redundant variable. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:47:24 +01:00
Johan Hedberg	9981bdb05a	Bluetooth: Fix Get Conn Info to use cmd_complete callback This patch fixes the Get Connection Information mgmt command to take advantage of the new cmd_complete callback. This allows for great simplifications in the logic for constructing the cmd_complete event. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:46:28 +01:00
Johan Hedberg	ebf86aa3ae	Bluetooth: Fix initializing hci_conn RSSI to invalid value When we create the hci_conn object we should properly initialize the RSSI to HCI_RSSI_INVALID. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:46:10 +01:00
Johan Hedberg	69487371d1	Bluetooth: Convert Get Clock Info to use cmd_complete callback This patch converts the Get Clock Information mgmt command to take advantage of the new cmd_complete callback for pending commands. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:46:10 +01:00
Johan Hedberg	2922a94fcc	Bluetooth: Convert discovery commands to use cmd_complete callback This patch converts the Start/Stop Discovery mgmt commands to use the cmd_complete callback of struct pending_cmd. Since both of these commands return the same parameters as they take as input we can use the existing generic_cmd_complete() helper for this. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:46:10 +01:00
Johan Hedberg	d8b7b1e49a	Bluetooth: Convert Unpair Device to use cmd_complete callback This patch updates the Unpair Device code to take advantage of the cmd_complete callback of struct pending_cmd. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:46:10 +01:00
Johan Hedberg	04ab2749ea	Bluetooth: Convert Pair Device to use cmd_complete callback This patch converts the Pair Device mgmt command to use the new cmd_complete callback for pending mgmt commands. The already existing pairing_complete() function is exactly what's needed and doesn't need changing. In addition to getting the return parameters always right this patch actually fixes a reference counting bug and memory leak with the hci_conn that's attached to the pending mgmt command - something that would occur when powering off or unplugging the adapter while pairing is in progress. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:46:10 +01:00
Johan Hedberg	7776d1d805	Bluetooth: Use cmd_complete callback for authentication mgmt commands This patch converts the user confirmation & PIN code mgmt commands to take advantage of the new cmd_complete callback for pending mgmt commands. The patch also adds a new generic addr_cmd_complete() helper function to be used with commands that send a mgmt_addr_info response based on a mgmt_addr_info in the beginning of the command parameters. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:46:09 +01:00
Johan Hedberg	f5818c2241	Bluetooth: Convert Disconnect mgmt command to use cmd_complete callback This patch converts the Disconnect mgmt command to take advantage of the new cmd_complete callback that's part of the pending_cmd struct. There are many commands whose response parameters map 1:1 to the command parameters and Disconnect is one of them. This patch adds a generic_cmd_complete() function for such commands that can be reused in subsequent patches. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:46:09 +01:00
Johan Hedberg	323b0b885b	Bluetooth: Store parameter length with pending mgmt commands As preparation for making generic cmd_complete responses possible we'll need to track the parameter length in addition to just a pointer to them. This patch adds the necessary variable to the pending_cmd struct. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:46:09 +01:00
Johan Hedberg	1b9b5ee530	Bluetooth: Add callback to create proper cmd_complete events We've got a couple of generic scenarios where all pending mgmt commands are processed and responses are sent to them. These scenarios are powering off the adapter and removing the adapter. So far the code has been generating cmd_status responses with NOT_POWERED and INVALID_INDEX resposes respectively, but this violates the mgmt specification for commands that should always generate a cmd_complete. This patch adds support for specifying a callback for the pending_cmd context that each command handler can use for command-specific cmd_complete event generation. The actual per-command event generators will come in subsequent patches. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:46:09 +01:00
Stefan Schmidt	1db9960452	net/mac802154: No need for an extra space when casting Coding style cleanup. Signed-off-by: Stefan Schmidt <s.schmidt@samsung.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:43:05 +01:00
Stefan Schmidt	a4dc132196	net/mac802154: Remove extra blank lines. Signed-off-by: Stefan Schmidt <s.schmidt@samsung.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:43:05 +01:00
Stefan Schmidt	6c10dc05b8	net/ieee802154: Remove and add extra blank lines as needed. Some have been missing and some have been needed. Just cosmetics. Signed-off-by: Stefan Schmidt <s.schmidt@samsung.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:43:05 +01:00
Stefan Schmidt	cad865dc4b	net/ieee802154: Make sure alignment matches parenthesis.. Follow coding style of the kernel. Signed-off-by: Stefan Schmidt <s.schmidt@samsung.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:43:04 +01:00
Stefan Schmidt	a413e3fd4b	net/6lowpan: Remove FSF address from GPL statement. This might change and we already deliver a copy of the license with the kernel. This was already removed form the ieee802154 code but missed here. Signed-off-by: Stefan Schmidt <s.schmidt@samsung.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-05 12:43:04 +01:00
Marcel Holtmann	ee3c3ca5ba	Bluetooth: Clear discovery filter before starting background scan Currently the discovery filter information are only cleared when the actual discovery procedure has been stopped. To make sure that none of the filters interfere with the background scanning and its device found event reporting, clear the filter before starting background scanning. This means that the discovery filter is now cleared before either Start Discovery, Start Service Discovery or background scanning. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-05 13:13:01 +02:00
Marcel Holtmann	22078800c3	Bluetooth: Fix memory leaks from discovery filter UUID list In case of failure or when unplugging a controller, the allocated memory for the UUID list of the discovery filter is not freed. Use the newly introduced helper for reset the discovery filter and with that also freeing existing memory. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-05 13:13:00 +02:00
Marcel Holtmann	0256325ed6	Bluetooth: Add helper function for clearing the discovery filter The discovery filter allocates memory for its UUID list. So use a helper function to free it and reset it to default states. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-05 13:12:58 +02:00
Jakub Pawlowski	66ea9427e0	Bluetooth: Add support for Start Service Discovery command This patch adds support for the Start Service Discovery command. It does all the checks for command parameters and configured the discovery filter settings correctly. However the actual support for filtering will be added with another patch. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-05 12:37:37 +02:00
Jakub Pawlowski	799ce93df0	Bluetooth: Add logic for UUID filter handling The previous patch provided the framework for integrating the UUID filtering into the service discovery. This patch now provides the actual filter logic. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-05 12:37:36 +02:00
Marcel Holtmann	b487b9ce93	Bluetooth: Add framework for device found filtering based on UUID Using Start Service Discovery provides the option to specifiy a list of UUID that are used to filter out device found events. This patch provides the framework for hooking up the UUID filter. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-05 12:37:35 +02:00
Marcel Holtmann	bda157a400	Bluetooth: Filter device found events based on RSSI threshold Using Start Service Discovery allows to provide a RSSI threshold. This patch implements support for filtering out device found events based on the provided value. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-05 12:37:34 +02:00
Jakub Pawlowski	37eab042be	Bluetooth: Add extra discovery fields for storing filter information With the upcoming addition of support for Start Service Discovery, the discovery handling needs to filter on RSSI and UUID values. For that they need to be stored in the discovery handling. This patch adds the appropiate fields and also make sure they are reset when discovery has been stopped. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-05 12:37:33 +02:00
Al Viro	f77c80142e	bury struct proc_ns in fs/proc a) make get_proc_ns() return a pointer to struct ns_common b) mirror ns_ops in dentry->d_fsdata of ns dentries, so that is_mnt_ns_file() could get away with fewer dereferences. That way struct proc_ns becomes invisible outside of fs/proc/*.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-04 14:34:54 -05:00
Al Viro	33c429405a	copy address of proc_ns_ops into ns_common Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-04 14:34:47 -05:00
Al Viro	6344c433a4	new helpers: ns_alloc_inum/ns_free_inum take struct ns_common *, for now simply wrappers around proc_{alloc,free}_inum() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-04 14:34:36 -05:00
Al Viro	64964528b2	make proc_ns_operations work with struct ns_common * instead of void * We can do that now. And kill ->inum(), while we are at it - all instances are identical. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-04 14:34:17 -05:00
Al Viro	ff24870f46	netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-04 14:34:04 -05:00
Al Viro	435d5f4bb2	common object embedded into various struct ....ns for now - just move corresponding ->proc_inum instances over there Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-12-04 14:31:00 -05:00
John W. Linville	de51f1649a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg <johannes@sipsolutions.net> says: "This time I have Felix's no-status rate control work, which will allow drivers to work better with rate control even if they don't have perfect status reporting. In addition to this, a small hwsim fix from Patrik, one of the regulatory patches from Arik, and a number of cleanups and fixes I did myself. Of note is a patch where I disable CFG80211_WEXT so that compatibility is no longer selectable - this is intended as a wake-up call for anyone who's still using it, and is still easily worked around (it's a one-line patch) before we fully remove the code as well in the future." Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-12-04 11:29:10 -05:00
John W. Linville	04bb7ecf88	NFC: 3.19 pull request This is the NFC pull request for 3.19. With this one we get: - NFC digital improvements for DEP support: Chaining, NACK and ATN support added. - NCI improvements: Support for p2p target, SE IO operand addition, SE operands extensions to support proprietary implementations, and a few fixes. - NFC HCI improvements: OPEN_PIPE and NOTIFY_ALL_CLEARED support, and SE IO operand addition. - A bunch of minor improvements and fixes for STMicro st21nfcb and st21nfca -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUfjvWAAoJEIqAPN1PVmxK46MP/1QFb093QmAK3AfMS7XBZdwg vgJkZIyFmTeRlDTonRJpv1MXZhPL+JHYK0wGANts1Ba04nZIgK9jzdcqjvd1niD1 Hk9ggHpfxsIl35FfRVcWAUpyb71Ykug2V396gJH2iiHvt2jHr8IsRWOWCVaItWMO ZpVpV58xkROEu115wjKdrmKijL0xf7/F70RlReV4tOLGip3E/bP7EhutF2s5D8Lt A5hfauQS3lUJIhlJ1IT59XtdnBSIS5D8OuNYjuJuFkPfIcXAHacEUBrta3KahQPy kFGYYjj+V0vZA9JQG/j6vomP4AI0pP4DPkwWqgmioaPshkOOktwf1l21IKQrC7aw h7EFEG6q5AxnL5JFbm4y+b5IEp8ALeVi5rv1xdRZ7/AW/n6FWGrkBHatsJceQgX+ wFhZ52USGHQUzWoJ6SC0O82U831/dgxWm5M42CV5/Bl9d69XX7Om9MWytdjQ94yB ERoSJcivc+Q30bAiZmvb44ghv3qMGNArlBVVlcN7+3rbz4XOWLhduqCJaUs12fgY lyGdNHxVVmEeEY9ByOjr6RD1pYpSDgEQSYvn5pbKadX+DqpXHQuGC4JHtbBkzKn0 Se9c442HRBjts0VprJ7wjLgbf5VfqFjtgsfqubD/mcTn61B4/t9zyGgV1VuI7UgL QxK0m0rB1nOxpdWFesl1 =WHHp -----END PGP SIGNATURE----- Merge tag 'nfc-next-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next Samuel Ortiz <sameo@linux.intel.com> says: "NFC: 3.19 pull request This is the NFC pull request for 3.19. With this one we get: - NFC digital improvements for DEP support: Chaining, NACK and ATN support added. - NCI improvements: Support for p2p target, SE IO operand addition, SE operands extensions to support proprietary implementations, and a few fixes. - NFC HCI improvements: OPEN_PIPE and NOTIFY_ALL_CLEARED support, and SE IO operand addition. - A bunch of minor improvements and fixes for STMicro st21nfcb and st21nfca" Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-12-04 11:27:40 -05:00
Marcel Holtmann	8019044dcb	Bluetooth: Split triggering of discovery commands into separate function The actual process of compiling the correct HCI commands for triggering discovery is something that should be generic. So instead of mixing it into the Start Discover operation handling, split it out into its own function utilizing HCI request handling and just providing status in case of errors or invalid parameters. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-04 12:52:29 +02:00
Marcel Holtmann	11e6e25d05	Bluetooth: Use {start,stop}_discovery_complete handler for cmd_complete Sending the required cmd_complete for the management commands should be done in one place and not in multiple places. Especially for Start and Stop Discovery commands this is split into to sending it in case of failure from the complete handler, but in case of success from the event state update function triggering mgmt_discovering. This is way too convoluted and since hci_request serializes the HCI command processing, send the cmd_complete response from the complete handler for all cases. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-04 12:50:34 +02:00
Marcel Holtmann	f5a969f23b	Bluetooth: Simplify the error handling of Start Discovery command The Start Discovery command has some complicated code when it comes to error handling. With the future introduction of Start Service Discovery simplifying this makes it easier to read. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-04 12:50:21 +02:00
Marcel Holtmann	854bda1982	Bluetooth: Increment management interface revision This patch increments the management interface revision due to the addition of support for LE Secure Connection feature. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-03 21:58:27 +02:00
Marcel Holtmann	8ab9731d8c	Bluetooth: Increase minor version of core module With the addition of support for Bluetooth Low Energy Secure Connections feature, it makes sense to increase the minor version of the Bluetooth core module. The module version is not used anywhere, but it gives a nice extra hint for debugging purposes. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-12-03 21:58:25 +02:00
Johan Hedberg	580039e838	Bluetooth: Fix false-positive "uninitialized" compiler warning Some gcc versions don't seem to be able to properly track the flow of the smp_cmd_pairing_random() function and end up causing the following types of (false-positive) warnings: smp.c:1995:6: warning: ‘nb’ may be used uninitialized in this function [-Wmaybe-uninitialized] err = smp_g2(smp->tfm_cmac, pkax, pkbx, na, nb, &passkey); smp.c:1995:6: warning: ‘na’ may be used uninitialized in this function [-Wmaybe-uninitialized] err = smp_g2(smp->tfm_cmac, pkax, pkbx, na, nb, &passkey); ^ smp.c:1995:6: warning: ‘pkbx’ may be used uninitialized in this function [-Wmaybe-uninitialized] err = smp_g2(smp->tfm_cmac, pkax, pkbx, na, nb, &passkey); ^ smp.c:1995:6: warning: ‘pkax’ may be used uninitialized in this function [-Wmaybe-uninitialized] err = smp_g2(smp->tfm_cmac, pkax, pkbx, na, nb, &passkey); This patch fixes the issue by moving the pkax/pkbx and na/nb initialization earlier in the function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:22 +01:00
Johan Hedberg	7f376cd6dc	Bluetooth: Fix minor coding style issue in smp.c The convention for checking for NULL pointers is !ptr and not ptr == NULL. This patch fixes such an occurrence in smp.c. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:22 +01:00
Johan Hedberg	923e24143e	Bluetooth: Fix SMP debug key handling We need to keep debug keys around at least until the point that they are used - otherwise e.g. slave role behavior wouldn't work as there'd be no key to be looked up. The correct behavior should therefore be to return any stored keys but when we clean up the SMP context to remove the key from the hdev list if keeping debug keys around hasn't been requestsed. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:22 +01:00
Johan Hedberg	06edf8deb5	Bluetooth: Organize SMP crypto functions to logical sections This patch organizes the various SMP crypto functions so that the LE SC functions appear in one section and the legacy SMP functions in a separate one. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:22 +01:00
Johan Hedberg	cd08279762	Bluetooth: Fix missing const declarations in SMP functions Several SMP functions take read-only data. This patch fixes the declaration of these parameters to use the const specifier as appropriate. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:21 +01:00
Johan Hedberg	c7a3d57db6	Bluetooth: Introduce SMP_DBG macro for low-level debuging The various inputs & outputs of the crypto functions as well as the values of the ECDH keys can be considered security sensitive. They should therefore not end up in dmesg by mistake. This patch introduces a new SMP_DBG macro which requires explicit compilation with -DDEBUG to be enabled. All crypto related data logs now use this macro instead of BT_DBG. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:21 +01:00
Johan Hedberg	a29b073351	Bluetooth: Add basic LE SC OOB support for remote OOB data This patch adds basic OOB pairing support when we've received the remote OOB data. This includes tracking the remote r value (in smp->rr) as well as doing the appropriate f4() call when needed. Previously the OOB rand would have been stored in smp->rrnd however these are actually two independent values so we need separate variables for them. Na/Nb in the spec maps to smp->prnd/rrnd and ra/rb maps to smp->rr with smp->pr to come once local OOB data is supported. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:21 +01:00
Johan Hedberg	02b05bd8b0	Bluetooth: Set SMP OOB flag if OOB data is available If we have OOB data available for the remote device in question we should set the OOB flag appropriately in the SMP pairing request or response. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:21 +01:00
Johan Hedberg	86df9200c7	Bluetooth: Add support for adding remote OOB data for LE This patch adds proper support for passing LE OOB data to the hci_add_remote_oob_data() function. For LE the 192-bit values are not valid and should therefore be passed as NULL values. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:21 +01:00
Johan Hedberg	6928a9245f	Bluetooth: Store address type with OOB data To be able to support OOB data for LE pairing we need to store the address type of the remote device. This patch extends the relevant functions and data types with a bdaddr_type variable. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:21 +01:00
Johan Hedberg	81328d5cca	Bluetooth: Unify remote OOB data functions There's no need to duplicate code for the 192 vs 192+256 variants of the OOB data functions. This is also helpful to pave the way to support LE SC OOB data where only 256 bit data is provided. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:20 +01:00
Johan Hedberg	903b71c78d	Bluetooth: Add SC-only mode support for SMP When Secure Connections-only mode is enabled we should reject any pairing command that does not have Secure Connections set in the authentication requirements. This patch adds the appropriate logic for this to the command handlers of Pairing Request/Response and Security Request. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:20 +01:00
Johan Hedberg	b5ae344d4c	Bluetooth: Add full SMP BR/EDR support When doing SMP over BR/EDR some of the routines can be shared with the LE functionality whereas others needs to be split into their own BR/EDR specific branches. This patch implements the split of BR/EDR specific SMP code from the LE-only code, making sure SMP over BR/EDR works as specified. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:20 +01:00
Johan Hedberg	ef8efe4bf8	Bluetooth: Add skeleton for BR/EDR SMP channel This patch adds the very basic code for creating and destroying SMP L2CAP channels for BR/EDR connections. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:20 +01:00
Johan Hedberg	858cdc78be	Bluetooth: Add debugfs switch for forcing SMP over BR/EDR To make it possible to use LE SC functionality over BR/EDR with pre-4.1 controllers (that do not support BR/EDR SC links) it's useful to be able to force LE SC operations even over a traditional SSP protected link. This patch adds a debugfs switch to force a special debug flag which is used to skip the checks for BR/EDR SC support. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:20 +01:00
Johan Hedberg	fe8bc5ac67	Bluetooth: Add hci_conn flag for new link key generation For LE Secure Connections we want to trigger cross transport key generation only if a new link key was actually created during the BR/EDR connection. This patch adds a new flag to track this information. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:20 +01:00
Johan Hedberg	70157ef539	Bluetooth: Use debug keys for SMP when HCI_USE_DEBUG_KEYS is set The HCI_USE_DEBUG_KEYS flag is intended to force our side to always use debug keys for pairing. This means both BR/EDR SSP as well as SMP with LE Secure Connections. This patch updates the SMP code to use the debug keys instead of generating a random local key pair when the flag is set. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:19 +01:00
Johan Hedberg	1408bb6efb	Bluetooth: Add dummy handler for LE SC keypress notification Since we don not actively try to clear the keypress notification bit we might get these PDUs. To avoid failing the pairing process add a simple dummy handler for these for now. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:19 +01:00
Johan Hedberg	d3e54a876e	Bluetooth: Fix DHKey Check sending order for slave role According to the LE SC specification the initiating device sends its DHKey check first and the non-initiating devices sends its DHKey check as a response to this. It's also important that the non-initiating device doesn't send the response if it's still waiting for user input. In order to synchronize all this a new flag is added. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:19 +01:00
Johan Hedberg	38606f1418	Bluetooth: Add passkey entry support for LE SC The passkey entry mechanism involves either both sides requesting the user for a passkey, or one side requesting the passkey while the other one displays it. The behavior as far as SMP PDUs are concerned are considerably different from numeric comparison and therefore requires several new functions to handle it. In essence passkey entry involves both sides gradually committing to each bit of the passkey which involves 20 rounds of pairing confirm and pairing random PDUS being sent in both directions. This patch adds a new smp->passkey_round variable to track the current round of the passkey commitment and reuses the variables already present in struct hci_conn for the passkey and entered key count. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:19 +01:00
Johan Hedberg	e3befab970	Bluetooth: Fix BR/EDR Link Key type when derived through LE SC We need to set the correct Link Key type based on the properties of the LE SC pairing that it was derived from. If debug keys were used the type should be a debug key, and the authenticated vs unauthenticated information should be set on what kind of security level was reached. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:19 +01:00
Johan Hedberg	dddd3059e3	Bluetooth: Add support for SC just-works pairing If the just-works method was chosen we shouldn't send anything to user space but simply proceed with sending the DHKey Check PDU. This patch adds the necessary code for it. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:19 +01:00
Johan Hedberg	d378a2d776	Bluetooth: Set correct LTK type and authentication for SC After generating the LTK we should set the correct type (normal SC or debug) and authentication information for it. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:18 +01:00
Johan Hedberg	6c0dcc5014	Bluetooth: Add check for accidentally generating a debug key It is very unlikely, but to have a 100% guarantee of the generated key type we need to reject any keys which happen to match the debug key. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:18 +01:00
Johan Hedberg	aeb7d461f9	Bluetooth: Detect SMP SC debug keys We need to be able to detect if the remote side used a debug key for the pairing. This patch adds the debug key defines and sets a flag to indicate that a debug key was used. The debug private key (debug_sk) is also added in this patch but will only be used in a subsequent patch when local debug key support is implemented. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:18 +01:00
Johan Hedberg	5e3d3d9b3c	Bluetooth: Add selection of the SC authentication method This patch adds code to select the authentication method for Secure Connections based on the local and remote capabilities. A new DSP_PASSKEY method is also added for displaying the passkey - something that is not part of legacy SMP pairing. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:18 +01:00
Johan Hedberg	783e057462	Bluetooth: Track authentication method in SMP context For Secure Connections we'll select the authentication method as soon as we receive the public key, but only use it later (both when actually triggering the method as well as when determining the quality of the resulting LTK). Store the method therefore in the SMP context. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:18 +01:00
Johan Hedberg	6a77083af5	Bluetooth: Add support for LE SC key generation As the last step of the LE SC pairing process it's time to generate and distribute keys. The generation part is unique to LE SC and so this patch adds a dedicated function for it. We also clear the distribution bits for keys which are not distributed with LE SC, so that the code shared with legacy SMP will not go ahead and try to distribute them. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:18 +01:00
Johan Hedberg	6433a9a2c4	Bluetooth: Add support for LE SC DHKey check PDU Once we receive the DHKey check PDU it's time to first verify that the value is correct and then proceed with encrypting the link. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:17 +01:00
Johan Hedberg	760b018b6c	Bluetooth: Add support for handling LE SC user response With LE SC, once the user has responded to the numeric comparison it's time to send DHKey check values in both directions. The DHKey check value is generated using new smp_f5 and smp_f6 cryptographic functions. The smp_f5 function is responsible for generating the LTK and the MacKey values whereas the smp_f6 function takes the MacKey as input and generates the DHKey Check value. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:17 +01:00
Johan Hedberg	191dc7fe2d	Bluetooth: Add support for LE SC numeric comparison After the Pairing Confirm and Random PDUs have been exchanged in LE SC it's time to generate a numeric comparison value using a new smp_g2 cryptographic function (which also builds on AES-CMAC). This patch adds the smp_g2 implementation and updates the Pairing Random PDU handler to proceed with the value genration and user confirmation. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:17 +01:00
Johan Hedberg	dcee2b3221	Bluetooth: Add LE SC support for responding to Pairing Confirm PDU When LE SC is being used we should always respond to it by sending our local random number. This patch adds a convenience function for it which also contains a check for the pre-requisite public key exchange completion Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:17 +01:00
Johan Hedberg	cbbbe3e242	Bluetooth: Add support for sending LE SC Confirm value Once the public key exchange is complete the next step is for the non-initiating device to send a SMP Pairing Confirm PDU to the initiating device. This requires the use of a new smp_f4 confirm value generation function which in turn builds on the AES-CMAC cryptographic function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:17 +01:00
Johan Hedberg	d8f8edbe93	Bluetooth: Add handler function for receiving LE SC public key This patch adds a handler function for the LE SC SMP Public Key PDU. When we receive the key we proceed with generating the shared DHKey value from the remote public key and local private key. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:17 +01:00
Johan Hedberg	3b19146d23	Bluetooth: Add basic support for sending our LE SC public key When the initial pairing request & response PDUs have been exchanged and both have had the LE SC bit set the next step is to generate a ECDH key pair and to send the public key to the remote side. This patch adds basic support for generating the key pair and sending the public key using the new Public Key SMP PDU. It is the initiating device that sends the public key first and the non-initiating device responds by sending its public key respectively (in a subsequent patch). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:17 +01:00
Johan Hedberg	05ddb47a91	Bluetooth: Add ECC library for LE Secure Connections This patch adds a simple ECC library that will act as a fundamental building block for LE Secure Connections. The library has a simple API consisting of two functions: one for generating a public/private key pair and another one for generating a Diffie-Hellman key from a local private key and a remote public key. The code has been taken from https://github.com/kmackay/easy-ecc and modified to conform with the kernel coding style. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:16 +01:00
Johan Hedberg	407cecf6c7	Bluetooth: Add basic support for AES-CMAC Most of the LE Secure Connections SMP crypto functions build on top of the AES-CMAC function. This patch adds access to AES-CMAC in the kernel crypto subsystem by allocating a crypto_hash handle for it in a similar way that we have one for AES-CBC. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:16 +01:00
Johan Hedberg	df8e1a4c73	Bluetooth: Set link key generation bit if necessary for LE SC Depending on whether Secure Connections is enabled or not we may need to add the link key generation bit to the key distribution. This patch does the necessary modifications to the build_pairing_cmd() function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:16 +01:00
Johan Hedberg	f3a73d97b3	Bluetooth: Rename hci_find_ltk_by_addr to hci_find_ltk Now that hci_find_ltk_by_addr is the only LTK lookup function there's no need to keep the long name anymore. This patch shortens the function name to simply hci_find_ltk. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:16 +01:00
Johan Hedberg	0ac3dbf999	Bluetooth: Remove unused hci_find_ltk function Now that LTKs are always looked up based on bdaddr (with EDiv/Rand checks done after a successful lookup) the hci_find_ltk function is not needed anymore. This patch removes the function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:16 +01:00
Johan Hedberg	5378bc5622	Bluetooth: Update LTK lookup to correctly deal with SC LTKs LTKs derived from Secure Connections based pairing are symmetric, i.e. they should match both master and slave role. This patch updates the LTK lookup functions to ignore the desired role when dealing with SC LTKs. Furthermore, with Secure Connections the EDiv and Rand values are not used and should always be set to zero. This patch updates the LTK lookup to first use the bdaddr as key and then do the necessary verifications of EDiv and Rand based on whether the found LTK is for SC or not. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:16 +01:00
Johan Hedberg	a3209694f8	Bluetooth: Add mgmt_set_secure_conn support for any LE adapter Since LE Secure Connections is a purely host-side feature we should offer the Secure Connections mgmt setting for any adapter with LE support. This patch updates the supported settings value and the set_secure_conn command handler accordingly. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:15 +01:00
Johan Hedberg	710f11c08e	Bluetooth: Use custom macro for testing BR/EDR SC enabled Since the HCI_SC_ENABLED flag will also be used for controllers without BR/EDR Secure Connections support whenever we need to check specifically for SC for BR/EDR we also need to check that the controller actually supports it. This patch adds a convenience macro for check all the necessary conditions and converts the places in the code that need it to use it. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:15 +01:00
Johan Hedberg	8f5eeca321	Bluetooth: Set the correct security level for SC LTKs When the looked-up LTK is one generated by Secure Connections pairing the security level it gives is BT_SECURITY_FIPS. This patch updates the LTK request event handler to correctly set this level. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:15 +01:00
Johan Hedberg	23fb8de376	Bluetooth: Add mgmt support for LE Secure Connections LTK types We need a dedicated LTK type for LTK resulting from a Secure Connections based SMP pairing. This patch adds a new define for it and ensures that both the New LTK event as well as the Load LTKs command supports it. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:15 +01:00
Johan Hedberg	d2eb9e10f7	Bluetooth: Update SMP security level to/from auth_req for SC This patch updates the functions which map the SMP authentication request to a security level and vice-versa to take into account the Secure Connections feature. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:15 +01:00
Johan Hedberg	6566877694	Bluetooth: Add SMP flag for SC and set it when necessary. This patch adds a new SMP flag for tracking whether Secure Connections is in use and sets the flag when both remote and local side have elected to use Secure Connections. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:15 +01:00
Johan Hedberg	0edb14de56	Bluetooth: Make auth_req mask dependent on SC enabled or not If we haven't enabled SC support on our side we should use the same mask for the authentication requirement as we were using before SC support was added, otherwise we should use the extended mask for SC. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:14 +01:00
Johan Hedberg	e65392e2cc	Bluetooth: Add basic SMP defines for LE Secure Connections This patch adds basic SMP defines for commands, error codes and PDU definitions for the LE Secure Connections feature. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 16:51:14 +01:00
Jozsef Kadlecsik	cac3763967	netfilter: ipset: Explicitly add padding elements to hash:net, net and hash:net, port, net The elements must be u32 sized for the used hash function. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-12-03 12:43:36 +01:00
Jozsef Kadlecsik	77b4311d20	netfilter: ipset: Allocate the proper size of memory when /0 networks are supported Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-12-03 12:43:36 +01:00
Jozsef Kadlecsik	25a76f3463	netfilter: ipset: Simplify cidr handling for hash:net types Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-12-03 12:43:36 +01:00
Jozsef Kadlecsik	59de79cf57	netfilter: ipset: Indicate when /0 networks are supported Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-12-03 12:43:36 +01:00
Jozsef Kadlecsik	a51b9199b1	netfilter: ipset: Alignment problem between 64bit kernel 32bit userspace Sven-Haegar Koch reported the issue: sims:~# iptables -A OUTPUT -m set --match-set testset src -j ACCEPT iptables: Invalid argument. Run `dmesg' for more information. In syslog: x_tables: ip_tables: set.3 match: invalid size 48 (kernel) != (user) 32 which was introduced by the counter extension in ipset. The patch fixes the alignment issue with introducing a new set match revision with the fixed underlying 'struct ip_set_counter_match' structure. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-12-03 12:43:35 +01:00
Jozsef Kadlecsik	86ac79c7be	netfilter: ipset: Support updating extensions when the set is full When the set was full (hash type and maxelem reached), it was not possible to update the extension part of already existing elements. The patch removes this limitation. Fixes: https://bugzilla.netfilter.org/show_bug.cgi?id=880 Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-12-03 12:43:34 +01:00
Johan Hedberg	82c13d42bb	Bluetooth: Simplify Link Key Notification event handling logic When we get a Link Key Notification HCI event we should already have a hci_conn object. This should have been created either in the Connection Request event handler, the hci_connect_acl() function or the hci_cs_create_conn() function (if the request was not sent by the kernel). Since the only case that we'd end up not having a hci_conn in the Link Key Notification event handler would be essentially broken hardware it's safe to simply bail out from the function if this happens. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-03 10:39:43 +01:00
Scott Feldman	2c3c031c8f	bridge: add brport flags to dflt bridge_getlink To allow brport device to return current brport flags set on port. Add returned flags to nested IFLA_PROTINFO netlink msg built in dflt getlink. With this change, netlink msg returned for bridge_getlink contains the port's offloaded flag settings (the port's SELF settings). Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-02 20:01:24 -08:00
Scott Feldman	065c212a9e	bridge: move private brport flags to if_bridge.h so port drivers can use flags Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-02 20:01:22 -08:00
Scott Feldman	cf6b8e1eed	bridge: add API to notify bridge driver of learned FBD on offloaded device When the swdev device learns a new mac/vlan on a port, it sends some async notification to the driver and the driver installs an FDB in the device. To give a holistic system view, the learned mac/vlan should be reflected in the bridge's FBD table, so the user, using normal iproute2 cmds, can view what is currently learned by the device. This API on the bridge driver gives a way for the swdev driver to install an FBD entry in the bridge FBD table. (And remove one). This is equivalent to the device running these cmds: bridge fdb [add\|del] <mac> dev <dev> vid <vlan id> master This patch needs some extra eyeballs for review, in paricular around the locking and contexts. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-02 20:01:22 -08:00
Scott Feldman	38dcf357ae	bridge: call netdev_sw_port_stp_update when bridge port STP status changes To notify switch driver of change in STP state of bridge port, add new .ndo op and provide switchdev wrapper func to call ndo op. Use it in bridge code then. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-02 20:01:22 -08:00
Jiri Pirko	aecbe01e74	net-sysfs: expose physical switch id for particular device Signed-off-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Thomas Graf <tgraf@suug.ch> Acked-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-02 20:01:22 -08:00
Jiri Pirko	82f2841291	rtnl: expose physical switch id for particular device The netdevice represents a port in a switch, it will expose IFLA_PHYS_SWITCH_ID value via rtnl. Two netdevices with the same value belong to one physical switch. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Thomas Graf <tgraf@suug.ch> Acked-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-02 20:01:21 -08:00
Jiri Pirko	007f790c82	net: introduce generic switch devices support The goal of this is to provide a possibility to support various switch chips. Drivers should implement relevant ndos to do so. Now there is only one ndo defined: - for getting physical switch id is in place. Note that user can use random port netdevice to access the switch. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Thomas Graf <tgraf@suug.ch> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-02 20:01:20 -08:00
Jiri Pirko	02637fce3e	net: rename netdev_phys_port_id to more generic name So this can be reused for identification of other "items" as well. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Thomas Graf <tgraf@suug.ch> Acked-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-02 20:01:19 -08:00
Jiri Pirko	f6f6424ba7	net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del Do the work of parsing NDA_VLAN directly in rtnetlink code, pass simple u16 vid to drivers from there. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-02 20:01:18 -08:00
Jiri Pirko	93859b13fa	bridge: convert flags in fbd entry into bitfields Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-02 20:01:17 -08:00
Jiri Pirko	020ec6ba2a	bridge: rename fdb__hw to fdb__hw_addr to avoid confusion The current name might seem that this actually offloads the fdb entry to hw. So rename it to clearly present that this for hardware address addition/removal. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-02 20:01:16 -08:00
Julien Lefrique	e479ce4797	NFC: NCI: Fix max length of General Bytes in ATR_RES The maximum size of ATR_REQ and ATR_RES is 64 bytes. The maximum number of General Bytes is calculated by the maximum number of data bytes in the ATR_REQ/ATR_RES, substracted by the number of mandatory data bytes. ATR_REQ: 16 mandatory data bytes, giving a maximum of 48 General Bytes. ATR_RES: 17 mandatory data bytes, giving a maximum of 47 General Bytes. Regression introduced in commit `a99903ec`. Signed-off-by: Julien Lefrique <lefrique@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 22:59:28 +01:00
Christophe Ricard	3ff24012dd	NFC: nci: Fix warning: cast to restricted __le16 Fixing: net/nfc/nci/ntf.c:106:31: warning: cast to restricted __le16 message when building with make C=1 CF=-D__CHECK_ENDIAN__ Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 22:49:17 +01:00
Christophe Ricard	e5b53c0a2e	NFC: Fix warning "warning: incorrect type in assignment" Fix warnings: net/nfc/llcp_commands.c:421:14: warning: incorrect type in assignment (different base types) net/nfc/llcp_commands.c:421:14: expected unsigned short [unsigned] [usertype] miux net/nfc/llcp_commands.c:421:14: got restricted __be16 net/nfc/llcp_commands.c:477:14: warning: incorrect type in assignment (different base types) net/nfc/llcp_commands.c:477:14: expected unsigned short [unsigned] [usertype] miux net/nfc/llcp_commands.c:477:14: got restricted __be16 Procedure to reproduce: make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 22:49:15 +01:00
Christophe Ricard	b3a55b9c5d	NFC: hci: Add specific hci macro to not create a pipe Some pipe are only created by other host (different than the Terminal Host). The pipe values will for example be notified by NFC_HCI_ADM_NOTIFY_PIPE_CREATED. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 22:48:13 +01:00
Christophe Ricard	cd96db6fd0	NFC: Add se_io NFC operand se_io allows to send apdu over the CLF to the embedded Secure Element. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 22:47:56 +01:00
Christophe Ricard	3682f49f32	NFC: netlink: Add new netlink command NFC_CMD_ACTIVATE_TARGET Some tag might get deactivated after some read or write tentative. This may happen for example with Mifare Ultralight C tag when trying to read the last 4 blocks (starting block 0x2c) configured as write only. NFC_CMD_ACTIVATE_TARGET will try to reselect the tag in order to detect if it got remove from the field or if it is still present. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 22:47:37 +01:00
Christophe Ricard	9295b5b569	NFC: nci: Add support for different NCI_DEACTIVATE_TYPE nci_rf_deactivate_req only support NCI_DEACTIVATE_TYPE_IDLE_MODE. In some situation, it might be necessary to be able to support other NCI_DEACTIVATE_TYPE such as NCI_DEACTIVATE_TYPE_SLEEP_MODE in order for example to reactivate the selected target. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 22:47:17 +01:00
Christophe Ricard	4391590c40	NFC: nci: Add management for NCI state for machine rf_deactivate_ntf A notification for rf deaction can be IDLE_MODE, SLEEP_MODE, SLEEP_AF_MODE and DISCOVERY. According to each type and the NCI state machine is different (see figure 10 RF Communication State Machine in NCI specification) Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 22:47:07 +01:00
Christophe Ricard	98ff416f97	NFC: nci: Add status byte management in case of error. The nci status byte was ignored. In case of tag reading for example, if the tag is removed from the antenna there is no way for the upper layers (aka: stack) to get inform about such event. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 22:46:47 +01:00
Johan Hedberg	0bd49fc75a	Bluetooth: Track both local and remote L2CAP fixed channel mask To pave the way for future fixed channels to be added easily we should track both the local and remote mask on a per-L2CAP connection (struct l2cap_conn) basis. So far the code has used a global variable in a racy way which anyway needs fixing. This patch renames the existing conn->fixed_chan_mask that tracked the remote mask to conn->remote_fixed_chan and adds a new variable conn->local_fixed_chan to track the local mask. Since the HS support info is now available in the local mask we can remove the conn->hs_enabled variable. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-12-02 09:26:50 +01:00
Christophe Ricard	a2ae218298	NFC: hci: Add support for NOTIFY_ALL_PIPE_CLEARED When switching from UICC to another, the CLF may signals to the Terminal Host that some existing pipe are cleared for future update. This notification needs to be "acked" by the Terminal Host with a ANY_OK message. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 02:02:00 +01:00
Christophe Ricard	deff5aa469	NFC: hci: Add open pipe command handler If our terminal connect with other host like UICC, it may create a pipe with us, the host controller will notify us new pipe created, after that UICC will open that pipe, if we don't handle that request, UICC may failed to continue initialize which may lead to card emulation feature failed to work Signed-off-by: Arron Wang <arron.wang@intel.com> Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 02:02:00 +01:00
Christophe Ricard	a688bf55c5	NFC: nci: Add se_io NCI operand se_io allows to send apdu over the CLF to the embedded Secure Element. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 02:01:21 +01:00
Christophe Ricard	e9ef9431a3	NFC: nci: Update nci_disable_se to run proprietary commands to disable a secure element Some NFC controller using NCI protocols may need a proprietary commands flow to disable a secure element Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 02:01:21 +01:00
Christophe Ricard	93bca2bfa4	NFC: nci: Update nci_enable_se to run proprietary commands to enable a secure element Some NFC controller using NCI protocols may need a proprietary commands flow to enable a secure element Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 02:01:21 +01:00
Christophe Ricard	ba4db551bb	NFC: nci: Update nci_discover_se to run proprietary commands to discover all available secure element Some NFC controller using NCI protocols may need a proprietary commands flow to discover all available secure element Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 02:01:21 +01:00
Christophe Ricard	c7dea2525b	NFC: nci: Fix sparse: symbol 'nci_get_prop_rf_protocol' was not declared. Fix sparse warning introduced by commit: `9e87f9a9c4` It was generating the following warning: net/nfc/nci/ntf.c:170:7: sparse: symbol 'nci_get_prop_rf_protocol' was not declared. Should it be static? Procedure to reproduce it: # apt-get install sparse git checkout `9e87f9a9c4` make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 01:50:42 +01:00
Christophe Ricard	9b8d32b7ac	NFC: hci: Add se_io HCI operand se_io allows to send apdu over the CLF to the embedded Secure Element. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-12-02 01:49:58 +01:00
John W. Linville	992066c8d3	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2014-12-01 15:49:58 -05:00
Jeff Layton	067f96ef17	sunrpc: release svc_pool_map reference when serv allocation fails Currently, it leaks when the allocation fails. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-01 12:45:27 -07:00
Jeff Layton	8d65ef760d	sunrpc: eliminate the XPT_DETACHED flag All it does is indicate whether a xprt has already been deleted from a list or not, which is unnecessary since we use list_del_init and it's always set and checked under the sv_lock anyway. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-01 12:45:26 -07:00
Nicolas Dichtel	e0ebde0e13	rtnetlink: release net refcnt on error in do_setlink() rtnl_link_get_net() holds a reference on the 'struct net', we need to release it in case of error. CC: Eric W. Biederman <ebiederm@xmission.com> Fixes: `b51642f6d7` ("net: Enable a userns root rtnl calls that are safe for unprivilged users") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-29 21:05:43 -08:00
David S. Miller	60b7379dc5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2014-11-29 20:47:48 -08:00
Steven Noonan	4338c57259	netfilter: nf_log_ipv6: correct typo in module description It incorrectly identifies itself as "IPv4" packet logging. Signed-off-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-28 17:28:17 +01:00
Felix Fietkau	f027c2aca0	mac80211: add ieee80211_tx_status_noskb This can be used by drivers that cannot reliably map tx status information onto specific skbs. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-28 15:01:51 +01:00
Felix Fietkau	7e1cdcbb09	mac0211: add a helper function for fixing up tx status rates Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-28 15:01:50 +01:00
Felix Fietkau	ad9dda6383	mac80211: pass tx info to ieee80211_lost_packet instead of an skb Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-28 15:01:50 +01:00
Felix Fietkau	63558a650a	mac80211: minstrel_ht: switch to .tx_status_noskb Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-28 15:01:50 +01:00
Felix Fietkau	fb7acfb8e1	mac80211: minstrel: switch to .tx_status_noskb Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-28 15:01:50 +01:00
Felix Fietkau	f684565e0a	mac80211: add tx_status_noskb to rate_control_ops This op works like .tx_status, except it does not need access to the skb. This will be used by drivers that cannot match tx status information to specific packets. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-28 15:01:50 +01:00
Felix Fietkau	95943425c0	mac80211: minstrel_ht: move aggregation check to .get_rate() Preparation for adding a no-skb tx status path Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-28 15:01:50 +01:00
Johannes Berg	ea372c5452	cfg80211: remove unneeded initialisations in nl80211_set_reg Some variables are assigned unconditionally, remove their initialisations to help avoid introducing errors later. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-28 14:54:31 +01:00
Arik Nemtsov	ad932f046f	cfg80211: leave invalid channels on regdomain change When the regulatory settings change, some channels might become invalid. Disconnect interfaces acting on these channels, after giving userspace code a grace period to leave them. This mode is currently opt-in, and not all interface operating modes are supported for regulatory-enforcement checks. A wiphy that wishes to use the new enforcement code must specify an appropriate regulatory flag, and all its supported interface modes must be supported by the checking code. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Reviewed-by: Luis R. Rodriguez <mcgrof@suse.com> [fix some indentation, typos] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-28 14:33:41 +01:00
Felix Fietkau	336004e291	mac80211: add more missing checks for VHT tx rates Fixes a crash on attempting to calculate the frame duration for a VHT packet (which needs to be handled by hw/driver instead). Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-28 14:24:23 +01:00
Julien Lefrique	d7979e130e	NFC: NCI: Signal deactivation in Target mode Before signaling the deactivation, send a deactivation request if in RFST_DISCOVERY state because neard assumes polling is stopped and will try to restart it. Signed-off-by: Julien Lefrique <lefrique@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-11-28 14:07:51 +01:00
Julien Lefrique	6ff5462b67	NFC: NCI: Handle Discovery deactivation type When the deactivation type reported by RF_DEACTIVATE_NTF is Discovery, go in RFST_DISCOVERY state. The NFCC stays in Poll mode and/or Listen mode. Signed-off-by: Julien Lefrique <lefrique@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-11-28 14:07:51 +01:00
Julien Lefrique	966efbfb0d	NFC: Fix a memory leak Signed-off-by: Julien Lefrique <lefrique@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-11-28 14:07:51 +01:00
Julien Lefrique	122c195872	NFC: NCI: Forward data received in Target mode to nfc core Signed-off-by: Julien Lefrique <lefrique@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-11-28 14:07:51 +01:00
Julien Lefrique	485f442fd5	NFC: NCI: Implement Target mode send function As specified in NCI 1.0 and NCI 1.1, when using the NFC-DEP RF Interface, the DH and the NFCC shall only use the Static RF Connection for data communication with a Remote NFC Endpoint. Signed-off-by: Julien Lefrique <lefrique@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-11-28 14:07:51 +01:00
Julien Lefrique	529ee06682	NFC: NCI: Configure ATR_RES general bytes The Target responds to the ATR_REQ with the ATR_RES. Configure the General Bytes in ATR_RES with the first three octets equal to the NFC Forum LLCP magic number, followed by some LLC Parameters TLVs described in section 4.5 of [LLCP]. Signed-off-by: Julien Lefrique <lefrique@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-11-28 14:07:51 +01:00
Julien Lefrique	a99903ec45	NFC: NCI: Handle Target mode activation Changes: * Extract the Listen mode activation parameters from RF_INTF_ACTIVATED_NTF. * Store the General Bytes of ATR_REQ. * Signal that Target mode is activated in case of an activation in NFC-DEP. * Update the NCI state accordingly. * Use the various constants defined in nfc.h. * Fix the ATR_REQ and ATR_RES maximum size. As per NCI 1.0 and NCI 1.1, the Activation Parameters for both Poll and Listen mode contain all the bytes of ATR_REQ/ATR_RES starting and including Byte 3 as defined in [DIGITAL]. In [DIGITAL], the maximum size of ATR_REQ/ATR_RES is 64 bytes and they are numbered starting from Byte 1. Signed-off-by: Julien Lefrique <lefrique@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-11-28 14:07:51 +01:00
Julien Lefrique	90d78c1396	NFC: NCI: Enable NFC-DEP in Listen A and Listen F Send LA_SEL_INFO and LF_PROTOCOL_TYPE with NFC-DEP protocol enabled. Configure 212 Kbit/s and 412 Kbit/s bit rates for Listen F. Signed-off-by: Julien Lefrique <lefrique@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-11-28 14:07:51 +01:00

... 36 37 38 39 40 ...

39145 Commits