Add NL80211_CMD_UPDATE_FT_IES to support update of FT IEs to the WLAN
driver and NL80211_CMD_FT_EVENT to send FT events from the WLAN driver.
This will carry the target AP's MAC address along with the relevant
Information Elements. This event is used to report received FT IEs
(MDIE, FTIE, RSN IE, TIE, RICIE). These changes allow FT to be supported
with drivers that use an internal SME instead of user space option (like
FT implementation in wpa_supplicant with mac80211-based drivers).
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Some devices can handle remain on channel requests differently
based on the request type/priority. Add support to
differentiate between different ROC types, i.e., indicate that
the ROC is required for sending managment frames.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
For testing it's sometimes useful to be able to
override certain VHT capability advertisement,
add the ability to do that in cfg80211.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The station change API isn't being checked properly before
drivers are called, and as a result it is difficult to see
what should be allowed and what not.
In order to comprehensively check the API parameters parse
everything first, and then have the driver call a function
(cfg80211_check_station_change()) with the additionally
information about the kind of station that is being changed;
this allows the function to make better decisions than the
old code could.
While at it, also add a few checks, particularly in mesh
and clarify the TDLS station lifetime in documentation.
To be able to reduce a few checks, ignore any flag set bits
when the mask isn't set, they shouldn't be applied then.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
All the pointers point right into the skb data and
not to anything that would be useful to change, so
make them const.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Make the ability to leave the plink_state unchanged not use a
magic -1 variable that isn't in the enum, but an explicit change
flag; reject invalid plink states or actions and move the needed
constants for plink actions to the right header file. Also
reject plink_state changes for non-mesh interfaces.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Pull networking fixes from David Miller:
"A moderately sized pile of fixes, some specifically for merge window
introduced regressions although others are for longer standing items
and have been queued up for -stable.
I'm kind of tired of all the RDS protocol bugs over the years, to be
honest, it's way out of proportion to the number of people who
actually use it.
1) Fix missing range initialization in netfilter IPSET, from Jozsef
Kadlecsik.
2) ieee80211_local->tim_lock needs to use BH disabling, from Johannes
Berg.
3) Fix DMA syncing in SFC driver, from Ben Hutchings.
4) Fix regression in BOND device MAC address setting, from Jiri
Pirko.
5) Missing usb_free_urb in ISDN Hisax driver, from Marina Makienko.
6) Fix UDP checksumming in bnx2x driver for 57710 and 57711 chips,
fix from Dmitry Kravkov.
7) Missing cfgspace_lock initialization in BCMA driver.
8) Validate parameter size for SCTP assoc stats getsockopt(), from
Guenter Roeck.
9) Fix SCTP association hangs, from Lee A Roberts.
10) Fix jumbo frame handling in r8169, from Francois Romieu.
11) Fix phy_device memory leak, from Petr Malat.
12) Omit trailing FCS from frames received in BGMAC driver, from Hauke
Mehrtens.
13) Missing socket refcount release in L2TP, from Guillaume Nault.
14) sctp_endpoint_init should respect passed in gfp_t, rather than use
GFP_KERNEL unconditionally. From Dan Carpenter.
15) Add AISX AX88179 USB driver, from Freddy Xin.
16) Remove MAINTAINERS entries for drivers deleted during the merge
window, from Cesar Eduardo Barros.
17) RDS protocol can try to allocate huge amounts of memory, check
that the user's request length makes sense, from Cong Wang.
18) SCTP should use the provided KMALLOC_MAX_SIZE instead of it's own,
bogus, definition. From Cong Wang.
19) Fix deadlocks in FEC driver by moving TX reclaim into NAPI poll,
from Frank Li. Also, fix a build error introduced in the merge
window.
20) Fix bogus purging of default routes in ipv6, from Lorenzo Colitti.
21) Don't double count RTT measurements when we leave the TCP receive
fast path, from Neal Cardwell."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
tcp: fix double-counted receiver RTT when leaving receiver fast path
CAIF: fix sparse warning for caif_usb
rds: simplify a warning message
net: fec: fix build error in no MXC platform
net: ipv6: Don't purge default router if accept_ra=2
net: fec: put tx to napi poll function to fix dead lock
sctp: use KMALLOC_MAX_SIZE instead of its own MAX_KMALLOC_SIZE
rds: limit the size allocated by rds_message_alloc()
MAINTAINERS: remove eexpress
MAINTAINERS: remove drivers/net/wan/cycx*
MAINTAINERS: remove 3c505
caif_dev: fix sparse warnings for caif_flow_cb
ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver
sctp: use the passed in gfp flags instead GFP_KERNEL
ipv[4|6]: correct dropwatch false positive in local_deliver_finish
l2tp: Restore socket refcount when sendmsg succeeds
net/phy: micrel: Disable asymmetric pause for KSZ9021
bgmac: omit the fcs
phy: Fix phy_device_free memory leak
bnx2x: Fix KR2 work-around condition
...
Pull more VFS bits from Al Viro:
"Unfortunately, it looks like xattr series will have to wait until the
next cycle ;-/
This pile contains 9p cleanups and fixes (races in v9fs_fid_add()
etc), fixup for nommu breakage in shmem.c, several cleanups and a bit
more file_inode() work"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
constify path_get/path_put and fs_struct.c stuff
fix nommu breakage in shmem.c
cache the value of file_inode() in struct file
9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry
9p: make sure ->lookup() adds fid to the right dentry
9p: untangle ->lookup() a bit
9p: double iput() in ->lookup() if d_materialise_unique() fails
9p: v9fs_fid_add() can't fail now
v9fs: get rid of v9fs_dentry
9p: turn fid->dlist into hlist
9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine
more file_inode() open-coded instances
selinux: opened file can't have NULL or negative ->f_path.dentry
(In the meantime, the hlist traversal macros have changed, so this
required a semantic conflict fixup for the newly hlistified fid->dlist)
TCP prequeue mechanism purpose is to let incoming packets
being processed by the thread currently blocked in tcp_recvmsg(),
instead of behalf of the softirq handler, to better adapt flow
control on receiver host capacity to schedule the consumer.
But in typical request/answer workloads, we send request, then
block to receive the answer. And before the actual answer, TCP
stack receives the ACK packets acknowledging the request.
Processing pure ACK on behalf of the thread blocked in tcp_recvmsg()
is a waste of resources, as thread has to immediately sleep again
because it got no payload.
This patch avoids the extra context switches and scheduler overhead.
Before patch :
a:~# echo 0 >/proc/sys/net/ipv4/tcp_low_latency
a:~# perf stat ./super_netperf 300 -t TCP_RR -l 10 -H 7.7.7.84 -- -r 8k,8k
231676
Performance counter stats for './super_netperf 300 -t TCP_RR -l 10 -H 7.7.7.84 -- -r 8k,8k':
116251.501765 task-clock # 11.369 CPUs utilized
5,025,463 context-switches # 0.043 M/sec
1,074,511 CPU-migrations # 0.009 M/sec
216,923 page-faults # 0.002 M/sec
311,636,972,396 cycles # 2.681 GHz
260,507,138,069 stalled-cycles-frontend # 83.59% frontend cycles idle
155,590,092,840 stalled-cycles-backend # 49.93% backend cycles idle
100,101,255,411 instructions # 0.32 insns per cycle
# 2.60 stalled cycles per insn
16,535,930,999 branches # 142.243 M/sec
646,483,591 branch-misses # 3.91% of all branches
10.225482774 seconds time elapsed
After patch :
a:~# echo 0 >/proc/sys/net/ipv4/tcp_low_latency
a:~# perf stat ./super_netperf 300 -t TCP_RR -l 10 -H 7.7.7.84 -- -r 8k,8k
233297
Performance counter stats for './super_netperf 300 -t TCP_RR -l 10 -H 7.7.7.84 -- -r 8k,8k':
91084.870855 task-clock # 8.887 CPUs utilized
2,485,916 context-switches # 0.027 M/sec
815,520 CPU-migrations # 0.009 M/sec
216,932 page-faults # 0.002 M/sec
245,195,022,629 cycles # 2.692 GHz
202,635,777,041 stalled-cycles-frontend # 82.64% frontend cycles idle
124,280,372,407 stalled-cycles-backend # 50.69% backend cycles idle
83,457,289,618 instructions # 0.34 insns per cycle
# 2.43 stalled cycles per insn
13,431,472,361 branches # 147.461 M/sec
504,470,665 branch-misses # 3.76% of all branches
10.249594448 seconds time elapsed
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.
Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.
The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
type T;
expression a,c,d,e;
identifier b;
statement S;
@@
-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking fixes from David Miller:
1) ping_err() ICMP error handler looks at wrong ICMP header, from Li
Wei.
2) TCP socket hash function on ipv6 is too weak, from Eric Dumazet.
3) netif_set_xps_queue() forgets to drop mutex on errors, fix from
Alexander Duyck.
4) sum_frag_mem_limit() can deadlock due to lack of BH disabling, fix
from Eric Dumazet.
5) TCP SYN data is miscalculated in tcp_send_syn_data(), because the
amount of TCP option space was not taken into account properly in
this code path. Fix from yuchung Cheng.
6) MLX4 driver allocates device queues with the wrong size, from Kleber
Sacilotto.
7) sock_diag can access past the end of the sock_diag_handlers[] array,
from Mathias Krause.
8) vlan_set_encap_proto() makes incorrect assumptions about where
skb->data points, rework the logic so that it works regardless of
where skb->data happens to be. From Jesse Gross.
9) Fix gianfar build failure with NET_POLL enabled, from Paul
Gortmaker.
10) Fix Ipv4 ID setting and checksum calculations in GRE driver, from
Pravin B Shelar.
11) bgmac driver does:
int i;
for (i = 0; ...; ...) {
...
for (i = 0; ...; ...) {
effectively corrupting the outer loop index, use a seperate
variable for the inner loops. From Rafał Miłecki.
12) Fix suspend bugs in smsc95xx driver, from Ming Lei.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits)
usbnet: smsc95xx: rename FEATURE_AUTOSUSPEND
usbnet: smsc95xx: fix broken runtime suspend
usbnet: smsc95xx: fix suspend failure
bgmac: fix indexing of 2nd level loops
b43: Fix lockdep splat on module unload
Revert "ip_gre: propogate target device GSO capability to the tunnel device"
IP_GRE: Fix GRE_CSUM case.
VXLAN: Use tunnel_ip_select_ident() for tunnel IP-Identification.
IP_GRE: Fix IP-Identification.
net/pasemi: Fix missing coding style
vmxnet3: fix ethtool ring buffer size setting
vmxnet3: make local function static
bnx2x: remove dead code and make local funcs static
gianfar: fix compile fail for NET_POLL=y due to struct packing
vlan: adjust vlan_set_encap_proto() for its callers
sock_diag: Simplify sock_diag_handlers[] handling in __sock_diag_rcv_msg
sock_diag: Fix out-of-bounds access to sock_diag_handlers[]
vxlan: remove depends on CONFIG_EXPERIMENTAL
mlx4_en: fix allocation of CPU affinity reverse-map
mlx4_en: fix allocation of device tx_cq
...
Pull user namespace and namespace infrastructure changes from Eric W Biederman:
"This set of changes starts with a few small enhnacements to the user
namespace. reboot support, allowing more arbitrary mappings, and
support for mounting devpts, ramfs, tmpfs, and mqueuefs as just the
user namespace root.
I do my best to document that if you care about limiting your
unprivileged users that when you have the user namespace support
enabled you will need to enable memory control groups.
There is a minor bug fix to prevent overflowing the stack if someone
creates way too many user namespaces.
The bulk of the changes are a continuation of the kuid/kgid push down
work through the filesystems. These changes make using uids and gids
typesafe which ensures that these filesystems are safe to use when
multiple user namespaces are in use. The filesystems converted for
3.9 are ceph, 9p, afs, ocfs2, gfs2, ncpfs, nfs, nfsd, and cifs. The
changes for these filesystems were a little more involved so I split
the changes into smaller hopefully obviously correct changes.
XFS is the only filesystem that remains. I was hoping I could get
that in this release so that user namespace support would be enabled
with an allyesconfig or an allmodconfig but it looks like the xfs
changes need another couple of days before it they are ready."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (93 commits)
cifs: Enable building with user namespaces enabled.
cifs: Convert struct cifs_ses to use a kuid_t and a kgid_t
cifs: Convert struct cifs_sb_info to use kuids and kgids
cifs: Modify struct smb_vol to use kuids and kgids
cifs: Convert struct cifsFileInfo to use a kuid
cifs: Convert struct cifs_fattr to use kuid and kgids
cifs: Convert struct tcon_link to use a kuid.
cifs: Modify struct cifs_unix_set_info_args to hold a kuid_t and a kgid_t
cifs: Convert from a kuid before printing current_fsuid
cifs: Use kuids and kgids SID to uid/gid mapping
cifs: Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID to keyring_alloc
cifs: Use BUILD_BUG_ON to validate uids and gids are the same size
cifs: Override unmappable incoming uids and gids
nfsd: Enable building with user namespaces enabled.
nfsd: Properly compare and initialize kuids and kgids
nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids
nfsd: Modify nfsd4_cb_sec to use kuids and kgids
nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion
nfsd: Convert nfsxdr to use kuids and kgids
nfsd: Convert nfs3xdr to use kuids and kgids
...
GRE-GSO generates ip fragments with id 0,2,3,4... for every
GSO packet, which is not correct. Following patch fixes it
by setting ip-header id unique id of fragments are allowed.
As Eric Dumazet suggested it is optimized by using inner ip-header
whenever inner packet is ipv4.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Jones reported a lockdep splat occurring in IP defrag code.
commit 6d7b857d54 (net: use lib/percpu_counter API for
fragmentation mem accounting) added a possible deadlock.
Because percpu_counter_sum_positive() needs to acquire
a lock that can be used from softirq, we need to disable BH
in sum_frag_mem_limit()
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now we handle icmp errors in each transport protocol's err_handler,
for icmp protocols, that is ping_err. Since this handler only care
of those icmp errors triggered by echo request, errors triggered
by echo reply(which sent by kernel) are sliently ignored.
So wrap ping_err() with icmp_err() to deal with those icmp errors.
Signed-off-by: Li Wei <lw@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull trivial tree from Jiri Kosina:
"Assorted tiny fixes queued in trivial tree"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (22 commits)
DocBook: update EXPORT_SYMBOL entry to point at export.h
Documentation: update top level 00-INDEX file with new additions
ARM: at91/ide: remove unsused at91-ide Kconfig entry
percpu_counter.h: comment code for better readability
x86, efi: fix comment typo in head_32.S
IB: cxgb3: delay freeing mem untill entirely done with it
net: mvneta: remove unneeded version.h include
time: x86: report_lost_ticks doesn't exist any more
pcmcia: avoid static analysis complaint about use-after-free
fs/jfs: Fix typo in comment : 'how may' -> 'how many'
of: add missing documentation for of_platform_populate()
btrfs: remove unnecessary cur_trans set before goto loop in join_transaction
sound: soc: Fix typo in sound/codecs
treewide: Fix typo in various drivers
btrfs: fix comment typos
Update ibmvscsi module name in Kconfig.
powerpc: fix typo (utilties -> utilities)
of: fix spelling mistake in comment
h8300: Fix home page URL in h8300/README
xtensa: Fix home page URL in Kconfig
...
It looks like its possible to open thousands of TCP IPv6
sessions on a server, all landing in a single slot of TCP hash
table. Incoming packets have to lookup sockets in a very
long list.
We should hash all bits from foreign IPv6 addresses, using
a salt and hash mix, not a simple XOR.
inet6_ehashfn() can also separately use the ports, instead
of xoring them.
Reported-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet wrote:
| Some strange crashes happen in rt6_check_expired(), with access
| to random addresses.
|
| At first glance, it looks like the RTF_EXPIRES and
| stuff added in commit 1716a96101
| (ipv6: fix problem with expired dst cache)
| are racy : same dst could be manipulated at the same time
| on different cpus.
|
| At some point, our stack believes rt->dst.from contains a dst pointer,
| while its really a jiffie value (as rt->dst.expires shares the same area
| of memory)
|
| rt6_update_expires() should be fixed, or am I missing something ?
|
| CC Neil because of https://bugzilla.redhat.com/show_bug.cgi?id=892060
Because we do not have any locks for dst_entry, we cannot change
essential structure in the entry; e.g., we cannot change reference
to other entity.
To fix this issue, split 'from' and 'expires' field in dst_entry
out of union. Once it is 'from' is assigned in the constructor,
keep the reference until the very last stage of the life time of
the object.
Of course, it is unsafe to change 'from', so make rt6_set_from simple
just for fresh entries.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Neil Horman <nhorman@tuxdriver.com>
CC: Gao Feng <gaofeng@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reported-by: Steinar H. Gunderson <sesse@google.com>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
The following patchset contain updates for your net-next tree, they are:
* Fix (for just added) connlabel dependencies, from Florian Westphal.
* Add aliasing support for conntrack, thus users can either use -m state
or -m conntrack from iptables while using the same kernel module, from
Jozsef Kadlecsik.
* Some code refactoring for the CT target to merge common code in
revision 0 and 1, from myself.
* Add aliasing support for CT, based on patch from Jozsef Kadlecsik.
* Add one mutex per nfnetlink subsystem, from myself.
* Improved logging for packets that are dropped by helpers, from myself.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull in 'net' to take in the bug fixes that didn't make it into
3.8-final.
Also, deal with the semantic conflict of the change made to
net/ipv6/xfrm6_policy.c A missing rt6->n neighbour release
was added to 'net', but in 'net-next' we no longer cache the
neighbour entries in the ipv6 routes so that change is not
appropriate there.
Signed-off-by: David S. Miller <davem@davemloft.net>
Connection tracking helpers have to drop packets under exceptional
situations. Currently, the user gets the following logging message
in case that happens:
nf_ct_%s: dropping packet ...
However, depending on the helper, there are different reasons why a
packet can be dropped.
This patch modifies the existing code to provide more specific
error message in the scope of each helper to help users to debug
the reason why the packet has been dropped, ie:
nf_ct_%s: dropping packet: reason ...
Thanks to Joe Perches for many formatting suggestions.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When SOCK_REFCNT_DEBUG is enabled, below build error is met:
kernel/sysctl_binary.o: In function `sk_refcnt_debug_release':
include/net/sock.h:1025: multiple definition of `sk_refcnt_debug_release'
kernel/sysctl.o:include/net/sock.h:1025: first defined here
kernel/audit.o: In function `sk_refcnt_debug_release':
include/net/sock.h:1025: multiple definition of `sk_refcnt_debug_release'
kernel/sysctl.o:include/net/sock.h:1025: first defined here
make[1]: *** [kernel/built-in.o] Error 1
make: *** [kernel] Error 2
So we decide to make sk_refcnt_debug_release static to eliminate
the error.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The information of the peer's capabilities and extended capabilities are
required for the driver to perform TDLS Peer UAPSD operations and off
channel operations. This information of the peer is passed from user space
using NL80211_CMD_SET_STATION command. This commit enhances
the function nl80211_set_station to pass the capability information of
the peer to the driver.
Similarly, there may be need for capability information for other modes,
so allow this to be provided with both add_station and change_station.
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In many cases, userspace may need to know which of the
802.11 extended capabilities ("Extended Capabilities
element") are implemented in the driver or device, to
include them e.g. in beacons, assoc request/response
or other frames. Add a new nl80211 attribute to hold
the extended capabilities bitmap for this.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Instead of modifying the HT SMPS capability field
for stations, track the SMPS mode explicitly in a
new field in the station struct and use it in the
drivers that care about it. This simplifies the
code using it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Some drivers might support 80 or 160 MHz only on some
channels for whatever reason, so allow them to disable
these channel widths. Also maintain the new flags when
regulatory bandwidth limitations would disable these
wide channels.
Reviewed-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
For HT and VHT the current bandwidth can change,
add the function ieee80211_vif_change_bandwidth()
to take care of this. It returns a failure if the
new bandwidth isn't compatible with the existing
channel context, the caller has to handle that.
When it happens, also inform the driver that the
bandwidth changed for this virtual interface (no
drivers would actually care today though.)
Changing to/from HT/VHT isn't allowed though.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Handle the operating mode notification action frame.
When the supported streams or the bandwidth change
let the driver and rate control algorithm know.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
With VHT, a station can change the number of spatial
streams it can receive on the fly, not unlike spatial
multiplexing in HT. Prepare for that by tracking the
maximum number of spatial streams it can receive when
the connection is established.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
For VHT, many more bandwidth changes are possible. As a first
step, stop toggling the IEEE80211_HT_CAP_SUP_WIDTH_20_40 flag
in the HT capabilities and instead introduce a bandwidth field
indicating the currently usable bandwidth to transmit to the
station. Of course, make all drivers use it.
To achieve this, make ieee80211_ht_cap_ie_to_sta_ht_cap() get
the station as an argument, rather than the new capabilities,
so it can set up the new bandwidth field.
If the station is a VHT station and VHT bandwidth is in use,
also set the bandwidth accordingly.
Doing this allows us to get rid of the supports_40mhz flag as
the HT capabilities now reflect the true capability instead of
the current setting.
While at it, also fix ieee80211_ht_cap_ie_to_sta_ht_cap() to not
ignore HT cap overrides when MCS TX isn't supported (not that it
really happens...)
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Add command to trigger radar detection in the driver/FW.
Once radar detection is started it should continuously
monitor for radars as long as the channel active.
If radar is detected usermode notified with 'radar
detected' event.
Scanning and remain on channel functionality must be disabled
while doing radar detection/scanning, and vice versa.
Based on original patch by Victor Goldenshtein <victorg@ti.com>
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Add new NL80211_CMD_RADAR_DETECT, which starts the Channel
Availability Check (CAC). This command will also notify the
usermode about events (CAC finished, CAC aborted, radar
detected, NOP finished).
Once radar detection has started it should continuously
monitor for radars as long as the channel is active.
This patch enables DFS for AP mode in nl80211/cfg80211.
Based on original patch by Victor Goldenshtein <victorg@ti.com>
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
[remove WIPHY_FLAG_HAS_RADAR_DETECT again -- my mistake]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Steffen Klassert says:
====================
1) Remove a duplicated call to skb_orphan() in pf_key, from Cong Wang.
2) Prepare xfrm and pf_key for algorithms without pf_key support,
from Jussi Kivilinna.
3) Fix an unbalanced lock in xfrm_output_one(), from Li RongQing.
4) Add an IPsec state resolution packet queue to handle
packets that are send before the states are resolved.
5) xfrm4_policy_fini() is unused since 2.6.11, time to remove it.
From Michal Kubecek.
6) The xfrm gc threshold was configurable just in the initial
namespace, make it configurable in all namespaces. From
Michal Kubecek.
7) We currently can not insert policies with mark and mask
such that some flows would be matched from both policies.
Allow this if the priorities of these policies are different,
the one with the higher priority is used in this case.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Intel Wireless devices are able to make a TCP connection
after suspending, sending some data and waking up when
the connection receives wakeup data (or breaks). Add the
WoWLAN configuration and feature advertising API for it.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
If user knows the location of a wowlan pattern to be matched in
Rx packet, he can provide an offset with the pattern. This will
help drivers to ignore initial bytes and match the pattern
efficiently.
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
[refactor pattern sending]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
It's not used anywhere else, so move it.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tbf will need to schedule watchdog in ns. No need to convert it twice.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As it is going to be used in tbf as well, push these to generic code.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad says: The whole multiple cookie keys code is completely unused
and has been all this time. Noone uses anything other then the
secret_key[0] since there is no changeover support anywhere.
Thus, for now clean up its left-over fragments.
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change struct 9p_fid and it's associated functions to
use kuid_t's instead of uid_t.
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
9p has thre strucrtures that can encode inode stat information. Modify
all of those structures to contain kuid_t and kgid_t values. Modify
he wire encoders and decoders of those structures to use 'u' and 'g' instead of
'd' in the format string where uids and gids are present.
This results in all kuid and kgid conversion to and from on the wire values
being performed by the same code in protocol.c where the client is known
at the time of the conversion.
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Modify the p9_client_rpc format specifiers of every function that
directly transmits a uid or a gid from 'd' to 'u' or 'g' as
appropriate.
Modify those same functions to take kuid_t and kgid_t parameters
instead of uid_t and gid_t parameters.
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Scans currently work by stopping the netdev tx queues but leaving the
mac80211 queues active. This stops the flow of incoming packets while
still allowing mac80211 to transmit nullfunc and probe request frames to
facilitate scanning. However, the driver may try to wake the mac80211
queues while in this state, which will also wake the netdev queues.
To prevent this, add a new queue stop reason,
IEEE80211_QUEUE_STOP_REASON_OFFCHANNEL, to be used when stopping the tx
queues for off-channel operation. This prevents the netdev queues from
waking when a driver wakes the mac80211 queues.
This also stops all frames from being transmitted, even those meant to
be sent off-channel. Add a new tx control flag,
IEEE80211_TX_CTL_OFFCHAN_TX_OK, which allows frames to be transmitted
when the queues are stopped only for the off-channel stop reason. Update
all locations transmitting off-channel frames to use this flag.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
There are only a few drivers that use HW scan, and
all of those don't need a non-idle transition before
starting the scan -- some don't even care about idle
at all. Remove the flag and code associated with it.
The only driver that really actually needed this is
wl1251 and it can just do it itself in the hw_scan
callback -- implement that.
Acked-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The functions were added for some sort of Bluetooth
coexistence, but aren't used, so remove them again.
Reviewed-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In order to be able to predict the next DTIM TBTT
in the driver, add the ability to use timing data
from beacons only with the new hardware flag
IEEE80211_HW_TIMING_BEACON_ONLY and the BSS info
value sync_dtim_count which is only valid if the
timing data came from a beacon. The data can only
come from a beacon, and if no beacon was received
before association it is updated later together
with the DTIM count notification.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
While technically the TSF isn't an IE, it can be
necessary to distinguish between the TSF from a
beacon and a probe response, in particular in
order to know the next DTIM TBTT, as not all APs
are spec compliant wrt. TSF==0 being a DTIM TBTT
and thus the DTIM count needs to be taken into
account as well.
To allow this, move the TSF into the IE struct
so it can be known whence it came.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
There's no way scan BSS IEs can be NULL as even
if the allocation fails the frame is discarded.
Remove some code checking for this and document
that it is always non-NULL.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Add debugfs driver callbacks so drivers can add
debugfs entries for interfaces. Note that they
_must_ remove the entries again as add/remove in
the driver doesn't correspond to add/remove in
debugfs; the former is up/down while the latter
is netdev create/destroy.
Signed-off-by: Alexander Bondar <alexander.bondar@intel.com>
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Currently, cfg80211 will copy beacon IEs from a previously
received hidden SSID beacon to a probe response entry, if
that entry is created after the beacon entry. However, if
it is the other way around, or if the beacon is updated,
such changes aren't propagated.
Fix this by tracking the relation between the probe
response and beacon BSS structs in this case.
In case drivers have private data stored in a BSS struct
and need access to such data from a beacon entry, cfg80211
now provides the hidden_beacon_bss pointer from the probe
response entry to the beacon entry.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Fix most kernel-doc warnings, for some reason it
seems to have issues with __aligned, don't remove
the documentation entries it considers to be in
excess due to that.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This prepares for using the spinlock instead of krefs
which is needed in the next patch to track the refs
of combined BSSes correctly.
Acked-by: Bing Zhao <bzhao@marvell.com> [mwifiex]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
To allow both of protocol-specific data and device-specific data
attached with neighbour entry, and to eliminate size calculation
cost when allocating entry, sizeof protocol-speicic data must be
multiple of NEIGH_PRIV_ALIGN. On 64bit archs,
sizeof(struct dn_neigh) is multiple of NEIGH_PRIV_ALIGN, but on
32bit archs, it was not.
Introduce NEIGH_ENTRY_SPACE() macro to ensure that protocol-specific
entry-size meets our requirement.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initial implementation of the Multiple Registration Protocol (MRP)
from IEEE 802.1Q-2011, based on the existing implementation of the
Generic Attribute Registration Protocol (GARP).
Signed-off-by: David Ward <david.ward@ll.mit.edu>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed-up drivers/net/wireless/iwlwifi/mvm/mac80211.c to change change
IEEE80211_HW_NEED_DTIM_PERIOD to IEEE80211_HW_NEED_DTIM_BEFORE_ASSOC
as requested by Johannes Berg. -- JWL
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The xfrm gc threshold can be configured via xfrm{4,6}_gc_thresh
sysctl but currently only in init_net, other namespaces always
use the default value. This can substantially limit the number
of IPsec tunnels that can be effectively used.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
As the default, we blackhole packets until the key manager resolves
the states. This patch implements a packet queue where IPsec packets
are queued until the states are resolved. We generate a dummy xfrm
bundle, the output routine of the returned route enqueues the packet
to a per policy queue and arms a timer that checks for state resolution
when dst_output() is called. Once the states are resolved, the packets
are sent out of the queue. If the states are not resolved after some
time, the queue is flushed.
This patch keeps the defaut behaviour to blackhole packets as long
as we have no states. To enable the packet queue the sysctl
xfrm_larval_drop must be switched off.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
TCP Appropriate Byte Count was added by me, but later disabled.
There is no point in maintaining it since it is a potential source
of bugs and Linux already implements other better window protection
heuristics.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/intel/e1000e/ethtool.c
drivers/net/vmxnet3/vmxnet3_drv.c
drivers/net/wireless/iwlwifi/dvm/tx.c
net/ipv6/route.c
The ipv6 route.c conflict is simple, just ignore the 'net' side change
as we fixed the same problem in 'net-next' by eliminating cached
neighbours from ipv6 routes.
The e1000e conflict is an addition of a new statistic in the ethtool
code, trivial.
The vmxnet3 conflict is about one change in 'net' removing a guarding
conditional, whilst in 'net-next' we had a netdev_info() conversion.
The iwlwifi conflict is dealing with a WARN_ON() conversion in
'net-next' vs. a revert happening in 'net'.
Signed-off-by: David S. Miller <davem@davemloft.net>
As Thomas pointed out, cfg80211_get_mesh() is
unused and can be removed.
Cc: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In per-station statistics, present 32bit counters are too small
for practical purposes - with gigabit speeds, it get overlapped
every few seconds.
Expand counters in the struct station_info to be 64-bit.
Driver can still fill only 32-bit and indicate in @filled
only bits like STATION_INFO_[TR]X_BYTES; in case driver provides
full 64-bit counter, it should also set in @filled
bit STATION_INFO_[TR]RX_BYTES64
Netlink sends both 32-bit and 64-bit counters, if present, to not
break userspace.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
[change to also have 32-bit counters if driver advertises 64-bit]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
With multi-channel, there's a corner case where a driver
doesn't receive a beacon soon enough to be able to sync
its timers with the AP. In this case, the only recovery
(after trying again) is to disconnect from the AP. Allow
calling ieee80211_connection_loss() for such cases. To
make that possible, modify the work function to not rely
on the IEEE80211_HW_CONNECTION_MONITOR flag but use new
state kept in the interface instead.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The primary purpose of the UUIDs is to enable generation of EIR and AD
data. In these data formats the UUIDs are split into separate fields
based on whether they're 16, 32 or 128 bit UUIDs. To make the generation
of these data fields simpler this patch adds a type member to the
bt_uuid struct and assigns a value to it as soon as the UUID is added to
the kernel. This way the type doesn't need to be calculated each time
the UUID list is later iterated.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Mark existing algorithms as pfkey supported and make pfkey only use algorithms
that have pfkey_supported set.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
The datagram_*_ctl functions in net/ipv6/datagram.c are IPv6-specific. Since
datagram_send_ctl is publicly exported it should be appropriately named to
reflect the fact that it's for IPv6 only.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sending authentication/association frames they
might take a bit of time to go out because we may
have to synchronise with the AP, in particular in
the case where it's really a P2P GO. In this case
the 200ms fixed timeout could potentially be too
short if the beacon interval is relatively large.
For drivers that report TX status we can do better.
Instead of starting the timeout directly, start it
only when the frame status arrives. Since then the
frame was out on the air, we can wait shorter (the
typical response time is supposed to be 30ms, wait
100ms.) Also, if the frame failed to be transmitted
try again right away instead of waiting.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Now that mac80211 no longer uses this API, remove
it completely. If anyone needs it again, we can
revert this patch of course, but mac80211 was the
only user right now.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Currently, when the driver requires the DTIM period,
mac80211 will wait to hear a beacon before association.
This behavior is suboptimal since some drivers may be
able to deal with knowing the DTIM period after the
association, if they get it at all.
To address this, notify the drivers with bss_info_changed
with the new BSS_CHANGED_DTIM_PERIOD flag when the DTIM
becomes known. This might be when changing to associated,
or later when the entire association was done with only
probe response information.
Rename the hardware flag for the current behaviour to
IEEE80211_HW_NEED_DTIM_BEFORE_ASSOC to more accurately
reflect its behaviour. IEEE80211_HW_NEED_DTIM_PERIOD is
no longer accurate as all drivers get the DTIM period
now, just not before association.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When waking up from WoWLAN, it is useful to know
what triggered the wakeup. Support reporting the
wakeup reason(s) in cfg80211 (and a pass-through
in mac80211) to allow userspace to know.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
All users of xfrm_addr_cmp() use its result as boolean.
Introduce xfrm_addr_equal() (which is equal to !xfrm_addr_cmp())
and convert all users.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bring in the 'net' tree so that we can get some ipv4/ipv6 bug
fixes that some net-next work will build upon.
Signed-off-by: David S. Miller <davem@davemloft.net>
There are some usecase when lifetime of ipv4 addresses might be helpful.
For example:
1) initramfs networkmanager uses a DHCP daemon to learn network
configuration parameters
2) initramfs networkmanager addresses, routes and DNS configuration
3) initramfs networkmanager is requested to stop
4) initramfs networkmanager stops all daemons including dhclient
5) there are addresses and routes configured but no daemon running. If
the system doesn't start networkmanager for some reason, addresses and
routes will be used forever, which violates RFC 2131.
This patch is essentially a backport of ivp6 address lifetime mechanism
for ipv4 addresses.
Current "ip" tool supports this without any patch (since it does not
distinguish between ipv4 and ipv6 addresses in this perspective.
Also, this should be back-compatible with all current netlink users.
Reported-by: Pavel Šimerda <psimerda@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Updating the fragmentation queues LRU (Least-Recently-Used) list,
required taking the hash writer lock. However, the LRU list isn't
tied to the hash at all, so we can use a separate lock for it.
Original-idea-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the per network namespace shared atomic "mem" accounting
variable, in the fragmentation code, with a lib/percpu_counter.
Getting percpu_counter to scale to the fragmentation code usage
requires some tweaks.
At first view, percpu_counter looks superfast, but it does not
scale on multi-CPU/NUMA machines, because the default batch size
is too small, for frag code usage. Thus, I have adjusted the
batch size by using __percpu_counter_add() directly, instead of
percpu_counter_sub() and percpu_counter_add().
The batch size is increased to 130.000, based on the largest 64K
fragment memory usage. This does introduce some imprecise
memory accounting, but its does not need to be strict for this
use-case.
It is also essential, that the percpu_counter, does not
share cacheline with other writers, to make this scale.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change is primarily a preparation to ease the extension of memory
limit tracking.
The change does reduce the number atomic operation, during freeing of
a frag queue. This does introduce a some performance improvement, as
these atomic operations are at the core of the performance problems
seen on NUMA systems.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fragmentation code cacheline adjusting of struct inet_frag_queue.
Take advantage of the size of struct timer_list, and move all but
spinlock_t lock, below the timer struct. On 64-bit 'lru_list',
'list' and 'refcnt', fits exactly into the next cacheline, and a
new cacheline starts at 'fragments'.
The netns_frags *net pointer is moved to the end of the struct,
because its used in a compare, with "next/close-by" elements of
which this struct is embedded into.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The globally shared rwlock, of struct inet_frags, shares
cacheline with the 'rnd' number, which is used by the hash
calculations. Fix this, as this obviously is a bad idea, as
unnecessary cache-misses will occur when accessing the 'rnd'
number.
Also small note that, moving function ptr (*match) up in struct,
is to avoid it lands on the next cacheline (on 64-bit).
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This small cacheline adjustment of struct netns_frags improves
performance significantly for the fragmentation code.
Struct members 'lru_list' and 'mem' are both hot elements, and it
hurts performance, due to cacheline bouncing at every call point,
when they share a cacheline. Also notice, how mem is placed
together with 'high_thresh' and 'low_thresh', as they are used in
the compare operations together.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is basically a revert of:
commit 5b632fe85e
Author: Stanislaw Gruszka <sgruszka@redhat.com>
Date: Mon Dec 3 12:56:33 2012 +0100
mac80211: introduce IEEE80211_HW_TEARDOWN_AGGR_ON_BAR_FAIL
We do not need this flag any longer, rt2x00 BAR/BA problem was fixed
correctly by wireless-testing commit:
commit 84e9e8ebd3
Author: Helmut Schaa <helmut.schaa@googlemail.com>
Date: Thu Jan 17 17:34:32 2013 +0100
rt2x00: Improve TX status handling for BlockAckReq frames
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When allocating memory for neighbour cache entry, if
tbl->entry_size is not set, we always calculate
sizeof(struct neighbour) + tbl->key_len, which is common
in the same table.
With this change, set tbl->entry_size during the table
initialization phase, if it was not set, and use it in
neigh_alloc() and neighbour_priv().
This change also allow us to have both of protocol private
data and device priate data at tha same time.
Note that the only user of prototcol private is DECnet
and the only user of device private is ATM CLIP.
Since those are exclusive, we have not been facing issues
here.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
sock->sk_dst_cache is protected by RCU.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sock->sk_dst_cache is protected by RCU, therefore we should
use __sk_dst_get() to deref it once we lock the sock.
This fixes several sparse warnings.
Cc: linux-decnet-user@lists.sourceforge.net
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
This batch contains netfilter updates for you net-next tree, they are:
* The new connlabel extension for x_tables, that allows us to attach
labels to each conntrack flow. The kernel implementation uses a
bitmask and there's a file in user-space that maps the bits with the
corresponding string for each existing label. By now, you can attach
up to 128 overlapping labels. From Florian Westphal.
* A new round of improvements for the netns support for conntrack.
Gao feng has moved many of the initialization code of each module
of the netns init path. He also made several code refactoring, that
code looks cleaner to me now.
* Added documentation for all possible tweaks for nf_conntrack via
sysctl, from Jiri Pirko.
* Cisco 7941/7945 IP phone support for our SIP conntrack helper,
from Kevin Cernekee.
* Missing header file in the snmp helper, from Stephen Hemminger.
* Finally, a couple of fixes to resolve minor issues with these
changes, from myself.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add API to enable drivers to implement MAC address based
access control in AP/P2P GO mode. Capable drivers advertise
this capability by setting the maximum number of MAC
addresses in such a list in wiphy->max_acl_mac_addrs.
An initial ACL may be given to the NL80211_CMD_START_AP
command and/or changed later with NL80211_CMD_SET_MAC_ACL.
Black- and whitelists are supported, but not simultaneously.
Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qca.qualcomm.com>
[rewrite commit log, many cleanups]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
struct mac_address will be used by ACL related configuration ops.
Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Since drivers can support several BSS / P2P Client
interfaces, the rssi callback needs to inform the driver
about the interface teh rssi event relates to.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Steffen Klassert says:
====================
1) Add a statistic counter for invalid output states and
remove a superfluous state valid check, from Li RongQing.
2) Probe for asynchronous block ciphers instead of synchronous block
ciphers to make the asynchronous variants available even if no
synchronous block ciphers are found, from Jussi Kivilinna.
3) Make rfc3686 asynchronous block cipher and make use of
the new asynchronous variant, from Jussi Kivilinna.
4) Replace some rwlocks by rcu, from Cong Wang.
5) Remove some unused defines.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Motivation for soreuseport would be something like a web server
binding to port 80 running with multiple threads, where each thread
might have it's own listener socket. This could be done as an
alternative to other models: 1) have one listener thread which
dispatches completed connections to workers. 2) accept on a single
listener socket from multiple threads. In case #1 the listener thread
can easily become the bottleneck with high connection turn-over rate.
In case #2, the proportion of connections accepted per thread tends
to be uneven under high connection load (assuming simple event loop:
while (1) { accept(); process() }, wakeup does not promote fairness
among the sockets. We have seen the disproportion to be as high
as 3:1 ratio between thread accepting most connections and the one
accepting the fewest. With so_reusport the distribution is
uniform.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>