Commit Graph

4658 Commits

Author SHA1 Message Date
David S. Miller
5e58e5283a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-04-01 17:15:25 -07:00
David S. Miller
9b12c75bf4 net: Order ports in same order as addresses in flow objects.
For consistency.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-31 18:03:35 -07:00
Gustavo F. Padovan
220b881a77 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth-2.6 2011-03-31 16:26:01 -03:00
Gustavo F. Padovan
105721328f Bluetooth: Fix HCI_RESET command synchronization
We can't send new commands before a cmd_complete for the HCI_RESET command
shows up.

Reported-by: Mikko Vinni <mmvinni@yahoo.com>
Reported-by: Justin P. Mattock <justinmattock@gmail.com>
Reported-by: Ed Tomlinson <edt@aei.ca>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Tested-by: Justin P. Mattock <justinmattock@gmail.com>
Tested-by: Mikko Vinni <mmvinni@yahoo.com>
Tested-by: Ed Tomlinson <edt@aei.ca>
2011-03-31 14:25:25 -03:00
Johan Hedberg
80a1e1dbf6 Bluetooth: Add local Extended Inquiry Response (EIR) support
This patch adds automated creation of the local EIR data based on what
16-bit UUIDs are registered and what the device name is. This should
cover the majority use cases, however things like 32/128-bit UUIDs, TX
power and Device ID will need to be added later to be on par with what
bluetoothd is capable of doing (without the Management interface).

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:58 -03:00
Gustavo F. Padovan
f3dd4f0f58 Bluetooth: Remove unused struct l2cap_conn item
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:57 -03:00
Szymon Janc
2763eda6cc Bluetooth: Add add/remove_remote_oob_data management commands
This patch adds commands to add and remove remote OOB data to the managment
interface. Remote data is stored in kernel and can be used by corresponding
HCI commands and events when needed.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:57 -03:00
Szymon Janc
c35938b2f5 Bluetooth: Add read_local_oob_data management command
This patch adds a command to read local OOB data to the managment interface.
The command maps directly to the Read Local OOB Data HCI command.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:57 -03:00
Gustavo F. Padovan
b0d2199d6f Bluetooth: Remove unused struct item
num in struct l2cap_chan_list isn't used anywhere.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:54 -03:00
Johan Hedberg
b312b161ec Bluetooth: mgmt: Add support for setting the local name
This patch adds a new set_local_name management command as well as a
local_name_changed management event. With these user space can both
change the local name as well as monitor changes to it by others.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:54 -03:00
Johan Hedberg
dc4fe30b86 Bluetooth: mgmt: Add local name information to read_info reply
This patch adds the name of the adapter to the reply of the read_info
management command.

The management messages reserve 249 bytes for the name instead of 248
(like in the HCI spec) so that there is always a guarantee that it is
nul-terminated. That way it can safely be passed onto string
manipulation functions.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:54 -03:00
Johan Hedberg
1f6c6378c5 Bluetooth: Add define for the maximum name length on HCI level
This patch adds a clear define for the maximum device name length in HCI
messages and thereby avoids magic numbers in the code.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:54 -03:00
Lucas De Marchi
25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
David S. Miller
94b92b8834 ipv4: Use flowi4_init_output() in net/route.h
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-31 04:52:59 -07:00
David S. Miller
83229aa5e2 net: Add helper flowi4_init_output().
On-stack initialization via assignment of flow structures are
expensive because GCC emits a memset() to clear the entire
structure out no matter what.

Add a helper for ipv4 output flow key setup which we can use to avoid
the memset.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-31 04:52:14 -07:00
Jouni Malinen
ec15e68ba6 cfg80211: Add nl80211 event for deletion of a station entry
Indicate an NL80211_CMD_DEL_STATION event when a station entry in
mac80211 is deleted to match with the NL80211_CMD_NEW_STATION event
that is used when the entry was added. This is needed, e.g., to allow
user space to remove a peer from RSN IBSS Authenticator state machine
to avoid re-authentication and re-keying delays when the peer is not
reachable anymore.

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-30 14:15:18 -04:00
Timo Teräs
93ca3bb5df net: gre: provide multicast mappings for ipv4 and ipv6
My commit 6d55cb91a0 (gre: fix hard header destination
address checking) broke multicast.

The reason is that ip_gre used to get ipgre_header() calls with
zero destination if we have NOARP or multicast destination. Instead
the actual target was decided at ipgre_tunnel_xmit() time based on
per-protocol dissection.

Instead of allowing the "abuse" of ->header() calls with invalid
destination, this creates multicast mappings for ip_gre. This also
fixes "ip neigh show nud noarp" to display the proper multicast
mappings used by the gre device.

Reported-by: Doug Kehn <rdkehn@yahoo.com>
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Acked-by: Doug Kehn <rdkehn@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-30 00:10:47 -07:00
Steffen Klassert
af2f464e32 xfrm: Assign esn pointers when cloning a state
When we clone a xfrm state we have to assign the replay_esn
and the preplay_esn pointers to the state if we use the
new replay detection method. To this end, we add a
xfrm_replay_clone() function that allocates memory for
the replay detection and takes over the necessary values
from the original state.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-28 23:34:52 -07:00
Ben Hutchings
e0bccd315d rose: Add length checks to CALL_REQUEST parsing
Define some constant offsets for CALL_REQUEST based on the description
at <http://www.techfest.com/networking/wan/x25plp.htm> and the
definition of ROSE as using 10-digit (5-byte) addresses.  Use them
consistently.  Validate all implicit and explicit facilities lengths.
Validate the address length byte rather than either trusting or
assuming its value.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 17:59:04 -07:00
Steffen Klassert
e433430a0c dst: Clone child entry in skb_dst_pop
We clone the child entry in skb_dst_pop before we call
skb_dst_drop(). Otherwise we might kill the child right
before we return it to the caller.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 17:55:01 -07:00
Steffen Klassert
6df59a84ec route: Take the right src and dst addresses in ip_route_newports
When we set up the flow informations in ip_route_newports(), we take
the address informations from the the rt_key_src and rt_key_dst fields
of the rtable. They appear to be empty. So take the address
informations from rt_src and rt_dst instead. This issue was introduced
by commit 5e2b61f784 ("ipv4: Remove
flowi from struct rtable.")

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-25 01:28:45 -07:00
David S. Miller
37e826c513 ipv4: Fix nexthop caching wrt. scoping.
Move the scope value out of the fib alias entries and into fib_info,
so that we always use the correct scope when recomputing the nexthop
cached source address.

Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-24 18:06:47 -07:00
David S. Miller
436c3b66ec ipv4: Invalidate nexthop cache nh_saddr more correctly.
Any operation that:

1) Brings up an interface
2) Adds an IP address to an interface
3) Deletes an IP address from an interface

can potentially invalidate the nh_saddr value, requiring
it to be recomputed.

Perform the recomputation lazily using a generation ID.

Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-24 17:42:21 -07:00
Gustavo F. Padovan
f630cf0d54 Bluetooth: Fix HCI_RESET command synchronization
We can't send new commands before a cmd_complete for the HCI_RESET command
shows up.

Reported-by: Mikko Vinni <mmvinni@yahoo.com>
Reported-by: Justin P. Mattock <justinmattock@gmail.com>
Reported-by: Ed Tomlinson <edt@aei.ca>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Tested-by: Justin P. Mattock <justinmattock@gmail.com>
Tested-by: Mikko Vinni <mmvinni@yahoo.com>
Tested-by: Ed Tomlinson <edt@aei.ca>
2011-03-24 17:04:44 -03:00
Eric Dumazet
ef352e7cdf net_sched: fix THROTTLED/RUNNING race
commit fd245a4adb (net_sched: move TCQ_F_THROTTLED flag)
added a race.

qdisc_watchdog() is run from softirq, so special care should be taken or
we can lose one state transition (THROTTLED/RUNNING)

Prior to fd245a4adb, we were manipulating q->flags (qdisc->flags &=
~TCQ_F_THROTTLED;) and this manipulation could only race with
qdisc_warn_nonwc().

Since we want to avoid atomic ops in qdisc fast path - it was the
meaning of commit 3711210576 (QDISC_STATE_RUNNING dont need atomic
bit ops) - fix is to move THROTTLE bit into 'state' field, this one
being manipulated with SMP and IRQ safe operations.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-24 00:13:14 -07:00
Florian Westphal
9c7a4f9ce6 ipv6: ip6_route_output does not modify sk parameter, so make it const
This avoids explicit cast to avoid 'discards qualifiers'
compiler warning in a netfilter patch that i've been working on.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22 19:17:36 -07:00
David S. Miller
db138908cc Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-03-22 14:36:18 -07:00
Julian Anastasov
e6abbaa272 ipv4: fix route deletion for IPs on many subnets
Alex Sidorenko reported for problems with local
routes left after IP addresses are deleted. It happens
when same IPs are used in more than one subnet for the
device.

	Fix fib_del_ifaddr to restrict the checks for duplicate
local and broadcast addresses only to the IFAs that use
our primary IFA or another primary IFA with same address.
And we expect the prefsrc to be matched when the routes
are deleted because it is possible they to differ only by
prefsrc. This patch prevents local and broadcast routes
to be leaked until their primary IP is deleted finally
from the box.

	As the secondary address promotion needs to delete
the routes for all secondaries that used the old primary IFA,
add option to ignore these secondaries from the checks and
to assume they are already deleted, so that we can safely
delete the route while these IFAs are still on the device list.

Reported-by: Alex Sidorenko <alexandre.sidorenko@hp.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22 01:06:32 -07:00
Simon Horman
736561a01f IPVS: Use global mutex in ip_vs_app.c
As part of the work to make IPVS network namespace aware
__ip_vs_app_mutex was replaced by a per-namespace lock,
ipvs->app_mutex. ipvs->app_key is also supplied for debugging purposes.

Unfortunately this implementation results in ipvs->app_key residing
in non-static storage which at the very least causes a lockdep warning.

This patch takes the rather heavy-handed approach of reinstating
__ip_vs_app_mutex which will cover access to the ipvs->list_head
of all network namespaces.

[   12.610000] IPVS: Creating netns size=2456 id=0
[   12.630000] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
[   12.640000] BUG: key ffff880003bbf1a0 not in .data!
[   12.640000] ------------[ cut here ]------------
[   12.640000] WARNING: at kernel/lockdep.c:2701 lockdep_init_map+0x37b/0x570()
[   12.640000] Hardware name: Bochs
[   12.640000] Pid: 1, comm: swapper Tainted: G        W 2.6.38-kexec-06330-g69b7efe-dirty #122
[   12.650000] Call Trace:
[   12.650000]  [<ffffffff8102e685>] warn_slowpath_common+0x75/0xb0
[   12.650000]  [<ffffffff8102e6d5>] warn_slowpath_null+0x15/0x20
[   12.650000]  [<ffffffff8105967b>] lockdep_init_map+0x37b/0x570
[   12.650000]  [<ffffffff8105829d>] ? trace_hardirqs_on+0xd/0x10
[   12.650000]  [<ffffffff81055ad8>] debug_mutex_init+0x38/0x50
[   12.650000]  [<ffffffff8104bc4c>] __mutex_init+0x5c/0x70
[   12.650000]  [<ffffffff81685ee7>] __ip_vs_app_init+0x64/0x86
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff811b1c33>] T.620+0x43/0x170
[   12.660000]  [<ffffffff811b1e9a>] ? register_pernet_subsys+0x1a/0x40
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff811b1db7>] register_pernet_operations+0x57/0xb0
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.670000]  [<ffffffff811b1ea9>] register_pernet_subsys+0x29/0x40
[   12.670000]  [<ffffffff81685f19>] ip_vs_app_init+0x10/0x12
[   12.670000]  [<ffffffff81685a87>] ip_vs_init+0x4c/0xff
[   12.670000]  [<ffffffff8166562c>] do_one_initcall+0x7a/0x12e
[   12.670000]  [<ffffffff8166583e>] kernel_init+0x13e/0x1c2
[   12.670000]  [<ffffffff8128c134>] kernel_thread_helper+0x4/0x10
[   12.670000]  [<ffffffff8128ad40>] ? restore_args+0x0/0x30
[   12.680000]  [<ffffffff81665700>] ? kernel_init+0x0/0x1c2
[   12.680000]  [<ffffffff8128c130>] ? kernel_thread_helper+0x0/0x1global0

Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 20:39:24 -07:00
Eric Dumazet
20246a8003 snmp: SNMP_UPD_PO_STATS_BH() always called from softirq
We dont need to test if we run from softirq context, we definitely are.

This saves few instructions in ip_rcv() & ip_rcv_finish()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:12:54 -07:00
Wei Yongjun
a454f0ccef xfrm: Fix initialize repl field of struct xfrm_state
Commit 'xfrm: Move IPsec replay detection functions to a separate file'
  (9fdc4883d9)
introduce repl field to struct xfrm_state, and only initialize it
under SA's netlink create path, the other path, such as pf_key,
ipcomp/ipcomp6 etc, the repl field remaining uninitialize. So if
the SA is created by pf_key, any input packet with SA's encryption
algorithm will cause panic.

    int xfrm_input()
    {
        ...
        x->repl->advance(x, seq);
        ...
    }

This patch fixed it by introduce new function __xfrm_init_state().

Pid: 0, comm: swapper Not tainted 2.6.38-next+ #14 Bochs Bochs
EIP: 0060:[<c078e5d5>] EFLAGS: 00010206 CPU: 0
EIP is at xfrm_input+0x31c/0x4cc
EAX: dd839c00 EBX: 00000084 ECX: 00000000 EDX: 01000000
ESI: dd839c00 EDI: de3a0780 EBP: dec1de88 ESP: dec1de64
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process swapper (pid: 0, ti=dec1c000 task=c09c0f20 task.ti=c0992000)
Stack:
 00000000 00000000 00000002 c0ba27c0 00100000 01000000 de3a0798 c0ba27c0
 00000033 dec1de98 c0786848 00000000 de3a0780 dec1dea4 c0786868 00000000
 dec1debc c074ee56 e1da6b8c de3a0780 c074ed44 de3a07a8 dec1decc c074ef32
Call Trace:
 [<c0786848>] xfrm4_rcv_encap+0x22/0x27
 [<c0786868>] xfrm4_rcv+0x1b/0x1d
 [<c074ee56>] ip_local_deliver_finish+0x112/0x1b1
 [<c074ed44>] ? ip_local_deliver_finish+0x0/0x1b1
 [<c074ef32>] NF_HOOK.clone.1+0x3d/0x44
 [<c074ef77>] ip_local_deliver+0x3e/0x44
 [<c074ed44>] ? ip_local_deliver_finish+0x0/0x1b1
 [<c074ec03>] ip_rcv_finish+0x30a/0x332
 [<c074e8f9>] ? ip_rcv_finish+0x0/0x332
 [<c074ef32>] NF_HOOK.clone.1+0x3d/0x44
 [<c074f188>] ip_rcv+0x20b/0x247
 [<c074e8f9>] ? ip_rcv_finish+0x0/0x332
 [<c072797d>] __netif_receive_skb+0x373/0x399
 [<c0727bc1>] netif_receive_skb+0x4b/0x51
 [<e0817e2a>] cp_rx_poll+0x210/0x2c4 [8139cp]
 [<c072818f>] net_rx_action+0x9a/0x17d
 [<c0445b5c>] __do_softirq+0xa1/0x149
 [<c0445abb>] ? __do_softirq+0x0/0x149

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:08:28 -07:00
Randy Dunlap
858022aa6f wireless: fix 80211 kernel-doc warnings
Fix many of each of these warnings:

Warning(include/net/cfg80211.h:519): No description found for parameter 'rxrate'
Warning(include/net/mac80211.h:1163): bad line:

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-21 15:19:48 -04:00
Linus Torvalds
7a6362800c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1480 commits)
  bonding: enable netpoll without checking link status
  xfrm: Refcount destination entry on xfrm_lookup
  net: introduce rx_handler results and logic around that
  bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag
  bonding: wrap slave state work
  net: get rid of multiple bond-related netdevice->priv_flags
  bonding: register slave pointer for rx_handler
  be2net: Bump up the version number
  be2net: Copyright notice change. Update to Emulex instead of ServerEngines
  e1000e: fix kconfig for crc32 dependency
  netfilter ebtables: fix xt_AUDIT to work with ebtables
  xen network backend driver
  bonding: Improve syslog message at device creation time
  bonding: Call netif_carrier_off after register_netdevice
  bonding: Incorrect TX queue offset
  net_sched: fix ip_tos2prio
  xfrm: fix __xfrm_route_forward()
  be2net: Fix UDP packet detected status in RX compl
  Phonet: fix aligned-mode pipe socket buffer header reserve
  netxen: support for GbE port settings
  ...

Fix up conflicts in drivers/staging/brcm80211/brcmsmac/wl_mac80211.c
with the staging updates.
2011-03-16 16:29:25 -07:00
Linus Torvalds
e6bee325e4 Merge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
* 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (76 commits)
  pch_uart: reference clock on CM-iTC
  pch_phub: add new device ML7213
  n_gsm: fix UIH control byte : P bit should be 0
  n_gsm: add a documentation
  serial: msm_serial_hs: Add MSM high speed UART driver
  tty_audit: fix tty_audit_add_data live lock on audit disabled
  tty: move cd1865.h to drivers/staging/tty/
  Staging: tty: fix build with epca.c driver
  pcmcia: synclink_cs: fix prototype for mgslpc_ioctl()
  Staging: generic_serial: fix double locking bug
  nozomi: don't use flush_scheduled_work()
  tty/serial: Relax the device_type restriction from of_serial
  MAINTAINERS: Update HVC file patterns
  tty: phase out of ioctl file pointer for tty3270 as well
  tty: forgot to remove ipwireless from drivers/char/pcmcia/Makefile
  pch_uart: Fix DMA channel miss-setting issue.
  pch_uart: fix exclusive access issue
  pch_uart: fix auto flow control miss-setting issue
  pch_uart: fix uart clock setting issue
  pch_uart : Use dev_xxx not pr_xxx
  ...

Fix up trivial conflicts in drivers/misc/pch_phub.c (same patch applied
twice, then changes to the same area in one branch)
2011-03-16 15:11:04 -07:00
David S. Miller
918690f981 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2011-03-15 13:57:18 -07:00
David S. Miller
31111c26d9 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
Conflicts:
	Documentation/feature-removal-schedule.txt
2011-03-15 13:03:27 -07:00
John W. Linville
106af2c99a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-03-15 14:16:48 -04:00
Aneesh Kumar K.V
c0aa4caf4c net/9p: Implement syncfs 9P operation
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15 09:57:38 -05:00
Venkateswararao Jujjuri (JV)
f735195d51 [net/9p] Small non-IO PDUs for zero-copy supporting transports.
If a transport prefers payload to be sent separate from the PDU
(P9_TRANS_PREF_PAYLOAD_SEP), there is no need to allocate msize
PDU buffers(struct p9_fcall).

This patch allocates only upto 4k buffers for this kind of transports
and there won't be any change to the legacy transports.

Hence, this patch on top of zero copy changes allows user to
specify higher msizes through the mount option
without hogging the kernel heap.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15 09:57:36 -05:00
Venkateswararao Jujjuri (JV)
6f69c395ce [net/9p] Add preferences to transport layer.
This patch adds preferences field to the p9_trans_module.
Through this, now transport layer can express its preference about the
payload. i.e if payload neds to be part of the PDU or it prefers it
to be sent sepearetly so that the transport layer can handle it in
a better way.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15 09:57:35 -05:00
Venkateswararao Jujjuri (JV)
022cae3655 [net/9p] Preparation and helper functions for zero copy
This patch prepares p9_fcall structure for zero copy. Added
fields send the payload buffer information to the transport layer.
In addition it adds a 'private' field for the transport layer to
store mapped/pinned page information so that it can be freed/unpinned
during req_done.

This patch also creates trans_common.[ch] to house helper functions.
It adds the following helper functions.

p9_release_req_pages - Release pages after the transaction.
p9_nr_pages - Return number of pages needed to accomodate the payload.
payload_gup - Translates user buffer into kernel pages.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15 09:57:34 -05:00
Simon Horman
f2247fbdc4 IPVS: Conditionally include sysctl members of struct netns_ipvs
There is now no need to include sysctl members of struct netns_ipvs
unless CONFIG_SYSCTL is defined.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:37:02 +09:00
Simon Horman
a4e2f5a700 IPVS: Conditional ip_vs_conntrack_enabled()
ip_vs_conntrack_enabled() becomes a noop when CONFIG_SYSCTL is undefined.

In preparation for not including sysctl_conntrack in
struct netns_ipvs when CONFIG_SYCTL is not defined.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:37:00 +09:00
Simon Horman
3a1bbf1885 IPVS: ip_vs_todrop() becomes a noop when CONFIG_SYSCTL is undefined
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:59 +09:00
Simon Horman
7532e8d40c IPVS: Add sysctl_sync_ver()
In preparation for not including sysctl_sync_ver in
struct netns_ipvs when CONFIG_SYCTL is not defined.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:57 +09:00
Simon Horman
59e0350ead IPVS: Add {sysctl_sync_threshold,period}()
In preparation for not including sysctl_sync_threshold in
struct netns_ipvs when CONFIG_SYCTL is not defined.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:57 +09:00
Julian Anastasov
6ef757f965 ipvs: rename estimator functions
Rename ip_vs_new_estimator to ip_vs_start_estimator
and ip_vs_kill_estimator to ip_vs_stop_estimator to better
match their logic.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:54 +09:00
Julian Anastasov
ea9f22cce9 ipvs: optimize rates reading
Move the estimator reading from estimation_timer to user
context. ip_vs_read_estimator() will be used to decode the rate
values. As the decoded rates are not set by estimation timer
there is no need to reset them in ip_vs_zero_stats.

 	There is no need ip_vs_new_estimator() to encode stats
to rates, if the destination is in trash both the stats and the
rates are inactive.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:53 +09:00
Julian Anastasov
87d68a15e2 ipvs: remove unused seqcount stats
Remove ustats_seq, IPVS_STAT_INC and IPVS_STAT_ADD
because they are not used. They were replaced with u64_stats.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:53 +09:00
Julian Anastasov
55a3d4e15c ipvs: properly zero stats and rates
Currently, the new percpu counters are not zeroed and
the zero commands do not work as expected, we still show the old
sum of percpu values. OTOH, we can not reset the percpu counters
from user context without causing the incrementing to use old
and bogus values.

 	So, as Eric Dumazet suggested fix that by moving all overhead
to stats reading in user context. Do not introduce overhead in
timer context (estimator) and incrementing (packet handling in
softirqs).

 	The new ustats0 field holds the zero point for all
counter values, the rates always use 0 as base value as before.
When showing the values to user space just give the difference
between counters and the base values. The only drawback is that
percpu stats are not zeroed, they are accessible only from /proc
and are new interface, so it should not be a compatibility problem
as long as the sum stats are correct after zeroing.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:52 +09:00
Julian Anastasov
2a0751af09 ipvs: reorganize tot_stats
The global tot_stats contains cpustats field just like the
stats for dest and svc, so better use it to simplify the usage
in estimation_timer. As tot_stats is registered as estimator
we can remove the special ip_vs_read_cpu_stats call for
tot_stats. Fix ip_vs_read_cpu_stats to be called under
stats lock because it is still used as synchronization between
estimation timer and user context (the stats readers).

 	Also, make sure ip_vs_stats_percpu_show reads properly
the u64 stats from user context.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:52 +09:00
Julian Anastasov
2553d064ff ipvs: move struct netns_ipvs
Remove include/net/netns/ip_vs.h because it depends on
structures from include/net/ip_vs.h. As ipvs is pointer in
struct net it is better to move struct netns_ipvs into
include/net/ip_vs.h, so that we can easily use other structures
in struct netns_ipvs.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:50 +09:00
Jesper Juhl
06b69390a6 IPVS: Fix variable assignment in ip_vs_notrack
There's no sense to 'ct = ct = ' in ip_vs_notrack(). Just assign
nf_ct_get()'s return value directly to the pointer variable 'ct' once.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:49 +09:00
Steffen Klassert
2cd084678f xfrm: Add support for IPsec extended sequence numbers
This patch adds support for IPsec extended sequence numbers (esn)
as defined in RFC 4303. The bits to manage the anti-replay window
are based on a patch from Alex Badea.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-13 20:22:31 -07:00
Steffen Klassert
9fdc4883d9 xfrm: Move IPsec replay detection functions to a separate file
To support multiple versions of replay detection, we move the replay
detection functions to a separate file and make them accessible
via function pointers contained in the struct xfrm_replay.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-13 20:22:30 -07:00
Steffen Klassert
1ce3644ade xfrm: Use separate low and high order bits of the sequence numbers in xfrm_skb_cb
To support IPsec extended sequence numbers, we split the
output sequence numbers of xfrm_skb_cb in low and high order 32 bits
and we add the high order 32 bits to the input sequence numbers.
All users are updated accordingly.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-13 20:22:28 -07:00
Steffen Klassert
9736acf395 xfrm: Add basic infrastructure to support IPsec extended sequence numbers
This patch adds the struct xfrm_replay_state_esn which will be
used to support IPsec extended sequence numbers and anti replay windows
bigger than 32 packets. Also we add a function that returns the actual
size of the xfrm_replay_state_esn, a xfrm netlink atribute and a xfrm state
flag for the use of extended sequence numbers.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-13 20:22:28 -07:00
David S. Miller
bef55aebd5 decnet: Convert to use flowidn where applicable.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:55 -08:00
David S. Miller
1958b856c1 net: Put fl6_* macros to struct flowi6 and use them again.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:55 -08:00
David S. Miller
4c9483b2fb ipv6: Convert to use flowi6 where applicable.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:54 -08:00
David S. Miller
9cce96df5b net: Put fl4_* macros to struct flowi4 and use them again.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:54 -08:00
David S. Miller
7e1dc7b6f7 net: Use flowi4 and flowi6 in xfrm layer.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:52 -08:00
David S. Miller
2032656e76 net: Add flowi6_* member helper macros.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:52 -08:00
David S. Miller
9d6ec93801 ipv4: Use flowi4 in public route lookup interfaces.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:48 -08:00
David S. Miller
22bd5b9b13 ipv4: Pass ipv4 flow objects into fib_lookup() paths.
To start doing these conversions, we need to add some temporary
flow4_* macros which will eventually go away when all the protocol
code paths are changed to work on AF specific flowi objects.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:47 -08:00
David S. Miller
59b1a94c9a net: Add flowiX_to_flowi() shorthands.
This is just a shorthand which will help in passing around AF
specific flow structures as generic ones.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:47 -08:00
David S. Miller
56bb8059e1 net: Break struct flowi out into AF specific instances.
Now we have struct flowi4, flowi6, and flowidn for each address
family.  And struct flowi is just a union of them all.

It might have been troublesome to convert flow_cache_uli_match() but
as it turns out this function is completely unused and therefore can
be simply removed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:46 -08:00
David S. Miller
6281dcc94a net: Make flowi ports AF dependent.
Create two sets of port member accessors, one set prefixed by fl4_*
and the other prefixed by fl6_*

This will let us to create AF optimal flow instances.

It will work because every context in which we access the ports,
we have to be fully aware of which AF the flowi is anyways.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:46 -08:00
David S. Miller
08704bcbf0 net: Create union flowi_uli
This will be used when we have seperate flowi types.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:45 -08:00
David S. Miller
806566cc78 net: Create struct flowi_common
Pull out the AF independent members of struct flowi into a
new struct flowi_common

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:45 -08:00
David S. Miller
1d28f42c1b net: Put flowi_* prefix on AF independent members of struct flowi
I intend to turn struct flowi into a union of AF specific flowi
structs.  There will be a common structure that each variant includes
first, much like struct sock_common.

This is the first step to move in that direction.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:44 -08:00
David S. Miller
fbef0a4091 net: Remove unnecessary padding in struct flowi
Move tos, scope, proto, and flags to the beginning of
the structure.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:43 -08:00
David S. Miller
78fbfd8a65 ipv4: Create and use route lookup helpers.
The idea here is this minimizes the number of places one has to edit
in order to make changes to how flows are defined and used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:42 -08:00
John W. Linville
38c091590f mac80211: implement support for cfg80211_ops->{get,set}_ringparam
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-11 15:34:10 -05:00
John W. Linville
3677713b79 wireless: add support for ethtool_ops->{get,set}_ringparam
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-11 14:16:58 -05:00
John W. Linville
409ec36c32 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-03-11 14:11:11 -05:00
David S. Miller
1b7fe59322 ipv4: Kill flowi arg to fib_select_multipath()
Completely unused.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 17:03:45 -08:00
Rémi Denis-Courmont
f7ae8d59f6 Phonet: allocate sock from accept syscall rather than soft IRQ
This moves most of the accept logic to process context like other
socket stacks do. Then we can use a few more common socket helpers
and simplify a bit.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 11:59:32 -08:00
David S. Miller
a7ac8fc1d8 ipv4: Fix scope value used in route src-address caching.
We have to use cfg->fc_scope not the final nh_scope value.

Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-08 11:03:21 -08:00
David S. Miller
1fc050a134 ipv4: Cache source address in nexthop entries.
When doing output route lookups, we have to select the source address
if the user has not specified an explicit one.

First, if the route has an explicit preferred source address
specified, then we use that.

Otherwise we search the route's outgoing interface for a suitable
address.

This search can be precomputed and cached at route insertion time.

The only missing part is that we have to refresh this precomputed
value any time addresses are added or removed from the interface, and
this is accomplished by fib_update_nh_saddrs().

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-07 20:54:48 -08:00
David S. Miller
5e2b61f784 ipv4: Remove flowi from struct rtable.
The only necessary parts are the src/dst addresses, the
interface indexes, the TOS, and the mark.

The rest is unnecessary bloat, which amounts to nearly
50 bytes on 64-bit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-04 21:55:31 -08:00
David S. Miller
4157434c23 ipv4: Use passed-in protocol in ip_route_newports().
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-04 21:31:48 -08:00
David S. Miller
d72751ede1 Merge branch 'for-davem' of ssh://master.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2011-03-04 12:48:25 -08:00
John W. Linville
85a7045a90 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-03-04 14:10:40 -05:00
John W. Linville
a177584609 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth-next-2.6 2011-03-04 13:59:44 -05:00
David S. Miller
0a0e9ae1bd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bnx2x/bnx2x.h
2011-03-03 21:27:42 -08:00
Eric Dumazet
d276055c4e net_sched: reduce fifo qdisc size
Because of various alignements [SLUB / qdisc], we use 512 bytes of
memory for one {p|b}fifo qdisc, instead of 256 bytes on 64bit arches and
192 bytes on 32bit ones.

Move the "u32 limit" inside "struct Qdisc" (no impact on other qdiscs)

Change qdisc_alloc(), first trying a regular allocation before an
oversized one.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-03 11:10:02 -08:00
Shmulik Ravid
dc6ed1df5a dcbnl: add support for retrieving peer configuration - cee
This patch adds the support for retrieving the remote or peer DCBX
configuration via dcbnl for embedded DCBX stacks supporting the CEE DCBX
standard.

Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-02 21:58:55 -08:00
Shmulik Ravid
eed84713bc dcbnl: add support for retrieving peer configuration - ieee
These 2 patches add the support for retrieving the remote or peer DCBX
configuration via dcbnl for embedded DCBX stacks. The peer configuration
is part of the DCBX MIB and is useful for debugging and diagnostics of
the overall DCB configuration. The first patch add this support for IEEE
802.1Qaz standard the second patch add the same support for the older
CEE standard. Diff for v2 - the peer-app-info is CEE specific.

Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-02 21:58:54 -08:00
David S. Miller
5bfa787fb2 ipv4: ip_route_output_key() is better as an inline.
This avoid a stack frame at zero cost.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-02 14:56:30 -08:00
David S. Miller
b23dd4fe42 ipv4: Make output route lookup return rtable directly.
Instead of on the stack.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-02 14:31:35 -08:00
David S. Miller
452edd598f xfrm: Return dst directly from xfrm_lookup()
Instead of on the stack.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-02 13:27:41 -08:00
David S. Miller
3872b28408 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2011-03-02 11:30:24 -08:00
David S. Miller
2774c131b1 xfrm: Handle blackhole route creation via afinfo.
That way we don't have to potentially do this in every xfrm_lookup()
caller.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 14:59:04 -08:00
David S. Miller
69ead7afdf ipv6: Normalize arguments to ip6_dst_blackhole().
Return a dst pointer which is potentitally error encoded.

Don't pass original dst pointer by reference, pass a struct net
instead of a socket, and elide the flow argument since it is
unnecessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 14:45:33 -08:00
David S. Miller
80c0bc9e37 xfrm: Kill XFRM_LOOKUP_WAIT flag.
This can be determined from the flow flags instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 14:36:37 -08:00
David S. Miller
a1414715f0 ipv6: Change final dst lookup arg name to "can_sleep"
Since it indicates whether we are invoked from a sleepable
context or not.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 14:32:04 -08:00
David S. Miller
273447b352 ipv4: Kill can_sleep arg to ip_route_output_flow()
This boolean state is now available in the flow flags.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 14:27:04 -08:00
David S. Miller
5df65e5567 net: Add FLOWI_FLAG_CAN_SLEEP.
And set is in contexts where the route resolution can sleep.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 14:22:19 -08:00
David S. Miller
420d44daa7 ipv4: Make final arg to ip_route_output_flow to be boolean "can_sleep"
Since that is what the current vague "flags" argument means.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 14:19:23 -08:00
David S. Miller
abdf7e7239 ipv4: Can final ip_route_connect() arg to boolean "can_sleep".
Since that's what the current vague "flags" thing means.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 14:15:24 -08:00
David S. Miller
68d0c6d34d ipv6: Consolidate route lookup sequences.
Route lookups follow a general pattern in the ipv6 code wherein
we first find the non-IPSEC route, potentially override the
flow destination address due to ipv6 options settings, and then
finally make an IPSEC search using either xfrm_lookup() or
__xfrm_lookup().

__xfrm_lookup() is used when we want to generate a blackhole route
if the key manager needs to resolve the IPSEC rules (in this case
-EREMOTE is returned and the original 'dst' is left unchanged).

Otherwise plain xfrm_lookup() is used and when asynchronous IPSEC
resolution is necessary, we simply fail the lookup completely.

All of these cases are encapsulated into two routines,
ip6_dst_lookup_flow and ip6_sk_dst_lookup_flow.  The latter of which
handles unconnected UDP datagram sockets.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 13:19:07 -08:00
Herbert Xu
f6b9664f8b udp: Switch to ip_finish_skb
This patch converts UDP to use the new ip_finish_skb API.  This
would then allows us to more easily use ip_make_skb which allows
UDP to run without a socket lock.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 12:35:03 -08:00
Herbert Xu
1c32c5ad6f inet: Add ip_make_skb and ip_finish_skb
This patch adds the helper ip_make_skb which is like ip_append_data
and ip_push_pending_frames all rolled into one, except that it does
not send the skb produced.  The sending part is carried out by
ip_send_skb, which the transport protocol can call after it has
tweaked the skb.

It is meant to be called in cases where corking is not used should
have a one-to-one correspondence to sendmsg.

This patch also adds the helper ip_finish_skb which is meant to
be replace ip_push_pending_frames when corking is required.
Previously the protocol stack would peek at the socket write
queue and add its header to the first packet.  With ip_finish_skb,
the protocol stack can directly operate on the final skb instead,
just like the non-corking case with ip_make_skb.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 12:35:03 -08:00
Herbert Xu
1470ddf7f8 inet: Remove explicit write references to sk/inet in ip_append_data
In order to allow simultaneous calls to ip_append_data on the same
socket, it must not modify any shared state in sk or inet (other
than those that are designed to allow that such as atomic counters).

This patch abstracts out write references to sk and inet_sk in
ip_append_data and its friends so that we may use the underlying
code in parallel.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 12:35:02 -08:00
Felix Fietkau
c8dcfd8a04 cfg80211: add a field for the bitrate of the last rx data packet from a station
Also fix a typo in the STATION_INFO_TX_BITRATE description

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-01 13:48:21 -05:00
David S. Miller
e3dfa389fd xfrm: Pass const xfrm_mark to xfrm_mark_put().
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-27 23:20:19 -08:00
David S. Miller
a70486f0e6 xfrm: Pass const xfrm_address_t objects to xfrm_state_lookup* and xfrm_find_acq.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-27 23:17:24 -08:00
David S. Miller
851586218f xfrm: Pass const arg to xfrm_alg_len and xfrm_alg_auth_len.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-27 23:07:02 -08:00
David S. Miller
6f2f19ed95 xfrm: Pass name as const to xfrm_*_get_byname().
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-27 23:04:45 -08:00
Szymon Janc
4e51eae9cd Bluetooth: Move index to common header in management interface
Most mgmt commands and event are related to hci adapter. Moving index to
common header allow to easily use it in command status while reporting errors.
For those not related to adapter use MGMT_INDEX_NONE (0xFFFF) as index.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Acked-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-27 16:56:41 -03:00
Johannes Berg
5f16a43617 mac80211: support direct offchannel TX offload
For devices supported by iwlwifi sometimes
off-channel transmissions need to be handled
by the device completely. To support this
mac80211 needs to pass the frame directly
to the driver and not through the TX path
as the driver needs the frame and channel
information at the same time.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-25 15:33:40 -05:00
Johannes Berg
7bb4568372 mac80211: make tx() operation return void
The return value of the tx operation is commonly
misused by drivers, leading to errors. All drivers
will drop frames if they fail to TX the frame, and
they must also properly manage the queues (if they
didn't, mac80211 would already warn).

Removing the ability for drivers to return a BUSY
value also allows significant cleanups of the TX
TX handling code in mac80211.

Note that this also fixes a bug in ath9k_htc, the
old "return -1" there was wrong.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Tested-by: Sedat Dilek <sedat.dilek@googlemail.com> [ath5k]
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com> [rt2x00]
Acked-by: Larry Finger <Larry.Finger@lwfinger.net> [b43, rtl8187, rtlwifi]
Acked-by: Luciano Coelho <coelho@ti.com> [wl12xx]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-25 15:32:34 -05:00
Rémi Denis-Courmont
8f44fcc72a Phonet: fix flawed "SYN/ACK" logic
* Do not fail if the peer supports more or less than 3 algorithms.
 * Ignore unknown congestion control algorithms instead of failing.
 * Simplify congestion algorithm negotiation (largest is best).
 * Do not use a static buffer.
 * Fix off-by-two read overflow.
 * Avoid extra memory copy (in addition to skb_copy_bits()).

The previous code really made no sense.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-25 11:19:37 -08:00
Rémi Denis-Courmont
0165d69bcb Phonet: don't bother with transaction IDs (especially for indications)
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-25 11:19:36 -08:00
Rémi Denis-Courmont
2feb61816f Phonet: remove redumdant pep->pipe_state
sk->sk_state already contains the pipe state.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-25 11:19:36 -08:00
Rémi Denis-Courmont
14ba8faebc Phonet: use socket destination in pipe protocol
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-25 11:19:35 -08:00
Rémi Denis-Courmont
a8059512b1 Phonet: implement per-socket destination/peer address
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-25 11:19:35 -08:00
David S. Miller
1b0db64fb7 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2011-02-24 22:35:12 -08:00
Changli Gao
b552f7e3a9 ipvs: unify the formula to estimate the overhead of processing connections
lc and wlc use the same formula, but lblc and lblcr use another one. There
is no reason for using two different formulas for the lc variants.

The formula used by lc is used by all the lc variants in this patch.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Wensong Zhang <wensong@linux-vs.org>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-02-25 11:35:41 +09:00
David S. Miller
dca8b089c9 ipv4: Rearrange how ip_route_newports() gets port keys.
ip_route_newports() is the only place in the entire kernel that
cares about the port members in the routing cache entry's lookup
flow key.

Therefore the only reason we store an entire flow inside of the
struct rtentry is for this one special case.

Rewrite ip_route_newports() such that:

1) The caller passes in the original port values, so we don't need
   to use the rth->fl.fl_ip_{s,d}port values to remember them.

2) The lookup flow is constructed by hand instead of being copied
   from the routing cache entry's flow.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-24 13:38:12 -08:00
Greg Kroah-Hartman
f227e08b71 Merge 2.6.38-rc6 into tty-next
This was to resolve a merge issue with drivers/char/Makefile and
drivers/tty/serial/68328serial.c

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-02-24 11:36:31 -08:00
David S. Miller
33765d0603 xfrm: Const'ify xfrm_address_t args to xfrm_state_find.
This required a const'ification in xfrm_init_tempstate() too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23 23:08:47 -08:00
David S. Miller
f8848067ca xfrm: Const'ify ptr args to xfrm_state_*_check and xfrm_state_kern.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23 23:07:45 -08:00
David S. Miller
21eddb5c1e xfrm: Const'ify xfrm_tmpl and xfrm_state args to xfrm_state_addr_cmp.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23 23:07:45 -08:00
David S. Miller
63eb23f5d8 xfrm: Const'ify policy arg to xp_net.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23 23:07:44 -08:00
David S. Miller
b4b7c0b389 xfrm: Const'ify selector args in xfrm_migrate paths.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23 23:07:42 -08:00
David S. Miller
183cad1278 xfrm: Const'ify pointer args to km_migrate() and implementations.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23 23:07:41 -08:00
David S. Miller
6cc329610f xfrm: Const'ify address argument to xfrm_addr_any()
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23 23:07:40 -08:00
David S. Miller
ff6acd1682 xfrm: Const'ify address arguments to xfrm_addr_cmp()
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23 23:07:39 -08:00
David S. Miller
5e6b930f21 xfrm: Const'ify address arguments to ->dst_lookup()
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23 23:07:38 -08:00
David S. Miller
200ce96e56 xfrm: Const'ify selector argument to xfrm_selector_match()
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23 23:07:38 -08:00
David S. Miller
19bd62441c xfrm: Const'ify tmpl and address arguments to ->init_temprop()
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23 23:07:37 -08:00
David S. Miller
214e005bc3 xfrm: Pass km_event pointers around as const when possible.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23 23:07:37 -08:00
Eric Dumazet
9e924cf407 net_sched: long word align struct qdisc_skb_cb data
netem_skb_cb() does :

return (struct netem_skb_cb *)qdisc_skb_cb(skb)->data;

Unfortunatly struct qdisc_skb_cb data is not long word aligned, so
access to psched_time_t time_to_send uses a non aligned access.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23 14:17:02 -08:00
Johannes Berg
6ebacbb79d mac80211: rename RX_FLAG_TSFT
The flag isn't very descriptive -- the intention
is that the driver provides a TSF timestamp at
the beginning of the MPDU -- make that clearer
by renaming the flag to RX_FLAG_MACTIME_MPDU.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-23 16:25:29 -05:00
David S. Miller
dee9f4bceb net: Make flow cache paths use a const struct flowi.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 18:44:31 -08:00
David S. Miller
0730b9a150 net: Mark flowi arg to flow_cache_uli_match() const.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 18:27:22 -08:00
David S. Miller
b520e9f616 xfrm: Mark flowi arg to xfrm_state_find() const.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 18:24:19 -08:00
David S. Miller
e1ad2ab2cf xfrm: Mark flowi arg to xfrm_selector_match() const.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 18:07:39 -08:00
David S. Miller
1744a8fe09 xfrm: Mark token args to addr_match() const.
Also, make it return a real bool.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 18:02:12 -08:00
David S. Miller
8f029de281 xfrm: Mark flowi arg to xfrm_type->reject() const.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 17:59:59 -08:00
David S. Miller
73e5ebb20f xfrm: Mark flowi arg to ->init_tempsel() const.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 17:51:44 -08:00
David S. Miller
0c7b3eefb4 xfrm: Mark flowi arg to ->fill_dst() const.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 17:48:57 -08:00
David S. Miller
05d8402576 xfrm: Mark flowi arg to ->get_tos() const.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 17:47:10 -08:00
David S. Miller
e8a4e37716 xfrm: Mark flowi arg const in key extraction helpers.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 17:42:56 -08:00
John W. Linville
5db5e44cdc Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-02-22 15:10:22 -05:00
Eric Dumazet
eaefd1105b net: add __rcu annotations to sk_wq and wq
Add proper RCU annotations/verbs to sk_wq and wq members

Fix __sctp_write_space() sk_sleep() abuse (and sock->wq access)

Fix sunrpc sk_sleep() abuse too

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:19:31 -08:00
Linus Lüssing
5ced133961 ipv6: Add IPv6 multicast address flag defines
This commit adds the missing IPv6 multicast address flag defines to
complement the already existing multicast address scope defines and to
be able to check these flags nicely in the future.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:07:27 -08:00
Changli Gao
731109e784 ipvs: use hlist instead of list
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-02-22 15:45:39 +09:00
Johan Hedberg
2a61169209 Bluetooth: Add mgmt_auth_failed event
To properly track bonding completion an event to indicate authentication
failure is needed. This event will be sent whenever an authentication
complete HCI event with a non-zero status comes. It will also be sent
when we're acting in acceptor role for SSP authentication in which case
the controller will send a Simple Pairing Complete event.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-21 17:22:44 -03:00
Johan Hedberg
ac56fb13c0 Bluetooth: Fix mgmt_pin_code_reply return parameters
The command complete event for mgmt_pin_code_reply &
mgmt_pin_code_neg_reply should have the adapter index, Bluetooth address
as well as the status.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-21 17:22:44 -03:00
Johan Hedberg
a5c296832b Bluetooth: Add management support for user confirmation request
This patch adds support for the user confirmation (numeric comparison)
Secure Simple Pairing authentication method.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-21 17:22:44 -03:00
Johan Hedberg
e9a416b5ce Bluetooth: Add mgmt_pair_device command
This patch adds a new mgmt_pair_device which can be used to initiate a
dedicated bonding procedure. Some extra callbacks are added to the
hci_conn struct so that the pairing code can get notified of the
completion of the procedure.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-21 17:22:43 -03:00
Shan Wei
089c34827e tcp: Remove debug macro of TCP_CHECK_TIMER
Now, TCP_CHECK_TIMER is not used for debuging, it does nothing.
And, it has been there for several years, maybe 6 years.

Remove it to keep code clearer.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-20 11:10:14 -08:00
David S. Miller
da935c66ba Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	Documentation/feature-removal-schedule.txt
	drivers/net/e1000e/netdev.c
	net/xfrm/xfrm_policy.c
2011-02-19 19:17:35 -08:00
David S. Miller
ece639caa3 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2011-02-19 16:42:37 -08:00
David S. Miller
982721f391 ipv4: Use const'ify fib_result deep in the route call chains.
The only troublesome bit here is __mkroute_output which wants
to override res->fi and res->type, compute those in local
variables instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 15:54:42 -08:00
David S. Miller
b6bf3ca032 ipv4: Mark fib_combine_itag()'s 'res' arg as const.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 15:52:59 -08:00
David S. Miller
3c7bd1a140 net: Add initial_ref arg to dst_alloc().
This allows avoiding multiple writes to the initial __refcnt.

The most simplest cases of wanting an initial reference of "1"
in ipv4 and ipv6 have been converted, the rest have been left
along and kept at the existing "0".

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 15:44:00 -08:00
Alan Cox
6caa76b778 tty: now phase out the ioctl file pointer for good
Only oddities here are a couple of drivers that bogusly called the ldisc
helpers instead of returning -ENOIOCTLCMD. Fix the bug and the rest goes
away.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-02-17 11:59:56 -08:00
Alan Cox
20b9d17715 tiocmset: kill the file pointer argument
Doing tiocmget was such fun we should do tiocmset as well for the same
reasons

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-02-17 11:52:43 -08:00
Alan Cox
60b33c133c tiocmget: kill off the passing of the struct file
We don't actually need this and it causes problems for internal use of
this functionality. Currently there is a single use of the FILE * pointer.
That is the serial core which uses it to check tty_hung_up_p. However if
that is true then IO_ERROR is also already set so the check may be removed.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-02-17 11:47:33 -08:00
Szymon Janc
adc4266d87 Bluetooth: Fix some code style issues in hci_core.h
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-17 13:30:04 -03:00
Florian Westphal
d503b30bd6 netfilter: tproxy: do not assign timewait sockets to skb->sk
Assigning a socket in timewait state to skb->sk can trigger
kernel oops, e.g. in nfnetlink_log, which does:

if (skb->sk) {
        read_lock_bh(&skb->sk->sk_callback_lock);
        if (skb->sk->sk_socket && skb->sk->sk_socket->file) ...

in the timewait case, accessing sk->sk_callback_lock and sk->sk_socket
is invalid.

Either all of these spots will need to add a test for sk->sk_state != TCP_TIME_WAIT,
or xt_TPROXY must not assign a timewait socket to skb->sk.

This does the latter.

If a TW socket is found, assign the tproxy nfmark, but skip the skb->sk assignment,
thus mimicking behaviour of a '-m socket .. -j MARK/ACCEPT' re-routing rule.

The 'SYN to TW socket' case is left unchanged -- we try to redirect to the
listener socket.

Cc: Balazs Scheidler <bazsi@balabit.hu>
Cc: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Florian Westphal <fwestphal@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-17 11:32:38 +01:00
Claudio Takahasi
2ce603ebe1 Bluetooth: Send LE Connection Update Command
If the new connection update parameter are accepted, the LE master
host sends the LE Connection Update Command to its controller informing
the new requested parameters.

Signed-off-by: Claudio Takahasi <claudio.takahasi@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-16 20:13:21 -03:00
Ville Tervo
6bd32326cd Bluetooth: Use proper timer for hci command timout
Use proper timer instead of hci command flow control to timeout
failed hci commands. Otherwise stack ends up sending commands
when flow control is used to block new commands.

2010-09-01 18:29:41.592132 < HCI Command: Remote Name Request (0x01|0x0019) plen 10
    bdaddr 00:16:CF:E1:C7:D7 mode 2 clkoffset 0x0000
2010-09-01 18:29:41.592681 > HCI Event: Command Status (0x0f) plen 4
    Remote Name Request (0x01|0x0019) status 0x00 ncmd 0
2010-09-01 18:29:51.022033 < HCI Command: Remote Name Request Cancel (0x01|0x001a) plen 6
    bdaddr 00:16:CF:E1:C7:D7

Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-16 16:33:26 -03:00
Claudio Takahasi
de73115a7d Bluetooth: Add connection parameter update response
Implements L2CAP Connection Parameter Update Response defined in
the Bluetooth Core Specification, Volume 3, Part A, section 4.21.
Address the LE Connection Parameter Procedure initiated by the slave.

Connection Interval Minimum and Maximum have the same range: 6 to
3200. Time = N * 1.25ms. Minimum shall be less or equal to Maximum.
The Slave Latency field shall have a value in the range of 0 to
((connSupervisionTimeout / connIntervalMax) - 1). Latency field shall
be less than 500. connSupervisionTimeout = Timeout Multiplier * 10 ms.
Multiplier field shall have a value in the range of 10 to 3200.

Signed-off-by: Claudio Takahasi <claudio.takahasi@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-16 16:33:24 -03:00
Claudio Takahasi
3300d9a930 Bluetooth: Add LE signaling commands handling
This patch splits the L2CAP command handling function in order to
have a clear separation between the commands related to BR/EDR and
LE. Commands and responses in the LE signaling channel are not being
handled yet, command reject is sent to all received requests. Bluetooth
Core Specification, Volume 3, Part A, section 4 defines the signaling
packets formats and allowed commands/responses over the LE signaling
channel.

Signed-off-by: Claudio Takahasi <claudio.takahasi@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-16 16:33:23 -03:00
Ville Tervo
aff2cae354 Bluetooth: Add SMP command structures
Add command structures for security manager protocol.

Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-16 16:33:19 -03:00
Ville Tervo
b62f328b8f Bluetooth: Add server socket support for LE connection
Add support for LE server sockets.

Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-16 16:33:02 -03:00
Ville Tervo
acd7d37085 Bluetooth: Add LE connection support to L2CAP
Add basic LE connection support to L2CAP. LE
connection can be created by specifying cid
in struct sockaddr_l2

Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-16 16:32:55 -03:00
Ville Tervo
6ed58ec520 Bluetooth: Use LE buffers for LE traffic
Bluetooth chips may have separate buffers for LE traffic.
This patch add support to use LE buffers provided by the chip.

Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-16 16:32:51 -03:00
Ville Tervo
fcd89c09a5 Bluetooth: Add LE connect support
Bluetooth V4.0 adds support for Low Energy (LE) connections.
Specification introduces new set of hci commands to control LE
connection. This patch adds logic to create, cancel and disconnect
LE connections.

Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-16 16:32:45 -03:00
Ville Tervo
63185f64ef Bluetooth: Add low energy commands and events
Add needed HCI command and event structs to
create LE connections.

Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-16 16:32:15 -03:00
Patrick Schaaf
41ac51eeda ipvs: make "no destination available" message more informative
When IP_VS schedulers do not find a destination, they output a terse
"WLC: no destination available" message through kernel syslog, which I
can not only make sense of because syslog puts them in a logfile
together with keepalived checker results.

This patch makes the output a bit more informative, by telling you which
virtual service failed to find a destination.

Example output:

kernel: [1539214.552233] IPVS: wlc: TCP 192.168.8.30:22 - no destination available
kernel: [1539299.674418] IPVS: wlc: FWM 22 0x00000016 - no destination available

I have tested the code for IPv4 and FWM services, as you can see from
the example; I do not have an IPv6 setup to test the third code path
with.

To avoid code duplication, I put a new function ip_vs_scheduler_err()
into ip_vs_sched.c, and use that from the schedulers instead of calling
IP_VS_ERR_RL directly.

Signed-off-by: Patrick Schaaf <netdev@bof.de>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-02-16 14:53:33 +09:00
Gustavo F. Padovan
c531a12ae6 Bluetooth: remove l2cap_load() hack
l2cap_load() was added to trigger l2cap.ko module loading from the RFCOMM
and BNEP modules. Now that L2CAP module is gone, we don't need it anymore.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-15 09:45:52 -03:00
Gustavo F. Padovan
642745184f Bluetooth: Merge L2CAP and SCO modules into bluetooth.ko
Actually doesn't make sense have these modules built separately.
The L2CAP layer is needed by almost all Bluetooth protocols and profiles.
There isn't any real use case without having L2CAP loaded.
SCO is only essential for Audio transfers, but it is so small that we can
have it loaded always in bluetooth.ko without problems.
If you really doesn't want it you can disable SCO in the kernel config.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-14 17:27:36 -03:00
David S. Miller
6431cbc25f inet: Create a mechanism for upward inetpeer propagation into routes.
If we didn't have a routing cache, we would not be able to properly
propagate certain kinds of dynamic path attributes, for example
PMTU information and redirects.

The reason is that if we didn't have a routing cache, then there would
be no way to lookup all of the active cached routes hanging off of
sockets, tunnels, IPSEC bundles, etc.

Consider the case where we created a cached route, but no inetpeer
entry existed and also we were not asked to pre-COW the route metrics
and therefore did not force the creation a new inetpeer entry.

If we later get a PMTU message, or a redirect, and store this
information in a new inetpeer entry, there is no way to teach that
cached route about the newly existing inetpeer entry.

The facilities implemented here handle this problem.

First we create a generation ID.  When we create a cached route of any
kind, we remember the generation ID at the time of attachment.  Any
time we force-create an inetpeer entry in response to new path
information, we bump that generation ID.

The dst_ops->check() callback is where the knowledge of this event
is propagated.  If the global generation ID does not equal the one
stored in the cached route, and the cached route has not attached
to an inetpeer yet, we look it up and attach if one is found.  Now
that we've updated the cached route's information, we update the
route's generation ID too.

This clears the way for implementing PMTU and redirects directly in
the inetpeer cache.  There is absolutely no need to consult cached
route information in order to maintain this information.

At this point nothing bumps the inetpeer genids, that comes in the
later changes which handle PMTUs and redirects using inetpeers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-10 13:33:41 -08:00
David S. Miller
ddd4aa424b inetpeer: Add redirect and PMTU discovery cached info.
Validity of the cached PMTU information is indicated by it's
expiration value being non-zero, just as per dst->expires.

The scheme we will use is that we will remember the pre-ICMP value
held in the metrics or route entry, and then at expiration time
we will restore that value.

In this way PMTU expiration does not kill off the cached route as is
done currently.

Redirect information is permanent, or at least until another redirect
is received.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-10 13:29:30 -08:00
David S. Miller
7a71ed899e inetpeer: Abstract address representation further.
Future changes will add caching information, and some of
these new elements will be addresses.

Since the family is implicit via the ->daddr.family member,
replicating the family in ever address we store is entirely
redundant.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-10 13:22:28 -08:00
David S. Miller
8d13a2a9fb net: Kill NETEVENT_PMTU_UPDATE.
Nobody actually does anything in response to the event,
so just kill it off.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-08 16:17:55 -08:00
David S. Miller
e7b66bdc02 net: Remove bogus barrier() in dst_allfrag().
I simply missed this one when modifying the other dst
metric interfaces earlier.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-08 15:33:22 -08:00
Nicolas Dichtel
fa9921e46f ipsec: allow to align IPv4 AH on 32 bits
The Linux IPv4 AH stack aligns the AH header on a 64 bit boundary
(like in IPv6). This is not RFC compliant (see RFC4302, Section
3.3.3.2.1), it should be aligned on 32 bits.

For most of the authentication algorithms, the ICV size is 96 bits.
The AH header alignment on 32 or 64 bits gives the same results.

However for SHA-256-128 for instance, the wrong 64 bit alignment results
in adding useless padding in IPv4 AH, which is forbidden by the RFC.

To avoid breaking backward compatibility, we use a new flag
(XFRM_STATE_ALIGN4) do change original behavior.

Initial patch from Dang Hongwu <hongwu.dang@6wind.com> and
Christophe Gouault <christophe.gouault@6wind.com>.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-08 14:00:40 -08:00
David S. Miller
c0c84ef5c1 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2011-02-08 13:52:31 -08:00
Gustavo F. Padovan
6de0702b5b Bluetooth: move __l2cap_sock_close() to l2cap_sock.c
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:46:02 -02:00
Gustavo F. Padovan
fd83ccdb39 Bluetooth: move l2cap_sock_sendmsg() to l2cap_sock.c
Also moves some L2CAP sending functions declaration to l2cap.h

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:43:31 -02:00
Gustavo F. Padovan
dcba0dba54 Bluetooth: move l2cap_sock_shutdown() to l2cap_sock.c
Declare __l2cap_wait_ack() and  l2cap_sock_clear_timer() in l2cap.h

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:43:31 -02:00
Gustavo F. Padovan
6898325923 Bluetooth: move l2cap_sock_recvmsg() to l2cap_sock.c
It causes the move of the declaration of 3 functions to l2cap.h:
l2cap_get_ident(), l2cap_send_cmd(), l2cap_build_conf_req()

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:43:31 -02:00
Gustavo F. Padovan
4e34c50bfe Bluetooth: move l2cap_sock_connect() to l2cap_sock.c
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:43:31 -02:00
Gustavo F. Padovan
99f4808db0 Bluetooth: move l2cap_sock_getsockopt() to l2cap_sock.c
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:43:31 -02:00
Gustavo F. Padovan
33575df7be Bluetooth: move l2cap_sock_setsockopt() to l2cap_sock.c
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:43:31 -02:00
Gustavo F. Padovan
d7175d5525 Bluetooth: move l2cap_sock_getname() to l2cap_sock.c
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:43:30 -02:00
Gustavo F. Padovan
c47b7c724b Bluetooth: move l2cap_sock_accept() to l2cap_sock.c
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:43:30 -02:00
Gustavo F. Padovan
af6bcd8205 Bluetooth: move l2cap_sock_bind()/listen() to l2cap_sock.c
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:43:30 -02:00
Gustavo F. Padovan
554f05bb8a Bluetooth: move l2cap_sock_release() to l2cap_sock.c
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:43:30 -02:00
Gustavo F. Padovan
65390587c7 Bluetooth: move l2cap_sock_ops to l2cap_sock.c
First step to move all l2cap_sock_ops function to l2cap_sock.c

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:43:30 -02:00
Gustavo F. Padovan
bb58f747e5 Bluetooth: Initial work for L2CAP split.
This patch tries to do the minimal to move l2cap_sock_create() and its
dependencies to l2cap_sock.c. It create a API to initialize and cleanup
the L2CAP sockets from l2cap_core.c through l2cap_init_sockets() and
l2cap_cleanup_sockets().

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:43:30 -02:00
Johan Hedberg
17fa4b9dff Bluetooth: Add set_io_capability management command
This patch adds a new set_io_capability management command which is used
to set the IO capability for Secure Simple Pairing (SSP) as well as the
Security Manager Protocol (SMP). The value is per hci_dev and each
hci_conn object inherits it upon creation.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:08 -02:00
Johan Hedberg
980e1a537f Bluetooth: Add support for PIN code handling in the management interface
This patch adds the necessary commands and events needed to communicate
PIN code related actions between the kernel and userspace. This includes
a pin_code_request event as well as pin_code_reply and
pin_code_negative_reply commands.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:07 -02:00
Johan Hedberg
2784eb41b1 Bluetooth: Add get_connections managment interface command
This patch adds a get_connections command to the management interface.
With this command userspace can get the current list of connected
devices. Typically this command would only be used once when enumerating
existing adapters. After that the connected and disconnected events are
used to track connections.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:07 -02:00
Johan Hedberg
17d5c04cb5 Bluetooth: Add support for connect failed management event
This patch add a new connect failed management event to track failures
in connecting to remote devices. It is particularly useful for security
mode 3 scenarios when we don't have a connected state while pairing but
still need to detect when the connect attempt failed.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:07 -02:00
Johan Hedberg
8962ee74be Bluetooth: Add disconnect managment command
This patch adds a disconnect command to the managment interface. Using
this command user space is able to force the disconnection of connected
devices. The command maps directly to the Disconnect HCI command.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:07 -02:00
Johan Hedberg
f7520543ab Bluetooth: Add connected/disconnected management events
This patch adds connected and disconnected managment events to track the
connection status to remote devices. The events map directly to
successful connection complete and disconnection complete HCI events for
ACL links.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:07 -02:00
Johan Hedberg
55ed8ca10f Bluetooth: Implement link key handling for the management interface
This patch adds a management commands to feed the kernel with all stored
link keys as well as remove specific ones or all of them. Once the
load_keys command has been called the kernel takes over link key
replies. A new_key event is also added to inform userspace of newly
created link keys that should be stored permanently.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:07 -02:00
Johan Hedberg
1aff6f0949 Bluetooth: Add class of device control to the management interface
This patch adds the possibility for user space to fully control the
Class of Device value of local adapters. To control the service class
bits each UUID that's added comes with a service class "hint" which acts
as a mask of bits that the UUID needs to have enabled. The
set_service_cache management command is used to make sure we queue up
all UUID changes as user space initializes its drivers and then send a
single HCI_Write_Class_of_Device command when initialization is
complete.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:06 -02:00
Johan Hedberg
d5859e22cd Bluetooth: Implement a more complete adapter initialization sequence
Using the managment interface means that user space doesn't need to do
any HCI command sending at all. This patch moves the remaining
initialization commands from user space to the kernel side. The patch
makes use of the new feature of __hci_request which allows the request
to be dynamically modified while it is ongoing (something that is needed
to react appropriately to the local features and the version of the
adapter).

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:06 -02:00
Johan Hedberg
b0916ea0d9 Bluetooth: Add controller side link key clearing to hci_init_req
The controller may have link keys in its own memory and these keys could
be used for secure connections. However, since the interface to access
these keys doesn't provide information about the key types (which would
be needed to infer the level of security each key provides) using these
keys is rather useless. Therefore, simply clear the controller side list
in the initialization procedure.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:06 -02:00
Johan Hedberg
a5040efa20 Bluetooth: Add special handling with __hci_request and HCI_INIT
To support a more dynamic HCI initialization sequence the __hci_request
behavior requires some more changes. Particularly, the init sequence
should be able to have conditionals in it (sending some HCI commands
depending on the outcome of a previous command) instead of being a fixed
list as it is right now.

The reasons for these additional requirements are the moving all
previously user space driven initialization commands to the kernel side
as well as the support the Low Energy controllers.

To fulfull these requirements the init sequence is made the only special
case for multi-command requests and req_last_cmd is renamed to
init_last_cmd. The hci_send_cmd function is changed to update
init_last_cmd as long as the HCI_INIT flag is set.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:06 -02:00
Johan Hedberg
03b555e119 Bluetooth: Reject pairing requests when in non-pairable mode
This patch adds the necessary logic to act accordingly when the
HCI_PAIRABLE flag is not set. In that case PIN code replies as well as
Secure Simple Pairing requests without a NoBonding requirement need to
be rejected.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:06 -02:00
Johan Hedberg
2aeb9a1ae0 Bluetooth: Implement UUID handling through the management interface
This patch adds methods to the management interface for userspace to
notify the kernel of which services have been registered for specific
adapters. This information is needed for setting the appropriate Class
of Device value as well as the Extended Inquiry Response value. This
patch doesn't actually implement setting of these values but just
provides the storage of the UUIDs so the needed functionality can be
built on top of it.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:05 -02:00
Johan Hedberg
c542a06c29 Bluetooth: Implement set_pairable managment command
This patch implements a new set_pairable management command to control
the pairable state of local adapters. The state is represented using a
new HCI_PAIRABLE flag in the hci_dev struct.

For backwards compatibility with older user space versions the
HCI_PAIRABLE flag gets automatically set when the existence of an
adapter is reported to user space through legacy methods and the
HCI_MGMT flag is not set.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:05 -02:00
Johan Hedberg
ebc99feba7 Bluetooth: Add flag to track managment controlled adapters
This patch adds a HCI_MGMT flag to track adapters which are under the
control of the management interface. This is needed to make sure that
new kernels will work with old user space versions. I.e. behaviour which
could break old user space versions (but is needed by the management
interface) should not be exhibited when the HCI_MGMT flag is not set.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:05 -02:00
Johan Hedberg
72a734ec1a Bluetooth: Unify mode related management messages to a single struct
The powered, connectable and discoverable messages all have the same
format. By using a single struct for all of them a lot of code can be
simplified and reused.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:05 -02:00
Johan Hedberg
9fbcbb455d Bluetooth: Add set_connectable management command
This patch adds a set_connectable command as well as a corresponding
event to the management interface. It's mainly useful for setting an
adapter as connectable from a non-initialized state as well as setting
an already initialized adapter as non-connectable (mostly useful for
qualification purposes).

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:05 -02:00
Johan Hedberg
73f22f6238 Bluetooth: Add support for set_discoverable management command
This patch adds a set_discoverable command to the management interface
as well as the corresponding event. The command is used to control the
discoverable state of adapters.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:04 -02:00
Johan Hedberg
eec8d2bcc8 Bluetooth: Add support for set_powered management command
This patch adds a set_powered command to the management interface
through which the powered state of local adapters can be controlled.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:04 -02:00
Johan Hedberg
5add6af8fc Bluetooth: Add support for management powered event
This patch adds support for the powered event that's used to indicate to
userspace when the powered state of a local adapter changes.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:04 -02:00
Johan Hedberg
ab81cbf99c Bluetooth: Implement automatic setup procedure for local adapters
This patch implements automatic initialization of basic information
about newly registered Bluetooth adapters. E.g. the address and features
are always needed so it makes sense for the kernel to automatically
power on adapters and read this information. A new HCI_SETUP flag is
added to track this state.

In order to not consume unnecessary amounts of power if there isn't a
user space available that could switch the adapter back off, a timer is
added to do this automatically as long as no Bluetooth user space seems
to be present. A new HCI_AUTO_OFF flag is added that user space needs to
clear to avoid the automatic power off.

Additionally, the management interface index_added event is moved to the
end of the HCI_SETUP stage so a user space supporting the managment
inteface has all the necessary information available for fetching when
it gets notified of a new adapter. The HCI_DEV_REG event is kept in the
same place as before since existing HCI raw socket based user space
versions depend on seeing the kernels initialization sequence
(hci_init_req) to determine when the adapter is ready for use.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:04 -02:00
Andrei Emeltchenko
e702112ff6 Bluetooth: Use non-flushable by default L2CAP data packets
Modification of Nick Pelly <npelly@google.com> patch.

With Bluetooth 2.1 ACL packets can be flushable or non-flushable. This commit
makes ACL data packets non-flushable by default on compatible chipsets, and
adds the BT_FLUSHABLE socket option to explicitly request flushable ACL
data packets for a given L2CAP socket. This is useful for A2DP data which can
be safely discarded if it can not be delivered within a short time (while
other ACL data should not be discarded).

Note that making ACL data flushable has no effect unless the automatic flush
timeout for that ACL link is changed from its default of 0 (infinite).

Default packet types (for compatible chipsets):
Frame 34: 13 bytes on wire (104 bits), 13 bytes captured (104 bits)
Bluetooth HCI H4
Bluetooth HCI ACL Packet
    .... 0000 0000 0010 = Connection Handle: 0x0002
    ..00 .... .... .... = PB Flag: First Non-automatically Flushable Packet (0)
    00.. .... .... .... = BC Flag: Point-To-Point (0)
    Data Total Length: 8
Bluetooth L2CAP Packet

After setting BT_FLUSHABLE
(sock.setsockopt(274 /*SOL_BLUETOOTH*/, 8 /* BT_FLUSHABLE */, 1 /* flush */))
Frame 34: 13 bytes on wire (104 bits), 13 bytes captured (104 bits)
Bluetooth HCI H4
Bluetooth HCI ACL Packet
    .... 0000 0000 0010 = Connection Handle: 0x0002
    ..10 .... .... .... = PB Flag: First Automatically Flushable Packet (2)
    00.. .... .... .... = BC Flag: Point-To-Point (0)
    Data Total Length: 8
Bluetooth L2CAP Packet

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-08 01:40:04 -02:00
David S. Miller
7eb38527c4 tcp: Add reference to initial CWND ietf draft.
Suggested by Alexander Zimmermann

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-05 18:13:45 -08:00
David S. Miller
92d8682926 inetpeer: Move ICMP rate limiting state into inet_peer entries.
Like metrics, the ICMP rate limiting bits are cached state about
a destination.  So move it into the inet_peer entries.

If an inet_peer cannot be bound (the reason is memory allocation
failure or similar), the policy is to allow.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-04 15:59:53 -08:00
David S. Miller
bd4a6974cc Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-02-04 14:28:58 -08:00
Julia Lawall
38db9e1db1 include/net/genetlink.h: Allow genlmsg_cancel to accept a NULL argument
nlmsg_cancel can accept NULL as its second argument, so for similarity,
this patch extends genlmsg_cancel to be able to accept a NULL second
argument as well.

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-03 20:47:08 -08:00
Jouni Malinen
681d119047 mac80211: Add testing functionality for TKIP
TKIP countermeasures depend on devices being able to detect Michael
MIC failures on received frames and for stations to report errors to
the AP. In order to test that behavior, it is useful to be able to
send out TKIP frames with incorrect Michael MIC. This testing behavior
has minimal effect on the TX path, so it can be added to mac80211 for
convenient use.

The interface for using this functionality is a file in mac80211
netdev debugfs (tkip_mic_test). Writing a MAC address to the file
makes mac80211 generate a dummy data frame that will be sent out using
invalid Michael MIC value. In AP mode, the address needs to be for one
of the associated stations or ff:ff:ff:ff:ff:ff to use a broadcast
frame. In station mode, the address can be anything, e.g., the current
BSSID. It should be noted that this functionality works correctly only
when associated and using TKIP.

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-03 16:45:29 -05:00
Arik Nemtsov
d057e5a381 mac80211: add HW flag for disabling auto link-PS in AP mode
When operating in AP mode the wl1271 hardware filters out null-data
packets as well as management packets. This makes it impossible for
mac80211 to monitor the PS mode by using the PM bit of incoming frames.

Implement a HW flag to indicate that mac80211 should ignore the PM bit.
In addition, expose ieee80211_sta_ps_transition() to make low-level
drivers capable of controlling PS-mode.

Signed-off-by: Arik Nemtsov <arik@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-03 16:44:44 -05:00
David S. Miller
fd95240568 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2011-02-03 13:06:43 -08:00
David S. Miller
442b9635c5 tcp: Increase the initial congestion window to 10.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Nandita Dukkipati <nanditad@google.com>
2011-02-02 20:48:47 -08:00
David S. Miller
0bc0be7f20 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2011-02-02 15:52:23 -08:00
David S. Miller
8fe73503fa Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2011-02-02 15:24:48 -08:00
David S. Miller
5348ba85a0 ipv4: Update some fib_hash centric interface names.
fib_hash_init() --> fib_trie_init()
fib_hash_table() --> fib_trie_table()

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-01 15:35:25 -08:00
Simon Horman
a13676476e IPVS: Remove unused variables
These variables are unused as a result of the recent netns work.

Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Tested-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01 18:27:51 +01:00
Pablo Neira Ayuso
3db7e93d33 netfilter: ecache: always set events bits, filter them later
For the following rule:

iptables -I PREROUTING -t raw -j CT --ctevents assured

The event delivered looks like the following:

 [UPDATE] tcp      6 src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]

Note that the TCP protocol state is not included. For that reason
the CT event filtering is not very useful for conntrackd.

To resolve this issue, instead of conditionally setting the CT events
bits based on the ctmask, we always set them and perform the filtering
in the late stage, just before the delivery.

Thus, the event delivered looks like the following:

 [UPDATE] tcp      6 432000 ESTABLISHED src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01 16:06:30 +01:00
Jozsef Kadlecsik
f703651ef8 netfilter: NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros
The patch adds the NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros to the
vanilla kernel.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01 15:20:14 +01:00
David S. Miller
0c838ff1ad ipv4: Consolidate all default route selection implementations.
Both fib_trie and fib_hash have a local implementation of
fib_table_select_default().  This is completely unnecessary
code duplication.

Since we now remember the fib_table and the head of the fib
alias list of the default route, we can implement one single
generic version of this routine.

Looking at the fib_hash implementation you may get the impression
that it's possible for there to be multiple top-level routes in
the table for the default route.  The truth is, it isn't, the
insert code will only allow one entry to exist in the zero
prefix hash table, because all keys evaluate to zero and all
keys in a hash table must be unique.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-31 16:16:50 -08:00
David S. Miller
5b4704419c ipv4: Remember FIB alias list head and table in lookup results.
This will be used later to implement fib_select_default() in a
completely generic manner, instead of the current situation where the
default route is re-looked up in the TRIE/HASH table and then the
available aliases are analyzed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-31 16:10:03 -08:00
David S. Miller
5403c8a295 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-01-31 13:13:24 -08:00
Eric W. Biederman
709b46e8d9 net: Add compat ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT
SIOCGETSGCNT is not a unique ioctl value as it it maps tio SIOCPROTOPRIVATE +1,
which unfortunately means the existing infrastructure for compat networking
ioctls is insufficient.  A trivial compact ioctl implementation would conflict
with:

SIOCAX25ADDUID
SIOCAIPXPRISLT
SIOCGETSGCNT_IN6
SIOCGETSGCNT
SIOCRSSCAUSE
SIOCX25SSUBSCRIP
SIOCX25SDTEFACILITIES

To make this work I have updated the compat_ioctl decode path to mirror the
the normal ioctl decode path.  I have added an ipv4 inet_compat_ioctl function
so that I can have ipv4 specific compat ioctls.   I have added a compat_ioctl
function into struct proto so I can break out ioctls by which kind of ip socket
I am using.  I have added a compat_raw_ioctl function because SIOCGETSGCNT only
works on raw sockets.  I have added a ipmr_compat_ioctl that mirrors the normal
ipmr_ioctl.

This was necessary because unfortunately the struct layout for the SIOCGETSGCNT
has unsigned longs in it so changes between 32bit and 64bit kernels.

This change was sufficient to run a 32bit ip multicast routing daemon on a
64bit kernel.

Reported-by: Bill Fenner <fenner@aristanetworks.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-30 01:14:38 -08:00
David S. Miller
725d1e1b45 ipv4: Attach FIB info to dst_default_metrics when possible
If there are no explicit metrics attached to a route, hook
fi->fib_info up to dst_default_metrics.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-28 14:05:05 -08:00
David S. Miller
9c150e82ac ipv4: Allocate fib metrics dynamically.
This is the initial gateway towards super-sharing metrics
if they are all set to zero for a route.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-28 14:01:25 -08:00
John W. Linville
3e11210d46 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
Conflicts:
	drivers/net/wireless/ath/ath9k/init.c
2011-01-28 16:23:14 -05:00
Johannes Berg
6d744bacee mac80211: add MCS information to radiotap
This adds the MCS information we currently get
from the drivers into radiotap.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-28 15:44:29 -05:00
David S. Miller
a4daad6b09 net: Pre-COW metrics for TCP.
TCP is going to record metrics for the connection,
so pre-COW the route metrics at route cache entry
creation time.

This avoids several atomic operations that have to
occur if we COW the metrics after the entry reaches
global visibility.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-27 22:01:53 -08:00
David S. Miller
8571a19c4a Merge branch 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2011-01-27 16:00:37 -08:00
David S. Miller
144001bddc inetpeer: Mark metrics as "new" in fresh inetpeer entries.
Set the RTAX_LOCKED metric to INETPEER_METRICS_NEW (basically,
all ones) on fresh inetpeer entries.

This way code can determine if default metrics have been loaded
in from a routing table entry already.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-27 13:52:16 -08:00
David S. Miller
606598237c inetpeer: Add metrics storage to inetpeer entries.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-27 13:48:26 -08:00
David S. Miller
62fa8a846d net: Implement read-only protection and COW'ing of metrics.
Routing metrics are now copy-on-write.

Initially a route entry points it's metrics at a read-only location.
If a routing table entry exists, it will point there.  Else it will
point at the all zero metric place-holder called 'dst_default_metrics'.

The writeability state of the metrics is stored in the low bits of the
metrics pointer, we have two bits left to spare if we want to store
more states.

For the initial implementation, COW is implemented simply via kmalloc.
However future enhancements will change this to place the writable
metrics somewhere else, in order to increase sharing.  Very likely
this "somewhere else" will be the inetpeer cache.

Note also that this means that metrics updates may transiently fail
if we cannot COW the metrics successfully.

But even by itself, this patch should decrease memory usage and
increase cache locality especially for routing workloads.  In those
cases the read-only metric copies stay in place and never get written
to.

TCP workloads where metrics get updated, and those rare cases where
PMTU triggers occur, will take a very slight performance hit.  But
that hit will be alleviated when the long-term writable metrics
move to a more sharable location.

Since the metrics storage went from a u32 array of RTAX_MAX entries to
what is essentially a pointer, some retooling of the dst_entry layout
was necessary.

Most importantly, we need to preserve the alignment of the reference
count so that it doesn't share cache lines with the read-mostly state,
as per Eric Dumazet's alignment assertion checks.

The only non-trivial bit here is the move of the 'flags' member into
the writeable cacheline.  This is OK since we are always accessing the
flags around the same moment when we made a modification to the
reference count.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-26 20:51:05 -08:00
David S. Miller
b4e69ac670 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-01-26 13:49:30 -08:00
David S. Miller
9b6941d8b1 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-01-26 11:49:49 -08:00
Michał Mirosław
04ed3e741d net: change netdev->features to u32
Quoting Ben Hutchings: we presumably won't be defining features that
can only be enabled on 64-bit architectures.

Occurences found by `grep -r` on net/, drivers/net, include/

[ Move features and vlan_features next to each other in
  struct netdev, as per Eric Dumazet's suggestion -DaveM ]

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24 15:32:47 -08:00
David S. Miller
5bdc22a565 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/sched/sch_hfsc.c
	net/sched/sch_htb.c
	net/sched/sch_tbf.c
2011-01-24 14:09:35 -08:00
David S. Miller
e92427b289 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 2011-01-24 13:17:06 -08:00
Bruno Randolf
59eb21a650 cfg80211: Extend channel to frequency mapping for 802.11j
Extend channel to frequency mapping for 802.11j Japan 4.9GHz band, according to
IEEE802.11 section 17.3.8.3.2 and Annex J. Because there are now overlapping
channel numbers in the 2GHz and 5GHz band we can't map from channel to
frequency without knowing the band. This is no problem as in most contexts we
know the band. In places where we don't know the band (and WEXT compatibility)
we assume the 2GHz band for channels below 14.

This patch does not implement all channel to frequency mappings defined in
802.11, it's just an extension for 802.11j 20MHz channels. 5MHz and 10MHz
channels as well as 802.11y channels have been omitted.

The following drivers have been updated to reflect the API changes:
iwl-3945, iwl-agn, iwmc3200wifi, libertas, mwl8k, rt2x00, wl1251, wl12xx.
The drivers have been compile-tested only.

Signed-off-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: Brian Prodoehl <bprodoehl@gmail.com>
Acked-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-21 15:34:17 -05:00
Eric Dumazet
9190b3b320 net_sched: accurate bytes/packets stats/rates
In commit 44b8288308 (net_sched: pfifo_head_drop problem), we fixed
a problem with pfifo_head drops that incorrectly decreased
sch->bstats.bytes and sch->bstats.packets

Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a
previously enqueued packet, and bstats cannot be changed, so
bstats/rates are not accurate (over estimated)

This patch changes the qdisc_bstats updates to be done at dequeue() time
instead of enqueue() time. bstats counters no longer account for dropped
frames, and rates are more correct, since enqueue() bursts dont have
effect on dequeue() rate.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-20 23:31:33 -08:00
Eric Dumazet
a2da570d62 net_sched: RCU conversion of stab
This patch converts stab qdisc management to RCU, so that we can perform
the qdisc_calculate_pkt_len() call before getting qdisc lock.

This shortens the lock's held time in __dev_xmit_skb().

This permits more qdiscs to get TCQ_F_CAN_BYPASS status, avoiding lot of
cache misses and so reducing latencies.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Jesper Dangaard Brouer <hawk@diku.dk>
CC: Jarek Poplawski <jarkao2@gmail.com>
CC: Jamal Hadi Salim <hadi@cyberus.ca>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-20 16:59:32 -08:00
Eric Dumazet
fd245a4adb net_sched: move TCQ_F_THROTTLED flag
In commit 3711210576 (net: QDISC_STATE_RUNNING dont need atomic bit
ops) I moved QDISC_STATE_RUNNING flag to __state container, located in
the cache line containing qdisc lock and often dirtied fields.

I now move TCQ_F_THROTTLED bit too, so that we let first cache line read
mostly, and shared by all cpus. This should speedup HTB/CBQ for example.

Not using test_bit()/__clear_bit()/__test_and_set_bit allows to use an
"unsigned int" for __state container, reducing by 8 bytes Qdisc size.

Introduce helpers to hide implementation details.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Jesper Dangaard Brouer <hawk@diku.dk>
CC: Jarek Poplawski <jarkao2@gmail.com>
CC: Jamal Hadi Salim <hadi@cyberus.ca>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-20 16:59:32 -08:00
Patrick McHardy
2f1e317672 netfilter: nf_conntrack: fix linker error with NF_CONNTRACK_TIMESTAMP=n
net/built-in.o: In function `nf_conntrack_init_net':
net/netfilter/nf_conntrack_core.c:1521:
	undefined reference to `nf_conntrack_tstamp_init'
net/netfilter/nf_conntrack_core.c:1531:
	undefined reference to `nf_conntrack_tstamp_fini'

Add dummy inline functions for the =n case to fix this.

Reported-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-20 20:46:52 +01:00
David S. Miller
a07aa004c8 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2011-01-20 00:06:15 -08:00
Linus Torvalds
1268afe676 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (41 commits)
  sctp: user perfect name for Delayed SACK Timer option
  net: fix can_checksum_protocol() arguments swap
  Revert "netlink: test for all flags of the NLM_F_DUMP composite"
  gianfar: Fix misleading indentation in startup_gfar()
  net/irda/sh_irda: return to RX mode when TX error
  net offloading: Do not mask out NETIF_F_HW_VLAN_TX for vlan.
  USB CDC NCM: tx_fixup() race condition fix
  ns83820: Avoid bad pointer deref in ns83820_init_one().
  ipv6: Silence privacy extensions initialization
  bnx2x: Update bnx2x version to 1.62.00-4
  bnx2x: Fix AER setting for BCM57712
  bnx2x: Fix BCM84823 LED behavior
  bnx2x: Mark full duplex on some external PHYs
  bnx2x: Fix BCM8073/BCM8727 microcode loading
  bnx2x: LED fix for BCM8727 over BCM57712
  bnx2x: Common init will be executed only once after POR
  bnx2x: Swap BCM8073 PHY polarity if required
  iwlwifi: fix valid chain reading from EEPROM
  ath5k: fix locking in tx_complete_poll_work
  ath9k_hw: do PA offset calibration only on longcal interval
  ...
2011-01-19 20:25:45 -08:00
Shan Wei
4580ccc04d sctp: user perfect name for Delayed SACK Timer option
The option name of Delayed SACK Timer should be SCTP_DELAYED_SACK,
not SCTP_DELAYED_ACK.

Left SCTP_DELAYED_ACK be concomitant with SCTP_DELAYED_SACK,
for making compatibility with existing applications.

Reference:
8.1.19.  Get or Set Delayed SACK Timer (SCTP_DELAYED_SACK)
(http://tools.ietf.org/html/draft-ietf-tsvwg-sctpsocket-25)

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19 16:51:29 -08:00
Patrick McHardy
14f0290ba4 Merge branch 'master' of /repos/git/net-next-2.6 2011-01-19 23:51:37 +01:00
Johan Hedberg
765c2a964b Bluetooth: Fix race condition with conn->sec_level
The conn->sec_level value is supposed to represent the current level of
security that the connection has. However, by assigning to it before
requesting authentication it will have the wrong value during the
authentication procedure. To fix this a pending_sec_level variable is
added which is used to track the desired security level while making
sure that sec_level always represents the current level of security.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-01-19 14:43:11 -02:00
Johannes Berg
5dd36bc933 mac80211: allow advertising correct maximum aggregate size
Currently, mac80211 always advertises that it may send
up to 64 subframes in an aggregate. This is fine, since
it's the max, but might as well be set to zero instead
since it doesn't have any information.

However, drivers might have that information, so allow
them to set a variable giving it, which will then be
used. The default of zero will be fine since to the
peer that means we don't know and it will just use its
own limit for the buffer size.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-19 11:36:12 -05:00
Johannes Berg
0b01f030d3 mac80211: track receiver's aggregation reorder buffer size
The aggregation code currently doesn't implement the
buffer size negotiation. It will always request a max
buffer size (which is fine, if a little pointless, as
the mac80211 code doesn't know and might just use 0
instead), but if the peer requests a smaller size it
isn't possible to honour this request.

In order to fix this, look at the buffer size in the
addBA response frame, keep track of it and pass it to
the driver in the ampdu_action callback when called
with the IEEE80211_AMPDU_TX_OPERATIONAL action. That
way the driver can limit the number of subframes in
aggregates appropriately.

Note that this doesn't fix any drivers apart from the
addition of the new argument -- they all need to be
updated separately to use this variable!

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-19 11:36:11 -05:00
Luciano Coelho
df6ba5d80d mac80211: add hw configuration for max ampdu buffer size
Some devices don't support the maximum AMDPU buffer size of 64, so we
need to add an option to configure this in the hardware configuration.
This value will be used in the ADDBA response instead of the value
suggested in the request, if the latter is greater than the max
supported.

Signed-off-by: Luciano Coelho <coelho@ti.com>
Tested-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-19 11:36:09 -05:00
Pablo Neira Ayuso
a992ca2a04 netfilter: nf_conntrack_tstamp: add flow-based timestamp extension
This patch adds flow-based timestamping for conntracks. This
conntrack extension is disabled by default. Basically, we use
two 64-bits variables to store the creation timestamp once the
conntrack has been confirmed and the other to store the deletion
time. This extension is disabled by default, to enable it, you
have to:

echo 1 > /proc/sys/net/netfilter/nf_conntrack_timestamp

This patch allows to save memory for user-space flow-based
loogers such as ulogd2. In short, ulogd2 does not need to
keep a hashtable with the conntrack in user-space to know
when they were created and destroyed, instead we use the
kernel timestamp. If we want to have a sane IPFIX implementation
in user-space, this nanosecs resolution timestamps are also
useful. Other custom user-space applications can benefit from
this via libnetfilter_conntrack.

This patch modifies the /proc output to display the delta time
in seconds since the flow start. You can also obtain the
flow-start date by means of the conntrack-tools.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-19 16:00:07 +01:00
Eric Dumazet
80f8f1027b net: filter: dont block softirqs in sk_run_filter()
Packet filter (BPF) doesnt need to disable softirqs, being fully
re-entrant and lock-less.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-18 21:33:05 -08:00
Jiri Olsa
93557f53e1 netfilter: nf_conntrack: nf_conntrack snmp helper
Adding support for SNMP broadcast connection tracking. The SNMP
broadcast requests are now paired with the SNMP responses.
Thus allowing using SNMP broadcasts with firewall enabled.

Please refer to the following conversation:
http://marc.info/?l=netfilter-devel&m=125992205006600&w=2

Patrick McHardy wrote:
> > The best solution would be to add generic broadcast tracking, the
> > use of expectations for this is a bit of abuse.
> > The second best choice I guess would be to move the help() function
> > to a shared module and generalize it so it can be used for both.
This patch implements the "second best choice".

Since the netbios-ns conntrack module uses the same helper
functionality as the snmp, only one helper function is added
for both snmp and netbios-ns modules into the new object -
nf_conntrack_broadcast.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-18 18:12:24 +01:00
Changli Gao
a7c2f4d7da netfilter: nf_nat: fix conversion to non-atomic bit ops
My previous patch (netfilter: nf_nat: don't use atomic bit operation)
made a mistake when converting atomic_set to a normal bit 'or'.
IPS_*_BIT should be replaced with IPS_*.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Cc: Tim Gardner <tim.gardner@canonical.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-18 15:02:48 +01:00
Linus Torvalds
d018b6f4f1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
  GRETH: resolve SMP issues and other problems
  GRETH: handle frame error interrupts
  GRETH: avoid writing bad speed/duplex when setting transfer mode
  GRETH: fixed skb buffer memory leak on frame errors
  GRETH: GBit transmit descriptor handling optimization
  GRETH: fix opening/closing
  GRETH: added raw AMBA vendor/device number to match against.
  cassini: Fix build bustage on x86.
  e1000e: consistent use of Rx/Tx vs. RX/TX/rx/tx in comments/logs
  e1000e: update Copyright for 2011
  e1000: Avoid unhandled IRQ
  r8169: keep firmware in memory.
  netdev: tilepro: Use is_unicast_ether_addr helper
  etherdevice.h: Add is_unicast_ether_addr function
  ks8695net: Use default implementation of ethtool_ops::get_link
  ks8695net: Disable non-working ethtool operations
  USB CDC NCM: Don't deref NULL in cdc_ncm_rx_fixup() and don't use uninitialized variable.
  vxge: Remember to release firmware after upgrading firmware
  netdev: bfin_mac: Remove is_multicast_ether_addr use in netdev_for_each_mc_addr
  ipsec: update MAX_AH_AUTH_LEN to support sha512
  ...
2011-01-14 13:25:30 -08:00
Patrick McHardy
d862a6622e netfilter: nf_conntrack: use is_vmalloc_addr()
Use is_vmalloc_addr() in nf_ct_free_hashtable() and get rid of
the vmalloc flags to indicate that a hash table has been allocated
using vmalloc().

Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-14 15:45:56 +01:00
Patrick McHardy
0134e89c7b Merge branch 'master' of git://1984.lsi.us.es/net-next-2.6
Conflicts:
	net/ipv4/route.c

Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-14 14:12:37 +01:00
Patrick McHardy
c7066f70d9 netfilter: fix Kconfig dependencies
Fix dependencies of netfilter realm match: it depends on NET_CLS_ROUTE,
which itself depends on NET_SCHED; this dependency is missing from netfilter.

Since matching on realms is also useful without having NET_SCHED enabled and
the option really only controls whether the tclassid member is included in
route and dst entries, rename the config option to IP_ROUTE_CLASSID and move
it outside of traffic scheduling context to get rid of the NET_SCHED dependeny.

Reported-by: Vladis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-14 13:36:42 +01:00
Nicolas Dichtel
78d0736946 ipsec: update MAX_AH_AUTH_LEN to support sha512
icv_truncbits is set to 256 for sha512, so update
MAX_AH_AUTH_LEN to 64.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-13 21:48:25 -08:00
Linus Torvalds
008d23e485 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
  Documentation/trace/events.txt: Remove obsolete sched_signal_send.
  writeback: fix global_dirty_limits comment runtime -> real-time
  ppc: fix comment typo singal -> signal
  drivers: fix comment typo diable -> disable.
  m68k: fix comment typo diable -> disable.
  wireless: comment typo fix diable -> disable.
  media: comment typo fix diable -> disable.
  remove doc for obsolete dynamic-printk kernel-parameter
  remove extraneous 'is' from Documentation/iostats.txt
  Fix spelling milisec -> ms in snd_ps3 module parameter description
  Fix spelling mistakes in comments
  Revert conflicting V4L changes
  i7core_edac: fix typos in comments
  mm/rmap.c: fix comment
  sound, ca0106: Fix assignment to 'channel'.
  hrtimer: fix a typo in comment
  init/Kconfig: fix typo
  anon_inodes: fix wrong function name in comment
  fix comment typos concerning "consistent"
  poll: fix a typo in comment
  ...

Fix up trivial conflicts in:
 - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)
 - fs/ext4/ext4.h

Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.
2011-01-13 10:05:56 -08:00
stephen hemminger
838b4dc6d8 sched: remove unused backlog in RED stats
The RED statistics structure includes backlog field which is not
set or used by any code.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-12 19:00:39 -08:00
David S. Miller
464143c911 Merge branch 'master' of git://1984.lsi.us.es/net-2.6 2011-01-12 18:58:40 -08:00
David S. Miller
bb1231052e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-01-12 18:52:31 -08:00
Hans Schillstrom
763f8d0ed4 IPVS: netns, svc counters moved in ip_vs_ctl,c
Last two global vars to be moved,
ip_vs_ftpsvc_counter and ip_vs_nullsvc_counter.

[horms@verge.net.au: removed whitespace-change-only hunk]
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:28 +09:00
Hans Schillstrom
f2431e6e92 IPVS: netns, trash handling
trash list per namspace,
and reordering of some params in dst struct.

[ horms@verge.net.au: Use cancel_delayed_work_sync() instead of
	              cancel_rearming_delayed_work(). Found during
		      merge conflict resoliution ]
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:28 +09:00
Hans Schillstrom
f6340ee0c6 IPVS: netns, defense work timer.
This patch makes defense work timer per name-space,
A net ptr had to be added to the ipvs struct,
since it's needed by defense_work_handler.

[ horms@verge.net.au: Use cancel_delayed_work_sync() instead of
	              cancel_rearming_delayed_work(). Found during
		      merge conflict resoliution ]
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:28 +09:00
Hans Schillstrom
a0840e2e16 IPVS: netns, ip_vs_ctl local vars moved to ipvs struct.
Moving global vars to ipvs struct, except for svc table lock.
Next patch for ctl will be drop-rate handling.

*v3
__ip_vs_mutex remains global
 ip_vs_conntrack_enabled(struct netns_ipvs *ipvs)

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:28 +09:00
Hans Schillstrom
6e67e586e7 IPVS: netns, connection hash got net as param.
Connection hash table is now name space aware.
i.e. net ptr >> 8 is xor:ed to the hash,
and this is the first param to be compared.
The net struct is 0xa40 in size ( a little bit smaller for 32 bit arch:s)
and cache-line aligned, so a ptr >> 5 might be a more clever solution ?

All lookups where net is compared uses net_eq() which returns 1 when netns
is disabled, and the compiler seems to do something clever in that case.

ip_vs_conn_fill_param() have *net as first param now.

Three new inlines added to keep conn struct smaller
when names space is disabled.
- ip_vs_conn_net()
- ip_vs_conn_net_set()
- ip_vs_conn_net_eq()

*v3
  moved net compare to the end in "fast path"

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:28 +09:00
Hans Schillstrom
b17fc9963f IPVS: netns, ip_vs_stats and its procfs
The statistic counter locks for every packet are now removed,
and that statistic is now per CPU, i.e. no locks needed.
However summing is made in ip_vs_est into ip_vs_stats struct
which is moved to ipvs struc.

procfs, ip_vs_stats now have a "per cpu" count and a grand total.
A new function seq_file_single_net() in ip_vs.h created for handling of
single_open_net() since it does not place net ptr in a struct, like others.

/var/lib/lxc # cat /proc/net/ip_vs_stats_percpu
       Total Incoming Outgoing         Incoming         Outgoing
CPU    Conns  Packets  Packets            Bytes            Bytes
  0        0        3        1               9D               34
  1        0        1        2               49               70
  2        0        1        2               34               76
  3        1        2        2               70               74
  ~        1        7        7              18A              18E

     Conns/s   Pkts/s   Pkts/s          Bytes/s          Bytes/s
           0        0        0                0                0

*v3
ip_vs_stats reamains as before, instead ip_vs_stats_percpu is added.
u64 seq lock added

*v4
Bug correction inbytes and outbytes as own vars..
per_cpu counter for all stats now as suggested by Julian.

[horms@verge.net.au: removed whitespace-change-only hunk]
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:28 +09:00
Hans Schillstrom
f131315fa2 IPVS: netns awareness to ip_vs_sync
All global variables moved to struct ipvs,
most external changes fixed (i.e. init_net removed)
in sync_buf create  + 4 replaced by sizeof(struct..)

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:28 +09:00
Hans Schillstrom
29c2026fd4 IPVS: netns awareness to ip_vs_est
All variables moved to struct ipvs,
most external changes fixed (i.e. init_net removed)

*v3
 timer per ns instead of a common timer in estimator.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:28 +09:00
Hans Schillstrom
ab8a5e8408 IPVS: netns awareness to ip_vs_app
All variables moved to struct ipvs,
most external changes fixed (i.e. init_net removed)

in ip_vs_protocol param struct net *net added to:
 - register_app()
 - unregister_app()
This affected almost all proto_xxx.c files

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:28 +09:00
Hans Schillstrom
9bbac6a904 IPVS: netns, common protocol changes and use of appcnt.
appcnt and timeout_table moved from struct ip_vs_protocol to
ip_vs proto_data.

struct net *net added as first param to
 - register_app()
 - unregister_app()
 - app_conn_bind()
 - ip_vs_conn_new()

[horms@verge.net.au: removed cosmetic-change-only hunk]
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:27 +09:00
Hans Schillstrom
9330419d9a IPVS: netns, use ip_vs_proto_data as param.
ip_vs_protocol *pp is replaced by ip_vs_proto_data *pd in
function call in ip_vs_protocol struct i.e. :,
 - timeout_change()
 - state_transition()

ip_vs_protocol_timeout_change() got ipvs as param, due to above
and a upcoming patch - defence work

Most of this changes are triggered by Julians comment:
"tcp_timeout_change should work with the new struct ip_vs_proto_data
        so that tcp_state_table will go to pd->state_table
        and set_tcp_state will get pd instead of pp"

*v3
Mostly comments from Julian
The pp -> pd conversion should start from functions like
ip_vs_out() that use pp = ip_vs_proto_get(iph.protocol),
now they should use ip_vs_proto_data_get(net, iph.protocol).
conn_in_get() and conn_out_get() unused param *pp, removed.

*v4
ip_vs_protocol_timeout_change() walk the proto_data path.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:27 +09:00
Hans Schillstrom
9d934878e7 IPVS: netns preparation for proto_sctp
In this phase (one), all local vars will be moved to ipvs struct.

Remaining work, add param struct net *net to a couple of
functions that is common for all protos and use ip_vs_proto_data

*v3
 Removed unuset function set_state_timeout()

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:27 +09:00
Hans Schillstrom
78b16bde10 IPVS: netns preparation for proto_udp
In this phase (one), all local vars will be moved to ipvs struct.

Remaining work, add param struct net *net to a couple of
functions that is common for all protos and use ip_vs_proto_data

*v3
Removed unused function set_state_timeout()

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:27 +09:00
Hans Schillstrom
4a85b96c08 IPVS: netns preparation for proto_tcp
In this phase (one), all local vars will be moved to ipvs struct.

Remaining work, add param struct net *net to a couple of
functions that is common for all protos and use all
ip_vs_proto_data

*v3
Removed unused function as sugested by Simon

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:27 +09:00
Hans Schillstrom
252c641032 IPVS: netns, prepare protocol
Add support for protocol data per name-space.
in struct ip_vs_protocol, appcnt will be removed when all protos
are modified for network name-space.

This patch causes warnings of unused functions, they will be used
when next patch will be applied.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:27 +09:00
Hans Schillstrom
b6e885ddb9 IPVS: netns awarness to lblc sheduler
var sysctl_ip_vs_lblc_expiration moved to ipvs struct as
    sysctl_lblc_expiration

procfs updated to handle this.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:27 +09:00
Hans Schillstrom
d0a1eef9c3 IPVS: netns awarness to lblcr sheduler
var sysctl_ip_vs_lblcr_expiration moved to ipvs struct as
    sysctl_lblcr_expiration

procfs updated to handle this.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:27 +09:00
Hans Schillstrom
fc723250c9 IPVS: netns to services part 1
Services hash tables got netns ptr a hash arg,
While Real Servers (rs) has been moved to ipvs struct.
Two new inline functions added to get net ptr from skb.

Since ip_vs is called from different contexts there is two
places to dig for the net ptr skb->dev or skb->sk
this is handled in skb_net() and skb_sknet()

Global functions, ip_vs_service_get() ip_vs_lookup_real_service()
etc have got  struct net *net as first param.
If possible get net ptr skb etc,
 - if not &init_net is used at this early stage of patching.

ip_vs_ctl.c  procfs not ready for netns yet.

*v3
 Comments by Julian
- __ip_vs_service_find and __ip_vs_svc_fwm_find are fast path,
  net_eq(svc->net, net) so the check is at the end now.
- net = skb_net(skb) in ip_vs_out moved after check for skb_dst.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:26 +09:00
Hans Schillstrom
61b1ab4583 IPVS: netns, add basic init per netns.
Preparation for network name-space init, in this stage
some empty functions exists.

In most files there is a check if it is root ns i.e. init_net
if (!net_eq(net, &init_net))
        return ...
this will be removed by the last patch, when enabling name-space.

*v3
 ip_vs_conn.c merge error corrected.
 net_ipvs #ifdef removed as sugested by Jan Engelhardt

[ horms@verge.net.au: Removed whitespace-change-only hunks ]
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-13 10:30:26 +09:00
Simon Horman
fee1cc0895 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 into HEAD 2011-01-13 10:29:21 +09:00
KOVACS Krisztian
2fc72c7b84 netfilter: fix compilation when conntrack is disabled but tproxy is enabled
The IPv6 tproxy patches split IPv6 defragmentation off of conntrack, but
failed to update the #ifdef stanzas guarding the defragmentation related
fields and code in skbuff and conntrack related code in nf_defrag_ipv6.c.

This patch adds the required #ifdefs so that IPv6 tproxy can truly be used
without connection tracking.

Original report:
http://marc.info/?l=linux-netdev&m=129010118516341&w=2

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-12 20:25:08 +01:00
Nicolas Dichtel
e44f391187 ah: update maximum truncated ICV length
For SHA256, RFC4868 requires to truncate ICV length to 128 bits,
hence MAX_AH_AUTH_LEN should be updated to 16.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-11 14:03:10 -08:00
Maxim Levitsky
545ecdc3b3 arp: allow to invalidate specific ARP entries
IPv4 over firewire needs to be able to remove ARP entries
from the ARP cache that belong to nodes that are removed, because
IPv4 over firewire uses ARP packets for private information
about nodes.

This information becomes invalid as soon as node drops
off the bus and when it reconnects, its only possible
to start talking to it after it responded to an ARP packet.
But ARP cache prevents such packets from being sent.

Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10 16:10:37 -08:00
Eric Dumazet
bfe0d0298f net_sched: factorize qdisc stats handling
HTB takes into account skb is segmented in stats updates.
Generalize this to all schedulers.

They should use qdisc_bstats_update() helper instead of manipulating
bstats.bytes and bstats.packets

Add bstats_update() helper too for classes that use
gnet_stats_basic_packed fields.

Note : Right now, TCQ_F_CAN_BYPASS shortcurt can be taken only if no
stab is setup on qdisc.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10 16:07:54 -08:00
Dan Carpenter
facb4edc1e phonet: some signedness bugs
Dan Rosenberg pointed out that there were some signed comparison bugs
in the phonet protocol.

http://marc.info/?l=full-disclosure&m=129424528425330&w=2

The problem is that we check for array overflows but "protocol" is
signed and we don't check for array underflows.  If you have already
have CAP_SYS_ADMIN then you could use the bugs to get root, or someone
could cause an oops by mistake.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10 13:33:17 -08:00
Johannes Berg
f52555a4b2 cfg80211: add mesh join/leave callback docs
When I made the patch to add mesh join/leave I
didn't pay attention to docs because it was a
proof of concept, and then when we actually did
merge it I forgot -- add docs now.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-10 15:40:52 -05:00
Johannes Berg
610dbc980f mac80211: add missing docs for off-chan TX flag
The flag is IEEE80211_TX_CTL_TX_OFFCHAN and I had
added that in a previous patch but forgotten docs.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-10 15:40:52 -05:00
Johannes Berg
4976b4eb9d mac80211: add remain-on-channel docs
Add documentation for the new callbacks that I
forgot in the patch adding the callbacks.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-10 15:40:51 -05:00
Randy Dunlap
928c41e7a1 net/sock.h: make some fields private to fix kernel-doc warning(s)
Fix new kernel-doc notation warning in sock.h by annotating skc_dontcopy_*
as private fields.

Warning(include/net/sock.h:163): No description found for parameter 'skc_dontcopy_end[0]'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 16:26:51 -08:00
Changli Gao
f682cefa5a netfilter: fix the race when initializing nf_ct_expect_hash_rnd
Since nf_ct_expect_dst_hash() may be called without nf_conntrack_lock
locked, nf_ct_expect_hash_rnd should be initialized in the atomic way.

In this patch, we use nf_conntrack_hash_rnd instead of
nf_ct_expect_hash_rnd.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-06 11:22:20 -08:00
Johannes Berg
21f8358964 mac80211: implement hardware offload for remain-on-channel
This allows drivers to support remain-on-channel
offload if they implement smarter timing or need
to use a device implementation like iwlwifi.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-05 16:07:12 -05:00
John W. Linville
c96e96354a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
Conflicts:
	net/bluetooth/Makefile
2011-01-05 16:06:25 -05:00
John W. Linville
782a9e31e8 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/padovan/bluetooth-next-2.6 2011-01-04 14:25:28 -05:00
Shmulik Ravid
ea45fe4e17 dcbnl: adding DCBX feature flags get-set
Adding a pair of set-get routines to dcbnl for setting the negotiation
flags of the various DCB features. Conforms to the CEE flavor of DCBX
The user sets these flags (enable, advertise, willing) for each feature
to be used by the DCBX engine. The 'get' routine returns which of the
features is enabled after the negotiation.

This patch is dependent on the following patches:
[net-next-2.6 PATCH 1/3] dcbnl: add support for ieee8021Qaz attributes
[net-next-2.6 PATCH 2/3] dcbnl: add appliction tlv handlers
[net-next-2.6 PATCH 3/3] net_dcb: add application notifiers

Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-31 10:50:54 -08:00
Shmulik Ravid
6241b6259b dcbnl: adding DCBX engine capability
Adding an optional DCBX capability and a pair for get-set routines for
setting the device DCBX mode. The DCBX capability is a bit field of
supported attributes. The user is expected to set the DCBX mode with a
subset of the advertised attributes.

This patch is dependent on the following patches:
[net-next-2.6 PATCH 1/3] dcbnl: add support for ieee8021Qaz attributes
[net-next-2.6 PATCH 2/3] dcbnl: add appliction tlv handlers
[net-next-2.6 PATCH 3/3] net_dcb: add application notifiers

Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-31 10:50:54 -08:00
John Fastabend
96b99684e3 net_dcb: add application notifiers
DCBx applications priorities can be changed dynamically. If
application stacks are expected to keep the skb priority
consistent with the dcbx priority the stack will need to
be notified when these changes occur.

This patch adds application notifiers for the stack to register
with.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-31 10:47:46 -08:00
John Fastabend
9ab933ab2c dcbnl: add appliction tlv handlers
This patch adds application tlv handlers. Networking stacks
may use the application priority to set the skb priority of
their stack using the negoatiated dcbx priority.

This patch provides the dcb_{get|set}app() routines for the
stack to query these parameters. Notice lower layer drivers
can use the dcbnl_ops routines if additional handling is
needed. Perhaps in the firmware case for example

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-31 10:47:45 -08:00
John Fastabend
3e29027af4 dcbnl: add support for ieee8021Qaz attributes
The IEEE8021Qaz is the IEEE standard version of CEE. The
standard has had enough significant changes from the CEE
version that many of the CEE attributes have no meaning
in the new spec or do not easily map to IEEE standards.

Rather then attempt to create a complicated mapping
between CEE and IEEE standards this patch adds a nested
IEEE attribute to the list of DCB attributes. The policy
is,

	[DCB_ATTR_IFNAME]
	[DCB_ATTR_STATE]
	...
	[DCB_ATTR_IEEE]
		[DCB_ATTR_IEEE_ETS]
		[DCB_ATTR_IEEE_PFC]
		[DCB_ATTR_IEEE_APP_TABLE]
			[DCB_ATTR_IEEE_APP]
			...

The following dcbnl_rtnl_ops routines were added to handle
the IEEE standard,

	int (*ieee_getets) (struct net_device *, struct ieee_ets *);
	int (*ieee_setets) (struct net_device *, struct ieee_ets *);
	int (*ieee_getpfc) (struct net_device *, struct ieee_pfc *);
	int (*ieee_setpfc) (struct net_device *, struct ieee_pfc *);
	int (*ieee_getapp) (struct net_device *, struct dcb_app *);
	int (*ieee_setapp) (struct net_device *, struct dcb_app *);

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-31 10:47:45 -08:00
David S. Miller
17f7f4d9fc Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/ipv4/fib_frontend.c
2010-12-26 22:37:05 -08:00
David S. Miller
e058464990 Revert "ipv4: Allow configuring subnets as local addresses"
This reverts commit 4465b46900.

Conflicts:

	net/ipv4/fib_frontend.c

As reported by Ben Greear, this causes regressions:

> Change 4465b46900 caused rules
> to stop matching the input device properly because the
> FLOWI_FLAG_MATCH_ANY_IIF is always defined in ip_dev_find().
>
> This breaks rules such as:
>
> ip rule add pref 512 lookup local
> ip rule del pref 0 lookup local
> ip link set eth2 up
> ip -4 addr add 172.16.0.102/24 broadcast 172.16.0.255 dev eth2
> ip rule add to 172.16.0.102 iif eth2 lookup local pref 10
> ip rule add iif eth2 lookup 10001 pref 20
> ip route add 172.16.0.0/24 dev eth2 table 10001
> ip route add unreachable 0/0 table 10001
>
> If you had a second interface 'eth0' that was on a different
> subnet, pinging a system on that interface would fail:
>
>   [root@ct503-60 ~]# ping 192.168.100.1
>   connect: Invalid argument

Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-23 12:03:57 -08:00
David S. Miller
b7e03ec9a6 Merge branch 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-12-22 17:34:40 -08:00
Johan Hedberg
23bb57633d Bluetooth: Fix __hci_request synchronization for hci_open_dev
The initialization function used by hci_open_dev (hci_init_req) sends
many different HCI commands. The __hci_request function should only
return when all of these commands have completed (or a timeout occurs).
Several of these commands cause hci_req_complete to be called which
causes __hci_request to return prematurely.

This patch fixes the issue by adding a new hdev->req_last_cmd variable
which is set during the initialization procedure. The hci_req_complete
function will no longer mark the request as complete until the command
matching hdev->req_last_cmd completes.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-12-22 22:58:07 -02:00
Johan Hedberg
c71e97bfaa Bluetooth: Add management events for controller addition & removal
This patch adds Bluetooth Management interface events for controller
addition and removal. The events correspond to the existing HCI_DEV_REG
and HCI_DEV_UNREG stack internal events.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-12-22 22:58:00 -02:00
Johan Hedberg
f7b64e69c7 Bluetooth: Add read_info management command
This patch implements the read_info command which is used to fetch basic
info about an adapter.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-12-22 22:57:51 -02:00
Johan Hedberg
faba42eb2a Bluetooth: Add read_index_list management command
This patch implements the read_index_list command through which
userspace can get a list of current adapter indices.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-12-22 22:57:44 -02:00
Johan Hedberg
02d981292a Bluetooth: Add read_version management command
This patch implements the initial read_version command that userspace
will use before any other management interface operations.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-12-22 22:57:37 -02:00
Johannes Berg
67408c8c7b mac80211: selective throughput LED trigger active
The throughput LED trigger was always active when
the radio was enabled. In most cases that's likely
the desired behaviour, but iwlwifi requires it to
be only active when one of the virtual interfaces
is actually "connected" in some way.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-22 14:33:37 -05:00
Johannes Berg
e1e5406854 mac80211: add throughput based LED blink trigger
iwlwifi and other drivers like to blink their LED
based on throughput. Implement this generically in
mac80211, based on a throughput table the driver
specifies. That way, drivers can set the blink
frequencies depending on their desired behaviour
and max throughput.

All the drivers need to do is provide an LED class
device, best with blink hardware offload.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-22 14:33:37 -05:00
John W. Linville
63e35cd9bd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-1000.c
	drivers/net/wireless/iwlwifi/iwl-6000.c
	drivers/net/wireless/iwlwifi/iwl-core.h
2010-12-22 14:27:21 -05:00
Jiri Kosina
4b7bd36470 Merge branch 'master' into for-next
Conflicts:
	MAINTAINERS
	arch/arm/mach-omap2/pm24xx.c
	drivers/scsi/bfa/bfa_fcpim.c

Needed to update to apply fixes for which the old branch was too
outdated.
2010-12-22 18:57:02 +01:00
David S. Miller
da521b2c4f net: Fix range checks in tcf_valid_offset().
This function has three bugs:

1) The offset should be valid most of the time, this is just
   a sanity check, therefore we should use "likely" not "unlikely"

2) This is the only place where we can check for arithmetic overflow
   of the pointer plus the length.

3) The existing range checks are off by one, the valid range is
   skb->head to skb_tail_pointer(), inclusive.

Based almost entirely upon a patch by Ralph Loader.

Reported-by: Ralph Loader <suckfish@ihug.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-21 12:43:16 -08:00
Nandita Dukkipati
356f039822 TCP: increase default initial receive window.
This patch changes the default initial receive window to 10 mss
(defined constant). The default window is limited to the maximum
of 10*1460 and 2*mss (when mss > 1460).

draft-ietf-tcpm-initcwnd-00 is a proposal to the IETF that recommends
increasing TCP's initial congestion window to 10 mss or about 15KB.
Leading up to this proposal were several large-scale live Internet
experiments with an initial congestion window of 10 mss (IW10), where
we showed that the average latency of HTTP responses improved by
approximately 10%. This was accompanied by a slight increase in
retransmission rate (0.5%), most of which is coming from applications
opening multiple simultaneous connections. To understand the extreme
worst case scenarios, and fairness issues (IW10 versus IW3), we further
conducted controlled testbed experiments. We came away finding minimal
negative impact even under low link bandwidths (dial-ups) and small
buffers.  These results are extremely encouraging to adopting IW10.

However, an initial congestion window of 10 mss is useless unless a TCP
receiver advertises an initial receive window of at least 10 mss.
Fortunately, in the large-scale Internet experiments we found that most
widely used operating systems advertised large initial receive windows
of 64KB, allowing us to experiment with a wide range of initial
congestion windows. Linux systems were among the few exceptions that
advertised a small receive window of 6KB. The purpose of this patch is
to fix this shortcoming.

References:
1. A comprehensive list of all IW10 references to date.
http://code.google.com/speed/protocols/tcpm-IW10.html

2. Paper describing results from large-scale Internet experiments with IW10.
http://ccr.sigcomm.org/drupal/?q=node/621

3. Controlled testbed experiments under worst case scenarios and a
fairness study.
http://www.ietf.org/proceedings/79/slides/tcpm-0.pdf

4. Raw test data from testbed experiments (Linux senders/receivers)
with initial congestion and receive windows of both 10 mss.
http://research.csc.ncsu.edu/netsrv/?q=content/iw10

5. Internet-Draft. Increasing TCP's Initial Window.
https://datatracker.ietf.org/doc/draft-ietf-tcpm-initcwnd/

Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-20 21:33:00 -08:00
David S. Miller
d9993be65a Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-12-20 13:24:14 -08:00
Bruno Randolf
7f531e03ab cfg80211: Separate available antennas for RX and TX
As has been pointed out by Daniel Halperin some devices (e.g. Intel IWL5100)
can only TX from a subset of RX antennas, so use separate availability masks
for RX and TX.

Signed-off-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-20 14:46:58 -05:00
Javier Cardona
c80d545da3 mac80211: Let userspace enable and configure vendor specific path selection.
Userspace will now be allowed to toggle between the default path
selection algorithm (HWMP, implemented in the kernel), and a vendor
specific alternative.  Also in the same patch, allow userspace to add
information elements to mesh beacons.  This is accordance with the
Extensible Path Selection Framework specified in version 7.0 of the
802.11s draft.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-20 14:46:57 -05:00
Javier Cardona
24bdd9f4c9 mac80211: Rename mesh_params to mesh_config to prepare for mesh_setup
Mesh parameters can be to setup a mesh or to configure it.
This patch renames the ambiguous name mesh_params to mesh_config
in preparation for mesh_setup.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-20 14:46:57 -05:00
Johannes Stezenbach
9f333281a7 mac80211/rt2x00: add ieee80211_tx_status_ni()
All rt2x00 drivers except rt2800pci call ieee80211_tx_status() from
a workqueue, which causes "NOHZ: local_softirq_pending 08" messages.

To fix it, add ieee80211_tx_status_ni() similar to ieee80211_rx_ni()
which can be called from process context, and call it from
rt2x00lib_txdone().  For the rt2800pci special case a driver
flag is introduced.

https://bugzilla.kernel.org/show_bug.cgi?id=24892

Signed-off-by: Johannes Stezenbach <js@sig21.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-20 13:48:04 -05:00
David S. Miller
6561a3b12d ipv4: Flush per-ns routing cache more sanely.
Flush the routing cache only of entries that match the
network namespace in which the purge event occurred.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
2010-12-20 10:37:19 -08:00
Changli Gao
173021072e net_sched: always clone skbs
Pawel reported a panic related to handling shared skbs in ixgbe
incorrectly. So we need to revert my previous patch to work around
this bug. Instead of reverting the patch completely, I just revert
the essential lines, so we can add the previous optimization
back more easily in future.

    commit 3511c9132f
    Author: Changli Gao <xiaosuo@gmail.com>
    Date:   Sat Oct 16 13:04:08 2010 +0000

        net_sched: remove the unused parameter of qdisc_create_dflt()

Reported-by: Pawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-20 10:27:19 -08:00
Shan Wei
4c306a9291 net: kill unused macros
These macros never be used, so remove them.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-19 21:59:35 -08:00
David Stevens
ad0081e43a ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed.
This patch modifies IPsec6 to fragment IPv6 packets that are
locally generated as needed.

This version of the patch only fragments in tunnel mode, so that fragment
headers will not be obscured by ESP in transport mode.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-19 20:22:23 -08:00
David S. Miller
b4aa9e05a6 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bnx2x/bnx2x.h
	drivers/net/wireless/iwlwifi/iwl-1000.c
	drivers/net/wireless/iwlwifi/iwl-6000.c
	drivers/net/wireless/iwlwifi/iwl-core.h
	drivers/vhost/vhost.c
2010-12-17 12:27:22 -08:00
Octavian Purdila
fcbdf09d96 net: fix nulls list corruptions in sk_prot_alloc
Special care is taken inside sk_port_alloc to avoid overwriting
skc_node/skc_nulls_node. We should also avoid overwriting
skc_bind_node/skc_portaddr_node.

The patch fixes the following crash:

 BUG: unable to handle kernel paging request at fffffffffffffff0
 IP: [<ffffffff812ec6dd>] udp4_lib_lookup2+0xad/0x370
 [<ffffffff812ecc22>] __udp4_lib_lookup+0x282/0x360
 [<ffffffff812ed63e>] __udp4_lib_rcv+0x31e/0x700
 [<ffffffff812bba45>] ? ip_local_deliver_finish+0x65/0x190
 [<ffffffff812bbbf8>] ? ip_local_deliver+0x88/0xa0
 [<ffffffff812eda35>] udp_rcv+0x15/0x20
 [<ffffffff812bba45>] ip_local_deliver_finish+0x65/0x190
 [<ffffffff812bbbf8>] ip_local_deliver+0x88/0xa0
 [<ffffffff812bb2cd>] ip_rcv_finish+0x32d/0x6f0
 [<ffffffff8128c14c>] ? netif_receive_skb+0x99c/0x11c0
 [<ffffffff812bb94b>] ip_rcv+0x2bb/0x350
 [<ffffffff8128c14c>] netif_receive_skb+0x99c/0x11c0

Signed-off-by: Leonard Crestez <lcrestez@ixiacom.com>
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16 14:26:56 -08:00
Eric Dumazet
bc2ce894e1 tcp: relax tcp_paws_check()
Some windows versions have wrong RFC1323 implementations, with SYN and
SYNACKS messages containing zero tcp timestamps.

We relaxed in commit fc1ad92dfc the passive connection case
(Windows connects to a linux machine), but the reverse case (linux
connects to a Windows machine) has an analogue problem when tsvals from
windows machine are 'negative' (high order bit set) : PAWS triggers and
we drops incoming messages.

Fix this by making zero ts_recent value special, allowing frame to be
processed.

Based on a report and initial patch from Dmitiy Balakin

Bugzilla reference : https://bugzilla.kernel.org/show_bug.cgi?id=24842

Reported-by: dmitriy.balakin@nicneiron.ru
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16 14:08:34 -08:00
Octavian Purdila
443457242b net: factorize sync-rcu call in unregister_netdevice_many
Add dev_close_many and dev_deactivate_many to factorize another
sync-rcu operation on the netdevice unregister path.

$ modprobe dummy numdummies=10000
$ ip link set dev dummy* up
$ time rmmod dummy

Without the patch           With the patch

real    0m 24.63s           real    0m 5.15s
user    0m 0.00s            user    0m 0.00s
sys     0m 6.05s            sys     0m 5.14s

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16 14:04:44 -08:00
Luis R. Rodriguez
2784fe915c cfg80211: fix null pointer dereference with a custom regulatory request
Once we moved the core regulatory request to the queue and let
the scheduler process it last_request will have been left NULL
until the schedular decides to process the first request. When
this happens and we are loading a driver with a custom regulatory
request like all Atheros drivers we end up with a NULL pointer
dereference. We fix this by checking if the request was a
custom one.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
IP: [<ffffffffa016de87>] freq_reg_info_regd.clone.2+0x27/0x130 [cfg80211]
PGD 71f91067 PUD 712b2067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1d.7/usb2/2-1/firmware/2-1/loading
CPU 0
Modules linked in: ath9k_htc(+) ath9k_common ath9k_hw ath <etc>
Pid: 3094, comm: insmod Tainted: G        W   2.6.37-rc5-wl #16 INVALID/28427ZQ
RIP: 0010:[<ffffffffa016de87>]  [<ffffffffa016de87>] freq_reg_info_regd.clone.2+0x27/0x130 [cfg80211]
RSP: 0018:ffff88007045db78  EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffffffffa047d9a0 RCX: ffff88007045dbd0
RDX: 0000000000004e20 RSI: 000000000024cde0 RDI: ffff8800700483e0
RBP: ffff88007045db98 R08: ffffffffa02f5b40 R09: 0000000000000001
R10: 000000000000000e R11: 0000000000000001 R12: 0000000000000000
R13: ffff88007004e3b0 R14: 0000000000000000 R15: ffff880070048340
FS:  00007f635a707700(0000) GS:ffff880077400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000004 CR3: 00000000708a9000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process insmod (pid: 3094, threadinfo ffff88007045c000, task ffff8800713e3ec0)
Stack:
 ffffffffa047d9a0 0000000000000000 ffff88007004e3b0 0000000000000000
 ffff88007045dc08 ffffffffa016e147 000000007045dc08 0000000000000002
 ffff8800700483e0 ffffffffa02f5b40 ffff88007045dbd8 0000000000000000
Call Trace:
 [<ffffffffa016e147>] wiphy_apply_custom_regulatory+0x137/0x1d0 [cfg80211]
 [<ffffffffa047a690>] ? ath9k_reg_notifier+0x0/0x50 [ath9k_htc]
 [<ffffffffa02f47f7>] ath_regd_init+0x347/0x430 [ath]
 [<ffffffffa047b1f5>] ath9k_htc_probe_device+0x6c5/0x960 [ath9k_htc]
 [<ffffffffa0472a2c>] ath9k_htc_hw_init+0xc/0x30 [ath9k_htc]
 [<ffffffffa04747e6>] ath9k_hif_usb_probe+0x216/0x3b0 [ath9k_htc]
 [<ffffffffa03bb6bc>] usb_probe_interface+0x10c/0x210 [usbcore]
 [<ffffffff812aec26>] driver_probe_device+0x96/0x1c0
 [<ffffffff812aedf3>] __driver_attach+0xa3/0xb0
 [<ffffffff812aed50>] ? __driver_attach+0x0/0xb0
 [<ffffffff812adaae>] bus_for_each_dev+0x5e/0x90
 [<ffffffff812ae8c9>] driver_attach+0x19/0x20
 [<ffffffff812ae438>] bus_add_driver+0x168/0x320
 [<ffffffff812af071>] driver_register+0x71/0x140
 [<ffffffff811fc4a8>] ? __raw_spin_lock_init+0x38/0x70
 [<ffffffffa03ba39c>] usb_register_driver+0xdc/0x190 [usbcore]
 [<ffffffffa03a2000>] ? ath9k_htc_init+0x0/0x4f [ath9k_htc]
 [<ffffffffa047499e>] ath9k_hif_usb_init+0x1e/0x20 [ath9k_htc]
 [<ffffffffa03a202b>] ath9k_htc_init+0x2b/0x4f [ath9k_htc]
 [<ffffffff8100212f>] do_one_initcall+0x3f/0x180
 [<ffffffff8109ef5b>] sys_init_module+0xbb/0x200
 [<ffffffff8100bf52>] system_call_fastpath+0x16/0x1b
Code: <etc, who cares>
RIP  [<ffffffffa016de87>] freq_reg_info_regd.clone.2+0x27/0x130 [cfg80211]
 RSP <ffff88007045db78>
CR2: 0000000000000004
---[ end trace 79e4193601c8b713 ]---

Reported-by: Sujith Manoharan <Sujith.Manoharan@atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-16 15:22:31 -05:00
Jouni Malinen
cf4e594ea7 nl80211: Add notification for dropped Deauth/Disassoc
Add a new notification to indicate that a received, unprotected
Deauthentication or Disassociation frame was dropped due to
management frame protection being in use. This notification is
needed to allow user space (e.g., wpa_supplicant) to implement
SA Query procedure to recover from association state mismatch
between an AP and STA.

This is needed to avoid getting stuck in non-working state when MFP
(IEEE 802.11w) is used and a protected Deauthentication or
Disassociation frame is dropped for any reason. After that, the
station would silently discard any unprotected Deauthentication or
Disassociation frame that could be indicating that the AP does not
have association for the STA (when the Reason Code would be 6 or 7).
IEEE Std 802.11w-2009, 11.13 describes this recovery mechanism.

Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-16 15:22:30 -05:00
KOVACS Krisztian
ae90bdeaea netfilter: fix compilation when conntrack is disabled but tproxy is enabled
The IPv6 tproxy patches split IPv6 defragmentation off of conntrack, but
failed to update the #ifdef stanzas guarding the defragmentation related
fields and code in skbuff and conntrack related code in nf_defrag_ipv6.c.

This patch adds the required #ifdefs so that IPv6 tproxy can truly be used
without connection tracking.

Original report:
http://marc.info/?l=linux-netdev&m=129010118516341&w=2

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-12-15 23:53:41 +01:00
Sujith Manoharan
bd2ce6e43f mac80211: Add timeout to BA session start API
Allow drivers or rate control algorithms to specify BlockAck session
timeout when initiating an ADDBA transaction. This is useful in cases
where maintaining persistent BA sessions does not incur any overhead.

The current timeout value of 5000 TUs is retained for all non ath9k/ath9k_htc
drivers.

Signed-off-by: Sujith Manoharan <Sujith.Manoharan@atheros.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-15 17:03:59 -05:00
Johannes Berg
a293911d4f nl80211: advertise maximum remain-on-channel duration
With the upcoming hardware offload implementation,
some devices will have a different maximum duration
for the remain-on-channel command. Advertise the
maximum duration in mac80211, and make mac80211 set
it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-15 17:03:56 -05:00
David S. Miller
d33e455337 net: Abstract default MTU metric calculation behind an accessor.
Like RTAX_ADVMSS, make the default calculation go through a dst_ops
method rather than caching the computation in the routing cache
entries.

Now dst metrics are pretty much left as-is when new entries are
created, thus optimizing metric sharing becomes a real possibility.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-14 13:01:14 -08:00
David S. Miller
6389aa73ab Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2010-12-14 10:52:54 -08:00
David S. Miller
0dbaee3b37 net: Abstract default ADVMSS behind an accessor.
Make all RTAX_ADVMSS metric accesses go through a new helper function,
dst_metric_advmss().

Leave the actual default metric as "zero" in the real metric slot,
and compute the actual default value dynamically via a new dst_ops
AF specific callback.

For stacked IPSEC routes, we use the advmss of the path which
preserves existing behavior.

Unlike ipv4/ipv6, DecNET ties the advmss to the mtu and thus updates
advmss on pmtu updates.  This inconsistency in advmss handling
results in more raw metric accesses than I wish we ended up with.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-13 12:52:14 -08:00
Johannes Berg
dbd2fd656f cfg80211/nl80211: separate unicast/multicast default TX keys
Allow userspace to specify that a given key
is default only for unicast and/or multicast
transmissions. Only WEP keys are for both,
WPA/RSN keys set here are GTKs for multicast
only. For more future flexibility, allow to
specify all combiations.

Wireless extensions can only set both so use
nl80211; WEP keys (connect keys) must be set
as default for both (but 802.1X WEP is still
possible).

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-13 15:23:28 -05:00
Bruno Randolf
a7ffac9591 cfg80211: Add antenna availability information
Add a field to wiphy for the hardware to report the availble antennas for
configuration. Only if this is set to something bigger than zero, will the
anntenna configuration ops be executed.

Allthough this could be a simple number of antennas, I defined it as a bitmap
of antennas which are available for configuration, since it's more consistent
with the rest of the antenna API and there could be cases where the
hardware allows only configuration of certain antennas. As it does not make
much of a difference in size or normal usage, I think it's better to be able to
support this, in case the need arises.

The antenna configuration is now also checked against the availabe antennas and
rejected if it does not match.

Signed-off-by: Bruno Randolf <br1@einfach.org>

--
v3:	always apply available antenna mask (for "all" antennas case).

v2:	reject antenna configurations which don't match the available antennas
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-13 15:23:27 -05:00
John W. Linville
1d212aa96e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2010-12-13 15:20:45 -05:00
David S. Miller
323e126f0c ipv4: Don't pre-seed hoplimit metric.
Always go through a new ip4_dst_hoplimit() helper, just like ipv6.

This allowed several simplifications:

1) The interim dst_metric_hoplimit() can go as it's no longer
   userd.

2) The sysctl_ip_default_ttl entry no longer needs to use
   ipv4_doint_and_flush, since the sysctl is not cached in
   routing cache metrics any longer.

3) ipv4_doint_and_flush no longer needs to be exported and
   therefore can be marked static.

When ipv4_doint_and_flush_strategy was removed some time ago,
the external declaration in ip.h was mistakenly left around
so kill that off too.

We have to move the sysctl_ip_default_ttl declaration into
ipv4's route cache definition header net/route.h, because
currently net/ip.h (where the declaration lives now) has
a back dependency on net/route.h

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-12 22:08:17 -08:00
David S. Miller
5170ae824d net: Abstract RTAX_HOPLIMIT metric accesses behind helper.
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-12 21:35:57 -08:00
Martin Willi
35d2856b46 xfrm: Add Traffic Flow Confidentiality padding XFRM attribute
The XFRMA_TFCPAD attribute for XFRM state installation configures
Traffic Flow Confidentiality by padding ESP packets to a specified
length.

Signed-off-by: Martin Willi <martin@strongswan.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-10 14:43:58 -08:00
David S. Miller
1e13f863ca Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
Conflicts:
	drivers/net/wireless/ath/ath9k/ar9003_eeprom.c
2010-12-10 09:50:47 -08:00
Eric Dumazet
68835aba4d net: optimize INET input path further
Followup of commit b178bb3dfc (net: reorder struct sock fields)

Optimize INET input path a bit further, by :

1) moving sk_refcnt close to sk_lock.

This reduces number of dirtied cache lines by one on 64bit arches (and
64 bytes cache line size).

2) moving inet_daddr & inet_rcv_saddr at the beginning of sk

(same cache line than hash / family / bound_dev_if / nulls_node)

This reduces number of accessed cache lines in lookups by one, and dont
increase size of inet and timewait socks.
inet and tw sockets now share same place-holder for these fields.

Before patch :

offsetof(struct sock, sk_refcnt) = 0x10
offsetof(struct sock, sk_lock) = 0x40
offsetof(struct sock, sk_receive_queue) = 0x60
offsetof(struct inet_sock, inet_daddr) = 0x270
offsetof(struct inet_sock, inet_rcv_saddr) = 0x274

After patch :

offsetof(struct sock, sk_refcnt) = 0x44
offsetof(struct sock, sk_lock) = 0x48
offsetof(struct sock, sk_receive_queue) = 0x68
offsetof(struct inet_sock, inet_daddr) = 0x0
offsetof(struct inet_sock, inet_rcv_saddr) = 0x4

compute_score() (udp or tcp) now use a single cache line per ignored
item, instead of two.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-09 20:05:58 -08:00
David S. Miller
defb3519a6 net: Abstract away all dst_entry metrics accesses.
Use helper functions to hide all direct accesses, especially writes,
to dst_entry metrics values.

This will allow us to:

1) More easily change how the metrics are stored.

2) Implement COW for metrics.

In particular this will help us put metrics into the inetpeer
cache if that is what we end up doing.  We can make the _metrics
member a pointer instead of an array, initially have it point
at the read-only metrics in the FIB, and then on the first set
grab an inetpeer entry and point the _metrics member there.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
2010-12-09 10:46:36 -08:00
David S. Miller
fe6c791570 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/wireless/ath/ath9k/ar9003_eeprom.c
	net/llc/af_llc.c
2010-12-08 13:47:38 -08:00
Helmut Schaa
50b12f597b cfg80211: Add new BSS attribute ht_opmode
Add a new BSS attribute to allow hostapd to set the current HT opmode.
Otherwise drivers won't be able to set up protection for HT rates in
AP mode.

Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-08 15:38:43 -05:00
Johan Hedberg
a40c406cbd Bluetooth: Make hci_send_to_sock usable for management control sockets
In order to send data to management control sockets the function should:

  - skip checks intended for raw HCI data and stack internal events
  - make sure RAW HCI data or stack internal events don't go to
    management control sockets

In order to accomplish this the patch adds a new member to the bluetooth
skb private data to flag skb's that are destined for management control
sockets.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-12-07 23:03:39 -02:00
Johan Hedberg
0381101fd6 Bluetooth: Add initial Bluetooth Management interface callbacks
Add initial code for handling Bluetooth Management interface messages.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Acked-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-12-07 23:03:38 -02:00
Johan Hedberg
c02178d22b Bluetooth: Add Bluetooth Management interface definitions
Add initial definitions for the new Bluetooth Management interface to
the bluetooth headers.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-12-07 22:59:54 -02:00
Bruno Randolf
541a45a142 nl80211/mac80211: Report signal average
Extend nl80211 to report an exponential weighted moving average (EWMA) of the
signal value. Since the signal value usually fluctuates between different
packets, an average can be more useful than the value of the last packet.

This uses the recently added generic EWMA library function.

--
v2:	fix ABI breakage and change factor to be a power of 2.

Signed-off-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-07 16:09:12 -05:00
Johannes Berg
29cbe68c51 cfg80211/mac80211: add mesh join/leave commands
Instead of tying mesh activity to interface up,
add join and leave commands for mesh. Since we
must be backward compatible, let cfg80211 handle
joining a mesh if a mesh ID was pre-configured
when the device goes up.

Note that this therefore must modify mac80211 as
well since mac80211 needs to lose the logic to
start the mesh on interface up.

We now allow querying mesh parameters before the
mesh is connected, which simply returns defaults.
Setting them (internally renamed to "update") is
only allowed while connected. Specify them with
the new mesh join command instead where needed.

In mac80211, beaconing must now also follow the
mesh enabled/not enabled state, which is done
by testing the mesh ID.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-06 16:01:29 -05:00
Johannes Berg
f9e10ce4cf cfg80211: require add_virtual_intf to return new dev
cfg80211 used to do all its bookkeeping in
the notifier, but some new stuff will have
to use local variables so make the callback
return the netdev pointer.

Tested-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-06 16:01:28 -05:00
Javier Cardona
45904f2165 nl80211/mac80211: define and allow configuring mesh element TTL
The TTL in path selection information elements is different from
the mesh ttl used in mesh data frames.  Version 7.03 of the 11s
draft calls this ttl 'Element TTL'.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-06 16:01:28 -05:00
John W. Linville
f435d9eea0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2010-12-06 15:35:34 -05:00
Eric Dumazet
46bcf14f44 filter: fix sk_filter rcu handling
Pavel Emelyanov tried to fix a race between sk_filter_(de|at)tach and
sk_clone() in commit 47e958eac2

Problem is we can have several clones sharing a common sk_filter, and
these clones might want to sk_filter_attach() their own filters at the
same time, and can overwrite old_filter->rcu, corrupting RCU queues.

We can not use filter->rcu without being sure no other thread could do
the same thing.

Switch code to a more conventional ref-counting technique : Do the
atomic decrement immediately and queue one rcu call back when last
reference is released.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-06 09:29:43 -08:00
Allan Stephens
d265fef6dd tipc: Remove obsolete native API files and exports
As part of the removal of TIPC's native API support it is no longer
necessary for TIPC to export symbols for routines that can be called
by kernel-based applications, nor for it to have header files that
kernel-based applications can include to access the declarations for
those routines. This commit eliminates the exporting of symbols by
TIPC and migrates the contents of each obsolete native API include
file into its corresponding non-native API equivalent.

The code which was migrated in this commit was migrated intact, in
that there are no technical changes combined with the relocation.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-02 13:34:01 -08:00
Shan Wei
dca9b2404a net: kill unused macros from head file
These macros have been defined for several years since v2.6.12-rc2(tracing by git),
but never be used. So remove them.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-02 13:27:33 -08:00
Shan Wei
a9527a3b62 net: snmp: fix the wrong ICMP_MIB_MAX value
__ICMP_MIB_MAX is equal to the total number of icmp mib,
So no need to add 1. This wastes 4/8 bytes memory.

Change it to be same as ICMP6_MIB_MAX, TCP_MIB_MAX, UDP_MIB_MAX.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-02 13:27:31 -08:00
John W. Linville
c30ae138aa Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth-next-2.6 2010-12-02 15:17:46 -05:00
Bruno Randolf
547025d5d4 cfg80211: Add documentation for antenna ops
The last patch with the same title was for mac80211 ops, accidentally.

Signed-off-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-02 15:16:59 -05:00
David S. Miller
ae4694b2d3 ipv6: Create inet6_csk_route_req().
Brother of ipv4's inet_csk_route_req().

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-02 10:59:22 -08:00
David S. Miller
15c054251a ipv6: Add rt6_get_peer() helper.
To go along side ipv4's rt_get_peer().

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-02 10:16:06 -08:00
David S. Miller
ccb7c410dd timewait_sock: Create and use getpeer op.
The only thing AF-specific about remembering the timestamp
for a time-wait TCP socket is getting the peer.

Abstract that behind a new timewait_sock_ops vector.

Support for real IPV6 sockets is not filled in yet, but
curiously this makes timewait recycling start to work
for v4-mapped ipv6 sockets.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-01 18:09:13 -08:00
David S. Miller
4399ce402c inetpeer: Fix incorrect comment about inetpeer struct size.
Now with ipv6 support it is no longer less than 64 bytes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-01 17:29:08 -08:00
David S. Miller
8790ca172a inetpeer: Kill use of inet_peer_address_t typedef.
They are verboten these days.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-01 17:28:18 -08:00
Andrei Emeltchenko
be21871f24 Bluetooth: clean up legal text
Remove extra spaces from legal text so that legal stuff looks
the same for all bluetooth code.

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-12-01 21:04:43 -02:00
Andrei Emeltchenko
70f23020e6 Bluetooth: clean up hci code
Do not use assignment in IF condition, remove extra spaces,
fixing typos, simplify code.

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-12-01 21:04:43 -02:00
Andrei Emeltchenko
894718a6be Bluetooth: clean up l2cap code
Do not initialize static vars to zero, macros with complex values
shall be enclosed with (), remove unneeded braces.

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-12-01 21:04:43 -02:00
Andrei Emeltchenko
285b4e9031 Bluetooth: clean up rfcomm code
Remove extra spaces, assignments in if statement, zeroing static
variables, extra braces. Fix includes.

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-12-01 21:04:43 -02:00
Andrei Emeltchenko
735cbc4784 Bluetooth: clean up sco code
Do not use assignments in IF condition, remove extra spaces

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-12-01 21:04:43 -02:00
David S. Miller
3f419d2d48 inet: Turn ->remember_stamp into ->get_peer in connection AF ops.
Then we can make a completely generic tcp_remember_stamp()
that uses ->get_peer() as a helper, minimizing the AF specific
code and minimizing the eventual code duplication when we implement
the ipv6 side of TW recycling.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-30 12:28:06 -08:00
David S. Miller
b341936380 ipv6: Add infrastructure to bind inet_peer objects to routes.
They are only allowed on cached ipv6 routes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-30 12:27:11 -08:00
David S. Miller
672f007d65 inetpeer: Add inet_getpeer_v6()
Now that all of the infrastructure is in place, we can add
the ipv6 shorthand for peer creation.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-30 12:20:00 -08:00
David S. Miller
b534ecf1cd inetpeer: Make inet_getpeer() take an inet_peer_adress_t pointer.
And make an inet_getpeer_v4() helper, update callers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-30 11:54:19 -08:00
David S. Miller
582a72da9a inetpeer: Introduce inet_peer_address_t.
Currently only the v4 aspect is used, but this will change.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-30 11:53:55 -08:00
Johannes Stezenbach
20ed3166c8 mac80211/rt2x00: add ieee80211_tx_status_ni()
All rt2x00 drivers except rt2800pci call ieee80211_tx_status() from
a workqueue, which causes "NOHZ: local_softirq_pending 08" messages.

To fix it, add ieee80211_tx_status_ni() similar to ieee80211_rx_ni()
which can be called from process context, and call it from
rt2x00lib_txdone().  For the rt2800pci special case a driver
flag is introduced.

Signed-off-by: Johannes Stezenbach <js@sig21.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-30 13:53:46 -05:00
Johannes Berg
f7ca38dfe5 nl80211/cfg80211: extend mgmt-tx API for off-channel
With p2p, it is sometimes necessary to transmit
a frame (typically an action frame) on another
channel than the current channel. Enable this
through the CMD_FRAME API, and allow it to wait
for a response. A new command allows that wait
to be aborted.

However, allow userspace to specify whether or
not it wants to allow off-channel TX, it may
actually want to use the same channel only.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-29 15:24:35 -05:00
David S. Miller
77148625e1 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2010-11-29 11:19:09 -08:00
Eric Dumazet
25888e3031 af_unix: limit recursion level
Its easy to eat all kernel memory and trigger NMI watchdog, using an
exploit program that queues unix sockets on top of others.

lkml ref : http://lkml.org/lkml/2010/11/25/8

This mechanism is used in applications, one choice we have is to have a
recursion limit.

Other limits might be needed as well (if we queue other types of files),
since the passfd mechanism is currently limited by socket receive queue
sizes only.

Add a recursion_level to unix socket, allowing up to 4 levels.

Each time we send an unix socket through sendfd mechanism, we copy its
recursion level (plus one) to receiver. This recursion level is cleared
when socket receive queue is emptied.

Reported-by: Марк Коренберг <socketpair@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-29 09:45:15 -08:00
Shan Wei
49b4a6546f sctp: kill unused macros in head file
1. SCTP_CMD_NUM_VERBS,SCTP_CMD_MAX
These two macros have never been used for several years since v2.6.12-rc2.

2.sctp_port_rover,sctp_port_alloc_lock
The commit 063930 abandoned global variables of port_rover and port_alloc_lock,
but still keep two macros to refer to them.
So, remove them now.

commit 0639300900
Author: Stephen Hemminger <shemminger@linux-foundation.org>
Date:   Wed Oct 10 17:30:18 2007 -0700

    [SCTP]: port randomization

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-29 09:41:12 -08:00
Timo Teräs
aa285b1740 xfrm: fix gre key endianess
fl->fl_gre_key is network byte order contrary to fl->fl_icmp_*.
Make xfrm_flowi_{s|d}port return network byte order values for gre
key too.

Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-28 11:22:17 -08:00
andrew hendry
5595a1a599 X25 remove bkl in subscription ioctls
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-28 11:12:20 -08:00
Shan Wei
5584b8078a sctp: kill unused macro definition
These macros have been existed for several years since v2.6.12-rc2.
But they never be used. So remove them now.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-28 10:47:15 -08:00
Thomas Graf
cf7afbfeb8 rtnl: make link af-specific updates atomic
As David pointed out correctly, updates to af-specific attributes
are currently not atomic. If multiple changes are requested and
one of them fails, previous updates may have been applied already
leaving the link behind in a undefined state.

This patch splits the function parse_link_af() into two functions
validate_link_af() and set_link_at(). validate_link_af() is placed
to validate_linkmsg() check for errors as early as possible before
any changes to the link have been made. set_link_af() is called to
commit the changes later.

This method is not fail proof, while it is currently sufficient
to make set_link_af() inerrable and thus 100% atomic, the
validation function method will not be able to detect all error
scenarios in the future, there will likely always be errors
depending on states which are f.e. not protected by rtnl_mutex
and thus may change between validation and setting.

Also, instead of silently ignoring unknown address families and
config blocks for address families which did not register a set
function the errors EAFNOSUPPORT respectively EOPNOSUPPORT are
returned to avoid comitting 4 out of 5 update requests without
notifying the user.

Signed-off-by: Thomas Graf <tgraf@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-27 22:56:08 -08:00
Hans Schillstrom
b880c1f077 IPVS: Backup, adding version 0 sending capabilities
This patch adds a sysclt net.ipv4.vs.sync_version
that can be used to send sync msg in version 0 or 1 format.

sync_version value is logical,
     Value 1 (default) New version
           0 Plain old version

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2010-11-25 10:42:59 +09:00
Hans Schillstrom
986a075795 IPVS: Backup, Change sending to Version 1 format
Enable sending and removal of version 0 sending
Affected functions,

ip_vs_sync_buff_create()
ip_vs_sync_conn()

ip_vs_core.c removal of IPv4 check.

*v5
 Just check cp->pe_data_len in ip_vs_sync_conn
 Check if padding needed before adding a new sync_conn
 to the buffer, i.e. avoid sending padding at the end.

*v4
 moved sanity check and pe_name_len after sloop.
 use cp->pe instead of cp->dest->svc->pe
 real length in each sync_conn, not padded length
 however total size of a sync_msg includes padding.

*v3
 Sending ip_vs_sync_conn_options in network order.
 Sending Templates for ONE_PACKET conn.
 Renaming of ip_vs_sync_mesg to ip_vs_sync_mesg_v0

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2010-11-25 10:42:59 +09:00
Hans Schillstrom
fe5e7a1efb IPVS: Backup, Adding Version 1 receive capability
Functionality improvements
 * flags  changed from 16 to 32 bits
 * fwmark added (32 bits)
 * timeout in sec. added (32 bits)
 * pe data added (Variable length)
 * IPv6 capabilities (3x16 bytes for addr.)
 * Version and type in every conn msg.

ip_vs_process_message() now handles Version 1 messages
and will call ip_vs_process_message_v0() for version 0 messages.

ip_vs_proc_conn() is common for both version, and handles the update of
connection hash.

ip_vs_conn_fill_param_sync()    - Version 1 messages only
ip_vs_conn_fill_param_sync_v0() - Version 0 messages only

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2010-11-25 10:42:59 +09:00
Hans Schillstrom
0e051e683b IPVS: Backup, Prepare for transferring firewall marks (fwmark) to the backup daemon.
One struct will have fwmark added:
 * ip_vs_conn

ip_vs_conn_new() and ip_vs_find_dest()
will have an extra param - fwmark
The effects of that, is in this patch.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2010-11-25 10:42:58 +09:00
John W. Linville
51cce8a590 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2010-11-24 16:49:20 -05:00
Johannes Berg
c063dbf52b cfg80211: allow using CQM event to notify packet loss
This adds the ability for drivers to use CQM events
to notify about packet loss for specific stations
(which could be the AP for the managed mode case).
Since the threshold might be determined by the
driver (it isn't passed in right now) it will be
passed out of the driver to userspace in the event.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-24 16:19:36 -05:00
Bruno Randolf
79b1c460a0 cfg80211: Add documentation for antenna ops
Signed-off-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-24 16:19:36 -05:00
Felix Fietkau
dd5b4cc71c cfg80211/mac80211: improve ad-hoc multicast rate handling
- store the multicast rate as an index instead of the rate value
  (reduces cpu overhead in a hotpath)
- validate the rate values (must match a bitrate in at least one sband)

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-24 16:19:35 -05:00
John W. Linville
d7a066c923 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-11-24 16:19:24 -05:00
John W. Linville
ccb1435401 Revert "nl80211/mac80211: Report signal average"
This reverts commit 86107fd170.

This patch inadvertantly changed the userland ABI.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-24 16:18:36 -05:00
Eric Dumazet
bba14de987 scm: lower SCM_MAX_FD
Lower SCM_MAX_FD from 255 to 253 so that allocations for scm_fp_list are
halved. (commit f8d570a4 added two pointers in this structure)

scm_fp_dup() should not copy whole structure (and trigger kmemcheck
warnings), but only the used part. While we are at it, only allocate
needed size.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-24 11:16:43 -08:00
Eric Dumazet
456b61bca8 ipv6: mcast: RCU conversion
ipv6_sk_mc_lock rwlock becomes a spinlock.

readers (inet6_mc_check()) now takes rcu_read_lock() instead of read
lock. Writers dont need to disable BH anymore.

struct ipv6_mc_socklist objects are reclaimed after one RCU grace
period.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-24 11:16:42 -08:00
Luis R. Rodriguez
b2e253cf30 cfg80211: Fix regulatory bug with multiple cards and delays
When two cards are connected with the same regulatory domain
if CRDA had a delayed response then cfg80211's own set regulatory
domain would still be the world regulatory domain. There was a bug
on cfg80211's logic such that it assumed that once you pegged a
request as the last request it was already the currently set
regulatory domain. This would mean we would race setting a stale
regulatory domain to secondary cards which had the same regulatory
domain since the alpha2 would match.

We fix this by processing each regulatory request atomically,
and only move on to the next one once we get it fully processed.
In the case CRDA is not present we will simply world roam.

This issue is only present when you have a slow system and the
CRDA processing is delayed. Because of this it is not a known
regression.

Without this fix when a delay is present with CRDA the second card
would end up with an intersected regulatory domain and not allow it
to use the channels it really is designed for. When two cards with
two different regulatory domains were inserted you'd end up
rejecting the second card's regulatory domain request.
This fails with mac80211_hswim's regtest=2 (two requests, same alpha2)
and regtest=3 (two requests, different alpha2) module parameter
options.

This was reproduced and tested against mac80211_hwsim using this
CRDA delayer:

       #!/bin/bash
       echo $COUNTRY >> /tmp/log
       sleep 2
       /sbin/crda.orig

And these regulatory tests:

       modprobe mac80211_hwsim regtest=2
       modprobe mac80211_hwsim regtest=3

Reported-by: Mark Mentovai <mark@moxienet.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Tested-by: Mark Mentovai <mark@moxienet.com>
Tested-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-22 15:48:51 -05:00
Jan Engelhardt
20a95a2169 netns: let net_generic take pointer-to-const args
This commit is same in nature as v2.6.37-rc1-755-g3654654; the network
namespace itself is not modified when calling net_generic, so the
parameter can be const.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-21 10:05:10 -08:00
David S. Miller
24912420e9 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bonding/bond_main.c
	net/core/net-sysfs.c
	net/ipv6/addrconf.c
2010-11-19 13:13:47 -08:00
David S. Miller
07bfa524d4 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-11-18 11:56:09 -08:00
Bruno Randolf
86107fd170 nl80211/mac80211: Report signal average
Extend nl80211 to report an exponential weighted moving average (EWMA) of the
signal value. Since the signal value usually fluctuates between different
packets, an average can be more useful than the value of the last packet.

This uses the recently added generic EWMA library function.

Signed-off-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-18 14:22:20 -05:00
Tetsuo Handa
ef22b7b65f net: Fix duplicate volatile warning.
jiffies is defined as "volatile".

  extern unsigned long volatile __jiffy_data jiffies;

ACCESS_ONCE() uses "volatile".
As a result, some compilers warn duplicate `volatile' for ACCESS_ONCE(jiffies).

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-18 09:40:04 -08:00
Johannes Berg
4bce22b9b8 mac80211: defines for AC numbers
In many places we've just hardcoded the
AC numbers -- which is a relic from the
original mac80211 (d80211). Add constants
for them so we know what we're talking
about.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-17 16:19:31 -05:00
Changli Gao
5811662b15 net: use the macros defined for the members of flowi
Use the macros defined for the members of flowi to clean the code up.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17 12:27:45 -08:00
Thomas Graf
f8ff182c71 rtnetlink: Link address family API
Each net_device contains address family specific data such as
per device settings and statistics. We already expose this data
via procfs/sysfs and partially netlink.

The netlink method requires the requester to send one RTM_GETLINK
request for each address family it wishes to receive data of
and then merge this data itself.

This patch implements a new API which combines all address family
specific link data in a new netlink attribute IFLA_AF_SPEC.
IFLA_AF_SPEC contains a sequence of nested attributes, one for each
address family which in turn defines the structure of its own
attribute. Example:

   [IFLA_AF_SPEC] = {
       [AF_INET] = {
           [IFLA_INET_CONF] = ...,
       },
       [AF_INET6] = {
           [IFLA_INET6_FLAGS] = ...,
           [IFLA_INET6_CONF] = ...,
       }
   }

The API also allows for address families to implement a function
which parses the IFLA_AF_SPEC attribute sent by userspace to
implement address family specific link options.

Signed-off-by: Thomas Graf <tgraf@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17 11:28:24 -08:00
Felix Fietkau
8f0729b16a mac80211: add support for setting the ad-hoc multicast rate
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-16 16:39:08 -05:00
Felix Fietkau
885a46d0f7 cfg80211: add support for setting the ad-hoc multicast rate
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-16 16:39:08 -05:00
Juuso Oikarinen
a619a4c0e1 mac80211: Add function to get probe request template for current AP
Chipsets with hardware based connection monitoring need to autonomically
send directed probe-request frames to the AP (in the event of beacon loss,
for example.)

For the hardware to be able to do this, it requires a template for the frame
to transmit to the AP, filled in with the BSSID and SSID of the AP, but also
the supported rate IE's.

This patch adds a function to mac80211, which allows the hardware driver to
fetch this template after association, so it can be configured to the hardware.

Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-16 16:37:08 -05:00
Bruno Randolf
15d9675321 mac80211: Add antenna configuration
Allow antenna configuration by calling driver's function for it.

We disallow antenna configuration if the wiphy is already running, mainly to
make life easier for 802.11n drivers which need to recalculate HT capabilites.

Signed-off-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-16 16:37:05 -05:00
Bruno Randolf
afe0cbf875 cfg80211: Add nl80211 antenna configuration
Allow setting of TX and RX antennas configuration via nl80211.

The antenna configuration is defined as a bitmap of allowed antennas to use.
This API can be used to mask out antennas which are not attached or should not
be used for other reasons like regulatory concerns or special setups.

Separate bitmaps are used for RX and TX to allow configuring different antennas
for receiving and transmitting. Each bitmap is 32 bit long, each bit
representing one antenna, starting with antenna 1 at the first bit. If an
antenna bit is set, this means the driver is allowed to use this antenna for RX
or TX respectively; if the bit is not set the hardware is not allowed to use
this antenna.

Using bitmaps has the benefit of allowing for a flexible configuration
interface which can support many different configurations and which can be used
for 802.11n as well as non-802.11n devices. Instead of relying on some hardware
specific assumptions, drivers can use this information to know which antennas
are actually attached to the system and derive their capabilities based on
that.

802.11n devices should enable or disable chains, based on which antennas are
present (If all antennas belonging to a particular chain are disabled, the
entire chain should be disabled). HT capabilities (like STBC, TX Beamforming,
Antenna selection) should be calculated based on the available chains after
applying the antenna masks. Should a 802.11n device have diversity antennas
attached to one of their chains, diversity can be enabled or disabled based on
the antenna information.

Non-802.11n drivers can use the antenna masks to select RX and TX antennas and
to enable or disable antenna diversity.

While covering chainmasks for 802.11n and the standard "legacy" modes "fixed
antenna 1", "fixed antenna 2" and "diversity" this API also allows more rare,
but useful configurations as follows:

1) Send on antenna 1, receive on antenna 2 (or vice versa). This can be used to
have a low gain antenna for TX in order to keep within the regulatory
constraints and a high gain antenna for RX in order to receive weaker signals
("speak softly, but listen harder"). This can be useful for building long-shot
outdoor links. Another usage of this setup is having a low-noise pre-amplifier
on antenna 1 and a power amplifier on the other antenna. This way transmit
noise is mostly kept out of the low noise receive channel.
(This would be bitmaps: tx 1 rx 2).

2) Another similar setup is: Use RX diversity on both antennas, but always send
on antenna 1. Again that would allow us to benefit from a higher gain RX
antenna, while staying within the legal limits.
(This would be: tx 0 rx 3).

3) And finally there can be special experimental setups in research and
development even with pre 802.11n hardware where more than 2 antennas are
available. It's good to keep the API simple, yet flexible.

Signed-off-by: Bruno Randolf <br1@einfach.org>

--
v7:	Made bitmasks 32 bit wide and rebased to latest wireless-testing.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-16 16:37:05 -05:00
Arik Nemtsov
f23a478075 mac80211: support hardware TX fragmentation offload
The lower driver is notified when the fragmentation threshold changes
and upon a reconfig of the interface.

If the driver supports hardware TX fragmentation, don't fragment
packets in the stack.

Signed-off-by: Arik Nemtsov <arik@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-16 16:37:04 -05:00
Eric Dumazet
b178bb3dfc net: reorder struct sock fields
Right now, fields in struct sock are not optimally ordered, because each
path (RX softirq, TX completion, RX user,  TX user) has to touch fields
that are contained in many different cache lines.

The really critical thing is to shrink number of cache lines that are
used at RX softirq time : CPU handling softirqs for a device can receive
many frames per second for many sockets. If load is too big, we can drop
frames at NIC level. RPS or multiqueue cards can help, but better reduce
latency if possible.

This patch starts with UDP protocol, then additional patches will try to
reduce latencies of other ones as well.

At RX softirq time, fields of interest for UDP protocol are :
(not counting ones in inet struct for the lookup)

Read/Written:
sk_refcnt   (atomic increment/decrement)
sk_rmem_alloc & sk_backlog.len (to check if there is room in queues)
sk_receive_queue
sk_backlog (if socket locked by user program)
sk_rxhash
sk_forward_alloc
sk_drops

Read only:
sk_rcvbuf (sk_rcvqueues_full())
sk_filter
sk_wq
sk_policy[0]
sk_flags

Additional notes :

- sk_backlog has one hole on 64bit arches. We can fill it to save 8
bytes.
- sk_backlog is used only if RX sofirq handler finds the socket while
locked by user.
- sk_rxhash is written only once per flow.
- sk_drops is written only if queues are full

Final layout :

[1] One section grouping all read/write fields, but placing rxhash and
sk_backlog at the end of this section.

[2] One section grouping all read fields in RX handler
   (sk_filter, sk_rcv_buf, sk_wq)

[3] Section used by other paths

I'll post a patch on its own to put sk_refcnt at the end of struct
sock_common so that it shares same cache line than section [1]

New offsets on 64bit arch :

sizeof(struct sock)=0x268
offsetof(struct sock, sk_refcnt)  =0x10
offsetof(struct sock, sk_lock)    =0x48
offsetof(struct sock, sk_receive_queue)=0x68
offsetof(struct sock, sk_backlog)=0x80
offsetof(struct sock, sk_rmem_alloc)=0x80
offsetof(struct sock, sk_forward_alloc)=0x98
offsetof(struct sock, sk_rxhash)=0x9c
offsetof(struct sock, sk_rcvbuf)=0xa4
offsetof(struct sock, sk_drops) =0xa0
offsetof(struct sock, sk_filter)=0xa8
offsetof(struct sock, sk_wq)=0xb0
offsetof(struct sock, sk_policy)=0xd0
offsetof(struct sock, sk_flags) =0xe0

Instead of :

sizeof(struct sock)=0x270
offsetof(struct sock, sk_refcnt)  =0x10
offsetof(struct sock, sk_lock)    =0x50
offsetof(struct sock, sk_receive_queue)=0xc0
offsetof(struct sock, sk_backlog)=0x70
offsetof(struct sock, sk_rmem_alloc)=0xac
offsetof(struct sock, sk_forward_alloc)=0x10c
offsetof(struct sock, sk_rxhash)=0x128
offsetof(struct sock, sk_rcvbuf)=0x4c
offsetof(struct sock, sk_drops) =0x16c
offsetof(struct sock, sk_filter)=0x198
offsetof(struct sock, sk_wq)=0x88
offsetof(struct sock, sk_policy)=0x98
offsetof(struct sock, sk_flags) =0x130

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16 11:17:43 -08:00
Eric Dumazet
c31504dc0d udp: use atomic_inc_not_zero_hint
UDP sockets refcount is usually 2, unless an incoming frame is going to
be queued in receive or backlog queue.

Using atomic_inc_not_zero_hint() permits to reduce latency, because
processor issues less memory transactions.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16 11:17:43 -08:00
Jan Engelhardt
3654654f7a netlink: let nlmsg and nla functions take pointer-to-const args
The changed functions do not modify the NL messages and/or attributes
at all. They should use const (similar to strchr), so that callers
which have a const nlmsg/nlattr around can make use of them without
casting.

While at it, constify a data array.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16 09:52:32 -08:00
David S. Miller
b5e4156743 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2010-11-16 09:17:12 -08:00
Simon Horman
d494262b8a IPVS: Make the cp argument to ip_vs_sync_conn() static
Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2010-11-16 08:13:07 +09:00
Simon Horman
e9e5eee873 IPVS: Add persistence engine to connection entry
The dest of a connection may not exist if it has been created as the result
of connection synchronisation. But in order for connection entries for
templates with persistence engine data created through connection
synchronisation to be valid access to the persistence engine pointer is
required.  So add the persistence engine to the connection itself.

Signed-off-by: Simon Horman <horms@verge.net.au>
2010-11-16 08:13:07 +09:00
Jussi Kivilinna
309075cf08 cfg80211: fix WIPHY_FLAG_IBSS_RSN bit
WIPHY_FLAG_IBSS_RSN is BIT(7) as is WIPHY_FLAG_CONTROL_PORT_PROTOCOL. Change
to BIT(8).

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-15 15:00:42 -05:00
Arnd Hannemann
62370e2b93 b43legacy: Fix compile on ARM architecture
When b43legacy is compiled on the arm platform, the following errors are seen:

  CC [M]  drivers/net/wireless/b43legacy/xmit.o
In file included from include/net/dst.h:11,
from drivers/net/wireless/b43legacy/xmit.c:31:
include/net/dst_ops.h:28: error: expected ':', ',', ';', '}' or '__attribute__'
   before '____cacheline_aligned_in_smp'
include/net/dst_ops.h: In function 'dst_entries_get_fast':
include/net/dst_ops.h:33: error: 'struct dst_ops' has no member named
   'pcpuc_entries'
include/net/dst_ops.h: In function 'dst_entries_get_slow':
include/net/dst_ops.h:41: error: 'struct dst_ops' has no member named
   'pcpuc_entries'
include/net/dst_ops.h: In function 'dst_entries_add':
include/net/dst_ops.h:49: error: 'struct dst_ops' has no member named
   'pcpuc_entries'
include/net/dst_ops.h: In function 'dst_entries_init':
include/net/dst_ops.h:55: error: 'struct dst_ops' has no member named
   'pcpuc_entries'
include/net/dst_ops.h: In function 'dst_entries_destroy':
include/net/dst_ops.h:60: error: 'struct dst_ops' has no member named
   'pcpuc_entries'
make[4]: *** [drivers/net/wireless/b43legacy/xmit.o] Error 1
make[3]: *** [drivers/net/wireless/b43legacy] Error 2
make[2]: *** [drivers/net/wireless] Error 2
make[1]: *** [drivers/net] Error 2
make: *** [drivers] Error 2

The cause is a missing include of <linux/cache.h>, which is present for
i386 and x86_64 architectures, but not for arm.

Signed-off-by: Arnd Hannemann <arnd@arndnet.de>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Stable <stable@kernel.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-15 15:00:42 -05:00
Joe Perches
d577f1ccdd include/net/caif/cfctrl.h: Remove unnecessary semicolons
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15 11:07:16 -08:00
Timo Teräs
cc9ff19da9 xfrm: use gre key as flow upper protocol info
The GRE Key field is intended to be used for identifying an individual
traffic flow within a tunnel. It is useful to be able to have XFRM
policy selector matches to have different policies for different
GRE tunnels.

Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15 10:44:04 -08:00
Luis R. Rodriguez
749b527b21 cfg80211: fix allowing country IEs for WIPHY_FLAG_STRICT_REGULATORY
We should be enabling country IE hints for WIPHY_FLAG_STRICT_REGULATORY
even if we haven't yet recieved regulatory domain hint for the driver
if it needed one. Without this Country IEs are not passed on to drivers
that have set WIPHY_FLAG_STRICT_REGULATORY, today this is just all
Atheros chipset drivers: ath5k, ath9k, ar9170, carl9170.

This was part of the original design, however it was completely
overlooked...

Cc: Easwar Krishnan <easwar.krishnan@atheros.com>
Cc: stable@kernel.org
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-15 13:24:09 -05:00
Eric Dumazet
0e60ebe04c netfilter: add __rcu annotations
Add some __rcu annotations and use helpers to reduce number of sparse
warnings (CONFIG_SPARSE_RCU_POINTER=y)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-11-15 18:17:21 +01:00
Changli Gao
03c0e5bb34 netfilter: nf_nat: define nat_pptp_info as needed
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-11-15 12:27:27 +01:00
Changli Gao
e0e76c83be netfilter: ct_extend: define NF_CT_EXT_* as needed
Less IDs make nf_ct_ext smaller.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-11-15 12:23:24 +01:00
Changli Gao
76a2d3bcfc netfilter: nf_nat: don't use atomic bit operation
As we own the conntrack and the others can't see it until we confirm it,
we don't need to use atomic bit operation on ct->status.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-11-15 11:59:03 +01:00
Changli Gao
0f8e80044b netfilter: nf_conntrack: define ct_*_info as needed
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-11-15 11:51:06 +01:00
David S. Miller
c25ecd0a21 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-11-14 11:57:05 -08:00
Linus Torvalds
9457b24a09 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (66 commits)
  can-bcm: fix minor heap overflow
  gianfar: Do not call device_set_wakeup_enable() under a spinlock
  ipv6: Warn users if maximum number of routes is reached.
  docs: Add neigh/gc_thresh3 and route/max_size documentation.
  axnet_cs: fix resume problem for some Ax88790 chip
  ipv6: addrconf: don't remove address state on ifdown if the address is being kept
  tcp: Don't change unlocked socket state in tcp_v4_err().
  x25: Prevent crashing when parsing bad X.25 facilities
  cxgb4vf: add call to Firmware to reset VF State.
  cxgb4vf: Fail open if link_start() fails.
  cxgb4vf: flesh out PCI Device ID Table ...
  cxgb4vf: fix some errors in Gather List to skb conversion
  cxgb4vf: fix bug in Generic Receive Offload
  cxgb4vf: don't implement trivial (and incorrect) ndo_select_queue()
  ixgbe: Look inside vlan when determining offload protocol.
  bnx2x: Look inside vlan when determining checksum proto.
  vlan: Add function to retrieve EtherType from vlan packets.
  virtio-net: init link state correctly
  ucc_geth: Fix deadlock
  ucc_geth: Do not bring the whole IF down when TX failure.
  ...
2010-11-12 17:17:55 -08:00
Eric Dumazet
1d7138de87 igmp: RCU conversion of in_dev->mc_list
in_dev->mc_list is protected by one rwlock (in_dev->mc_list_lock).

This can easily be converted to a RCU protection.

Writers hold RTNL, so mc_list_lock is removed, not replaced by a
spinlock.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Cypher Wu <cypher.w@gmail.com>
Cc: Américo Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-12 13:18:57 -08:00
Changli Gao
e5fc9e7a66 netfilter: nf_conntrack: don't always initialize ct->proto
ct->proto is big(60 bytes) due to structure ip_ct_tcp, and we don't need
to initialize the whole for all the other protocols. This patch moves
proto to the end of structure nf_conn, and pushes the initialization down
to the individual protocols.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-11-12 17:33:17 +01:00
David S. Miller
c753796769 ipv4: Make rt->fl.iif tests lest obscure.
When we test rt->fl.iif against zero, we're seeing if it's
an output or an input route.

Make that explicit with some helper functions.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-11 17:07:48 -08:00
Eric Dumazet
72cdd1d971 net: get rid of rtable->idev
It seems idev field in struct rtable has no special purpose, but adding
extra atomic ops.

We hold refcounts on the device itself (using percpu data, so pretty
cheap in current kernel).

infiniband case is solved using dst.dev instead of idev->dev

Removal of this field means routing without route cache is now using
shared data, percpu data, and only potential contention is a pair of
atomic ops on struct neighbour per forwarded packet.

About 5% speedup on routing test.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-11 10:29:40 -08:00
Eric Dumazet
46b13fc5c0 neigh: reorder struct neighbour
It is important to move nud_state outside of the often modified cache
line (because of refcnt), to reduce false sharing in neigh_event_send()

This is a followup of commit 0ed8ddf404 (neigh: Protect neigh->ha[]
with a seqlock)

This gives a 7% speedup on routing test with IP route cache disabled.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-11 10:29:40 -08:00
Eric Dumazet
8d987e5c75 net: avoid limits overflow
Robin Holt tried to boot a 16TB machine and found some limits were
reached : sysctl_tcp_mem[2], sysctl_udp_mem[2]

We can switch infrastructure to use long "instead" of "int", now
atomic_long_t primitives are available for free.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Robin Holt <holt@sgi.com>
Reviewed-by: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-10 12:12:00 -08:00
Eric Dumazet
fc766e4c49 decnet: RCU conversion and get rid of dev_base_lock
While tracking dev_base_lock users, I found decnet used it in
dnet_select_source(), but for a wrong purpose:

Writers only hold RTNL, not dev_base_lock, so readers must use RCU if
they cannot use RTNL.

Adds an rcu_head in struct dn_ifaddr and handle proper RCU management.

Adds __rcu annotation in dn_route as well.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-08 13:50:08 -08:00
David S. Miller
d0eaeec8e8 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-11-08 12:38:28 -08:00
Paul Mundt
43b81f85eb net dst: need linux/cache.h for ____cacheline_aligned_in_smp.
Presently the b43legacy build fails on an sh randconfig:

In file included from include/net/dst.h:12,
                 from drivers/net/wireless/b43legacy/xmit.c:32:
include/net/dst_ops.h:28: error: expected ':', ',', ';', '}' or '__attribute__' before '____cacheline_aligned_in_smp'
include/net/dst_ops.h: In function 'dst_entries_get_fast':
include/net/dst_ops.h:33: error: 'struct dst_ops' has no member named 'pcpuc_entries'
include/net/dst_ops.h: In function 'dst_entries_get_slow':
include/net/dst_ops.h:41: error: 'struct dst_ops' has no member named 'pcpuc_entries'
include/net/dst_ops.h: In function 'dst_entries_add':
include/net/dst_ops.h:49: error: 'struct dst_ops' has no member named 'pcpuc_entries'
include/net/dst_ops.h: In function 'dst_entries_init':
include/net/dst_ops.h:55: error: 'struct dst_ops' has no member named 'pcpuc_entries'
include/net/dst_ops.h: In function 'dst_entries_destroy':
include/net/dst_ops.h:60: error: 'struct dst_ops' has no member named 'pcpuc_entries'
make[5]: *** [drivers/net/wireless/b43legacy/xmit.o] Error 1
make[5]: *** Waiting for unfinished jobs....

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-07 19:58:05 -08:00
Linus Torvalds
4b4a2700f4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (41 commits)
  inet_diag: Make sure we actually run the same bytecode we audited.
  netlink: Make nlmsg_find_attr take a const nlmsghdr*.
  fib: fib_result_assign() should not change fib refcounts
  netfilter: ip6_tables: fix information leak to userspace
  cls_cgroup: Fix crash on module unload
  memory corruption in X.25 facilities parsing
  net dst: fix percpu_counter list corruption and poison overwritten
  rds: Remove kfreed tcp conn from list
  rds: Lost locking in loop connection freeing
  de2104x: fix panic on load
  atl1 : fix panic on load
  netxen: remove unused firmware exports
  caif: Remove noisy printout when disconnecting caif socket
  caif: SPI-driver bugfix - incorrect padding.
  caif: Bugfix for socket priority, bindtodev and dbg channel.
  smsc911x: Set Ethernet EEPROM size to supported device's size
  ipv4: netfilter: ip_tables: fix information leak to userland
  ipv4: netfilter: arp_tables: fix information leak to userland
  cxgb4vf: remove call to stop TX queues at load time.
  cxgb4: remove call to stop TX queues at load time.
  ...
2010-11-05 15:25:48 -07:00
Nelson Elhage
6b8c92ba07 netlink: Make nlmsg_find_attr take a const nlmsghdr*.
This will let us use it on a nlmsghdr stored inside a netlink_callback.

Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-04 12:26:34 -07:00
Sjur Brændeland
2c24a5d1b4 caif: SPI-driver bugfix - incorrect padding.
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-03 18:50:03 -07:00
André Carvalho de Matos
f2527ec436 caif: Bugfix for socket priority, bindtodev and dbg channel.
Changes:
o Bugfix: SO_PRIORITY for SOL_SOCKET could not be handled
  in caif's setsockopt,  using the struct sock attribute priority instead.

o Bugfix: SO_BINDTODEVICE for SOL_SOCKET could not be handled
  in caif's setsockopt,  using the struct sock attribute ifindex instead.

o Wrong assert statement for RFM layer segmentation.

o CAIF Debug channels was not working over SPI, caif_payload_info
  containing padding info must be initialized.

o Check on pointer before dereferencing when unregister dev in caif_dev.c

Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-03 18:50:03 -07:00
Uwe Kleine-König
b595076a18 tree-wide: fix comment/printk typos
"gadget", "through", "command", "maintain", "maintain", "controller", "address",
"between", "initiali[zs]e", "instead", "function", "select", "already",
"equal", "access", "management", "hierarchy", "registration", "interest",
"relative", "memory", "offset", "already",

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-11-01 15:38:34 -04:00
Linus Torvalds
1840897ab5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (34 commits)
  b43: Fix warning at drivers/mmc/core/core.c:237 in mmc_wait_for_cmd
  mac80211: fix failure to check kmalloc return value in key_key_read
  libertas: Fix sd8686 firmware reload
  ath9k: Fix incorrect access of rate flags in RC
  netfilter: xt_socket: Make tproto signed in socket_mt6_v1().
  stmmac: enable/disable rx/tx in the core with a single write.
  net: atarilance - flags should be unsigned long
  netxen: fix kdump
  pktgen: Limit how much data we copy onto the stack.
  net: Limit socket I/O iovec total length to INT_MAX.
  USB: gadget: fix ethernet gadget crash in gether_setup
  fib: Fix fib zone and its hash leak on namespace stop
  cxgb3: Fix panic in free_tx_desc()
  cxgb3: fix crash due to manipulating queues before registration
  8390: Don't oops on starting dev queue
  dccp ccid-2: Stop polling
  dccp: Refine the wait-for-ccid mechanism
  dccp: Extend CCID packet dequeueing interface
  dccp: Return-value convention of hc_tx_send_packet()
  igbvf: fix panic on load
  ...
2010-10-29 14:17:12 -07:00
Pavel Emelyanov
4aa2c466a7 fib: Fix fib zone and its hash leak on namespace stop
When we stop a namespace we flush the table and free one, but the
added fn_zone-s (and their hashes if grown) are leaked. Need to free.
Tries releases all its stuff in the flushing code.

Shame on us - this bug exists since the very first make-fib-per-net
patches in 2.6.27 :(

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-28 10:27:03 -07:00
Venkateswararao Jujjuri (JV)
b165d60145 9p: Add datasync to client side TFSYNC/RFSYNC for dotl
SYNOPSIS
    size[4] Tfsync tag[2] fid[4] datasync[4]

    size[4] Rfsync tag[2]

DESCRIPTION

    The Tfsync transaction transfers ("flushes") all modified in-core data of
    file identified by fid to the disk device (or other  permanent  storage
    device)  where that  file  resides.

    If datasync flag is specified data will be fleshed but does not flush
    modified metadata unless  that  metadata  is  needed  in order to allow a
    subsequent data retrieval to be correctly handled.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-10-28 09:08:49 -05:00
M. Mohan Kumar
329176cc2c 9p: Implement TREADLINK operation for 9p2000.L
Synopsis

	size[4] TReadlink tag[2] fid[4]
	size[4] RReadlink tag[2] target[s]

Description
	Readlink is used to return the contents of the symoblic link
        referred by fid. Contents of symboic link is returned as a
        response.

	target[s] - Contents of the symbolic link referred by fid.

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-10-28 09:08:48 -05:00
M. Mohan Kumar
1d769cd192 9p: Implement TGETLOCK
Synopsis

    size[4] TGetlock tag[2] fid[4] getlock[n]
    size[4] RGetlock tag[2] getlock[n]

Description

TGetlock is used to test for the existence of byte range posix locks on a file
identified by given fid. The reply contains getlock structure. If the lock could
be placed it returns F_UNLCK in type field of getlock structure.  Otherwise it
returns the details of the conflicting locks in the getlock structure

    getlock structure:
      type[1] - Type of lock: F_RDLCK, F_WRLCK
      start[8] - Starting offset for lock
      length[8] - Number of bytes to check for the lock
             If length is 0, check for lock in all bytes starting at the location
            'start' through to the end of file
      pid[4] - PID of the process that wants to take lock/owns the task
               in case of reply
      client[4] - Client id of the system that owns the process which
                  has the conflicting lock

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-10-28 09:08:47 -05:00
M. Mohan Kumar
a099027c77 9p: Implement TLOCK
Synopsis

    size[4] TLock tag[2] fid[4] flock[n]
    size[4] RLock tag[2] status[1]

Description

Tlock is used to acquire/release byte range posix locks on a file
identified by given fid. The reply contains status of the lock request

    flock structure:
        type[1] - Type of lock: F_RDLCK, F_WRLCK, F_UNLCK
        flags[4] - Flags could be either of
          P9_LOCK_FLAGS_BLOCK - Blocked lock request, if there is a
            conflicting lock exists, wait for that lock to be released.
          P9_LOCK_FLAGS_RECLAIM - Reclaim lock request, used when client is
            trying to reclaim a lock after a server restrart (due to crash)
        start[8] - Starting offset for lock
        length[8] - Number of bytes to lock
          If length is 0, lock all bytes starting at the location 'start'
          through to the end of file
        pid[4] - PID of the process that wants to take lock
        client_id[4] - Unique client id

        status[1] - Status of the lock request, can be
          P9_LOCK_SUCCESS(0), P9_LOCK_BLOCKED(1), P9_LOCK_ERROR(2) or
          P9_LOCK_GRACE(3)
          P9_LOCK_SUCCESS - Request was successful
          P9_LOCK_BLOCKED - A conflicting lock is held by another process
          P9_LOCK_ERROR - Error while processing the lock request
          P9_LOCK_GRACE - Server is in grace period, it can't accept new lock
            requests in this period (except locks with
            P9_LOCK_FLAGS_RECLAIM flag set)

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-10-28 09:08:47 -05:00
Venkateswararao Jujjuri (JV)
920e65dc69 [9p] Introduce client side TFSYNC/RFSYNC for dotl.
SYNOPSIS
    size[4] Tfsync tag[2] fid[4]

    size[4] Rfsync tag[2]

DESCRIPTION

The Tfsync transaction transfers ("flushes") all modified in-core data of
file identified by fid to the disk device (or other  permanent  storage
device)  where that  file  resides.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-10-28 09:08:47 -05:00
Arun R Bharadwaj
4f7ebe8072 net/9p: This patch implements TLERROR/RLERROR on the 9P client.
Signed-off-by: Arun R Bharadwaj <arun@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-10-28 09:08:45 -05:00
Amarnath Revanna
a10c02036f caif-u5500: Adding shared memory include
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27 12:29:51 -07:00
Eric Dumazet
b914c4ea92 inetpeer: __rcu annotations
Adds __rcu annotations to inetpeer
	(struct inet_peer)->avl_left
	(struct inet_peer)->avl_right

This is a tedious cleanup, but removes one smp_wmb() from link_to_pool()
since we now use more self documenting rcu_assign_pointer().

Note the use of RCU_INIT_POINTER() instead of rcu_assign_pointer() in
all cases we dont need a memory barrier.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27 11:37:33 -07:00
Eric Dumazet
7a2b03c517 fib_rules: __rcu annotates ctarget
Adds __rcu annotation to (struct fib_rule)->ctarget

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27 11:37:32 -07:00
Eric Dumazet
b33eab0844 tunnels: add __rcu annotations
Add __rcu annotations to :
        (struct ip_tunnel)->prl
        (struct ip_tunnel_prl_entry)->next
        (struct xfrm_tunnel)->next
	struct xfrm_tunnel *tunnel4_handlers
	struct xfrm_tunnel *tunnel64_handlers

And use appropriate rcu primitives to reduce sparse warnings if
CONFIG_SPARSE_RCU_POINTER=y

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27 11:37:32 -07:00
Eric Dumazet
e0ad61ec86 net: add __rcu annotations to protocol
Add __rcu annotations to :
        struct net_protocol *inet_protos
        struct net_protocol *inet6_protos

And use appropriate casts to reduce sparse warnings if
CONFIG_SPARSE_RCU_POINTER=y

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27 11:37:31 -07:00
Eric Dumazet
1c31720a74 ipv4: add __rcu annotations to routes.c
Add __rcu annotations to :
        (struct dst_entry)->rt_next
        (struct rt_hash_bucket)->chain

And use appropriate rcu primitives to reduce sparse warnings if
CONFIG_SPARSE_RCU_POINTER=y

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-27 11:37:31 -07:00
Eric Dumazet
43a951e999 ipv4: add __rcu annotations to ip_ra_chain
Add __rcu annotations to :
        (struct ip_ra_chain)->next
	struct ip_ra_chain *ip_ra_chain;

And use appropriate rcu primitives.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25 14:18:28 -07:00
Eric Dumazet
0d7da9ddd9 net: add __rcu annotation to sk_filter
Add __rcu annotation to :
        (struct sock)->sk_filter

And use appropriate rcu primitives to reduce sparse warnings if
CONFIG_SPARSE_RCU_POINTER=y

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25 14:18:28 -07:00
Eric Dumazet
1c87733d06 net_ns: add __rcu annotations
add __rcu annotation to (struct net)->gen, and use
rcu_dereference_protected() in net_assign_generic()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25 14:18:27 -07:00
Eric Dumazet
6f0bcf1525 tunnels: add _rcu annotations
(struct ip6_tnl)->next is rcu protected :
(struct ip_tunnel)->next is rcu protected :
(struct xfrm6_tunnel)->next is rcu protected :

add __rcu annotation and proper rcu primitives.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25 13:09:45 -07:00
Eric Dumazet
3cc77ec74e net/802: add __rcu annotations
(struct net_device)->garp_port is rcu protected :
(struct garp_port)->applicants is rcu protected :

add __rcu annotation and proper rcu primitives.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25 13:09:44 -07:00
Linus Torvalds
5f05647dd8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1699 commits)
  bnx2/bnx2x: Unsupported Ethtool operations should return -EINVAL.
  vlan: Calling vlan_hwaccel_do_receive() is always valid.
  tproxy: use the interface primary IP address as a default value for --on-ip
  tproxy: added IPv6 support to the socket match
  cxgb3: function namespace cleanup
  tproxy: added IPv6 support to the TPROXY target
  tproxy: added IPv6 socket lookup function to nf_tproxy_core
  be2net: Changes to use only priority codes allowed by f/w
  tproxy: allow non-local binds of IPv6 sockets if IP_TRANSPARENT is enabled
  tproxy: added tproxy sockopt interface in the IPV6 layer
  tproxy: added udp6_lib_lookup function
  tproxy: added const specifiers to udp lookup functions
  tproxy: split off ipv6 defragmentation to a separate module
  l2tp: small cleanup
  nf_nat: restrict ICMP translation for embedded header
  can: mcp251x: fix generation of error frames
  can: mcp251x: fix endless loop in interrupt handler if CANINTF_MERRF is set
  can-raw: add msg_flags to distinguish local traffic
  9p: client code cleanup
  rds: make local functions/variables static
  ...

Fix up conflicts in net/core/dev.c, drivers/net/pcmcia/smc91c92_cs.c and
drivers/net/wireless/ath/ath9k/debug.c as per David
2010-10-23 11:47:02 -07:00
Linus Torvalds
888a6f77e0 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (52 commits)
  sched: fix RCU lockdep splat from task_group()
  rcu: using ACCESS_ONCE() to observe the jiffies_stall/rnp->qsmask value
  sched: suppress RCU lockdep splat in task_fork_fair
  net: suppress RCU lockdep false positive in sock_update_classid
  rcu: move check from rcu_dereference_bh to rcu_read_lock_bh_held
  rcu: Add advice to PROVE_RCU_REPEATEDLY kernel config parameter
  rcu: Add tracing data to support queueing models
  rcu: fix sparse errors in rcutorture.c
  rcu: only one evaluation of arg in rcu_dereference_check() unless sparse
  kernel: Remove undead ifdef CONFIG_DEBUG_LOCK_ALLOC
  rcu: fix _oddness handling of verbose stall warnings
  rcu: performance fixes to TINY_PREEMPT_RCU callback checking
  rcu: upgrade stallwarn.txt documentation for CPU-bound RT processes
  vhost: add __rcu annotations
  rcu: add comment stating that list_empty() applies to RCU-protected lists
  rcu: apply TINY_PREEMPT_RCU read-side speedup to TREE_PREEMPT_RCU
  rcu: combine duplicate code, courtesy of CONFIG_PREEMPT_RCU
  rcu: Upgrade srcu_read_lock() docbook about SRCU grace periods
  rcu: document ways of stalling updates in low-memory situations
  rcu: repair code-duplication FIXMEs
  ...
2010-10-21 12:54:12 -07:00
David S. Miller
9941fb6276 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2010-10-21 08:21:34 -07:00
Patrick McHardy
3b1a1ce6f4 Merge branch 'for-patrick' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-test-2.6 2010-10-21 16:25:51 +02:00
Balazs Scheidler
3b9afb2991 tproxy: added IPv6 socket lookup function to nf_tproxy_core
Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-21 16:12:14 +02:00
Balazs Scheidler
aa976fc011 tproxy: added udp6_lib_lookup function
Just like with IPv4, we need access to the UDP hash table to look up local
sockets, but instead of exporting the global udp_table, export a lookup
function.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-21 16:05:41 +02:00
Balazs Scheidler
e97c3e278e tproxy: split off ipv6 defragmentation to a separate module
Like with IPv4, TProxy needs IPv6 defragmentation but does not
require connection tracking. Since defragmentation was coupled
with conntrack, I split off the two, creating an nf_defrag_ipv6 module,
similar to the already existing nf_defrag_ipv4.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-21 16:03:43 +02:00
stephen hemminger
32a875adcd 9p: client code cleanup
Make p9_client_version static since only used in one file.
Remove p9_client_auth because it is defined but never used.
Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21 04:26:39 -07:00
Balazs Scheidler
093d282321 tproxy: fix hash locking issue when using port redirection in __inet_inherit_port()
When __inet_inherit_port() is called on a tproxy connection the wrong locks are
held for the inet_bind_bucket it is added to. __inet_inherit_port() made an
implicit assumption that the listener's port number (and thus its bind bucket).
Unfortunately, if you're using the TPROXY target to redirect skbs to a
transparent proxy that assumption is not true anymore and things break.

This patch adds code to __inet_inherit_port() so that it can handle this case
by looking up or creating a new bind bucket for the child socket and updates
callers of __inet_inherit_port() to gracefully handle __inet_inherit_port()
failing.

Reported by and original patch from Stephen Buck <stephen.buck@exinda.com>.
See http://marc.info/?t=128169268200001&r=1&w=2 for the original discussion.

Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-21 13:06:43 +02:00
Balazs Scheidler
6006db84a9 tproxy: add lookup type checks for UDP in nf_tproxy_get_sock_v4()
Also, inline this function as the lookup_type is always a literal
and inlining removes branches performed at runtime.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-21 12:47:34 +02:00
Balazs Scheidler
106e4c26b1 tproxy: kick out TIME_WAIT sockets in case a new connection comes in with the same tuple
Without tproxy redirections an incoming SYN kicks out conflicting
TIME_WAIT sockets, in order to handle clients that reuse ports
within the TIME_WAIT period.

The same mechanism didn't work in case TProxy is involved in finding
the proper socket, as the time_wait processing code looked up the
listening socket assuming that the listener addr/port matches those
of the established connection.

This is not the case with TProxy as the listener addr/port is possibly
changed with the tproxy rule.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-21 12:45:14 +02:00
Changli Gao
3511c9132f net_sched: remove the unused parameter of qdisc_create_dflt()
The first parameter dev isn't in use in qdisc_create_dflt().

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21 03:09:47 -07:00
stephen hemminger
1c4c40c42d xfrm: make xfrm_bundle_ok local
Only used in one place.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21 03:09:46 -07:00
stephen hemminger
8d8a0b1cc2 rtnetlink: remove rtnl_kill_links
The function rtnl_kill_links is defined but never used.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21 03:09:45 -07:00
stephen hemminger
6f747aca5e xfrm6: make xfrm6_tunnel_free_spi local
Function only defined and used in one file.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21 03:09:45 -07:00
Julian Anastasov
0d79641a96 ipvs: provide address family for debugging
As skb->protocol is not valid in LOCAL_OUT add
parameter for address family in packet debugging functions.
Even if ports are not present in AH and ESP change them to
use ip_vs_tcpudp_debug_packet to show at least valid addresses
as before. This patch removes the last user of skb->protocol
in IPVS.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21 11:04:43 +02:00
Julian Anastasov
fc60476761 ipvs: changes for local real server
This patch deals with local real servers:

- Add support for DNAT to local address (different real server port).
It needs ip_vs_out hook in LOCAL_OUT for both families because
skb->protocol is not set for locally generated packets and can not
be used to set 'af'.

- Skip packets in ip_vs_in marked with skb->ipvs_property because
ip_vs_out processing can be executed in LOCAL_OUT but we still
have the conn_out_get check in ip_vs_in.

- Ignore packets with inet->nodefrag from local stack

- Require skb_dst(skb) != NULL because we use it to get struct net

- Add support for changing the route to local IPv4 stack after DNAT
depending on the source address type. Local client sets output
route and the remote client sets input route. It looks like
IPv6 does not need such rerouting because the replies use
addresses from initial incoming header, not from skb route.

- All transmitters now have strict checks for the destination
address type: redirect from non-local address to local real
server requires NAT method, local address can not be used as
source address when talking to remote real server.

- Now LOCALNODE is not set explicitly as forwarding
method in real server to allow the connections to provide
correct forwarding method to the backup server. Not sure if
this breaks tools that expect to see 'Local' real server type.
If needed, this can be supported with new flag IP_VS_DEST_F_LOCAL.
Now it should be possible connections in backup that lost
their fwmark information during sync to be forwarded properly
to their daddr, even if it is local address in the backup server.
By this way backup could be used as real server for DR or TUN,
for NAT there are some restrictions because tuple collisions
in conntracks can create problems for the traffic.

- Call ip_vs_dst_reset when destination is updated in case
some real server IP type is changed between local and remote.

[ horms@verge.net.au: removed trailing whitespace ]
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21 11:03:46 +02:00
Julian Anastasov
190ecd27cd ipvs: do not schedule conns from real servers
This patch is needed to avoid scheduling of
packets from local real server when we add ip_vs_in
in LOCAL_OUT hook to support local client.

 	Currently, when ip_vs_in can not find existing
connection it tries to create new one by calling ip_vs_schedule.

 	The default indication from ip_vs_schedule was if
connection was scheduled to real server. If real server is
not available we try to use the bypass forwarding method
or to send ICMP error. But in some cases we do not want to use
the bypass feature. So, add flag 'ignored' to indicate if
the scheduler ignores this packet.

 	Make sure we do not create new connections from replies.
We can hit this problem for persistent services and local real
server when ip_vs_in is added to LOCAL_OUT hook to handle
local clients.

 	Also, make sure ip_vs_schedule ignores SYN packets
for Active FTP DATA from local real server. The FTP DATA
connection should be created on SYN+ACK from client to assign
correct connection daddr.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21 10:50:41 +02:00
Julian Anastasov
cf356d69db ipvs: switch to notrack mode
Change skb->ipvs_property semantic. This is preparation
to support ip_vs_out processing in LOCAL_OUT. ipvs_property=1
will be used to avoid expensive lookups for traffic sent by
transmitters. Now when conntrack support is not used we call
ip_vs_notrack method to avoid problems in OUTPUT and
POST_ROUTING hooks instead of exiting POST_ROUTING as before.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21 10:50:20 +02:00
Julian Anastasov
8b27b10f58 ipvs: optimize checksums for apps
Avoid full checksum calculation for apps that can provide
info whether csum was broken after payload mangling. For now only
ip_vs_ftp mangles payload and it updates the csum, so the full
recalculation is avoided for all packets.

 	Add CHECKSUM_UNNECESSARY for snat_handler (TCP and UDP).
It is needed to support SNAT from local address for the case
when csum is fully recalculated.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21 10:50:02 +02:00
David S. Miller
5eeaa2db16 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2010-10-20 01:59:48 -07:00
Hans Schillstrom
714f095f74 ipvs: IPv6 tunnel mode
IPv6 encapsulation uses a bad source address for the tunnel.
i.e. VIP will be used as local-addr and encap. dst addr.
Decapsulation will not accept this.

Example
LVS (eth1 2003::2:0:1/96, VIP 2003::2:0:100)
   (eth0 2003::1:0:1/96)
RS  (ethX 2003::1:0:5/96)

tcpdump
2003::2:0:100 > 2003::1:0:5: IP6 (hlim 63, next-header TCP (6) payload length: 40)  2003::3:0:10.50991 > 2003::2:0:100.http: Flags [S], cksum 0x7312 (correct), seq 3006460279, win 5760, options [mss 1440,sackOK,TS val 1904932 ecr 0,nop,wscale 3], length 0

In Linux IPv6 impl. you can't have a tunnel with an any cast address
receiving packets (I have not tried to interpret RFC 2473)
To have receive capabilities the tunnel must have:
 - Local address set as multicast addr or an unicast addr
 - Remote address set as an unicast addr.
 - Loop back addres or Link local address are not allowed.

This causes us to setup a tunnel in the Real Server with the
LVS as the remote address, here you can't use the VIP address since it's
used inside the tunnel.

Solution
Use outgoing interface IPv6 address (match against the destination).
i.e. use ip6_route_output() to look up the route cache and
then use ipv6_dev_get_saddr(...) to set the source address of the
encapsulated packet.

Additionally, cache the results in new destination
fields: dst_cookie and dst_saddr and properly check the
returned dst from ip6_route_output. We now add xfrm_lookup
call only for the tunneling method where the source address
is a local one.

Signed-off-by:Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-19 10:38:48 +02:00
Pablo Neira Ayuso
ebbf41df4a netfilter: ctnetlink: add expectation deletion events
This patch allows to listen to events that inform about
expectations destroyed.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-19 10:19:06 +02:00
Eric Dumazet
8e602ce298 netns: reorder fields in struct net
In a network bench, I noticed an unfortunate false sharing between
'loopback_dev' and 'count' fields in "struct net".

'count' is written each time a socket is created or destroyed, while
loopback_dev might be often read in routing code.

Move loopback_dev in a read mostly section of "struct net"

Note: struct netns_xfrm is cache line aligned on SMP.
(It contains a "struct dst_ops")
Move it at the end to avoid holes, and reduce sizeof(struct net) by 128
bytes on ia32.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-17 13:49:14 -07:00
stephen hemminger
31e3c3f6f1 tipc: cleanup function namespace
Do some cleanups of TIPC based on make namespacecheck
  1. Don't export unused symbols
  2. Eliminate dead code
  3. Make functions and variables local
  4. Rename buf_acquire to tipc_buf_acquire since it is used in several files

Compile tested only.
This make break out of tree kernel modules that depend on TIPC routines.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-16 11:13:24 -07:00
John W. Linville
c64557d666 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2010-10-15 16:11:56 -04:00
John W. Linville
1a63c353c8 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth-next-2.6 into for-davem 2010-10-15 16:00:02 -04:00
Kumar Sanghvi
b3d6255388 Phonet: 'connect' socket implementation for Pipe controller
Based on suggestion by Rémi Denis-Courmont to implement 'connect'
for Pipe controller logic,  this patch implements 'connect' socket
call for the Pipe controller logic.
The patch does following:-
- Removes setsockopts for PNPIPE_CREATE and PNPIPE_DESTROY
- Adds setsockopt for setting the Pipe handle value
- Implements connect socket call
- Updates the Pipe controller logic

User-space should now follow below sequence with Pipe controller:-
-socket
-bind
-setsockopt for PNPIPE_PIPE_HANDLE
-connect
-setsockopt for PNPIPE_ENCAP_IP
-setsockopt for PNPIPE_ENABLE

GPRS/3G data has been tested working fine with this.

Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com>
Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-13 14:40:34 -07:00
Johannes Berg
7be5086d4c mac80211: add probe request filter flag
Using the frame registration notification, we
can see when probe requests are requested and
notify the low-level driver via filtering. The
flag is also set in AP and IBSS modes.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-10-13 15:45:22 -04:00
Johannes Berg
271733cf84 cfg80211: notify drivers about frame registrations
Drivers may need to adjust their filters according
to frame registrations, so notify them about them.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-10-13 15:45:22 -04:00
Andrei Emeltchenko
534c92fde7 Bluetooth: clean up rfcomm code
Remove dead code and unused rfcomm thread events

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-10-12 12:44:53 -03:00
Mat Martineau
796c86eec8 Bluetooth: Add common code for stream-oriented recvmsg()
This commit adds a bt_sock_stream_recvmsg() function for use by any
Bluetooth code that uses SOCK_STREAM sockets.  This code is copied
from rfcomm_sock_recvmsg() with minimal modifications to remove
RFCOMM-specific functionality and improve readability.

L2CAP (with the SOCK_STREAM socket type) and RFCOMM have common needs
when it comes to reading data.  Proper stream read semantics require
that applications can read from a stream one byte at a time and not
lose any data.  The RFCOMM code already operated on and pulled data
from the underlying L2CAP socket, so very few changes were required to
make the code more generic for use with non-RFCOMM data over L2CAP.

Applications that need more awareness of L2CAP frame boundaries are
still free to use SOCK_SEQPACKET sockets, and may verify that they
connection did not fall back to basic mode by calling getsockopt().

Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-10-12 12:44:51 -03:00
David Vrabel
8f1e174223 Bluetooth: HCI devices are either BR/EDR or AMP radios
HCI transport drivers may not know what type of radio an AMP device has
so only say whether they're BR/EDR or AMP devices.

Signed-off-by: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-10-12 12:44:51 -03:00
Eric Dumazet
e37ef961e5 neigh: reorder struct neighbour fields
Le mardi 12 octobre 2010 à 00:02 +0200, Eric Dumazet a écrit :
> Here is the followup patch.
>
> Thanks !
>

Oops, this was an old version, the up2date ones also took care of "used"
field.

I guess its time for a sleep, sorry again.

[PATCH net-next V2] neigh: reorder struct neighbour fields

(refcnt) and (ha_lock, ha, used, dev, output, ops, primary_key) should
be placed on a separate cache lines.

refcnt can be often written, while other fields are mostly read.

This gave me good result on stress test :

before:

real    0m45.570s
user    0m15.525s
sys     9m56.669s

After:

real    0m41.841s
user    0m15.261s
sys     8m45.949s

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-11 16:09:14 -07:00
Eric Dumazet
fc66f95c68 net dst: use a percpu_counter to track entries
struct dst_ops tracks number of allocated dst in an atomic_t field,
subject to high cache line contention in stress workload.

Switch to a percpu_counter, to reduce number of time we need to dirty a
central location. Place it on a separate cache line to avoid dirtying
read only fields.

Stress test :

(Sending 160.000.000 UDP frames,
IP route cache disabled, dual E5540 @2.53GHz,
32bit kernel, FIB_TRIE, SLUB/NUMA)

Before:

real    0m51.179s
user    0m15.329s
sys     10m15.942s

After:

real	0m45.570s
user	0m15.525s
sys	9m56.669s

With a small reordering of struct neighbour fields, subject of a
following patch, (to separate refcnt from other read mostly fields)

real	0m41.841s
user	0m15.261s
sys	8m45.949s

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-11 13:06:53 -07:00
Eric Dumazet
0ed8ddf404 neigh: Protect neigh->ha[] with a seqlock
Add a seqlock in struct neighbour to protect neigh->ha[], and avoid
dirtying neighbour in stress situation (many different flows / dsts)

Dirtying takes place because of read_lock(&n->lock) and n->used writes.

Switching to a seqlock, and writing n->used only on jiffies changes
permits less dirtying.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-11 12:54:04 -07:00
David S. Miller
d122179a3c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/core/ethtool.c
2010-10-11 12:30:34 -07:00
Felix Fietkau
8610c29a2c cfg80211: add channel utilization stats to the survey command
Using these, user space can calculate a relative channel utilization
with arbitrary intervals by regularly taking snapshots of the survey
results.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-10-11 15:04:20 -04:00
Ben Greear
5a5c731aa5 wireless: Set some stats used by /proc/net/wireless (wext)
Some stats for /proc/net/wireless (and wext in general) are not
being set.  This patch addresses a few of those with values easily
obtained from mac80211 core.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-10-11 15:04:19 -04:00
John W. Linville
e9a68707d7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
Conflicts:
	Documentation/feature-removal-schedule.txt
	drivers/net/wireless/ipw2x00/ipw2200.c
2010-10-08 15:39:28 -04:00
Johannes Berg
388ac775be cfg80211: constify WDS address
There's no need for the WDS peer address
to not be const, so make it const.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-10-07 14:41:28 -04:00
Ingo Molnar
556ef63255 Merge commit 'v2.6.36-rc7' into core/rcu
Merge reason: Update from -rc3 to -rc7.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-07 09:43:45 +02:00
Ingo Molnar
d4f8f217b8 Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/rcu 2010-10-07 09:43:11 +02:00
Eric Dumazet
767e97e1e0 neigh: RCU conversion of struct neighbour
This is the second step for neighbour RCU conversion.

(first was commit d6bf7817 : RCU conversion of neigh hash table)

neigh_lookup() becomes lockless, but still take a reference on found
neighbour. (no more read_lock()/read_unlock() on tbl->lock)

struct neighbour gets an additional rcu_head field and is freed after an
RCU grace period.

Future work would need to eventually not take a reference on neighbour
for temporary dst (DST_NOCACHE), but this would need dst->_neighbour to
use a noref bit like we did for skb->_dst.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-06 18:01:33 -07:00
Bruno Randolf
b206b4ef06 nl80211/mac80211: Add retry and failed transmission count to station info
This information is already available in mac80211, we just need to export it
via cfg80211 and nl80211.

Signed-off-by: Bruno Randolf <br1@einfach.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-10-06 16:30:43 -04:00
Johannes Berg
e31b82136d cfg80211/mac80211: allow per-station GTKs
This adds API to allow adding per-station GTKs,
updates mac80211 to support it, and also allows
drivers to remove a key from hwaccel again when
this may be necessary due to multiple GTKs.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-10-06 16:30:40 -04:00
Eric Dumazet
ebc0ffae5d fib: RCU conversion of fib_lookup()
fib_lookup() converted to be called in RCU protected context, no
reference taken and released on a contended cache line (fib_clntref)

fib_table_lookup() and fib_semantic_match() get an additional parameter.

struct fib_info gets an rcu_head field, and is freed after an rcu grace
period.

Stress test :
(Sending 160.000.000 UDP frames on same neighbour,
IP route cache disabled, dual E5540 @2.53GHz,
32bit kernel, FIB_HASH) (about same results for FIB_TRIE)

Before patch :

real	1m31.199s
user	0m13.761s
sys	23m24.780s

After patch:

real	1m5.375s
user	0m14.997s
sys	15m50.115s

Before patch Profile :

13044.00 15.4% __ip_route_output_key vmlinux
 8438.00 10.0% dst_destroy           vmlinux
 5983.00  7.1% fib_semantic_match    vmlinux
 5410.00  6.4% fib_rules_lookup      vmlinux
 4803.00  5.7% neigh_lookup          vmlinux
 4420.00  5.2% _raw_spin_lock        vmlinux
 3883.00  4.6% rt_set_nexthop        vmlinux
 3261.00  3.9% _raw_read_lock        vmlinux
 2794.00  3.3% fib_table_lookup      vmlinux
 2374.00  2.8% neigh_resolve_output  vmlinux
 2153.00  2.5% dst_alloc             vmlinux
 1502.00  1.8% _raw_read_lock_bh     vmlinux
 1484.00  1.8% kmem_cache_alloc      vmlinux
 1407.00  1.7% eth_header            vmlinux
 1406.00  1.7% ipv4_dst_destroy      vmlinux
 1298.00  1.5% __copy_from_user_ll   vmlinux
 1174.00  1.4% dev_queue_xmit        vmlinux
 1000.00  1.2% ip_output             vmlinux

After patch Profile :

13712.00 15.8% dst_destroy             vmlinux
 8548.00  9.9% __ip_route_output_key   vmlinux
 7017.00  8.1% neigh_lookup            vmlinux
 4554.00  5.3% fib_semantic_match      vmlinux
 4067.00  4.7% _raw_read_lock          vmlinux
 3491.00  4.0% dst_alloc               vmlinux
 3186.00  3.7% neigh_resolve_output    vmlinux
 3103.00  3.6% fib_table_lookup        vmlinux
 2098.00  2.4% _raw_read_lock_bh       vmlinux
 2081.00  2.4% kmem_cache_alloc        vmlinux
 2013.00  2.3% _raw_spin_lock          vmlinux
 1763.00  2.0% __copy_from_user_ll     vmlinux
 1763.00  2.0% ip_output               vmlinux
 1761.00  2.0% ipv4_dst_destroy        vmlinux
 1631.00  1.9% eth_header              vmlinux
 1440.00  1.7% _raw_read_unlock_bh     vmlinux

Reference results, if IP route cache is enabled :

real	0m29.718s
user	0m10.845s
sys	7m37.341s

25213.00 29.5% __ip_route_output_key   vmlinux
 9011.00 10.5% dst_release             vmlinux
 4817.00  5.6% ip_push_pending_frames  vmlinux
 4232.00  5.0% ip_finish_output        vmlinux
 3940.00  4.6% udp_sendmsg             vmlinux
 3730.00  4.4% __copy_from_user_ll     vmlinux
 3716.00  4.4% ip_route_output_flow    vmlinux
 2451.00  2.9% __xfrm_lookup           vmlinux
 2221.00  2.6% ip_append_data          vmlinux
 1718.00  2.0% _raw_spin_lock_bh       vmlinux
 1655.00  1.9% __alloc_skb             vmlinux
 1572.00  1.8% sock_wfree              vmlinux
 1345.00  1.6% kfree                   vmlinux

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-05 20:39:38 -07:00
Eric Dumazet
d6bf781712 net neigh: RCU conversion of neigh hash table
David

This is the first step for RCU conversion of neigh code.

Next patches will convert hash_buckets[] and "struct neighbour" to RCU
protected objects.

Thanks

[PATCH net-next] net neigh: RCU conversion of neigh hash table

Instead of storing hash_buckets, hash_mask and hash_rnd in "struct
neigh_table", a new structure is defined :

struct neigh_hash_table {
       struct neighbour        **hash_buckets;
       unsigned int            hash_mask;
       __u32                   hash_rnd;
       struct rcu_head         rcu;
};

And "struct neigh_table" has an RCU protected pointer to such a
neigh_hash_table.

This means the signature of (*hash)() function changed: We need to add a
third parameter with the actual hash_rnd value, since this is not
anymore a neigh_table field.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-05 14:54:36 -07:00
Johannes Berg
ff4c92d85c genetlink: introduce pre_doit/post_doit hooks
Each family may have some amount of boilerplate
locking code that applies to most, or even all,
commands.

This allows a family to handle such things in
a more generic way, by allowing it to
 a) include private flags in each operation
 b) specify a pre_doit hook that is called,
    before an operation's doit() callback and
    may return an error directly,
 c) specify a post_doit hook that can undo
    locking or similar things done by pre_doit,
    and finally
 d) include two private pointers in each info
    struct passed between all these operations
    including doit(). (It's two because I'll
    need two in nl80211 -- can be extended.)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-10-05 13:35:30 -04:00
Helmut Schaa
78be49ec2a mac80211: distinct between max rates and the number of rates the hw can report
Some drivers cannot handle multiple retry rates specified by the rc
algorithm but instead use their own retry table (for example rt2800).
However, if such a device registers itself with a max_rates value of 1
the rc algorithm cannot make use of the extended information the device
can provide about retried rates. On the other hand, if a device
registers itself with a max_rates value > 1 the rc algorithm assumes
that the device can handle multi rate retries.

Fix this issue by introducing another hw parameter max_report_rates that
can be set to a different value then max_rates to indicate if a device
is capable of reporting more rates then specified in max_rates.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-10-05 13:35:28 -04:00
Johannes Berg
ea229e6826 cfg80211: remove spurious __KERNEL__ ifdef
The net/cfg80211.h header file isn't exported to
userspace, so there's no need for any kind of
__KERNEL__ protection in it. If it was exported,
everything else in it would need protection as
well, not just the logging stuff ...

Cc:Joe Perches <joe@perches.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-10-05 13:35:23 -04:00
Felix Fietkau
17e5a80828 nl80211: allow drivers to indicate whether the survey data channel is in use
Some user space applications only want to display survey data for
the operating channel, however there is no API to get that yet.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-10-05 13:35:22 -04:00
stephen hemminger
c61393ea83 ipv6: make __ipv6_isatap_ifid static
Another exported symbol only used in one file

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-05 00:47:39 -07:00
stephen hemminger
1df9916e46 fib: fib_rules_cleanup can be static
fib_rules_cleanup_ups is only defined and used in one place.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-05 00:47:39 -07:00
Patrick McHardy
eecc545856 netfilter: add missing xt_log.h file
Forgot to add xt_log.h in commit a8defca0 (netfilter: ipt_LOG:
add bufferisation to call printk() once)

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-04 23:24:21 +02:00
David S. Miller
21a180cda0 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/ipv4/Kconfig
	net/ipv4/tcp_timer.c
2010-10-04 11:56:38 -07:00
Stephen Hemminger
0c200d9353 netfilter: nf_nat: make find/put static
The functions nf_nat_proto_find_get and nf_nat_proto_put are
only used internally in nf_nat_core. This might break some out
of tree NAT module.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-04 20:53:18 +02:00
Simon Horman
0d1e71b04a IPVS: Allow configuration of persistence engines
Allow the persistence engine of a virtual service to be set, edited
and unset.

This feature only works with the netlink user-space interface.

Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04 22:45:24 +09:00
Simon Horman
8be67a6617 IPVS: management of persistence engine modules
This is based heavily on the scheduler management code

Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04 22:45:24 +09:00
Simon Horman
a3c918acd2 IPVS: Add persistence engine data to /proc/net/ip_vs_conn
This shouldn't break compatibility with userspace as the new data
is at the end of the line.

I have confirmed that this doesn't break ipvsadm, the main (only?)
user-space user of this data.

Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04 22:45:24 +09:00
Simon Horman
85999283a2 IPVS: Add struct ip_vs_pe
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04 22:45:24 +09:00
Simon Horman
f11017ec2d IPVS: Add struct ip_vs_conn_param
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04 22:45:24 +09:00
Eric Dumazet
c7d4426a98 net: introduce DST_NOCACHE flag
While doing stress tests with IP route cache disabled, and multi queue
devices, I noticed a very high contention on one rwlock used in
neighbour code.

When many cpus are trying to send frames (possibly using a high
performance multiqueue device) to the same neighbour, they fight for the
neigh->lock rwlock in order to call neigh_hh_init(), and fight on
hh->hh_refcnt (a pair of atomic_inc/atomic_dec_and_test())

But we dont need to call neigh_hh_init() for dst that are used only
once. It costs four atomic operations at least, on two contended cache
lines, plus the high contention on neigh->lock rwlock.

Introduce a new dst flag, DST_NOCACHE, that is set when dst was not
inserted in route cache.

With the stress test bench, sending 160000000 frames on one neighbour,
results are :

Before patch:

real	2m28.406s
user	0m11.781s
sys	36m17.964s


After patch:

real	1m26.532s
user	0m12.185s
sys	20m3.903s

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03 22:17:54 -07:00
John W. Linville
41f4a6f71f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2010-10-01 11:12:36 -04:00
Eric Dumazet
367e5e3769 neigh: reorder fields in struct neighbour
On 64bit arches, there are two 32bit holes that we can remove.

sizeof(struct neighbour) shrinks from 0xf8 to 0xf0 bytes

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01 00:36:51 -07:00
Gustavo F. Padovan
e454c84464 Bluetooth: Fix deadlock in the ERTM logic
The Enhanced Retransmission Mode(ERTM) is a realiable mode of operation
of the Bluetooth L2CAP layer. Think on it like a simplified version of
TCP.
The problem we were facing here was a deadlock. ERTM uses a backlog
queue to queue incomimg packets while the user is helding the lock. At
some moment the sk_sndbuf can be exceeded and we can't alloc new skbs
then the code sleep with the lock to wait for memory, that stalls the
ERTM connection once we can't read the acknowledgements packets in the
backlog queue to free memory and make the allocation of outcoming skb
successful.

This patch actually affect all users of bt_skb_send_alloc(), i.e., all
L2CAP modes and SCO.

We are safe against socket states changes or channels deletion while the
we are sleeping wait memory. Checking for the sk->sk_err and
sk->sk_shutdown make the code safe, since any action that can leave the
socket or the channel in a not usable state set one of the struct
members at least. Then we can check both of them when getting the lock
again and return with the proper error if something unexpected happens.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Ulisses Furquim <ulisses@profusion.mobi>
2010-09-30 12:19:35 -03:00
stephen hemminger
1b9f409293 tcp: tcp_enter_quickack_mode can be static
Function only used in tcp_input.c

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29 19:45:36 -07:00
stephen hemminger
a64de47c09 arp: remove unnecessary export of arp_broken_ops
arp_broken_ops is only used in arp.c

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29 19:45:35 -07:00
Tom Herbert
4465b46900 ipv4: Allow configuring subnets as local addresses
This patch allows a host to be configured to respond to any address in
a specified range as if it were local, without actually needing to
configure the address on an interface.  This is done through routing
table configuration.  For instance, to configure a host to respond
to any address in 10.1/16 received on eth0 as a local address we can do:

ip rule add from all iif eth0 lookup 200
ip route add local 10.1/16 dev lo proto kernel scope host src 127.0.0.1 table 200

This host is now reachable by any 10.1/16 address (route lookup on
input for packets received on eth0 can find the route).  On output, the
rule will not be matched so that this host can still send packets to
10.1/16 (not sent on loopback).  Presumably, external routing can be
configured to make sense out of this.

To make this work, we needed to modify the logic in finding the
interface which is assigned a given source address for output
(dev_ip_find).  We perform a normal fib_lookup instead of just a
lookup on the local table, and in the lookup we ignore the input
interface for matching.

This patch is useful to implement IP-anycast for subnets of virtual
addresses.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-28 23:38:15 -07:00
Pablo Neira Ayuso
bc01befdcf netfilter: ctnetlink: add support for user-space expectation helpers
This patch adds the basic infrastructure to support user-space
expectation helpers via ctnetlink and the netfilter queuing
infrastructure NFQUEUE. Basically, this patch:

* adds NF_CT_EXPECT_USERSPACE flag to identify user-space
  created expectations. I have also added a sanity check in
  __nf_ct_expect_check() to avoid that kernel-space helpers
  may create an expectation if the master conntrack has no
  helper assigned.
* adds some branches to check if the master conntrack helper
  exists, otherwise we skip the code that refers to kernel-space
  helper such as the local expectation list and the expectation
  policy.
* allows to set the timeout for user-space expectations with
  no helper assigned.
* a list of expectations created from user-space that depends
  on ctnetlink (if this module is removed, they are deleted).
* includes USERSPACE in the /proc output for expectations
  that have been created by a user-space helper.

This patch also modifies ctnetlink to skip including the helper
name in the Netlink messages if no kernel-space helper is set
(since no user-space expectation has not kernel-space kernel
assigned).

You can access an example user-space FTP conntrack helper at:
http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-userspace-POC.tar.bz

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-09-28 21:06:34 +02:00
Eric Dumazet
290b895e0b tunnels: prepare percpu accounting
Tunnels are going to use percpu for their accounting.

They are going to use a new tstats field in net_device.

skb_tunnel_rx() is changed to be a wrapper around __skb_tunnel_rx()

IPTUNNEL_XMIT() is changed to be a wrapper around __IPTUNNEL_XMIT()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27 21:30:42 -07:00
Kumar Sanghvi
8d98efa84b Phonet: Implement Pipe Controller to support Nokia Slim Modems
Phonet stack assumes the presence of Pipe Controller, either in Modem or
on Application Processing Engine user-space for the Pipe data.
Nokia Slim Modems like WG2.5 used in ST-Ericsson U8500 platform do not
implement Pipe controller in them.
This patch adds Pipe Controller implemenation to Phonet stack to support
Pipe data over Phonet stack for Nokia Slim Modems.

Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27 21:30:41 -07:00
Ulrich Weber
fb0c5f0bc8 tproxy: check for transparent flag in ip_route_newports
as done in ip_route_connect()

Signed-off-by: Ulrich Weber <uweber@astaro.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27 15:03:33 -07:00
Johannes Berg
554891e63a mac80211: move packet flags into packet
commit 8c0c709eea
Author: Johannes Berg <johannes@sipsolutions.net>
Date:   Wed Nov 25 17:46:15 2009 +0100

    mac80211: move cmntr flag out of rx flags

moved the CMNTR flag into the skb RX flags for
some aggregation cleanups, but this was wrong
since the optimisation this flag tried to make
requires that it is kept across the processing
of multiple interfaces -- which isn't true for
flags in the skb. The patch not only broke the
optimisation, it also introduced a bug: under
some (common!) circumstances the flag will be
set on an already freed skb!

However, investigating this in more detail, I
found that most of the flags that we set should
be per packet, _except_ for this one, due to
a-MPDU processing. Additionally, the flags used
for processing (currently just this one) need
to be reset before processing a new packet.

Since we haven't actually seen bugs reported as
a result of the wrong flags handling (which is
not too surprising -- the only real bug case I
can come up with is an a-MSDU contained in an
a-MPDU), I'll make a different fix for rc.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-09-27 15:57:54 -04:00
Ben Greear
686b9cb994 mac80211/ath9k: Support AMPDU with multiple VIFs.
The old ieee80211_find_sta_by_hw method didn't properly
find VIFS when there was more than one per AP.  This caused
AMPDU logic in ath9k to get the wrong VIF when trying to
account for transmitted SKBs.

This patch changes ieee80211_find_sta_by_hw to take a
localaddr argument to distinguish between VIFs with the
same AP but different local addresses.  The method name
is changed to ieee80211_find_sta_by_ifaddr.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-09-27 15:57:45 -04:00
David S. Miller
e40051d134 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/qlcnic/qlcnic_init.c
	net/ipv4/ip_output.c
2010-09-27 01:03:03 -07:00
Neil Horman
2cc6d2bf3d ipv6: add a missing unregister_pernet_subsys call
Clean up a missing exit path in the ipv6 module init routines.  In
addrconf_init we call ipv6_addr_label_init which calls register_pernet_subsys
for the ipv6_addr_label_ops structure.  But if module loading fails, or if the
ipv6 module is removed, there is no corresponding unregister_pernet_subsys call,
which leaves a now-bogus address on the pernet_list, leading to oopses in
subsequent registrations.  This patch cleans up both the failed load path and
the unload path.  Tested by myself with good results.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

 include/net/addrconf.h |    1 +
 net/ipv6/addrconf.c    |   11 ++++++++---
 net/ipv6/addrlabel.c   |    5 +++++
 3 files changed, 14 insertions(+), 3 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26 19:09:25 -07:00
Eric Dumazet
7a91b434e2 net: update SOCK_MIN_RCVBUF
SOCK_MIN_RCVBUF current value is 256 bytes

It doesnt permit to receive the smallest possible frame, considering
socket sk_rmem_alloc/sk_rcvbuf account skb truesizes. On 64bit arches,
sizeof(struct sk_buff) is 240 bytes. Add the typical 64 bytes of
headroom, and we go over the limit.

With old kernels and 32bit arches, we were under the limit, if netdriver
was doing copybreak.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26 18:53:07 -07:00
Tom Herbert
693019e90c net: reset skb queue mapping when rx'ing over tunnel
Reset queue mapping when an skb is reentering the stack via a tunnel.
On second pass, the queue mapping from the original device is no
longer valid.

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26 18:48:40 -07:00
Christian Lamparter
eb7d3066cf mac80211: clear txflags for ps-filtered frames
This patch fixes stale mac80211_tx_control_flags for
filtered / retried frames.

Because ieee80211_handle_filtered_frame feeds skbs back
into the tx path, they have to be stripped of some tx
flags so they won't confuse the stack, driver or device.

Cc: <stable@kernel.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-09-24 15:54:30 -04:00
Eric Dumazet
a02cec2155 net: return operator cleanup
Change "return (EXPR);" to "return EXPR;"

return is not a function, parentheses are not required.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-23 14:33:39 -07:00
Pablo Neira Ayuso
8b008faf92 netfilter: ctnetlink: allow to specify the expectation flags
With this patch, you can specify the expectation flags for user-space
created expectations.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-09-22 08:36:59 +02:00
David S. Miller
a0741ca949 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2010-09-21 18:17:19 -07:00
Eric Dumazet
48daa3bb84 ipv6: addrconf.h cleanups
- Use rcu_dereference_rtnl() in __in6_dev_get
- kerneldoc for __in6_dev_get() and in6_dev_get()
- Use inline functions instead of macros

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-21 18:04:47 -07:00
John W. Linville
b618f6f885 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
Conflicts:
	arch/arm/mach-omap2/board-omap3pandora.c
	drivers/net/wireless/ath/ath5k/base.c
2010-09-21 15:49:14 -04:00
Julian Anastasov
8a8030407f ipvs: make rerouting optional with snat_reroute
Add new sysctl flag "snat_reroute". Recent kernels use
ip_route_me_harder() to route LVS-NAT responses properly by
VIP when there are multiple paths to client. But setups
that do not have alternative default routes can skip this
routing lookup by using snat_reroute=0.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-09-21 17:38:57 +02:00
Julian Anastasov
f4bc17cdd2 ipvs: netfilter connection tracking changes
Add more code to IPVS to work with Netfilter connection
tracking and fix some problems.

- Allow IPVS to be compiled without connection tracking as in
2.6.35 and before. This can avoid keeping conntracks for all
IPVS connections because this costs memory. ip_vs_ftp still
depends on connection tracking and NAT as implemented for 2.6.36.

- Add sysctl var "conntrack" to enable connection tracking for
all IPVS connections. For loaded IPVS directors it needs
tuning of nf_conntrack_max limit.

- Add IP_VS_CONN_F_NFCT connection flag to request the connection
to use connection tracking. This allows user space to provide this
flag, for example, in dest->conn_flags. This can be useful to
request connection tracking per real server instead of forcing it
for all connections with the "conntrack" sysctl. This flag is
set currently only by ip_vs_ftp and of course by "conntrack" sysctl.

- Add ip_vs_nfct.c file to hold all connection tracking code,
by this way main code should not depend of netfilter conntrack
support.

- Return back the ip_vs_post_routing handler as in 2.6.35 and use
skb->ipvs_property=1 to allow IPVS to work without connection
tracking

Connection tracking:

- most of the code is already in 2.6.36-rc

- alter conntrack reply tuple for LVS-NAT connections when first packet
from client is forwarded and conntrack state is NEW or RELATED.
Additionally, alter reply for RELATED connections from real server,
again for packet in original direction.

- add IP_VS_XMIT_TUNNEL to confirm conntrack (without altering
reply) for LVS-TUN early because we want to call nf_reset. It is
needed because we add IPIP header and the original conntrack
should be preserved, not destroyed. The transmitted IPIP packets
can reuse same conntrack, so we do not set skb->ipvs_property.

- try to destroy conntrack when the IPVS connection is destroyed.
It is not fatal if conntrack disappears before that, it depends
on the used timers.

Fix problems from long time:

- add skb->ip_summed = CHECKSUM_NONE for the LVS-TUN transmitters

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-09-21 17:35:41 +02:00
Thomas Egerer
8444cf712c xfrm: Allow different selector family in temporary state
The family parameter xfrm_state_find is used to find a state matching a
certain policy. This value is set to the template's family
(encap_family) right before xfrm_state_find is called.
The family parameter is however also used to construct a temporary state
in xfrm_state_find itself which is wrong for inter-family scenarios
because it produces a selector for the wrong family. Since this selector
is included in the xfrm_user_acquire structure, user space programs
misinterpret IPv6 addresses as IPv4 and vice versa.
This patch splits up the original init_tempsel function into a part that
initializes the selector respectively the props and id of the temporary
state, to allow for differing ip address families whithin the state.

Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-20 11:11:38 -07:00
Julian Anastasov
3575792e00 ipvs: extend connection flags to 32 bits
- the sync protocol supports 16 bits only, so bits 0..15 should be
used only for flags that should go to backup server, bits 16 and
above should be allocated for flags not sent to backup.

- use IP_VS_CONN_F_DEST_MASK as mask of connection flags in
destination that can be changed by user space

- allow IP_VS_CONN_F_ONE_PACKET to be set in destination

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-09-17 14:18:16 +02:00
Johannes Berg
2ca27bcff7 mac80211: add p2p device type support
When a driver advertises p2p device support,
mac80211 will handle it, but internally it will
rewrite the interface type to STA/AP rather than
P2P-STA/GO since otherwise a lot of paths need
to be touched that are otherwise identical. A
p2p boolean tells drivers whether or not a given
interface will be used for p2p or not.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-09-16 15:46:07 -04:00
Joe Perches
9c37663929 include/net/cfg80211.h: wiphy_<level> messages use dev_printk
The output becomes:

[   41.261941] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-09-16 15:19:44 -04:00
Rémi Denis-Courmont
507215f8d0 Phonet: list subscribed resources via proc_fs
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-15 21:31:33 -07:00
Rémi Denis-Courmont
4e3d16ce5e Phonet: resource routing backend
When both destination device and object are nul, Phonet routes the
packet according to the resource field. In fact, this is the most
common pattern when sending Phonet "request" packets. In this case,
the packet is delivered to whichever endpoint (socket) has
registered the resource.

This adds a new table so that Linux processes can register their
Phonet sockets to Phonet resources, if they have adequate privileges.

(Namespace support is not implemented at the moment.)

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-15 21:31:32 -07:00
Rémi Denis-Courmont
6482f554e2 Phonet: remove dangling pipe if an endpoint is closed early
Closing a pipe endpoint is not normally allowed by the Phonet pipe,
other than as a side after-effect of removing the pipe between two
endpoints. But there is no way to prevent Linux userspace processes
from being killed or suffering from bugs, so this can still happen.
We might as well forcefully close Phonet pipe endpoints then.

The cellular modem supports only a few existing pipes at a time. So we
really should not leak them. This change instructs the modem to destroy
the pipe if either of the pipe's endpoint (Linux socket) is closed too
early.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-15 21:31:31 -07:00
Alexey Kuznetsov
01f83d6984 tcp: Prevent overzealous packetization by SWS logic.
If peer uses tiny MSS (say, 75 bytes) and similarly tiny advertised
window, the SWS logic will packetize to half the MSS unnecessarily.

This causes problems with some embedded devices.

However for large MSS devices we do want to half-MSS packetize
otherwise we never get enough packets into the pipe for things
like fast retransmit and recovery to work.

Be careful also to handle the case where MSS > window, otherwise
we'll never send until the probe timer.

Reported-by: ツ Leandro Melo de Sales <leandroal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-15 12:01:44 -07:00
Joe Perches
55b1804c67 net/irda: Use static const char * const where possible
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-14 20:22:05 -07:00
Felix Fietkau
2944f45d9d mac80211: add a note about iterating interfaces during add_interface()
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-09-14 16:14:26 -04:00
David S. Miller
e548833df8 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/mac80211/main.c
2010-09-09 22:27:33 -07:00
Eric Dumazet
719f835853 udp: add rehash on connect()
commit 30fff923 introduced in linux-2.6.33 (udp: bind() optimisation)
added a secondary hash on UDP, hashed on (local addr, local port).

Problem is that following sequence :

fd = socket(...)
connect(fd, &remote, ...)

not only selects remote end point (address and port), but also sets
local address, while UDP stack stored in secondary hash table the socket
while its local address was INADDR_ANY (or ipv6 equivalent)

Sequence is :
 - autobind() : choose a random local port, insert socket in hash tables
              [while local address is INADDR_ANY]
 - connect() : set remote address and port, change local address to IP
              given by a route lookup.

When an incoming UDP frame comes, if more than 10 sockets are found in
primary hash table, we switch to secondary table, and fail to find
socket because its local address changed.

One solution to this problem is to rehash datagram socket if needed.

We add a new rehash(struct socket *) method in "struct proto", and
implement this method for UDP v4 & v6, using a common helper.

This rehashing only takes care of secondary hash table, since primary
hash (based on local port only) is not changed.

Reported-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-08 21:45:01 -07:00
Joe Perches
e3634169bc include/net/raw.h: Convert raw_seq_private macro to inline
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-08 13:42:22 -07:00
Julian Anastasov
6523ce1525 ipvs: fix active FTP
- Do not create expectation when forwarding the PORT
  command to avoid blocking the connection. The problem is that
  nf_conntrack_ftp.c:help() tries to create the same expectation later in
  POST_ROUTING and drops the packet with "dropping packet" message after
  failure in nf_ct_expect_related.

- Change ip_vs_update_conntrack to alter the conntrack
  for related connections from real server. If we do not alter the reply in
  this direction the next packet from client sent to vport 20 comes as NEW
  connection. We alter it but may be some collision happens for both
  conntracks and the second conntrack gets destroyed immediately. The
  connection stucks too.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-08 10:39:57 -07:00
Li Zefan
3fb5a99191 cls_cgroup: Fix rcu lockdep warning
Dave reported an rcu lockdep warning on 2.6.35.4 kernel

task->cgroups and task->cgroups->subsys[i] are protected by RCU.
So we avoid accessing invalid pointers here. This might happen,
for example, when you are deref-ing those pointers while someone
move @task from one cgroup to another.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-03 09:55:24 -07:00
John W. Linville
78ab952717 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2010-09-02 13:30:07 -04:00
Gerrit Renker
3d5b99ae82 TCP: update initial windows according to RFC 5681
This updates the use of larger initial windows, as originally specified in
RFC 3390, to use the newer IW values specified in RFC 5681, section 3.1.

The changes made in RFC 5681 are:
 a) the setting now is more clearly specified in units of segments (as the
    comments  by John Heffner emphasized, this was not very clear in RFC 3390);
 b) for connections with 1095 < SMSS <= 2190 there is now a change:
    - RFC 3390 says that IW <= 4380,
    - RFC 5681 says that IW = 3 * SMSS <= 6570.

Since RFC 3390 is older and "only" proposed standard, whereas the newer RFC 5681
is already draft standard, it seems preferable to use the newer IW variant.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-30 13:50:44 -07:00
Gerrit Renker
22b71c8f4f tcp/dccp: Consolidate common code for RFC 3390 conversion
This patch consolidates initial-window code common to TCP and CCID-2:
 * TCP uses RFC 3390 in a packet-oriented manner (tcp_input.c) and
 * CCID-2 uses RFC 3390 in packet-oriented manner (RFC 4341).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-30 13:45:26 -07:00
Jerry Chu
dca43c75e7 tcp: Add TCP_USER_TIMEOUT socket option.
This patch provides a "user timeout" support as described in RFC793. The
socket option is also needed for the the local half of RFC5482 "TCP User
Timeout Option".

TCP_USER_TIMEOUT is a TCP level socket option that takes an unsigned int,
when > 0, to specify the maximum amount of time in ms that transmitted
data may remain unacknowledged before TCP will forcefully close the
corresponding connection and return ETIMEDOUT to the application. If
0 is given, TCP will continue to use the system default.

Increasing the user timeouts allows a TCP connection to survive extended
periods without end-to-end connectivity. Decreasing the user timeouts
allows applications to "fail fast" if so desired. Otherwise it may take
upto 20 minutes with the current system defaults in a normal WAN
environment.

The socket option can be made during any state of a TCP connection, but
is only effective during the synchronized states of a connection
(ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, or LAST-ACK).
Moreover, when used with the TCP keepalive (SO_KEEPALIVE) option,
TCP_USER_TIMEOUT will overtake keepalive to determine when to close a
connection due to keepalive failure.

The option does not change in anyway when TCP retransmits a packet, nor
when a keepalive probe will be sent.

This option, like many others, will be inherited by an acceptor from its
listener.

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-30 13:23:33 -07:00
Johannes Berg
34d4bc4d41 mac80211: support runtime interface type changes
Add support to mac80211 for changing the interface
type even when the interface is UP, if the driver
supports it.

To achieve this
 * add a new driver callback for switching,
 * split some of the interface up/down code out
   into new functions (do_open/do_stop), and
 * maintain an own __SDATA_RUNNING bit that will
   not be set during interface type, so that any
   other code doesn't use the interface.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:53:31 -04:00
Johannes Berg
c0692b8fe2 cfg80211: allow changing port control protocol
Some vendor specified mechanisms for 802.1X-style
functionality use a different protocol than EAP
(even if EAP is vendor-extensible). Allow setting
the ethertype for the protocol when a driver has
support for this. The default if unspecified is
EAP, of course.

Note: This is suitable only for station mode, not
      for AP implementation.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:27:07 -04:00
Johannes Berg
8789d459bc mac80211: allow scan to complete from any context
The ieee80211_scan_completed() function was a frequent
source of potential deadlocks, since it is called by
drivers but may call back into drivers, so drivers had
to make sure to call it without any locks held, which
frequently lead to more complex code in drivers. Avoid
that problem by allowing the function to be called in
any context, and queueing the actual work it does.
Also update the documentation for it to indicate this.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:27:06 -04:00
Joe Perches
145ce502e4 net/sctp: Use pr_fmt and pr_<level>
Change SCTP_DEBUG_PRINTK and SCTP_DEBUG_PRINTK_IPADDR to
use do { print } while (0) guards.
Add SCTP_DEBUG_PRINTK_CONT to fix errors in log when
lines were continued.
Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
Add a missing newline in "Failed bind hash alloc"

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-26 14:11:48 -07:00
John W. Linville
e569aa78ba Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
Conflicts:
	drivers/net/wireless/libertas/if_sdio.c
2010-08-25 14:51:42 -04:00
Bob Copeland
2738bd682d mac80211: trivial spelling fixes
Fix spelling and readability of a few lines of kernel doc:

    s/issueing/issuing/g
    s/approriate/appropriate/g
    s/supported by simply/supported simply by/
    s/IEEE80211_HW_BEACON_FILTERING/IEEE80211_HW_BEACON_FILTER/g

Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-25 14:33:17 -04:00
David S. Miller
ad1af0fedb tcp: Combat per-cpu skew in orphan tests.
As reported by Anton Blanchard when we use
percpu_counter_read_positive() to make our orphan socket limit checks,
the check can be off by up to num_cpus_online() * batch (which is 32
by default) which on a 128 cpu machine can be as large as the default
orphan limit itself.

Fix this by doing the full expensive sum check if the optimized check
triggers.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
2010-08-25 02:27:49 -07:00
Johannes Berg
d70e96932d cfg80211: add some documentation
Add some documentation for cfg80211. I'm hoping some of
the regulatory documentation will be filled by somebody
more familiar with it, hint hint! :)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-24 16:32:06 -04:00
Johannes Berg
633dd1ea68 mac80211: fix docbook
Fix a small problem in the documentation for
ieee80211_request_smps, and a now erroneous
inclusion of enum ieee80211_key_alg, which no
longer exists after the change to ciphers.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-24 16:32:04 -04:00
Johannes Berg
2e161f78e5 cfg80211/mac80211: extensible frame processing
Allow userspace to register for more than just
action frames by giving the frame subtype, and
make it possible to use this in various modes
as well.

With some tweaks and some added functionality
this will, in the future, also be usable in AP
mode and be able to replace the cooked monitor
interface currently used in that case.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-24 16:27:56 -04:00
Johannes Berg
633adf1ad1 cfg80211: mark ieee80211_hdrlen const
This function analyses only its single, value-passed
argument, and has no side effects. Thus it can be
const, which makes mac80211 smaller, for example:

   text	   data	    bss	    dec	    hex	filename
 362518	  16720	    884	 380122	  5ccda	mac80211.ko (before)
 362358	  16720	    884	 379962	  5cc3a	mac80211.ko (after)

a 160 byte saving in text size, and an optimisation
because the function won't be called as often.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-24 16:27:54 -04:00
Eric Dumazet
81ce790bd7 irda: use net_device_stats from struct net_device
struct net_device has its own struct net_device_stats member, so use
this one instead of a private copy in the irlan_cb struct.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-21 23:32:31 -07:00
Dmitry Kozlov
00959ade36 PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol)
PPP: introduce "pptp" module which implements point-to-point tunneling protocol using pppox framework
NET: introduce the "gre" module for demultiplexing GRE packets on version criteria
     (required to pptp and ip_gre may coexists)
NET: ip_gre: update to use the "gre" module

This patch introduces then pptp support to the linux kernel which
dramatically speeds up pptp vpn connections and decreases cpu usage in
comparison of existing user-space implementation
(poptop/pptpclient). There is accel-pptp project
(https://sourceforge.net/projects/accel-pptp/) to utilize this module,
it contains plugin for pppd to use pptp in client-mode and modified
pptpd (poptop) to build high-performance pptp NAS.

There was many changes from initial submitted patch, most important are:
1. using rcu instead of read-write locks
2. using static bitmap instead of dynamically allocated
3. using vmalloc for memory allocation instead of BITS_PER_LONG + __get_free_pages
4. fixed many coding style issues
Thanks to Eric Dumazet.

Signed-off-by: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-21 23:05:39 -07:00
Grégoire Baron
eb4d406545 net/sched: add ACT_CSUM action to update packets checksums
net/sched: add ACT_CSUM action to update packets checksums

ACT_CSUM can be called just after ACT_PEDIT in order to re-compute some
altered checksums in IPv4 and IPv6 packets. The following checksums are
supported by this patch:
 - IPv4: IPv4 header, ICMP, IGMP, TCP, UDP & UDPLite
 - IPv6: ICMPv6, TCP, UDP & UDPLite
It's possible to request in the same action to update different kind of
checksums, if the packets flow mix TCP, UDP and UDPLite, ...

An example of usage is done in the associated iproute2 patch.

Version 3 changes:
 - remove useless goto instructions
 - improve IPv6 hop options decoding

Version 2 changes:
 - coding style correction
 - remove useless arguments of some functions
 - use stack in tcf_csum_dump()
 - add tcf_csum_skb_nextlayer() to factor code

Signed-off-by: Gregoire Baron <baronchon@n7mm.org>
Acked-by: jamal <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-20 01:42:59 -07:00
Arnd Bergmann
0906a372f2 net/netfilter: __rcu annotations
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2010-08-19 17:18:01 -07:00
Paul E. McKenney
d34a16661e net: convert to rcu_dereference_index_check()
The task_cls_classid() function applies rcu_dereference() to integers,
which does not work with the shiny new sparse-based checking in
rcu_dereference().  This commit therefore moves to the new RCU API
rcu_dereference_index_check().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-08-19 17:17:59 -07:00
Oliver Hartkopp
2244d07bfa net: simplify flags for tx timestamping
This patch removes the abstraction introduced by the union skb_shared_tx in
the shared skb data.

The access of the different union elements at several places led to some
confusion about accessing the shared tx_flags e.g. in skb_orphan_try().

    http://marc.info/?l=linux-netdev&m=128084897415886&w=2

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-19 00:08:30 -07:00
Johannes Berg
97359d1235 mac80211: use cipher suite selectors
Currently, mac80211 translates the cfg80211
cipher suite selectors into ALG_* values.
That isn't all too useful, and some drivers
benefit from the distinction between WEP40
and WEP104 as well. Therefore, convert it
all to use the cipher suite selectors.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 16:45:11 -04:00
Johannes Berg
d1f5b7a34a mac80211: allow drivers to request SM PS mode change
Sometimes drivers have more information than the
stack about how their antennas/chains are used,
and may require that the SM PS mode be changed.
This could happen, for example, when detecting
that the user disconnected an antenna. Thus this
patch introduces API to allow drivers to request
SM PS mode changes.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 15:26:40 -04:00
Johannes Berg
7da7cc1d42 mac80211: per interface idle notification
Sometimes we don't just need to know whether or
not the device is idle, but also per interface.
This adds that reporting capability to mac80211.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 15:26:40 -04:00
John W. Linville
9714d315d2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-08-16 14:40:44 -04:00
John W. Linville
4e6cbfd09c mac80211: support use of NAPI for bottom-half processing
This patch implement basic infrastructure to support use of NAPI by
mac80211-based hardware drivers.

Because mac80211 devices can support multiple netdevs, a dummy netdev
is used for interfacing with the NAPI code in the core of the network
stack.  That structure is hidden from the hardware drivers, but the
actual napi_struct is exposed in the ieee80211_hw structure so that the
poll routines in drivers can retrieve that structure.  Hardware drivers
can also specify their own weight value for NAPI polling.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 14:39:46 -04:00
David S. Miller
1c114f42a5 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-08-10 15:59:38 -07:00
Mat Martineau
db12d647cc Bluetooth: Use 3-DH5 payload size for default ERTM max PDU size
The previous value of 672 for L2CAP_DEFAULT_MAX_PDU_SIZE is based on
the default L2CAP MTU.  That default MTU is calculated from the size
of two DH5 packets, minus ACL and L2CAP b-frame header overhead.

ERTM is used with newer basebands that typically support larger 3-DH5
packets, and i-frames and s-frames have more header overhead.  With
clean RF conditions, basebands will typically attempt to use 1021-byte
3-DH5 packets for maximum throughput.  Adjusting for 2 bytes of ACL
headers plus 10 bytes of worst-case L2CAP headers yields 1009 bytes
of payload.

This PDU size imposes less overhead for header bytes and gives the
baseband the option to choose 3-DH5 packets, but is small enough for
ERTM traffic to interleave well with other L2CAP or SCO data.
672-byte payloads do not allow the most efficient over-the-air
packet choice, and cannot achieve maximum throughput over BR/EDR.

Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-08-10 07:59:11 -04:00
Mat Martineau
fa235562fb Bluetooth: Change default L2CAP ERTM retransmit timeout
The L2CAP specification requires that the ERTM retransmit timeout be at
least 2 seconds for BR/EDR connections.

Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-08-10 07:59:11 -04:00
Randy Dunlap
53c3fa2064 net/sock.h: add missing kernel-doc notation
Add missing kernel-doc notation to struct sock:

Warning(include/net/sock.h:324): No description found for parameter 'sk_peer_pid'
Warning(include/net/sock.h:324): No description found for parameter 'sk_peer_cred'
Warning(include/net/sock.h:324): No description found for parameter 'sk_classid'
Warning(include/net/sock.h:324): Excess struct/union/enum/typedef member 'sk_peercred' description in 'sock'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-10 00:09:20 -07:00
David S. Miller
e225567960 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-08-06 13:30:43 -07:00
John W. Linville
c0068c8589 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-next-2.6 2010-08-05 15:54:28 -04:00
Linus Torvalds
6ba74014c1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1443 commits)
  phy/marvell: add 88ec048 support
  igb: Program MDICNFG register prior to PHY init
  e1000e: correct MAC-PHY interconnect register offset for 82579
  hso: Add new product ID
  can: Add driver for esd CAN-USB/2 device
  l2tp: fix export of header file for userspace
  can-raw: Fix skb_orphan_try handling
  Revert "net: remove zap_completion_queue"
  net: cleanup inclusion
  phy/marvell: add 88e1121 interface mode support
  u32: negative offset fix
  net: Fix a typo from "dev" to "ndev"
  igb: Use irq_synchronize per vector when using MSI-X
  ixgbevf: fix null pointer dereference due to filter being set for VLAN 0
  e1000e: Fix irq_synchronize in MSI-X case
  e1000e: register pm_qos request on hardware activation
  ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice
  net: Add getsockopt support for TCP thin-streams
  cxgb4: update driver version
  cxgb4: add new PCI IDs
  ...

Manually fix up conflicts in:
 - drivers/net/e1000e/netdev.c: due to pm_qos registration
   infrastructure changes
 - drivers/net/phy/marvell.c: conflict between adding 88ec048 support
   and cleaning up the IDs
 - drivers/net/wireless/ipw2x00/ipw2100.c: trivial ipw2100_pm_qos_req
   conflict (registration change vs marking it static)
2010-08-04 11:47:58 -07:00
David S. Miller
83bf2e4089 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2010-08-02 15:07:58 -07:00
Aneesh Kumar K.V
eda25e4616 net/9p: Implement TXATTRCREATE 9p call
TXATTRCREATE:  Prepare a fid for setting xattr value on a file system object.

 size[4] TXATTRCREATE tag[2] fid[4] name[s] attr_size[8] flags[4]
 size[4] RXATTRCREATE tag[2]

txattrcreate gets a fid pointing to xattr. This fid can later be
used to set the xattr value.

flag value is derived from set Linux setxattr. The manpage says
"The flags parameter can be used to refine the semantics of the operation.
XATTR_CREATE specifies a pure create, which fails if the named attribute
exists already. XATTR_REPLACE specifies a pure replace operation, which
fails if the named attribute does not already exist. By default (no flags),
the extended attribute will be created if need be, or will simply replace
the value if the attribute exists."

The actual setxattr operation happens when the fid is clunked. At that point
the written byte count and the attr_size specified in TXATTRCREATE should be
same otherwise an error will be returned.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-08-02 14:28:34 -05:00
Aneesh Kumar K.V
0ef63f345c net/9p: Implement attrwalk 9p call
TXATTRWALK: Descend a ATTR namespace

 size[4] TXATTRWALK tag[2] fid[4] newfid[4] name[s]
 size[4] RXATTRWALK tag[2] size[8]

txattrwalk gets a fid pointing to xattr. This fid can later be
used to read the xattr value. If name is NULL the fid returned
can be used to get the list of extended attribute associated to
the file system object.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-08-02 14:28:33 -05:00
M. Mohan Kumar
ef56547efa 9p: Implement LOPEN
Implement 9p2000.L version of open(LOPEN) interface in 9p client.

For LOPEN, no need to convert the flags to and from 9p mode to VFS mode.

Synopsis:

    size[4] Tlopen tag[2] fid[4] mode[4]

    size[4] Rlopen tag[2] qid[13] iounit[4]

[Fix mode bit format - jvrao@linux.vnet.ibm.com]

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbegren <ericvh@gmail.com>
2010-08-02 14:28:32 -05:00
Venkateswararao Jujjuri (JV)
5643135a28 fs/9p: This patch implements TLCREATE for 9p2000.L protocol.
SYNOPSIS

    size[4] Tlcreate tag[2] fid[4] name[s] flags[4] mode[4] gid[4]

    size[4] Rlcreate tag[2] qid[13] iounit[4]

DESCRIPTION

The Tlreate request asks the file server to create a new regular file with the
name supplied, in the directory (dir) represented by fid.
The mode argument specifies the permissions to use. New file is created with
the uid if the fid and with supplied gid.

The flags argument represent Linux access mode flags with which the caller
is requesting to open the file with. Protocol allows all the Linux access
modes but it is upto the server to allow/disallow any of these acess modes.
If the server doesn't support any of the access mode, it is expected to
return error.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-08-02 14:28:32 -05:00
M. Mohan Kumar
01a622bd74 9p: Implement TMKDIR
Implement TMKDIR as part of 2000.L Work

Synopsis

    size[4] Tmkdir tag[2] fid[4] name[s] mode[4] gid[4]

    size[4] Rmkdir tag[2] qid[13]

Description

    mkdir asks the file server to create a directory with given name,
    mode and gid. The qid for the new directory is returned with
    the mkdir reply message.

Note: 72 is selected as the opcode for TMKDIR from the reserved list.

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-08-02 14:28:31 -05:00
M. Mohan Kumar
4b43516ab1 9p: Implement TMKNOD
Synopsis

    size[4] Tmknod tag[2] fid[4] name[s] mode[4] major[4] minor[4] gid[4]

    size[4] Rmknod tag[2] qid[13]

Description

    mknod asks the file server to create a device node with given major and
    minor number, mode and gid. The qid for the new device node is returned
    with the mknod reply message.

[sripathik@in.ibm.com: Fix error handling code]

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-08-02 14:28:30 -05:00
Venkateswararao Jujjuri (JV)
50cc42ff3d 9p: Define and implement TSYMLINK for 9P2000.L
Create a symbolic link

SYNOPSIS

size[4] Tsymlink tag[2] fid[4] name[s] symtgt[s] gid[4]

size[4] Rsymlink tag[2] qid[13]

DESCRIPTION

Create a symbolic link named 'name' pointing to 'symtgt'.
gid represents the effective group id of the caller.
The  permissions of a symbolic link are irrelevant hence it is omitted
from the protocol.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Reviewed-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-08-02 14:28:29 -05:00
Venkateswararao Jujjuri (JV)
652df9a7fd 9p: Define and implement TLINK for 9P2000.L
This patch adds a helper function to get the dentry from inode and
uses it in creating a Hardlink

SYNOPSIS

size[4] Tlink tag[2] dfid[4] oldfid[4] newpath[s]

size[4] Rlink tag[2]

DESCRIPTION

Create a link 'newpath' in directory pointed by dfid linking to oldfid path.

[sripathik@in.ibm.com : p9_client_link should not free req structure
if p9_client_rpc has returned an error.]

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-08-02 14:28:25 -05:00
Sripathi Kodi
87d7845aa0 9p: Implement client side of setattr for 9P2000.L protocol.
SYNOPSIS

      size[4] Tsetattr tag[2] attr[n]

      size[4] Rsetattr tag[2]

    DESCRIPTION

      The setattr command changes some of the file status information.
      attr resembles the iattr structure used in Linux kernel. It
      specifies which status parameter is to be changed and to what
      value. It is laid out as follows:

         valid[4]
            specifies which status information is to be changed. Possible
            values are:
            ATTR_MODE       (1 << 0)
            ATTR_UID        (1 << 1)
            ATTR_GID        (1 << 2)
            ATTR_SIZE       (1 << 3)
            ATTR_ATIME      (1 << 4)
            ATTR_MTIME      (1 << 5)
            ATTR_ATIME_SET  (1 << 7)
            ATTR_MTIME_SET  (1 << 8)

            The last two bits represent whether the time information
            is being sent by the client's user space. In the absense
            of these bits the server always uses server's time.

         mode[4]
            File permission bits

         uid[4]
            Owner id of file

         gid[4]
            Group id of the file

         size[8]
            File size

         atime_sec[8]
            Time of last file access, seconds

         atime_nsec[8]
            Time of last file access, nanoseconds

         mtime_sec[8]
            Time of last file modification, seconds

         mtime_nsec[8]
            Time of last file modification, nanoseconds

Explanation of the patches:
--------------------------

*) The kernel just copies relevent contents of iattr structure to
   p9_iattr_dotl structure and passes it down to the client. The
   only check it has is calling inode_change_ok()
*) The p9_iattr_dotl structure does not have ctime and ia_file
   parameters because I don't think these are needed in our case.
   The client user space can request updating just ctime by calling
   chown(fd, -1, -1). This is handled on server side without a need
   for putting ctime on the wire.
*) The server currently supports changing mode, time, ownership and
   size of the file.
*) 9P RFC says "Either all the changes in wstat request happen, or
   none of them does: if the request succeeds, all changes were made;
   if it fails, none were."
   I have not done anything to implement this specifically because I
   don't see a reason.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-08-02 14:25:10 -05:00
Sripathi Kodi
f085312204 9p: getattr client implementation for 9P2000.L protocol.
SYNOPSIS

              size[4] Tgetattr tag[2] fid[4] request_mask[8]

              size[4] Rgetattr tag[2] lstat[n]

           DESCRIPTION

              The getattr transaction inquires about the file identified by fid.
              request_mask is a bit mask that specifies which fields of the
              stat structure is the client interested in.

              The reply will contain a machine-independent directory entry,
              laid out as follows:

                 st_result_mask[8]
                    Bit mask that indicates which fields in the stat structure
                    have been populated by the server

                 qid.type[1]
                    the type of the file (directory, etc.), represented as a bit
                    vector corresponding to the high 8 bits of the file's mode
                    word.

                 qid.vers[4]
                    version number for given path

                 qid.path[8]
                    the file server's unique identification for the file

                 st_mode[4]
                    Permission and flags

                 st_uid[4]
                    User id of owner

                 st_gid[4]
                    Group ID of owner

                 st_nlink[8]
                    Number of hard links

                 st_rdev[8]
                    Device ID (if special file)

                 st_size[8]
                    Size, in bytes

                 st_blksize[8]
                    Block size for file system IO

                 st_blocks[8]
                    Number of file system blocks allocated

                 st_atime_sec[8]
                    Time of last access, seconds

                 st_atime_nsec[8]
                    Time of last access, nanoseconds

                 st_mtime_sec[8]
                    Time of last modification, seconds

                 st_mtime_nsec[8]
                    Time of last modification, nanoseconds

                 st_ctime_sec[8]
                    Time of last status change, seconds

                 st_ctime_nsec[8]
                    Time of last status change, nanoseconds

                 st_btime_sec[8]
                    Time of creation (birth) of file, seconds

                 st_btime_nsec[8]
                    Time of creation (birth) of file, nanoseconds

                 st_gen[8]
                    Inode generation

                 st_data_version[8]
                    Data version number

              request_mask and result_mask bit masks contain the following bits
                 #define P9_STATS_MODE          0x00000001ULL
                 #define P9_STATS_NLINK         0x00000002ULL
                 #define P9_STATS_UID           0x00000004ULL
                 #define P9_STATS_GID           0x00000008ULL
                 #define P9_STATS_RDEV          0x00000010ULL
                 #define P9_STATS_ATIME         0x00000020ULL
                 #define P9_STATS_MTIME         0x00000040ULL
                 #define P9_STATS_CTIME         0x00000080ULL
                 #define P9_STATS_INO           0x00000100ULL
                 #define P9_STATS_SIZE          0x00000200ULL
                 #define P9_STATS_BLOCKS        0x00000400ULL

                 #define P9_STATS_BTIME         0x00000800ULL
                 #define P9_STATS_GEN           0x00001000ULL
                 #define P9_STATS_DATA_VERSION  0x00002000ULL

                 #define P9_STATS_BASIC         0x000007ffULL
                 #define P9_STATS_ALL           0x00003fffULL

        This patch implements the client side of getattr implementation for
        9P2000.L. It introduces a new structure p9_stat_dotl for getting
        Linux stat information along with QID. The data layout is similar to
        stat structure in Linux user space with the following major
        differences:

        inode (st_ino) is not part of data. Instead qid is.

        device (st_dev) is not part of data because this doesn't make sense
        on the client.

        All time variables are 64 bit wide on the wire. The kernel seems to use
        32 bit variables for these variables. However, some of the architectures
        have used 64 bit variables and glibc exposes 64 bit variables to user
        space on some architectures. Hence to be on the safer side we have made
        these 64 bit in the protocol. Refer to the comments in
        include/asm-generic/stat.h

        There are some additional fields: st_btime_sec, st_btime_nsec, st_gen,
        st_data_version apart from the bitmask, st_result_mask. The bit mask
        is filled by the server to indicate which stat fields have been
        populated by the server. Currently there is no clean way for the
        server to obtain these additional fields, so it sends back just the
        basic fields.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbegren <ericvh@gmail.com>
2010-08-02 14:25:09 -05:00
Sripathi Kodi
7751bdb3a0 9p: readdir implementation for 9p2000.L
This patch implements the kernel part of readdir() implementation for 9p2000.L

    Change from V3: Instead of inode, server now sends qids for each dirent

    SYNOPSIS

    size[4] Treaddir tag[2] fid[4] offset[8] count[4]
    size[4] Rreaddir tag[2] count[4] data[count]

    DESCRIPTION

    The readdir request asks the server to read the directory specified by 'fid'
    at an offset specified by 'offset' and return as many dirent structures as
    possible that fit into count bytes. Each dirent structure is laid out as
    follows.

            qid.type[1]
              the type of the file (directory, etc.), represented as a bit
              vector corresponding to the high 8 bits of the file's mode
              word.

            qid.vers[4]
              version number for given path

            qid.path[8]
              the file server's unique identification for the file

            offset[8]
              offset into the next dirent.

            type[1]
              type of this directory entry.

            name[256]
              name of this directory entry.

    This patch adds v9fs_dir_readdir_dotl() as the readdir() call for 9p2000.L.
    This function sends P9_TREADDIR command to the server. In response the server
    sends a buffer filled with dirent structures. This is different from the
    existing v9fs_dir_readdir() call which receives stat structures from the server.
    This results in significant speedup of readdir() on large directories.
    For example, doing 'ls >/dev/null' on a directory with 10000 files on my
    laptop takes 1.088 seconds with the existing code, but only takes 0.339 seconds
    with the new readdir.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-08-02 14:25:07 -05:00
Changli Gao
f43dc98b3b netfilter: nf_nat: make unique_tuple return void
The only user of unique_tuple() get_unique_tuple() doesn't care about the
return value of unique_tuple(), so make unique_tuple() return void (nothing).

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-08-02 17:20:54 +02:00
Simon Horman
5c0d2374a1 ipvs: provide default ip_vs_conn_{in,out}_get_proto
This removes duplicate code by providing a default implementation
which is used by 3 of the 4 modules that provide these call.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-08-02 17:12:44 +02:00
Changli Gao
ee92d37861 netfilter: nf_conntrack_extend: introduce __nf_ct_ext_exist()
some users of nf_ct_ext_exist() know ct->ext isn't NULL. For these users, the
check for ct->ext isn't necessary, the function __nf_ct_ext_exist() can be
used instead.

the type of the return value of nf_ct_ext_exist() is changed to bool.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-08-02 17:06:19 +02:00
David Miller
ea4bd8ba80 Bluetooth: Use list_head for HCI blacklist head
The bdaddr in the list root is completely unused and just
taking up space.

Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-31 16:06:58 -07:00
John W. Linville
ae3568adf4 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2010-07-29 14:47:07 -04:00
Christian Lamparter
b7753c8cd5 cfg80211: fix dev <-> wiphy typo
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-29 12:55:00 -04:00
Johannes Berg
e5b900d228 mac80211: allow drivers to request DTIM period
Some features require knowing the DTIM period
before associating. This implements the ability
to wait for a beacon in mac80211 before assoc
to provide this value. It is optional since
most likely not all drivers will need this.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-29 12:55:00 -04:00
Felix Fietkau
4552124543 mac80211: inform drivers about the off-channel status on channel changes
For some drivers it can be useful to know whether the channel they're
supposed to switch to is going to be used for short off-channel work or
scanning, or whether the hardware is expected to stay on it for a while
longer. This is important for various kinds of calibration work, which
takes longer to complete and should keep some persistent state, even if
the channel temporarily changes.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-28 16:24:02 -04:00
John W. Linville
099284bdec Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-next-2.6 2010-07-28 16:17:49 -04:00
David S. Miller
bb7e95c8fd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bnx2x_main.c

Merge bnx2x bug fixes in by hand... :-/

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-27 21:01:35 -07:00
Marcel Holtmann
e73439d8c0 Bluetooth: Defer SCO setup if mode change is pending
Certain headsets such as the Motorola H350 will reject SCO and eSCO
connection requests while the ACL is transitioning from sniff mode
to active mode. Add synchronization so that SCO and eSCO connection
requests will wait until the ACL has fully transitioned to active mode.

< HCI Command: Exit Sniff Mode (0x02|0x0004) plen 2
    handle 12
> HCI Event: Command Status (0x0f) plen 4
    Exit Sniff Mode (0x02|0x0004) status 0x00 ncmd 1
< HCI Command:  Setup Synchronous Connection (0x01|0x0028) plen 17
    handle 12 voice setting 0x0040
> HCI Event: Command Status (0x0f) plen 4
    Setup Synchronous Connection (0x01|0x0028) status 0x00 ncmd 1
> HCI Event: Number of Completed Packets (0x13) plen 5
    handle 12 packets 1
> HCI Event: Mode Change (0x14) plen 6
    status 0x00 handle 12 mode 0x00 interval 0
    Mode: Active
> HCI Event: Synchronous Connect Complete (0x2c) plen 17
    status 0x10 handle 14 bdaddr 00:1A:0E:50:28:A4 type SCO
    Error: Connection Accept Timeout Exceeded

Signed-off-by: Ron Shaffer <rshaffer@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-27 12:29:04 -07:00
Joe Perches
073730d771 wireless: Convert wiphy_debug macro to function
Save a few bytes of text

(allyesconfig)
$ size drivers/net/wireless/built-in.o*
   text	   data	    bss	    dec	    hex	filename
3924568	 100548	 871056	4896172	 4ab5ac	drivers/net/wireless/built-in.o.new
3926520	 100548	 871464	4898532	 4abee4	drivers/net/wireless/built-in.o.old

$ size net/wireless/core.o*
   text	   data	    bss	    dec	    hex	filename
  12843	    216	   3768	  16827	   41bb	net/wireless/core.o.new
  12328	    216	   3656	  16200	   3f48	net/wireless/core.o

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-27 15:14:13 -04:00
Joe Perches
e1db74fcc3 include/net/cfg80211.h: Add wiphy_<level> printk equivalents
Simplify logging messages for wiphy devices

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-27 15:14:13 -04:00
John W. Linville
800f65bba8 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-next-2.6
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-commands.h
2010-07-27 11:59:19 -04:00
John W. Linville
3289a8368c lib80211: remove unused host_build_iv option
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-26 15:09:04 -04:00
stephen hemminger
3b87956ea6 net sched: fix race in mirred device removal
This fixes hang when target device of mirred packet classifier
action is removed.

If a mirror or redirection action is configured to cause packets
to go to another device, the classifier holds a ref count, but was assuming
the adminstrator cleaned up all redirections before removing. The fix
is to add a notifier and cleanup during unregister.

The new list is implicitly protected by RTNL mutex because
it is held during filter add/delete as well as notifier.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-24 21:04:20 -07:00
David S. Miller
2a88e7e559 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-commands.h
2010-07-23 14:03:38 -07:00
Hannes Eder
7f1c407579 IPVS: make FTP work with full NAT support
Use nf_conntrack/nf_nat code to do the packet mangling and the TCP
sequence adjusting.  The function 'ip_vs_skb_replace' is now dead
code, so it is removed.

To SNAT FTP, use something like:

% iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \
    --vport 21 -j SNAT --to-source 192.168.10.10
and for the data connections in passive mode:

% iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \
    --vportctl 21 -j SNAT --to-source 192.168.10.10
using '-m state --state RELATED' would also works.

Make sure the kernel modules ip_vs_ftp, nf_conntrack_ftp, and
nf_nat_ftp are loaded.

[ up-port and minor fixes by Simon Horman <horms@verge.net.au> ]
Signed-off-by: Hannes Eder <heder@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23 12:48:52 +02:00
Gustavo F. Padovan
3f30fc1570 net: remove last uses of __attribute__((packed))
Network code uses the __packed macro instead of __attribute__((packed)).

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-21 14:44:29 -07:00
Gustavo F. Padovan
942875ffc1 irda: Use __packed annotation instead IRDA_PACKED macro
Remove IRDA_PACKED macro, which maps to __attribute__((packed)). IRDA is
one of the last users of __attribute__((packet)). Networking code uses
__packed now.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-21 14:44:29 -07:00
Gustavo F. Padovan
66c853cc21 Bluetooth: Use __packed annotation
To make net/ and include/net/ code consistent use __packed instead of
__attribute__ ((packed)). Bluetooth subsystem was one of the last net
subsys still using __attribute__ ((packed)).

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:13 -07:00
Gustavo F. Padovan
5d8868ff3d Bluetooth: Add Google's copyright to L2CAP
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:12 -07:00
Suraj Sumangala
9981151086 Bluetooth: Implemented HCI frame reassembly for RX from stream
Implemented frame reassembly implementation for reassembling fragments
received from stream.

Signed-off-by: Suraj Sumangala <suraj@atheros.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:12 -07:00
Suraj Sumangala
33e882a5f2 Bluetooth: Implement hci_reassembly helper to reassemble RX packets
Implements feature to reassemble received HCI frames from any input stream

Signed-off-by: Suraj Sumangala <suraj@atheros.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:12 -07:00
Suraj Sumangala
cd4c53919e Bluetooth: Add one more buffer for HCI stream reassembly
Additional reassembly buffer to keep track of stream reasembly

Signed-off-by: Suraj Sumangala <suraj@atheros.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:12 -07:00
Gustavo F. Padovan
ce5706bd69 Bluetooth: Add Copyright notice to L2CAP
Copyright for the time I worked on L2CAP during the Google Summer of Code
program.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:11 -07:00
Gustavo F. Padovan
e0f66218b3 Bluetooth: Remove the send_lock spinlock from ERTM
Using a lock to deal with the ERTM race condition - interruption with
new data from the hci layer - is wrong. We should use the native skb
backlog queue.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:09 -07:00
Gustavo F. Padovan
cf6c2c0b9f Bluetooth: Disconnect early if mode is not supported
When mode is mandatory we shall not send connect request and report this
to the userspace as well.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:09 -07:00
Ron Shaffer
2d0a03460a Bluetooth: Reassigned copyright to Code Aurora Forum
Qualcomm, Inc. has reassigned rights to Code Aurora Forum. Accordingly,
as files are modified by Code Aurora Forum members, the copyright
statement will be updated.

Signed-off-by: Ron Shaffer <rshaffer@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:06 -07:00
Ron Shaffer
04fafe4ed7 Bluetooth: Remove extraneous white space
Deleted extraneous white space from the end of several lines

Signed-off-by: Ron Shaffer <rshaffer@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:05 -07:00
Johan Hedberg
f03585689f Bluetooth: Add blacklist support for incoming connections
In some circumstances it could be desirable to reject incoming
connections on the baseband level. This patch adds this feature through
two new ioctl's: HCIBLOCKADDR and HCIUNBLOCKADDR. Both take a simple
Bluetooth address as a parameter. BDADDR_ANY can be used with
HCIUNBLOCKADDR to remove all devices from the blacklist.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:05 -07:00
David S. Miller
11fe883936 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/vhost/net.c
	net/bridge/br_device.c

Fix merge conflict in drivers/vhost/net.c with guidance from
Stephen Rothwell.

Revert the effects of net-2.6 commit 573201f36f
since net-next-2.6 has fixes that make bridge netpoll work properly thus
we don't need it disabled.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-20 18:25:24 -07:00
John W. Linville
c28991a02c wireless: correct sparse warning in wext-compat.c
CHECK   net/wireless/wext-compat.c
net/wireless/wext-compat.c:1434:5: warning: symbol 'cfg80211_wext_siwpmksa' was not declared. Should it be static?

Add declaration in cfg80211.h.  Also add an EXPORT_SYMBOL_GPL, since all
the peer functions have it.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-20 16:49:37 -04:00
John W. Linville
4f366c5dab wireless: only use alpha2 regulatory information from country IE
The meaning and/or usage of the country IE is somewhat poorly defined.
In practice, this means that regulatory rulesets in a country IE are
often incomplete and might be untrustworthy.  This removes the code
associated with interpreting those rulesets while preserving respect
for country "alpha2" codes also contained in the country IE.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-20 16:44:35 -04:00
Johannes Berg
4ced3f74da mac80211: move QoS-enable to BSS info
Ever since

commit e1b3ec1a2a
Author: Stanislaw Gruszka <sgruszka@redhat.com>
Date:   Mon Mar 29 12:18:34 2010 +0200

    mac80211: explicitly disable/enable QoS

mac80211 is telling drivers, in particular
iwlwifi, whether QoS is enabled or not.

However, this is only relevant for station mode,
since only then will any device send nullfunc
frames and need to know whether they should be
QoS frames or not. In other modes, there are
(currently) no frames the device is supposed to
send.

When you now consider virtual interfaces, it
becomes apparent that the current mechanism is
inadequate since it enables/disables QoS on a
global scale, where for nullfunc frames it has
to be on a per-interface scale.

Due to the above considerations, we can change
the way mac80211 advertises the QoS state to
drivers to only ever advertise it as "off" in
station mode, and make it a per-BSS setting.

Tested-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-20 16:02:58 -04:00
Eric Dumazet
f86586fa48 tcp: sizeof struct tcp_skb_cb is 44
Correct comment stating sizeof(struct tcp_skb_cb) is 36 or 40, since its
44 bytes, since commit 951dbc8ac7 ([IPV6]: Move nextheader offset
to the IP6CB).

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-15 21:41:00 -07:00
Pablo Neira Ayuso
cca5cf91c7 nfnetlink_log: do not expose NFULNL_COPY_DISABLED to user-space
This patch moves NFULNL_COPY_PACKET definition from
linux/netfilter/nfnetlink_log.h to net/netfilter/nfnetlink_log.h
since this copy mode is only for internal use.

I have also changed the value from 0x03 to 0xff. Thus, we avoid
a gap from user-space that may confuse users if we add new
copy modes in the future.

This change was introduced in:
http://www.spinics.net/lists/netfilter-devel/msg13535.html

Since this change is not included in any stable Linux kernel,
I think it's safe to make this change now. Anyway, this copy
mode does not make any sense from user-space, so this patch
should not break any existing setup.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-15 11:27:41 +02:00
Tom Herbert
b0f77d0eae net: fix problem in reading sock TX queue
Fix problem in reading the tx_queue recorded in a socket.  In
dev_pick_tx, the TX queue is read by doing a check with
sk_tx_queue_recorded on the socket, followed by a sk_tx_queue_get.
The problem is that there is not mutual exclusion across these
calls in the socket so it it is possible that the queue in the
sock can be invalidated after sk_tx_queue_recorded is called so
that sk_tx_queue get returns -1, which sets 65535 in queue_index
and thus dev_pick_tx returns 65536 which is a bogus queue and
can cause crash in dev_queue_xmit.

We fix this by only calling sk_tx_queue_get which does the proper
checks.  The interface is that sk_tx_queue_get returns the TX queue
if the sock argument is non-NULL and TX queue is recorded, else it
returns -1.  sk_tx_queue_recorded is no longer used so it can be
completely removed.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-14 20:50:29 -07:00
Changli Gao
7ba4291007 inet, inet6: make tcp_sendmsg() and tcp_sendpage() through inet_sendmsg() and inet_sendpage()
a new boolean flag no_autobind is added to structure proto to avoid the autobind
calls when the protocol is TCP. Then sock_rps_record_flow() is called int the
TCP's sendmsg() and sendpage() pathes.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/net/inet_common.h |    4 ++++
 include/net/sock.h        |    1 +
 include/net/tcp.h         |    8 ++++----
 net/ipv4/af_inet.c        |   15 +++++++++------
 net/ipv4/tcp.c            |   11 +++++------
 net/ipv4/tcp_ipv4.c       |    3 +++
 net/ipv6/af_inet6.c       |    8 ++++----
 net/ipv6/tcp_ipv6.c       |    3 +++
 8 files changed, 33 insertions(+), 20 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-12 20:21:46 -07:00
Changli Gao
53d3176b28 net: cleanups
remove useless blanks.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/net/inet_common.h |   55 ++++-------
 include/net/tcp.h         |  222 +++++++++++++++++-----------------------------
 include/net/udp.h         |   38 +++----
 3 files changed, 123 insertions(+), 192 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-12 20:21:45 -07:00
David S. Miller
597e608a84 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-07-07 15:59:38 -07:00
David S. Miller
e490c1defe Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2010-07-02 22:42:06 -07:00
John Fastabend
f0796d5c73 net: decreasing real_num_tx_queues needs to flush qdisc
Reducing real_num_queues needs to flush the qdisc otherwise
skbs with queue_mappings greater then real_num_tx_queues can
be sent to the underlying driver.

The flow for this is,

dev_queue_xmit()
	dev_pick_tx()
		skb_tx_hash()  => hash using real_num_tx_queues
		skb_set_queue_mapping()
	...
	qdisc_enqueue_root() => enqueue skb on txq from hash
...
dev->real_num_tx_queues -= n
...
sch_direct_xmit()
	dev_hard_start_xmit()
		ndo_start_xmit(skb,dev) => skb queue set with old hash

skbs are enqueued on the qdisc with skb->queue_mapping set
0 < queue_mappings < real_num_tx_queues.  When the driver
decreases real_num_tx_queues skb's may be dequeued from the
qdisc with a queue_mapping greater then real_num_tx_queues.

This fixes a case in ixgbe where this was occurring with DCB
and FCoE. Because the driver is using queue_mapping to map
skbs to tx descriptor rings we can potentially map skbs to
rings that no longer exist.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-02 21:59:07 -07:00
John Fastabend
4ef6acff83 sched: qdisc_reset_all_tx is calling qdisc_reset without qdisc_lock
When calling qdisc_reset() the qdisc lock needs to be held.  In
this case there is at least one driver i4l which is using this
without holding the lock.  Add the locking here.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-02 21:59:07 -07:00
David S. Miller
05318bc905 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
Conflicts:
	drivers/net/wireless/libertas/host.h
2010-07-01 17:34:14 -07:00
Changli Gao
d6bebca92c fragment: add fast path for in-order fragments
add fast path for in-order fragments

As the fragments are sent in order in most of OSes, such as Windows, Darwin and
FreeBSD, it is likely the new fragments are at the end of the inet_frag_queue.
In the fast path, we check if the skb at the end of the inet_frag_queue is the
prev we expect.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/net/inet_frag.h |    1 +
 net/ipv4/ip_fragment.c  |   12 ++++++++++++
 net/ipv6/reassembly.c   |   11 +++++++++++
 3 files changed, 24 insertions(+)
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30 13:44:29 -07:00
Eric Dumazet
4ce3c183fc snmp: 64bit ipstats_mib for all arches
/proc/net/snmp and /proc/net/netstat expose SNMP counters.

Width of these counters is either 32 or 64 bits, depending on the size
of "unsigned long" in kernel.

This means user program parsing these files must already be prepared to
deal with 64bit values, regardless of user program being 32 or 64 bit.

This patch introduces 64bit snmp values for IPSTAT mib, where some
counters can wrap pretty fast if they are 32bit wide.

# netstat -s|egrep "InOctets|OutOctets"
    InOctets: 244068329096
    OutOctets: 244069348848

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30 13:31:19 -07:00
Kulikov Vasiliy
787a34456d net/neighbour.h: fix typo
'Shoul' must be 'should'.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30 12:06:52 -07:00
Andreas Steffen
4efd7e8335 xfrm: fix XFRMA_MARK extraction in xfrm_mark_get
Determine the size of the xfrm_mark struct, not of its pointer.

Signed-off-by: Andreas Steffen <andreas.steffen@strongswan.org>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30 10:43:31 -07:00
Sjur Braendeland
529d6dad5b caif-driver: Add CAIF-SPI Protocol driver.
This patch introduces the CAIF SPI Protocol Driver for
CAIF Link Layer.

This driver implements a platform driver to accommodate for a
platform specific SPI device. A general platform driver is not
possible as there are no SPI Slave side Kernel API defined.
A sample CAIF SPI Platform device can be found in
.../Documentation/networking/caif/spi_porting.txt

Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-29 00:08:21 -07:00
Changli Gao
210d6de78c act_mirred: don't clone skb when skb isn't shared
don't clone skb when skb isn't shared

When the tcf_action is TC_ACT_STOLEN, and the skb isn't shared, we don't need
to clone a new skb. As the skb will be freed after this function returns, we
can use it freely once we get a reference to it.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/net/sch_generic.h |   11 +++++++++--
 net/sched/act_mirred.c    |    6 +++---
 2 files changed, 12 insertions(+), 5 deletions(-)
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-28 23:24:32 -07:00
Florian Westphal
172d69e63c syncookies: add support for ECN
Allows use of ECN when syncookies are in effect by encoding ecn_ok
into the syn-ack tcp timestamp.

While at it, remove a uneeded #ifdef CONFIG_SYN_COOKIES.
With CONFIG_SYN_COOKIES=nm want_cookie is ifdef'd to 0 and gcc
removes the "if (0)".

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-26 22:00:03 -07:00
Eric Dumazet
1823e4c80e snmp: add align parameter to snmp_mib_init()
In preparation for 64bit snmp counters for some mibs,
add an 'align' parameter to snmp_mib_init(), instead
of assuming mibs only contain 'unsigned long' fields.

Callers can use __alignof__(type) to provide correct
alignment.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
CC: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-25 21:33:17 -07:00
Tim Gardner
a8756201ba netfilter: xt_connbytes: Force CT accounting to be enabled
Check at rule install time that CT accounting is enabled. Force it
to be enabled if not while also emitting a warning since this is not
the default state.

This is in preparation for deprecating CONFIG_NF_CT_ACCT upon which
CONFIG_NETFILTER_XT_MATCH_CONNBYTES depended being set.

Added 2 CT accounting support functions:

nf_ct_acct_enabled() - Get CT accounting state.
nf_ct_set_acct() - Enable/disable CT accountuing.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-25 14:44:07 +02:00
Juuso Oikarinen
fa61cf70a6 cfg80211/mac80211: Update set_tx_power to use mBm instead of dBm units
In preparation for a TX power setting interface in the nl80211, change the
.set_tx_power function to use mBm units instead of dBm for greater accuracy and
smaller power levels.

Also, already in advance move the tx_power_setting enumeration to nl80211.

This change affects the .tx_set_power function prototype. As a result, the
corresponding changes are needed to modules using it. These are mac80211,
iwmc3200wifi and rndis_wlan.

Cc: Samuel Ortiz <samuel.ortiz@intel.com>
Cc: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Acked-by: Samuel Ortiz <samuel.ortiz@intel.com>
Acked-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-24 15:42:33 -04:00
David S. Miller
8244132ea8 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/ipv4/ip_output.c
2010-06-23 18:26:27 -07:00
Jiri Olsa
7b2ff18ee7 net - IP_NODEFRAG option for IPv4 socket
this patch is implementing IP_NODEFRAG option for IPv4 socket.
The reason is, there's no other way to send out the packet with user
customized header of the reassembly part.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-23 13:16:38 -07:00
Justin P. Mattock
1dc8d8c06d net: Fix a typo in netlink.h
Fix a typo in include/net/netlink.h
should be finalize instead of finanlize

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-23 12:58:40 -07:00
Eric Dumazet
8f1c14b2e3 snmp: fix SNMP_ADD_STATS()
commit aa2ea0586d (tcp: fix outsegs stat for TSO segments) incorrectly
assumed SNMP_ADD_STATS() was used from BH context.

Fix this using mib[!in_softirq()] instead of mib[0]

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-23 11:48:19 -07:00
Juuso Oikarinen
f90754c15f mac80211: Add interface for driver to temporarily disable dynamic ps
This mechanism introduced in this patch applies (at least) for hardware
designs using a single shared antenna for both WLAN and BT. In these designs,
the antenna must be toggled between WLAN and BT.

In those hardware, managing WLAN co-existence with Bluetooth requires WLAN
full power save whenever there is Bluetooth activity in order for WLAN to be
able to periodically relinquish the antenna to be used for BT. This is because
BT can only access the shared antenna when WLAN is idle or asleep.

Some hardware, for instance the wl1271, are able to indicate to the host
whenever there is BT traffic. In essence, the hardware will send an indication
to the host whenever there is, for example, SCO traffic or A2DP traffic, and
will send another indication when the traffic is over.

The hardware gets information of Bluetooth traffic via hardware co-existence
control lines - these lines are used to negotiate the shared antenna
ownership. The hardware will give the antenna to BT whenever WLAN is sleeping.

This patch adds the interface to mac80211 to facilitate temporarily disabling
of dynamic power save as per request of the WLAN driver. This interface will
immediately force WLAN to full powersave, hence allowing BT coexistence as
described above.

In these kind of shared antenna desings, when WLAN powersave is fully disabled,
Bluetooth will not work simultaneously with WLAN at all. This patch does not
address that problem. This interface will not change PSM state, so if PSM is
disabled it will remain so. Solving this problem requires knowledge about BT
state, and is best done in user-space.

Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-21 15:39:59 -04:00
Sjur Braendeland
2aa40aef9d caif: Use link layer MTU instead of fixed MTU
Previously CAIF supported maximum transfer size of ~4050.
The transfer size is now calculated dynamically based on the
link layers mtu size.

Signed-off-by: Sjur Braendeland@stericsson.com
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-20 19:46:06 -07:00
Sjur Braendeland
a7da1f55a8 caif: Bugfix - RFM must support segmentation.
CAIF Remote File Manager may send or receive more than 4050 bytes.
Due to this The CAIF RFM service have to support segmentation.

Signed-off-by: Sjur Braendeland@stericsson.com
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-20 19:46:05 -07:00
Sjur Braendeland
b1c74247b9 caif: Bugfix not all services uses flow-ctrl.
Flow control is not used by all CAIF services.
The usage of flow control is now part of the gerneal
initialization function for CAIF Services.

Signed-off-by: Sjur Braendeland@stericsson.com
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-20 19:46:05 -07:00
David S. Miller
bb9c03d8a6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2010-06-17 14:19:06 -07:00
Patrick McHardy
c68cd6cc21 netfilter: nf_nat: support user-specified SNAT rules in LOCAL_IN
2.6.34 introduced 'conntrack zones' to deal with cases where packets
from multiple identical networks are handled by conntrack/NAT. Packets
are looped through veth devices, during which they are NATed to private
addresses, after which they can continue normally through the stack
and possibly have NAT rules applied a second time.

This works well, but is needlessly complicated for cases where only
a single SNAT/DNAT mapping needs to be applied to these packets. In that
case, all that needs to be done is to assign each network to a seperate
zone and perform NAT as usual. However this doesn't work for packets
destined for the machine performing NAT itself since its corrently not
possible to configure SNAT mappings for the LOCAL_IN chain.

This patch adds a new INPUT chain to the NAT table and changes the
targets performing SNAT to be usable in that chain.

Example usage with two identical networks (192.168.0.0/24) on eth0/eth1:

iptables -t raw -A PREROUTING -i eth0 -j CT --zone 1
iptables -t raw -A PREROUTING -i eth0 -j MARK --set-mark 1
iptables -t raw -A PREROUTING -i eth1 -j CT --zone 2
iptabels -t raw -A PREROUTING -i eth1 -j MARK --set-mark 2

iptables -t nat -A INPUT       -m mark --mark 1 -j NETMAP --to 10.0.0.0/24
iptables -t nat -A POSTROUTING -m mark --mark 1 -j NETMAP --to 10.0.0.0/24
iptables -t nat -A INPUT       -m mark --mark 2 -j NETMAP --to 10.0.1.0/24
iptables -t nat -A POSTROUTING -m mark --mark 2 -j NETMAP --to 10.0.1.0/24

iptables -t raw -A PREROUTING -d 10.0.0.0/24 -j CT --zone 1
iptables -t raw -A OUTPUT     -d 10.0.0.0/24 -j CT --zone 1
iptables -t raw -A PREROUTING -d 10.0.1.0/24 -j CT --zone 2
iptables -t raw -A OUTPUT     -d 10.0.1.0/24 -j CT --zone 2

iptables -t nat -A PREROUTING -d 10.0.0.0/24 -j NETMAP --to 192.168.0.0/24
iptables -t nat -A OUTPUT     -d 10.0.0.0/24 -j NETMAP --to 192.168.0.0/24
iptables -t nat -A PREROUTING -d 10.0.1.0/24 -j NETMAP --to 192.168.0.0/24
iptables -t nat -A OUTPUT     -d 10.0.1.0/24 -j NETMAP --to 192.168.0.0/24

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-17 06:12:26 +02:00
Eric W. Biederman
7361c36c52 af_unix: Allow credentials to work across user and pid namespaces.
In unix_skb_parms store pointers to struct pid and struct cred instead
of raw uid, gid, and pid values, then translate the credentials on
reception into values that are meaningful in the receiving processes
namespaces.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16 14:58:16 -07:00
Eric W. Biederman
257b5358b3 scm: Capture the full credentials of the scm sender.
Start capturing not only the userspace pid, uid and gid values of the
sending process but also the struct pid and struct cred of the sending
process as well.

This is in preparation for properly supporting SCM_CREDENTIALS for
sockets that have different uid and/or pid namespaces at the different
ends.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16 14:55:56 -07:00
Eric W. Biederman
109f6e39fa af_unix: Allow SO_PEERCRED to work across namespaces.
Use struct pid and struct cred to store the peer credentials on struct
sock.  This gives enough information to convert the peer credential
information to a value relative to whatever namespace the socket is in
at the time.

This removes nasty surprises when using SO_PEERCRED on socket
connetions where the processes on either side are in different pid and
user namespaces.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16 14:55:55 -07:00
Eric W. Biederman
812e876e84 scm: Reorder scm_cookie.
Reorder the fields in scm_cookie so they pack better on 64bit.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16 14:55:34 -07:00
Florian Westphal
8c76368174 syncookies: check decoded options against sysctl settings
Discard the ACK if we find options that do not match current sysctl
settings.

Previously it was possible to create a connection with sack, wscale,
etc. enabled even if the feature was disabled via sysctl.

Also remove an unneeded call to tcp_sack_reset() in
cookie_check_timestamp: Both call sites (cookie_v4_check,
cookie_v6_check) zero "struct tcp_options_received", hand it to
tcp_parse_options() (which does not change tcp_opt->num_sacks/dsack)
and then call cookie_check_timestamp().

Even if num_sacks/dsacks were changed, the structure is allocated on
the stack and after cookie_check_timestamp returns only a few selected
members are copied to the inet_request_sock.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16 14:42:15 -07:00
Eric Dumazet
317fe0e6c5 inetpeer: restore small inet_peer structures
Addition of rcu_head to struct inet_peer added 16bytes on 64bit arches.

Thats a bit unfortunate, since old size was exactly 64 bytes.

This can be solved, using an union between this rcu_head an four fields,
that are normally used only when a refcount is taken on inet_peer.
rcu_head is used only when refcnt=-1, right before structure freeing.

Add a inet_peer_refcheck() function to check this assertion for a while.

We can bring back SLAB_HWCACHE_ALIGN qualifier in kmem cache creation.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16 11:55:39 -07:00
Eric Dumazet
aa1039e73c inetpeer: RCU conversion
inetpeer currently uses an AVL tree protected by an rwlock.

It's possible to make most lookups use RCU

1) Add a struct rcu_head to struct inet_peer

2) add a lookup_rcu_bh() helper to perform lockless and opportunistic
lookup. This is a normal function, not a macro like lookup().

3) Add a limit to number of links followed by lookup_rcu_bh(). This is
needed in case we fall in a loop.

4) add an smp_wmb() in link_to_pool() right before node insert.

5) make unlink_from_pool() use atomic_cmpxchg() to make sure it can take
last reference to an inet_peer, since lockless readers could increase
refcount, even while we hold peers.lock.

6) Delay struct inet_peer freeing after rcu grace period so that
lookup_rcu_bh() cannot crash.

7) inet_getpeer() first attempts lockless lookup.
   Note this lookup can fail even if target is in AVL tree, but a
concurrent writer can let tree in a non correct form.
   If this attemps fails, lock is taken a regular lookup is performed
again.

8) convert peers.lock from rwlock to a spinlock

9) Remove SLAB_HWCACHE_ALIGN when peer_cachep is created, because
rcu_head adds 16 bytes on 64bit arches, doubling effective size (64 ->
128 bytes)
In a future patch, this is probably possible to revert this part, if rcu
field is put in an union to share space with rid, ip_id_count, tcp_ts &
tcp_ts_stamp. These fields being manipulated only with refcnt > 0.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15 14:23:38 -07:00
David S. Miller
16fb62b6b4 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2010-06-15 13:49:24 -07:00
Juuso Oikarinen
ff61638105 mac80211: Fix ps-qos network latency handling
The ps-qos latency handling is broken. It uses predetermined latency values
to select specific dynamic PS timeouts. With common AP configurations, these
values overlap with beacon interval and are therefore essentially useless
(for network latencies less than the beacon interval, PSM is disabled.)

This patch remedies the problem by replacing the predetermined network latency
values with one high value (1900ms) which is used to go trigger full psm. For
backwards compatibility, the value 2000ms is still mapped to a dynamic ps
timeout of 100ms.

Currently also the mac80211 internal value for storing user space configured
dynamic PSM values is incorrectly in the driver visible ieee80211_conf struct.
Move it to the ieee80211_local struct.

Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-15 16:00:48 -04:00
Changli Gao
a3433f35a5 tcp: unify tcp flag macros
unify tcp flag macros: TCPHDR_FIN, TCPHDR_SYN, TCPHDR_RST, TCPHDR_PSH,
TCPHDR_ACK, TCPHDR_URG, TCPHDR_ECE and TCPHDR_CWR. TCBCB_FLAG_* are replaced
with the corresponding TCPHDR_*.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/net/tcp.h                      |   24 ++++++-------
 net/ipv4/tcp.c                         |    8 ++--
 net/ipv4/tcp_input.c                   |    2 -
 net/ipv4/tcp_output.c                  |   59 ++++++++++++++++-----------------
 net/netfilter/nf_conntrack_proto_tcp.c |   32 ++++++-----------
 net/netfilter/xt_TCPMSS.c              |    4 --
 6 files changed, 58 insertions(+), 71 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15 11:56:19 -07:00