Commit Graph

361566 Commits

Author SHA1 Message Date
Alex Williamson
d0f63acc2f igb: Fix null pointer dereference
The max_vfs= option has always been self limiting to the number of VFs
supported by the device.  fa44f2f1 added SR-IOV configuration via
sysfs, but in the process broke this self correction factor.  The
failing path is:

igb_probe
  igb_sw_init
    if (max_vfs > 7) {
        adapter->vfs_allocated_count = 7;
    ...
    igb_probe_vfs
    igb_enable_sriov(, max_vfs)
      if (num_vfs > 7) {
        err = -EPERM;
        ...

This leaves vfs_allocated_count = 7 and vf_data = NULL, so we bomb out
when igb_probe finally calls igb_reset.  It seems like a really bad
idea, and somewhat pointless, to set vfs_allocated_count separate from
vf_data, but limiting max_vfs is enough to avoid the null pointer.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-26 03:00:32 -07:00
Lior Levy
22c12752d1 igb: fix i350 anti spoofing config
Fix a problem in i350 where anti spoofing configuration was written into a
wrong register.

Signed-off-by: Lior Levy <lior.levy@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-26 02:47:50 -07:00
xunleer
a1f6c6b147 ixgbevf: don't release the soft entries
When the ixgbevf driver is opened the request to allocate MSIX irq
vectors may fail.  In that case the driver will call ixgbevf_down()
which will call ixgbevf_irq_disable() to clear the HW interrupt
registers and calls synchronize_irq() using the msix_entries pointer in
the adapter structure.  However, when the function to request the MSIX
irq vectors failed it had already freed the msix_entries which causes
an OOPs from using the NULL pointer in synchronize_irq().

The calls to pci_disable_msix() and to free the msix_entries memory
should not occur if device open fails.  Instead they should be called
during device driver removal to balance with the call to
pci_enable_msix() and the call to allocate msix_entries memory
during the device probe and driver load.

Signed-off-by: Li Xun <xunleer.li@huawei.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-26 02:31:48 -07:00
Hong Zhiguo
a79ca223e0 ipv6: fix bad free of addrconf_init_net
Signed-off-by: Hong Zhiguo <honkiko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-25 13:55:38 -04:00
Paul Moore
ded34e0fe8 unix: fix a race condition in unix_release()
As reported by Jan, and others over the past few years, there is a
race condition caused by unix_release setting the sock->sk pointer
to NULL before properly marking the socket as dead/orphaned.  This
can cause a problem with the LSM hook security_unix_may_send() if
there is another socket attempting to write to this partially
released socket in between when sock->sk is set to NULL and it is
marked as dead/orphaned.  This patch fixes this by only setting
sock->sk to NULL after the socket has been marked as dead; I also
take the opportunity to make unix_release_sock() a void function
as it only ever returned 0/success.

Dave, I think this one should go on the -stable pile.

Special thanks to Jan for coming up with a reproducer for this
problem.

Reported-by: Jan Stancek <jan.stancek@gmail.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-25 13:11:48 -04:00
Yuchung Cheng
7ebe183c6d tcp: undo spurious timeout after SACK reneging
On SACK reneging the sender immediately retransmits and forces a
timeout but disables Eifel (undo). If the (buggy) receiver does not
drop any packet this can trigger a false slow-start retransmit storm
driven by the ACKs of the original packets. This can be detected with
undo and TCP timestamps.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-24 17:27:28 -04:00
Kumar Amit Mehta
8fe7f99a9e bnx2x: fix assignment of signed expression to unsigned variable
fix for incorrect assignment of signed expression to unsigned variable.

Signed-off-by: Kumar Amit Mehta <gmate.amit@gmail.com>
Acked-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-24 17:27:28 -04:00
Hong zhi guo
9b46922e15 bridge: fix crash when set mac address of br interface
When I tried to set mac address of a bridge interface to a mac
address which already learned on this bridge, I got system hang.

The cause is straight forward: function br_fdb_change_mac_address
calls fdb_insert with NULL source nbp. Then an fdb lookup is
performed. If an fdb entry is found and it's local, it's OK. But
if it's not local, source is dereferenced for printk without NULL
check.

Signed-off-by: Hong Zhiguo <honkiko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-24 17:27:28 -04:00
Cong Wang
4a7df340ed 8021q: fix a potential use-after-free
vlan_vid_del() could possibly free ->vlan_info after a RCU grace
period, however, we may still refer to the freed memory area
by 'grp' pointer. Found by code inspection.

This patch moves vlan_vid_del() as behind as possible.

Cc: Patrick McHardy <kaber@trash.net>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-24 17:27:28 -04:00
Eric Dumazet
9979a55a83 net: remove a WARN_ON() in net_enable_timestamp()
The WARN_ON(in_interrupt()) in net_enable_timestamp() can get false
positive, in socket clone path, run from softirq context :

[ 3641.624425] WARNING: at net/core/dev.c:1532 net_enable_timestamp+0x7b/0x80()
[ 3641.668811] Call Trace:
[ 3641.671254]  <IRQ>  [<ffffffff80286817>] warn_slowpath_common+0x87/0xc0
[ 3641.677871]  [<ffffffff8028686a>] warn_slowpath_null+0x1a/0x20
[ 3641.683683]  [<ffffffff80742f8b>] net_enable_timestamp+0x7b/0x80
[ 3641.689668]  [<ffffffff80732ce5>] sk_clone_lock+0x425/0x450
[ 3641.695222]  [<ffffffff8078db36>] inet_csk_clone_lock+0x16/0x170
[ 3641.701213]  [<ffffffff807ae449>] tcp_create_openreq_child+0x29/0x820
[ 3641.707663]  [<ffffffff807d62e2>] ? ipt_do_table+0x222/0x670
[ 3641.713354]  [<ffffffff807aaf5b>] tcp_v4_syn_recv_sock+0xab/0x3d0
[ 3641.719425]  [<ffffffff807af63a>] tcp_check_req+0x3da/0x530
[ 3641.724979]  [<ffffffff8078b400>] ? inet_hashinfo_init+0x60/0x80
[ 3641.730964]  [<ffffffff807ade6f>] ? tcp_v4_rcv+0x79f/0xbe0
[ 3641.736430]  [<ffffffff807ab9bd>] tcp_v4_do_rcv+0x38d/0x4f0
[ 3641.741985]  [<ffffffff807ae14a>] tcp_v4_rcv+0xa7a/0xbe0

Its safe at this point because the parent socket owns a reference
on the netstamp_needed, so we cant have a 0 -> 1 transition, which
requires to lock a mutex.

Instead of refining the check, lets remove it, as all known callers
are safe. If it ever changes in the future, static_key_slow_inc()
will complain anyway.

Reported-by: Laurent Chavey <chavey@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-24 17:27:27 -04:00
Eric Dumazet
f4541d60a4 tcp: preserve ACK clocking in TSO
A long standing problem with TSO is the fact that tcp_tso_should_defer()
rearms the deferred timer, while it should not.

Current code leads to following bad bursty behavior :

20:11:24.484333 IP A > B: . 297161:316921(19760) ack 1 win 119
20:11:24.484337 IP B > A: . ack 263721 win 1117
20:11:24.485086 IP B > A: . ack 265241 win 1117
20:11:24.485925 IP B > A: . ack 266761 win 1117
20:11:24.486759 IP B > A: . ack 268281 win 1117
20:11:24.487594 IP B > A: . ack 269801 win 1117
20:11:24.488430 IP B > A: . ack 271321 win 1117
20:11:24.489267 IP B > A: . ack 272841 win 1117
20:11:24.490104 IP B > A: . ack 274361 win 1117
20:11:24.490939 IP B > A: . ack 275881 win 1117
20:11:24.491775 IP B > A: . ack 277401 win 1117
20:11:24.491784 IP A > B: . 316921:332881(15960) ack 1 win 119
20:11:24.492620 IP B > A: . ack 278921 win 1117
20:11:24.493448 IP B > A: . ack 280441 win 1117
20:11:24.494286 IP B > A: . ack 281961 win 1117
20:11:24.495122 IP B > A: . ack 283481 win 1117
20:11:24.495958 IP B > A: . ack 285001 win 1117
20:11:24.496791 IP B > A: . ack 286521 win 1117
20:11:24.497628 IP B > A: . ack 288041 win 1117
20:11:24.498459 IP B > A: . ack 289561 win 1117
20:11:24.499296 IP B > A: . ack 291081 win 1117
20:11:24.500133 IP B > A: . ack 292601 win 1117
20:11:24.500970 IP B > A: . ack 294121 win 1117
20:11:24.501388 IP B > A: . ack 295641 win 1117
20:11:24.501398 IP A > B: . 332881:351881(19000) ack 1 win 119

While the expected behavior is more like :

20:19:49.259620 IP A > B: . 197601:202161(4560) ack 1 win 119
20:19:49.260446 IP B > A: . ack 154281 win 1212
20:19:49.261282 IP B > A: . ack 155801 win 1212
20:19:49.262125 IP B > A: . ack 157321 win 1212
20:19:49.262136 IP A > B: . 202161:206721(4560) ack 1 win 119
20:19:49.262958 IP B > A: . ack 158841 win 1212
20:19:49.263795 IP B > A: . ack 160361 win 1212
20:19:49.264628 IP B > A: . ack 161881 win 1212
20:19:49.264637 IP A > B: . 206721:211281(4560) ack 1 win 119
20:19:49.265465 IP B > A: . ack 163401 win 1212
20:19:49.265886 IP B > A: . ack 164921 win 1212
20:19:49.266722 IP B > A: . ack 166441 win 1212
20:19:49.266732 IP A > B: . 211281:215841(4560) ack 1 win 119
20:19:49.267559 IP B > A: . ack 167961 win 1212
20:19:49.268394 IP B > A: . ack 169481 win 1212
20:19:49.269232 IP B > A: . ack 171001 win 1212
20:19:49.269241 IP A > B: . 215841:221161(5320) ack 1 win 119

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Van Jacobson <vanj@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-22 10:34:03 -04:00
Andrey Vagin
ae5fc98728 net: fix *_DIAG_MAX constants
Follow the common pattern and define *_DIAG_MAX like:

        [...]
        __XXX_DIAG_MAX,
};

Because everyone is used to do:

        struct nlattr *attrs[XXX_DIAG_MAX+1];

        nla_parse([...], XXX_DIAG_MAX, [...]

Reported-by: Thomas Graf <tgraf@suug.ch>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21 12:36:33 -04:00
David S. Miller
b37391e68c Merge branch 'mlx4'
Or Gerlitz says:

====================
Here's a batch of mlx4 driver fixes for 3.9, mostly SRIOV/Flow-steering
related. Series done against the net tree as of commit 5a3da1f
"inet: limit length of fragment queue hash table bucket lists
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21 12:05:32 -04:00
Hadar Hen Zion
2c473ae7e5 net/mlx4_core: Disallow releasing VF QPs which have steering rules
VF QPs must not be released when they have steering rules attached to them.

For that end, introduce a reference count field to the QP object in the
SRIOV resource tracker which is incremented/decremented when steering rules
are attached/detached to it. QPs can be released by VF only when their
ref count is zero.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21 12:05:08 -04:00
Hadar Hen Zion
1e3f7b324e net/mlx4_core: Always use 64 bit resource ID when doing lookup
One of the resource tracker code paths was wrongly using int and not u64
for resource tracking IDs, fix it.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21 12:05:08 -04:00
Hadar Hen Zion
6efb5fac4d net/mlx4_en: Remove ethtool flow steering rules before releasing QPs
Fix the ethtool flow steering rules cleanup to be carried out before
releasing the RX QPs.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21 12:05:08 -04:00
Hadar Hen Zion
80cb002116 net/mlx4_core: Fix wrong order of flow steering resources removal
On the resource tracker cleanup flow, the DMFS rules must be deleted before we
destroy the QPs, else the HW may attempt doing packet steering to non existent QPs.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21 12:05:07 -04:00
Moshe Lazer
c101c81b52 net/mlx4_core: Fix wrong mask applied on EQ numbers in the wrapper
Currently the  mask is wrongly set in the MAP_EQ wrapper, fix that.
Without the fix any EQ number above 511 is mapped to one below 511.

Signed-off-by: Moshe Lazer <moshel@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21 12:05:07 -04:00
Lothar Waßmann
ce16294fda net: ethernet: cpsw: fix erroneous condition in error check
The error check in cpsw_probe_dt() has an '&&' where an '||' is
meant to be. This causes a NULL pointer dereference when incomplet DT
data is passed to the driver ('phy_id' property for cpsw_emac1
missing).

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21 11:57:42 -04:00
Wei Yongjun
cb0e51d806 lantiq_etop: use free_netdev(netdev) instead of kfree()
Freeing netdev without free_netdev() leads to net, tx leaks.
And it may lead to dereferencing freed pointer.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21 11:50:10 -04:00
Masatake YAMATO
73214f5d9f thermal: shorten too long mcast group name
The original name is too long.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 17:56:58 -04:00
David S. Miller
f379fb991b Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
I present to you another batch of fixes intended for the 3.9 stream...

On the bluetooth bits, Gustavo says:

"I put together 3 fixes intended for 3.9, there are support for two
new devices and a NULL dereference fix in the SCO code."

Amitkumar Karwar fixes a command queueing race in mwifiex.

Bing Zhao provides a pair of mwifiex related to cleaning-up before
a shutdown.

Felix Fietkau provides an ath9k fix for a regression caused by an
earlier calibration fix, and another ath9k fix to avoid race conditions
that unnecessarily lead to chip resets.

Jussi Kivilinna prevents and skbuff leak in rtlwifi.

Stanislaw Gruszka corrects a length paramater for a DMA buffer mapping
operation in iwlegacy.

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 15:19:32 -04:00
Fabio Estevam
9d73adf431 fec: Fix the build as module
Since commit ff43da86c6 (NET: FEC: dynamtic check DMA desc buff type) the
following build error happens when CONFIG_FEC=m

ERROR: "fec_ptp_init" [drivers/net/ethernet/freescale/fec.ko] undefined!
ERROR: "fec_ptp_ioctl" [drivers/net/ethernet/freescale/fec.ko] undefined!
ERROR: "fec_ptp_start_cyclecounter" [drivers/net/ethernet/freescale/fec.ko] undefined!

Fix it by exporting the required fec_ptp symbols.

Reported-by: Uwe Kleine-Koenig <u.kleine-koenig@pengutronix.de>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 14:45:30 -04:00
John W. Linville
b9d5319041 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-03-20 14:26:37 -04:00
Fabio Estevam
da2191e314 net: fec: Define indexes as 'unsigned int'
Fix the following warnings that happen when building with W=1 option:

drivers/net/ethernet/freescale/fec.c: In function 'fec_enet_free_buffers':
drivers/net/ethernet/freescale/fec.c:1337:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
drivers/net/ethernet/freescale/fec.c: In function 'fec_enet_alloc_buffers':
drivers/net/ethernet/freescale/fec.c:1361:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
drivers/net/ethernet/freescale/fec.c: In function 'fec_enet_init':
drivers/net/ethernet/freescale/fec.c:1631:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:28:59 -04:00
Kees Cook
896ee0eee6 net/irda: add missing error path release_sock call
This makes sure that release_sock is called for all error conditions in
irda_getsockopt.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Brad Spengler <spender@grsecurity.net>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:23:13 -04:00
Wei Yongjun
fa90b077d7 lpc_eth: fix error return code in lpc_eth_drv_probe()
Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:19:15 -04:00
Sergei Shtylyov
fc0c090040 sh_eth: check TSU registers ioremap() error
One must check the result of ioremap() -- in this case it prevents potential
kernel oops when initializing TSU registers further on...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:17:59 -04:00
Sergei Shtylyov
0582b7d15f sh_eth: fix bitbang memory leak
sh_mdio_init() allocates pointer to 'struct bb_info' but only stores it locally,
so that sh_mdio_release() can't free it on driver unload.  Add the pointer to
'struct bb_info' to 'struct sh_eth_private', so that sh_mdio_init() can save
'bitbang' variable for sh_mdio_release() to be able to free it later...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:17:59 -04:00
Martin Fuzzey
283951f95b ipconfig: Fix newline handling in log message.
When using ipconfig the logs currently look like:

Single name server:
[    3.467270] IP-Config: Complete:
[    3.470613]      device=eth0, hwaddr=ac🇩🇪48:00:00:01, ipaddr=172.16.42.2, mask=255.255.255.0, gw=172.16.42.1
[    3.480670]      host=infigo-1, domain=, nis-domain=(none)
[    3.486166]      bootserver=172.16.42.1, rootserver=172.16.42.1, rootpath=
[    3.492910]      nameserver0=172.16.42.1[    3.496853] ALSA device list:

Three name servers:
[    3.496949] IP-Config: Complete:
[    3.500293]      device=eth0, hwaddr=ac🇩🇪48:00:00:01, ipaddr=172.16.42.2, mask=255.255.255.0, gw=172.16.42.1
[    3.510367]      host=infigo-1, domain=, nis-domain=(none)
[    3.515864]      bootserver=172.16.42.1, rootserver=172.16.42.1, rootpath=
[    3.522635]      nameserver0=172.16.42.1, nameserver1=172.16.42.100
[    3.529149] , nameserver2=172.16.42.200

Fix newline handling for these cases

Signed-off-by: Martin Fuzzey <mfuzzey@parkeon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:15:58 -04:00
Daniel Borkmann
8ed781668d flow_keys: include thoff into flow_keys for later usage
In skb_flow_dissect(), we perform a dissection of a skbuff. Since we're
doing the work here anyway, also store thoff for a later usage, e.g. in
the BPF filter.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:14:36 -04:00
David S. Miller
e0ebcb80cb Merge branch 'l2tp'
Tom Parkin says:

====================
This l2tp bugfix patchset addresses a number of issues.

The first five patches in the series prevent l2tp sessions pinning an l2tp
tunnel open.  This occurs because the l2tp tunnel is torn down in the tunnel
socket destructor, but each session holds a tunnel socket reference which
prevents tunnels with sessions being deleted.  The solution I've implemented
here involves adding a .destroy hook to udp code, as discussed previously on
netdev[1].

The subsequent seven patches address futher bugs exposed by fixing the problem
above, or exposed through stress testing the implementation above.  Patch 11
(avoid deadlock in l2tp stats update) isn't directly related to tunnel/session
lifetimes, but it does prevent deadlocks on i386 kernels running on 64 bit
hardware.

This patchset has been tested on 32 and 64 bit preempt/non-preempt kernels,
using iproute2, openl2tp, and custom-made stress test code.

[1] http://comments.gmane.org/gmane.linux.network/259169
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:46 -04:00
Tom Parkin
f6e16b299b l2tp: unhash l2tp sessions on delete, not on free
If we postpone unhashing of l2tp sessions until the structure is freed, we
risk:

 1. further packets arriving and getting queued while the pseudowire is being
    closed down
 2. the recv path hitting "scheduling while atomic" errors in the case that
    recv drops the last reference to a session and calls l2tp_session_free
    while in atomic context

As such, l2tp sessions should be unhashed from l2tp_core data structures early
in the teardown process prior to calling pseudowire close.  For pseudowires
like l2tp_ppp which have multiple shutdown codepaths, provide an unhash hook.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:39 -04:00
Tom Parkin
7b7c0719cd l2tp: avoid deadlock in l2tp stats update
l2tp's u64_stats writers were incorrectly synchronised, making it possible to
deadlock a 64bit machine running a 32bit kernel simply by sending the l2tp
code netlink commands while passing data through l2tp sessions.

Previous discussion on netdev determined that alternative solutions such as
spinlock writer synchronisation or per-cpu data would bring unjustified
overhead, given that most users interested in high volume traffic will likely
be running 64bit kernels on 64bit hardware.

As such, this patch replaces l2tp's use of u64_stats with atomic_long_t,
thereby avoiding the deadlock.

Ref:
http://marc.info/?l=linux-netdev&m=134029167910731&w=2
http://marc.info/?l=linux-netdev&m=134079868111131&w=2

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:39 -04:00
Tom Parkin
cf2f5c886a l2tp: push all ppp pseudowire shutdown through .release handler
If userspace deletes a ppp pseudowire using the netlink API, either by
directly deleting the session or by deleting the tunnel that contains the
session, we need to tear down the corresponding pppox channel.

Rather than trying to manage two pppox unbind codepaths, switch the netlink
and l2tp_core session_close handlers to close via. the l2tp_ppp socket
.release handler.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:39 -04:00
Tom Parkin
4c6e2fd354 l2tp: purge session reorder queue on delete
Add calls to l2tp_session_queue_purge as a part of l2tp_tunnel_closeall
and l2tp_session_delete.  Pseudowire implementations which are deleted only
via. l2tp_core l2tp_session_delete calls can dispense with their own code for
flushing the reorder queue.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:39 -04:00
Tom Parkin
48f72f92b3 l2tp: add session reorder queue purge function to core
If an l2tp session is deleted, it is necessary to delete skbs in-flight
on the session's reorder queue before taking it down.

Rather than having each pseudowire implementation reaching into the
l2tp_session struct to handle this itself, provide a function in l2tp_core to
purge the session queue.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:39 -04:00
Tom Parkin
02d13ed5f9 l2tp: don't BUG_ON sk_socket being NULL
It is valid for an existing struct sock object to have a NULL sk_socket
pointer, so don't BUG_ON in l2tp_tunnel_del_work if that should occur.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:39 -04:00
Tom Parkin
8abbbe8ff5 l2tp: take a reference for kernel sockets in l2tp_tunnel_sock_lookup
When looking up the tunnel socket in struct l2tp_tunnel, hold a reference
whether the socket was created by the kernel or by userspace.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:38 -04:00
Tom Parkin
2b551c6e7d l2tp: close sessions before initiating tunnel delete
When a user deletes a tunnel using netlink, all the sessions in the tunnel
should also be deleted.  Since running sessions will pin the tunnel socket
with the references they hold, have the l2tp_tunnel_delete close all sessions
in a tunnel before finally closing the tunnel socket.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:38 -04:00
Tom Parkin
936063175a l2tp: close sessions in ip socket destroy callback
l2tp_core hooks UDP's .destroy handler to gain advance warning of a tunnel
socket being closed from userspace.  We need to do the same thing for
IP-encapsulation sockets.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:38 -04:00
Tom Parkin
e34f4c7050 l2tp: export l2tp_tunnel_closeall
l2tp_core internally uses l2tp_tunnel_closeall to close all sessions in a
tunnel when a UDP-encapsulation socket is destroyed.  We need to do something
similar for IP-encapsulation sockets.

Export l2tp_tunnel_closeall as a GPL symbol to enable l2tp_ip and l2tp_ip6 to
call it from their .destroy handlers.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:38 -04:00
Tom Parkin
9980d001ce l2tp: add udp encap socket destroy handler
L2TP sessions hold a reference to the tunnel socket to prevent it going away
while sessions are still active.  However, since tunnel destruction is handled
by the sock sk_destruct callback there is a catch-22: a tunnel with sessions
cannot be deleted since each session holds a reference to the tunnel socket.
If userspace closes a managed tunnel socket, or dies, the tunnel will persist
and it will be neccessary to individually delete the sessions using netlink
commands.  This is ugly.

To prevent this occuring, this patch leverages the udp encapsulation socket
destroy callback to gain early notification when the tunnel socket is closed.
This allows us to safely close the sessions running in the tunnel, dropping
the tunnel socket references in the process.  The tunnel socket is then
destroyed as normal, and the tunnel resources deallocated in sk_destruct.

While we're at it, ensure that l2tp_tunnel_closeall correctly drops session
references to allow the sessions to be deleted rather than leaking.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:38 -04:00
Tom Parkin
44046a593e udp: add encap_destroy callback
Users of udp encapsulation currently have an encap_rcv callback which they can
use to hook into the udp receive path.

In situations where a encapsulation user allocates resources associated with a
udp encap socket, it may be convenient to be able to also hook the proto
.destroy operation.  For example, if an encap user holds a reference to the
udp socket, the destroy hook might be used to relinquish this reference.

This patch adds a socket destroy hook into udp, which is set and enabled
in the same way as the existing encap_rcv hook.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:38 -04:00
Masatake YAMATO
f1e79e2080 genetlink: trigger BUG_ON if a group name is too long
Trigger BUG_ON if a group name is longer than GENL_NAMSIZ.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:05:51 -04:00
David S. Miller
90b2621fd4 Merge branch 'master' of git://1984.lsi.us.es/nf
Pablo Neira Ayuso says:

====================
The following patchset contains 7 Netfilter/IPVS fixes for 3.9-rc, they are:

* Restrict IPv6 stateless NPT targets to the mangle table. Many users are
  complaining that this target does not work in the nat table, which is the
  wrong table for it, from Florian Westphal.

* Fix possible use before initialization in the netns init path of several
  conntrack protocol trackers (introduced recently while improving conntrack
  netns support), from Gao Feng.

* Fix incorrect initialization of copy_range in nfnetlink_queue, spotted
  by Eric Dumazet during the NFWS2013, patch from myself.

* Fix wrong calculation of next SCTP chunk in IPVS, from Julian Anastasov.

* Remove rcu_read_lock section in IPVS while calling ipv4_update_pmtu
  not required anymore after change introduced in 3.7, again from Julian.

* Fix SYN looping in IPVS state sync if the backup is used a real server
  in DR/TUN modes, this required a new /proc entry to disable the director
  function when acting as backup, also from Julian.

* Remove leftover IP_NF_QUEUE Kconfig after ip_queue removal, noted by
  Paul Bolle.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 10:23:52 -04:00
Paul Bolle
3dd6664fac netfilter: remove unused "config IP_NF_QUEUE"
Kconfig symbol IP_NF_QUEUE is unused since commit
d16cf20e2f ("netfilter: remove ip_queue
support"). Let's remove it too.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-03-20 00:11:43 +01:00
Linus Torvalds
10b38669d6 - Fix for a potential infinite loop which was introduced in 4d559a3bcb
- Fix for the return type of xfs_iomap_eof_prealloc_initial_size
   from a1e16c2666
 - Fix for a failed buffer readahead causing subsequent callers to
   fail incorrectly
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRSOIAAAoJENaLyazVq6ZODqQP/2m1iZVIA9CXFf5hS2QZgkc2
 MHq+QaQ1aaZlAIRCnZO4XrWoLw4tH7AmsHA7dVJVz/ZhVrJg4ahfdSS6qR5EGWFb
 I5uE8LD8ZhpIiW6mBytJ7g9ST6xnaeean2sMwa0BcVK3uF84nO/uBopntZVrVlZE
 sMuklZe8GfxDpF6SBxVGG+5+OeLXzFmf+s+xoCYN410uuzYoT8/jveFP6a5ARcmH
 xEcOJA2+3o2z4/fsdx/Euf6LnDMSyOsAFUJCtnmBdKUA5w9DrJJqGpDDPEkg9h6d
 /DTPYXEWx6+w4xoMnIf09oEdCSamBVTWcRFXtftN03VNrbRNtyVwAc8HUaSNmt0p
 I3P/b5NJ5guH7uK72jp61N2RP7D5KOqwkwR58Y1SJWuwcgatYuB3NM5UeUyJBILj
 ViZ4DsKGE6BCl8T3hwkN+mxSxB+o7O8AypjWdEviBXbVIG9CwOxr1IEatl3eyV5T
 8QsNFb0LJcWzl1+F/uUYe1Goeqxvzupt7omUaRONdMnac3uFIk0ARrdxXFgawIJ9
 lgeftBCmMkqqLZUACSfmfCYNwyupz3E6bYB7Azwx01qg7CzTPUfIL2SxqDYp2dup
 /s+R7HL4HOJ0FCzjCZxHHO/1jsWgu265dJdpaQw/UcIe2IuEFGr558deHEM62bDW
 rWCVHj5eY5NRGyzSwzqB
 =41Vk
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-v3.9-rc4' of git://oss.sgi.com/xfs/xfs

Pull XFS fixes from Ben Myers:

 - Fix for a potential infinite loop which was introduced in commit
   4d559a3bcb ("xfs: limit speculative prealloc near ENOSPC
   thresholds")

 - Fix for the return type of xfs_iomap_eof_prealloc_initial_size from
   commit a1e16c2666 ("xfs: limit speculative prealloc size on sparse
   files")

 - Fix for a failed buffer readahead causing subsequent callers to fail
   incorrectly

* tag 'for-linus-v3.9-rc4' of git://oss.sgi.com/xfs/xfs:
  xfs: ensure we capture IO errors correctly
  xfs: fix xfs_iomap_eof_prealloc_initial_size type
  xfs: fix potential infinite loop in xfs_iomap_prealloc_size()
2013-03-19 15:17:40 -07:00
Matthew Garrett
547b524636 PCI: Use ROM images from firmware only if no other ROM source available
Mantas Mikulėnas reported that his graphics hardware failed to
initialise after commit f9a37be0f0 ("x86: Use PCI setup data").

The aim of this commit was to ensure that ROM images were available on
some Apple systems that don't expose the GPU ROM via any other source.
In this case, UEFI appears to have provided a broken ROM image that we
were using even though there was a perfectly valid ROM available via
other sources.  The simplest way to handle this seems to be to just
re-order pci_map_rom() and leave any firmare-supplied ROM to last.

Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Tested-by: Mantas Mikulėnas <grawity@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-19 14:51:14 -07:00
Linus Torvalds
5c7c3361d1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
 "Just some minor fixups, a sunsu console setup panic cure, and
  recognition of a Fujitsu sun4v cpu."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: remove unused "config BITS"
  sparc: delete "if !ULTRA_HAS_POPULATION_COUNT"
  sparc64: correctly recognize SPARC64-X chips
  sparc,leon: fix GRPCI2 device0 PCI config space access
  sunsu: Fix panic in case of nonexistent port at "console=ttySY" cmdline option
2013-03-19 14:47:11 -07:00