It was possible to hit a kernel panic on NULL pointer dereference in
dev_queue_xmit() when sending power save buffered frames to a STA that
woke up from sleep. This happened when the buffered frame was requeued
for transmission in ap_sta_ps_end(). In order to avoid the panic, copy
the skb->dev and skb->iif values from the first fragment to all other
fragments.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
All 802.11n PCI devices (Cardbus, PCI, mini-PCI) require
serialization of IO when on non-uniprocessor systems. PCI
express devices not not require this.
This should fix our only last standing open ath9k kernel.org
bugzilla bug report:
http://bugzilla.kernel.org/show_bug.cgi?id=12110
A port is probably required to older kernels and I can work on
that.
Cc: stable@kernel.org
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When they were part of the now defunct ieee80211 component, these
messages were only visible when special debugging settings were enabled.
Let's mirror that with a new lib80211 debugging Kconfig option.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
As my netpoll fix for net doesn't really work for net-next, we
need this update to move the checks into the right place. As it
stands we may pass freed skbs to netpoll_receive_skb.
This patch also introduces a netpoll_rx_on function to avoid GRO
completely if we're invoked through netpoll. This might seem
paranoid but as netpoll may have an external receive hook it's
better to be safe than sorry. I don't think we need this for
2.6.29 though since there's nothing immediately broken by it.
This patch also moves the GRO_* return values to netdevice.h since
VLAN needs them too (I tried to avoid this originally but alas
this seems to be the easiest way out). This fixes a bug in VLAN
where it continued to use the old return value 2 instead of the
correct GRO_DROP.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
NEXTHDR_NONE doesn't has an IPv6 option header, so the first check
for the length will always fail and results in a confusing message
"too short" if debugging enabled. With this patch, we check for
NEXTHDR_NONE before length sanity checkings are done.
Signed-off-by: Christoph Paasch <christoph.paasch@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
We currently use the negative value in the conntrack code to encode
the packet verdict in the error. As NF_DROP is equal to 0, inverting
NF_DROP makes no sense and, as a result, no packets are ever dropped.
Signed-off-by: Christoph Paasch <christoph.paasch@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch fixes a possible crash due to the missing initialization
of the expectation class when nf_ct_expect_related() is called.
Reported-by: BORBELY Zoltan <bozo@andrews.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch skips the delivery of conntrack events if the packet
was drop due to a race condition in the conntrack insertion.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
It's not too likely to happen, would basically require crafted
packets (must hit the max guard in tcp_bound_to_half_wnd()).
It seems that nothing that bad would happen as there's tcp_mems
and congestion window that prevent runaway at some point from
hurting all too much (I'm not that sure what all those zero
sized segments we would generate do though in write queue).
Preventing it regardless is certainly the best way to go.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
The results is very unlikely change every so often so we
hardly need to divide again after doing that once for a
connection. Yet, if divide still becomes necessary we
detect that and do the right thing and again settle for
non-divide state. Takes the u16 space which was previously
taken by the plain xmit_size_goal.
This should take care part of the tso vs non-tso difference
we found earlier.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's very little need for most of the callsites to get
tp->xmit_goal_size updated. That will cost us divide as is,
so slice the function in two. Also, the only users of the
tp->xmit_goal_size are directly behind tcp_current_mss(),
so there's no need to store that variable into tcp_sock
at all! The drop of xmit_goal_size currently leaves 16-bit
hole and some reorganization would again be necessary to
change that (but I'm aiming to fill that hole with u16
xmit_goal_size_segs to cache the results of the remaining
divide to get that tso on regression).
Bring xmit_goal_size parts into tcp.c
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
It seems that no variables clash such that we couldn't do
the check just once later on. Therefore move it.
Also kill dead obvious comment, dead argument and add
unlikely since this mtu probe does not happen too often.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wow, it was quite tricky to merge that stream of negations
but I think I finally got it right:
check & replace_ts_recent:
(s32)(rcv_tsval - ts_recent) >= 0 => 0
(s32)(ts_recent - rcv_tsval) <= 0 => 0
discard:
(s32)(ts_recent - rcv_tsval) > TCP_PAWS_WINDOW => 1
(s32)(ts_recent - rcv_tsval) <= TCP_PAWS_WINDOW => 0
I toggled the return values of tcp_paws_check around since
the old encoding added yet-another negation making tracking
of truth-values really complicated.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
I've already forgotten what for this was necessary, anyway
it's no longer used (if it ever was).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the pure assignment case, the earlier zeroing is
still in effect.
David S. Miller raised concerns if the ifs are there to avoid
dirtying cachelines. I came to these conclusions:
> We'll be dirty it anyway (now that I check), the first "real" statement
> in tcp_rcv_established is:
>
> tp->rx_opt.saw_tstamp = 0;
>
> ...that'll land on the same dword. :-/
>
> I suppose the blocks are there just because they had more complexity
> inside when they had to calculate the eff_sacks too (maybe it would
> have been better to just remove them in that drop-patch so you would
> have had less head-ache :-)).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
It fails on the following systems:
- RTL8169sc/8110sc (XID 18000000)
reported by Tim Durack <tdurack@gmail.com> (x86)
- RTL8169sb/8110sb (XID 10000000)
reported by Mikael Pettersson <mikpe@it.uu.se> (ARM)
The patch appeared to work on x86 for the following systems:
RTL8169sb/8110sb 10000000 PCI (EXT)
RTL8110s 04000000 PCI (EXT)
RTL8102e 24a00000 PCI-E (LOM)
RTL8168c/8111c 3c2000c0 PCI-E (LOM)
RTL8168b/8111b 38000000 PCI-E (LOM)
RTL8168b/8111b 38000000 PCI-E (EXT)
The patch exposes two problems:
1) while not completely wrong, mac addresses are not read correctly
from the EEPROM
2) the MAC address registers are not correctly set
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
It shortens the code and fixes the current pci_unmap leak with
padded skb reported by Dave Jones.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While looking for a possible reason of bugzilla report on HTB oops:
http://bugzilla.kernel.org/show_bug.cgi?id=12858
I found the code in htb_delete calling htb_destroy_class on zero
refcount is very misleading: it can suggest this is a common path, and
destroy is called under sch_tree_lock. Actually, this can never happen
like this because before deletion cops->get() is done, and after
delete a class is still used by tclass_notify. The class destroy is
always called from cops->put(), so without sch_tree_lock.
This doesn't mean much now (since 2.6.27) because all vulnerable calls
were moved from htb_destroy_class to htb_delete, but there was a bug
in older kernels. The same change is done for other classful scheds,
which, it seems, didn't have similar locking problems here.
Reported-by: m0sia <m0sia@m0sia.ru>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On x86_64, its rather unfortunate that "wait_queue_head_t wait"
field of "struct socket" spans two cache lines (assuming a 64
bytes cache line in current cpus)
offsetof(struct socket, wait)=0x30
sizeof(wait_queue_head_t)=0x18
This might explain why Kenny Chang noticed that his multicast workload
was performing bad with 64 bit kernels, since more cache lines ping pongs
were involved.
This litle patch moves "wait" field next "fasync_list" so that both
fields share a single cache line, to speedup sock_def_readable()
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The L0s workaround should be moved into a pci quirk and so it is not
necessary in the driver. This update removes the L0s workaround from the
igb driver.
This was the second half of the PCI quirk patch that Matthew Wilcox did
not pick up when he picked up the quirk patch.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To mark all features and bugfixes submitted since 4.0.11.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enables the load balancing capability of firmware
and hardware to spray traffic into different cpus through
separate rx msix interrupts.
The feature is being enabled for NX3031, NX2031 (old) will be
enabled later. This depends on msi-x and compatibility with
msi and legacy is maintained by enabling single rx ring.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o remove max_ prefix from ring sizes, since they don't really
represent max possible sizes.
o cleanup naming of rx ring types (normal, jumbo, lro).
o simplify logic to choose rx ring size, gig ports get half
rx ring of 10 gig ports.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Detach network interface on PCI suspend and recreate hardware
context after resumes.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Documentation for the ixgbe driver in the kernel docs area is missing.
This adds that documentation.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cleanup a bit of whitespace, add some function header comments, and fix a
few comments around the driver.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Acked-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Tx DMA unit should be disabled when bringing the device down. Also,
the KX4 device with 82599 supports WoL, so we should clear the Wake Up
Status (WUS) after a PCIe slot reset.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are possible times that a driver may fail to completely initialize,
due to a buggy platform or a buggy kernel. In those cases, we'd rather
fail gracefully instead of a panic. Add a few safety checks to some
critical paths to try and prevent a panic in these corner-case situations.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This cleans up the following pieces of the Rx initialization path:
- Enable the ECC memory fault interrupt in OTHER causes.
- Fix an 82598 initialization of RDRXCTL when depending on RSS and VMDq to
be enabled. We don't need these features enabled to safely set the MVMEN
bit to allow multiple SRRCTL register mappings into the RXDCTL registers.
- Fix the RSS initialization path to not stomp on DCB accidentally. When
configuring the MRQC (multiple Rx queue contol) register, we want to make
sure we only OR in features as necessary, instead of full assignment.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Tx accounting when cleaning during NAPI was not completely properly.
We should use the work_limit to determine when to finish cleaning, and
use the same to return the cleaned status. The impact of running like this
causes the NAPI clean for this Tx to get stuck in a scheduling loop, and
can result in Tx not getting cleaned, ending with a Tx hang and device
reset.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Occasionally if the driver was loaded in a system that
didn't support MSI-X or MSI and was on a shared interrupt,
the driver would then panic in NAPI on the first shared
interrupt because we hadn't called napi_add yet.
Solution: call napi_add before calling request_irq
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The interrupt models using EITR have changed in 82599. The way the register
is laid out, the change is transparent to some of the existing code.
However, some of it isn't. This patch fixes all the cases where EITR
handling is different than 82598.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
82599 mistakenly enabled drop on Rx queues in the packet buffer. The
default mode should be store-and-forward from the FIFO.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Acked-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rx_no_dma_resources counter reported by ethtool -S ethX is not
counting correctly. In 82599, the queue mappings for the counters need
to be mapped properly, and accounted for properly.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Acked-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A purely cosmetic change. Report which physical layer is present, instead
of PHY unknown. 82599 added new PHY types for the SFP+ devices, and this
was missed getting updated.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for 82576 copper adapter and necessary code to restrict wol for
quad port adapter to first port.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding device id to support 82576NS dual port copper
NIC.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch corrects a typo that was doing a less than comparison instead of
a left shift due to the fact that I didn't get enough <'s in there.
This resolves an issue in which vlans were not functioning correctly.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add Pf to pool if adding a VLVF register value and the VFTA bit is
already set.
This patch addresses the unlikely situation that the PF adds a vlan
entry when the vlvf is full, and a vf later adds the vlan to the vlvf.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to support wol on the second port for situations such as when the
lan ports are on the motherboard itself.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If DCA is undefined then the adapter struct becomes unnecessary. To
resolve this issue the DCA calls can simply make a call to the adapter
struct through the rx_ring adapter struct member.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netif_running check in igb poll is a hold over from the use of fake
netdevs to use multiple queues with NAPI prior to 2.6.24. It is no longer
necessary to have the call there and it currently can cause errors if
work_done == budget.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the new DCA API, the driver should use dca3_get_tag() instead of
the obsolete dca_get_tag().
Signed-off-by: Maciej Sosnowski < maciej.sosnowski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove flash size check which made sense only for ancient
boards with 1MB flash. The check is based on values read
from specific locations and fails with firmware size changes.
This prevents driver from getting right mac addresses.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I found the PPP subsystem to not work properly when connecting channels
with different speeds to the same bundle.
Problem Description:
As the "ppp_mp_explode" function fragments the sk_buff buffer evenly
among the PPP channels that are connected to a certain PPP unit to
make up a bundle, if we are transmitting using an upper layer protocol
that requires an Ack before sending the next packet (like TCP/IP for
example), we will have a bandwidth bottleneck on the slowest channel
of the bundle.
Let's clarify by an example. Let's consider a scenario where we have
two PPP links making up a bundle: a slow link (10KB/sec) and a fast
link (1000KB/sec) working at the best (full bandwidth). On the top we
have a TCP/IP stack sending a 1000 Bytes sk_buff buffer down to the
PPP subsystem. The "ppp_mp_explode" function will divide the buffer in
two fragments of 500B each (we are neglecting all the headers, crc,
flags etc?.). Before the TCP/IP stack sends out the next buffer, it
will have to wait for the ACK response from the remote peer, so it
will have to wait for both fragments to have been sent over the two
PPP links, received by the remote peer and reconstructed. The
resulting behaviour is that, rather than having a bundle working
@1010KB/sec (the sum of the channels bandwidths), we'll have a bundle
working @20KB/sec (the double of the slowest channels bandwidth).
Problem Solution:
The problem has been solved by redesigning the "ppp_mp_explode"
function in such a way to make it split the sk_buff buffer according
to the speeds of the underlying PPP channels (the speeds of the serial
interfaces respectively attached to the PPP channels). Referring to
the above example, the redesigned "ppp_mp_explode" function will now
divide the 1000 Bytes buffer into two fragments whose sizes are set
according to the speeds of the channels where they are going to be
sent on (e.g . 10 Byets on 10KB/sec channel and 990 Bytes on
1000KB/sec channel). The reworked function grants the same
performances of the original one in optimal working conditions (i.e. a
bundle made up of PPP links all working at the same speed), while
greatly improving performances on the bundles made up of channels
working at different speeds.
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb->len is an unsigned int, so the test in x25_rx_call_request() always
evaluates to true.
len in x25_sendmsg() is unsigned as well. so -ERRORS returned by x25_output()
are not noticed.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>