When a user disables interrupt throttling with ethtool on 82599 devices,
the interrupt timer may not be re-enabled if hardware RSC is running. The
RSC completions in hardware don't complete before the next ITR event tries
to fire, so the ITR timer never gets re-armed. This patch increases the
amount of time between interrupts when throttling is disabled (rx-usecs =
0) when the hardware RSC deature is enabled.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A second set of feature flag bits was added, and the hardware RSC engine
flags were moved there. However, the code itself didn't make the move
completely to use the new bitmap.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Acked-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Our ndo_poll_controller callback is broken for anything but non-multiqueue
setups. This fixes that issue.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Choose saner defaults for xfrm[4|6] gc_thresh values on init
Currently, the xfrm[4|6] code has hard-coded initial gc_thresh values
(set to 1024). Given that the ipv4 and ipv6 routing caches are sized
dynamically at boot time, the static selections can be non-sensical.
This patch dynamically selects an appropriate gc threshold based on
the corresponding main routing table size, using the assumption that
we should in the worst case be able to handle as many connections as
the routing table can.
For ipv4, the maximum route cache size is 16 * the number of hash
buckets in the route cache. Given that xfrm4 starts garbage
collection at the gc_thresh and prevents new allocations at 2 *
gc_thresh, we set gc_thresh to half the maximum route cache size.
For ipv6, its a bit trickier. there is no maximum route cache size,
but the ipv6 dst_ops gc_thresh is statically set to 1024. It seems
sane to select a simmilar gc_thresh for the xfrm6 code that is half
the number of hash buckets in the v6 route cache times 16 (like the v4
code does).
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Parameter order for using mk_ic_value(count, time) was reversed,
the patch fixes this.
Signed-off-by: Jiajun Wu <b06378@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a socket is hashed in last slot of pppoe hash table (PPPOE_HASH_SIZE-1)
we report it many times (up to filling seq buffer)
(Only the last socket of last slot)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
start_code is 69 words, but the code always writes a multiple of 16 words,
so the last 11 words written are outside the array.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If arp_format_neigh_entry() can be called with n->dev->addr_len == 0, then a
write to hbuffer[-1] occurs.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
if dev_alloc_skb() fails on the first iteration, a write to
cp->rx_ring[-1] occurs.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no reason for the arbitrary restriction that device must be
up to create a vlan. This patch was added to Vyatta kernel to resolve startup
ordering issues where vlan's are created but real device was disabled.
Note: the vlan already correctly inherits the operstate from real device; so
if vlan is created and real device is marked down, the vlan is marked
down.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the %pI4 format string instead of %d.%d.%d.%d and NIPQUAD.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In some cases with 57711E, depending on the functions unload sequence, other
functions MAC address could have been used to wake the system as well. Make sure
to block all but the current function if WoL is required by changing the mode
to single function WoL.
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Self test used to play with the management FIFO possibly while management was
running...
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to lack of configuration, if the BMC configures the chip to pass all
broadcast/multicast traffic to it, the host will not receive it. On top of
fixing it, also make sure that in promiscuous mode, the host will receive the
management traffic as well.
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Changing to GFP_ATOMIC because the only caller in cnic/bnx2i may
be calling this function while holding spin_lock.
This problem was discovered by Mike Christie.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
The test on map4 should be a test on map6.
The semantic match that finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@@
expression *x;
identifier f;
constant char *C;
@@
x = \(kmalloc\|kcalloc\|kzalloc\)(...);
... when != x == NULL
when != x != NULL
when != (x || ...)
(
kfree(x)
|
f(...,C,...,x,...)
|
*f(...,x,...)
|
*x->f
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wrap dest IP hashing code with #ifdef CONFIG_INET,
this feature makes no sense without INET, but other
driver can still work.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan reported that his b43-based laptop hangs during suspend.
The problem turned out to be mac80211 asking the driver to
stop the hardware before removing interfaces, and interface
removal caused b43 to touch the hardware (while down, which
causes the hang).
This patch fixes mac80211 to do reorder these operations to
have them in the correct order -- first remove interfaces
and then stop the hardware. Some more code is necessary to
be able to do so in a race-free manner, in particular it is
necessary to not process frames received during quiescing.
Fixes http://bugzilla.kernel.org/show_bug.cgi?id=13337.
Reported-by: Jan Scholz <scholz@fias.uni-frankfurt.de>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Several arrays were read before checking whether the index was within
bounds. ARRAY_SIZE() should be used to determine the size of arrays.
rates->rates has an arraysize of 1, so calling get_common_rates()
with a rates_size of MAX_RATES (14) was causing reads out of bounds.
tmp_size can increment at most to (ARRAY_SIZE(lbs_bg_rates) - 1) *
(*rates_size - 1), so that should be the number of elements of tmp[].
A goto can be eliminated: ret was already set upon its declaration.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
reads bss->rates[j] before checking bounds of index, and should use
ARRAY_SIZE to determine the size of the array.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Holger Schurig <hs4233@mail.mn-solutions.de>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Even though reverse path filter was changed from simple boolean to
trinary control, the loose mode only works if both all and device are
configured because of this logic error.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tid is bounded (above) by the size of default_tid_to_tx_fifo (17 elements), but
the size of priv->stations[].tid[] is MAX_TID_COUNT (9) elements.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Incorrect limits leads to reads outside array bounds.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
SSID_rid has space for only 3 ssids.
txPowerLevels[i] is read before the bounds check for i
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
iwm_wdev_alloc() returns an ERR_PTR on failure and not null. It also
prints its own dev_err() message so I removed that as well.
Compile tested only. Sorry.
Found by smatch (http://repo.or.cz/w/smatch.git).
Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We need to provide a reasonable minimum that will result in a
working setup if used. Set minimum to be 10 to provide for
4 standard TX queues + 1 command queue + 2 (unused) HCCA queues +
4 HT queues (one per AC).
We allow the user to change the number of queues used via a module
parameter and use this minimum value to check if it is valid. Without
this patch a user can select a value for the number of queues that
will result in a failing setup.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Tomas Winkler <tomas.winkler@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
I had a problem on 4965 hardware (well, probably other hardware too,
but others don't survive my stress testing right now, unfortunately)
where the driver was sending invalid commands to the device, but no
such thing could be seen from the driver's point of view. I could
reproduce this fairly easily by sending multiple TCP streams with
iperf on different TIDs, though sometimes a single iperf stream was
sufficient. It even happened with a single core, but I have forced
preemption turned on.
The culprit was a queue overrun, where we advanced the queue's write
pointer over the read pointer. After careful analysis I've come to
the conclusion that the cause is a race condition between iwlwifi
and mac80211.
mac80211, of course, checks whether the queue is stopped, before
transmitting a frame. This effectively looks like this:
lock(queues)
if (stopped(queue)) {
unlock(queues)
return busy;
}
unlock(queues)
... <-- this place will be important
there is some more code here
drv_tx(frame)
The driver, on the other hand, can stop and start queues, which does
lock(queues)
mark_running/stopped(queue)
unlock(queues)
[if marked running: wake up tasklet to send pending frames]
Now, however, once the driver starts the queue, mac80211 can see that
and end up at the marked place above, at which point for some reason the
driver seems to stop the queue again (I don't understand that) and then
we end up transmitting while the queue is actually full.
Now, this shouldn't actually matter much, but for some reason I've seen
it happen multiple times in a row and the queue actually overflows, at
which point the queue bites itself in the tail and things go completely
wrong.
This patch fixes this by just dropping the packet should this have
happened, and making the lock in iwlwifi cover everything so iwlwifi
can't race against itself (dropping the lock there might make it more
likely, but it did seem to happen without that too).
Since we can't hold the lock across drv_tx() above, I see no way to fix
this in mac80211, but I also don't understand why I haven't seen this
before -- maybe I just never stress tested it this badly.
With this patch, the device has survived many minutes of simultanously
sending two iperf streams on different TIDs with combined throughput
of about 60 Mbps.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
str has already been tested. It seems that this test should be on the
recently returned value snr.
A simplified version of the semantic match that finds this problem is as
follows: (http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@r exists@
local idexpression x;
expression E;
@@
if (x == NULL || ...) { ... when forall
return ...; }
... when != \(x=E\|x--\|x++\|--x\|++x\|x-=E\|x+=E\|x|=E\|x&=E\|&x\)
(
*x == NULL
|
*x != NULL
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Export garbage collector thresholds for xfrm[4|6]_dst_ops
Had a problem reported to me recently in which a high volume of ipsec
connections on a system began reporting ENOBUFS for new connections
eventually.
It seemed that after about 2000 connections we started being unable to
create more. A quick look revealed that the xfrm code used a dst_ops
structure that limited the gc_thresh value to 1024, and always
dropped route cache entries after 2x the gc_thresh.
It seems the most direct solution is to export the gc_thresh values in
the xfrm[4|6] dst_ops as sysctls, like the main routing table does, so
that higher volumes of connections can be supported. This patch has
been tested and allows the reporter to increase their ipsec connection
volume successfully.
Reported-by: Joe Nall <joe@nall.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
ipv4/xfrm4_policy.c | 18 ++++++++++++++++++
ipv6/xfrm6_policy.c | 18 ++++++++++++++++++
2 files changed, 36 insertions(+)
Signed-off-by: David S. Miller <davem@davemloft.net>
entry was tested for NULL near the beginning of the function, followed by a
return, and there is no intervening modification of its value.
A simplified version of the semantic match that finds this problem is as
follows: (http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@r exists@
local idexpression x;
expression E;
position p1,p2;
@@
if (x == NULL || ...) { ... when forall
return ...; }
... when != \(x=E\|x--\|x++\|--x\|++x\|x-=E\|x+=E\|x|=E\|x&=E\|&x\)
(
*x == NULL
|
*x != NULL
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
res has already been tested. It seems that this test should be on the
recently returned value mmio.
A simplified version of the semantic match that finds this problem is as
follows: (http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@r exists@
local idexpression x;
expression E;
@@
if (x == NULL || ...) { ... when forall
return ...; }
... when != \(x=E\|x--\|x++\|--x\|++x\|x-=E\|x+=E\|x|=E\|x&=E\|&x\)
(
*x == NULL
|
*x != NULL
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a net device goes down or when the bnx2i driver is unloaded,
the code was not generating the ISCSI_KEVENT_IF_DOWN message
properly and this could cause the userspace driver to crash.
This is fixed by sending the message properly in the shutdown path.
cnic_uio_stop() is also added to send the message when bnx2i is
unregistering.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for ethtool -G to tune rx and tx ring sizes
per interface basis.
This is only supported for NX3031 based cards.
Signed-off-by: Amit Kumar Salecha <amit@netxen.com>
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable vlan tx acceleration for NX3031 if firmware advertises
capability.
Signed-off-by: Amit Kumar Salecha <amit@netxen.com>
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Request 1532 bytes skb data size for NX3031. NX2031 firmware
needs 1760 sized buffers.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move all net_device initialization into one function
netxen_setup_netdev().
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NX2031 firmware version will never be > 4.0.0, so replace
(adapter->fw_major < 4) checks with pci revision ID check.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o move all tso / checksum offload code into netxen_tso_check().
o optimize the tso header copy into simple loop.
o clean up unnecessary unions from cmd_desc_type0 struct.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o remove superfluous code to setup PCI dma watchdog for NX2031.
o disable dma watchdog completely for NX3031 (not required).
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initialize and configure interrupt coalesing defaults
in the firmware, so that these also reflect in "ethool -c".
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NX3031 hardware requires local IP addresses for packet
accumulation (LRO). IP address hashing is required to
distinguish a local TCP flow from others (forwarded or
guest).
This patch adds listener for IP and netdev events and
configures IP address in the firmware.
Signed-off-by: Amit Kumar Salecha <amit@netxen.com>
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>