The X550 hardware will use more bits in the mask, so change
the prototypes to match. This larger mask will require changes
in callers which use the higher bits. Likewise since X550 will
use different semaphore mask values and will use the lan_id
value. So save these values in the ixgbe_phy_info struct.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since on X550 we use host interface commands to read,write and erase
some commands require more time to complete. So this adds a timeout
parameter to ixgbe_host_interface_command as wells as a return_data
parameter allowing us to return with any data.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The new X550 family of MAC's will have a larger RSS hash (16 -> 64).
It will also support individual VF to have their own independent RSS
hash key. This patch will enable this functionality
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Accessing the CIAA/D register can block access to the PCI config space.
This patch removes the read/write operations to the CIAA/D registers
and makes use of standard kernel functions for accessing the PCI config
space.
In addition it moves ixgbevf_check_for_bad_vf() into the watchdog subtask
which reduces the frequency of the checks.
CC: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change cleans up the tail writes for the ixgbe descriptor queues. The
current implementation had me confused as I wasn't sure if it was still
making use of the surprise remove logic or not.
It also adds the mmiowb which is needed on ia64, mips, and a couple other
architectures in order to synchronize the MMIO writes with the Tx queue
_xmit_lock spinlock.
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch cleans up the page reuse code getting it into a state where all
the workarounds needed are in place as well as cleaning up a few minor
oversights such as using __free_pages instead of put_page to drop a locally
allocated page.
It also cleans up how we clear the descriptor status bits. Previously they
were zeroed as a part of clearing the hdr_addr. However the hdr_addr is a
64 bit field and 64 bit writes can be a bit more expensive on on 32 bit
systems. Since we are no longer using the header split feature the upper
32 bits of the address no longer need to be cleared. As a result we can
just clear the status bits and leave the length and VLAN fields as-is which
should provide more information in debugging.
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
To allow brport device to return current brport flags set on port. Add
returned flags to nested IFLA_PROTINFO netlink msg built in dflt getlink.
With this change, netlink msg returned for bridge_getlink contains the port's
offloaded flag settings (the port's SELF settings).
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While working on a different issue, I noticed an annoying use
after free bug on my machine when unloading the ixgbe driver:
[ 8642.318797] ixgbe 0000:02:00.1: removed PHC on p2p2
[ 8642.742716] ixgbe 0000:02:00.1: complete
[ 8642.743784] BUG: unable to handle kernel paging request at ffff8807d3740a90
[ 8642.744828] IP: [<ffffffffa01c77dc>] ixgbe_remove+0xfc/0x1b0 [ixgbe]
[ 8642.745886] PGD 20c6067 PUD 81c1f6067 PMD 81c15a067 PTE 80000007d3740060
[ 8642.746956] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
[ 8642.748039] Modules linked in: [...]
[ 8642.752929] CPU: 1 PID: 1225 Comm: rmmod Not tainted 3.18.0-rc2+ #49
[ 8642.754203] Hardware name: Supermicro X10SLM-F/X10SLM-F, BIOS 1.1b 11/01/2013
[ 8642.755505] task: ffff8807e34d3fe0 ti: ffff8807b7204000 task.ti: ffff8807b7204000
[ 8642.756831] RIP: 0010:[<ffffffffa01c77dc>] [<ffffffffa01c77dc>] ixgbe_remove+0xfc/0x1b0 [ixgbe]
[...]
[ 8642.774335] Stack:
[ 8642.775805] ffff8807ee824098 ffff8807ee824098 ffffffffa01f3000 ffff8807ee824000
[ 8642.777326] ffff8807b7207e18 ffffffff8137720f ffff8807ee824098 ffff8807ee824098
[ 8642.778848] ffffffffa01f3068 ffff8807ee8240f8 ffff8807b7207e38 ffffffff8144180f
[ 8642.780365] Call Trace:
[ 8642.781869] [<ffffffff8137720f>] pci_device_remove+0x3f/0xc0
[ 8642.783395] [<ffffffff8144180f>] __device_release_driver+0x7f/0xf0
[ 8642.784876] [<ffffffff814421f8>] driver_detach+0xb8/0xc0
[ 8642.786352] [<ffffffff814414a9>] bus_remove_driver+0x59/0xe0
[ 8642.787783] [<ffffffff814429d0>] driver_unregister+0x30/0x70
[ 8642.789202] [<ffffffff81375c65>] pci_unregister_driver+0x25/0xa0
[ 8642.790657] [<ffffffffa01eb38e>] ixgbe_exit_module+0x1c/0xc8e [ixgbe]
[ 8642.792064] [<ffffffff810f93a2>] SyS_delete_module+0x132/0x1c0
[ 8642.793450] [<ffffffff81012c61>] ? do_notify_resume+0x61/0xa0
[ 8642.794837] [<ffffffff816d2029>] system_call_fastpath+0x12/0x17
The issue is that test_and_set_bit() done on adapter->state is being
performed *after* the netdevice has been freed via free_netdev().
When netdev is being allocated on initialization time, it allocates
a private area, here struct ixgbe_adapter, that resides after the
net_device structure. In ixgbe_probe(), the device init routine,
we set up the adapter after alloc_etherdev_mq() on the private area
and add a reference for the pci_dev as well via pci_set_drvdata().
Both in the error path of ixgbe_probe(), but also on module unload
when ixgbe_remove() is being called, commit 41c62843eb ("ixgbe:
Fix rcu warnings induced by LER") accesses adapter after free_netdev().
The patch stores the result in a bool and thus fixes above oops on my
side.
Fixes: 41c62843eb ("ixgbe: Fix rcu warnings induced by LER")
Cc: stable <stable@vger.kernel.org>
Cc: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IXGBE adapter seems to require that VLAN filtering be enabled if
VMDQ or SRIOV are enabled. When those functions are disabled,
VLAN filtering may be disabled in promiscuous mode.
Prior to commit a9b8943ee1 ("ixgbe: remove vlan_filter_disable
and enable functions")
The logic was correct. However, after the commit the logic
got reversed and VLAN filtered in now turned on when VMDQ/SRIOV
is disabled.
This patch changes the condition to enable hw vlan filtered
when VMDQ or SRIOV is enabled.
Fixes: a9b8943ee1 ("ixgbe: remove vlan_filter_disable and enable functions")
Cc: stable <stable@vger.kernel.org>
CC: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/chelsio/cxgb4vf/sge.c
drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c
sge.c was overlapping two changes, one to use the new
__dev_alloc_page() in net-next, and one to use s->fl_pg_order in net.
ixgbe_phy.c was a set of overlapping whitespace changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Split off the setting of the RSS key into its own function. This
will help when we add support for X550 which can have different
RSS keys per pool.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch will add in the new MAC defines and fit it into the switch
cases throughout the driver. New functionality and enablement support will
be added in following patches.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Move setting of drop enable to support function. This not only makes the
code more readable but is also prep for following patches that add
additional MAC support.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On topologies including few levels of PCIe switching X540 can run into an
unexpected completion error. We get around this by waiting after enabling
loopback a sufficient amount of time until Tx Data Fetch is sent. We then
poll the pending transaction bit to ensure we received the completion. Only
then do we go on to clear the buffers.
Signed-of-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Following commands:
modprobe ixgbe
ifconfig ethX up
ethtool -s ethX advertise 0x020
can lead to "setup link failed with code -14" error due to the setup_link
call racing with the SFP detection routine in the watchdog.
This patch resolves this issue by protecting the setup_link call with check
for __IXGBE_IN_SFP_INIT.
Reported-by: Scott Harrison <scoharr2@cisco.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The check for vfinfo is not sufficient because it does not protect
against specifying vf that is outside of sriov_num_vfs range.
All of the ndo functions have a check for it except for
ixgbevf_ndo_set_spoofcheck().
The following patch is all we need to protect against this panic:
ip link set p96p1 vf 0 spoofchk off
BUG: unable to handle kernel NULL pointer dereference at 0000000000000052
IP: [<ffffffffa044a1c1>]
ixgbe_ndo_set_vf_spoofchk+0x51/0x150 [ixgbe]
Reported-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Acked-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is illegal to use atomic_set(&page->_count, 2) even if we 'own'
the page. Other entities in the kernel need to use get_page_unless_zero()
to get a reference to the page before testing page properties, so we could
loose a refcount increment.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch consolidates the logic behind dynamically setting TXDCTL.WTHRESH
depending on interrupt throttle rate (ITR) setting regardless of BQL.
Previously TXDCTL.WTHRESH was dynamically being set only with BQL being
enabled, but we have to set it regardless of BQL when ITR is low to avoid
Tx stalls/hangs.
CC: John Greene <jogreene@redhat.com>
Reported by: Masayuki Gouji <gouji.masayuki@jp.fujitsu.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch removes couple of wait loops on autoneg that are not needed.
During validation we noticed that the loops always time out, so there
should be no user impact.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Convert the normal packet completion path to dev_consume_skb_any() so
packet drop profiling via dropwatch or perf top -G -e skb_kfree_skb
is not cluttered with false hits.
Compile tested only. There is a dev_kfree_skb_any() in the routine
ixgbe_ptp_tx_hwtstamp() in ixgbe_ptp.c that looks like a conversion
candidate but I wasn't familiar enough with the code to pull the
trigger.
Signed-off-by: Rick Jones <rick.jones2@hp.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Again, we should not be directly using netif_printk, as we have our own
error print routines that we generate. In addition, instead of using an
early return we can just use the else block of this one line if
statement.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In this case, disabling DCB is not an error. We can still function, but
we just have to let the user know. In addition, since we call this
during probe before allocating our netdevice structure, we should use
e_dev_warn instead of e_warn.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Our calculated v_budget doesn't matter except if we allocate MSI-X
vectors. We shouldn't need to calculate this outside of the function, so
don't. Instead, only calculate it once we attempt to acquire MSI-X
vectors. This helps collocate all of the MSI-X vector code together.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We already have to kfree this value if we fail, and this is only part of
MSI-X mode, so we should simply allocate the value where we need it.
This is cleaner, and makes it a lot more obvious why we are freeing it
inside of ixgbe_acquire_msix_vectors.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Similar to how ixgbevf handles acquiring MSI-X vectors, we can return an
error code instead of relying on the flag being set. This makes it more
clear that we have failed to setup MSI-X mode, and also will make it
easier to consolidate MSI-X related code all into the single function.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The netif_printk relies on our netdevice structure to be registered
already. We may call ixgbe_acquire_msix_vectors prior to registering our
netdevice, so we should not use the netdevice specific printk.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If a hardware Tx timestamp is requested, an uninitialized
workqueue entry may be scheduled, especially on an 82598 adapter.
Add a check for a PTP clock to avoid that. Also only apply the
unlikely to the first term of the conditional. That will make the
rest of the checks be in the cold path.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Because bd_number is not useful anymore, so remove it from adapter struct, or
if keep it, we have to fix the boards driven counter bug in ixgbe_remove() and
ixgbe_probe() only for trivial debug purpose -- other output is enough.
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change addresses several issues in the current ixgbe implementation of
busy poll sockets.
First was the fact that it was possible for frames to be delivered out of
order if they were held in GRO. This is addressed by flushing the GRO buffers
before releasing the q_vector back to the idle state.
The other issue was the fact that we were having to take a spinlock on
changing the state to and from idle. To resolve this I have replaced the
state value with an atomic and use atomic_cmpxchg to change the value from
idle, and a simple atomic set to restore it back to idle after we have
acquired it. This allows us to only use a locked operation on acquiring the
vector without a need for a locked operation to release it.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change removes the Rx buffer allocation at the end of ixgbe_clean_rx_irq.
The reason for removing this is to avoid the extra latency introduced by the
MMIO write. This can amount to somewhere around an extra 100ns of latency and
one extra message worth of PCIe bus overhead.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch resolves warnings produced by ixgbe in W=2 kernel
builds. There are missing-field-initializers warnings and shadow
warnings. None of these point to any deeper problem, so just
resolve them so any new warnings get analyzed.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since we previously called ixgbe_set_num_queues just prior to attempting
to set our interrupt scheme, it may be non obvious why we have to call
it again inside the function. Add a comment which helps make it more
obvious that we are resetting features based on the fact that we do not
have MSI-X enabled, and cannot use the previous settings.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ixgbe initiates a reset of the interface on link loss with pending Tx work
in order to clear the rings.
This patch extends the pending Tx work check to the VF interfaces with the
same purpose.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so that the behavior for FDB handling is consistent
between both the SR-IOV and non-SR-IOV cases. The main change here is that we
perform bounds checking on the number of SR-IOV addresses regardless of if
SR-IOV is enabled or not as we can only support a certain number of addresses
in the hardware.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When xmit_more mode is being used and the ring is about to
become full or the stack has stopped the ring, enforce a tail
pointer write to the hw. Otherwise, we could risk a TX hang.
Code suggested by Alexander Duyck.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>