linux

Author	SHA1	Message	Date
Jacob Keller	1f5c27e528	fm10k: use the MAC/VLAN queue for VF<->PF MAC/VLAN requests Now that we have a working MAC/VLAN queue for handling MAC/VLAN messages from the netdev, replace the default handler for the VF<->PF messages. This new handler is very similar to the default code, but uses the MAC/VLAN queue instead of sending the message directly. Unfortunately we can't easily re-use the default code, so we'll just replace the entire function. This ensures that a VF requesting a large number of VLANs or MAC addresses does not start a reset cycle, as explained in the commit which introduced the message queue. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Ngai-mint Kwan <ngai-mint.kwan@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:39:17 -07:00
Jacob Keller	fc9173682d	fm10k: introduce a message queue for MAC/VLAN messages Under some circumstances, when dealing with a large number of MAC address or VLAN updates at once, the fm10k driver, particularly the VFs can overload the mailbox with too many messages at once. This results in a mailbox timeout, which causes the driver to initiate a reset. During the reset, we re-send all the same messages that originally caused the timeout. This results in a cycle of resets each triggering a future reset. To fix or avoid this, we introduce a workqueue item which monitors a queue of MAC and VLAN requests. These requests are queued to the end of the list, and we process as a FIFO periodically. Initially we only handle requests for the netdev, but we do handle unicast MAC addresses, multicast MAC addresses, and update VLAN requests. A future patch will add support to use this queue for handling MAC update requests from the VF<->PF mailbox. The MAC/VLAN work item will keep checking to make sure that each request does not overflow the mailbox and cause a timeout. If it might, then the work item will reschedule itself a short time later. This avoids any reset cycle, since we never send the message if the mailbox is not ready. As an alternative, we tried increasing the mailbox message FIFO, but this just delays the problem and results in needless memory waste on the system. Our new message queue is dynamically allocated so only uses as much memory as it needs. Additionally, it need not be contiguous like the Tx and Rx FIFOs. Note that this patch chose to only create a queue for MAC and VLAN messages, since these are the only messages sent in a large enough volume to cause the reset loop. Other messages are very unlikely to overflow the mailbox Tx FIFO so easily. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:25:36 -07:00
Jacob Keller	8249c47c6b	fm10k: use generic PM hooks instead of legacy PCIe power hooks Replace the PCI specific legacy power management hooks with the new generic power management hooks which work properly for both suspend and hibernate. The new generic system is better and properly handles the lower level PCIe power management rather than forcing the driver to handle it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:19:33 -07:00
Jacob Keller	b4fcd43661	fm10k: use spinlock to implement mailbox lock Lets not re-invent the locking wheel. Remove our bitlock and use a proper spinlock instead. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:12:44 -07:00
Jacob Keller	0b40f45748	fm10k: prepare_for_reset() when we lose PCIe Link If we lose PCIe link, such as when an unannounced PFLR event occurs, or when a device is surprise removed, we currently detach the device and close the netdev. This unfortunately leaves a lot of things still active, such as the msix_mbx_pf IRQ, and Tx/Rx resources. This can cause problems because the register reads will return potentially invalid values which may result in unknown driver behavior. Begin the process of resetting using fm10k_prepare_for_reset(), much in the same way as the suspend and resume cycle does. This will attempt to shutdown as much as possible, in order to prevent possible issues. A naive implementation for this has issues, because there are now multiple flows calling the reset logic and setting a reset bit. This would cause problems, because the "re-attach" routine might call fm10k_handle_reset() prior to the reset actually finishing. Instead, we'll add state bits to indicate which flow actually initiated the reset. For the general reset flow, we'll assume that if someone else is resetting that we do not need to handle it at all, so it does not need its own state bit. For the suspend case, we will simply issue a warning indicating that we are attempting to recover from this case when resuming. For the detached subtask, we'll simply refuse to re-attach until we've actually initiated a reset as part of that flow. Finally, we'll stop attempting to manage the mailbox subtask when we're detached, since there's nothing we can do if we don't have a PCIe address. Overall this produces a much cleaner shutdown and recovery cycle for a PCIe surprise remove event. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:06:44 -07:00
David S. Miller	4efac6ff4d	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-10-02 This series contains updates to i40e and i40evf. Shannon Nelson fixes an issue where when a machine has more CPUs than queue pairs, the counting gets a "little funky" and turns off Flow Director. So to correct it, limit the number of LAN queues initially allocated to be sure there are some left for Flow Director and other features. Lihong cleans up dead code by removing a condition check which cannot ever be true. Christophe Jaillet fixes a potential NULL pointer dereference, which could happen if kzalloc() fails. Filip corrects the reporting of supported link modes, which was incorrect for some NICs. Added support for 'ethtool -m' command, which displays information about QSFP+ modules. Mariusz adds functions to read/write the LED registers to control the LEDS, instead of accessing the registers directly whenever the LEDs need to be controlled. Jake fixes a regression where we introduced a scheduling while atomic, so introduce a separate helper function which will manage its own need for the mac_filter_hash_lock. Also cleaned up the "PF" parameter in i40e_vc_disable_vf() since it is never used and is not needed. Fixed a rare case where it is possible that a reset does not occur when i40e_vc_disable_vf() is called, so modify i40e_reset_vf() to return a bool to indicate whether it reset or not so that i40e_vc_disable_vf() can wait until a reset actually occurs. Alan adds the ability for the VF to request more or less underlying allocated queues from the PF. Fixes the incorrect method for clearing the vf_states variable with a NULL assignment, when we should be using atomic bitops since we don't actually want to clear all the flags. Fixed a resource leak, where the PF driver fails to inform clients of a VF reset because we were incorrectly checking the I40E_VF_STATE_PRE_ENABLE bit. Mitch converts i40evf_map_rings_to_vectors() to a void function since it cannot fail and allows us to clean up the checks for the function return value. Scott enables the driver(s) to pass traffic with VLAN tags using the 802.1ad Ethernet protocol. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-02 15:16:03 -07:00
Scott Peterson	ab243ec940	i40e: Stop dropping 802.1ad tags - eth proto 0x88a8 Enable i40e to pass traffic with VLAN tags using the 802.1ad ethernet protocol ID (0x88a8). This requires NIC firmware providing version 1.7 of the API. With older NIC firmware 802.1ad tagged packets will continue to be dropped. No VLAN offloads nor RSS are supported for 802.1ad VLANs. Signed-off-by: Scott Peterson <scott.d.peterson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:36 -07:00
Alan Brady	c53d11f669	i40e: fix client notify of VF reset Currently there is a bug in which the PF driver fails to inform clients of a VF reset which then causes clients to leak resources. The bug exists because we were incorrectly checking the I40E_VF_STATE_PRE_ENABLE bit. When a VF is first init we go through a reset to initialize variables and allocate resources but we don't want to inform clients of this first reset since the client isn't fully enabled yet so we set a state bit signifying we're in a "pre-enabled" client state. During the first reset we should be clearing the bit, allowing all following resets to notify the client of the reset when the bit is not set. This patch fixes the issue by negating the 'test_and_clear_bit' check to accurately reflect the behavior we want. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:36 -07:00
Alan Brady	41d0a4d0c8	i40e: fix handling of vf_states variable Currently we inappropriately clear the vf_states variable with a null assignment. This is problematic because we should be using atomic bitops on this variable and we don't actually want to clear all the flags. We should just clear the ones we know we want to clear. Additionally remove the I40E_VF_STATE_FCOEENA bit because it is no longer being used. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Mitch Williams	1b7b7596ae	i40e: make i40evf_map_rings_to_vectors void This function cannot fail, so why is it returning a value? And why are we checking it? Why shouldn't we just make it void? Why is this commit message made up of only questions? Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Alan Brady	5b36e8d04b	i40evf: Enable VF to request an alternate queue allocation Currently the VF gets a default number of allocated queues from HW on init and it could choose to enable or disable those allocated queues. This makes it such that the VF can request more or less underlying allocated queues from the PF. First the VF negotiates the number of queues it wants that can be supported by the PF and if successful asks for a reset. During reset the PF will reallocate the HW queues for the VF and will then remap the new queues. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	d43d60e5eb	i40e: ensure reset occurs when disabling VF It is possible although rare that we may not reset when i40e_vc_disable_vf() is called. This can lead to some weird circumstances with some values not being properly set. Modify i40e_reset_vf() to return a code indicating whether it reset or not. Now, i40e_vc_disable_vf() can wait until a reset actually occurs. If it fails to free up within a reasonable time frame we'll display a warning message. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	f18d20218a	i40e: make use of i40e_vc_disable_vf Replace i40e_vc_notify_vf_reset and i40e_reset_vf with a call to i40e_vc_disable_vf which does this exact thing. This matches similar code patterns throughout the driver. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	eeeddbb806	i40e: drop i40e_pf *pf from i40e_vc_disable_vf() It's never used, and the vf structure could get back to the PF if necessary. Lets just drop the extra unneeded parameter. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	ba4e003d29	i40e: don't hold spinlock while resetting VF When we refactored handling of the PVID in commit `9af52f60b2` ("i40e: use (add\|rm)_vlan_all_mac helper functions when changing PVID") we introduced a scheduling while atomic regression. This occurred because we now held the spinlock across a call to i40e_reset_vf(), which results in a usleep_range() call that triggers a scheduling while atomic bug. This was rare as it only occurred if the user configured a VLAN on a VF and also attempted to reconfigure the VF from the host system with a port VLAN. We do need to hold the lock while calling i40e_is_vsi_in_vlan(), but we should not be holding it while we reset the VF. We'll fix this by introducing a separate helper function i40e_vsi_has_vlans which checks whether we have a PVID and whether the VSI has configured VLANs. This helper function will manage its own need for the mac_filter_hash_lock. Then, we can move the acquiring of the spinlock until after we reset the VF, which ensures that we do not sleep while holding the lock. Using a separate function like this makes the code more clear and is easier to read than attempting to release and re-acquire the spinlock when we reset the VF. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Mariusz Stachura	00f6c2f5e2	i40e: use admin queue for setting LEDs behavior Instead of accessing register directly, use newly added AQC in order to blink LEDs. Introduce and utilize a new flag to prevent excessive API version checking. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Filip Sadowski	9c0e5caf63	i40e: Add support for 'ethtool -m' This patch adds support for 'ethtool -m' command which displays information about (Q)SFP+ module plugged into NIC's cage. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Filip Sadowski	d60bcc7980	i40e: Fix reporting of supported link modes This patch fixes incorrect reporting of supported link modes on some NICs. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Christophe JAILLET	54902349ee	i40e: Fix a potential NULL pointer dereference If 'kzalloc()' fails, a NULL pointer will be dereferenced. Return an error code (-ENOMEM) instead. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Lihong Yang	5872866e16	i40e: remove logically dead code This patch removes the !vf condition check that cannot be true in i40e_ndo_set_vf_trust function Detected by CoverityScan, CID 1397531 Logically dead code Signed-off-by: Lihong Yang <lihong.yang@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Shannon Nelson	e50d5751c8	i40e: limit lan queue count in large CPU count machine When a machine has more CPUs than queue pairs, e.g. 512 cores, the counting gets a little funky and turns off Flow Director with the message: not enough queues for Flow Director. Flow Director feature is disabled This patch limits the number of lan queues initially allocated to be sure we have some left for FD and other features. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:34 -07:00
Jacob Keller	04914390f5	fm10k: prevent race condition of __FM10K_SERVICE_SCHED Although very unlikely, it is possible that cancel_work_sync() may stop the service_task before it actually started. In this case, the __FM10K_SERVICE_SCHED bit will never be cleared. This results in the service task being unable to reschedule in the future. Add a helper function which sets the service disable bit, waits for the service task to stop and clears the schedule bit, thus avoiding the race condition. We know the schedule bit is safe to clear because the cancel_work_sync() guarantees the service task is not running. Add a helper function also to restart the service task, for symmetry. This is not strictly needed but helps the mental model of how to stop and start the service task. This race could only happen in fm10k_suspend/fm10k_resume as this is the only place where the service task is actually restarted. Thus, suspend/resume testing would be ideal. However, note that the chance of this happening is very slim as the service event is scheduled for immediate execution, and you would have to trigger a suspend at almost the exact same time as the service task was scheduled. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 08:10:54 -07:00
Jacob Keller	65b0a469e9	fm10k: move fm10k_prepare_for_reset and fm10k_handle_reset A future patch needs these functions defined earlier in the file. Move them closer to above where they will be called. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 08:09:18 -07:00
Jacob Keller	dd5eede2b7	fm10k: avoid divide by zero in rare cases when device is resetting It is possible that under rare circumstances the device is undergoing a reset, such as when a PFLR occurs, and the device may be transmitting simultaneously. In this case, we might attempt to divide by zero when finding the proper r_idx. Instead, lets read the num_tx_queues once, and make sure it's non-zero. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 08:07:57 -07:00
Jacob Keller	d876c1583b	fm10k: don't loop while resetting VFs due to VFLR event We've always had a really weird looping construction for resetting VFs. We read the VFLRE register and reset the VF if the corresponding bit is set, which makes sense. However we loop continuously until we no longer have any bits left unset. At first this makes sense, as a sort of "keep trying until we succeed" concept. Unfortunately this causes a problem if we happen to surprise remove while this code is executing, because in this case we'll always read all 1s for the VFLRE register. This results in a hard lockup on the CPU because the loop will never terminate. Because our own reset function will clear the VFLR event register always, (except when we've lost PCIe link obviously) there is no real reason to loop. In practice, we'll loop over once and find that no VFs are pending anymore. Lets just check once. Since we're clear the notification when we reset there's no benefit to the loop. Additionally, there shouldn't be a race as future VLFRE events should trigger an interrupt. Additionally, we didn't warn or do anything in the looped case anyways. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 08:06:30 -07:00
Jacob Keller	4abf01b43b	fm10k: simplify reading PFVFLRE register We're doing a really convoluted bitshift and read for the PFVFLRE register. Just reading the PFVFLRE(1), shifting it by 32, then reading PFVFLRE(0) should be sufficient. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 08:04:57 -07:00
Jacob Keller	8bac58be17	fm10k: avoid needless delay when loading driver When we load the driver, we set the last_reset to be in the future, which delays the initial driver reset. Additionally, the service task isn't scheduled to run automatically until the timer runs out. This causes a needless delay of the first reset to begin talking to the switch manager. We can avoid this by simply not setting last_reset and immediately scheduling the service task while in probe. This allows the device to wake up faster, and avoids this delay. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:57:42 -07:00
Jacob Keller	523a0b558d	fm10k: add missing fall through comment Newer versions of GCC starting with 7 now additionally warn when a case statement may fall through without an explicit comment mentioning it. Add such a comment to silence the warning, as this is expected. Unfortunately the comment must come directly before the next case statement, so we put it outside the #ifdef. Otherwise, the compiler cannot properly detect it and thus the warning is displayed regardless. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:54:00 -07:00
Jacob Keller	b94dd008c4	fm10k: avoid possible truncation of q_vector->name New versions of GCC since version 7 began warning about possible truncation of calls to snprintf. We can fix this and avoid false positives. First, we should pass the full buffer size to snprintf, because it guarantees a NULL character as part of its passed length, so passing len-1 is simply wasting a byte of possible storage. Second, if we make the ri and ti variables unsigned, the compiler is able to correctly reason that the value never gets larger than 256, so it doesn't need to warn about the full space required to print a signed integer. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:46:57 -07:00
Jacob Keller	375ce90eab	fm10k: fix typos on fall through comments Newer versions of GCC since version 7 now warn when a case statement may fall through without an explicit comment. "Fallthough" does not count as it is misspelled. Fix the typos for these comments to appease the new warnings. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:42:15 -07:00
Jacob Keller	5c66d1251d	fm10k: stop spurious link down messages when Tx FIFO is full In fm10k_get_host_state_generic, we check the mailbox tx_read() function to ensure that the mailbox is still open. This function also checks to make sure we have space to transmit another message. Unfortunately, if we just recently sent a bunch of messages (such as enabling hundreds of VLANs on a VF) this can result in a race where the watchdog task thinks the link went down just because we haven't had time to process all these messages yet. Instead, lets just check whether the mailbox is still open. This ensures that we don't race with the Tx FIFO, and we only link down once the mailbox is not open. This is safe, because if the FIFO fills up and we're unable to send a message for too long, we'll end up triggering the timeout detection which results in a reset. Additionally, since we still check to ensure the mailbox state is OPEN, we'll transition to link down whenever the mailbox closes as well. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:40:31 -07:00
Markus Elfring	95f49d4bde	fm10k: Use seq_putc() in fm10k_dbg_desc_break() Two single characters should be put into a sequence. Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:28:57 -07:00
Jacob Keller	b52b7f7059	fm10k: reschedule service event if we stall the PF<->SM mailbox When we are handling PF<->VF mailbox messages, it is possible that the VF will send us so many messages that the PF<->SM FIFO will fill up. In this case, we stop the loop and wait until the service event is rescheduled. Normally this should happen due to an interrupt. But it is possible that we don't get another interrupt for a while and it isn't until the service timer actually reschedules us. Instead, simply reschedule immediately which will cause the service event to be run again as soon as we exit. This ensures that we promptly handle all of the PF<->VF messages with minimal delay, while still giving time for the SM mailbox to drain. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:25:47 -07:00
Jacob Keller	17a9180994	fm10k: ensure we process SM mbx when processing VF mbx When we process VF mailboxes, the driver is likely going to also queue up messages to the switch manager. This process merely queues up the FIFO, but doesn't actually begin the transmission process. Because we hold the mailbox lock during this VF processing, the PF<->SM mailbox is not getting processed at this time. Ensure that we actually process the PF<->SM mailbox in between each PF<->VF mailbox. This should ensure prompt transmission of the messages queued up after each VF message is received and handled. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:24:48 -07:00
Mitch Williams	22b96551f2	i40e: refactor FW version checking The i40e driver now supports two different devices with two different firmware versions. So be smart about how we handle these. Move the FW version macros to the appropriate header file, and add a convenience macro that checks the version based on the device. Then use this macro to check whether or not the driver can use the new link info API. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:51:02 -07:00
Alan Brady	a3f5aa9073	i40e: Enable VF to negotiate number of allocated queues Currently the PF allocates a default number of queues for each VF and cannot be changed. This patch enables the VF to request a different number of queues allocated to it. This patch also adds a new virtchnl op and capability flag to facilitate this negotiation. After the PF receives a request message, it will set a requested number of queues for that VF. Then when the VF resets, its VSI will get a new number of queues allocated to it. This is a best effort request and since we only allocate a guaranteed default number, if the VF tries to ask for more than the guaranteed number, there may not be enough in HW to accommodate it unless other queues for other VFs are freed. It should also be noted decreasing the number queues allocated to a VF to below the default will NOT enable the allocation of more than 32 VFs per PF and will not free queues guaranteed to each VF by default. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:51:01 -07:00
Alan Brady	c97fc9b6a7	i40evf: fix ring to vector mapping The current implementation for mapping queues to vectors is broken because it attempts to map each Tx and Rx ring to its own vector, however we use combined queues so we should actually be mapping the Tx/Rx rings together on one vector. Also in the current implementation, in the case where we have more queues than vectors, we attempt to group the queues together into 'chunks' and map each 'chunk' of queues to a vector. Chunking them together would be more ideal if, and only if, we only had RSS because of the way the hashing algorithm works but in the case of a future patch that enables VF ADq, round robin assignment is better and still works with RSS. This patch resolves both those issues and simplifies the code needed to accomplish this. Instead of treating the case where we have more queues than vectors as special, if we notice our vector index is greater than vectors, reset the vector index to zero and continue mapping. This should ensure that in both cases, whether we have enough vectors for each queue or not, the queues get appropriately mapped. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:51:01 -07:00
Jacob Keller	b980c0634f	i40e: shutdown all IRQs and disable MSI-X when suspended On some platforms with a large number of CPUs, we will allocate many IRQ vectors. When hibernating, the system will attempt to migrate all of the vectors back to CPU0 when shutting down all the other CPUs. It is possible that we have so many vectors that it cannot re-assign them to CPU0. This is even more likely if we have many devices installed in one platform. The end result is failure to hibernate, as it is not possible to shutdown the CPUs. We can avoid this by disabling MSI-X and clearing our interrupt scheme when the device is suspended. A more ideal solution would be some method for the stack to properly handle this for all drivers, rather than on a case-by-case basis for each driver to fix itself. However, until this more ideal solution exists, we can do our part and shutdown our IRQs during suspend, which should allow systems with a large number of CPUs to safely suspend or hibernate. It may be worth investigating if we should shut down even further when we suspend as it may make the path cleaner, but this was the minimum fix for the hibernation issue mentioned here. Testing-hints: This affects systems with a large number of CPUs, and with multiple devices enabled. Without this change, those platforms are unable to hibernate at all. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:51:01 -07:00
Jacob Keller	5c49922880	i40e: prevent service task from running while we're suspended Although the service task does check the suspended status before running, it might already be part way through running when we go to suspend. Lets ensure that the service task is stopped and will not be restarted again until we finish resuming. This ensures that service task code does not cause strange interactions with the suspend/resume handlers. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:51:01 -07:00
Jacob Keller	401586c2b9	i40e: don't clear suspended state until we finish resuming When handling suspend and resume callbacks we want to make sure that (a) we don't suspend again if we're already suspended and (b) we don't resume again if we're already resuming. Lets make sure we test_and_set the __I40E_SUSPENDED bit in i40e_suspend which ensures that a suspend call when already suspended will exit early. Additionally, if __I40E_SUSPENDED is not set when we begin resuming, exit early as well. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:51:00 -07:00
Jacob Keller	0e5d3da400	i40e: use newer generic PM support instead of legacy PM callbacks Stop using the old legacy PM support, since we now have stable support for the newer generic PM callbacks. This has several advantages. First, we no longer have to manage our own pci_save_state() and power changes, as it's preferred to have the PCI stack do this. Second, these routines get called for both hibernate and suspend to ram, so we can have the driver properly handle all the suspend/resume flows that it needs to. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:51:00 -07:00
Jacob Keller	c17401a1dd	i40e: use separate state bit for miscellaneous IRQ setup We currently (mis)use the __I40E_RECOVERY_PENDING bit to determine when we should actually request a new IRQ in i40e_setup_misc_vector(). This led to a design mistake where we open-coded the re-setup of the miscellaneous vector in i40e_resume() instead of using the function provided. If we did not open-code this and instead tried to use the i40e_setup_misc_vector() function, it would lead to never reallocating the IRQ. This would lead to a second i40e_suspend() call failing to free the vector due to a NULL pointer dereference. A future patch is going to re-work how the i40e_suspend() and i40e_resume() flows work to clear all IRQ vectors, which would require us to use i40e_setup_misc_vector() directly. Since during this time the __I40E_RECOVERY_PENDING bit is set, we'll never re-allocate the vector. Rather than leaving the open-coded setup in i40e_resume() lets just fix the problem properly in i40e_setup_misc_vector(). Introduce a new state bit which indicates when the IRQ has been assigned, which will be set when i40e_setup_misc_vector is first called. This ultimately resolves the issue of re-requesting the vector, without overloading the __I40E_RECOVERY_PENDING state. This ensures that the suspend/resume cycle can use the setup function instead of open-coding the re-request during resume. Additionally, since the only callers of i40e_stop_misc_vector also want to free it, move this code directly into the function to avoid duplication. Due to the new functionality, rename it to i40e_free_misc_vector(). This lets us drop the extra calls to free and re-enable the vector during i40e_suspend() and i40e_resume(). We don't need to call i40e_setup_misc_Vector() in i40e_resume() because it gets called by the i40e_rebuild() call. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:51:00 -07:00
Mitch Williams	905770fa3e	i40evf: lower message level We see this message regularly on VF reset or unload (which invokes a reset). It's essentially meaningless unless it's happening constantly. To prevent consternation, lower the log level to debug so it's not seen under normal circumstance. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:51:00 -07:00
Mariusz Stachura	0dc8692e91	i40e: fix for flow director counters not wrapping as expected An errata with GLQF_PCNT causes it to not wrap as expected. This can cause an error in flow director statistics. This patch resets affected counters just after reading. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:50:59 -07:00
Mariusz Stachura	e04ea00217	i40e: relax warning message in case of version mismatch Fortville and Fort Park devices are often on different firmware release schedules. This change relaxes the minor version warning message, so it is only displayed for older FW warning version for old firmware Fortville 3 or earlier. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:50:59 -07:00
Sudheer Mogilappagari	3fded4663b	i40e: simplify member variable accesses This commit replaces usage of vsi->back in i40e_print_link_message() (which is actually a PF pointer) with temp variable. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:50:59 -07:00
Sudheer Mogilappagari	9a03449d3e	i40e: Fix link down message when interface is brought up i40e_print_link_message() is intended to compare new link state with current link state and print log message only if the new state is different from current state. However in current driver the new state does not get updated when link is going down because of the if condition. When an interface is brought down, vsi->state is set to I40E_VSI_DOWN in i40e_vsi_close() and later i40e_print_link_message() does not get invoked in i40e_link_event due to if condition. Hence link down message doesn't appear when link is going down. The down state is seen later during i40e_open() and old state gets printed. The actual link state doesn't get updated in i40e_close() or i40e_open() but when i40e_handle_link_event is called inside i40e_clean_adminq_subtask. This change allows i40e_print_link_message() to be called when interface is going down and keeps the state information updated. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:50:59 -07:00
Sudheer Mogilappagari	16badf758b	i40e: Fix unqualified module message while bringing link up In current driver, when ifconfig ethx up is done, the link state doesn't transition to UP inside i40e_open(). It changes after AQ command response is handled in i40e_handle_link_event(). When pf->hw.phy.link_info.link_info is DOWN inside i40e_open(), The state is transient and invalid. So log message gets printed based on incorrect info (i.e link_info and an_info). This commit removes check for unqualified module inside i40e_up_complete(). The existing check in i40e_handle_link_event() logs the error message based on correct link state information. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:50:59 -07:00
Jacob Keller	2b634bb068	i40e/i40evf: rename bytes_per_int to bytes_per_usec This value is not calculating bytes_per_int, which would actually just be bytes/ITR_COUNTDOWN_START, but rather it's calculating bytes/usecs. Rename the variable for clarity so that future developers understand what the value is actually calculating. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:50:58 -07:00
Daniel Borkmann	366a88fe2f	bpf, ixgbe: add meta data support Implement support for transferring XDP meta data into skb for ixgbe driver; before calling into the program, xdp.data_meta points to xdp.data, where on program return with pass verdict, we call into skb_metadata_set(). We implement this for the default ixgbe_build_skb() variant. For the ixgbe_construct_skb() that is used when legacy-rx buffer mananagement mode is turned on via ethtool, I found that XDP gets 0 headroom, so neither xdp_adjust_head() nor xdp_adjust_meta() can be used with this. Just add a comment with explanation for this operating mode. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-26 13:36:44 -07:00
Daniel Borkmann	de8f3a83b0	bpf: add meta pointer for direct access This work enables generic transfer of metadata from XDP into skb. The basic idea is that we can make use of the fact that the resulting skb must be linear and already comes with a larger headroom for supporting bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work on a similar principle and introduce a small helper bpf_xdp_adjust_meta() for adjusting a new pointer called xdp->data_meta. Thus, the packet has a flexible and programmable room for meta data, followed by the actual packet data. struct xdp_buff is therefore laid out that we first point to data_hard_start, then data_meta directly prepended to data followed by data_end marking the end of packet. bpf_xdp_adjust_head() takes into account whether we have meta data already prepended and if so, memmove()s this along with the given offset provided there's enough room. xdp->data_meta is optional and programs are not required to use it. The rationale is that when we process the packet in XDP (e.g. as DoS filter), we can push further meta data along with it for the XDP_PASS case, and give the guarantee that a clsact ingress BPF program on the same device can pick this up for further post-processing. Since we work with skb there, we can also set skb->mark, skb->priority or other skb meta data out of BPF, thus having this scratch space generic and programmable allows for more flexibility than defining a direct 1:1 transfer of potentially new XDP members into skb (it's also more efficient as we don't need to initialize/handle each of such new members). The facility also works together with GRO aggregation. The scratch space at the head of the packet can be multiple of 4 byte up to 32 byte large. Drivers not yet supporting xdp->data_meta can simply be set up with xdp->data_meta as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out, such that the subsequent match against xdp->data for later access is guaranteed to fail. The verifier treats xdp->data_meta/xdp->data the same way as we treat xdp->data/xdp->data_end pointer comparisons. The requirement for doing the compare against xdp->data is that it hasn't been modified from it's original address we got from ctx access. It may have a range marking already from prior successful xdp->data/xdp->data_end pointer comparisons though. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-26 13:36:44 -07:00
Thomas Meyer	b6cd4b5895	e100: Cocci spatch "pool_zalloc-simple" Use _pool_zalloc rather than _pool_alloc followed by memset with 0. Found by coccinelle spatch "api/alloc/pool_zalloc-simple.cocci" Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-21 15:26:59 -07:00
Allen Pais	7d8fb3a774	drivers: net: i40evf: use setup_timer() helper. Use setup_timer function instead of initializing timer with the function and data fields. Signed-off-by: Allen Pais <allen.lkml@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-21 11:44:44 -07:00
Allen Pais	4a9c07ed71	drivers: net: e1000e: use setup_timer() helper. Use setup_timer function instead of initializing timer with the function and data fields. Signed-off-by: Allen Pais <allen.lkml@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-21 11:44:41 -07:00
Allen Pais	82a8c67451	drivers: net: ixgb: use setup_timer() helper. Use setup_timer function instead of initializing timer with the function and data fields. Signed-off-by: Allen Pais <allen.lkml@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-21 11:44:40 -07:00
David S. Miller	66bed8465a	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2017-09-05 This series contains fixes for i40e only. These two patches fix an issue where our nvmupdate tool does not work on RHEL 7.4 and newer kernels, in fact, the use of the nvmupdate tool on newer kernels can cause the cards to be non-functional unless these patches are applied. Anjali reworks the locking around accessing the NVM so that NVM acquire timeouts do not occur which was causing the failed firmware updates. Jake correctly updates the wb_desc when reading the NVM through the AdminQ. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-05 20:03:40 -07:00
Jacob Keller	3c8f3e96af	i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq When introducing the functions to read the NVM through the AdminQ, we did not correctly mark the wb_desc. Fixes: `7073f46e44` ("i40e: Add AQ commands for NVM Update for X722", 2015-06-05) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-05 17:52:46 -07:00
Anjali Singhai Jain	09f79fd49d	i40e: avoid NVM acquire deadlock during NVM update X722 devices use the AdminQ to access the NVM, and this requires taking the AdminQ lock. Because of this, we lock the AdminQ during i40e_read_nvm(), which is also called in places where the lock is already held, such as the firmware update path which wants to lock once and then unlock when finished after performing several tasks. Although this should have only affected X722 devices, commit `96a39aed25` ("i40e: Acquire NVM lock before reads on all devices", 2016-12-02) added locking for all NVM reads, regardless of device family. This resulted in us accidentally causing NVM acquire timeouts on all devices, causing failed firmware updates which left the eeprom in a corrupt state. Create unsafe non-locked variants of i40e_read_nvm_word and i40e_read_nvm_buffer, __i40e_read_nvm_word and __i40e_read_nvm_buffer respectively. These variants will not take the NVM lock and are expected to only be called in places where the NVM lock is already held if needed. Since the only caller of i40e_read_nvm_buffer() was in such a path, remove it entirely in favor of the unsafe version. If necessary we can always add it back in the future. Additionally, we now need to hold the NVM lock in i40e_validate_checksum because the call to i40e_calc_nvm_checksum now assumes that the NVM lock is held. We can further move the call to read I40E_SR_SW_CHECKSUM_WORD up a bit so that we do not need to acquire the NVM lock twice. This should resolve firmware updates and also fix potential raise that could have caused the driver to report an invalid NVM checksum upon driver load. Reported-by: Stefan Assmann <sassmann@kpanic.de> Fixes: `96a39aed25` ("i40e: Acquire NVM lock before reads on all devices", 2016-12-02) Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-05 17:48:22 -07:00
Jacob Keller	742c987575	i40e/i40evf: avoid dynamic ITR updates when polling or low packet rate The dynamic ITR algorithm depends on a calculation of usecs which assumes that the interrupts have been firing constantly at the interrupt throttle rate. This is not guaranteed because we could have a low packet rate, or have been polling in software. We'll estimate whether this is the case by using jiffies to determine if we've been too long. If the time difference of jiffies is larger we are guaranteed to have an incorrect calculation. If the time difference of jiffies is smaller we might have been polling some but the difference shouldn't affect the calculation too much. This ensures that we don't get stuck in BULK latency during certain rare situations where we receive bursts of packets that force us into NAPI polling. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 16:15:24 -07:00
Jacob Keller	0a2c7722be	i40e/i40evf: remove ULTRA latency mode Since commit `c56625d597` ("i40e/i40evf: change dynamic interrupt thresholds") a new higher latency ITR setting called I40E_ULTRA_LATENCY was added with a cryptic comment about how it was meant for adjusting Rx more aggressively when streaming small packets. This mode was attempting to calculate packets per second and then kick in when we have a huge number of small packets. Unfortunately, the ULTRA setting was kicking in for workloads it wasn't intended for including single-thread UDP_STREAM workloads. This wasn't caught for a variety of reasons. First, the ip_defrag routines were improved somewhat which makes the UDP_STREAM test still reasonable at 10GbE, even when dropped down to 8k interrupts a second. Additionally, some other obvious workloads appear to work fine, such as TCP_STREAM. The number 40k doesn't make sense for a number of reasons. First, we absolutely can do more than 40k packets per second. Second, we calculate the value inline in an integer, which sometimes can overflow resulting in using incorrect values. If we fix this overflow it makes it even more likely that we'll enter ULTRA mode which is the opposite of what we want. The ULTRA mode was added originally as a way to reduce CPU utilization during a small packet workload where we weren't keeping up anyways. It should never have been kicking in during these other workloads. Given the issues outlined above, let's remove the ULTRA latency mode. If necessary, a better solution to the CPU utilization issue for small packet workloads will be added in a future patch. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 16:12:15 -07:00
Jacob Keller	6d9777298b	i40e: invert logic for checking incorrect cpu vs irq affinity In commit `96db776a36` ("i40e/vf: fix interrupt affinity bug") we added some code to force exit of polling in case we did not have the correct CPU. This is important since it was possible for the IRQ affinity to be changed while the CPU is pegged at 100%. This can result in the polling routine being stuck on the wrong CPU until traffic finally stops. Unfortunately, the implementation, "if the CPU is correct, exit as normal, otherwise, fall-through to the end-polling exit" is incredibly confusing to reason about. In this case, the normal flow looks like the exception, while the exception actually occurs far away from the if statement and comment. We recently discovered and fixed a bug in this code because we were incorrectly initializing the affinity mask. Re-write the code so that the exceptional case is handled at the check, rather than having the logic be spread through the regular exit flow. This does end up with minor code duplication, but the resulting code is much easier to reason about. The new logic is identical, but inverted. If we are running on a CPU not in our affinity mask, we'll exit polling. However, the code flow is much easier to understand. Note that we don't actually have to check for MSI-X, because in the MSI case we'll only have one q_vector, but its default affinity mask should be correct as it includes all CPUs when it's initialized. Further, we could at some point add code to setup the notifier for the non-MSI-X case and enable this workaround for that case too, if desired, though there isn't much gain since its unlikely to be the common case. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 16:10:48 -07:00
Jacob Keller	759dc4a7e6	i40e: initialize our affinity_mask based on cpu_possible_mask On older kernels a call to irq_set_affinity_hint does not guarantee that the IRQ affinity will be set. If nothing else on the system sets the IRQ affinity this can result in a bug in the i40e_napi_poll() routine where we notice that our interrupt fired on the "wrong" CPU according to our internal affinity_mask variable. This results in a bug where we continuously tell NAPI to stop polling to move the interrupt to a new CPU, but the CPU never changes because our affinity mask does not match the actual mask setup for the IRQ. The root problem is a mismatched affinity mask value. So lets initialize the value to cpu_possible_mask instead. This ensures that prior to the first time we get an IRQ affinity notification we'll have the mask set to include every possible CPU. We use cpu_possible_mask instead of cpu_online_mask since the former is almost certainly never going to change, while the later might change after we've made a copy. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 16:09:03 -07:00
Jacob Keller	9254c0e34e	i40e: move enabling icr0 into i40e_update_enable_itr If we don't have MSI-X enabled, we handle interrupts on all icr0. This is a special case, so let's move the conditional into i40e_update_enable_itr() in order to make i40e_napi_poll easier to read about. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 16:07:13 -07:00
Jacob Keller	ba4460d45a	i40e: remove workaround for resetting XPS Since commit `3ffa037d7f` ("i40e: Set XPS bit mask to zero in DCB mode") we've tried to reset the XPS settings by building a custom empty CPU mask. This workaround is not necessary because we're not really removing the XPS setting, but simply setting it so that no CPU is valid. Second, we shorten the code further by using zalloc_cpumask_var instead of a separate call to bitmap_zero(). Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 16:06:02 -07:00
Carolyn Wyborny	19279235be	i40e: Fix for unused value issue found by static analysis This patch fixes an issue where an error return value is set, but without an immediate exit, the value can be overwritten by the following code execution. The condition at this point is not fatal, so remove the error assignment and comment the intent for future code maintainers Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 16:02:16 -07:00
Mariusz Stachura	68e49702a1	i40e: 25G FEC status improvements This patch improves the system log message. The log message will be expanded to include the FEC mode the FW requested before link was established. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 16:01:03 -07:00
Mariusz Stachura	8774370d26	i40e/i40evf: support for VF VLAN tag stripping control This patch gives VF capability to control VLAN tag stripping via ethtool. As rx-vlan-offload was fixed before, now the VF is able to change it using "ethtool --offload <IF> rxvlan on/off" settings. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 15:47:43 -07:00
Jacob Keller	8c9eb350aa	i40e: force VMDQ device name truncation In new versions of GCC since 7.x a new warning exists which warns when a string is truncated before all of the format can be completed. When we setup VMDQ netdev names we are copying a pre-existing interface name which could be up to 15 characters in length. Since we also add 4 bytes, v, the literal %, the d and a \0 null, we would overrun the available size unless snprintf truncated for us. The snprintf call will of course truncate on the end, so lets instead modify the code to force truncation of the copied netdev name by 4 characters, to create enough space for the 4 bytes we're adding. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 15:44:04 -07:00
Jacob Keller	696ac80aa1	i40evf: fix possible snprintf truncation of q_vector->name The q_vector names are based on the interface name with a driver prefix, the type of q_vector setup, and the queue number. We previously set the size of this variable to IFNAMSIZ + 9, which is incorrect, because we actually include a minimum of 14 characters extra beyond the interface name size. New versions of GCC since 7 include a new warning that detects this possible truncation and complains. We can fix this by increasing the size in case our interface name is too large to avoid truncation. We don't need to go beyond 14 because the compiler is smart enough to realize our values can never exceed size of 1. We do go up to 15 here because possible future changes may increase the number of queues beyond one digit. While we are here, also change some variables to be unsigned (since they are never negative) and stop using an extra unnecessary %s format specifier. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 15:43:58 -07:00
Akeem G Abodunrin	e53b382f3a	i40e: Use correct flag to enable egress traffic for unicast promisc Albeit, we usually set true promiscuous mode for both multicast and unicast at the same time - however, it is possible to set it individually, so using allmulti flag which is only for allmulticast might caused unwanted behavior in mirroring egress traffic promiscuous for unicast in VF. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 15:43:53 -07:00
Jacob Keller	b5d5504aa1	i40e: prevent snprintf format specifier truncation Increase the size of the prefix buffer so that it can hold enough characters for every possible input. Although 20 is enough for all expected inputs, it is possible for the values to be larger than expected, resulting in a possibly truncated string. Additionally, lets use sizeof(prefix) in order to ensure we use the correct size if we need to change the array length in the future. New versions of GCC starting at 7 now include warnings to prevent truncation unless you handle the return code. At most 27 bytes can be written here, so lets just increase the buffer size even if for all expected hw->bus.* values we only needed 20. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 15:43:41 -07:00
Mariusz Stachura	ed601f6601	i40e: Store the requested FEC information Store information about FEC modes, that were requested. It will be used in printing link status information function and this way there is no need to call admin queue there. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 15:43:34 -07:00
Sudheer Mogilappagari	167d52edc4	i40e: Update state variable for adminq subtask During NVM update, state machine gets into unrecoverable state because i40e_clean_adminq_subtask can get scheduled after the admin queue command but before other state variables are updated. This causes incorrect input to i40e_nvmupd_check_wait_event and state transitions don't happen. This fix updates the state variables so that adminq_subtask will have accurate information whenever it gets scheduled. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 15:42:53 -07:00
Sudheer Mogilappagari	2bf01935ec	i40e: synchronize nvmupdate command and adminq subtask During NVM update, state machine gets into unrecoverable state because i40e_clean_adminq_subtask can get scheduled after the admin queue command but before other state variables are updated. This causes incorrect input to i40e_nvmupd_check_wait_event and state transitions don't happen. This issue existed before but surfaced after commit `373149fc99` ("i40e: Decrease the scope of rtnl lock") This fix adds locking around admin queue command and update of state variables so that adminq_subtask will have accurate information whenever it gets scheduled. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-25 14:52:52 -07:00
Alan Brady	06b2decd92	i40e: prevent changing ITR if adaptive-rx/tx enabled Currently the driver allows the user to change (or even disable) interrupt moderation if adaptive-rx/tx is enabled when this should not be the case. Adaptive RX/TX will not respect the user's ITR settings so allowing the user to change it is weird. This bug would also allow the user to disable interrupt moderation with adaptive-rx/tx enabled which doesn't make much sense either. This patch makes it such that if adaptive-rx/tx is enabled, the user cannot make any manual adjustments to interrupt moderation. It also makes it so that if ITR is disabled but adaptive-rx/tx is then enabled, ITR will be re-enabled. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-25 14:52:46 -07:00
Jacob Keller	7e4d01e7d3	i40e: use cpumask_copy instead of direct assignment According to the header file cpumask.h, we shouldn't be directly copying a cpumask_t, since its a bitmap and might not be copied correctly. Lets use the provided cpumask_copy() function instead. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-25 14:52:42 -07:00
Alan Brady	f0db789287	i40evf: use netdev variable in reset task If we're going to bother initializing a variable to reference it we might as well use it. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-25 14:52:38 -07:00
Stefan Assmann	fbb113f773	i40e/i40evf: rename vf_offload_flags to vf_cap_flags in struct virtchnl_vf_resource The current name of vf_offload_flags indicates that the bitmap is limited to offload related features. Make this more generic by renaming it to vf_cap_flags, which allows for other capabilities besides offloading to be added. Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-25 14:52:29 -07:00
Jacob Keller	fcf6cfc8a6	i40e: move check for avoiding VID=0 filters into i40e_vsi_add_vlan In i40e_vsi_add_vlan we treat attempting to add VID=0 as an error, because it does not do what the caller might expect. We already special case VID=0 in i40e_vlan_rx_add_vid so that we avoid this error when adding the VLAN. This special casing is necessary so that we do not add the VLAN=0 filter since we don't want to stop receiving untagged traffic. Unfortunately, not all callers of i40e_vsi_add_vlan are aware of this, including when we add VLANs from a VF device. Rather than special casing every single caller of i40e_vsi_add_vlan, lets just move this check internally. This makes the code simpler because the caller does not need to be aware of how VLAN=0 is special, and we don't forget to add this check in new places. This fixes a harmless error message displaying when adding a VLAN from within a VF. The message was meaningless but there is no reason to confuse end users and system administrators, and this is now avoided. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-25 14:46:45 -07:00
Jacob Keller	841c950d67	i40e/i40evf: use cmpxchg64 when updating private flags in ethtool When a user gives an invalid command to change a private flag which is not supported, either because it is read-only, or the device is not capable of the feature, we simply ignore the request. A naive solution would simply be to report error codes when one of the flags was not supported. However, this causes problems because it makes the operation not atomic. If a user requests multiple private flags together at once we could end up changing one before failing at the second flag. We can do a bit better if we instead update a temporary copy of the flags variable in the loop, and then copy it into place after. If we aren't careful this has the pitfall of potentially silently overwriting any changes caused by other threads. Avoid this by using cmpxchg64 which will compare and swap the flags variable only if it currently matched the old value. We'll report -EAGAIN in the (hopefully rare!) case where the cmpxchg64 fails. This ensures that we can properly report when flags are not supported in an atomic fashion without the risk of overwriting other threads changes. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-25 14:46:34 -07:00
Anjali Singhai Jain	10a955ff62	i40e: Detect ATR HW Evict NVM issue and disable the feature This patch fixes a problem with the HW ATR eviction feature where the NVM setting was incorrect. This patch detects the issue on X720 adapters and disables the feature if the NVM setting is incorrect. Without this patch, HW ATR Evict feature does not work on broken NVMs and is not detected either. If the HW ATR Evict feature is disabled the SW Eviction feature will take effect. Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-25 14:46:26 -07:00
Jacob Keller	28921a0c2f	i40e: remove workaround for Open Firmware MAC address Since commit `b499ffb0a2` ("i40e: Look up MAC address in Open Firmware or IDPROM"), we've had support for obtaining the MAC address form Open Firmware or IDPROM. This code relied on sending the Open Firmware address directly to the device firmware instead of relying on our MAC/VLAN filter list. Thus, a work around was introduced in commit `b1b15df592` ("i40e: Explicitly write platform-specific mac address after PF reset") We refactored the Open Firmware address enablement code in the ill-named commit `41c4c2b50d` ("i40e: allow look-up of MAC address from Open Firmware or IDPROM") Since this refactor, we no longer even set I40E_FLAG_PF_MAC. Further, we don't need this work around, because we actually store the MAC address as part of the MAC/VLAN filter hash. Thus, we will restore the address correctly upon reset. The refactor above failed to revert the workaround, so do that now. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-25 14:46:20 -07:00
Jacob Keller	d36e41dc78	i40e: separate hw_features from runtime changing flags The number of flags found in pf->flags has grown quite large, and there are a lot of different types of flags. Most of the flags are simply hardware features which are enabled on some firmware or some MAC types. Other flags are dynamic run-time flags which enable or disable certain features of the driver. Separate these two types of flags into pf->hw_features and pf->flags. The hw_features list will contain a set of features which are enabled at init time. This will not contain toggles or otherwise dynamically changing features. These flags should not need atomic protections, as they will be set once during init and then be essentially read only. Everything else will remain in the flags variable. These flags may be modified at any time during run time. A future patch may wish to convert these flags into set_bit/clear_bit/test_bit or similar approach to ensure atomic correctness. The I40E_FLAG_MFP_ENABLED flag may be a good fit for hw_features but currently is used by ethtool in the private flags settings, and thus has been left as part of flags. Additionally, I40E_FLAG_DCB_CAPABLE may be a good fit for the hw_features but this patch has not tried to untangle it yet. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-25 14:46:15 -07:00
Anjali Singhai Jain	5a433199bf	i40e: Fix a bug with VMDq RSS queue allocation The X722 pf flag setup should happen before the VMDq RSS queue count is initialized for VMDq VSI to get the right number of queues for RSS in case of X722 devices. Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-25 14:46:06 -07:00
Sudheer Mogilappagari	fe2647ab0c	i40evf: prevent VF close returning before state transitions to DOWN Currently i40evf_close() can return before state transitions to __I40EVF_DOWN because of the latency involved in processing and receiving response from PF driver and scheduling of VF watchdog_task. Due to this inconsistency an immediate call to i40evf_open() fails because state is still DOWN_PENDING. When a VF interface is in up state and we try to add it as slave, The bonding driver calls dev_close() and dev_open() in short duration resulting in dev_open returning error. The ifenslave command needs to be run again for dev_open to succeed. This fix ensures that watchdog timer is scheduled immediately after admin queue operations are scheduled in i40evf_down(). In addition a wait condition is added at the end of i40evf_close so that function wont return when state is still DOWN_PENDING. The timeout value is chosen after some profiling and includes some buffer. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-25 14:45:55 -07:00
Mitch Williams	1e3a5fd5c0	i40e/i40evf: adjust packet size to account for double VLANs Now that the kernel supports double VLAN tags, we should at least play nice. Adjust the max packet size to account for two VLAN tags, not just one. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-25 14:45:28 -07:00
Jesper Dangaard Brouer	2886447dc5	ixgbe: use return codes from ndo_xdp_xmit that are distinguishable For XDP_REDIRECT the use of return code -EINVAL is confusing, as it is used in three different cases. (1) When the index or ifindex lookup fails, and in the ixgbe driver (2) when link is down and (3) when XDP have not been enabled. The return code can be picked up by the tracepoint xdp:xdp_redirect for diagnosing why XDP_REDIRECT isn't working. Thus, there is a need different return codes to tell the issues apart. I'm considering using a specific err-code scheme for XDP_REDIRECT instead of using these errno codes. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 11:59:37 -07:00
Jesper Dangaard Brouer	d2cee2e5d0	ixgbe: change ndo_xdp_xmit return code on xmit errors Use errno -ENOSPC ("No space left on device") when the XDP xmit have no space left on the TX ring buffer, instead of -ENOMEM. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-18 16:18:31 -07:00
Chris Mi	7f3b39dafc	net/sched: Fix the logic error to decide the ingress qdisc The offending commit used a newly added helper function. But the logic is wrong. Without this fix, the affected NICs can't do HW offload. Error -EOPNOTSUPP will be returned directly. Fixes: `a2e8da9378` ("net/sched: use newly added classid identity helpers") Signed-off-by: Chris Mi <chrism@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-18 10:29:04 -07:00
Jiri Pirko	a2e8da9378	net: sched: use newly added classid identity helpers Instead of checking handle, which does not have the inner class information and drivers wrongly assume clsact->egress as ingress, use the newly introduced classid identification helpers. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-11 13:47:01 -07:00
David S. Miller	d5e7f827a6	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 1GbE Intel Wired LAN Driver Updates 2017-08-08 This series contains updates to e1000e and igb/igbvf. Gangfeng Huang fixes an issue with receive network flow classification, where igb_nfc_filter_exit() was not being called in igb_down() which would cause the filter tables to "fill up" if a user where to change the adapter settings (such as speed) which requires a reset of the adapter. Cliff Spradlin fixes a timestamping issue, where igb was allowing requests for hardware timestamping even if it was not configured for hardware transmit timestamping. Corinna Vinschen removes the error message that there was an "unexpected SYS WRAP", when it is actually expected. So remove the message to not confuse users. Greg Edwards provides several patches for the mailbox interface between the PF and VF drivers. Added a mailbox unlock method to be used to unlock the PF/VF mailbox by the PF. Added a lock around the VF mailbox ops to prevent the VF from sending another message while the PF is still processing the previous message. Fixed a "scheduling while atomic" issue by changing msleep() to mdelay(). Sasha adds support for the next LOM generations i219 (v8 & v9) which will be available in the next Intel client platform IceLake. John Linville adds support for a Broadcom PHY to the igb driver, since there are designs out in the world which use the igb MAC and a third party PHY. This allows the driver to load and function as expected on these designs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-09 16:47:19 -07:00
David S. Miller	3118e6e19d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The UDP offload conflict is dealt with by simply taking what is in net-next where we have removed all of the UFO handling code entirely. The TCP conflict was a case of local variables in a function being removed from both net and net-next. In netvsc we had an assignment right next to where a missing set of u64 stats sync object inits were added. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-09 16:28:45 -07:00
John W Linville	eeb0149660	igb: support BCM54616 PHY The management port on an Edgecore AS7712-32 switch uses an igb MAC, but it uses a BCM54616 PHY. Without a patch like this, loading the igb module produces dmesg output like this: [ 3.439125] igb: Copyright (c) 2007-2014 Intel Corporation. [ 3.439866] igb: probe of 0000:00:14.0 failed with error -2 Signed-off-by: John W Linville <linville@tuxdriver.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-08 18:09:12 -07:00
Greg Edwards	d466124860	igbvf: convert msleep to mdelay in atomic context This fixes a "scheduling while atomic" splat seen with CONFIG_DEBUG_ATOMIC_SLEEP enabled. Signed-off-by: Greg Edwards <gedwards@ddn.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-08 18:08:00 -07:00
Greg Edwards	0d3ee0d925	igbvf: after mailbox write, wait for reply Two of the VF mailbox commands were not waiting for a reply from the PF, which can result in a VF mailbox timeout in the VM for the next command. Signed-off-by: Greg Edwards <gedwards@ddn.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-08 18:01:42 -07:00
Greg Edwards	32652c2ac2	igbvf: add lock around mailbox ops The PF driver assumes the VF will not send another mailbox message until the PF has written its reply to the previous message. If the VF does, that message will be silently dropped by the PF before it writes its reply to the mailbox. This results in a VF mailbox timeout for posted messages waiting for an ACK, and the VF is reset by the igbvf_watchdog_task in the VM. Add a lock around the VF mailbox ops to prevent the VF from sending another message while the PF is still processing the previous one. Signed-off-by: Greg Edwards <gedwards@ddn.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-08 17:59:03 -07:00
Sasha Neftin	48f76b68f9	e1000e: Initial Support for IceLake i219 (8) and i219 (9) are the next LOM generations that will be available on the next Intel Client platform (IceLake). This patch provides the initial support for these devices Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Reviewed-by: Raanan Avargil <raanan.avargil@intel.com> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-08 17:54:21 -07:00
Greg Edwards	46b3bb9b47	igb: do not drop PF mailbox lock after read of VF message When the PF receives a mailbox message from the VF, it grabs the mailbox lock, reads the VF message from the mailbox, ACKs the message and drops the lock. While the PF is performing the action for the VF message, nothing prevents another VF message from being posted to the mailbox. The current code handles this condition by just dropping any new VF messages without processing them. This results in a mailbox timeout in the VM for posted messages waiting for an ACK, and the VF is reset by the igbvf_watchdog_task in the VM. Given the right sequence of VF messages and mailbox timeouts, this condition can go on ad infinitum. Modify the PF mailbox read method to take an 'unlock' argument that optionally leaves the mailbox locked by the PF after reading the VF message. This ensures another VF message is not posted to the mailbox until after the PF has completed processing the VF message and written its reply. Signed-off-by: Greg Edwards <gedwards@ddn.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-08 17:52:45 -07:00
Greg Edwards	1a6c4a3b1e	igb: expose mailbox unlock method Add a mailbox unlock method to e1000_mbx_operations, which will be used to unlock the PF/VF mailbox by the PF. Signed-off-by: Greg Edwards <gedwards@ddn.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-08 17:50:46 -07:00
Greg Edwards	09fc97ba3e	igb: add argument names to mailbox op function declarations Signed-off-by: Greg Edwards <gedwards@ddn.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-08 17:48:10 -07:00
Corinna Vinschen	2643e6e902	igb: Remove incorrect "unexpected SYS WRAP" log message TSAUXC.DisableSystime is never set, so SYSTIM runs into a SYS WRAP every 1100 secs on 80580/i350/i354 (40 bit SYSTIM) and every 35000 secs on 80576 (45 bit SYSTIM). This wrap event sets the TSICR.SysWrap bit unconditionally. However, checking TSIM at interrupt time shows that this event does not actually cause the interrupt. Rather, it's just bycatch while the actual interrupt is caused by, for instance, TSICR.TXTS. The conclusion is that the SYS WRAP is actually expected, so the "unexpected SYS WRAP" message is entirely bogus and just helps to confuse users. Drop it. Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-08 17:46:25 -07:00
Gustavo A R Silva	d75372a2da	e1000e: add check on e1e_wphy() return value Check return value from call to e1e_wphy(). This value is being checked during previous calls to function e1e_wphy() and it seems a check was missing here. Addresses-Coverity-ID: 1226905 Signed-off-by: Gustavo A R Silva <garsilva@embeddedor.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-08 17:44:53 -07:00
Cliff Spradlin	26bd4e2db0	igb: protect TX timestamping from API misuse HW timestamping can only be requested for a packet if the NIC is first setup via ioctl(SIOCSHWTSTAMP). If this step was skipped, then the igb driver still allowed TX packets to request HW timestamping. In this situation, the _IGB_PTP_TX_IN_PROGRESS flag was set and would never clear. This prevented any future HW timestamping requests to succeed. Fix this by checking that the NIC is configured for HW TX timestamping before accepting a HW TX timestamping request. Signed-off-by: Cliff Spradlin <cspradlin@google.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-08 17:43:18 -07:00
Gangfeng Huang	94221ae75c	igb: Fix error of RX network flow classification After add an ethertype filter, if user change the adapter speed several times, the error "ethtool -N: etype filters are all used" is reported by igb driver. In older patch, function igb_nfc_filter_exit() and igb_nfc_filter_restore() is not paried. igb_nfc_filter_restore() exist in igb_up(), but function igb_nfc_filter_exit() is exist in __igb_close(). In the process of speed changing, only igb_nfc_filter_restore() is called, it will take a position of ethertype bitmap. Reproduce steps: Step 1: Add a etype filter by ethtool $ethtool -N eth0 flow-type ether proto 0x88F8 action 1 Step 2: Change the adapter speed to 100M/full duplex $ethtool -s eth0 speed 100 duplex full Step 3: Change the adapter speed to 1000M/full duplex ethtool -s eth0 speed 1000 duplex full Repeat step2 and step3, then dmesg the system log, you can find the error message, add new ethtype filter is also failed. This fixing is move igb_nfc_filter_exit() from __igb_close() to igb_down() to make igb_nfc_filter_restore()/igb_nfc_filter_exit() is paired. Signed-off-by: Gangfeng Huang <gangfeng.huang@ni.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-08 17:28:29 -07:00
Jiri Pirko	de4784ca03	net: sched: get rid of struct tc_to_netdev Get rid of struct tc_to_netdev which is now just unnecessary container and rather pass per-type structures down to drivers directly. Along with that, consolidate the naming of per-type structure variables in cls_*. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:37 -07:00
Jiri Pirko	38cf0426e5	net: sched: change return value of ndo_setup_tc for driver supporting mqprio only Change the return value from -EINVAL to -EOPNOTSUPP. The rest of the drivers have it like that, so be aligned. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:37 -07:00
Jiri Pirko	5fd9fc4e20	net: sched: push cls related args into cls_common structure As ndo_setup_tc is generic offload op for whole tc subsystem, does not really make sense to have cls-specific args. So move them under cls_common structurure which is embedded in all cls structs. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:37 -07:00
Jiri Pirko	bc32afdb2b	ixgbe: push cls_u32 and mqprio setup_tc processing into separate functions Let __ixgbe_setup_tc be a splitter for specific setup_tc types and push out cls_u32 and mqprio specific codes into separate functions. Also change the return values so they are the same as in the rest of the drivers. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:36 -07:00
Jiri Pirko	2572ac53c4	net: sched: make type an argument for ndo_setup_tc Since the type is always present, push it to be a separate argument to ndo_setup_tc. On the way, name the type enum and use it for arg type. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:35 -07:00
Jiri Pirko	3bcc0cec81	net: sched: change names of action number helpers to be aligned with the rest The rest of the helpers are named tcf_exts_*, so change the name of the action number helpers to be aligned. While at it, change to inline functions. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-04 11:21:23 -07:00
Florian Fainelli	7c3a4626eb	ixgbe: Initialize 64-bit stats seqcounts On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. Fixes: `4197aa7bb8` ("ixgbevf: provide 64 bit statistics") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 20:06:07 -07:00
Florian Fainelli	7d6d067790	i40e: Initialize 64-bit statistics TX ring seqcount On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. Fixes: `980e9b1186` ("i40e: Add support for 64 bit netstats") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 20:06:06 -07:00
David S. Miller	e27a879271	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-07-25 This series contains updates to i40e and i40evf only. Gustavo Silva fixes a variable assignment, where the incorrect variable was being used to store the error parameter. Carolyn provides a fix for a problem found in systems when entering S4 state, by ensuring that the misc vector's IRQ is disabled as well. Jake removes the single-threaded restriction on the module workqueue, which was causing issues with events such as CORER. Does some future proofing, by changing how the driver displays the UDP tunnel type. Paul adds a retry in releasing resources if the admin queue times out during the first attempt to release the resources. Jesse fixes up references to 32bit timspec, since there are a small set of errors on 32 bit, so we need to be using the right calls for dealing with timespec64 variables. Cleaned up code indentation and corrected an "if" conditional check, as well as making the code flow more clear. Cast or changed the types to remove warnings for comparing signed and unsigned types. Adds missing includes in i40evf, which were being used but were not being directly included. Daniel Borkmann fixes i40e to fill the XDP prog_id with the id just like other XDP enabled drivers, so that on dump we can retrieve the attached program based on the id and dump BPF insns, opcodes, etc back to user space. Tushar Dave adds le32_to_cpu while evaluating the hardware descriptor fields, since they are in little-endian format. Also removed unnecessary "__packed" to a couple of i40evf structures. Stefan Assmann fixes an issue when an administratively set MAC was set and should now be switched back to 00:00:00:00:00:00, the pf_set_mac flag is not being toggled back to false. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-26 16:58:45 -07:00
Stefan Assmann	2f1d86e44c	i40e: handle setting administratively set MAC address back to zero When an administratively set MAC was previously set and should now be switched back to 00:00:00:00:00:00 the pf_set_mac flag did not get toggled back to false. As a result VFs were still treated as if an administratively set MAC was present. Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:21 -07:00
Tushar Dave	07c357f348	i40evf: remove unnecessary __packed This is similar to 'commit `9588397d24` ("i40e: remove unnecessary __packed")' to avoid unaligned access. Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:21 -07:00
Tushar Dave	c969ef4ed9	i40evf: Use le32_to_cpu before evaluating HW desc fields i40e hardware descriptor fields are in little-endian format. Driver must use le32_to_cpu while evaluating these fields otherwise on big-endian arch we end up evaluating incorrect values, cause errors like: i40evf 0000:03:0a.0: Expected response 24 from PF, received 402653184 i40evf 0000:03:0a.1: Expected response 7 from PF, received 117440512 Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:21 -07:00
Daniel Borkmann	eb23039f6c	i40e: report BPF prog id during XDP_QUERY_PROG Fill the XDP prog_id with the id just like we do in other XDP enabled drivers such as ixgbe. This is needed so that on dump we can retrieve the attached program based on the id, and dump BPF insns, opcodes, etc back to user space. Only XDP driver missing this is currently i40e. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:21 -07:00
Jesse Brandeburg	8d9ee66ac0	i40evf: add some missing includes These includes were all being used in the driver, but weren't being directly included. Since the current advised method is to directly include anything that you need, this implements that. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:20 -07:00
Jacob Keller	d8b2c700a3	i40e: display correct UDP tunnel type name The i40e driver attempts to display the UDP tunnel name by doing a check against the type, where for non-zero types we use "vxlan" and for zero type we use "geneve". This is not future proof, because if new tunnel types get added, we'll incorrectly label them. It also depends on the value of UDP_TUNNEL_TYPE_GENEVE == 0, which is brittle. Instead, replace this with a function that can return a constant string depending on the type. For now we'll use "unknown" for types we don't know about, and we can expand this in the future if new types get added. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:20 -07:00
Jesse Brandeburg	b85c94b617	i40e/i40evf: remove mismatched type warnings Compiler reported several places where driver compared signed and unsigned types. Cast or change the types to remove the warnings. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:20 -07:00
Jesse Brandeburg	601a2e7ac5	i40e/i40evf: make IPv6 ATR code clearer This just reorders some local vars and makes the code flow clearer. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:20 -07:00
Jesse Brandeburg	4d433084dd	i40e: fix odd formatting and indent The compiler warned on an oddly indented bit of code, and when investigating that, noted that the functions themselves had an odd flow. The if condition was checked, and would exclude a call to AQ, but then the aq_ret would be checked unconditionally which just looks really weird, and is likely to cause objections. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:19 -07:00
Jesse Brandeburg	0ac30ce433	i40e: fix up 32 bit timespec references As it turns out there was only a small set of errors on 32 bit, and we just needed to be using the right calls for dealing with timespec64 variables. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:19 -07:00
Paul M Stillwell Jr	981e25c32b	i40e: Handle admin Q timeout when releasing NVM There are some rare cases where the release resource call will return an admin Q timeout. In these cases the code needs to try to release the resource again until it succeeds or it times out. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:19 -07:00
Jacob Keller	4d5957cbde	i40e: remove WQ_UNBOUND and the task limit of our workqueue During certain events such as a CORER, multiple devices will run a work task to handle some cleanup. This can cause issues due to a single-threaded workqueue which can mean that a device doesn't cleanup in time. Prevent this by removing the single-threaded restriction on the module workqueue. This avoids the need to add more complex yielding logic in our service task routine. This is also similar to what other drivers such as fm10k do. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:19 -07:00
Carolyn Wyborny	7c9ae7f053	i40e: Fix for trace found with S4 state This patch fixes a problem found in systems when entering S4 state. This patch fixes the problem by ensuring that the misc vector's IRQ is disabled as well. Without this patch a stack trace can be seen upon entering S4 state. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:19 -07:00
Gustavo A R Silva	db1a8f8e83	i40e: fix incorrect variable assignment Fix incorrect variable assignment. Based on line 1511: aq_ret = I40_ERR_PARAM; the correct variable to be used in this instance is aq_ret instead of ret. Also, variable ret is updated at line 1602 just before return, so assigning a value to this variable in this code block is useless. Addresses-Coverity-ID: 1397693 Signed-off-by: Gustavo A R Silva <garsilva@embeddedor.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:18 -07:00
Tony Nguyen	7adbccbbb5	ixgbe: Disable flow control for XFI Flow control autonegotiation is not supported for XFI. Make sure that ixgbe_device_supports_autoneg_fc() returns false and hw->fc.disable_fc_autoneg is set to true to avoid running the fc_autoneg function for that device. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-25 16:31:51 -07:00
Tony Nguyen	ae84dbf7ff	ixgbe: Do not support flow control autonegotiation for X553 Flow control autonegotiation is not supported for fiber on X553. Add device ID checks in ixgbe_device_supports_autoneg_fc() to return the appropriate value. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-25 16:31:46 -07:00
Tony Nguyen	48301cf22f	ixgbe: Update NW_MNG_IF_SEL support for X553 The MAC register NW_MNG_IF_SEL fields have been redefined for X553. These changes impact the iXFI driver code flow. Since iXFI is only supported in X552, add MAC checks for iXFI flows. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-25 16:31:42 -07:00
Tony Nguyen	72f740b101	ixgbe: Enable LASI interrupts for X552 devices Enable LASI interrupts on X552 devices in order to receive notifications of link configurations of the external PHY and support the configuration of the internal iXFI link since iXFI does not support auto-negotiation. This is not required for X553 devices; add a check to avoid enabling LASI interrupts for X553 devices. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-25 16:31:38 -07:00
Tony Nguyen	0e1ff3061c	ixgbe: Ensure MAC filter was added before setting MACVLAN This patch adds a check to ensure that adding the MAC filter was successful before setting the MACVLAN. If it was unsuccessful, propagate the error. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-25 16:27:12 -07:00
John Fastabend	11393cc9b9	xdp: Add batching support to redirect map For performance reasons we want to avoid updating the tail pointer in the driver tx ring as much as possible. To accomplish this we add batching support to the redirect path in XDP. This adds another ndo op "xdp_flush" that is used to inform the driver that it should bump the tail pointer on the TX ring. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:48:06 -07:00
John Fastabend	5acaee0a89	xdp: add trace event for xdp redirect This adds a trace event for xdp redirect which may help when debugging XDP programs that use redirect bpf commands. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:48:06 -07:00
John Fastabend	6453073987	ixgbe: add initial support for xdp redirect There are optimizations we can add after the basic feature is enabled. But, for now keep the patch simple. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:48:06 -07:00
John Fastabend	90382dca61	ixgbe: NULL xdp_tx rings on resource cleanup tx_rings and rx_rings are cleaned up on close paths in ixgbe driver however, xdp_rings are not. Set the xdp_rings to NULL here so that we can use the pointer to indicate if the XDP rings are initialized. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:48:05 -07:00
Linus Torvalds	f263fbb8d6	pci-v4.13-changes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJZYAFUAAoJEFmIoMA60/r8cFQP/A4fpdjhd42WRNQXGTpZieop i40lBQtGdBn/UY97U6BoutcS1ygDi9OiSzg+IR6I90iMgidqyUHFhe4hGWgVHD2g Tg0KLzd+lKKfQ6Gqt1P6t4dLGLvyEj5NUbCeFE4XYODAUkkiBaOndax6DK1GvU54 Vjuj63rHtMKFR/tG/4iFTigObqyI8QE6O9JVxwuvIyEX6RXKbJe+wkulv5taSnWt Ne94950i10MrELtNreVdi8UbCbXiqjg0r5sKI/WTJ7Bc7WsC7X5PhWlhcNrbHyBT Ivhoypkui3Ky8gvwWqL0KBG+cRp8prBXAdabrD9wRbz0TKnfGI6pQzseCGRnkE6T mhlSJpsSNIHaejoCjk93yPn5oRiTNtPMdVhMpEQL9V/crVRGRRmbd7v2TYvpMHVR JaPZ8bv+C2aBTY8uL3/v/rgrjsMKOYFeaxeNklpErxrknsbgb6BgubmeZXDvTBVv YUIbAkvveonUKisv+kbD8L7tp1+jdbRUT0AikS0NVgAJQhfArOmBcDpTL9YC51vE feFhkVx4A32vvOm7Zcg9A7IMXNjeSfccKGw3dJOAvzgDODuJiaCG6S0o7B5Yngze axMi87ixGT4QM98z/I4MC8E9rDrJdIitlpvb6ZBgiLzoO3kmvsIZZKt8UxWqf5r8 w3U2HoyKH13Qbkn1xkum =mkyb -----END PGP SIGNATURE----- Merge tag 'pci-v4.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: - add sysfs max_link_speed/width, current_link_speed/width (Wong Vee Khee) - make host bridge IRQ mapping much more generic (Matthew Minter, Lorenzo Pieralisi) - convert most drivers to pci_scan_root_bus_bridge() (Lorenzo Pieralisi) - mutex sriov_configure() (Jakub Kicinski) - mutex pci_error_handlers callbacks (Christoph Hellwig) - split ->reset_notify() into ->reset_prepare()/reset_done() (Christoph Hellwig) - support multiple PCIe portdrv interrupts for MSI as well as MSI-X (Gabriele Paoloni) - allocate MSI/MSI-X vector for Downstream Port Containment (Gabriele Paoloni) - fix MSI IRQ affinity pre/post/min_vecs issue (Michael Hernandez) - test INTx masking during enumeration, not at run-time (Piotr Gregor) - avoid using device_may_wakeup() for runtime PM (Rafael J. Wysocki) - restore the status of PCI devices across hibernation (Chen Yu) - keep parent resources that start at 0x0 (Ard Biesheuvel) - enable ECRC only if device supports it (Bjorn Helgaas) - restore PRI and PASID state after Function-Level Reset (CQ Tang) - skip DPC event if device is not present (Keith Busch) - check domain when matching SMBIOS info (Sujith Pandel) - mark Intel XXV710 NIC INTx masking as broken (Alex Williamson) - avoid AMD SB7xx EHCI USB wakeup defect (Kai-Heng Feng) - work around long-standing Macbook Pro poweroff issue (Bjorn Helgaas) - add Switchtec "running" status flag (Logan Gunthorpe) - fix dra7xx incorrect RW1C IRQ register usage (Arvind Yadav) - modify xilinx-nwl IRQ chip for legacy interrupts (Bharat Kumar Gogada) - move VMD SRCU cleanup after bus, child device removal (Jon Derrick) - add Faraday clock handling (Linus Walleij) - configure Rockchip MPS and reorganize (Shawn Lin) - limit Qualcomm TLP size to 2K (hardware issue) (Srinivas Kandagatla) - support Tegra MSI 64-bit addressing (Thierry Reding) - use Rockchip normal (not privileged) register bank (Shawn Lin) - add HiSilicon Kirin SoC PCIe controller driver (Xiaowei Song) - add Sigma Designs Tango SMP8759 PCIe controller driver (Marc Gonzalez) - add MediaTek PCIe host controller support (Ryder Lee) - add Qualcomm IPQ4019 support (John Crispin) - add HyperV vPCI protocol v1.2 support (Jork Loeser) - add i.MX6 regulator support (Quentin Schulz) * tag 'pci-v4.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (113 commits) PCI: tango: Add Sigma Designs Tango SMP8759 PCIe host bridge support PCI: Add DT binding for Sigma Designs Tango PCIe controller PCI: rockchip: Use normal register bank for config accessors dt-bindings: PCI: Add documentation for MediaTek PCIe PCI: Remove __pci_dev_reset() and pci_dev_reset() PCI: Split ->reset_notify() method into ->reset_prepare() and ->reset_done() PCI: xilinx: Make of_device_ids const PCI: xilinx-nwl: Modify IRQ chip for legacy interrupts PCI: vmd: Move SRCU cleanup after bus, child device removal PCI: vmd: Correct comment: VMD domains start at 0x10000, not 0x1000 PCI: versatile: Add local struct device pointers PCI: tegra: Do not allocate MSI target memory PCI: tegra: Support MSI 64-bit addressing PCI: rockchip: Use local struct device pointer consistently PCI: rockchip: Check for clk_prepare_enable() errors during resume MAINTAINERS: Remove Wenrui Li as Rockchip PCIe driver maintainer PCI: rockchip: Configure RC's MPS setting PCI: rockchip: Reconfigure configuration space header type PCI: rockchip: Split out rockchip_pcie_cfg_configuration_accesses() PCI: rockchip: Move configuration accesses into rockchip_pcie_cfg_atu() ...	2017-07-08 15:51:57 -07:00
Christoph Hellwig	775755ed3c	PCI: Split ->reset_notify() method into ->reset_prepare() and ->reset_done() The pci_error_handlers->reset_notify() method had a flag to indicate whether to prepare for or clean up after a reset. The prepare and done cases have no shared functionality whatsoever, so split them into separate methods. [bhelgaas: changelog, update locking comments] Link: http://lkml.kernel.org/r/20170601111039.8913-3-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2017-07-03 07:58:30 -05:00
Jacob Keller	dfc4ff6446	i40e: don't hold RTNL lock for the entire reset We recently refactored i40e_do_reset() and its friends to be able to hold the RTNL lock only for the portions that actually need to be protected. However, a separate refactoring added several new callers of these functions during the PCIe error recovery and suspend/resume cycles. When merging the changes together, it was not noticed that we could reduce the RTNL scope by letting the reset function handle the lock itself, as previously it was not possible. Fix this by replacing these call sites to indicate that the reset function should handle its own lock. This enables multiple PFs to reset or resume simultaneously without serializing the resets via the RTNL lock. The end result is that on systems with lots of PFs and VFs the resets don't stall waiting for each other to finish. It is probable that we can also do the same for i40e_do_reset_safe, but this author did not research that change carefully enough to be confident. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:12 -07:00
Catherine Sullivan	7642984b08	i40e: Handle PE_CRITERR properly with IWARP enabled When IWARP is enabled, we weren't clearing the PE_CRITERR, just logging it and removing it from the mask. We need to do a corer to reset the PE_CRITERR register, so set the bit for that as we handle the interrupt. We should also be checking for the error against the PFINT_ICR0 register, and only need to clear it in the value getting written to PFINT_ICR0_ENA. Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:12 -07:00
Shannon Nelson	2e5c26ea0d	i40e: clear only cause_ena bit When disabling interrupts, we should only be clearing the CAUSE_ENA bit, not clearing the whole register. Clearing the whole register sets the NEXTQ_IDX field to 0 instead of 0x7ff which can confuse the Firmware in some reset sequences. Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:12 -07:00
Alan Brady	e588723986	i40e: fix disabling overflow promiscuous mode There exists a bug in which the driver does not correctly exit overflow promiscuous mode. This can occur if "too many" mac filters are added, putting the driver into overflow promiscuous mode, and the filters are then removed. When the failed filters are removed, the driver reports exiting overflow promiscuous mode which is correct, however traffic continues to be received as if in promiscuous mode still. The bug occurs because the conditional for toggling promiscuous mode was set to only execute when promiscuous mode was enabled and not when it was disabled as well. This patch fixes the conditional to correctly execute when promiscuous mode is toggled and not just enabled. Without this patch, the driver is unable to correctly exit overflow promiscuous mode. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:11 -07:00
Filip Sadowski	5bbb2e2045	i40e: Add support for OEM firmware version This patch adds support for OEM firmware version. If OEM specific adapter is detected ethtool reports OEM product version in firmware version string instead of etrack id. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:11 -07:00
Shannon Nelson	4fc8c67639	i40e: genericize the partition bandwidth control Partition bandwidth control is not in just one form of MFP (multi-function partitioning), so make the code more generic and be sure to nudge the Tx scheduler for all MFP. Copyright updated to 2017. Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:11 -07:00
Carolyn Wyborny	83d14c595e	i40e: Add message for unsupported MFP mode This patch adds a check and message if the device is in MFP mode as changing RSS input set is not supported in MFP mode. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:11 -07:00
Greg Bowers	68fb13a767	i40e: Support firmware CEE DCB UP to TC map re-definition Changes parsing of FW 4.33 AQ command Get CEE DCBX OPER CFG (0x0A07). Change is required because FW now creates the oper_prio_tc nibbles reversed from those in the CEE Priority Group sub-TLV. This change will only apply to FW 4.33 as future FW versions will use a different function to parse the CEE data. Signed-off-by: Greg Bowers <gregory.j.bowers@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:10 -07:00
Sudheer Mogilappagari	1e99854715	i40e: Fix potential out of bound array access This is a fix for the static code analysis issue where dcbcfg->numapps could be greater than size of array (i.e dcbcfg->app[I40E_DCBX_MAX_APPS]). The fix makes sure that the array is not accessed past the size of of the array (i.e. I40E_DCBX_MAX_APPS). Copyright updated to 2017. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:10 -07:00
Jacob Keller	15d23b4c36	i40e: comment that udp_port must be in host byte order The firmware expects the port number passed when setting up the UDP tunnel configuration to be in Little Endian format. The i40e_aq_add_udp_tunnel command byte swaps the value from host order to Little Endian. Since commit `fe0b0cd97b` ("i40e: send correct port number to AdminQ when enabling UDP tunnels") we've correctly sent the value in host order. Let's also add a comment to the function explaining that it must be in host order, as the port numbers are commonly stored as Big Endian values. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:10 -07:00
Jacob Keller	59e331e36e	i40e: use dev_dbg instead of dev_info when warning about missing routine When searching for the vf_capability client routine, dev_info() was used, instead of the normal dev_dbg(). This causes the message to be displayed at standard log levels which can cause administrators to worry. Avoid this by using dev_dbg instead. Copyright updated to 2017. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:10 -07:00
Alice Michael	7c32b1e650	i40e/i40evf: update WOL and I40E_AQC_ADDR_VALID_MASK flags Update a few flags related to FW interactions. Copyright updated to 2017. Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:10 -07:00
Jacob Keller	65c7006f23	i40evf: assign num_active_queues inside i40evf_alloc_queues The variable num_active_queues represents the number of active queues we have for the device. We assign this pretty early in i40evf_init_subtask. Several code locations are written with loops over the tx_rings and rx_rings structures, which don't get allocated until i40evf_alloc_queues, and which get freed by i40evf_free_queues. These call sites were written under the assumption that tx_rings and rx_rings would always be allocated at least when num_active_queues is non-zero. Lets fix this by moving the assignment into the function where we allocate queues. We'll use a temporary variable for storage so that we don't assign the value in the adapter structure until after the rings have been set up. Finally, when we free the queues, we'll clear the value to ensure that we do not loop over the rings memory that no longer exists. This resolves a possible NULL pointer dereference in i40evf_get_ethtool_stats which could occur if the VF fails to recover from a reset, and then a user requests statistics. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:09 -07:00
Björn Töpel	74608d17fe	i40e: add support for XDP_TX action This patch adds proper XDP_TX action support. For each Tx ring, an additional XDP Tx ring is allocated and setup. This version does the DMA mapping in the fast-path, which will penalize performance for IOMMU enabled systems. Further, debugfs support is not wired up for the XDP Tx rings. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:09 -07:00
Björn Töpel	0c8493d90b	i40e: add XDP support for pass and drop actions This commit adds basic XDP support for i40e derived NICs. All XDP actions will end up in XDP_DROP. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:09 -07:00
Martin KaFai Lau	4792093edd	bpf: ixgbe: Report bpf_prog ID during XDP_QUERY_PROG Add support to ixgbe to report bpf_prog ID during XDP_QUERY_PROG. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Cc: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-16 11:58:37 -04:00
Johannes Berg	4df864c1d9	networking: make skb_put & friends return void pointers It seems like a historic accident that these return unsigned char , and in many places that means casts are required, more often than not. Make these functions (skb_put, __skb_put and pskb_put) return void and remove all the casts across the tree, adding a (u8 ) cast only where the unsigned char pointer was used directly, all done with the following spatch: @@ expression SKB, LEN; typedef u8; identifier fn = { skb_put, __skb_put }; @@ - (fn(SKB, LEN)) + (u8 )fn(SKB, LEN) @@ expression E, SKB, LEN; identifier fn = { skb_put, __skb_put }; type T; @@ - E = ((T *)(fn(SKB, LEN))) + E = fn(SKB, LEN) which actually doesn't cover pskb_put since there are only three users overall. A handful of stragglers were converted manually, notably a macro in drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many instances in net/bluetooth/hci_sock.c. In the former file, I also had to fix one whitespace problem spatch introduced. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-16 11:48:39 -04:00
Johannes Berg	59ae1d127a	networking: introduce and use skb_put_data() A common pattern with skb_put() is to just want to memcpy() some data into the new space, introduce skb_put_data() for this. An spatch similar to the one for skb_put_zero() converts many of the places using it: @@ identifier p, p2; expression len, skb, data; type t, t2; @@ ( -p = skb_put(skb, len); +p = skb_put_data(skb, data, len); \| -p = (t)skb_put(skb, len); +p = skb_put_data(skb, data, len); ) ( p2 = (t2)p; -memcpy(p2, data, len); \| -memcpy(p, data, len); ) @@ type t, t2; identifier p, p2; expression skb, data; @@ t p; ... ( -p = skb_put(skb, sizeof(t)); +p = skb_put_data(skb, data, sizeof(t)); \| -p = (t )skb_put(skb, sizeof(t)); +p = skb_put_data(skb, data, sizeof(t)); ) ( p2 = (t2)p; -memcpy(p2, data, sizeof(p)); \| -memcpy(p, data, sizeof(p)); ) @@ expression skb, len, data; @@ -memcpy(skb_put(skb, len), data, len); +skb_put_data(skb, data, len); (again, manually post-processed to retain some comments) Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-16 11:48:37 -04:00
David S. Miller	0ddead90b2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The conflicts were two cases of overlapping changes in batman-adv and the qed driver. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-15 11:59:32 -04:00
Jia-Ju Bai	640f93cc6e	i40e: Fix a sleep-in-atomic bug The driver may sleep under a spin lock, and the function call path is: i40e_ndo_set_vf_port_vlan (acquire the lock by spin_lock_bh) i40e_vsi_remove_pvid i40e_vlan_stripping_disable i40e_aq_update_vsi_params i40e_asq_send_command mutex_lock --> may sleep To fixed it, the spin lock is released before "i40e_vsi_remove_pvid", and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-14 23:45:22 -04:00
Jeff Mahoney	a09c0fc3f5	ixgbe: pci_set_drvdata must be called before register_netdev We call pci_set_drvdata immediately after calling register_netdev, which leaves a window where tasks writing to the sriov_numvfs sysfs attribute can sneak in and crash the kernel. register_netdev cleans up after itself so placing pci_set_drvdata immediately before it should preserve the intent of commit `0fb6a55cc3` ("ixgbe: fix crash on rmmod after probe fail"). Fixes: `0fb6a55cc3` ("ixgbe: fix crash on rmmod after probe fail") Signed-off-by: Jeff Mahoney <jeffm@suse.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-13 17:36:39 -07:00
Tony Nguyen	4ebdf8af30	ixgbe: Resolve cppcheck format string warning cppcheck warns that the format string is incorrect in the function ixgbe_get_strings(). Since the value cannot be negative, change the variable to unsigned which matches the format specifier. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-13 17:36:29 -07:00
Emil Tantilov	d28b194955	ixgbe: fix writes to PFQDE ixgbe_write_qde() was ignoring the qde parameter which resulted in PFQDE.HIDE_VLAN not being set for X550. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-13 16:11:48 -07:00
Tony Nguyen	adc2c83e2b	ixgbevf: Bump version number Update ixgbevf version number. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-13 16:11:48 -07:00
Tony Nguyen	01ec5525fc	ixgbe: Bump version number Update ixgbe version number. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-13 16:11:48 -07:00
Jacob Keller	622a2ef538	ixgbe: check for Tx timestamp timeouts during watchdog The ixgbe driver has logic to handle only one Tx timestamp at a time, using a state bit lock to avoid multiple requests at once. It may be possible, if incredibly unlikely, that a Tx timestamp event is requested but never completes. Since we use an interrupt scheme to determine when the Tx timestamp occurred we would never clear the state bit in this case. Add an ixgbe_ptp_tx_hang() function similar to the already existing ixgbe_ptp_rx_hang() function. This function runs in the watchdog routine and makes sure we eventually recover from this case instead of permanently disabling Tx timestamps. Note: there is no currently known way to cause this without hacking the driver code to force it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-13 16:11:48 -07:00
Jacob Keller	4cc74c01ef	ixgbe: add statistic indicating number of skipped Tx timestamps The ixgbe driver can only handle one Tx timestamp request at a time. This means it is possible for an application timestamp request to be ignored. There is no easy way for an administrator to determine if this occurred. Add a new statistic which tracks this, tx_hwtstamp_skipped. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-13 16:11:48 -07:00
Jacob Keller	5fef124d9c	ixgbe: avoid permanent lock of *_PTP_TX_IN_PROGRESS The ixgbe driver uses a state bit lock to avoid handling more than one Tx timestamp request at once. This is required because hardware is limited to a single set of registers for Tx timestamps. The state bit lock is not properly cleaned up during ixgbe_xmit_frame_ring() if the transmit fails such as due to DMA or TSO failure. In some hardware this results in blocking timestamps until the service task times out. In other hardware this results in a permanent lock of the timestamp bit because we never receive an interrupt indicating the timestamp occurred, since indeed the packet was never transmitted. Fix this by checking for DMA and TSO errors in ixgbe_xmit_frame_ring() and properly cleaning up after ourselves when these occur. Reported-by: Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-13 16:11:48 -07:00
Jacob Keller	aaebaf50b5	ixgbe: fix race condition with PTP_TX_IN_PROGRESS bits Hardware related to the ixgbe driver is limited to handling a single Tx timestamp request at a time. Thus, the driver ignores requests for Tx timestamp while waiting for the current request to finish. It uses a state bit lock which enforces that only one timestamp request is honored at a time. Unfortunately this suffers from a simple race condition. The bit lock is not cleared until after skb_tstamp_tx() is called notifying applications of a new Tx timestamp. Even a well behaved application sending only one packet at a time and waiting for a response can wake up and send a new packet before the bit lock is cleared. This results in needlessly dropping some Tx timestamp requests. We can fix this by unlocking the state bit as soon as we read the Timestamp register, as this is the first point at which it is safe to unlock. To avoid issues with the skb pointer, we'll use a copy of the pointer and set the global variable in the driver structure to NULL first. This ensures that the next timestamp request does not modify our local copy of the skb pointer. This ensures that well behaved applications do not accidentally race with the unlock bit. Obviously an application which sends multiple Tx timestamp requests at once will still only timestamp one packet at a time. Unfortunately there is nothing we can do about this. Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-13 16:11:47 -07:00
Jacob Keller	6964e53f55	i40e: fix handling of HW ATR eviction A recent commit to refactor the driver and remove the hw_disabled_flags field accidentally introduced two regressions. First, we overwrote pf->flags which removed various key flags including the MSI-X settings. Additionally, it was intended that we have now two flags, HW_ATR_EVICT_CAPABLE and HW_ATR_EVICT_ENABLED, but this was not done, and we accidentally were mis-using HW_ATR_EVICT_CAPABLE everywhere. This patch adds the missing piece, HW_ATR_EVICT_ENABLED, and safely updates pf->flags instead of overwriting it. Without this patch we will have many problems including disabling MSI-X support, and we'll attempt to use HW ATR eviction on devices which do not support it. Fixes: `47994c119a` ("i40e: remove hw_disabled_flags in favor of using separate flag bits", 2017-04-19) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-12 18:53:02 -04:00
David S. Miller	3948b57bd5	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 1GbE Intel Wired LAN Driver Updates 2017-06-07 This series contains a fix for e1000e and igb. Colin Ian King fixes sparse warnings in igb by making functions static. Chris Wilson provides a fix for a previous commit which is causing an issue during suspend "e1000e_pm_suspend()", where we need to run e1000e_pm_thaw() if __e1000_shutdown() is unsuccessful. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-08 14:41:19 -04:00
Jiri Pirko	a5fcf8a6c9	net: propagate tc filter chain index down the ndo_setup_tc call We need to push the chain index down to the drivers, so they have the information to which chain the rule belongs. For now, no driver supports multichain offload, so only chain 0 is supported. This is needed to prevent chain squashes during offload for now. Later this will be used to implement multichain offload. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-08 09:55:53 -04:00
Chris Wilson	833521ebc6	e1000e: Undo e1000e_pm_freeze if __e1000_shutdown fails An error during suspend (e100e_pm_suspend), [ 429.994338] ACPI : EC: event blocked [ 429.994633] e1000e: EEE TX LPI TIMER: 00000011 [ 430.955451] pci_pm_suspend(): e1000e_pm_suspend+0x0/0x30 [e1000e] returns -2 [ 430.955454] dpm_run_callback(): pci_pm_suspend+0x0/0x140 returns -2 [ 430.955458] PM: Device 0000:00:19.0 failed to suspend async: error -2 [ 430.955581] PM: Some devices failed to suspend, or early wake event detected [ 430.957709] ACPI : EC: event unblocked lead to complete failure: [ 432.585002] ------------[ cut here ]------------ [ 432.585013] WARNING: CPU: 3 PID: 8372 at kernel/irq/manage.c:1478 __free_irq+0x9f/0x280 [ 432.585015] Trying to free already-free IRQ 20 [ 432.585016] Modules linked in: cdc_ncm usbnet x86_pkg_temp_thermal intel_powerclamp coretemp mii crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep lpc_ich snd_hda_core snd_pcm mei_me mei sdhci_pci sdhci i915 mmc_core e1000e ptp pps_core prime_numbers [ 432.585042] CPU: 3 PID: 8372 Comm: kworker/u16:40 Tainted: G U 4.10.0-rc8-CI-Patchwork_3870+ #1 [ 432.585044] Hardware name: LENOVO 2356GCG/2356GCG, BIOS G7ET31WW (1.13 ) 07/02/2012 [ 432.585050] Workqueue: events_unbound async_run_entry_fn [ 432.585051] Call Trace: [ 432.585058] dump_stack+0x67/0x92 [ 432.585062] __warn+0xc6/0xe0 [ 432.585065] warn_slowpath_fmt+0x4a/0x50 [ 432.585070] ? _raw_spin_lock_irqsave+0x49/0x60 [ 432.585072] __free_irq+0x9f/0x280 [ 432.585075] free_irq+0x34/0x80 [ 432.585089] e1000_free_irq+0x65/0x70 [e1000e] [ 432.585098] e1000e_pm_freeze+0x7a/0xb0 [e1000e] [ 432.585106] e1000e_pm_suspend+0x21/0x30 [e1000e] [ 432.585113] pci_pm_suspend+0x71/0x140 [ 432.585118] dpm_run_callback+0x6f/0x330 [ 432.585122] ? pci_pm_freeze+0xe0/0xe0 [ 432.585125] __device_suspend+0xea/0x330 [ 432.585128] async_suspend+0x1a/0x90 [ 432.585132] async_run_entry_fn+0x34/0x160 [ 432.585137] process_one_work+0x1f4/0x6d0 [ 432.585140] ? process_one_work+0x16e/0x6d0 [ 432.585143] worker_thread+0x49/0x4a0 [ 432.585145] kthread+0x107/0x140 [ 432.585148] ? process_one_work+0x6d0/0x6d0 [ 432.585150] ? kthread_create_on_node+0x40/0x40 [ 432.585154] ret_from_fork+0x2e/0x40 [ 432.585156] ---[ end trace 6712df7f8c4b9124 ]--- The unwind failures stems from commit `2800209994` ("e1000e: Refactor PM flows"), but it may be a later patch that introduced the non-recoverable behaviour. Fixes: `2800209994` ("e1000e: Refactor PM flows") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99847 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-07 20:45:58 -07:00
Colin Ian King	b476deab8f	igb: make a few local functions static Clean up a few sparse warnings, these following functions can be made static: drivers/net/ethernet/intel/igb/igb_main.c: warning: symbol 'igb_add_mac_filter' was not declared. Should it be static? drivers/net/ethernet/intel/igb/igb_main.c: warning: symbol 'igb_del_mac_filter' was not declared. Should it be static? drivers/net/ethernet/intel/igb/igb_main.c: warning: symbol 'igb_set_vf_mac_filter' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-07 19:05:42 -07:00
David S. Miller	216fe8f021	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Just some simple overlapping changes in marvell PHY driver and the DSA core code. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 22:20:08 -04:00
Björn Töpel	2aae918c7a	i40e/i40evf: proper update of the page_offset field In `f8b45b74cc` ("i40e/i40evf: Use build_skb to build frames") i40e_build_skb updates the page_offset field with an incorrect offset, which can lead to data corruption. This patch updates page_offset correctly, by properly setting truesize. Note that the bug only appears on architectures where PAGE_SIZE is 8192 or larger. Fixes: `f8b45b74cc` ("i40e/i40evf: Use build_skb to build frames") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 02:49:15 -07:00
Mauro S. M. Rodrigues	9e6c9c0f2c	i40e: Fix state flags for bit set and clean operations of PF Commit `0da36b9774` ("i40e: use DECLARE_BITMAP for state fields") introduced changes in the way i40e works with state flags converting them to bitmaps using kernel bitmap API. This change introduced a regression due to a mistaken substitution using __I40E_VSI_DOWN instead of __I40E_DOWN when testing state of a PF at i40e_reset_subtask() function. This caused a flood in the kernel log with the follow message: [49.013] i40e 0002:01:00.0: bad reset request 0x00000020 Commit `d19cb64b92` ("i40e: separate PF and VSI state flags") also introduced some misuse of the VSI and PF flags, so both could be considered as the offenders. This patch simply fixes the flags where it makes sense by changing __I40E_VSI_DOWN to __I40E_DOWN. Fixes: `0da36b9774` ("i40e: use DECLARE_BITMAP for state fields") Fixes: `d19cb64b92` ("i40e: separate PF and VSI state flags") Reviewed-by: "Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com> Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 02:45:32 -07:00
Konstantin Khlebnikov	fd8e597ba4	e1000e: use disable_hardirq() also for MSIX vectors in e1000_netpoll() Replace disable_irq() which waits for threaded irq handlers with disable_hardirq() which waits only for hardirq part. Fixes: `3111912971` ("e1000: use disable_hardirq() for e1000_netpoll()") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 01:06:06 -07:00
Benjamin Poirier	24ad2a9209	e1000e: Don't return uninitialized stats Some statistics passed to ethtool are garbage because e1000e_get_stats64() doesn't write them, for example: tx_heartbeat_errors. This leaks kernel memory to userspace and confuses users. Do like ixgbe and use dev_get_stats() which first zeroes out rtnl_link_stats64. Fixes: `5944701df9` ("net: remove useless memset's in drivers get_stats64") Reported-by: Stefan Priebe <s.priebe@profihost.ag> Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 01:05:13 -07:00
Benjamin Poirier	81e3f64a9b	igb: Remove useless argument Given that all callers of igb_update_stats() pass the same two arguments: (adapter, &adapter->stats64), the second argument can be removed. Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 01:04:13 -07:00
Jacob Keller	e5f36ad14c	igb: check for Tx timestamp timeouts during watchdog The igb driver has logic to handle only one Tx timestamp at a time, using a state bit lock to avoid multiple requests at once. It may be possible, if incredibly unlikely, that a Tx timestamp event is requested but never completes. Since we use an interrupt scheme to determine when the Tx timestamp occurred we would never clear the state bit in this case. Add an igb_ptp_tx_hang() function similar to the already existing igb_ptp_rx_hang() function. This function runs in the watchdog routine and makes sure we eventually recover from this case instead of permanently disabling Tx timestamps. Note: there is no currently known way to cause this without hacking the driver code to force it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 01:03:17 -07:00
Jacob Keller	c3b8f85ec2	igb: add statistic indicating number of skipped Tx timestamps The igb driver can only handle one Tx timestamp request at a time. This means it is possible for an application timestamp request to be ignored. There is no easy way for an administrator to determine if this occurred. Add a new statistic which tracks this, tx_hwtstamp_skipped. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 01:02:05 -07:00
Jacob Keller	cff5714145	e1000e: add statistic indicating number of skipped Tx timestamps The e1000e driver can only handle one Tx timestamp request at a time. This means it is possible for an application timestamp request to be ignored. There is no easy way for an administrator to determine if this occurred. Add a new statistic which tracks this, tx_hwtstamp_skipped. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 01:01:27 -07:00
Jacob Keller	74344e32fc	igb: avoid permanent lock of *_PTP_TX_IN_PROGRESS The igb driver uses a state bit lock to avoid handling more than one Tx timestamp request at once. This is required because hardware is limited to a single set of registers for Tx timestamps. The state bit lock is not properly cleaned up during igb_xmit_frame_ring() if the transmit fails such as due to DMA or TSO failure. In some hardware this results in blocking timestamps until the service task times out. In other hardware this results in a permanent lock of the timestamp bit because we never receive an interrupt indicating the timestamp occurred, since indeed the packet was never transmitted. Fix this by checking for DMA and TSO errors in igb_xmit_frame_ring() and properly cleaning up after ourselves when these occur. Reported-by: Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 00:53:48 -07:00
Jacob Keller	4ccdc013b0	igb: fix race condition with PTP_TX_IN_PROGRESS bits Hardware related to the igb driver has a limitation of only handling one Tx timestamp at a time. Thus, the driver uses a state bit lock to enforce that only one timestamp request is honored at a time. Unfortunately this suffers from a simple race condition. The bit lock is not cleared until after skb_tstamp_tx() is called notifying the stack of a new Tx timestamp. Even a well behaved application which sends only one timestamp request at once and waits for a response might wake up and send a new packet before the bit lock is cleared. This results in needlessly dropping some Tx timestamp requests. We can fix this by unlocking the state bit as soon as we read the Timestamp register, as this is the first point at which it is safe to unlock. To avoid issues with the skb pointer, we'll use a copy of the pointer and set the global variable in the driver structure to NULL first. This ensures that the next timestamp request does not modify our local copy of the skb pointer. This ensures that well behaved applications do not accidentally race with the unlock bit. Obviously an application which sends multiple Tx timestamp requests at once will still only timestamp one packet at a time. Unfortunately there is nothing we can do about this. Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 00:53:07 -07:00
Jacob Keller	5012863b73	e1000e: fix race condition around skb_tstamp_tx() The e1000e driver and related hardware has a limitation on Tx PTP packets which requires we limit to timestamping a single packet at once. We do this by verifying that we never request a new Tx timestamp while we still have a tx_hwtstamp_skb pointer. Unfortunately the driver suffers from a race condition around this. The tx_hwtstamp_skb pointer is not set to NULL until after skb_tstamp_tx() is called. This function notifies the stack and applications of a new timestamp. Even a well behaved application that only sends a new request when the first one is finished might be woken up and possibly send a packet before we can free the timestamp in the driver again. The result is that we needlessly ignore some Tx timestamp requests in this corner case. Fix this by assigning the tx_hwtstamp_skb pointer prior to calling skb_tstamp_tx() and use a temporary pointer to hold the timestamped skb until that function finishes. This ensures that the application is not woken up until the driver is ready to begin timestamping a new packet. This ensures that well behaved applications do not accidentally race with condition to skip Tx timestamps. Obviously an application which sends multiple Tx timestamp requests at once will still only timestamp one packet at a time. Unfortunately there is nothing we can do about this. Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 00:52:17 -07:00
Arnd Bergmann	000ba1f2eb	igb: mark PM functions as __maybe_unused The new wake function is only used by the suspend/resume handlers that are defined in inside of an #ifdef, which can cause this harmless warning: drivers/net/ethernet/intel/igb/igb_main.c:7988:13: warning: 'igb_deliver_wake_packet' defined but not used [-Wunused-function] Removing the #ifdef, instead using a __maybe_unused annotation simplifies the code and avoids the warning. Fixes: `b90fa87635` ("igb: Enable reading of wake up packet") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 00:51:36 -07:00
Matwey V Kornilov	440aeca4b9	igb: Explicitly select page 0 at initialization The functions igb_read_phy_reg_gs40g/igb_write_phy_reg_gs40g (which were removed in `2a3cdea`) explicitly selected the required page at every phy_reg access. Currently, igb_get_phy_id_82575 relays on the fact that page 0 is already selected. The assumption is not fulfilled for my Lex 3I380CW motherboard with integrated dual i211 based gigabit ethernet. This leads to igb initialization failure and network interfaces are not working: igb: Intel(R) Gigabit Ethernet Network Driver - version 5.4.0-k igb: Copyright (c) 2007-2014 Intel Corporation. igb: probe of 0000:01:00.0 failed with error -2 igb: probe of 0000:02:00.0 failed with error -2 In order to fix it, we explicitly select page 0 before first access to phy registers. See also: https://bugzilla.suse.com/show_bug.cgi?id=1009911 See also: http://www.lex.com.tw/products/pdf/3I380A&3I380CW.pdf Fixes: `2a3cdea` ("igb: Remove GS40G specific defines/functions") Cc: <stable@vger.kernel.org> # 4.5+ Signed-off-by: Matwey V Kornilov <matwey@sai.msu.ru> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 00:49:20 -07:00
yuval.shaia@oracle.com	82c01a84d5	net/{mii, smsc}: Make mii_ethtool_get_link_ksettings and smc_netdev_get_ecmd return void Make return value void since functions never returns meaningfull value. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-05 11:00:42 -04:00
Preethi Banala	abf709a1e7	i40evf: Add support for Adaptive Virtual Function Add device ID define and mac_type assignment needed for Adaptive Virtual Function (VF Base Mode Support). Also, update version to v3.0.0 in order to indicate clearly that this is the first driver supporting the AVF device ID. Signed-off-by: Preethi Banala <preethi.banala@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:30:02 -07:00
Jesse Brandeburg	735e35c56b	i40e/virtchnl: move function to virtchnl This moves a function that is needed for the virtchnl interface from the i40e PF driver over to the virtchnl.h file. It was manually verified that the function in question is unchanged except for the function name and function header, which explains the slight difference in the number of lines removed/added. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:22:53 -07:00
Jesse Brandeburg	ff3f4cc267	virtchnl: finish conversion to virtchnl interface This patch implements the complete version of the virtchnl.h file with final renames, and fixes the related code in i40e and i40evf. It also expands comments, and adds details on the usage of certain fields. In addition, due to the changes a couple of casts are needed to prevent errors found by sparse after renaming some fields. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:21:27 -07:00
Jesse Brandeburg	f0adc6e831	i40evf/virtchnl: whitespace cleanups This patch fixes up a bunch of whitespace issues introduced by the previous automated change of name from i40e to virtchnl. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:19:14 -07:00
Jesse Brandeburg	764430ce6f	i40e/virtchnl: refactor code for validate checks This change updates the arguments passed to the validate function and fixes the caller, as well as uses the new return values added to virtchnl.h One other minor tweak, remove a duplicate set to zero of valid_len. This is in preparation for moving the function to virtchnl.h. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:17:02 -07:00
Jesse Brandeburg	eedcfef85b	virtchnl: convert to new macros As part of the conversion, change the arguments to VF_IS_V1[01] macros and move them to virtchnl.h Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:15:21 -07:00
Jesse Brandeburg	260e93820a	virtchnl: move some code to core driver Before moving this function over to virtchnl.h, move some driver specific checks that had snuck into a fairly generic function, back into the caller of the function. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:13:00 -07:00
Jesse Brandeburg	310a2ad92e	virtchnl: rename i40e to generic virtchnl This morphs all the i40e and i40evf references to/in virtchnl.h to be generic, using only automated methods. Updates all the callers to use the new names. A followup patch provides separate clean ups for messy line conversions from these "automatic" changes, to make them more reviewable. Was executed with the following sed script: sed -i -f transform_script drivers/net/ethernet/intel/i40e/i40e_client.c sed -i -f transform_script drivers/net/ethernet/intel/i40e/i40e_prototype.h sed -i -f transform_script drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c sed -i -f transform_script drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h sed -i -f transform_script drivers/net/ethernet/intel/i40evf/i40e_common.c sed -i -f transform_script drivers/net/ethernet/intel/i40evf/i40e_prototype.h sed -i -f transform_script drivers/net/ethernet/intel/i40evf/i40evf.h sed -i -f transform_script drivers/net/ethernet/intel/i40evf/i40evf_client.c sed -i -f transform_script drivers/net/ethernet/intel/i40evf/i40evf_main.c sed -i -f transform_script drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c sed -i -f transform_script include/linux/avf/virtchnl.h transform_script: ----8<---- s/I40E_VIRTCHNL_SUPPORTED_QTYPES/SAVE_ME_SUPPORTED_QTYPES/g s/I40E_VIRTCHNL_VF_CAP/SAVE_ME_VF_CAP/g s/I40E_VIRTCHNL_/VIRTCHNL_/g s/i40e_virtchnl_/virtchnl_/g s/i40e_vfr_/virtchnl_vfr_/g s/I40E_VFR_/VIRTCHNL_VFR_/g s/VIRTCHNL_OP_ADD_ETHER_ADDRESS/VIRTCHNL_OP_ADD_ETH_ADDR/g s/VIRTCHNL_OP_DEL_ETHER_ADDRESS/VIRTCHNL_OP_DEL_ETH_ADDR/g s/VIRTCHNL_OP_FCOE/VIRTCHNL_OP_RSVD/g s/SAVE_ME_SUPPORTED_QTYPES/I40E_VIRTCHNL_SUPPORTED_QTYPES/g s/SAVE_ME_VF_CAP/I40E_VIRTCHNL_VF_CAP/g ----8<---- Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:08:53 -07:00
Jesse Brandeburg	55cdfd48f2	i40e: use new unified virtchnl header file This patch changes the i40e driver to start using the new virtchnl interface header file, and removes an already existing duplicate of the i40e_virtchnl.h file contained in the i40e directory. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:06:35 -07:00
Jesse Brandeburg	681bdf80cf	i40e/i40evf: create and use new unified header file This moves a header for i40evf to include/linux/avf/virtchnl.h. The directory name AVF is an acronym for the Intel(R) Adaptive Virtual Function. This first step creates the new file, which is a rename of drivers/net/ethernet/intel/i40evf/i40e_virtchnl.h to include/linux/avf/virtchnl.h, and should show up in git as a rename when using git log --follow. To keep things building after the move, the changes to the i40evf driver are made to point to the new include file location. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:04:42 -07:00
Jesse Brandeburg	3929080333	i40evf: drop i40e_type.h include This drops the i40e_type.h include in anticipation of the next patch which moves this file to a location where type.h doesn't exist, and all the places this file is included already include i40e_type.h before this file. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 13:59:28 -07:00
David S. Miller	c380e377a5	Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2017-05-31 This series contains updates to ixgbe and ixgbevf only. Scott enables support for TSO & GSO for MPLS encapsulated packets for both ixgbe and ixgbevf. Liwei Song fixes an issue where seqcount/seqlock in ixgbe_get_stats64() are not initialized in time, so move the initialization into probe routine after the transmit and receive rings are initialized. Paul cleans up led_[on\|off] for X550EM_X, since the firmware configures the PHY & MAC and we have no PHY access so LED on/off is not supported with this device. Emil provides several fixes, starting with enabling RSS on VF to VF traffic on the same PF. Fixed PHY identification, where the previous method was unreliable, so use a different register to ensure proper identification. Cleaned up the logic which could cause us to skip the link configuration, this skipping over the link configuration was leaving SFP+ PHY's in an unstable state, so always call setup_mac_link(). Added RS1 (rate select 1) support for ixgbe. Lastly, fixed incorrect logic in the setting up of SFP+ link speed. Mark fixes the thermal sensor event logic, where it was being executed when there really was no thermal event. So simplify the logic to only execute when there is a thermal event. Tony adds additional error checks and reporting when setting a VF MAC address to ensure that the MAC filter was successfully added. Also fixed possible truncation warnings, as well as implicit fallthrough warnings. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-31 17:55:10 -04:00
Emil Tantilov	d9c23ff80b	ixgbe: fix incorrect status check Check for ret_val instead of !ret_val to allow the rest of the code to execute and configure the speed properly. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 04:54:12 -07:00
Emil Tantilov	3ce5cb75f3	ixgbe: add missing configuration for rate select 1 Add RS1 configuration to ixgbe_set_soft_rate_select_speed() Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 04:52:41 -07:00
Emil Tantilov	08ed48e182	ixgbe: always call setup_mac_link for multispeed fiber Remove the logic which would previously skip the link configuration in the case where we are already at the requested speed in ixgbe_setup_mac_link_multispeed_fiber(). By exiting early we are skipping the link configuration and as such the driver may not always configure the PHY correctly for SFP+. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 04:50:41 -07:00
Emil Tantilov	410a494902	ixgbe: add write flush when configuring CS4223/7 Make sure the writes are processed immediately. Without the flush it is possible for operations on one port to spill over the other as the resource is shared. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 04:49:43 -07:00
Emil Tantilov	cc1de78c2a	ixgbe: correct CS4223/7 PHY identification Previous method was unreliable. Use a different register to differentiate between the SKUs. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 04:48:19 -07:00
Tony Nguyen	80666035c7	ixgbevf: Resolve warnings for -Wimplicit-fallthrough Additions to gcc 7 now warn whenever a switch statement falls through implicitly. This patch adds explicit fall through comments to address the following warnings: drivers/net/ethernet/intel/ixgbevf/vf.c: In function ‘ixgbevf_get_reta_locked’: drivers/net/ethernet/intel/ixgbevf/vf.c:336:6: warning: this statement may fall through [-Wimplicit-fallthrough=] if (hw->mac.type < ixgbe_mac_X550_vf) ^ drivers/net/ethernet/intel/ixgbevf/vf.c:338:2: note: here default: ^~~~~~~ drivers/net/ethernet/intel/ixgbevf/vf.c: In function ‘ixgbevf_get_rss_key_locked’: drivers/net/ethernet/intel/ixgbevf/vf.c:402:6: warning: this statement may fall through [-Wimplicit-fallthrough=] if (hw->mac.type < ixgbe_mac_X550_vf) ^ drivers/net/ethernet/intel/ixgbevf/vf.c:404:2: note: here default: ^~~~~~~ Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 04:46:44 -07:00
Tony Nguyen	31f5d9b1e8	ixgbevf: Resolve truncation warning for q_vector->name The following warning is now shown as a result of new checks added for gcc 7: drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c: In function ‘ixgbevf_open’: drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:1363:13: warning: ‘%d’ directive output may be truncated writing between 1 and 10 bytes into a region of size between 3 and 18 [-Wformat-truncation=] "%s-%s-%d", netdev->name, "TxRx", ri++); ^~ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:1363:6: note: directive argument in the range [0, 2147483647] "%s-%s-%d", netdev->name, "TxRx", ri++); ^~~~~~~~~~ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:1362:4: note: ‘snprintf’ output between 8 and 32 bytes into a destination of size 24 snprintf(q_vector->name, sizeof(q_vector->name) - 1, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "%s-%s-%d", netdev->name, "TxRx", ri++); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Resolve this warning by making a couple of changes. - Don't reserve space for the null terminator. Since snprintf adds the null terminator automatically, there is no need for us to reserve a byte for it. - Change a couple variables that can never be negative from int to unsigned int. While we're making changes to the format string, move the constant strings into the format string instead of providing them as specifiers. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 04:45:14 -07:00
Tony Nguyen	93df9465c9	ixgbe: Resolve warnings for -Wimplicit-fallthrough This patch adds/changes fall through comments to address new warnings produced by gcc 7. Fixed formatting on a couple of comments in the function. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 04:43:47 -07:00
Tony Nguyen	e61e4c8b90	ixgbe: Resolve truncation warning for q_vector->name The following warning is now shown as a result of new checks added for gcc 7: drivers/net/ethernet/intel/ixgbe/ixgbe_main.c: In function ‘ixgbe_open’: drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3118:13: warning: ‘%d’ directive output may be truncated writing between 1 and 10 bytes into a region of size between 3 and 18 [-Wformat-truncation=] "%s-%s-%d", netdev->name, "TxRx", ri++); ^~ drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3118:6: note: directive argument in the range [0, 2147483647] "%s-%s-%d", netdev->name, "TxRx", ri++); ^~~~~~~~~~ drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3117:4: note: ‘snprintf’ output between 8 and 32 bytes into a destination of size 24 snprintf(q_vector->name, sizeof(q_vector->name) - 1, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "%s-%s-%d", netdev->name, "TxRx", ri++); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Resolve this warning by making a couple of changes. - Don't reserve space for the null terminator. Since snprintf adds the null terminator automatically, there is no need for us to reserve a byte for it. - Change a couple variables that can never be negative from int to unsigned int. While we're making changes to the format string, move the constant strings into the format string instead of providing them as specifiers. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 04:39:47 -07:00
Tony Nguyen	6af3d0faed	ixgbe: Add error checking to setting VF MAC Currently, when setting a VF MAC address there are no error checks to ensure that the MAC filter was successfully added. This patch adds additional error checks, reporting, and propagation of errors. It also will not set the MAC address unless adding the MAC filter was successful. With these changes, setting the mac address to zeros can no longer call ixgbe_set_vf_mac() as adding a zero MAC address filter is not valid. Instead directly delete the filter and, if successful, clear the MAC address. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 04:38:23 -07:00
Mark Rustad	22cb4fff3d	ixgbe: Correct thermal sensor event check The thermal sensor event logic is messed up, because it can execute the code when there is no thermal event. The current logic is that it will exit when !capable && !event whereas it really should exit when !capable \|\| !event. For one thing, it means that the service task is doing too much work. It probably has some other symptoms as well. So, correct the logic, simplifying to only execute when there is a thermal event. The capable check is redundant. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 04:36:51 -07:00
Emil Tantilov	e6b41c8881	ixgbe: enable L3/L4 filtering for Tx switched packets This will ensure that VF-to-VF traffic on the same PF is filtered to allow RSS operation. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 04:32:48 -07:00
Paul Greenwalt	5e999fb43e	ixgbe: Remove MAC X550EM_X 1Gbase-t led_[on\|off] support Since FW configures the PHY and MAC X550EM_X has no PHY access, led_[on\|off] is not supported with the 1Gbase-t design. Removed MAC X550EM_X 1Gbase-t led_[on\|off] support by setting function pointers to NULL and added NULL pointer checks. Also set init_led_link_act to NULL and added NULL pointer check. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 04:31:43 -07:00
Liwei Song	b09457e7a1	ixgbe: initialize u64_stats_sync structures early at ixgbe_probe Fix the following CallTrace: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 71 PID: 1 Comm: swapper/0 Not tainted 4.8.8-WR9.0.0.1_standard #11 Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRNDSDP1.86B.0036.R05.1407140519 07/14/2014 00200086 00200086 eb5e1ab8 c144dd70 00000000 00000000 eb5e1af8 c10af89a c1d23de4 eb5e1af8 00000009 eb5d8600 eb5d8638 eb5e1af8 c10b14d8 00000009 0000000a c1d32911 00000000 00000000 e44c826c eb5d8000 eb5e1b74 c10b214e Call Trace: [<c144dd70>] dump_stack+0x5f/0x8f [<c10af89a>] register_lock_class+0x25a/0x4c0 [<c10b14d8>] ? check_irq_usage+0x88/0xc0 [<c10b214e>] __lock_acquire+0x5e/0x17a0 [<c1abdb9b>] ? _raw_spin_unlock_irqrestore+0x3b/0x70 [<c10cf14a>] ? rcu_read_lock_sched_held+0x8a/0x90 [<c10b3c5f>] lock_acquire+0x9f/0x1f0 [<c1922dcf>] ? dev_get_stats+0x5f/0x110 [<c176e6b3>] ixgbe_get_stats64+0x113/0x320 [<c1922dcf>] ? dev_get_stats+0x5f/0x110 [<c1922dcf>] dev_get_stats+0x5f/0x110 [<c1ab5415>] rtnl_fill_stats+0x40/0x105 [<c193dd45>] rtnl_fill_ifinfo+0x4c5/0xd20 [<c11c5115>] ? __kmalloc_node_track_caller+0x1a5/0x410 [<c1917487>] ? __kmalloc_reserve.isra.42+0x27/0x80 [<c191754f>] ? __alloc_skb+0x6f/0x270 [<c1942291>] rtmsg_ifinfo_build_skb+0x71/0xd0 [<c194230a>] rtmsg_ifinfo.part.23+0x1a/0x50 [<c1923dad>] ? call_netdevice_notifiers_info+0x2d/0x60 [<c194236b>] rtmsg_ifinfo+0x2b/0x40 [<c192f997>] register_netdevice+0x3d7/0x4d0 [<c192faa7>] register_netdev+0x17/0x30 [<c177b83d>] ixgbe_probe+0x118d/0x1610 [<c1498202>] local_pci_probe+0x32/0x80 [<c1498172>] ? pci_match_device+0xd2/0x100 [<c14991e0>] pci_device_probe+0xc0/0x110 [<c1652cc5>] driver_probe_device+0x1c5/0x280 [<c1498172>] ? pci_match_device+0xd2/0x100 [<c1652e09>] __driver_attach+0x89/0x90 [<c1652d80>] ? driver_probe_device+0x280/0x280 [<c165114f>] bus_for_each_dev+0x4f/0x80 [<c165269e>] driver_attach+0x1e/0x20 [<c1652d80>] ? driver_probe_device+0x280/0x280 [<c1652317>] bus_add_driver+0x1a7/0x220 [<c1653a79>] driver_register+0x59/0xe0 [<c1f897b8>] ? igb_init_module+0x49/0x49 [<c1497b2a>] __pci_register_driver+0x4a/0x50 [<c1f8985d>] ixgbe_init_module+0xa5/0xc4 [<c1000485>] do_one_initcall+0x35/0x150 [<c107e818>] ? parameq+0x18/0x70 [<c1f395d8>] ? repair_env_string+0x12/0x51 [<c107ead0>] ? parse_args+0x260/0x3b0 [<c1074f73>] ? __usermodehelper_set_disable_depth+0x43/0x50 [<c1f39e90>] kernel_init_freeable+0x19b/0x267 [<c1f395c6>] ? set_debug_rodata+0xf/0xf [<c10b1e7b>] ? trace_hardirqs_on+0xb/0x10 [<c1abdc02>] ? _raw_spin_unlock_irq+0x32/0x50 [<c1085f0b>] ? finish_task_switch+0xab/0x1f0 [<c1085ec9>] ? finish_task_switch+0x69/0x1f0 [<c1ab6a30>] kernel_init+0x10/0x110 [<c108bd65>] ? schedule_tail+0x25/0x80 [<c1abe422>] ret_from_kernel_thread+0xe/0x24 [<c1ab6a20>] ? rest_init+0x130/0x130 This CallTrace occurred on 32-bit kernel with CONFIG_PROVE_LOCKING enabled. This happens at ixgbe driver probe hardware stage, when comes to ixgbe_get_stats64, the seqcount/seqlock still not initialize, although this was initialize in TX/RX resources setup routin, but it was too late, then lockdep give this Warning. To fix this, move the u64_stats_init function to driver probe stage, which before we get the status of seqcount and after the RX/TX ring was finished init. Signed-off-by: Liwei Song <liwei.song@windriver.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>	2017-05-31 04:28:30 -07:00
Scott Peterson	2a20525b26	ixgbe/ixgbevf: Enables TSO for MPLS encapsulated packets This patch advertises TSO & GSO features in netdev->mpls_features. In ixgbe(vf)_tso() where we set up segmentation offload, the IP header will be the inner network header when eth_p_mpls() indicates the Ethernet protocol is MPLS (UC or MC). Suggested-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Scott Peterson <scott.d.peterson@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 04:16:28 -07:00
Christophe Jaillet	0a4ecc2c5e	i40e: Check for memory allocation failure If 'kzalloc' fails, a NULL pointer will be dereferenced. Return -ENOMEM instead. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 03:13:36 -07:00
Jacob Keller	0bc0706b46	i40e: check for Tx timestamp timeouts during watchdog The i40e driver has logic to handle only one Tx timestamp at a time, using a state bit lock to avoid multiple requests at once. It may be possible, if incredibly unlikely, that a Tx timestamp event is requested but never completes. Since we use an interrupt scheme to determine when the Tx timestamp occurred we would never clear the state bit in this case. Add an i40e_ptp_tx_hang() function similar to the already existing i40e_ptp_rx_hang() function. This function runs in the watchdog routine and makes sure we eventually recover from this case instead of permanently disabling Tx timestamps. Note: there is no currently known way to cause this without hacking the driver code to force it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 03:12:06 -07:00
Jacob Keller	6118955669	i40e: use pf data structure directly in i40e_ptp_rx_hang There's no reason to pass a vsi pointer if we already have the pf pointer in the only location where we call this function. Lets update the signature and directly pass the *pf data structure pointer. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 03:10:16 -07:00
Jacob Keller	2955faca04	i40e: add statistic indicating number of skipped Tx timestamps The i40e driver can only handle one Tx timestamp request at a time. This means it is possible for an application timestamp request to be ignored. There is no easy way for an administrator to determine if this occurred. Add a new statistic which tracks this, tx_hwtstamp_skipped. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 03:09:14 -07:00
Jacob Keller	69077577af	i40e: avoid permanent lock of *_PTP_TX_IN_PROGRESS The i40e driver uses a bit lock to indicate when a Tx timestamp is in progress to avoid attempting to timestamp multiple packets at once. This is required because hardware only has registers to handle one request at a time. There is a corner case where we failed to cleanup the bit lock after a failed transmit. This can potentially result in a state bit being locked forever. Add some cleanup code to i40e_xmit_frame_ring to check and make sure we cleanup incase of these failures. We also modify i40e_tx_map to return an error code indication DMA failure. Reported-by: Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 03:08:09 -07:00
Jacob Keller	bbc4e7d273	i40e: fix race condition with PTP_TX_IN_PROGRESS bits Hardware related to the i40e driver has a limitation on Tx PTP packets. This requires us to limit the driver to timestamping a single packet at once. This is done using a state bitlock which enforces that only one timestamp request is honored at a time. Unfortunately this suffers from a race condition. The bit lock is not cleared until after skb_tstamp_tx() is called notifying applications of a new Tx timestamp. Even a well behaved application sending only one packet at a time and waiting for a response can wake up and send a new timestamped packet request before the bit lock is cleared. This results in needlessly dropping some Tx timestamp requests. We can fix this by unlocking the state bit as soon as we read the Timestamp register, as this is the first point at which it is safe to timestamp another packet. To avoid issues with the skb pointer, we'll use a copy of the pointer and set the global variable in the driver structure to NULL first. This ensures that the next timestamp request does not modify our local copy of the skb pointer. Now, a well behaved application which has at most one outstanding timestamp request will not accidentally race with the driver unlock bit. Obviously an application attempting to timestamp faster than one request at a time will have some timestamp requests skipped. Unfortunately there is nothing we can do about that. Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 03:03:33 -07:00
Jesse Brandeburg	9d68322e53	i40evf: disable unused flags The i40evf hardware doesn't have any way to ever report FCoE enabled so just force the code to always report FCoE is disabled, remove the unused defines, and mark the OP as reserved. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 03:00:46 -07:00
Jesse Brandeburg	155b0f6900	i40evf: fix merge error in older patch This patch fixes a missing line that was missed while merging, which results in a driver feature in the VF not working to enable RSS as a negotiated feature. Fixes: `43a3d9ba34` ("i40evf: Allow PF driver to configure RSS") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 02:58:07 -07:00
Jesse Brandeburg	eb873fe4d3	i40evf: fix duplicate lines This removes two duplicate lines that snuck into the code somehow. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 02:56:13 -07:00
Miroslav Lichvar	74abc9b18f	net: ethernet: update drivers to make both SW and HW TX timestamps Some drivers were calling the skb_tx_timestamp() function only when a hardware timestamp was not requested. Now that applications can use the SOF_TIMESTAMPING_OPT_TX_SWHW option to request both software and hardware timestamps, the drivers need to be modified to unconditionally call skb_tx_timestamp(). CC: Richard Cochran <richardcochran@gmail.com> CC: Willem de Bruijn <willemb@google.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:37:32 -04:00
Miroslav Lichvar	e341257548	net: ethernet: update drivers to handle HWTSTAMP_FILTER_NTP_ALL Include HWTSTAMP_FILTER_NTP_ALL in net_hwtstamp_validate() as a valid filter and update drivers which can timestamp all packets, or which explicitly list unsupported filters instead of using a default case, to handle the filter. CC: Richard Cochran <richardcochran@gmail.com> CC: Willem de Bruijn <willemb@google.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:37:32 -04:00
Linus Torvalds	857f864014	pci-v4.12-changes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJZEHmsAAoJEFmIoMA60/r88SgQAJbFddueb0+DfJ+USDud4b/Z akfS+G1UAm+TgtMyh1wM49dHzFssp36uWJxtWI+bPqBzuy94PMCbz7JVUV28gX9G tFhFuc5YH94I/3y85rbZnolb6uZN9MhLjzTFqDC9ilW6HFqmwK4t4wlHSCjQN1St svLYvs2G6n6/VK3Fre7/wOvdZ1erG4Qod+kn5Tx3K5TQydmRlaSBfK+DRANuDBkM KzGO7Bkc/Cx8hb9pHmaey/wxmNrrgmVjTtWrEnb2tEq833zP4h6GhUIJEKodMSi5 gXPNZgKlu3n5L592M0UCh4EoHejzkv9wrcsoDm+djmsc5Zg2Howq4kAdHP8k4hUG 0gt8n0ni9vhJN56jikrGi7cAdHCKSNnx2Ue/qTCbX0ncB3XUMuJxJwCsgW/6wa9f oU7tRtTS03UltnKoFAcyYclS4TaSY4SA4ySaK6Hi+cRkdVFDdyHQYbHHNSU7MsA+ IS2tXvGoIdSYyrZMHSRcl2rRTfYQUkmPEvBF3LvqZr32M4mJMmUNAPLZaly373ZE iwq0ZJlrLeM0cqdFIG3S60RtJyQk/HBN1NMqrYHArWOxvWIgNd5F8NCsTTxY3wU3 IxgBIuUFcbVwVkqEHGs8K5AvB3oghqdnA3eGOV79799eMtLn3LOvyIlpHMSw9WUq ags00JtMLitfNPBH3eSl =eE4D -----END PGP SIGNATURE----- Merge tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: - add framework for supporting PCIe devices in Endpoint mode (Kishon Vijay Abraham I) - use non-postable PCI config space mappings when possible (Lorenzo Pieralisi) - clean up and unify mmap of PCI BARs (David Woodhouse) - export and unify Function Level Reset support (Christoph Hellwig) - avoid FLR for Intel 82579 NICs (Sasha Neftin) - add pci_request_irq() and pci_free_irq() helpers (Christoph Hellwig) - short-circuit config access failures for disconnected devices (Keith Busch) - remove D3 sleep delay when possible (Adrian Hunter) - freeze PME scan before suspending devices (Lukas Wunner) - stop disabling MSI/MSI-X in pci_device_shutdown() (Prarit Bhargava) - disable boot interrupt quirk for ASUS M2N-LR (Stefan Assmann) - add arch-specific alignment control to improve device passthrough by avoiding multiple BARs in a page (Yongji Xie) - add sysfs sriov_drivers_autoprobe to control VF driver binding (Bodong Wang) - allow slots below PCI-to-PCIe "reverse bridges" (Bjorn Helgaas) - fix crashes when unbinding host controllers that don't support removal (Brian Norris) - add driver for MicroSemi Switchtec management interface (Logan Gunthorpe) - add driver for Faraday Technology FTPCI100 host bridge (Linus Walleij) - add i.MX7D support (Andrey Smirnov) - use generic MSI support for Aardvark (Thomas Petazzoni) - make Rockchip driver modular (Brian Norris) - advertise 128-byte Read Completion Boundary support for Rockchip (Shawn Lin) - advertise PCI_EXP_LNKSTA_SLC for Rockchip root port (Shawn Lin) - convert atomic_t to refcount_t in HV driver (Elena Reshetova) - add CPU IRQ affinity in HV driver (K. Y. Srinivasan) - fix PCI bus removal in HV driver (Long Li) - add support for ThunderX2 DMA alias topology (Jayachandran C) - add ThunderX pass2.x 2nd node MCFG quirk (Tomasz Nowicki) - add ITE 8893 bridge DMA alias quirk (Jarod Wilson) - restrict Cavium ACS quirk only to CN81xx/CN83xx/CN88xx devices (Manish Jaggi) * tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (146 commits) PCI: Don't allow unbinding host controllers that aren't prepared ARM: DRA7: clockdomain: Change the CLKTRCTRL of CM_PCIE_CLKSTCTRL to SW_WKUP MAINTAINERS: Add PCI Endpoint maintainer Documentation: PCI: Add userguide for PCI endpoint test function tools: PCI: Add sample test script to invoke pcitest tools: PCI: Add a userspace tool to test PCI endpoint Documentation: misc-devices: Add Documentation for pci-endpoint-test driver misc: Add host side PCI driver for PCI test function device PCI: Add device IDs for DRA74x and DRA72x dt-bindings: PCI: dra7xx: Add DT bindings to enable unaligned access PCI: dwc: dra7xx: Workaround for errata id i870 dt-bindings: PCI: dra7xx: Add DT bindings for PCI dra7xx EP mode PCI: dwc: dra7xx: Add EP mode support PCI: dwc: dra7xx: Facilitate wrapper and MSI interrupts to be enabled independently dt-bindings: PCI: Add DT bindings for PCI designware EP mode PCI: dwc: designware: Add EP mode support Documentation: PCI: Add binding documentation for pci-test endpoint function ixgbe: Use pcie_flr() instead of duplicating it IB/hfi1: Use pcie_flr() instead of duplicating it PCI: imx6: Fix spelling mistake: "contol" -> "control" ...	2017-05-08 19:03:25 -07:00
Stephen Boyd	ad61dd303a	scripts/spelling.txt: add regsiter -> register spelling mistake This typo is quite common. Fix it and add it to the spelling file so that checkpatch catches it earlier. Link: http://lkml.kernel.org/r/20170317011131.6881-2-sboyd@codeaurora.org Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-05-08 17:15:13 -07:00
David S. Miller	8dd5b698c2	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-04-30 This series contains updates to i40e and i40evf only. Jake provides majority of the changes in this series, starting with the renaming of a flag to avoid confusion. Then renamed a variable to a more meaningful name to clarify what is actually being done and to reduce confusion. Amortizes the wait time when initializing or disabling lots of VFs by using i40e_reset_all_vfs() and i40e_vsi_stop_rings_no_wait(). Cleaned up a unnecessary delay since pci_disable_sriov() already has its own delay, so need to add a additional delay when removing VFs. Avoid using the same name flags for both vsi->state and pf->state, to make code review easier and assist future work to use the correct state field when checking bits. Use DECLARE_BITMAP() to ensure that we always allocate enough space for flags. Replace hw_disabled_flags with the new _AUTO_DISABLED flags, which are more readable because we are not setting an *_ENABLED flag to disable the feature. Alex corrects a oversight where we were not reprogramming the ports after a reset, which was causing us to lose all of the receive tunnel offloads. Arnd Bergmann moves the declaration of a local variable to avoid a warning seen on architectures with larger pages about an unused variable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-30 11:33:37 -04:00
Sasha Neftin	68fe1d5da5	e1000e: Add Support for 38.4MHZ frequency Add support for 38.4MHz frequency is required for PTP on CannonLake. SYSTIM frequency adjustment attributes for TIMINCA are get/set dependent on the hardware clock frequency for a different types of adapters. 38.4MHz frequency supported by CannonLake and active once time synchronisation mechanism was enabled Changed abbreviation from Hz to HZ to be compliant checkpatch code style Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Reviewed-by: Raanan Avargil <raanan.avargil@intel.com> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 05:22:32 -07:00
Sasha Neftin	c8744f44ae	e1000e: Add Support for CannonLake The propagation of CannonLake mac type to driver functionality Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Reviewed-by: Raanan Avargil <raanan.avargil@intel.com> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 05:18:30 -07:00
Sasha Neftin	3a3173b9c3	e1000e: Initial Support for CannonLake i219 (6) and i219 (7) are the next LOM generations that will be available on the nextIntel Client platform (CannonLake) This patch provides the initial support for these devices Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Reviewed-by: Raanan Avargil <raanan.avargil@intel.com> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 05:15:08 -07:00
Jarod Wilson	10ed1e0bd1	e1000e: fix PTP on e1000_pch_lpt variants I've got reports that the Intel I-218V NIC in Intel NUC5i5RYH systems used as a PTP slave experiences random ~10 hour clock jumps, which are resolved if the same workaround for the 82574 and 82583 is employed, so set the appropriate flag2 in e1000_pch_lpt_info too. Reported-by: Rupesh Patel <rupatel@redhat.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 05:12:07 -07:00
Arnd Bergmann	3dfc3eb581	i40evf: hide unused variable On architectures with larger pages, we get a warning about an unused variable: drivers/net/ethernet/intel/i40evf/i40evf_main.c: In function 'i40evf_configure_rx': drivers/net/ethernet/intel/i40evf/i40evf_main.c:690:21: error: unused variable 'netdev' [-Werror=unused-variable] This moves the declaration into the #ifdef to avoid the warning. Fixes: `dab86afdbb` ("i40e/i40evf: Change the way we limit the maximum frame size for Rx") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 05:01:35 -07:00
Jacob Keller	283aeafe6b	i40evf: allocate queues before we setup the interrupts and q_vectors This matches the ordering of how we free stuff during reset and remove. It also makes logical sense because we set the interrupts based on the number of queues. Currently this doesn't really matter in practice. However a future patch moves the assignment of num_active_queues into i40evf_alloc_queues, which is required by i40evf_set_interrupt_capability. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 04:57:42 -07:00
Jacob Keller	707636c648	i40evf: remove I40E_FLAG_FDIR_ATR_ENABLED The flag used by the common code and PF code is I40E_FLAG_FD_ATR_ENABLED, not FDIR. It turns out none of the txrx code actually shared with the VF driver actually checks the ATR flag. This is made even more obvious by the typo in the VF header file. Let's just remove the flag from the VF driver since it's not needed. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 04:55:09 -07:00
Jacob Keller	47994c119a	i40e: remove hw_disabled_flags in favor of using separate flag bits The hw_disabled_flags field was added as a way of signifying that a feature was automatically or temporarily disabled. However, we actually only use this for FDir features. Replace its use with new _AUTO_DISABLED flags instead. This is more readable, because you aren't setting an _ENABLED flag to disable* the feature. Additionally, clean up a few areas where we used these bits. First, we don't really need to set the auto-disable flag for ATR if we're fully disabling the feature via ethtool. Second, we should always clear the auto-disable bits in case they somehow got set when the feature was disabled. However, avoid displaying a message that we've re-enabled the feature. Third, we shouldn't be re-enabling ATR in the SB ntuple add flow, because it might have been disabled due to space constraints. Instead, we should just wait for the fdir_check_and_reenable to be called by the watchdog. Overall, this change allows us to simplify some code by removing an extra field we didn't need, and the result should make it more clear as to what we're actually doing with these flags. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 04:53:58 -07:00
Jacob Keller	789f38ca70	i40evf: remove needless min_t() on num_online_cpus()2 We already set pairs to the value of adapter->num_active_queues. This value is limited by vsi_res->num_queue_pairs and num_online_cpus(). This means that pairs by definition is already smaller than num_online_cpus()2, so we don't even need to bother with this check. Lets just remove it and update the comment. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 04:50:18 -07:00
Jacob Keller	0da36b9774	i40e: use DECLARE_BITMAP for state fields Instead of assuming our flags fit within an unsigned long, use DECLARE_BITMAP which will ensure that we always allocate enough space. Additionally, use __I40E_STATE_SIZE__ markers as the last element of the enumeration so that the size of the BITMAP is compile-time assigned rather than programmer-time assigned. This ensures that potential future flag additions do not actually overrun the array. This is especially important as 32bit systems would only have 32bit longs instead of 64bit longs as we generally have assumed in the prior code. This change also removes a dereference of the state fields throughout the code, so it does have a bit of code churn. The conversions were automated using sed replacements with an alternation s/&(vsi->back\|vsi\|pf)->state/\1->state/ s/&adapter->vsi.state/adapter->vsi.state/ For debugfs, we modify the printing so that we can display chunks of the state value on new lines. This ensures that we can print the entire set of state values. Additionally, we now print them as 08lx to ensure that they display nicely. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 04:48:13 -07:00
Jacob Keller	d19cb64b92	i40e: separate PF and VSI state flags Avoid using the same named flags for both vsi->state and pf->state. This makes code review easier, as it is more likely that future authors will use the correct state field when checking bits. Previous commits already found issues with at least one check, and possibly others may be incorrect. This reduces confusion as it is more clear what each flag represents, and which flags are valid for which state field. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 04:42:30 -07:00
Jacob Keller	2318b4018a	i40e: remove unnecessary msleep() delay in i40e_free_vfs The delay was added because of a desire to ensure that the VF driver can finish up removing. However, pci_disable_sriov already has its own ssleep() call that will sleep for an entire second, so there is no reason to add extra delay on top of this by using msleep here. In practice, an msleep() won't have a huge impact on timing but there is no real value in keeping it, so lets just simplify the code and remove it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 04:19:55 -07:00
Jacob Keller	707d088af3	i40e: amortize wait time when disabling lots of VFs Just as we do in i40e_reset_all_vfs, save some time when freeing VFs by amortizing the wait time for stopping queues. We can use i40e_vsi_stop_rings_no_wait() to begin the process of stopping all the VF rings at once. Then, once we've started the process on each VF we can begin waiting for the VFs to stop. This helps reduce the total wait time by a large factor. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 04:18:17 -07:00
Alexander Duyck	1f190d9369	i40e: Reprogram port offloads after reset This patch corrects a major oversight in that we were not reprogramming the ports after a reset. As a result we completely lost all of the Rx tunnel offloads on receive including Rx checksum, RSS on inner headers, and ATR. The fix for this is pretty standard as all we needed to do is reset the filter bits to pending for all active filters and schedule the sync event. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 04:16:31 -07:00
Jacob Keller	27826fd5d3	i40e: rename index to port to avoid confusion The .index field of i40e_udp_port_config represents the udp port number. Rename this variable to port so that it is more obvious. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 04:13:49 -07:00
Jacob Keller	1b48437028	i40e: make use of i40e_reset_all_vfs when initializing new VFs When allocating a large number of VFs, the driver previously used i40e_reset_vf in a sequence. Just as when performing a normal reset, this accumulates a large amount of delay for handling all of the VFs in sequence. This delay is mainly due to a hardware requirement to wait after initiating a reset on the VF. We recently added a new function, i40e_reset_all_vfs() which can be used to amortize the delay time, by first triggering all VF resets, then waiting once, and finally cleaning up and allocating the VFs. This is almost as good as truly running the resets in parallel. In order to avoid sending a spurious reset message to a client interface, we have a check to see whether we've assigned pf->num_alloc_vfs yet. This was originally intended as a way to distinguish the "initialization" case from the regular reset case. Unfortunately, this means that we can't directly use i40e_reset_all_vfs yet. Lets avoid this check of pf->num_alloc_vfs by replacing it with a proper VSI state bit which we can use instead. This makes the intention much clearer and allows us to re-use the i40e_reset_all_vfs function directly. Change-ID: I694279b37eb6b5a91b6670182d0c15d10244fd6e Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 04:05:20 -07:00
Jacob Keller	6322e63c35	i40e: properly spell I40E_VF_STATE_* flags These flags represent the state of the VF at various times. Do not spell them as _STAT_ which can be confusing to readers who may think these refer to statistics. Change-ID: I6bc092cd472e8276896a1fd7498aced2084312df Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 03:58:53 -07:00
Tony Nguyen	e60ae00361	ixgbevf: Check for RSS key before setting value The RSS key is being repopulated every time the interface is brought up regardless of whether there is an existing value. If the user sets the RSS key and the interface is brought up (e.g. reset), the user specified RSS key will be overwritten. This patch changes the rss_key to a pointer so we can check to see if the key has been populated and preserve it accordingly. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-29 20:01:04 -07:00
Tony Nguyen	82fb670c5f	ixgbevf: Fix errors in retrieving RETA and RSS from PF Mailbox support for getting RETA and RSS is available for only 82599 and x540; a previous patch reversed the logic and these adapters were returning not supported. Also, the NACK check in ixgbevf_get_rss_key_locked() was checking for the command IXGBE_VF_GET_RETA instead of IXGBE_VF_GET_RSS_KEY. This patch corrects both issues by correcting the logic and checking for the right command. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-29 20:01:04 -07:00
Tony Nguyen	3dfbfc7ebb	ixgbe: Check for RSS key before setting value The RSS key is being repopulated every time the interface is brought up regardless of whether there is an existing value. If the user sets the RSS key and the interface is brought up (e.g. reset), the user specified RSS key will be overwritten. This patch changes the rss_key to a pointer so we can check to see if the key has been populated and preserve it accordingly. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-29 20:01:04 -07:00
Paul Greenwalt	8dc963e1cd	ixgbe: Add 1000Base-T device based on X550EM_X MAC Add support for new 1000Base-T device based on X550EM_X MAC type. All PHY operations are disabled as the PHY is controlled by FW. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-29 20:01:03 -07:00
Tony Nguyen	27bdc44cdb	ixgbe: Allow setting zero MAC address for VF Currently, there is no logic that allows a VF's MAC address to be removed from the RAR table. Allow the user to specify a zero MAC address in order to clear the VF's MAC address from the RAR table. This functionality is also utilized by libvirt when removing VFs. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-29 20:01:03 -07:00
Emil Tantilov	f87fc44770	ixgbevf: fix size of queue stats length IXGBEVF_QUEUE_STATS_LEN is based on ixgebvf_stats, not ixgbe_stats. This change fixes a bug where ethtool -S displayed some empty fields. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-29 20:01:03 -07:00
Emil Tantilov	e251ecf752	ixgbe: clean macvlan MAC filter table on VF reset Flush the macvlan filters on VF reset to avoid conflict with other VFs that may end up using the same MAC address. The main change here is the call to ixgbe_set_vf_macvlan() with index 0. Moved ixgbe_set_vf_macvlan() in front of ixgbe_vf_reset_event() to avoid adding a prototype. Reported-by: Sritej Kanakadandi Sritej Rama <skanakad@cisco.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-29 20:01:03 -07:00
John Fastabend	7379f97a4f	ixgbe: delay tail write to every 'n' packets Current XDP implementation hits the tail on every XDP_TX return code. This patch changes driver behavior to only hit the tail after packet processing is complete. With this patch I can run XDP drop programs @ 14+Mpps and XDP_TX programs are at ~13.5Mpps. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-29 20:01:02 -07:00
John Fastabend	33fdc82f08	ixgbe: add support for XDP_TX action A couple design choices were made here. First I use a new ring pointer structure xdp_ring[] in the adapter struct instead of pushing the newly allocated XDP TX rings into the tx_ring[] structure. This means we have to duplicate loops around rings in places we want to initialize both TX rings and XDP rings. But by making it explicit it is obvious when we are using XDP rings and when we are using TX rings. Further we don't have to do ring arithmatic which is error prone. As a proof point for doing this my first patches used only a single ring structure and introduced bugs in FCoE code and macvlan code paths. Second I am aware this is not the most optimized version of this code possible. I want to get baseline support in using the most readable format possible and then once this series is included I will optimize the TX path in another series of patches. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-29 19:58:07 -07:00
John Fastabend	9247080816	ixgbe: add XDP support for pass and drop actions Basic XDP drop support for ixgbe. Uses READ_ONCE/xchg semantics on XDP programs instead of RCU primitives as suggested by Daniel Borkmann and Alex Duyck. v2: fix the build issues seen w/ XDP when page sizes are larger than 4K and made minor fixes based on feedback from Jakub Kicinski Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-29 19:55:08 -07:00
Paul Greenwalt	6133406be1	ixgbe: Acquire PHY semaphore before device reset A recent firmware change fixed an issue to acquire the PHY semaphore before accessing PHY registers. This led to a case where SW can issue a device reset clearing the MDIO registers. This patch makes SW acquire the PHY semaphore before issuing a device reset. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-28 19:02:31 -07:00
Christoph Hellwig	63af8f7a82	ixgbe: Use pcie_flr() instead of duplicating it Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-27 11:46:50 -05:00
David S. Miller	9b5381637e	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 1GbE Intel Wired LAN Driver Updates 2017-04-20 This series contains updates to e1000, e1000e, igb/vf and ixgb. Tobias Klauser cleans up e1000, ixgb and igbvf from having a local function or structure for netdev stats. Bernd Faust fixes an issue for 82579 devices, where the clock frequency was being incorrectly set for these devices. These devices only support 96MHz, so make sure they are set to use only that. Yury Kylulin extends the work Jake and Alex did for ixgbe in MAC filter handling into the igb driver. Kim Tatt Chuah enables igb to wake up by packet and to read the necessary Wake Up Status (WUS) and Wake Up Packet Memory (WUPM) registers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-24 11:53:53 -04:00
David S. Miller	072cec7797	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-04-19 This series contains updates to i40e and i40evf only, most notable being the addition of trace points for BPF programs. Tobias Klauser updates i40evf to use net_device stats struct instead of a local private copy. Preethi updates the VF driver to not enable receive checksum offload by default for tunneled packets. Alex fixes an issue he introduced when he converted the code over to using the length field to determine if a descriptor was done or not. Mitch adds the ability to dump additional information on the VFs, which is not available through 'ip link show' using debugfs. Scott adds trace points to the drivers so that BPF programs can be attached for feature testing and verification. Jingjing adds admin queue functions for Pipeline Personalization Profile commands. Jake does most of the heavy lifting in this series, starting with the a reduction in the scope of the RTNL lock being held while resetting VFs to allow multiple PFs to reset in a timely manner. Factored out the direct queue modification so that we are able to re-use the code. Reduced the wait time for admin queue commands to complete, since we were waiting a minimum of a millisecond, when in practice the admin queue command is processed often much faster. Cleaned up code (flag) we never use. Make the code to resetting all the VFs optimized for parallel computing instead of the current way is a serialized fashion, to help reduce the time it takes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-21 14:13:11 -04:00
Tobias Klauser	55c05dd029	igbvf: Use net_device_stats from struct net_device Instead of using a private copy of struct net_device_stats in struct igbvf_adapter, use stats from struct net_device. Also remove the now unnecessary .ndo_get_stats function. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-20 16:32:45 -07:00
Kim Tatt Chuah	b90fa87635	igb: Enable reading of wake up packet Currently, in igb_resume(), igb driver ignores the Wake Up Status (WUS) and Wake Up Packet Memory (WUPM) registers. This patch enables the igb driver to read the WUPM if the controller was woken by a wake up packet that is not more than 128 bytes long (maximum WUPM size), then pass it up the kernel network stack. Signed-off-by: Kim Tatt Chuah <kim.tatt.chuah@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-20 16:32:45 -07:00
Yury Kylulin	4827cc3779	igb/igbvf: Add VF MAC filter request capabilities Add functionality for the VF to request up to 3 additional MAC filters. This is done using existing E1000_VF_SET_MAC_ADDR message, but with additional message info - E1000_VF_MAC_FILTER_CLR to clear all unicast MAC filters previously set for this VF and E1000_VF_MAC_FILTER_ADD to add MAC filter. Additional filters can be added only in case if administrator did not set VF MAC explicitly and allowed to change default MAC to the VF. Due to the limited number of RAR entries reserve at least 3 MAC filters for the PF. If SRIOV is supported by the NIC after this change RAR entries starting from 1 to (RAR MAX ENTRIES - NUM SRIOV VFS) will be used for PF and VF MAC filters. Signed-off-by: Yury Kylulin <yury.kylulin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-20 16:32:44 -07:00
Yury Kylulin	83c21335c8	igb: improve MAC filter handling Using the work which was done for ixgbe driver by Jacob Keller commit `5d7daa35b9` ("ixgbe: improve mac filter handling") and Alexander Duyck commit `0f079d2283` ("ixgbe: Use __dev_uc_sync and __dev_uc_unsync for unicast addresses") and out-of-tree igb driver add functionality to manage (add and delete) MAC filters. Signed-off-by: Yury Kylulin <yury.kylulin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-20 16:32:44 -07:00
Bernd Faust	5313eeccd2	e1000e: fix timing for 82579 Gigabit Ethernet controller After an upgrade to Linux kernel v4.x the hardware timestamps of the 82579 Gigabit Ethernet Controller are different than expected. The values that are being read are almost four times as big as before the kernel upgrade. The difference is that after the upgrade the driver sets the clock frequency to 25MHz, where before the upgrade it was set to 96MHz. Intel confirmed that the correct frequency for this network adapter is 96MHz. Signed-off-by: Bernd Faust <berndfaust@gmail.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-20 16:32:44 -07:00
Tobias Klauser	541c1662ba	ixgb: Omit private ndo_get_stats function ixgb_get_stats() just returns dev->stats so we can leave it out altogether and let dev_get_stats() do the job. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-20 16:18:45 -07:00
Tobias Klauser	14cf2ddf08	e1000: Omit private ndo_get_stats function e1000_get_stats() just returns dev->stats so we can leave it out altogether and let dev_get_stats() do the job. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-20 16:18:45 -07:00
Jacob Keller	3480756f2c	i40e: use i40e_stop_rings_no_wait to implement PORT_SUSPENDED state This state bit was added as a way for DCB to avoid having to wait for the queues to disable when handling LLDP events. The logic for this was burried deep within stop Tx and stop Rx queue code. First, let's rename it so that it does not appear to only affect Tx when infact it modifies both Tx and Rx flow. Second we can move it up into the i40e_stop_rings() function, and we can simply re-use the i40e_stop_rings_no_wait() so that we don't have to bury the implementation as deep into the call stack. An alternative might be to remove the state bit and instead attempt to shut down everything directly in DCP flow. This, however, is not ideal because it creates yet another separate shutdown routine that we'd have to maintain. In the current implementation any changes will be made to both flows. Change-ID: I68e1ccb901af320862bca395e9c9746f08e8b17c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 17:49:17 -07:00
Jacob Keller	e4b433f4a7	i40e: reset all VFs in parallel when rebuilding PF When there are a lot of active VFs, it can take multiple seconds to finish resetting all of them during certain flows., which can cause some VFs to fail to wait long enough for the reset to occur. The user might see messages like "Never saw reset" or "Reset never finished" and the VF driver will stop functioning properly. The naive solution would be to simply increase the wait timer. We can get much more clever. Notice that i40e_reset_vf is run in a serialized fashion, and includes lots of delays. There are two prominent delays which take most of the time. First, when we begin resetting VFs, we have multiple 10ms delays which accrue because we reset each VF in a serial fashion. These delays accumulate to almost 4 seconds when handling the maximum number of VFs (128). Secondly, there is a massive 50ms delay for each time we disable queues on a VSI. This delay is necessary to allow HW to finish disabling queues before we restore functionality. However, just like with the first case, we are paying the cost for each VF, rather than disabling all VFs and waiting once. Both of these can be fixed, but required some previous refactoring to handle the special case. First, we will need the i40e_vsi_wait_queues_disabled function which was previously DCB specific. Second, we will need to implement our own i40e_vsi_stop_rings_no_wait function which will handle the stopping of rings without the delays. Finally, implement an i40e_reset_all_vfs function, which will first start the reset of all VFs, and pay the wait cost all at once, rather than serially waiting for each VF before we start processing then next one. After the VF has been reset, we'll disable all the VF queues, and then wait for them to disable. Again, we'll organize the flow such that we pay the wait cost only once. Finally, after we've disabled queues we'll go ahead and begin restoring VF functionality. The result is reducing the wait time by a large factor and ensuring that VFs do not timeout when waiting in the VF driver. Change-ID: Ia6e8cf8d98131b78aec89db78afb8d905c9b12be Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 17:45:07 -07:00
Jacob Keller	9dc2e41738	i40e: split some code in i40e_reset_vf into helpers A future patch is going to want to re-use some of the code in i40e_reset_vf, so lets break up the beginning and ending parts into their own helper functions. The first function will be used to initialize the reset on a VF, while the second function will be used to finalize the reset and restore functionality. Change-ID: I48df808b8bf09de3c2ed8c521f57b3f0ab9e5907 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 17:42:21 -07:00
Jacob Keller	1de81c2d07	i40e: remove I40E_FLAG_IN_NETPOLL entirely This flag was originally intended to be used to let some driver code know when we were running from netpoll. Ultimately this was not necessary and we never used it. Let's remove it Change-ID: I43b72483d91c1638071d2a7f389ab171ec5b796a Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 17:40:27 -07:00
Jacob Keller	9e3f23f44f	i40e: reduce wait time for adminq command completion When sending an adminq command, we wait for the command to complete in a loop. This loop waits for an entire millisecond, when in practice the adminq command is processed often much faster. Change the loop to use i40e_usec_delay instead, and wait for 50 usecs each time instead. This appears to be about the minimum time required, based on some manual observation and testing. The primary benefit of this change is reducing latency of various operations in the PF driver, especially when related to having a large number of VFs enabled. For example, on Linux, when instantiating 128 VFs, the time to finish the operation dropped from about 9 seconds down to under 6 seconds. Additionally, the time it takes to finish a PF reset with 128 VFs dropped from 5.1 seconds down to 0.7 seconds. As the examples above show, a significant portion of the delay is wasted waiting for admiqn operations which have already finished. This patch shouldn't cause impact to functionality, as we still check and keep waiting until the command does get processed. The only expected change is an increase in CPU utilization as we now check for completion far more times. However, in practice the commands appear to generally be complete within the first delay window anyways. Change-ID: If8af8388e100da0a14eaf9e1af3afadf73a958cf Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 17:38:25 -07:00
Jacob Keller	e8d2f4c674	i40e: fix CONFIG_BUSY checks in i40e_set_settings function The check for I40E_CONFIG_BUSY state bit in the i40e_set_link_ksettings function is fishy. First we can notice a few things about the check here. First a similar check was introduced by commit 'c7d05ca89f8e ("i40e: driver ethtool core")' Later a commit introducing the link settings was added by commit 'bf9c71417f72 ("i40e: Implement set_settings for ethtool")' However, this second check was against vsi->state instead of pf->state, and also failed to set the bit, it only checks. That indicates the locking was not quite correct. The only other place that the state bit in vsi->state gets used is to protect the filter list. Since this code does not care about the mac filter list, and seems clear the original code should have set the pf->state bit. Fix these issues by using pf->state correctly, and by actually setting the bit so that we properly lock as expected. Since these checks occur while holding the rtnl_lock(), lets also add a timeout so that we don't potentially softlock the system. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 17:17:03 -07:00
Jacob Keller	c768e49064	i40e: factor out queue control from i40e_vsi_control_(tx\|rx) A future patch will need to be able to handle controlling queues without waiting until all VSIs are handled. Factor out the direct queue modification so that we can easily re-use this code. The result is also a bit easier to read since we don't embed multiple single-letter loop counters. Change-ID: Id923cbfa43127b1c24d8ed4f809b1012c736d9ac Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 17:13:56 -07:00
Jacob Keller	024b05f424	i40e: don't hold RTNL lock while waiting for VF reset to finish We made some effort to reduce the RTNL lock scope when resetting and rebuilding the PF. Unfortunately we still held the RTNL lock during the VF reset operation, which meant that multiple PFs could not reset in parallel due to the global lock. For now, further reduce the scope by not holding the RTNL lock while resetting VFs. This allows multiple PFs to reset in a timely manner. Change-ID: I2fbf823a0063f24dff67676cad09f0bbf83ee4ce Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 17:11:30 -07:00
Jingjing Wu	1d5c960c5e	i40e: new AQ commands Add admin queue functions for Pipeline Personalization Profile AQ commands: - Write Recipe Command buffer (Opcode: 0x0270) - Get Applied Profiles list (Opcode: 0x0271) Change-ID: I558b4145364140f624013af48d4bbf79d21ebb0d Signed-off-by: Jingjing Wu <jingjing.wu@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 17:04:06 -07:00
Scott Peterson	ed0980c440	i40e/i40evf: Add tracepoints This patch adds tracepoints to the i40e and i40evf drivers to which BPF programs can be attached for feature testing and verification. It's expected that an attached BPF program will identify and count or log some interesting subset of traffic. The bcc-tools package is helpful there for containing all the BPF arcana in a handy Python wrapper. Though you can make these tracepoints log trace messages, the messages themselves probably won't be very useful (other to verify the tracepoint is being called while you're debugging your BPF program). The idea here is that tracepoints have such low performance cost when disabled that we can leave these in the upstream drivers. This may eventually enable the instrumentation of unmodified customer systems should the need arise to verify a NIC feature is working as expected. In general this enables one set of feature verification tools to be used on these drivers whether they're built with the kernel or separately. Users are advised against using these tracepoints for anything other than a diagnostic tool. They have a performance impact when enabled, and their exact placement and form may change as we see how well they work in practice for the purposes above. Change-ID: Id6014a7322c0e6d08068114dd20bd156f2f6435e Signed-off-by: Scott Peterson <scott.d.peterson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 16:47:31 -07:00
Mitch Williams	3118025a07	i40e: dump VF information in debugfs Dump some internal state about VFs through debugfs. This provides information not available with 'ip link show'. To use, write "dump vf <id>" to the command file, or just "dump vf" to dump information on all of the VFs. Change-ID: Ibe32b7f4ae55d4358c0b903217475f708ada1ecd Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 16:38:23 -07:00
Alexander Duyck	0e626ff7cc	i40e: Fix support for flow director programming status This patch fixes an issue I introduced when I converted the code over to using the length field to determine if a descriptor was done or not. It turns out that we are also processing programming descriptors in the Rx path and need to have these processed even though the length field will be 0 on these packets. What will happen with a programming descriptor is that we will receive a descriptor that has the SPH bit set, and the header length and packet length fields cleared. To account for this we should be checking for the bit for split header being set even though we aren't actually using header split. This bit is set in the length field to indicate if a programming descriptor response is contained in the descriptor. Since we don't support header split we don't need to perform the extra checks of using a fixed value for the entire length field. In addition I am moving the function for checking if a filter is a programming status filter into the i40e_txrx.c file since there is no longer support for FCoE it doesn't make sense to keep this file in i40e.h. Change-ID: I12c359c3dc70adb9d6b92b27324bb2c7f04c1a06 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 16:36:18 -07:00
alice michael	53240e99db	i40e/i40evf: Remove VF Rx csum offload for tunneled packets Rx checksum offload for tunneled packets was never being negotiated or requested by VF. This capability was assumed by default and enabled in current hardware for VF. Going forward, this feature needs to be disabled or advanced ptypes should be negotiated with PF in the future. Change-ID: I9e54cfa8a90e03ab6956db4412f1e337ccd2c2e0 Signed-off-by: Preethi Banala <preethi.banala@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 16:32:53 -07:00
Tobias Klauser	4a0a3abfd9	i40evf: Use net_device_stats from struct net_device Instead of using a private copy of struct net_device_stats in struct i40evf_adapter, use stats from struct net_device. Also remove the now unnecessary .ndo_get_stats function. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 16:31:34 -07:00
Alexander Duyck	18a8cc9815	ixgbe: Fix output from ixgbe_dump I just found that when we had changed the Rx path to check for length instead of the DD bit we introduced an issue in ixgbe_dump since we were no longer clearing the status bits. To correct this I am updating ixgbe_dump to look for the length bits in the descriptor since that is what we are using in the Rx path. Fixes: `c3630cc40b` ("ixgbe: Use length to determine if descriptor is done") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-18 15:27:33 -07:00
Alexander Duyck	541ea69a90	ixgbe: Add support for maximum headroom when using build_skb This patch increases the headroom allocated when using build_skb on a system with 4K pages. Specifically the breakdown of headroom versus cache size is as follows: L1 Cache Size Headroom 64 192 64, NET_IP_ALIGN == 2 194 128 128 128, NET_IP_ALIGN == 2 130 256 512 256, NET_IP_ALIGN == 2 258 I stopped at supporting only a cache line size of 256 as that was the largest cache size I could find supported in the kernel. With this we are guaranteeing at least 128 bytes of headroom to spare in the frame. This should be enough for us to insert a couple of IPv6 headers if needed which is likely enough room for anything XDP should need. I'm leaving the padding for systems with pages larger than 4K unmodified for now. XDP currently isn't really setup to work on those types of systems so we can cross that bridge when we get there. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-18 13:32:45 -07:00
Tony Nguyen	f4a6374ba4	ixgbe: add check for VETO bit when configuring link for KR We did not have a check in place for MMNGC.MNG_VETO when setting up link on X550EM_X KR devices which resulted in link loss for the BMC when loading the driver. This patch adds a check for ixgbe_check_reset_blocked() in setup_link() since in that case there is no PHY reset function. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-18 13:29:55 -07:00
Philippe Reynes	9668c93616	ixgbevf: use new api ethtool_{get\|set}_link_ksettings The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-18 13:28:31 -07:00
Don Skidmore	7ee814d7a6	ixgbe: Remove unused define Remove the Marvell 1145 PHY define as we have never had a device that supports it and have no plan to in the future. The existence of this define has caused confusing on whether or not this PHY was supported by ixgbe. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-18 13:26:53 -07:00
Emil Tantilov	5c11f00dda	ixgbe: do not use adapter->num_vfs when setting VFs via module parameter Avoid setting adapter->num_vfs early in the init code path when using the max_vfs module parameter by passing it to ixgbe_enable_sriov() as a function parameter. This fixes an issue where if we failed to allocate vfinfo in __ixgbe_enable_sriov() the driver will crash with NULL pointer in ixgbe_disable_sriov() when attempting to free the vfinfo struct based on adapter->num_vfs. Also it cleans up the assignment of adapter->num_vfs since now it will only be set in __ixgbe_enable_sriov() and cleared in ixgbe_disable_sriov(). Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-18 13:25:30 -07:00
Emil Tantilov	da614d042a	ixgbe: return early instead of wrap block in if statement Since we exit at the end of the block, we can save a level of indentation by performing an early return, and make the next several sections of code more legible, with fewer 80 character line breaks. Also moved allocating vfinfo at the beginning and the notification for enabling SRIOV at the end of the function when we know that it will succeed. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-18 13:24:04 -07:00
Emil Tantilov	2bc0972988	ixgbe: move num_vfs_macvlans allocation into separate function Move the code allocating memory for list of MAC addresses that the VFs can use for MACVLAN into its own function. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-18 13:11:54 -07:00
Emil Tantilov	5b9d3cfb6b	ixgbe: add default setup_link for x550em_a MAC type Add default setting for mac->ops.setup_link on x550em_a MAC types. This fixes a link issue on KR parts. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-18 13:10:03 -07:00
Don Skidmore	8e5c9c534f	ixgbe: list X553 backplane speeds correctly We forgot to indicate some of the supported speed on the X553 backplane. This patch attempts to correct for that. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-18 13:08:48 -07:00
Don Skidmore	18e01ee75f	ixgbe: Add X552 XFI backplane support This patch add support for X552 XFI backplane interface. The XFI backplane requires a custom tuned link. HW/FW owns the link config for XF backplane and SW must not interfere with it. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-18 13:06:54 -07:00
Don Skidmore	18bda0d93b	ixgbe: Complete support for X553 sgmii The initial patches supporting X553 sgmii forgot some details. This patch should cover those missing spots. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-18 13:05:55 -07:00
Tony Nguyen	5fbf5addad	ixgbe: Remove driver config for KX4 PHY The KX4 PHY is configured by the NVM. Currently, the driver is overwriting the config; remove the code associated with KX4 configuration. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>	2017-04-18 13:04:09 -07:00
Joe Perches	332f235836	ixgbe: Remove pr_cont uses As pr_cont output can be interleaved by other processes, using pr_cont should be avoided where possible. Miscellanea: - Use a temporary pointer to hold the next descriptions and consolidate the pr_cont uses - Use the temporary buffer to hold the 8 u32 register values and emit those in a single go - Coalesce formats and logging neatening around those changes - Fix a defective output for the rx ring entry description when also emitting rx_buffer_info data This reduces overall object size a tiny bit too. $ size drivers/net/ethernet/intel/ixgbe/.o text data bss dec hex filename 62167 728 12 62907 f5bb drivers/net/ethernet/intel/ixgbe/ixgbe_main.o.new 62273 728 12 63013 f625 drivers/net/ethernet/intel/ixgbe/ixgbe_main.o.old Signed-off-by: Joe Perches <joe@perches.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-18 12:59:44 -07:00
Usha Ketineni	b5d8acbb87	ixgbe: Avoid Tx hang by not allowing more than the number of VFs supported. When DCB is enabled, add checks to ensure creation of number of VF's is valid based on the traffic classes configured by the device. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Tested-by: Ronald Bynoe <ronald.j.bynoe@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-18 12:53:52 -07:00
Alexander Duyck	f8b45b74cc	i40e/i40evf: Use build_skb to build frames This patch is meant to improve the performance of the Rx path. Specifically by using build_skb we have several distinct advantages. In the case of small frames we were previously using a copy-break approach. This means that we were allocating a page fragment to use for skb->head, and were having to copy the packet into that region. Both of those calls are now avoided since we just build the skb around the data. In the case of large frames the gains are much more significant. Specifically we were having to allocate skb->head, and copy the headers as before. However in addition we were having to parse the header using eth_get_headlen which could be quite expensive. All of this is avoided by building the frame around the data. I have seen gains as high as 30% when using VXLAN for instance due to just header pulling overhead. Finally with all this in place it also sets us up to start looking at enabling XDP. Specifically we now have a path in which the data is in the page and the frame is built around it. So if we parse it with XDP before we call build_skb we can take care of any necessary processing there. Change-ID: Id4bdd618e94473d41f892417e5d8019639e421e3 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:51 -07:00
Alexander Duyck	ca9ec0888d	i40e/i40evf: Add support for padding start of frames This patch adds padding to the start of frames to make room for headroom for us to eventually start using build_skb. Right now we guarantee at least NET_SKB_PAD + NET_IP_ALIGN, however we allocate more space if more is available. For example on x86 the headroom should be 192 bytes. On systems that have too large of a cache line size to support storing 1.5K padding and shared info we default to using 3K buffers and reserve everything that isn't used for skb_shared_info or the data buffer for headroom. Change-ID: I33c641c9a1ea10cf7cc484c2d20985368d2d709a Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:51 -07:00
Alexander Duyck	98efd69493	i40e/i40evf: Add support for using order 1 pages with a 3K buffer There are situations where adding padding to the front and back of an Rx buffer will require that we add additional padding. Specifically if NET_IP_ALIGN is non-zero, or the MTU size is larger than 7.5K we would need to use 2K buffers which leaves us with no room for the padding. To preemptively address these cases I am adding support for 3K buffers to the Rx path so that we can provide the additional padding needed in the event of NET_IP_ALIGN being non-zero or a cache line being greater than 64. Change-ID: I938bc1ba611285428df39a613cd66f98e60b55c7 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:50 -07:00
Jacob Keller	33512191fe	i40e: clean up historic deprecated flag definitions Since an early commit a few flags have no longer been used. Remove these definitions to reduce code clutter. Change-ID: I3589be4622574e747013cd4dc403e18b039f4965 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:50 -07:00
Alice Michael	78786d4a59	i40e: remove I40E_FLAG_NEED_LINK_UPDATE The I40E_FLAG_NEED_LINK_UPDATE was never used. Remove the flag definitions. Change-ID: If59d0c6b4af85ca27281f3183c54b055adb439a4 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:50 -07:00
Jacob Keller	af26ce2dfb	i40e: remove extraneous loop in i40e_vsi_wait_queues_disabled We can simply check both Tx and Rx queues in a single loop, rather than repeating the loop twice. Change-ID: Ic06f26b0e3c2620e0e33c1a2999edda488e647ad Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:50 -07:00
Jacob Keller	41c4c2b50d	i40e: allow look-up of MAC address from Open Firmware or IDPROM Look up the MAC address from the eth_get_platform_mac_address() function first before checking what the firmware provides. We already handle the case of re-writing the MAC-VLAN filter, so there is no need to add extra code for this. However, update the comment where we do this to indicate that it does impact the Open Firmware MAC address case. Change-ID: I73e59fbe0b0e7e6f3ee9f5170d0bd3a4d5faf4db Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:49 -07:00
Alan Brady	17daabb5e8	i40e: Simplify i40e_detect_recover_hung_queue logic This patch greatly reduces the unneeded complexity in the i40e_detect_recover_hung_queue code path. The previous implementation set a 'hung bit' which would then get cleared while polling. If the detection routine was called a second time with the bit already set, we would issue a software interrupt. This patch makes it such that if interrupts are disabled and we have pending TX descriptors, we trigger a software interrupt since in, the worst case, queues are already clean and we have an extra interrupt. Additionally this patch removes the workaround for lost interrupts as calling napi_reschedule in this context can cause software interrupts to fire on the wrong CPU. Change-ID: Iae108582a3ceb6229ed1d22e4ed6e69cf97aad8d Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:49 -07:00
Maciej Sosin	373149fc99	i40e: Decrease the scope of rtnl lock Previously rtnl lock was held during whole reset procedure that was stopping other PFs running their reset procedures. In the result reset was not handled properly and host reset was the only way to recover. Change-ID: I23c0771c0303caaa7bd64badbf0c667e25142954 Signed-off-by: Maciej Sosin <maciej.sosin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:49 -07:00
Alexander Duyck	e8c5f7231c	i40e: Swap use of pf->flags and pf->hw_disabled_flags for ATR Eviction This is a minor cleanup so that we are always updating pf->flags when we make a change to the private flags instead of updating a mix of either pf->flags and/or pf->hw_disabled_flags. In addition I went through and cleaned out all the spots where we were using the X722 define in regards to this flag. Lastly since we changed the logic I went through and flushed out any redundancy and cleaned up the handling of the flags in the Tx path. Change-ID: I79ff95a7272bb2533251ff11ef91e89ccb80b610 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:49 -07:00
Jacob Keller	a346fb836c	i40e: update error message when trying to add invalid filters Re-word the error message displayed when adding a filter with an invalid flow type. Additionally, report a distinct error message when the IPv4 protocol is at fault. Change-ID: Iba3d85b87f8d383c97c8bdd180df34a6adf3ee67 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:48 -07:00
Mitch Williams	004eb614c4	i40e: only register client on iWarp-capable devices The client interface is only intended for use on devices that support iWarp. Only register with the client if this is the case. This fixes a panic when loading i40iw on X710 devices. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Reported-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:48 -07:00
Mitch Williams	921c467c6b	i40e: close client on remove and shutdown When the driver is removed or shut down, close any attached clients (i.e. i40iw). This prevents a panic seen sometimes on forced driver removal or system shutdown when iWarp is running. Change-ID: I4f6161e5a73ffbb2fd5883567b007310302bfcb5 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-06 20:22:28 -07:00
Mitch Williams	8090f6183c	i40e: register existing client on probe In some cases, a client (i40iw) may already be present when probe is called. Check for this, and add a client instance if necessary. Change-ID: I2009312694b7ad81f1023919e4c6c86181f21689 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-06 20:22:28 -07:00
Mitch Williams	295c0a5550	i40e: remove client instance on driver unload When the driver is unloaded, we need to remove the client instance, otherwise we leak memory. Change-ID: If1e7882ac1f6ce15d004722fafbe31afbe0adc9a Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-06 20:22:27 -07:00
Preethi Banala	bacd75cfac	i40e/i40evf: Add capability exchange for outer checksum This patch adds a capability negotiation between VF and PF using ENCAP/ ENCAP_CSUM offload flags in order for the VF to support outer checksum and TSO offloads for encapsulated packets. These capabilities were assumed by default and enabled in current hardware. Going forward, these features needs to be negotiated with PF before advertising to the stack. Additionally, strip out the mac.type checks for X722 since outer checksums are enabled based on the ENCAP_CSUM offload negotiation flag and maintain consistency between drivers in how the features are configured. Change-ID: Ie380a6f57eca557a2bb575b66b12fae36d308920 Signed-off-by: Preethi Banala <preethi.banala@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-06 20:14:51 -07:00
David S. Miller	b404127879	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2017-04-05 This series contains updates to fm10k only. Phil Turnbull from Oracle fixes an issue where the argument provided to FM10K_REMOVED macro was not what was expecting. Jake modifies the driver to replace the bitwise operators and defines with a BITMAP and enumeration values to avoid race conditions. Also future proof the driver so that developers do not have to remember to re-size the bitmaps when adding new values. Fixed the wording of a code comment to avoid stating that we return a value for a void function. Ngai-Mint makes sure that when configuring the receive ring, we make sure the receive queue is disabled. Fixed an issue where interfaces were resetting because the transmit mailbox FIFO was becoming full since the host was not ready, so ensure the host is ready before queueing up mailbox messages. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-06 13:53:14 -07:00
David S. Miller	6f14f443d3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Mostly simple cases of overlapping changes (adding code nearby, a function whose name changes, for example). Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-06 08:24:51 -07:00
Ngai-Mint Kwan	7d4fe0d123	fm10k: do not enqueue mailbox when host not ready Interfaces will reset whenever the TX mailbox FIFO has become full. This occurs more frequently whenever the IES API application is not running to process and clear the messages in the FIFO. Thus, this could lead to situations where the interface would enter an infinite reset loop. That is: if the interface is trying to synchronize a huge number of unicast and multicast entries with the IES API application, the TX mailbox FIFO will become full and the interface resets. Once the interface exits reset, it'll try to synchronize the unicast and multicast entries again. Ergo, this creates an infinite loop. Other actions such as multiple mulitcast mode or up/down transitions will fill the TX mailbox FIFO and induce the interface to reset. To correct these situations, check if the interface's "host_ready" flag is enabled before enqueuing any messages to the TX mailbox FIFO. This check will be conducted by a function call. Lastly, this issue mainly affects the PF and, thus, the VF is exempt. Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-05 22:47:31 -07:00
Ngai-Mint Kwan	16b1889f8b	fm10k: disable receive queue when configuring ring Write to RXQCTL register to disable the receive queue when configuring the RX ring. Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-05 22:47:31 -07:00
Jacob Keller	02957703ca	fm10k: update function header comment for fm10k_get_stats64 Re-word the comment to avoid stating that we return a value for this void function. Additionally, there is no need to mention older kernels, since this is the upstream kernel. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-05 22:47:31 -07:00
Jacob Keller	b4fd8ffc11	fm10k: allow service task to reschedule itself If some code path executes fm10k_service_event_schedule(), it is guaranteed that we only queue the service task once, since we use __FM10K_SERVICE_SCHED flag. Unfortunately this has a side effect that if a service request occurs while we are currently running the watchdog, it is possible that we will fail to notice the request and ignore it until the next time the request occurs. This can cause problems with pf/vf mailbox communication and other service event tasks. To avoid this, introduce a FM10K_SERVICE_REQUEST bit. When we successfully schedule (and set the _SCHED bit) the service task, we will clear this bit. However, if we are unable to currently schedule the service event, we just set the new SERVICE_REQUEST bit. Finally, after the service event completes, we will re-schedule if the request bit has been set. This should ensure that we do not miss any service event schedules, since we will re-schedule it once the currently running task finishes. This means that for each request, we will always schedule the service task to run at least once in full after the request came in. This will avoid timing issues that can occur with the service event scheduling. We do pay a cost in re-running many tasks, but all the service event tasks use either flags to avoid duplicate work, or are tolerant of being run multiple times. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-05 22:47:30 -07:00
Jacob Keller	4692955787	fm10k: future-proof state bitmaps using DECLARE_BITMAP This ensures that future programmers do not have to remember to re-size the bitmaps due to adding new values. Although this is unlikely for this driver, it may happen and it's best to prevent it from ever being an issue. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-05 22:47:30 -07:00
Jacob Keller	3ee7b3a3b9	fm10k: use a BITMAP for flags to avoid race conditions Replace bitwise operators and #defines with a BITMAP and enumeration values. This is similar to how we handle the "state" values as well. This has two distinct advantages over the old method. First, we ensure correctness of operations which are currently problematic due to race conditions. Suppose that two kernel threads are running, such as the watchdog and an ethtool ioctl, and both modify flags. We'll say that the watchdog is CPU A, and the ethtool ioctl is CPU B. CPU A sets FLAG_1, which can be seen as CPU A read FLAGS CPU A write FLAGS \| FLAG_1 CPU B sets FLAG_2, which can be seen as CPU B read FLAGS CPU A write FLAGS \| FLAG_2 However, "\|=" and "&=" operators are not actually atomic. So this could be ordered like the following: CPU A read FLAGS -> variable CPU B read FLAGS -> variable CPU A write FLAGS (variable \| FLAG_1) CPU B write FLAGS (variable \| FLAG_2) Notice how the 2nd write from CPU B could actually undo the write from CPU A because it isn't guaranteed that the \|= operation is atomic. In practice the race windows for most flag writes is incredibly narrow so it is not easy to isolate issues. However, the more flags we have, the more likely they will cause problems. Additionally, if such a problem were to arise, it would be incredibly difficult to track down. Second, there is an additional advantage beyond code correctness. We can now automatically size the BITMAP if more flags were added, so that we do not need to remember that flags is u32 and thus if we added too many flags we would over-run the variable. This is not a likely occurrence for fm10k driver, but this patch can serve as an example for other drivers which have many more flags. This particular change does have a bit of trouble converting some of the idioms previously used with the #defines for flags. Specifically, when converting FM10K_FLAG_RSS_FIELD_IPV[46]_UDP flags. This whole operation was actually quite problematic, because we actually stored flags separately. This could more easily show the problem of the above re-ordering issue. This is really difficult to test whether atomics make a difference in practical scenarios, but you can ensure that basic functionality remains the same. This patch has a lot of code coverage, but most of it is relatively simple. While we are modifying these files, update their copyright year. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-05 22:47:30 -07:00
Phil Turnbull	540fca35e3	fm10k: correctly check if interface is removed FM10K_REMOVED expects a hardware address, not a 'struct fm10k_hw'. Fixes: `5cb8db4a4c` ("fm10k: Add support for VF") Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-05 22:47:30 -07:00
Florian Westphal	282ccf6efb	drivers: add explicit interrupt.h includes These files all use functions declared in interrupt.h, but currently rely on implicit inclusion of this file (via netns/xfrm.h). That won't work anymore when the flow cache is removed so include that header where needed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-30 11:05:34 -07:00
Wyborny, Carolyn	d08a9f6cd1	i40e: fix for queue timing delays This patch adds a delay to Rx queue disables to accommodate HW needs. v2: Added missing check for disable only, additional details on the need for the ugly delay and fixed spacing on comment. Change-ID: I2864ca667ce5dcc2cc44f8718113b719742a46a1 Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:23:00 -07:00
Alexander Duyck	dab86afdbb	i40e/i40evf: Change the way we limit the maximum frame size for Rx This patch changes the way we handle the maximum frame size for the Rx path. Previously we were rounding up to 2K for a 1500 MTU and then brining the max frame size down to MTU plus a fixed amount. With this patch applied what we now do is limit the maximum frame to 1.5K minus the value for NET_IP_ALIGN for standard MTU, and for any MTU greater than 1500 we allow up to the maximum frame size. This makes the behavior more consistent with the other drivers such as igb which had similar logic. In addition it reduces the test matrix for MTU since we only have two max frame sizes that are handled for Rx now. Change-ID: I23a9d3c857e7df04b0ef28c64df63e659c013f3f Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:07 -07:00
Alexander Duyck	c424d4a3dd	i40e/i40evf: Add legacy-rx private flag to allow fallback to old Rx flow This patch adds a control which will allow us to toggle into and out of the legacy Rx mode. The legacy Rx mode is what we currently do when performing Rx. As I make further changes what should happen is that the driver will fall back to the behavior for Rx as of this patch should the "legacy-rx" flag be set to on. Change-ID: I0342998849bbb31351cce05f6e182c99174e7751 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Alexander Duyck	fa2343e903	i40e/i40evf: Break i40e_fetch_rx_buffer up to allow for reuse of frag code This patch is meant to clean up the code in preparation for us adding support for build_skb. Specifically we deconstruct i40e_fetch_buffer into several functions so that those functions can later be reused when we add a path for build_skb. Specifically with this change we split out the code for adding a page to an exiting skb. Change-ID: Iab1efbab6b8b97cb60ab9fdd0be1d37a056a154d Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Alexander Duyck	a0cfc3130e	i40e/i40evf: Pull out code for cleaning up Rx buffers This patch pulls out the code responsible for handling buffer recycling and page counting and distributes it through several functions. This allows us to commonize the bits that handle either freeing or recycling the buffers. As far as the page count tracking one change to the logic is that pagecnt_bias is decremented as soon as we call i40e_get_rx_buffer. It is then the responsibility of the function that pulls the data to either increment the pagecnt_bias if the buffer can be recycled as-is, or to update page_offset so that we are pointing at the correct location for placement of the next buffer. Change-ID: Ibac576360cb7f0b1627f2a993d13c1a8a2bf60af Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Alexander Duyck	9a064128fc	i40e/i40evf: Pull code for grabbing and syncing rx_buffer from fetch_buffer This patch pulls the code responsible for fetching the Rx buffer and synchronizing DMA into a function, specifically called i40e_get_rx_buffer. The general idea is to allow for better code reuse by pulling this out of i40e_fetch_rx_buffer. We dropped a couple of prefetches since the time between the prefetch being called and the data being accessed was too small to be useful. Change-ID: I4885fce4b2637dbedc8e16431169d23d3d7e79b9 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Alexander Duyck	d57c0e08c7	i40e/i40evf: Use length to determine if descriptor is done This change makes it so that we use the length of the packet instead of the DD status bit to determine if a new descriptor is ready to be processed. The obvious advantage is that it cuts down on reads as we don't really even need the DD bit if going from a 0 to a non-zero value on size is enough to inform us that the packet has been completed. Change-ID: Iebdf9cdb36c454ef092df27199b92ad09c374231 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Jacob Keller	3a104f8df2	i40e: remove FDIR_REQUIRES_REINIT driver flag This flag hasn't been used since commit `1e1be8f622` ("i40e: ATR policy change to flush the table to clean stale ATR rules"). Lets simplify things and just remove it. Change-ID: I76279d84db8a2fd96f445b96aa413059f9256879 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Jacob Keller	d9eaf12e85	i40e: remove a useless goto statement The goto found here for when in MFP mode is pointless. It jumps to the end of a series of if blocks. However, right after this statement is a closing '}' for this if block, which will result in the program flow going to the exact same location as the goto statement indicates. Thus, regardless of whether we are in MFP mode, the program flow will resume from the same location. This arose due to various refactoring which did not notice that this goto became essentially a no-op. To properly understand this diff you will need to view a larger context than is given by default. Change-ID: I088f73c3831aa5c4e2281380c7a3ce605594300c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Christopher N Bednarz	1fca3265be	i40e: Check for new arq elements before leaving the adminq subtask loop Fix a case where we miss an arq element if a new one is added before we enable interrupts and exit the arq subtask loop. This occurs frequently with RDMA running on Windows VF and causes long delays that prevent SMB from establishing connections. Change-ID: I3e1c8b2b960c12857d9b8275bea2c1563674392e Signed-off-by: Christopher N Bednarz <christopher.n.bednarz@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Paul M Stillwell Jr	6030308ef8	i40e: use register for XL722 control register read/write The XL722 doesn't support the AQ command to read/write the control register so enable it to bypass the check and use the direct read/write method. Change-ID: Iefecc737b57207485c90845af5989d5af518bf16 Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Alexander Duyck	aca955d831	i40e: Clean up handling of private flags This patch cleans up and addresses several issues in the way that i40e handles private flags. Previously the code was choosing fixed bits and trying to match them up with strings in a somewhat haphazard way. This resulted in the possibility for adding a new bit and causing a mismatch as the private flags are linear bits starting at 0, and the private flags in the driver were split up over a group specific to the PF and a group that was global. What this change does is define an array of structs used to represent the private flags. Contained within the structs are the bits necessary to know which flags to set and/or clear depending on the state of the bit. By doing this we can add new bits in the future with minimal overhead and avoid creating possible mis-matches should we need to remove a flag based on compile options. Change-ID: Ia3214ab04f0ab2f70354ac0997a135f1d01b0acd Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Preethi Banala	b1cb07db6e	i40evf: enforce descriptor write-back mechanism for VF The current driver mode is to use a write-back mechanism for the head register which indicates transmit completions. The VF driver needs to be able to work on hardware that exclusively uses descriptor write-back, so change the default driver mode of operation to descriptor write-back for VF. In our analysis, performance wasn't significantly different with either write-back method. Change-ID: Ia92e4ec77c2df8dc4515c71d53746d57d77759af Signed-off-by: Preethi Banala <preethi.banala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Jacob Keller	7be147dc14	i40e: initialize params before notifying of l2_param_changes Probably due to some mis-merging fix a bug associated with commits `d7ce6422d6` ("i40e: don't check params until after checking for client instance", 2017-02-09) and 3140aa9a78c9 ("i40e: KISS the client interface", 2017-03-14) The first commit tried to move the initialization of the params structure so that we didn't bother doing this if we didn't have a client interface. You can already see that it looks fishy because of the indentation. The second commit refactors a bunch of the interface, and incorrectly drops the params initialization. I believe what occurred is that internally the two patches were re-ordered, and the merge conflicts as a result were performed incorrectly. Fix the use of an uninitialized variable by correctly initializing the params variable via i40e_client_get_params(). Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-27 16:47:44 -07:00
Colin Ian King	703ba88548	i40evf: dereference VSI after VSI has been null checked VSI is being dereferenced before the VSI null check; if VSI is null we end up with a null pointer dereference. Fix this by performing VSI deference after the VSI null check. Also remove the need for using adapter by using vsi->back->cinst. Detected by CoverityScan, CID#1419696, CID#1419697 ("Dereference before null check") Fixes: `ed0e894de7` ("i40evf: add client interface") Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-27 16:47:44 -07:00
Alexander Duyck	c76cb6ed54	i40e: Drop FCoE code that always evaluates to false or 0 Since FCoE isn't supported by the i40e products there isn't much point in carrying around code that will always evaluate to false. This patch goes through and strips out the code in several spots so that we don't go around carrying variables and/or code that is always going to evaluate to false or 0. Change-ID: I39d1d779c66c638b75525839db2b6208fdc809d7 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-27 16:47:44 -07:00
Alexander Duyck	9eed69a914	i40e: Drop FCoE code from core driver files Looking over the code for FCoE it looks like the Rx path has been broken at least since the last major Rx refactor almost a year ago. It seems like FCoE isn't supported for any of the Fortville/Fortpark hardware so there isn't much point in carrying the code around, especially if it is broken and untested. Change-ID: I892de8fa551cb129ce2361e738ff82ce55fa229e Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-27 16:47:43 -07:00
Alexander Duyck	a5b268e4b1	i40e/i40evf: Clean-up process_skb_fields This is a minor clean-up to make the i40e/i40evf process_skb_fields function look a little more like what we have in igb. The Rx checksum function called out a need for skb->protocol but I can't see where it actually needs it. I am assuming this is something that was likely refactored out some time ago as the Rx checksum code has gone through a few rewrites. Change-ID: I0b4668a34d90b61b66ded7c7c26e19a3e2d06251 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-27 16:47:43 -07:00
Bimmy Pujari	0a25b7311d	i40e: removed no longer needed delays Removed no longer needed delays. At preproduction stage those delays were needed but now these delays are not needed. Signed-off-by: Bimmy Pujari <bimmy.pujari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-27 16:47:43 -07:00
Robert Konklewski	beff3e9d80	i40e: Fixed race conditions in VF reset First, this patch eliminates IOMMU DMAR Faults caused by VF hardware. This is done by enabling VF hardware only after VSI resources are freed. Otherwise, hardware could DMA into memory that is (or just has been) being freed. Then, the VF driver is activated only after VSI resources have been reallocated. That's because the VF driver can request resources immediately after it's activated. So they need to be ready at that point. The second race condition happens when the OS initiates a VF reset, and then before it's finished modifies VF's settings by changing its MAC, VLAN ID, bandwidth allocation, anti-spoof checking, etc. These functions needed to be blocked while VF is undergoing reset. Otherwise, they could operate on data structures that had just been freed or not yet fully initialized. Change-ID: I43ba5a7ae2c9a1cce3911611ffc4598ae33ae3ff Signed-off-by: Robert Konklewski <robertx.konklewski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-27 16:45:14 -07:00
Alexander Duyck	741b8b832a	i40e/i40evf: Fix use after free in Rx cleanup path We need to reset skb back to NULL when we have freed it in the Rx cleanup path. I found one spot where this wasn't occurring so this patch fixes it. Change-ID: Iaca68934200732cd4a63eb0bd83b539c95f8c4dd Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-27 16:45:14 -07:00
Harshitha Ramamurthy	f25571b576	i40e: fix configuration of RSS table with DCB There exists a bug in the driver where the calculation of the RSS size was not taking into account the number of traffic classes enabled. This patch factors in the traffic classes both in the initial configuration of the table as well as reconfiguration. Change-ID: I34dcd345ce52faf1d6b9614bea28d450cfd5f621 Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-27 16:45:14 -07:00
Alexander Duyck	1793668c3b	i40e/i40evf: Update code to better handle incrementing page count Update the driver code so that we do bulk updates of the page reference count instead of just incrementing it by one reference at a time. The advantage to doing this is that we cut down on atomic operations and this in turn should give us a slight improvement in cycles per packet. In addition if we eventually move this over to using build_skb the gains will be more noticeable. I also found and fixed a store forwarding stall from where we were assigning "new_buff = old_buff". By breaking it up into individual copies we can avoid this and as a result the performance is slightly improved. Change-ID: I1d3880dece4133eca3c32423b04a5467321ccc52 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-27 16:45:13 -07:00
Alexander Duyck	13a8cd191a	i40e: Do not enable NAPI on q_vectors that have no rings When testing the epoll w/ busy poll code I found that I could get into a state where the i40e driver had q_vectors w/ active NAPI that had no rings. This was resulting in a divide by zero error. To correct it I am updating the driver code so that we only support NAPI on q_vectors that have 1 or more rings allocated to them. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:28:55 -07:00
Jeff Kirsher	9f47a48e6e	Revert "e1000e: driver trying to free already-free irq" This reverts commit `7e54d9d063`. After additional regression testing, several users are experiencing kernel panics during shutdown on e1000e devices. Reverting this change resolves the issue. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:43:11 -07:00
Jacob Keller	584a88709b	i40e: make use of hlist_for_each_entry_continue Replace a complex if->continue->else->break construction in i40e_next_filter. We can simply use hlist_for_each_entry_continue instead. This drops a lot of confusing code. The resulting code is much easier to understand the intention, and follows the more normal pattern for using hlist loops. We could have also used a break with a "return next" at the end of the function, instead of return NULL, but the current implementation is explicitly clear that when you reach the end of the loop you get a NULL value. The alternative construction is less clear since the reader would have to know that next is NULL at the end of the loop. Change-Id: Ife74ca451dd79d7f0d93c672bd42092d324d4a03 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	f223c8752a	i40e: add support for SCTPv4 FDir filters Enable FDir filters for SCTPv4 packets using the ethtool ntuple interface to enable filters. The ethtool API does not allow masking on the verification tag. Change-Id: I093e88a8143994c7e6f4b7b17a0bd5cf861d18e4 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	0e588de17f	i40e: implement support for flexible word payload Add support for flexible payloads passed via ethtool user-def field. This support is somewhat limited due to hardware design. The input set can only be programmed once per filter type, and the flexible offset is part of this filter input set. This means that the user cannot program both a regular and a flexible filter at the same time for a given flow type. Additionally, the user may not program two flexible filters of the same flow type with different offsets, although they are allowed to configure different values at that offset location. We support a single flexible word (2byte) value per protocol type, and we handle the FLX_PIT register using a list of flexible entries so that each flow type may be configured separately. Due to hardware implementation, the flexible data is offset from the start of the packet payload, and thus may not be in part of the header data. For this reason, the offset provided by the user defined data is interpreted as a byte offset from the start of the matching payload. Previous implementations have tried to represent the offset as from the start of the frame, but this is not feasible because header sizes may change due to options. Change-Id: 36ed27995e97de63f9aea5ade5778ff038d6f811 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	e793095e8a	i40e: add parsing of flexible filter fields from userdef Add code to parse the user-def field into a data structure format. This code is intended to allow future extensions of the user-def field by keeping all code that actually reads and writes the field into a single location. This ensures that we do not litter the driver with references to the user-def field and minimizes the amount of bitwise operations we need to do on the data. Add code which parses the lower 32bits into a flexible word and its offset. This will be used in a future patch to enable flexible filters which can match on some arbitrary data in the packet payload. For now, we just return -EOPNOTSUPP when this is used. Add code to fill in the user-def field when reporting the filter back, even though we don't actually implement any user-def fields yet. Additionally, ensure that we mask the extended FLOW_EXT bit from the flow_type now that we will be accepting filters which have the FLOW_EXT bit set (and thus make use of the user-def field). Change-Id: I238845035c179380a347baa8db8223304f5f6dd7 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	43b15697a3	i40e: partition the ring_cookie to get VF index Do not use the user-def field for determining the VF target. Instead, similar to ixgbe, partition the ring_cookie value into 8bits of VF index, along with 32bits of queue number. This is better than using the user-def field, because it leaves the field open for extension in a future patch which will enable flexible data. Also, this matches with convention used by ixgbe and other drivers. Change-Id: Ie36745186d817216b12f0313b99ec95cb8a9130c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	9229e99334	i40e: allow changing input set for ntuple filters Add support to detect when we can update the input set for each flow type. Because the hardware only supports a single input set for all flows of that matching type, the driver shall only allow the input set to change if there are no other configured filters for that flow type. Thus, the first filter added for each flow type is allowed to change the input set, and all future filters must match the same input set. Display a diagnostic message whenever the filter input set changes, and a warning whenever a filter cannot be accepted because it does not match the configured input set. Change-Id: Ic22e1c267ae37518bb036aca4a5694681449f283 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	3bcee1e653	i40e: restore default input set for each flow type Ensure that the default input set is correctly reprogrammed when cleaning up after disabling flow director support. This ensures that the programmed value will be in a clean state. Although we do not yet have support for SCTPv4 filters, a future patch will add support for this protocol, so we will correctly restore the SCTPv4 input set here as well. Note that strictly speaking the default hardware value for SCTP includes matching the verification tag. However, the ethtool API does not have support for specifying this value, so there is no reason to keep the verification field enabled. This patch is the next step on the way to enabling partial tuple filters which will be implemented in a following patch. Change-Id: Ic22e1c267ae37518bb036aca4a5694681449f283 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	36777d9fa2	i40e: check current configured input set when adding ntuple filters Do not assume that hardware has been programmed with the default mask, but instead read the input set registers to determine what is currently programmed. This ensures that all programmed filters match exactly how the hardware will interpret them, avoiding confusion regarding filter behavior. This sets the initial ground-work for allowing custom input sets where some fields are disabled. A future patch will fully implement this feature. Instead of using bitwise negation, we'll just explicitly check for the correct value. The use of htonl and htons are used to silence sparse warnings. The compiler should be able to handle the constant value and avoid actually performing a byteswap. Change-Id: I3d8db46cb28ea0afdaac8c5b31a2bfb90e3a4102 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	faa16e0f38	i40e: correctly honor the mask fields for ETHTOOL_SRXCLSRLINS The current implementation of .set_rxnfc does not properly read the mask field for filter entries. This results in incorrect driver behavior, as we do not reject filters which have masks set to ignore some fields. The current implementation simply assumes that every part of the tuple or "input set" is specified. This results in filters not behaving as expected, and not working correctly. As a first step in supporting some partial filters, add code which checks the mask fields and rejects any filters which do not have an acceptable mask. For now, we just assume that all fields must be set. This will get the driver one step towards allowing some partial filters. At a minimum, the ethtool commands which previously installed filters that would not function will now return a non-zero exit code indicating failure instead. We should now be meeting the minimum requirements of the .set_rxnfc API, by ensuring that all filters we program have a valid mask value for each field. Finally, add code to report the mask correctly so that the ethtool command properly reports the mask to the user. Note that the typecast to (__be16) when checking source and destination port masks is required because the ~ bitwise negation operator does not correctly handle variables other than integer size. Change-Id: Ia020149e07c87aa3fcec7b2283621b887ef0546f Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Philippe Reynes	f22913d0b5	ixgb: use new API ethtool_{get\|set}_link_ksettings The ethtool API {get\|set}_settings is deprecated. We move this driver to new API {get\|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-21 15:53:19 -07:00
Philippe Reynes	1120ecd5aa	igbvf: use new API ethtool_{get\|set}_link_ksettings The ethtool API {get\|set}_settings is deprecated. We move this driver to new API {get\|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-21 15:51:39 -07:00
Philippe Reynes	c19153008b	igb: use new API ethtool_{get\|set}_link_ksettings The ethtool API {get\|set}_settings is deprecated. We move this driver to new API {get\|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-21 15:42:19 -07:00
Philippe Reynes	fb052fdd26	e1000e: use new API ethtool_{get\|set}_link_ksettings The ethtool API {get\|set}_settings is deprecated. We move this driver to new API {get\|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-21 15:31:30 -07:00
Philippe Reynes	5add2f9a1b	e1000: use new API ethtool_{get\|set}_link_ksettings The ethtool API {get\|set}_settings is deprecated. We move this driver to new API {get\|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-21 15:27:58 -07:00
David S. Miller	406910a8ba	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-03-20 This series contains updates to i40e and i40evf only. Philippe Reynes updates i40e and i40evf to use the new ethtool API for {get\|set}_link_ksettings. Jake provides the remaining patches in the series, starting with a fix for i40e where the firmware expected the port numbers for the offloaded UDP tunnels in Little Endian format and we were sending them in Big Endian format which put the wrong port number to be put in the UDP tunnel list. Changed the driver to use __be32 values instead of arrays for (src\|dst)_ip. Refactored the exit flow of i40e_add_fdir_ethtool() which removes the dependency on having a non-zero return value. Fixed a memory leak by running kfree() and returning immediately when we fail to add flow director filter. Fixed a potential issue where could update the filter count without actually succeeding in adding a filter, by moving the ATR exit check to after we have sent the TCP/IPv4 filter to the ring successfully. Ensures that the fd_tcp_rule count is reset to 0, before we reprogram the filters so that we do not end up with a stale count which does not correctly reflect the number of programmed filters. Added a check whether we have TCP/IPv4 filters before re-enabling ATR after flushing and replaying FDIR filters. Added counters for each filter type in preparation for adding code to properly check the mask value. Fixed potential issues by explicitly checking the flow type at the start of i40e_add_fdir_ethtool(). To avoid possible memory leaks, we now unconditionally delete the old filter, even if it is identical to the new filter and ensures will always update the filters as expected. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-21 14:35:21 -07:00
Jacob Keller	c6da525de7	i40e: always remove old filter when adding new FDir filter The previous code relied on i40e_match_fdir_input_set to determine when determining whether to free the old filter. Change this code so that we simply unconditionally delete the old filter, even if it's identical to the new filter. This ensures that we don't leak any memory, and that we always update the filters as expected. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:22 -07:00
Jacob Keller	1ec8deac8c	i40e: explicitly fail on extended MAC field for ethtool_rx_flow_spec Although we will fail the filter later due to checking flow_type which will have a bogus invalid type, it is possible future refactoring will remove this hidden failure case. Avoid a possible issue in the future by explicitly checking the flow type at the start. Change-Id: Ia98eb26f7b93ccbe38c7141e8f203ef496fc6598 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:22 -07:00
Jacob Keller	097dbf5250	i40e: add counters for UDP/IPv4 and IPv4 filters In preparation for adding code to properly check the mask values, we will need to know the number of active filters for each type. Add counters for each filter type. Rename the already existing fd_tcp_rule to fd_tcp4_filter_cnt to match the style of other names. To avoid style warnings, avoid assigning multiple parameters at once, and fix up one other case where we did so previously. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:22 -07:00
Jacob Keller	510dd4609f	i40e: don't re-enable ATR when flushing filters if SB has TCP4/IPv4 rules When flushing and replaying FDIR filters, it is possible we would disable ATR, and then re-enable it even though we should have kept it disabled due to existing TCP/IPv4 filters. Fix this by checking whether we have TCP4/IPv4 filters before re-enabling. Alternatively, we could instead restore ATR and then replay filters, however, this would cause us to rapidly enable and then disable ATR in some cases. Change-ID: I076e4cc1e4409bce7f98f3c213295433a4ff43d8 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Avinash Dayanand <avinash.dayanand@intel.com> Reviewed-by: Alan Brady <alan.brady@intel.com> Reviewed-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:21 -07:00
Jacob Keller	6d069425f0	i40e: reset fd_tcp_rule count when restoring filters Since we're about to reprogram the filters, we need to ensure that the fd_tcp_rule count is correctly reset to 0. Otherwise, we will keep a stale count that does not accurately reflect the number of programmed TCPv4 filters. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:21 -07:00
Jacob Keller	e122eb7482	i40e: remove redundant check for fd_tcp_rule when restoring filters i40e_fdir_filter_restore re-adds all existing filters, which already checks when adding a TCPv4 filter to disable ATR. We don't need to make the check twice, so remove this redundant code. Change-ID: Ia0b0690e23523915199d601494557def135c9d7f Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:21 -07:00
Jacob Keller	377cc24980	i40e: exit ATR mode only when adding TCP/IPv4 filter succeeds Move ATR exit check after we have sent the TCP/IPv4 filter to the ring successfully. This avoids an issue where we potentially update the filter count without actually succeeding in adding the filter. Now, we only increment the fd_tcp_rule after we've succeeded. Additionally, we will re-enable ATR mode only after deletion of the filter is actually posted to the FDIR ring. Change-ID: If5c1dea422081cc5e2de65618b01b4c3bf6bd586 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:20 -07:00
Jacob Keller	e5187ee3ee	i40e: return immediately when failing to add fdir filter Instead of setting err=true and checking this to determine when to free the raw_packet near the end of the function, simply kfree and return immediately. The resulting code is a bit cleaner and has one less variable. This also resolves a subtle bug in the ipv4 case which could fail to add the first filter and then never free the memory, resulting in a small memory leak. Change-ID: I7583aac033481dc794b4acaa14445059c8930ff1 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Avinash Dayanand <avinash.dayanand@intel.com> Reviewed-by: Alan Brady <alan.brady@intel.com> Reviewed-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:20 -07:00
Jacob Keller	01016da1e5	i40e: rework exit flow of i40e_add_fdir_ethtool Refactor the exit flow of the i40e_add_fdir_ethtool function. Move the input_label to the end of the function, removing the dependency on having a non-zero return value. Add a comment explaining why it is ok not to free the fdir data structure, because the structure is now stored in the fdir_filter_list. Change-Id: I723342181d59cd0c9f3b31140c37961ba37bb242 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:20 -07:00
Jacob Keller	8ce43dce6f	i40e: don't use arrays for (src\|dst)_ip The code originally included src_ip and dst_ip with enough space to support ipv6 filters. However, no actual support for ipv6 filters has been implemented. Thus, remove the arrays and just use __be32 values. Should ipv6 support be added in the future, we can replace these with a union that has sizes for both values. Change-Id: I1bc04032244a80eb6ebc8a4e6c723a4a665c1dd5 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:20 -07:00
Jacob Keller	fe0b0cd97b	i40e: send correct port number to AdminQ when enabling UDP tunnels The firmware expects the port numbers for offloaded UDP tunnels in Little Endian format. We accidentally sent the value in Big Endian format which obviously will cause the wrong port number to be put into the UDP tunnels list. This results in VxLAN and Geneve tunnel Rx offloads being essentially disabled, unless the port number happens to be identical after byte swapping. Note that i40e_aq_add_udp_tunnel() will byteswap the parameter from host order into Little Endian so we don't need worry about passing strictly a __le16 value to the command. This patch essentially reverts `b3f5c7bc88` ("i40e: Fix for extra byte swap in tunnel setup", 2016-08-24), but in a way that makes the result much more clear to the reader. Fixes: `b3f5c7bc88` ("i40e: Fix for extra byte swap in tunnel setup", 2016-08-24) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Williams, Mitch A <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:19 -07:00
Philippe Reynes	48ce88022d	i40evf: use new api ethtool_{get\|set}_link_ksettings The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:43:30 -07:00
Philippe Reynes	a7f909405b	i40e: use new api ethtool_{get\|set}_link_ksettings The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 19:18:04 -07:00
Alexander Duyck	3a1eb6d10c	igb/ixgbe: Fix typo in igb_build_skb and/or ixgbe_build_skb code comment There was a typo that I had left in the code comments for the igb and ixgbe functions that enabled build_skb support. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:55:55 -07:00
Alexander Duyck	b1bb2eb0a0	igb: Re-add support for build_skb in igb This reverts commit `f9d40f6a99` ("igb: Revert support for build_skb in igb") and adds a few changes to update it to work with the latest version of igb. We are now able to revert the removal of this due to the fact that with the recent changes to the page count and the use of DMA_ATTR_SKIP_CPU_SYNC we can make the pages writable so we should not be invalidating the additional data added when we call build_skb. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	e014272672	igb: Break out Rx buffer page management At this point we have 2 to 3 paths that can be taken depending on what Rx modes are enabled. In order to better support that and improve the maintainability I am breaking out the common bits from those paths and making them into their own functions. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	e3cdf68d4a	igb: Add support for padding packet With the size of the frame limited we can now write to an offset within the buffer instead of having to write at the very start of the buffer. The advantage to this is that it allows us to leave padding room for things like supporting XDP in the future. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	8649aaef40	igb: Add support for using order 1 pages to receive large frames This patch adds support for using 3K buffers in order 1 pages the same way we were using 2K buffers in 4K pages. We are reserving 1K of room for now to have space available for future headroom and tailroom when we enable build_skb support. One side effect of this patch is that we can end up using a larger buffer if jumbo frames is enabled. The impact shouldn't be too great, but it could hurt small packet performance for UDP workloads if jumbo frames is enabled as the truesize of frames will be larger. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	e08912985b	igb: Add support for ethtool private flag to allow use of legacy Rx Since there are potential drawbacks to the new Rx allocation approach I thought it best to add a "chicken bit" so that we can turn the feature off if in the event that a problem is found. It also provides a means of validating the legacy Rx path in the event that we are forced to fall back. At some point in the future when we are convinced we don't need it anymore we might be able to drop the legacy-rx flag. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	3456fd5342	igb: Use page_address offset from page instead of masking virtual address Update the handling of page addresses so that we always refer to them using a void pointer, and try to use the consistent name of va indicating we are working with a virtual address. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	cb0ef1d1dc	igb: Only sync size of expected frame in ethtool testing We only need to sync the size of the frame that is read to test. We don't need to sync the entire Rx buffer. This way the testing is more consistent with how we handle things in the receive path. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	cfbc871c21	igb: Limit maximum frame Rx based on MTU In order to support the use of build_skb going forward it will be necessary to place a maximum limit on the amount of data we can receive when jumbo frames is not enabled. In order to do this I am adding a new upper limit for receive based on the size of a 2K buffer minus padding. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	7cc6fd4c60	igb: Don't bother clearing Tx buffer_info in igb_clean_tx_ring In the case of the Tx rings we need to only clear the Tx buffer_info when we are resetting the rings. Ideally we do this when we configure the ring to bring it back up instead of when we are taking it down in order to avoid dirtying pages we don't need to. In addition we don't need to clear the Tx descriptor ring since we will fully repopulate it when we begin transmitting frames and next_to_watch can be cleared to prevent the ring from being cleaned beyond that point instead of needing to touch anything in the Tx descriptor ring. Finally with these changes we can avoid having to reset the skb member of the Tx buffer_info structure in the cleanup path since the skb will always be associated with the first buffer which has next_to_watch set. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	d2bead576e	igb: Clear Rx buffer_info in configure instead of clean This change makes it so that instead of going through the entire ring on Rx cleanup we only go through the region that was designated to be cleaned up and stop when we reach the region where new allocations should start. In addition we can avoid having to perform a memset on the Rx buffer_info structures until we are about to start using the ring again. By deferring this we can avoid dirtying the cache any more than we have to which can help to improve the time needed to bring the interface down and then back up again in a reset or suspend/resume cycle. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	7ec0116c91	igb: Use length to determine if descriptor is done This change makes it so that we use the length of the packet instead of the DD status bit to determine if a new descriptor is ready to be processed. The obvious advantage is that it cuts down on reads as we don't really even need the DD bit if going from a 0 to a non-zero value on size is enough to inform us that the packet has been completed. In addition I have updated the code so that we only reset the Rx descriptor length for descriptor zero when resetting a ring instead of having to do a memset with 0 over the entire ring. By doing this we can save some time on initialization. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	7bd1759282	igb: Add support for DMA_ATTR_WEAK_ORDERING Since we are already using DMA attributes in igb for Rx there is no reason why we can't also apply DMA_ATTR_WEAK_ORDERING which is needed on some platforms to improve performance. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:43 -07:00
Amritha Nambiar	56f36acd21	mqprio: Modify mqprio to pass user parameters via ndo_setup_tc. The configurable priority to traffic class mapping and the user specified queue ranges are used to configure the traffic class, overriding the hardware defaults when the 'hw' option is set to 0. However, when the 'hw' option is non-zero, the hardware QOS defaults are used. This patch makes it so that we can pass the data the user provided to ndo_setup_tc. This allows us to pull in the queue configuration if the user requested it as well as any additional hardware offload type requested by using a value other than 1 for the hw value. Finally it also provides a means for the device driver to return the level supported for the offload type via the qopt->hw value. Previously we were just always assuming the value to be 1, in the future values beyond just 1 may be supported. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-15 15:20:27 -07:00
Harshitha Ramamurthy	b77ac97593	i40e: rename auto_disable_flags to hw_disabled_flags A previous commit introduced a field that tracks the features that are disabled due to HW resource limitations as opposed to the featured disabled by the user. This patch changes the name of the field to make it more readable since it might get confusing when looking at code containing both the flags field and the auto_disable_features field together. Change-ID: Idcc9888659698f6fe3ccff17c8c3f09b5026f708 Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-15 02:02:02 -07:00
Bimmy Pujari	15990832cd	i40e/i40evf: Change version from 1.6.27 to 2.1.7 Signed-off-by: Bimmy Pujari <bimmy.pujari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-15 02:01:27 -07:00
Mitch Williams	4dbc566139	i40e: Allow untrusted VFs to have more filters Our original filter limit of 8 was based on behavior that we saw from Linux VMs. Now we're running Other Operating Systems under KVM and we see that they commonly use more MAC filters. Since it seems weird to require people to enable trusted VFs just to boot their OS, bump the number of filters allowed by default. Change-ID: I76b2dcb2ad6017e39231ad3096c3fb6f065eef5e Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-15 02:00:52 -07:00
Alexander Duyck	59605bc096	i40e/i40evf: Add support for mapping pages with DMA attributes This patch adds support for DMA_ATTR_SKIP_CPU_SYNC and DMA_ATTR_WEAK_ORDERING. By enabling both of these for the Rx path we are able to see performance improvements on architectures that implement either one due to the fact that page mapping and unmapping only has to sync what is actually being used instead of the entire buffer. In addition by enabling the weak ordering attribute enables a performance improvement for architectures that can associate a memory ordering with a DMA buffer such as Sparc. Change-ID: If176824e8231c5b24b8a5d55b339a6026738fc75 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-15 02:00:14 -07:00
Filip Sadowski	3954b39102	i40e: Clarify steps in MAC/VLAN filters initialization routine This patch clarifies the reason for removal of automatically firmware-generated filter and explicit addition of filter which accepts frames with any VLAN id. Change-ID: Iabf180b6d61c4d8a36d3bcf8457c377a6f2aca0e Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-15 01:59:35 -07:00
Lihong Yang	26f77e53cf	i40e: fix RSS queues only operating on PF0 This patch fixes the issue that RSS offloading only works on PF0 by using the direct register writing of the hash keys for the VFs instead of using the admin queue command to do so. Change-ID: Ia02cda7dbaa23def342e8786097a2c03db6f580b Signed-off-by: Lihong Yang <lihong.yang@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-15 01:58:57 -07:00
Lihong Yang	c271dd6c39	i40e: fix ethtool to get EEPROM data from X722 interface Currently ethtool -e will error out with a X722 interface as its EEPROM has a scope limit at offset 0x5B9FFF. This patch fixes the issue by setting the EEPROM length to the scope limit to avoid NVM read failure beyond that. Change-ID: I0b7d4dd6c7f2a57cace438af5dffa0f44c229372 Signed-off-by: Lihong Yang <lihong.yang@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-15 01:58:21 -07:00
Jacob Keller	c0cf70a6fc	i40e: don't add more vectors to num_lan_msix than number of CPUs This is a solution to avoid adding too many queues to num_lan_msix. A recent refactor of queue pairs accidentally added all remaining vectors to the num_lan_msix which can have adverse performance issues, due to enabling more queues than the number of CPU cores. This patch removes the old calculation, and replaces it with a simple algorithm. 1) add queue pairs up to num_online_cpus(), but capped at half of total vectors 2) then add alternative features such as flow directory and similar 3) finally, add the remaining vectors back to queue pairs, but capped such that the total number of queue pairs does not exceed num_online_cpus(). Change-ID: I668abf67d5011a1248866daba8885f4ff00cb8d9 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com> Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-15 01:57:44 -07:00
Mitch Williams	0ef2d5afb1	i40e: KISS the client interface (KISS is Keep It Simple, Stupid. Or is it?) The client interface vastly overengineered for what it needs to do. It was originally designed to support multiple clients on multiple netdevs, possibly even with multiple drivers. None of this happened, and now we know that there will only ever be one client for i40e (i40iw) and one for i40evf (i40iwvf). So, time for some KISS. Since i40e and i40evf are a Dynasty, we'll simplify this one to match the VF interface. First, be a Destroyer and remove all of the lists and locks required to support multiple clients. Keep one static around to keep track of one client, and track the client instances for each netdev in the driver's pf (or adapter) struct. Now it's Almost Human. Since we already know the client type is iWarp, get rid of any checks for this. Same for VSI type - it's always going to be the same type, so it's just a Parasite. While we're at it, fix up some comments. This makes the function headers actually match the functions. These changes reduce code complexity, simplify maintenance, squash some lurking timing bugs, and allow us to Rock and Roll All Nite. Change-ID: I1ea79948ad73b8685272451440a34507f9a9012e Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-15 01:57:08 -07:00
Mitch Williams	ed0e894de7	i40evf: add client interface In preparation for upcoming RDMA-capable hardware, add a client interface to the VF driver. This is a slightly-simplified version of the PF client interface, with the names changed to protect the innocent. Due to the nature of the VF<->PF interactions, the client interface sometimes needs to call back into itself to pass messages. Because of this, we can't use the coarse-grained locking like the PF's client interface uses. Instead, we handle all client interactions in a separate thread so the watchdog can still run and process virtual channel messages. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-15 01:31:38 -07:00
Shannon Nelson	d60be2ca9c	i40e: fix up recent proxy and wol bits for X722_SUPPORT Some opcodes added & reordered to be in numerical order with the rest of the opcodes. This patch adds admin queue structs to support Wake on LAN feature for X722. Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-14 17:52:32 -07:00
Aaron Salter	96a39aed25	i40e: Acquire NVM lock before reads on all devices Acquire NVM lock before reads on all devices. Previously, locks were only used for X722 and later. Fixes an issue where simultaneous X710 NVM accesses were interfering with each other. Change-ID: If570bb7acf958cef58725ec2a2011cead6f80638 Signed-off-by: Aaron Salter <aaron.k.salter@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-14 17:52:32 -07:00
Philippe Reynes	8704f21c84	net: intel: ixgbe: use new api ethtool_{get\|set}_link_ksettings The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-12 19:28:07 -07:00
Alexander Duyck	c74042f3b3	ixgbe: Limit use of 2K buffers on architectures with 256B or larger cache lines On architectures that have a cache line size larger than 64 Bytes we start running into issues where the amount of headroom for the frame starts shrinking. The size of skb_shared_info on a system with a 64B L1 cache line size is 320. This increases to 384 with a 128B cache line, and 512 with a 256B cache line. In addition the NET_SKB_PAD value increases as well consistent with the cache line size. As a result when we get to a 256B cache line as seen on the s390 we end up 768 bytes used by padding and shared info leaving us with only 1280 bytes to use for data storage. On architectures such as this we should default to using 3K Rx buffers out of a 8K page instead of trying to do 1.5K buffers out of a 4K page. To take all of this into account I have added one small check so that we compare the max_frame to the amount of actual data we can store. This was already occurring for igb, but I had overlooked it for ixgbe as it doesn't have strict limits for 82599 once we enable jumbo frames. By adding this check we will automatically enable 3K Rx buffers as soon as the maximum frame size we can handle drops below the standard Ethernet MTU. I also went through and fixed one small typo that I found where I had left an IGB in a variable name due to a copy/paste error. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-02 18:03:39 -08:00
Paolo Abeni	d3aa9c9f21	ixgbe: update the rss key on h/w, when ethtool ask for it Currently ixgbe_set_rxfh() updates the rss_key copy in the driver memory, but does not push the new value into the h/w. This commit add a new helper for the latter operation and call it in ixgbe_set_rxfh(), so that the h/w rss key value can be really updated via ethtool. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-02 17:19:17 -08:00
Masahiro Yamada	9a284e5c9e	scripts/spelling.txt: add "overwritting" pattern and fix typo instances Fix typos and add the following to the scripts/spelling.txt: overwritting\|\|overwriting Link: http://lkml.kernel.org/r/1481573103-11329-29-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-27 18:43:47 -08:00
Masahiro Yamada	a6ab4eff19	scripts/spelling.txt: add "applys" pattern and fix typo instances Fix typos and add the following to the scripts/spelling.txt: applys\|\|applies The "applyes" in drivers/video/fbdev/aty/radeon_monitor.c is a different pattern but it was fixed in this commit. The "This functions" in the same line was fixed as well. Link: http://lkml.kernel.org/r/1481573103-11329-24-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-27 18:43:47 -08:00
Masahiro Yamada	b564d62e67	scripts/spelling.txt: add "varible" pattern and fix typo instances Fix typos and add the following to the scripts/spelling.txt: varible\|\|variable While we are here, tidy up the comment blocks that fit in a single line for drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c and net/sctp/transport.c. Link: http://lkml.kernel.org/r/1481573103-11329-11-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-27 18:43:47 -08:00
Benjamin Poirier	83a0c6e589	i40e: Invoke softirqs after napi_reschedule The following message is logged from time to time when using i40e: NOHZ: local_softirq_pending 08 i40e may schedule napi from a workqueue. Afterwards, softirqs are not run in a deterministic time frame. The problem is the same as what was described in commit `ec13ee8014` ("virtio_net: invoke softirqs after __napi_schedule") and this patch applies the same fix to i40e. Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:36 -08:00
Carolyn Wyborny	ee847d9351	i40e: remove duplicate device id from PCI table Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:36 -08:00
Jacob Keller	b9c015d421	i40e: mark the value passed to csum_replace_by_diff as __wsum Fix, or rather, avoid a sparse warning caused by the fact that csum_replace_by_diff expects to receive a __wsum value. Since the calculation appears to work, simply typecast the passed paylen value to __wsum to avoid the warning. This seems pretty fishy since __wsum was obviously annotated as a separate type on purpose, so this throws the entire calculation into question. Since it currently appears to behave as expected, the typecast is probably safe. Change-ID: I4fdc5cddd589abc16098176e8a61127e761488f4 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:36 -08:00
Harshitha Ramamurthy	ae13670824	i40e: Error handling for link event There exists an intermittent bug which causes the 'Link Detected' field reported by the 'ethtool <iface>' command to be 'Yes' when in fact, there is no link. This patch fixes the problem by enabling temporary link polling when i40e_get_link_status returns an error. This causes the driver to remember that an admin queue command failed and polls, until the function returns with a success. Change-Id: I64c69b008db4017b8729f3fc27b8f65c8fe2eaa0 Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:36 -08:00
Jacob Keller	5cb259016b	i40e: properly convert le16 value to CPU format This ensures that the pvid which is stored in __le16 format is converted to the CPU format. This will fix comparison issues on Big Endian platforms. Change-ID: I92c80d1315dc2a0f9f095d5a0c48d461beb052ed Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:36 -08:00
Jacob Keller	2ae0bf5014	i40e: convert to cpu from le16 to generate switch_id correctly On Big Endian platforms we would incorrectly calculate the wrong switch id since we did not properly convert the le16 value into CPU format. Caught by sparse. Change-ID: I69a2f9fa064a0a91691f7d0e6fcc206adceb8e36 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:36 -08:00
Alan Brady	773d4023ba	i40e: refactor AQ CMD buffer debug printing This patch refactors the '%ph' printk format specifier to instead use the print_hex_dump function, as recommended by the '%ph' documentation. This produces better/more standardized output. Change-ID: Id56700b4e8abc40ff8c04bc8379e7df04cb4d6fd Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:36 -08:00
Carolyn Wyborny	3c234c4709	i40e: Fix Adaptive ITR enabling This patch fixes a bug introduced with the addition of the per queue ITR feature support in ethtool. With that addition, there were functions added which converted the ITR settings to binary values. The IS_ENABLED macros that run on those values check whether a bit is set or not and with the value being binary, the bit check always returned ITR disabled which prevents any updating of the ITR rate. This patch fixes the problem by changing the functions to return the current ITR value instead and renaming it to better reflect its function. These functions now provide a value which will be accurately asessed and update the ITR as intended. Change-ID: I14f1d088d052e27f652aaa3113e186415ddea1fc Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:36 -08:00
Mitch Williams	51f3826266	i40evf: add comment Add a comment to reduce confusion. Change-ID: I3d5819c0f3f5174680442ae54398a073d4a61f4f Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:35 -08:00
Mitch Williams	8a68badd12	i40evf: free rings in remove function When the i40evf_remove() calls netdev close, the device doesn't actually close - it schedules the work for the watchdog to perform. Since we're stopping the watchdog, this work doesn't get done. However, we're resetting the part, so we can free resources after the reset request has gone through. This plugs a memory leak. Change-ID: Id5335dcaf76ce00d2a4c3d26e9faf711d7f051cf Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:35 -08:00
Jacob Keller	03aa268b14	i40e: remove unnecessary call to i40e_update_link_info This call is made just prior to running i40e_link_event. In i40e_link_event, we set hw->phy.get_link_info to true just prior to calling i40e_get_link_status, which conveniently runs i40e_update_link_info for us. Thus, we are running i40e_update_link_info twice, which seems like something we don't need to do... Change-ID: I36467a570f44b7546d218c99e134ff97c2709315 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:35 -08:00
Joshua Hay	1d68005db4	i40e: enable mc magic pkt wakeup during power down This patch adds a call to the mac_address_write admin q function during power down to update the PRTPM_SAH/SAL registers with the MC_MAG_EN bit thus enabling multicast magic packet wakeup. A FW workaround is needed to write the multicast magic wake up enable bit in the PRTPM_SAH register. The FW expects the mac address write admin q cmd to be called first with one of the WRITE_TYPE_LAA flags and then with the multicast relevant flags. *Note: This solution only works for X722 devices currently. A PFR will clear the previously mentioned bit by default, but X722 has support for a WOL_PRESERVE_ON_PFR flag which prevents the bit from being cleared. Once other devices support this flag, this solution should work as well. Change-ID: I51bd5b8535bd9051c2676e27c999c1657f786827 Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:35 -08:00
Alan Brady	a410c821c0	i40e: fix disable overflow promiscuous mode There exists a bug in which the driver is unable to exit overflow promiscuous mode after having added "too many" mac filters. It is expected that after triggering overflow promiscuous, removing the failed/extra filters should then disable overflow promiscuous mode. The bug exists because we were intentionally skipping the sync_vsi_filter path in cases where we were removing failed filters since they shouldn't have been added to the firmware in the first place, however we still need to go through the sync_vsi_filter code path to determine whether or not it is ok to exit overflow promiscuous mode. This patch fixes the bug by making sure we go through the sync_vsi_filter path in cases of failed filters. Change-ID: I634d249ca3e5fa50729553137c295e73e7722143 Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:35 -08:00
Alexander Duyck	ffed21bcee	ixgbe: Don't bother clearing buffer memory for descriptor rings This patch makes it so that we don't need to bother with clearing the memory out for the descriptor rings. The general idea is to only free buffers associated with buffers in use which are located between the next_to_clean and next_to_use or next_to_alloc values. Everything outside of those regions can be safely ignored since they should have no buffers associated with them. The advantage to doing things this way is that is should speed up bring-up and tear-down of the rings. Specifically we can avoid the 512 or more cycles required to memset the rings in tear-down. In the bring-up phase we then clear the memory as a part of initialization. The general idea is that the clearing in initialization can act as a prefetch of sorts for the buffer info structures so they are in the local CPU when we go to populate them. This should help to improve overall time needed to perform a suspend/resume. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	6f429223b3	ixgbe: Add support for build_skb This patch adds build_skb support to the Rx path. There are several advantages to this change. 1. It avoids the memcpy and skb->head allocation for small packets which improves performance by about 5% in my tests. 2. It avoids the memcpy, skb->head allocation, and eth_get_headlen for larger packets improving performance by about 10% in my tests. 3. For VXLAN packets it allows the full header to be in skb->data which improves the performance by as much as 30% in some of my tests. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	2ccdf26ff6	ixgbe: Add private flag to control buffer mode Since there are potential drawbacks to the new Rx allocation approach I thought it best to add a "chicken bit" so that we can turn the feature off if in the event that a problem is found. It also provides a means of validating the legacy Rx path in the event that we are forced to fall back. At some point in the future when we are convinced we don't need it anymore we might be able to drop the legacy-rx flag. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	2de6aa3a66	ixgbe: Add support for padding packet This patch adds support for providing a buffer with headroom and tailroom to allow for shared info, NET_SKB_PAD, and NET_IP_ALIGN. With this combined with the DMA changes we can start using build_skb to build frames around an incoming Rx buffer instead of having to memcpy the headers. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	3fd218767f	ixgbe: Break out Rx buffer page management We are going to be expanding the number of Rx paths in the driver. Instead of duplicating all that code I am pulling it apart into separate functions so that we don't have so much code duplication. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	c3630cc40b	ixgbe: Use length to determine if descriptor is done This change makes it so that we use the length of the packet instead of the DD status bit to determine if a new descriptor is ready to be processed. The obvious advantage is that it cuts down on reads as we don't really even need the DD bit if going from a 0 to a non-zero value on size is enough to inform us that the packet has been completed. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	4f4542bfb3	ixgbe: Make use of order 1 pages and 3K buffers independent of FCoE In order to support build_skb with jumbo frames it will be necessary to use 3K buffers for the Rx path with 8K pages backing them. This is needed on architectures that implement 4K pages because we can't support 2K buffers plus padding in a 4K page. In the case of systems that support page sizes larger than 4K the 3K attribute will only be applied to FCoE as we can fall back to using just 2K buffers and adding the padding. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	1b56cf49f5	ixgbe: Update code to better handle incrementing page count Batch the page count updates instead of doing them one at a time. By doing this we can improve the overall performance as the atomic increment operations can be expensive due to the fact that on x86 they are locked operations which can cause stalls. By doing bulk updates we can consolidate the stall which should help to improve the overall receive performance. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	f3213d9321	ixgbe: Update driver to make use of DMA attributes in Rx path This patch adds support for DMA_ATTR_SKIP_CPU_SYNC and DMA_ATTR_WEAK_ORDERING. By enabling both of these for the Rx path we are able to see performance improvements on architectures that implement either one due to the fact that page mapping and unmapping only has to sync what is actually being used instead of the entire buffer. In addition by enabling the weak ordering attribute enables a performance improvement for architectures that can associate a memory ordering with a DMA buffer such as Sparc. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	f215af8cae	ixgbe: Only DMA sync frame length On some platforms, syncing a buffer for DMA is expensive. Rather than sync the whole 2K receive buffer, only synchronise the length of the frame, which will typically be the MTU, or a much smaller TCP ACK. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	af43da0dba	ixgbe: Add function for checking to see if we can reuse page This patch consolidates the code for the ixgbe driver so that it is more inline with what is already in igb. The general idea is to just consolidate functions that represent logical steps in the Rx process so we can later update them more easily. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Mark Rustad	1733284d02	ixgbe: Update version to reflect added functionality Update the driver version to reflect the new devices that it supports. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Stephen Hemminger	3f40c74cce	ixgbe: prefix Data Center Bridge ops struct Since dcbnl_ops is global, it should be prefixed by ixgbe_ Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Tony Nguyen	1dc0eb75a8	ixgbe: Support 2.5Gb and 5Gb speed Though not advertised through ethtool, if the link partner advertises a 2.5Gb or 5Gb connection, and the adapter supports it, allow the speed to be used. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Henry Tieman	b7eaf8f16e	i40e: Save more link abilities when using ethtool Ethtool support needs to save more PHY information. The added information includes FEC capabilities and 25G link types. Without this change it is possible to lose 25G or FEC settings by using ethtool. Change-ID: Ie42255b1e901ffbf9583b8c46466a54894114280 Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:02 -08:00
Jacob Keller	671889e674	i40e: avoid race condition when sending filters to firmware for addition Refactor how we add new filters to firmware to avoid a race condition that can occur due to removing filters from the hash temporarily. To understand the race condition, suppose that you have a number of MAC filters, but have not yet added any VLANs. Now, add two VLANs in rapid succession. A possible resulting flow would look something like the following: (1) lock hash for add VLAN (2) add the new MAC/VLAN combos for each current MAC filter (3) unlock hash (4) lock hash for filter sync (5) notice that we have a VLAN, so prepare to update all MAC filters with VLAN=-1 to be VLAN=0. (6) move NEW and REMOVE filters to temporary list (7) unlock hash (8) lock hash for add VLAN (9) add new MAC/VLAN combos. Notice that no MAC filters are currently in the hash list, so we don't add any VLANs <--- BUG! (10) unlock hash (11) sync the temporary lists to firmware (12) lock hash for post-sync (13) move the temporary elements back to the main list .... Because we take filters out of the main hash into temporary lists, we introduce a narrow window where it is possible that other callers to the list will not see some of the filters which were previously added but have not yet been finalized. This results in sometimes dropping VLAN additions, and could also result in failing to add a MAC address on the newly added VLAN. One obvious way to avoid this race condition would be to lock the entire firmware process. Unfortunately this does not work because adminq firmware commands take a mutex which results in a sleep while atomic BUG(). So, we can't use the simplest approach. An alternative approach is to simply not remove the filters from the hash list while adding. Instead, add an i40e_new_mac_filter structure which we will use to track added filters. This avoids the need to remove the filter from the hash list. We'll store a pointer to the original i40e_mac_filter, along with our own copy of the state. We won't update the state directly, so as to avoid race with other code that may modify the state while under the lock. We are safe to read f->macaddr and f->vlan since these only change in two locations. The first is on filter creation, which must have already occurred. The second is inside i40e_correct_vlan_filters which was previously run after creation of this object and can't be run again until after. Thus, we should be safe to read the MAC address and VLAN while outside the lock. We also aren't going to run into a use-after-free issue because the only place where we free filters is when they are marked FAILED or when we remove them inside the sync subtask. Since the subtask has its own critical flag to prevent duplicate runs, we know this won't happen. We also know that the only location to transition a filter from NEW to FAILED is inside the subtask also, so we aren't worried about that either. Use the wrapper i40e_new_mac_filter for additions, and once we've finalized the addition to firmware, we will update the filter state inside a lock, and then free the wrapper structure. In order to avoid a possible race condition with filter deletion, we won't update the original filter state unless it is still I40E_FILTER_NEW when we finish the firmware sync. This approach is more complex, but avoids race conditions related to filters being temporarily removed from the list. We do not need the same behavior for deletion because we always unconditionally removed the filters from the list regardless of the firmware status. Change-Id: I14b74bc2301f8e69433fbe77ebca532db20c5317 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:01 -08:00
Jacob Keller	d88d40b01c	i40e: allow i40e_update_filter_state to skip broadcast filters Fix a bug where we modified the mac_filter_hash while outside a lock, when handling addition of broadcast filters. Normally, we add filters to firmware by batching the additions into lists and issuing 1 update for every few filters. Broadcast filters are handled differently, by instead setting the broadcast promiscuous mode flags. In order to make sure the 1<->1 mapping of filters in our addition array lined up with filters in the hlist tmp_add_list, we had to remove the filter and move it back to the main hash. However, we didn't do this under lock, which could cause consistency problems for the list. Fix this by updating i40e_update_filter_state logic so that it knows to avoid broadcast filters. This ensures that we don't have to remove the filter separately, and can put it back using the normal flow. Change-ID: Id288fade80b3e3a9a54b68cc249188cb95147518 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:01 -08:00
Jacob Keller	e6e3fc2bd3	i40e: don't warn every time we clear an Rx timestamp register The intent of this message was to indicate to a user that we might have missed a timestamp event for a valid packet. The original method of detecting the missed events relied on waiting until all 4 registers were filled. A recent commit d55458c0cd7a5 ("i40e: replace PTP Rx timestamp hang logic") replaced this logic with much better detection scheme that could detect a stalled Rx timestamp register even when other registers were still functional. The new logic means that a message will be displayed almost as soon as a timestamp for a dropped frame occurs. This new logic highlights that the hardware will attempt timestamp for frames which it later decides to drop. The most prominent example is when a multicast PTP frame is received on a multicast address that we are not subscribed to. Because the hardware initiates the Rx timestamp as soon as possible, it will latch an RXTIME register, but then drop the packet. This results in users being confused by the message as they are not expecting to see dropped timestamp messages unless their application also indicates that timestamps were missing. Resolve this by reducing the severity and frequency of the displayed message. We now only print the message if 3 or 4 of the RXTIME registers are stalled and get cleared within the same watchdog event. This ensures that the common case does not constantly display the message. Additionally, since the message is likely not as meaningful to most users, reduce the message to a dev_dbg instead of a dev_warn. Users can still get a count of the number of timestamps dropped by reading the ethtool statistics value, if necessary. Change-ID: I35494442226a444c418dfb4f91a3070d06c8435c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:01 -08:00
Henry Tieman	3e03d7ccf4	i40e: Save link FEC info from link up event Store the FEC status bits from the link up event into the hw_link_info structure. Change-ID: I9a7b256f6dfb0dce89c2f503075d0d383526832e Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:01 -08:00
Sudheer Mogilappagari	b3f028fc8a	i40e: Add bus number info to i40e_bus_info struct Currently i40e_bus_info has PCI device and function info only and log messages print device number as bus number. Added field to provide bus number info and modified log statements to print bus, device and function information. Change-ID: I811617cee2714cc0d6bade8d369f57040990756f Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:01 -08:00
Mitch Williams	3bb83baf9a	i40e: Clean up dead code The function i40e_client_prepare() can never return an error. So make it void and quit checking its return value. Change-ID: I9ff311e2324dde329eb68648efb2c94aaff856db Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:01 -08:00
Bimmy Pujari	cfffef76e7	i40e/i40evf : Changed version from 1.6.25 to 1.6.27 Signed-off-by: Bimmy Pujari <bimmy.pujari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:01 -08:00
Jacob Keller	a158aeaf5b	i40e: update comment explaining where FDIR buffers are freed The original comment implies that the only location where the raw_packet buffer will be freed is in i40e_clean_tx_ring() which is incorrect. In fact this isn't even the normal case. Update the comment explaining where the memory is freed. Change-ID: Ie0defc35ed1c3af183f81fdc60b6d783707a5595 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:01 -08:00
Scott Peterson	9b37c93731	i40e/i40evf: eliminate i40e_pull_tail() Reorganize the i40e_pull_tail() logic, doing it in i40e_add_rx_frag() where it's cheaper. The igb driver does this the same way. Also renames i40e_page_is_reserved() to reflect what it actually tests. Change-ID: Icd9cc507aae1fcdc02308b3a09034111b4c24071 Signed-off-by: Scott Peterson <scott.d.peterson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:01 -08:00
Scott Peterson	e72e56597b	i40e/i40evf: Moves skb from i40e_rx_buffer to i40e_ring This patch reduces the size of struct i40e_rx_buffer by one pointer, and makes the i40e driver a little more consistent with the igb driver in terms of packets that span buffers. We do this by moving the skb field from struct i40e_rx_buffer to struct i40e_ring. We pass the skb we already have (or NULL if we don't) to i40e_fetch_rx_buffer(), which skips the skb allocation if we already have one for this packet. Change-ID: I4ad48a531844494ba0c5d8e1a62209a057f661b0 Signed-off-by: Scott Peterson <scott.d.peterson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:01 -08:00
Scott Peterson	7987dcd7b9	i40e/i40evf: Limit DMA sync of RX buffers to actual packet size On packet RX, we perform a DMA sync for CPU before passing the packet up. Here we limit that sync to the actual length of the incoming packet, rather than always syncing the entire buffer. Change-ID: I626aaf6c37275a8ce9e81efcaa773f327b331487 Signed-off-by: Scott Peterson <scott.d.peterson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:01 -08:00
Mitch Williams	e5f77f4a2f	i40evf: track outstanding client request The iWarp client cannot continue until this operation has been completed by the PF driver. Sleep (with timeout) until the reply from the PF driver has been received. Change-ID: I5dc41b857bba32d0218b7ce167b5da122dadf349 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:00 -08:00
Jacob Keller	d7ce6422d6	i40e: don't check params until after checking for client instance We can avoid the minor bit of work by calling check params after we check for the client instance, since we're about to return early in cases where we do not have a client. Change-ID: I56f8ea2ba48d4f571fa331c9ace50819a022fa1c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:00 -08:00
David S. Miller	a076d1bdc6	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-02-03 This series contains updates to i40e/i40evf only. Jake fixes up the driver to not call i40e_vsi_kill_vlan() or i40e_vsi_add_vlan() when the PVID is set or when the VID is less than 1. Cleaned up a check which really is not needed since there is no real reason why we cannot just call i40e_del_mac_all_vlan() directly. Renamed functions to better reflect their actual purpose and how they function in a more clear manner. Bimmy cleans up unused/deprecated macros. Mitch cleans up unused device ids which were intended for use when running Linux VF drivers under Hyper-V, but found to be not needed. Then cleaned up a function that is no longer needed since the client open and close functions were refactored. Adds a sleep without timeout until the reply from the PF driver has been received since the iWARP client cannot continue until the operation has been completed. Tushar Dave fixes an issue seen on SPARC where the use of the 'packed' directive was causing kernel unaligned errors. Alex does a refactor to pull some data off of the stack and store it in the transmit buffer info section of the transmit ring. Alan fixes a bug which was caused by passing a bad register value to the firmware, by refactoring the macro INTRL_USEC_TO_REG into a static inline function. Also added feedback to the user as to the actual interrupt rate limit being used when it differs from the requested limit. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-04 12:13:27 -05:00
Eric Dumazet	508aac6dee	ixgbevf: get rid of custom busy polling code In linux-4.5, busy polling was implemented in core NAPI stack, meaning that all custom implementation can be removed from drivers. Not only we remove lot's of code, we also remove one lock operation in fast path, and allow GRO to do its job. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 17:17:53 -05:00
Eric Dumazet	3ffc1af576	ixgbe: get rid of custom busy polling code In linux-4.5, busy polling was implemented in core NAPI stack, meaning that all custom implementation can be removed from drivers. Not only we remove lot's of code, we also remove one lock operation in fast path, and allow GRO to do its job. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 17:17:52 -05:00

... 7 8 9 10 11 ...

4896 Commits