Commit Graph

428 Commits

Author SHA1 Message Date
Cao jin
629823b872 igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr
When running as guest, under certain condition, it will oops as following.
writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
is NULL. While other register access won't oops kernel because they use
wr32/rd32 which have a defense against NULL pointer.

    [  141.225449] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Fatal)
    error received: id=0101
    [  141.225523] igb 0000:01:00.1: PCIe Bus Error:
    severity=Uncorrected (Fatal), type=Unaccessible,
    id=0101(Unregistered Agent ID)
    [  141.299442] igb 0000:01:00.1: broadcast error_detected message
    [  141.300539] igb 0000:01:00.0 enp1s0f0: PCIe link lost, device now
    detached
    [  141.351019] igb 0000:01:00.1 enp1s0f1: PCIe link lost, device now
    detached
    [  143.465904] pcieport 0000:00:1c.0: Root Port link has been reset
    [  143.465994] igb 0000:01:00.1: broadcast slot_reset message
    [  143.466039] igb 0000:01:00.0: enabling device (0000 -> 0002)
    [  144.389078] igb 0000:01:00.1: enabling device (0000 -> 0002)
    [  145.312078] igb 0000:01:00.1: broadcast resume message
    [  145.322211] BUG: unable to handle kernel paging request at
    0000000000003818
    [  145.361275] IP: [<ffffffffa02fd38d>]
    igb_configure_tx_ring+0x14d/0x280 [igb]
    [  145.400048] PGD 0
    [  145.438007] Oops: 0002 [#1] SMP

A similar issue & solution could be found at:
    http://patchwork.ozlabs.org/patch/689592/

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-01-06 02:18:50 -08:00
Alexander Duyck
bd4171a5d4 igb: update code to better handle incrementing page count
Update the driver code so that we do bulk updates of the page reference
count instead of just incrementing it by one reference at a time.  The
advantage to doing this is that we cut down on atomic operations and
this in turn should give us a slight improvement in cycles per packet.
In addition if we eventually move this over to using build_skb the gains
will be more noticeable.

Link: http://lkml.kernel.org/r/20161110113616.76501.17072.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Cc: Helge Deller <deller@gmx.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Keguang Zhang <keguang.zhang@gmail.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Tobias Klauser <tklauser@distanz.ch>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Alexander Duyck
5be5955425 igb: update driver to make use of DMA_ATTR_SKIP_CPU_SYNC
The ARM architecture provides a mechanism for deferring cache line
invalidation in the case of map/unmap.  This patch makes use of this
mechanism to avoid unnecessary synchronization.

A secondary effect of this change is that the portion of the page that
has been synchronized for use by the CPU should be writable and could be
passed up the stack (at least on ARM).

The last bit that occurred to me is that on architectures where the
sync_for_cpu call invalidates cache lines we were prefetching and then
invalidating the first 128 bytes of the packet.  To avoid that I have
moved the sync up to before we perform the prefetch and allocate the
skbuff so that we can actually make use of it.

Link: http://lkml.kernel.org/r/20161110113611.76501.98897.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Cc: Helge Deller <deller@gmx.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Keguang Zhang <keguang.zhang@gmail.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Tobias Klauser <tklauser@distanz.ch>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
David S. Miller
2745529ac7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Couple conflicts resolved here:

1) In the MACB driver, a bug fix to properly initialize the
   RX tail pointer properly overlapped with some changes
   to support variable sized rings.

2) In XGBE we had a "CONFIG_PM" --> "CONFIG_PM_SLEEP" fix
   overlapping with a reorganization of the driver to support
   ACPI, OF, as well as PCI variants of the chip.

3) In 'net' we had several probe error path bug fixes to the
   stmmac driver, meanwhile a lot of this code was cleaned up
   and reorganized in 'net-next'.

4) The cls_flower classifier obtained a helper function in
   'net-next' called __fl_delete() and this overlapped with
   Daniel Borkamann's bug fix to use RCU for object destruction
   in 'net'.  It also overlapped with Jiri's change to guard
   the rhashtable_remove_fast() call with a check against
   tc_skip_sw().

5) In mlx4, a revert bug fix in 'net' overlapped with some
   unrelated changes in 'net-next'.

6) In geneve, a stale header pointer after pskb_expand_head()
   bug fix in 'net' overlapped with a large reorganization of
   the same code in 'net-next'.  Since the 'net-next' code no
   longer had the bug in question, there was nothing to do
   other than to simply take the 'net-next' hunks.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 12:29:53 -05:00
Alexander Duyck
516165a1e2 igb/igbvf: Don't use lco_csum to compute IPv4 checksum
In the case of IPIP and SIT tunnel frames the outer transport header
offset is actually set to the same offset as the inner transport header.
This results in the lco_csum call not doing any checksum computation over
the inner IPv4/v6 header data.

In order to account for that I am updating the code so that we determine
the location to start the checksum ourselves based on the location of the
IPv4 header and the length.

Fixes: e10715d3e9 ("igb/igbvf: Add support for GSO partial")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-01 15:41:35 -05:00
Jarod Wilson
91c527a556 ethernet/intel: use core min/max MTU checking
e100: min_mtu 68, max_mtu 1500
- remove e100_change_mtu entirely, is identical to old eth_change_mtu,
  and no longer serves a purpose. No need to set min_mtu or max_mtu
  explicitly, as ether_setup() will already set them to 68 and 1500.

e1000: min_mtu 46, max_mtu 16110

e1000e: min_mtu 68, max_mtu varies based on adapter

fm10k: min_mtu 68, max_mtu 15342
- remove fm10k_change_mtu entirely, does nothing now

i40e: min_mtu 68, max_mtu 9706

i40evf: min_mtu 68, max_mtu 9706

igb: min_mtu 68, max_mtu 9216
- There are two different "max" frame sizes claimed and both checked in
  the driver, the larger value wasn't relevant though, so I've set max_mtu
  to the smaller of the two values here to retain identical behavior.

igbvf: min_mtu 68, max_mtu 9216
- Same issue as igb duplicated

ixgb: min_mtu 68, max_mtu 16114
- Also remove pointless old == new check, as that's done in dev_set_mtu

ixgbe: min_mtu 68, max_mtu 9710

ixgbevf: min_mtu 68, max_mtu dependent on hardware/firmware
- Some hw can only handle up to max_mtu 1504 on a vf, others 9710

CC: netdev@vger.kernel.org
CC: intel-wired-lan@lists.osuosl.org
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 11:34:18 -04:00
Todd Fujinaka
0742337c19 igb: bump version to igb-5.4.0
Bump igb version to match other igb drivers.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-09-27 19:00:52 -07:00
Moshe Shemesh
79aab093a0 net: Update API for VF vlan protocol 802.1ad support
Introduce new rtnl UAPI that exposes a list of vlans per VF, giving
the ability for user-space application to specify it for the VF, as an
option to support 802.1ad.
We adjusted IP Link tool to support this option.

For future use cases, the new UAPI supports multiple vlans. For now we
limit the list size to a single vlan in kernel.
Add IFLA_VF_VLAN_LIST in addition to IFLA_VF_VLAN to keep backward
compatibility with older versions of IP Link tool.

Add a vlan protocol parameter to the ndo_set_vf_vlan callback.
We kept 802.1Q as the drivers' default vlan protocol.
Suitable ip link tool command examples:
  Set vf vlan protocol 802.1ad:
    ip link set eth0 vf 1 vlan 100 proto 802.1ad
  Set vf to VST (802.1Q) mode:
    ip link set eth0 vf 1 vlan 100 proto 802.1Q
  Or by omitting the new parameter
    ip link set eth0 vf 1 vlan 100

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-24 08:01:26 -04:00
Gangfeng Huang
0e71def252 igb: add support of RX network flow classification
This patch is meant to allow for RX network flow classification to insert
and remove Rx filter by ethtool. Ethtool interface has it's own rules
manager

Show all filters:
$ ethtool -n eth0
4 RX rings available
Total 2 rules

Signed-off-by: Ruhao Gao <ruhao.gao@ni.com>
Signed-off-by: Gangfeng Huang <gangfeng.huang@ni.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-08-18 22:27:47 -07:00
Linus Torvalds
c8d0267efd PCI changes for the v4.8 merge window:
Enumeration
     Move ecam.h to linux/include/pci-ecam.h (Jayachandran C)
     Add parent device field to ECAM struct pci_config_window (Jayachandran C)
     Add generic MCFG table handling (Tomasz Nowicki)
     Refactor pci_bus_assign_domain_nr() for CONFIG_PCI_DOMAINS_GENERIC (Tomasz Nowicki)
     Factor DT-specific pci_bus_find_domain_nr() code out (Tomasz Nowicki)
 
   Resource management
     Add devm_request_pci_bus_resources() (Bjorn Helgaas)
     Unify pci_resource_to_user() declarations (Bjorn Helgaas)
     Implement pci_resource_to_user() with pcibios_resource_to_bus() (microblaze, powerpc, sparc) (Bjorn Helgaas)
     Request host bridge window resources (designware, iproc, rcar, xgene, xilinx, xilinx-nwl) (Bjorn Helgaas)
     Make PCI I/O space optional on ARM32 (Bjorn Helgaas)
     Ignore write combining when mapping I/O port space (Bjorn Helgaas)
     Claim bus resources on MIPS PCI_PROBE_ONLY set-ups (Bjorn Helgaas)
     Remove unicore32 pci=firmware command line parameter handling (Bjorn Helgaas)
     Support I/O resources when parsing host bridge resources (Jayachandran C)
     Add helpers to request/release memory and I/O regions (Johannes Thumshirn)
     Use pci_(request|release)_mem_regions (NVMe, lpfc, GenWQE, ethernet/intel, alx) (Johannes Thumshirn)
     Extend pci=resource_alignment to specify device/vendor IDs (Koehrer Mathias (ETAS/ESW5))
     Add generic pci_bus_claim_resources() (Lorenzo Pieralisi)
     Claim bus resources on ARM32 PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi)
     Remove ARM32 and ARM64 arch-specific pcibios_enable_device() (Lorenzo Pieralisi)
     Add pci_unmap_iospace() to unmap I/O resources (Sinan Kaya)
     Remove powerpc __pci_mmap_set_pgprot() (Yinghai Lu)
 
   PCI device hotplug
     Allow additional bus numbers for hotplug bridges (Keith Busch)
     Ignore interrupts during D3cold (Lukas Wunner)
 
   Power management
     Enforce type casting for pci_power_t (Andy Shevchenko)
     Don't clear d3cold_allowed for PCIe ports (Mika Westerberg)
     Put PCIe ports into D3 during suspend (Mika Westerberg)
     Power on bridges before scanning new devices (Mika Westerberg)
     Runtime resume bridge before rescan (Mika Westerberg)
     Add runtime PM support for PCIe ports (Mika Westerberg)
     Remove redundant check of pcie_set_clkpm (Shawn Lin)
 
   Virtualization
     Add function 1 DMA alias quirk for Marvell 88SE9182 (Aaron Sierra)
     Add DMA alias quirk for Adaptec 3805 (Alex Williamson)
     Mark Atheros AR9485 and QCA9882 to avoid bus reset (Chris Blake)
     Add ACS quirk for Solarflare SFC9220 (Edward Cree)
 
   MSI
     Fix PCI_MSI dependencies (Arnd Bergmann)
     Add pci_msix_desc_addr() helper (Christoph Hellwig)
     Switch msix_program_entries() to use pci_msix_desc_addr() (Christoph Hellwig)
     Make the "entries" argument to pci_enable_msix() optional (Christoph Hellwig)
     Provide sensible IRQ vector alloc/free routines (Christoph Hellwig)
     Spread interrupt vectors in pci_alloc_irq_vectors() (Christoph Hellwig)
 
   Error Handling
     Bind DPC to Root Ports as well as Downstream Ports (Keith Busch)
     Remove DPC tristate module option (Keith Busch)
     Convert Downstream Port Containment driver to use devm_* functions (Mika Westerberg)
 
   Generic host bridge driver
     Select IRQ_DOMAIN (Arnd Bergmann)
     Claim bus resources on PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi)
 
   ACPI host bridge driver
     Add ARM64 acpi_pci_bus_find_domain_nr() (Tomasz Nowicki)
     Add ARM64 ACPI support for legacy IRQs parsing and consolidation with DT code (Tomasz Nowicki)
     Implement ARM64 AML accessors for PCI_Config region (Tomasz Nowicki)
     Support ARM64 ACPI-based PCI host controller (Tomasz Nowicki)
 
   Altera host bridge driver
     Check link status before retrain link (Ley Foon Tan)
     Poll for link up status after retraining the link (Ley Foon Tan)
 
   Axis ARTPEC-6 host bridge driver
     Add PCI_MSI_IRQ_DOMAIN dependency (Arnd Bergmann)
     Add DT binding for Axis ARTPEC-6 PCIe controller (Niklas Cassel)
     Add Axis ARTPEC-6 PCIe controller driver (Niklas Cassel)
 
   Intel VMD host bridge driver
     Use lock save/restore in interrupt enable path (Jon Derrick)
     Select device dma ops to override (Keith Busch)
     Initialize list item in IRQ disable (Keith Busch)
     Use x86_vector_domain as parent domain (Keith Busch)
     Separate MSI and MSI-X vector sharing (Keith Busch)
 
   Marvell Aardvark host bridge driver
     Add DT binding for the Aardvark PCIe controller (Thomas Petazzoni)
     Add Aardvark PCI host controller driver (Thomas Petazzoni)
     Add Aardvark PCIe support for Armada 3700 (Thomas Petazzoni)
 
   Microsoft Hyper-V host bridge driver
     Fix interrupt cleanup path (Cathy Avery)
     Don't leak buffer in hv_pci_onchannelcallback() (Vitaly Kuznetsov)
     Handle all pending messages in hv_pci_onchannelcallback() (Vitaly Kuznetsov)
 
   NVIDIA Tegra host bridge driver
     Program PADS_REFCLK_CFG* always, not just on legacy SoCs (Stephen Warren)
     Program PADS_REFCLK_CFG* registers with per-SoC values (Stephen Warren)
     Use lower-case hex consistently for register definitions (Thierry Reding)
     Use generic pci_remap_iospace() rather than ARM32-specific one (Thierry Reding)
     Stop setting pcibios_min_mem (Thierry Reding)
 
   Renesas R-Car host bridge driver
     Drop gen2 dummy I/O port region (Bjorn Helgaas)
 
   TI DRA7xx host bridge driver
     Fix return value in case of error (Christophe JAILLET)
 
   Xilinx AXI host bridge driver
     Fix return value in case of error (Christophe JAILLET)
 
   Miscellaneous
     Make bus_attr_resource_alignment static (Ben Dooks)
     Include <asm/dma.h> for isa_dma_bridge_buggy (Ben Dooks)
     MAINTAINERS: Add file patterns for PCI device tree bindings (Geert Uytterhoeven)
     Make host bridge drivers explicitly non-modular (Paul Gortmaker)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXoNRtAAoJEFmIoMA60/r8LMkP/3kiNh21QFS6RZGOaDft5/Py
 n14Zo0w51avspxoI3iyDlBd5q/SssMqi+2c6Ko/fh2D2xMxJgmQOjdMDrIGARxGA
 qEHk/5IoXquY2/GcptmCk3ap66cJ6kTovS4OPrb73m3fPuknFwFwdzExq22XHbnI
 crPya6xwQxPLc54VpY/TsgW8E+EKZd/3FW9wuzzNHXrXmTILyhBQzQAA0K470GMx
 wEXU6kc3M/XhRuF1zjV9/O+H/xguwfnbTpZLvd2NAF6uXKZoRytEHHtNnVqu1hoe
 UPpDS2xq32pMNbGxGqBetCdIbkY/hWOufmckHI7Yu2OfXBYyHBYMG2je1+nMPkOV
 WiFhhrchGt5KnEMUwXPS4ROqnSZVpZBl1Fd4s10GhUYkoE2HNKJXta398H9FR1jj
 4NEVSi4mSX/+CkaoIN3lXYiaf9P0wv4Wppve4Scr30+VnLjJhm7Vw5La7v12oo6x
 otrJ/g98AkmnbuUdLeWBUS/+TOcdPjZYbw52rqBsbOOjFm51Zcj6D7kf5WcTypQy
 HzbvygSVabcioWehUG1uudC8pdJmQlUGx1aES/iu+mZEae4cuUFALu6hDBD9IYnZ
 5JdwjVzI0UItEwT3rQt3t4xiAqHADQ0NAVNJVCeREdoy/YQpSoTWGXIpyqCZ1yCm
 aBykjRsxbKQXlhVeIxuc
 =NVxu
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Highlights:

   - ARM64 support for ACPI host bridges

   - new drivers for Axis ARTPEC-6 and Marvell Aardvark

   - new pci_alloc_irq_vectors() interface for MSI-X, MSI, legacy INTx

   - pci_resource_to_user() cleanup (more to come)

  Detailed summary:

  Enumeration:
   - Move ecam.h to linux/include/pci-ecam.h (Jayachandran C)
   - Add parent device field to ECAM struct pci_config_window (Jayachandran C)
   - Add generic MCFG table handling (Tomasz Nowicki)
   - Refactor pci_bus_assign_domain_nr() for CONFIG_PCI_DOMAINS_GENERIC (Tomasz Nowicki)
   - Factor DT-specific pci_bus_find_domain_nr() code out (Tomasz Nowicki)

  Resource management:
   - Add devm_request_pci_bus_resources() (Bjorn Helgaas)
   - Unify pci_resource_to_user() declarations (Bjorn Helgaas)
   - Implement pci_resource_to_user() with pcibios_resource_to_bus() (microblaze, powerpc, sparc) (Bjorn Helgaas)
   - Request host bridge window resources (designware, iproc, rcar, xgene, xilinx, xilinx-nwl) (Bjorn Helgaas)
   - Make PCI I/O space optional on ARM32 (Bjorn Helgaas)
   - Ignore write combining when mapping I/O port space (Bjorn Helgaas)
   - Claim bus resources on MIPS PCI_PROBE_ONLY set-ups (Bjorn Helgaas)
   - Remove unicore32 pci=firmware command line parameter handling (Bjorn Helgaas)
   - Support I/O resources when parsing host bridge resources (Jayachandran C)
   - Add helpers to request/release memory and I/O regions (Johannes Thumshirn)
   - Use pci_(request|release)_mem_regions (NVMe, lpfc, GenWQE, ethernet/intel, alx) (Johannes Thumshirn)
   - Extend pci=resource_alignment to specify device/vendor IDs (Koehrer Mathias (ETAS/ESW5))
   - Add generic pci_bus_claim_resources() (Lorenzo Pieralisi)
   - Claim bus resources on ARM32 PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi)
   - Remove ARM32 and ARM64 arch-specific pcibios_enable_device() (Lorenzo Pieralisi)
   - Add pci_unmap_iospace() to unmap I/O resources (Sinan Kaya)
   - Remove powerpc __pci_mmap_set_pgprot() (Yinghai Lu)

  PCI device hotplug:
   - Allow additional bus numbers for hotplug bridges (Keith Busch)
   - Ignore interrupts during D3cold (Lukas Wunner)

  Power management:
   - Enforce type casting for pci_power_t (Andy Shevchenko)
   - Don't clear d3cold_allowed for PCIe ports (Mika Westerberg)
   - Put PCIe ports into D3 during suspend (Mika Westerberg)
   - Power on bridges before scanning new devices (Mika Westerberg)
   - Runtime resume bridge before rescan (Mika Westerberg)
   - Add runtime PM support for PCIe ports (Mika Westerberg)
   - Remove redundant check of pcie_set_clkpm (Shawn Lin)

  Virtualization:
   - Add function 1 DMA alias quirk for Marvell 88SE9182 (Aaron Sierra)
   - Add DMA alias quirk for Adaptec 3805 (Alex Williamson)
   - Mark Atheros AR9485 and QCA9882 to avoid bus reset (Chris Blake)
   - Add ACS quirk for Solarflare SFC9220 (Edward Cree)

  MSI:
   - Fix PCI_MSI dependencies (Arnd Bergmann)
   - Add pci_msix_desc_addr() helper (Christoph Hellwig)
   - Switch msix_program_entries() to use pci_msix_desc_addr() (Christoph Hellwig)
   - Make the "entries" argument to pci_enable_msix() optional (Christoph Hellwig)
   - Provide sensible IRQ vector alloc/free routines (Christoph Hellwig)
   - Spread interrupt vectors in pci_alloc_irq_vectors() (Christoph Hellwig)

  Error Handling:
   - Bind DPC to Root Ports as well as Downstream Ports (Keith Busch)
   - Remove DPC tristate module option (Keith Busch)
   - Convert Downstream Port Containment driver to use devm_* functions (Mika Westerberg)

  Generic host bridge driver:
   - Select IRQ_DOMAIN (Arnd Bergmann)
   - Claim bus resources on PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi)

  ACPI host bridge driver:
   - Add ARM64 acpi_pci_bus_find_domain_nr() (Tomasz Nowicki)
   - Add ARM64 ACPI support for legacy IRQs parsing and consolidation with DT code (Tomasz Nowicki)
   - Implement ARM64 AML accessors for PCI_Config region (Tomasz Nowicki)
   - Support ARM64 ACPI-based PCI host controller (Tomasz Nowicki)

  Altera host bridge driver:
   - Check link status before retrain link (Ley Foon Tan)
   - Poll for link up status after retraining the link (Ley Foon Tan)

  Axis ARTPEC-6 host bridge driver:
   - Add PCI_MSI_IRQ_DOMAIN dependency (Arnd Bergmann)
   - Add DT binding for Axis ARTPEC-6 PCIe controller (Niklas Cassel)
   - Add Axis ARTPEC-6 PCIe controller driver (Niklas Cassel)

  Intel VMD host bridge driver:
   - Use lock save/restore in interrupt enable path (Jon Derrick)
   - Select device dma ops to override (Keith Busch)
   - Initialize list item in IRQ disable (Keith Busch)
   - Use x86_vector_domain as parent domain (Keith Busch)
   - Separate MSI and MSI-X vector sharing (Keith Busch)

  Marvell Aardvark host bridge driver:
   - Add DT binding for the Aardvark PCIe controller (Thomas Petazzoni)
   - Add Aardvark PCI host controller driver (Thomas Petazzoni)
   - Add Aardvark PCIe support for Armada 3700 (Thomas Petazzoni)

  Microsoft Hyper-V host bridge driver:
   - Fix interrupt cleanup path (Cathy Avery)
   - Don't leak buffer in hv_pci_onchannelcallback() (Vitaly Kuznetsov)
   - Handle all pending messages in hv_pci_onchannelcallback() (Vitaly Kuznetsov)

  NVIDIA Tegra host bridge driver:
   - Program PADS_REFCLK_CFG* always, not just on legacy SoCs (Stephen Warren)
   - Program PADS_REFCLK_CFG* registers with per-SoC values (Stephen Warren)
   - Use lower-case hex consistently for register definitions (Thierry Reding)
   - Use generic pci_remap_iospace() rather than ARM32-specific one (Thierry Reding)
   - Stop setting pcibios_min_mem (Thierry Reding)

  Renesas R-Car host bridge driver:
   - Drop gen2 dummy I/O port region (Bjorn Helgaas)

  TI DRA7xx host bridge driver:
   - Fix return value in case of error (Christophe JAILLET)

  Xilinx AXI host bridge driver:
   - Fix return value in case of error (Christophe JAILLET)

  Miscellaneous:
   - Make bus_attr_resource_alignment static (Ben Dooks)
   - Include <asm/dma.h> for isa_dma_bridge_buggy (Ben Dooks)
   - MAINTAINERS: Add file patterns for PCI device tree bindings (Geert Uytterhoeven)
   - Make host bridge drivers explicitly non-modular (Paul Gortmaker)"

* tag 'pci-v4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (125 commits)
  PCI: xgene: Make explicitly non-modular
  PCI: thunder-pem: Make explicitly non-modular
  PCI: thunder-ecam: Make explicitly non-modular
  PCI: tegra: Make explicitly non-modular
  PCI: rcar-gen2: Make explicitly non-modular
  PCI: rcar: Make explicitly non-modular
  PCI: mvebu: Make explicitly non-modular
  PCI: layerscape: Make explicitly non-modular
  PCI: keystone: Make explicitly non-modular
  PCI: hisi: Make explicitly non-modular
  PCI: generic: Make explicitly non-modular
  PCI: designware-plat: Make it explicitly non-modular
  PCI: artpec6: Make explicitly non-modular
  PCI: armada8k: Make explicitly non-modular
  PCI: artpec: Add PCI_MSI_IRQ_DOMAIN dependency
  PCI: Add ACS quirk for Solarflare SFC9220
  arm64: dts: marvell: Add Aardvark PCIe support for Armada 3700
  PCI: aardvark: Add Aardvark PCI host controller driver
  dt-bindings: add DT binding for the Aardvark PCIe controller
  PCI: tegra: Program PADS_REFCLK_CFG* registers with per-SoC values
  ...
2016-08-02 17:12:29 -04:00
Andrew Lunn
64f2525ca4 igb: Only DMA sync frame length
On some platforms, syncing a buffer for DMA is expensive. Rather than
sync the whole 2K receive buffer, only synchronise the length of the
frame, which will typically be the MTU, or a much smaller TCP ACK.

For an IMX6Q, this gives around 6% increased TCP receive performance,
which is cache operations bound and reduces CPU load for TCP transmit.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-06-29 13:59:24 -07:00
Jacob Keller
8646f7b4cd igb: call igb_ptp_suspend during suspend/resume cycle
Properly stop the extra workqueue items and ensure that we resume
cleanly. This is better than using igb_ptp_init and igb_ptp_stop since
these functions destroy the PHC device, which will cause other problems
if we do so. Since igb_ptp_reset now re-schedules the work-queue item we
don't need an equivalent igb_ptp_resume in the resume workflow.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-06-29 11:14:31 -07:00
Jacob Keller
4f3ce71bb8 igb: re-use igb_ptp_reset in igb_ptp_init
Modify igb_ptp_init to take advantage of igb_ptp_reset, and remove
duplicated work that was occurring in both igb_ptp_reset and
igb_ptp_init.

In total, resetting the TSAUXC register, and resetting the system time
both happen in igb_ptp_reset already. igb_ptp_reset now also takes care
of starting the delayed work item for overflow checks, as well.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-06-29 10:56:11 -07:00
Johannes Thumshirn
56d766d64c ethernet/intel: Use pci_(request|release)_mem_regions
Now that we do have pci_request_mem_regions() and pci_release_mem_regions()
at hand, use it in the Intel ethernet drivers.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: David S. Miller <davem@davemloft.net>
2016-06-23 11:48:58 -05:00
Alexander Duyck
bf2d1df395 intel: Add support for IPv6 IP-in-IP offload
This patch adds support for offloading IPXIP6 type packets that represent
either IPv4 or IPv6 encapsulated inside of an IPv6 outer IP header.  In
addition with this change we should also be able to support FOU
encapsulated traffic with outer IPv6 headers.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20 19:25:52 -04:00
Tom Herbert
7e13318daa net: define gso types for IPx over IPv4 and IPv6
This patch defines two new GSO definitions SKB_GSO_IPXIP4 and
SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and
NETIF_F_GSO_IPXIP6. These are used to described IP in IP
tunnel and what the outer protocol is. The inner protocol
can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and
SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT
are removed (these are both instances of SKB_GSO_IPXIP4).
SKB_GSO_IPXIP6 will be used when support for GSO with IP
encapsulation over IPv6 is added.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20 18:03:15 -04:00
Alexander Duyck
e10715d3e9 igb/igbvf: Add support for GSO partial
This patch adds support for partial GSO segmentation in the case of
tunnels.  Specifically with this change the driver an perform segmentation
as long as the frame either has IPv6 inner headers, or we are allowed to
mangle the IP IDs on the inner header.  This is needed because we will not
be modifying any fields from the start of the start of the outer transport
header to the start of the inner transport header as we are treating them
like they are just a block of IP options.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-13 15:26:37 -07:00
Jacob Keller
8008f68cb8 igb: make igb_update_pf_vlvf static
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-13 14:39:59 -07:00
Jacob Keller
a51d8c217b igb: use BIT() macro or unsigned prefix
For bitshifts, we should make use of the BIT macro when possible, and
ensure that other bitshifts are marked as unsigned. This helps prevent
signed bitshift errors, and ensures similar style.

Make use of GENMASK and the unsigned postfix where BIT() isn't
appropriate.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-13 14:39:47 -07:00
Florian Westphal
4d0e965732 drivers: replace dev->trans_start accesses with dev_trans_start
a trans_start struct member exists twice:
- in struct net_device (legacy)
- in struct netdev_queue

Instead of open-coding dev->trans_start usage to obtain the current
trans_start value, use dev_trans_start() instead.

This is not exactly the same, as dev_trans_start also considers
the trans_start values of the netdev queues owned by the device
and provides the most recent one.

For legacy devices this doesn't matter as dev_trans_start can cope
with netdev trans_start values of 0 (they are ignored).

This is a prerequisite to eventual removal of dev->trans_start.

Cc: linux-rdma@vger.kernel.org
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:16:47 -04:00
Arika Chen
d99e366fc9 Revert "igb: Fix a deadlock in igb_sriov_reinit"
This reverts commit 3eb14ea8d9 ("igb: Fix a deadlock in
igb_sriov_reinit")
It is the same as commit f468adc944 ("igb: missing rtnl_unlock in
igb_sriov_reinit()")
There is no rtnl_lock() in igb_resume before, rtnl_unlock will cause a
deadlock.

Signed-off-by: Arika Chen <arika.chen@huawei.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-06 21:02:11 -07:00
John Holland
806ffb1d50 igb: allow setting MAC address on i211 using a device tree blob
The Intel i211 LOM PCIe Ethernet controllers' iNVM operates as an OTP
and has no external EEPROM interface [1]. The following allows the
driver to pickup the MAC address from a device tree blob when CONFIG_OF
has been enabled.

[1]
http://www.intel.com/content/www/us/en/embedded/products/networking/i211-ethernet-controller-datasheet.html

Signed-off-by: John Holland <jotihojr@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-06 12:38:36 -07:00
Alexander Duyck
7f0ba84560 igb: Add support for bulk Tx cleanup & cleanup boolean logic
This patch enables bulk free in Tx cleanup for igb and cleans up the
boolean logic in the polling routines for igb in the hopes of avoiding
any mix-ups similar to what occurred with i40e and i40evf.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-06 12:26:36 -07:00
Alexander Duyck
415cd2a645 igb: Fix sparse warning about passing __beXX into leXX_to_cpup
We were casting the addr as __beXX and then passing it into le32_to_cpu
because the device expects the MAC address to be in network order even
though the register set is little endian.  Instead of casting it as __beXX
we can just cast it as __leXX in order to maintain consistency since the
region of memory is already in little endian order as far as we are
concerned.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-06 12:16:07 -07:00
Linus Torvalds
1200b6809d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Support more Realtek wireless chips, from Jes Sorenson.

   2) New BPF types for per-cpu hash and arrap maps, from Alexei
      Starovoitov.

   3) Make several TCP sysctls per-namespace, from Nikolay Borisov.

   4) Allow the use of SO_REUSEPORT in order to do per-thread processing
   of incoming TCP/UDP connections.  The muxing can be done using a
   BPF program which hashes the incoming packet.  From Craig Gallek.

   5) Add a multiplexer for TCP streams, to provide a messaged based
      interface.  BPF programs can be used to determine the message
      boundaries.  From Tom Herbert.

   6) Add 802.1AE MACSEC support, from Sabrina Dubroca.

   7) Avoid factorial complexity when taking down an inetdev interface
      with lots of configured addresses.  We were doing things like
      traversing the entire address less for each address removed, and
      flushing the entire netfilter conntrack table for every address as
      well.

   8) Add and use SKB bulk free infrastructure, from Jesper Brouer.

   9) Allow offloading u32 classifiers to hardware, and implement for
      ixgbe, from John Fastabend.

  10) Allow configuring IRQ coalescing parameters on a per-queue basis,
      from Kan Liang.

  11) Extend ethtool so that larger link mode masks can be supported.
      From David Decotigny.

  12) Introduce devlink, which can be used to configure port link types
      (ethernet vs Infiniband, etc.), port splitting, and switch device
      level attributes as a whole.  From Jiri Pirko.

  13) Hardware offload support for flower classifiers, from Amir Vadai.

  14) Add "Local Checksum Offload".  Basically, for a tunneled packet
      the checksum of the outer header is 'constant' (because with the
      checksum field filled into the inner protocol header, the payload
      of the outer frame checksums to 'zero'), and we can take advantage
      of that in various ways.  From Edward Cree"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits)
  bonding: fix bond_get_stats()
  net: bcmgenet: fix dma api length mismatch
  net/mlx4_core: Fix backward compatibility on VFs
  phy: mdio-thunder: Fix some Kconfig typos
  lan78xx: add ndo_get_stats64
  lan78xx: handle statistics counter rollover
  RDS: TCP: Remove unused constant
  RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket
  net: smc911x: convert pxa dma to dmaengine
  team: remove duplicate set of flag IFF_MULTICAST
  bonding: remove duplicate set of flag IFF_MULTICAST
  net: fix a comment typo
  ethernet: micrel: fix some error codes
  ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it
  bpf, dst: add and use dst_tclassid helper
  bpf: make skb->tc_classid also readable
  net: mvneta: bm: clarify dependencies
  cls_bpf: reset class and reuse major in da
  ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c
  ldmvsw: Add ldmvsw.c driver code
  ...
2016-03-19 10:05:34 -07:00
Joonsoo Kim
fe896d1878 mm: introduce page reference manipulation functions
The success of CMA allocation largely depends on the success of
migration and key factor of it is page reference count.  Until now, page
reference is manipulated by direct calling atomic functions so we cannot
follow up who and where manipulate it.  Then, it is hard to find actual
reason of CMA allocation failure.  CMA allocation should be guaranteed
to succeed so finding offending place is really important.

In this patch, call sites where page reference is manipulated are
converted to introduced wrapper function.  This is preparation step to
add tracepoint to each page reference manipulation function.  With this
facility, we can easily find reason of CMA allocation failure.  There is
no functional change in this patch.

In addition, this patch also converts reference read sites.  It will
help a second step that renames page._count to something else and
prevents later attempt to direct access to it (Suggested by Andrew).

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Stefan Assmann
46eafa59e1 igb: call ndo_stop() instead of dev_close() when running offline selftest
Calling dev_close() causes IFF_UP to be cleared which will remove the
interfaces routes and some addresses. That's probably not what the user
intended when running the offline selftest. Besides this does not happen
if the interface is brought down before the test, so the current
behaviour is inconsistent.
Instead call the net_device_ops ndo_stop function directly and avoid
touching IFF_UP at all.

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-24 15:54:39 -08:00
Corinna Vinschen
030f9f5264 igb: Fix VLAN tag stripping on Intel i350
Problem: When switching off VLAN offloading on an i350, the VLAN
interface gets unusable.  For testing, set up a VLAN on an i350
and some remote machine, e.g.:

  $ ip link add link eth0 name eth0.42 type vlan id 42
  $ ip addr add 192.168.42.1/24 dev eth0.42
  $ ip link set dev eth0.42 up

Offloading is switched on by default:

  $ ethtool -k eth0 | grep vlan-offload
  rx-vlan-offload: on
  tx-vlan-offload: on

  $ ping -c 3 -I eth0.42 192.168.42.2
  [...works as usual...]

Now switch off VLAN offloading and try again:

  $ ethtool -K eth0 rxvlan off
  Actual changes:
  rx-vlan-offload: off
  tx-vlan-offload: off [requested on]
  $ ping -c 3 -I eth0.42 192.168.42.2
  PING 192.168.42.2 (192.168.42.2) from 192.168.42.1 eth0.42: 56(84) bytes of da
ta.

  --- 192.168.42.2 ping statistics ---
  3 packets transmitted, 0 received, 100% packet loss, time 1999ms

I can only reproduce it on an i350, the above works fine on a 82580.

While inspecting the igb source, I came across the code in igb_set_vmolr
which sets the E1000_VMOLR_STRVLAN/E1000_DVMOLR_STRVLAN flags once and
for all, and in all of the igb code there's no other place where the
STRVLAN is set or cleared.  Thus, VLAN stripping is enabled in igb
unconditionally, independently of the offloading setting.

I compared that to the latest Intel igb-5.3.3.5 driver from
http://sourceforge.net/projects/e1000/ which in fact sets and clears the
STRVLAN flag independently from igb_set_vmolr in its own function
igb_set_vf_vlan_strip, depending on the vlan settings.

So I included the STRVLAN handling from the igb-5.3.3.5 driver into our
current igb driver and tested the above scenario again.  This time ping
still works after switching off VLAN offloading.

Tested on i350, with and without addtional VFs, as well as on 82580
successfully.

Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-24 15:50:40 -08:00
Alexander Duyck
6e03370088 igb: Add support for generic Tx checksums
This patch adds support for generic Tx checksums to the igb driver.  It
turns out this is actually pretty easy after going over the datasheet as we
were doing a number of steps we didn't need to.

In order to perform a Tx checksum for an L4 header we need to fill in the
following fields in the Tx descriptor:
  MACLEN (maximum of 127), retrieved from:
		skb_network_offset()
  IPLEN  (maximum of 511), retrieved from:
		skb_checksum_start_offset() - skb_network_offset()
  TUCMD.L4T indicates offset and if checksum or crc32c, based on:
		skb->csum_offset

The added advantage to doing this is that we can support inner checksum
offloads for tunnels and MPLS while still being able to transparently
insert VLAN tags.

I also took the opportunity to clean-up many of the feature flag
configuration bits to make them a bit more consistent between drivers.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-24 15:32:42 -08:00
Todd Fujinaka
c883de9fd7 igb: rename igb define to be more generic
E1000_MRQC_ENABLE_RSS_4Q enables 4 and 8 queues depending on the part
so rename to be generic.

Similarly, E1000_MRQC_ENABLE_VMDQ_RSS_2Q has no numeric meaning so
rename to be more generic.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-24 15:29:16 -08:00
Todd Fujinaka
5e350b9260 igb: enable WoL for OEM devices regardless of EEPROM setting
Override EEPROM settings for specific OEM devices.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-24 15:14:32 -08:00
Takuma Ueba
b72f3f7200 igb: When GbE link up, wait for Remote receiver status condition
I210 device IPv6 autoconf test sometimes fails,
because DAD NS for link-local is not transmitted.
This packet is silently dropped.
This problem is seen only GbE environment.

igb_watchdog_task link up detection continues to the following process.
The following cases are observed:
1.PHY 1000BASE-T Status Register Remote receiver status bit is NG.
(NG status becomes OK after about 200 - 700ms)
2.In this case, the transfer packet is silently dropped.

1000BASE-T Status register
[Expected]: 0x3800 or 0x7800
[problem occurred]: 0x2800 or 0x6800
Frequency of occurrence: approx 1/10 - 1/40 observed

In order to avoid this problem,
wait until 1000BASE-T Status register "Remote receiver status OK"

After applying this patch, at least 400 runs succeed with no problems.

Signed-off-by: Takuma Ueba <t.ueba11@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-24 14:56:42 -08:00
Alexander Duyck
bf456abb9b igb: Add workaround for VLAN tag stripping on 82576
There was a workaround partially implemented for the 82576 that is needed
in order for VLAN tag stripping to function correctly.  The original code
had side effects that would make it so the workaround was active on all
MACs.  I have updated the code so that the workaround is enabled, but
limited to the 82576, or activated if we exceed the available unicast
addresses.

The workaround has a side effect of mirroring all of the traffic outgoing
from the VFs back to the PF.  As such it is not recommended to use the
82576 in promiscuous mode as it will take a performance hit, though this is
now consistent with the performance as seen on the out-of-tree igb driver.

I also limited the scope of the UTA bits all being set to only when the
VMOLR register is enabled.  This should limit the effects of the UTA
register so that we don't pick up any excess traffic unless promiscuous
mode has been enabled on the PF, whereas before the PF would have ended up
in something equivalent to unicast promiscuous mode with VLAN filtering
otherwise.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-15 16:52:33 -08:00
Alexander Duyck
268f9d33a9 igb: Enable use of "bridge fdb add" to set unicast table entries
This change makes it so that we can use the bridge utility to add a FDB
entry for the PF to an igb port.  By doing this we can enable the VFs to
talk to virtual ports residing on top of the PF.

In addition this should also address issues with MACVLANs trying to reside
on top of the PF as well as they would have had similar issues when added
to the PF with SR-IOV enabled.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-15 16:49:25 -08:00
Alexander Duyck
9c2f186e45 igb: Drop unnecessary checks in transmit path
This patch drops several checks that we dropped from ixgbe some ago.  It
should not be possible for us to be called with either of the conditional
statements returning true so we can just drop them from the hot-path.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-15 16:46:15 -08:00
Alexander Duyck
16903caa33 igb: Add support for VLAN promiscuous with SR-IOV and NTUPLE
This change fixes things so that we can fully support SR-IOV or the
recently added NTUPLE filtering while allowing support for VLAN promiscuous
mode.  By making this change we are able to support possible scenarios such
as SR-IOV with the PF connected to a Linux bridge hosting other VMs.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-15 16:43:06 -08:00
Alexander Duyck
a15d92598a igb: Clean-up configuration of VF port VLANs
This patch is meant to clean-up the configuration of the VF port based VLAN
configuration.  The original logic was a bit muddled and had some
undesirable side effects such as VLANs being either completely stripped
from the port or VLANs being left when they shouldn't be.  The idea behind
this code is to avoid any events such as spurious spoof notifications when
we are removing one VLAN tag and replacing it with another.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-15 16:39:57 -08:00
Alexander Duyck
8b77c6b20f igb: Merge VLVF configuration into igb_vfta_set
This change makes it so that we can merge the configuration of the VLVF
registers into the setting of the VFTA register.  By doing this we simplify
the logic and make use of similar functionality that we have already added
for ixgbe making it easier to maintain both drivers.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-15 16:36:52 -08:00
Alexander Duyck
5982a5565a igb: Always enable VLAN 0 even if 8021q is not loaded
This patch makes it so that we always add VLAN 0.  This is important as we
need to guarantee the PF can receive untagged frames in the case of SR-IOV
being enabled but VLAN filtering not being enabled in the kernel.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-15 16:33:42 -08:00
Alexander Duyck
d3836f8e25 igb: Do not factor VLANs into RLPML calculation
The RLPML registers already take the size of VLAN headers into account when
determining the maximum packet length.  This is called out in EAS documents
for several parts including the 82576 and the i350.  As such we can drop
the addition of size to the value programmed into the RLPML registers.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-15 16:30:33 -08:00
Alexander Duyck
45693bcb00 igb: Allow asymmetric configuration of MTU versus Rx frame size
Since the igb driver is using page based receive there is no point in
limiting the Rx capabilities of the device.  The driver can receive 9K
jumbo frames at all times.  The only changes needed due to MTU changes are
updates for the FIFO sizes and flow-control watermarks.

Update the maximum frame size to reflect the 9.5K limitation of the
hardware, and replace all instances of max_frame_size with
MAX_JUMBO_FRAME_SIZE when referring to an Rx FIFO or frame.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-15 16:27:21 -08:00
Alexander Duyck
c3278587e7 igb: clean up code for setting MAC address
Drop a bunch of hand written byte swapping code in favor of just doing the
byte swapping ourselves.  The registers are little endian registers storing
a big endian value so if we read the MAC address array as little endian
then we will get the CPU registers into the proper layout.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-15 16:21:05 -08:00
Shota Suzuki
37a5d163fb igb: Unpair the queues when changing the number of queues
By the commit 72ddef0506 ("igb: Fix oops caused by missing queue
pairing"), the IGB_FLAG_QUEUE_PAIRS flag can now be set when changing the
number of queues by "ethtool -L", but it is never cleared unless the igb
driver is reloaded.
This patch clears it if queue pairing becomes unnecessary as a result of
"ethtool -L".

Signed-off-by: Shota Suzuki <suzuki_shota_t3@lab.ntt.co.jp>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-15 16:14:50 -08:00
Shota Suzuki
ceb2775998 igb: Remove unnecessary flag setting in igb_set_flag_queue_pairs()
If VFs are enabled (max_vfs >= 1), both max_rss_queues and
adapter->rss_queues are set to 2 in the case of e1000_82576.
In this case, IGB_FLAG_QUEUE_PAIRS is always set in the default block as a
result of fall-through, thus setting it in the e1000_82576 block is not
necessary.

Signed-off-by: Shota Suzuki <suzuki_shota_t3@lab.ntt.co.jp>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-15 16:09:39 -08:00
Tom Herbert
53692b1de4 sctp: Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC
The SCTP checksum is really a CRC and is very different from the
standards 1's complement checksum that serves as the checksum
for IP protocols. This offload interface is also very different.
Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC to highlight these
differences. The term CSUM should be reserved in the stack to refer
to the standard 1's complement IP checksum.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 16:49:58 -05:00
Jarod Wilson
7b06a69095 igb: improve handling of disconnected adapters
Clean up array_rd32 so that it uses igb_rd32 the same as rd32, per the
suggestion of Alexander Duyck, and use io_addr in more places, so that
we don't have the need to call E1000_REMOVED (which simply looks for a
null hw_addr) nearly as much.

Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-12 23:51:26 -08:00
Jan Beulich
be06998f96 igb: fix NULL derefs due to skipped SR-IOV enabling
The combined effect of commits 6423fc3416 ("igb: do not re-init SR-IOV
during probe") and ceee3450b3 ("igb: make sure SR-IOV init uses the
right number of queues") causes VFs no longer getting set up, leading
to NULL pointer dereferences due to the adapter's ->vf_data being NULL
while ->vfs_allocated_count is non-zero. The first commit not only
neglected the side effect of igb_sriov_reinit() that the second commit
tried to account for, but also that of setting IGB_FLAG_HAS_MSIX,
without which igb_enable_sriov() is effectively a no-op. Calling
igb_{,re}set_interrupt_capability() as done here seems to address this,
but I'm not sure whether this is better than sinply reverting the other
two commits.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-12 23:19:37 -08:00
Jarod Wilson
73bf8048d7 igb: don't unmap NULL hw_addr
I've got a startech thunderbolt dock someone loaned me, which among other
things, has the following device in it:

08:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)

This hotplugs just fine (kernel 4.2.0 plus a patch or two here):

[  863.020315] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.2.18-k
[  863.020316] igb: Copyright (c) 2007-2014 Intel Corporation.
[  863.028657] igb 0000:08:00.0: enabling device (0000 -> 0002)
[  863.062089] igb 0000:08:00.0: added PHC on eth0
[  863.062090] igb 0000:08:00.0: Intel(R) Gigabit Ethernet Network Connection
[  863.062091] igb 0000:08:00.0: eth0: (PCIe:2.5Gb/s:Width x1) e8:ea:6a:00:1b:2a
[  863.062194] igb 0000:08:00.0: eth0: PBA No: 000200-000
[  863.062196] igb 0000:08:00.0: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)
[  863.064889] igb 0000:08:00.0 enp8s0: renamed from eth0

But disconnecting it is another story:

[ 1002.807932] igb 0000:08:00.0: removed PHC on enp8s0
[ 1002.807944] igb 0000:08:00.0 enp8s0: PCIe link lost, device now detached
[ 1003.341141] ------------[ cut here ]------------
[ 1003.341148] WARNING: CPU: 0 PID: 199 at lib/iomap.c:43 bad_io_access+0x38/0x40()
[ 1003.341149] Bad IO access at port 0x0 ()
[ 1003.342767] Modules linked in: snd_usb_audio snd_usbmidi_lib snd_rawmidi igb dca firewire_ohci firewire_core crc_itu_t rfcomm ctr ccm arc4 iwlmvm mac80211 fuse xt_CHECKSUM ipt_MASQUERADE
nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat
nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter bnep dm_mirror dm_region_hash dm_log dm_mod coretemp x86_pkg_temp_thermal intel_powerclamp kvm_intel snd_hda_codec_hdmi kvm
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drbg
[ 1003.342793]  ansi_cprng aesni_intel hp_wmi aes_x86_64 iTCO_wdt lrw iTCO_vendor_support ppdev gf128mul sparse_keymap glue_helper ablk_helper cryptd snd_hda_codec_realtek snd_hda_codec_generic
microcode snd_hda_intel uvcvideo iwlwifi snd_hda_codec videobuf2_vmalloc videobuf2_memops snd_hda_core videobuf2_core snd_hwdep btusb v4l2_common btrtl snd_seq btbcm btintel videodev cfg80211
snd_seq_device rtsx_pci_ms bluetooth pcspkr input_leds i2c_i801 media parport_pc memstick rfkill sg lpc_ich snd_pcm 8250_fintek parport joydev snd_timer snd soundcore hp_accel ie31200_edac
mei_me lis3lv02d edac_core input_polldev mei hp_wireless shpchp tpm_infineon sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables autofs4 xfs libcrc32c sd_mod sr_mod cdrom
rtsx_pci_sdmmc mmc_core crc32c_intel serio_raw rtsx_pci
[ 1003.342822]  nouveau ahci libahci mxm_wmi e1000e xhci_pci hwmon ptp drm_kms_helper pps_core xhci_hcd ttm wmi video ipv6
[ 1003.342839] CPU: 0 PID: 199 Comm: kworker/0:2 Not tainted 4.2.0-2.el7_UNSUPPORTED.x86_64 #1
[ 1003.342840] Hardware name: Hewlett-Packard HP ZBook 15 G2/2253, BIOS M70 Ver. 01.07 02/26/2015
[ 1003.342843] Workqueue: pciehp-3 pciehp_power_thread
[ 1003.342844]  ffffffff81a90655 ffff8804866d3b48 ffffffff8164763a 0000000000000000
[ 1003.342846]  ffff8804866d3b98 ffff8804866d3b88 ffffffff8107134a ffff8804866d3b88
[ 1003.342847]  ffff880486f46000 ffff88046c8a8000 ffff880486f46840 ffff88046c8a8098
[ 1003.342848] Call Trace:
[ 1003.342852]  [<ffffffff8164763a>] dump_stack+0x45/0x57
[ 1003.342855]  [<ffffffff8107134a>] warn_slowpath_common+0x8a/0xc0
[ 1003.342857]  [<ffffffff810713c6>] warn_slowpath_fmt+0x46/0x50
[ 1003.342859]  [<ffffffff8133719e>] ? pci_disable_msix+0x3e/0x50
[ 1003.342860]  [<ffffffff812f6328>] bad_io_access+0x38/0x40
[ 1003.342861]  [<ffffffff812f6567>] pci_iounmap+0x27/0x40
[ 1003.342865]  [<ffffffffa0b728d7>] igb_remove+0xc7/0x160 [igb]
[ 1003.342867]  [<ffffffff8132189f>] pci_device_remove+0x3f/0xc0
[ 1003.342869]  [<ffffffff81433426>] __device_release_driver+0x96/0x130
[ 1003.342870]  [<ffffffff814334e3>] device_release_driver+0x23/0x30
[ 1003.342871]  [<ffffffff8131b404>] pci_stop_bus_device+0x94/0xa0
[ 1003.342872]  [<ffffffff8131b3ad>] pci_stop_bus_device+0x3d/0xa0
[ 1003.342873]  [<ffffffff8131b3ad>] pci_stop_bus_device+0x3d/0xa0
[ 1003.342874]  [<ffffffff8131b516>] pci_stop_and_remove_bus_device+0x16/0x30
[ 1003.342876]  [<ffffffff81333f5b>] pciehp_unconfigure_device+0x9b/0x180
[ 1003.342877]  [<ffffffff81333a73>] pciehp_disable_slot+0x43/0xb0
[ 1003.342878]  [<ffffffff81333b6d>] pciehp_power_thread+0x8d/0xb0
[ 1003.342885]  [<ffffffff810881b2>] process_one_work+0x152/0x3d0
[ 1003.342886]  [<ffffffff8108854a>] worker_thread+0x11a/0x460
[ 1003.342887]  [<ffffffff81088430>] ? process_one_work+0x3d0/0x3d0
[ 1003.342890]  [<ffffffff8108ddd9>] kthread+0xc9/0xe0
[ 1003.342891]  [<ffffffff8108dd10>] ? kthread_create_on_node+0x180/0x180
[ 1003.342893]  [<ffffffff8164e29f>] ret_from_fork+0x3f/0x70
[ 1003.342894]  [<ffffffff8108dd10>] ? kthread_create_on_node+0x180/0x180
[ 1003.342895] ---[ end trace 65a77e06d5aa9358 ]---

Upon looking at the igb driver, I see that igb_rd32() attempted to read from
hw_addr and failed, so it set hw->hw_addr to NULL and spit out the message
in the log output above, "PCIe link lost, device now detached".

Well, now that hw_addr is NULL, the attempt to call pci_iounmap is obviously
not going to go well. As suggested by Mark Rustad, do something similar to
what ixgbe does, and save a copy of hw_addr as adapter->io_addr, so we can
still call pci_iounmap on it on teardown. Additionally, for consistency,
make the pci_iomap call assignment directly to io_addr, so map and unmap
match.

Signed-off-by: Jarod Wilson <jarod@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-12 23:09:54 -08:00
Jesse Brandeburg
32b3e08fff drivers/net/intel: use napi_complete_done()
As per Eric Dumazet's previous patches:
(see commit (24d2e4a507) - tg3: use napi_complete_done())

Quoting verbatim:
Using napi_complete_done() instead of napi_complete() allows
us to use /sys/class/net/ethX/gro_flush_timeout

GRO layer can aggregate more packets if the flush is delayed a bit,
without having to set too big coalescing parameters that impact
latencies.
</end quote>

Tested
configuration: low latency via ethtool -C ethx adaptive-rx off
				rx-usecs 10 adaptive-tx off tx-usecs 15
workload: streaming rx using netperf TCP_MAERTS

igb:
MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo
...
Interim result:  941.48 10^6bits/s over 1.000 seconds ending at 1440193171.589

Alignment      Offset         Bytes    Bytes       Recvs   Bytes    Sends
Local  Remote  Local  Remote  Xfered   Per                 Per
Recv   Send    Recv   Send             Recv (avg)          Send (avg)
    8       8      0       0 1176930056  1475.36    797726   16384.00  71905

MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo
...
Interim result:  941.49 10^6bits/s over 0.997 seconds ending at 1440193142.763

Alignment      Offset         Bytes    Bytes       Recvs   Bytes    Sends
Local  Remote  Local  Remote  Xfered   Per                 Per
Recv   Send    Recv   Send             Recv (avg)          Send (avg)
    8       8      0       0 1175182320  50476.00     23282   16384.00  71816

i40e:
Hard to test because the traffic is incoming so fast (24Gb/s) that GRO
always receives 87kB, even at the highest interrupt rate.

Other drivers were only compile tested.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-16 04:33:46 -07:00
Arnd Bergmann
40c9b0796d net: igb: avoid using timespec
We want to deprecate the use of 'struct timespec' on 32-bit
architectures, as it is will overflow in 2038. The igb
driver uses it to read the current time, and can simply
be changed to use ktime_get_real_ts64() instead.

Because of hardware limitations, there is still an overflow
in year 2106, which we cannot really avoid, but this documents
the overflow.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-05 03:16:42 -07:00
Stefan Assmann
cbfe360a15 igb: assume MSI-X interrupts during initialization
In igb_sw_init() the sequence of calls was changed from
igb_init_queue_configuration()
igb_init_interrupt_scheme()
igb_probe_vfs()
to
igb_probe_vfs()
igb_init_queue_configuration()
igb_init_interrupt_scheme()

This results in adapter->flags not having the IGB_FLAG_HAS_MSIX bit set
during igb_probe_vfs()->igb_enable_sriov(). Therefore SR-IOV does not
get enabled properly and we run into a NULL pointer if the max_vfs
module parameter is specified (adapter->vf_data does not get allocated,
crash on accessing the structure).

[    7.419348] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[    7.419367] IP: [<ffffffffa02161c6>] igb_reset+0xe6/0x5d0 [igb]
[    7.419370] PGD 0
[    7.419373] Oops: 0002 [#1] SMP
[    7.419381] Modules linked in: ahci(+) libahci igb(+) i40e(+) vxlan ip6_udp_tunnel udp_tunnel megaraid_sas(+) ixgbe(+) mdio
[    7.419385] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 4.2.0+ #153
[    7.419387] Hardware name: Dell Inc. PowerEdge R720/0C4Y3R, BIOS 1.6.0 03/07/2013
[...]
[    7.419431] Call Trace:
[    7.419442]  [<ffffffffa0217236>] igb_probe+0x8b6/0x1340 [igb]
[    7.419447]  [<ffffffff814c7f15>] local_pci_probe+0x45/0xa0

Prevent this by setting the IGB_FLAG_HAS_MSIX bit before calling
igb_probe_vfs(). The real interrupt capabilities will be checked during
igb_init_interrupt_scheme() so this is safe to do.

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-28 17:48:34 -07:00
David S. Miller
0d36938bb8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-08-27 21:45:31 -07:00
Michal Hocko
2f064f3485 mm: make page pfmemalloc check more robust
Commit c48a11c7ad ("netvm: propagate page->pfmemalloc to skb") added
checks for page->pfmemalloc to __skb_fill_page_desc():

        if (page->pfmemalloc && !page->mapping)
                skb->pfmemalloc = true;

It assumes page->mapping == NULL implies that page->pfmemalloc can be
trusted.  However, __delete_from_page_cache() can set set page->mapping
to NULL and leave page->index value alone.  Due to being in union, a
non-zero page->index will be interpreted as true page->pfmemalloc.

So the assumption is invalid if the networking code can see such a page.
And it seems it can.  We have encountered this with a NFS over loopback
setup when such a page is attached to a new skbuf.  There is no copying
going on in this case so the page confuses __skb_fill_page_desc which
interprets the index as pfmemalloc flag and the network stack drops
packets that have been allocated using the reserves unless they are to
be queued on sockets handling the swapping which is the case here and
that leads to hangs when the nfs client waits for a response from the
server which has been dropped and thus never arrive.

The struct page is already heavily packed so rather than finding another
hole to put it in, let's do a trick instead.  We can reuse the index
again but define it to an impossible value (-1UL).  This is the page
index so it should never see the value that large.  Replace all direct
users of page->pfmemalloc by page_is_pfmemalloc which will hide this
nastiness from unspoiled eyes.

The information will get lost if somebody wants to use page->index
obviously but that was the case before and the original code expected
that the information should be persisted somewhere else if that is
really needed (e.g.  what SLAB and SLUB do).

[akpm@linux-foundation.org: fix blooper in slub]
Fixes: c48a11c7ad ("netvm: propagate page->pfmemalloc to skb")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Debugged-by: Vlastimil Babka <vbabka@suse.com>
Debugged-by: Jiri Bohac <jbohac@suse.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org>	[3.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-21 14:30:10 -07:00
Todd Fujinaka
ceee3450b3 igb: make sure SR-IOV init uses the right number of queues
Recent changes to igb_probe_vfs() could lead to the PF holding onto all
of the queues. Reorder igb_probe_vfs() to be before
gb_init_queue_configuration() and add some more error checking.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:07 -07:00
Jia-Ju Bai
42ad1a03b4 igb: Fix a memory leak in igb_probe
In error handling code of igb_probe, the memory adapter->shadow_vfta
allocated by kcalloc in igb_sw_init is not freed. So when register_netdev
or igb_init_i2c is failed, a memory leak will occur.
This patch adds kfree to fix it.

Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:06 -07:00
Jia-Ju Bai
3eb14ea8d9 igb: Fix a deadlock in igb_sriov_reinit
When igb_init_interrupt_scheme in igb_sriov_reinit is failed, the lock
acquired by rtnl_lock() is not released, which causes a deadlock.
This patch adds rtnl_unlock() in error handling to fix it.

Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:05 -07:00
Alex Williamson
c23d92b80e igb: Teardown SR-IOV before unregister_netdev()
When the .remove() callback for a PF is called, SR-IOV support for the
device is disabled, which requires unbinding and removing the VFs.
The VFs may be in-use either by the host kernel or userspace, such as
assigned to a VM through vfio-pci.  In this latter case, the VFs may
be removed either by shutting down the VM or hot-unplugging the
devices from the VM.  Unfortunately in the case of a Windows 2012 R2
guest, hot-unplug is broken due to the ordering of the PF driver
teardown.  Disabling SR-IOV prior to unregister_netdev() avoids this
issue.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:05 -07:00
Stefan Assmann
6423fc3416 igb: do not re-init SR-IOV during probe
During driver probing the following code path is triggered.
igb_probe
->igb_sw_init
  ->igb_probe_vfs
    ->igb_pci_enable_sriov
      ->igb_sriov_reinit

Doing the SR-IOV re-init is not necessary during probing since we're
starting from scratch. Here we can call igb_enable_sriov() right away.

Running igb_sriov_reinit() during igb_probe() also seems to cause
occasional packet loss on some onboard 82576 NICs. Reproduced on
Dell and HP servers with onboard 82576 NICs.
Example:
Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01)
Subsystem: Dell Device [1028:0481]

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:04 -07:00
Vasily Averin
f468adc944 igb: missing rtnl_unlock in igb_sriov_reinit()
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:03 -07:00
Shota Suzuki
72ddef0506 igb: Fix oops caused by missing queue pairing
When initializing igb driver (e.g. 82576, I350), IGB_FLAG_QUEUE_PAIRS is
set if adapter->rss_queues exceeds half of max_rss_queues in
igb_init_queue_configuration().
On the other hand, IGB_FLAG_QUEUE_PAIRS is not set even if the number of
queues exceeds half of max_combined in igb_set_channels() when changing
the number of queues by "ethtool -L".
In this case, if numvecs is larger than MAX_MSIX_ENTRIES (10), the size
of adapter->msix_entries[], an overflow can occur in
igb_set_interrupt_capability(), which in turn leads to an oops.

Fix this problem as follows:
 - When changing the number of queues by "ethtool -L", set
   IGB_FLAG_QUEUE_PAIRS in the same way as initializing igb driver.
 - When increasing the size of q_vector, reallocate it appropriately.
   (With IGB_FLAG_QUEUE_PAIRS set, the size of q_vector gets larger.)

Another possible way to fix this problem is to cap the queues at its
initial number, which is the number of the initial online cpus. But this
is not the optimal way because we cannot increase queues when another
cpu becomes online.

Note that before commit cd14ef54d2 ("igb: Change to use statically
allocated array for MSIx entries"), this problem did not cause oops
but just made the number of queues become 1 because of entering msi_only
mode in igb_set_interrupt_capability().

Fixes: 907b783579 ("igb: Add ethtool support to configure number of channels")
CC: stable <stable@vger.kernel.org>
Signed-off-by: Shota Suzuki <suzuki_shota_t3@lab.ntt.co.jp>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:03 -07:00
Todd Fujinaka
6fb469023c igb: bump version to igb-5.3.0
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-07-23 05:10:50 -07:00
Alexander Duyck
f56e7bba22 igb: Pull timestamp from fragment before adding it to skb
This change makes it so that we pull the timestamp from the fragment before
we add it to the skb.  By doing this we can avoid a possible issue in which
the fragment can possibly be less than IGB_RX_HDR_LEN due to the timestamp
being pulled after the copybreak check.

While making this change I realized we could also pull the rest of the
igb_pull_tail function into igb_add_rx_frag since in the case of igb,
unlike ixgbe, we are able to unmap the entire buffer before calling
add_rx_frag so merging the two allows for sharing of code between the two
merged functions.

Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-07-17 19:59:06 -07:00
Todd Fujinaka
73cd63598d igb: bump version of igb to 5.2.18
Bump version of igb to igb-5.2.18

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-26 02:36:38 -07:00
David S. Miller
b04096ff33 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Four minor merge conflicts:

1) qca_spi.c renamed the local variable used for the SPI device
   from spi_device to spi, meanwhile the spi_set_drvdata() call
   got moved further up in the probe function.

2) Two changes were both adding new members to codel params
   structure, and thus we had overlapping changes to the
   initializer function.

3) 'net' was making a fix to sk_release_kernel() which is
   completely removed in 'net-next'.

4) In net_namespace.c, the rtnl_net_fill() call for GET operations
   had the command value fixed, meanwhile 'net-next' adjusted the
   argument signature a bit.

This also matches example merge resolutions posted by Stephen
Rothwell over the past two days.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 14:31:43 -04:00
Alexander Duyck
2ee52ad496 igb: Don't use NETDEV_FRAG_PAGE_MAX_SIZE in descriptor calculation
This change updates igb so that it will correctly perform the descriptor
count calculation.  Previously it was taking NETDEV_FRAG_PAGE_MAX_SIZE
into account with isn't really correct since a different value is used to
determine the size of the pages used for TCP.  That is actually determined
by SKB_FRAG_PAGE_ORDER.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 10:39:26 -04:00
Toshiaki Makita
2439fc4d71 igb: Fix NULL assignment to incorrect variable in igb_reset_q_vector
adapter->tx_ring is set to NULL where rx_ring should be.

Fixes: 5536d2102a ("igb: Combine q_vector and ring allocation into a single function")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-05-07 05:11:29 -07:00
Toshiaki Makita
c0a06ee185 igb: Fix oops on changing number of rings
When changing the number of rings by ethtool -L, q_vectors are reused,
which causes oops because of uninitialized pointers.

- When an rx is reused as a tx, q_vector->rx.ring is not set to NULL, which
  misleads igb_poll() to determine that it has an rx ring although it
  actually points to the tx ring.
- When a tx is reused as an rx, q_vector->rx.ring->skb
  (q_vector->ring[0].skb) has a value that was used as tx_stats before.

Fix these problems by zeroing it out on reuseing it.

Fixes: 02ef6e1d0b ("igb: Fix queue allocation method to accommodate changing during runtime")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-05-07 05:08:32 -07:00
Todd Fujinaka
8cfb879d1b igb: simplify and clean up igb_enable_mas()
igb_enable_mas() should only be called for the 82575 and has no clear
return so changing it to void. Also simplify the odd conditional
expression.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-05-04 01:17:47 -07:00
Toshiaki Makita
1abbc98a8a igb: Enable TSO for stacked vlan
As datasheets for igb (I210, I350, 82576, etc.) say, maclen can be from
14 to 127, which is enough for reasonable number of vlan tags.
My netperf test showed I350's TSO works pretty fine with multiple vlans.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29 13:33:25 -07:00
Todd Fujinaka
f28ea083a3 igb: use netif_carrier_off earlier when bringing if down
Use netif_carrier_off() first, since that will prevent the stack from
queuing more packets to this IF. This operation is fast, and should
behave much nicer when trying to bring down an interface under load.

Reported-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-20 17:45:12 -07:00
Alexander Graf
6ddbc4cf1f igb: Indicate failure on vf reset for empty mac address
Commit 5ac6f91d changed the igb driver to expose a zero (empty) mac
address to the VF on reset rather than a random one.

However, that behavioral change also requires igbvf driver changes
which can be hard especially when we want to talk to proprietary
guest OSs.

Looking at the code previous to the commit in Linux that made igbvf
work with empty mac addresses (8d56b6d), we can see that on reset
failure the driver will try to generate a new mac address with both
the old and the new code.

Furthermore, ixgbe does send reset failure when it detects an empty
mac address (35055928c).

So I think it's safe to make igb behave the same. With this patch I
can successfully run a Windows 8.1 guest with an empty mac address
and an assigned igbvf device that has no mac address set by the host.

If anyone is aware of a guest driver that chokes on NACK returns of
VF RESET commands, please speak up.

Signed-off-by: Alexander Graf <agraf@suse.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-01-22 18:10:23 -08:00
Richard Cochran
720db4ffd0 igb: enable auxiliary PHC functions for the i210
The i210 device offers a number of special PTP Hardware Clock features on
the Software Defined Pins (SDPs). This patch adds support for two of the
possible functions, namely time stamping external events, and periodic
output signals.

The assignment of PHC functions to the four SDP can be freely chosen by
the user.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-01-22 18:10:19 -08:00
Richard Cochran
00c65578b4 igb: enable internal PPS for the i210
The i210 device can produce an interrupt on the full second. This
patch allows using this interrupt to generate an internal PPS event
for adjusting the kernel system time.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-01-22 18:10:19 -08:00
Richard Cochran
61d7f75f45 igb: refactor time sync interrupt handling
The code that handles the time sync interrupt is repeated in three
different places. This patch refactors the identical code blocks into
a single helper function.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-01-22 18:10:18 -08:00
Alexander Duyck
95dd44b4f3 igb: Clean-up page reuse code
This patch cleans up the page reuse code getting it into a state where all
the workarounds needed are in place as well as cleaning up a few minor
oversights such as using __free_pages instead of put_page to drop a locally
allocated page.

It also cleans up how we clear the descriptor status bits.  Previously they
were zeroed as a part of clearing the hdr_addr.  However the hdr_addr is a
64 bit field and 64 bit writes can be a bit more expensive on on 32 bit
systems.  Since we are no longer using the header split feature the upper
32 bits of the address no longer need to be cleared.  As a result we can
just clear the status bits and leave the length and VLAN fields as-is which
should provide more information in debugging.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-01-22 18:10:17 -08:00
Jiri Pirko
df8a39defa net: rename vlan_tx_* helpers since "tx" is misleading there
The same macros are used for rx as well. So rename it.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-13 17:51:08 -05:00
Alexander Duyck
124b74c18e fm10k/igb/ixgbe: Use dma_rmb on Rx descriptor reads
This change makes it so that dma_rmb is used when reading the Rx
descriptor.  The advantage of dma_rmb is that it allows for a much
lower cost barrier on x86, powerpc, arm, and arm64 architectures than a
traditional memory barrier when dealing with reads that only have to
synchronize to coherent memory.

In addition I have updated the code so that it just checks to see if any
bits have been set instead of just the DD bit since the DD bit will always
be set as a part of a descriptor write-back so we just need to check for a
non-zero value being present at that memory location rather than just
checking for any specific bit.  This allows the code itself to appear much
cleaner and allows the compiler more room to optimize.

Cc: Matthew Vick <matthew.vick@intel.com>
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11 21:15:06 -05:00
Linus Torvalds
70e71ca0af Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) New offloading infrastructure and example 'rocker' driver for
    offloading of switching and routing to hardware.

    This work was done by a large group of dedicated individuals, not
    limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend,
    Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu

 2) Start making the networking operate on IOV iterators instead of
    modifying iov objects in-situ during transfers.  Thanks to Al Viro
    and Herbert Xu.

 3) A set of new netlink interfaces for the TIPC stack, from Richard
    Alpe.

 4) Remove unnecessary looping during ipv6 routing lookups, from Martin
    KaFai Lau.

 5) Add PAUSE frame generation support to gianfar driver, from Matei
    Pavaluca.

 6) Allow for larger reordering levels in TCP, which are easily
    achievable in the real world right now, from Eric Dumazet.

 7) Add a variable of napi_schedule that doesn't need to disable cpu
    interrupts, from Eric Dumazet.

 8) Use a doubly linked list to optimize neigh_parms_release(), from
    Nicolas Dichtel.

 9) Various enhancements to the kernel BPF verifier, and allow eBPF
    programs to actually be attached to sockets.  From Alexei
    Starovoitov.

10) Support TSO/LSO in sunvnet driver, from David L Stevens.

11) Allow controlling ECN usage via routing metrics, from Florian
    Westphal.

12) Remote checksum offload, from Tom Herbert.

13) Add split-header receive, BQL, and xmit_more support to amd-xgbe
    driver, from Thomas Lendacky.

14) Add MPLS support to openvswitch, from Simon Horman.

15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen
    Klassert.

16) Do gro flushes on a per-device basis using a timer, from Eric
    Dumazet.  This tries to resolve the conflicting goals between the
    desired handling of bulk vs.  RPC-like traffic.

17) Allow userspace to ask for the CPU upon what a packet was
    received/steered, via SO_INCOMING_CPU.  From Eric Dumazet.

18) Limit GSO packets to half the current congestion window, from Eric
    Dumazet.

19) Add a generic helper so that all drivers set their RSS keys in a
    consistent way, from Eric Dumazet.

20) Add xmit_more support to enic driver, from Govindarajulu
    Varadarajan.

21) Add VLAN packet scheduler action, from Jiri Pirko.

22) Support configurable RSS hash functions via ethtool, from Eyal
    Perry.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits)
  Fix race condition between vxlan_sock_add and vxlan_sock_release
  net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header
  net/mlx4: Add support for A0 steering
  net/mlx4: Refactor QUERY_PORT
  net/mlx4_core: Add explicit error message when rule doesn't meet configuration
  net/mlx4: Add A0 hybrid steering
  net/mlx4: Add mlx4_bitmap zone allocator
  net/mlx4: Add a check if there are too many reserved QPs
  net/mlx4: Change QP allocation scheme
  net/mlx4_core: Use tasklet for user-space CQ completion events
  net/mlx4_core: Mask out host side virtualization features for guests
  net/mlx4_en: Set csum level for encapsulated packets
  be2net: Export tunnel offloads only when a VxLAN tunnel is created
  gianfar: Fix dma check map error when DMA_API_DEBUG is enabled
  cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call
  net: fec: only enable mdio interrupt before phy device link up
  net: fec: clear all interrupt events to support i.MX6SX
  net: fec: reset fep link status in suspend function
  net: sock: fix access via invalid file descriptor
  net: introduce helper macro for_each_cmsghdr
  ...
2014-12-11 14:27:06 -08:00
Alexander Duyck
67fd893ee0 ethernet/intel: Use napi_alloc_skb
This change replaces calls to netdev_alloc_skb_ip_align with
napi_alloc_skb.  The advantage of napi_alloc_skb is currently the fact that
the page allocation doesn't make use of any irq disable calls.

There are few spots where I couldn't replace the calls as the buffer
allocation routine is called as a part of init which is outside of the
softirq context.

Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10 13:31:57 -05:00
Alexander Duyck
a94d9e224e ethernet/intel: Use eth_skb_pad and skb_put_padto helpers
Update the Intel Ethernet drivers to use eth_skb_pad() and skb_put_padto
instead of doing their own implementations of the function.

Also this cleans up two other spots where skb_pad was called but the length
and tail pointers were being manipulated directly instead of just having
the padding length added via __skb_put.

Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-08 20:47:42 -05:00
Rafael J. Wysocki
e3d857e1ae Merge branch 'pm-runtime'
* pm-runtime: (25 commits)
  i2c-omap / PM: Drop CONFIG_PM_RUNTIME from i2c-omap.c
  dmaengine / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  drivers: sh / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  e1000e / igb / PM: Eliminate CONFIG_PM_RUNTIME
  MMC / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  MFD / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  misc / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  media / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  input / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  iio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  hsi / OMAP / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  i2c-hid / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  drm / exynos / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  gpio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  hwrandom / exynos / PM: Use CONFIG_PM in #ifdef
  block / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  USB / PM: Drop CONFIG_PM_RUNTIME from the USB core
  PM: Merge the SET*_RUNTIME_PM_OPS() macros
  PM / Kconfig: Do not select PM directly from Kconfig files
  PCI / PM: Drop CONFIG_PM_RUNTIME from the PCI core
  ...
2014-12-08 20:00:44 +01:00
Rafael J. Wysocki
d61c81cb68 e1000e / igb / PM: Eliminate CONFIG_PM_RUNTIME
After commit b2b49ccbdd (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is
selected) PM_RUNTIME is always set if PM is set, so #ifdef blocks
depending on CONFIG_PM_RUNTIME within #ifdef blocks depending on
CONFIG_PM may be dropped now.

Do that in the e1000e and igb network drivers.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-05 03:06:53 +01:00
David S. Miller
60b7379dc5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-11-29 20:47:48 -08:00
Carolyn Wyborny
17a402a007 igb: Fixes needed for surprise removal support
This patch adds some checks in order to prevent panic's on surprise
removal of devices during S0, S3, S4.  Without this patch, Thunderbolt
type device removal will panic the system.

Signed-off-by: Yanir Lubetkin <yanirx.lubetkin@intel.com>
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-23 14:26:12 -05:00
Eric Dumazet
eb31f8493e igb: use netdev_rss_key_fill() helper
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-16 15:59:12 -05:00
Alexander Duyck
42b17f0955 fm10k/igb/ixgbe: Replace __skb_alloc_page with dev_alloc_page
The Intel drivers were pretty much just using the plain vanilla GFP flags
in their calls to __skb_alloc_page so this change makes it so that they use
dev_alloc_page which just uses GFP_ATOMIC for the gfp_flags value.

Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Matthew Vick <matthew.vick@intel.com>
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-12 00:00:14 -05:00
Roman Gushchin
bc16e47f03 igb: don't reuse pages with pfmemalloc flag
Incoming packet is dropped silently by sk_filter(), if the skb was
allocated from pfmemalloc reserves and the corresponding socket is
not marked with the SOCK_MEMALLOC flag.

Igb driver allocates pages for DMA with __skb_alloc_page(), which
calls alloc_pages_node() with the __GFP_MEMALLOC flag. So, in case
of OOM condition, igb can get pages with pfmemalloc flag set.

If an incoming packet hits the pfmemalloc page and is large enough
(small packets are copying into the memory, allocated with
netdev_alloc_skb_ip_align(), so they are not affected), it will be
dropped.

This behavior is ok under high memory pressure, but the problem is
that the igb driver reuses these mapped pages. So, packets are still
dropping even if all memory issues are gone and there is a plenty
of free memory.

In my case, some TCP sessions hang on a small percentage (< 0.1%)
of machines days after OOMs.

Fix this by avoiding reuse of such pages.

Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Tested-by: Aaron Brown "aaron.f.brown@intel.com"
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-30 04:56:52 -07:00
Eric Dumazet
00cd5adb03 igb: fix race accessing page->_count
This is illegal to use atomic_set(&page->_count, 2) even if we 'own'
the page. Other entities in the kernel need to use get_page_unless_zero()
to get a reference to the page before testing page properties, so we could
loose a refcount increment.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-10 15:37:28 -04:00
Todd Fujinaka
b5d130c4d6 igb: bump version to 5.2.15
Bump version

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02 03:14:34 -07:00
Rick Jones
a81fb04941 i40e/igb: Convert to dev_consume_skb_any()
Convert two more Intel NIC drivers to dev_consume_skb_any() to help
make dropped packet profiling sane.

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02 02:36:59 -07:00
Bernhard Kaindl
7acf631889 igb: remove blocking phy read from inside spinlock
Remove a source of latency spikes (in my case up to 10ms) by not calling
code that uses mdelay() for feeding a phy statistic (rx errors for idle
symbols - not data -> idle_errors) while being called with a spinlock held.

As idle_errors isn't read, this patch only removes unused code and data.

Later, more complicated changes may be applied to address the spinlock and
allow for some PHY diagnostics by harvesting this PHY stats register fully.

This patch is designed to fix the issue and be safe for longterm/stable.

For the Intel e1000e driver, the same change was applied in 2008 with
commit 23033fad5b ("e1000e: remove phy read from inside spinlock").

The mdelay is triggered by HW/SW semaphores, thus it depends on the HW.

I've HW that triggers it even when idle. Others may trigger it only e.g.
when Ethernet ports aquire or loose the link or on ifconfig up / down.
We've noticed this first from delays in frame rx/tx due to the mdelay().

Example command for checking if the issue is triggered: cyclictest -Smp1
(Look for occasional "Max:" values > 4000 or use -b 4000 to stop if greater)

It was observed with I350 ports connected to other I350 ports, but not
if driver and EEPROM was modified to run the I350 in EEPROM-less mode.

phy_stats.idle_errors and .receive_errors (isn't touched) occupy 64 not
used bits in the adapter struct: Their allocation may be removed as well.

Cc: Carolyn Wyborny <carolyn.wyborny@intel.com>
Cc: Todd Fujinaka <todd.fujinaka@intel.com>
Fixes: 12dcd86b75 ("igb: fix stats handling") (this added the spin_lock)
Signed-off-by: Bernhard Kaindl <bk-linux@use.startmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02 02:30:39 -07:00
Todd Fujinaka
c4c112f158 igb: add flags to set eee advertisement mode
Change e1000_set_eee and e1000_set_eee_i35(0|4) to allow
changes in the advertised EEE speeds from ethtool. Adds two boolean
flags to e1000_set_eee_i35(0|4) to pass in advertised speed data.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-06 05:00:39 -07:00
Alexander Duyck
24cd23d3d2 igb: use new eth_get_headlen interface
Update igb to drop the igb_get_headlen function in favor of eth_get_headlen.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by:  Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:47:02 -07:00
David S. Miller
6f19e12f62 igb: flush when in xmit_more mode and under descriptor pressure
Mirror the changes made to ixgbe in commit 2367a17390
("ixgbe: flush when in xmit_more mode and under descriptor pressure")

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28 01:39:31 -07:00
David S. Miller
0b725a2ca6 net: Remove ndo_xmit_flush netdev operation, use signalling instead.
As reported by Jesper Dangaard Brouer, for high packet rates the
overhead of having another indirect call in the TX path is
non-trivial.

There is the indirect call itself, and then there is all of the
reloading of the state to refetch the tail pointer value and
then write the device register.

Move to a more passive scheme, which requires very light modifications
to the device drivers.

The signal is a new skb->xmit_more value, if it is non-zero it means
that more SKBs are pending to be transmitted on the same queue as the
current SKB.  And therefore, the driver may elide the tail pointer
update.

Right now skb->xmit_more is always zero.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25 16:29:42 -07:00
David S. Miller
c1ebf46c1f igb: Support netdev_ops->ndo_xmit_flush()
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-24 23:02:45 -07:00
Todd Fujinaka
bf22a6bd0f igb: bump igb version to 5.2.13
Bump version number.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-07-24 03:11:19 -07:00
Carolyn Wyborny
1516f0a649 igb: Add message when malformed packets detected by hw
This patch adds a check and prints the error cause register value when
the hardware detects a malformed packet.  This is a very unlikely
scenario but has been seen occasionally, so printing the message to
assist the user.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-07-24 03:00:37 -07:00
David S. Miller
1a98c69af1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 14:09:34 -07:00
Stefan Assmann
76252723e8 igb: do a reset on SR-IOV re-init if device is down
To properly re-initialize SR-IOV it is necessary to reset the device
even if it is already down. Not doing this may result in Tx unit hangs.

Cc: stable <stable@vger.kernel.org>
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-10 12:45:24 -07:00
Todd Fujinaka
948264879b igb: Workaround for i210 Errata 25: Slow System Clock
On some devices, the internal PLL circuit occasionally provides the
wrong clock frequency after power up. The probability of failure is less
than one failure per 1000 power cycles. When the failure occurs, the
internal clock frequency is around 1/20 of the correct frequency.

Cc: stable <stable@vger.kernel.org>
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-10 01:48:28 -07:00
Todd Fujinaka
aec653c43b igb: bring link up when PHY is powered up
Call igb_setup_link() when the PHY is powered up.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Reported-by: Jeff Westfahl <jeff.westfahl@ni.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-07-01 02:39:54 -07:00
Todd Fujinaka
23d87824de igb: unhide invariant returns
Return a 0 directly rather than a constant.

Reported-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:45:48 -07:00
Todd Fujinaka
27dff8b2f6 igb: add defaults for i210 TX/RX PBSIZE
Set the defaults on probe for the packet buffer size registers for the
i210.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-03 23:57:53 -07:00
David S. Miller
0c3592b821 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to igb, igbvf, ixgbe, i40e and i40evf.

Jacob provides eight patches to cleanup the ixgbe driver to resolve various
checkpatch.pl warnings/errors as well as minor coding style issues.

Stephen Hemminger and I provide simple cleanups of void functions which
had useless return statements at the end of the function which are not
needed.

v2: Dropped Emil's patch "ixgbe: fix the detection of SFP+ capable interfaces"
    while I wait for his updated patch to be validated.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-23 16:28:18 -04:00
Sucheta Chakraborty
ed616689a3 net-next:v4: Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool.
o min_tx_rate puts lower limit on the VF bandwidth. VF is guaranteed
  to have a bandwidth of at least this value.
  max_tx_rate puts cap on the VF bandwidth. VF can have a bandwidth
  of up to this value.

o A new handler set_vf_rate for attr IFLA_VF_RATE has been introduced
  which takes 4 arguments:
  netdev, VF number, min_tx_rate, max_tx_rate

o ndo_set_vf_rate replaces ndo_set_vf_tx_rate handler.

o Drivers that currently implement ndo_set_vf_tx_rate should now call
  ndo_set_vf_rate instead and reject attempt to set a minimum bandwidth
  greater than 0 for IFLA_VF_TX_RATE when IFLA_VF_RATE is not yet
  implemented by driver.

o If user enters only one of either min_tx_rate or max_tx_rate, then,
  userland should read back the other value from driver and set both
  for IFLA_VF_RATE.
  Drivers that have not yet implemented IFLA_VF_RATE should always
  return min_tx_rate as 0 when read from ip tool.

o If both IFLA_VF_TX_RATE and IFLA_VF_RATE options are specified, then
  IFLA_VF_RATE should override.

o Idea is to have consistent display of rate values to user.

o Usage example: -

  ./ip link set p4p1 vf 0 rate 900

  ./ip link show p4p1
  32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
  DEFAULT qlen 1000
    link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 900 (Mbps), max_tx_rate 900Mbps
    vf 1 MAC f6:c6:7c:3f:3d:6c
    vf 2 MAC 56:32:43:98:d7:71
    vf 3 MAC d6:be:c3:b5:85:ff
    vf 4 MAC ee:a9:9a:1e:19:14
    vf 5 MAC 4a:d0:4c:07:52:18
    vf 6 MAC 3a:76:44:93:62:f9
    vf 7 MAC 82:e9:e7:e3:15:1a

  ./ip link set p4p1 vf 0 max_tx_rate 300 min_tx_rate 200

  ./ip link show p4p1
  32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
  DEFAULT qlen 1000
    link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 300 (Mbps), max_tx_rate 300Mbps,
    min_tx_rate 200Mbps
    vf 1 MAC f6:c6:7c:3f:3d:6c
    vf 2 MAC 56:32:43:98:d7:71
    vf 3 MAC d6:be:c3:b5:85:ff
    vf 4 MAC ee:a9:9a:1e:19:14
    vf 5 MAC 4a:d0:4c:07:52:18
    vf 6 MAC 3a:76:44:93:62:f9
    vf 7 MAC 82:e9:e7:e3:15:1a

  ./ip link set p4p1 vf 0 max_tx_rate 600 rate 300

  ./ip link show p4p1
  32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
  DEFAULT qlen 1000
    link/ether 00:0e:1e:08:b0:f brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 3e:a0:ca:bd:ae:5, tx rate 600 (Mbps), max_tx_rate 600Mbps,
    min_tx_rate 200Mbps
    vf 1 MAC f6:c6:7c:3f:3d:6c
    vf 2 MAC 56:32:43:98:d7:71
    vf 3 MAC d6:be:c3:b5:85:ff
    vf 4 MAC ee:a9:9a:1e:19:14
    vf 5 MAC 4a:d0:4c:07:52:18
    vf 6 MAC 3a:76:44:93:62:f9
    vf 7 MAC 82:e9:e7:e3:15:1a

Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-23 15:04:02 -04:00
Stephen Hemminger
41457f64da i40e,igb,ixgbe: remove usless return statements
Remove cases where useless bare return is left at end of function.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-05-23 05:28:46 -07:00
Carolyn Wyborny
a1f6347328 igb: Change memcpy to struct assignment
This patch fixes issue found by updated coccicheck.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-24 17:41:17 -07:00
Carolyn Wyborny
be28b63506 igb: Cleanups to remove unneeded extern declaration
This patch fixes WARNING:AVOID_EXTERNS found by checkpatch file check.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-24 17:41:16 -07:00
Carolyn Wyborny
cd1631cee3 igb: Cleanups to replace deprecated DEFINE_PCI_DEVICE_TABLE
This patch changes implementation to remove use of DEFINE_PCI_DEVICE_TABLE.
This patch fixes WARNING:DEFINE_PCI_DEVICE_TABLE

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-24 17:41:16 -07:00
Carolyn Wyborny
6dd6d2b783 igb: Cleanups to fix static initialization
This patch fixes ERROR:INITIALISED_STATIC from checkpatch file check.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-24 17:41:16 -07:00
Carolyn Wyborny
0d451e7956 igb: Cleanups to fix msleep warnings
This patch fixes WARNING:MSLEEP found by checkpatch check.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-24 17:41:15 -07:00
Carolyn Wyborny
c502ea2ea8 igb: Cleanups to fix line length warnings
This patch fixes WARNING:LONG_LINE found with checkpatch check.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-24 17:26:30 -07:00
Carolyn Wyborny
da1f1dfeb3 igb: Cleanups to remove return parentheses
This patch fixes ERROR:RETURN_PARENTHESES from checkpatch file check.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-24 17:26:29 -07:00
Carolyn Wyborny
b26141d47a igb: Cleanups to fix missing break in switch statements
This patch fixes WARNING:MISSING_BREAK found with checkpatch check.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-24 17:26:28 -07:00
Carolyn Wyborny
81ad807b26 igb: Cleanups to fix assignment in if error
This patch fixes ERROR:ASSIGN_IN_IF found with checkpatch file check.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-24 17:26:28 -07:00
Carolyn Wyborny
e52c0f960c igb: Cleanups to change comment style on license headers
This patch fixes WARNING:NETWORKING_BLOCK_COMMENT_STYLE from checkpatch file check.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-24 17:26:27 -07:00
David S. Miller
4366004d77 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/intel/igb/e1000_mac.c
	net/core/filter.c

Both conflicts were simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24 13:19:00 -04:00
Carolyn Wyborny
9005df3861 igb: Cleanups to fix incorrect indentation
This patch fixes WARNING:LEADING_SPACE, WARNING:SPACING, ERROR:SPACING,
WARNING:SPACE_BEFORE_TAB and ERROR_CODE_INDENT from checkpatch file check.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-23 01:43:26 -07:00
Carolyn Wyborny
d34a15abfe igb: Cleanups to fix braces location warnings
This patch fixes WARNING:BRACES and ERROR:OPEN_BRACE from
checkpatch file check.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-23 01:30:15 -07:00
Carolyn Wyborny
c75c4edfc3 igb: Cleanups for messaging
This patch fixes WARNING:PREFER_PR_LEVEL and WARNING:SPLIT_STRING
from checkpatch file check.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-23 01:22:19 -07:00
Todd Fujinaka
e66c083aab igb: fix stats for i210 rx_fifo_errors
RQDPC on i210/i211 is R/W not ReadClear. Clear after reading.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-18 18:24:53 -07:00
Jakub Kicinski
5499a968d4 igb: fix last_rx_timestamp usage
last_rx_timestamp should be updated only when rx time stamp is
read. Also it's only used with NICs that have per-interface time
stamping resources so it can be moved to adapter structure and
set in igb_ptp_rx_rgtstamp().

Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-11 05:58:08 -07:00
Francois Romieu
06c14e5adb igb: remove open-coded skb_cow_head
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-11 05:58:08 -07:00
Peter Senna Tschudin
75009b3a88 INTEL-IGB: Convert iounmap to pci_iounmap
Use pci_iounmap instead of iounmap when the virtual mapping was done
with pci_iomap. A simplified version of the semantic patch that finds this
issue is as follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r@
expression addr;
@@
addr = pci_iomap(...)

@rr@
expression r.addr;
@@
* iounmap(addr)
// </smpl>

Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-31 15:48:03 -07:00
Jakub Kicinski
ed4420a3b4 igb: fix race conditions on queuing skb for HW time stamp
igb has a single set of TX time stamping resources per NIC.
Use a simple bit lock to avoid race conditions and leaking skbs
when multiple TX rings try to claim time stamping.

Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-28 06:54:02 -07:00
Jakub Kicinski
afc835d1bd igb: never generate both software and hardware timestamps
skb_tx_timestamp() does not report software time stamp
if SKBTX_IN_PROGRESS is set. According to timestamping.txt
software time stamps are a fallback and should not be
generated if hardware time stamp is provided.

Move call to skb_tx_timestamp() after setting
SKBTX_IN_PROGRESS.

Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-28 06:54:02 -07:00
Christoph Paasch
b709323d24 igb: Unset IGB_FLAG_HAS_MSIX-flag when falling back to msi-only
Prior to cd14ef54d2 (igb: Change to use statically allocated array for
MSIx entries), having msix_entries different from NULL was an indicator
that MSIX is enabled.
In igb_set_interrupt_capabiliy we may fall back to MSI-only. Prior to
the above patch msix_entries was set to NULL by
igb_reset_interrupt_capability.

However, now we are checking the flag for IGB_FLAG_HAS_MSIX and so the
stack gets completly confused:

[   42.659791] ------------[ cut here ]------------
[   42.715032] WARNING: CPU: 7 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x15c/0x1fb()
[   42.848263] NETDEV WATCHDOG: eth0 (igb): transmit queue 0 timed out
[   42.923253] Modules linked in:
[   42.959875] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.14.0-rc2-mptcp #437
[   43.043184] Hardware name: HP ProLiant DL165 G7, BIOS O37 01/26/2011
[   43.119215]  0000000000000108 ffff88023fdc3da8 ffffffff81487847 0000000000000108
[   43.208165]  ffff88023fdc3df8 ffff88023fdc3de8 ffffffff81034e7d ffff88023fdc3dd8
[   43.297120]  ffffffff813fff10 ffff880236018000 ffff880236b178c0 0000000000000008
[   43.386071] Call Trace:
[   43.415303]  <IRQ>  [<ffffffff81487847>] dump_stack+0x49/0x62
[   43.484174]  [<ffffffff81034e7d>] warn_slowpath_common+0x77/0x91
[   43.556049]  [<ffffffff813fff10>] ? dev_watchdog+0x15c/0x1fb
[   43.623759]  [<ffffffff81034f2b>] warn_slowpath_fmt+0x41/0x43
[   43.692511]  [<ffffffff813fff10>] dev_watchdog+0x15c/0x1fb
[   43.758141]  [<ffffffff813ffdb4>] ? __netdev_watchdog_up+0x64/0x64
[   43.832091]  [<ffffffff8103cd04>] call_timer_fn+0x17/0x6f
[   43.896682]  [<ffffffff8103cebe>] run_timer_softirq+0x162/0x1a2
[   43.967511]  [<ffffffff81038520>] __do_softirq+0xcd/0x1cc
[   44.032104]  [<ffffffff81038689>] irq_exit+0x3a/0x48
[   44.091492]  [<ffffffff81026d43>] smp_apic_timer_interrupt+0x43/0x50
[   44.167525]  [<ffffffff8148c24a>] apic_timer_interrupt+0x6a/0x70
[   44.239392]  <EOI>  [<ffffffff8100992c>] ? default_idle+0x6/0x8
[   44.310343]  [<ffffffff81009b31>] arch_cpu_idle+0x13/0x18
[   44.374934]  [<ffffffff81066126>] cpu_startup_entry+0xa7/0x101
[   44.444724]  [<ffffffff81025660>] start_secondary+0x1b2/0x1b7
[   44.513472] ---[ end trace a5a075fd4e7f854f ]---
[   44.568753] igb 0000:04:00.0 eth0: Reset adapter
[   46.206945] random: nonblocking pool is initialized
[   46.465670] irq 44: nobody cared (try booting with the "irqpoll" option)
[   46.545862] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G        W    3.14.0-rc2-mptcp #437
[   46.640610] Hardware name: HP ProLiant DL165 G7, BIOS O37 01/26/2011
[   46.716641]  ffff8802363f8c84 ffff88023fdc3e38 ffffffff81487847 00000000a03cdb6d
[   46.805598]  ffff8802363f8c00 ffff88023fdc3e68 ffffffff81068489 0000007f81825400
[   46.894539]  ffff8802363f8c00 0000000000000000 0000000000000000 ffff88023fdc3ea8
[   46.983484] Call Trace:
[   47.012714]  <IRQ>  [<ffffffff81487847>] dump_stack+0x49/0x62
[   47.081585]  [<ffffffff81068489>] __report_bad_irq+0x35/0xc1
[   47.149295]  [<ffffffff81068683>] note_interrupt+0x16e/0x1ea
[   47.217006]  [<ffffffff8106679e>] handle_irq_event_percpu+0x116/0x12e
[   47.294075]  [<ffffffff810667e9>] handle_irq_event+0x33/0x4f
[   47.361787]  [<ffffffff81068c95>] handle_fasteoi_irq+0x83/0xd1
[   47.431577]  [<ffffffff81003d5b>] handle_irq+0x1f/0x28
[   47.493047]  [<ffffffff81003567>] do_IRQ+0x4e/0xd4
[   47.550358]  [<ffffffff8148b06a>] common_interrupt+0x6a/0x6a
[   47.618066]  <EOI>  [<ffffffff8100992c>] ? default_idle+0x6/0x8
[   47.689016]  [<ffffffff81009b31>] arch_cpu_idle+0x13/0x18
[   47.753605]  [<ffffffff81066126>] cpu_startup_entry+0xa7/0x101
[   47.823397]  [<ffffffff81025660>] start_secondary+0x1b2/0x1b7
[   47.892146] handlers:
[   47.919301] [<ffffffff812fbd7d>] igb_intr

So, this patch unsets the flag to indicate that we are not using MSIX.
This patch does exactly this: Unsetting the flag when falling back to MSI.

Fixes: cd14ef54d2 (igb: Change to use statically allocated array for MSIx entries)
Cc: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-21 05:05:54 -07:00
Christoph Paasch
cb06d10232 igb: Fix Null-pointer dereference in igb_reset_q_vector
When igb_set_interrupt_capability() calls
igb_reset_interrupt_capability() (e.g., because CONFIG_PCI_MSI is unset),
num_q_vectors has been set but no vector has yet been allocated.

igb_reset_interrupt_capability() will then call igb_reset_q_vector,
which assumes that the vector is allocated. As this is not the case, we
are accessing a NULL-pointer.

This patch fixes it by checking that q_vector is indeed different from
NULL.

Fixes: 02ef6e1d0b (igb: Fix queue allocation method to accommodate changing during runtime)
Cc: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-21 03:48:34 -07:00
Fujinaka, Todd
22a8b29159 igb: add register rd/wr for surprise removal
Add initial register rd/wr for surprise removal (LER).

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-21 03:00:49 -07:00
Jacob Keller
6ab5f7b298 igb: implement SIOCGHWTSTAMP ioctl
This patch adds support for the SIOCGHWTSTAMP ioctl which enables user
processes to read the current hwtstamp_config settings
non-destructively. Previously a process had to be privileged and could
only set values, it couldn't return what is currently set without
possibly overwriting the value.

This patch adds support for this new operation into igb by keeping a
shadow copy of the config in the adapter structure, which is returned
upon request.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-21 02:51:24 -07:00
Joe Perches
7c4d16ffb7 igb: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-19 22:18:48 -07:00
Stefan Assmann
dc1edc67fe igb: enable VLAN stripping for VMs with i350
For i350 VLAN stripping for VMs is not enabled in the VMOLR register but in
the DVMOLR register. Making the changes accordingly. It's not necessary to
unset the E1000_VMOLR_STRVLAN bit on i350 as the hardware will simply ignore
it.

Without this change if a VLAN is configured for a VF assigned to a guest
via (i.e.)
ip link set p1p1 vf 0 vlan 10
the VLAN tag will not be stripped from packets going into the VM. Which they
should be because the VM itself is not aware of the VLAN at all.

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-19 21:38:54 -07:00
Fernando Luis Vazquez Cao
406d49656f igb: remove references to long gone command line parameters
Command line parameters QueuePairs, Node, EEE, DMAC and InterruptThrottleRate
do not exist these days. Remove all references to them in the Documentation
folder and update code comments.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-18 13:25:41 -04:00
Eric W. Biederman
57ba34c9b0 igb: Don't receive packets when the napi budget == 0
Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:52:47 -04:00
Eric W. Biederman
57a7744e09 net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq
Replace the bh safe variant with the hard irq safe variant.

We need a hard irq safe variant to deal with netpoll transmitting
packets from hard irq context, and we need it in most if not all of
the places using the bh safe variant.

Except on 32bit uni-processor the code is exactly the same so don't
bother with a bh variant, just have a hard irq safe variant that
everyone can use.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:41:36 -04:00
Jeff Kirsher
b936136da2 igb: Fix code comment
Recently added code comment was missing a space that is needed.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-14 16:29:14 -07:00
Carolyn Wyborny
f4c01e965f igb: Fix for devices using ethtool for EEE settings
This patch fixes a problem where using ethtool for EEE setting was not
working correctly.  This patch also fixes a problem where
the function that checks for EEE status on i354 devices was not being
called and was causing warnings with static analysis tools.

Reported-by: Rashika Kheria <rashika.kheria@gmail.com>
Reported-by: Josh Triplett <josh@joshtriplett.org>
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-12 18:58:39 -07:00
Tom Herbert
42bdf083fe net: igb calls skb_set_hash
Drivers should call skb_set_hash to set the hash and its type
in an skbuff.

Signed-off-by: Tom Herbert <therbert@google.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-12 18:58:35 -07:00
Carolyn Wyborny
74cfb2e1f2 igb: Update license text to remove FSF address and update copyright.
This patch updates the license text to remove address of Free Software
Foundation and refer  users to www.gnu.org instead. This patch also updates
the copyright dates in appropriate igb driver files.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26 15:54:52 -05:00
Alexander Gordeev
479d02dfad igb: Use pci_enable_msix_range() instead of pci_enable_msix()
As result of deprecation of MSI-X/MSI enablement functions
pci_enable_msix() and pci_enable_msi_block() all drivers
using these two interfaces need to be updated to use the
new pci_enable_msi_range() and pci_enable_msix_range()
interfaces.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: e1000-devel@lists.sourceforge.net
Cc: netdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-18 15:33:31 -05:00
Carolyn Wyborny
cd14ef54d2 igb: Change to use statically allocated array for MSIx entries
This patch changes how the driver initializes MSIx and checks
for MSIx configuration.  This change makes it easier to reconfigure the
device when queue changes happen at runtime using ethtool's set_channels
feature.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-12-17 22:43:14 -08:00
Carolyn Wyborny
02ef6e1d0b igb: Fix queue allocation method to accommodate changing during runtime
When changing number of queues using ethtool's set_channels during runtime,
a queue allocation could fail, which can leave the device in a down state.
In order to preserve the usability of the device in this scenario, this patch
changes the driver to allocate the  number of queues only if they have not
been allocated already. The first allocation is then done for the max number
of queues, which is the default queues for this driver.   With this change,
queue quantity changes are not subject to queue allocation failures.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-12-17 22:43:12 -08:00
Carolyn Wyborny
56cec24916 igb: Add new feature Media Auto Sense for 82580 devices only
This patch adds support for the hardware feature Media Auto Sense.  This
feature requires a custom EEPROM image provided by our customer support
team.  The feature allows hardware designed with dual PHY's, fiber and
copper to be used with either media without additional EEPROM changes.
Fiber is preferred and driver will swap and configure for fiber media if
sensed by the device at any time. Device will swap back to copper if it
is the only media detected.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-12-10 01:27:35 -08:00
Aaron Sierra
89dbefb213 igb: Support ports mapped in 64-bit PCI space
This patch resolves an issue with 64-bit PCI addresses being truncated
because the return values of pci_resource_start() and pci_resource_end()
were being cast to unsigned long.

Signed-off-by: Aaron Sierra <asierra@xes-inc.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-12-10 01:27:34 -08:00
Carolyn Wyborny
2bdfc4e271 igb: Add media switching feature for i354 PHY's
This patch adds a new feature which is supported in some PHY's on some i354
devices.  This feature is Auto Media Detect and allows which ever media is
detected first by the PHY to be the media used and configured by the
device.  This is a media swapping feature that is wholly contained in the
Marvell PHY.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-12-10 01:27:34 -08:00
Linus Torvalds
5e30025a31 Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core locking changes from Ingo Molnar:
 "The biggest changes:

   - add lockdep support for seqcount/seqlocks structures, this
     unearthed both bugs and required extra annotation.

   - move the various kernel locking primitives to the new
     kernel/locking/ directory"

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  block: Use u64_stats_init() to initialize seqcounts
  locking/lockdep: Mark __lockdep_count_forward_deps() as static
  lockdep/proc: Fix lock-time avg computation
  locking/doc: Update references to kernel/mutex.c
  ipv6: Fix possible ipv6 seqlock deadlock
  cpuset: Fix potential deadlock w/ set_mems_allowed
  seqcount: Add lockdep functionality to seqcount/seqlock structures
  net: Explicitly initialize u64_stats_sync structures for lockdep
  locking: Move the percpu-rwsem code to kernel/locking/
  locking: Move the lglocks code to kernel/locking/
  locking: Move the rwsem code to kernel/locking/
  locking: Move the rtmutex code to kernel/locking/
  locking: Move the semaphore core to kernel/locking/
  locking: Move the spinlock code to kernel/locking/
  locking: Move the lockdep code to kernel/locking/
  locking: Move the mutex code to kernel/locking/
  hung_task debugging: Add tracepoint to report the hang
  x86/locking/kconfig: Update paravirt spinlock Kconfig description
  lockstat: Report avg wait and hold times
  lockdep, x86/alternatives: Drop ancient lockdep fixup message
  ...
2013-11-14 16:30:30 +09:00
Linus Torvalds
8ceafbfa91 Merge branch 'for-linus-dma-masks' of git://git.linaro.org/people/rmk/linux-arm
Pull DMA mask updates from Russell King:
 "This series cleans up the handling of DMA masks in a lot of drivers,
  fixing some bugs as we go.

  Some of the more serious errors include:
   - drivers which only set their coherent DMA mask if the attempt to
     set the streaming mask fails.
   - drivers which test for a NULL dma mask pointer, and then set the
     dma mask pointer to a location in their module .data section -
     which will cause problems if the module is reloaded.

  To counter these, I have introduced two helper functions:
   - dma_set_mask_and_coherent() takes care of setting both the
     streaming and coherent masks at the same time, with the correct
     error handling as specified by the API.
   - dma_coerce_mask_and_coherent() which resolves the problem of
     drivers forcefully setting DMA masks.  This is more a marker for
     future work to further clean these locations up - the code which
     creates the devices really should be initialising these, but to fix
     that in one go along with this change could potentially be very
     disruptive.

  The last thing this series does is prise away some of Linux's addition
  to "DMA addresses are physical addresses and RAM always starts at
  zero".  We have ARM LPAE systems where all system memory is above 4GB
  physical, hence having DMA masks interpreted by (eg) the block layers
  as describing physical addresses in the range 0..DMAMASK fails on
  these platforms.  Santosh Shilimkar addresses this in this series; the
  patches were copied to the appropriate people multiple times but were
  ignored.

  Fixing this also gets rid of some ARM weirdness in the setup of the
  max*pfn variables, and brings ARM into line with every other Linux
  architecture as far as those go"

* 'for-linus-dma-masks' of git://git.linaro.org/people/rmk/linux-arm: (52 commits)
  ARM: 7805/1: mm: change max*pfn to include the physical offset of memory
  ARM: 7797/1: mmc: Use dma_max_pfn(dev) helper for bounce_limit calculations
  ARM: 7796/1: scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations
  ARM: 7795/1: mm: dma-mapping: Add dma_max_pfn(dev) helper function
  ARM: 7794/1: block: Rename parameter dma_mask to max_addr for blk_queue_bounce_limit()
  ARM: DMA-API: better handing of DMA masks for coherent allocations
  ARM: 7857/1: dma: imx-sdma: setup dma mask
  DMA-API: firmware/google/gsmi.c: avoid direct access to DMA masks
  DMA-API: dcdbas: update DMA mask handing
  DMA-API: dma: edma.c: no need to explicitly initialize DMA masks
  DMA-API: usb: musb: use platform_device_register_full() to avoid directly messing with dma masks
  DMA-API: crypto: remove last references to 'static struct device *dev'
  DMA-API: crypto: fix ixp4xx crypto platform device support
  DMA-API: others: use dma_set_coherent_mask()
  DMA-API: staging: use dma_set_coherent_mask()
  DMA-API: usb: use new dma_coerce_mask_and_coherent()
  DMA-API: usb: use dma_set_coherent_mask()
  DMA-API: parport: parport_pc.c: use dma_coerce_mask_and_coherent()
  DMA-API: net: octeon: use dma_coerce_mask_and_coherent()
  DMA-API: net: nxp/lpc_eth: use dma_coerce_mask_and_coherent()
  ...
2013-11-14 07:55:21 +09:00
John Stultz
827da44c61 net: Explicitly initialize u64_stats_sync structures for lockdep
In order to enable lockdep on seqcount/seqlock structures, we
must explicitly initialize any locks.

The u64_stats_sync structure, uses a seqcount, and thus we need
to introduce a u64_stats_init() function and use it to initialize
the structure.

This unfortunately adds a lot of fairly trivial initialization code
to a number of drivers. But the benefit of ensuring correctness makes
this worth while.

Because these changes are required for lockdep to be enabled, and the
changes are quite trivial, I've not yet split this patch out into 30-some
separate patches, as I figured it would be better to get the various
maintainers thoughts on how to best merge this change along with
the seqcount lockdep enablement.

Feedback would be appreciated!

Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: James Morris <jmorris@namei.org>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Roger Luethi <rl@hellgate.ch>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Simon Horman <horms@verge.net.au>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:40:25 +01:00
Stefan Assmann
781798a11e igb: fix driver reload with VF assigned to guest
commit fa44f2f185 broke reloading of igb, when
VFs are assigned to a guest, in several ways.
1. on module load adapter->vf_data does not get properly allocated,
resulting in a null pointer exception when accessing adapter->vf_data in
igb_reset() on module reload.
 modprobe -r igb ; modprobe igb max_vfs=7
[  215.215837] igb 0000:01:00.1: removed PHC on eth1
[  216.932072] igb 0000:01:00.1: IOV Disabled
[  216.937038] igb 0000:01:00.0: removed PHC on eth0
[  217.127032] igb 0000:01:00.0: Cannot deallocate SR-IOV virtual functions while they are assigned - VFs will not be deallocated
[  217.146178] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.0.5-k
[  217.154050] igb: Copyright (c) 2007-2013 Intel Corporation.
[  217.160688] igb 0000:01:00.0: Enabling SR-IOV VFs using the module parameter is deprecated - please use the pci sysfs interface.
[  217.173703] igb 0000:01:00.0: irq 103 for MSI/MSI-X
[  217.179227] igb 0000:01:00.0: irq 104 for MSI/MSI-X
[  217.184735] igb 0000:01:00.0: irq 105 for MSI/MSI-X
[  217.220082] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[  217.228846] IP: [<ffffffffa007c5e5>] igb_reset+0xc5/0x4b0 [igb]
[  217.235472] PGD 3607ec067 PUD 36170b067 PMD 0
[  217.240461] Oops: 0002 [#1] SMP
[  217.244085] Modules linked in: igb(+) igbvf mptsas mptscsih mptbase scsi_transport_sas [last unloaded: igb]
[  217.255040] CPU: 4 PID: 4833 Comm: modprobe Not tainted 3.11.0+ #46
[...]
[  217.390007]  [<ffffffffa007fab2>] igb_probe+0x892/0xfd0 [igb]
[  217.396422]  [<ffffffff81470b3e>] local_pci_probe+0x1e/0x40
[  217.402641]  [<ffffffff81472029>] pci_device_probe+0xf9/0x110
[...]
2. A follow up issue, pci_enable_sriov() should only be called if no VFs were
still allocated on module unload. Otherwise pci_enable_sriov() gets called
multiple times in a row rendering the NIC unusable until reset.
3. simply calling igb_enable_sriov() in igb_probe_vfs() is not enough as the
interrupts need to be re-setup. Switching that to igb_pci_enable_sriov().

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Tested-by: Sibai Li <Sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-10-24 05:53:46 -07:00