Commit Graph

35849 Commits

Author SHA1 Message Date
Vinicius Costa Gomes
3cda505a67 igc: Fix PTP initialization
Right now, igc_ptp_reset() is called from igc_reset(), which is called
from igc_probe() before igc_ptp_init() has a chance to run. It is
detected as an attempt to use an spinlock without registering its key
first. See log below.

To avoid this problem, simplify the initialization: igc_ptp_init() is
only called from igc_probe(), and igc_ptp_reset() is only called from
igc_reset().

[    2.736332] INFO: trying to register non-static key.
[    2.736902] input: HDA Intel PCH Front Headphone as /devices/pci0000:00/0000:00:1f.3/sound/card0/input10
[    2.737513] the code is fine but needs lockdep annotation.
[    2.737513] turning off the locking correctness validator.
[    2.737515] CPU: 8 PID: 239 Comm: systemd-udevd Tainted: G            E     5.8.0-rc7+ #13
[    2.737515] Hardware name: Gigabyte Technology Co., Ltd. Z390 AORUS ULTRA/Z390 AORUS ULTRA-CF, BIOS F7 03/14/2019
[    2.737516] Call Trace:
[    2.737521]  dump_stack+0x78/0xa0
[    2.737524]  register_lock_class+0x6b1/0x6f0
[    2.737526]  ? lockdep_hardirqs_on_prepare+0xca/0x160
[    2.739177]  ? _raw_spin_unlock_irq+0x24/0x50
[    2.739179]  ? trace_hardirqs_on+0x1c/0xf0
[    2.740820]  __lock_acquire+0x56/0x1ff0
[    2.740823]  ? __schedule+0x30c/0x970
[    2.740825]  lock_acquire+0x97/0x3e0
[    2.740830]  ? igc_ptp_reset+0x35/0xf0 [igc]
[    2.740833]  ? schedule_hrtimeout_range_clock+0xb7/0x120
[    2.742507]  _raw_spin_lock_irqsave+0x3a/0x50
[    2.742512]  ? igc_ptp_reset+0x35/0xf0 [igc]
[    2.742515]  igc_ptp_reset+0x35/0xf0 [igc]
[    2.742519]  igc_reset+0x96/0xd0 [igc]
[    2.744148]  igc_probe+0x68f/0x7d0 [igc]
[    2.745796]  local_pci_probe+0x3d/0x70
[    2.745799]  pci_device_probe+0xd1/0x190
[    2.745802]  really_probe+0x15a/0x3f0
[    2.759936]  driver_probe_device+0xe1/0x150
[    2.759937]  device_driver_attach+0xa8/0xb0
[    2.761786]  __driver_attach+0x89/0x150
[    2.761786]  ? device_driver_attach+0xb0/0xb0
[    2.761787]  ? device_driver_attach+0xb0/0xb0
[    2.761788]  bus_for_each_dev+0x66/0x90
[    2.765012]  bus_add_driver+0x12e/0x1f0
[    2.765716]  driver_register+0x8b/0xe0
[    2.766418]  ? 0xffffffffc0230000
[    2.767119]  do_one_initcall+0x5a/0x310
[    2.767826]  ? kmem_cache_alloc_trace+0xe9/0x200
[    2.768528]  do_init_module+0x5c/0x260
[    2.769206]  __do_sys_finit_module+0x93/0xe0
[    2.770048]  do_syscall_64+0x46/0xa0
[    2.770716]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    2.771396] RIP: 0033:0x7f83534589e0
[    2.772073] Code: 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 2e 2e 2e 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 80 24 0d 00 f7 d8 64 89 01 48
[    2.772074] RSP: 002b:00007ffd31d0ed18 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    2.774854] RAX: ffffffffffffffda RBX: 000055d52816aba0 RCX: 00007f83534589e0
[    2.774855] RDX: 0000000000000000 RSI: 00007f83535b982f RDI: 0000000000000006
[    2.774855] RBP: 00007ffd31d0ed60 R08: 0000000000000000 R09: 00007ffd31d0ed30
[    2.774856] R10: 0000000000000006 R11: 0000000000000246 R12: 0000000000000000
[    2.774856] R13: 0000000000020000 R14: 00007f83535b982f R15: 000055d527f5e120

Fixes: 5f2958052c ("igc: Add basic skeleton for PTP")
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Reviewed-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-08-14 09:41:20 -07:00
Linus Torvalds
a1d21081a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
 "Some merge window fallout, some longer term fixes:

   1) Handle headroom properly in lapbether and x25_asy drivers, from
      Xie He.

   2) Fetch MAC address from correct r8152 device node, from Thierry
      Reding.

   3) In the sw kTLS path we should allow MSG_CMSG_COMPAT in sendmsg,
      from Rouven Czerwinski.

   4) Correct fdputs in socket layer, from Miaohe Lin.

   5) Revert troublesome sockptr_t optimization, from Christoph Hellwig.

   6) Fix TCP TFO key reading on big endian, from Jason Baron.

   7) Missing CAP_NET_RAW check in nfc, from Qingyu Li.

   8) Fix inet fastreuse optimization with tproxy sockets, from Tim
      Froidcoeur.

   9) Fix 64-bit divide in new SFC driver, from Edward Cree.

  10) Add a tracepoint for prandom_u32 so that we can more easily
      perform usage analysis. From Eric Dumazet.

  11) Fix rwlock imbalance in AF_PACKET, from John Ogness"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits)
  net: openvswitch: introduce common code for flushing flows
  af_packet: TPACKET_V3: fix fill status rwlock imbalance
  random32: add a tracepoint for prandom_u32()
  Revert "ipv4: tunnel: fix compilation on ARCH=um"
  net: accept an empty mask in /sys/class/net/*/queues/rx-*/rps_cpus
  net: ethernet: stmmac: Disable hardware multicast filter
  net: stmmac: dwmac1000: provide multicast filter fallback
  ipv4: tunnel: fix compilation on ARCH=um
  vsock: fix potential null pointer dereference in vsock_poll()
  sfc: fix ef100 design-param checking
  net: initialize fastreuse on inet_inherit_port
  net: refactor bind_bucket fastreuse into helper
  net: phy: marvell10g: fix null pointer dereference
  net: Fix potential memory leak in proto_register()
  net: qcom/emac: add missed clk_disable_unprepare in error path of emac_clks_phase1_init
  ionic_lif: Use devm_kcalloc() in ionic_qcq_alloc()
  net/nfc/rawsock.c: add CAP_NET_RAW check.
  hinic: fix strncpy output truncated compile warnings
  drivers/net/wan/x25_asy: Added needed_headroom and a skb->len check
  net/tls: Fix kmap usage
  ...
2020-08-13 20:03:11 -07:00
Jonathan McDowell
df43dd526e net: ethernet: stmmac: Disable hardware multicast filter
The IPQ806x does not appear to have a functional multicast ethernet
address filter. This was observed as a failure to correctly receive IPv6
packets on a LAN to the all stations address. Checking the vendor driver
shows that it does not attempt to enable the multicast filter and
instead falls back to receiving all multicast packets, internally
setting ALLMULTI.

Use the new fallback support in the dwmac1000 driver to correctly
achieve the same with the mainline IPQ806x driver. Confirmed to fix IPv6
functionality on an RB3011 router.

Cc: stable@vger.kernel.org
Signed-off-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-12 13:12:52 -07:00
Jonathan McDowell
592d751c1e net: stmmac: dwmac1000: provide multicast filter fallback
If we don't have a hardware multicast filter available then instead of
silently failing to listen for the requested ethernet broadcast
addresses fall back to receiving all multicast packets, in a similar
fashion to other drivers with no multicast filter.

Cc: stable@vger.kernel.org
Signed-off-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-12 13:12:52 -07:00
Edward Cree
41077c9902 sfc: fix ef100 design-param checking
The handling of the RXQ/TXQ size granularity design-params had two
 problems: it had a 64-bit divide that didn't build on 32-bit platforms,
 and it could divide by zero if the NIC supplied 0 as the value of the
 design-param.  Fix both by checking for 0 and for a granularity bigger
 than our min-size; if the granularity <= EFX_MIN_DMAQ_SIZE then it fits
 in 32 bits, so we can cast it to u32 for the divide.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-12 12:54:03 -07:00
Wang Hai
50caa777a3 net: qcom/emac: add missed clk_disable_unprepare in error path of emac_clks_phase1_init
Fix the missing clk_disable_unprepare() before return
from emac_clks_phase1_init() in the error handling case.

Fixes: b9b17debc6 ("net: emac: emac gigabit ethernet controller driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Acked-by: Timur Tabi <timur@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-11 10:36:47 -07:00
Xu Wang
e71642009c ionic_lif: Use devm_kcalloc() in ionic_qcq_alloc()
A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "devm_kcalloc".

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-11 10:36:08 -07:00
Luo bin
1dab5877e8 hinic: fix strncpy output truncated compile warnings
fix the compile warnings of 'strncpy' output truncated before
terminating nul copying N bytes from a string of the same length

Signed-off-by: Luo bin <luobin9@huawei.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-11 10:33:14 -07:00
Linus Torvalds
049eb096da pci-v5.9-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl8sdUkUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwH2Q/7Brcm1uyLORSzseGsaXSGMncBs2YB
 aKbfhyy4BPsDIZRLnzcfRZzgKo3f4jlLH9dJ6nBukbNXCvS/g7oYCXtNKVuB70MD
 IgBH3OJxLmqsYgDkoQmj1fZBCBhdqMgGbRmeIPLqiIBrWOJkBpGHXKpb0XtyXAas
 CpD0Tvr0JBeHMluZq6Uay09jBDKexeCFrT5HCoVaRMXT/C/iB5K1oMrUczzITsdi
 jB9xesDjh32rYtaePKfuL8itbRT7jtqOwQlk7sCtnMNamaOOaYO/s6hL5v/4GxMh
 rtWa1knOxxA1nOsnEkUEHi0Fj/+9zXDIdb7v6thRDo0ZgWQxl7l3nshvmPcxX421
 tpCm3HqmvHzGqSI85Rtr3p4XKm9e+IjgE2EA/J6Y8Q6Grrb0EGJituhO4meL2Ciq
 6mxdhu7InxDJ2p3TLGas3fB/1hrCO0Fc0pQoBJx7YgqA1ANyld9DYCkDN6IDoZBI
 uUjKgkE1dfbW/pGjotjhBsmz3dycZHkurIFdt1iX/Xtt5KKdPAzu9yM2U03iIS2R
 im1wZ/THiS/YCOlgL/J8+DHTY0ZvXjAdbiSPjTFfwb9XTh8aHVWtFaaZON1jRIjg
 xMpIY0SxfshpLx631ThZdDTDiOwE8D3B+1n/kMwps6HOLpxOoJZeSGTRCt9wGP40
 j58DTtLm5FKpdYc=
 =moI9
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:
   - Fix pci_cfg_wait queue locking problem (Bjorn Helgaas)
   - Convert PCIe capability PCIBIOS errors to errno (Bolarinwa Olayemi
     Saheed)
   - Align PCIe capability and PCI accessor return values (Bolarinwa
     Olayemi Saheed)
   - Fix pci_create_slot() reference count leak (Qiushi Wu)
   - Announce device after early fixups (Tiezhu Yang)

  PCI device hotplug:
   - Make rpadlpar functions static (Wei Yongjun)

  Driver binding:
   - Add device even if driver attach failed (Rajat Jain)

  Virtualization:
   - xen: Remove redundant initialization of irq (Colin Ian King)

  IOMMU:
   - Add pci_pri_supported() to check device or associated PF (Ashok Raj)
   - Release IVRS table in AMD ACS quirk (Hanjun Guo)
   - Mark AMD Navi10 GPU rev 0x00 ATS as broken (Kai-Heng Feng)
   - Treat "external-facing" devices themselves as internal (Rajat Jain)

  MSI:
   - Forward MSI-X error code in pci_alloc_irq_vectors_affinity() (Piotr
     Stankiewicz)

  Error handling:
   - Clear PCIe Device Status errors only if OS owns AER (Jonathan
     Cameron)
   - Log correctable errors as warning, not error (Matt Jolly)
   - Use 'pci_channel_state_t' instead of 'enum pci_channel_state' (Luc
     Van Oostenryck)

  Peer-to-peer DMA:
   - Allow P2PDMA on AMD Zen and newer CPUs (Logan Gunthorpe)

  ASPM:
   - Add missing newline in sysfs 'policy' (Xiongfeng Wang)

  Native PCIe controllers:
   - Convert to devm_platform_ioremap_resource_byname() (Dejin Zheng)
   - Convert to devm_platform_ioremap_resource() (Dejin Zheng)
   - Remove duplicate error message from devm_pci_remap_cfg_resource()
     callers (Dejin Zheng)
   - Fix runtime PM imbalance on error (Dinghao Liu)
   - Remove dev_err() when handing an error from platform_get_irq()
     (Krzysztof Wilczyński)
   - Use pci_host_bridge.windows list directly instead of splicing in a
     temporary list for cadence, mvebu, host-common (Rob Herring)
   - Use pci_host_probe() instead of open-coding all the pieces for
     altera, brcmstb, iproc, mobiveil, rcar, rockchip, tegra, v3,
     versatile, xgene, xilinx, xilinx-nwl (Rob Herring)
   - Default host bridge parent device to the platform device (Rob
     Herring)
   - Use pci_is_root_bus() instead of tracking root bus number
     separately in aardvark, designware (imx6, keystone,
     designware-host), mobiveil, xilinx-nwl, xilinx, rockchip, rcar (Rob
     Herring)
   - Set host bridge bus number in pci_scan_root_bus_bridge() instead of
     each driver for aardvark, designware-host, host-common, mediatek,
     rcar, tegra, v3-semi (Rob Herring)
   - Move DT resource setup into devm_pci_alloc_host_bridge() (Rob
     Herring)
   - Set bridge map_irq and swizzle_irq to default functions; drivers
     that don't support legacy IRQs (iproc) need to undo this (Rob
     Herring)

  ARM Versatile PCIe controller driver:
   - Drop flag PCI_ENABLE_PROC_DOMAINS (Rob Herring)

  Cadence PCIe controller driver:
   - Use "dma-ranges" instead of "cdns,no-bar-match-nbits" property
     (Kishon Vijay Abraham I)
   - Remove "mem" from reg binding (Kishon Vijay Abraham I)
   - Fix cdns_pcie_{host|ep}_setup() error path (Kishon Vijay Abraham I)
   - Convert all r/w accessors to perform only 32-bit accesses (Kishon
     Vijay Abraham I)
   - Add support to start link and verify link status (Kishon Vijay
     Abraham I)
   - Allow pci_host_bridge to have custom pci_ops (Kishon Vijay Abraham I)
   - Add new *ops* for CPU addr fixup (Kishon Vijay Abraham I)
   - Fix updating Vendor ID and Subsystem Vendor ID register (Kishon
     Vijay Abraham I)
   - Use bridge resources for outbound window setup (Rob Herring)
   - Remove private bus number and range storage (Rob Herring)

  Cadence PCIe endpoint driver:
   - Add MSI-X support (Alan Douglas)

  HiSilicon PCIe controller driver:
   - Remove non-ECAM HiSilicon hip05/hip06 driver (Rob Herring)

  Intel VMD host bridge driver:
   - Use Shadow MEMBAR registers for QEMU/KVM guests (Jon Derrick)

  Loongson PCIe controller driver:
   - Use DECLARE_PCI_FIXUP_EARLY for bridge_class_quirk() (Tiezhu Yang)

  Marvell Aardvark PCIe controller driver:
   - Indicate error in 'val' when config read fails (Pali Rohár)
   - Don't touch PCIe registers if no card connected (Pali Rohár)

  Marvell MVEBU PCIe controller driver:
   - Setup BAR0 in order to fix MSI (Shmuel Hazan)

  Microsoft Hyper-V host bridge driver:
   - Fix a timing issue which causes kdump to fail occasionally (Wei Hu)
   - Make some functions static (Wei Yongjun)

  NVIDIA Tegra PCIe controller driver:
   - Revert tegra124 raw_violation_fixup (Nicolas Chauvet)
   - Remove PLL power supplies (Thierry Reding)

  Qualcomm PCIe controller driver:
   - Change duplicate PCI reset to phy reset (Abhishek Sahu)
   - Add missing ipq806x clocks in PCIe driver (Ansuel Smith)
   - Add missing reset for ipq806x (Ansuel Smith)
   - Add ext reset (Ansuel Smith)
   - Use bulk clk API and assert on error (Ansuel Smith)
   - Add support for tx term offset for rev 2.1.0 (Ansuel Smith)
   - Define some PARF params needed for ipq8064 SoC (Ansuel Smith)
   - Add ipq8064 rev2 variant (Ansuel Smith)
   - Support PCI speed set for ipq806x (Sham Muthayyan)

  Renesas R-Car PCIe controller driver:
   - Use devm_pci_alloc_host_bridge() (Rob Herring)
   - Use struct pci_host_bridge.windows list directly (Rob Herring)
   - Convert rcar-gen2 to use modern host bridge probe functions (Rob
     Herring)

  TI J721E PCIe driver:
   - Add TI J721E PCIe host and endpoint driver (Kishon Vijay Abraham I)

  Xilinx Versal CPM PCIe controller driver:
   - Add Versal CPM Root Port driver and YAML schema (Bharat Kumar
     Gogada)

  MicroSemi Switchtec management driver:
   - Add missing __iomem and __user tags to fix sparse warnings (Logan
     Gunthorpe)

  Miscellaneous:
   - Replace http:// links with https:// (Alexander A. Klimov)
   - Replace lkml.org, spinics, gmane with lore.kernel.org (Bjorn
     Helgaas)
   - Remove unused pci_lost_interrupt() (Heiner Kallweit)
   - Move PCI_VENDOR_ID_REDHAT definition to pci_ids.h (Huacai Chen)
   - Fix kerneldoc warnings (Krzysztof Kozlowski)"

* tag 'pci-v5.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (113 commits)
  PCI: Fix kerneldoc warnings
  PCI: xilinx-cpm: Add Versal CPM Root Port driver
  PCI: xilinx-cpm: Add YAML schemas for Versal CPM Root Port
  PCI: Set bridge map_irq and swizzle_irq to default functions
  PCI: Move DT resource setup into devm_pci_alloc_host_bridge()
  PCI: rcar-gen2: Convert to use modern host bridge probe functions
  PCI: Remove dev_err() when handing an error from platform_get_irq()
  MAINTAINERS: Add Kishon Vijay Abraham I for TI J721E SoC PCIe
  misc: pci_endpoint_test: Add J721E in pci_device_id table
  PCI: j721e: Add TI J721E PCIe driver
  PCI: switchtec: Add missing __iomem tag to fix sparse warnings
  PCI: switchtec: Add missing __iomem and __user tags to fix sparse warnings
  PCI: rpadlpar: Make functions static
  PCI/P2PDMA: Allow P2PDMA on AMD Zen and newer CPUs
  PCI: Release IVRS table in AMD ACS quirk
  PCI: Announce device after early fixups
  PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken
  PCI: Remove unused pci_lost_interrupt()
  dt-bindings: PCI: Add EP mode dt-bindings for TI's J721E SoC
  dt-bindings: PCI: Add host mode dt-bindings for TI's J721E SoC
  ...
2020-08-07 18:48:15 -07:00
Linus Torvalds
81e11336d9 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - a few MM hotfixes

 - kthread, tools, scripts, ntfs and ocfs2

 - some of MM

Subsystems affected by this patch series: kthread, tools, scripts, ntfs,
ocfs2 and mm (hofixes, pagealloc, slab-generic, slab, slub, kcsan,
debug, pagecache, gup, swap, shmem, memcg, pagemap, mremap, mincore,
sparsemem, vmalloc, kasan, pagealloc, hugetlb and vmscan).

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (162 commits)
  mm: vmscan: consistent update to pgrefill
  mm/vmscan.c: fix typo
  khugepaged: khugepaged_test_exit() check mmget_still_valid()
  khugepaged: retract_page_tables() remember to test exit
  khugepaged: collapse_pte_mapped_thp() protect the pmd lock
  khugepaged: collapse_pte_mapped_thp() flush the right range
  mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible
  mm: thp: replace HTTP links with HTTPS ones
  mm/page_alloc: fix memalloc_nocma_{save/restore} APIs
  mm/page_alloc.c: skip setting nodemask when we are in interrupt
  mm/page_alloc: fallbacks at most has 3 elements
  mm/page_alloc: silence a KASAN false positive
  mm/page_alloc.c: remove unnecessary end_bitidx for [set|get]_pfnblock_flags_mask()
  mm/page_alloc.c: simplify pageblock bitmap access
  mm/page_alloc.c: extract the common part in pfn_to_bitidx()
  mm/page_alloc.c: replace the definition of NR_MIGRATETYPE_BITS with PB_migratetype_bits
  mm/shuffle: remove dynamic reconfiguration
  mm/memory_hotplug: document why shuffle_zone() is relevant
  mm/page_alloc: remove nr_free_pagecache_pages()
  mm: remove vm_total_pages
  ...
2020-08-07 11:39:33 -07:00
Waiman Long
453431a549 mm, treewide: rename kzfree() to kfree_sensitive()
As said by Linus:

  A symmetric naming is only helpful if it implies symmetries in use.
  Otherwise it's actively misleading.

  In "kzalloc()", the z is meaningful and an important part of what the
  caller wants.

  In "kzfree()", the z is actively detrimental, because maybe in the
  future we really _might_ want to use that "memfill(0xdeadbeef)" or
  something. The "zero" part of the interface isn't even _relevant_.

The main reason that kzfree() exists is to clear sensitive information
that should not be leaked to other future users of the same memory
objects.

Rename kzfree() to kfree_sensitive() to follow the example of the recently
added kvfree_sensitive() and make the intention of the API more explicit.
In addition, memzero_explicit() is used to clear the memory to make sure
that it won't get optimized away by the compiler.

The renaming is done by using the command sequence:

  git grep -w --name-only kzfree |\
  xargs sed -i 's/kzfree/kfree_sensitive/'

followed by some editing of the kfree_sensitive() kerneldoc and adding
a kzfree backward compatibility macro in slab.h.

[akpm@linux-foundation.org: fs/crypto/inline_crypt.c needs linux/slab.h]
[akpm@linux-foundation.org: fix fs/crypto/inline_crypt.c some more]

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Joe Perches <joe@perches.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: "Jason A . Donenfeld" <Jason@zx2c4.com>
Link: http://lkml.kernel.org/r/20200616154311.12314-3-longman@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:22 -07:00
Linus Torvalds
96e3f3c16b - Add support to enable/disable the thermal zones resulting on core code and
drivers cleanup (Andrzej Pietrasiewicz)
 
 - Add generic netlink support for userspace notifications: events, temperature
   and discovery commands (Daniel Lezcano)
 
 - Fix redundant initialization for a ret variable (Colin Ian King)
 
 - Remove the clock cooling code as it is used nowhere (Amit Kucheria)
 
 - Add the rcar_gen3_thermal's r8a774e1 support (Marian-Cristian Rotariu)
 
 - Replace all references to thermal.txt in the documentation to the
   corresponding yaml files (Amit Kucheria)
 
 - Add maintainer entry for the IPA (Lukasz Luba)
 
 - Add support for MSM8939 for the tsens (Shawn Guo)
 
 - Update power allocator and devfreq cooling to SPDX licensing (Lukasz Luba)
 
 - Add Cannon Lake Low Power PCH support (Sumeet Pawnikar)
 
 - Add tsensor support for V2 mediatek thermal system (Henry Yen)
 
 - Fix thermal zone lookup by ID for the core code (Thierry Reding)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAl8q7tsACgkQqDIjiipP
 6E+5Rwf7BFEn5YXPvng8cmnAlgvEBc9DdT6mGSo0NpFm9MdUxXlaqvw3WWSGyqWQ
 +z0Ka7lmn5XyiMsVN11++Snp+79X17HzZf9SXO3glyIpAn+5prTDRhzzj0/jPrtS
 sEeI++DrILsKKMGVljzftLmwNJN9DkUDNcnmWmZdCDbYVEKtP9Pjf2wBjAnXj7sX
 JA3CkHRMwYLEQbfaKz37M11cYM+LqbDOlb6U11YWgAGGJ7d7zNYRf2/YSYPM4AN6
 iE6j0E+3jIlXesULsap1AzeJaBq+wFxj1FL2TUZ8KscvRrm3AucqzNAT2M/Bc5Az
 XLKKzc6Gp9JfqB5KXhX2EDu7VRnDBg==
 =cSMN
 -----END PGP SIGNATURE-----

Merge tag 'thermal-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux

Pull thermal updates from Daniel Lezcano:

 - Add support to enable/disable the thermal zones resulting on core
   code and drivers cleanup (Andrzej Pietrasiewicz)

 - Add generic netlink support for userspace notifications: events,
   temperature and discovery commands (Daniel Lezcano)

 - Fix redundant initialization for a ret variable (Colin Ian King)

 - Remove the clock cooling code as it is used nowhere (Amit Kucheria)

 - Add the rcar_gen3_thermal's r8a774e1 support (Marian-Cristian
   Rotariu)

 - Replace all references to thermal.txt in the documentation to the
   corresponding yaml files (Amit Kucheria)

 - Add maintainer entry for the IPA (Lukasz Luba)

 - Add support for MSM8939 for the tsens (Shawn Guo)

 - Update power allocator and devfreq cooling to SPDX licensing (Lukasz
   Luba)

 - Add Cannon Lake Low Power PCH support (Sumeet Pawnikar)

 - Add tsensor support for V2 mediatek thermal system (Henry Yen)

 - Fix thermal zone lookup by ID for the core code (Thierry Reding)

* tag 'thermal-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (40 commits)
  thermal: intel: intel_pch_thermal: Add Cannon Lake Low Power PCH support
  thermal: mediatek: Add tsensor support for V2 thermal system
  thermal: mediatek: Prepare to add support for other platforms
  thermal: Update power allocator and devfreq cooling to SPDX licensing
  MAINTAINERS: update entry to thermal governors file name prefixing
  thermal: core: Add thermal zone enable/disable notification
  thermal: qcom: tsens-v0_1: Add support for MSM8939
  dt-bindings: tsens: qcom: Document MSM8939 compatible
  thermal: core: Fix thermal zone lookup by ID
  thermal: int340x: processor_thermal: fix: update Jasper Lake PCI id
  thermal: imx8mm: Support module autoloading
  thermal: ti-soc-thermal: Fix reversed condition in ti_thermal_expose_sensor()
  MAINTAINERS: Add maintenance information for IPA
  thermal: rcar_gen3_thermal: Do not shadow thcode variable
  dt-bindings: thermal: Get rid of thermal.txt and replace references
  thermal: core: Move initialization after core initcall
  thermal: netlink: Improve the initcall ordering
  net: genetlink: Move initialization to core_initcall
  thermal: rcar_gen3_thermal: Add r8a774e1 support
  thermal/drivers/clock_cooling: Remove clock_cooling code
  ...
2020-08-06 18:10:55 -07:00
Linus Torvalds
d7806bbd22 RDMA 5.9 merge window pull request
Smaller set of RDMA updates. A smaller number of 'big topics' with the
 majority of changes being driver updates.
 
 - Driver updates for hfi1, rxe, mlx5, hns, qedr, usnic, bnxt_re
 
 - Removal of dead or redundant code across the drivers
 
 - RAW resource tracker dumps to include a device specific data blob for
   device objects to aide device debugging
 
 - Further advance the IOCTL interface, remove the ability to turn it off.
   Add QUERY_CONTEXT, QUERY_MR, and QUERY_PD commands
 
 - Remove stubs related to devices with no pkey table
 
 - A shared CQ scheme to allow multiple ULPs to share the CQ rings of a
   device to give higher performance
 
 - Several more static checker, syzkaller and rare crashers fixed
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl8sSA0ACgkQOG33FX4g
 mxpp1w/8Df/KIB38PVHpKraIW10bX03KsXwoskMYCA+ITYWM5ce+P7YF+yXXGs69
 Vh2vUYHlr1RvqXQkq3Y3LjzCPKTYFuNFVQRZF1LrfbfOpSS9aoQqoxwgKs08dibm
 YDeRwueWneksWhXeEZLA0QoKd4kEWrScA/n7VGYQ4YcWw8FLKa9t6OMSGivCrFLu
 QA+sA9nytrvMWC5uJUCdeVwlRnoaICPYHmM5yafOykPyEciRw2jU1kzTRVy5Z0Hu
 iCsXm2lJPcVoMgSjW6SgktY3oBkQeSu3ZZesT3eTM6FJsoDYkuSiKjNmWSZjW1zv
 x6CFGjVVin41rN4FMTeqqnwYoML9Q/obbyHvBHs5MTd5J8tLDhesQj3Ev7CUaUed
 b0s38v+oEL1w22nkOChfeyfh7eLcy3yiszqvkIU9ABk8mF0p1guGQYsfguzbsq0K
 3ZRw/361SxCUBvU6P8CdQbIJlhkH+Un7d81qyt+rhLgaZYm/N+d8auIKUxP1jCxh
 q9hss2Cj2U9eZsA/wGNqV1LNazfEAAj/5qjItMirbRd90FL8h+AP2LfJfC7p+id3
 3BfOui0JbZqNTTl4ftTxPuxtWDEdTPgwi7JvQd/be9HRlSV8DYCSMUzYFn8A+Zya
 cbxjxFuBJWmF+y9csDIVBTdFi+j9hO6notw+G89NznuB3QlPl50=
 =0z2L
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A quiet cycle after the larger 5.8 effort. Substantially cleanup and
  driver work with a few smaller features this time.

   - Driver updates for hfi1, rxe, mlx5, hns, qedr, usnic, bnxt_re

   - Removal of dead or redundant code across the drivers

   - RAW resource tracker dumps to include a device specific data blob
     for device objects to aide device debugging

   - Further advance the IOCTL interface, remove the ability to turn it
     off. Add QUERY_CONTEXT, QUERY_MR, and QUERY_PD commands

   - Remove stubs related to devices with no pkey table

   - A shared CQ scheme to allow multiple ULPs to share the CQ rings of
     a device to give higher performance

   - Several more static checker, syzkaller and rare crashers fixed"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (121 commits)
  RDMA/mlx5: Fix flow destination setting for RDMA TX flow table
  RDMA/rxe: Remove pkey table
  RDMA/umem: Add a schedule point in ib_umem_get()
  RDMA/hns: Fix the unneeded process when getting a general type of CQE error
  RDMA/hns: Fix error during modify qp RTS2RTS
  RDMA/hns: Delete unnecessary memset when allocating VF resource
  RDMA/hns: Remove redundant parameters in set_rc_wqe()
  RDMA/hns: Remove support for HIP08_A
  RDMA/hns: Refactor hns_roce_v2_set_hem()
  RDMA/hns: Remove redundant hardware opcode definitions
  RDMA/netlink: Remove CAP_NET_RAW check when dump a raw QP
  RDMA/include: Replace license text with SPDX tags
  RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq
  RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting
  RDMA/cma: Execute rdma_cm destruction from a handler properly
  RDMA/cma: Remove unneeded locking for req paths
  RDMA/cma: Using the standard locking pattern when delivering the removal event
  RDMA/cma: Simplify DEVICE_REMOVAL for internal_id
  RDMA/efa: Add EFA 0xefa1 PCI ID
  RDMA/efa: User/kernel compatibility handshake mechanism
  ...
2020-08-06 16:43:36 -07:00
Colin Ian King
8912fd6a61 net: hns3: fix spelling mistake "could'nt" -> "couldn't"
There is a spelling mistake in a dev_err message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-06 12:05:40 -07:00
Linus Torvalds
47ec5303d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:

 1) Support 6Ghz band in ath11k driver, from Rajkumar Manoharan.

 2) Support UDP segmentation in code TSO code, from Eric Dumazet.

 3) Allow flashing different flash images in cxgb4 driver, from Vishal
    Kulkarni.

 4) Add drop frames counter and flow status to tc flower offloading,
    from Po Liu.

 5) Support n-tuple filters in cxgb4, from Vishal Kulkarni.

 6) Various new indirect call avoidance, from Eric Dumazet and Brian
    Vazquez.

 7) Fix BPF verifier failures on 32-bit pointer arithmetic, from
    Yonghong Song.

 8) Support querying and setting hardware address of a port function via
    devlink, use this in mlx5, from Parav Pandit.

 9) Support hw ipsec offload on bonding slaves, from Jarod Wilson.

10) Switch qca8k driver over to phylink, from Jonathan McDowell.

11) In bpftool, show list of processes holding BPF FD references to
    maps, programs, links, and btf objects. From Andrii Nakryiko.

12) Several conversions over to generic power management, from Vaibhav
    Gupta.

13) Add support for SO_KEEPALIVE et al. to bpf_setsockopt(), from Dmitry
    Yakunin.

14) Various https url conversions, from Alexander A. Klimov.

15) Timestamping and PHC support for mscc PHY driver, from Antoine
    Tenart.

16) Support bpf iterating over tcp and udp sockets, from Yonghong Song.

17) Support 5GBASE-T i40e NICs, from Aleksandr Loktionov.

18) Add kTLS RX HW offload support to mlx5e, from Tariq Toukan.

19) Fix the ->ndo_start_xmit() return type to be netdev_tx_t in several
    drivers. From Luc Van Oostenryck.

20) XDP support for xen-netfront, from Denis Kirjanov.

21) Support receive buffer autotuning in MPTCP, from Florian Westphal.

22) Support EF100 chip in sfc driver, from Edward Cree.

23) Add XDP support to mvpp2 driver, from Matteo Croce.

24) Support MPTCP in sock_diag, from Paolo Abeni.

25) Commonize UDP tunnel offloading code by creating udp_tunnel_nic
    infrastructure, from Jakub Kicinski.

26) Several pci_ --> dma_ API conversions, from Christophe JAILLET.

27) Add FLOW_ACTION_POLICE support to mlxsw, from Ido Schimmel.

28) Add SK_LOOKUP bpf program type, from Jakub Sitnicki.

29) Refactor a lot of networking socket option handling code in order to
    avoid set_fs() calls, from Christoph Hellwig.

30) Add rfc4884 support to icmp code, from Willem de Bruijn.

31) Support TBF offload in dpaa2-eth driver, from Ioana Ciornei.

32) Support XDP_REDIRECT in qede driver, from Alexander Lobakin.

33) Support PCI relaxed ordering in mlx5 driver, from Aya Levin.

34) Support TCP syncookies in MPTCP, from Flowian Westphal.

35) Fix several tricky cases of PMTU handling wrt. briding, from Stefano
    Brivio.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2056 commits)
  net: thunderx: initialize VF's mailbox mutex before first usage
  usb: hso: remove bogus check for EINPROGRESS
  usb: hso: no complaint about kmalloc failure
  hso: fix bailout in error case of probe
  ip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM
  selftests/net: relax cpu affinity requirement in msg_zerocopy test
  mptcp: be careful on subflow creation
  selftests: rtnetlink: make kci_test_encap() return sub-test result
  selftests: rtnetlink: correct the final return value for the test
  net: dsa: sja1105: use detected device id instead of DT one on mismatch
  tipc: set ub->ifindex for local ipv6 address
  ipv6: add ipv6_dev_find()
  net: openvswitch: silence suspicious RCU usage warning
  Revert "vxlan: fix tos value before xmit"
  ptp: only allow phase values lower than 1 period
  farsync: switch from 'pci_' to 'dma_' API
  wan: wanxl: switch from 'pci_' to 'dma_' API
  hv_netvsc: do not use VF device if link is down
  dpaa2-eth: Fix passing zero to 'PTR_ERR' warning
  net: macb: Properly handle phylink on at91sam9x
  ...
2020-08-05 20:13:21 -07:00
Dean Nelson
c1055b76ad net: thunderx: initialize VF's mailbox mutex before first usage
A VF's mailbox mutex is not getting initialized by nicvf_probe() until after
it is first used. And such usage is resulting in...

[   28.270927] ------------[ cut here ]------------
[   28.270934] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
[   28.270980] WARNING: CPU: 9 PID: 675 at kernel/locking/mutex.c:938 __mutex_lock+0xdac/0x12f0
[   28.270985] Modules linked in: ast(+) nicvf(+) i2c_algo_bit drm_vram_helper drm_ttm_helper ttm nicpf(+) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ixgbe(+) sg thunder_bgx mdio i2c_thunderx mdio_thunder thunder_xcv mdio_cavium dm_mirror dm_region_hash dm_log dm_mod
[   28.271064] CPU: 9 PID: 675 Comm: systemd-udevd Not tainted 4.18.0+ #1
[   28.271070] Hardware name: GIGABYTE R120-T34-00/MT30-GS2-00, BIOS F02 08/06/2019
[   28.271078] pstate: 60000005 (nZCv daif -PAN -UAO)
[   28.271086] pc : __mutex_lock+0xdac/0x12f0
[   28.271092] lr : __mutex_lock+0xdac/0x12f0
[   28.271097] sp : ffff800d42146fb0
[   28.271103] x29: ffff800d42146fb0 x28: 0000000000000000
[   28.271113] x27: ffff800d24361180 x26: dfff200000000000
[   28.271122] x25: 0000000000000000 x24: 0000000000000002
[   28.271132] x23: ffff20001597cc80 x22: ffff2000139e9848
[   28.271141] x21: 0000000000000000 x20: 1ffff001a8428e0c
[   28.271151] x19: ffff200015d5d000 x18: 1ffff001ae0f2184
[   28.271160] x17: 0000000000000000 x16: 0000000000000000
[   28.271170] x15: ffff800d70790c38 x14: ffff20001597c000
[   28.271179] x13: ffff20001597cc80 x12: ffff040002b2f779
[   28.271189] x11: 1fffe40002b2f778 x10: ffff040002b2f778
[   28.271199] x9 : 0000000000000000 x8 : 00000000f1f1f1f1
[   28.271208] x7 : 00000000f2f2f2f2 x6 : 0000000000000000
[   28.271217] x5 : 1ffff001ae0f2186 x4 : 1fffe400027eb03c
[   28.271227] x3 : dfff200000000000 x2 : ffff1001a8428dbe
[   28.271237] x1 : c87fdfac7ea11d00 x0 : 0000000000000000
[   28.271246] Call trace:
[   28.271254]  __mutex_lock+0xdac/0x12f0
[   28.271261]  mutex_lock_nested+0x3c/0x50
[   28.271297]  nicvf_send_msg_to_pf+0x40/0x3a0 [nicvf]
[   28.271316]  nicvf_register_misc_interrupt+0x20c/0x328 [nicvf]
[   28.271334]  nicvf_probe+0x508/0xda0 [nicvf]
[   28.271344]  local_pci_probe+0xc4/0x180
[   28.271352]  pci_device_probe+0x3ec/0x528
[   28.271363]  driver_probe_device+0x21c/0xb98
[   28.271371]  device_driver_attach+0xe8/0x120
[   28.271379]  __driver_attach+0xe0/0x2a0
[   28.271386]  bus_for_each_dev+0x118/0x190
[   28.271394]  driver_attach+0x48/0x60
[   28.271401]  bus_add_driver+0x328/0x558
[   28.271409]  driver_register+0x148/0x398
[   28.271416]  __pci_register_driver+0x14c/0x1b0
[   28.271437]  nicvf_init_module+0x54/0x10000 [nicvf]
[   28.271447]  do_one_initcall+0x18c/0xc18
[   28.271457]  do_init_module+0x18c/0x618
[   28.271464]  load_module+0x2bc0/0x4088
[   28.271472]  __se_sys_finit_module+0x110/0x188
[   28.271479]  __arm64_sys_finit_module+0x70/0xa0
[   28.271490]  el0_svc_handler+0x15c/0x380
[   28.271496]  el0_svc+0x8/0xc
[   28.271502] irq event stamp: 52649
[   28.271513] hardirqs last  enabled at (52649): [<ffff200011b4d790>] _raw_spin_unlock_irqrestore+0xc0/0xd8
[   28.271522] hardirqs last disabled at (52648): [<ffff200011b4d3c4>] _raw_spin_lock_irqsave+0x3c/0xf0
[   28.271530] softirqs last  enabled at (52330): [<ffff200010082af4>] __do_softirq+0xacc/0x117c
[   28.271540] softirqs last disabled at (52313): [<ffff20001019b354>] irq_exit+0x3cc/0x500
[   28.271545] ---[ end trace a9b90324c8a0d4ee ]---

This problem is resolved by moving the call to mutex_init() up earlier
in nicvf_probe().

Fixes: 609ea65c65 ("net: thunderx: add mutex to protect mailbox from concurrent calls for same VF")
Signed-off-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-05 17:50:34 -07:00
Bjorn Helgaas
0caa17f5f2 Merge branch 'pci/misc'
- Convert PCIe capability PCIBIOS errors to errno (Bolarinwa Olayemi
  Saheed)

- Align PCIe capability and PCI accessor return values (Bolarinwa Olayemi
  Saheed)

- Replace http:// links with https:// (Alexander A. Klimov)

- Replace lkml.org, spinics, gmane with lore.kernel.org (Bjorn Helgaas)

- Update panic message to mention kzalloc(), not kmalloc() (Liao Pingfang)

- Move PCI_VENDOR_ID_REDHAT definition to pci_ids.h (Huacai Chen)

- Remove unused pci_lost_interrupt() (Heiner Kallweit)

* pci/misc:
  PCI: Remove unused pci_lost_interrupt()
  PCI: Move PCI_VENDOR_ID_REDHAT definition to pci_ids.h
  PCI: Fix error in panic message
  PCI: Replace lkml.org, spinics, gmane with lore.kernel.org
  PCI: Replace http:// links with https://
  PCI: Align PCIe capability and PCI accessor return values
  PCI: Convert PCIe capability PCIBIOS errors to errno
2020-08-05 18:24:16 -05:00
YueHaibing
02afa9c66b dpaa2-eth: Fix passing zero to 'PTR_ERR' warning
Fix smatch warning:

drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:2419
 alloc_channel() warn: passing zero to 'ERR_PTR'

setup_dpcon() should return ERR_PTR(err) instead of zero in error
handling case.

Fixes: d7f5a9d89a ("dpaa2-eth: defer probe on object allocate")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04 16:10:24 -07:00
Stefan Roese
f7ba7dbf4f net: macb: Properly handle phylink on at91sam9x
I just recently noticed that ethernet does not work anymore since v5.5
on the GARDENA smart Gateway, which is based on the AT91SAM9G25.
Debugging showed that the "GEM bits" in the NCFGR register are now
unconditionally accessed, which is incorrect for the !macb_is_gem()
case.

This patch adds the macb_is_gem() checks back to the code
(in macb_mac_config() & macb_mac_link_up()), so that the GEM register
bits are not accessed in this case any more.

Fixes: 7897b071ac ("net: macb: convert to phylink")
Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Reto Schneider <reto.schneider@husqvarnagroup.com>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04 16:04:17 -07:00
Linus Torvalds
99ea1521a0 Remove uninitialized_var() macro for v5.9-rc1
- Clean up non-trivial uses of uninitialized_var()
 - Update documentation and checkpatch for uninitialized_var() removal
 - Treewide removal of uninitialized_var()
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl8oYLQWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJsfjEACvf0D3WL3H7sLHtZ2HeMwOgAzq
 il08t6vUscINQwiIIK3Be43ok3uQ1Q+bj8sr2gSYTwunV2IYHFferzgzhyMMno3o
 XBIGd1E+v1E4DGBOiRXJvacBivKrfvrdZ7AWiGlVBKfg2E0fL1aQbe9AYJ6eJSbp
 UGqkBkE207dugS5SQcwrlk1tWKUL089lhDAPd7iy/5RK76OsLRCJFzIerLHF2ZK2
 BwvA+NWXVQI6pNZ0aRtEtbbxwEU4X+2J/uaXH5kJDszMwRrgBT2qoedVu5LXFPi8
 +B84IzM2lii1HAFbrFlRyL/EMueVFzieN40EOB6O8wt60Y4iCy5wOUzAdZwFuSTI
 h0xT3JI8BWtpB3W+ryas9cl9GoOHHtPA8dShuV+Y+Q2bWe1Fs6kTl2Z4m4zKq56z
 63wQCdveFOkqiCLZb8s6FhnS11wKtAX4czvXRXaUPgdVQS1Ibyba851CRHIEY+9I
 AbtogoPN8FXzLsJn7pIxHR4ADz+eZ0dQ18f2hhQpP6/co65bYizNP5H3h+t9hGHG
 k3r2k8T+jpFPaddpZMvRvIVD8O2HvJZQTyY6Vvneuv6pnQWtr2DqPFn2YooRnzoa
 dbBMtpon+vYz6OWokC5QNWLqHWqvY9TmMfcVFUXE4AFse8vh4wJ8jJCNOFVp8On+
 drhmmImUr1YylrtVOw==
 =xHmk
 -----END PGP SIGNATURE-----

Merge tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull uninitialized_var() macro removal from Kees Cook:
 "This is long overdue, and has hidden too many bugs over the years. The
  series has several "by hand" fixes, and then a trivial treewide
  replacement.

   - Clean up non-trivial uses of uninitialized_var()

   - Update documentation and checkpatch for uninitialized_var() removal

   - Treewide removal of uninitialized_var()"

* tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  compiler: Remove uninitialized_var() macro
  treewide: Remove uninitialized_var() usage
  checkpatch: Remove awareness of uninitialized_var() macro
  mm/debug_vm_pgtable: Remove uninitialized_var() usage
  f2fs: Eliminate usage of uninitialized_var() macro
  media: sur40: Remove uninitialized_var() usage
  KVM: PPC: Book3S PR: Remove uninitialized_var() usage
  clk: spear: Remove uninitialized_var() usage
  clk: st: Remove uninitialized_var() usage
  spi: davinci: Remove uninitialized_var() usage
  ide: Remove uninitialized_var() usage
  rtlwifi: rtl8192cu: Remove uninitialized_var() usage
  b43: Remove uninitialized_var() usage
  drbd: Remove uninitialized_var() usage
  x86/mm/numa: Remove uninitialized_var() usage
  docs: deprecated.rst: Add uninitialized_var()
2020-08-04 13:49:43 -07:00
Xin Long
bab9693a9a net: thunderx: use spin_lock_bh in nicvf_set_rx_mode_task()
A dead lock was triggered on thunderx driver:

        CPU0                    CPU1
        ----                    ----
   [01] lock(&(&nic->rx_mode_wq_lock)->rlock);
                           [11] lock(&(&mc->mca_lock)->rlock);
                           [12] lock(&(&nic->rx_mode_wq_lock)->rlock);
   [02] <Interrupt> lock(&(&mc->mca_lock)->rlock);

The path for each is:

  [01] worker_thread() -> process_one_work() -> nicvf_set_rx_mode_task()
  [02] mld_ifc_timer_expire()
  [11] ipv6_add_dev() -> ipv6_dev_mc_inc() -> igmp6_group_added() ->
  [12] dev_mc_add() -> __dev_set_rx_mode() -> nicvf_set_rx_mode()

To fix it, it needs to disable bh on [1], so that the timer on [2]
wouldn't be triggered until rx_mode_wq_lock is released. So change
to use spin_lock_bh() instead of spin_lock().

Thanks to Paolo for helping with this.

v1->v2:
  - post to netdev.

Reported-by: Rafael P. <rparrazo@redhat.com>
Tested-by: Dean Nelson <dnelson@redhat.com>
Fixes: 469998c861 ("net: thunderx: prevent concurrent data re-writing by nicvf_set_rx_mode")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04 13:03:36 -07:00
Joe Perches
93f4ddd64b via-velocity: Use more typical logging styles
Use netdev_<level> in place of VELOCITY_PRT.
Use pr_<level> in place of printk(KERN_<LEVEL>.

Miscellanea:

o Add pr_fmt to prefix pr_<level> output with "via-velocity: "
o Remove now unused functions and macros
o Realign some logging lines
o Remove devname where pr_<level> is also used

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04 12:54:49 -07:00
Luo bin
c8c29ec3c5 hinic: add check for mailbox msg from VF
PF should check whether the cmd from VF is supported and its content
is right before passing it to hw.

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04 12:17:06 -07:00
Luo bin
088c5f0d1a hinic: add generating mailbox random index support
add support to generate mailbox random id of VF to ensure that
mailbox messages PF received are from the correct VF.

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04 12:17:06 -07:00
David S. Miller
da7955405f sfc: Fix build with CONFIG_RFS_ACCEL disabled.
drivers/net/ethernet/sfc/ef100_nic.c:835:3: error: 'const struct efx_nic_type' has no member named 'filter_rfs_expire_one'
     835 |  .filter_rfs_expire_one = efx_mcdi_filter_rfs_expire_one,
         |   ^~~~~~~~~~~~~~~~~~~~~
>> drivers/net/ethernet/sfc/ef100_nic.c:835:27: error: initialization of 'void (*)(struct efx_nic *, u32)' {aka 'void (*)(struct efx_nic *, unsigned int)'} from incompatible pointer type 'bool (*)(struct efx_nic *, u32,  unsigned int)' {aka '_Bool (*)(struct efx_nic *, unsigned int,  unsigned int)'} [-Werror=incompatible-pointer-types]
     835 |  .filter_rfs_expire_one = efx_mcdi_filter_rfs_expire_one,
         |                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:29:39 -07:00
David S. Miller
2e7199bd77 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-08-04

The following pull-request contains BPF updates for your *net-next* tree.

We've added 73 non-merge commits during the last 9 day(s) which contain
a total of 135 files changed, 4603 insertions(+), 1013 deletions(-).

The main changes are:

1) Implement bpf_link support for XDP. Also add LINK_DETACH operation for the BPF
   syscall allowing processes with BPF link FD to force-detach, from Andrii Nakryiko.

2) Add BPF iterator for map elements and to iterate all BPF programs for efficient
   in-kernel inspection, from Yonghong Song and Alexei Starovoitov.

3) Separate bpf_get_{stack,stackid}() helpers for perf events in BPF to avoid
   unwinder errors, from Song Liu.

4) Allow cgroup local storage map to be shared between programs on the same
   cgroup. Also extend BPF selftests with coverage, from YiFei Zhu.

5) Add BPF exception tables to ARM64 JIT in order to be able to JIT BPF_PROBE_MEM
   load instructions, from Jean-Philippe Brucker.

6) Follow-up fixes on BPF socket lookup in combination with reuseport group
   handling. Also add related BPF selftests, from Jakub Sitnicki.

7) Allow to use socket storage in BPF_PROG_TYPE_CGROUP_SOCK-typed programs for
   socket create/release as well as bind functions, from Stanislav Fomichev.

8) Fix an info leak in xsk_getsockopt() when retrieving XDP stats via old struct
   xdp_statistics, from Peilin Ye.

9) Fix PT_REGS_RC{,_CORE}() macros in libbpf for MIPS arch, from Jerry Crunchtime.

10) Extend BPF kernel test infra with skb->family and skb->{local,remote}_ip{4,6}
    fields and allow user space to specify skb->dev via ifindex, from Dmitry Yakunin.

11) Fix a bpftool segfault due to missing program type name and make it more robust
    to prevent them in future gaps, from Quentin Monnet.

12) Consolidate cgroup helper functions across selftests and fix a v6 localhost
    resolver issue, from John Fastabend.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:27:40 -07:00
David S. Miller
76769c38b4 mlx5-updates-2020-08-03
This patchset introduces some updates to mlx5 driver.
 
 1) Jakub converts mlx5 to use the new udp tunnel infrastructure.
    Starting with a hack to allow drivers to request a static configuration
    of the default vxlan port, and then a patch that converts mlx5.
 
 2) Parav implements change_carrier ndo for VF eswitch representors,
    to speedup link state control of representors netdevices.
 
 3) Alex Vesker, makes a simple update to software steering to fix an issue
    with push vlan action sequence
 
 4) Leon removes a redundant dump stack on error flow.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl8oRdgACgkQSD+KveBX
 +j4/LQgAkSjNzOaS7bVDzhoYL3aBQOMIzgocJUeVi7xXH8IO1uy55mNDrKBqjxbW
 dy9U9VsvV5i2V2qkkQLvHVkoDSg8Buo2Uxu4OrZHOLN0KfbFrra4VvmB1CzEBix8
 FICnQaZZcE7529P04TgZ8Mo9vRb5VdJFhqED5Nvegy+y8FolEsQYbjIoDBE6wa0j
 Meqa/29+XCE5FzTOjbbQWizAnRZMbkxtSSreDNgeHxke9eMSO+fmwKScng63QUfl
 7nfU6dW6A0d1kHhpL5RqAFOcmkpSdqYaA3SA+/8pPT9X3yOAkxE6KTKGIixpB9JX
 zQt+Wkna49jJ/JfDQB5vgww5c0HjAQ==
 =j0fG
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-08-03

This patchset introduces some updates to mlx5 driver.

1) Jakub converts mlx5 to use the new udp tunnel infrastructure.
   Starting with a hack to allow drivers to request a static configuration
   of the default vxlan port, and then a patch that converts mlx5.

2) Parav implements change_carrier ndo for VF eswitch representors,
   to speedup link state control of representors netdevices.

3) Alex Vesker, makes a simple update to software steering to fix an issue
   with push vlan action sequence

4) Leon removes a redundant dump stack on error flow.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:24:30 -07:00
Edward Cree
d61592a112 sfc_ef100: add nic-type for VFs, and bind to them
We don't yet have a .sriov_configure() to create them, though.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:55 -07:00
Edward Cree
ef2c57b956 sfc_ef100: read pf_index at probe time
We'll need it later, for VF representors.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:55 -07:00
Edward Cree
43c3df0d56 sfc_ef100: functions for selftests
Self-tests for event and interrupt reception and NVRAM.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:55 -07:00
Edward Cree
b593b6f1b4 sfc_ef100: statistics gathering
MAC stats work much the same as on EF10, with a periodic DMA to a region
 specified via an MCDI.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:55 -07:00
Edward Cree
b780feac36 sfc_ef100: plumb in fini_dmaq
Bring down the TX and RX queues at ifdown, so that we can then fini the
 EVQs (otherwise the MC would return EBUSY because they're still in use).

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:55 -07:00
Edward Cree
8e57daf706 sfc_ef100: RX path for EF100
Includes RSS spreading.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:54 -07:00
Edward Cree
a9dc3d5612 sfc_ef100: RX filter table management and related gubbins
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:54 -07:00
Edward Cree
d19a537218 sfc_ef100: TX path for EF100 NICs
Includes checksum offload and TSO, so declare those in our netdev features.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:54 -07:00
Edward Cree
adcfc3482f sfc_ef100: read Design Parameters at probe time
Several parts of the EF100 architecture are parameterised (to allow
 varying capabilities on FPGAs according to resource constraints), and
 these parameters are exposed to the driver through a TLV-encoded
 region of the BAR.
For the most part we either don't care about these values at all or
 just need to sanity-check them against the driver's assumptions, but
 there are a number of TSO limits which we record so that we will be
 able to check against them in the TX path when handling GSO skbs.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:54 -07:00
Edward Cree
4496363bec sfc_ef100: fail the probe if NIC uses unsol_ev credits
In the future, EF100 is planned to have a credit-based scheme for
 handling unsolicited events, which drivers will need to use in order
 to function correctly.  However, current EF100 hardware does not yet
 generate unsolicited events and the credit scheme has not yet been
 implemented in firmware.  To prevent compatibility problems later if
 the current driver is used with future firmware which does implement
 it, we check for the corresponding capability flag (which that
 future firmware will set), and if found, we refuse to probe.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:54 -07:00
Edward Cree
8e737145e8 sfc_ef100: check firmware version at start-of-day
Early in EF100 development there was a different format of event
 descriptor; if the NIC is somehow running the very old firmware
 which will use that format, fail the probe.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:54 -07:00
Jiafei Pan
215602a8d2 enetc: use napi_schedule to be compatible with PREEMPT_RT
The driver calls napi_schedule_irqoff() from a context where, in RT,
hardirqs are not disabled, since the IRQ handler is force-threaded.

In the call path of this function, __raise_softirq_irqoff() is modifying
its per-CPU mask of pending softirqs that must be processed, using
or_softirq_pending(). The or_softirq_pending() function is not atomic,
but since interrupts are supposed to be disabled, nobody should be
preempting it, and the operation should be safe.

Nonetheless, when running with hardirqs on, as in the PREEMPT_RT case,
it isn't safe, and the pending softirqs mask can get corrupted,
resulting in softirqs being lost and never processed.

To have common code that works with PREEMPT_RT and with mainline Linux,
we can use plain napi_schedule() instead. The difference is that
napi_schedule() (via __napi_schedule) also calls local_irq_save, which
disables hardirqs if they aren't already. But, since they already are
disabled in non-RT, this means that in practice we don't see any
measurable difference in throughput or latency with this patch.

Signed-off-by: Jiafei Pan <Jiafei.Pan@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:21:30 -07:00
Jiafei Pan
6c33ae1ad5 dpaa2-eth: use napi_schedule to be compatible with PREEMPT_RT
The driver calls napi_schedule_irqoff() from a context where, in RT,
hardirqs are not disabled, since the IRQ handler is force-threaded.

In the call path of this function, __raise_softirq_irqoff() is modifying
its per-CPU mask of pending softirqs that must be processed, using
or_softirq_pending(). The or_softirq_pending() function is not atomic,
but since interrupts are supposed to be disabled, nobody should be
preempting it, and the operation should be safe.

Nonetheless, when running with hardirqs on, as in the PREEMPT_RT case,
it isn't safe, and the pending softirqs mask can get corrupted,
resulting in softirqs being lost and never processed.

To have common code that works with PREEMPT_RT and with mainline Linux,
we can use plain napi_schedule() instead. The difference is that
napi_schedule() (via __napi_schedule) also calls local_irq_save, which
disables hardirqs if they aren't already. But, since they already are
disabled in non-RT, this means that in practice we don't see any
measurable difference in throughput or latency with this patch.

Signed-off-by: Jiafei Pan <Jiafei.Pan@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:21:30 -07:00
Rahul Lakkireddy
59b328cf56 cxgb4: add TC-MATCHALL IPv6 support
Matching IPv6 traffic require allocating their own individual slots
in TCAM. So, fetch additional slots to insert IPv6 rules. Also, fetch
the cumulative stats of all the slots occupied by the Matchall rule.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:17:08 -07:00
Petr Machata
54a9238589 mlxsw: spectrum_qdisc: Offload action trap for qevents
When offloading action trap on a qevent, pass to_dev of NULL to the SPAN
module to trigger the mirror to the CPU port. Query the buffer drops
policer and use it for policing of the trapped traffic.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Ido Schimmel
6687e953f4 mlxsw: spectrum_trap: Add early_drop trap
As previously explained, packets that are dropped due to buffer related
reasons (e.g., tail drop, early drop) can be mirrored to the CPU port.
These packets are then trapped with one of the "mirror session" traps
and their CQE includes the reason for which the packet was mirrored.

Register with devlink a new trap, early_drop, and initialize the
corresponding Rx listener with the appropriate mirror reason. Return an
error in case user tries to change the traps' action, as this is not
supported.

Since Spectrum-1 does not support these traps, the above is only done
for Spectrum-2 onwards.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Ido Schimmel
869c7be940 mlxsw: spectrum_trap: Allow for per-ASIC traps initialization
Subsequent patches will need to register different traps for Spectrum-1
and Spectrum-2 onwards.

Enable that by invoking a per-ASIC operation during traps
initialization.

Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Ido Schimmel
36d1fd687d mlxsw: spectrum_trap: Allow for per-ASIC trap groups initialization
Subsequent patches will need to register different trap groups for
Spectrum-1 and Spectrum-2 onwards.

Enable that by invoking a per-ASIC operation during trap groups
initialization.

Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Petr Machata
928345c08b mlxsw: spectrum_span: On policer_id_base_ref_count, use dec_and_test
When unsetting policer base, the SPAN code currently uses refcount_dec().
However that function splats when the counter reaches zero, because
reaching zero without actually testing is in general indicative of a
missing cleanup. There is no cleanup to be done here, but nonetheless, use
refcount_dec_and_test() as required.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Ido Schimmel
76ba292cc7 mlxsw: spectrum_trap: Use 'size_t' for array sizes
Use 'size_t' instead of 'u64' for array sizes, as this this is correct
type to use for expressions involving sizeof().

Suggested-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Ido Schimmel
c88e11e047 devlink: Pass extack when setting trap's action and group's parameters
A later patch will refuse to set the action of certain traps in mlxsw
and also to change the policer binding of certain groups. Pass extack so
that failure could be communicated clearly to user space.

Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Jisheng Zhang
01f4d47a5b net: stmmac: fix failed to suspend if phy based WOL is enabled
With the latest net-next tree, if test suspend/resume after enabling
WOL, we get error as below:

[  487.086365] dpm_run_callback(): mdio_bus_suspend+0x0/0x30 returns -16
[  487.086375] PM: Device stmmac-0:00 failed to suspend: error -16

-16 means -EBUSY, this is because I didn't enable wakeup of the correct
device when implementing phy based WOL feature. To be honest, I caught
the issue when implementing phy based WOL and then fix it locally, but
forgot to amend the phy based wol patch. Today, I found the issue by
testing net-next tree.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 17:59:39 -07:00
Florinel Iordache
3207f715c3 fsl/fman: fix eth hash table allocation
Fix memory allocation for ethernet address hash table.
The code was wrongly allocating an array for eth hash table which
is incorrect because this is the main structure for eth hash table
(struct eth_hash_t) that contains inside a number of elements.

Fixes: 57ba4c9b56 ("fsl/fman: Add FMan MAC support")
Signed-off-by: Florinel Iordache <florinel.iordache@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 16:20:15 -07:00
Florinel Iordache
cc5d229a12 fsl/fman: check dereferencing null pointer
Add a safe check to avoid dereferencing null pointer

Fixes: 57ba4c9b56 ("fsl/fman: Add FMan MAC support")
Signed-off-by: Florinel Iordache <florinel.iordache@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 16:20:15 -07:00
Florinel Iordache
cc79fd8f55 fsl/fman: fix unreachable code
The parameter 'priority' is incorrectly forced to zero which ultimately
induces logically dead code in the subsequent lines.

Fixes: 57ba4c9b56 ("fsl/fman: Add FMan MAC support")
Signed-off-by: Florinel Iordache <florinel.iordache@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 16:20:14 -07:00
Florinel Iordache
0572054617 fsl/fman: fix dereference null return value
Check before using returned value to avoid dereferencing null pointer.

Fixes: 18a6c85fcc ("fsl/fman: Add FMan Port Support")
Signed-off-by: Florinel Iordache <florinel.iordache@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 16:20:14 -07:00
Florinel Iordache
99f47abd9f fsl/fman: use 32-bit unsigned integer
Potentially overflowing expression (ts_freq << 16 and intgr << 16)
declared as type u32 (32-bit unsigned) is evaluated using 32-bit
arithmetic and then used in a context that expects an expression of
type u64 (64-bit unsigned) which ultimately is used as 16-bit
unsigned by typecasting to u16. Fixed by using an unsigned 32-bit
integer since the value is truncated anyway in the end.

Fixes: 414fd46e77 ("fsl/fman: Add FMan support")
Signed-off-by: Florinel Iordache <florinel.iordache@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 16:20:14 -07:00
Christophe JAILLET
c23cf402d0 net: spider_net: Remove a useless memset
Avoid a memset after a call to 'dma_alloc_coherent()'.
This is useless since
commit 518a2f1925 ("dma-mapping: zero memory returned from dma_alloc_*")

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 16:01:11 -07:00
Christophe JAILLET
36f28f7687 net: spider_net: Fix the size used in a 'dma_free_coherent()' call
Update the size used in 'dma_free_coherent()' in order to match the one
used in the corresponding 'dma_alloc_coherent()', in
'spider_net_init_chain()'.

Fixes: d4ed8f8d1f ("Spidernet DMA coalescing")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 16:01:11 -07:00
Christophe JAILLET
edab74e9cb net: sgi: ioc3-eth: Fix the size used in some 'dma_free_coherent()' calls
Update the size used in 'dma_free_coherent()' in order to match the one
used in the corresponding 'dma_alloc_coherent()'.

Fixes: 369a782af0 ("net: sgi: ioc3-eth: ensure tx ring is 16k aligned.")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 16:00:26 -07:00
Tianjia Zhang
aa027850a2 liquidio: Fix wrong return value in cn23xx_get_pf_num()
On an error exit path, a negative error code should be returned
instead of a positive return value.

Fixes: 0c45d7fe12 ("liquidio: fix use of pf in pass-through mode in a virtual machine")
Cc: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:57:31 -07:00
Tianjia Zhang
bace287c55 net/enetc: Fix wrong return value in enetc_psfp_parse_clsflower()
In the case of invalid rule, a positive value EINVAL is returned here.
I think this is a typo error. It is necessary to return an error value.

Cc: Po Liu <Po.Liu@nxp.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:57:07 -07:00
Tianjia Zhang
0470a48880 net: ethernet: aquantia: Fix wrong return value
In function hw_atl_a0_hw_multicast_list_set(), when an invalid
request is encountered, a negative error code should be returned.

Fixes: bab6de8fd1 ("net: ethernet: aquantia: Atlantic A0 and B0 specific functions")
Cc: David VomLehn <vomlehn@texas.net>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:57:02 -07:00
David S. Miller
ac6d1835ca Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2020-08-01

This series contains updates to the ice driver only.

Wei Yongjun marks power management functions with __maybe_unused.

Nick disables VLAN pruning in promiscuous mode and renames grst_delay to
grst_timeout.

Kiran modifies the check for linearization and corrects the vsi_id mask
value.

Vignesh replaces the use of flow profile locks to RSS profile locks for RSS
rule removal. Destroys flow profile lock on clearing XLT table and
clears extraction sequence entries.

Jesse adds some statistics and removes an unreported one.

Brett allows for 2 queue configuration for VFs.

Surabhi adds a check for failed allocation of an extraction sequence
table.

Tony updates the PTYPE lookup table and makes other trivial fixes.

Victor extends profile ID locks to be held until all references are
completed.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:43:59 -07:00
Miaohe Lin
8340303670 net: qed: use eth_zero_addr() to clear mac address
Use eth_zero_addr() to clear mac address instead of memset().

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:38:31 -07:00
Miaohe Lin
7ad9c26f75 net: qede: use eth_zero_addr() to clear mac address
Use eth_zero_addr() to clear mac address instead of memset().

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:38:31 -07:00
Rahul Lakkireddy
29b3705fac cxgb4: fix extracting IP addresses in TC-FLOWER rules
commit c8729cac2a ("cxgb4: add ethtool n-tuple filter insertion")
has removed checking control key for determining IP address types
for TC-FLOWER rules, which causes all the rules being inserted to
hardware to become IPv6 rule type always. So, add back the check
to select the correct IP address type to extract and hence fix the
correct rule type being inserted to hardware.

Also, ethtool_rx_flow_key doesn't have any control key and instead
directly sets the IPv4/IPv6 address keys. So, explicitly set the
IP address type for ethtool n-tuple filters to reuse the same code.

Fixes: c8729cac2a ("cxgb4: add ethtool n-tuple filter insertion")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:33:12 -07:00
Rahul Lakkireddy
fd4ec07631 cxgb4: fix check for running offline ethtool selftest
The flag indicating the selftest to run is a bitmask. So, fix the
check. Also, the selftests will fail if adapter initialization has
not been completed yet. So, add appropriate check and bail sooner.

Fixes: 7235ffae3d ("cxgb4: add loopback ethtool self-test")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:32:52 -07:00
Shannon Nelson
fe8c30b508 ionic: separate interrupt for Tx and Rx
Add the capability to split the Tx queues onto their own
interrupts with their own napi contexts.  This gives the
opportunity for more direct control of Tx interrupt
handling, such as CPU affinity and interrupt coalescing,
useful for some traffic loads.

v2: use ethtool -L, not a vendor specific priv-flag
v3: simplify logging, drop unnecessary "no-change" tests

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:32:02 -07:00
Shannon Nelson
b14e4e95f9 ionic: tx separate servicing
We give the tx clean path its own budget and service routine in
order to give a little more leeway to be more aggressive, and
in preparation for coming changes.  We've found this gives us
a little better performance in some packet processing scenarios
without hurting other scenarios.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:32:02 -07:00
Shannon Nelson
155f15ad67 ionic: use fewer firmware doorbells on rx fill
We really don't need to hit the Rx queue doorbell so many times,
we can wait to the end and cause a little less thrash.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:32:02 -07:00
Lorenzo Bianconi
d6526926de net: mvpp2: fix memory leak in mvpp2_rx
Release skb memory in mvpp2_rx() if mvpp2_rx_refill routine fails

Fixes: b501585467 ("net: mvpp2: fix refilling BM pools in RX path")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:14:46 -07:00
Leon Romanovsky
6c4e9bcfb4 net/mlx5: Delete extra dump stack that gives nothing
The WARN_*() macros are intended to catch impossible situations
from the SW point of view. They gave a little in case HW<->SW interface
is out-of-sync.

Such out-of-sync scenario can be due to SW errors that are not part
of this flow or because some HW errors, where dump stack won't help
either.

This specific WARN_ON() is useless because mlx5_core code is prepared
to handle such situations and will unfold everything correctly while
providing enough information to the users to understand why FS is not
working.

WARNING: CPU: 0 PID: 3222 at drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:825 connect_fts_in_prio.isra.20+0x1dd/0x260 linux/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:825
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 3222 Comm: syz-executor861 Not tainted 5.5.0-rc6+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
 __dump_stack linux/lib/dump_stack.c:77 [inline]
 dump_stack+0x94/0xce linux/lib/dump_stack.c:118
 panic+0x234/0x56f linux/kernel/panic.c:221
 __warn+0x1cc/0x1e1 linux/kernel/panic.c:582
 report_bug+0x200/0x310 linux/lib/bug.c:195
 fixup_bug.part.11+0x32/0x80 linux/arch/x86/kernel/traps.c:174
 fixup_bug linux/arch/x86/kernel/traps.c:273 [inline]
 do_error_trap+0xd3/0x100 linux/arch/x86/kernel/traps.c:267
 do_invalid_op+0x31/0x40 linux/arch/x86/kernel/traps.c:286
 invalid_op+0x1e/0x30 linux/arch/x86/entry/entry_64.S:1027
RIP: 0010:connect_fts_in_prio.isra.20+0x1dd/0x260
linux/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:825
Code: 00 00 48 c7 c2 60 8c 31 84 48 c7 c6 00 81 31 84 48 8b 38 e8 3c a8
cb ff 41 83 fd 01 8b 04 24 0f 8e 29 ff ff ff e8 83 7b bc fe <0f> 0b 8b
04 24 e9 1a ff ff ff 89 04 24 e8 c1 20 e0 fe 8b 04 24 eb
RSP: 0018:ffffc90004bb7858 EFLAGS: 00010293
RAX: ffff88805de98e80 RBX: 0000000000000c96 RCX: ffffffff827a853d
RDX: 0000000000000000 RSI: 0000000000000000 RDI: fffff52000976efa
RBP: 0000000000000007 R08: ffffed100da060e3 R09: ffffed100da060e3
R10: 0000000000000001 R11: ffffed100da060e2 R12: dffffc0000000000
R13: 0000000000000002 R14: ffff8880683a1a10 R15: ffffed100d07bc1c
 connect_prev_fts linux/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:844 [inline]
 connect_flow_table linux/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:975 [inline]
 __mlx5_create_flow_table+0x8f8/0x1710 linux/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:1064
 mlx5_create_flow_table linux/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:1094 [inline]
 mlx5_create_auto_grouped_flow_table+0xe1/0x210 linux/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:1136
 _get_prio linux/drivers/infiniband/hw/mlx5/main.c:3286 [inline]
 get_flow_table+0x2ea/0x760 linux/drivers/infiniband/hw/mlx5/main.c:3376
 mlx5_ib_create_flow+0x331/0x11c0 linux/drivers/infiniband/hw/mlx5/main.c:3896
 ib_uverbs_ex_create_flow+0x13e8/0x1b40 linux/drivers/infiniband/core/uverbs_cmd.c:3311
 ib_uverbs_write+0xaa5/0xdf0 linux/drivers/infiniband/core/uverbs_main.c:769
 __vfs_write+0x7c/0x100 linux/fs/read_write.c:494
 vfs_write+0x168/0x4a0 linux/fs/read_write.c:558
 ksys_write+0xc8/0x200 linux/fs/read_write.c:611
 do_syscall_64+0x9c/0x390 linux/arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45a059
Code: 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fcc17564c98 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fcc17564ca0 RCX: 000000000045a059
RDX: 0000000000000030 RSI: 00000000200003c0 RDI: 0000000000000005
RBP: 0000000000000007 R08: 0000000000000002 R09: 0000000000003131
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006e636c
R13: 0000000000000000 R14: 00000000006e6360 R15: 00007ffdcbdaf6a0
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 1 seconds..

Fixes: f90edfd279 ("net/mlx5_core: Connect flow tables")
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-08-03 10:13:58 -07:00
Jakub Kicinski
18a2b7f969 net/mlx5: convert to new udp_tunnel infrastructure
Allocate nic_info dynamically - n_entries is not constant.

Attach the tunnel offload info only to the uplink representor.
We expect the "main" netdev to be unregistered in switchdev
mode, and there to be only one uplink representor.

Drop the udp_tunnel_drop_rx_info() call, it was not there until
commit b3c2ed21c0 ("net/mlx5e: Fix VXLAN configuration restore after function reload")
so the device doesn't need it, and core should handle reloads and
reset just fine.

v2:
 - don't drop the ndos on reprs, and register info on uplink repr.
v4:
 - Move netdev tunnel structure handling to en_main.c

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-08-03 10:13:56 -07:00
Alex Vesker
b206490940 net/mlx5: DR, Change push vlan action sequence
The DR TX state machine supports the following order:
modify header, push vlan and encapsulation.
Instead fs_dr would pass:
push vlan, modify header and encapsulation.

The above caused the rule creation to fail on invalid action
sequence provided error.

Fixes: 6a48faeeca ("net/mlx5: Add direct rule fs_cmd implementation")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-08-03 10:13:51 -07:00
Parav Pandit
45d252ca80 net/mlx5e: Enable users to change VF/PF representors carrier state
Currently PF and VF representor netdevice carrier is always controlled
by controlling the representor netdevice device state as up/down.

Representor netdevice state change undergoes one or more txq/rxq
destroy/create commands to firmware, skb and its rx buffer allocation,
health reporters creation and more.

Due to this limitation users do not have the ability to just change
the carrier of the non uplink representors without modifying the
device state.

In one use case when the eswitch physical port carrier is down/up,
user needs to update the VF link state to same as physical port
carrier.

Example of updating VF representor carrier state:
$ ip link set enp0s8f0npf0vf0 carrier off
$ ip link set enp0s8f0npf0vf0 carrier on

This enhancement results into VF link state change which is
represented by the VF representor netdevice carrier.

This enables users to modify the representor carrier without modifying
the representor netdevice state.

A simple test is run using [1] to calculate the time difference between
updating carrier vs updating device state (to update just the carrier)
with one VF to simulate 255 VFs.

Time taken to update the carrier using device up/down:
$ time ./calculate.sh dev enp0s8f0npf0vf0
real    0m30.913s
user    0m0.200s
sys     0m11.168s

Time taken to update just the carrier using carrier iproute2 command:
$ time ./calculate.sh carrier enp0s8f0npf0vf0
real    0m2.142s
user    0m0.160s
sys     0m2.021s

Test shows that its better to use carrier on/off user interface to notify
link up/down event to VF compare to device up/down interface, because
carrier user interface delivers the same event 15 times faster.

[1] https://github.com/paravmellanox/myscripts/blob/master/calculate_carrier_time.sh

Signed-off-by: Parav Pandit <parav@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-08-03 10:13:49 -07:00
David S. Miller
bd0b33b248 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Resolved kernel/bpf/btf.c using instructions from merge commit
69138b34a7

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-02 01:02:12 -07:00
Tony Nguyen
7dbc63f0a5 ice: Misc minor fixes
This is a collection of minor fixes including typos, white space, and
style. No functional changes.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
2020-08-01 08:44:04 -07:00
Victor Raj
6a2c2b2c1b ice: adjust profile ID map locks
The profile ID map lock should be held till the caller completes
all references of that profile entries.

The current code releases the lock right after the match search.
This caused a driver issue when the profile map entries were
referenced after it was freed in other thread after the lock was
released earlier.

Signed-off-by: Victor Raj <victor.raj@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-08-01 08:44:04 -07:00
Tony Nguyen
eddbee9b94 ice: update PTYPE lookup table
Update the PTYPE lookup table to reflect values that can be set by the
hardware.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
2020-08-01 08:44:04 -07:00
Nick Nunley
68d210a609 ice: Disable VLAN pruning in promiscuous mode
Disable VLAN pruning when entering promiscuous mode, and re-enable it
when exiting.

Without this VLAN-over-bridge topologies created on the device won't be
functional unless rx-vlan-filter is explicitly disabled with ethtool.

Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-08-01 08:44:04 -07:00
Surabhi Boob
bcc46cb8a0 ice: Graceful error handling in HW table calloc failure
In the ice_init_hw_tbls, if the devm_kcalloc for es->written fails, catch
that error and bail out gracefully, instead of continuing with a NULL
pointer.

Fixes: 32d63fa1e9 ("ice: Initialize DDP package structures")
Signed-off-by: Surabhi Boob <surabhi.boob@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-08-01 08:44:04 -07:00
Kiran Patil
0a37abfa01 ice: port fix for chk_linearlize
This is a port of commit 248de22e63 ("i40e/i40evf: Account for frags
split over multiple descriptors in check linearize")

As part of testing workloads (read/write) using larger IO size (128K)
tx_timeout is observed and whenever it happens, it was due to
tx_linearize.

Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-08-01 08:44:04 -07:00
Brett Creeley
f34f55557a ice: Allow 2 queue pairs per VF on SR-IOV initialization
Currently VFs are only allowed to get 16, 4, and 1 queue pair by
default, which require 17, 5, and 2 MSI-X vectors respectively. This
is because each VF needs a MSI-X per data queue and a MSI-X for its
other interrupt. The calculation is based on the number of VFs created,
MSI-X available, and queue pairs available at the time of VF creation.

Unfortunately the values above exclude 2 queue pairs when only 3 MSI-X
are available to each VF based on resource constraints. The current
calculation would default to 2 MSI-X and 1 queue pair. This is a waste
of resources, so fix this by allowing 2 queue pairs per VF when there
are between 2 and 5 MSI-X available per VF.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-08-01 08:44:04 -07:00
Vignesh Sridhar
ec1d1d2302 ice: Clear and free XLT entries on reset
This fix has been added to address memory leak issues resulting from
triggering a sudden driver reset which does not allow us to follow our
normal removal flows for SW XLT entries for advanced features.

- Adding call to destroy flow profile locks when clearing SW XLT tables.

- Extraction sequence entries were not correctly cleared previously
which could cause ownership conflicts for repeated reset-replay calls.

Fixes: 31ad4e4ee1 ("ice: Allocate flow profile")
Signed-off-by: Vignesh Sridhar <vignesh.sridhar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-08-01 08:44:04 -07:00
Jesse Brandeburg
a8fffd7ae9 ice: add useful statistics
Display and count some useful hot-path statistics. The usefulness is as
follows:

- tx_restart: use to determine if the transmit ring size is too small or
  if the transmit interrupt rate is too low.
- rx_gro_dropped: use to count drops from GRO layer, which previously were
  completely uncounted when occurring.
- tx_busy: use to determine when the driver is miscounting number of
  descriptors needed for an skb.
- tx_timeout: as our other drivers, count the number of times we've reset
  due to timeout because the kernel only prints a warning once per netdev.

Several of these were already counted but not displayed.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-08-01 08:44:04 -07:00
Jesse Brandeburg
a4c493fea5 ice: remove page_reuse statistic
The page reuse statistic wasn't even being displayed to the user, even
though the driver counted it. Don't waste the struct space and hot-path
cycles since the driver doesn't display it.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-08-01 08:43:59 -07:00
Vignesh Sridhar
cdedbab92d ice: Fix RSS profile locks
Replacing flow profile locks with RSS profile locks in the function to
remove all RSS rules for a given VSI. This is to align the locks used
for RSS rule addition to VSI and removal during VSI teardown to avoid
a race condition owing to several iterations of the above operations.
In function to get RSS rules for given VSI and protocol header replacing
the pointer reference of the RSS entry with a copy of hash value to
ensure thread safety.

Signed-off-by: Vignesh Sridhar <vignesh.sridhar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-08-01 08:22:30 -07:00
Kiran Patil
f07d134d37 ice: fix the vsi_id mask to be 10 bit for set_rss_lut
set_rss_lut can fail due to incorrect vsi_id mask. vsi_id is 10 bit
but mask was 0x1FF whereas it should be 0x3FF.

For vsi_num >= 512, FW set_rss_lut can fail with return code
EACCESS (VSI ownership issue) because software was providing
incorrect vsi_num (dropping 10th bit due to incorrect mask) for
set_rss_lut admin command

Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-08-01 08:20:10 -07:00
Nick Nunley
585cdabdfd ice: rename misleading grst_delay variable
The grst_delay variable in ice_check_reset contains the maximum time
(in 100 msec units) that the driver will wait for a reset event to
transition to the Device Active state. The value is the sum of three
separate components:
1) The maximum time it may take for the firmware to process its
outstanding command before handling the reset request.
2) The value in RSTCTL.GRSTDEL (the delay firmware inserts between first
seeing the driver reset request and the actual hardware assertion).
3) The maximum expected reset processing time in hardware.

Referring to this total time as "grst_delay" is misleading and
potentially confusing to someone checking the code and cross-referencing
the hardware specification.

Fix this by renaming the variable to "grst_timeout", which is more
descriptive of its actual use.

Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-08-01 08:17:40 -07:00
Wei Yongjun
65c72291f7 ice: mark PM functions as __maybe_unused
In certain configurations without power management support, the
following warnings happen:

drivers/net/ethernet/intel/ice/ice_main.c:4214:12: warning:
 'ice_resume' defined but not used [-Wunused-function]
 4214 | static int ice_resume(struct device *dev)
      |            ^~~~~~~~~~
drivers/net/ethernet/intel/ice/ice_main.c:4150:12: warning:
 'ice_suspend' defined but not used [-Wunused-function]
 4150 | static int ice_suspend(struct device *dev)
      |            ^~~~~~~~~~~

Mark these functions as __maybe_unused to make it clear to the
compiler that this is going to happen based on the configuration,
which is the standard for these types of functions.

Fixes: 769c500dcc ("ice: Add advanced power mgmt for WoL")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-08-01 08:15:56 -07:00
David S. Miller
e535d87d8b mlx5-fixes-2020-07-30
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl8jebYACgkQSD+KveBX
 +j6Vnwf/Z0LVx9aIVOl+lfIuIqeKCtbwdoh9kTa6wNIuX0c/NnG2u2bt9GjOGyKh
 LqTVX6Nu2sfcuTtkyrqhiIj6PyivzHciDN+au8hqrvBV429KsKbNu+jvXJmL/mXX
 BU47mExP6ZMsWDTkBEnRQnwSsKDHbUw+xT7LeO36DE9Rrjlox6AtjFeWuJBoyBMO
 QlbTjgDHtwV1bJe8sVBrymZmQcy582hEcoZqAGnzDVAF5DW6TAZkSqx3UdiO6KeN
 hLTMYZbiIDRDs3n1dOyLlQlXSWzzQzw1gae/Q7nJsYu4VPsGjEeNXUw8d9RNWv4v
 EGhJF999G9Np6mjzWtXDC00tM2X7LQ==
 =+emh
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2020-07-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2020-07-30

This small patchset introduces some fixes to mlx5 driver.

Please pull and let me know if there is any problem.

For -stable v4.18:
 ('net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq')

For -stable v5.7:
 ('net/mlx5e: E-Switch, Add misc bit when misc fields changed for mirroring')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 17:05:54 -07:00
David S. Miller
c6886957d2 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Tony Nguyen says:

====================
1GbE Intel Wired LAN Driver Updates 2020-07-30

This series contains updates to e100, e1000, e1000e, igb, igbvf, ixgbe,
ixgbevf, iavf, and driver documentation.

Vaibhav Gupta converts legacy .suspend() and .resume() to generic PM
callbacks for e100, igbvf, ixgbe, ixgbevf, and iavf.

Suraj Upadhyay replaces 1 byte memsets with assignments for e1000,
e1000e, igb, and ixgbe.

Alexander Klimov replaces http links with https.

Miaohe Lin replaces uses of memset to clear MAC addresses with
eth_zero_addr().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 16:59:13 -07:00
David S. Miller
dc096288d5 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2020-07-30

This series contains updates to the e1000e and igb drivers.

Aaron Ma allows PHY initialization to continue if ULP disable failed for
e1000e.

Francesco Ruggeri fixes race conditions in igb reset that could cause panics.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 16:51:58 -07:00
Andy Shevchenko
26b4b2d99c qede: Use %pM format specifier for MAC addresses
Convert to %pM instead of using custom code.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 16:48:23 -07:00
Andy Shevchenko
b03c3bacf5 qed: Use %pM format specifier for MAC addresses
Convert to %pM instead of using custom code.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 16:47:47 -07:00
Xin Xiong
e692139e6a net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq
The function invokes bpf_prog_inc(), which increases the reference
count of a bpf_prog object "rq->xdp_prog" if the object isn't NULL.

The refcount leak issues take place in two error handling paths. When
either mlx5_wq_ll_create() or mlx5_wq_cyc_create() fails, the function
simply returns the error code and forgets to drop the reference count
increased earlier, causing a reference count leak of "rq->xdp_prog".

Fix this issue by jumping to the error handling path err_rq_wq_destroy
while either function fails.

Fixes: 422d4c401e ("net/mlx5e: RX, Split WQ objects for different RQ types")
Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn>
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-30 18:53:55 -07:00
Jianbo Liu
6f7bbad18e net/mlx5e: E-Switch, Specify flow_source for rule with no in_port
The flow_source must be specified, even for rule without matching
source vport, because some actions are only allowed in uplink.
Otherwise, rule can't be offloaded and firmware syndrome happens.

Fixes: 6fb0701a9c ("net/mlx5: E-Switch, Add support for offloading rules with no in_port")
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-30 18:53:53 -07:00
Jianbo Liu
0faddfe6b7 net/mlx5e: E-Switch, Add misc bit when misc fields changed for mirroring
The modified flow_context fields in FTE must be indicated in
modify_enable bitmask. Previously, the misc bit in modify_enable is
always set as source vport must be set for each rule. So, when parsing
vxlan/gre/geneve/qinq rules, this bit is not set because those are all
from the same misc fileds that source vport fields are located at, and
we don't need to set the indicator twice.

After adding per vport tables for mirroring, misc bit is not set, then
firmware syndrome happens. To fix it, set the bit wherever misc fileds
are changed. This also makes it unnecessary to check misc fields and set
the misc bit accordingly in metadata matching, so here remove it.

Besides, flow_source must be specified for uplink because firmware
will check it and some actions are only allowed for packets received
from uplink.

Fixes: 96e326878f ("net/mlx5e: Eswitch, Use per vport tables for mirroring")
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-30 18:53:50 -07:00
Jianbo Liu
01cefbbe2c net/mlx5e: CT: Support restore ipv6 tunnel
Currently the driver restores only IPv4 tunnel headers.
Add support for restoring IPv6 tunnel header.

Fixes: b8ce903709 ("net/mlx5e: Restore tunnel metadata on miss")
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-30 18:53:47 -07:00
Wang Hai
85496a2922 net: gemini: Fix missing clk_disable_unprepare() in error path of gemini_ethernet_port_probe()
Fix the missing clk_disable_unprepare() before return
from gemini_ethernet_port_probe() in the error handling case.

Fixes: 4d5ae32f5e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-30 17:45:13 -07:00
Wang Hai
bd69058f50 net: ll_temac: Use devm_platform_ioremap_resource_byname()
platform_get_resource() may fail and return NULL, so we had better
check its return value to avoid a NULL pointer dereference a bit later
in the code. Fix it to use devm_platform_ioremap_resource_byname()
instead of calling platform_get_resource_byname() and devm_ioremap().

Fixes: 8425c41d1e ("net: ll_temac: Extend support to non-device-tree platforms")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-30 17:44:28 -07:00
Vaibhav Gupta
04db64652e tlan: use generic power management
Drivers using legacy power management .suspen()/.resume() callbacks
have to manage PCI states and device's PM states themselves. They also
need to take care of standard configuration registers.

Switch to generic power management framework using a single
"struct dev_pm_ops" variable to take the unnecessary load from the driver.
This also avoids the need for the driver to directly call most of the PCI
helper functions and device power state control functions, as through
the generic framework PCI Core takes care of the necessary operations,
and drivers are required to do only device-specific jobs.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-30 17:43:02 -07:00
Vaibhav Gupta
7fa8bb48a4 sis900: use generic power management
Drivers using legacy power management .suspen()/.resume() callbacks
have to manage PCI states and device's PM states themselves. They also
need to take care of standard configuration registers.

Switch to generic power management framework using a single
"struct dev_pm_ops" variable to take the unnecessary load from the driver.
This also avoids the need for the driver to directly call most of the PCI
helper functions and device power state control functions, as through
the generic framework PCI Core takes care of the necessary operations,
and drivers are required to do only device-specific jobs.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-30 17:43:01 -07:00
Vaibhav Gupta
bfc6c183cb sc92031: use generic power management
Drivers using legacy power management .suspen()/.resume() callbacks
have to manage PCI states and device's PM states themselves. They also
need to take care of standard configuration registers.

Switch to generic power management framework using a single
"struct dev_pm_ops" variable to take the unnecessary load from the driver.
This also avoids the need for the driver to directly call most of the PCI
helper functions and device power state control functions, as through
the generic framework PCI Core takes care of the necessary operations,
and drivers are required to do only device-specific jobs.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-30 17:43:01 -07:00
Li Heng
bbf1b94a73 bnxt_en: Remove superfluous memset()
Fixes coccicheck warning:

./drivers/net/ethernet/broadcom/bnxt/bnxt.c:3730:19-37: WARNING:
dma_alloc_coherent use in stats -> hw_stats already zeroes out
memory,  so memset is not needed

dma_alloc_coherent use in status already zeroes out memory,
so memset is not needed

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Li Heng <liheng40@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-30 17:41:05 -07:00
Wang Hai
1e51f9358a liquidio: Replace vmalloc with kmalloc in octeon_register_dispatch_fn()
The size of struct octeon_dispatch is too small, it is better to use
kmalloc instead of vmalloc.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-30 17:40:11 -07:00
Gustavo A. R. Silva
10470c0d7e mlxsw: spectrum_cnt: Use flex_array_size() helper in memcpy()
Make use of the flex_array_size() helper to calculate the size of a
flexible array member within an enclosing structure.

This helper offers defense-in-depth against potential integer
overflows, while at the same time makes it explicitly clear that
we are dealing witha flexible array member.

Also, remove unnecessary pointer identifier sub_pool.

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-30 17:38:23 -07:00
Shannon Nelson
59929fbb45 ionic: unlock queue mutex in error path
On an error return, jump to the unlock at the end to be sure
to unlock the queue_lock mutex.

Fixes: 0925e9db4d ("ionic: use mutex to protect queue operations")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-30 17:37:16 -07:00
Landen Chao
555a893303 net: ethernet: mtk_eth_soc: fix MTU warnings
in recent kernel versions there are warnings about incorrect MTU size
like these:

eth0: mtu greater than device maximum
mtk_soc_eth 1b100000.ethernet eth0: error -22 setting MTU to include DSA overhead

Fixes: bfcb813203 ("net: dsa: configure the MTU for switch ports")
Fixes: 72579e14a1 ("net: dsa: don't fail to probe if we couldn't set the MTU")
Fixes: 7a4c53bee3 ("net: report invalid mtu value via netlink extack")
Signed-off-by: Landen Chao <landen.chao@mediatek.com>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-30 16:56:30 -07:00
Lu Wei
366228ed01 net: nixge: fix potential memory leak in nixge_probe()
If some processes in nixge_probe() fail, free_netdev(dev)
needs to be called to aviod a memory leak.

Fixes: 87ab207981 ("net: nixge: Separate ctrl and dma resources")
Fixes: abcd3d6fc6 ("net: nixge: Fix error path for obtaining mac address")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Lu Wei <luwei32@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-30 16:55:39 -07:00
YueHaibing
b04e55d641 sfc_ef100: remove duplicated include from ef100_netdev.c
Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-30 16:53:07 -07:00
Miaohe Lin
8698fb64cc igb: use eth_zero_addr() to clear mac address
Use eth_zero_addr() to clear mac address instead of memset().

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-30 11:41:45 -07:00
Miaohe Lin
935f73bd51 ixgbe: use eth_zero_addr() to clear mac address
Use eth_zero_addr() to clear mac address instead of memset().

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-30 11:40:08 -07:00
Suraj Upadhyay
7ba068d128 ixgbe: Remove unnecessary usages of memset
Replace memsets of 1 byte with simple assignment.
Issue found with checkpatch.pl

Signed-off-by: Suraj Upadhyay <usuraj35@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-30 10:51:26 -07:00
Suraj Upadhyay
90105264a6 igb: Remove unnecessary usages of memset
Replace memsets of 1 byte with simple assignment.
Issue found with checkpatch.pl

Signed-off-by: Suraj Upadhyay <usuraj35@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-30 10:51:25 -07:00
Suraj Upadhyay
c5b369651b e1000e: Remove unnecessary usages of memset
Replace memsets of 1 byte with simple assignments.
Issue found with checkpatch.pl

Signed-off-by: Suraj Upadhyay <usuraj35@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-30 10:51:25 -07:00
Suraj Upadhyay
4b6bafb9e1 e1000: Remove unnecessary usages of memset
Replace memsets of 1 byte with simple assignments.
Issue reported by checkpatch.pl.

Signed-off-by: Suraj Upadhyay <usuraj35@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-30 10:51:25 -07:00
Vaibhav Gupta
69a74aef8a e100: use generic power management
With legacy PM hooks, it was the responsibility of a driver to manage PCI
states and also the device's power state. The generic approach is to let
PCI core handle the work.

e100_suspend() calls __e100_shutdown() to perform intermediate tasks.
__e100_shutdown() calls pci_save_state() which is not recommended.

e100_suspend() also calls __e100_power_off() which is calling PCI helper
functions, pci_prepare_to_sleep(), pci_set_power_state(), along with
pci_wake_from_d3(...,false). Hence, the functin call is removed and wol is
disabled as earlier using device_wakeup_disable().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-30 10:50:55 -07:00
Francesco Ruggeri
024a8168b7 igb: reinit_locked() should be called with rtnl_lock
We observed two panics involving races with igb_reset_task.
The first panic is caused by this race condition:

	kworker			reboot -f

	igb_reset_task
	igb_reinit_locked
	igb_down
	napi_synchronize
				__igb_shutdown
				igb_clear_interrupt_scheme
				igb_free_q_vectors
				igb_free_q_vector
				adapter->q_vector[v_idx] = NULL;
	napi_disable
	Panics trying to access
	adapter->q_vector[v_idx].napi_state

The second panic (a divide error) is caused by this race:

kworker		reboot -f	tx packet

igb_reset_task
		__igb_shutdown
		rtnl_lock()
		...
		igb_clear_interrupt_scheme
		igb_free_q_vectors
		adapter->num_tx_queues = 0
		...
		rtnl_unlock()
rtnl_lock()
igb_reinit_locked
igb_down
igb_up
netif_tx_start_all_queues
				dev_hard_start_xmit
				igb_xmit_frame
				igb_tx_queue_mapping
				Panics on
				r_idx % adapter->num_tx_queues

This commit applies to igb_reset_task the same changes that
were applied to ixgbe in commit 2f90b8657e ("ixgbe: this patch
adds support for DCB to the kernel and ixgbe driver"),
commit 8f4c5c9fb8 ("ixgbe: reinit_locked() should be called with
rtnl_lock") and commit 88adce4ea8 ("ixgbe: fix possible race in
reset subtask").

Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-30 10:05:14 -07:00
Aaron Ma
1050242fa6 e1000e: continue to init PHY even when failed to disable ULP
After 'commit e086ba2fcc ("e1000e: disable s0ix entry and exit flows
 for ME systems")',
ThinkPad P14s always failed to disable ULP by ME.
'commit 0c80cdbf33 ("e1000e: Warn if disabling ULP failed")'
break out of init phy:

error log:
[   42.364753] e1000e 0000:00:1f.6 enp0s31f6: Failed to disable ULP
[   42.524626] e1000e 0000:00:1f.6 enp0s31f6: PHY Wakeup cause - Unicast Packet
[   42.822476] e1000e 0000:00:1f.6 enp0s31f6: Hardware Error

When disable s0ix, E1000_FWSM_ULP_CFG_DONE will never be 1.
If continue to init phy like before, it can work as before.
iperf test result good too.

Fixes: 0c80cdbf33 ("e1000e: Warn if disabling ULP failed")
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-30 10:04:54 -07:00
Vaibhav Gupta
bac6631728 ixgbevf: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

The driver was invoking PCI helper functions like pci_save/restore_state(),
and pci_enable/disable_device(), which is not recommended.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-30 08:44:17 -07:00
Vaibhav Gupta
6f82b25587 ixgbe: use generic power management
With legacy PM hooks, it was the responsibility of a driver to manage PCI
states and also the device's power state. The generic approach is to let
PCI core handle the work.

ixgbe_suspend() calls __ixgbe_shutdown() to perform intermediate tasks.
__ixgbe_shutdown() modifies the value of "wake" (device should be wakeup
enabled or not), responsible for controlling the flow of legacy PM.

Since, PCI core has no idea about the value of "wake", new code for generic
PM may produce unexpected results. Thus, use "device_set_wakeup_enable()"
to wakeup-enable the device accordingly.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-30 08:39:41 -07:00
Vaibhav Gupta
e9c971bdab igbvf: use generic power management
Remove legacy PM callbacks and use generic operations. With legacy code,
drivers were responsible for handling PCI PM operations like
pci_save_state(). In generic code, all these are handled by PCI core.

The generic suspend() and resume() are called at the same point the legacy
ones were called. Thus, it does not affect the normal functioning of the
driver.

__maybe_unused attribute is used with .resume() but not with .suspend(), as
.suspend() is called by .shutdown().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-30 08:36:08 -07:00
Vaibhav Gupta
bc5cbd73eb iavf: use generic power management
With the support of generic PM callbacks, drivers no longer need to use
legacy .suspend() and .resume() in which they had to maintain PCI states
changes and device's power state themselves. The required operations are
done by PCI core.

PCI drivers are not expected to invoke PCI helper functions like
pci_save/restore_state(), pci_enable/disable_device(),
pci_set_power_state(), etc. Their tasks are completed by PCI core itself.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-30 08:32:03 -07:00
Thomas Falcon
27a2145d6f ibmvnic: Fix IRQ mapping disposal in error path
RX queue IRQ mappings are disposed in both the TX IRQ and RX IRQ
error paths. Fix this and dispose of TX IRQ mappings correctly in
case of an error.

Fixes: ea22d51a78 ("ibmvnic: simplify and improve driver probe function")
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-29 15:35:55 -07:00
David S. Miller
a41cf09b8e Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2020-07-29

This series contains updates to the ice driver only.

Dave works around LFC settings not being preserved through link events.
Fixes link issues with GLOBR reset and handling of multiple link events.

Nick restores VF MSI-X after PCI reset.

Kiran corrects the error code returned in ice_aq_sw_rules if the rule
does not exist.

Paul prevents overwriting of user set descriptors.

Tarun adds masking before accessing rate limiting profile types and
corrects queue bandwidth configuration.

Victor modifies Tx queue scheduler distribution to spread more evenly
across queue group nodes.

Krzysztof sets need_wakeup flag for Tx AF_XDP.

Brett allows VLANs in safe mode.

Marcin cleans up VSIs on probe failure.

Bruce reduces the scope of a variable.

Ben removes a FW workaround.

Tony fixes an unused parameter warning.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-29 13:15:30 -07:00
Jisheng Zhang
5ba2254b04 net: mvneta: fix comment about phylink_speed_down
mvneta has switched to phylink, so the comment should look
like "We may have called phylink_speed_down before".

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-29 12:17:24 -07:00
Ido Schimmel
5515c3448d mlxsw: spectrum_router: Fix use-after-free in router init / de-init
Several notifiers are registered as part of router initialization.
Since some of these notifiers are registered before the end of the
initialization, it is possible for them to access uninitialized or freed
memory when processing notifications [1].

Additionally, some of these notifiers queue work items on a workqueue.
If these work items are executed after the router was de-initialized,
they will access freed memory.

Fix both problems by moving the registration of the notifiers to the end
of the router initialization and flush the work queue after they are
unregistered.

[1]
BUG: KASAN: use-after-free in __mutex_lock_common kernel/locking/mutex.c:938 [inline]
BUG: KASAN: use-after-free in __mutex_lock+0xeea/0x1340 kernel/locking/mutex.c:1103
Read of size 8 at addr ffff888038c3a6e0 by task kworker/u4:1/61

CPU: 1 PID: 61 Comm: kworker/u4:1 Not tainted 5.8.0-rc2+ #36
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Workqueue: mlxsw_core_ordered mlxsw_sp_inet6addr_event_work
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xf6/0x16e lib/dump_stack.c:118
 print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 __mutex_lock_common kernel/locking/mutex.c:938 [inline]
 __mutex_lock+0xeea/0x1340 kernel/locking/mutex.c:1103
 mlxsw_sp_inet6addr_event_work+0xb3/0x1b0 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:7123
 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
 kthread+0x355/0x470 kernel/kthread.c:291
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

Allocated by task 1298:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc mm/kasan/common.c:494 [inline]
 __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467
 kmalloc include/linux/slab.h:555 [inline]
 kzalloc include/linux/slab.h:669 [inline]
 mlxsw_sp_router_init+0xb2/0x1d20 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:8074
 mlxsw_sp_init+0xbd8/0x3ac0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2932
 __mlxsw_core_bus_device_register+0x657/0x10d0 drivers/net/ethernet/mellanox/mlxsw/core.c:1375
 mlxsw_core_bus_device_register drivers/net/ethernet/mellanox/mlxsw/core.c:1436 [inline]
 mlxsw_devlink_core_bus_device_reload_up+0xcd/0x150 drivers/net/ethernet/mellanox/mlxsw/core.c:1133
 devlink_reload net/core/devlink.c:2959 [inline]
 devlink_reload+0x281/0x3b0 net/core/devlink.c:2944
 devlink_nl_cmd_reload+0x2f1/0x7c0 net/core/devlink.c:2987
 genl_family_rcv_msg_doit net/netlink/genetlink.c:691 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:736 [inline]
 genl_rcv_msg+0x611/0x9d0 net/netlink/genetlink.c:753
 netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2469
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:764
 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
 netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1329
 netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1918
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0x150/0x190 net/socket.c:672
 ____sys_sendmsg+0x6d8/0x840 net/socket.c:2363
 ___sys_sendmsg+0xff/0x170 net/socket.c:2417
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2450
 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:359
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 1348:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455
 slab_free_hook mm/slub.c:1474 [inline]
 slab_free_freelist_hook mm/slub.c:1507 [inline]
 slab_free mm/slub.c:3072 [inline]
 kfree+0xe6/0x320 mm/slub.c:4063
 mlxsw_sp_fini+0x340/0x4e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:3132
 mlxsw_core_bus_device_unregister+0x16c/0x6d0 drivers/net/ethernet/mellanox/mlxsw/core.c:1474
 mlxsw_devlink_core_bus_device_reload_down+0x8e/0xc0 drivers/net/ethernet/mellanox/mlxsw/core.c:1123
 devlink_reload+0xc6/0x3b0 net/core/devlink.c:2952
 devlink_nl_cmd_reload+0x2f1/0x7c0 net/core/devlink.c:2987
 genl_family_rcv_msg_doit net/netlink/genetlink.c:691 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:736 [inline]
 genl_rcv_msg+0x611/0x9d0 net/netlink/genetlink.c:753
 netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2469
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:764
 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
 netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1329
 netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1918
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0x150/0x190 net/socket.c:672
 ____sys_sendmsg+0x6d8/0x840 net/socket.c:2363
 ___sys_sendmsg+0xff/0x170 net/socket.c:2417
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2450
 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:359
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff888038c3a000
 which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 1760 bytes inside of
 2048-byte region [ffff888038c3a000, ffff888038c3a800)
The buggy address belongs to the page:
page:ffffea0000e30e00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 head:ffffea0000e30e00 order:3 compound_mapcount:0 compound_pincount:0
flags: 0x100000000010200(slab|head)
raw: 0100000000010200 dead000000000100 dead000000000122 ffff88806c40c000
raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888038c3a580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888038c3a600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888038c3a680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                       ^
 ffff888038c3a700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888038c3a780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: 965fa8e600 ("mlxsw: spectrum_router: Make RIF deletion more robust")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-29 12:16:21 -07:00
Ido Schimmel
3c8ce24b03 mlxsw: core: Free EMAD transactions using kfree_rcu()
The lifetime of EMAD transactions (i.e., 'struct mlxsw_reg_trans') is
managed using RCU. They are freed using kfree_rcu() once the transaction
ends.

However, in case the transaction failed it is freed immediately after being
removed from the active transactions list. This is problematic because it is
still possible for a different CPU to dereference the transaction from an RCU
read-side critical section while traversing the active transaction list in
mlxsw_emad_rx_listener_func(). In which case, a use-after-free is triggered
[1].

Fix this by freeing the transaction after a grace period by calling
kfree_rcu().

[1]
BUG: KASAN: use-after-free in mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671
Read of size 8 at addr ffff88800b7964e8 by task syz-executor.2/2881

CPU: 0 PID: 2881 Comm: syz-executor.2 Not tainted 5.8.0-rc4+ #44
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xf6/0x16e lib/dump_stack.c:118
 print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671
 mlxsw_core_skb_receive+0x571/0x700 drivers/net/ethernet/mellanox/mlxsw/core.c:2061
 mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline]
 mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651
 tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550
 __do_softirq+0x223/0x964 kernel/softirq.c:292
 asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
 do_softirq_own_stack+0x109/0x140 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:387 [inline]
 __irq_exit_rcu kernel/softirq.c:417 [inline]
 irq_exit_rcu+0x16f/0x1a0 kernel/softirq.c:429
 sysvec_apic_timer_interrupt+0x4e/0xd0 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:587
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/irqflags.h:85 [inline]
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x3b/0x40 kernel/locking/spinlock.c:191
Code: e8 2a c3 f4 fc 48 89 ef e8 12 96 f5 fc f6 c7 02 75 11 53 9d e8 d6 db 11 fd 65 ff 0d 1f 21 b3 56 5b 5d c3 e8 a7 d7 11 fd 53 9d <eb> ed 0f 1f 00 55 48 89 fd 65 ff 05 05 21 b3 56 ff 74 24 08 48 8d
RSP: 0018:ffff8880446ffd80 EFLAGS: 00000286
RAX: 0000000000000006 RBX: 0000000000000286 RCX: 0000000000000006
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa94ecea9
RBP: ffff888012934408 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: fffffbfff57be301 R12: 1ffff110088dffc1
R13: ffff888037b817c0 R14: ffff88802442415a R15: ffff888024424000
 __do_sys_perf_event_open+0x1b5d/0x2bd0 kernel/events/core.c:11874
 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x473dbd
Code: Bad RIP value.
RSP: 002b:00007f21e5e9cc28 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
RAX: ffffffffffffffda RBX: 000000000057bf00 RCX: 0000000000473dbd
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000040
RBP: 000000000057bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000003 R11: 0000000000000246 R12: 000000000057bf0c
R13: 00007ffd0493503f R14: 00000000004d0f46 R15: 00007f21e5e9cd80

Allocated by task 871:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc mm/kasan/common.c:494 [inline]
 __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467
 kmalloc include/linux/slab.h:555 [inline]
 kzalloc include/linux/slab.h:669 [inline]
 mlxsw_core_reg_access_emad+0x70/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1812
 mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991
 mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130
 update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173
 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
 kthread+0x355/0x470 kernel/kthread.c:291
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

Freed by task 871:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455
 slab_free_hook mm/slub.c:1474 [inline]
 slab_free_freelist_hook mm/slub.c:1507 [inline]
 slab_free mm/slub.c:3072 [inline]
 kfree+0xe6/0x320 mm/slub.c:4052
 mlxsw_core_reg_access_emad+0xd45/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1819
 mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991
 mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130
 update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173
 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
 kthread+0x355/0x470 kernel/kthread.c:291
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

The buggy address belongs to the object at ffff88800b796400
 which belongs to the cache kmalloc-512 of size 512
The buggy address is located 232 bytes inside of
 512-byte region [ffff88800b796400, ffff88800b796600)
The buggy address belongs to the page:
page:ffffea00002de500 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 head:ffffea00002de500 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x100000000010200(slab|head)
raw: 0100000000010200 dead000000000100 dead000000000122 ffff88806c402500
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88800b796380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88800b796400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88800b796480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                          ^
 ffff88800b796500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88800b796580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: caf7297e7a ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-29 12:16:21 -07:00
Ido Schimmel
7d8e8f3433 mlxsw: core: Increase scope of RCU read-side critical section
The lifetime of the Rx listener item ('rxl_item') is managed using RCU,
but is dereferenced outside of RCU read-side critical section, which can
lead to a use-after-free.

Fix this by increasing the scope of the RCU read-side critical section.

Fixes: 93c1edb27f ("mlxsw: Introduce Mellanox switch driver core")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-29 12:16:21 -07:00
Ido Schimmel
ec4f5b3617 mlxsw: spectrum: Use different trap group for externally routed packets
Cited commit mistakenly removed the trap group for externally routed
packets (e.g., via the management interface) and grouped locally routed
and externally routed packet traps under the same group, thereby
subjecting them to the same policer.

This can result in problems, for example, when FRR is restarted and
suddenly all transient traffic is trapped to the CPU because of a
default route through the management interface. Locally routed packets
required to re-establish a BGP connection will never reach the CPU and
the routing tables will not be re-populated.

Fix this by using a different trap group for externally routed packets.

Fixes: 8110668ecd ("mlxsw: spectrum_trap: Register layer 3 control traps")
Reported-by: Alex Veber <alexve@mellanox.com>
Tested-by: Alex Veber <alexve@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-29 12:16:21 -07:00
Ido Schimmel
89ab533135 mlxsw: spectrum_router: Allow programming link-local host routes
Cited commit added the ability to program link-local prefix routes to
the ASIC so that relevant packets are routed and trapped correctly.

However, host routes were not included in the change and thus not
programmed to the ASIC. This can result in packets being trapped via an
external route trap instead of a local route trap as in IPv4.

Fix this by programming all the link-local routes to the ASIC.

Fixes: 10d3757fcb ("mlxsw: spectrum_router: Allow programming link-local prefix routes")
Reported-by: Alex Veber <alexve@mellanox.com>
Tested-by: Alex Veber <alexve@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-29 12:16:21 -07:00
Tony Nguyen
6221595fc5 ice: fix unused parameter warning
Depending on PAGE_SIZE, the following unused parameter warning can be
reported:

drivers/net/ethernet/intel/ice/ice_txrx.c: In function ‘ice_rx_frame_truesize’:
drivers/net/ethernet/intel/ice/ice_txrx.c:513:21: warning: unused parameter ‘size’ [-Wunused-parameter]
        unsigned int size)

The 'size' variable is used only when PAGE_SIZE >= 8192. Add __maybe_unused
to remove the warning.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
2020-07-29 08:38:56 -07:00
Ben Shelton
7dfff9ffe8 ice: disable no longer needed workaround for FW logging
For the FW logging info AQ command, we currently set the ICE_AQ_FLAG_RD
in order to work around a FW issue. This issue has been fixed so remove the
workaround.

Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29 08:38:56 -07:00
Bruce Allan
e923f04d66 ice: reduce scope of variable
The scope of the macro local variable 'i' can be reduced.  Do so to avoid
static analysis tools from complaining.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29 08:38:55 -07:00
Marcin Szycik
78116e979d ice: cleanup VSI on probe fail
As part of ice_setup_pf_sw() a PF VSI is setup; release the VSI in case of
failure.

Signed-off-by: Marcin Szycik <marcin.szycik@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29 08:38:55 -07:00
Brett Creeley
cd1f56f429 ice: Allow all VLANs in safe mode
Currently the PF VSI's context parameters are left in a bad state when
going into safe mode. This is causing VLAN traffic to not pass. Fix this
by configuring the PF VSI to allow all VLAN tagged traffic.

Also, remove redundant comment explaining the safe mode flow in
ice_probe().

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29 08:38:55 -07:00
Krzysztof Kazimierczak
682dfedcee ice: need_wakeup flag might not be set for Tx
This is a port of i40e commit 705639572e ("i40e: need_wakeup flag might
not be set for Tx").

Quoting the original commit message:

"The need_wakeup flag for Tx might not be set for AF_XDP sockets that
are only used to send packets. This happens if there is at least one
outstanding packet that has not been completed by the hardware and we
get that corresponding completion (which will not generate an interrupt
since interrupts are disabled in the napi poll loop) between the time we
stopped processing the Tx completions and interrupts are enabled again.
In this case, the need_wakeup flag will have been cleared at the end of
the Tx completion processing as we believe we will get an interrupt from
the outstanding completion at a later point in time. But if this
completion interrupt occurs before interrupts are enable, we lose it and
should at that point really have set the need_wakeup flag since there
are no more outstanding completions that can generate an interrupt to
continue the processing. When this happens, user space will see a Tx
queue need_wakeup of 0 and skip issuing a syscall, which means will
never get into the Tx processing again and we have a deadlock."

As a result, packet processing stops. This patch introduces a fix for
this issue, by always setting the need_wakeup flag at the end of an
interrupt processing. This ensures that the deadlock will not happen.

Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29 08:38:55 -07:00
Victor Raj
4043818c13 ice: distribute Tx queues evenly
Distribute the Tx queues evenly across all queue groups. This will
help the queues to get more equal sharing among the queues when all
are in use.

In the previous algorithm, the next queue group node will be picked up
only after the previous one filled with max children.
For example: if VSI is configured with 9 queues, the first 8 queues
will be assigned to queue group 1 and the 9th queue will be assigned to
queue group 2.

The 2 queue groups split the bandwidth between them equally (50:50).
The first queue group node will share the 50% bandwidth with all of
its children (8 queues). And the second queue group node will share
the entire 50% bandwidth with its only children.

The new algorithm will fix this issue.

Signed-off-by: Victor Raj <victor.raj@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29 08:38:55 -07:00
Tarun Singh
984824a210 ice: Adjust scheduler default BW weight
By default the queues are configured in legacy mode. The default
BW settings for legacy/advanced modes are different. The existing
code was using the advanced mode default value of 1 which was
incorrect. This caused the unbalanced BW sharing among siblings.
The recommended default value is applied.

Signed-off-by: Tarun Singh <tarun.k.singh@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29 08:38:55 -07:00
Tarun Singh
b3b93d6ce1 ice: Add RL profile bit mask check
Mask bits before accessing the profile type field.

Signed-off-by: Tarun Singh <tarun.k.singh@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29 08:38:55 -07:00
Paul M Stillwell Jr
a02016de00 ice: fix overwriting TX/RX descriptor values when rebuilding VSI
If a user sets the value of the TX or RX descriptors to some non-default
value using 'ethtool -G' then we need to not overwrite the values when
we rebuild the VSI. The VSI rebuild could happen as a result of a user
setting the number of queues via the 'ethtool -L' command. Fix this by
checking to see if the value we have stored is non-zero and if it is
then don't change the value.

Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29 08:38:55 -07:00
Kiran Patil
ca1fdb885e ice: return correct error code from ice_aq_sw_rules
Return ICE_ERR_DOES_NOT_EXIST return code if admin command error code is
ICE_AQ_RC_ENOENT (not exist). ice_aq_sw_rules is used when switch
rule is getting added/deleted/updated. In case of delete/update
switch rule, admin command can return ICE_AQ_RC_ENOENT error code
if such rule does not exist, hence return ICE_ERR_DOES_NOT_EXIST error
code from ice_aq_sw_rule, so that caller of this function can decide
how to handle ICE_ERR_DOES_NOT_EXIST.

Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29 08:38:55 -07:00
Nick Nunley
a54a0b24f4 ice: restore VF MSI-X state during PCI reset
During a PCI FLR the MSI-X Enable flag in the VF PCI MSI-X capability
register will be cleared. This can lead to issues when a VF is
assigned to a VM because in these cases the VF driver receives no
indication of the PF PCI error/reset and additionally it is incapable
of restoring the cleared flag in the hypervisor configuration space
without fully reinitializing the driver interrupt functionality.

Since the VF driver is unable to easily resolve this condition on its own,
restore the VF MSI-X flag during the PF PCI reset handling.

Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29 08:38:55 -07:00
Dave Ertman
0ce6c34a8f ice: fix link event handling timing
When the driver experiences a link event (especially link up)
there can be multiple events generated. Some of these are
link fault and still have a state of DOWN set.  The problem
happens when the link comes UP during the PF driver handling
one of the LINK DOWN events.  The status of the link is updated
and is now seen as UP, so when the actual LINK UP event comes,
the port information has already been updated to be seen as UP,
even though none of the UP activities have been completed.

After the link information has been updated in the link
handler and evaluated for MEDIA PRESENT, if the state
of the link has been changed to UP, treat the DOWN event
as an UP event since the link is now UP.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29 08:38:54 -07:00
Dave Ertman
b767ca650f ice: Fix link broken after GLOBR reset
After a GLOBR, the link was broken so that a link
up situation was being seen as a link down.

The problem was that the rebuild process was updating
the port_info link status without doing any of the
other things that need to be done when link changes.

This was causing the port_info struct to have current
"UP" information so that any further UP interrupts
were skipped as redundant.

The rebuild flow should *not* be updating the port_info
struct link information, so eliminate this and leave
it to the link event handling code.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29 08:38:54 -07:00
Dave Ertman
7d9c9b791f ice: Implement LFC workaround
There is a bug where the LFC settings are not being preserved
through a link event.  The registers in question are the ones
that are touched (and restored) when a set_local_mib AQ command
is performed.

On a link-up event, make sure that a set_local_mib is being
performed.

Move the function ice_aq_set_lldp_mib() from the DCB specific
ice_dcb.c to ice_common.c so that the driver always has access
to this AQ command.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29 08:38:54 -07:00
Jisheng Zhang
77b2898394 net: stmmac: Speed down the PHY if WoL to save energy
When WoL is enabled and the machine is powered off, the PHY remains
waiting for wakeup events at max speed, which is a waste of energy.

Slow down the PHY speed before stopping the ethernet if WoL is enabled,

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:48:20 -07:00
Jisheng Zhang
1d8e5b0f3f net: stmmac: Support WOL with phy
Currently, the stmmac driver WOL implementation relies on MAC's PMT
feature. We have a case: the MAC HW doesn't enable PMT, instead, we
rely on the phy to support WOL. Implement the support for this case.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:48:20 -07:00
Jisheng Zhang
e8377e7a29 net: stmmac: only call pmt() during suspend/resume if HW enables PMT
This is to prepare WOL support with phy. Compared with WOL
implementation which relies on the MAC's PMT features, in phy
supported WOL case, device_may_wakeup() may also be true, but we
should not call mac's pmt() function if HW doesn't enable PMT.

And during resume, we should call phylink_start() if PMT is disabled.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:48:20 -07:00
Jisheng Zhang
2f45f7a13e net: stmmac: Move device_can_wakeup() check earlier in set_wol
If !device_can_wakeup(), there's no need to futher check. And return
-EOPNOTSUPP rather than -EINVAL if !device_can_wakeup().

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:48:20 -07:00
Jisheng Zhang
1057d685c6 net: stmmac: Remove WAKE_MAGIC if HW shows no pmt_magic_frame
Remove WAKE_MAGIC from supported modes if the HW capability register
shows no support for pmt_magic_frame.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:48:19 -07:00
Luo bin
90f86b8a36 hinic: add log in exception handling processes
improve the error message when functions return failure and dump
relevant registers in some exception handling processes

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:22:03 -07:00
Luo bin
c15850c709 hinic: add support to handle hw abnormal event
add support to handle hw abnormal event such as hardware failure,
cable unplugged,link error

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:22:02 -07:00
Subbaraya Sundeep
ed543f5c6a octeontx2-pf: Unregister netdev at driver remove
Added unregister_netdev in the driver remove
function. Generally unregister_netdev is called
after disabling all the device interrupts but here
it is called before disabling device mailbox
interrupts. The reason behind this is VF needs
mailbox interrupt to communicate with its PF to
clean up its resources during otx2_stop.
otx2_stop disables packet I/O and queue interrupts
first and by using mailbox interrupt communicates
to PF to free VF resources. Hence this patch
calls unregister_device just before
disabling mailbox interrupts.

Fixes: 3184fb5ba9 ("octeontx2-vf: Virtual function driver support")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:14:48 -07:00
Subbaraya Sundeep
c0376f473c octeontx2-pf: cancel reset_task work
During driver exit cancel the queued
reset_task work in VF driver.

Fixes: 3184fb5ba9 ("octeontx2-vf: Virtual function driver support")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:14:48 -07:00
Subbaraya Sundeep
948a66338f octeontx2-pf: Fix reset_task bugs
Two bugs exist in the code related to reset_task
in PF driver one is the missing protection
against network stack ndo_open and ndo_close.
Other one is the missing cancel_work.
This patch fixes those problems.

Fixes: 4ff7d1488a ("octeontx2-pf: Error handling support")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:14:48 -07:00
Jakub Kicinski
3cab8c6552 mlx4: disable device on shutdown
It appears that not disabling a PCI device on .shutdown may lead to
a Hardware Error with particular (perhaps buggy) BIOS versions:

    mlx4_en: eth0: Close port called
    mlx4_en 0000:04:00.0: removed PHC
    reboot: Restarting system
    {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
    {1}[Hardware Error]: event severity: fatal
    {1}[Hardware Error]:  Error 0, type: fatal
    {1}[Hardware Error]:   section_type: PCIe error
    {1}[Hardware Error]:   port_type: 4, root port
    {1}[Hardware Error]:   version: 1.16
    {1}[Hardware Error]:   command: 0x4010, status: 0x0143
    {1}[Hardware Error]:   device_id: 0000:00:02.2
    {1}[Hardware Error]:   slot: 0
    {1}[Hardware Error]:   secondary_bus: 0x04
    {1}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x2f06
    {1}[Hardware Error]:   class_code: 000604
    {1}[Hardware Error]:   bridge: secondary_status: 0x2000, control: 0x0003
    {1}[Hardware Error]:   aer_uncor_status: 0x00100000, aer_uncor_mask: 0x00000000
    {1}[Hardware Error]:   aer_uncor_severity: 0x00062030
    {1}[Hardware Error]:   TLP Header: 40000018 040000ff 791f4080 00000000
[hw error repeats]
    Kernel panic - not syncing: Fatal hardware error!
    CPU: 0 PID: 2189 Comm: reboot Kdump: loaded Not tainted 5.6.x-blabla #1
    Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 05/05/2017

Fix the mlx4 driver.

This is a very similar problem to what had been fixed in:
commit 0d98ba8d70 ("scsi: hpsa: disable device during shutdown")
to address https://bugzilla.kernel.org/show_bug.cgi?id=199779.

Fixes: 2ba5fbd62b ("net/mlx4_core: Handle AER flow properly")
Reported-by: Jake Lawrence <lawja@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:10:56 -07:00
Jacob Keller
d69ea414c9 ice: implement device flash update via devlink
Use the newly added pldmfw library to implement device flash update for
the Intel ice networking device driver. This support uses the devlink
flash update interface.

The main parts of the flash include the Option ROM, the netlist module,
and the main NVM data. The PLDM firmware file contains modules for each
of these components.

Using the pldmfw library, the provided firmware file will be scanned for
the three major components, "fw.undi" for the Option ROM, "fw.mgmt" for
the main NVM module containing the primary device firmware, and
"fw.netlist" containing the netlist module.

The flash is separated into two banks, the active bank containing the
running firmware, and the inactive bank which we use for update. Each
module is updated in a staged process. First, the inactive bank is
erased, preparing the device for update. Second, the contents of the
component are copied to the inactive portion of the flash. After all
components are updated, the driver signals the device to switch the
active bank during the next EMP reset (which would usually occur during
the next reboot).

Although the firmware AdminQ interface does report an immediate status
for each command, the NVM erase and NVM write commands receive status
asynchronously. The driver must not continue writing until previous
erase and write commands have finished. The real status of the NVM
commands is returned over the receive AdminQ. Implement a simple
interface that uses a wait queue so that the main update thread can
sleep until the completion status is reported by firmware. For erasing
the inactive banks, this can take quite a while in practice.

To help visualize the process to the devlink application and other
applications based on the devlink netlink interface, status is reported
via the devlink_flash_update_status_notify. While we do report status
after each 4k block when writing, there is no real status we can report
during erasing. We simply must wait for the complete module erasure to
finish.

With this implementation, basic flash update for the ice hardware is
supported.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:07:06 -07:00
Jacob Keller
2ab560a78e ice: add flags indicating pending update of firmware module
After a flash update, the pending status of the update can be determined
from the device capabilities.

Read the appropriate device capability and store whether there is
a pending update awaiting a reboot.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:07:06 -07:00
Cudzilo, Szymon T
544cd2ac13 ice: Add AdminQ commands for FW update
Add structures, identifiers, and helper functions for several AdminQ
commands related to performing a firmware update for the ice hardware.
These will be used in future code for implementing the devlink
.flash_update handler.

Signed-off-by: Cudzilo, Szymon T <szymon.t.cudzilo@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:07:06 -07:00
Jacek Naczyk
de9b277ee0 ice: Add support for unified NVM update flow capability
Extends function parsing response from Discover Device
Capability AQC to check if the device supports unified NVM update flow.

Signed-off-by: Jacek Naczyk <jacek.naczyk@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:07:06 -07:00
René van Dorst
19016d93bf net: ethernet: mtk_eth_soc: Always call mtk_gmac0_rgmii_adjust() for mt7623
Modify mtk_gmac0_rgmii_adjust() so it can always be called.
mtk_gmac0_rgmii_adjust() sets-up the TRGMII clocks.

Signed-off-by: René van Dorst <opensource@vdorst.com>
Signed-off-By: David Woodhouse <dwmw2@infradead.org>
Tested-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 17:04:30 -07:00
David S. Miller
b5cd55b334 mlx5-fixes-2020-07-28
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl8ggswACgkQSD+KveBX
 +j7yOgf8DjzPtSpVfUA7Iq28WO6YxJy208oUdLjKNuRNr74vXulHegNlP6cFDHU3
 QIddvTBNMTp2BpJpoFMbEod7sVTBq5KVQZvpIuFM2JU4h76vL4cYbWeBuT6rFIoJ
 m5vuuUyAB+16QbJzagY/rqfQMs0w7KnR+Zhv18JzwyHhBiRLaPzYdmSWM2kkF8HZ
 3DrY8RWgkeaI9vTpE6Fau7BRNDUOMgjIahiUrojJuyPsYZpJf5g+KaMj4xvgcqMa
 vaPaw8iHN7+N3KIdcf6MJhfzx3SHP5YNieU/MfvE9sLvdPvfLETdpexYPB8b0/vs
 L9w2D8j0uZyXek30fIiIwHaibQGZPw==
 =W0Fn
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2020-07-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes-2020-07-28

This series introduces some fixes to mlx5 driver.
v1->v2:
 - Drop the "Hold reference on mirred devices" patch, until Or's
   comments are addressed.
 - Imporve "Modify uplink state" patch commit message per Or's request.

Please pull and let me know if there is any problem.

For -Stable:

For -stable v4.9
 ('net/mlx5e: Fix error path of device attach')

For -stable v4.15
 ('net/mlx5: Verify Hardware supports requested ptp function on a given
pin')

For -stable v5.3
 ('net/mlx5e: Modify uplink state on interface up/down')

For -stable v5.4
 ('net/mlx5e: Fix kernel crash when setting vf VLANID on a VF dev')
 ('net/mlx5: E-switch, Destroy TSAR when fail to enable the mode')

For -stable v5.5
 ('net/mlx5: E-switch, Destroy TSAR after reload interface')

For -stable v5.7
 ('net/mlx5: Fix a bug of using ptp channel index as pin index')
====================

Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 16:55:13 -07:00
Vadim Pasternak
f152b41ba6 mlxsw: core: Add support for temperature thresholds reading for QSFP-DD transceivers
Allow QSFP-DD transceivers temperature thresholds reading for hardware
monitoring and thermal control.

For this type, the thresholds are located in page 02h according to the
"Module and Lane Thresholds" description from Common Management
Interface Specification.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 13:28:02 -07:00
Vadim Pasternak
6af496adcb mlxsw: core: Add ethtool support for QSFP-DD transceivers
The Quad Small Form Factor Pluggable Double Density (QSFP-DD) hardware
specification defines a form factor that supports up to 400 Gbps in
aggregate over an 8x50-Gbps electrical interface. The QSFP-DD supports
both optical and copper interfaces.

Implementation is based on Common Management Interface Specification;
Rev 4.0 May 8, 2019. Table 8-2 "Identifier and Status Summary (Lower
Page)" from this spec defines "Id and Status" fields located at offsets
00h - 02h. Bit 2 at offset 02h ("Flat_mem") specifies QSFP EEPROM memory
mode, which could be "upper memory flat" or "paged". Flat memory mode is
coded "1", and indicates that only page 00h is implemented in EEPROM.
Paged memory is coded "0" and indicates that pages 00h, 01h, 02h, 10h
and 11h are implemented. Pages 10h and 11h are currently not supported
by the driver.

"Flat" memory mode is used for the passive copper transceivers. For this
type only page 00h (256 bytes) is available. "Paged" memory is used for
the optical transceivers. For this type pages 00h (256 bytes), 01h (128
bytes) and 02h (128 bytes) are available. Upper page 01h contains static
advertising field, while upper page 02h contains the module-defined
thresholds and lane-specific monitors.

Extend enumerator 'mlxsw_reg_mcia_eeprom_module_info_id' with additional
field 'MLXSW_REG_MCIA_EEPROM_MODULE_INFO_TYPE_ID'. This field is used to
indicate for QSFP-DD transceiver type which memory mode is to be used.

Expose 256 bytes buffer for QSFP-DD passive copper transceiver and
512 bytes buffer for optical.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 13:28:02 -07:00
Alaa Hleihel
350a63249d net/mlx5e: Fix kernel crash when setting vf VLANID on a VF dev
After the cited commit, function 'mlx5_eswitch_set_vport_vlan' started
to acquire esw->state_lock.
However, esw is not defined for VF devices, hence attempting to set vf
VLANID on a VF dev will cause a kernel panic.

Fix it by moving up the (redundant) esw validation from function
'__mlx5_eswitch_set_vport_vlan' since the rest of the callers now have
and use a valid esw.

For example with vf device eth4:
 # ip link set dev eth4 vf 0 vlan 0

Trace of the panic:
 [  411.409842] BUG: unable to handle page fault for address: 00000000000011b8
 [  411.449745] #PF: supervisor read access in kernel mode
 [  411.452348] #PF: error_code(0x0000) - not-present page
 [  411.454938] PGD 80000004189c9067 P4D 80000004189c9067 PUD 41899a067 PMD 0
 [  411.458382] Oops: 0000 [#1] SMP PTI
 [  411.460268] CPU: 4 PID: 5711 Comm: ip Not tainted 5.8.0-rc4_for_upstream_min_debug_2020_07_08_22_04 #1
 [  411.462447] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
 [  411.464158] RIP: 0010:__mutex_lock+0x4e/0x940
 [  411.464928] Code: fd 41 54 49 89 f4 41 52 53 89 d3 48 83 ec 70 44 8b 1d ee 03 b0 01 65 48 8b 04 25 28 00 00 00 48 89 45 c8 31 c0 45 85 db 75 0a <48> 3b 7f 60 0f 85 7e 05 00 00 49 8d 45 68 41 56 41 b8 01 00 00 00
 [  411.467678] RSP: 0018:ffff88841fcd74b0 EFLAGS: 00010246
 [  411.468562] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
 [  411.469715] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000001158
 [  411.470812] RBP: ffff88841fcd7550 R08: ffffffffa00fa1ce R09: 0000000000000000
 [  411.471835] R10: ffff88841fcd7570 R11: 0000000000000000 R12: 0000000000000002
 [  411.472862] R13: 0000000000001158 R14: ffffffffa00fa1ce R15: 0000000000000000
 [  411.474004] FS:  00007faee7ca6b80(0000) GS:ffff88846fc00000(0000) knlGS:0000000000000000
 [  411.475237] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [  411.476129] CR2: 00000000000011b8 CR3: 000000041909c006 CR4: 0000000000360ea0
 [  411.477260] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [  411.478340] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 [  411.479332] Call Trace:
 [  411.479760]  ? __nla_validate_parse.part.6+0x57/0x8f0
 [  411.482825]  ? mlx5_eswitch_set_vport_vlan+0x3e/0xa0 [mlx5_core]
 [  411.483804]  mlx5_eswitch_set_vport_vlan+0x3e/0xa0 [mlx5_core]
 [  411.484733]  mlx5e_set_vf_vlan+0x41/0x50 [mlx5_core]
 [  411.485545]  do_setlink+0x613/0x1000
 [  411.486165]  __rtnl_newlink+0x53d/0x8c0
 [  411.486791]  ? mark_held_locks+0x49/0x70
 [  411.487429]  ? __lock_acquire+0x8fe/0x1eb0
 [  411.488085]  ? rcu_read_lock_sched_held+0x52/0x60
 [  411.488998]  ? kmem_cache_alloc_trace+0x16d/0x2d0
 [  411.489759]  rtnl_newlink+0x47/0x70
 [  411.490357]  rtnetlink_rcv_msg+0x24e/0x450
 [  411.490978]  ? netlink_deliver_tap+0x92/0x3d0
 [  411.491631]  ? validate_linkmsg+0x330/0x330
 [  411.492262]  netlink_rcv_skb+0x47/0x110
 [  411.492852]  netlink_unicast+0x1ac/0x270
 [  411.493551]  netlink_sendmsg+0x336/0x450
 [  411.494209]  sock_sendmsg+0x30/0x40
 [  411.494779]  ____sys_sendmsg+0x1dd/0x1f0
 [  411.495378]  ? copy_msghdr_from_user+0x5c/0x90
 [  411.496082]  ___sys_sendmsg+0x87/0xd0
 [  411.496683]  ? lock_acquire+0xb9/0x3a0
 [  411.497322]  ? lru_cache_add+0x5/0x170
 [  411.497944]  ? find_held_lock+0x2d/0x90
 [  411.498568]  ? handle_mm_fault+0xe46/0x18c0
 [  411.499205]  ? __sys_sendmsg+0x51/0x90
 [  411.499784]  __sys_sendmsg+0x51/0x90
 [  411.500341]  do_syscall_64+0x59/0x2e0
 [  411.500938]  ? asm_exc_page_fault+0x8/0x30
 [  411.501609]  ? rcu_read_lock_sched_held+0x52/0x60
 [  411.502350]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 [  411.503093] RIP: 0033:0x7faee73b85a7
 [  411.503654] Code: Bad RIP value.

Fixes: 0e18134f4f ("net/mlx5e: Eswitch, use state_lock to synchronize vlan change")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 12:55:53 -07:00
Ron Diskin
7d0314b11c net/mlx5e: Modify uplink state on interface up/down
When setting the PF interface up/down, notify the firmware to update
uplink state via MODIFY_VPORT_STATE, when E-Switch is enabled.

This behavior will prevent sending traffic out on uplink port when PF is
down, such as sending traffic from a VF interface which is still up.
Currently when calling mlx5e_open/close(), the driver only sends PAOS
command to notify the firmware to set the physical port state to
up/down, however, it is not sufficient. When VF is in "auto" state, it
follows the uplink state, which was not updated on mlx5e_open/close()
before this patch.

When switchdev mode is enabled and uplink representor is first enabled,
set the uplink port state value back to its FW default "AUTO".

Fixes: 63bfd399de ("net/mlx5e: Send PAOS command on interface up/down")
Signed-off-by: Ron Diskin <rondi@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 12:55:51 -07:00
Eran Ben Elisha
ed56d749c3 net/mlx5: Query PPS pin operational status before registering it
In a special configuration, a ConnectX6-Dx pin pps-out might be activated
when driver is loaded. Fix the driver to always read the operational pin
mode when registering it, and advertise it accordingly.

Fixes: ee7f12205a ("net/mlx5e: Implement 1PPS support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 12:55:48 -07:00
Raed Salem
21083309ca net/mlx5e: Fix slab-out-of-bounds in mlx5e_rep_is_lag_netdev
mlx5e_rep_is_lag_netdev is used as first check as part of netdev events
handler for bond device of non-uplink representors, this handler can get
any netdevice under the same network namespace of mlx5e netdevice. Current
code treats the netdev as mlx5e netdev and only later on verifies this,
hence causes the following Kasan trace:
[15402.744990] ==================================================================
[15402.746942] BUG: KASAN: slab-out-of-bounds in mlx5e_rep_is_lag_netdev+0xcb/0xf0 [mlx5_core]
[15402.749009] Read of size 8 at addr ffff880391f3f6b0 by task ovs-vswitchd/5347

[15402.752065] CPU: 7 PID: 5347 Comm: ovs-vswitchd Kdump: loaded Tainted: G    B      O     --------- -t - 4.18.0-g3dcc204d291d-dirty #1
[15402.755349] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[15402.757600] Call Trace:
[15402.758968]  dump_stack+0x71/0xab
[15402.760427]  print_address_description+0x6a/0x270
[15402.761969]  kasan_report+0x179/0x2d0
[15402.763445]  ? mlx5e_rep_is_lag_netdev+0xcb/0xf0 [mlx5_core]
[15402.765121]  mlx5e_rep_is_lag_netdev+0xcb/0xf0 [mlx5_core]
[15402.766782]  mlx5e_rep_esw_bond_netevent+0x129/0x620 [mlx5_core]

Fix by deferring the violating access to be post the netdev verify check.

Fixes: 7e51891a23 ("net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule")
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 12:55:45 -07:00
Eran Ben Elisha
071995c877 net/mlx5: Verify Hardware supports requested ptp function on a given pin
Fix a bug where driver did not verify Hardware pin capabilities for
PTP functions.

Fixes: ee7f12205a ("net/mlx5e: Implement 1PPS support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 12:55:43 -07:00
Eran Ben Elisha
88c8cf92db net/mlx5: Fix a bug of using ptp channel index as pin index
On PTP mlx5_ptp_enable(on=0) flow, driver mistakenly used channel index
as pin index.

After ptp patch marked in fixes tag was introduced, driver can freely
call ptp_find_pin() as part of the .enable() callback.

Fix driver mlx5_ptp_enable(on=0) flow to always use ptp_find_pin(). With
that, Driver will use the correct pin index in mlx5_ptp_enable(on=0) flow.

In addition, when initializing the pins, always set channel to zero. As
all pins can be attached to all channels, let ptp_set_pinfunc() to move
them between the channels.

For stable branches, this fix to be applied only on kernels that includes
both patches in fixes tag. Otherwise, mlx5_ptp_enable(on=0) will be stuck
on pincfg_mux.

Fixes: 62582a7ee7 ("ptp: Avoid deadlocks in the programmable pin code.")
Fixes: ee7f12205a ("net/mlx5e: Implement 1PPS support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 12:55:40 -07:00
Maor Dickman
0e2e7aa57b net/mlx5e: Fix missing cleanup of ethtool steering during rep rx cleanup
The cited commit add initialization of ethtool steering during
representor rx initializations without cleaning it up in representor
rx cleanup, this may cause for stale ethtool flows to remain after
moving back from switchdev mode to legacy mode.

Fixed by calling ethtool steering cleanup during rep rx cleanup.

Fixes: 6783e8b29f ("net/mlx5e: Init ethtool steering for representors")
Signed-off-by: Maor Dickman <maord@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 12:55:37 -07:00
Aya Levin
5cd39b6e9a net/mlx5e: Fix error path of device attach
On failure to attach the netdev, fix the rollback by re-setting the
device's state back to MLX5E_STATE_DESTROYING.

Failing to attach doesn't stop statistics polling via .ndo_get_stats64.
In this case, although the device is not attached, it falsely continues
to query the firmware for counters. Setting the device's state back to
MLX5E_STATE_DESTROYING prevents the firmware counters query.

Fixes: 26e59d8077 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 12:55:35 -07:00
Maor Gottlieb
59f8f7c84c net/mlx5: Fix forward to next namespace
The steering tree is as follow (nic RX as example):
		   ---------
                   |root_ns|
		   ---------
			|
      	--------------------------------
    	|		|	       |
   ---------- 	   ----------      ---------
   |p(prio)0|	   |   p1   |      |   pn  |
   ----------	   ----------	   ---------
        |		|
 ----------------  ---------------
 |ns(e.g bypass)|  |ns(e.g. lag) |
 ----------------  ---------------
  |     |    |
----  ----  ----
|p0|  |p1|  |pn|
----  ----  ----
 |
----
|FT|
----

find_next_chained_ft(prio) returns the first flow table in the next
priority. If prio is a parent of a flow table then it returns the first
flow table in the next priority in the same namespace, else if prio
is parent of namespace, then it should return the first flow table
in the next namespace. Currently if the user requests to forward to
next namespace, the code calls to find_next_chained_ft with the prio
of the next namespace and not the prio of the namesapce itself.

Fixes: 9254f8ed15 ("net/mlx5: Add support in forward to namespace")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 12:55:32 -07:00
Parav Pandit
0c2600c619 net/mlx5: E-switch, Destroy TSAR after reload interface
When eswitch offloads is enabled, TSAR is created before reloading
the interfaces.
However when eswitch offloads mode is disabled, TSAR is disabled before
reloading the interfaces.

To keep the eswitch enable/disable sequence as mirror, destroy TSAR
after reloading the interfaces.

Fixes: 1bd27b11c1 ("net/mlx5: Introduce E-switch QoS management")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 12:55:30 -07:00
Parav Pandit
2b8e9c7c3f net/mlx5: E-switch, Destroy TSAR when fail to enable the mode
When either esw_legacy_enable() or esw_offloads_enable() fails,
code missed to destroy the created TSAR.

Hence, add the missing call to destroy the TSAR.

Fixes: 610090ebce ("net/mlx5: E-switch, Initialize TSAR Qos hardware block before its user vports")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 12:55:27 -07:00
Guojia Liao
b7b5d25bdd net: hns3: fix for VLAN config when reset failed
When device is resetting or reset failed, firmware is unable to
handle mailbox. VLAN should not be configured in this case.

Fixes: fe4144d47e ("net: hns3: sync VLAN filter entries when kill VLAN ID failed")
Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 12:54:48 -07:00
Guojia Liao
efe3fa45f7 net: hns3: fix aRFS FD rules leftover after add a user FD rule
When user had created a FD rule, all the aRFS rules should be clear up.
HNS3 process flow as below:
1.get spin lock of fd_ruls_list
2.clear up all aRFS rules
3.release lock
4.get spin lock of fd_ruls_list
5.creat a rules
6.release lock;

There is a short period of time between step 3 and step 4, which would
creatting some new aRFS FD rules if driver was receiving packet.
So refactor the fd_rule_lock to fix it.

Fixes: 4412288757 ("net: hns3: refine the flow director handle")
Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 12:54:48 -07:00
Jian Shen
a6f7bfdc78 net: hns3: add reset check for VF updating port based VLAN
Currently hclgevf_update_port_base_vlan_info() may be called when
VF is resetting,  which may cause hns3_nic_net_open() being called
twice unexpectedly.

So fix it by adding a reset check for it, and extend critical
region for rntl_lock in hclgevf_update_port_base_vlan_info().

Fixes: 92f11ea177 ("net: hns3: fix set port based VLAN issue for VF")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 12:54:48 -07:00
Yonglong Liu
a7e90ee596 net: hns3: fix a TX timeout issue
When the queue depth and queue parameters are modified, there is
a low probability that TX timeout occurs. The two operations cause
the link to be down or up when the watchdog is still working. All
queues are stopped when the link is down. After the carrier is on,
all queues are woken up. If the watchdog detects the link between
the carrier on and wakeup queues, a false TX timeout occurs.

So fix this issue by modifying the sequence of carrier on and queue
wakeup, which is symmetrical to the link down action.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 12:54:48 -07:00
Yunsheng Lin
cfdaeba5dd net: hns3: fix desc filling bug when skb is expanded or lineared
The linear and frag data part may be changed when the skb is expanded
or lineared in skb_cow_head() or skb_checksum_help(), which is called
by hns3_fill_skb_desc(), so the linear len return by skb_headlen()
before the calling of hns3_fill_skb_desc() is unreliable.

Move hns3_fill_skb_desc() before the calling of skb_headlen() to fix
this bug.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28 12:54:48 -07:00
Julia Lawall
22f9d2f4ee net/mlx5: drop unnecessary list_empty
list_for_each_entry is able to handle an empty list.
The only effect of avoiding the loop is not initializing the
index variable.
Drop list_empty tests in cases where these variables are not
used.

Note that list_for_each_entry is defined in terms of list_first_entry,
which indicates that it should not be used on an empty list.  But in
list_for_each_entry, the element obtained by list_first_entry is not
really accessed, only the address of its list_head field is compared
to the address of the list head, so the list_first_entry is safe.

The semantic patch that makes this change is as follows (with another
variant for the no brace case): (http://coccinelle.lip6.fr/)

<smpl>
@@
expression x,e;
iterator name list_for_each_entry;
statement S;
identifier i;
@@

-if (!(list_empty(x))) {
   list_for_each_entry(i,x,...) S
- }
 ... when != i
? i = e
</smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 02:37:57 -07:00
Gustavo A. R. Silva
c8b838d108 net/mlx5: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 02:37:55 -07:00
Alex Vesker
ffdc8ec0b7 net/mlx5: DR, Reduce print level for matcher print
There is no need to print on each unsuccessful matcher
ip_version combination since it probably will happen when
trying to create all the possible combinations.
On a real failure we have a print in the calling function.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 02:37:52 -07:00
Aya Levin
17347d5430 net/mlx5e: Add support for PCI relaxed ordering
The concept of Relaxed Ordering in the PCI Express environment allows
switches in the path between the Requester and Completer to reorder some
transactions just received before others that were previously enqueued.

In ETH driver, there is no question of write integrity since each memory
segment is written only once per cycle. In addition, the driver doesn't
access the memory shared with the hardware until the corresponding CQE
arrives indicating all PCI transactions are done.

Running TCP single stream over ConnectX-4 LX, ARM CPU on remote-numa has
300% improvement in the bandwidth.

With relaxed ordering turned off: BW:10 [GB/s]
With relaxed ordering turned on: BW:40 [GB/s]

The driver turns relaxed ordering with respect to the firmware
capabilities and the return value from pcie_relaxed_ordering_enabled().

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 02:37:49 -07:00
Tariq Toukan
5d0b847694 net/mlx5e: Use indirect call wrappers for RX post WQEs functions
Use the indirect call wrapper API macros for declaration and scope
of the RX post WQEs functions.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 02:37:47 -07:00
Tariq Toukan
b307f7f163 net/mlx5e: Move exposure of datapath function to txrx header
Move them from the generic header file "en.h", to the
datapath header file "txrx.h".

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 02:37:44 -07:00
Tariq Toukan
5adf4c475a net/mlx5e: RX, Re-work initializaiton of RX function pointers
Instead of exposing the RQ datapath handlers (from en_rx.c) so that
they are set in the control path (in en_main.c), wrap this logic
in a single function in en_rx.c and expose it alone.

Every profile will now have a pointer to the new mlx5e_rx_handlers
structure, instead of directly pointing to the previously-exposed
RQ handlers.

This significantly improves locality and modularity of the driver,
and allows many functions in en_rx.c to become static.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 02:37:41 -07:00
Parav Pandit
123f0f53dd net/mlx5e: Link non uplink representors to PCI device
Currently PF and VF representors are exposed as virtual device.
They are not linked to its parent PCI device like how uplink
representor is linked.
Due to this, PF and VF representors cannot benefit of the
systemd defined naming scheme. This requires special handling
by the users.

Hence, link the PF and VF representors to their parent PCI device
similar to existing uplink representor netdevice.

Example:
udevadm output before linking to PCI device:
$ udevadm test-builtin net_id  /sys/class/net/eth6
Load module index
Network interface NamePolicy= disabled on kernel command line, ignoring.
Parsed configuration file /usr/lib/systemd/network/99-default.link
Created link configuration context.
Using default interface naming scheme 'v243'.
ID_NET_NAMING_SCHEME=v243
Unload module index
Unloaded link configuration context.

udevadm output after linking to PCI device:
$ udevadm test-builtin net_id /sys/class/net/eth6
Load module index
Network interface NamePolicy= disabled on kernel command line, ignoring.
Parsed configuration file /usr/lib/systemd/network/99-default.link
Created link configuration context.
Using default interface naming scheme 'v243'.
ID_NET_NAMING_SCHEME=v243
ID_NET_NAME_PATH=enp0s8f0npf0vf0
Unload module index
Unloaded link configuration context.

In past there was little concern over seeing 10,000 lines output
showing up at thread [1] is not applicable as ndo ops for VF
handling is not exposed for all the 100 repesentors for mlx5 devices.

Additionally alternative device naming [2] to overcome shorter device
naming is also part of the latest systemd release v245.

[1] https://marc.info/?l=linux-netdev&m=152657949117904&w=2
[2] https://lwn.net/Articles/814068/

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 02:37:39 -07:00
Parav Pandit
8d6bd3c339 net/mlx5: E-switch, Use eswitch total_vports
Currently steering table and rx group initialization helper
routines works on the total_vports passed as input parameter.

Both eswitch helpers work on the mlx5_eswitch and thereby have access
to esw->total_vports. Hence use it directly instead of passing it
via function input arguments.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 02:37:36 -07:00
Parav Pandit
0da3c12dd6 net/mlx5: E-switch, Reuse total_vports and avoid duplicate nvports
Total e-switch vports are already stored in mlx5_eswitch total_vports.
Avoid copy of it in nvports and reuse existing total_vports calculation.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 02:37:34 -07:00
Parav Pandit
8b95bda47c net/mlx5: E-switch, Consider maximum vf vports for steering init
When eswitch is enabled, VFs might not be enabled. Hence, consider
maximum number of VFs.
This further closes the gap between handling VF vports between ECPF and
PF.

Fixes: ea2128fd63 ("net/mlx5: E-switch, Reduce dependency on num_vfs during mode set")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 02:37:31 -07:00
Avihu Hagag
c1a0969ee8 net/mlx5: Add function ID to reclaim pages debug log
Add function ID to reclaim pages debug log for better user visibility.

Signed-off-by: Avihu Hagag <avihuh@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 02:37:29 -07:00
Eran Ben Elisha
d6945242f4 net/mlx5: Hold pages RB tree per VF
Per page request event, FW request to allocated or release pages for a
single function. Driver maintains FW pages object per function, so there
is no need to hold one global page data-base. Instead, have a page
data-base per function, which will improve performance release flow in all
cases, especially for "release all pages".

As the range of function IDs is large and not sequential, use xarray to
store a per function ID page data-base, where the function ID is the key.

Upon first allocation of a page to a function ID, create the page
data-base per function. This data-base will be released only at pagealloc
mechanism cleanup.

NIC: ConnectX-4 Lx
CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
Test case: 32 VFs, measure release pages on one VF as part of FLR
Before: 0.021 Sec
After:  0.014 Sec

The improvement depends on amount of VFs and memory utilization
by them. Time measurements above were taken from idle system.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28 02:37:26 -07:00
Gustavo A. R. Silva
5e619d73e6 net/mlx4: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1].

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 13:14:10 -07:00
David S. Miller
a02d26fe48 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Tony Nguyen says:

====================
1GbE Intel Wired LAN Driver Updates 2020-07-27

This series contains updates to igc driver only.

Sasha cleans up double definitions, unneeded and non applicable
registers, and removes unused fields in structs. Ensures the Receive
Descriptor Minimum Threshold Count is cleared and fixes a static checker
error.

v2: Remove fields from hw_stats in patches that removed their uses.
Reworded patch descriptions for patches 1, 2, and 4.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 13:11:57 -07:00
Edward Cree
1c74884387 sfc_ef100: implement ndo_get_phys_port_{id,name}
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:56 -07:00
Edward Cree
29ec1b27e7 sfc_ef100: read device MAC address at probe time
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:56 -07:00
Edward Cree
99a23c1168 sfc_ef100: probe the PHY and configure the MAC
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:56 -07:00
Edward Cree
4e5675bbab sfc_ef100: actually perform resets
In ef100_reset(), make the MCDI call to do the reset.
Also, do a reset at start-of-day during probe, to put the function in
 a clean state.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:56 -07:00
Edward Cree
d802b0ae65 sfc_ef100: extend ef100_check_caps to cover datapath_caps3
MC_CMD_GET_CAPABILITIES now has a third word of flags; extend the
 efx_has_cap() machinery to cover it.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:55 -07:00
Edward Cree
f65731207d sfc_ef100: read datapath caps, implement check_caps
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:55 -07:00
Edward Cree
5e4ef67346 sfc_ef100: process events for MCDI completions
Currently RX and TX-completion events are unhandled, as neither the RX
 nor the TX path has been implemented yet.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:55 -07:00
Edward Cree
965b549f3c sfc_ef100: implement ndo_open/close and EVQ probing
Channels are probed, but actual event handling is still stubbed out.

Stub implementation of check_caps is needed because ptp.c will call into
 it from efx_ptp_use_mac_tx_timestamps() to decide if it wants TXQs.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:55 -07:00
Edward Cree
2200e6d92e sfc_ef100: implement MCDI transport
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:55 -07:00
Edward Cree
35a36af88f sfc_ef100: don't call efx_reset_down()/up() on EF100
We handle everything ourselves in ef100_reset(), rather than relying on
 the generic down/up routines.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:55 -07:00
Edward Cree
aa86a75fed sfc_ef100: PHY probe stub
We can't actually do the MCDI to probe it fully until we have working
 MCDI, which comes later, but we need efx->phy_data to be allocated so
 that when we get MCDI events the link-state change handler doesn't
 NULL-dereference.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:55 -07:00
Edward Cree
c027f2a72a sfc_ef100: reset-handling stub
We don't actually do the efx_mcdi_reset() because we don't have MCDI yet.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:55 -07:00
Edward Cree
51b35a454e sfc: skeleton EF100 PF driver
No TX or RX path, no MCDI, not even an ifup/down handler.
Besides stubs, the bulk of the patch deals with reading the Xilinx
 extended PCIe capability, which tells us where to find our BAR.

Though in the same module, EF100 has its own struct pci_driver,
 which is named sfc_ef100.

A small number of additional nic_type methods are added; those in the
 TX (tx_enqueue) and RX (rx_packet) paths are called through indirect
 call wrappers to minimise the performance impact.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:55 -07:00
Edward Cree
61060c5dc5 sfc_ef100: register accesses on EF100
EF100 adds a few new valid addresses for efx_writed_page(), as well as
 a Function Control Window in the BAR whose location is variable.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:55 -07:00
Edward Cree
adf72ee3f7 sfc_ef100: add EF100 register definitions
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:55 -07:00
Edward Cree
0ccf267e34 sfc: remove efx_ethtool_nway_reset()
An MDIO-based n-way restart does not make sense for any of the NICs
 supported by this driver, nor for the coming EF100.
Unlike on Falcon (which was already split off into a separate driver),
 the PHY on all of Siena, EF10 and EF100 is managed by MC firmware.
While Siena can talk to the PHY over MDIO, doing so for anything other
 than debugging purposes (mdio_mii_ioctl) is likely to confuse the
 firmware.
(According to the SFC firmware team, this support was originally added
 to the Siena driver early in the development of that product, before
 it was decided to have firmware manage the PHY.)

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:26:55 -07:00
Alexander Lobakin
1775da47c3 qed: fix the allocation of the chains with an external PBL
Dan reports static checker warning:

"The patch 9b6ee3cf95: "qed: sanitize PBL chains allocation" from Jul
23, 2020, leads to the following static checker warning:

	drivers/net/ethernet/qlogic/qed/qed_chain.c:299 qed_chain_alloc_pbl()
	error: uninitialized symbol 'pbl_virt'.

drivers/net/ethernet/qlogic/qed/qed_chain.c
   249  static int qed_chain_alloc_pbl(struct qed_dev *cdev, struct qed_chain *chain)
   250  {
   251          struct device *dev = &cdev->pdev->dev;
   252          struct addr_tbl_entry *addr_tbl;
   253          dma_addr_t phys, pbl_phys;
   254          __le64 *pbl_virt;
                ^^^^^^^^^^^^^^^^
[...]
   271          if (chain->b_external_pbl)
   272                  goto alloc_pages;
                        ^^^^^^^^^^^^^^^^ uninitialized
[...]
   298                  /* Fill the PBL table with the physical address of the page */
   299                  pbl_virt[i] = cpu_to_le64(phys);
                        ^^^^^^^^^^^
[...]
"

This issue was introduced with commit c3a321b06a ("qed: simplify
initialization of the chains with an external PBL"), when
chain->pbl_sp.table_virt initialization was moved up to
qed_chain_init_params().
Fix it by initializing pbl_virt with an already filled chain struct field.

Fixes: c3a321b06a ("qed: simplify initialization of the chains with an external PBL")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:17:14 -07:00
laurent brando
5fd82200d8 net: mscc: ocelot: fix hardware timestamp dequeue logic
The next hw timestamp should be snapshoot to the read registers
only once the current timestamp has been read.
If none of the pending skbs matches the current HW timestamp
just gracefully flush the available timestamp by reading it.

Signed-off-by: laurent brando <laurent.brando@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 12:04:40 -07:00
Vasundhara Volam
b5d600b027 bnxt_en: Add support for 'ethtool -d'
Add support to dump PXP registers and PCIe statistics.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 11:47:33 -07:00
Michael Chan
a0c30621c2 bnxt_en: Switch over to use the 64-bit software accumulated counters.
Now we can report all the full 64-bit CPU endian software accumulated
counters instead of the hw counters, some of which may be less than
64-bit wide.  Define the necessary macros to access the software
counters.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 11:47:33 -07:00
Michael Chan
fea6b33355 bnxt_en: Accumulate all counters.
Now that we have the infrastructure in place, add the new function
bnxt_accumulate_all_stats() to periodically accumulate and check for
counter rollover of all ring stats and port stats.

A chip bug was also discovered that could cause some ring counters to
become 0 during DMA.  Workaround by ignoring zeros on the affected
chips.

Some older frimware will reset port counters during ifdown.  We need
to check for that and free the accumulated port counters during ifdown
to prevent bogus counter overflow detection during ifup.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 11:47:33 -07:00
Michael Chan
531d1d269c bnxt_en: Retrieve hardware masks for port counters.
If supported by newer firmware, make the firmware call to query all
the port counter masks.  If not supported, assume 40-bit port
counter masks.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 11:47:33 -07:00
Michael Chan
d752d0536c bnxt_en: Retrieve hardware counter masks from firmware if available.
Newer firmware has a new call HWRM_FUNC_QSTATS_EXT to retrieve the
masks of all ring counters.  Make this call when supported to
initialize the hardware masks of all ring counters.  If the call
is not available, assume 48-bit ring counter masks on P5 chips.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 11:47:33 -07:00
Michael Chan
a37120b22e bnxt_en: Allocate additional memory for all statistics blocks.
Some of these DMAed hardware counters are not full 64-bit counters and
so we need to accumulate them as they overflow.  Allocate copies of these
DMA statistics memory blocks with the same size for accumulation.  The
hardware counter widths are also counter specific so we allocate
memory for masks that correspond to each counter.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 11:47:33 -07:00
Michael Chan
177a6cde47 bnxt_en: Refactor statistics code and structures.
The driver manages multiple statistics structures of different sizes.
They are all allocated, freed, and handled practically the same.  Define
a new bnxt_stats_mem structure and common allocation and free functions
for all staistics memory blocks.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 11:47:33 -07:00
Michael Chan
24c93443fe bnxt_en: Use macros to define port statistics size and offset.
The port statistics structures have hard coded padding and offset.
Define macros to make this look cleaner.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 11:47:33 -07:00
Michael Chan
bfc6e5fbcb bnxt_en: Update firmware interface to 1.10.1.54.
Main changes are 200G support and fixing the definitions of discard and
error counters to match the hardware definitions.

Because the HWRM_PORT_PHY_QCFG message size has now exceeded the max.
encapsulated response message size of 96 bytes from the PF to the VF,
we now need to cap this message to 96 bytes for forwarding.  The forwarded
response only needs to contain the basic link status and speed information
and can be capped without adding the new information.

v2: Fix bnxt_re compile error.

Cc: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 11:47:33 -07:00
Vasundhara Volam
dfe64de974 bnxt_en: Remove PCIe non-counters from ethtool statistics
Remove PCIe non-counters display from ethtool statistics, as
they are not simple counters but register dump.  The next few
patches will add logic to detect counter roll-over and it won't
work with these PCIe non-counters.

There will be a follow up patch to get PCIe information via
ethtool register dump.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 11:47:32 -07:00
Julia Lawall
d21a06d5d8 sfc: drop unnecessary list_empty
list_for_each_safe is able to handle an empty list.
The only effect of avoiding the loop is not initializing the
index variable.
Drop list_empty tests in cases where these variables are not
used.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

<smpl>
@@
expression x,e;
iterator name list_for_each_safe;
statement S;
identifier i,j;
@@

-if (!(list_empty(x))) {
   list_for_each_safe(i,j,x) S
- }
 ... when != i
     when != j
(
  i = e;
|
? j = e;
)
</smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 10:09:57 -07:00
Sasha Neftin
360d749e0c igc: Fix static checker warning
drivers/net/ethernet/intel/igc/igc_mac.c:424 igc_check_for_copper_link()
error: uninitialized symbol 'link'.
This patch come to fix this warning and initialize the 'link' symbol.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 707abf0695 ("igc: Add initial LTR support")
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-27 08:49:32 -07:00
Sasha Neftin
db02bee2ec igc: Clean up the hw_stats structure
Remove ictxptc, ictxatc, cbtmpc, cbrdpc, cbrmpc and htcbdpc fields from
the hw_stats structure. Accordance to the i225 device
specification these fields not in use.
This patch come to clean up the driver code.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-27 08:49:32 -07:00
Sasha Neftin
4a9e9b8fee igc: Clean up the mac_info structure
collision_delta, tx_packet_delta, txcw, adaptive_ifs and
has_fwsm fields not in use.
This patch come to clean up the driver code.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-27 08:49:32 -07:00
Sasha Neftin
643e5c2e8c igc: Remove ledctl_ fields from the mac_info structure
LED control currently not implemented.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-27 08:49:31 -07:00
Sasha Neftin
94a5181f4b igc: Fix registers definition
IGC_ICTXPTC and IGC_ICTXATC are already defined elsewhere, remove this
double definition. Also, remove unneeded registers as they are not
applicable to i225 devices.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-27 08:49:31 -07:00
Sasha Neftin
ed6ab19adf igc: Remove unneeded ICTXQMTC register
Tx Queue Min Threshold Count register no applicable for the i225 device.
This patch comes to clean up it.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-27 08:49:31 -07:00
Sasha Neftin
60f7bb8241 igc: Add Receive Descriptor Minimum Threshold Count to clear HW counters
The statistics of this register are being tracked, however, the register
was inadvertently missed when implementing igc_clear_hw_cntrs_base().
The register is clear on read, so add it to the function so that the
register is cleared when requested so the tracked count is accurate.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-27 08:49:31 -07:00
Sasha Neftin
d9f0c8e457 igc: Remove unneeded variable
Though we are populating and tracking ictxqec, the value is not being used
for anything so remove it altogether and save the register read.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-27 08:49:31 -07:00
Andrii Nakryiko
e8407fdeb9 bpf, xdp: Remove XDP_QUERY_PROG and XDP_QUERY_PROG_HW XDP commands
Now that BPF program/link management is centralized in generic net_device
code, kernel code never queries program id from drivers, so
XDP_QUERY_PROG/XDP_QUERY_PROG_HW commands are unnecessary.

This patch removes all the implementations of those commands in kernel, along
the xdp_attachment_query().

This patch was compile-tested on allyesconfig.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200722064603.3350758-10-andriin@fb.com
2020-07-25 20:37:02 -07:00
David S. Miller
a57066b1a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The UDP reuseport conflict was a little bit tricky.

The net-next code, via bpf-next, extracted the reuseport handling
into a helper so that the BPF sk lookup code could invoke it.

At the same time, the logic for reuseport handling of unconnected
sockets changed via commit efc6b6f6c3
which changed the logic to carry on the reuseport result into the
rest of the lookup loop if we do not return immediately.

This requires moving the reuseport_has_conns() logic into the callers.

While we are here, get rid of inline directives as they do not belong
in foo.c files.

The other changes were cases of more straightforward overlapping
modifications.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-25 17:49:04 -07:00
Wang Hai
9b964f1654 net: hix5hd2_gmac: Remove unneeded cast from memory allocation
Remove casting the values returned by memory allocation function.

Coccinelle emits WARNING:

./drivers/net/ethernet/hisilicon/hix5hd2_gmac.c:1027:9-23: WARNING:
 casting value returned by memory allocation function to (struct sg_desc *) is useless.

This issue was detected by using the Coccinelle software.

Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24 17:28:51 -07:00
David S. Miller
aab99b62b4 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2020-07-23

This series contains updates to ice driver only.

Jake refactors ice_discover_caps() to reduce the number of AdminQ calls
made. Splits ice_parse_caps() to separate functions to update function
and device capabilities separately to allow for updating outside of
initialization.

Akeem adds power management support.

Paul G refactors FC and FEC code to aid in restoring of PHY settings
on media insertion. Implements lenient mode and link override support.
Adds link debug info and formats existing debug info to be more
readable. Adds support to check and report additional autoneg
capabilities. Implements the capability to detect media cage in order to
differentiate AUI types as Direct Attach or backplane.

Bruce implements Total Port Shutdown for devices that support it.

Lev renames low_power_ctrl field to lower_power_ctrl_an to be more
descriptive of the field.

Doug reports AOC types as media type fiber.

Paul S adds code to handle 1G SGMII PHY type.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24 16:39:28 -07:00
Paul M Stillwell Jr
c2b352262a ice: add 1G SGMII PHY type
There isn't a case for 1G SGMII in ice_get_media_type() so add
the handling for it.

Also handle the special case where some direct attach
cables may report that they support 1G SGMII, but
that is erroneous since SGMII is supposed to be a
backplane media type (between a MAC and a PHY). If
the driver doesn't handle this special case then a
user could see the 'Port' in ethtool change from
'Direct attach Copper' to 'Backplane' when they have
forced the speed to 1G, but the cable hasn't changed.

Lastly, change ice_aq_get_phy_caps() to save the
module_type info if the function was called with
ICE_AQC_REPORT_TOPO_CAP. This call uses the media
information to populate the module_type. If no
media is present then the values in module_type
will be 0.

Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 15:36:14 -07:00
Doug Dziggel
c1eb3b6b68 ice: Report AOC PHY Types as Fiber
Report AOC types as fiber instead of unknown.

Signed-off-by: Doug Dziggel <douglas.a.dziggel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 15:34:43 -07:00
Paul Greenwalt
8ea1da593b ice: add AQC get link topology handle support
Add AQC get link topology handle support. This is needed to determine
Direct Attach (DA) or backplane media type for PHY types that support
either. Get link topology handle cage node type request can be used to
determine if a cage is present or not. If a cage is present for PHY
types that supports both DA and backplane media type, then the media
type is DA, else the media type is backplane.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 15:33:26 -07:00
Lev Faerman
bdeff9718a ice: Rename low_power_ctrl
Rename the low_power_ctrl field to low_power_ctrl_an to be properly
descriptive of it being an autoneg field.

Signed-off-by: Lev Faerman <lev.faerman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 15:31:30 -07:00
Paul Greenwalt
5ee30564c8 ice: update reporting of autoneg capabilities
Firmware now reports AN28, AN32, and AN73. Add a helper and check these new
values and report PHY autoneg capability.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 15:29:46 -07:00
Paul Greenwalt
55df52a0bc ice: add ice_aq_get_phy_caps() debug logs
Add debug logs for ice_aq_get_phy_caps(), and format
ice_aq_set_phy_cfg() and ice_aq_get_link_info() debug logs to make them
more readable.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 15:27:46 -07:00
Bruce Allan
b4e813dd04 ice: support Total Port Shutdown on devices that support it
When the Port Disable bit is set in the Link Default Override Mask TLV PFA
module in the NVM, Total Port Shutdown mode is supported and enabled.  In
this mode, the driver should act as if the link-down-on-close ethtool
private flag is always enabled and dis-allow any change to that flag.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 15:26:09 -07:00
Paul Greenwalt
ea78ce4dab ice: add link lenient and default override support
Adds functions to check for link override firmware support and get
the override settings for a port. The previously supported/default link
mode was strict mode.

In strict mode link is configured based on get PHY capabilities PHY types
with media.

Lenient mode is now the default link mode. In lenient mode the link is
configured based on get PHY capabilities PHY types without media. This
allows the user to configure link that the media does not report. Limit the
minimum supported link mode to 25G for devices that support 100G, and 1G
for devices that support less than 100G.

Default override is only supported in lenient mode. If default override
is supported and enabled, then default override values are used for
configuring speed and FEC. Default override provide persistent link
settings in the NVM.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Evan Swanson <evan.swanson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 15:22:31 -07:00
Paul Greenwalt
1a3571b593 ice: restore PHY settings on media insertion
After the transition from no media to media FW will clear the
set-phy-cfg data set by the user. Save initial PHY settings and any
settings later requested by the user and use that data to restore PHY
settings on media insertion. Since PHY configuration is now being stored,
replace calls that were calling FW to get the configuration with the saved
copy.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 15:15:28 -07:00
Paul Greenwalt
61cf42e71a ice: move auto FEC checks into ice_cfg_phy_fec()
The call to ice_cfg_phy_fec() requires the caller to perform certain
actions before calling it. Instead of imposing these preconditions move
the operations into the function and perform them ourselves.

Also, fix some style issues in nearby touched code.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 15:05:49 -07:00
Paul Greenwalt
2ffb60856a ice: refactor FC functions
Create a helper function for configuring requested flow control so that it
can be utilized by other functions looking to configure flow control
settings. Utilize the existing helper ice_copy_phy_caps_to_cfg() to copy a
PHY capability to configuration instead duplicating the code for it.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 15:03:58 -07:00
Akeem G Abodunrin
769c500dcc ice: Add advanced power mgmt for WoL
Add callbacks needed to support advanced power management for Wake on LAN.
Also make ice_pf_state_is_nominal function available for all configurations
not just CONFIG_PCI_IOV.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 14:59:20 -07:00
Jacob Keller
81aed6475d ice: split ice_discover_caps into two functions
Using the new ice_aq_list_caps and ice_parse_(dev|func)_caps functions,
replace ice_discover_caps with two functions that each take a pointer to
the dev_caps and func_caps structures respectively.

This makes the side effect of updating the hw->dev_caps and
hw->func_caps obvious from reading the implementation of the function.
Additionally, it opens the way for enabling reading of device
capabilities outside of the initialization flow. By passing in
a pointer, another caller will be able to read the capabilities without
modifying the HW capabilities structures.

As there are no other callers, it is safe to now remove
ice_aq_discover_caps and ice_parse_caps.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 14:48:41 -07:00
Jacob Keller
595b13e228 ice: split ice_parse_caps into separate functions
The ice_parse_caps function is used to convert the capability block data
coming from firmware into a structured format used by other parts of the
code.

The current implementation directly updates the hw->func_caps and
hw->dev_caps structures. It is directly called from within
ice_aq_discover_caps. This causes the discover_caps function to have the
side effect of modifying the HW capability structures, which is not
intuitive.

Split this function into ice_parse_dev_caps and ice_parse_func_caps.
These functions will take a pointer to the dev_caps and func_caps
respectively. Also create an ice_parse_common_caps for sharing the
capability logic that is common to device and function.

Doing so enables a future refactor to allow reading and parsing
capabilities into a local caps structure instead of modifying the
members of the HW structure directly.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 14:46:33 -07:00
Jacob Keller
1082b360e3 ice: refactor ice_discover_caps to avoid need to retry
The ice_discover_caps function is used to read the device and function
capabilities, updating the hardware capabilities structures with
relevant data.

The exact number of capabilities returned by the hardware is unknown
ahead of time. The AdminQ command will report the total number of
capabilities in the return buffer.

The current implementation involves requesting capabilities once,
reading this returned size, and then re-requested with that size.

This isn't really necessary. The firmware interface has a maximum size
of ICE_AQ_MAX_BUF_LEN. Firmware can never return more than
ICE_AQ_MAX_BUF_LEN / sizeof(struct ice_aqc_list_caps_elem) capabilities.

Avoid the retry loop by simply allocating a buffer of size
ICE_AQ_MAX_BUF_LEN. This is significantly simpler than retrying. The
extra allocation isn't a big deal, as it will be released after we
finish parsing the capabilities.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23 14:16:02 -07:00
Vishal Kulkarni
7235ffae3d cxgb4: add loopback ethtool self-test
In this test, loopback pkt is created and sent on default queue.
The packet goes until the Multi Port Switch (MPS) just before
the MAC and based on the specified channel number, it either
goes outside the wire on one of the physical ports or looped
back to Rx path by MPS. In this case, we're specifying loopback
channel, instead of physical ports, so the packet gets looped
back to Rx path, instead of getting transmitted on the wire.

v3:
- Modify commit message to include test details.
v2:
- Add only loopback self-test.

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23 11:59:26 -07:00
Miaohe Lin
8bf9d8eabb cxgb4: use eth_zero_addr() to clear mac address
Use eth_zero_addr() to clear mac address insetad of memset().

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23 11:49:12 -07:00
Jakub Kicinski
205a55f4e6 sfc: convert to new udp_tunnel infrastructure
Check MC_CMD_DRV_ATTACH_EXT_OUT_FLAG_TRUSTED, before setting
the info, which will hopefully protect us from -EPERM errors
the previous code was gracefully ignoring. Ed reports this
is not the 100% correct bit, but it's the best approximation
we have. Shared code reports the port information back to user
space, so we really want to know what was added and what failed.
Ignoring -EPERM is not an option.

The driver does not call udp_tunnel_get_rx_info(), so its own
management of table state is not really all that problematic,
we can leave it be. This allows the driver to continue with its
copious table syncing, and matching the ports to TX frames,
which it will reportedly do one day.

Leave the feature checking in the callbacks, as the device may
remove the capabilities on reset.

Inline the loop from __efx_ef10_udp_tnl_lookup_port() into
efx_ef10_udp_tnl_has_port(), since it's the only caller now.

With new infra this driver gains port replace - when space frees
up in a full table a new port will be selected for offload.
Plus efx will no longer sleep in an atomic context.

v2:
 - amend the commit message about TRUSTED not being 100%
 - add TUNNEL_ENCAP_UDP_PORT_ENTRY_INVALID to mark unsed
   entries

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-By: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23 11:36:06 -07:00
Navid Emamdoost
e6827d1abd cxgb4: add missing release on skb in uld_send()
In the implementation of uld_send(), the skb is consumed on all
execution paths except one. Release skb when returning NET_XMIT_DROP.

Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 20:04:17 -07:00
Alexander Lobakin
d1b25b79e1 qede: add .ndo_xdp_xmit() and XDP_REDIRECT support
Add XDP_REDIRECT case handling and the corresponding NDO to support
redirecting XDP frames. This also includes registering driver memory
model (currently order-0 page mode) in BPF subsystem.
The total number of XDP queues is usually 1:1 with Rx ones.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:19:03 -07:00
Alexander Lobakin
4c2bacbea1 qede: refactor XDP Tx processing
Current XDP Tx logic is suboptimal and can't be reused for XDP_REDIRECT
path.
Make qede_xdp_{tx_int,xmit}() more universal and effective in general to
allow future expanding.

Misc: use unlikely() hints where appropriate and replace "fallthrough"
comments with pseudo-keywords.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:19:03 -07:00
Alexander Lobakin
f285ad5726 qede: reformat net_device_ops declarations
Correct the indentation of net_device_ops declarations for fancier look.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:19:03 -07:00
Alexander Lobakin
f35535f73c qede: reformat several structures in "qede.h"
Make the file more readable and easier for adding new fields.

Misc: use IFNAMSIZ and netdev_name() instead of sizeof_field()
and direct net_device::name dereferencing.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:19:03 -07:00
Alexander Lobakin
155065866b qed: add support for different page sizes for chains
Extend current infrastructure to store chain page size in a struct
and use it in all functions instead of fixed QED_CHAIN_PAGE_SIZE.
Its value remains the default one, but can be overridden in
qed_chain_init_params before chain allocation.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:19:03 -07:00
Alexander Lobakin
b6db3f71c9 qed: simplify chain allocation with init params struct
To simplify qed_chain_alloc() prototype and call sites, introduce struct
qed_chain_init_params to specify chain params, and pass a pointer to
filled struct to the actual qed_chain_alloc() instead of a long list
of separate arguments.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:19:03 -07:00
Alexander Lobakin
c3a321b06a qed: simplify initialization of the chains with an external PBL
Fill PBL table parameters for chains with an external PBL data earlier on
qed_chain_init_params() rather than on allocation itself. This simplifies
allocation code and allows to extend struct ext_pbl for other chain types.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:19:03 -07:00
Alexander Lobakin
5e776d8016 qed: move chain initialization inlines next to allocation functions
qed_chain_init*() are used in one file/place on "cold" path only, so they
can be uninlined and moved next to the call sites.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:19:03 -07:00
Alexander Lobakin
9b6ee3cf95 qed: sanitize PBL chains allocation
PBL chain elements are actually DMA addresses stored in __le64, but
currently their size is hardcoded to 8, and DMA addresses are assigned
via cast to variable-sized dma_addr_t without any bitwise conversions.
Change the type of pbl_virt array to match the actual one, add a new
field to store the size of allocated DMA memory and sanitize elements
assignment.

Misc: give more logic names to the members of qed_chain::pbl_sp embedded
struct.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:19:03 -07:00
Alexander Lobakin
96ca4c50c7 qed: prevent possible double-frees of the chains
Zero-initialize chain on qed_chain_free(), so it couldn't be freed
twice and provoke undefined behaviour.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:19:03 -07:00
Alexander Lobakin
a08c9b2c7c qed: move chain methods to a separate file
Move chain allocation/freeing functions to a new file to not mix it with
hardware-related code.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:19:03 -07:00
Alexander Lobakin
bdaf98f6d5 qed: reformat Makefile
List one entry per line and sort them alphabetically to simplify the
addition of the new ones.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:19:03 -07:00
Egor Pomozov
901f3cc163 net: atlantic: fix PTP on AQC10X
This patch fixes PTP on AQC10X.
PTP support on AQC10X requires FW involvement and FW configures the
TPS data arb mode itself.
So we must make sure driver doesn't touch TPS data arb mode on AQC10x
if PTP is enabled. Otherwise, there are no timestamps even though
packets are flowing.

Fixes: 2deac71ac4 ("net: atlantic: QoS implementation: min_rate")
Signed-off-by: Egor Pomozov <epomozov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:15:07 -07:00
Gustavo A. R. Silva
f1fa27f590 net: qed_hsi.h: Avoid the use of one-element array
One-element arrays are being deprecated[1]. Replace the one-element
array with a simple value type '__le32 reserved1'[2], once it seems
this is just a placeholder for alignment.

[1] https://github.com/KSPP/linux/issues/79
[2] https://github.com/KSPP/linux/issues/86

Tested-by: kernel test robot <lkp@intel.com>
Link: https://github.com/GustavoARSilva/linux-hardening/blob/master/cii/0-day/qed_hsi-20200718.md
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:12:51 -07:00
Gustavo A. R. Silva
6fcf9affd1 bna: bfi.h: Avoid the use of one-element array
One-element arrays are being deprecated[1]. Replace the one-element
array with a simple value type 'u8 rsvd'[2], once it seems this is
just a placeholder for alignment.

[1] https://github.com/KSPP/linux/issues/79
[2] https://github.com/KSPP/linux/issues/86

Tested-by: kernel test robot <lkp@intel.com>
Link: https://github.com/GustavoARSilva/linux-hardening/blob/master/cii/0-day/bfi-20200718.md
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:12:11 -07:00
Gustavo A. R. Silva
7ec3e95e7a tg3: Avoid the use of one-element array
One-element arrays are being deprecated[1]. Replace the one-element
array with a simple value type 'u32 reserved2'[2], once it seems
this is just a placeholder for alignment.

[1] https://github.com/KSPP/linux/issues/79
[2] https://github.com/KSPP/linux/issues/86

Tested-by: kernel test robot <lkp@intel.com>
Link: https://github.com/GustavoARSilva/linux-hardening/blob/master/cii/0-day/tg3-20200718.md
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:11:07 -07:00
Colin Ian King
4b1debbe63 ionic: fix memory leak of object 'lid'
Currently when netdev fails to allocate the error return path
fails to free the allocated object 'lid'.  Fix this by setting
err to the return error code and jumping to a new label that
performs the kfree of lid before returning.

Addresses-Coverity: ("Resource leak")
Fixes: 4b03b27349 ("ionic: get MTU from lif identity")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 18:10:09 -07:00
Colin Ian King
bb809a047e lan743x: remove redundant initialization of variable current_head_index
The variable current_head_index is being initialized with a value that
is never read and it is being updated later with a new value.  Replace
the initialization of -1 with the latter assignment.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 17:55:35 -07:00
Claudiu Manoil
26cb7085c8 enetc: Remove the mdio bus on PF probe bailout
For ENETC ports that register an external MDIO bus,
the bus doesn't get removed on the error bailout path
of enetc_pf_probe().

This issue became much more visible after recent:
commit 07095c025a ("net: enetc: Use DT protocol information to set up the ports")
Before this commit, one could make probing fail on the error
path only by having register_netdev() fail, which is unlikely.
But after this commit, because it moved the enetc_of_phy_get()
call up in the probing sequence, now we can trigger an mdiobus_free()
bug just by forcing enetc_alloc_msix() to return error, i.e. with the
'pci=nomsi' kernel bootarg (since ENETC relies on MSI support to work),
as the calltrace below shows:

kernel BUG at /home/eiz/work/enetc/net/drivers/net/phy/mdio_bus.c:648!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[...]
Hardware name: LS1028A RDB Board (DT)
pstate: 80000005 (Nzcv daif -PAN -UAO BTYPE=--)
pc : mdiobus_free+0x50/0x58
lr : devm_mdiobus_free+0x14/0x20
[...]
Call trace:
 mdiobus_free+0x50/0x58
 devm_mdiobus_free+0x14/0x20
 release_nodes+0x138/0x228
 devres_release_all+0x38/0x60
 really_probe+0x1c8/0x368
 driver_probe_device+0x5c/0xc0
 device_driver_attach+0x74/0x80
 __driver_attach+0x8c/0xd8
 bus_for_each_dev+0x7c/0xd8
 driver_attach+0x24/0x30
 bus_add_driver+0x154/0x200
 driver_register+0x64/0x120
 __pci_register_driver+0x44/0x50
 enetc_pf_driver_init+0x24/0x30
 do_one_initcall+0x60/0x1c0
 kernel_init_freeable+0x1fc/0x274
 kernel_init+0x14/0x110
 ret_from_fork+0x10/0x34

Fixes: ebfcb23d62 ("enetc: Add ENETC PF level external MDIO support")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 17:51:13 -07:00
Claudiu Manoil
c6dd6488ac enetc: Remove the imdio bus on PF probe bailout
enetc_imdio_remove() is missing from the enetc_pf_probe()
bailout path. Not surprisingly because enetc_setup_serdes()
is registering the imdio bus for internal purposes, and it's
not obvious that enetc_imdio_remove() currently performs the
teardown of enetc_setup_serdes().
To fix this, define enetc_teardown_serdes() to wrap
enetc_imdio_remove() (improve code maintenance) and call it
on bailout and remove paths.

Fixes: 975d183ef0 ("net: enetc: Initialize SerDes for SGMII and USXGMII protocols")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 17:32:07 -07:00
Wang Hai
7979a7d2ab net: qed: Remove unneeded cast from memory allocation
Remove casting the values returned by memory allocation function.

Coccinelle emits WARNING: casting value returned by memory allocation
unction to (struct roce_destroy_qp_req_output_params *) is useless.

This issue was detected by using the Coccinelle software.

Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 17:28:54 -07:00
Vladimir Oltean
8bb849d67f net: mscc: ocelot: fix non-initialized CPU port on VSC7514
The VSC7514 is marketed as a 10-port switch, however it has 11 physical
ports (0->10) in the block diagram:
https://www.microsemi.com/product-directory/ethernet-switches/3992-vsc7514
(also in the device tree at arch/mips/boot/dts/mscc/ocelot.dtsi)

Additionally, by architecture it has one more entry in the analyzer
block, situated right after the physical ports, for the CPU port module.
This is not a physical port, it only represents a channel for frame
injection and extraction. That entry for the CPU port is at index 11 in
the analyzer.

When the register groups for QSYS_SWITCH_PORT_MODE, SYS_PORT_MODE and
SYS_PAUSE_CFG are declared to be replicated 11 times, the 11th entry in
the array of regfields is not initialized, so the CPU port module is not
initialized either.

The documentation of QSYS_SWITCH_PORT_MODE for VSC7514 also says that
this register group is replicated 12 times, so this patch is simply
reflecting that and not introducing any further inconsistency.

Fixes: 886e1387c7 ("net: mscc: ocelot: convert QSYS_SWITCH_PORT_MODE and SYS_PORT_MODE to regfields")
Fixes: 541132f096 ("net: mscc: ocelot: convert SYS_PAUSE_CFG register access to regfield")
Reported-by: Bryan Whitehead <bryan.whitehead@microchip.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 13:02:09 -07:00
Shannon Nelson
1b897e7d8d ionic: interface file updates
Add some new interface values and update a few more descriptions.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 18:36:34 -07:00
Shannon Nelson
6a6014e2fb ionic: rearrange reset and bus-master control
We can prevent potential incorrect DMA access attempts from the
NIC by enabling bus-master after the reset, and by disabling
bus-master earlier in cleanup.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 18:36:34 -07:00
Shannon Nelson
3fbc9bb6ca ionic: update eid test for overflow
Fix up our comparison to better handle a potential (but largely
unlikely) wrap around.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 18:36:34 -07:00
Shannon Nelson
4471b1c13a ionic: remove unused ionic_coal_hw_to_usec
Clean up some unused code.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 18:36:34 -07:00
Shannon Nelson
c8768e7321 ionic: set netdev default name
If the host system's udev fails to set a new name for the
network port, there is no NETDEV_CHANGENAME event to trigger
the driver to send the name down to the firmware.  It is safe
to set the lif name multiple times, so we add a call early on
to set the default netdev name to be sure the FW has something
to use in its internal debug logging.  Then when udev gets
around to changing it we can update it to the actual name the
system will be using.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 18:36:34 -07:00
Shannon Nelson
4b03b27349 ionic: get MTU from lif identity
Change from using hardcoded MTU limits and instead use the
firmware defined limits. The value from the LIF attributes is
the frame size, so we take off the header size to convert to
MTU size.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 18:36:34 -07:00
Murali Karicheri
2c4dc31486 net: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offload
Currently drive supports taprio offload which is a tc feature offloaded
to cpsw hardware. So driver has to set the hw feature flag, NETIF_F_HW_TC
in the net device to be compliant. This patch adds the flag.

Fixes: 8127224c27 ("ethernet: ti: am65-cpsw-qos: add TAPRIO offload support")
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 18:32:34 -07:00
Wang Hai
1264d7fa3a net: ethernet: ave: Fix error returns in ave_init
When regmap_update_bits failed in ave_init(), calls of the functions
reset_control_assert() and clk_disable_unprepare() were missed.
Add goto out_reset_assert to do this.

Fixes: 57878f2f46 ("net: ethernet: ave: add support for phy-mode setting of system controller")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 18:31:46 -07:00
Ioana Ciornei
3657cdaf03 dpaa2-eth: add support for TBF offload
React to TC_SETUP_QDISC_TBF and configure the egress shaper as
appropriate with the maximum rate and burst size requested by the user.
TBF can only be offloaded on DPAA2 when it's the root qdisc, ie it's a
per port shaper.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 16:24:04 -07:00
Ioana Ciornei
39344a8962 dpaa2-eth: add API for Tx shaping
Add the necessary API (dpni_set_tx_shaping) for configuring the rate and
burst size of a per port shaper in DPAA2.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 16:24:04 -07:00
Ioana Ciornei
e3ec13be57 dpaa2-eth: move the mqprio setup into a separate function
Move the setup done for MQPRIO into a separate function so that
with the addition of another offload we do not crowd
dpaa2_eth_setup_tc(). After this restructuring it's easier to see what
is supported in terms of Qdisc offloading.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 16:24:04 -07:00
Heiner Kallweit
3fc364c052 r8169: allow to enable ASPM on RTL8125A
For most chip versions this has been added already. Allow also for
RTL8125A to enable ASPM.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 16:12:19 -07:00
Alexander Lobakin
eb61c2d699 qed: suppress false-positives interrupt error messages on HW init
It was found that qed_pglueb_rbc_attn_handler() can produce a lot of
false-positive error detections on driver load/reload (especially after
crashes/recoveries) and spam the kernel log:

[    4.958275] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d00ff0
[ 2079.146764] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0
[ 2116.374631] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0
[ 2135.250564] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0
[...]

Reduce the logging level of two false-positive prone error messages from
notice to verbose on initialization (only) to not mix it with real error
attentions while debugging.

Fixes: 666db4862f ("qed: Revise load sequence to avoid PCI errors")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 16:07:34 -07:00
Alexander Lobakin
1ea999039f qed: suppress "don't support RoCE & iWARP" flooding on HW init
Change the verbosity of the "don't support RoCE & iWARP simultaneously"
warning to debug level to stop flooding on driver/hardware initialization:

[    4.783230] qede 01:00.00: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth0]
[    4.810020] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[    4.861186] qede 01:00.01: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth1]
[    4.893311] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[    5.181713] qede a1:00.00: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth2]
[    5.224740] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[    5.276449] qede a1:00.01: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth3]
[    5.318671] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[    5.369548] qede a1:00.02: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth4]
[    5.411645] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only

Fixes: e0a8f9de16 ("qed: Add iWARP enablement support")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 16:07:34 -07:00
Arthur Kiyanovski
0e3a3f6dac net: ena: support new LLQ acceleration mode
New devices add a new hardware acceleration engine, which adds some
restrictions to the driver.
Metadata descriptor must be present for each packet and the maximum
burst size between two doorbells is now limited to a number
advertised by the device.

This patch adds:
1. A handshake protocol between the driver and the device, so the
device will enable the accelerated queues only when both sides
support it.

2. The driver support for the new acceleration engine:
2.1. Send metadata descriptor for each Tx packet.
2.2. Limit the number of packets sent between doorbells.(*)

(*) A previous driver implementation of this feature was comitted in
commit 05d62ca218 ("net: ena: add handling of llq max tx burst size")
however the design of the interface between the driver and device
changed since then. This change is reflected in this commit.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:59:04 -07:00
Arthur Kiyanovski
c29efeae37 net: ena: move llq configuration from ena_probe to ena_device_init()
When the ENA device resets to recover from some error state, all LLQ
configuration values are reset to their defaults, because LLQ is
initialized only once during ena_probe().

Changes in this commit:
1. Move the LLQ configuration process into ena_init_device()
which is called from both ena_probe() and ena_restore_device(). This
way, LLQ setup configurations that are different from the default
values will survive resets.

2. Extract the LLQ bar mapping to ena_map_llq_bar(),
and call once in the lifetime of the driver from ena_probe(),
since there is no need to unmap and map the LLQ bar again every reset.

3. Map the LLQ bar if it exists, regardless if initialization of LLQ
placement policy (ENA_ADMIN_PLACEMENT_POLICY_DEV) succeeded
or not. Initialization might fail the first time, falling back to the
ENA_ADMIN_PLACEMENT_POLICY_HOST placement policy, but later succeed
after device reset, in which case the LLQ bar needs to be mapped
already.

Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:59:04 -07:00
Arthur Kiyanovski
0ee60edf46 net: ena: enable support of rss hash key and function changes
Add the rss_configurable_function_key bit to driver_supported_feature.

This bit tells the device that the driver in question supports the
retrieving and updating of RSS function and hash key, and therefore
the device should allow RSS function and key manipulation.

This commit turns on  device support for hash key and RSS function
management. Without this commit this feature is turned off at the
device and appears to the user as unsupported.

This commit concludes the following series of already merged commits:
commit 0af3c4e2ea ("net: ena: changes to RSS hash key allocation")
commit c1bd17e51c ("net: ena: change default RSS hash function to Toeplitz")
commit f66c2ea3b1 ("net: ena: allow setting the hash function without changing the key")
commit e9a1de378d ("net: ena: fix error returning in ena_com_get_hash_function()")
commit 80f8443fcd ("net: ena: avoid unnecessary admin command when RSS function set fails")
commit 6a4f7dc82d ("net: ena: rss: do not allocate key when not supported")
commit 0d1c3de7b8 ("net: ena: fix incorrect default RSS key")

The above commits represent the last part of the implementation of
this feature, and with them merged the feature can be enabled
in the device.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:59:04 -07:00
Arthur Kiyanovski
0f505c604e net: ena: add support for traffic mirroring
Add support for traffic mirroring, where the hardware reads the
buffer from the instance memory directly.

Traffic Mirroring needs access to the rx buffers in the instance.
To have this access, this patch:
1. Changes the code to map and unmap the rx buffers bidirectionally.
2. Enables the relevant bit in driver_supported_features to indicate
   to the FW that this driver supports traffic mirroring.

Rx completion is not generated until mirroring is done to avoid
the situation where the driver changes the buffer before it is
mirrored.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:59:04 -07:00
Arthur Kiyanovski
0dcec68651 net: ena: cosmetic: change ena_com_stats_admin stats to u64
The size of the admin statistics in ena_com_stats_admin is changed
from 32bit to 64bit so to align with the sizes of the other statistics
in the driver (i.e. rx_stats, tx_stats and ena_stats_dev).

This is done as part of an effort to create a unified API to read
statistics.

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:59:04 -07:00
Arthur Kiyanovski
79890d3f3c net: ena: cosmetic: satisfy gcc warning
gcc 4.8 reports a warning when initializing with = {0}.
Dropping the "0" from the braces fixes the issue.
This fix is not ANSI compatible but is allowed by gcc.

Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:59:04 -07:00
Arthur Kiyanovski
866032ab4d net: ena: add reserved PCI device ID
Add a reserved PCI device ID to the driver's table
Used for internal testing purposes.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:59:04 -07:00
Arthur Kiyanovski
1e5ae35072 net: ena: avoid unnecessary rearming of interrupt vector when busy-polling
For an overview of the race created by this patch goto synchronization
label.

In napi busy-poll mode, the kernel invokes the napi handler of the
device repeatedly to poll the NIC's receive queues. This process
repeats until a timeout, specific for each connection, is up.
By polling packets in busy-poll mode the user may gain lower latency
and higher throughput (since the kernel no longer waits for interrupts
to poll the queues) in expense of CPU usage.

Upon completing a napi routine, the driver checks whether
the routine was called by an interrupt handler. If so, the driver
re-enables interrupts for the device. This is needed since an
interrupt routine invocation disables future invocations until
explicitly re-enabled.

The driver avoids re-enabling the interrupts if they were not disabled
in the first place (e.g. if driver in busy mode).
Originally, the driver checked whether interrupt re-enabling is needed
by reading the 'ena_napi->unmask_interrupt' variable. This atomic
variable was set upon interrupt and cleared after re-enabling it.

In the 4.10 Linux version, the 'napi_complete_done' call was changed
so that it returns 'false' when device should not re-enable
interrupts, and 'true' otherwise. The change includes reading the
"NAPIF_STATE_IN_BUSY_POLL" flag to check if the napi call is in
busy-poll mode, and if so, return 'false'.
The driver was changed to re-enable interrupts according to this
routine's return value.
The Linux community rejected the use of the
'ena_napi->unmaunmask_interrupt' variable to determine whether
unmasking is needed, and urged to use napi_napi_complete_done()
return value solely.
See https://lore.kernel.org/patchwork/patch/741149/ for more details

As explained, a busy-poll session exists for a specified timeout
value, after which it exits the busy-poll mode and re-enters it later.
This leads to many invocations of the napi handler where
napi_complete_done() false indicates that interrupts should be
re-enabled.
This creates a bug in which the interrupts are re-enabled
unnecessarily.
To reproduce this bug:
    1) echo 50 | sudo tee /proc/sys/net/core/busy_poll
    2) echo 50 | sudo tee /proc/sys/net/core/busy_read
    3) Add counters that check whether
    'ena_unmask_interrupt(tx_ring, rx_ring);'
    is called without disabling the interrupts in the first
    place (i.e. with calling the interrupt routine
    ena_intr_msix_io())

Steps 1+2 enable busy-poll as the default mode for new connections.

The busy poll routine rearms the interrupts after every session by
design, and so we need to add an extra check that the interrupts were
masked in the first place.

synchronization:
This patch introduces a race between the interrupt handler
ena_intr_msix_io() and the napi routine ena_io_poll().
Some macros and instruction were added to prevent this race from leaving
the interrupts masked. The following specifies the different race
scenarios in this patch:

1) interrupt handler and napi routine run sequentially
    i) interrupt handler is called, sets 'interrupts_masked' flag and
	successfully schedules the napi handler via softirq.

    In this scenario the napi routine might not see the flag change
    for several reasons:
	a) The flag is stored in a register by the compiler. For this
	case the WRITE_ONCE macro which prevents this.
	b) The compiler might reorder the instruction. For this the
	smp_wmb() instruction was used which implies a compiler memory
	barrier.
	c) On archs with weak consistency model (like ARM64) the napi
	routine might be scheduled and start running before the flag
	STORE instruction is committed to cache/memory. To ensure this
	doesn't happen, the smp_wmb() instruction was added. It ensures
	that the flag set instruction is committed before scheduling
	napi.

    ii) compiler reorders the flag's value check in the 'if' with
    the flag set in the napi routine.

    This scenario is prevented by smp_rmb() call after the flag check.

2) interrupt handler and napi routine run in parallel (can happen when
busy poll routine invokes the napi handler)

    i) interrupt handler sets the flag in one core, while the napi
    routine reads it in another core.

    This scenario also is divided into two cases:
	a) napi_complete_done() doesn't finish running, in which case
	napi_sched() would just set NAPIF_STATE_MISSED and the napi
	routine would reschedule itself without changing the flag's value.

	b) napi_complete_done() finishes running. In this case the
	napi routine might override the flag's value.
	This doesn't present any rise since it later unmasks the
	interrupt vector.

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:59:04 -07:00
Yuval Basson
d4eae993fc qed: Fix ILT and XRCD bitmap memory leaks
- Free ILT lines used for XRC-SRQ's contexts.
- Free XRCD bitmap

Fixes: b8204ad878 ("qed: changes to ILT to support XRC")
Fixes: 7bfb399eca ("qed: Add XRC to RoCE")
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Yuval Basson <ybason@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:50:53 -07:00
Jian Shen
fac24df7b9 net: hns3: fix return value error when query MAC link status fail
Currently, PF queries the MAC link status per second by calling
function hclge_get_mac_link_status(). It return the error code
when failed to send cmdq command to firmware. It's incorrect,
because this return value is used as the MAC link status, which
0 means link down, and none-zero means link up. So fixes it.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:49:17 -07:00
Yunsheng Lin
8ceca59fb3 net: hns3: fix error handling for desc filling
The content of the TX desc is automatically cleared by the HW
when the HW has sent out the packet to the wire. When desc filling
fails in hns3_nic_net_xmit(), it will call hns3_clear_desc() to do
the error handling, which miss zeroing of the TX desc and the
checking if a unmapping is needed.

So add the zeroing and checking in hns3_clear_desc() to avoid the
above problem. Also add DESC_TYPE_UNKNOWN to indicate the info in
desc_cb is not valid, because hns3_nic_reclaim_desc() may treat
the desc_cb->type of zero as packet and add to the sent pkt
statistics accordingly.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:49:17 -07:00
Yunsheng Lin
48ae74c9d8 net: hns3: fix for not calculating TX BD send size correctly
With GRO and fraglist support, the SKB can be aggregated to
a total size of 65535, and when that SKB is forwarded through
a bridge, the size of the SKB may be pushed to exceed the size
of 65535 when br_dev_queue_push_xmit() is called.

The max send size of BD supported by the HW is 65535, when a SKB
with a headlen of over 65535 is sent to the driver, the driver
needs to use multi BD to send the linear data, and the send size
of the last BD is calculated incorrectly by the driver who is
using '&' operation, which causes a TX error.

Use '%' operation to fix this problem.

Fixes: 3fe13ed95d ("net: hns3: avoid mult + div op in critical data path")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:49:17 -07:00
Yunsheng Lin
0ec3b6a7c0 net: hns3: fix for not unmapping TX buffer correctly
When a big TX buffer is sent using multi BD, the driver maps the
whole TX buffer, and unmaps it using info in desc_cb corresponding
to each BD, but only the info in the desc_cb of first BD is correct,
other info in desc_cb is wrong, which causes TX unmapping problem
when SMMU is on.

Only set the mapping and freeing info in the desc_cb of first BD to
fix this problem, because the TX buffer only need to be unmapped and
freed once.

Fixes: 1e8a7977d09f("net: hns3: add handling for big TX fragment")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huzhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:49:17 -07:00
Claudiu Manoil
ae0e6a5d16 enetc: Add adaptive interrupt coalescing
Use the generic dynamic interrupt moderation (dim)
framework to implement adaptive interrupt coalescing
on Rx.  With the per-packet interrupt scheme, a high
interrupt rate has been noted for moderate traffic flows
leading to high CPU utilization.  The 'dim' scheme
implemented by the current patch addresses this issue
improving CPU utilization while using minimal coalescing
time thresholds in order to preserve a good latency.
On the Tx side use an optimal time threshold value by
default.  This value has been optimized for Tx TCP
streams at a rate of around 85kpps on a 1G link,
at which rate half of the Tx ring size (128) gets filled
in 1500 usecs.  Scaling this down to 2.5G links yields
the current value of 600 usecs, which is conservative
and gives good enough results for 1G links too (see
next).

Below are some measurement results for before and after
this patch (and related dependencies) basically, for a
2 ARM Cortex-A72 @1.3Ghz CPUs system (32 KB L1 data cache),
using 60secs log netperf TCP stream tests @ 1Gbit link
(maximum throughput):

1) 1 Rx TCP flow, both Rx and Tx processed by the same NAPI
thread on the same CPU:
	CPU utilization		int rate (ints/sec)
Before:	50%-60% (over 50%)		92k
After:  13%-22%				3.5k-12k
Comment:  Major CPU utilization improvement for a single flow
	  Rx TCP flow (i.e. netperf -t TCP_MAERTS) on a single
	  CPU. Usually settles under 16% for longer tests.

2) 4 Rx TCP flows + 4 Tx TCP flows (+ pings to check the latency):
	Total CPU utilization	Total int rate (ints/sec)
Before:	~80% (spikes to 90%)		~100k
After:   60% (more steady)		  ~4k
Comment:  Important improvement for this load test, while the
	  ping test outcome does not show any notable
	  difference compared to before.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:38:30 -07:00
Claudiu Manoil
915710812b enetc: Add interrupt coalescing support
Enable programming of the interrupt coalescing registers
and allow manual configuration of the coalescing time
thresholds via ethtool.  Packet thresholds have been fixed
to predetermined values as there's no point in making them
run-time configurable, also anticipating the dynamic interrupt
moderation (DIM) algorithm which uses fixed packet thresholds
as well.  If the interface is up when the operation mode of
traffic interrupt events is changed by the user (i.e. switching
from default per-packet interrupts to coalesced interrupts),
the traffic needs to be paused in the process.
This patch also prepares the ground for introducing DIM on Rx.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:38:30 -07:00
Claudiu Manoil
058d9cfa60 enetc: Drop redundant ____cacheline_aligned_in_smp
'struct enetc_bdr' is already '____cacheline_aligned_in_smp'.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:38:30 -07:00
Claudiu Manoil
12460a0abe enetc: Fix interrupt coalescing register naming
Interrupt coalescing registers naming in the current revision
of the Ref Man (RM) is ICR, deprecating the ICIR name used
in earlier (draft) versions of the RM.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:38:30 -07:00
Claudiu Manoil
bbb96dc7fa enetc: Factor out the traffic start/stop procedures
A reliable traffic pause (and reconfiguration) procedure
is needed to be able to safely make h/w configuration
changes during run-time, like changing the mode in which the
interrupts are operating (i.e. with or without coalescing),
as opposed to making on-the-fly register updates that
may be subject to h/w or s/w concurrency issues.
To this end, the code responsible of the run-time device
configurations that basically starts resp. stops the traffic
flow through the device has been extracted from the
the enetc_open/_close procedures, to the separate standalone
enetc_start/_stop procedures. Traffic stop should be as
graceful as possible, it lets the executing napi threads to
to finish while the interrupts stay disabled.  But since
the napi thread will try to re-enable interrupts by clearing
the device's unmask register, the enable_irq/ disable_irq
API has been used to avoid this potential concurrency issue
and make the traffic pause procedure more reliable.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:38:30 -07:00
Claudiu Manoil
02293dd4b7 enetc: Refine buffer descriptor ring sizes
It's time to differentiate between Rx and Tx ring sizes.
Not only Tx rings are processed differently than Rx rings,
but their default number also differs - i.e. up to 8 Tx rings
per device (8 traffic classes) vs. 2 Rx rings (one per CPU).
So let's set Tx rings sizes to half the size of the Rx rings
for now, to be conservative.
The default ring sizes were decreased as well (to the next
lower power of 2), to reduce the memory footprint, buffering
etc., since the measurements I've made so far show that the
rings are very unlikely to get full.
This change also anticipates the introduction of the
dynamic interrupt moderation (dim) algorithm which operates
on maximum packet thresholds of 256 packets for Rx and 128
packets for Tx.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:38:30 -07:00
Yoshihiro Shimoda
015c5d5e6a net: ethernet: ravb: exit if re-initialization fails in tx timeout
According to the report of [1], this driver is possible to cause
the following error in ravb_tx_timeout_work().

ravb e6800000.ethernet ethernet: failed to switch device to config mode

This error means that the hardware could not change the state
from "Operation" to "Configuration" while some tx and/or rx queue
are operating. After that, ravb_config() in ravb_dmac_init() will fail,
and then any descriptors will be not allocaled anymore so that NULL
pointer dereference happens after that on ravb_start_xmit().

To fix the issue, the ravb_tx_timeout_work() should check
the return values of ravb_stop_dma() and ravb_dmac_init().
If ravb_stop_dma() fails, ravb_tx_timeout_work() re-enables TX and RX
and just exits. If ravb_dmac_init() fails, just exits.

[1]
https://lore.kernel.org/linux-renesas-soc/20200518045452.2390-1-dirk.behme@de.bosch.com/

Reported-by: Dirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 15:34:07 -07:00
Bixuan Cui
db44c60c45 net: neterion: vxge: reduce stack usage in VXGE_COMPLETE_VPATH_TX
Fix the warning: [-Werror=-Wframe-larger-than=]

drivers/net/ethernet/neterion/vxge/vxge-main.c:
In function'VXGE_COMPLETE_VPATH_TX.isra.37':
drivers/net/ethernet/neterion/vxge/vxge-main.c:119:1:
warning: the frame size of 1056 bytes is larger than 1024 bytes

Dropping the NR_SKB_COMPLETED to 16 is appropriate that won't
have much impact on performance and functionality.

Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:38:27 -07:00
Huang Guobin
befc113c56 net: ag71xx: add missed clk_disable_unprepare in error path of probe
The ag71xx_mdio_probe() forgets to call clk_disable_unprepare() when
of_reset_control_get_exclusive() failed. Add the missed call to fix it.

Fixes: d51b6ce441 ("net: ethernet: add ag71xx driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Huang Guobin <huangguobin4@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:37:38 -07:00
Christophe JAILLET
405e30e23c net/fealnx: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated, GFP_KERNEL can be used because it is called from
the probe function (i.e. 'fealnx_init_one()') and no lock is taken.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:32:49 -07:00
Shannon Nelson
0925e9db4d ionic: use mutex to protect queue operations
The ionic_wait_on_bit_lock() was a open-coded mutex knock-off
used only for protecting the queue reset operations, and there
was no reason not to use the real thing.  We can use the lock
more correctly and to better protect the queue stop and start
operations from cross threading.  We can also remove a useless
and expensive bit operation from the Rx path.

This fixes a case found where the link_status_check from a link
flap could run into an MTU change and cause a crash.

Fixes: beead698b1 ("ionic: Add the basic NDO callbacks for netdev support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:09:38 -07:00
Shannon Nelson
bdff46665e ionic: keep rss hash after fw update
Make sure the RSS hash key is kept across a fw update by not
de-initing it when an update is happening.

Fixes: c672412f61 ("ionic: remove lifs on fw reset")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:09:38 -07:00
Shannon Nelson
cc4428c4de ionic: update filter id after replay
When we replay the rx filters after a fw-upgrade we get new
filter_id values from the FW, which we need to save and update
in our local filter list.  This allows us to delete the filters
with the correct filter_id when we're done.

Fixes: 7e4d47596b ("ionic: replay filters after fw upgrade")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:09:38 -07:00
Shannon Nelson
cbec2153a9 ionic: fix up filter locks and debug msgs
Add in a couple of forgotten spinlocks and fix up some of
the debug messages around filter management.

Fixes: c1e329ebec ("ionic: Add management of rx filters")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:09:38 -07:00
Shannon Nelson
f85ae16f92 ionic: use offset for ethtool regs data
Use an offset to write the second half of the regs data into the
second half of the buffer instead of overwriting the first half.

Fixes: 4d03e00a21 ("ionic: Add initial ethtool support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:09:38 -07:00
Mark Starovoytov
8dcf2ad39f net: atlantic: add hwmon getter for MAC temperature
This patch adds the possibility to obtain MAC temperature via hwmon.
On A1 there are two separate temperature sensors.
On A2 there's only one temperature sensor, which is used for reporting
both MAC and PHY temperature.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:07:39 -07:00
Dmitry Bogdanov
a89df867ce net: atlantic: A0 ntuple filters
This patch adds support for ntuple filters on A0.

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:07:38 -07:00
Nikita Danilov
b98ffe6fa4 net: atlantic: use intermediate variable to improve readability a bit
This patch syncs up hw_atl_a0.c with an out-of-tree driver, where an
intermediate variable was introduced in a couple of functions to
improve the code readability a bit.

Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:07:38 -07:00
Mark Starovoytov
88bc9cf143 net: atlantic: use U32_MAX in aq_hw_utils.c
This patch replaces magic constant ~0U usage with U32_MAX in aq_hw_utils.c

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:07:38 -07:00
Pavel Belous
1e41b3fee7 net: atlantic: add support for 64-bit reads/writes
This patch adds support for 64-bit reads/writes where applicable, e.g.
A2 supports them.

Signed-off-by: Pavel Belous <pbelous@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:07:38 -07:00
Igor Russkikh
8bd6071085 net: atlantic: enable ipv6 support for TCP LSO and UDP GSO
This patch enables ipv6 support for TCP LSO and UDP GSO.
The code itself (aq_nic_map_skb) was ready for this after udp gso feature,
but corresponding NETIF_F_TSO6 wasn't enabled.

We now have tested both tcp and udp v6 GSO, and enabling them safely.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:07:38 -07:00
Pavel Belous
14b539a349 net: atlantic: PTP statistics
This patch adds PTP rings statistics. Before that
these were missing from overall stats, hardening debugging
and analysis.

Signed-off-by: Pavel Belous <pbelous@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:07:38 -07:00
Dmitry Bogdanov
aa7e17a3e3 net: atlantic: additional per-queue stats
This patch adds additional per-queue stats, these could
be useful for debugging and diagnostics.

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:07:38 -07:00
Mark Starovoytov
d7d8bb9286 net: atlantic: use u64_stats_update_* to protect access to 64-bit stats
This patch adds u64_stats_update_* usage to protect access to 64-bit stats,
where necessary.

This is necessary for per-ring stats, because they are updated by the
driver directly, so there is a possibility for a partial read.

Other stats require no additional protection, e.g.:
 * all MACSec stats are fetched directly from HW (under semaphore);
 * nic/ndev stats (aq_stats_s) are fetched directly from FW (under mutex).

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:07:38 -07:00
Mark Starovoytov
508f2e3dce net: atlantic: split rx and tx per-queue stats
This patch splits rx and tx per-queue stats.
This change simplifies the follow-up introduction of PTP stats and
u64_stats_update_* usage.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:07:38 -07:00
Mark Starovoytov
b772112c5a net: atlantic: make _get_sw_stats return count as return value
This patch changes aq_vec_get_sw_stats() to return count as a return
value (which was unused) instead of an out parameter.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:07:38 -07:00
Mark Starovoytov
3624aa3c25 net: atlantic: use simple assignment in _get_stats and _get_sw_stats
This patch replaces addition assignment operator with a simple assignment
in aq_vec_get_stats() and aq_vec_get_sw_stats(), because it is
sufficient in both cases and this change simplifies the introduction of
u64_stats_update_* in these functions.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:07:38 -07:00
Mark Starovoytov
519f0cefb4 net: atlantic: move FRAC_PER_NS to aq_hw.h
This patch moves FRAC_PER_NS to aq_hw.h so that it can be used in both
hw_atl (A1) and hw_atl2 (A2) in the future.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:07:38 -07:00
Vaibhav Gupta
0c17ac5424 ethernet: myri10ge: use generic power management
Drivers using legacy PM have to manage PCI states and device's PM states
themselves. They also need to take care of configuration registers.

With improved and powerful support of generic PM, PCI Core takes care of
above mentioned, device-independent, jobs.

This driver makes use of PCI helper functions like
pci_save/restore_state(), pci_enable/disable_device(),
pci_set_power_state() and pci_set_master() to do required operations. In
generic mode, they are no longer needed.

Change function parameter in both .suspend() and .resume() to
"struct device*" type. Use to_pci_dev() and dev_get_drvdata() to get
"struct pci_dev*" variable and drv data.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:01:07 -07:00
Alexander Lobakin
99785a87fc qed: add support for the extended speed and FEC modes
Add all necessary code (NVM parsing, MFW and Ethtool reports etc.) to
support extended speed and FEC modes.
These new modes are supported by the new boards revisions and newer
MFW versions.

Misc: correct port type for MEDIA_KR.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:44 -07:00
Alexander Lobakin
097818fcf8 qed: populate supported link modes maps on module init
Simplify and lighten qed_set_link() by declaring static link modes maps
and populating them on module init. This way we save plenty of text size
at the low expense of __ro_after_init and __initconst data (the latter
will be purged after module init is done).

Misc: sanitize exit callback.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:44 -07:00
Alexander Lobakin
98e675ec5a qed: add missing loopback modes
These modes are relevant only for several boards, but may be reported by
MFW as well as the others.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:44 -07:00
Alexander Lobakin
a396818c08 qed: add support for new port modes
These ports ship on new boards revisions and are supported by newer
firmware versions.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:44 -07:00
Alexander Lobakin
e9a5eb8564 qed: remove unused qed_hw_info::port_mode and QED_PORT_MODE
Struct field qed_hw_info::port_mode isn't used anywhere in the code, so
can be safely removed to prevent possible dead code addition.
Also remove the enumeration QED_PORT_MODE orphaned after this deletion.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:44 -07:00
Alexander Lobakin
5d4193c641 qed: reformat several structures a bit
Reformat a few nvm_cfg* structures (and partly qed_dev) prior to adding
new fields and definitions.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:44 -07:00
Alexander Lobakin
9bdca14a0e qede: introduce support for FEC control
Add Ethtool callbacks for querying and setting FEC parameters if it's
supported by the underlying qed module and MFW version running on the
device.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:44 -07:00
Alexander Lobakin
460761570b qede: format qede{,_vf}_ethtool_ops
Prior to adding new callbacks, format qede ethtool_ops structs to make
declarations more fancy and readable.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:44 -07:00
Alexander Lobakin
ae7e69379f qed: add support for Forward Error Correction
Add all necessary routines for reading supported FEC modes from NVM and
querying FEC control to the MFW (if the running version supports it).

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:44 -07:00
Alexander Lobakin
37237b5b71 qed: reformat several structures a bit
Prior to adding new fields and bitfields, reformat the related
structures according to the Linux style (spaces to tabs,
lowercase hex, indentation etc.).

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:44 -07:00
Alexander Lobakin
3c41486e46 qed: use transceiver data to fill link partner's advertising speeds
Currently qed driver does not take into consideration transceiver's
capabilities when generating link partner's speed advertisement. This
leads to e.g. incorrect ethtool link info on 10GbaseT modules.
Use transceiver info not only for advertisement and support arrays, but
also for link partner's abilities to fix it.

Misc: fix a couple of comments nearby.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:44 -07:00
Alexander Lobakin
9228b7c1f4 qed: add support for multi-rate transceivers
Set the corresponding advertised and supported link modes according
to the detected transceiver type and device capabilities.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:43 -07:00
Alexander Lobakin
d47839f31e qed: reformat public_port::transceiver_data a bit
Prior to adding new bitfields, reformat the existing ones from spaces
to tabs, and unify all hex values to lowercase.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:43 -07:00
Alexander Lobakin
1d4e4ecccb qede: populate supported link modes maps on module init
Simplify and lighten qede_set_link_ksettings() by declaring static link
modes maps and populating them on module init. This way we save plenty
of text size at the low expense of __ro_after_init and __initconst data
(the latter will be purged after module init is done).

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:43 -07:00
Alexander Lobakin
bdb5d8ec47 qed, qede, qedf: convert link mode from u32 to ETHTOOL_LINK_MODE
Currently qed driver already ran out of 32 bits to store link modes,
and this doesn't allow to add and support more speeds.
Convert custom link mode to generic Ethtool bitmap and definitions
(convenient Phylink shorthands are used for elegance and readability).
This allowed us to drop all conversions/mappings between the driver
and Ethtool.

This involves changes in qede and qedf as well, as they used definitions
from shared "qed_if.h".

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:59:43 -07:00
Liu Jian
5dbaeb87f2 mlxsw: destroy workqueue when trap_register in mlxsw_emad_init
When mlxsw_core_trap_register fails in mlxsw_emad_init,
destroy_workqueue() shouled be called to destroy mlxsw_core->emad_wq.

Fixes: d965465b60 ("mlxsw: core: Fix possible deadlock")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:55:23 -07:00
Liu Jian
6790711f8a dpaa_eth: Fix one possible memleak in dpaa_eth_probe
When dma_coerce_mask_and_coherent() fails, the alloced netdev need to be freed.

Fixes: 060ad66f97 ("dpaa_eth: change DMA device")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:54:08 -07:00
Christophe JAILLET
256ca7449f sis: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'epic_init_one()' (sis190.c), GFP_KERNEL can be
used because this is a net_device_ops' 'ndo_open' function. This function
is protected by the rtnl_lock() semaphore. So only a mutex is used and no
spin_lock is acquired.

When memory is allocated in 'sis900_probe()' (sis900.c), GFP_KERNEL can be
used because it is a probe function and no spin_lock is acquired.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:50:07 -07:00
Christophe JAILLET
0b0edb993c r6040: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'r6040_open()', GFP_KERNEL can be used because
this is a net_device_ops' 'ndo_open' function. This function is protected
by the rtnl_lock() semaphore. So only a mutex is used and no spin_lock is
acquired.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:49:50 -07:00
Christophe JAILLET
73e283dfbf net: packetengines: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'hamachi_init_one()' (hamachi.c), GFP_KERNEL
can be used because it is a probe function and no lock is acquired.

When memory is allocated in 'yellowfin_init_one()' (yellowfin.c),
GFP_KERNEL can be used because it is a probe function and no lock is
acquired.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:48:23 -07:00
Zhang Changzhong
cebd2cac90 net: fs_enet: remove redundant null check
Because clk_prepare_enable and clk_disable_unprepare already
checked NULL clock parameter, so the additional checks are
unnecessary, just remove them.

Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:42:17 -07:00
Zhang Changzhong
53a92889ec net: bcmgenet: add missed clk_disable_unprepare in bcmgenet_probe
The driver forgets to call clk_disable_unprepare() in error path after
a success calling for clk_prepare_enable().

Fix to goto err_clk_disable if clk_prepare_enable() is successful.

Fixes: c80d36ff63 ("net: bcmgenet: Use devm_clk_get_optional() to get the clocks")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:03:45 -07:00
Nicolas Ferre
9d45c8e890 net: macb: Add WoL interrupt support for MACB type of Ethernet controller
Handle the Wake-on-Lan interrupt for the Cadence MACB Ethernet
controller.
As we do for the GEM version, we handle of WoL interrupt in a
specialized interrupt handler for MACB version that is positionned
just between suspend() and resume() calls.

Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:01:45 -07:00
Nicolas Ferre
558e35ccfe net: macb: WoL support for GEM type of Ethernet controller
Adapt the Wake-on-Lan feature to the Cadence GEM Ethernet controller.
This controller has different register layout and cannot be handled by
previous code.
We disable completely interrupts on all the queues but the queue 0.
Handling of WoL interrupt is done in another interrupt handler
positioned depending on the controller version used, just between
suspend() and resume() calls.
It allows to lower pressure on the generic interrupt hot path by
removing the need to handle 2 tests for each IRQ: the first figuring out
the controller revision, the second for actually knowing if the WoL bit
is set.

Queue management in suspend()/resume() functions inspired from RFC patch
by Harini Katakam <harinik@xilinx.com>, thanks!

Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:01:45 -07:00
Wang Hai
d89d8d4db4 net: ena: Fix using plain integer as NULL pointer in ena_init_napi_in_range
Fix sparse build warning:

drivers/net/ethernet/amazon/ena/ena_netdev.c:2193:34: warning:
 Using plain integer as NULL pointer

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Suggested-by: Joe Perches <joe@perches.com>
Acked-by: Shay Agroskin <shayagr@amazon.com>
Acked-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 16:59:22 -07:00
Zhang Changzhong
24a63fe6d4 net: bcmgenet: fix error returns in bcmgenet_probe()
The driver forgets to call clk_disable_unprepare() in error path after
a success calling for clk_prepare_enable().

Fix to goto err_clk_disable if clk_prepare_enable() is successful.

Fixes: 99d55638d4 ("net: bcmgenet: enable NETIF_F_HIGHDMA flag")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 16:57:53 -07:00
Xu Wang
74b5afea3b net: hns: use eth_broadcast_addr() to assign broadcast address
This patch is to use eth_broadcast_addr() to assign broadcast address
insetad of memset().

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 16:55:34 -07:00
Xu Wang
88a3c45482 net: vxge-main: Remove unnecessary cast in kfree()
Remove unnecassary casts in the argument to kfree.

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 16:45:09 -07:00
Vladimir Oltean
ecf9f9b77c net: mscc: ocelot: add support for PTP waveform configuration
For PPS output (perout period is 1.000000000), accept the new "phase"
parameter from the periodic output request structure.

For both PPS and freeform output, accept the new "on" argument for
specifying the duty cycle of the generated signal. Preserve the old
defaults for this "on" time: 1 us for PPS, and half the period for
freeform output.

Also preserve the old behavior that accepted the "phase" via the "start"
argument.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 19:22:57 -07:00
Lorenzo Bianconi
c7a3a8cd9d net: mvneta: move rxq->left_size on the stack
Allocate rxq->left_size on mvneta_rx_swbm stack since it is used just
in sw bm napi_poll

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:28:34 -07:00
Lorenzo Bianconi
89f4a198c9 net: mvneta: get rid of skb in mvneta_rx_queue
Remove skb pointer in mvneta_rx_queue data structure since it is no
longer used

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:28:34 -07:00
Lorenzo Bianconi
7d1643ebce net: mvneta: drop all fragments in XDP_DROP
Release all consumed pages if the eBPF program returns XDP_DROP for XDP
multi-buffers

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:28:34 -07:00
Lorenzo Bianconi
afda408b61 net: mvneta: move mvneta_run_xdp after descriptors processing
Move mvneta_run_xdp routine after all descriptor processing. This is a
preliminary patch to enable multi-buffers and JUMBO frames support for
XDP

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:28:33 -07:00
Lorenzo Bianconi
ca0e014609 net: mvneta: move skb build after descriptors processing
Move skb build after all descriptors processing. This is a preliminary
patch to enable multi-buffers and JUMBO frames support for XDP.
Introduce mvneta_xdp_put_buff routine to release all pages used by a
XDP multi-buffer

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:28:33 -07:00
Alex Marginean
07095c025a net: enetc: Use DT protocol information to set up the ports
Use DT information rather than in-band information from bootloader to
set up MAC for XGMII. For RGMII use the DT indication in addition to
RGMII defaults in hardware.
However, this implies that PHY connection information needs to be
extracted before netdevice creation, when the ENETC Port MAC is
being configured.

Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Michael Walle <michael@walle.cc>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:05:49 -07:00
Michael Walle
975d183ef0 net: enetc: Initialize SerDes for SGMII and USXGMII protocols
ENETC has ethernet MACs capable of SGMII, 2500BaseX and USXGMII. But in
order to use these protocols some SerDes configurations need to be
performed. The SerDes is configurable via an internal PCS PHY which is
connected to an internal MDIO bus at address 0.

This patch basically removes the dependency on bootloader regarding
SerDes initialization.

Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:05:49 -07:00
Vadim Pasternak
9b8737788a mlxsw: core: Fix wrong SFP EEPROM reading for upper pages 1-3
Fix wrong reading of upper pages for SFP EEPROM. According to "Memory
Organization" figure in SFF-8472 spec: When reading upper pages 1, 2 and
3 the offset should be set relative to zero and I2C high address 0x51
[1010001X (A2h)] is to be used.

Fixes: a45bfb5a50 ("mlxsw: core: Extend QSFP EEPROM size for ethtool")
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 19:07:26 -07:00
Armin Wolf
a050d82f5b ne2k-pci: Use netif_msg_init to initialize msg_enable bits
Use netif_msg_enable() to process param settings.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 19:04:06 -07:00
Dmitry Bogdanov
0044b1e147 net: atlantic: add support for FW 4.x
This patch adds support for FW 4.x, which is about to get into the
production for some products.
4.x is mostly compatible with 3.x, save for soft reset, which requires
the acquisition of 2 additional semaphores.
Other differences (e.g. absence of PTP support) are handled via
capabilities.

Note: 4.x targets specific products only. 3.x is still the main firmware
branch, which should be used by most users (at least for now).

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 19:00:54 -07:00
Mark Starovoytov
b567edbfc8 net: atlantic: align return value of ver_match function with function name
This patch aligns the return value of hw_atl_utils_ver_match function with
its name.
Change the return type to bool, because it's better aligned with the actual
usage. Return true when the version matches, false otherwise.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 19:00:54 -07:00
Mark Einon
11f3c1f583 net: ethernet: et131x: Remove redundant register read
Following the removal of an unused variable assignment (remove
unused variable 'pm_csr') the associated register read can also go,
as the read also occurs in the subsequent et1310_in_phy_coma()
call.

Signed-off-by: Mark Einon <mark.einon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 18:48:15 -07:00
Zhang Changzhong
eacc43d2c3 net: ethernet: et131x: Remove unused variable 'pm_csr'
Gcc report warning as follows:

drivers/net/ethernet/agere/et131x.c:953:6: warning:
 variable 'pm_csr' set but not used [-Wunused-but-set-variable]
  953 |  u32 pm_csr;
      |      ^~~~~~
drivers/net/ethernet/agere/et131x.c:1002:6⚠️
 variable 'pm_csr' set but not used [-Wunused-but-set-variable]
 1002 |  u32 pm_csr;
      |      ^~~~~~
drivers/net/ethernet/agere/et131x.c:3446:8: warning:
 variable 'pm_csr' set but not used [-Wunused-but-set-variable]
 3446 |    u32 pm_csr;
      |        ^~~~~~

After commit 38df6492eb ("et131x: Add PCIe gigabit ethernet driver
et131x to drivers/net"), 'pm_csr' is never used in these functions,
so removing it to avoid build warning.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Mark Einon <mark.einon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 18:43:40 -07:00
Zhang Changzhong
5686b10978 net: bna: Remove unused variable 't'
Gcc report warning as follows:

drivers/net/ethernet/brocade/bna/bfa_ioc.c:1538:6: warning:
 variable 't' set but not used [-Wunused-but-set-variable]
 1538 |  u32 t;
      |      ^

After commit c107ba171f ("bna: Firmware Patch Simplification"),
't' is never used, so removing it to avoid build warning.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 18:42:28 -07:00
Alexandre Belloni
2ccb0161a0 net: macb: use phy_interface_mode_is_rgmii everywhere
There is one RGMII check not using the phy_interface_mode_is_rgmii()
helper. This prevents the driver from configuring the MAC properly when
using a phy-mode that is not just rgmii, e.g. rgmii-rxid. This became an
issue on sama5d3 xplained since the ksz9031 driver is hadling phy-mode
properly and the phy-mode has to be set to rgmii-rxid.

Fixes: bcf3440c6d ("net: phy: micrel: add phy-mode support for the KSZ9031 PHY")
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 18:32:35 -07:00
Jakub Kicinski
18c7015cc6 net: bnxt: don't complain if TC flower can't be supported
The fact that NETIF_F_HW_TC is not set should be a sufficient
indication to the user that TC offloads are not supported.
No need to bother users of older firmware versions with
pointless warnings on every boot.

Also, since the support is optional, bnxt_init_tc() should not
return an error in case FW is old, similarly to how error
is not returned when CONFIG_BNXT_FLOWER_OFFLOAD is not set.

With that we can add an error message to the caller, to warn
about actual unexpected failures.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 18:26:20 -07:00
Nikita Danilov
23e500e887 net: atlantic: disable PTP on AQC111, AQC112
This patch disables PTP on AQC111 and AQC112 due to a known HW issue,
which can cause datapath issues.

Ideally PTP block should have been disabled via PHY provisioning, but
unfortunately many units have been shipped with enabled PTP block.
Thus, we have to work around this in the driver.

Fixes: dbcd6806af ("net: aquantia: add support for Phy access")
Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 18:24:32 -07:00
David S. Miller
d44a919a5c mlx5-updates-2020-07-16
Fixes:
 1) Fix build break when CONFIG_XPS is not set
 2) Fix missing switch_id for representors
 
 Updates:
 1) IPsec XFRM RX offloads from Raed and Huy.
   - Added IPSec RX steering flow tables to NIC RX
   - Refactoring of the existing FPGA IPSec, to add support
     for ConnectX IPsec.
   - RX data path handling for IPSec traffic
   - Synchronize offloading device ESN with xfrm received SN
 
 2) Parav allows E-Switch to siwtch to switchdev mode directly without
    the need to go through legacy mode first.
 
 3) From Tariq, Misc updates including:
    3.1) indirect calls for RX and XDP handlers
    3.2) Make MLX5_EN_TLS non-prompt as it should always be enabled when
         TLS and MLX5_EN are selected.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl8Q5J0ACgkQSD+KveBX
 +j5zhggAm/8ILhtG04BBKeQGay+m4CCg9qK7BrIavU3ta2t+DQdAxE+XmmHl+W2F
 DfL5sR0AiV8z8v6OF6Yjrh49Ys6k7LFh6msFP2vyVkUC6t02zRv7WYMlZn44Igqb
 Jg8n4Q806y5g2RJRmV/QFz9nOq8jxL/CXxA7eLCMiRSQKHl3LQ3TXbvvLJRY6ab2
 aZT9fhi6lJWhe7Rii932oUM+USikmilFgB0tBoSgVQ9fxa+cNTuMb2y/IKHQo5pi
 O9OUUKbPgYy3+xah+FCPLMx4izyv8F36XA7z6fGhtsM74pmFvC5e2eWOoqriWeBO
 8SL2m2+FSUnuoI6S2wKsBl5dePdezQ==
 =p788
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2020-07-16' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-07-16

Fixes:
1) Fix build break when CONFIG_XPS is not set
2) Fix missing switch_id for representors

Updates:
1) IPsec XFRM RX offloads from Raed and Huy.
  - Added IPSec RX steering flow tables to NIC RX
  - Refactoring of the existing FPGA IPSec, to add support
    for ConnectX IPsec.
  - RX data path handling for IPSec traffic
  - Synchronize offloading device ESN with xfrm received SN

2) Parav allows E-Switch to siwtch to switchdev mode directly without
   the need to go through legacy mode first.

3) From Tariq, Misc updates including:
   3.1) indirect calls for RX and XDP handlers
   3.2) Make MLX5_EN_TLS non-prompt as it should always be enabled when
        TLS and MLX5_EN are selected.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 13:04:17 -07:00
Doug Berger
a8c64542b4 net: bcmgenet: restore HFB filters on resume
The Hardware Filter Block RAM may not be preserved when the GENET
block is reset during a deep sleep, so it is not sufficient to
only backup and restore the enables.

This commit clears out the HFB block and reprograms the rxnfc
rules when the system resumes from a suspended state. To support
this the bcmgenet_hfb_create_rxnfc_filter() function is modified
to access the register space directly so that it can't fail due
to memory allocation issues.

Fixes: f50932cca6 ("net: bcmgenet: add WAKE_FILTER support")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 13:00:27 -07:00
Doug Berger
3d653adb4b net: bcmgenet: test RBUF_ACPI_EN when resuming
When the GENET driver resumes from deep sleep the UMAC_CMD
register may not be accessible and therefore should not be
accessed from bcmgenet_wol_power_up_cfg() if the GENET has
been reset.

This commit adds a check of the RBUF_ACPI_EN flag when Wake
on Filter is enabled. A clear flag indicates that the GENET
hardware must have been reset so the remainder of the
hardware programming is bypassed.

Fixes: f50932cca6 ("net: bcmgenet: add WAKE_FILTER support")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 13:00:27 -07:00
Doug Berger
2f11f0df84 net: bcmgenet: test MPD_EN when resuming
When the GENET driver resumes from deep sleep the UMAC_CMD
register may not be accessible and therefore should not be
accessed from bcmgenet_wol_power_up_cfg() if the GENET has
been reset.

This commit adds a check of the MPD_EN flag when Wake on
Magic Packet is enabled. A clear flag indicates that the
GENET hardware must have been reset so the remainder of the
hardware programming is bypassed.

Fixes: 1a1d5106c1 ("net: bcmgenet: move clk_wol management to bcmgenet_wol")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 13:00:27 -07:00
Christophe JAILLET
721dab2b56 net: alteon: Avoid some useless memset
Avoid a memset after a call to 'dma_alloc_coherent()'.
This is useless since
commit 518a2f1925 ("dma-mapping: zero memory returned from dma_alloc_*")

Replace a kmalloc+memset with a corresponding kzalloc.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:57:59 -07:00
Christophe JAILLET
f4079e5d72 net: alteon: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'ace_allocate_descriptors()' and
'ace_init()' GFP_KERNEL can be used because both functions are called from
the probe function and no lock is acquired.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:57:21 -07:00
Christophe JAILLET
8d4f62ca19 net: sungem: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'gem_init_one()', GFP_KERNEL can be used
because it is a probe function and no lock is acquired.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:56:40 -07:00
Christophe JAILLET
dcc82bb072 net: sun: cassini: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'cas_tx_tiny_alloc()', GFP_KERNEL can be used
because a few lines below in its only caller, 'cas_alloc_rxds()', is also
called. This function makes an explicit use of GFP_KERNEL.

When memory is allocated in 'cas_init_one()', GFP_KERNEL can be used
because it is a probe function and no lock is acquired.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:52:56 -07:00
Wang Hai
bca9749b1a net: smc91x: Fix possible memory leak in smc_drv_probe()
If try_toggle_control_gpio() failed in smc_drv_probe(), free_netdev(ndev)
should be called to free the ndev created earlier. Otherwise, a memleak
will occur.

Fixes: 7d2911c438 ("net: smc91x: Fix gpios for device tree based booting")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:44:42 -07:00
Eli Britstein
54b154ecfb net/mlx5e: CT: Map 128 bits labels to 32 bit map ID
The 128 bits ct_label field is matched using a 32 bit hardware register.
As such, only the lower 32 bits of ct_label field are offloaded. Change
this logic to support setting and matching higher bits too.
Map the 128 bits data to a unique 32 bits ID. Matching is done as exact
match of the mapping ID of key & mask.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Maor Dickman <maord@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-16 16:37:00 -07:00
Tariq Toukan
0bdc89b39d net/mlx5e: Do not request completion on every single UMR WQE
UMR WQEs are posted in bulks, and HW is notified once per a bulk.
Reduce the number of completions by requesting such only for
the last WQE of the bulk.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-16 16:36:58 -07:00
Tariq Toukan
2901a5c618 net/mlx5e: RX, Avoid indirect call in representor CQE handling
Use INDIRECT_CALL_2() helper to avoid the cost of the indirect call
when/if CONFIG_RETPOLINE=y.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-16 16:36:56 -07:00
Tariq Toukan
93761ca17e net/mlx5e: XDP, Avoid indirect call in TX flow
Use INDIRECT_CALL_2() helper to avoid the cost of the indirect call
when/if CONFIG_RETPOLINE=y.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-16 16:36:54 -07:00
Raed Salem
7ed92f97a1 net/mlx5e: IPsec: Add Connect-X IPsec ESN update offload support
Synchronize offloading device ESN with xfrm received SN
by updating an existing IPsec HW context with the new SN.

Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-16 16:36:51 -07:00
Raed Salem
b2ac7541e3 net/mlx5e: IPsec: Add Connect-X IPsec Rx data path offload
On receive flow inspect received packets for IPsec offload indication
using the cqe, for IPsec offloaded packets propagate offload status
and stack handle to stack for further processing.

Supported statuses:
- Offload ok.
- Authentication failure.
- Bad trailer indication.

Connect-X IPsec does not use mlx5e_ipsec_handle_rx_cqe.

For RX only offload, we see the BW gain. Below is the iperf3
performance report on two server of 24 cores Intel(R) Xeon(R)
CPU E5-2620 v3 @ 2.40GHz with ConnectX6-DX.
We use one thread per IPsec tunnel.

---------------------------------------------------------------------
Mode          |  Num tunnel | BW     | Send CPU util | Recv CPU util
              |             | (Gbps) | (Average %)   | (Average %)
---------------------------------------------------------------------
Cryto offload | 1           | 4.6    | 4.2           | 14.5
---------------------------------------------------------------------
Cryto offload | 24          | 38     | 73            | 63
---------------------------------------------------------------------
Non-offload   | 1           | 4      | 4             | 13
---------------------------------------------------------------------
Non-offload   | 24          | 23     | 52            | 67

Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-16 16:36:49 -07:00
Huy Nguyen
5e46634529 net/mlx5e: IPsec: Add IPsec steering in local NIC RX
Introduce decrypt FT, the RX error FT and the default rules.

The IPsec RX decrypt flow table is pointed by the TTC
(Traffic Type Classifier) ESP steering rules.
The decrypt flow table has two flow groups. The first flow group
keeps the decrypt steering rule programmed via the "ip xfrm s" interface.
The second flow group has a default rule to forward all non-offloaded
ESP packet to the TTC ESP default RSS TIR.

The RX error flow table is the destination of the decrypt steering rules
in the IPsec RX decrypt flow table. It has a fixed rule with single
copy action that copies ipsec_syndrome to metadata_regB[0:6]. The IPsec
syndrome is used to filter out non-ipsec packet and to return the IPsec
crypto offload status in Rx flow. The destination of RX error flow table
is the TTC ESP default RSS TIR.

All the FTs (decrypt FT and error FT) are created only when IPsec SAs
are added. If there is no IPsec SAs, the FTs are removed.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-16 16:36:48 -07:00
Raed Salem
2d64663cd5 net/mlx5: IPsec: Add HW crypto offload support
This patch adds support for Connect-X IPsec crypto offload
by implementing the IPsec acceleration layer needed routines,
which delegates IPsec offloads to Connect-X routines.

In Connect-X IPsec, a Security Association (SA) is added or deleted
via allocating a HW context of an encryption/decryption key and
a HW context of a matching SA (IPsec object).
The Security Policy (SP) is added or deleted by creating matching Tx/Rx
steering rules whith an action of encryption/decryption respectively,
executed using the previously allocated SA HW context.

When new xfrm state (SA) is added:
- Use a separate crypto key HW context.
- Create a separate IPsec context in HW to inlcude the SA properties:
 - aes-gcm salt.
 - ICV properties (ICV length, implicit IV).
 - on supported devices also update ESN.
 - associate the allocated crypto key with this IPsec context.

Introduce a new compilation flag MLX5_IPSEC for it.

Downstream patches will implement the Rx,Tx steering
and will add the update esn.

Signed-off-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-16 16:36:44 -07:00
Raed Salem
9a6ad1ad71 net/mlx5: Accel, Add core IPsec support for the Connect-X family
This to set the base for downstream patches to support
the new IPsec implementation of the Connect-X family.

Following modifications made:
- Remove accel layer dependency from MLX5_FPGA_IPSEC.
- Introduce accel_ipsec_ops, each IPsec device will
  have to support these ops.

Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-16 16:36:42 -07:00
Parav Pandit
ea2128fd63 net/mlx5: E-switch, Reduce dependency on num_vfs during mode set
Currently only ECPF allows enabling eswitch when SR-IOV is disabled.

Enable PF also to enable eswitch when SR-IOV is disabled.
Load VF vports when eswitch is already enabled.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-16 16:36:40 -07:00
Parav Pandit
3d5f41ca01 net/mlx5: E-switch, Avoid function change handler for non ECPF
for non ECPF eswitch manager function, vports are already
enabled/disabled when eswitch is enabled/disabled respectively.
Simplify function change handler for such eswitch manager function.

Therefore, ECPF is the only one which remains PF/VF function change
handler.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-16 16:36:38 -07:00
Tariq Toukan
e21feb88f7 net/mlx5: Make MLX5_EN_TLS non-prompt
TLS runs only over Eth, and the Eth driver is the only user of
the core TLS functionality.
There is no meaning of having the core functionality without the usage
in Eth driver.
Hence, let both TLS core implementations depend on MLX5_CORE_EN,
and select MLX5_EN_TLS.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-16 16:36:36 -07:00
Saeed Mahameed
8b5ec43d73 net/mlx5e: Fix build break when CONFIG_XPS is not set
mlx5e_accel_sk_get_rxq is only used in ktls_rx.c file which already
depends on XPS to be compiled, move it from the generic en_accel.h
header to be local in ktls_rx.c, to fix the below build break

In file included from
../drivers/net/ethernet/mellanox/mlx5/core/en_main.c:49:0:
../drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h:
In function ‘mlx5e_accel_sk_get_rxq’:
../drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h:153:12:
error: implicit declaration of function ‘sk_rx_queue_get’ ...
  int rxq = sk_rx_queue_get(sk);
            ^~~~~~~~~~~~~~~

Fixes: 1182f36593 ("net/mlx5e: kTLS, Add kTLS RX HW offload support")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
2020-07-16 16:36:33 -07:00
Parav Pandit
1315971fea net/mlx5e: Fix missing switch_id for representors
Cited commit in fixes tag missed to set the switch id of the PF and VF
ports. Due to this flow cannot be offloaded, a simple command like below
fails to offload with below error.

tc filter add dev ens2f0np0 parent ffff: prio 1 flower \
 dst_mac 00:00:00:00:00:00/00:00:00:00:00:00 skip_sw \
 action mirred egress redirect dev ens2f0np0pf0vf0

Error: mlx5_core: devices are not on same switch HW, can't offload forwarding.

Hence, fix it by setting switch id for each PF and VF representors port
as before the cited commit.

Fixes: 71ad8d55f8 ("devlink: Replace devlink_port_attrs_set parameters with a struct")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-07-16 16:36:31 -07:00
Vladimir Oltean
89e35f66d5 net: mscc: ocelot: rethink Kconfig dependencies again
Having the users of MSCC_OCELOT_SWITCH_LIB depend on REGMAP_MMIO was a
bad idea, since that symbol is not user-selectable. So we should have
kept a 'select REGMAP_MMIO'.

When we do that, we run into 2 more problems:

- By depending on GENERIC_PHY, we are causing a recursive dependency.
  But it looks like GENERIC_PHY has no other dependencies, and other
  drivers select it, so we can select it too:

drivers/of/Kconfig:69:error: recursive dependency detected!
drivers/of/Kconfig:69:  symbol OF_IRQ depends on IRQ_DOMAIN
kernel/irq/Kconfig:68:  symbol IRQ_DOMAIN is selected by REGMAP
drivers/base/regmap/Kconfig:7:  symbol REGMAP default is visible depending on REGMAP_MMIO
drivers/base/regmap/Kconfig:39: symbol REGMAP_MMIO is selected by MSCC_OCELOT_SWITCH_LIB
drivers/net/ethernet/mscc/Kconfig:15:   symbol MSCC_OCELOT_SWITCH_LIB is selected by MSCC_OCELOT_SWITCH
drivers/net/ethernet/mscc/Kconfig:22:   symbol MSCC_OCELOT_SWITCH depends on GENERIC_PHY
drivers/phy/Kconfig:8:  symbol GENERIC_PHY is selected by PHY_BCM_NS_USB3
drivers/phy/broadcom/Kconfig:41:        symbol PHY_BCM_NS_USB3 depends on MDIO_BUS
drivers/net/phy/Kconfig:13:     symbol MDIO_BUS depends on MDIO_DEVICE
drivers/net/phy/Kconfig:6:      symbol MDIO_DEVICE is selected by PHYLIB
drivers/net/phy/Kconfig:254:    symbol PHYLIB is selected by ARC_EMAC_CORE
drivers/net/ethernet/arc/Kconfig:19:    symbol ARC_EMAC_CORE is selected by ARC_EMAC
drivers/net/ethernet/arc/Kconfig:25:    symbol ARC_EMAC depends on OF_IRQ

- By depending on PHYLIB, we are causing a recursive dependency. PHYLIB
  only has a single dependency, "depends on NETDEVICES", which we are
  already depending on, so we can again hack our way into conformance by
  turning the PHYLIB dependency into a select.

drivers/of/Kconfig:69:error: recursive dependency detected!
drivers/of/Kconfig:69:  symbol OF_IRQ depends on IRQ_DOMAIN
kernel/irq/Kconfig:68:  symbol IRQ_DOMAIN is selected by REGMAP
drivers/base/regmap/Kconfig:7:  symbol REGMAP default is visible depending on REGMAP_MMIO
drivers/base/regmap/Kconfig:39: symbol REGMAP_MMIO is selected by MSCC_OCELOT_SWITCH_LIB
drivers/net/ethernet/mscc/Kconfig:15:   symbol MSCC_OCELOT_SWITCH_LIB is selected by MSCC_OCELOT_SWITCH
drivers/net/ethernet/mscc/Kconfig:22:   symbol MSCC_OCELOT_SWITCH depends on PHYLIB
drivers/net/phy/Kconfig:254:    symbol PHYLIB is selected by ARC_EMAC_CORE
drivers/net/ethernet/arc/Kconfig:19:    symbol ARC_EMAC_CORE is selected by ARC_EMAC
drivers/net/ethernet/arc/Kconfig:25:    symbol ARC_EMAC depends on OF_IRQ

Fixes: f4d0323bae ("net: mscc: ocelot: convert MSCC_OCELOT_SWITCH into a library")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-16 12:46:00 -07:00
Kees Cook
3f649ab728 treewide: Remove uninitialized_var() usage
Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.

In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:

git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
	xargs perl -pi -e \
		's/\buninitialized_var\(([^\)]+)\)/\1/g;
		 s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'

drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.

No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/

Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB
Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers
Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-16 12:35:15 -07:00
Sergey Organov
31bb1a560b net: fec: replace snprintf() with strlcpy() in fec_ptp_init()
No need to use snprintf() on a constant string, nor using magic
constant in the fixed code was a good idea.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-16 11:32:03 -07:00
Sergey Organov
2b80308886 net: fec: get rid of redundant code in fec_ptp_set()
Code of the form "if(x) x = 0" replaced with "x = 0".

Code of the form "if(x == a) x = a" removed.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-16 11:31:58 -07:00
Sergey Organov
199560343e net: fec: initialize clock with 0 rather than current kernel time
Initializing with 0 makes it much easier to identify time stamps from
otherwise uninitialized clock.

Initialization of PTP clock with current kernel time makes little sense as
PTP time scale differs from UTC time scale that kernel time represents.
It only leads to confusion when no actual PTP initialization happens, as
these time scales differ in a small integer number of seconds (37 at the
time of writing.)

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-16 11:31:54 -07:00
Sergey Organov
e53a57e56f net: fec: enable to use PPS feature without time stamping
PPS feature could be useful even when hardware time stamping
of network packets is not in use, so remove offending check
for this condition from fec_ptp_enable_pps().

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-16 11:31:43 -07:00
Sergey Organov
340746398b net: fec: fix hardware time stamping by external devices
Fix support for external PTP-aware devices such as DSA or PTP PHY:

Make sure we never time stamp tx packets when hardware time stamping
is disabled.

Check for PTP PHY being in use and then pass ioctls related to time
stamping of Ethernet packets to the PTP PHY rather than handle them
ourselves. In addition, disable our own hardware time stamping in this
case.

Fixes: 6605b730c0 ("FEC: Add time stamping code and a PTP hardware clock")
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-16 11:20:38 -07:00
Michael Guralnik
4c2573e1f6 net/mlx5: Enable count action for rules with allow action
Enable the creation of rules with allow and count actions.
This enables using counters on egress flow tables.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-15 22:21:29 -07:00
Eli Cohen
1dcb6c36a5 net/mlx5: Support setting access rights of dma addresses
mlx5_fill_page_frag_array() is used to populate dma addresses to
resources that require it, such as QPs, RQs etc. When the resource is
used, PA list permissions are ignored. For resources that use MTT list,
the user is required to provide the access rights. Subsequent patches
use resources that require MTT lists, so modify API and implementation
to support that.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-15 22:21:29 -07:00
Ido Schimmel
af11e818a7 mlxsw: spectrum_acl: Offload FLOW_ACTION_POLICE
Offload action police when used with a flower classifier. The number of
dropped packets is read from the policer and reported to tc.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 18:10:00 -07:00
Ido Schimmel
deee0abc70 mlxsw: core_acl_flex_actions: Add police action
Add core functionality required to support police action in the policy
engine.

The utilized hardware policers are stored in a hash table keyed by the
flow action index. This allows to support policer sharing between
multiple ACL rules.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 18:10:00 -07:00
Ido Schimmel
d25b8f6ebc mlxsw: core_acl_flex_actions: Work around hardware limitation
In the policy engine, each ACL rule points to an action block where the
ACL actions are stored. Each action block consists of one or more action
sets. Each action set holds one or more individual actions, up to a
maximum queried from the device. For example:

                        Action set #1               Action set #2

+----------+          +--------------+            +--------------+
| ACL rule +---------->  Action #1   |      +----->  Action #4   |
+----------+          +--------------+      |     +--------------+
                      |  Action #2   |      |     |  Action #5   |
                      +--------------+      |     +--------------+
                      |  Action #3   +------+     |              |
                      +--------------+            +--------------+

                      <---------+ Action block +----------------->

The hardware has a limitation that prevents a policing action
(MLXSW_AFA_POLCNT_CODE when used with a policer, not a counter) from
being configured in the same action set with a trap action (i.e.,
MLXSW_AFA_TRAP_CODE or MLXSW_AFA_TRAPWU_CODE). Note that the latter used
to implement multiple actions: 'trap', 'mirred', 'drop'.

Work around this limitation by teaching mlxsw_afa_block_append_action()
to create a new action set not only when there is no more room left in
the current set, but also when there is a conflict between previously
mentioned actions.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 18:10:00 -07:00
Ido Schimmel
bf038f0372 mlxsw: spectrum_policer: Add devlink resource support
Expose via devlink-resource the maximum number of single-rate policers
and their current occupancy. Example:

$ devlink resource show pci/0000:01:00.0
...
  name global_policers size 1000 unit entry dpipe_tables none
    resources:
      name single_rate_policers size 968 occ 0 unit entry dpipe_tables none

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 18:10:00 -07:00
Ido Schimmel
8d3fbae70d mlxsw: spectrum_policer: Add policer core
Add common code to handle all policer-related functionality in mlxsw.
Currently, only policer for policy engines are supported, but it in the
future more policer families will be added such as CPU (trap) policers
and storm control policers.

The API allows different modules to add / delete policers and read their
drop counter.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 18:10:00 -07:00
Ido Schimmel
1b744fc9f8 mlxsw: resources: Add resource identifier for global policers
Add a resource identifier for maximum global policers so that it could
be later used to query the information from firmware.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 18:09:59 -07:00
Ido Schimmel
fbf0f5d185 mlxsw: reg: Add policer bandwidth limits
Add policer bandwidth limits for both rate and burst size so that they
could be enforced by a later patch.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 18:09:59 -07:00
Luo bin
5e126e7c4e hinic: add firmware update support
add support to update firmware by the devlink flashing API

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 18:06:44 -07:00
Alexander A. Klimov
e63a228284 net: sundance: Replace HTTP links with HTTPS ones
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
	  If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
            If both the HTTP and HTTPS versions
            return 200 OK and serve the same content:
              Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 17:41:34 -07:00
Ioana Ciornei
841eb4012c dpaa2-eth: check fsl_mc_get_endpoint for IS_ERR_OR_NULL()
The fsl_mc_get_endpoint() function can return an error or directly a
NULL pointer in case the peer device is not under the root DPRC
container. Treat this case also, otherwise it would lead to a NULL
pointer when trying to access the peer fsl_mc_device.

Fixes: 7194792308 ("dpaa2-eth: add MAC/PHY support through phylink")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 17:31:52 -07:00
Jakub Kicinski
78c6bc2bdf qlcnic: convert to new udp_tunnel_nic infra
Straightforward conversion to new infra, 1 VxLAN port, handler
may sleep.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 17:04:27 -07:00
Jakub Kicinski
8cd160a294 qede: convert to new udp_tunnel_nic infra
Covert to new infra. Looks like this driver was not doing
ref counting, and sleeping in the callback.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 17:04:27 -07:00
Jakub Kicinski
f7529b4ba3 fm10k: convert to new udp_tunnel_nic infra
Straightforward conversion to new infra. Driver restores info
after close/open cycle by calling its internal restore function
so just use that, no need for udp_tunnel_nic_reset_ntf() here.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 17:04:27 -07:00
Jakub Kicinski
6a8c1a75e5 liquidio_vf: convert to new udp_tunnel_nic infra
Carbon copy of the previous change.

This driver is just a super thin FW interface, but Derek let us
know the table has 1024 entries.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 17:04:27 -07:00
Jakub Kicinski
3fcd2ba10f liquidio: convert to new udp_tunnel_nic infra
This driver is just a super thin FW interface, but Derek let us
know the table has 1024 entries.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 17:04:27 -07:00
Jakub Kicinski
fc9a7def5d enic: convert to new udp_tunnel_nic infra
Convert to new infra, now the refcounting will be correct,
and driver gets port replay of other ports when offloaded
port gets removed.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 17:04:27 -07:00
Jakub Kicinski
ad166a8ec2 cxgb4: convert to new udp_tunnel_nic infra
Convert to new infra, this driver is very simple. The check of
adapter->rawf_cnt in cxgb_udp_tunnel_unset_port() is kept from
the old port deletion function but it's dodgy since nothing ever
updates that member once its set during init. Also .set_port
callback always adds the raw mac filter..

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 17:04:27 -07:00
Jakub Kicinski
085c5c42e3 bnx2x: convert to new udp_tunnel_nic infra
Fairly straightforward conversion - no need to keep track
of the use count, and replay when ports get removed, also
callbacks can just sleep.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 17:04:27 -07:00
Jakub Kicinski
4df587ab87 xgbe: convert to new udp_tunnel_nic infra
Make use of the new udp_tunnel_nic infra. Don't clear the features
when VxLAN port is not present to make all drivers behave the same.
Driver will now (until we address the problem in the core) leave
the RX UDP tunnel feature always on, since this is what most drivers
do.

Remove the list of VxLAN ports, just program the one core told us to.

The driver seem to want to clear the VxLAN ports on close but it
doesn't seem to flush the port list properly so it'd get wrong
use counts after close/open. Again since it calls its own open
handler we need the reset notification hack.

v2:
 - fix kbuild warning

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 17:04:27 -07:00
Jakub Kicinski
b5c5f8d062 xgbe: switch to more generic VxLAN detection
Instead of looping though the list of ports just check
if the geometry of the packet is correct for VxLAN.
HW most likely doesn't care about the exact port, anyway,
since only first port is actually offloaded, and this way
we won't have to maintain the port list at all.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 17:04:27 -07:00
Jakub Kicinski
8f0545d232 be2net: convert to new udp_tunnel_nic infra
Convert be2net to new udp_tunnel_nic infra. NIC only takes one VxLAN
port. Remove the port tracking using a list. The warning in
be_work_del_vxlan_port() looked suspicious - like the driver expected
ports to be removed in order of addition.

be2net unregisters ports when going down and re-registers them (for
skyhawk) when coming up, but it never checks if the device is up
in the add_port / del_port callbacks. Make it use
UDP_TUNNEL_NIC_INFO_OPEN_ONLY. Sadly this driver calls its own
open/close functions directly so the udp_tunnel_nic_reset_ntf()
workaround is needed.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 17:04:27 -07:00
Jakub Kicinski
641ca08547 nfp: convert to new udp_tunnel_nic infra
NFP conversion is pretty straightforward. We want to be able
to sleep, and only get callbacks when the device is open.

NFP did not ask for port replay when ports were removed, now
new infra will provide this feature for free.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 17:04:27 -07:00
Laurence Oberman
1d61e21852 qed: Disable "MFW indication via attention" SPAM every 5 minutes
This is likely firmware causing this but its starting to annoy customers.
Change the message level to verbose to prevent the spam.
Note that this seems to only show up with ISCSI enabled on the HBA via the
qedi driver.

Signed-off-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 15:15:44 -07:00
Christophe JAILLET
81adcd65b6 ksz884x: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'ksz_alloc_desc()', GFP_KERNEL can be used
because a few lines below, GFP_KERNEL is also used in the
'ksz_alloc_soft_desc()' calls.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 15:10:29 -07:00
Heiner Kallweit
0439297be9 r8169: add support for RTL8125B
Add support for RTL8125B rev.b. In my tests 2.5Gbps worked well
w/o firmware, however for a stable link at 1Gbps firmware revision
0.0.2 is needed.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 15:07:16 -07:00
Ido Schimmel
6a8c101e07 mlxsw: core: Use mirror reason during Rx listener lookup
The Rx listener abstraction allows the switch driver (e.g.,
mlxsw_spectrum) to register a function that is called when a packet is
received (trapped) for a specific reason.

Up until now, the Rx listener lookup was solely based on the trap
identifier. However, when a packet is mirrored to the CPU the trap
identifier merely indicates that the packet was mirrored, but not why it
was mirrored. This makes it impossible for the switch driver to register
different Rx listeners for different mirror reasons.

Solve this by allowing the switch driver to register a Rx listener with
a mirror reason and by extending the Rx listener lookup to take the
mirror reason into account.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:50:50 -07:00
Ido Schimmel
eacc86ec51 mlxsw: pci: Retrieve mirror reason from CQE during receive
In case the mirror reason is valid, retrieve it into the Rx information
so that it could be used during listener lookup in a later patch.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:50:50 -07:00
Ido Schimmel
a76423a144 mlxsw: pci: Add mirror reason field to CQEv2
The Completion Queue Element version 2 (CQEv2) includes a field called
'mirror_reason' which indicates why the packet was mirrored to the CPU.

Add the field so that it can be used by a later patch.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:50:49 -07:00
Ido Schimmel
0cc32c5b5c mlxsw: trap: Add trap identifiers for mirrored packets
Packets that are mirrored to the CPU port are trapped with one of eight
trap identifiers. Add them.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:50:49 -07:00
Amit Cohen
47e4b1620e mlxsw: reg: Increase trap identifier to 10 bits
The trap identifier was increased to 10 bits in new versions of the
Programmer's Reference Manual (PRM).

Increase it accordingly in the Host PacKet Trap (HPKT) register and in
the Completion Queue Element (CQE).

This is significant for subsequent patches that will introduce trap
identifiers which utilize the extended range.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:50:49 -07:00
Ido Schimmel
4039504e6a mlxsw: spectrum_span: Allow setting policer on a SPAN agent
When mirroring packets to the CPU port the mirrored packets are trapped
to the CPU. However, unlike other traps, it is not possible to set a
policer on the associated trap group. Instead, the policer needs to be
set on the SPAN agent.

Moreover, the policer ID must be within a specified range: From a
configurable (even) base ID to this base plus the maximum number of SPAN
agents.

While the immediate use case is to set the policer on a SPAN agent that
mirrors to the CPU port, a policer can be set on any SPAN agent.
Therefore, the operation is implemented for all SPAN agent types.

Extend the SPAN agent request API to allow passing the desired policer
ID that should be bound to the SPAN agent. Return an error for
Spectrum-1, as it does not support policer setting on a SPAN agent.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:50:49 -07:00
Ido Schimmel
a120ecc3c5 mlxsw: spectrum_span: Allow passing parameters to SPAN agents
Currently, the only parameter of a SPAN agent is the netdev which
the SPAN agent should mirror to.

The next patch will add the ability to request a SPAN agent that mirrors
to a specific netdev and has a specific policer ID bound to it. This is
required when mirroring packets to the CPU port.

Therefore, encapsulate the sole parameter to mlxsw_sp_span_agent_get()
in a structure, so that it could later be extended with policer
information.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:50:49 -07:00
Ido Schimmel
fa8c08b8fc mlxsw: spectrum_span: Add support for mirroring towards CPU port
The Spectrum-2 and Spectrum-3 ASICs are able to mirror packets towards
the CPU. These packets are then trapped like any other packet, but with
a special packet trap and additional metadata such as why the packet was
mirrored.

The ability to mirror packets towards the CPU will be utilized by a
subsequent patch set that will mirror packets that were dropped by the
ASIC for various buffer-related reasons, such as tail-drop and
early-drop.

Add mirroring towards the CPU as a new SPAN agent type and re-use the
functions that mirror to a physical port where possible.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:50:49 -07:00
Ido Schimmel
6edc8beab4 mlxsw: spectrum_span: Do not dereference destination netdev
Currently, the destination netdev to which we mirror must be a valid
netdev. However, this is going to change with the introduction of
mirroring towards the CPU port, as the CPU port does not have a backing
netdev.

Avoid dereferencing the destination netdev when it is not clear if it is
valid or not.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:50:49 -07:00
Ido Schimmel
f4a626e2ca mlxsw: spectrum_span: Add driver private info to parms_set() callback
The parms_set() callback is supposed to fill in the parameters for the
SPAN agent, such as the destination port and encapsulation info, if any.

When mirroring to the CPU port we cannot resolve the destination port
(the CPU port) without access to the driver private info.

Pass the driver private info to parms_set() callback so that it could be
used later on to resolve the CPU port.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:50:49 -07:00
Ido Schimmel
34e4ace56f mlxsw: spectrum_span: Add per-ASIC SPAN agent operations
The various SPAN agent types differ in their mirror targets (i.e.,
physical port netdev vs. VLAN netdev) and the encapsulation headers that
they need to encapsulate the mirrored packets with.

The Spectrum-2 and Spectrum-3 ASICs support a SPAN agent type that is
able to mirror towards the CPU, whereas the Spectrum-1 ASIC does not.

Prepare for the addition of this new SPAN agent type by splitting the
SPAN agent operations to be per-ASIC.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:50:49 -07:00
Amit Cohen
95c68833fa mlxsw: reg: add mirroring_pid_base to MOGCR register
Allow setting mirroring_pid_base using MOGCR register.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:50:49 -07:00
Amit Cohen
ef8d57e6b7 mlxsw: reg: Add session_id and pid to MPAT register
Allow setting session_id and pid as part of port analyzer
configurations.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:50:49 -07:00
Maxim Kochetkov
ff021f22ea gianfar: Use random MAC address when none is given
If there is no valid MAC address in the device tree,
use a random MAC address.

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:16:37 -07:00
Christophe JAILLET
8331bbe9ea net: neterion: vxge: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below. No GFP_
flag needs to be corrected.
It has been compile tested.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:07:49 -07:00
Christophe JAILLET
fb059b26bc net: neterion: s2io: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GPF_ with a correct flag.
It has been compile tested.

When memory is allocated in 'init_shared_mem()' GFP_KERNEL can be used
because this flag is already used to allocate some memory in this function.

While at it, update some debug message to match the new function names.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:07:49 -07:00
Christophe JAILLET
a3b7b49388 lan743x: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GPF_ with a correct flag.
It has been compile tested.

When memory is allocated in 'lan743x_tx_ring_cleanup()' and
'lan743x_rx_ring_init()', GFP_KERNEL can be used because this flag is
already used to allocate some memory in these functions.

While at it, remove a useless (void *) casting in the first hunk in so that
the code is more consistent.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 13:47:56 -07:00
Christophe JAILLET
da6e8ace56 pcnet32: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GPF_ with a correct flag.
It has been compile tested.

When memory is allocated in 'pcnet32_realloc_tx_ring()' and
'pcnet32_realloc_rx_ring()', GFP_ATOMIC must be used because a spin_lock is
hold.
The call chain is:
   pcnet32_set_ringparam
   ** spin_lock_irqsave(&lp->lock, flags);
   --> pcnet32_realloc_tx_ring
   --> pcnet32_realloc_rx_ring
   ** spin_unlock_irqrestore(&lp->lock, flags);

When memory is in 'pcnet32_probe1()' and 'pcnet32_alloc_ring()', GFP_KERNEL
can be used.

While at it, update a few comments and pr_err messages to be more in line
with the new function names.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:54:31 -07:00
Christophe JAILLET
428f09c2b7 amd8111e: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GPF_ with a correct flag.
It has been compile tested.

When memory is allocated in 'amd8111e_init_ring()', GFP_ATOMIC must be used
because a spin_lock is hold.
One of the call chains is:
   amd8111e_open
   ** spin_lock_irq(&lp->lock);
   --> amd8111e_restart
      --> amd8111e_init_ring
   ** spin_unlock_irq(&lp->lock);

The rest of the patch is produced by coccinelle with a few adjustments to
please checkpatch.pl.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:49:28 -07:00
Alexander A. Klimov
d788a0b512 net: jme: Replace HTTP links with HTTPS ones
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
	  If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
            If both the HTTP and HTTPS versions
            return 200 OK and serve the same content:
              Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:45:04 -07:00
Alexander A. Klimov
a7d0278235 net: ethernet: Replace HTTP links with HTTPS ones
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
	  If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
            If both the HTTP and HTTPS versions
            return 200 OK and serve the same content:
              Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:42:45 -07:00
Maxim Kochetkov
aa92d836d5 net: mscc: ocelot: extend watermark encoding function
The ocelot_wm_encode function deals with setting thresholds for pause
frame start and stop. In Ocelot and Felix the register layout is the
same, but for Seville, it isn't. The easiest way to accommodate Seville
hardware configuration is to introduce a function pointer for setting
this up.

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:02 -07:00
Maxim Kochetkov
541132f096 net: mscc: ocelot: convert SYS_PAUSE_CFG register access to regfield
Seville has a different bitwise layout than Ocelot and Felix.

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:02 -07:00
Vladimir Oltean
b39648079d net: mscc: ocelot: disable flow control on NPI interface
The Ocelot switches do not support flow control on Ethernet interfaces
where a DSA tag must be added. If pause frames are enabled, they will be
encapsulated in the DSA tag just like regular frames, and the DSA master
will not recognize them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:02 -07:00
Vladimir Oltean
e8e6e73db1 net: mscc: ocelot: split writes to pause frame enable bit and to thresholds
We don't want ocelot_port_set_maxlen to enable pause frame TX, just to
adjust the pause thresholds.

Move the unconditional enabling of pause TX to ocelot_init_port. There
is no good place to put such setting because it shouldn't be
unconditional. But at the moment it is, we're not changing that.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:01 -07:00
Vladimir Oltean
886e1387c7 net: mscc: ocelot: convert QSYS_SWITCH_PORT_MODE and SYS_PORT_MODE to regfields
Currently Felix and Ocelot share the same bit layout in these per-port
registers, but Seville does not. So we need reg_fields for that.

Actually since these are per-port registers, we need to also specify the
number of ports, and register size per port, and use the regmap API for
multiple ports.

There's a more subtle point to be made about the other 2 register
fields:
- QSYS_SWITCH_PORT_MODE_SCH_NEXT_CFG
- QSYS_SWITCH_PORT_MODE_INGRESS_DROP_MODE
which we are not writing any longer, for 2 reasons:
- Using the previous API (ocelot_write_rix), we were only writing 1 for
  Felix and Ocelot, which was their hardware-default value, and which
  there wasn't any intention in changing.
- In the case of SCH_NEXT_CFG, in fact Seville does not have this
  register field at all, and therefore, if we want to have common code
  we would be required to not write to it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:01 -07:00
Maxim Kochetkov
2789658fa3 soc: mscc: ocelot: add MII registers description
Add the register definitions for the MSCC MIIM MDIO controller in
preparation for seville_vsc9959.c to create its accessors for the
internal MDIO bus.

Since we've introduced elements to ocelot_regfields that are not
instantiated by felix and ocelot, we need to define the size of the
regfields arrays explicitly, otherwise ocelot_regfields_init, which
iterates up to REGFIELD_MAX, will fault on the undefined regfield
entries (if we're lucky).

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:01 -07:00
Vladimir Oltean
91c724cfc0 net: mscc: ocelot: convert port registers to regmap
At the moment, there are some minimal register differences between
VSC7514 Ocelot and VSC9959 Felix. To be precise, the PCS1G registers are
missing from Felix because it was integrated with an NXP PCS.

But with VSC9953 Seville (not yet introduced), the register differences
are more pronounced.  The MAC registers are located at different offsets
within the DEV_GMII target. So we need to refactor the driver to keep a
regmap even for per-port registers. The callers of the ocelot_port_readl
and ocelot_port_writel were kept unchanged, only the implementation is
now more generic.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:01 -07:00
Petr Machata
f6668eac22 mlxsw: spectrum_qdisc: Offload mirroring on RED qevent early_drop
The RED qevents early_drop and mark can be offloaded under the following
fairly strict conditions:

- At most one filter is configured at the qevent block
- The protocol is "any"
- The classifier is matchall
- The action is trap, sample, or mirror with the same conditions as
  with other SPAN offloads
- The hw_counters type is none

In this patchset, implement offload of mirror for early_drop qevent.
The ECN trigger is currently not implemented in the FW and therefore
the mark qevent is not supported.

The qevent notifications look exactly like regular block binding
notifications with a binder type that identifies them as qevents.
Therefore the details of processing this binding are fairly similar
to the matchall offload.

struct flow_block_offload.sch points at the qdisc in question. Use it to
figure out if the qdisc is offloaded at all and what TC it configures.
Bounce bindings on not-offloaded qdiscs.

Individual bindings are kept in a list so that several qevents can share
the same block and all binding points get configured as the configured
filters change.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:22:22 -07:00
Petr Machata
f7a439cbf1 mlxsw: spectrum_flow: Promote binder-type dispatch to spectrum.c
Two RED qevents have been introduced recently. From the point of view of a
driver, qevents are simply blocks with unusual binder types. However they
need to be handled by different logic than ACL-like flows.

Thus rename mlxsw_sp_setup_tc_block() to mlxsw_sp_setup_tc_block_clsact()
and move the binder-type dispatch from there to spectrum.c into a new
function of the original name. The new dispatcher is easier to extend with
new binder types.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:22:22 -07:00
Petr Machata
b50f60a0c4 mlxsw: spectrum_matchall: Publish matchall data structures
A following patch introduces offloading of filters attached to blocks bound
to the RED tail_drop qevent. The only classifier that mlxsw will permit in
this role is matchall. mlxsw currently offloads matchall filters used with
clsact qdisc. The data structures used for that offload will come handy for
the qevent offload as well. Publish them in spectrum.h.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:22:22 -07:00
Petr Machata
d928f82198 mlxsw: spectrum_flow: Drop an unused field
The field "dev" in struct mlxsw_sp_flow_block_binding is not used. Drop it.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:22:22 -07:00
Petr Machata
2c4950ea10 mlxsw: spectrum_flow: Convert a goto to a return
No clean-up is performed at the target label of this goto. Convert it to a
direct return.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:22:22 -07:00
Ido Schimmel
2bafb216e1 mlxsw: spectrum_span: Add APIs to enable / disable global mirroring triggers
While the binding of global mirroring triggers to a SPAN agent is
global, packets are only mirrored if they belong to a port and TC on
which the trigger was enabled. This allows, for example, to mirror
packets that were tail-dropped on a specific netdev.

Implement the operations that allow to enable / disable a global
mirroring trigger on a specific port and TC.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:22:22 -07:00
Ido Schimmel
ab8c06b7b4 mlxsw: spectrum_span: Add support for global mirroring triggers
Global mirroring triggers are triggers that are only keyed by their
trigger, as opposed to per-port triggers, which are keyed by their
trigger and port.

Such triggers allow mirroring packets that were tail/early dropped or
ECN marked to a SPAN agent.

Implement the previously added trigger operations for these global
triggers. Since such triggers are only supported from Spectrum-2
onwards, have the Spectrum-1 operations return an error.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:22:22 -07:00
Ido Schimmel
08a3641f26 mlxsw: spectrum_span: Prepare for global mirroring triggers
Currently, a SPAN agent can only be bound to a per-port trigger where
the trigger is either an incoming packet (INGRESS) or an outgoing packet
(EGRESS) to / from the port.

The subsequent patch will introduce the concept of global mirroring
triggers. The binding / unbinding of global triggers is different than
that of per-port triggers. Such triggers also need to be enabled /
disabled on a per-{port, TC} basis and are only supported from
Spectrum-2 onwards.

Add trigger operations that allow us to abstract these differences. Only
implement the operations for per-port triggers. Next patch will
implement the operations for global triggers.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:22:21 -07:00
Ido Schimmel
4bafb85ae2 mlxsw: spectrum_span: Move SPAN operations out of global file
The per-ASIC SPAN operations are relevant to the SPAN module and
therefore should be implemented there and not in the main driver file.
Move them.

These operations will be extended later on.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:22:21 -07:00
Amit Cohen
c0e3969b07 mlxsw: reg: Add Monitoring Port Analyzer Global Register
This register is used for global port analyzer configurations.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:22:21 -07:00
Amit Cohen
951b84d4ae mlxsw: reg: Add Monitoring Mirror Trigger Enable Register
This register is used to configure the mirror enable for different
mirror reasons.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:22:21 -07:00
Petr Machata
c40f4e50b6 net: sched: Pass qdisc reference in struct flow_block_offload
Previously, shared blocks were only relevant for the pseudo-qdiscs ingress
and clsact. Recently, a qevent facility was introduced, which allows to
bind blocks to well-defined slots of a qdisc instance. RED in particular
got two qevents: early_drop and mark. Drivers that wish to offload these
blocks will be sent the usual notification, and need to know which qdisc it
is related to.

To that end, extend flow_block_offload with a "sch" pointer, and initialize
as appropriate. This prompts changes in the indirect block facility, which
now tracks the scheduler in addition to the netdevice. Update signatures of
several functions similarly.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:22:21 -07:00
Michael Chan
27640ce68d bnxt_en: Fix completion ring sizing with TPA enabled.
The current completion ring sizing formula is wrong with TPA enabled.
The formula assumes that the number of TPA completions are bound by the
RX ring size, but that's not true.  TPA_START completions are immediately
recycled so they are not bound by the RX ring size.  We must add
bp->max_tpa to the worst case maximum RX and TPA completions.

The completion ring can overflow because of this mistake.  This will
cause hardware to disable the completion ring when this happens,
leading to RX and TX traffic to stall on that ring.  This issue is
generally exposed only when the RX ring size is set very small.

Fix the formula by adding bp->max_tpa to the number of RX completions
if TPA is enabled.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.");
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-12 15:29:01 -07:00
Vasundhara Volam
ca0c753815 bnxt_en: Init ethtool link settings after reading updated PHY configuration.
In a shared port PHY configuration, async event is received when any of the
port modifies the configuration. Ethtool link settings should be
initialised after updated PHY configuration from firmware.

Fixes: b1613e78e9 ("bnxt_en: Add async. event logic for PHY configuration changes.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-12 15:29:01 -07:00
Vasundhara Volam
163e9ef636 bnxt_en: Fix race when modifying pause settings.
The driver was modified to not rely on rtnl lock to protect link
settings about 2 years ago.  The pause setting was missed when
making that change.  Fix it by acquiring link_lock mutex before
calling bnxt_hwrm_set_pause().

Fixes: e2dc9b6e38 ("bnxt_en: Don't use rtnl lock to protect link change logic in workqueue.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-12 15:29:01 -07:00
Christophe JAILLET
c86768cf5c net: sky2: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GPF_ with a correct flag.
It has been compile tested.

When memory is allocated in 'sky2_alloc_buffers()', GFP_KERNEL can be used
because some other memory allocations in the same function already use this
flag.

When memory is allocated in 'sky2_probe()', GFP_KERNEL can be used
because another memory allocations in the same function already uses this
flag.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-12 15:25:52 -07:00
Christophe JAILLET
6d905436a2 net: skge: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GPF_ with a correct flag.
It has been compile tested.

When memory is allocated in 'skge_up()', GFP_KERNEL can be used because
some other memory allocations done a few lines below in 'skge_ring_alloc()'
already use this flag.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-12 15:25:52 -07:00
Andrew Lunn
5919305351 net: fec: Set max MTU size to allow the MTU to be changed
The FEC allocates 2K buffers, but looses some of it due to
alignment. It can however support an MTU bigger than the default. This
is particularly interesting when used in combination with Ethernet
switches supporting DSA, which have extra headers. The DSA core will
try to increase the MTU to support these extra headers. If the max
size defaults to that of standard Ethernet we get a warning. By
setting the max to what the driver actually supports, we avoid this
warning.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-12 15:22:14 -07:00
David S. Miller
71930d6102 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
All conflicts seemed rather trivial, with some guidance from
Saeed Mameed on the tc_ct.c one.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-11 00:46:00 -07:00
Ido Schimmel
c4317b1167 mlxsw: pci: Fix use-after-free in case of failed devlink reload
In case devlink reload failed, it is possible to trigger a
use-after-free when querying the kernel for device info via 'devlink dev
info' [1].

This happens because as part of the reload error path the PCI command
interface is de-initialized and its mailboxes are freed. When the
devlink '->info_get()' callback is invoked the device is queried via the
command interface and the freed mailboxes are accessed.

Fix this by initializing the command interface once during probe and not
during every reload.

This is consistent with the other bus used by mlxsw (i.e., 'mlxsw_i2c')
and also allows user space to query the running firmware version (for
example) from the device after a failed reload.

[1]
BUG: KASAN: use-after-free in memcpy include/linux/string.h:406 [inline]
BUG: KASAN: use-after-free in mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
Write of size 4096 at addr ffff88810ae32000 by task syz-executor.1/2355

CPU: 1 PID: 2355 Comm: syz-executor.1 Not tainted 5.8.0-rc2+ #29
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xf6/0x16e lib/dump_stack.c:118
 print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 check_memory_region_inline mm/kasan/generic.c:186 [inline]
 check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192
 memcpy+0x39/0x60 mm/kasan/common.c:106
 memcpy include/linux/string.h:406 [inline]
 mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
 mlxsw_cmd_exec+0x249/0x550 drivers/net/ethernet/mellanox/mlxsw/core.c:2335
 mlxsw_cmd_access_reg drivers/net/ethernet/mellanox/mlxsw/cmd.h:859 [inline]
 mlxsw_core_reg_access_cmd drivers/net/ethernet/mellanox/mlxsw/core.c:1938 [inline]
 mlxsw_core_reg_access+0x2f6/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1985
 mlxsw_reg_query drivers/net/ethernet/mellanox/mlxsw/core.c:2000 [inline]
 mlxsw_devlink_info_get+0x17f/0x6e0 drivers/net/ethernet/mellanox/mlxsw/core.c:1090
 devlink_nl_info_fill.constprop.0+0x13c/0x2d0 net/core/devlink.c:4588
 devlink_nl_cmd_info_get_dumpit+0x246/0x460 net/core/devlink.c:4648
 genl_lock_dumpit+0x85/0xc0 net/netlink/genetlink.c:575
 netlink_dump+0x515/0xe50 net/netlink/af_netlink.c:2245
 __netlink_dump_start+0x53d/0x830 net/netlink/af_netlink.c:2353
 genl_family_rcv_msg_dumpit.isra.0+0x296/0x300 net/netlink/genetlink.c:638
 genl_family_rcv_msg net/netlink/genetlink.c:733 [inline]
 genl_rcv_msg+0x78d/0x9d0 net/netlink/genetlink.c:753
 netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2469
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:764
 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
 netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1329
 netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1918
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0x150/0x190 net/socket.c:672
 ____sys_sendmsg+0x6d8/0x840 net/socket.c:2363
 ___sys_sendmsg+0xff/0x170 net/socket.c:2417
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2450
 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:359
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: a9c8336f65 ("mlxsw: core: Add support for devlink info command")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 14:33:34 -07:00
Ido Schimmel
d9d5420273 mlxsw: spectrum_router: Remove inappropriate usage of WARN_ON()
We should not trigger a warning when a memory allocation fails. Remove
the WARN_ON().

The warning is constantly triggered by syzkaller when it is injecting
faults:

[ 2230.758664] FAULT_INJECTION: forcing a failure.
[ 2230.758664] name failslab, interval 1, probability 0, space 0, times 0
[ 2230.762329] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
...
[ 2230.898175] WARNING: CPU: 3 PID: 1407 at drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:6265 mlxsw_sp_router_fib_event+0xfad/0x13e0
[ 2230.898179] Kernel panic - not syncing: panic_on_warn set ...
[ 2230.898183] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
[ 2230.898190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014

Fixes: 3057224e01 ("mlxsw: spectrum_router: Implement FIB offload in deferred work")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 14:33:34 -07:00
Vladyslav Tarasiuk
b7e93bb6b1 net/mlx5e: Move devlink-health rx and tx reporters to devlink port
Utilize new devlink-health port reporters API to move rx and tx
reporters from device to port.

Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 14:32:02 -07:00
Vladyslav Tarasiuk
4d54d3251e net/mlx5e: Move devlink port register and unregister calls
Register devlink ports upon NIC init. TX and RX health reporters handle
errors which may occur early on at driver initialization. And because
these reporters are to be moved to port context, they require devlink
ports to be already registered.

Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 14:32:02 -07:00
Nicolas Ferre
6c8f85cac9 net: macb: fix call to pm_runtime in the suspend/resume functions
The calls to pm_runtime_force_suspend/resume() functions are only
relevant if the device is not configured to act as a WoL wakeup source.
Add the device_may_wakeup() test before calling them.

Fixes: 3e2a5e1539 ("net: macb: add wake-on-lan support via magic packet")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Sergio Prado <sergio.prado@e-labworks.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 14:29:38 -07:00
Nicolas Ferre
64febc5e56 net: macb: fix macb_suspend() by removing call to netif_carrier_off()
As we now use the phylink call to phylink_stop() in the non-WoL path,
there is no need for this call to netif_carrier_off() anymore. It can
disturb the underlying phylink FSM.

Fixes: 7897b071ac ("net: macb: convert to phylink")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 14:29:37 -07:00
Nicolas Ferre
253fe09435 net: macb: fix macb_get/set_wol() when moving to phylink
Keep previous function goals and integrate phylink actions to them.

phylink_ethtool_get_wol() is not enough to figure out if Ethernet driver
supports Wake-on-Lan.
Initialization of "supported" and "wolopts" members is done in phylink
function, no need to keep them in calling function.

phylink_ethtool_set_wol() return value is considered and determines
if the MAC has to handle WoL or not. The case where the PHY doesn't
implement WoL leads to the MAC configuring it to provide this feature.

Fixes: 7897b071ac ("net: macb: convert to phylink")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Antoine Tenart <antoine.tenart@bootlin.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 14:29:37 -07:00
Nicolas Ferre
ced4799d06 net: macb: mark device wake capable when "magic-packet" property present
Change the way the "magic-packet" DT property is handled in the
macb_probe() function, matching DT binding documentation.
Now we mark the device as "wakeup capable" instead of calling the
device_init_wakeup() function that would enable the wakeup source.

For Ethernet WoL, enabling the wakeup_source is done by
using ethtool and associated macb_set_wol() function that
already calls device_set_wakeup_enable() for this purpose.

That would reduce power consumption by cutting more clocks if
"magic-packet" property is set but WoL is not configured by ethtool.

Fixes: 3e2a5e1539 ("net: macb: add wake-on-lan support via magic packet")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Sergio Prado <sergio.prado@e-labworks.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 14:29:37 -07:00
Nicolas Ferre
515a10a701 net: macb: fix wakeup test in runtime suspend/resume routines
Use the proper struct device pointer to check if the wakeup flag
and wakeup source are positioned.
Use the one passed by function call which is equivalent to
&bp->dev->dev.parent.

It's preventing the trigger of a spurious interrupt in case the
Wake-on-Lan feature is used.

Fixes: d54f89af6c ("net: macb: Add pm runtime support")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 14:29:37 -07:00
Davide Caratti
c8b1d74360 bnxt_en: fix NULL dereference in case SR-IOV configuration fails
we need to set 'active_vfs' back to 0, if something goes wrong during the
allocation of SR-IOV resources: otherwise, further VF configurations will
wrongly assume that bp->pf.vf[x] are valid memory locations, and commands
like the ones in the following sequence:

 # echo 2 >/sys/bus/pci/devices/${ADDR}/sriov_numvfs
 # ip link set dev ens1f0np0 up
 # ip link set dev ens1f0np0 vf 0 trust on

will cause a kernel crash similar to this:

 bnxt_en 0000:3b:00.0: not enough MMIO resources for SR-IOV
 BUG: kernel NULL pointer dereference, address: 0000000000000014
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] SMP PTI
 CPU: 43 PID: 2059 Comm: ip Tainted: G          I       5.8.0-rc2.upstream+ #871
 Hardware name: Dell Inc. PowerEdge R740/08D89F, BIOS 2.2.11 06/13/2019
 RIP: 0010:bnxt_set_vf_trust+0x5b/0x110 [bnxt_en]
 Code: 44 24 58 31 c0 e8 f5 fb ff ff 85 c0 0f 85 b6 00 00 00 48 8d 1c 5b 41 89 c6 b9 0b 00 00 00 48 c1 e3 04 49 03 9c 24 f0 0e 00 00 <8b> 43 14 89 c2 83 c8 10 83 e2 ef 45 84 ed 49 89 e5 0f 44 c2 4c 89
 RSP: 0018:ffffac6246a1f570 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000000b
 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff98b28f538900
 RBP: ffff98b28f538900 R08: 0000000000000000 R09: 0000000000000008
 R10: ffffffffb9515be0 R11: ffffac6246a1f678 R12: ffff98b28f538000
 R13: 0000000000000001 R14: 0000000000000000 R15: ffffffffc05451e0
 FS:  00007fde0f688800(0000) GS:ffff98baffd40000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000014 CR3: 000000104bb0a003 CR4: 00000000007606e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Call Trace:
  do_setlink+0x994/0xfe0
  __rtnl_newlink+0x544/0x8d0
  rtnl_newlink+0x47/0x70
  rtnetlink_rcv_msg+0x29f/0x350
  netlink_rcv_skb+0x4a/0x110
  netlink_unicast+0x21d/0x300
  netlink_sendmsg+0x329/0x450
  sock_sendmsg+0x5b/0x60
  ____sys_sendmsg+0x204/0x280
  ___sys_sendmsg+0x88/0xd0
  __sys_sendmsg+0x5e/0xa0
  do_syscall_64+0x47/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reported-by: Fei Liu <feliu@redhat.com>
CC: Jonathan Toppins <jtoppins@redhat.com>
CC: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 14:20:03 -07:00
David S. Miller
d6c7fc0c8c mlx5-updates-2020-07-09
mlx5 connection tracking offloads updates:
 
 1)  Restore CT state from lookup in zone instead of tupleid
 
     On a miss, Use this zone + 5 tuple taken from the skb, to lookup the CT
     entry and restore it, instead of the driver allocated tuple id.
 
     This improves flow insertion rate by avoiding the allocation of a header
     rewrite context to maintain the tupleid.
 
 2) Re-use modify header HW objects for identical modify actions.
 
 3) Expand tunnel register mappings
    Reg_c1 is 32 bits wide. Before this patchset, 24 bit were allocated
    for the tuple_id,  6 bits for tunnel mapping and 2 bits for tunnel
    options mappings.
 
    Restoring the ct state from zone lookup instead of tuple id requires
    reg_c1 to store 8 bits mapping the ct zone, leaving 24 bits for tunnel
    mappings.
 
    Expand tunnel and tunnel options register mappings to 12 bit each.
 
 4) Trivial cleanup and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl8H16UACgkQSD+KveBX
 +j7yTwf/eza7ftn9Jq1f6yyTM9qQZ64oC0cboDZQ3EyJtY++frWzo4bNbHFbQ26Y
 EDjRGqG0Hiby95dgTrGtRzf9PQuDwWfdNavLKyV1D//cPeTDYpHkwKVF4sozfd5Q
 g1RB6rySvYfx8BKALaJBclYlRoiVevLoIEfuMSrmstR1/tBCvmMLiB0p1VsLIS0+
 XBDEezO4rqDyNJwuznMYIX44w8Xa4IzIb9/YwEubMPs52WjktXAmTPTChcO8cu/9
 4VLsTkFKUDlm3TDXg99Lpk8L+0dfo7dUcHsqaoXMs5eER6kw8bjK/f7muSSIiIcd
 Nnba/UaU+FYzA4EF98xQD0bFQJNrmQ==
 =NMLY
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2020-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-07-09

This series provides updates to mlx5 CT (connection tracking) offloads
For more information please see tag log below.

Please pull and let me know if there is any problem.

The following conflict is expected when net is merged into net-next:
to resolve just use the hunks from net-next.

<<<<<<< HEAD (net-next)
	mlx5_tc_ct_del_ft_entry(ct_priv, entry);
	kfree(entry);
======= (net)
	mlx5_tc_ct_entry_del_rules(ct_priv, entry);
	kfree(entry);
>>>>>>> b1a7d5bdfe54c98eca46e2c997d4e3b1484a49af

mlx5 connection tracking offloads updates:

1)  Restore CT state from lookup in zone instead of tupleid

    On a miss, Use this zone + 5 tuple taken from the skb, to lookup the CT
    entry and restore it, instead of the driver allocated tuple id.

    This improves flow insertion rate by avoiding the allocation of a header
    rewrite context to maintain the tupleid.

2) Re-use modify header HW objects for identical modify actions.

3) Expand tunnel register mappings
   Reg_c1 is 32 bits wide. Before this patchset, 24 bit were allocated
   for the tuple_id,  6 bits for tunnel mapping and 2 bits for tunnel
   options mappings.

   Restoring the ct state from zone lookup instead of tuple id requires
   reg_c1 to store 8 bits mapping the ct zone, leaving 24 bits for tunnel
   mappings.

   Expand tunnel and tunnel options register mappings to 12 bit each.

4) Trivial cleanup and fixes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 14:10:45 -07:00
Jakub Kicinski
fb6f8970bd mlx4: convert to new udp_tunnel_nic infra
Convert to new infra, make use of the ability to sleep in the callback.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 13:54:00 -07:00
Jakub Kicinski
442a35a5a7 bnxt: convert to new udp_tunnel_nic infra
Convert to new infra, taking advantage of sleeping in callbacks.

v2:
 - use bp->*_fw_dst_port_id != INVALID_HW_RING_ID as indication
   that the offload is active.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 13:54:00 -07:00
Jakub Kicinski
dc221851ff ixgbe: convert to new udp_tunnel_nic infra
Make use of new common udp_tunnel_nic infra. ixgbe supports
IPv4 only, and only single VxLAN and Geneve ports (one each).

v2:
 - split out the RXCSUM feature handling to separate change;
 - declare structs separately;
 - use ti.type instead of assuming table 0 is VxLAN;
 - move setting netdev->udp_tunnel_nic_info to its own switch.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 13:54:00 -07:00
Jakub Kicinski
abc0c78c0a ixgbe: don't clear UDP tunnel ports when RXCSUM is disabled
It appears the clearing of UDP tunnel ports when RXCSUM
is disabled is unnecessary. Driver will not pay attention
to checksum bits if RXCSUM is not set, so we can let
the hardware parse the packets.

Note that the UDP tunnel port NDO handlers don't pay attention
to the state of RXCSUM, so the ports could had been re-programmed,
anyway.

This cleanup simplifies later conversion patch.

v2:
 - break this out of the following patch.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10 13:54:00 -07:00
Roi Dayan
bbe1124944 net/mlx5e: CT: Fix releasing ft entries
Before this commit, on ft flush, ft entries were not removed
from the ct_tuple hashtables. Fix it.

Fixes: ac991b48d4 ("net/mlx5e: CT: Offload established flows")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:51:17 -07:00
Saeed Mahameed
de96d5732a net/mlx5e: CT: Remove unused function param
"flow" parameter is not used in __mlx5_tc_ct_flow_offload_clear(),
remove it.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-07-09 19:51:16 -07:00
Saeed Mahameed
2acc4551d4 net/mlx5e: CT: Return err_ptr from internal functions
Instead of having to deal with converting between int and ERR_PTR for
return values in mlx5_tc_ct_flow_offload(), make the internal helper
functions return a ptr to mlx5_flow_handle instead of passing it as
output param, this will also avoid gcc confusion and false alarms,
thus we remove the redundant ERR_PTR rule initialization.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Suggested-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-07-09 19:51:16 -07:00
Paul Blakey
d12f4521d3 net/mlx5e: CT: Expand tunnel register mappings
Reg_c1 is 32 bits wide. Originally, 24 bit were allocated for the tuple_id,
6 bits for tunnel mapping and 2 bits for tunnel options mappings.

Restoring the ct state from zone lookup instead of tuple id requires
reg_c1 to store 8 bits mapping the ct zone, leaving 24 bits for tunnel
mappings.

Expand tunnel and tunnel options register mappings to 12 bit each.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:51:16 -07:00
Paul Blakey
8f5b3c3ec1 net/mlx5e: CT: Use mapping for zone restore register
Use a single byte mapping for zone restore register (zone matching
remains 16 bit).

This makes room for using the freed 8 bits on register C1 for
mapping more tunnels and tunnel options.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:51:15 -07:00
Paul Blakey
6702d39355 net/mlx5e: CT: Re-use tuple modify headers for identical modify actions
After removing the tupleid register which changed per tuple,
tuple modify headers set the ct_state, zone, mark, and label registers.
For non-natted tuples going through the same tc rules path, their values
will be the same, and all their modify headers will be the same.

Re-use tuple modify header when possible, by adding each new modify
header to an hahstable, and looking up identical ones before creating
a new one.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:51:15 -07:00
Paul Blakey
b2fdf3d047 net/mlx5e: Export sharing of mod headers to a new file
Refactor sharing of mod headers to new file and while there,
remove spin lock and flows list, as this is only used for warn on.

Use the generic API in the next patch to re-use tuple modify headers
for identical modify actions,

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:51:15 -07:00
Paul Blakey
a8eb919ba6 net/mlx5e: CT: Restore ct state from lookup in zone instead of tupleid
Remove tupleid, and replace it with zone_restore, which is the zone an
established tuple sets after match. On miss, Use this zone + tuple
taken from the skb, to lookup the ct entry and restore it.

This improves flow insertion rate by avoiding the allocation of a header
rewrite context.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:51:14 -07:00
Paul Blakey
7e36feeb04 net/mlx5e: CT: Don't offload tuple rewrites for established tuples
Next patches will remove the tupleid registers that is used
to restore the ct state on miss, and instead use the tuple on
the missed packet to lookup which state to restore.
Disable tuple rewrites after connection tracking.

For tuple rewrites, inject a ct_state=-trk match so it won't
change the tuple for established flows (+trk) that passed connection
tracking, and instead miss to software.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:51:14 -07:00
Oz Shlomo
3d486ec4fa net/mlx5e: Use netdev_info instead of pr_info
The next patch will pass the mlx5e_priv struct to the
modify_header_match_supported method. Use this opportunity to refactor
the existing pr_info call to a netdev_info call.

Signed-off-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:51:14 -07:00
Paul Blakey
a7c119bd82 net/mlx5e: CT: Allow header rewrite of 5-tuple and ct clear action
With ct clear we don't jump to the ct tables, so header rewrite
of 5-tuple can be done in place (and not moved to after the CT action).

Check for ct clear action, and if so, allow 5-tuple header
rewrite.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:51:13 -07:00
Paul Blakey
bc562be967 net/mlx5e: CT: Save ct entries tuples in hashtables
Save original tuple and natted tuple in two new hashtables.

This is a pre-step for restoring ct state after hw miss by performing a
5-tuple lookup on the hash tables.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:51:13 -07:00
Parav Pandit
e9716afdca net/mlx5: E-switch, When eswitch is unsupported, return -EOPNOTSUPP
When eswitch is unsupported, currently -EPERM error code is returned
instead of -EOPNOTSUPP.

Due to this VF device's devlink virtual port is not enumerated because
port_function_get() callback returned -EPERM instead of -EOPNOTSUPP.

Hence, return the error code -EOPNOTSUPP when eswitch is unsupported.

Fixes: bd93975353 ("net/mlx5: E-switch, Introduce and use eswitch support check helper")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:51:12 -07:00
Eli Britstein
eb32b3f53d net/mlx5e: CT: Fix memory leak in cleanup
CT entries are deleted via a workqueue from netfilter. If removing the
module before that, the rules are cleaned by the driver itself, but the
memory entries for them are not freed. Fix that.

Fixes: ac991b48d4 ("net/mlx5e: CT: Offload established flows")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:27:07 -07:00
Eran Ben Elisha
88b3d5c90e net/mlx5e: Fix port buffers cell size value
Device unit for port buffers size, xoff_threshold and xon_threshold is
cells. Fix a bug in driver where cell unit size was hard-coded to
128 bytes. This hard-coded value is buggy, as it is wrong for some hardware
versions.

Driver to read cell size from SBCAM register and translate bytes to cell
units accordingly.

In order to fix the bug, this patch exposes SBCAM (Shared buffer
capabilities mask) layout and defines.

If SBCAM.cap_cell_size is valid, use it for all bytes to cells
calculations. If not valid, fallback to 128.

Cell size do not change on the fly per device. Instead of issuing SBCAM
access reg command every time such translation is needed, cache it in
mlx5e_dcbx as part of mlx5e_dcbnl_initialize(). Pass dcbx.port_buff_cell_sz
as a param to every function that needs bytes to cells translation.

While fixing the bug, move MLX5E_BUFFER_CELL_SHIFT macro to
en_dcbnl.c, as it is only used by that file.

Fixes: 0696d60853 ("net/mlx5e: Receive buffer configuration")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:27:07 -07:00
Aya Levin
6a1cf4e443 net/mlx5e: Fix 50G per lane indication
Some released FW versions mistakenly don't set the capability that 50G
per lane link-modes are supported for VFs (ptys_extended_ethernet
capability bit). When the capability is unset, read
PTYS.ext_eth_proto_capability (always reliable).
If PTYS.ext_eth_proto_capability is valid (has a non-zero value)
conclude that the HCA supports 50G per lane. Otherwise, conclude that
the HCA doesn't support 50G per lane.

Fixes: a08b4ed137 ("net/mlx5: Add support to ext_* fields introduced in Port Type and Speed register")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:27:06 -07:00
Aya Levin
f4aebbfb56 net/mlx5e: Fix CPU mapping after function reload to avoid aRFS RX crash
After function reload, CPU mapping used by aRFS RX is broken, leading to
a kernel panic. Fix by moving initialization of rx_cpu_rmap from
netdev_init to netdev_attach. IRQ table is re-allocated on mlx5_load,
but netdev is not re-initialize.

Trace of the panic:
[ 22.055672] general protection fault, probably for non-canonical address 0x785634120000ff1c: 0000 [#1] SMP PTI
[ 22.065010] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.7.0-rc2-for-upstream-perf-2020-04-21_16-34-03-31 #1
[ 22.067967] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[ 22.071174] RIP: 0010:get_rps_cpu+0x267/0x300
[ 22.075692] RSP: 0018:ffffc90000244d60 EFLAGS: 00010202
[ 22.076888] RAX: ffff888459b0e400 RBX: 0000000000000000 RCX:0000000000000007
[ 22.078364] RDX: 0000000000008884 RSI: ffff888467cb5b00 RDI:0000000000000000
[ 22.079815] RBP: 00000000ff342b27 R08: 0000000000000007 R09:0000000000000003
[ 22.081289] R10: ffffffffffffffff R11: 00000000000070cc R12:ffff888454900000
[ 22.082767] R13: ffffc90000e5a950 R14: ffffc90000244dc0 R15:0000000000000007
[ 22.084190] FS: 0000000000000000(0000) GS:ffff88846fc80000(0000)knlGS:0000000000000000
[ 22.086161] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 22.087427] CR2: ffffffffffffffff CR3: 0000000464426003 CR4:0000000000760ee0
[ 22.088888] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000
[ 22.090336] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400
[ 22.091764] PKRU: 55555554
[ 22.092618] Call Trace:
[ 22.093442] <IRQ>
[ 22.094211] ? kvm_clock_get_cycles+0xd/0x10
[ 22.095272] netif_receive_skb_list_internal+0x258/0x2a0
[ 22.096460] gro_normal_list.part.137+0x19/0x40
[ 22.097547] napi_complete_done+0xc6/0x110
[ 22.098685] mlx5e_napi_poll+0x190/0x670 [mlx5_core]
[ 22.099859] net_rx_action+0x2a0/0x400
[ 22.100848] __do_softirq+0xd8/0x2a8
[ 22.101829] irq_exit+0xa5/0xb0
[ 22.102750] do_IRQ+0x52/0xd0
[ 22.103654] common_interrupt+0xf/0xf
[ 22.104641] </IRQ>

Fixes: 4383cfcc65 ("net/mlx5: Add devlink reload")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:27:06 -07:00
Aya Levin
b3c2ed21c0 net/mlx5e: Fix VXLAN configuration restore after function reload
When detaching netdev, remove vxlan port configuration using
udp_tunnel_drop_rx_info. During function reload, configuration will be
restored using udp_tunnel_get_rx_info. This ensures sync between
firmware and driver. Use udp_tunnel_get_rx_info even if its physical
interface is down.

Fixes: 4383cfcc65 ("net/mlx5: Add devlink reload")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:27:06 -07:00
Vlad Buslov
c1aea9e176 net/mlx5e: Fix usage of rcu-protected pointer
In mlx5e_configure_flower() flow pointer is protected by rcu read lock.
However, after cited commit the pointer is being used outside of rcu read
block. Extend the block to protect all pointer accesses.

Fixes: 553f932838 ("net/mlx5e: Support tc block sharing for representors")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:27:05 -07:00
Vlad Buslov
2fb15e72c0 net/mxl5e: Verify that rpriv is not NULL
In helper function is_flow_rule_duplicate_allowed() verify that rpviv
pointer is not NULL before dereferencing it. This can happen when device is
in NIC mode and leads to following crash:

[90444.046419] BUG: kernel NULL pointer dereference, address: 0000000000000000
[90444.048149] #PF: supervisor read access in kernel mode
[90444.049781] #PF: error_code(0x0000) - not-present page
[90444.051386] PGD 80000003d35a4067 P4D 80000003d35a4067 PUD 3d35a3067 PMD 0
[90444.053051] Oops: 0000 [#1] SMP PTI
[90444.054683] CPU: 16 PID: 31736 Comm: tc Not tainted 5.8.0-rc1+ #1157
[90444.056340] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[90444.058079] RIP: 0010:mlx5e_configure_flower+0x3aa/0x9b0 [mlx5_core]
[90444.059753] Code: 24 50 49 8b 95 08 02 00 00 48 b8 00 08 00 00 04 00 00 00 48 21 c2 48 39 c2 74 0a 41 f6 85 0d 02 00 00 20 74 16 48 8b 44 24 20 <48> 8b 00 66 83 78 20 ff 74 07 4d 89 aa e0 00 00 00 48 83 7d 28 00
[90444.063232] RSP: 0018:ffffabe9c61ff768 EFLAGS: 00010246
[90444.065014] RAX: 0000000000000000 RBX: ffff9b13c4c91e80 RCX: 00000000000093fa
[90444.066784] RDX: 0000000400000800 RSI: 0000000000000000 RDI: 000000000002d5e0
[90444.068533] RBP: ffff9b174d308468 R08: 0000000000000000 R09: ffff9b17d63003f0
[90444.070285] R10: ffff9b17ea288600 R11: 0000000000000000 R12: ffffabe9c61ff878
[90444.072032] R13: ffff9b174d300000 R14: ffffabe9c61ffbb8 R15: ffff9b174d300880
[90444.073760] FS:  00007f3c23775480(0000) GS:ffff9b13efc80000(0000) knlGS:0000000000000000
[90444.075492] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[90444.077266] CR2: 0000000000000000 CR3: 00000003e2a60002 CR4: 00000000001606e0
[90444.079024] Call Trace:
[90444.080753]  tc_setup_cb_add+0xca/0x1e0
[90444.082415]  fl_hw_replace_filter+0x15f/0x1f0 [cls_flower]
[90444.084119]  fl_change+0xa59/0x13dc [cls_flower]
[90444.085772]  ? wait_for_completion+0xa8/0xf0
[90444.087364]  tc_new_tfilter+0x3f5/0xa60
[90444.088960]  rtnetlink_rcv_msg+0xeb/0x360
[90444.090514]  ? __d_lookup_done+0x76/0xe0
[90444.092034]  ? proc_alloc_inode+0x16/0x70
[90444.093560]  ? prep_new_page+0x8c/0xf0
[90444.095048]  ? _cond_resched+0x15/0x30
[90444.096483]  ? rtnl_calcit.isra.0+0x110/0x110
[90444.097907]  netlink_rcv_skb+0x49/0x110
[90444.099289]  netlink_unicast+0x191/0x230
[90444.100629]  netlink_sendmsg+0x243/0x480
[90444.101984]  sock_sendmsg+0x5e/0x60
[90444.103305]  ____sys_sendmsg+0x1f3/0x260
[90444.104597]  ? copy_msghdr_from_user+0x5c/0x90
[90444.105916]  ? __mod_lruvec_state+0x3c/0xe0
[90444.107210]  ___sys_sendmsg+0x81/0xc0
[90444.108484]  ? do_filp_open+0xa5/0x100
[90444.109732]  ? handle_mm_fault+0x117b/0x1e00
[90444.110970]  ? __check_object_size+0x46/0x147
[90444.112205]  ? __check_object_size+0x136/0x147
[90444.113402]  __sys_sendmsg+0x59/0xa0
[90444.114587]  do_syscall_64+0x4d/0x90
[90444.115782]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[90444.116953] RIP: 0033:0x7f3c2393b7b8
[90444.118101] Code: Bad RIP value.
[90444.119240] RSP: 002b:00007ffc6ad8e6c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[90444.120408] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3c2393b7b8
[90444.121583] RDX: 0000000000000000 RSI: 00007ffc6ad8e740 RDI: 0000000000000003
[90444.122750] RBP: 000000005eea0c3a R08: 0000000000000001 R09: 00007ffc6ad8e68c
[90444.123928] R10: 0000000000404fa8 R11: 0000000000000246 R12: 0000000000000001
[90444.125073] R13: 0000000000000000 R14: 00007ffc6ad92a00 R15: 00000000004866a0
[90444.126221] Modules linked in: act_skbedit act_tunnel_key act_mirred bonding vxlan ip6_udp_tunnel udp_tunnel nfnetlink act_gact cls_flower sch_ingress openvswitch nsh nf_conncount nfsv3 nfs_acl nfs lockd grace fscache tun bridge stp llc sunrpc rdma_ucm rdma_cm iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core intel_r
apl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel mlxfw kvm act_ct nf_flow_table nf_nat nf_conntrack irqbypass crct10dif_pclmul nf_defrag_ipv6 igb ipmi_ssif libcrc32c crc32_pclmul crc32c_intel ipmi_si nf_defrag_ipv4 ptp ghash_clmulni_intel mei_me ses iTCO_wdt i2c_i801 pps_core
ioatdma iTCO_vendor_support joydev mei enclosure intel_cstate i2c_smbus wmi dca ipmi_devintf intel_uncore lpc_ich ipmi_msghandler pcspkr acpi_pad acpi_power_meter ast i2c_algo_bit drm_vram_helper drm_kms_helper drm_ttm_helper ttm drm mpt3sas raid_class scsi_transport_sas
[90444.136253] CR2: 0000000000000000
[90444.137621] ---[ end trace 924af62aa2b151bd ]---

Fixes: 553f932838 ("net/mlx5e: Support tc block sharing for representors")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:27:05 -07:00
Vu Pham
01f3d5db4a net/mlx5: E-Switch, Fix vlan or qos setting in legacy mode
Refactoring eswitch ingress acl codes accidentally inserts extra
memset zero that removes vlan and/or qos setting in legacy mode.

Fixes: 07bab95026 ("net/mlx5: E-Switch, Refactor eswitch ingress acl codes")
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:27:05 -07:00
Eran Ben Elisha
47afbdd2fa net/mlx5: Fix eeprom support for SFP module
Fix eeprom SFP query support by setting i2c_addr, offset and page number
correctly. Unlike QSFP modules, SFP eeprom params are as follow:
- i2c_addr is 0x50 for offset 0 - 255 and 0x51 for offset 256 - 511.
- Page number is always zero.
- Page offset is always relative to zero.

As part of eeprom query, query the module ID (SFP / QSFP*) via helper
function to set the params accordingly.

In addition, change mlx5_qsfp_eeprom_page() input type to be u16 to avoid
unnecessary casting.

Fixes: a708fb7b1f ("net/mlx5e: ethtool, Add support for EEPROM high pages query")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09 19:27:04 -07:00
Huacai Chen
2575b2f3ee PCI: Move PCI_VENDOR_ID_REDHAT definition to pci_ids.h
Instead of duplicating the PCI_VENDOR_ID_REDHAT definition everywhere, move
it to include/linux/pci_ids.h.

[bhelgaas: also update MDPY_PCI_VENDOR_ID]
Link: https://lore.kernel.org/r/1594195170-11119-1-git-send-email-chenhc@lemote.com
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2020-07-09 17:00:47 -05:00
Danielle Ratson
82901ad169 devlink: Move input checks from driver to devlink
Currently, all the input checks are done in driver.

After adding the split capability to devlink port, move the checks to
devlink.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09 13:15:30 -07:00
Danielle Ratson
a0f49b5486 devlink: Add a new devlink port split ability attribute and pass to netlink
Add a new attribute that indicates the split ability of devlink port.

Drivers are expected to set it via devlink_port_attrs_set(), before
registering the port.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09 13:15:30 -07:00
Danielle Ratson
1b604efb6c mlxsw: Set port split ability attribute in driver
Currently, port attributes like flavour, port number and whether the port
was split are set when initializing a port.

Set the split ability of the port as well, based on port_mapping->width
field and split attribute of devlink port in spectrum, so that it could be
easily passed to devlink in the next patch.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09 13:15:29 -07:00
Danielle Ratson
a21cf0a833 devlink: Add a new devlink port lanes attribute and pass to netlink
Add a new devlink port attribute that indicates the port's number of lanes.

Drivers are expected to set it via devlink_port_attrs_set(), before
registering the port.

The attribute is not passed to user space in case the number of lanes is
invalid (0).

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09 13:15:29 -07:00
Danielle Ratson
622d3e9201 mlxsw: Set number of port lanes attribute in driver
Currently, port attributes like flavour, port number and whether the
port was split are set when initializing a port.

Set the number of lanes of the port as well so that it could be easily
passed to devlink in the next patch.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09 13:15:29 -07:00
Danielle Ratson
71ad8d55f8 devlink: Replace devlink_port_attrs_set parameters with a struct
Currently, devlink_port_attrs_set accepts a long list of parameters,
that most of them are devlink port's attributes.

Use the devlink_port_attrs struct to replace the relevant parameters.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09 13:15:29 -07:00
Colin Ian King
e3cbdaf146 net: systemport: fix double shift of a vlan_tci by VLAN_PRIO_SHIFT
Currently the u16 skb->vlan_tci is being right  shifted twice by
VLAN_PRIO_SHIFT, once in the macro skb_vlan_tag_get_pri and explicitly
by VLAN_PRIO_SHIFT afterwards. The combined shift amount is larger than
the u16 so the end result is always zero.  Remove the second explicit
shift as this is extraneous.

Fixes: 6e9fdb60d3 ("net: systemport: Add support for VLAN transmit acceleration")
Addresses-Coverity: ("Operands don't affect result")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09 13:06:56 -07:00
Xu Wang
5ca670e58d net: enetc: use eth_broadcast_addr() to assign broadcast
This patch is to use eth_broadcast_addr() to assign broadcast address
insetad of memset().

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09 12:33:36 -07:00
Sudarsana Reddy Kalluru
13cf8aab74 qed: Populate nvm-file attributes while reading nvm config partition.
NVM config file address will be modified when the MBI image is upgraded.
Driver would return stale config values if user reads the nvm-config
(via ethtool -d) in this state. The fix is to re-populate nvm attribute
info while reading the nvm config values/partition.

Changes from previous version:
-------------------------------
v3: Corrected the formatting in 'Fixes' tag.
v2: Added 'Fixes' tag.

Fixes: 1ac4329a1c ("qed: Add configuration information to register dump and debug data")
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09 12:30:25 -07:00
Rahul Lakkireddy
76c4d85c92 cxgb4: fix all-mask IP address comparison
Convert all-mask IP address to Big Endian, instead, for comparison.

Fixes: f286dd8eaa ("cxgb4: use correct type for all-mask IP address comparison")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 15:43:00 -07:00
Meir Lichtinger
12fdafb817 net/mlx5: Added support for 100Gbps per lane link modes
This patch exposes new link modes using 100Gbps per lane, including 100G,
200G and 400G modes.

Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 15:30:42 -07:00
Edwin Peer
1da63ddd0e bnxt_en: allow firmware to disable VLAN offloads
Bare-metal use cases require giving firmware and the embedded
application processor control over VLAN offloads. The driver should
not attempt to override or utilize this feature in such scenarios
since it will not work as expected.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 15:21:14 -07:00
Edwin Peer
a196e96bb6 bnxt_en: clean up VLAN feature bit handling
The hardware VLAN offload feature on our NIC does not have separate
knobs for handling customer and service tags on RX. Either offloading
of both must be enabled or both must be disabled. Introduce definitions
for the combined feature set in order to clean up the code and make
this constraint more clear. Technically these features can be separately
enabled on TX, however, since the default is to turn both on, the
combined TX feature set is also introduced for code consistency.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 15:21:14 -07:00
Michael Chan
bd3191b5d8 bnxt_en: Implement ethtool -X to set indirection table.
With the new infrastructure in place, we can now support the setting of
the indirection table from ethtool.

When changing channels, in a rare case that firmware cannot reserve the
rings that were promised, we will still try to keep the RSS map and only
revert to default when absolutely necessary.

v4: Revert RSS map to default during ring change only when absolutely
    necessary.

v3: Add warning messages when firmware cannot reserve the requested RX
    rings, and when the RSS table entries have to change to default.

v2: When changing channels, if the RSS table size changes and RSS map
    is non-default, return error.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 15:21:13 -07:00
Michael Chan
adc38ac667 bnxt_en: Return correct RSS indirection table entries to ethtool -x.
Now that we have the logical indirection table, we can return these
proper logical indices directly to ethtool -x instead of the physical
IDs.

Reported-by: Jakub Kicinski <kicinski@fb.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 15:21:13 -07:00
Michael Chan
f33a305d09 bnxt_en: Fill HW RSS table from the RSS logical indirection table.
Now that we have the logical table, we can fill the HW RSS table
using the logical table's entries and converting them to the HW
specific format.  Re-initialize the logical table to standard
distribution if the number of RX rings changes during ring reservation.

v4: Use bnxt_get_rxfh_indir_size() to get the RSS table size.

v2: Use ALIGN() to roundup the RSS table size.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 15:21:13 -07:00
Michael Chan
f9f6a3fbb5 bnxt_en: Add helper function to return the number of RSS contexts.
On some chips, this varies based on the number of RX rings.  Add this
helper function and refactor the existing code to use it.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 15:21:13 -07:00
Michael Chan
1667cbf6a4 bnxt_en: Add logical RSS indirection table structure.
The driver currently does not keep track of the logical RSS indirection
table.  The hardware RSS table is set up with standard default ring
distribution when initializing the chip.  This makes it difficult to
support user sepcified indirection table entries.  As a first step, add
the logical table in the main bnxt structure and allocate it according
to chip specific table size.  Add a function that sets up default
RSS distribution based on the number of RX rings.

v4: Use bnxt_get_rxfh_indir_size() for the current RSS table size.

v2: Use kmalloc_array() since we init. all entries afterwards.
    Use ALIGN() to roundup the RSS table size.
    Use ethtool_rxfh_indir_default() to init. the default entries.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 15:21:13 -07:00
Michael Chan
b73c1d08a0 bnxt_en: Fix up bnxt_get_rxfh_indir_size().
Fix up bnxt_get_rxfh_indir_size() to return the proper current RSS
table size for P5 chips.  Change it to non-static so that bnxt.c
can use it to get the table size.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 15:21:13 -07:00
Michael Chan
34370d2435 bnxt_en: Set up the chip specific RSS table size.
Currently, we allocate one page for the hardware DMA RSS indirection
table.  While the size is currently big enough for all chips, future
chip variations may support bigger sizes, so it is better to calculate
and store the chip specific size and allocate accordingly.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 15:21:13 -07:00
Dmitry Bogdanov
a42e6aee7f net: atlantic: fix ip dst and ipv6 address filters
This patch fixes ip dst and ipv6 address filters.
There were 2 mistakes in the code, which led to the issue:
* invalid register was used for ipv4 dst address;
* incorrect write order of dwords for ipv6 addresses.

Fixes: 23e7a718a4 ("net: aquantia: add rx-flow filter definitions")
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 12:29:33 -07:00
Alexander A. Klimov
5d75c04306 Replace HTTP links with HTTPS ones: ATMEL MACB ETHERNET DRIVER
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
	  If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
            If both the HTTP and HTTPS versions
            return 200 OK and serve the same content:
              Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 10:35:09 -07:00
Shannon Nelson
086c18f245 ionic: centralize queue reset code
The queue reset pattern is used in a couple different places,
only slightly different from each other, and could cause
issues if one gets changed and the other didn't.  This puts
them together so that only one version is needed, yet each
can have slighty different effects by passing in a pointer
to a work function to do whatever configuration twiddling is
needed in the middle of the reset.

This specifically addresses issues seen where under loops
of changing ring size or queue count parameters we could
occasionally bump into the netdev watchdog.

v2: added more commit message commentary

Fixes: 4d03e00a21 ("ionic: Add initial ethtool support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07 15:50:31 -07:00
Alexander A. Klimov
1fd52137d3 Replace HTTP links with HTTPS ones: GRETH 10/100/1G Ethernet MAC device driver
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
          If both the HTTP and HTTPS versions
          return 200 OK and serve the same content:
            Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07 15:44:27 -07:00
Alexander Lobakin
da3287111a net: qed: fix buffer overflow on ethtool -d
When generating debug dump, driver firstly collects all data in binary
form, and then performs per-feature formatting to human-readable if it
is supported.

For ethtool -d, this is roughly incorrect for two reasons. First of all,
drivers should always provide only original raw dumps to Ethtool without
any changes.
The second, and more critical, is that Ethtool's output buffer size is
strictly determined by ethtool_ops::get_regs_len(), and all data *must*
fit in it. The current version of driver always returns the size of raw
data, but the size of the formatted buffer exceeds it in most cases.
This leads to out-of-bound writes and memory corruption.

Address both issues by adding an option to return original, non-formatted
debug data, and using it for Ethtool case.

v2:
 - Expand commit message to make it more clear;
 - No functional changes.

Fixes: c965db4446 ("qed: Add support for debug data collection")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07 15:42:31 -07:00
Colin Ian King
2291bde8c0 bnx2x: fix spelling mistake "occurd" -> "occurred"
There are spelling mistakes in various literal strings. Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07 15:41:54 -07:00
Sebastian Andrzej Siewior
f0b594dfa4 net/mlx5e: Do not include rwlock.h directly
rwlock.h should not be included directly. Instead linux/splinlock.h
should be included. Including it directly will break the RT build.

Fixes: 549c243e4e ("net/mlx5e: Extract neigh-specific code from en_rep.c to rep/neigh.c")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07 15:28:51 -07:00
Tobias Waldekranz
7cdaa4cc4b net: ethernet: fec: prevent tx starvation under high rx load
In the ISR, we poll the event register for the queues in need of
service and then enter polled mode. After this point, the event
register will never be read again until we exit polled mode.

In a scenario where a UDP flow is routed back out through the same
interface, i.e. "router-on-a-stick" we'll typically only see an rx
queue event initially. Once we start to process the incoming flow
we'll be locked polled mode, but we'll never clean the tx rings since
that event is never caught.

Eventually the netdev watchdog will trip, causing all buffers to be
dropped and then the process starts over again.

Rework the NAPI poll to keep trying to consome the entire budget as
long as new events are coming in, making sure to service all rx/tx
queues, in priority order, on each pass.

Fixes: 4d494cdc92 ("net: fec: change data structure to support multiqueue")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Tested-by: Fugang Duan <fugang.duan@nxp.com>
Reviewed-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07 15:25:05 -07:00
Tom Rix
28b18e4eb5 net: sky2: initialize return of gm_phy_read
clang static analysis flags this garbage return

drivers/net/ethernet/marvell/sky2.c:208:2: warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn]
        return v;
        ^~~~~~~~

static inline u16 gm_phy_read( ...
{
	u16 v;
	__gm_phy_read(hw, port, reg, &v);
	return v;
}

__gm_phy_read can return without setting v.

So handle similar to skge.c's gm_phy_read, initialize v.

Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07 15:23:53 -07:00
Vaibhav Gupta
53fff2bfb3 smsc9420: use generic power management
Drivers should not use legacy power management as they have to manage power
states and related operations, for the device, themselves. This driver was
handling them with the help of PCI helper functions.

With generic PM, all essentials will be handled by the PCI core. Driver
needs to do only device-specific operations.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07 15:20:49 -07:00
Vaibhav Gupta
622594f2ad epic100: use generic power management
Drivers should not use legacy power management as they have to manage power
states and related operations, for the device, themselves.

With generic PM, all essentials will be handled by the PCI core. Driver
needs to do only device-specific operations.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07 15:20:48 -07:00
Luc Van Oostenryck
16d79cd4e2 PCI: Use 'pci_channel_state_t' instead of 'enum pci_channel_state'
The method struct pci_error_handlers.error_detected() is defined and
documented as taking an 'enum pci_channel_state' for the second argument,
but most drivers use 'pci_channel_state_t' instead.

This 'pci_channel_state_t' is not a typedef for the enum but a typedef for
a bitwise type in order to have better/stricter typechecking.

Consolidate everything by using 'pci_channel_state_t' in the method's
definition, in the related helpers and in the drivers.

Enforce use of 'pci_channel_state_t' by replacing 'enum pci_channel_state'
with an anonymous 'enum'.

Note: Currently, from a typechecking point of view this patch changes
nothing because only the constants defined by the enum are bitwise, not the
enum itself (sparse doesn't have the notion of 'bitwise enum'). This may
change in some not too far future, hence the patch.

[bhelgaas: squash in
  https://lore.kernel.org/r/20200702162651.49526-3-luc.vanoostenryck@gmail.com
  https://lore.kernel.org/r/20200702162651.49526-4-luc.vanoostenryck@gmail.com]
Link: https://lore.kernel.org/r/20200702162651.49526-2-luc.vanoostenryck@gmail.com
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-07-07 17:11:52 -05:00
Matteo Croce
4e48978cd2 mvpp2: fix pointer check
priv->page_pool is an array, so comparing against it will always return true.
Do a meaningful check by checking priv->page_pool[0] instead.
While at it, clear the page_pool pointers on deallocation, or when an
allocation error happens during init.

Reported-by: Colin Ian King <colin.king@canonical.com>
Fixes: c2d6fe6163 ("mvpp2: XDP TX support")
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07 13:03:21 -07:00
Wei Yongjun
847d97e013 sun/cassini: mark cas_resume() as __maybe_unused
In certain configurations without power management support, gcc report
the following warning:

drivers/net/ethernet/sun/cassini.c:5206:12: warning:
 'cas_resume' defined but not used [-Wunused-function]
 5206 | static int cas_resume(struct device *dev_d)
      |            ^~~~~~~~~~

Mark cas_resume() as __maybe_unused to make it clear.

Fixes: f193f4ebde ("sun/cassini: use generic power management")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07 12:59:37 -07:00
Vaibhav Gupta
86fc3f7074 sun/niu: add __maybe_unused attribute to PM functions
The upgraded .suspend() and .resume() throw
"defined but not used [-Wunused-function]" warning for certain
configurations.

Mark them with "__maybe_unused" attribute.

Compile-tested only.

Fixes: b0db0cc2f6 ("sun/niu: use generic power management")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07 12:56:40 -07:00
Andrew Lunn
791e5f61ae net: phy: mdio-octeon: Cleanup module loading dependencies
To ensure that the octeon MDIO driver has been loaded, the Cavium
ethernet drivers reference a dummy symbol in the MDIO driver. This
forces it to be loaded first. And this symbol has not been cleanly
implemented, resulting in warnings when build W=1 C=1.

Since device tree is being used, and a phandle points to the PHY on
the MDIO bus, we can make use of deferred probing. If the PHY fails to
connect, it should be because the MDIO bus driver has not loaded
yet. Return -EPROBE_DEFER so it will be tried again later.

Additionally, add a MODULE_SOFTDEP() to give user space a hint as to
what order it should load the modules.

v2:
s/octoen/octeon/
Add MODULE_SOFTDEP()

Cc: Sunil Goutham <sgoutham@marvell.com>
Cc: Robert Richter <rrichter@marvell.com>
Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>
Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07 12:47:11 -07:00
Florian Fainelli
6e9fdb60d3 net: systemport: Add support for VLAN transmit acceleration
SYSTEMPORT is capable of performing VLAN transmit acceleration, support
that by configuring it appropriately, providing the VLAN ID and PCP/DEI
where necessary.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07 12:39:53 -07:00
Alexander Lobakin
fd0816628a net: qede: fix BE vs CPU comparison
Flow Dissector's keys are mostly Network / Big Endian. U{16,32}_MAX are
the same in either of byteorders, but let's make sparse happy with
wrapping them into noops.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 13:18:56 -07:00
Alexander Lobakin
50089be6bf net: qede: fix kernel-doc for qede_ptp_adjfreq()
One of the function arguments was renamed some time ago, but this
wasn't reflected in its kernel-doc comment.
Also add the description for return values.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 13:18:56 -07:00
Alexander Lobakin
5ab903418a net: qed: sanitize BE/LE data processing
Current code assumes that both host and device operates in Little Endian
in lots of places. While this is true for x86 platform, this doesn't mean
we should not care about this.

This commit addresses all parts of the code that were pointed out by sparse
checker. All operations with restricted (__be*/__le*) types are now
protected with explicit from/to CPU conversions, even if they're noops on
common setups.

I'm sure there are more such places, but this implies a deeper code
investigation, and is a subject for future works.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 13:18:56 -07:00
Alexander Lobakin
a0f3266f4b net: qed: use ptr shortcuts to dedup field accessing in some parts
Use intermediate pointers instead of multiple dereferencing to
simplify and beautify parts of code that will be addressed in
the next commit.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 13:18:55 -07:00
Alexander Lobakin
1451e467a3 net: qed: improve indentation of some parts of code
To not mix functional and stylistic changes, correct indentation
of code that will be modified in the subsequent commits.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 13:18:55 -07:00
Alexander Lobakin
71e11a3f5e net: qed: address kernel-doc warnings
Get rid of the kernel-doc warnings when building with W=1+ by
rewriting the problematic doc comments according to the
recommended format and style.

Note that this only fixes problems found in C source files,
headers aren't in scope for now.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 13:18:55 -07:00
Alexander Lobakin
365cd2cee0 net: qed: correct qed_hw_err_notify() prototype
Change the prototype of qed_hw_err_notify() with the following:
* constify "fmt" argument according to printk() declarations;
* anontate it with __cold attribute to move the function out of
  the line;
* annotate it with __printf() attribute;

This eliminates W=1+ warning:

drivers/net/ethernet/qlogic/qed/qed_hw.c: In function
‘qed_hw_err_notify’:
drivers/net/ethernet/qlogic/qed/qed_hw.c:851:3: warning: function
‘qed_hw_err_notify’ might be a candidate for ‘gnu_printf’ format
attribute [-Wsuggest-attribute=format]
 len = vsnprintf(buf, QED_HW_ERR_MAX_STR_SIZE, fmt, vl);
 ^~~

as well as saves some code size:

add/remove: 0/0 grow/shrink: 2/4 up/down: 40/-125 (-85)
Function                                     old     new   delta
qed_dmae_execute_command                    1680    1711     +31
qed_spq_post                                1104    1113      +9
qed_int_sp_dpc                              3554    3545      -9
qed_mcp_cmd_and_union                       1896    1876     -20
qed_hw_err_notify                            395     352     -43
qed_mcp_handle_events                       2630    2577     -53
Total: Before=368645, After=368560, chg -0.02%

__printf() will also be helpful with catching bad format strings
and arguments.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 13:18:55 -07:00
Alexander Lobakin
c6b7314d53 net: qed: cleanup global structs declarations
Fix several sparse warnings by moving structs declarations into
the corresponding header files:

drivers/net/ethernet/qlogic/qed/qed_dcbx.c:2402:32: warning:
symbol 'qed_dcbnl_ops_pass' was not declared. Should it be static?

drivers/net/ethernet/qlogic/qed/qed_ll2.c:2754:26: warning: symbol
'qed_ll2_ops_pass' was not declared. Should it be static?

drivers/net/ethernet/qlogic/qed/qed_ptp.c:449:30: warning: symbol
'qed_ptp_ops_pass' was not declared. Should it be static?

drivers/net/ethernet/qlogic/qed/qed_sriov.c:5265:29: warning:
symbol 'qed_iov_ops_pass' was not declared. Should it be static?

(some of them were declared twice in different header files)

Also make qed_hw_err_type_descr[] const while at it.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 13:18:55 -07:00
Alexander Lobakin
0dfda108bf net: qed: move static iro_arr[] out of header file
Static variables (and functions, unless they're inline) should not
be declared in header files.
Move the static array iro_arr[] from "qed_hsi.h" to the sole place
where it's used, "qed_init_ops.c". This eliminates lots of warnings
(42 of them actually) against W=1+:

In file included from drivers/net/ethernet/qlogic/qed/qed.h:51:0,
                 from drivers/net/ethernet/qlogic/qed/qed_ooo.c:40:
drivers/net/ethernet/qlogic/qed/qed_hsi.h:4421:18: warning: 'iro_arr'
defined but not used [-Wunused-const-variable=]
 static const u32 iro_arr[] = {
                  ^~~~~~~

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 13:18:55 -07:00
Ioana Ciornei
0fe665d42f dpaa2-eth: fix draining of S/G cache
On link down, the draining of the S/G cache should be done on all
_possible_ CPUs not just the ones that are online in that moment.
Fix this by changing the iterator.

Fixes: d70446ee1f ("dpaa2-eth: send a scatter-gather FD instead of realloc-ing")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 13:00:43 -07:00
Tang Bin
bc0c3ae40a net/amd: Remove needless assignment and the extra brank lines
The variable 'err = -ENODEV;' in au1000_probe() is
duplicate, so remove redundant one. And remove the
extra blank lines in the file au1000_eth.c

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 12:58:40 -07:00
Yonglong Liu
a066562113 net: hns3: fix use-after-free when doing self test
Enable promisc mode of PF, set VF link state to enable, and
run iperf of the VF, then do self test of the PF. The self test
will fail with a low frequency, and may cause a use-after-free
problem.

[   87.142126] selftest:000004a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   87.159722] ==================================================================
[   87.174187] BUG: KASAN: use-after-free in hex_dump_to_buffer+0x140/0x608
[   87.187600] Read of size 1 at addr ffff003b22828000 by task ethtool/1186
[   87.201012]
[   87.203978] CPU: 7 PID: 1186 Comm: ethtool Not tainted 5.5.0-rc4-gfd51c473-dirty #4
[   87.219306] Hardware name: Huawei TaiShan 2280 V2/BC82AMDA, BIOS TA BIOS 2280-A CS V2.B160.01 01/15/2020
[   87.238292] Call trace:
[   87.243173]  dump_backtrace+0x0/0x280
[   87.250491]  show_stack+0x24/0x30
[   87.257114]  dump_stack+0xe8/0x140
[   87.263911]  print_address_description.isra.8+0x70/0x380
[   87.274538]  __kasan_report+0x12c/0x230
[   87.282203]  kasan_report+0xc/0x18
[   87.288999]  __asan_load1+0x60/0x68
[   87.295969]  hex_dump_to_buffer+0x140/0x608
[   87.304332]  print_hex_dump+0x140/0x1e0
[   87.312000]  hns3_lb_check_skb_data+0x168/0x170
[   87.321060]  hns3_clean_rx_ring+0xa94/0xfe0
[   87.329422]  hns3_self_test+0x708/0x8c0

The length of packet sent by the selftest process is only
128 + 14 bytes, and the min buffer size of a BD is 256 bytes,
and the receive process will make sure the packet sent by
the selftest process is in the linear part, so only check
the linear part in hns3_lb_check_skb_data().

So fix this use-after-free by using skb_headlen() to dump
skb->data instead of skb->len.

Fixes: c39c4d98dc ("net: hns3: Add mac loopback selftest support in hns3 driver")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 12:33:28 -07:00
Huazhong Tan
e22b5e728b net: hns3: add a missing uninit debugfs when unload driver
When unloading driver, if flag HNS3_NIC_STATE_INITED has been
already cleared, the debugfs will not be uninitialized, so fix it.

Fixes: b2292360bb ("net: hns3: Add debugfs framework registration")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 12:33:28 -07:00
Huazhong Tan
cddd564892 net: hns3: fix for mishandle of asserting VF reset fail
When asserts VF reset fail, flag HCLGEVF_STATE_CMD_DISABLE
and handshake status should not set, otherwise the retry will
fail. So adds a check for asserting VF reset and returns
directly when fails.

Fixes: ef5f8e507e ("net: hns3: stop handling command queue while resetting VF")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 12:33:28 -07:00
Huazhong Tan
bb3d866882 net: hns3: check reset pending after FLR prepare
If there is a PF reset pending before FLR prepare, FLR's
preparatory work will not fail, but the FLR rebuild procedure
will fail for this pending. So this PF reset pending should
be handled in the FLR preparatory.

Fixes: 8627bdedc4 ("net: hns3: refactor the precedure of PF FLR")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 12:33:28 -07:00
Vaibhav Gupta
f193f4ebde sun/cassini: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 12:24:15 -07:00
Vaibhav Gupta
b0db0cc2f6 sun/niu: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

The driver was calling pci_save/restore_state() which is no more needed.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 12:24:15 -07:00
Vaibhav Gupta
d4ce70b3b6 sun/sungem: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states. And they use PCI
helper functions to do it.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

In this driver:
gem_suspend() calls gem_do_stop() which in turn invokes
pci_disable_device(). As the PCI helper function is not called at the
end/start of the function body, breaking the function in two parts
may change its behavior.

The only other function invoking gem_do_stop() is gem_close(). Hence,
gem_close() and gem_suspend() can do the required end steps on their own.

The same case is with gem_resume(). Both gem_resume() and gem_open()
invoke gem_do_start(). Again, make the caller functions do the required
steps on their own.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-06 12:24:15 -07:00
Taehee Yoo
2fb2799a2a net: rmnet: do not allow to add multiple bridge interfaces
rmnet can have only two bridge interface.
One of them is a link interface and another one is added by
the master operation.
rmnet interface shouldn't allow adding additional
bridge interfaces by mater operation.
But, there is no code to deny additional interfaces.
So, interface leak occurs.

Test commands:
    ip link add dummy0 type dummy
    ip link add dummy1 type dummy
    ip link add dummy2 type dummy
    ip link add rmnet0 link dummy0 type rmnet mux_id 1
    ip link set dummy1 master rmnet0
    ip link set dummy2 master rmnet0
    ip link del rmnet0

In the above test command, the dummy0 was attached to rmnet as VND mode.
Then, dummy1 was attached to rmnet0 as BRIDGE mode.
At this point, dummy0 mode is switched from VND to BRIDGE automatically.
Then, dummy2 is attached to rmnet as BRIDGE mode.
At this point, rmnet0 should deny this operation.
But, rmnet0 doesn't deny this.
So that below splat occurs when the rmnet0 interface is deleted.

Splat looks like:
[  186.684787][    C2] WARNING: CPU: 2 PID: 1009 at net/core/dev.c:8992 rollback_registered_many+0x986/0xcf0
[  186.684788][    C2] Modules linked in: rmnet dummy openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_x
[  186.684805][    C2] CPU: 2 PID: 1009 Comm: ip Not tainted 5.8.0-rc1+ #621
[  186.684807][    C2] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  186.684808][    C2] RIP: 0010:rollback_registered_many+0x986/0xcf0
[  186.684811][    C2] Code: 41 8b 4e cc 45 31 c0 31 d2 4c 89 ee 48 89 df e8 e0 47 ff ff 85 c0 0f 84 cd fc ff ff 5
[  186.684812][    C2] RSP: 0018:ffff8880cd9472e0 EFLAGS: 00010287
[  186.684815][    C2] RAX: ffff8880cc56da58 RBX: ffff8880ab21c000 RCX: ffffffff9329d323
[  186.684816][    C2] RDX: 1ffffffff2be6410 RSI: 0000000000000008 RDI: ffffffff95f32080
[  186.684818][    C2] RBP: dffffc0000000000 R08: fffffbfff2be6411 R09: fffffbfff2be6411
[  186.684819][    C2] R10: ffffffff95f32087 R11: 0000000000000001 R12: ffff8880cd947480
[  186.684820][    C2] R13: ffff8880ab21c0b8 R14: ffff8880cd947400 R15: ffff8880cdf10640
[  186.684822][    C2] FS:  00007f00843890c0(0000) GS:ffff8880d4e00000(0000) knlGS:0000000000000000
[  186.684823][    C2] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  186.684825][    C2] CR2: 000055b8ab1077b8 CR3: 00000000ab612006 CR4: 00000000000606e0
[  186.684826][    C2] Call Trace:
[  186.684827][    C2]  ? lockdep_hardirqs_on_prepare+0x379/0x540
[  186.684829][    C2]  ? netif_set_real_num_tx_queues+0x780/0x780
[  186.684830][    C2]  ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
[  186.684831][    C2]  ? __kasan_slab_free+0x126/0x150
[  186.684832][    C2]  ? kfree+0xdc/0x320
[  186.684834][    C2]  ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
[  186.684835][    C2]  unregister_netdevice_many.part.135+0x13/0x1b0
[  186.684836][    C2]  rtnl_delete_link+0xbc/0x100
[ ... ]
[  238.440071][ T1009] unregister_netdevice: waiting for rmnet0 to become free. Usage count = 1

Fixes: 037f9cdf72 ("net: rmnet: use upper/lower device infrastructure")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-04 18:04:55 -07:00
Taehee Yoo
2a762e9e8c net: rmnet: fix lower interface leak
There are two types of the lower interface of rmnet that are VND
and BRIDGE.
Each lower interface can have only one type either VND or BRIDGE.
But, there is a case, which uses both lower interface types.
Due to this unexpected behavior, lower interface leak occurs.

Test commands:
    ip link add dummy0 type dummy
    ip link add dummy1 type dummy
    ip link add rmnet0 link dummy0 type rmnet mux_id 1
    ip link set dummy1 master rmnet0
    ip link add rmnet1 link dummy1 type rmnet mux_id 2
    ip link del rmnet0

The dummy1 was attached as BRIDGE interface of rmnet0.
Then, it also was attached as VND interface of rmnet1.
This is unexpected behavior and there is no code for handling this case.
So that below splat occurs when the rmnet0 interface is deleted.

Splat looks like:
[   53.254112][    C1] WARNING: CPU: 1 PID: 1192 at net/core/dev.c:8992 rollback_registered_many+0x986/0xcf0
[   53.254117][    C1] Modules linked in: rmnet dummy openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nfx
[   53.254182][    C1] CPU: 1 PID: 1192 Comm: ip Not tainted 5.8.0-rc1+ #620
[   53.254188][    C1] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   53.254192][    C1] RIP: 0010:rollback_registered_many+0x986/0xcf0
[   53.254200][    C1] Code: 41 8b 4e cc 45 31 c0 31 d2 4c 89 ee 48 89 df e8 e0 47 ff ff 85 c0 0f 84 cd fc ff ff 0f 0b e5
[   53.254205][    C1] RSP: 0018:ffff888050a5f2e0 EFLAGS: 00010287
[   53.254214][    C1] RAX: ffff88805756d658 RBX: ffff88804d99c000 RCX: ffffffff8329d323
[   53.254219][    C1] RDX: 1ffffffff0be6410 RSI: 0000000000000008 RDI: ffffffff85f32080
[   53.254223][    C1] RBP: dffffc0000000000 R08: fffffbfff0be6411 R09: fffffbfff0be6411
[   53.254228][    C1] R10: ffffffff85f32087 R11: 0000000000000001 R12: ffff888050a5f480
[   53.254233][    C1] R13: ffff88804d99c0b8 R14: ffff888050a5f400 R15: ffff8880548ebe40
[   53.254238][    C1] FS:  00007f6b86b370c0(0000) GS:ffff88806c200000(0000) knlGS:0000000000000000
[   53.254243][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   53.254248][    C1] CR2: 0000562c62438758 CR3: 000000003f600005 CR4: 00000000000606e0
[   53.254253][    C1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   53.254257][    C1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   53.254261][    C1] Call Trace:
[   53.254266][    C1]  ? lockdep_hardirqs_on_prepare+0x379/0x540
[   53.254270][    C1]  ? netif_set_real_num_tx_queues+0x780/0x780
[   53.254275][    C1]  ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
[   53.254279][    C1]  ? __kasan_slab_free+0x126/0x150
[   53.254283][    C1]  ? kfree+0xdc/0x320
[   53.254288][    C1]  ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
[   53.254293][    C1]  unregister_netdevice_many.part.135+0x13/0x1b0
[   53.254297][    C1]  rtnl_delete_link+0xbc/0x100
[   53.254301][    C1]  ? rtnl_af_register+0xc0/0xc0
[   53.254305][    C1]  rtnl_dellink+0x2dc/0x840
[   53.254309][    C1]  ? find_held_lock+0x39/0x1d0
[   53.254314][    C1]  ? valid_fdb_dump_strict+0x620/0x620
[   53.254318][    C1]  ? rtnetlink_rcv_msg+0x457/0x890
[   53.254322][    C1]  ? lock_contended+0xd20/0xd20
[   53.254326][    C1]  rtnetlink_rcv_msg+0x4a8/0x890
[ ... ]
[   73.813696][ T1192] unregister_netdevice: waiting for rmnet0 to become free. Usage count = 1

Fixes: 037f9cdf72 ("net: rmnet: use upper/lower device infrastructure")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-04 18:04:55 -07:00
Vaibhav Gupta
7ada9a5e48 qlcninc: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and taking care of register states. And they use PCI
helper functions to do it.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

.suspend() calls __qlcnic_shutdown, which then calls qlcnic_82xx_shutdown;
.resume()  calls __qlcnic_resume,   which then calls qlcnic_82xx_resume;

Both ...82xx..() are define in
drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c and are used only in
drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c.

Hence upgrade them and remove PCI function calls, like pci_save_state() and
pci_enable_wake(), inside them

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-04 18:02:06 -07:00
Vaibhav Gupta
063ad9bcc2 netxen_nic: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states. And they use PCI
helper functions to do it.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

In this driver:
netxen_nic_resume() calls netxen_nic_attach_func() which then invokes PCI
helper functions like pci_enable_device(), pci_set_power_state() and
pci_restore_state(). Other function:
 - netxen_io_slot_reset()
also calls netxen_nic_attach_func().

Also, netxen_io_slot_reset() returns specific value based on the return value
of netxen_nic_attach_func() as whole. Thus, cannot simply move some piece of
code from netxen_nic_attach_func() to it.

Hence, define a new function netxen_nic_attach_late_func() to do the tasks
which has to be done after PCI helper functions have done their job.

Now, netxen_nic_attach_func() invokes netxen_nic_attach_late_func(), thus
netxen_io_slot_reset() behaves normally.
And, netxen_nic_resume() calls netxen_nic_attach_late_func() to avoid PCI
helper functions calls.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-04 18:02:06 -07:00
Luo bin
6dbb89014d hinic: fix sending mailbox timeout in aeq event work
When sending mailbox in the work of aeq event, another aeq event
will be triggered. because the last aeq work is not exited and only
one work can be excuted simultaneously in the same workqueue, mailbox
sending function will return failure of timeout. We create and use
another workqueue to fix this.

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-04 17:53:16 -07:00
Sudarsana Reddy Kalluru
a466657071 bnx2x: Perform Idlechk dump during the debug collection.
The patch adds driver changes to perform Idlechk dump during the debug
data collection.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-04 17:51:07 -07:00
Sudarsana Reddy Kalluru
cdf711f20b bnx2x: Add support for idlechk tests.
This patch populates a database of idlechk tests (registers and
predicates) and performs the idlechk using this data.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-04 17:51:07 -07:00
Sudarsana Reddy Kalluru
4365f35b12 bnx2x: Add Idlechk related register definitions.
The patch adds register definitions required for Idlechk implementation.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-04 17:51:07 -07:00
Sven Auhagen
39b9631524 mvpp2: xdp ethtool stats
Add ethtool statistics for XDP.

Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-03 14:46:24 -07:00
Matteo Croce
c2d6fe6163 mvpp2: XDP TX support
Add the transmit part of XDP support, which includes:
- support for XDP_TX in mvpp2_xdp()
- .ndo_xdp_xmit hook for AF_XDP and XDP_REDIRECT with mvpp2 as destination

mvpp2_xdp_submit_frame() is a generic function which is called by
mvpp2_xdp_xmit_back() when doing XDP_TX, and by mvpp2_xdp_xmit when
doing AF_XDP or XDP_REDIRECT target.

The buffer allocation has been reworked to be able to map the buffers
as DMA_FROM_DEVICE or DMA_BIDIRECTIONAL depending if native XDP is
in use or not.

Co-developed-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-03 14:46:24 -07:00
Matteo Croce
07dd0a7aae mvpp2: add basic XDP support
Add XDP native support.
By now only XDP_DROP, XDP_PASS and XDP_REDIRECT
verdicts are supported.

Co-developed-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-03 14:46:24 -07:00
Matteo Croce
b27db2274b mvpp2: use page_pool allocator
Use the page_pool API for memory management.
This is a prerequisite for native XDP support.

Tested-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-03 14:46:24 -07:00
Matteo Croce
136bcd8425 mvpp2: refactor BM pool init percpu code
In mvpp2_swf_bm_pool_init_percpu(), a reference to a struct
mvpp2_bm_pool is obtained traversing multiple structs, when a
local variable already points to the same object.

Fix it and, while at it, give the variable a meaningful name.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-03 14:46:24 -07:00
Florian Fainelli
47ff6154fd net: bcmgenet: Allow changing carrier from user-space
The GENET driver interfaces with internal MoCA interface as well as
external MoCA chips like the BCM6802/6803 through a fixed link
interface. It is desirable for the mocad user-space daemon to be able to
control the carrier state based upon out of band messages that it
receives from the MoCA chip.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-03 12:34:28 -07:00
Michael Guralnik
4dca650991 net/mlx5: Enable QP number request when creating IPoIB underlay QP
If in the process of creating the underlay QP for an IPoIB interface
the user has set the address and specifically the 1st-3rd bytes
representing the QP number, use the requested QP number when creating
the underlay QP.

For a user to be able to request a QP number on QP creation, the MKEY_BY_NAME
NVCONFIG should be set. As mkey_by_name and qp_by_name are coupled in FW.
This requires driver to query the mkey_by_name max cap during initialization
and set the current cap if it was enabled in FW.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-07-03 18:38:01 +03:00
Aya Levin
e620556427 net/mlx5e: Enhance TX timeout recovery
Upon a TX timeout handle, if the TX reporter was not able to recover
from the error, reopen the channels. If tried to reopen channels, do not
loop over TX queues for timeout.

With that, the reporters state and separation will better
expose the driver's state.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-02 21:05:19 -07:00
Aya Levin
b84921129b net/mlx5e: Enhance ICOSQ data on RX reporter's diagnose
When the RQ is in striding RQ mode, it uses the ICOSQ as a helper queue.
In this mode, RX reporter dumps more info about the ICOSQ and its
related CQ.

$ devlink health diagnose pci/0000:00:0b.0 reporter rx
Common config:
    RQ:
      type: 2 stride size: 2048 size: 8
      CQ:
        stride size: 64 size: 1024
RQs:
    channel ix: 0 rqn: 2413 HW state: 1 SW state: 5 WQE counter: 7 posted WQEs: 7 cc: 7
    CQ:
      cqn: 1032 HW status: 0 ci: 0 size: 1024
    EQ:
      eqn: 7 irqn: 42 vecidx: 1 ci: 93 size: 2048
    ICOSQ:
      sqn: 2411 HW state: 1 cc: 74 pc: 74 WQE size: 128
      CQ:
        cqn: 1029 cc: 8 size: 128
    channel ix: 1 rqn: 2418 HW state: 1 SW state: 5 WQE counter: 7 posted WQEs: 7 cc: 7
    CQ:
      cqn: 1036 HW status: 0 ci: 0 size: 1024
    EQ:
      eqn: 8 irqn: 43 vecidx: 2 ci: 2 size: 2048
    ICOSQ:
      sqn: 2416 HW state: 1 cc: 74 pc: 74 WQE size: 128
      CQ:
        cqn: 1033 cc: 8 size: 128

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-02 21:05:19 -07:00
Aya Levin
56837c2ae1 net/mlx5e: Add EQ info to TX/RX reporter's diagnose
Enhance TX/RX reporter's diagnose to include info about the
corresponding EQ.

$ devlink health diagnose pci/0000:00:0b.0 reporter rx
Common config:
    RQ:
      type: 2 stride size: 2048 size: 8
      CQ:
        stride size: 64 size: 1024
RQs:
    channel ix: 0 rqn: 1713 HW state: 1 SW state: 5 WQE counter: 7 posted WQEs: 7 cc: 7 ICOSQ HW state: 1
     CQ:
       cqn: 1032 HW status: 0 ci: 0 size: 1024
     EQ:
       eqn: 7 irqn: 42 vecidx: 1 ci: 93 size: 2048
     channel ix: 1 rqn: 1718 HW state: 1 SW state: 5 WQE counter: 7 posted WQEs: 7 cc: 7 ICOSQ HW state: 1
     CQ:
       cqn: 1036 HW status: 0 ci: 0 size: 1024
     EQ:
       eqn: 8 irqn: 43 vecidx: 2 ci: 2 size: 2048

$ devlink health diagnose pci/0000:00:0b.0 reporter tx
Common Config:
    SQ:
      stride size: 64 size: 1024
      CQ:
        stride size: 64 size: 1024
SQs:
   channel ix: 0 tc: 0 txq ix: 0 sqn: 1712 HW state: 1 stopped: false cc: 91 pc: 91
   CQ:
     cqn: 1030 HW status: 0 ci: 91 size: 1024
   EQ:
     eqn: 7 irqn: 42 vecidx: 1 ci: 93 size: 2048
   channel ix: 1 tc: 0 txq ix: 1 sqn: 1717 HW state: 1 stopped: false cc: 0 pc: 0
   CQ:
     cqn: 1034 HW status: 0 ci: 0 size: 1024
   EQ:
     eqn: 8 irqn: 43 vecidx: 2 ci: 2 size: 2048

Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-02 21:05:19 -07:00
Aya Levin
3c9d1699b8 net/mlx5e: Enhance CQ data on diagnose output
Add CQ's consumer index and size to the CQ's diagnose output retruved on
RX/TX reporter diadgnose.

$ devlink health diagnose pci/0000:00:0b.0 reporter rx
Common config:
    RQ:
      type: 2 stride size: 2048 size: 8
      CQ:
        stride size: 64 size: 1024
RQs:
    channel ix: 0 rqn: 2413 HW state: 1 SW state: 5 WQE counter: 7 posted WQEs: 7 cc: 7 ICOSQ HW state: 1
    CQ:
      cqn: 1032 HW status: 0 ci: 0 size: 1024
    channel ix: 1 rqn: 2418 HW state: 1 SW state: 5 WQE counter: 7 posted WQEs: 7 cc: 7 ICOSQ HW state: 1
    CQ:
      cqn: 1036 HW status: 0 ci: 0 size: 1024

$ devlink health diagnose pci/0000:00:0b.0 reporter tx
Common Config:
    SQ:
      stride size: 64 size: 1024
      CQ:
        stride size: 64 size: 1024
SQs:
    channel ix: 0 tc: 0 txq ix: 0 sqn: 2412 HW state: 1 stopped: false cc: 0 pc: 0
    CQ:
      cqn: 1030 HW status: 0 ci: 0 size: 1024
    channel ix: 1 tc: 0 txq ix: 1 sqn: 2417 HW state: 1 stopped: false cc: 5 pc: 5
    CQ:
      cqn: 1034 HW status: 0 ci: 5 size: 1024

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-02 21:05:18 -07:00
Aya Levin
d5cbedd7fc net/mlx5e: Rename reporter's helpers
Change prefix to match resident file:
%s/mlx5e_reporter_cq_diagnose/mlx5e_health_cq_diag_fmsg
%s/mlx5e_reporter_cq_common_diagnose/mlx5e_health_cq_common_diag_fmsg
%s/mlx5e_reporter_named_obj_nest_start/mlx5e_health_fmsg_named_obj_nest_start
%s/mlx5e_reporter_named_obj_nest_end/mlx5e_health_fmsg_named_obj_nest_end

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-02 21:05:18 -07:00
Aya Levin
de6c6ab7e8 net/mlx5e: Add helper to get the RQ WQE counter
Add a helper which retrieves the RQ's WQE counter. Use this helper in
the RX reporter diagnose callback.

$ devlink health diagnose pci/0000:00:0b.0 reporter rx
Common config:
  RQ:
     type: 2 stride size: 2048 size: 8
     CQ:
      stride size: 64 size: 1024
RQs:
   channel ix: 0 rqn: 2113 HW state: 1 SW state: 5 WQE counter: 7 posted WQEs: 7 cc: 7 ICOSQ HW state: 1
   CQ:
    cqn: 1032 HW status: 0
   channel ix: 1 rqn: 2118 HW state: 1 SW state: 5 WQE counter: 7 posted WQEs: 7 cc: 7 ICOSQ HW state: 1
   CQ:
    cqn: 1036 HW status: 0

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-02 21:05:18 -07:00
Aya Levin
fc42d0de16 net/mlx5e: Add helper to get RQ WQE's head
Add helper which retrieves the RQ WQE's head. Use this helper in RX
reporter diagnose callback.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-02 21:05:17 -07:00
Aya Levin
5d95c81660 net/mlx5e: Move RQ helpers to txrx.h
Use txrx.h to contain helper function regarding TX/RX. In the coming
patches, I will add more RQ helpers.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-02 21:05:17 -07:00
Aya Levin
4537f524b4 net/mlx5e: Align RX/TX reporters diagnose output format
Change the hierarchy of the RX reporter 'Common config' in the diagnose
output to match the 'Common config' of the TX reporter which reflects
that CQ is a helper to the traffic queues.

Before:
$ devlink health diagnose pci/0000:00:0b.0 reporter rx
Common config:
    RQ:
      type: 2 stride size: 2048 size: 8
    CQ:
      stride size: 64 size: 1024
    RQs:
    ...

After:
$ devlink health diagnose pci/0000:00:0b.0 reporter rx
Common config:
    RQ:
      type: 2 stride size: 2048 size: 8
      CQ:
        stride size: 64 size: 1024
    RQs:
    ...

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-02 21:05:17 -07:00
Aya Levin
b9961af7b8 net/mlx5e: Remove redundant RQ state query
When received a CQE error, the driver inspect the syndrome given by the
firmware. RQ recovery is initiated only as a result of a fatal syndrome;
syndrome which set the RQ into an error state. Hence no need to query
the RQ state at the beginning of the recovery process. Add additional
debug prints before recovering.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-02 21:05:16 -07:00
Aya Levin
e74e28aee1 net/mlx5e: Add a flush timeout define
During queue's recovery, driver waits for flush. The flush timeout is
set to 2 seconds. Add a define for this value for the benefit of RX and
TX reporters.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-02 21:05:16 -07:00
Eran Ben Elisha
b3ea4c4fdc net/mlx5e: Change reporters create functions to return void
Creation of devlink health reporters is not fatal for mlx5e instance load.
In case of error in reporter's creation, the return value is ignored.
Change all reporters creation functions to return void.

In addition, with this change, a failure in creating a reporter, will not
prevent the driver from trying to create the next reporter in the list.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-02 21:05:15 -07:00
Edward Cree
b6d02dd2ff sfc_ef100: helper function to set default RSS table of given size
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:40 -07:00
Edward Cree
b3007dfd5b sfc_ef100: NVRAM selftest support code
We have yet another new scheme for NVRAM, and a corresponding new MCDI.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:40 -07:00
Edward Cree
39c965f4e6 sfc_ef100: populate BUFFER_SIZE_BYTES in INIT_RXQ
The QDMA subsystem on EF100 needs this information.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:40 -07:00
Edward Cree
805d22bf92 sfc_ef100: add EF100 to NIC-revision enumeration
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:40 -07:00
Edward Cree
bcacac7a8c sfc: get drvinfo driver name from outside the common code
Since ethtool_common.o will be built into both sfc and sfc_ef100 drivers,
 it can't use KBUILD_MODNAME directly.  Instead, make it reference a
 string provided by the individual driver code.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:40 -07:00
Edward Cree
31f4cbd401 sfc: initialise RSS context ID to 'no RSS context' in efx_init_struct()
Previously this was only happening in ef10-specific code.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:40 -07:00
Edward Cree
d700fe014e sfc: commonise efx_fini_dmaq
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:40 -07:00
Edward Cree
965470ee76 sfc: factor out efx_mcdi_filter_table_down() from _remove()
_down() merely removes all our filters and VLANs, it doesn't free
 efx->filter_state itself.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:40 -07:00
Edward Cree
79de6e7cb8 sfc: don't call tx_limit_len if NIC type doesn't have one
EF100 doesn't need to split up large DMAs.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:40 -07:00
Edward Cree
a81dcd85a7 sfc: assign TXQs without gaps
Since we only allocate VIs for the number of TXQs we actually need, we
 cannot naively use "channel * TXQ_TYPES + txq" for the TXQ number, as
 this has gaps (when efx->tx_queues_per_channel < EFX_TXQ_TYPES) and
 thus overruns the driver's VI allocations, causing the firmware to
 reject the MC_CMD_INIT_TXQ based on INSTANCE.
Thus, we distinguish INSTANCE (stored in tx_queue->queue) from LABEL
 (tx_queue->label); the former is allocated starting from 0 in
 efx_set_channels(), while the latter is simply the txq type (index in
 channel->tx_queue array).
To simplify things, rather than changing tx_queues_per_channel after
 setting up TXQs, make Siena always probe its HIGHPRI queues at start
 of day, rather than deferring it until tc mqprio enables them.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:40 -07:00
Edward Cree
69a704962e sfc: commonise netif_set_real_num[tr]x_queues calls
While we're at it, also check them for failure.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:40 -07:00
Edward Cree
f9cac93e5b sfc: make tx_queues_per_channel variable at runtime
Siena needs four TX queues (csum * highpri), EF10 needs two (csum),
 and EF100 only needs one (as checksumming is controlled entirely by
 the transmit descriptor).  Rather than having various bits of ad-hoc
 code to decide which queues to set up etc., put the knowledge of how
 many TXQs a channel has in one place.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:40 -07:00
Edward Cree
67e6398e2e sfc: move modparam 'rss_cpus' out of common channel code
Instead of exposing this old module parameter on the new driver (thus
 having to keep it forever after for compatibility), let's confine it
 to the old one; if we find later that we need the feature, we ought
 to support it properly, with ethtool set-channels.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:40 -07:00
Edward Cree
e4ff323210 sfc: move modparam 'interrupt_mode' out of common channel code
EF100 only supports MSI-X, so there's no need for the new driver to
 expose this old module parameter.
Since it's now visible to the linker, we have to rename it internally
 to efx_interrupt_mode to avoid symbol collisions in non-modular
 builds.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:39 -07:00
Edward Cree
bc32442176 sfc: remove max_interrupt_mode
All NICs supported by this driver are capable of MSI-X interrupts (only
 Falcon A1 wasn't, and that's now hived off into its own driver), so no
 need for a nic-type parameter.  Besides, the code that checked it was
 buggy anyway (the following assignment that checked min_interrupt_mode
 overrode it).

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:39 -07:00
Edward Cree
af3c38d3fb sfc: support setting MTU even if not privileged to configure MAC fully
Unprivileged functions (such as VFs) may set their MTU by use of the
 'control' field of MC_CMD_SET_MAC_EXT, as used in efx_mcdi_set_mtu().
If calling efx_ef10_mac_reconfigure() from efx_change_mtu(), and the
 NIC supports the above (SET_MAC_ENHANCED capability), use it rather
 than efx_mcdi_set_mac().

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:47:39 -07:00
Wei Yongjun
4e1a691168 mlx4: Mark PM functions as __maybe_unused
In certain configurations without power management support, the
following warnings happen:

drivers/net/ethernet/mellanox/mlx4/main.c:4388:12:
 warning: 'mlx4_resume' defined but not used [-Wunused-function]
 4388 | static int mlx4_resume(struct device *dev_d)
      |            ^~~~~~~~~~~
drivers/net/ethernet/mellanox/mlx4/main.c:4373:12: warning:
 'mlx4_suspend' defined but not used [-Wunused-function]
 4373 | static int mlx4_suspend(struct device *dev_d)
      |            ^~~~~~~~~~~~

Mark these functions as __maybe_unused to make it clear to the
compiler that this is going to happen based on the configuration,
which is the standard for these types of functions.

Fixes: 0e3e206a3e ("mlx4: use generic power management")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:24:17 -07:00
Wei Yongjun
ffa76e38b7 ksz884x: mark pcidev_suspend() as __maybe_unused
In certain configurations without power management support, gcc report
the following warning:

drivers/net/ethernet/micrel/ksz884x.c:7182:12: warning:
 'pcidev_suspend' defined but not used [-Wunused-function]
 7182 | static int pcidev_suspend(struct device *dev_d)
      |            ^~~~~~~~~~~~~~

Mark pcidev_suspend() as __maybe_unused to make it clear.

Fixes: 64120615d1 ("ksz884x: use generic power management")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:23:32 -07:00
Claudiu Beznea
8932b5a533 net: macb: remove is_udp variable
Remove is_udp variable that is used in only one place and use
ip_hdr(skb)->protocol == IPPROTO_UDP check instead.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:22:00 -07:00
Claudiu Beznea
580d395cb9 net: macb: do not initialize queue variable
Do not initialize queue variable. It is already initialized in for loops.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:22:00 -07:00
Claudiu Beznea
b7ab39b359 net: macb: use hweight32() to count set bits in queue_mask
Use hweight32() to count set bits in queue_mask.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:22:00 -07:00
Claudiu Beznea
fec371f624 net: macb: do not set again bit 0 of queue_mask
Bit 0 of queue_mask is set at the beginning of
macb_probe_queues() function. Do not set it again after reading
DGFG6 but instead use "|=" operator.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:22:00 -07:00
David S. Miller
d8c8a96ce5 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2020-07-01

This series contains updates to all Intel drivers, but a majority of the
changes are to the i40e driver.

Jeff converts 'fall through' comments to the 'fallthrough;' keyword for
all Intel drivers. Removed unnecessary delay in the ixgbe ethtool
diagnostics test.

Arkadiusz implements Total Port Shutdown for i40e. This is the revised
patch based on Jakub's feedback from an earlier submission of this
patch, where additional code comments and description was needed to
describe the functionality.

Wei Yongjun fixes return error code for iavf_init_get_resources().

Magnus optimizes XDP code in i40e; starting with AF_XDP zero-copy
transmit completion path. Then by only executing a division when
necessary in the napi_poll data path. Move the check for transmit ring
full outside the send loop to increase performance.

Ciara add XDP ring statistics to i40e and the ability to dump these
statistics and descriptors.

Tony fixes reporting iavf statistics.

Radoslaw adds support for 2.5 and 5 Gbps by implementing the newer ethtool
ksettings API in ixgbe.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 17:41:56 -07:00
David S. Miller
11a20c7152 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2020-07-01

This series contains updates to the ice driver only.

Jacob implements a devlink region for device capabilities.

Bruce removes structs containing only one-element arrays that are either
unused or only used for indexing. Instead, use pointer arithmetic or
other indexing to access the elements. Converts "C struct hack"
variable-length types to the preferred C99 flexible array member.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 17:25:00 -07:00
Bruce Allan
66486d8943 ice: replace single-element array used for C struct hack
Convert the pre-C90-extension "C struct hack" method (using a single-
element array at the end of a structure for implementing variable-length
types) to the preferred use of C99 flexible array member.

Additional code cleanups were done near areas affected by this change.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 16:35:23 -07:00
Bruce Allan
b3c3890489 ice: avoid unnecessary single-member variable-length structs
There are a number of structures that consist of a one-element array as the
only struct member.  Some of those are unused so remove them. Others are
used to index into a buffer/array consisting of a variable number of a
different data or structure type.  Those are unnecessary since we can use
simple pointer arithmetic or index directly into the buffer to access
individual elements of the buffer/array.

Additional code cleanups were done near areas affected by this change.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 16:33:29 -07:00
Jacob Keller
8d7aab3515 ice: implement snapshot for device capabilities
Add a new devlink region used for capturing a snapshot of the device
capabilities buffer which is reported by the firmware over the AdminQ.
This information can useful in debugging driver and firmware
interactions.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 15:41:45 -07:00
Radoslaw Tyl
a296d665ea ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support
Added full support for new version Ethtool API. New API allow use
2500Gbase-T and 5000base-T supported and advertised link speed modes.

Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 15:04:54 -07:00
Jeff Kirsher
bb0967c04e ixgbe: Cleanup unneeded delay in ethtool test
There is a 4 seconds delay in ixgbe_diag_test() that is holding up other
ioctls such as SIOCGIFCONF that Oracle database applications use.
One of Oracle's product runs "ethtool -t ethX online" periodically for
system monitoring and that is impacting database applications that use
SIOCGIFCONF at that same time.

This 4 second delay was needed in out early 1GbE parts to give the PHY
time to recover from a reset.  This code was carried forward to the 10 GbE
driver even it was not needed for the supported PHYs in the ixgbe driver.

CC: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
CC: Jack Vogel <jack.vogel@oracle.com>
Reported-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:47:58 -07:00
Tony Nguyen
9358076642 iavf: Fix updating statistics
Commit bac8486116 ("iavf: Refactor the watchdog state machine") inverted
the logic for when to update statistics. Statistics should be updated when
no other commands are pending, instead they were only requested when a
command was processed. iavf_request_stats() would see a pending request
and not request statistics to be updated. This caused statistics to never
be updated; fix the logic.

Fixes: bac8486116 ("iavf: Refactor the watchdog state machine")
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
2020-07-01 14:45:59 -07:00
Ciara Loftus
44ea803e2f i40e: introduce new dump desc XDP command
Interfaces already exist for dumping Rx and Tx descriptor information.
Introduce another for doing the same for XDP descriptors.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:44:17 -07:00
Ciara Loftus
890c402c7b i40e: add XDP ring statistics to dump VSI debug output
Prior to this, only the Rx and Tx ring statistics were dumped. The XDP
ring statistics are now dumped as well.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:40:07 -07:00
Ciara Loftus
e2968260e1 i40e: add XDP ring statistics to VSI stats
Prior to this, only Rx and Tx ring statistics were accounted for.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:35:54 -07:00
Magnus Karlsson
1fd972ebe5 i40e: move check of full Tx ring to outside of send loop
Move the check if the HW Tx ring is full to outside the send
loop. Currently it is checked for every single descriptor that we
send. Instead, tell the send loop to only process a maximum number of
packets equal to the number of available slots in the Tx ring. This
way, we can remove the check inside the send loop to and gain some
performance.

Suggested-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:31:41 -07:00
Magnus Karlsson
4b5539c01d i40e: eliminate division in napi_poll data path
Eliminate a division in the napi_poll data path. This division is
executed even though it is only needed in the rare case when there are
not enough interrupt lines so they have to be shared between queue
pairs. Instead, just test for this case and only execute the division
if needed. The code has been lifted from the ice driver.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:27:11 -07:00
Magnus Karlsson
5574ff7b7b i40e: optimize AF_XDP Tx completion path
Improve the performance of the AF_XDP zero-copy Tx completion
path. When there are no XDP buffers being sent using XDP_TX or
XDP_REDIRECT, we do not have go through the SW ring to clean up any
entries since the AF_XDP path does not use these. In these cases, just
fast forward the next-to-use counter and skip going through the SW
ring. The limit on the maximum number of entries to complete is also
removed since the algorithm is now O(1). To simplify the code path, the
maximum number of entries to complete for the XDP path is therefore
also increased from 256 to 512 (the default number of Tx HW
descriptors). This should be fine since the completion in the XDP path
is faster than in the SKB path that has 256 as the maximum number.

This patch provides around 4% throughput improvement for the l2fwd
application in xdpsock on my machine.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:24:14 -07:00
Wei Yongjun
753f3884f2 iavf: fix error return code in iavf_init_get_resources()
Fix to return negative error code -ENOMEM from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: b66c7bc1cd ("iavf: Refactor init state machine")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:18:54 -07:00
Arkadiusz Kubalewski
d5ec9e2ce4 i40e: Add support for a new feature Total Port Shutdown
After OS requests to down a link on a physical network port, the
traffic is no longer being processed but the physical link with
a link partner is still established.

Currently there is a feature (Link down on close) which allows
to physically bring the link down (after OS request).

With this patch new feature with similar capability is introduced:
TOTAL_PORT_SHUTDOWN
Allows to physically disable the link on the NIC's port.
If enabled, (after link down request from the OS)
no link, traffic or led activity is possible on that port.

If I40E_FLAG_TOTAL_PORT_SHUTDOWN is enabled, the
I40E_FLAG_LINK_DOWN_ON_CLOSE_ENABLED must be explicitly forced to
true and cannot be disabled at that time.
The functionalities are exclusive in terms of configuration, but
they also have similar behavior (allowing to disable physical link
of the port), with following differences:
- LINK_DOWN_ON_CLOSE_ENABLED is configurable at host OS run-time
  and is supported by whole family of 7xx Intel Ethernet Controllers
- TOTAL_PORT_SHUTDOWN may be enabled only before OS loads (in BIOS)
  only if motherboard's BIOS and NIC's FW has support of it
- when LINK_DOWN_ON_CLOSE_ENABLED is used, the link is being brought
  down by sending phy_type=0 to NIC's FW
- when TOTAL_PORT_SHUTDOWN is used, phy_type is not altered, instead
  the link is being brought down by clearing bit
  (I40E_AQ_PHY_ENABLE_LINK) in abilities field of
  i40e_aq_set_phy_config structure

Introduced changes:
- new private flag I40E_FLAG_TOTAL_PORT_SHUTDOWN for handling the
  feature
- probe of NVM if the feature was enabled at driver's port
  initialization
- special handling on link-down procedure to let FW physically
  shutdown the port if the feature was enabled

Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:17:16 -07:00
Jeff Kirsher
5463fce643 ethernet/intel: Convert fallthrough code comments
Convert all the remaining 'fall through" code comments to the newer
'fallthrough;' keyword.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 13:47:43 -07:00
Vaibhav Gupta
40c1b1ee55 natsemi: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Thus, there is no need to call the PCI helper functions like
pci_enable_device, which is not recommended. Hence, removed.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
4c2ad1263b vxge: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Use "struct dev_pm_ops" variable to bind the callbacks.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
64120615d1 ksz884x: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Thus, there is no need to call the PCI helper functions like
pci_enable_wake(), pci_save/restore_sate() and
pci_set_power_state().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
0e3e206a3e mlx4: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Use "struct dev_pm_ops" variable to bind the callbacks.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
e9a7f8c586 benet: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Thus, there is no need to call the PCI helper functions like
pci_enable/disable_device(), pci_save/restore_sate() and
pci_set_power_state().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
78cad4cec6 sundance: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Thus, there is no need to call the PCI helper functions like
pci_enable/disable_device(), pci_save/restore_sate() and
pci_set_power_state().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
1c2e4839ec liquidio: use generic power management
Drivers should not use legacy power management as they have to manage power
states and related operations, for the device, themselves. This driver was
handling them with the help of PCI helper functions.

With generic PM, all essentials will be handled by the PCI core. Driver
needs to do only device-specific operations.

The driver defined empty-body .suspend() and .resume() callbacks earlier.
They can now be define NULL and bind with "struct dev_pm_ops" variable.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
817a89ae10 ena_netdev: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
a7c48c7211 starfire: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Thus, there is no need to call the PCI helper functions like
pci_save/restore_sate() and pci_set_power_state().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
33b7a252c8 ne2k-pci: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Thus, there is no need to call the PCI helper functions like
pci_enable/disable_device(), pci_save/restore_sate() and
pci_set_power_state().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
7b46681cf4 typhoon: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states. And they use PCI
helper functions to do it.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

In this driver:
typhoon_resume() calls typhoon_wakeup() which then calls PCI helper
functions pci_set_power_state() and pci_restore_state(). The only other
function, using typhoon_wakeup() is typhoon_open().

Thus remove the pci_*() calls from tyhpoon_wakeup() and place them in
typhoon_open(), maintaining the order, to retain the normal behavior of
the function

Now, typhoon_suspend() calls typhoon_sleep() which then calls PCI helper
functions pci_enable_wake(), pci_disable_device() and
pci_set_power_state(). Other functions:
 - typhoon_open()
 - typhoon_close()
 - typhoon_init_one()
are also invoking typhoon_sleep(). Thus, in this case, cannot simply
move PCI helper functions call.

Hence, define a new function typhoon_sleep_early() which will do all the
operations, which typhoon_sleep() was doing before calling PCI helper
functions. Now typhoon_sleep() will call typhoon_sleep_early() to do
those tasks, hence, the behavior for _open(), _close and _init_one() remain
unchanged. And typhon_suspend() only requires typhoon_sleep_early().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Hulk Robot
4f195d2803 qed: Make symbol 'qed_hw_err_type_descr' static
Fix sparse build warning:

drivers/net/ethernet/qlogic/qed/qed_main.c:2480:6: warning:
 symbol 'qed_hw_err_type_descr' was not declared. Should it be static?

Signed-off-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:46:11 -07:00
Luo bin
d3c54f7f18 hinic: fix passing non negative value to ERR_PTR
get_dev_cap and set_resources_state functions may return a positive
value because of hardware failure, and the positive return value
can not be passed to ERR_PTR directly.

Fixes: 7dd29ee128 ("hinic: add sriov feature support")
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:14:04 -07:00
Rahul Lakkireddy
696c278fdf cxgb4: add main VI to mirror VI config replication
When mirror VI is enabled, replicate various VI config params
enabled on main VI to mirror VI. These include replicating MTU,
promiscuous mode, all-multicast mode, and enabled netdev Rx
feature offloads.

v3:
- Replace mirror VI refcount_t with normal u32 variable.
- Add back calling cxgb4_port_mirror_start() in cxgb_open(), which
  was there in v1, but got missed in v2 during refactoring.

v2:
- Simplify the replication code by refactoring t4_set_rxmode()
  to handle mirror VI, instead of duplicating the t4_set_rxmode()
  calls in multiple places.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 18:34:34 -07:00
Rahul Lakkireddy
2b465ed00f cxgb4: add support for mirror Rxqs
When mirror VI is enabled, allocate the mirror Rxqs and setup the
mirror VI RSS table. The mirror Rxqs are allocated/freed when
the mirror VI is created/destroyed or when underlying port is
brought up/down, respectively.

v3:
- Replace mirror VI refcount_t with normal u32 variable.

v2:
- Use mutex to protect all mirror VI data, instead of just
  mirror Rxqs.
- Remove the un-needed mirror Rxq mutex.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 18:34:34 -07:00
Rahul Lakkireddy
fd2261d8ed cxgb4: add mirror action to TC-MATCHALL offload
Add mirror Virtual Interface (VI) support to receive all ingress
mirror traffic from the underlying device. The mirror VI is
created dynamically, if the TC-MATCHALL rule has a corresponding
mirror action. Also request MSI-X vectors needed for the mirror VI
Rxqs. If no vectors are available, then disable mirror VI support.

v3:
- Replace mirror VI refcount_t with normal u32 variable.

v2:
- Add mutex to protect all mirror VI data, instead of just
  mirror Rxqs.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 18:34:34 -07:00
Nathan Chancellor
75603a3112 pcnet32: Mark PM functions as __maybe_unused
In certain configurations without power management support, the
following warnings happen:

../drivers/net/ethernet/amd/pcnet32.c:2928:12: warning:
'pcnet32_pm_resume' defined but not used [-Wunused-function]
 2928 | static int pcnet32_pm_resume(struct device *device_d)
      |            ^~~~~~~~~~~~~~~~~
../drivers/net/ethernet/amd/pcnet32.c:2916:12: warning:
'pcnet32_pm_suspend' defined but not used [-Wunused-function]
 2916 | static int pcnet32_pm_suspend(struct device *device_d)
      |            ^~~~~~~~~~~~~~~~~~

Mark these functions as __maybe_unused to make it clear to the compiler
that this is going to happen based on the configuration, which is the
standard for these types of functions.

Fixes: a86688fbef ("pcnet32: Convert to generic power management")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 18:17:54 -07:00
Nathan Chancellor
0adcd2981d amd8111e: Mark PM functions as __maybe_unused
In certain configurations without power management support, the
following warnings happen:

../drivers/net/ethernet/amd/amd8111e.c:1623:12: warning:
'amd8111e_resume' defined but not used [-Wunused-function]
 1623 | static int amd8111e_resume(struct device *dev_d)
      |            ^~~~~~~~~~~~~~~
../drivers/net/ethernet/amd/amd8111e.c:1584:12: warning:
'amd8111e_suspend' defined but not used [-Wunused-function]
 1584 | static int amd8111e_suspend(struct device *dev_d)
      |            ^~~~~~~~~~~~~~~~

Mark these functions as __maybe_unused to make it clear to the compiler
that this is going to happen based on the configuration, which is the
standard for these types of functions.

Fixes: 2caf751fe0 ("amd8111e: Convert to generic power management")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 18:17:53 -07:00
Bartosz Golaszewski
9ed0a3fac0 net: ethernet: mtk-star-emac: use devm_of_mdiobus_register()
Shrink the code by using the managed variant of of_mdiobus_register().

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 15:57:34 -07:00
Bartosz Golaszewski
ac3a68d566 net: phy: don't abuse devres in devm_mdiobus_register()
We currently have two managed helpers for mdiobus - devm_mdiobus_alloc()
and devm_mdiobus_register(). The idea behind devres is that the release
callback releases whatever resource the devm function allocates. In the
mdiobus case however there's no devres associated with the device by
devm_mdiobus_register(). Instead the release callback for
devm_mdiobus_alloc(): _devm_mdiobus_free() unregisters the device if
it is marked as managed.

This all seems wrong. The managed structure shouldn't need to know or
care about whether it's managed or not - and this is the case now for
struct mii_bus. The devres wrapper should be opaque to the managed
resource.

This changeset makes devm_mdiobus_alloc() and devm_mdiobus_register()
conform to common devres standards: devm_mdiobus_alloc() allocates a
devres structure and registers a callback that will call mdiobus_free().
__devm_mdiobus_register() allocated another devres and registers a
callback that will unregister the bus.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 15:57:34 -07:00
Bartosz Golaszewski
d10d607f50 net: ethernet: ixgbe: don't call devm_mdiobus_free()
The idea behind devres is that the release callbacks are called if
probe fails. As we now check the return value of ixgbe_mii_bus_init(),
we can drop the call devm_mdiobus_free() in error path as the release
callback will be called automatically.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 15:57:34 -07:00
Bartosz Golaszewski
09ef193fef net: ethernet: ixgbe: check the return value of ixgbe_mii_bus_init()
This function may fail. Check its return value and propagate the error
code.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 15:57:34 -07:00
Nirranjan Kirubaharan
e0cdac65ba cxgb4vf: configure ports accessible by the VF
Find ports accessible by the VF, based on the index of the
mac address stored for the VF in the adapter. If no mac address
is stored for the VF, use the port mask provided by firmware.

Signed-off-by: Nirranjan Kirubaharan <nirranjan@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 15:54:25 -07:00
Li Heng
8a259e6b73 net: cxgb4: fix return error value in t4_prep_fw
t4_prep_fw goto bye tag with positive return value when something
bad happened and which can not free resource in adap_init0.
so fix it to return negative value.

Fixes: 16e47624e7 ("cxgb4: Add new scheme to update T4/T5 firmware")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Li Heng <liheng40@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 15:53:25 -07:00
Alexander Lobakin
c4fad2a532 net: qede: update copyright years
Set the actual copyright holder and years in all qede source files.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 15:51:40 -07:00
Alexander Lobakin
7268f33e55 net: qede: convert to SPDX License Identifiers
QLogic QED drivers source code is dual licensed under
GPL-2.0/BSD-3-Clause.
Remove all the boilerplates in the existing code and replace it with the
correct SPDX tag.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 15:51:40 -07:00
Alexander Lobakin
090efe00ab net: qede: correct existing SPDX tags
QLogic QED drivers source code is dual licensed under
GPL-2.0/BSD-3-Clause.
Correct already existing but wrong SPDX tags to match the actual
license.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 15:51:40 -07:00
Alexander Lobakin
663eacd899 net: qed: update copyright years
Set the actual copyright holder and years in all qed source files.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 15:51:40 -07:00
Alexander Lobakin
1f4d4ed6ac net: qed: convert to SPDX License Identifiers
QLogic QED drivers source code is dual licensed under
GPL-2.0/BSD-3-Clause.
Remove all the boilerplates in the existing code and replace it with the
correct SPDX tag.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 15:51:39 -07:00
Alexander Lobakin
ab81e23cf7 net: qed: correct existing SPDX tags
QLogic QED drivers source code is dual licensed under
GPL-2.0/BSD-3-Clause.
Correct already existing but wrong SPDX tags to match the actual
license.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 15:51:39 -07:00
Colin Ian King
5831b33362 net/mlx5e: fix memory leak of tls
The error return path when create_singlethread_workqueue fails currently
does not kfree tls and leads to a memory leak. Fix this by kfree'ing
tls before returning -ENOMEM.

Addresses-Coverity: ("Resource leak")
Fixes: 1182f36593 ("net/mlx5e: kTLS, Add kTLS RX HW offload support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:38:47 -07:00
Edward Cree
c72ae701ee sfc: don't call tx_remove if there isn't one
EF100 won't have an efx->type->tx_remove method, because there's
 nothing for it to do.  So make the call conditional.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:09:09 -07:00
Edward Cree
f07cb4128a sfc: commonise initialisation of efx->vport_id
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:09:09 -07:00
Edward Cree
d4adc5162b sfc: commonise efx->[rt]xq_entries initialisation
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:09:09 -07:00
Edward Cree
937aa3ae4d sfc: initialise max_[tx_]channels in efx_init_channels()
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:09:09 -07:00
Edward Cree
20e1026cbe sfc: move definition of EFX_MC_STATS_GENERATION_INVALID
Saves a whole #include from nic.c.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:09:09 -07:00
Edward Cree
e7a256858f sfc: factor out efx_tx_tso_header_length() and understand encapsulation
ef100 will need to check this against NIC limits.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:09:09 -07:00
Edward Cree
93841000ed sfc: remove duplicate declaration of efx_enqueue_skb_tso()
Define it in nic_common.h, even though the ef100 driver will have a
 different implementation backing it (actually a WARN_ON_ONCE as it
 should never get called by ef100.  But it needs to still exist because
 common TX path code references it).

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:09:09 -07:00
Edward Cree
740acc15c8 sfc: commonise TSO fallback code
ef100 will need this if it gets GSO skbs it can't handle (e.g. too long
 header length).

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:09:09 -07:00
Edward Cree
80a0074e6a sfc: commonise efx_sync_rx_buffer()
The ef100 RX path will also need to DMA-sync RX buffers.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:09:09 -07:00
Edward Cree
f7e55550a3 sfc: commonise some MAC configuration code
Refactor it a little as we go, and introduce efx_mcdi_set_mtu() which we
 will later use for ef100 to change MTU without touching other MAC settings.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:09:09 -07:00
Edward Cree
2d73515a1c sfc: commonise miscellaneous efx functions
Various left-over bits and pieces from efx.c that are needed by ef100.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:09:09 -07:00
Edward Cree
2c6c1e3cfd sfc: add missing licence info to mcdi_filters.c
Both the licence notice and the SPDX tag were missing from this file.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:09:09 -07:00
Edward Cree
272e53aa5c sfc: commonise MCDI MAC stats handling
Most of it was already declared in mcdi_port_common.h, so just move the
 implementations to mcdi_port_common.c.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:09:09 -07:00
Edward Cree
83d00531cb sfc: move NIC-specific mcdi_port declarations out of common header
These functions are implemented in mcdi_port.c, which will not be linked
 into the EF100 driver; thus their prototypes should not be visible in
 common header files.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:09:09 -07:00
Russell King
f2ca673d2c net: mvneta: fix use of state->speed
When support for short preambles was added, it incorrectly keyed its
decision off state->speed instead of state->interface.  state->speed
is not guaranteed to be correct for in-band modes, which can lead to
short preambles being unexpectedly disabled.

Fix this by keying off the interface mode, which is the only way that
mvneta can operate at 2.5Gbps.

Fixes: da58a931f2 ("net: mvneta: Add support for 2500Mbps SGMII")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:01:12 -07:00
Luo bin
9d9f95a940 hinic: remove unused but set variable
remove unused but set variable to avoid auto build test WARNING

Signed-off-by: Luo bin <luobin9@huawei.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 12:42:07 -07:00
David S. Miller
e25974ae9d Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2020-06-29

This series contains updates to only the igc driver.

Sasha added Energy Efficient Ethernet (EEE) support and Latency Tolerance
Reporting (LTR) support for the igc driver. Added Low Power Idle (LPI)
counters and cleaned up unused TCP segmentation counters. Removed
igc_power_down_link() and call igc_power_down_phy_copper_base()
directly. Removed unneeded copper media check.

Andre cleaned up timestamping by removing un-supported features and
duplicate code for i225. Fixed the timestamp check on the proper flag
instead of the skb for pending transmit timestamps. Refactored
igc_ptp_set_timestamp_mode() to simply the flow.

v2: Removed the log message in patch 1 as suggested by David Miller.
    Note: The locking issue Jakub Kicinski saw in patch 5, currently
    exists in the current net-next tree, so Andre will resolve the
    locking issue in a follow-on patch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 12:34:35 -07:00
Sasha Neftin
f637471d33 igc: Remove checking media type during MAC initialization
i225 device support only copper mode.
There is no point to check media type in the
igc_config_fc_after_link_up() method.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-29 18:22:03 -07:00
Sasha Neftin
2b374e3738 igc: Remove unneeded check for copper media type
PHY of the i225 device support only copper mode.
There is no point to check media type in the
igc_power_up_link() method.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-29 18:21:56 -07:00
Sasha Neftin
a0beb3c1b1 igc: Refactor the igc_power_down_link()
Currently the implementation of igc_power_down_link()
method was just calling igc_power_down_phy_copper_base()
method.
We can just call igc_power_down_phy_copper_base()
method directly.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-29 18:21:53 -07:00
Sasha Neftin
725fa16d36 igc: Remove TCP segmentation TX fail counter
TCP segmentation TX context fail counter is not
applicable for i225 devices.
This patch comes to clean up this counter.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown<aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-29 18:21:48 -07:00
Sasha Neftin
900d1e8b34 igc: Add LPI counters
Add EEE TX LPI and EEE RX LPI counters. A EEE TX LPI event
occurs when the transmitter enters EEE (IEEE 802.3az) LPI
state. A EEE RX LPI event occurs when the receiver detect
link partner entry into EEE(IEEE 802.3az) LPI state.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-29 18:21:41 -07:00
Andre Guedes
1cbedabffd igc: Fix Rx timestamp disabling
When Rx timestamping is enabled, we set the timestamp bit in SRRCTL
register for each queue, but we don't clear it when disabling. This
patch fixes igc_ptp_disable_rx_timestamp() accordingly.

Also, this patch gets rid of igc_ptp_enable_tstamp_rxqueue() and
igc_ptp_enable_tstamp_all_rxqueues() and move their logic into
igc_ptp_enable_rx_timestamp() to keep the enable and disable
helpers symmetric.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-29 18:21:38 -07:00
Andre Guedes
3df7fd799b igc: Refactor igc_ptp_set_timestamp_mode()
Current igc_ptp_set_timestamp_mode() logic is a bit tangled since it
handles many different hardware configurations in one single place,
making it harder to follow. This patch untangles that code by breaking
it into helper functions.

Quick note about the hw->mac.type check which was removed in this
refactoring: this check it not really needed since igc_i225 is the only
type supported by the IGC driver.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-29 18:21:35 -07:00
Andre Guedes
3b44d4c10c igc: Remove UDP filter setup in PTP code
As implemented in igc_ethtool_get_ts_info(), igc only supports HWTSTAMP_
FILTER_ALL so any HWTSTAMP_FILTER_* option the user may set falls back to
HWTSTAMP_FILTER_ALL.

HWTSTAMP_FILTER_ALL is implemented via Rx Time Sync Control (TSYNCRXCTL)
configuration which timestamps all incoming packets. Configuring a
UDP filter, in addition to TSYNCRXCTL, doesn't add much so this patch
removes that code. It also takes this opportunity to remove some
non-applicable comments.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-29 18:21:29 -07:00
Andre Guedes
1801f8d929 igc: Check __IGC_PTP_TX_IN_PROGRESS instead of ptp_tx_skb
The __IGC_PTP_TX_IN_PROGRESS flag indicates we have a pending Tx
timestamp. In some places, instead of checking that flag, we check
adapter->ptp_tx_skb. This patch fixes those places to use the flag.

Quick note about igc_ptp_tx_hwtstamp() change: when that function is
called, adapter->ptp_tx_skb is expected to be valid always so we
WARN_ON_ONCE() in case it is not.

Quick note about igc_ptp_suspend() change: when suspending, we don't
really need to check if there is a pending timestamp. We can simply
clear it unconditionally.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-29 18:21:24 -07:00
Andre Guedes
29b821fe97 igc: Remove duplicate code in Tx timestamp handling
The functions igc_ptp_tx_hang() and igc_ptp_tx_work() have duplicate
code which handles Tx timestamp timeouts. This patch does a trivial
refactoring by moving that code to its own function and reusing it.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-29 18:21:19 -07:00
Andre Guedes
3a66abe903 igc: Clean up Rx timestamping logic
Differently from I210, I225 doesn't report Rx timestamps via the TS bit
Rx descriptor + RXSTMPL/RXSTMPH registers mechanism. Rx timestamps are
reported in the packet buffer only, which is implemented by igc_ptp_rx_
pktstamp(). So this patch removes igc_ptp_rx_rgtstamp() and all code
related to it, copied from igb driver.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-29 18:21:16 -07:00
Sasha Neftin
707abf0695 igc: Add initial LTR support
The LTR message on the PCIe inform the requested latency
on which the PCIe must become active to the downstream
PCIe port of the system.
This patch provide recommended LTR parameters by i225
specification.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-29 18:21:09 -07:00
Amit Cohen
60f30cd6c2 mlxsw: spectrum_ethtool: Add link extended state
Implement .get_down_ext_state() as part of ethtool_ops.
Query link down reason from PDDR register and convert it to ethtool
link_ext_state.

In case that more information than common link_ext_state is provided,
fill link_ext_substate also with the appropriate value.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:45:02 -07:00
Amit Cohen
1bd06938df mlxsw: reg: Port Diagnostics Database Register
The PDDR register enables to read the Phy debug database.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:45:02 -07:00
Amit Cohen
2be5c8a963 mlxsw: spectrum_ethtool: Move mlxsw_sp_port_type_speed_ops structs
Move mlxsw_sp1_port_type_speed_ops and mlxsw_sp2_port_type_speed_ops
with the relevant code from spectrum.c to spectrum_ethtool.c.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:45:02 -07:00
Amit Cohen
614d509aa1 mlxsw: Move ethtool_ops to spectrum_ethtool.c
Add spectrum_ethtool.c file for ethtool code.
Move ethtool_ops and the relevant code from spectrum.c to
spectrum_ethtool.c.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:45:02 -07:00
Amit Cohen
a2af44b64c mlxsw: spectrum_dcb: Rename mlxsw_sp_port_headroom_set()
mlxsw_sp_port_headroom_set() is defined twice - in spectrum.c and in
spectrum_dcb.c, with different arguments and different implementation
but the name is same.

Rename mlxsw_sp_port_headroom_set() to mlxsw_sp_port_headroom_ets_set()
in order to allow using the second function in several files, and not
only as static function in spectrum.c.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:45:02 -07:00
Sasha Neftin
93ec439abe igc: Add initial EEE support
IEEE802.3az-2010 Energy Efficient Ethernet has been
approved as standard (September 2010) and the driver
can enable and disable it via ethtool.
Disable the feature by default on parts which support it.
Add enable/disable eee options.
tx-lpi, tx-timer and advertise not supported yet.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Reviewed-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-29 17:43:38 -07:00
Ioana Ciornei
4c96c0ac16 dpaa2-eth: add software counter for Tx frames converted to S/G
With the previous commit, in case of insufficient SKB headroom on the Tx
path instead of reallocing the SKB we now send a S/G frame descriptor.
Export the number of occurences of this case as a per CPU counter (in
debugfs) and a total number in the ethtool statistics - "tx converted sg
frames'.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:42:48 -07:00
Ioana Ciornei
d70446ee1f dpaa2-eth: send a scatter-gather FD instead of realloc-ing
Instead of realloc-ing the skb on the Tx path when the provided headroom
is smaller than the HW requirements, create a Scatter/Gather frame
descriptor with only one entry.

Remove the '[drv] tx realloc frames' counter exposed previously through
ethtool since it is no longer used.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:42:48 -07:00
Edward Cree
4d9c0a2d64 sfc: extend common GRO interface to support CHECKSUM_COMPLETE
EF100 will use CHECKSUM_COMPLETE, but will also make use of
 efx_rx_packet_gro(), thus needs to be able to pass the checksum value
 into that function.
Drivers for older NICs pass in a csum of 0 to get the old semantics (use
 the RX flags for CHECKSUM_UNNECESSARY marking).

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:49 -07:00
Edward Cree
28abe8251b sfc: commonise ARFS handling
EF100 will use the same approach to ARFS as EF10.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:49 -07:00
Edward Cree
850b722756 sfc: commonise drain event handling
Avoids a call from generic MCDI code into ef10.c.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:49 -07:00
Edward Cree
21ea21252e sfc: commonise PCI error handlers
EF100 will use the same mechanisms for PCI error recovery.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:49 -07:00
Edward Cree
66a65128d4 sfc: track which BAR is mapped
EF100 needs to map multiple BARs (sequentially, not concurrently) in
 order to read the Function Control Window during probe.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:49 -07:00
Edward Cree
53e1f21abd sfc: commonise FC advertising
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:48 -07:00
Edward Cree
5671dd5565 sfc: commonise other ethtool bits
A few more ethtool handlers which EF100 will share.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:48 -07:00
Edward Cree
cdec457b7a sfc: commonise ethtool NFC and RXFH/RSS functions
EF100 will share EF10's model of filtering, hashing and spreading.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:48 -07:00
Edward Cree
bdccfd2d4e sfc: commonise ethtool link handling functions
Link speeds, FEC, and autonegotiation are all things EF100 will share.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:48 -07:00
Edward Cree
9043f48fd3 sfc: split up nic.h
The new nic_common.h contains the inlines for NIC-type function dispatch,
 declarations for NIC-generic functions in nic.c, and other similar NIC-
 generic functionality.  Retained in nic.h are NIC-specific declarations
 such as the siena and ef10 nic_data structs and various farch functions.

The EF100 driver will thus include nic_common.h but not nic.h.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:48 -07:00
Edward Cree
d3142c193d sfc: refactor EF10 stats handling
Separate the generation-count handling from the format conversion, to
 make it easier to re-use both for EF100.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:48 -07:00
Edward Cree
de5f32e2b6 sfc: don't try to create more channels than we can have VIs
Calculate efx->max_vis at probe time, and check against it in
 efx_allocate_msix_channels() when considering whether to create XDP TX
 channels.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:48 -07:00
Edward Cree
08f9912ef0 sfc: extend bitfield macros up to POPULATE_DWORD_13
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:48 -07:00
Edward Cree
6d9b5dcd29 sfc: determine flag word automatically in efx_has_cap()
Now that we have an _OFST definition for each individual flag bit,
 callers of efx_has_cap() don't need to specify which flag word it's
 in; we can just use the flag name directly in MCDI_CAPABILITY_OFST.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:48 -07:00
Edward Cree
0dc95084c3 sfc: update MCDI protocol headers
The script used to generate these now includes _OFST definitions for
 flags, to identify the containing flag word.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:37:48 -07:00
Po Liu
5f035af76e net:qos: police action offloading parameter 'burst' change to the original value
Since 'tcfp_burst' with TICK factor, driver side always need to recover
it to the original value, this patch moves the generic calculation and
recover to the 'burst' original value before offloading to device driver.

Signed-off-by: Po Liu <po.liu@nxp.com>
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:33:42 -07:00
David S. Miller
1078029172 mlx5-tls-2020-06-26
1) Improve hardware layouts and structure for kTLS support
 
 2) Generalize ICOSQ (Internal Channel Operations Send Queue)
 Due to the asynchronous nature of adding new kTLS flows and handling
 HW asynchronous kTLS resync requests, the XSK ICOSQ was extended to
 support generic async operations, such as kTLS add flow and resync, in
 addition to the existing XSK usages.
 
 3) kTLS hardware flow steering and classification:
 The driver already has the means to classify TCP ipv4/6 flows to send them
 to the corresponding RSS HW engine, as reflected in patches 3 through 5,
 the series will add a steering layer that will hook to the driver's TCP
 classifiers and will match on well known kTLS connection, in case of a
 match traffic will be redirected to the kTLS decryption engine, otherwise
 traffic will continue flowing normally to the TCP RSS engine.
 
 3) kTLS add flow RX HW offload support
 New offload contexts post their static/progress params WQEs
 (Work Queue Element) to communicate the newly added kTLS contexts
 over the per-channel async ICOSQ.
 
 The Channel/RQ is selected according to the socket's rxq index.
 
 A new TLS-RX workqueue is used to allow asynchronous addition of
 steering rules, out of the NAPI context.
 It will be also used in a downstream patch in the resync procedure.
 
 Feature is OFF by default. Can be turned on by:
 $ ethtool -K <if> tls-hw-rx-offload on
 
 4) Added mlx5 kTLS sw stats and new counters are documented in
 Documentation/networking/tls-offload.rst
 rx_tls_ctx - number of TLS RX HW offload contexts added to device for
 decryption.
 
 rx_tls_ooo - number of RX packets which were part of a TLS stream
 but did not arrive in the expected order and triggered the resync
 procedure.
 
 rx_tls_del - number of TLS RX HW offload contexts deleted from device
 (connection has finished).
 
 rx_tls_err - number of RX packets which were part of a TLS stream
  but were not decrypted due to unexpected error in the state machine.
 
 5) Asynchronous RX resync
 
 a. The NIC driver indicates that it would like to resync on some TLS
 record within the received packet (P), but the driver does not
 know (yet) which of the TLS records within the packet.
 At this stage, the NIC driver will query the device to find the exact
 TCP sequence for resync (tcpsn), however, the driver does not wait
 for the device to provide the response.
 
 b. Eventually, the device responds, and the driver provides the tcpsn
 within the resync packet to KTLS. Now, KTLS can check the tcpsn against
 any processed TLS records within packet P, and also against any record
 that is processed in the future within packet P.
 
 The asynchronous resync path simplifies the device driver, as it can
 save bits on the packet completion (32-bit TCP sequence), and pass this
 information on an asynchronous command instead.
 
 Performance:
     CPU: Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz, 24 cores, HT off
     NIC: ConnectX-6 Dx 100GbE dual port
 
     Goodput (app-layer throughput) comparison:
     +---------------+-------+-------+---------+
     | # connections |   1   |   4   |    8    |
     +---------------+-------+-------+---------+
     | SW (Gbps)     |  7.26 | 24.70 |   50.30 |
     +---------------+-------+-------+---------+
     | HW (Gbps)     | 18.50 | 64.30 |   92.90 |
     +---------------+-------+-------+---------+
     | Speedup       | 2.55x | 2.56x | 1.85x * |
     +---------------+-------+-------+---------+
 
     * After linerate is reached, diff is observed in CPU util
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl73s2kACgkQSD+KveBX
 +j4wqAf/ZhcEn7i4N2F9wMMIL6wd4DgwKWWhbGpiREIxDwcRbqH7PGom8nBZMNd9
 +3g3zfURvByWehLtYcjmMgR4B7+xDgEs0dSx6pQM9764HqLDV2jW8ENr9Vr/u8s1
 hJ/eV8uzIfvx27MzbENZi0oJTw7N9nCgdcv1OyZkIba+Iado9pOeakPgBmTbINgo
 46LJI9nIEROE15gfjyxrVeYAs3Nxt+bogQCWYfMqUfRmKcMJ0d4oTHaUdtmm+xQB
 jC685/e4gE7jRgZ3qH/xvCZYp7+TVKaXsB0EtaJdPFEkvvvQpgPTfquIQ+6l7vvE
 Yf1YUhnDOoxGUQy1CdSZ2reNxLIm8A==
 =7+rG
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-tls-2020-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-tls-2020-06-26

1) Improve hardware layouts and structure for kTLS support

2) Generalize ICOSQ (Internal Channel Operations Send Queue)
Due to the asynchronous nature of adding new kTLS flows and handling
HW asynchronous kTLS resync requests, the XSK ICOSQ was extended to
support generic async operations, such as kTLS add flow and resync, in
addition to the existing XSK usages.

3) kTLS hardware flow steering and classification:
The driver already has the means to classify TCP ipv4/6 flows to send them
to the corresponding RSS HW engine, as reflected in patches 3 through 5,
the series will add a steering layer that will hook to the driver's TCP
classifiers and will match on well known kTLS connection, in case of a
match traffic will be redirected to the kTLS decryption engine, otherwise
traffic will continue flowing normally to the TCP RSS engine.

3) kTLS add flow RX HW offload support
New offload contexts post their static/progress params WQEs
(Work Queue Element) to communicate the newly added kTLS contexts
over the per-channel async ICOSQ.

The Channel/RQ is selected according to the socket's rxq index.

A new TLS-RX workqueue is used to allow asynchronous addition of
steering rules, out of the NAPI context.
It will be also used in a downstream patch in the resync procedure.

Feature is OFF by default. Can be turned on by:
$ ethtool -K <if> tls-hw-rx-offload on

4) Added mlx5 kTLS sw stats and new counters are documented in
Documentation/networking/tls-offload.rst
rx_tls_ctx - number of TLS RX HW offload contexts added to device for
decryption.

rx_tls_ooo - number of RX packets which were part of a TLS stream
but did not arrive in the expected order and triggered the resync
procedure.

rx_tls_del - number of TLS RX HW offload contexts deleted from device
(connection has finished).

rx_tls_err - number of RX packets which were part of a TLS stream
 but were not decrypted due to unexpected error in the state machine.

5) Asynchronous RX resync

a. The NIC driver indicates that it would like to resync on some TLS
record within the received packet (P), but the driver does not
know (yet) which of the TLS records within the packet.
At this stage, the NIC driver will query the device to find the exact
TCP sequence for resync (tcpsn), however, the driver does not wait
for the device to provide the response.

b. Eventually, the device responds, and the driver provides the tcpsn
within the resync packet to KTLS. Now, KTLS can check the tcpsn against
any processed TLS records within packet P, and also against any record
that is processed in the future within packet P.

The asynchronous resync path simplifies the device driver, as it can
save bits on the packet completion (32-bit TCP sequence), and pass this
information on an asynchronous command instead.

Performance:
    CPU: Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz, 24 cores, HT off
    NIC: ConnectX-6 Dx 100GbE dual port

    Goodput (app-layer throughput) comparison:
    +---------------+-------+-------+---------+
    | # connections |   1   |   4   |    8    |
    +---------------+-------+-------+---------+
    | SW (Gbps)     |  7.26 | 24.70 |   50.30 |
    +---------------+-------+-------+---------+
    | HW (Gbps)     | 18.50 | 64.30 |   92.90 |
    +---------------+-------+-------+---------+
    | Speedup       | 2.55x | 2.56x | 1.85x * |
    +---------------+-------+-------+---------+

    * After linerate is reached, diff is observed in CPU util
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:18:40 -07:00
Grygorii Strashko
38389aa6ba net: ethernet: ti: am65-cpsw-nuss: enable am65x sr2.0 support
The AM65x SR2.0 MCU CPSW has fixed errata i2027 "CPSW: CPSW Does Not
Support CPPI Receive Checksum (Host to Ethernet) Offload Feature". This
errata also fixed for J271E SoC.

Use SOC bus data for K3 SoC identification and apply i2027 errata w/a only
for the AM65x SR1.0 SoC.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:06:19 -07:00
Grygorii Strashko
3d0fda901c net: ethernet: ti: am65-cpsw-ethtool: configured critical setting only when no running netdevs
Ensure that critical setting can only be configured when there are no
running netdevs - all ports are down.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:06:19 -07:00
Grygorii Strashko
7d58d3ebe4 net: ethernet: ti: am65-cpsw-ethtool: skip hw cfg when change p0-rx-ptype-rrobin
Skip HW configuration when p0-rx-ptype-rrobin is changed as it will be done
by .ndev_open(),

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:06:19 -07:00
Grygorii Strashko
d6d0aeafb3 net: ethernet: ti: am65-cpsw-nuss: fix ports mac sl initialization
The MAC SL has to be initialized for each port otherwise
am65_cpsw_nuss_slave_disable_unused() will crash for disabled ports.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:06:19 -07:00
Grygorii Strashko
5182404806 net: ethernet: ti: am65-cpsw: move to pf_p0_rx_ptype_rrobin init in probe
The pf_p0_rx_ptype_rrobin is global parameter so move its initialization in
probe.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:06:19 -07:00
Grygorii Strashko
7bcffde021 net: ethernet: ti: am65-cpsw-nuss: restore vlan configuration while down/up
The vlan configuration is not restored after interface down/up sequence.

Steps to check:
 # ip link add link eth0 name eth0.100 type vlan id 100
 # ifconfig eth0 down
 # ifconfig eth0 up

This patch fixes it, restoring vlan ALE entries on .ndo_open().

Fixes: 93a7653031 ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:06:18 -07:00
Andrzej Pietrasiewicz
5d7bd8aa7c thermal: Simplify or eliminate unnecessary set_mode() methods
Setting polling_delay is now done at thermal_core level (by not polling
DISABLED devices), so no need to repeat this code.

int340x: Checking for an impossible enum value is unnecessary.
acpi/thermal: It only prints debug messages.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
[for acerhdf]
Acked-by: Peter Kaestle <peter@piie.net>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200629122925.21729-11-andrzej.p@collabora.com
2020-06-29 20:26:39 +02:00
Andrzej Pietrasiewicz
bbcf90c064 thermal: Explicitly enable non-changing thermal zone devices
Some thermal zone devices never change their state, so they should be
always enabled.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200629122925.21729-9-andrzej.p@collabora.com
2020-06-29 20:26:37 +02:00
Andrzej Pietrasiewicz
7f4957be0d thermal: Use mode helpers in drivers
Use thermal_zone_device_{en|dis}able() and thermal_zone_device_is_enabled().

Consequently, all set_mode() implementations in drivers:

- can stop modifying tzd's "mode" member,
- shall stop taking tzd's lock, as it is taken in the helpers
- shall stop calling thermal_zone_device_update() as it is called in the
helpers
- can assume they are called when the mode truly changes, so checks to
verify that can be dropped

Not providing set_mode() by a driver no longer prevents the core from
being able to set tzd's mode, so the relevant check in mode_store() is
removed.

Other comments:

- acpi/thermal.c: tz->thermal_zone->mode will be updated only after we
return from set_mode(), so use function parameter in thermal_set_mode()
instead, no need to call acpi_thermal_check() in set_mode()
- thermal/imx_thermal.c: regmap writes and mode assignment are done in
thermal_zone_device_{en|dis}able() and set_mode() callback
- thermal/intel/intel_quark_dts_thermal.c: soc_dts_{en|dis}able() are a
part of set_mode() callback, so they don't need to modify tzd->mode, and
don't need to fall back to the opposite mode if unsuccessful, as the return
value will be propagated to thermal_zone_device_{en|dis}able() and
ultimately tzd's member will not be changed in thermal_zone_device_set_mode().
- thermal/of-thermal.c: no need to set zone->mode to DISABLED in
of_parse_thermal_zones() as a tzd is kzalloc'ed so mode is DISABLED anyway

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
[for acerhdf]
Acked-by: Peter Kaestle <peter@piie.net>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200629122925.21729-8-andrzej.p@collabora.com
2020-06-29 20:26:36 +02:00
Andrzej Pietrasiewicz
1ee14820fd thermal: remove get_mode() operation of drivers
get_mode() is now redundant, as the state is stored in struct
thermal_zone_device.

Consequently the "mode" attribute in sysfs can always be visible, because
it is always possible to get the mode from struct tzd.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
[for acerhdf]
Acked-by: Peter Kaestle <peter@piie.net>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200629122925.21729-6-andrzej.p@collabora.com
2020-06-29 20:26:35 +02:00
Andrzej Pietrasiewicz
5a3506657f thermal: Store device mode in struct thermal_zone_device
Prepare for eliminating get_mode().

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
[for acerhdf]
Acked-by: Peter Kaestle <peter@piie.net>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200629122925.21729-5-andrzej.p@collabora.com
2020-06-29 20:26:34 +02:00
Geliang Tang
b8483ecaf7 liquidio: use list_empty_careful in lio_list_delete_head
Use list_empty_careful() instead of open-coding.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 21:46:33 -07:00
Armin Wolf
ac6a86a539 8390: Fix coding-style issues
Fix some coding-style issues, including one which
made the function pointers in the struct ei_device
hard to understand.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 21:44:49 -07:00
Vladimir Oltean
836e0e5558 net: mscc: ocelot: remove EXPORT_SYMBOL from ocelot_net.c
Now that all net_device operations are bundled together inside
mscc_ocelot.ko and no longer part of the common library, there's no
reason to export these symbols.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 21:40:21 -07:00
Heiner Kallweit
cdafdc29ef r8169: sync support for RTL8401 with vendor driver
So far RTL8401 was treated like a RTL8101e, means we relied on the BIOS
to configure MAC and PHY properly. Make RTL8401 a separate chip version
and copy MAC / PHY config from r8101 vendor driver.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:56:38 -07:00
Heiner Kallweit
93c09ca6b1 r8169: merge handling of RTL8101e and RTL8100e
Chip versions 13, 14, 15 are treated the same by the driver, therefore
let's merge them.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:56:38 -07:00
Luc Van Oostenryck
2a78478439 cxgb4vf: fix t4vf_eth_xmit()'s return type
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, but the implementation in this
driver returns an 'int'.

Fix this by returning 'netdev_tx_t' in this driver too.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:52:53 -07:00
Luc Van Oostenryck
673d8eb6cf net: dwc-xlgmac: fix xlgmac_xmit()'s return type
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, but the implementation in this
driver returns an 'int'.

Fix this by returning 'netdev_tx_t' in this driver too.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:52:53 -07:00
Luc Van Oostenryck
4e516a35eb net: pch_gbe: fix pch_gbe_xmit_frame()'s return type
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, but the implementation in this
driver returns an 'int'.

Fix this by returning 'netdev_tx_t' in this driver too.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:52:53 -07:00
Luc Van Oostenryck
737ce1e986 net: nfp: fix nfp_net_tx()'s return type
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, but the implementation in this
driver returns an 'int'.

Fix this by returning 'netdev_tx_t' in this driver too.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:52:53 -07:00
Luc Van Oostenryck
f649c35551 net: nb8800: fix nb8800_xmit()'s return type
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, but the implementation in this
driver returns an 'int'.

Fix this by returning 'netdev_tx_t' in this driver too.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:52:53 -07:00
Luc Van Oostenryck
de37b0a58a net: arc_emac: fix arc_emac_tx()'s return type
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, but the implementation in this
driver returns an 'int'.

Fix this by returning 'netdev_tx_t' in this driver too.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:52:53 -07:00
Luc Van Oostenryck
92c5e11507 net: aquantia: fix aq_ndev_start_xmit()'s return type
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, but the implementation in this
driver returns an 'int'.

Fix this by returning 'netdev_tx_t' in this driver too.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:52:53 -07:00
Luo bin
2ac84cd160 hinic: add support to get eeprom information
add support to get eeprom information from the plug-in module
with ethtool -m cmd.

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:40:58 -07:00
Luo bin
07afcc7ab4 hinic: add support to identify physical device
add support to identify physical device by flashing an LED
attached to it with ethtool -p cmd.

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:40:58 -07:00
Luo bin
4aa218a4fe hinic: add self test support
add support to excute internal and external loopback test with
ethtool -t cmd.

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:40:58 -07:00
Luo bin
a0337c0dee hinic: add support to set and get irq coalesce
add support to set TX/RX irq coalesce params with ethtool -C and
get these params with ethtool -c.

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:40:58 -07:00
Luo bin
ea256222a4 hinic: add support to set and get pause params
add support to set pause params with ethtool -A and get pause
params with ethtool -a. Also remove set_link_ksettings ops for VF
and enable pause by default.

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:40:58 -07:00
Tariq Toukan
a29074367b net/mlx5e: kTLS, Improve rx handler function call
Prior to this patch mlx5e tls rx handler was called unconditionally on
all rx frames and the decision whether a frame is a valid tls record
is done inside that function.  A function call can be expensive especially
for regular rx packet rate.  To avoid this, check the tls validity before
jumping into the tls rx handler.

While at it, split between kTLS device offload rx handler and FPGA tls rx
handler using a similar method.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
2020-06-27 14:00:25 -07:00
Tariq Toukan
ed9a7c53b8 net/mlx5e: kTLS, Cleanup redundant capability check
All callers of mlx5e_ktls_build_netdev() check capability
before the call.
Remove the repeated check in the function.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-27 14:00:25 -07:00
Tariq Toukan
c5607360ec net/mlx5e: Increase Async ICO SQ size
Resync communication with HW for kTLS RX is done via the
async ICOSQs.
kTLS RX resync requests might come in bursts. To improve the
success chances for such bursts, use a larger ICOSQ.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-27 14:00:24 -07:00
Tariq Toukan
76c1e1ac2a net/mlx5e: kTLS, Add kTLS RX stats
Add global and per-channel ethtool SW stats for the device
offload.
Document the new counters in tls-offload.rst.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-27 14:00:23 -07:00
Tariq Toukan
0419d8c9d8 net/mlx5e: kTLS, Add kTLS RX resync support
Implement the RX resync procedure, using the TLS async resync API.

The HW offload of TLS decryption in RX side might get out-of-sync
due to out-of-order reception of packets.
This requires SW intervention to update the HW context and get it
back in-sync.

Performance:
CPU: Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz, 24 cores, HT off
NIC: ConnectX-6 Dx 100GbE dual port

Goodput (app-layer throughput) comparison:
+---------------+-------+-------+---------+
| # connections |   1   |   4   |    8    |
+---------------+-------+-------+---------+
| SW (Gbps)     |  7.26 | 24.70 |   50.30 |
+---------------+-------+-------+---------+
| HW (Gbps)     | 18.50 | 64.30 |   92.90 |
+---------------+-------+-------+---------+
| Speedup       | 2.55x | 2.56x | 1.85x * |
+---------------+-------+-------+---------+

* After linerate is reached, diff is observed in CPU util.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-27 14:00:23 -07:00
Tariq Toukan
1182f36593 net/mlx5e: kTLS, Add kTLS RX HW offload support
Implement driver support for the kTLS RX HW offload feature.
Resync support is added in a downstream patch.

New offload contexts post their static/progress params WQEs
over the per-channel async ICOSQ, protected under a spin-lock.
The Channel/RQ is selected according to the socket's rxq index.

Feature is OFF by default. Can be turned on by:
$ ethtool -K <if> tls-hw-rx-offload on

A new TLS-RX workqueue is used to allow asynchronous addition of
steering rules, out of the NAPI context.
It will be also used in a downstream patch in the resync procedure.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-27 14:00:21 -07:00
Tariq Toukan
df8d866770 net/mlx5e: kTLS, Use kernel API to extract private offload context
Modify the implementation of the private kTLS TX HW offload context
getter and setter, so it uses the kernel API functions, instead of
a local shadow structure.
A single BUILD_BUG_ON check is sufficient, remove the duplicate.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-27 14:00:20 -07:00
Tariq Toukan
7d0d0d86ec net/mlx5e: kTLS, Improve TLS feature modularity
Better separate the code into c/h files, so that kTLS internals
are exposed to the corresponding non-accel flow as follows:
- Necessary datapath functions are exposed via ktls_txrx.h.
- Necessary caps and configuration functions are exposed via ktls.h,
  which became very small.

In addition, kTLS internal code sharing is done via ktls_utils.h,
which is not exposed to any non-accel file.

Add explicit WQE structures for the TLS static and progress
params, breaking the union of the static with UMR, and the progress
with PSV.

Generalize the API as a preparation for TLS RX offload support.

Move kTLS TX-specific code to the proper file.
Remove the inline tag for function in C files, let the compiler decide.
Use kzalloc/kfree for the priv_tx context.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
2020-06-27 14:00:20 -07:00
Tariq Toukan
5229a96e59 net/mlx5e: Accel, Expose flow steering API for rules add/del
Given a socket, the function extracts the TCP/IP{4,6} ntuple
and adds rule to steering.
Another function gets the rule and deletes it.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
2020-06-27 14:00:19 -07:00
Boris Pismenny
c062d52ac2 net/mlx5e: Receive flow steering framework for accelerated TCP flows
The framework allows creating flow tables to steer incoming traffic of
TCP sockets to the acceleration TIRs.
This is used in downstream patches for TLS, and will be used in the
future for other offloads.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-27 14:00:18 -07:00
Saeed Mahameed
b8922a73ec net/mlx5e: API to manipulate TTC rules destinations
Store the default destinations of the on-load generated TTC
(Traffic Type Classifier) rules in the ttc rules table.

Introduce TTC API functions to manipulate/restore and get the TTC rule
destination and use these API functions in arfs implementation.

This will allow a better decoupling between TTC implementation and its
users.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
2020-06-27 14:00:18 -07:00
Tariq Toukan
c293ac927f net/mlx5e: Refactor build channel params
Take the CQ params into their respective RQ/SQ params.
Split the params build of the different ICOSQs (sync and async),
as they require different init values.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-27 14:00:17 -07:00
Tariq Toukan
8d94b590f1 net/mlx5e: Turn XSK ICOSQ into a general asynchronous one
There is an upcoming demand (in downstream patches) for
an ICOSQ to be populated out of the NAPI context, asynchronously.

There is already an existing one serving XSK-related use case.
In this patch, promote this ICOSQ to serve as general async ICOSQ,
to be used for XSK and non-XSK flows.

As part of this, the reg_umr bit of the SQ context is now set
(if capable), as the general async ICOSQ should support possible
posts of UMR WQEs.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-27 14:00:16 -07:00
Saeed Mahameed
e396eccf0f Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: kTLS, Improve TLS params layout structures
  net/mlx5: Avoid eswitch header inclusion in fs core layer
  net/mlx5: Avoid RDMA file inclusion in core driver
  net/mlx5: Add support in query QP, CQ and MKEY segments
  net/mlx5: Export resource dump interface

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-27 14:00:13 -07:00
Tariq Toukan
2d1b69ed65 net/mlx5: kTLS, Improve TLS params layout structures
Add explicit WQE segment structures for the TLS static and progress
params.
According to the HW spec, TISN is not part of the progress params context,
take it out of it.
Rename the control segment tisn field as it could hold either a TIS or
a TIR number.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-27 13:50:46 -07:00
Parav Pandit
188f0f988b net/mlx5: Avoid eswitch header inclusion in fs core layer
Flow steering core layer is independent of the eswitch layer.
Hence avoid fs_core dependency on eswitch.

Fixes: 328edb499f ("net/mlx5: Split FDB fast path prio to multiple namespaces")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-27 13:50:46 -07:00
Igor Russkikh
4378b882bf net: atlantic: put ptp code under IS_REACHABLE check
A1 requires additional processing for both egress and ingress to support
PTP.
And it makes sense to get rid of this processing altogether (via ifdef),
if PTP clock is disabled globally.

This patch puts the PTP code under the corresponding IS_REACHABLE check.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 16:32:51 -07:00
Mark Starovoytov
8664240e30 net: atlantic: add alignment checks in hw_atl2_utils_fw.c
This patch adds alignment checks in all the helper macros in
hw_atl2_utils_fw.c
These alignment checks are compile-time, so runtime is not affected.

All these helper macros assume the length to be aligned (multiple of 4).
If it's not aligned, then there might be issues, e.g. stack corruption.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 16:32:51 -07:00
Dmitry Bezrukov
6ec99221d7 net: atlantic: missing space in a comment in aq_nic.h
This patch add a missing space in the comment in aq_nic.h

Signed-off-by: Dmitry Bezrukov <dbezrukov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 16:32:50 -07:00
Mark Starovoytov
586616cbd4 net: atlantic: fix typo in aq_ring_tx_clean
This patch fixes a typo in aq_ring_tx_clean.
stats is a union, so the typo doesn't cause any issues, but it's a typo
nonetheless.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 16:32:50 -07:00
Mark Starovoytov
ab3518acac net: atlantic: make aq_pci_func_init static
This patch makes aq_pci_func_init() static, because it's not used anywhere
outside the file itself.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 16:32:50 -07:00
Mark Starovoytov
e35df21865 net: atlantic: Replace ENOTSUPP usage to EOPNOTSUPP
This patch replaces ENOTSUPP (where it was used by mistake) with
EOPNOTSUPP.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 16:32:50 -07:00
Nikita Danilov
e39b8ffeb9 net: atlantic: fix variable type in aq_ethtool_get_pauseparam
This patch fixes the type for variable which is assigned from enum,
as such it should have been int, not u32.

Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 16:32:50 -07:00
Mark Starovoytov
3a8b445469 net: atlantic: MACSec offload statistics checkpatch fix
This patch fixes a checkpatch warning.

Fixes: aec0f1aac5 ("net: atlantic: MACSec offload statistics implementation")

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 16:32:50 -07:00
Jakub Kicinski
132db93572 docs: networking: reorganize driver documentation again
Organize driver documentation by device type. Most documents
have fairly verbose yet uninformative names, so let users
first select a well defined device type, and then search for
a particular driver.

While at it rename the section from Vendor drivers to
Hardware drivers. This seems more accurate, besides people
sometimes refer to out-of-tree drivers as vendor drivers.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 16:08:44 -07:00
Claudiu Manoil
0574e2000f enetc: Fix tx rings bitmap iteration range, irq handling
The rings bitmap of an interrupt vector encodes
which of the device's rings were assigned to that
interrupt vector.
Hence the iteration range of the tx rings bitmap
(for_each_set_bit()) should be the total number of
Tx rings of that netdevice instead of the number of
rings assigned to the interrupt vector.
Since there are 2 cores, and one interrupt vector for
each core, the number of rings asigned to an interrupt
vector is half the number of available rings.
The impact of this error is that the upper half of the
tx rings could still generate interrupts during napi
polling.

Fixes: d4fd0404c1 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 15:02:30 -07:00
Daniel González Cabanelas
61b5cc20c8 net: mvneta: speed down the PHY, if WoL used, to save energy
Some PHYs connected to this ethernet hardware support the WoL feature.
But when WoL is enabled and the machine is powered off, the PHY remains
waiting for a magic packet at max speed (i.e. 1Gbps), which is a waste of
energy.

Slow down the PHY speed before stopping the ethernet if WoL is enabled,
and save some energy while the machine is powered off or sleeping.

Tested using an Armada 370 based board (LS421DE) equipped with a Marvell
88E1518 PHY.

Signed-off-by: Daniel González Cabanelas <dgcbueu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 13:36:33 -07:00
David S. Miller
b0f46a9754 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2020-06-25

This series contains updates to i40e driver and removes the individual
driver versions from all of the Intel wired LAN drivers.

Shiraz moves the client header so that it can easily be shared between
the i40e LAN driver and i40iw RDMA driver.

Jesse cleans up the unused defines, since they are just dead weight.

Alek reduces the unreasonably long wait time for a PF reset after reboot
by using jiffies to limit the maximum wait time for the PF reset to
succeed.  Added additional logging to let the user know when the driver
transitions into recovery mode.  Adds new device support for our 5 Gbps
NICs.

Todd adds a check to see if MFS is set after warm reboot and notifies
the user when MFS is set to anything lower than the default value.

Arkadiusz fixes a possible race condition, where were holding a
spin-lock while in atomic context.

v2: removed code comments that were no longer applicable in patch 2 of
    the series.  Also removed 'inline' from patch 4 and patch 8 of the
    series.  Also re-arranged code to be able to remove the forward
    function declarations.  Dropped patch 9 of the series, while the
    author works on cleaning up the commit message.
v3: Updated patch 8 description to answer Jakub's questions
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 12:22:34 -07:00
Shannon Nelson
fa48494cce ionic: update the queue count on open
Let the network stack know the real number of queues that
we are using.

v2: added error checking

Fixes: 49d3b49367 ("ionic: disable the queues on link down")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 12:19:51 -07:00
Martin Blumenstingl
52660c0ec9 net: stmmac: dwmac-meson8b: use clk_parent_data for clock registration
Simplify meson8b_init_rgmii_tx_clk() by using struct clk_parent_data to
initialize the clock parents. No functional changes intended.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 12:17:29 -07:00
Vaibhav Gupta
4ced637bd2 bnx2x: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

The driver was also calling bnx2x_set_power_state() to set the power state
of the device by changing the device's registers' value. It is no more
needed.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Acked-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26 12:14:09 -07:00
Aleksandr Loktionov
37d318d780 i40e: Remove scheduling while atomic possibility
In some occasions task held spinlock (mac_filter_hash_lock),
while being rescheduled due to admin queue mutex_lock.  The struct
i40e_spinlock asq_spinlock, which later expands to struct mutex
spinlock.  Moved i40e_aq_set_vsi_multicast_promiscuous(),
i40e_aq_set_vsi_unicast_promiscuous(),
i40e_aq_set_vsi_mc_promisc_on_vlan(), and
i40e_aq_set_vsi_uc_promisc_on_vlan() outside of atomic context.  Without
this patch there is a race condition, which might result in scheduling
while in atomic context.  The race condition is between the thread, which
holds mac_filter_hash_lock, while trying to acquire an admin queue mutex
and a thread, which already has said admin queue mutex. The thread, which
holds spinlock, fails to acquire the mutex, which causes this thread to
sleep.

Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-25 22:37:03 -07:00
Aleksandr Loktionov
3dbdd6c2f7 i40e: Add support for 5Gbps cards
Make possible for the i40e driver to bind to the new v710 for 5GBASE-T
NICs.

Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-25 22:25:13 -07:00
Jeff Kirsher
34a2a3b83e net/intel: remove driver versions from Intel drivers
As with other networking drivers, remove the unnecessary driver version
from the Intel drivers. The ethtool driver information and module version
will then report the kernel version instead.

For ixgbe, i40e and ice drivers, the driver passes the driver version to
the firmware to confirm that we are up and running.  So we now pass the
value of UTS_RELEASE to the firmware.  This adminq call is required per
the HAS document.  The Device then sends an indication to the BMC that the
PF driver is present. This is done using Host NC Driver Status Indication
in NC-SI Get Link command or via the Host Network Controller Driver Status
Change AEN.

What the BMC may do with this information is implementation-dependent, but
this is a standard NC-SI 1.1 command we honor per the HAS.

CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Alek Loktionov <aleksandr.loktionov@intel.com>
CC: Kevin Liedtke <kevin.d.liedtke@intel.com>
CC: Aaron Rowden <aaron.f.rowden@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
2020-06-25 22:25:13 -07:00
Todd Fujinaka
3a2c6ced90 i40e: Add a check to see if MFS is set
A customer was chain-booting to provision his systems and one of the
steps was setting MFS. MFS isn't cleared by normal warm reboots
(clearing requires a GLOBR) and there was no indication of why Jumbo
Frame receives were failing.

Add a warning if MFS is set to anything lower than the default.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-25 22:25:13 -07:00
Piotr Kwapulinski
fffeeddfcf i40e: detect and log info about pre-recovery mode
Detect and log information about pre-recovery mode when firmware
transitions to a recovery mode.
When a firmware transitions to a recovery mode it stores a number
of unexpected EMP resets in one of its registers. The number of EMP
resets ranging from 0x21 to 0x2A indicates that FW transitions
to recovery mode. Use these values to emit log entry about transition
process. Previously the pre-recovery mode may not have been detected
and there was no log entry when NIC was in pre-recovery mode.

Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-25 22:25:13 -07:00
Piotr Kwapulinski
91c534b5e3 i40e: make PF wait reset loop reliable
Use jiffies to limit max waiting time for PF reset to succeed.
Previous wait loop was unreliable. It required unreasonably long time
to wait for PF reset after reboot when NIC was about to enter
recovery mode

Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-25 22:25:13 -07:00
Jesse Brandeburg
3c98f9ee6b i40e: remove unused defines
Remove all the unused defines as they are just dead weight.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-25 22:25:13 -07:00
Shiraz Saleem
fe21b6c3a6 i40e: Move client header location
Move i40e_client.h to include/linux/net/intel/*
since its shared between i40iw and i40e.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-25 22:25:13 -07:00
David S. Miller
7bed145516 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Minor overlapping changes in xfrm_device.c, between the double
ESP trailing bug fix setting the XFRM_INIT flag and the changes
in net-next preparing for bonding encryption support.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-25 19:29:51 -07:00
Jason A. Donenfeld
93ab48a97a hns: do not cast return value of napi_gro_receive to null
Basically no drivers care about the return value here, and there's no
__must_check that would make casting to void sensible, so remove it.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-25 16:16:21 -07:00
Jason A. Donenfeld
e5e7d8052f socionext: account for napi_gro_receive never returning GRO_DROP
The napi_gro_receive function no longer returns GRO_DROP ever, making
handling GRO_DROP dead code. This commit removes that dead code.
Further, it's not even clear that device drivers have any business in
taking action after passing off received packets; that's arguably out of
their hands.

Fixes: 6570bc79c0 ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-25 16:16:21 -07:00
Ioana Ciornei
05e190467d dpaa2-eth: fix misspelled function parameters in dpni_[set/get]_taildrop
Two of the function parameters (qtype and index) were misspelled in the
associated descriptions of dpni_[set/get]_taildrop which led to sparse
warnings. Fix this by using the exact same names as present in the
function definition.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-25 16:03:39 -07:00
Ioana Ciornei
cef5820b7f dpaa2-eth: fix recursive header include
The dpaa2-eth.h header file includes dpaa2-eth-trace.h which includes
back dpaa2-eth leading to a recursion in the include path. Fix this by
removing the include of dpaa2-eth.h in the trace header.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-25 16:03:39 -07:00
Ioana Ciornei
0e5ad75b02 dpaa2-eth: fix condition for number of buffer acquire retries
We should keep retrying to acquire buffers through the software portals
as long as the function returns -EBUSY and the number of retries is
__below__ DPAA2_ETH_SWP_BUSY_RETRIES.

Fixes: ef17bd7cc0 ("dpaa2-eth: Avoid unbounded while loops")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-25 16:03:39 -07:00
Ioana Ciornei
37fbbdda63 dpaa2-eth: check the result of skb_to_sgvec()
Before passing the result of skb_to_sgvec() to dma_map_sg() check if any
error was returned.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-25 16:03:39 -07:00
Ioana Radulescu
0da1e28f97 dpaa2-eth: trim debugfs FQ stats
With the addition of multiple traffic classes support, the number
of available frame queues grew significantly, overly inflating the
debugfs FQ statistics entry. Update it to only show the queues
which are actually in use (i.e. have a non-zero frame counter).

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-25 16:03:39 -07:00
Claudiu Beznea
33fdef24c9 net: macb: free resources on failure path of at91ether_open()
DMA buffers were not freed on failure path of at91ether_open().
Along with changes for freeing the DMA buffers the enable/disable
interrupt instructions were moved to at91ether_start()/at91ether_stop()
functions and the operations on at91ether_stop() were done in
their reverse order (compared with how is done in at91ether_start()):
before this patch the operation order on interface open path
was as follows:
1/ alloc DMA buffers
2/ enable tx, rx
3/ enable interrupts
and the order on interface close path was as follows:
1/ disable tx, rx
2/ disable interrupts
3/ free dma buffers.

Fixes: 7897b071ac ("net: macb: convert to phylink")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-25 15:59:23 -07:00
Claudiu Beznea
0eaf228d57 net: macb: call pm_runtime_put_sync on failure path
Call pm_runtime_put_sync() on failure path of at91ether_open.

Fixes: e6a41c23df ("net: macb: ensure interface is not suspended on at91rm9200")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-25 15:59:23 -07:00
Saeed Mahameed
efbb974d8e net/mlx5e: vxlan: Return bool instead of opaque ptr in port_lookup()
struct mlx5_vxlan_port is not exposed to the outside callers, it is
redundant to return a pointer to it from mlx5_vxlan_port_lookup(), to be
only used as a boolean, so just return a boolean.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-25 12:41:46 -07:00
Saeed Mahameed
7a64ca862a net/mlx5e: vxlan: Use RCU for vxlan table lookup
Remove the spinlock protecting the vxlan table and use RCU instead.
This will improve performance as it will eliminate contention on data
path cores.

Fixes: b3f63c3d5e ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
2020-06-25 12:41:43 -07:00
Vlad Buslov
185901ceeb net/mlx5e: Move TC-specific function definitions into MLX5_CLS_ACT
en_tc.h header file declares several TC-specific functions in
CONFIG_MLX5_ESWITCH block even though those functions are only compiled
when CONFIG_MLX5_CLS_ACT is set, which is a recent change. Move them to
proper block.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Maor Dickman <maord@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-25 12:41:40 -07:00
Alaa Hleihel
8fab0175aa net/mlx5e: Move including net/arp.h from en_rep.c to rep/neigh.c
After the cited commit, the header net/arp.h is no longer used in en_rep.c.
So, move it to the new file rep/neigh.c that uses it now.

Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-25 12:41:38 -07:00
Maxim Mikityanskiy
d39c9885b6 net/mlx5e: Remove unused mlx5e_xsk_first_unused_channel
mlx5e_xsk_first_unused_channel is a leftover from old versions of the
first XSK commit, and it was never used. Remove it.

Fixes: db05815b36 ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-25 12:41:35 -07:00
Denis Efremov
360000b26e net/mlx5: Use kfree(ft->g) in arfs_create_groups()
Use kfree() instead of kvfree() on ft->g in arfs_create_groups() because
the memory is allocated with kcalloc().

Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-25 12:41:32 -07:00
Hu Haowen
39797f1c53 net/mlx5: FWTrace: Add missing space
Missing space at the end of a comment line, add it.

Signed-off-by: Hu Haowen <xianfengting221@163.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-25 12:41:30 -07:00
Parav Pandit
04dfa7057b net/mlx5: Avoid eswitch header inclusion in fs core layer
Flow steering core layer is independent of the eswitch layer.
Hence avoid fs_core dependency on eswitch.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-25 12:41:28 -07:00
Po Liu
d621d7703d net: enetc add tc flower offload flow metering policing action
Flow metering entries in IEEE 802.1Qci is an optional function for a
flow filtering module. Flow metering is two rates two buckets and three
color marker to policing the frames. This patch only enable one rate one
bucket and in color blind mode. Flow metering instance are as
specified in the algorithm in MEF 10.3 and in Bandwidth Profile
Parameters. They are:

a) Flow meter instance identifier. An integer value identifying the flow
meter instance. The patch use the police 'index' as thin value.
b) Committed Information Rate (CIR), in bits per second. This patch use
the 'rate_bytes_ps' represent this value.
c) Committed Burst Size (CBS), in octets. This patch use the 'burst'
represent this value.
d) Excess Information Rate (EIR), in bits per second.
e) Excess Burst Size per Bandwidth Profile Flow (EBS), in octets.
And plus some other parameters. This patch set EIR/EBS default disable
and color blind mode.

v1->v2 changes:
- Use div_u64() as division replace the '/' report:

All errors (new ones prefixed by >>):

   ld: drivers/net/ethernet/freescale/enetc/enetc_qos.o: in function `enetc_flowmeter_hw_set':
>> enetc_qos.c:(.text+0x66): undefined reference to `__udivdi3'

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Po Liu <Po.Liu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24 22:05:01 -07:00
Po Liu
89d1f09669 net: enetc: add support max frame size for tc flower offload
Base on the tc flower offload police action add max frame size by the
parameter 'mtu'. Tc flower device driver working by the IEEE 802.1Qci
stream filter can implement the max frame size filtering. Add it to the
current hardware tc flower stearm filter driver.

Signed-off-by: Po Liu <Po.Liu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24 22:04:26 -07:00
Doug Berger
20d1f2d1b0 net: bcmgenet: use hardware padding of runt frames
When commit 474ea9cafc ("net: bcmgenet: correctly pad short
packets") added the call to skb_padto() it should have been
located before the nr_frags parameter was read since that value
could be changed when padding packets with lengths between 55
and 59 bytes (inclusive).

The use of a stale nr_frags value can cause corruption of the
pad data when tx-scatter-gather is enabled. This corruption of
the pad can cause invalid checksum computation when hardware
offload of tx-checksum is also enabled.

Since the original reason for the padding was corrected by
commit 7dd399130e ("net: bcmgenet: fix skb_len in
bcmgenet_xmit_single()") we can remove the software padding all
together and make use of hardware padding of short frames as
long as the hardware also always appends the FCS value to the
frame.

Fixes: 474ea9cafc ("net: bcmgenet: correctly pad short packets")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24 21:51:03 -07:00
Doug Berger
d966d2efb6 net: bcmgenet: use __be16 for htons(ETH_P_IP)
The 16-bit value that holds a short in network byte order should
be declared as a restricted big endian type to allow type checks
to succeed during assignment.

Fixes: 3e37095228 ("net: bcmgenet: add support for ethtool rxnfc flows")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24 21:51:03 -07:00
Doug Berger
673bafd5b8 net: bcmgenet: re-remove bcmgenet_hfb_add_filter
This function was originally removed by Baoyou Xie in
commit e2072600a2 ("net: bcmgenet: remove unused function in
bcmgenet.c") to prevent a build warning.

Some of the functions removed by Baoyou Xie are now used for
WAKE_FILTER support so his commit was reverted, but this function
is still unused and the kbuild test robot dutifully reported the
warning.

This commit once again removes the remaining unused hfb functions.

Fixes: 14da1510fe ("Revert "net: bcmgenet: remove unused function in bcmgenet.c"")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24 21:51:03 -07:00
Colin Ian King
a512438608 qed: add missing error test for DBG_STATUS_NO_MATCHING_FRAMING_MODE
The error DBG_STATUS_NO_MATCHING_FRAMING_MODE was added to the enum
enum dbg_status however there is a missing corresponding entry for
this in the array s_status_str. This causes an out-of-bounds read when
indexing into the last entry of s_status_str.  Fix this by adding in
the missing entry.

Addresses-Coverity: ("Out-of-bounds read").
Fixes: 2d22bc8354 ("qed: FW 8.42.2.0 debug features")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24 14:57:24 -07:00
Sascha Hauer
41c2b6b4f0 net: ethernet: mvneta: Add back interface mode validation
When writing the serdes configuration register was moved to
mvneta_config_interface() the whole code block was removed from
mvneta_port_power_up() in the assumption that its only purpose was to
write the serdes configuration register. As mentioned by Russell King
its purpose was also to check for valid interface modes early so that
later in the driver we do not have to care for unexpected interface
modes.
Add back the test to let the driver bail out early on unhandled
interface modes.

Fixes: b4748553f5 ("net: ethernet: mvneta: Fix Serdes configuration for SoCs without comphy")
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24 14:51:42 -07:00
Sascha Hauer
d3d239dcb8 net: ethernet: mvneta: Do not error out in non serdes modes
In mvneta_config_interface() the RGMII modes are catched by the default
case which is an error return. The RGMII modes are valid modes for the
driver, so instead of returning an error add a break statement to return
successfully.

This avoids this warning for non comphy SoCs which use RGMII, like
SolidRun Clearfog:

WARNING: CPU: 0 PID: 268 at drivers/net/ethernet/marvell/mvneta.c:3512 mvneta_start_dev+0x220/0x23c

Fixes: b4748553f5 ("net: ethernet: mvneta: Fix Serdes configuration for SoCs without comphy")
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24 14:51:42 -07:00
YueHaibing
147373d968 lan743x: Remove duplicated include from lan743x_main.c
Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24 14:49:38 -07:00
Rahul Lakkireddy
f35d2117e2 cxgb4: move device dump arrays in header to C file
Move all arrays related to device dump in header file to C file.
Also, move the function that shares the arrays to the same C file.

Fixes following warnings reported by make W=1 in several places:
cudbg_entity.h:513:18: warning: 't6_hma_ireg_array' defined but not
used [-Wunused-const-variable=]
  513 | static const u32 t6_hma_ireg_array[][IREG_NUM_ELEM] = {

Fixes: a7975a2f9a ("cxgb4: collect register dump")
Fixes: 17b332f480 ("cxgb4: add support to read serial flash")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:55:44 -07:00
Rahul Lakkireddy
5fff701c83 cxgb4: always sync access when flashing PHY firmware
Access to on-chip memory for flashing PHY firmware must always
be synchronized. So, ensure the callers take on-chip memory lock.

Also fixes following sparse warning:
sge.c:1641:26: warning: context imbalance in 't4_load_phy_fw' -
different lock contexts for basic block

Fixes: 01b6961410 ("cxgb4: Add PHY firmware support for T420-BT cards")
Fixes: 4ee339e1e9 ("cxgb4: add support to flash PHY image")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:55:44 -07:00
Jeremy Linton
0f183fd151 net/fsl: enable extended scanning in xgmac_mdio
Since we know the xgmac hardware always has a c45
compliant bus, let's try scanning for c22 capable
PHYs first. If we fail to find any, then it will
fall back to c45 automatically.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Calvin Johnson <calvin.johnson@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:35:15 -07:00
Calvin Johnson
229f4bb475 net/fsl: acpize xgmac_mdio
Add ACPI support for xgmac MDIO bus registration while maintaining
the existing DT support.

The function mdiobus_register() inside of_mdiobus_register(), brings
up all the PHYs on the mdio bus and attach them to the bus.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Calvin Johnson <calvin.johnson@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:35:15 -07:00
Vaibhav Gupta
6c3cb945ed tulip: uli526x: use generic power management
With the support of generic PM callbacks, drivers no longer need to use
legacy .suspend() and .resume() in which they had to maintain PCI states
changes and device's power state themselves.

Legacy PM involves usage of PCI helper functions like pci_enable_wake()
which is no longer recommended.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:33:15 -07:00
Vaibhav Gupta
77eb16e9b2 tulip: tulip_core: use generic power management
With the support of generic PM callbacks, drivers no longer need to use
legacy .suspend() and .resume() in which they had to maintain PCI
states changes and device's power state themselves.

Earlier, .suspend() and .resume() were invoking pci_disable_device()
and pci_enable_device() respectively to manage the device's power state.
driver also invoked pci_save/restore_state() and pci_set_power_sitate().
With generic PM, it is no longer needed. The driver is expected to just
implement driver-specific operations and leave power transitions to PCI
core.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:33:15 -07:00
Vaibhav Gupta
8cfa989ae3 tulip: de2104x: use generic power management
With the support of generic PM callbacks, drivers no longer need to use
legacy .suspend() and .resume() in which they had to maintain PCI states
changes and device's power state themselves.

Earlier, .suspend() and .resume() were invoking pci_disable_device()
and pci_enable_device() respectively to manage the device's power state.
With generic PM, it is no longer needed. The driver is expected to just
implement driver-specific operations and leave power transitions to PCI
core.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:33:15 -07:00
Vaibhav Gupta
fc9aebfbdb tulip: windbond-840: use generic power management
With stable support of generic PM callbacks, drivers no longer need to use
legacy .suspend() and .resume() in which they had to maintain PCI states
changes and device's power state themselves.

Earlier, .resume() was invoking pci_enable_device(). Drivers should not
call PCI legacy helper functions, hence, it was removed. This should not
change the behavior of the device as this function is called by PCI core
if somehow pm_ops is not able to bind with the driver, else, required tasks
are managed by the core itself.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:33:15 -07:00
Vaibhav Gupta
f906d0f9cd tulip: dmfe: use generic power management
With legacy PM hooks, it was the responsibility of a driver to manage PCI
states and also the device's power state. The generic approach is to let the
PCI core handle the work.

The legacy suspend() and resume() were making use of
pci_read/write_config_dword() to enable/disable wol. Driver editing
configuration registers of a device is not recommended. Thus replace them
all with device_wakeup_enable/disable().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:33:15 -07:00
Vaibhav Gupta
c6f0fb5dfe amd-xgbe: Convert to generic power management
Use dev_pm_ops structure to call generic suspend() and resume() callbacks.

Drivers should avoid saving device register and/or change power states
using PCI helper functions. With the generic approach, all these are handled
by PCI core.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:31:18 -07:00
Vaibhav Gupta
2caf751fe0 amd8111e: Convert to generic power management
Drivers should not save device registers and/or change the power state of
the device. As per the generic PM approach, these are handled by PCI core.

The driver should support dev_pm_ops callbacks and bind them to pci_driver.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:31:18 -07:00
Vaibhav Gupta
a86688fbef pcnet32: Convert to generic power management
Remove legacy PM callbacks and use generic operations. With legacy code,
drivers were responsible for handling PCI PM operations like
pci_save_state(). In generic code, all these are handled by PCI core.

The generic suspend() and resume() are called at the same point the legacy
ones were called. Thus, it does not affect the normal functioning of the
driver.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:31:18 -07:00
Gaurav Singh
5777cbba79 xirc2ps_cs: remove dev null check from do_reset().
dev cannot be NULL here since its already being accessed
before. Remove the redundant null check.

Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:18:16 -07:00
Vasundhara Volam
c55e28a8b4 bnxt_en: Read VPD info only for PFs
Virtual functions does not have VPD information. This patch modifies
calling bnxt_read_vpd_info() only for PFs and avoids an unnecessary
error log.

Fixes: a0d0fd70fe ("bnxt_en: Read partno and serialno of the board from VPD")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:13:58 -07:00
Michael Chan
c2dec363fe bnxt_en: Fix statistics counters issue during ifdown with older firmware.
On older firmware, the hardware statistics are not cleared when the
driver frees the hardware stats contexts during ifdown.  The driver
expects these stats to be cleared and saves a copy before freeing
the stats contexts.  During the next ifup, the driver will likely
allocate the same hardware stats contexts and this will cause a big
increase in the counters as the old counters are added back to the
saved counters.

We fix it by making an additional firmware call to clear the counters
before freeing the hw stats contexts when the firmware is the older
20.x firmware.

Fixes: b8875ca356 ("bnxt_en: Save ring statistics before reset.")
Reported-by: Jakub Kicinski <kicinski@fb.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Tested-by: Jakub Kicinski <kicinski@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:13:58 -07:00
Michael Chan
fed7edd181 bnxt_en: Do not enable legacy TX push on older firmware.
Older firmware may not support legacy TX push properly and may not
be disabling it.  So we check certain firmware versions that may
have this problem and disable legacy TX push unconditionally.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:13:58 -07:00
Michael Chan
d0ad2ea2bc bnxt_en: Store the running firmware version code.
We currently only store the firmware version as a string for ethtool
and devlink info.  Store it also as a version code.  The next 2
patches will need to check the firmware major version to determine
some workarounds.

We also use the 16-bit firmware version fields if the firmware is newer
and provides the 16-bit fields.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 20:13:58 -07:00
Heiner Kallweit
4640338c36 r8169: rename RTL8125 to RTL8125A
Realtek added new members to the RTL8125 chip family, therefore rename
RTL8125 to RTL8125a. Then we use the same chip naming as in the r8125
vendor driver.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:25:16 -07:00
Jarod Wilson
bdfd2d1fa7 bonding/xfrm: use real_dev instead of slave_dev
Rather than requiring every hw crypto capable NIC driver to do a check for
slave_dev being set, set real_dev in the xfrm layer and xso init time, and
then override it in the bonding driver as needed. Then NIC drivers can
always use real_dev, and at the same time, we eliminate the use of a
variable name that probably shouldn't have been used in the first place,
particularly given recent current events.

CC: Boris Pismenny <borisp@mellanox.com>
CC: Saeed Mahameed <saeedm@mellanox.com>
CC: Leon Romanovsky <leon@kernel.org>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
Suggested-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:19:55 -07:00
Rahul Lakkireddy
20bb0c8f2c cxgb4vf: update kernel-doc line comments
Update several kernel-doc line comments to fix warnings reported by
make W=1.

Fixes following class of warnings reported by make W=1 in several
places:
cxgb4vf_main.c:275: warning: Function parameter or member 'persistent'
not described in 'cxgb4vf_change_mac'
cxgb4vf_main.c:275: warning: Excess function parameter 'persist'
description in 'cxgb4vf_change_mac'

Fixes: 16f8bd4be7 ("cxgb4vf: Add core T4 PCI-E SR-IOV Virtual Function hardware definitions and device communication code")
Fixes: c6e0d91464 ("cxgb4vf: Add T4 Virtual Function Scatter-Gather Engine DMA code")
Fixes: e0a8b34a9c ("cxgb4vf: Add and initialize some sge params for VF driver")
Fixes: c3168cabe1 ("cxgb4/cxgbvf: Handle 32-bit fw port capabilities")
Fixes: 0e23daeb64 ("drivers/net: chelsio/cxgb*: Convert timers to use timer_setup()")
Fixes: 3f8cfd0d95 ("cxgb4/cxgb4vf: Program hash region for {t4/t4vf}_change_mac()")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:18:48 -07:00
Rahul Lakkireddy
29bbf5d7f5 cxgb4: update kernel-doc line comments
Update several kernel-doc line comments to fix warnings reported by
make W=1.

Fixes following class of warnings reported by make W=1 in several
places:
l2t.c:616: warning: Cannot understand  * @dev: net_device pointer
t4_hw.c:3175: warning: Function parameter or member 'adap' not
described in 't4_get_exprom_version'
t4_hw.c:3175: warning: Excess function parameter 'adapter' description
in 't4_get_exprom_version'

Fixes: 56d36be4dd ("cxgb4: Add HW and FW support code")
Fixes: fd3a47900b ("cxgb4: Add packet queues and packet DMA code")
Fixes: 26f7cbc0a5 ("cxgb4: Don't attempt to upgrade T4 firmware when cxgb4 will end up as a slave")
Fixes: 793dad94e7 ("RDMA/cxgb4: Fix bug for active and passive LE hash collision path")
Fixes: ba3f8cd55f ("cxgb4: Add support in cxgb4 to get expansion rom version via ethtool")
Fixes: f7502659ce ("cxgb4: Add API to alloc l2t entry; also update existing ones")
Fixes: ddc7740d9a ("cxgb4: Decode link down reason code obtained from firmware")
Fixes: 193c4c2845 ("cxgb4: Update T6 Buffer Group and Channel Mappings")
Fixes: 8f46d46715 ("cxgb4: Use Firmware params to get buffer-group map")
Fixes: a456950445 ("cxgb4: time stamping interface for PTP")
Fixes: 9c33e4208b ("cxgb4: Add PTP Hardware Clock (PHC) support")
Fixes: c3168cabe1 ("cxgb4/cxgbvf: Handle 32-bit fw port capabilities")
Fixes: 5ccf9d0496 ("cxgb4: update API for TP indirect register access")
Fixes: 3bdb376e69 ("cxgb4: introduce SMT ops to prepare for SMAC rewrite support")
Fixes: 736c3b9447 ("cxgb4: collect egress and ingress SGE queue contexts")
Fixes: f56ec6766d ("cxgb4: Add support for ethtool i2c dump")
Fixes: 9d5fd927d2 ("cxgb4/cxgb4vf: add support for ndo_set_vf_vlan")
Fixes: 98f3697f8d ("cxgb4: add tc flower match support for tunnel VNI")
Fixes: 02d805dc5f ("cxgb4: use new fw interface to get the VIN and smt index")
Fixes: 3f8cfd0d95 ("cxgb4/cxgb4vf: Program hash region for {t4/t4vf}_change_mac()")
Fixes: d429005fdf ("cxgb4/cxgb4vf: Add support for SGE doorbell queue timer")
Fixes: 0e395b3cb1 ("cxgb4: add FLOWC based QoS offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:18:48 -07:00
Rahul Lakkireddy
00e31cfc89 cxgb4: fix set but unused variable when DCB is disabled
Remove the set but unused variable when DCB is disabled. Instead,
do the calculation directly inline.

Fixes following warning in make W=1:
cxgb4_main.c: In function 'cfg_queues':
cxgb4_main.c:5380:29: warning: variable 'n1g' set but not used
[-Wunused-but-set-variable]
  u32 i, n10g = 0, qidx = 0, n1g = 0;
                             ^

Fixes: 116ca924ae ("cxgb4: fix checks for max queues to allocate")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:18:48 -07:00
Rahul Lakkireddy
bab3bcf3e9 cxgb4: move DCB version extern to header file
Move the DCB version string array extern to header file.

Fixes following sparse warning:
cxgb4_dcb.c:13:12: warning: symbol 'dcb_ver_array' was not declared.
Should it be static?

Fixes: ebddd97afb ("cxgb4: add support to display DCB info")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:18:48 -07:00
Rahul Lakkireddy
2f6670165d cxgb4: remove cast when saving IPv4 partial checksum
The checksum field in IPv4 header is in __sum16 and ip_fast_csum()
also returns __sum16. So, no need to cast it to u16.

Fixes following sparse warning:
sge.c:1539:47: warning: cast from restricted __sum16
sge.c:1539:44: warning: incorrect type in assignment (different base types)
sge.c:1539:44:    expected restricted __sum16 [usertype] check
sge.c:1539:44:    got unsigned short [usertype]

Fixes: d0a1299c6b ("cxgb4: add support for vxlan segmentation offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:18:48 -07:00
Rahul Lakkireddy
1992ded5d1 cxgb4: fix SGE queue dump destination buffer context
The data in destination buffer is expected to be be parsed in big
endian. So, use the right context.

Fixes following sparse warning:
cudbg_lib.c:2041:44: warning: incorrect type in assignment (different
base types)
cudbg_lib.c:2041:44:    expected unsigned long long [usertype]
cudbg_lib.c:2041:44:    got restricted __be64 [usertype]

Fixes: 736c3b9447 ("cxgb4: collect egress and ingress SGE queue contexts")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:18:48 -07:00
Rahul Lakkireddy
f286dd8eaa cxgb4: use correct type for all-mask IP address comparison
Use correct type to check for all-mask exact match IP addresses.

Fixes following sparse warnings due to big endian value checks
against 0xffffffff in is_addr_all_mask():
cxgb4_filter.c:977:25: warning: restricted __be32 degrades to integer
cxgb4_filter.c:983:37: warning: restricted __be32 degrades to integer
cxgb4_filter.c:984:37: warning: restricted __be32 degrades to integer
cxgb4_filter.c:985:37: warning: restricted __be32 degrades to integer
cxgb4_filter.c:986:37: warning: restricted __be32 degrades to integer

Fixes: 3eb8b62d5a ("cxgb4: add support to create hash-filters via tc-flower offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:18:48 -07:00
Rahul Lakkireddy
63b53b0b99 cxgb4: fix endian conversions for L4 ports in filters
The source and destination L4 ports in filter offload need to be
in CPU endian. They will finally be converted to Big Endian after
all operations are done and before giving them to hardware. The
L4 ports for NAT are expected to be passed as a byte stream TCB.
So, treat them as such.

Fixes following sparse warnings in several places:
cxgb4_tc_flower.c:159:33: warning: cast from restricted __be16
cxgb4_tc_flower.c:159:33: warning: incorrect type in argument 1 (different
base types)
cxgb4_tc_flower.c:159:33:    expected unsigned short [usertype] val
cxgb4_tc_flower.c:159:33:    got restricted __be16 [usertype] dst

Fixes: dca4faeb81 ("cxgb4: Add LE hash collision bug fix path in LLD driver")
Fixes: 62488e4b53 ("cxgb4: add basic tc flower offload support")
Fixes: 557ccbf9df ("cxgb4: add tc flower support for L3/L4 rewrite")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:18:47 -07:00
Rahul Lakkireddy
27f78cb245 cxgb4: parse TC-U32 key values and masks natively
TC-U32 passes all keys values and masks in __be32 format. The parser
already expects this and hence pass the value and masks in __be32
natively to the parser.

Fixes following sparse warnings in several places:
cxgb4_tc_u32.c:57:21: warning: incorrect type in assignment (different base
types)
cxgb4_tc_u32.c:57:21:    expected unsigned int [usertype] val
cxgb4_tc_u32.c:57:21:    got restricted __be32 [usertype] val
cxgb4_tc_u32_parse.h:48:24: warning: cast to restricted __be32

Fixes: 2e8aad7bf2 ("cxgb4: add parser to translate u32 filters to internal spec")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:18:47 -07:00
Rahul Lakkireddy
589b1c9c16 cxgb4: use unaligned conversion for fetching timestamp
Use get_unaligned_be64() to fetch the timestamp needed for ns_to_ktime()
conversion.

Fixes following sparse warning:
sge.c:3282:43: warning: cast to restricted __be64

Fixes: a456950445 ("cxgb4: time stamping interface for PTP")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:18:47 -07:00
Rahul Lakkireddy
030c98824d cxgb4: move PTP lock and unlock to caller in Tx path
Check for whether PTP is enabled or not at the caller and perform
locking/unlocking at the caller.

Fixes following sparse warning:
sge.c:1641:26: warning: context imbalance in 'cxgb4_eth_xmit' -
different lock contexts for basic block

Fixes: a456950445 ("cxgb4: time stamping interface for PTP")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:18:47 -07:00
Rahul Lakkireddy
11d8cd5c9f cxgb4: move handling L2T ARP failures to caller
Move code handling L2T ARP failures to the only caller.

Fixes following sparse warning:
skbuff.h:2091:29: warning: context imbalance in
'handle_failed_resolution' - unexpected unlock

Fixes: 749cb5fe48 ("cxgb4: Replace arpq_head/arpq_tail with SKB double link-list code")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:18:47 -07:00
Petr Machata
34639fa383 mlxsw: Enforce firmware version for Spectrum-3
In a fashion similar to the other Spectrum systems, enforce a specific
firmware version for Spectrum-3 so that the driver and firmware are
always in sync with regards to new features.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:14:13 -07:00
Petr Machata
69c8a8c543 mlxsw: Bump firmware version to XX.2007.1168
This version comes with fixes to the following problems, among others:

- Wrong shaper configuration on Spectrum-1
- Bogus temperature reading on Spectrum-2
- Problems in setting egress buffer size after MTU change on Spectrum-2

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:14:13 -07:00
Masanari Iida
13fdc4193c mlxsw: spectrum_dcb: Fix a spelling typo in spectrum_dcb.c
This patch fixes a spelling typo in spectrum_dcb.c

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:03:54 -07:00
Alexander Lobakin
10f468ea5c net: qed: fix "maybe uninitialized" warning
Variable 'abs_ppfid' in qed_dev.c:qed_llh_add_mac_filter() always gets
printed, but is initialized only under 'ref_cnt == 1' condition. This
results in:

In file included from ./include/linux/kernel.h:15:0,
                 from ./include/asm-generic/bug.h:19,
                 from ./arch/x86/include/asm/bug.h:86,
                 from ./include/linux/bug.h:5,
                 from ./include/linux/io.h:11,
                 from drivers/net/ethernet/qlogic/qed/qed_dev.c:35:
drivers/net/ethernet/qlogic/qed/qed_dev.c: In function 'qed_llh_add_mac_filter':
./include/linux/printk.h:358:2: warning: 'abs_ppfid' may be used uninitialized
in this function [-Wmaybe-uninitialized]
  printk(KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__)
  ^~~~~~
drivers/net/ethernet/qlogic/qed/qed_dev.c:983:17: note: 'abs_ppfid' was declared
here
  u8 filter_idx, abs_ppfid;
                 ^~~~~~~~~

...under W=1+.

Fix this by initializing it with zero.

Fixes: 79284adeb9 ("qed: Add llh ppfid interface and 100g support for offload protocols")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:01:16 -07:00
Alexander Lobakin
c221dd1831 net: qed: reset ILT block sizes before recomputing to fix crashes
Sizes of all ILT blocks must be reset before ILT recomputing when
disabling clients, or memory allocation may exceed ILT shadow array
and provoke system crashes.

Fixes: 1408cc1fa4 ("qed: Introduce VFs")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:01:16 -07:00
Alexander Lobakin
ec6c80590b net: qede: fix use-after-free on recovery and AER handling
Set edev->cdev pointer to NULL after calling remove() callback to avoid
using of already freed object.

Fixes: ccc67ef50b ("qede: Error recovery process")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:01:16 -07:00
Alexander Lobakin
1c85f394c2 net: qede: fix PTP initialization on recovery
Currently PTP cyclecounter and timecounter are initialized only on
the first probing and are cleaned up during removal. This means that
PTP becomes non-functional after device recovery.
Fix this by unconditional PTP initialization on probing and clearing
Tx pending bit on exiting.

Fixes: ccc67ef50b ("qede: Error recovery process")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:01:16 -07:00
Alexander Lobakin
d434d02f7e net: qed: fix excessive QM ILT lines consumption
This is likely a copy'n'paste mistake. The amount of ILT lines to
reserve for a single VF was being multiplied by the total VFs count.
This led to a huge redundancy in reservation and potential lines
drainouts.

Fixes: 1408cc1fa4 ("qed: Introduce VFs")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:01:16 -07:00
Alexander Lobakin
ccd7c7ce16 net: qed: fix NVMe login fails over VFs
25ms sleep cycles in waiting for PF response are excessive and may lead
to different timeout failures.

Start to wait with short udelays, and in most cases polling will end
here. If the time was not sufficient, switch to msleeps.
usleep_range() may go far beyond 100us depending on platform and tick
configuration, hence atomic udelays for consistency.

Also add explicit DMA barriers since 'done' always comes from a shared
request-response DMA pool, and note that in the comment nearby.

Fixes: 1408cc1fa4 ("qed: Introduce VFs")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:01:16 -07:00
Alexander Lobakin
4079c7f7a2 net: qede: stop adding events on an already destroyed workqueue
Set rdma_wq pointer to NULL after destroying the workqueue and check
for it when adding new events to fix crashes on driver unload.

Fixes: cee9fbd8e2 ("qede: Add qedr framework")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:01:16 -07:00
Alexander Lobakin
31333c1a25 net: qed: fix async event callbacks unregistering
qed_spq_unregister_async_cb() should be called before
qed_rdma_info_free() to avoid crash-spawning uses-after-free.
Instead of calling it from each subsystem exit code, do it in one place
on PF down.

Fixes: 291d57f67d ("qed: Fix rdma_info structure allocation")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23 15:01:16 -07:00
Maor Gottlieb
608ca553c9 net/mlx5: Add support in query QP, CQ and MKEY segments
Introduce new resource dump segments - PRM_QUERY_QP,
PRM_QUERY_CQ and PRM_QUERY_MKEY. These segments contains the resource
dump in PRM query format.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-06-23 17:26:10 +03:00
Maor Gottlieb
d63cc24933 net/mlx5: Export resource dump interface
Export some of the resource dump API. mlx5_ib driver will use
it in downstream patches.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-06-23 17:26:10 +03:00
Dmitry Bogdanov
ecab78703f net: atlantic: A2: phy loopback support
This patch adds the phy loopback support on A2.

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 21:10:22 -07:00
Dmitry Bogdanov
2b53b04de3 net: atlantic: A2: report link partner capabilities
This patch adds link partner capabilities reporting support on A2.
In particular, the following capabilities are available for reporting:
* link rate;
* EEE;
* flow control.

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 21:10:22 -07:00
Igor Russkikh
3e168de529 net: atlantic: A2: flow control support
This patch adds flow control support on A2.

Co-developed-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 21:10:22 -07:00
Nikita Danilov
ce6a690ccc net: atlantic: A2: EEE support
This patch adds EEE support on A2.

Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Co-developed-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 21:10:22 -07:00
Nikita Danilov
e61b28686b net: atlantic: remove baseX usage
This patch removes 2.5G baseX wrong usage/reporting, since it shouldn't have
been mixed with baseT.

Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 21:10:22 -07:00
Igor Russkikh
071a02046c net: atlantic: A2: half duplex support
This patch adds support for 10M/100M/1G half duplex rates, which are
supported by A2 in additional to full duplex rates supported by A1.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 21:10:22 -07:00
Russell King
75674e3159 net: mtk_eth_soc: use resolved link config in mac_link_up()
Convert the mtk_eth_soc driver to use the finalised link parameters in
mac_link_up() rather than the parameters in mac_config().

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 20:45:53 -07:00
Vladimir Oltean
9403c158b8 net: mscc: ocelot: support IPv4, IPv6 and plain Ethernet mdb entries
The current procedure for installing a multicast address is hardcoded
for IPv4. But, in the ocelot hardware, there are 3 different procedures
for IPv4, IPv6 and for regular L2 multicast.

For IPv6 (33-33-xx-xx-xx-xx), it's the same as for IPv4
(01-00-5e-xx-xx-xx), except that the destination port mask is stuffed
into first 2 bytes of the MAC address except into first 3 bytes.

For plain Ethernet multicast, there's no port-in-address stuffing going
on, instead the DEST_IDX (pointer to PGID) is used there, just as for
unicast. So we have to use one of the nonreserved multicast PGIDs that
the hardware has allocated for this purpose.

This patch classifies the type of multicast address based on its first
bytes, then redirects to one of the 3 different hardware procedures.

Note that this gives us a really better way of redirecting PTP frames
sent at 01-1b-19-00-00-00 to the CPU. Previously, Yangbo Lu tried to add
a trapping rule for PTP EtherType but got a lot of pushback:

https://patchwork.ozlabs.org/project/netdev/patch/20190813025214.18601-5-yangbo.lu@nxp.com/

But right now, that isn't needed at all. The application stack (ptp4l)
does this for the PTP multicast addresses it's interested in (which are
configurable, and include 01-1b-19-00-00-00):

	memset(&mreq, 0, sizeof(mreq));
	mreq.mr_ifindex = index;
	mreq.mr_type = PACKET_MR_MULTICAST;
	mreq.mr_alen = MAC_LEN;
	memcpy(mreq.mr_address, addr1, MAC_LEN);

	err1 = setsockopt(fd, SOL_PACKET, PACKET_ADD_MEMBERSHIP, &mreq,
			  sizeof(mreq));

Into the kernel, this translates into a dev_mc_add on the switch network
interfaces, and our drivers know that it means they should translate it
into a host MDB address (make the CPU port be the destination).
Previously, this was broken because all mdb addresses were treated as
IPv4 (which 01-1b-19-00-00-00 obviously is not).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 20:41:05 -07:00
Vladimir Oltean
96b029b004 net: mscc: ocelot: introduce macros for iterating over PGIDs
The current iterators are impossible to understand at first glance
without switching back and forth between the definitions and their
actual use in the for loops.

So introduce some convenience names to help readability.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 20:41:05 -07:00
Vladimir Oltean
209edf95da net: dsa: felix: call port mdb operations from ocelot
This adds the mdb hooks in felix and exports the mdb functions from
ocelot.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 20:41:05 -07:00
Vladimir Oltean
471beb11c4 net: mscc: ocelot: make the NPI port a proper target for FDB and MDB
When used in DSA mode (as seen in Felix), the DEST_IDX in the MAC table
should point to the PGID for the CPU port (PGID_CPU) and not for the
Ethernet port where the CPU queues are redirected to (also known as Node
Processor Interface - NPI).

Because for Felix this distinction shouldn't really matter (from DSA
perspective, the NPI port _is_ the CPU port), make the ocelot library
act upon the CPU port when NPI mode is enabled. This has no effect for
the mscc_ocelot driver for VSC7514, because that does not use NPI (and
ocelot->npi is -1).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 20:41:05 -07:00
Vladimir Oltean
0897ecf753 net: mscc: ocelot: fix encoding destination ports into multicast IPv4 address
The ocelot hardware designers have made some hacks to support multicast
IPv4 and IPv6 addresses. Normally, the MAC table matches on MAC
addresses and the destination ports are selected through the DEST_IDX
field of the respective MAC table entry. The DEST_IDX points to a Port
Group ID (PGID) which contains the bit mask of ports that frames should
be forwarded to. But there aren't a lot of PGIDs (only 80 or so) and
there are clearly many more IP multicast addresses than that, so it
doesn't scale to use this PGID mechanism, so something else was done.
Since the first portion of the MAC address is known, the hack they did
was to use a single PGID for _flooding_ unknown IPv4 multicast
(PGID_MCIPV4 == 62), but for known IP multicast, embed the destination
ports into the first 3 bytes of the MAC address recorded in the MAC
table.

The VSC7514 datasheet explains it like this:

    3.9.1.5 IPv4 Multicast Entries

    MAC table entries with the ENTRY_TYPE = 2 settings are interpreted
    as IPv4 multicast entries.
    IPv4 multicasts entries match IPv4 frames, which are classified to
    the specified VID, and which have DMAC = 0x01005Exxxxxx, where
    xxxxxx is the lower 24 bits of the MAC address in the entry.
    Instead of a lookup in the destination mask table (PGID), the
    destination set is programmed as part of the entry MAC address. This
    is shown in the following table.

    Table 78: IPv4 Multicast Destination Mask

        Destination Ports            Record Bit Field
        ---------------------------------------------
        Ports 10-0                   MAC[34-24]

    Example: All IPv4 multicast frames in VLAN 12 with MAC 01005E112233 are
    to be forwarded to ports 3, 8, and 9. This is done by inserting the
    following entry in the MAC table entry:
    VALID = 1
    VID = 12
    MAC = 0x000308112233
    ENTRY_TYPE = 2
    DEST_IDX = 0

But this procedure is not at all what's going on in the driver. In fact,
the code that embeds the ports into the MAC address looks like it hasn't
actually been tested. This patch applies the procedure described in the
datasheet.

Since there are many other fixes to be made around multicast forwarding
until it works properly, there is no real reason for this patch to be
backported to stable trees, or considered a real fix of something that
should have worked.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 20:41:05 -07:00
Petr Machata
ce10d7d4ad mlxsw: spectrum_acl: Support FLOW_ACTION_MANGLE for TCP, UDP ports
Spectrum-2 supports an ACL action L4_PORT, which allows TCP and UDP source
and destination port number change. Offload suitable mangles to this
action.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:32:11 -07:00
Petr Machata
faad0525c0 mlxsw: core_acl_flex_actions: Add L4_PORT_ACTION
Add fields related to L4_PORT_ACTION, which is used for changing of TCP and
UDP port numbers.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:32:11 -07:00
Petr Machata
3cc9a15a0b mlxsw: spectrum: Split handling of pedit mangle by chip type
Certain ACL actions are only available on some Spectrum revisions. In
particular, L4_PORT_ACTION is not available on Spectrum-1. Introduce a
new ops struct intended to hold these differences, mlxsw_sp_rulei_ops.
Prime it with a sole member, act_mangle_field, meant for handling of
pedit mangles.

Create two ops structures, one for Spectrum-1, the other for Spectrum-2
and above. Add callbacks for act_mangle_field and dispatch to the common
handler.

Invoke mlxsw_sp_rulei_ops.act_mangle_field from the field mangler
instead of calling the common handler directly.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:32:11 -07:00
Ido Schimmel
f3fe412b0a mlxsw: spectrum: Do not rely on machine endianness
The second commit cited below performed a cast of 'u32 buffsize' to
'(u16 *)' when calling mlxsw_sp_port_headroom_8x_adjust():

mlxsw_sp_port_headroom_8x_adjust(mlxsw_sp_port, (u16 *) &buffsize);

Colin noted that this will behave differently on big endian
architectures compared to little endian architectures.

Fix this by following Colin's suggestion and have the function accept
and return 'u32' instead of passing the current size by reference.

Fixes: da382875c6 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
Fixes: 60833d54d5 ("mlxsw: spectrum: Adjust headroom buffers for 8x ports")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Colin Ian King <colin.king@canonical.com>
Suggested-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:29:51 -07:00
Heiner Kallweit
288302dab3 r8169: improve rtl8169_runtime_resume
Simplify rtl8169_runtime_resume() by calling rtl8169_resume().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:26:02 -07:00
Heiner Kallweit
06a14ab852 r8169: remove driver-specific mutex
Now that the critical sections are protected with RTNL lock, we don't
need a separate mutex any longer.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:26:02 -07:00
Heiner Kallweit
abe5fc42f9 r8169: use RTNL to protect critical sections
Most relevant ops (open, close, ethtool ops) are protected with RTNL
lock by net core. Make sure that such ops can't be interrupted by
e.g. (runtime-)suspending by taking the RTNL lock in suspend ops
and the PCI error handler.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:26:02 -07:00
Heiner Kallweit
567ca57faa r8169: add rtl8169_up
Factor out bringing device up to a new function rtl8169_up(), similar
to rtl8169_down() for bringing the device down.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:26:02 -07:00
Heiner Kallweit
ec2f204bdd r8169: remove no longer needed checks for device being runtime-active
Because the netdevice is marked as detached now when parent is not
accessible we can remove quite some checks.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:26:01 -07:00
Heiner Kallweit
476c4f5de3 r8169: mark device as not present when in PCI D3
Mark the netdevice as detached whenever we go into PCI D3hot.
This allows to remove some checks e.g. from ethtool ops because
dev_ethtool() checks for netif_device_present() in the beginning.

In this context move waking up the queue out of rtl_reset_work()
because in cases where netif_device_attach() is called afterwards
the queue should be woken up by the latter function only.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:26:01 -07:00
Martin Blumenstingl
a4f63342d0 net: stmmac: dwmac-meson8b: add a compatible string for G12A SoCs
Amlogic Meson G12A, G12B and SM1 have the same (at least as far as we
know at the time of writing) PRG_ETHERNET glue register implementation.
This implementation however is slightly different from AXG as it now has
an undocument "auto cali idx val" register in PRG_ETH1[17:16] which
seems to be related to RGMII Ethernet.

Add a new compatible string for G12A SoCs so the logic for this new
register can be implemented in the future.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:24:10 -07:00
Vasundhara Volam
9bf88b9fc8 bnxt_en: Add board.serial_number field to info_get cb
Add board.serial_number field info to info_get cb via devlink,
if driver can fetch the information from the device.

Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:15:05 -07:00
Jarod Wilson
bf3a058de5 mlx5: become aware of when running as a bonding slave
I've been unable to get my hands on suitable supported hardware to date,
but I believe this ought to be all that is needed to enable the mlx5
driver to also work with bonding active-backup crypto offload passthru.

CC: Boris Pismenny <borisp@mellanox.com>
CC: Saeed Mahameed <saeedm@mellanox.com>
CC: Leon Romanovsky <leon@kernel.org>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:38:57 -07:00
Jarod Wilson
0dea9ea97e ixgbe_ipsec: become aware of when running as a bonding slave
Slave devices in a bond doing hardware encryption also need to be aware
that they're slaves, so we operate on the slave instead of the bonding
master to do the actual hardware encryption offload bits.

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
CC: intel-wired-lan@lists.osuosl.org
Acked-by: Jeff Kirsher <Jeffrey.t.kirsher@intel.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:38:57 -07:00
Parav Pandit
330077d14d net/mlx5: E-switch, Supporting setting devlink port function mac address
Enable user to set mac address of the PCI PF and VF port function.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:19 -07:00
Parav Pandit
1094795ce4 net/mlx5: Split mac address setting function for using state_lock
Refactor mac address setting function to let caller hold the necessary
state_lock mutex, so that subsequent patch and use this helper routine.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:19 -07:00
Parav Pandit
f099fde16d net/mlx5: E-switch, Support querying port function mac address
Support querying mac address of the eswitch devlink port function.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:19 -07:00
Parav Pandit
443bf36eb5 net/mlx5: Move helper to eswitch layer
To use port number to port index conversion at eswitch level, move it to
eswitch header.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:19 -07:00
Parav Pandit
bd93975353 net/mlx5: E-switch, Introduce and use eswitch support check helper
Introduce an helper routine to get esw from a devlink device and use it
at eswitch callbacks and in subsequent patch.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:19 -07:00
Parav Pandit
fa997825eb net/mlx5: Constify mac address pointer
Since none of the functions need to modify the input mac address,
constify them.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:19 -07:00
Russell King
63d78cc976 net: mvpp2: set xlg flow control in mvpp2_mac_link_up()
Set the flow control settings in mvpp2_mac_link_up() for 10G links
just as we do for 1G and slower links. This is now the preferred
location.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 21:38:26 -07:00
Russell King
bd45f644a8 net: mvpp2: add register modification helper
Add a helper to read-modify-write a register, and use it in the phylink
helpers.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 21:38:26 -07:00
Russell King
6c2b49eb96 net: mvpp2: add mvpp2_phylink_to_port() helper
Add a helper to convert the struct phylink_config pointer passed in
from phylink to the drivers internal struct mvpp2_port.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 21:38:26 -07:00
Russell King
a9a3320227 net: mvpp2: add port support helpers
The mvpp2 code has tests scattered amongst the code to determine
whether the port supports the XLG, and whether the port supports
RGMII mode.

Rather than having these tests scattered, provide a couple of helper
functions, so that future additions can ensure that they get these
tests correct.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 21:38:26 -07:00
Heiner Kallweit
89fbd26cca r8169: fix firmware not resetting tp->ocp_base
Typically the firmware takes care that tp->ocp_base is reset to its
default value. That's not the case (at least) for RTL8117.
As a result subsequent PHY access reads/writes the wrong page and
the link is broken. Fix this be resetting tp->ocp_base explicitly.

Fixes: 229c1e0dfd ("r8169: load firmware for RTL8168fp/RTL8117")
Reported-by: Aaron Ma <mapengyu@gmail.com>
Tested-by: Aaron Ma <mapengyu@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 17:31:26 -07:00
Dany Madden
8b40eb7350 ibmvnic: continue to init in CRQ reset returns H_CLOSED
Continue the reset path when partner adapter is not ready or H_CLOSED is
returned from reset crq. This patch allows the CRQ init to proceed to
establish a valid CRQ for traffic to flow after reset.

Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 17:28:41 -07:00
Shannon Nelson
b59eabd23e ionic: tame the watchdog timer on reconfig
Even with moving netif_tx_disable() to an earlier point when
taking down the queues for a reconfiguration, we still end
up with the occasional netdev watchdog Tx Timeout complaint.
The old method of using netif_trans_update() works fine for
queue 0, but has no effect on the remaining queues.  Using
netif_device_detach() allows us to signal to the watchdog to
ignore us for the moment.

Fixes: beead698b1 ("ionic: Add the basic NDO callbacks for netdev support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 17:26:33 -07:00
Vladimir Oltean
c73b0ad36e net: mscc: ocelot: unexpose ocelot_vcap_policer_{add,del}
Remove the function prototypes from ocelot_police.h and make these
functions static. We need to move them above their callers. Note that
moving the implementations to ocelot_police.c is not trivially possible
due to dependency on is2_entry_set() which is static to ocelot_vcap.c.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 17:25:23 -07:00
Vladimir Oltean
aae4e500e1 net: mscc: ocelot: generalize the "ACE/ACL" names
Access Control Lists (and their respective Access Control Entries) are
specifically entries in the VCAP IS2, the security enforcement block,
according to the documentation.
Let's rename the structures and functions to something more generic, so
that VCAP IS1 structures (which would otherwise have to be called
Ingress Classification Entries) can reuse the same code without
confusion.

Some renaming that was done:

struct ocelot_ace_rule -> struct ocelot_vcap_filter
struct ocelot_acl_block -> struct ocelot_vcap_block
enum ocelot_ace_type -> enum ocelot_vcap_key_type
struct ocelot_ace_vlan -> struct ocelot_vcap_key_vlan
enum ocelot_ace_action -> enum ocelot_vcap_action
struct ocelot_ace_stats -> struct ocelot_vcap_stats
enum ocelot_ace_type -> enum ocelot_vcap_key_type
struct ocelot_ace_frame_* -> struct ocelot_vcap_key_*

No functional change is intended.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 17:25:23 -07:00
Vladimir Oltean
3c83654f24 net: mscc: ocelot: rename ocelot_ace.{c, h} to ocelot_vcap.{c,h}
Access Control Lists (and their respective Access Control Entries) are
specifically entries in the VCAP IS2, the security enforcement block,
according to the documentation.

Let's rename the files that deal with generic operations on the VCAP
TCAM, so that VCAP IS1 and ES0 can reuse the same code without
confusion.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 17:25:23 -07:00
Vladimir Oltean
9c90eea310 net: mscc: ocelot: move net_device related functions to ocelot_net.c
The ocelot hardware library shouldn't contain too much net_device
specific code, since it is shared with DSA which abstracts that
structure away. So much as much of this code as possible into the
mscc_ocelot driver and outside of the common library.

We're making an exception for MDB and LAG code. That is not yet exported
to DSA, but when it will, most of the code that's already in ocelot.c
will remain there. So, there's no point in moving code to ocelot_net.c
just to move it back later.

We could have moved all net_device code to ocelot_vsc7514.c directly,
but let's operate under the assumption that if a new switchdev ocelot
driver gets added, it'll define its SoC-specific stuff in a new
ocelot_vsc*.c file and it'll reuse the rest of the code.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 17:25:23 -07:00
Vladimir Oltean
d9feb90499 net: mscc: ocelot: move ocelot_regs.c into ocelot_vsc7514.c
ocelot_regs.c actually shouldn't be part of the common library. It
describes the register map of the VSC7514 switch. The way ocelot
switches work, they'll have highly optimized register maps, so another
SoC will likely have the same registers but laid out completely
different in memory (so there's little room for reusing this structure).
So move it to ocelot_vsc7514.c instead.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 17:25:23 -07:00
Vladimir Oltean
14addfb635 net: mscc: ocelot: rename MSCC_OCELOT_SWITCH_OCELOT to MSCC_OCELOT_SWITCH
Putting 'ocelot' in the config's name twice just to say that 'it's the
ocelot driver running on the ocelot SoC' is a bit confusing. Instead,
it's just the ocelot driver. Now that we've renamed the previous symbol
that was holding the MSCC_OCELOT_SWITCH_OCELOT into *_LIB (because
that's what it is), we're free to use this name for the driver.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 17:25:23 -07:00
Vladimir Oltean
f4d0323bae net: mscc: ocelot: convert MSCC_OCELOT_SWITCH into a library
Hide the CONFIG_MSCC_OCELOT_SWITCH option from users. It is meant to be
only a hardware library which is selected by the drivers that use it
(ocelot, felix).

Since it is "selected" from Kconfig, all its dependencies are manually
transferred to the driver that selects it. This is because "select" in
Kconfig language is a bit of a mess, and doesn't handle dependencies of
selected options quite right.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 17:25:23 -07:00
Vladimir Oltean
56583862b8 net: mscc: ocelot: rename module to mscc_ocelot
mscc_ocelot is a slightly better name for a module than ocelot_board or
ocelot_vsc7514 is, so let's use that.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 17:25:23 -07:00
Vladimir Oltean
589aa6e7c9 net: mscc: ocelot: rename ocelot_board.c to ocelot_vsc7514.c
To follow the model of felix and seville where we have one
platform-specific file, rename this file to the actual SoC it serves.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 17:25:23 -07:00
Vladimir Oltean
ff4b0bc623 net: mscc: ocelot: access EtherType using __be16
Get rid of sparse "cast to restricted __be16" warnings.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 17:25:23 -07:00
Vladimir Oltean
7eb5c96a7c net: mscc: ocelot: use plain int when interacting with TCAM tables
sparse is rightfully complaining about the fact that:

warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
   26 |   __builtin_constant_p((l) > (h)), (l) > (h), 0)))
      |                            ^
note: in expansion of macro ‘GENMASK_INPUT_CHECK’
   39 |  (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
      |   ^~~~~~~~~~~~~~~~~~~
note: in expansion of macro ‘GENMASK’
  127 |   mask = GENMASK(width, 0);
      |          ^~~~~~~

So replace the variables that go into GENMASK with plain, signed integer
types.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-20 17:25:23 -07:00
Thomas Falcon
5948378b26 ibmveth: Fix max MTU limit
The max MTU limit defined for ibmveth is not accounting for
virtual ethernet buffer overhead, which is twenty-two additional
bytes set aside for the ethernet header and eight additional bytes
of an opaque handle reserved for use by the hypervisor. Update the
max MTU to reflect this overhead.

Fixes: d894be57ca ("ethernet: use net core MTU range checking in more drivers")
Fixes: 110447f826 ("ethernet: fix min/max MTU typos")
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 20:21:46 -07:00
wenxu
a1db217861 net: flow_offload: fix flow_indr_dev_unregister path
If the representor is removed, then identify the indirect flow_blocks
that need to be removed by the release callback and the port representor
structure. To identify the port representor structure, a new
indr.cb_priv field needs to be introduced. The flow_block also needs to
be removed from the driver list from the cleanup path.

Fixes: 1fac52da59 ("net: flow_offload: consolidate indirect flow_block infrastructure")

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 20:12:58 -07:00
wenxu
66f1939a1b flow_offload: use flow_indr_block_cb_alloc/remove function
Prepare fix the bug in the next patch. use flow_indr_block_cb_alloc/remove
function and remove the __flow_block_indr_binding.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 20:12:58 -07:00
Claudiu Manoil
9deba33f1b enetc: Fix HW_VLAN_CTAG_TX|RX toggling
VLAN tag insertion/extraction offload is correctly
activated at probe time but deactivation of this feature
(i.e. via ethtool) is broken.  Toggling works only for
Tx/Rx ring 0 of a PF, and is ignored for the other rings,
including the VF rings.
To fix this, the existing VLAN offload toggling code
was extended to all the rings assigned to a netdevice,
instead of the default ring 0 (likely a leftover from the
early validation days of this feature).  And the code was
moved to the common set_features() function to fix toggling
for the VF driver too.

Fixes: d4fd0404c1 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 20:00:33 -07:00
Claudiu Beznea
faa620876b net: macb: undo operations in case of failure
Undo previously done operation in case macb_phylink_connect()
fails. Since macb_reset_hw() is the 1st undo operation the
napi_exit label was renamed to reset_hw.

Fixes: 7897b071ac ("net: macb: convert to phylink")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 19:59:14 -07:00
Gustavo A. R. Silva
a422d5ff6d cxgb4: Use struct_size() helper
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.

This code was detected with the help of Coccinelle and, audited and
fixed manually.

Addresses-KSPP-ID: https://github.com/KSPP/linux/issues/83
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 13:42:08 -07:00
Gustavo A. R. Silva
f362b70bd6 ethernet: ti: am65-cpsw-qos: Use struct_size() in devm_kzalloc()
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes. Also, remove unnecessary
variable _size_.

This code was detected with the help of Coccinelle and, audited and
fixed manually.

Addresses-KSPP-ID: https://github.com/KSPP/linux/issues/83
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 13:42:08 -07:00
Vishal Kulkarni
4dababa232 cxgb4: add action to steer flows to specific Rxq
Add support for queue action to steer Rx traffic
hitting the flows to specified Rxq.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 13:17:32 -07:00
Vishal Kulkarni
27ee299364 cxgb4: add support to fetch ethtool n-tuple filters
Add support to fetch the requested ethtool n-tuple filters by
translating them from hardware spec to ethtool n-tuple spec.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 13:17:32 -07:00
Vishal Kulkarni
db43b30cd8 cxgb4: add ethtool n-tuple filter deletion
Add support to delete ethtool n-tuple filter. Fetch the appropriate
filter region (HPFILTER, HASH, NORMAL) in which the filter exists,
and delete it from the respective region, accordingly.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 13:17:31 -07:00
Vishal Kulkarni
c8729cac2a cxgb4: add ethtool n-tuple filter insertion
Add support to parse and insert ethtool n-tuple filters.
Translate n-tuple spec to flow spec and use the existing tc-flower
offload infra to insert ethtool n-tuple filters.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 13:17:31 -07:00
Vishal Kulkarni
d915c299f1 cxgb4: add skeleton for ethtool n-tuple filters
Allocate and manage resources required for ethtool n-tuple filters.
Also fetch the HASH filter region size and calculate nhash entries.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 13:17:31 -07:00
Flavio Suligoi
6564cfefb0 net: ethernet: oki-semi: pch_gbe: fix spelling mistake
Fix typo: "Triger" --> "Trigger"

Signed-off-by: Flavio Suligoi <f.suligoi@asem.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 13:09:26 -07:00
Flavio Suligoi
24f5aa53af net: ethernet: neterion: vxge: fix spelling mistake
Fix typo: "trigered" --> "triggered"

Signed-off-by: Flavio Suligoi <f.suligoi@asem.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 13:06:26 -07:00
Po Liu
4b61d3e8d3 net: qos offload add flow status with dropped count
This patch adds a drop frames counter to tc flower offloading.
Reporting h/w dropped frames is necessary for some actions.
Some actions like police action and the coming introduced stream gate
action would produce dropped frames which is necessary for user. Status
update shows how many filtered packets increasing and how many dropped
in those packets.

v2: Changes
 - Update commit comments suggest by Jiri Pirko.

Signed-off-by: Po Liu <Po.Liu@nxp.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19 12:53:30 -07:00
Björn Töpel
3995ecbabc i40e: fix crash when Rx descriptor count is changed
When the AF_XDP buffer allocator was introduced, the Rx SW ring
"rx_bi" allocation was moved from i40e_setup_rx_descriptors()
function, and was instead done in the i40e_configure_rx_ring()
function.

This broke the ethtool set_ringparam() hook for changing the Rx
descriptor count, which was relying on i40e_setup_rx_descriptors() to
handle the allocation.

Fix this by adding an explicit i40e_alloc_rx_bi() call to
i40e_set_ringparam().

Fixes: be1222b585 ("i40e: Separate kernel allocated rx_bi rings from AF_XDP rings")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-18 22:37:25 -07:00
Ciara Loftus
b1d95cc239 ice: protect ring accesses with WRITE_ONCE
The READ_ONCE macro is used when reading rings prior to accessing the
statistics pointer. The corresponding WRITE_ONCE usage when allocating and
freeing the rings to ensure protected access was not in place. Introduce
this.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-18 22:35:34 -07:00
Ciara Loftus
d59e267912 i40e: protect ring accesses with READ- and WRITE_ONCE
READ_ONCE should be used when reading rings prior to accessing the
statistics pointer. Introduce this as well as the corresponding WRITE_ONCE
usage when allocating and freeing the rings, to ensure protected access.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-18 22:31:21 -07:00
Ciara Loftus
f140ad9fe2 ixgbe: protect ring accesses with READ- and WRITE_ONCE
READ_ONCE should be used when reading rings prior to accessing the
statistics pointer. Introduce this as well as the corresponding WRITE_ONCE
usage when allocating and freeing the rings, to ensure protected access.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-18 22:30:04 -07:00
Vishal Kulkarni
17b332f480 cxgb4: add support to read serial flash
This patch adds support to dump flash memory via
ethtool --get-dump

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:49:55 -07:00
Vishal Kulkarni
d5002c9a3d cxgb4: add support to flash boot cfg image
Update set_flash to flash boot cfg image to flash region

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:49:55 -07:00
Vishal Kulkarni
550883558f cxgb4: add support to flash boot image
Update set_flash to flash boot image to flash region

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:49:55 -07:00
Vishal Kulkarni
4ee339e1e9 cxgb4: add support to flash PHY image
Update set_flash to flash PHY image to flash region

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:49:55 -07:00
Vishal Kulkarni
3893c905b5 cxgb4: update set_flash to flash different images
Chelsio adapter contains different flash regions and each
region is used by different binary files. This patch adds
support to flash images like PHY firmware, boot and boot config
using ethtool -f N.

The N value mapping is as follows.
N = 0 : Parse image and decide which region to flash
N = 1 : Firmware
N = 2 : PHY firmware
N = 3 : boot image
N = 4 : boot cfg

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>"
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:49:55 -07:00
Eric Dumazet
761b331cb6 net: tso: cache transport header length
Add tlen field into struct tso_t, and change tso_start()
to return skb_transport_offset(skb) + tso->tlen

This removes from callers the need to use tcp_hdrlen(skb) and
will ease UDP segmentation offload addition.

v2: calls tso_start() earlier in otx2_sq_append_tso() [Jakub]

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:46:23 -07:00
Eric Dumazet
393415203f octeontx2-af: change (struct qmem)->entry_sz from u8 to u16
We need to increase TSO_HEADER_SIZE from 128 to 256.

Since otx2_sq_init() calls qmem_alloc() with TSO_HEADER_SIZE,
we need to change (struct qmem)->entry_sz to avoid truncation to 0.

Fixes: 7a37245ef2 ("octeontx2-af: NPA block admin queue init")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:46:23 -07:00
Barry Song
c2a2e1270a net: hns3: streaming dma buffer sync between cpu and device
Right now they are empty functions for our SoC since hardware can keep
cache coherent, but it is still good to align with streaming DMA APIs
as device drivers should not make an assumption of SoC.

Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:43:10 -07:00
Barry Song
e99a308da3 net: hns3: replace disable_irq by IRQ_NOAUTOEN flag
disable_irq() after request_irq() is still risk as there is a chance irq
can come after request_irq() and before disable_irq().
this should be done by IRQ_NOAUTOEN flag.

Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:43:10 -07:00
Barry Song
4d2cad3212 net: hns3: rename buffer-related functions
This is for improving the readability.

Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:43:10 -07:00
Barry Song
cb0e3e6115 net: hns3: pointer type of buffer should be void
Move the type of buffer address from unsigned char to void

Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:43:10 -07:00
Barry Song
674a135746 net: hns3: remove unnecessary devm_kfree
since we are using device-managed function, it is unnecessary
to free in probe.

Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:43:10 -07:00
Tim Harvey
c90834cd47 lan743x: allow mac address to come from dt
If a valid mac address is present in dt, use that before using
CSR's or a random mac address.

Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:40:18 -07:00
Heiner Kallweit
51f6291b04 r8169: allow setting irq coalescing if link is down
So far we can not configure irq coalescing when link is down. Allow the
user to do this, and assume that he wants to configure irq coalescing
for highest speed. Otherwise the irq rate is low enough anyway.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:38:36 -07:00
Heiner Kallweit
9f0b54cd16 r8169: move switching optional clock on/off to pll power functions
Relevant chip clocks are disabled in rtl_pll_power_down(), therefore
move calling clk_disable_unprepare() there. Similar for enabling the
clock.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:38:36 -07:00
Heiner Kallweit
a2ee847242 r8169: move updating counters to rtl8169_down
Counters are updated whenever we go down, therefore move the call to
rtl8169_down().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:38:36 -07:00
Heiner Kallweit
0c28a63a47 r8169: move napi_disable call and rename rtl8169_hw_reset
rtl8169_hw_reset() meanwhile does more than a hw reset, therefore rename
it to rtl8169_cleanup(). In addition move calling napi_disable() to this
function.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:38:36 -07:00
Heiner Kallweit
7190aeece9 r8169: replace synchronize_rcu with synchronize_net
rtl8169_hw_reset() may be called under RTNL lock, therefore switch to
synchronize_net().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:38:36 -07:00
Heiner Kallweit
e9882208ae r8169: improve setting WoL on runtime-resume
In the following scenario WoL isn't configured properly:
- Driver is loaded, interface isn't brought up within 10s, so driver
  runtime-suspends.
- WoL is set.
- Interface is brought up, stored WoL setting isn't applied.

It has always been like that, but the scenario seems to be quite
theoretical as I haven't seen any bug report yet. Therefore treat
the change as an improvement.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:38:36 -07:00
Heiner Kallweit
27248d57c8 r8169: remove unused constant RsvdMask
Since 9d3679fe0f ("r8169: inline rtl8169_make_unusable_by_asic")
this constant isn't used any longer, so remove it.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:38:36 -07:00
Heiner Kallweit
a38b7fbfea r8169: add info for DASH being enabled
In case of problems it facilitates the bug analysis if we know whether
DASH is active. Therefore emit a message in probe if this is the case.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:38:36 -07:00
Gustavo A. R. Silva
1260e772dd enetc: Use struct_size() helper in kzalloc()
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.

This code was detected with the help of Coccinelle and, audited and
fixed manually.

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:36:07 -07:00
David Christensen
3a2656a211 tg3: driver sleeps indefinitely when EEH errors exceed eeh_max_freezes
The driver function tg3_io_error_detected() calls napi_disable twice,
without an intervening napi_enable, when the number of EEH errors exceeds
eeh_max_freezes, resulting in an indefinite sleep while holding rtnl_lock.

Add check for pcierr_recovery which skips code already executed for the
"Frozen" state.

Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:35:04 -07:00
Gustavo A. R. Silva
4e638025f2 net: stmmac: selftests: Use struct_size() helper in kzalloc()
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.

This code was detected with the help of Coccinelle and, audited and
fixed manually.

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 20:19:20 -07:00
Sascha Hauer
1a642ca7f3 net: ethernet: mvneta: Add 2500BaseX support for SoCs without comphy
The older SoCs like Armada XP support a 2500BaseX mode in the datasheets
referred to as DR-SGMII (Double rated SGMII) or HS-SGMII (High Speed
SGMII). This is an upclocked 1000BaseX mode, thus
PHY_INTERFACE_MODE_2500BASEX is the appropriate mode define for it.
adding support for it merely means writing the correct magic value into
the MVNETA_SERDES_CFG register.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 19:59:49 -07:00
Sascha Hauer
b4748553f5 net: ethernet: mvneta: Fix Serdes configuration for SoCs without comphy
The MVNETA_SERDES_CFG register is only available on older SoCs like the
Armada XP. On newer SoCs like the Armada 38x the fields are moved to
comphy. This patch moves the writes to this register next to the comphy
initialization, so that depending on the SoC either comphy or
MVNETA_SERDES_CFG is configured.
With this we no longer write to the MVNETA_SERDES_CFG on SoCs where it
doesn't exist.

Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-18 19:59:49 -07:00
Shannon Nelson
ef7232da6b ionic: export features for vlans to use
Set up vlan_features for use by any vlans above us.

Fixes: beead698b1 ("ionic: Add the basic NDO callbacks for netdev support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-17 15:11:29 -07:00
Shannon Nelson
3103b6feb4 ionic: no link check while resetting queues
If the driver is busy resetting queues after a change in
MTU or queue parameters, don't bother checking the link,
wait until the next watchdog cycle.

Fixes: 987c0871e8 ("ionic: check for linkup in watchdog")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-17 15:07:01 -07:00
Gustavo A. R. Silva
682591f7a6 liquidio: Replace vmalloc_node + memset with vzalloc_node and use array_size
Use vzalloc/vzalloc_node instead of the vmalloc/vzalloc_node and memset.

Also, notice that vzalloc_node() function has no 2-factor argument form
to calculate the size for the allocation, so multiplication factors need
to be wrapped in array_size().

This issue was found with the help of Coccinelle and, audited and fixed
manually.

Addresses-KSPP-ID: https://github.com/KSPP/linux/issues/83
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-17 15:04:03 -07:00
David S. Miller
c9f66b43ee Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2020-06-16

This series contains fixes to e1000 and e1000e.

Chen fixes an e1000e issue where systems could be waken via WoL, even
though the user has disabled the wakeup bit via sysfs.

Vaibhav Gupta updates the e1000 driver to clean up the legacy Power
Management hooks.

Arnd Bergmann cleans up the inconsistent use CONFIG_PM_SLEEP
preprocessor tags, which also resolves the compiler warnings about the
possibility of unused structure.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-16 16:16:24 -07:00
Arnd Bergmann
880e6269fd e1000e: fix unused-function warning
The CONFIG_PM_SLEEP #ifdef checks in this file are inconsistent,
leading to a warning about sometimes unused function:

drivers/net/ethernet/intel/e1000e/netdev.c:137:13: error: unused function 'e1000e_check_me' [-Werror,-Wunused-function]

Rather than adding more #ifdefs, just remove them completely
and mark the PM functions as __maybe_unused to let the compiler
work it out on it own.

Fixes: e086ba2fcc ("e1000e: disable s0ix entry and exit flows for ME systems")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-16 15:42:08 -07:00
Vaibhav Gupta
eb6779d4c5 e1000: use generic power management
With legacy PM hooks, it was the responsibility of a driver to manage PCI
states and also the device's power state. The generic approach is to let PCI
core handle the work.

e1000_suspend() calls __e1000_shutdown() to perform intermediate tasks.
__e1000_shutdown() modifies the value of "wake" (device should be wakeup
enabled or not), responsible for controlling the flow of legacy PM.

Since, PCI core has no idea about the value of "wake", new code for generic
PM may produce unexpected results. Thus, use "device_set_wakeup_enable()"
to wakeup-enable the device accordingly.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-16 15:38:11 -07:00
Chen Yu
6bf6be1127 e1000e: Do not wake up the system via WOL if device wakeup is disabled
Currently the system will be woken up via WOL(Wake On LAN) even if the
device wakeup ability has been disabled via sysfs:
 cat /sys/devices/pci0000:00/0000:00:1f.6/power/wakeup
 disabled

The system should not be woken up if the user has explicitly
disabled the wake up ability for this device.

This patch clears the WOL ability of this network device if the
user has disabled the wake up ability in sysfs.

Fixes: bc7f75fa97 ("[E1000E]: New pci-express e1000 driver")
Reported-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-16 15:35:48 -07:00
Tim Harvey
ea12fe9dee lan743x: add MODULE_DEVICE_TABLE for module loading alias
Without a MODULE_DEVICE_TABLE the attributes are missing that create
an alias for auto-loading the module in userspace via hotplug.

Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-16 14:01:14 -07:00
Ido Schimmel
60833d54d5 mlxsw: spectrum: Adjust headroom buffers for 8x ports
The port's headroom buffers are used to store packets while they
traverse the device's pipeline and also to store packets that are egress
mirrored.

On Spectrum-3, ports with eight lanes use two headroom buffers between
which the configured headroom size is split.

In order to prevent packet loss, multiply the calculated headroom size
by two for 8x ports.

Fixes: da382875c6 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-16 13:46:27 -07:00
Sven Auhagen
807eaf9968 mvpp2: remove module bugfix
The remove function does not destroy all
BM Pools when per cpu pool is active.

When reloading the mvpp2 as a module the BM Pools
are still active in hardware and due to the bug
have twice the size now old + new.

This eventually leads to a kernel crash.

v2:
* add Fixes tag

Fixes: 7d04b0b13b ("mvpp2: percpu buffers")
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-16 13:41:16 -07:00
Sven Auhagen
cc970925fe mvpp2: ethtool rxtx stats fix
The ethtool rx and tx queue statistics are reporting wrong values.
Fix reading out the correct ones.

Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15 13:43:49 -07:00
Aditya Pakki
58d0c864e1 rocker: fix incorrect error handling in dma_rings_init
In rocker_dma_rings_init, the goto blocks in case of errors
caused by the functions rocker_dma_cmd_ring_waits_alloc() and
rocker_dma_ring_create() are incorrect. The patch fixes the
order consistent with cleanup in rocker_dma_rings_fini().

Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15 13:37:36 -07:00
Bartosz Golaszewski
adaff6d906 net: ethernet: mtk-star-emac: simplify interrupt handling
During development we tried to make the interrupt handling as fine-grained
as possible with TX and RX interrupts being disabled/enabled independently
and the counter registers reset from workqueue context.

Unfortunately after thorough testing of current mainline, we noticed the
driver has become unstable under heavy load. While this is hard to
reproduce, it's quite consistent in the driver's current form.

This patch proposes to go back to the previous approach of doing all
processing in napi context with all interrupts masked in order to make the
driver usable in mainline linux. This doesn't impact the performance on
pumpkin boards at all and it's in line with what many ethernet drivers do
in mainline linux anyway.

At the same time we're adding a FIXME comment about the need to improve
the interrupt handling.

Fixes: 8c7bd5a454 ("net: ethernet: mtk-star-emac: new driver")
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15 13:30:58 -07:00
Vasundhara Volam
e000940473 bnxt_en: Return from timer if interface is not in open state.
This will avoid many uneccessary error logs when driver or firmware is
in reset.

Fixes: 230d1f0de7 ("bnxt_en: Handle firmware reset.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15 13:28:33 -07:00
Michael Chan
6e2f83884c bnxt_en: Fix AER reset logic on 57500 chips.
AER reset should follow the same steps as suspend/resume.  We need to
free context memory during AER reset and allocate new context memory
during recovery by calling bnxt_hwrm_func_qcaps().  We also need
to call bnxt_reenable_sriov() to restore the VFs.

Fixes: bae361c54f ("bnxt_en: Improve AER slot reset.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15 13:28:33 -07:00
Michael Chan
59ae210173 bnxt_en: Re-enable SRIOV during resume.
If VFs are enabled, we need to re-configure them during resume because
firmware has been reset while resuming.  Otherwise, the VFs won't
work after resume.

Fixes: c16d4ee0e3 ("bnxt_en: Refactor logic to re-enable SRIOV after firmware reset detected.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15 13:28:33 -07:00
Michael Chan
2084ccf625 bnxt_en: Simplify bnxt_resume().
The separate steps we do in bnxt_resume() can be done more simply by
calling bnxt_hwrm_func_qcaps().  This change will add an extra
__bnxt_hwrm_func_qcaps() call which is needed anyway on older
firmware.

Fixes: f9b69d7f62 ("bnxt_en: Fix suspend/resume path on 57500 chips")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15 13:28:33 -07:00
Zekun Shen
e89df5c432 net: alx: fix race condition in alx_remove
There is a race condition exist during termination. The path is
alx_stop and then alx_remove. An alx_schedule_link_check could be called
before alx_stop by interrupt handler and invoke alx_link_check later.
Alx_stop frees the napis, and alx_remove cancels any pending works.
If any of the work is scheduled before termination and invoked before
alx_remove, a null-ptr-deref occurs because both expect alx->napis[i].

This patch fix the race condition by moving cancel_work_sync functions
before alx_free_napis inside alx_stop. Because interrupt handler can call
alx_schedule_link_check again, alx_free_irq is moved before
cancel_work_sync calls too.

Signed-off-by: Zekun Shen <bruceshenzk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15 13:20:14 -07:00
Thomas Falcon
dff515a3e7 ibmvnic: Harden device login requests
The VNIC driver's "login" command sequence is the final step
in the driver's initialization process with device firmware,
confirming the available device queue resources to be utilized
by the driver. Under high system load, firmware may not respond
to the request in a timely manner or may abort the request. In
such cases, the driver should reattempt the login command
sequence. In case of a device error, the number of retries
is bounded.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15 13:18:13 -07:00
Charles Keepax
939a5bf7c9 net: macb: Only disable NAPI on the actual error path
A recent change added a disable to NAPI into macb_open, this was
intended to only happen on the error path but accidentally applies
to all paths. This causes NAPI to be disabled on the success path, which
leads to the network to no longer functioning.

Fixes: 014406babc ("net: cadence: macb: disable NAPI on error")
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Tested-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15 13:10:34 -07:00
Wang Qing
0acb47a3a0 qlcnic: Use kobj_to_dev() instead
Use kobj_to_dev() instead of container_of()

Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15 13:05:50 -07:00
Colin Ian King
35ed87add7 net: axienet: fix spelling mistake in comment "Exteneded" -> "extended"
There is a spelling mistake in a comment. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15 13:02:03 -07:00
Linus Torvalds
96144c58ab Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix cfg80211 deadlock, from Johannes Berg.

 2) RXRPC fails to send norigications, from David Howells.

 3) MPTCP RM_ADDR parsing has an off by one pointer error, fix from
    Geliang Tang.

 4) Fix crash when using MSG_PEEK with sockmap, from Anny Hu.

 5) The ucc_geth driver needs __netdev_watchdog_up exported, from
    Valentin Longchamp.

 6) Fix hashtable memory leak in dccp, from Wang Hai.

 7) Fix how nexthops are marked as FDB nexthops, from David Ahern.

 8) Fix mptcp races between shutdown and recvmsg, from Paolo Abeni.

 9) Fix crashes in tipc_disc_rcv(), from Tuong Lien.

10) Fix link speed reporting in iavf driver, from Brett Creeley.

11) When a channel is used for XSK and then reused again later for XSK,
    we forget to clear out the relevant data structures in mlx5 which
    causes all kinds of problems. Fix from Maxim Mikityanskiy.

12) Fix memory leak in genetlink, from Cong Wang.

13) Disallow sockmap attachments to UDP sockets, it simply won't work.
    From Lorenz Bauer.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
  net: ethernet: ti: ale: fix allmulti for nu type ale
  net: ethernet: ti: am65-cpsw-nuss: fix ale parameters init
  net: atm: Remove the error message according to the atomic context
  bpf: Undo internal BPF_PROBE_MEM in BPF insns dump
  libbpf: Support pre-initializing .bss global variables
  tools/bpftool: Fix skeleton codegen
  bpf: Fix memlock accounting for sock_hash
  bpf: sockmap: Don't attach programs to UDP sockets
  bpf: tcp: Recv() should return 0 when the peer socket is closed
  ibmvnic: Flush existing work items before device removal
  genetlink: clean up family attributes allocations
  net: ipa: header pad field only valid for AP->modem endpoint
  net: ipa: program upper nibbles of sequencer type
  net: ipa: fix modem LAN RX endpoint id
  net: ipa: program metadata mask differently
  ionic: add pcie_print_link_status
  rxrpc: Fix race between incoming ACK parser and retransmitter
  net/mlx5: E-Switch, Fix some error pointer dereferences
  net/mlx5: Don't fail driver on failure to create debugfs
  net/mlx5e: CT: Fix ipv6 nat header rewrite actions
  ...
2020-06-13 16:27:13 -07:00
Grygorii Strashko
bc139119a1 net: ethernet: ti: ale: fix allmulti for nu type ale
On AM65xx MCU CPSW2G NUSS and 66AK2E/L NUSS allmulti setting does not allow
unregistered mcast packets to pass.

This happens, because ALE VLAN entries on these SoCs do not contain port
masks for reg/unreg mcast packets, but instead store indexes of
ALE_VLAN_MASK_MUXx_REG registers which intended for store port masks for
reg/unreg mcast packets.
This path was missed by commit 9d1f644727 ("net: ethernet: ti: ale: fix
seeing unreg mcast packets with promisc and allmulti disabled").

Hence, fix it by taking into account ALE type in cpsw_ale_set_allmulti().

Fixes: 9d1f644727 ("net: ethernet: ti: ale: fix seeing unreg mcast packets with promisc and allmulti disabled")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-13 15:37:17 -07:00
Grygorii Strashko
2074f9eaa5 net: ethernet: ti: am65-cpsw-nuss: fix ale parameters init
The ALE parameters structure is created on stack, so it has to be reset
before passing to cpsw_ale_create() to avoid garbage values.

Fixes: 93a7653031 ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-13 15:35:08 -07:00
Masahiro Yamada
a7f7f6248d treewide: replace '---help---' in Kconfig files with 'help'
Since commit 84af7a6194 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.

This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.

There are a variety of indentation styles found.

  a) 4 spaces + '---help---'
  b) 7 spaces + '---help---'
  c) 8 spaces + '---help---'
  d) 1 space + 1 tab + '---help---'
  e) 1 tab + '---help---'    (correct indentation)
  f) 1 tab + 1 space + '---help---'
  g) 1 tab + 2 spaces + '---help---'

In order to convert all of them to 1 tab + 'help', I ran the
following commend:

  $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-06-14 01:57:21 +09:00
Thomas Falcon
6954a9e419 ibmvnic: Flush existing work items before device removal
Ensure that all scheduled work items have completed before continuing
with device removal and after further event scheduling has been
halted. This patch fixes a bug where a scheduled driver reset event
is processed following device removal.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-12 14:11:02 -07:00
Shannon Nelson
c25cba3689 ionic: add pcie_print_link_status
Print the PCIe link information for our device.

Fixes: 77f972a707 ("ionic: remove support for mgmt device")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-11 18:26:19 -07:00
David S. Miller
3049f0fd3b Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2020-06-11

This series contains fixes to the iavf driver.

Brett fixes the supported link speeds in the iavf driver, which was only
able to report speeds that the i40e driver supported and was missing the
speeds supported by the ice driver.  In addition, fix how 2.5 and 5.0
GbE speeds are reported.

Alek fixes a enum comparison that was comparing two different enums that
may have different values, so update the comparison to use matching
enums.

Paul increases the time to complete a reset to allow for 128 VFs to
complete a reset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-11 18:25:20 -07:00
Dan Carpenter
09a9297574 net/mlx5: E-Switch, Fix some error pointer dereferences
We can't leave "counter" set to an error pointer.  Otherwise either it
will lead to an error pointer dereference later in the function or it
leads to an error pointer dereference when we call mlx5_fc_destroy().

Fixes: 07bab95026 ("net/mlx5: E-Switch, Refactor eswitch ingress acl codes")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-11 15:38:08 -07:00
Leon Romanovsky
17e73d47cd net/mlx5: Don't fail driver on failure to create debugfs
Clang warns:

drivers/net/ethernet/mellanox/mlx5/core/main.c:1278:6: warning: variable
'err' is used uninitialized whenever 'if' condition is true
[-Wsometimes-uninitialized]
        if (!priv->dbg_root) {
            ^~~~~~~~~~~~~~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1303:9: note:
uninitialized use occurs here
        return err;
               ^~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1278:2: note: remove the
'if' if its condition is always false
        if (!priv->dbg_root) {
        ^~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1259:9: note: initialize
the variable 'err' to silence this warning
        int err;
               ^
                = 0
1 warning generated.

The check of returned value of debugfs_create_dir() is wrong because
by the design debugfs failures should never fail the driver and the
check itself was wrong too. The kernel compiled without CONFIG_DEBUG_FS
will return ERR_PTR(-ENODEV) and not NULL as expected.

Fixes: 11f3b84d70 ("net/mlx5: Split mdev init and pci init")
Link: https://github.com/ClangBuiltLinux/linux/issues/1042
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-11 15:38:06 -07:00
Oz Shlomo
0d156f2ded net/mlx5e: CT: Fix ipv6 nat header rewrite actions
Set the ipv6 word fields according to the hardware definitions.

Fixes: ac991b48d4 ("net/mlx5e: CT: Offload established flows")
Signed-off-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-11 15:38:04 -07:00
Parav Pandit
98f91c4576 net/mlx5: Fix devlink objects and devlink device unregister sequence
Current below problems exists.

1. devlink device is registered by mlx5_load_one(). But it is
not unregistered by mlx5_unload_one(). This is incorrect.

2. Above issue leads to,
When mlx5 PCI device is removed, currently devlink device is
unregistered before devlink ports are unregistered in below ladder
diagram.

remove_one()
  mlx5_devlink_unregister()
    [..]
    devlink_unregister() <- ports are still registered!
  mlx5_unload_one()
    mlx5_unregister_device()
      mlx5_remove_device()
        mlx5e_remove()
          mlx5e_devlink_port_unregister()
            devlink_port_unregister()

3. Condition checking for registering and unregister device are not
symmetric either in these routines.

Hence, fix the sequence by having load and unload routines symmetric
and in right order.
i.e.
(a) register devlink device followed by registering devlink ports
(b) unregister devlink ports followed by devlink device

Do this based on boot and cleanup flags instead of different
conditions.

Fixes: c6acd629ee ("net/mlx5e: Add support for devlink-port in non-representors mode")
Fixes: f60f315d33 ("net/mlx5e: Register devlink ports for physical link, PCI PF, VFs")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-11 15:38:02 -07:00
Parav Pandit
60904cd349 net/mlx5: Disable reload while removing the device
While unregistration is in progress, user might be reloading the
interface.
This can race with unregistration in below flow which uses the
resources which are getting disabled by reload flow.

Hence, disable the devlink reloading first when removing the device.

     CPU0                                   CPU1
     ----                                   ----
local_pci_remove()                  devlink_mutex
  remove_one()                       devlink_nl_cmd_reload()
    mlx5_unregister_device()           devlink_reload()
                                       ops->reload_down()
                                         mlx5_unload_one()

Fixes: 4383cfcc65 ("net/mlx5: Add devlink reload")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-11 15:38:00 -07:00
Aya Levin
5f1572e617 net/mlx5e: Fix ethtool hfunc configuration change
Changing RX hash function requires rearranging of RQT internal indexes,
the user isn't exposed to such changes and these changes do not affect
the user configured indirection table. Rebuild RQ table on hfunc change.

Fixes: bdfc028de1 ("net/mlx5e: Fix ethtool RX hash func configuration change")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-11 15:37:57 -07:00
Maxim Mikityanskiy
36d45fb9d2 net/mlx5e: Fix repeated XSK usage on one channel
After an XSK is closed, the relevant structures in the channel are not
zeroed. If an XSK is opened the second time on the same channel without
recreating channels, the stray values in the structures will lead to
incorrect operation of queues, which causes CQE errors, and the new
socket doesn't work at all.

This patch fixes the issue by explicitly zeroing XSK-related structs in
the channel on XSK close. Note that those structs are zeroed on channel
creation, and usually a configuration change (XDP program is set)
happens on XSK open, which leads to recreating channels, so typical XSK
usecases don't suffer from this issue. However, if XSKs are opened and
closed on the same channel without removing the XDP program, this bug
reproduces.

Fixes: db05815b36 ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-11 15:37:55 -07:00
Denis Efremov
47a357de2b net/mlx5: DR, Fix freeing in dr_create_rc_qp()
Variable "in" in dr_create_rc_qp() is allocated with kvzalloc() and
should be freed with kvfree().

Fixes: 297cccebdc ("net/mlx5: DR, Expose an internal API to issue RDMA operations")
Cc: stable@vger.kernel.org
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-11 15:37:53 -07:00
Shay Drory
b6e0b6bebe net/mlx5: Fix fatal error handling during device load
Currently, in case of fatal error during mlx5_load_one(), we cannot
enter error state until mlx5_load_one() is finished, what can take
several minutes until commands will get timeouts, because these commands
can't be processed due to the fatal error.
Fix it by setting dev->state as MLX5_DEVICE_STATE_INTERNAL_ERROR before
requesting the lock.

Fixes: c1d4d2e92a ("net/mlx5: Avoid calling sleeping function by the health poll thread")
Signed-off-by: Shay Drory <shayd@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-11 15:37:51 -07:00
Shay Drory
42ea9f1b5c net/mlx5: drain health workqueue in case of driver load error
In case there is a work in the health WQ when we teardown the driver,
in driver load error flow, the health work will try to read dev->iseg,
which was already unmap in mlx5_pci_close().
Fix it by draining the health workqueue first thing in mlx5_pci_close().

Trace of the error:
BUG: unable to handle page fault for address: ffffb5b141c18014
PF: supervisor read access in kernel mode
PF: error_code(0x0000) - not-present page
PGD 1fe95d067 P4D 1fe95d067 PUD 1fe95e067 PMD 1b7823067 PTE 0
Oops: 0000 [#1] SMP PTI
CPU: 3 PID: 6755 Comm: kworker/u128:2 Not tainted 5.2.0-net-next-mlx5-hv_stats-over-last-worked-hyperv #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  04/28/2016
Workqueue: mlx5_healtha050:00:02.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
RIP: 0010:ioread32be+0x30/0x40
Code: 00 77 27 48 81 ff 00 00 01 00 76 07 0f b7 d7 ed 0f c8 c3 55 48 c7 c6 3b ee d5 9f 48 89 e5 e8 67 fc ff ff b8 ff ff ff ff 5d c3 <8b> 07 0f c8 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 81 fe ff ff 03
RSP: 0018:ffffb5b14c56fd78 EFLAGS: 00010292
RAX: ffffb5b141c18000 RBX: ffff8e9f78a801c0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8e9f7ecd7628 RDI: ffffb5b141c18014
RBP: ffffb5b14c56fd90 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8e9f372a2c30 R11: ffff8e9f87f4bc40 R12: ffff8e9f372a1fc0
R13: ffff8e9f78a80000 R14: ffffffffc07136a0 R15: ffff8e9f78ae6f20
FS:  0000000000000000(0000) GS:ffff8e9f7ecc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffb5b141c18014 CR3: 00000001c8f82006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ? mlx5_health_try_recover+0x4d/0x270 [mlx5_core]
 mlx5_fw_fatal_reporter_recover+0x16/0x20 [mlx5_core]
 devlink_health_reporter_recover+0x1c/0x50
 devlink_health_report+0xfb/0x240
 mlx5_fw_fatal_reporter_err_work+0x65/0xd0 [mlx5_core]
 process_one_work+0x1fb/0x4e0
 ? process_one_work+0x16b/0x4e0
 worker_thread+0x4f/0x3d0
 kthread+0x10d/0x140
 ? process_one_work+0x4e0/0x4e0
 ? kthread_cancel_delayed_work_sync+0x20/0x20
 ret_from_fork+0x1f/0x30
Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache 8021q garp mrp stp llc ipmi_devintf ipmi_msghandler rpcrdma rdma_ucm ib_iser rdma_cm ib_umad iw_cm ib_ipoib libiscsi scsi_transport_iscsi ib_cm mlx5_ib ib_uverbs ib_core mlx5_core sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 mlxfw crypto_simd cryptd glue_helper input_leds hyperv_fb intel_rapl_perf joydev serio_raw pci_hyperv pci_hyperv_mini mac_hid hv_balloon nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel ip_tables x_tables autofs4 hv_utils hid_generic hv_storvsc ptp hid_hyperv hid hv_netvsc hyperv_keyboard pps_core scsi_transport_fc psmouse hv_vmbus i2c_piix4 floppy pata_acpi
CR2: ffffb5b141c18014
---[ end trace b12c5503157cad24 ]---
RIP: 0010:ioread32be+0x30/0x40
Code: 00 77 27 48 81 ff 00 00 01 00 76 07 0f b7 d7 ed 0f c8 c3 55 48 c7 c6 3b ee d5 9f 48 89 e5 e8 67 fc ff ff b8 ff ff ff ff 5d c3 <8b> 07 0f c8 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 81 fe ff ff 03
RSP: 0018:ffffb5b14c56fd78 EFLAGS: 00010292
RAX: ffffb5b141c18000 RBX: ffff8e9f78a801c0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8e9f7ecd7628 RDI: ffffb5b141c18014
RBP: ffffb5b14c56fd90 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8e9f372a2c30 R11: ffff8e9f87f4bc40 R12: ffff8e9f372a1fc0
R13: ffff8e9f78a80000 R14: ffffffffc07136a0 R15: ffff8e9f78ae6f20
FS:  0000000000000000(0000) GS:ffff8e9f7ecc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffb5b141c18014 CR3: 00000001c8f82006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:38
in_atomic(): 0, irqs_disabled(): 1, pid: 6755, name: kworker/u128:2
INFO: lockdep is turned off.
CPU: 3 PID: 6755 Comm: kworker/u128:2 Tainted: G      D           5.2.0-net-next-mlx5-hv_stats-over-last-worked-hyperv #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  04/28/2016
Workqueue: mlx5_healtha050:00:02.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
Call Trace:
 dump_stack+0x63/0x88
 ___might_sleep+0x10a/0x130
 __might_sleep+0x4a/0x80
 exit_signals+0x33/0x230
 ? blocking_notifier_call_chain+0x16/0x20
 do_exit+0xb1/0xc30
 ? kthread+0x10d/0x140
 ? process_one_work+0x4e0/0x4e0

Fixes: 52c368dc3d ("net/mlx5: Move health and page alloc init to mdev_init")
Signed-off-by: Shay Drory <shayd@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-11 15:37:48 -07:00
Paul Greenwalt
8e3e4b9da7 iavf: increase reset complete wait time
With an increased number of VFs, it's possible to encounter the following
issue during reset.

    iavf b8d4:00:02.0: Hardware reset detected
    iavf b8d4:00:02.0: Reset never finished (0)
    iavf b8d4:00:02.0: Reset task did not complete, VF disabled

Increase the reset complete wait count to allow for 128 VFs to complete
reset.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-11 15:13:16 -07:00
Brett Creeley
18c012d922 iavf: Fix reporting 2.5 Gb and 5Gb speeds
Commit 4ae4916b56 ("i40e: fix 'Unknown bps' in dmesg for 2.5Gb/5Gb
speeds") added the ability for the PF to report 2.5 and 5Gb speeds,
however, the iavf driver does not recognize those speeds as the values were
not added there. Add the proper enums and values so that iavf can properly
deal with those speeds.

Fixes: 4ae4916b56 ("i40e: fix 'Unknown bps' in dmesg for 2.5Gb/5Gb speeds")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Witold Fijalkowski <witoldx.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-11 15:11:42 -07:00
Aleksandr Loktionov
5071bda294 iavf: use appropriate enum for comparison
adapter->link_speed has type enum virtchnl_link_speed but our comparisons
are against enum iavf_aq_link_speed. Though they are, currently, the same
values, change the comparison to the matching enum virtchnl_link_speed
since that may not always be the case.

Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-11 14:53:50 -07:00
Brett Creeley
e0ef26fbe2 iavf: fix speed reporting over virtchnl
Link speeds are communicated over virtchnl using an enum
virtchnl_link_speed. Currently, the highest link speed is 40Gbps which
leaves us unable to reflect some speeds that an ice VF is capable of.
This causes link speed to be misreported on the iavf driver.

Allow for communicating link speeds using Mbps so that the proper speed can
be reported for an ice VF. Moving away from the enum allows us to
communicate future speed changes without requiring a new enum to be added.

In order to support communicating link speeds over virtchnl in Mbps the
following functionality was added:
    - Added u32 link_speed_mbps in the iavf_adapter structure.
    - Added the macro ADV_LINK_SUPPORT(_a) to determine if the VF
      driver supports communicating link speeds in Mbps.
    - Added the function iavf_get_vpe_link_status() to fill the
      correct link_status in the event_data union based on the
      ADV_LINK_SUPPORT(_a) macro.
    - Added the function iavf_set_adapter_link_speed_from_vpe()
      to determine whether or not to fill the u32 link_speed_mbps or
      enum virtchnl_link_speed link_speed field in the iavf_adapter
      structure based on the ADV_LINK_SUPPORT(_a) macro.
    - Do not free vf_res in iavf_init_get_resources() as vf_res will be
      accessed in iavf_get_link_ksettings(); memset to 0 instead. This
      memory is subsequently freed in iavf_remove().

Fixes: 7c710869d6 ("ice: Add handlers for VF netdevice operations")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Sergey Nemov <sergey.nemov@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-06-11 14:53:50 -07:00
Shannon Nelson
77f972a707 ionic: remove support for mgmt device
We no longer support the mgmt device in the ionic driver,
so remove the device id and related code.

Fixes: b3f064e974 ("ionic: add support for device id 0x1004")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-11 12:43:29 -07:00
Xu Wang
9334d5ba32 drivers: dpaa2: Use devm_kcalloc() in setup_dpni()
A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "devm_kcalloc".

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-11 12:41:12 -07:00
Corentin Labbe
014406babc net: cadence: macb: disable NAPI on error
When the PHY is not working, the macb driver crash on a second try to
setup it.
[   78.545994] macb e000b000.ethernet eth0: Could not attach PHY (-19)
ifconfig: SIOCSIFFLAGS: No such device
[   78.655457] ------------[ cut here ]------------
[   78.656014] kernel BUG at /linux-next/include/linux/netdevice.h:521!
[   78.656504] Internal error: Oops - BUG: 0 [#1] SMP ARM
[   78.657079] Modules linked in:
[   78.657795] CPU: 0 PID: 122 Comm: ifconfig Not tainted 5.7.0-next-20200609 #1
[   78.658202] Hardware name: Xilinx Zynq Platform
[   78.659632] PC is at macb_open+0x220/0x294
[   78.660160] LR is at 0x0
[   78.660373] pc : [<c0b0a634>]    lr : [<00000000>]    psr: 60000013
[   78.660716] sp : c89ffd70  ip : c8a28800  fp : c199bac0
[   78.661040] r10: 00000000  r9 : c8838540  r8 : c8838568
[   78.661362] r7 : 00000001  r6 : c8838000  r5 : c883c000  r4 : 00000000
[   78.661724] r3 : 00000010  r2 : 00000000  r1 : 00000000  r0 : 00000000
[   78.662187] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[   78.662635] Control: 10c5387d  Table: 08b64059  DAC: 00000051
[   78.663035] Process ifconfig (pid: 122, stack limit = 0x(ptrval))
[   78.663476] Stack: (0xc89ffd70 to 0xc8a00000)
[   78.664121] fd60:                                     00000000 c89fe000 c8838000 c89fe000
[   78.664866] fd80: 00000000 c11ff9ac c8838028 00000000 00000000 c0de6f2c 00000001 c1804eec
[   78.665579] fda0: c19b8178 c8838000 00000000 ca760866 c8838000 00000001 00001043 c89fe000
[   78.666355] fdc0: 00001002 c0de72f4 c89fe000 c0de8dc0 00008914 c89fe000 c199bac0 ca760866
[   78.667111] fde0: c89ffddc c8838000 00001002 00000000 c8838138 c881010c 00008914 c0de7364
[   78.667862] fe00: 00000000 c89ffe70 c89fe000 ffffffff c881010c c0e8bd48 00000003 00000000
[   78.668601] fe20: c8838000 c8810100 39c1118f 00039c11 c89a0960 00001043 00000000 000a26d0
[   78.669343] fe40: b6f43000 ca760866 c89a0960 00000051 befe6c50 00008914 c8b2a3c0 befe6c50
[   78.670086] fe60: 00000003 ee610500 00000000 c0e8ef58 30687465 00000000 00000000 00000000
[   78.670865] fe80: 00001043 00000000 000a26d0 b6f43000 c89a0600 ee40ae7c c8870d00 c0ddabf4
[   78.671593] fea0: c89ffeec c0ddabf4 c89ffeec c199bac0 00008913 c0ddac48 c89ffeec c89fe000
[   78.672324] fec0: befe6c50 ca760866 befe6c50 00008914 c89fe000 befe6c50 c8b2a3c0 c0dc00e4
[   78.673088] fee0: c89a0480 00000201 00000cc0 30687465 00000000 00000000 00000000 00001002
[   78.673822] ff00: 00000000 000a26d0 b6f43000 ca760866 00008914 c8b2a3c0 000a0ec4 c8b2a3c0
[   78.674576] ff20: befe6c50 c04b21bc 000d5004 00000817 c89a0480 c0315f94 00000000 00000003
[   78.675415] ff40: c19a2bc8 c8a3cc00 c89fe000 00000255 00000000 00000000 00000000 000d5000
[   78.676182] ff60: 000f6000 c180b2a0 00000817 c0315e64 000d5004 c89fffb0 b6ec0c30 ca760866
[   78.676928] ff80: 00000000 000b609b befe6c50 000a0ec4 00000036 c03002c4 c89fe000 00000036
[   78.677673] ffa0: 00000000 c03000c0 000b609b befe6c50 00000003 00008914 befe6c50 000b609b
[   78.678415] ffc0: 000b609b befe6c50 000a0ec4 00000036 befe6e0c befe6f1a 000d5150 00000000
[   78.679154] ffe0: 000d41e4 befe6bf4 00019648 b6e4509c 20000010 00000003 00000000 00000000
[   78.681059] [<c0b0a634>] (macb_open) from [<c0de6f2c>] (__dev_open+0xd0/0x154)
[   78.681571] [<c0de6f2c>] (__dev_open) from [<c0de72f4>] (__dev_change_flags+0x16c/0x1c4)
[   78.682015] [<c0de72f4>] (__dev_change_flags) from [<c0de7364>] (dev_change_flags+0x18/0x48)
[   78.682493] [<c0de7364>] (dev_change_flags) from [<c0e8bd48>] (devinet_ioctl+0x5e4/0x75c)
[   78.682945] [<c0e8bd48>] (devinet_ioctl) from [<c0e8ef58>] (inet_ioctl+0x1f0/0x3b4)
[   78.683381] [<c0e8ef58>] (inet_ioctl) from [<c0dc00e4>] (sock_ioctl+0x39c/0x664)
[   78.683818] [<c0dc00e4>] (sock_ioctl) from [<c04b21bc>] (ksys_ioctl+0x2d8/0x9c0)
[   78.684343] [<c04b21bc>] (ksys_ioctl) from [<c03000c0>] (ret_fast_syscall+0x0/0x54)
[   78.684789] Exception stack(0xc89fffa8 to 0xc89ffff0)
[   78.685346] ffa0:                   000b609b befe6c50 00000003 00008914 befe6c50 000b609b
[   78.686106] ffc0: 000b609b befe6c50 000a0ec4 00000036 befe6e0c befe6f1a 000d5150 00000000
[   78.686710] ffe0: 000d41e4 befe6bf4 00019648 b6e4509c
[   78.687582] Code: 9a000003 e5983078 e3130001 1affffef (e7f001f2)
[   78.688788] ---[ end trace e3f2f6ab69754eae ]---

This is due to NAPI left enabled if macb_phylink_connect() fail.

Fixes: 7897b071ac ("net: macb: convert to phylink")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-10 16:20:07 -07:00
Lorenzo Bianconi
62a502cc91 net: mvneta: do not redirect frames during reconfiguration
Disable frames injection in mvneta_xdp_xmit routine during hw
re-configuration in order to avoid hardware hangs

Fixes: b0a43db908 ("net: mvneta: add XDP_TX support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-09 14:29:15 -07:00
Shannon Nelson
976ee3b211 ionic: wait on queue start until after IFF_UP
The netif_running() test looks at __LINK_STATE_START which
gets set before ndo_open() is called, there is a window of
time between that and when the queues are actually ready to
be run.  If ionic_check_link_status() notices that the link is
up very soon after netif_running() becomes true, it might try
to run the queues before they are ready, causing all manner of
potential issues.  Since the netdev->flags IFF_UP isn't set
until after ndo_open() returns, we can wait for that before
we allow ionic_check_link_status() to start the queues.

On the way back to close, __LINK_STATE_START is cleared before
calling ndo_stop(), and IFF_UP is cleared after.  Both of
these need to be true in order to safely stop the queues
from ionic_check_link_status().

Fixes: 49d3b49367 ("ionic: disable the queues on link down")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-09 13:19:27 -07:00
Mike Rapoport
65fddcfca8 mm: reorder includes after introduction of linux/pgtable.h
The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include
of the latter in the middle of asm includes.  Fix this up with the aid of
the below script and manual adjustments here and there.

	import sys
	import re

	if len(sys.argv) is not 3:
	    print "USAGE: %s <file> <header>" % (sys.argv[0])
	    sys.exit(1)

	hdr_to_move="#include <linux/%s>" % sys.argv[2]
	moved = False
	in_hdrs = False

	with open(sys.argv[1], "r") as f:
	    lines = f.readlines()
	    for _line in lines:
		line = _line.rstrip('
')
		if line == hdr_to_move:
		    continue
		if line.startswith("#include <linux/"):
		    in_hdrs = True
		elif not moved and in_hdrs:
		    moved = True
		    print hdr_to_move
		print line

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:13 -07:00
Mike Rapoport
ca5999fde0 mm: introduce include/linux/pgtable.h
The include/linux/pgtable.h is going to be the home of generic page table
manipulation functions.

Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and
make the latter include asm/pgtable.h.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:13 -07:00
Mike Rapoport
e31cf2f4ca mm: don't include asm/pgtable.h if linux/mm.h is already included
Patch series "mm: consolidate definitions of page table accessors", v2.

The low level page table accessors (pXY_index(), pXY_offset()) are
duplicated across all architectures and sometimes more than once.  For
instance, we have 31 definition of pgd_offset() for 25 supported
architectures.

Most of these definitions are actually identical and typically it boils
down to, e.g.

static inline unsigned long pmd_index(unsigned long address)
{
        return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
}

static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
{
        return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
}

These definitions can be shared among 90% of the arches provided
XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.

For architectures that really need a custom version there is always
possibility to override the generic version with the usual ifdefs magic.

These patches introduce include/linux/pgtable.h that replaces
include/asm-generic/pgtable.h and add the definitions of the page table
accessors to the new header.

This patch (of 12):

The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the
functions involving page table manipulations, e.g.  pte_alloc() and
pmd_alloc().  So, there is no point to explicitly include <asm/pgtable.h>
in the files that include <linux/mm.h>.

The include statements in such cases are remove with a simple loop:

	for f in $(git grep -l "include <linux/mm.h>") ; do
		sed -i -e '/include <asm\/pgtable.h>/ d' $f
	done

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:13 -07:00
Linus Torvalds
af7b480103 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 - Fix the build with certain Kconfig combinations for the Chelsio
   inline TLS device, from Rohit Maheshwar and Vinay Kumar Yadavi.

 - Fix leak in genetlink, from Cong Lang.

 - Fix out of bounds packet header accesses in seg6, from Ahmed
   Abdelsalam.

 - Two XDP fixes in the ENA driver, from Sameeh Jubran

 - Use rwsem in device rename instead of a seqcount because this code
   can sleep, from Ahmed S. Darwish.

 - Fix WoL regressions in r8169, from Heiner Kallweit.

 - Fix qed crashes in kdump mode, from Alok Prasad.

 - Fix the callbacks used for certain thermal zones in mlxsw, from Vadim
   Pasternak.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits)
  net: dsa: lantiq_gswip: fix and improve the unsupported interface error
  mlxsw: core: Use different get_trend() callbacks for different thermal zones
  net: dp83869: Reset return variable if PHY strap is read
  rhashtable: Drop raw RCU deref in nested_table_free
  cxgb4: Use kfree() instead kvfree() where appropriate
  net: qed: fixes crash while running driver in kdump kernel
  vsock/vmci: make vmci_vsock_transport_cb() static
  net: ethtool: Fix comment mentioning typo in IS_ENABLED()
  net: phy: mscc: fix Serdes configuration in vsc8584_config_init
  net: mscc: Fix OF_MDIO config check
  net: marvell: Fix OF_MDIO config check
  net: dp83867: Fix OF_MDIO config check
  net: dp83869: Fix OF_MDIO config check
  net: ethernet: mvneta: fix MVNETA_SKB_HEADROOM alignment
  ethtool: linkinfo: remove an unnecessary NULL check
  net/xdp: use shift instead of 64 bit division
  crypto/chtls:Fix compile error when CONFIG_IPV6 is disabled
  inet_connection_sock: clear inet_num out of destroy helper
  yam: fix possible memory leak in yam_init_driver
  lan743x: Use correct MAC_CR configuration for 1 GBit speed
  ...
2020-06-07 17:27:45 -07:00
Vadim Pasternak
2dc2f76005 mlxsw: core: Use different get_trend() callbacks for different thermal zones
The driver registers three different types of thermal zones: For the
ASIC itself, for port modules and for gearboxes.

Currently, all three types use the same get_trend() callback which does
not work correctly for the ASIC thermal zone. The callback assumes that
the device data is of type 'struct mlxsw_thermal_module', whereas for
the ASIC thermal zone 'struct mlxsw_thermal' is passed as device data.

Fix this by using one get_trend() callback for the ASIC thermal zone and
another for the other two types.

Fixes: 6f73862fab ("mlxsw: core: Add the hottest thermal zone detection")
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-07 16:59:43 -07:00
Linus Torvalds
818dbde78e SCSI misc on 20200605
This series consists of the usual driver updates (qla2xxx, ufs, zfcp,
 target, scsi_debug, lpfc, qedi, qedf, hisi_sas, mpt3sas) plus a host
 of other minor updates.  There are no major core changes in this
 series apart from a refactoring in scsi_lib.c.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXtq5QyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXyGAQCipTWx
 7kHKHZBCVTU133bADt3+SstLrAm8PKZEXMnP9wEAzu4QkkW8URxEDRrpu7qk5gbA
 9M/KyqvfRtTH7+BSK7M=
 =J6aO
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 :This series consists of the usual driver updates (qla2xxx, ufs, zfcp,
  target, scsi_debug, lpfc, qedi, qedf, hisi_sas, mpt3sas) plus a host
  of other minor updates.

  There are no major core changes in this series apart from a
  refactoring in scsi_lib.c"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (207 commits)
  scsi: ufs: ti-j721e-ufs: Fix unwinding of pm_runtime changes
  scsi: cxgb3i: Fix some leaks in init_act_open()
  scsi: ibmvscsi: Make some functions static
  scsi: iscsi: Fix deadlock on recovery path during GFP_IO reclaim
  scsi: ufs: Fix WriteBooster flush during runtime suspend
  scsi: ufs: Fix index of attributes query for WriteBooster feature
  scsi: ufs: Allow WriteBooster on UFS 2.2 devices
  scsi: ufs: Remove unnecessary memset for dev_info
  scsi: ufs-qcom: Fix scheduling while atomic issue
  scsi: mpt3sas: Fix reply queue count in non RDPQ mode
  scsi: lpfc: Fix lpfc_nodelist leak when processing unsolicited event
  scsi: target: tcmu: Fix a use after free in tcmu_check_expired_queue_cmd()
  scsi: vhost: Notify TCM about the maximum sg entries supported per command
  scsi: qla2xxx: Remove return value from qla_nvme_ls()
  scsi: qla2xxx: Remove an unused function
  scsi: iscsi: Register sysfs for iscsi workqueue
  scsi: scsi_debug: Parser tables and code interaction
  scsi: core: Refactor scsi_mq_setup_tags function
  scsi: core: Fix incorrect usage of shost_for_each_device
  scsi: qla2xxx: Fix endianness annotations in source files
  ...
2020-06-05 15:11:50 -07:00
Linus Torvalds
242b233198 RDMA 5.8 merge window pull request
A few large, long discussed works this time. The RNBD block driver has
 been posted for nearly two years now, and the removal of FMR has been a
 recurring discussion theme for a long time. The usual smattering of
 features and bug fixes.
 
 - Various small driver bugs fixes in rxe, mlx5, hfi1, and efa
 
 - Continuing driver cleanups in bnxt_re, hns
 
 - Big cleanup of mlx5 QP creation flows
 
 - More consistent use of src port and flow label when LAG is used and a
   mlx5 implementation
 
 - Additional set of cleanups for IB CM
 
 - 'RNBD' network block driver and target. This is a network block RDMA
   device specific to ionos's cloud environment. It brings strong multipath
   and resiliency capabilities.
 
 - Accelerated IPoIB for HFI1
 
 - QP/WQ/SRQ ioctl migration for uverbs, and support for multiple async fds
 
 - Support for exchanging the new IBTA defiend ECE data during RDMA CM
   exchanges
 
 - Removal of the very old and insecure FMR interface from all ULPs and
   drivers. FRWR should be preferred for at least a decade now.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl7X/IwACgkQOG33FX4g
 mxp2uw/+MI2S/aXqEBvZfTT8yrkAwqYezS0VeTDnwH/T6UlTMDhHVN/2Ji3tbbX3
 FEKT1i2mnAL5RqUAL1lr9g4sG/bVozrpN46Ws5Lu9dTbIPLKTNPWDuLFQDUShKY7
 OyMI/bRx6anGnsOy20iiBqnrQbrrZj5TECgnmrkAl62QFdcl7aBWe/yYjy4CT11N
 ub+aBXBREN1F1pc0HIjd2tI+8gnZc+mNm1LVVDRH9Capun/pI26qDNh7e6QwGyIo
 n8ItraC8znLwv/nsUoTE7/JRcsTEe6vJI26PQmczZfNJs/4O65G7fZg0eSBseZYi
 qKf7Uwtb3qW0R7jRUMEgFY4DKXVAA0G2ph40HXBuzOSsqlT6HqYMO2wgG8pJkrTc
 qAjoSJGzfAHIsjxzxKI8wKuufCddjCm30VWWU7EKeriI6h1J0uPVqKkQMfYBTkik
 696eZSBycAVgwayOng3XaehiTxOL7qGMTjUpDjUR6UscbiPG919vP+QsbIUuBXdb
 YoddBQJdyGJiaCXv32ciJjo9bjPRRi/bII7Q5qzCNI2mi4ZVbudF4ffzyQvdHtNJ
 nGnpRXoPi7kMvUrKTMPWkFjj0R5/UsPszsA51zbxPydfgBe0Dlc2PrrIG8dlzYAp
 wbV0Lec+iJucKlt7EZtrjz1xOiOOaQt/5/cW1bWqL+wk2t6gAuY=
 =9zTe
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A more active cycle than most of the recent past, with a few large,
  long discussed works this time.

  The RNBD block driver has been posted for nearly two years now, and
  flowing through RDMA due to it also introducing a new ULP.

  The removal of FMR has been a recurring discussion theme for a long
  time.

  And the usual smattering of features and bug fixes.

  Summary:

   - Various small driver bugs fixes in rxe, mlx5, hfi1, and efa

   - Continuing driver cleanups in bnxt_re, hns

   - Big cleanup of mlx5 QP creation flows

   - More consistent use of src port and flow label when LAG is used and
     a mlx5 implementation

   - Additional set of cleanups for IB CM

   - 'RNBD' network block driver and target. This is a network block
     RDMA device specific to ionos's cloud environment. It brings strong
     multipath and resiliency capabilities.

   - Accelerated IPoIB for HFI1

   - QP/WQ/SRQ ioctl migration for uverbs, and support for multiple
     async fds

   - Support for exchanging the new IBTA defiend ECE data during RDMA CM
     exchanges

   - Removal of the very old and insecure FMR interface from all ULPs
     and drivers. FRWR should be preferred for at least a decade now"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (247 commits)
  RDMA/cm: Spurious WARNING triggered in cm_destroy_id()
  RDMA/mlx5: Return ECE DC support
  RDMA/mlx5: Don't rely on FW to set zeros in ECE response
  RDMA/mlx5: Return an error if copy_to_user fails
  IB/hfi1: Use free_netdev() in hfi1_netdev_free()
  RDMA/hns: Uninitialized variable in modify_qp_init_to_rtr()
  RDMA/core: Move and rename trace_cm_id_create()
  IB/hfi1: Fix hfi1_netdev_rx_init() error handling
  RDMA: Remove 'max_map_per_fmr'
  RDMA: Remove 'max_fmr'
  RDMA/core: Remove FMR device ops
  RDMA/rdmavt: Remove FMR memory registration
  RDMA/mthca: Remove FMR support for memory registration
  RDMA/mlx4: Remove FMR support for memory registration
  RDMA/i40iw: Remove FMR leftovers
  RDMA/bnxt_re: Remove FMR leftovers
  RDMA/mlx5: Remove FMR leftovers
  RDMA/core: Remove FMR pool API
  RDMA/rds: Remove FMR support for memory registration
  RDMA/srp: Remove support for FMR memory registration
  ...
2020-06-05 14:05:57 -07:00
Denis Efremov
7f89cc07d2 cxgb4: Use kfree() instead kvfree() where appropriate
Use kfree(buf) in blocked_fl_read() because the memory is allocated with
kzalloc(). Use kfree(t) in blocked_fl_write() because the memory is
allocated with kcalloc().

Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-05 13:22:49 -07:00
Alok Prasad
6da95b52b8 net: qed: fixes crash while running driver in kdump kernel
This fixes a crash introduced by recent is_kdump_kernel() check.
The source of the crash is that kdump kernel can be loaded on a
system with already created VFs. But for such VFs, it will follow
a logic path of PF and eventually crash.

Thus, we are partially reverting back previous changes and instead
use is_kdump_kernel is a single init point of PF init, where we
disable SRIOV explicitly.

Fixes: 37d4f8a6b4 ("net: qed: Disable SRIOV functionality inside kdump kernel")
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-05 13:20:16 -07:00
Alexander Lobakin
e224372076 net: ethernet: mvneta: fix MVNETA_SKB_HEADROOM alignment
Commit ca23cb0bc5 ("mvneta: MVNETA_SKB_HEADROOM set last 3 bits to zero")
added headroom alignment check against 8.
Hovewer (if we imagine that NET_SKB_PAD or XDP_PACKET_HEADROOM is not
aligned to cacheline size), it actually aligns headroom down, while
skb/xdp_buff headroom should be *at least* equal to one of the values
(depending on XDP prog presence).
So, fix the check to align the value up. This satisfies both
hardware/driver and network stack requirements.

Fixes: ca23cb0bc5 ("mvneta: MVNETA_SKB_HEADROOM set last 3 bits to zero")
Signed-off-by: Alexander Lobakin <bloodyreaper@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-05 13:13:11 -07:00
Linus Torvalds
7ae77150d9 powerpc updates for 5.8
- Support for userspace to send requests directly to the on-chip GZIP
    accelerator on Power9.
 
  - Rework of our lockless page table walking (__find_linux_pte()) to make it
    safe against parallel page table manipulations without relying on an IPI for
    serialisation.
 
  - A series of fixes & enhancements to make our machine check handling more
    robust.
 
  - Lots of plumbing to add support for "prefixed" (64-bit) instructions on
    Power10.
 
  - Support for using huge pages for the linear mapping on 8xx (32-bit).
 
  - Remove obsolete Xilinx PPC405/PPC440 support, and an associated sound driver.
 
  - Removal of some obsolete 40x platforms and associated cruft.
 
  - Initial support for booting on Power10.
 
  - Lots of other small features, cleanups & fixes.
 
 Thanks to:
   Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Andrey Abramov,
   Aneesh Kumar K.V, Balamuruhan S, Bharata B Rao, Bulent Abali, Cédric Le
   Goater, Chen Zhou, Christian Zigotzky, Christophe JAILLET, Christophe Leroy,
   Dmitry Torokhov, Emmanuel Nicolet, Erhard F., Gautham R. Shenoy, Geoff Levand,
   George Spelvin, Greg Kurz, Gustavo A. R. Silva, Gustavo Walbon, Haren Myneni,
   Hari Bathini, Joel Stanley, Jordan Niethe, Kajol Jain, Kees Cook, Leonardo
   Bras, Madhavan Srinivasan., Mahesh Salgaonkar, Markus Elfring, Michael
   Neuling, Michal Simek, Nathan Chancellor, Nathan Lynch, Naveen N. Rao,
   Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pingfan Liu, Qian Cai, Ram
   Pai, Raphael Moreira Zinsly, Ravi Bangoria, Sam Bobroff, Sandipan Das, Segher
   Boessenkool, Stephen Rothwell, Sukadev Bhattiprolu, Tyrel Datwyler, Wolfram
   Sang, Xiongfeng Wang.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl7aYZ8THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgPiKD/9zNCuZLFMAFrIdbm0HlYA2RGYZFT75
 GUHsqYyei1pxA7PgM3KwJiXELVODsBv0eQbgNh1tbecKrxPRegN/cywd1KLjPZ7I
 v5/qweQP8MvR0RhzjbhvUcO0jq/f8u2LbJr5mUfVzjU6tAvrvcWo3oZqDElsekCS
 kgyOH3r1vZ2PLTMiGFhb0gWi2iqc+6BHU1AFCGPCMjB1Vu5d5+54VvZ/6lllGsOF
 yg9CBXmmVvQ+Bn6tH4zdEB78FYxnAIwBqlbmL79i5ca+HQJ0Sw6HuPRy9XYq35p6
 2EiXS4Wrgp7i7+1TN3HO362u5Onb8TSyQU7NS6yCFPoJ6JQxcJMBIw6mHhnXOPuZ
 CrjgcdwUMjx8uDoKmX1Epbfuex2w+AysW+4yBHPFiSgl3klKC3D0wi95mR485w2F
 rN8uzJtrDeFKcYZJG7IoB/cgFCCPKGf9HaXr8q0S/jBKMffx91ul3cfzlfdIXOCw
 FDNw/+ZX7UD6ddFEG12ZTO+vdL8yf1uCRT/DIZwUiDMIA0+M6F4nc7j3lfyZfoO1
 65f9UlhoLxScq7VH2fKH4UtZatO9cPID2z1CmiY4UbUIPtFDepSuYClgLF+Duf4b
 rkfxhKU0+Ja1zNH5XNc+L+Bc5/W4lFiJXz02dYIjtHoUpWkc1aToOETVwzggYFNM
 G3PXIBOI0jRgRw==
 =o0WU
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Support for userspace to send requests directly to the on-chip GZIP
   accelerator on Power9.

 - Rework of our lockless page table walking (__find_linux_pte()) to
   make it safe against parallel page table manipulations without
   relying on an IPI for serialisation.

 - A series of fixes & enhancements to make our machine check handling
   more robust.

 - Lots of plumbing to add support for "prefixed" (64-bit) instructions
   on Power10.

 - Support for using huge pages for the linear mapping on 8xx (32-bit).

 - Remove obsolete Xilinx PPC405/PPC440 support, and an associated sound
   driver.

 - Removal of some obsolete 40x platforms and associated cruft.

 - Initial support for booting on Power10.

 - Lots of other small features, cleanups & fixes.

Thanks to: Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan,
Andrey Abramov, Aneesh Kumar K.V, Balamuruhan S, Bharata B Rao, Bulent
Abali, Cédric Le Goater, Chen Zhou, Christian Zigotzky, Christophe
JAILLET, Christophe Leroy, Dmitry Torokhov, Emmanuel Nicolet, Erhard F.,
Gautham R. Shenoy, Geoff Levand, George Spelvin, Greg Kurz, Gustavo A.
R. Silva, Gustavo Walbon, Haren Myneni, Hari Bathini, Joel Stanley,
Jordan Niethe, Kajol Jain, Kees Cook, Leonardo Bras, Madhavan
Srinivasan., Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Michal
Simek, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin,
Oliver O'Halloran, Paul Mackerras, Pingfan Liu, Qian Cai, Ram Pai,
Raphael Moreira Zinsly, Ravi Bangoria, Sam Bobroff, Sandipan Das, Segher
Boessenkool, Stephen Rothwell, Sukadev Bhattiprolu, Tyrel Datwyler,
Wolfram Sang, Xiongfeng Wang.

* tag 'powerpc-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (299 commits)
  powerpc/pseries: Make vio and ibmebus initcalls pseries specific
  cxl: Remove dead Kconfig options
  powerpc: Add POWER10 architected mode
  powerpc/dt_cpu_ftrs: Add MMA feature
  powerpc/dt_cpu_ftrs: Enable Prefixed Instructions
  powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selected
  powerpc: Add support for ISA v3.1
  powerpc: Add new HWCAP bits
  powerpc/64s: Don't set FSCR bits in INIT_THREAD
  powerpc/64s: Save FSCR to init_task.thread.fscr after feature init
  powerpc/64s: Don't let DT CPU features set FSCR_DSCR
  powerpc/64s: Don't init FSCR_DSCR in __init_FSCR()
  powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG
  powerpc/module_64: Use special stub for _mcount() with -mprofile-kernel
  powerpc/module_64: Simplify check for -mprofile-kernel ftrace relocations
  powerpc/module_64: Consolidate ftrace code
  powerpc/32: Disable KASAN with pages bigger than 16k
  powerpc/uaccess: Don't set KUEP by default on book3s/32
  powerpc/uaccess: Don't set KUAP by default on book3s/32
  powerpc/8xx: Reduce time spent in allow_user_access() and friends
  ...
2020-06-05 12:39:30 -07:00
Roelof Berg
7cdee28c4e lan743x: Use correct MAC_CR configuration for 1 GBit speed
Corrected the MAC_CR configuration bits for 1 GBit operation. The data
sheet allows MAC_CR(2:1) to be 10 and also 11 for 1 GBit/s speed, but
only 10 works correctly.

Devices tested:
Microchip Lan7431, fixed-phy mode
Microchip Lan7430, normal phy mode

Fixes: 6f197fb638 ("lan743x: Added fixed link and RGMII support")
Signed-off-by: Roelof Berg <rberg@berg-solutions.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-04 15:57:03 -07:00
Valentin Longchamp
09820ce88b net: ethernet: freescale: remove unneeded include for ucc_geth
net/sch_generic.h does not need to be included, remove it.

Signed-off-by: Valentin Longchamp <valentin@longchamp.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-04 15:55:58 -07:00
Heiner Kallweit
1200684814 r8169: fix failing WoL
Th referenced change added an extra hw reset to rtl8169_net_suspend()
what makes WoL fail on few chip versions. Therefore skip the extra
reset if we're going down and WoL is enabled.
In rtl_shutdown() rtl8169_hw_reset() is called by rtl8169_net_suspend()
already if needed, therefore avoid issues issue by removing the extra
call. The fix was tested on a system with RTL8168g.

Meanwhile rtl8169_hw_reset() does more than a hw reset and should be
renamed. But that's net-next material.

Fixes: 8ac8e8c64b ("r8169: make rtl8169_down central chip quiesce function")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-04 15:51:54 -07:00
Dan Carpenter
f6c1fb0a76 net: ethernet: dwmac: Fix an error code in imx_dwmac_probe()
The code is return PTR_ERR(NULL) which is zero or success.  We should
return -ENOMEM instead.

Fixes: 94abdad697 ("net: ethernet: dwmac: add ethernet glue logic for NXP imx8 chip")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-04 15:51:25 -07:00
Sameeh Jubran
3921a81c31 net: ena: xdp: update napi budget for DROP and ABORTED
This patch fixes two issues with XDP:

1. If the XDP verdict is XDP_ABORTED we break the loop, which results in
   us handling one buffer per napi cycle instead of the total budget
   (usually 64). To overcome this simply change the xdp_verdict check to
   != XDP_PASS. When the verdict is XDP_PASS, the skb is not expected to
   be NULL.

2. Update the residual budget for XDP_DROP and XDP_ABORTED, since
   packets are handled in these cases.

Fixes: 548c4940b9 ("net: ena: Implement XDP_TX action")
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-04 15:43:01 -07:00
Sameeh Jubran
cd07ecccba net: ena: xdp: XDP_TX: fix memory leak
When sending very high packet rate, the XDP tx queues can get full and
start dropping packets. In this case we don't free the pages which
results in ena driver draining the system memory.

Fix:
Simply free the pages when necessary.

Fixes: 548c4940b9 ("net: ena: Implement XDP_TX action")
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-04 15:43:01 -07:00
Rohit Maheshwari
ef1c75593e crypto/chcr: error seen if CONFIG_CHELSIO_TLS_DEVICE isn't set
cxgb4_uld_in_use() is used only by cxgb4_ktls_det_feature() which
is under CONFIG_CHELSIO_TLS_DEVICE macro.

Fixes: a3ac249a1a ("cxgb4/chcr: Enable ktls settings at run time")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-04 15:31:47 -07:00
Linus Torvalds
cb8e59cc87 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:

 1) Allow setting bluetooth L2CAP modes via socket option, from Luiz
    Augusto von Dentz.

 2) Add GSO partial support to igc, from Sasha Neftin.

 3) Several cleanups and improvements to r8169 from Heiner Kallweit.

 4) Add IF_OPER_TESTING link state and use it when ethtool triggers a
    device self-test. From Andrew Lunn.

 5) Start moving away from custom driver versions, use the globally
    defined kernel version instead, from Leon Romanovsky.

 6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin.

 7) Allow hard IRQ deferral during NAPI, from Eric Dumazet.

 8) Add sriov and vf support to hinic, from Luo bin.

 9) Support Media Redundancy Protocol (MRP) in the bridging code, from
    Horatiu Vultur.

10) Support netmap in the nft_nat code, from Pablo Neira Ayuso.

11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina
    Dubroca. Also add ipv6 support for espintcp.

12) Lots of ReST conversions of the networking documentation, from Mauro
    Carvalho Chehab.

13) Support configuration of ethtool rxnfc flows in bcmgenet driver,
    from Doug Berger.

14) Allow to dump cgroup id and filter by it in inet_diag code, from
    Dmitry Yakunin.

15) Add infrastructure to export netlink attribute policies to
    userspace, from Johannes Berg.

16) Several optimizations to sch_fq scheduler, from Eric Dumazet.

17) Fallback to the default qdisc if qdisc init fails because otherwise
    a packet scheduler init failure will make a device inoperative. From
    Jesper Dangaard Brouer.

18) Several RISCV bpf jit optimizations, from Luke Nelson.

19) Correct the return type of the ->ndo_start_xmit() method in several
    drivers, it's netdev_tx_t but many drivers were using
    'int'. From Yunjian Wang.

20) Add an ethtool interface for PHY master/slave config, from Oleksij
    Rempel.

21) Add BPF iterators, from Yonghang Song.

22) Add cable test infrastructure, including ethool interfaces, from
    Andrew Lunn. Marvell PHY driver is the first to support this
    facility.

23) Remove zero-length arrays all over, from Gustavo A. R. Silva.

24) Calculate and maintain an explicit frame size in XDP, from Jesper
    Dangaard Brouer.

25) Add CAP_BPF, from Alexei Starovoitov.

26) Support terse dumps in the packet scheduler, from Vlad Buslov.

27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei.

28) Add devm_register_netdev(), from Bartosz Golaszewski.

29) Minimize qdisc resets, from Cong Wang.

30) Get rid of kernel_getsockopt and kernel_setsockopt in order to
    eliminate set_fs/get_fs calls. From Christoph Hellwig.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits)
  selftests: net: ip_defrag: ignore EPERM
  net_failover: fixed rollback in net_failover_open()
  Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv"
  Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv"
  vmxnet3: allow rx flow hash ops only when rss is enabled
  hinic: add set_channels ethtool_ops support
  selftests/bpf: Add a default $(CXX) value
  tools/bpf: Don't use $(COMPILE.c)
  bpf, selftests: Use bpf_probe_read_kernel
  s390/bpf: Use bcr 0,%0 as tail call nop filler
  s390/bpf: Maintain 8-byte stack alignment
  selftests/bpf: Fix verifier test
  selftests/bpf: Fix sample_cnt shared between two threads
  bpf, selftests: Adapt cls_redirect to call csum_level helper
  bpf: Add csum_level helper for fixing up csum levels
  bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
  sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf()
  crypto/chtls: IPv6 support for inline TLS
  Crypto/chcr: Fixes a coccinile check error
  Crypto/chcr: Fixes compilations warnings
  ...
2020-06-03 16:27:18 -07:00
Jason Gunthorpe
649392bf75 RDMA: Remove 'max_fmr'
Now that FMR support is gone, this attribute can be deleted from all
places.

Link: https://lore.kernel.org/r/12-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-02 20:32:54 -03:00
Max Gurtovoy
1f55b7ab90 RDMA/mlx4: Remove FMR support for memory registration
HCA's that are driven by mlx4 driver support FRWR method to register
memory. Remove the ancient and unsafe FMR method.

Link: https://lore.kernel.org/r/8-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-02 20:32:54 -03:00
Luo bin
2eed5a8b61 hinic: add set_channels ethtool_ops support
add support to change TX/RX queue number with "ethtool -L combined".

V5 -> V6: remove check for carrier in hinic_xmit_frame
V4 -> V5: change time zone in patch header
V3 -> V4: update date in patch header
V2 -> V3: remove check for zero channels->combined_count
V1 -> V2: update commit message("ethtool -L" to "ethtool -L combined")
V0 -> V1: remove check for channels->tx_count/rx_count/other_count

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-02 15:10:24 -07:00
Linus Torvalds
355ba37d75 Power management updates for 5.8-rc1
- Rework the system-wide PM driver flags to make them easier to
    understand and use and update their documentation (Rafael Wysocki,
    Alan Stern).
 
  - Allow cpuidle governors to be switched at run time regardless of
    the kernel configuration and update the related documentation
    accordingly (Hanjun Guo).
 
  - Improve the resume device handling in the user space hibernarion
    interface code (Domenico Andreoli).
 
  - Document the intel-speed-select sysfs interface (Srinivas
    Pandruvada).
 
  - Make the ACPI code handing suspend to idle print more debug
    messages to help diagnose issues with it (Rafael Wysocki).
 
  - Fix a helper routine in the cpufreq core and correct a typo in
    the struct cpufreq_driver kerneldoc comment (Rafael Wysocki, Wang
    Wenhu).
 
  - Update cpufreq drivers:
 
    * Make the intel_pstate driver start in the passive mode by
      default on systems without HWP (Rafael Wysocki).
 
    * Add i.MX7ULP support to the imx-cpufreq-dt driver and add
      i.MX7ULP to the cpufreq-dt-platdev blacklist (Peng Fan).
 
    * Convert the qoriq cpufreq driver to a platform one, make the
      platform code create a suitable device object for it and add
      platform dependencies to it (Mian Yousaf Kaukab, Geert
      Uytterhoeven).
 
    * Fix wrong compatible binding in the qcom driver (Ansuel Smith).
 
    * Build the omap driver by default for ARCH_OMAP2PLUS (Anders
      Roxell).
 
    * Add r8a7742 SoC support to the dt cpufreq driver (Lad Prabhakar).
 
  - Update cpuidle core and drivers:
 
    * Fix three reference count leaks in error code paths in the
      cpuidle core (Qiushi Wu).
 
    * Convert Qualcomm SPM to a generic cpuidle driver (Stephan
      Gerhold).
 
    * Fix up the execution order when entering a domain idle state in
      the PSCI driver (Ulf Hansson).
 
  - Fix a reference counting issue related to clock management and
    clean up two oddities in the PM-runtime framework (Rafael Wysocki,
    Andy Shevchenko).
 
  - Add ElkhartLake support to the Intel RAPL power capping driver
    and remove an unused local MSR definition from it (Jacob Pan,
    Sumeet Pawnikar).
 
  - Update devfreq core and drivers:
 
    * Replace strncpy() with strscpy() in the devfreq core and use
      lockdep asserts instead of manual checks for a locked mutex in
      it (Dmitry Osipenko, Krzysztof Kozlowski).
 
    * Add a generic imx bus scaling driver and make it register an
      interconnect device (Leonard Crestez, Gustavo A. R. Silva).
 
    * Make the cpufreq notifier in the tegra30 driver take boosting
      into account and delete an unuseful error message from that
      driver (Dmitry Osipenko, Markus Elfring).
 
  - Remove unneeded semicolon from the cpupower code (Zou Wei).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl7VGjwSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx46gP/jGAXlddFEQswi6qUT3Cff0A9mb8CdcX
 dyKrjX4xxo/wtBIAwSN4achxrgse//ayo2dYTzWRDd31W9Azbv+5F+46XsDRz4hL
 pH29u/E66NMtFWnHCmt78NEJn0FzSa0YBC43ZzwFwKktCK9skYIpGN2z6iuXUBSX
 Q5GHqop3zvDsdKQFBGL62xvUw/AmOTPG7ohIZvqWBN2mbOqEqMcoFHT+aUF/NbLj
 +i14dvTH767eDZGRVASmXWQyljjaRWm+SIw4+m8zT1D1Y3d5IFObuMN+9RQl1Tif
 BYjkgJ2oDDMhCJLW7TBuJB+g7exiyaSQds3nMr2ZR+eZbJipICjU4eehNEKIUopU
 DM17tHQfnwZfS/7YbCx3vYQwLkNq37AJyXS9uqCAIFM+0n4xN4/mIVmgWYISLDTs
 1v9olFxtwMRNpjGGQWPJAO7ebB8Zz9qhQv7pIkSQEfwp93/SzvlVf4vvruTeFN9J
 qqG60cDumXWAm+s43eQHJNn5nOd5ocWv0FBpo/cxqKbzxFVWwdB42Cm0SY+rK2ID
 uHdnc2DJcK2c78UVbz3Cmk4272foJt2zxchqjFXXAZPLrOsFfzmti4B28VxGxjmP
 LG3MhH5sdbF4yl/1aSC1Bnrt+PV9Lus6ut/VKhjwIpw8cqiXgpwSbMoDoaBd9UMQ
 ubGz2rplGAtB
 =APdj
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These rework the system-wide PM driver flags, make runtime switching
  of cpuidle governors easier, improve the user space hibernation
  interface code, add intel-speed-select interface documentation, add
  more debug messages to the ACPI code handling suspend to idle, update
  the cpufreq core and drivers, fix a minor issue in the cpuidle core
  and update two cpuidle drivers, improve the PM-runtime framework,
  update the Intel RAPL power capping driver, update devfreq core and
  drivers, and clean up the cpupower utility.

  Specifics:

   - Rework the system-wide PM driver flags to make them easier to
     understand and use and update their documentation (Rafael Wysocki,
     Alan Stern).

   - Allow cpuidle governors to be switched at run time regardless of
     the kernel configuration and update the related documentation
     accordingly (Hanjun Guo).

   - Improve the resume device handling in the user space hibernarion
     interface code (Domenico Andreoli).

   - Document the intel-speed-select sysfs interface (Srinivas
     Pandruvada).

   - Make the ACPI code handing suspend to idle print more debug
     messages to help diagnose issues with it (Rafael Wysocki).

   - Fix a helper routine in the cpufreq core and correct a typo in the
     struct cpufreq_driver kerneldoc comment (Rafael Wysocki, Wang
     Wenhu).

   - Update cpufreq drivers:

      - Make the intel_pstate driver start in the passive mode by
        default on systems without HWP (Rafael Wysocki).

      - Add i.MX7ULP support to the imx-cpufreq-dt driver and add
        i.MX7ULP to the cpufreq-dt-platdev blacklist (Peng Fan).

      - Convert the qoriq cpufreq driver to a platform one, make the
        platform code create a suitable device object for it and add
        platform dependencies to it (Mian Yousaf Kaukab, Geert
        Uytterhoeven).

      - Fix wrong compatible binding in the qcom driver (Ansuel Smith).

      - Build the omap driver by default for ARCH_OMAP2PLUS (Anders
        Roxell).

      - Add r8a7742 SoC support to the dt cpufreq driver (Lad
        Prabhakar).

   - Update cpuidle core and drivers:

      - Fix three reference count leaks in error code paths in the
        cpuidle core (Qiushi Wu).

      - Convert Qualcomm SPM to a generic cpuidle driver (Stephan
        Gerhold).

      - Fix up the execution order when entering a domain idle state in
        the PSCI driver (Ulf Hansson).

   - Fix a reference counting issue related to clock management and
     clean up two oddities in the PM-runtime framework (Rafael Wysocki,
     Andy Shevchenko).

   - Add ElkhartLake support to the Intel RAPL power capping driver and
     remove an unused local MSR definition from it (Jacob Pan, Sumeet
     Pawnikar).

   - Update devfreq core and drivers:

      - Replace strncpy() with strscpy() in the devfreq core and use
        lockdep asserts instead of manual checks for a locked mutex in
        it (Dmitry Osipenko, Krzysztof Kozlowski).

      - Add a generic imx bus scaling driver and make it register an
        interconnect device (Leonard Crestez, Gustavo A. R. Silva).

      - Make the cpufreq notifier in the tegra30 driver take boosting
        into account and delete an unuseful error message from that
        driver (Dmitry Osipenko, Markus Elfring).

   - Remove unneeded semicolon from the cpupower code (Zou Wei)"

* tag 'pm-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (51 commits)
  cpuidle: Fix three reference count leaks
  PM: runtime: Replace pm_runtime_callbacks_present()
  PM / devfreq: Use lockdep asserts instead of manual checks for locked mutex
  PM / devfreq: imx-bus: Fix inconsistent IS_ERR and PTR_ERR
  PM / devfreq: Replace strncpy with strscpy
  PM / devfreq: imx: Register interconnect device
  PM / devfreq: Add generic imx bus scaling driver
  PM / devfreq: tegra30: Delete an error message in tegra_devfreq_probe()
  PM / devfreq: tegra30: Make CPUFreq notifier to take into account boosting
  PM: hibernate: Restrict writes to the resume device
  PM: runtime: clk: Fix clk_pm_runtime_get() error path
  cpuidle: Convert Qualcomm SPM driver to a generic CPUidle driver
  ACPI: EC: PM: s2idle: Extend GPE dispatching debug message
  ACPI: PM: s2idle: Print type of wakeup debug messages
  powercap: RAPL: remove unused local MSR define
  PM: runtime: Make clear what we do when conditions are wrong in rpm_suspend()
  Documentation: admin-guide: pm: Document intel-speed-select
  PM: hibernate: Split off snapshot dev option
  PM: hibernate: Incorporate concurrency handling
  Documentation: ABI: make current_governer_ro as a candidate for removal
  ...
2020-06-02 13:17:23 -07:00
David S. Miller
9a25c1df24 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-06-01

The following pull-request contains BPF updates for your *net-next* tree.

We've added 55 non-merge commits during the last 1 day(s) which contain
a total of 91 files changed, 4986 insertions(+), 463 deletions(-).

The main changes are:

1) Add rx_queue_mapping to bpf_sock from Amritha.

2) Add BPF ring buffer, from Andrii.

3) Attach and run programs through devmap, from David.

4) Allow SO_BINDTODEVICE opt in bpf_setsockopt, from Ferenc.

5) link based flow_dissector, from Jakub.

6) Use tracing helpers for lsm programs, from Jiri.

7) Several sk_msg fixes and extensions, from John.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 15:53:08 -07:00
Jules Irenge
efd7ed0f5f sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf()
Sparse reports a warning at efx_ef10_try_update_nic_stats_vf()
warning: context imbalance in efx_ef10_try_update_nic_stats_vf()
	- unexpected unlock
The root cause is the missing annotation at
efx_ef10_try_update_nic_stats_vf()
Add the missing _must_hold(&efx->stats_lock) annotation

Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 15:52:06 -07:00
Rohit Maheshwari
a3ac249a1a cxgb4/chcr: Enable ktls settings at run time
Current design enables ktls setting from start, which is not
efficient. Now the feature will be enabled when user demands
TLS offload on any interface.

v1->v2:
- taking ULD module refcount till any single connection exists.
- taking rtnl_lock() before clearing tls_devops.

v2->v3:
- cxgb4 is now registering to tlsdev_ops.
- module refcount inc/dec in chcr.
- refcount is only for connections.
- removed new code from cxgb_set_feature().

v3->v4:
- fixed warning message.

Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 15:48:18 -07:00
Lorenzo Bianconi
1b698fa5d8 xdp: Rename convert_to_xdp_frame in xdp_convert_buff_to_frame
In order to use standard 'xdp' prefix, rename convert_to_xdp_frame
utility routine in xdp_convert_buff_to_frame and replace all the
occurrences

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/6344f739be0d1a08ab2b9607584c4d5478c8c083.1590698295.git.lorenzo@kernel.org
2020-06-01 15:02:53 -07:00
David S. Miller
2a2e01e7b1 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2020-05-31

This series contains updates to the ice driver only.

Brett modifies the driver to allow users to clear a VF's
administratively set MAC address on the PF.  Fixes the driver to
recognize an existing VLAN tag when DMAC/SMAC is enabled in a packet.
Fixes an issue, so that VF's are reset after any VF port VLAN
modifications are made on the PF.  Made sure the register QRXFLXP_CNTXT
is cleared before writing a new value to ensure the previous value is
not passed forward.  Updates the PF to allow the VF to request a reset
as soon as it has been initialized.  Fixes an issue to ensure when a VSI
is created, it uses the current coalesce value, not the default value.

Paul allows untrusted VF's to add 16 filters.

Dan increases the timeout needed after a PFR to allow ample time for
package download.

Chinh adjust the define value for the number of PHY speeds we currently
support.  Changes the driver to ignore EMODE error when configuring the
PHY.

Jesse fixes an issue which was preventing a user from configuring the
interface before bringing it up.

Henry fixes the logic for adding back perfect flows after flow director
filter does a deletion.

Bruce fixes line wrappings to make it more consistent.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 12:09:08 -07:00
Ioana Ciornei
07beb1651a dpaa2-eth: Keep congestion group taildrop enabled when PFC on
Leave congestion group taildrop enabled for all traffic classes
when PFC is enabled. Notification threshold is low enough such
that it will be hit first and this also ensures that FQs on
traffic classes which are not PFC enabled won't drain the buffer
pool.

FQ taildrop threshold is kept disabled as long as any form of
flow control is on. Since FQ taildrop works with bytes, not number
of frames, we can't guarantee it will not interfere with the
congestion notification mechanism for all frame sizes.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 12:04:32 -07:00
Ioana Ciornei
f395b69f40 dpaa2-eth: Add PFC support through DCB ops
Add support in dpaa2-eth for PFC (Priority Flow Control)
through the DCB ops.

Instruct the hardware to respond to received PFC frames.
Current firmware doesn't allow us to selectively enable PFC
on the Rx side for some priorities only, so we will react to
all incoming PFC frames (and stop transmitting on the traffic
classes specified in the frame).

Also, configure the hardware to generate PFC frames based on Rx
congestion notifications. When a certain number of frames accumulate in
the ingress queues corresponding to a traffic class, priority flow
control frames are generated for that TC.

The number of PFC traffic classes available can be queried through
lldptool. Also, which of those traffic classes have PFC enabled is also
controlled through the same dcbnl_rtnl_ops callbacks.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 12:04:32 -07:00
Ioana Radulescu
3f8b826d70 dpaa2-eth: Update FQ taildrop threshold and buffer pool count
Now that we have congestion group taildrop configured at all
times, we can afford to increase the frame queue taildrop
threshold; this will ensure a better response when receiving
bursts of large-sized frames.

Also decouple the buffer pool count from the Rx FQ taildrop
threshold, as above change would increase it too much. Instead,
keep the old count as a hardcoded value.

With the new limits, we try to ensure that:
* we allow enough leeway for large frame bursts (by buffering
enough of them in queues to avoid heavy dropping in case of
bursty traffic, but when overall ingress bandwidth is manageable)
* allow pending frames to be evenly spread between ingress FQs,
regardless of frame size
* avoid dropping frames due to the buffer pool being empty; this
is not a bad behaviour per se, but system overall response is
more linear and predictable when frames are dropped at frame
queue/group level.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 12:04:32 -07:00
Ioana Radulescu
2c8d1c8d7d dpaa2-eth: Add congestion group taildrop
The increase in number of ingress frame queues means we now risk
depleting the buffer pool before the FQ taildrop kicks in.

Congestion group taildrop allows us to control the number of frames that
can accumulate on a group of Rx frame queues belonging to the same
traffic class.  This setting coexists with the frame queue based
taildrop: whichever limit gets hit first triggers the frame drop.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 12:04:32 -07:00
Ioana Radulescu
ad054f2654 dpaa2-eth: Add helper functions
Add convenient helper functions that determines whether Rx/Tx pause
frames are enabled based on link state flags received from firmware.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 12:04:32 -07:00
Ioana Radulescu
6aa90fe2d9 dpaa2-eth: Distribute ingress frames based on VLAN prio
Configure static ingress classification based on VLAN PCP field.
If the DPNI doesn't have enough traffic classes to accommodate all
priority levels, the lowest ones end up on TC 0 (default on miss).

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 12:04:32 -07:00
Ioana Radulescu
685e39eaf4 dpaa2-eth: Add support for Rx traffic classes
The firmware reserves for each DPNI a number of RX frame queues
equal to the number of configured flows x number of configured
traffic classes.

Current driver configuration directs all incoming traffic to
FQs corresponding to TC0, leaving all other priority levels unused.

Start adding support for multiple ingress traffic classes, by
configuring the FQs associated with all priority levels, not just
TC0. All settings that are per-TC, such as those related to
hashing and flow steering, are also updated.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 12:04:32 -07:00
Roelof Berg
6f197fb638 lan743x: Added fixed link and RGMII support
Microchip lan7431 is frequently connected to a phy. However, it
can also be directly connected to a MII remote peer without
any phy in between. For supporting such a phyless hardware setup
in Linux we utilized phylib, which supports a fixed-link
configuration via the device tree. And we added support for
defining the connection type R/GMII in the device tree.

New behavior:
-------------
. The automatic speed and duplex detection of the lan743x silicon
  between mac and phy is disabled. Instead phylib is used like in
  other typical Linux drivers. The usage of phylib allows to
  specify fixed-link parameters in the device tree.

. The device tree entry phy-connection-type is supported now with
  the modes RGMII or (G)MII (default).

Development state:
------------------
. Tested with fixed-phy configurations. Not yet tested in normal
  configurations with phy. Microchip kindly offered testing
  as soon as the Corona measures allow this.

. All review findings of Andrew Lunn are included

Example:
--------
&pcie {
	status = "okay";

	host@0 {
		reg = <0 0 0 0 0>;

		#address-cells = <3>;
		#size-cells = <2>;

		ethernet@0 {
			compatible = "weyland-yutani,noscom1", "microchip,lan743x";
			status = "okay";
			reg = <0 0 0 0 0>;
			phy-connection-type = "rgmii";

			fixed-link {
				speed = <100>;
				full-duplex;
			};
		};
	};
};

Signed-off-by: Roelof Berg <rberg@berg-solutions.de>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 11:51:17 -07:00
Ido Schimmel
88e2774961 mlxsw: spectrum_trap: Register ACL control traps
In a similar fashion to other control traps, register ACL control traps
with devlink.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 11:49:23 -07:00
Ido Schimmel
8110668ecd mlxsw: spectrum_trap: Register layer 3 control traps
In a similar fashion to layer 2 control traps, register layer 3 control
traps with devlink.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 11:49:23 -07:00
Ido Schimmel
39c10350cf mlxsw: spectrum_trap: Register layer 2 control traps
In a similar fashion to other traps, register layer 2 control traps with
devlink.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 11:49:23 -07:00
Ido Schimmel
45b1c87313 mlxsw: spectrum_trap: Factor out common Rx listener function
We currently have an Rx listener function for exception traps that marks
received skbs with 'offload_fwd_mark' and injects them to the kernel's
Rx path. The marking is done because all these exceptions occur during
L3 forwarding, after the packets were potentially flooded at L2.

A subsequent patch will add support for control traps. Packets received
via some of these control traps need different handling:

1. Packets might not need to be marked with 'offload_fwd_mark'. For
   example, if packet was trapped before L2 forwarding

2. Packets might not need to be injected to the kernel's Rx path. For
   example, sampled packets are reported to user space via the psample
   module

Factor out a common Rx listener function that only reports trapped
packets to devlink. Call it from mlxsw_sp_rx_no_mark_listener() and
mlxsw_sp_rx_mark_listener() that will inject the packets to the kernel's
Rx path, without and with the marking, respectively.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 11:49:23 -07:00
Ido Schimmel
1e292f5c11 mlxsw: spectrum_trap: Move layer 3 exceptions to exceptions trap group
The layer 3 exceptions are still subject to the same trap policer, so
nothing changes, but user space can choose to assign a different one.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 11:49:23 -07:00
Liu Xiang
a74d19ba7c net: fec: disable correct clk in the err path of fec_enet_clk_enable
When enable clk_ref failed, clk_ptp should be disabled rather than
clk_ref itself.

Signed-off-by: Liu Xiang <liuxiang_1999@126.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 11:45:40 -07:00
Alexander Sverdlin
0c34bb598c net: octeon: mgmt: Repair filling of RX ring
The removal of mips_swiotlb_ops exposed a problem in octeon_mgmt Ethernet
driver. mips_swiotlb_ops had an mb() after most of the operations and the
removal of the ops had broken the receive functionality of the driver.
My code inspection has shown no other places except
octeon_mgmt_rx_fill_ring() where an explicit barrier would be obviously
missing. The latter function however has to make sure that "ringing the
bell" doesn't happen before RX ring entry is really written.

The patch has been successfully tested on Octeon II.

Fixes: a999933db9 ("MIPS: remove mips_swiotlb_ops")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 11:45:20 -07:00
Pablo Neira Ayuso
e445e30cf7 bnxt_tc: update indirect block support
Register ndo callback via flow_indr_dev_register() and
flow_indr_dev_unregister().

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 11:41:50 -07:00
Pablo Neira Ayuso
50c1b1c938 nfp: update indirect block support
Register ndo callback via flow_indr_dev_register() and
flow_indr_dev_unregister().

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 11:41:50 -07:00
Pablo Neira Ayuso
9eabd18871 mlx5: update indirect block support
Register ndo callback via flow_indr_dev_register() and
flow_indr_dev_unregister().

No need for mlx5e_rep_indr_clean_block_privs() since flow_block_cb_free()
already releases the internal mapping via ->release callback, which in
this case is mlx5e_rep_indr_tc_block_unbind().

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 11:41:50 -07:00
Bartosz Golaszewski
240f1ae40c net: ethernet: mtk-star-emac: use regmap bitops
Shrink the code visually by replacing regmap_update_bits() with
appropriate regmap bit operations where applicable.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 11:35:18 -07:00
Dan Carpenter
bda6752f3d cxgb4: cleanup error code in setup_sge_queues_uld()
The caller doesn't care about the error codes, they only check for zero
vs non-zero.  Still, it's better to preserve the negative error codes
from alloc_uld_rxqs() instead of changing it to 1.  We can also return
directly if there is a failure.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 11:32:59 -07:00
Rafael J. Wysocki
be6018a44c Merge branches 'pm-core' and 'pm-sleep'
* pm-core:
  PM: runtime: Replace pm_runtime_callbacks_present()
  PM: runtime: clk: Fix clk_pm_runtime_get() error path
  PM: runtime: Make clear what we do when conditions are wrong in rpm_suspend()

* pm-sleep:
  PM: hibernate: Restrict writes to the resume device
  PM: hibernate: Split off snapshot dev option
  PM: hibernate: Incorporate concurrency handling
  PM: sleep: Helpful edits for devices.rst documentation
  Documentation: PM: sleep: Update driver flags documentation
  PM: sleep: core: Rename DPM_FLAG_LEAVE_SUSPENDED
  PM: sleep: core: Rename DPM_FLAG_NEVER_SKIP
  PM: sleep: core: Rename dev_pm_smart_suspend_and_suspended()
  PM: sleep: core: Rename dev_pm_may_skip_resume()
  PM: sleep: core: Rework the power.may_skip_resume handling
  PM: sleep: core: Do not skip callbacks in the resume phase
  PM: sleep: core: Fold functions into their callers
  PM: sleep: core: Simplify the SMART_SUSPEND flag handling
2020-06-01 15:19:08 +02:00
David S. Miller
1806c13dc2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
xdp_umem.c had overlapping changes between the 64-bit math fix
for the calculation of npgs and the removal of the zerocopy
memory type which got rid of the chunk_size_nohdr member.

The mlx5 Kconfig conflict is a case where we just take the
net-next copy of the Kconfig entry dependency as it takes on
the ESWITCH dependency by one level of indirection which is
what the 'net' conflicting change is trying to ensure.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-31 17:48:46 -07:00
Chinh T Cao
b5e19a642b ice: Ignore EMODE when setting PHY config
When setting the PHY cfg (CQ cmd 0x0601), if the firmware responds
with an EMODE error, software will ignore the error as it simply
means that manageability (ex: BMC) is in control of the link and that
the new setting may not be applied.

Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 04:01:16 -07:00
Henry Tieman
d5329be990 ice: fix aRFS after flow director delete
The logic was missing for adding back perfect flows after flow director
filter delete. The code now adds perfect flows into the HW tables after
filter delete.

Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 03:58:12 -07:00
Brett Creeley
a039f6fcba ice: Use coalesce values from q_vector 0 when increasing q_vectors
Currently when a VSI is built (i.e. reset, set channels, etc.)
the coalesce settings will be preserved in most cases. However, when the
number of q_vectors are increased the settings for the new q_vectors
will be set to the driver defaults of AIM on, Rx/Tx ITR 50, and INTRL 0.
This is causing issues with how the ethtool layer gets the current
coalesce settings since it only uses q_vector 0. So, assume that the user
set the coalesce settings globally (i.e. ethtool -C eth0) and use q_vector
0's settings for all of the new q_vectors.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 03:56:40 -07:00
Paul M Stillwell Jr
1a9c561aa3 ice: fix PCI device serial number to be lowercase values
Commit ceb2f00707 ("ice: Use pci_get_dsn()") changed the code to
use a new function to get the Device Serial Number. It also changed
the case of the filename for loading a package on a specific NIC
from lowercase to uppercase. Change the filename back to
lowercase since that is what we specified.

Fixes: ceb2f00707 ("ice: Use pci_get_dsn()")
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 03:55:07 -07:00
Bruce Allan
ebb462dc21 ice: fix function signature style format
Where possible, cuddle multiple lines of function signatures to be
consistent throughout the code.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 03:52:25 -07:00
Brett Creeley
7dcc0fb8f6 ice: Allow VF to request reset as soon as it's initialized
A VF driver has the ability to request reset via VIRTCHNL_OP_RESET_VF.
This is a required step in VF driver load. Currently, the PF is only
allowing a VF to request reset using this method after the VF has
already communicated resources via VIRTCHNL_OP_GET_VF_RESOURCES.
However, this is incorrect because the VF can request reset before
requesting resources. Fix this by allowing the VF to request a reset
once it has been initialized.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 03:50:37 -07:00
Jesse Brandeburg
765dd7a182 ice: Fix inability to set channels when down
Currently the driver prevents a user from doing
modprobe ice
ethtool -L eth0 combined 5
ip link set eth0 up

The ethtool command fails, because the driver is checking to see if the
interface is down before allowing the get_channels to proceed (even for
a set_channels).

Remove this check and allow the user to configure the interface
before bringing it up, which is a much better usability case.

Fixes: 87324e747f ("ice: Implement ethtool ops for channels")
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 03:49:00 -07:00
Brett Creeley
401ce33b32 ice: Always clear QRXFLXP_CNTXT before writing new value
Always clear the previous value in QRXFLXP_CNTXT before writing a new
value. This will make it so re-used queues will not accidentally take the
previously configured settings.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 03:47:35 -07:00
Brett Creeley
cf0bf41dd6 ice: Reset VF for all port VLAN changes from host
Currently the PF is modifying the VF's port VLAN on the fly when
configured via iproute. This is okay for most cases, but if the VF
already has guest VLANs configured the PF has to remove all of those
filters so only VLAN tagged traffic that matches the port VLAN will
pass. Instead of adding functionality to track which guest VLANs have
been added, just reset the VF each time port VLAN parameters are
modified.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 03:43:00 -07:00
Chinh T Cao
bff185e240 ice: Update ICE_PHY_TYPE_HIGH_MAX_INDEX value
As currently, we are supporting only 5 PHY_SPEEDs for phy_type_high.
Thus, we should adjust the value of ICE_PHY_TYPE_HIGH_MAX_INDEX to 5.

Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 03:41:29 -07:00
Dan Nowlin
c9a12d6d20 ice: Increase timeout after PFR
To allow for resets during package download, increase the timeout period
after performing a PFR. The time waited is the global config lock
timeout plus the normal PFSWR timeout.

Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 03:39:45 -07:00
Brett Creeley
2bb19d6e07 ice: Fix transmit for all software offloaded VLANs
Currently the driver does not recognize when there is an 802.1AD VLAN
tag right after the dmac/smac (outermost VLAN tag). If any DCB map is
applied and/or DCB is enabled this is causing the hardware to insert a
VLAN 0 tag after the 802.1AD VLAN tag that is already in the packet.
Fix this by preventing VLAN tag 0 from being added when any VLAN is
already present after dmac/smac (software offloaded) or skb (hardware
offloaded).

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 03:38:20 -07:00
Paul Greenwalt
c1636a6e8a ice: support adding 16 unicast/multicast filter on untrusted VF
Allow untrusted VF to add 16 unicast/multicast filters. VF uses 1 filter
for the default/perm_addr/LAA MAC, 1 for broadcast, and 16 additional
unicast/multicast filters.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 03:36:06 -07:00
Brett Creeley
f109603a4b ice: allow host to clear administratively set VF MAC
Currently a user is not allowed to clear a VF's administratively set MAC
on the PF. Fix this by allowing an all zero MAC address via "ip link set
${pf_eth} vf ${vf_id} mac 00:00:00:00:00:00".

An example use case for this would be issuing a "virsh shutdown"
command on a VM. The call to iproute mentioned above is part of this flow.
Without this change the driver incorrectly rejects clearing the VF's
administratively set MAC and prints unhelpful log messages.

Also, improve the comments surrounding this change.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 03:34:15 -07:00
Geert Uytterhoeven
9b23203c32 ravb: Mask PHY mode to avoid inserting delays twice
Until recently, the Micrel KSZ9031 PHY driver ignored any PHY mode
("RGMII-*ID") settings, but used the hardware defaults, augmented by
explicit configuration of individual skew values using the "*-skew-ps"
DT properties.  The lack of PHY mode support was compensated by the
EtherAVB MAC driver, which configures TX and/or RX internal delay
itself, based on the PHY mode.

However, now the KSZ9031 driver has gained PHY mode support, delays may
be configured twice, causing regressions.  E.g. on the Renesas
Salvator-X board with R-Car M3-W ES1.0, TX performance dropped from ca.
400 Mbps to 0.1-0.3 Mbps, as measured by nuttcp.

As internal delay configuration supported by the KSZ9031 PHY is too
limited for some use cases, the ability to configure MAC internal delay
is deemed useful and necessary.  Hence a proper fix would involve
splitting internal delay configuration in two parts, one for the PHY,
and one for the MAC.  However, this would require adding new DT
properties, thus breaking DTB backwards-compatibility.

Hence fix the regression in a backwards-compatibility way, by letting
the EtherAVB driver mask the PHY mode when it has inserted a delay, to
avoid the PHY driver adding a second delay.  This also fixes messages
like:

    Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00: *-skew-ps values should be used only with phy-mode = "rgmii"

as the PHY no longer sees the original RGMII-*ID mode.

Solving the issue by splitting configuration in two parts can be handled
in future patches, and would require retaining a backwards-compatibility
mode anyway.

Fixes: bcf3440c6d ("net: phy: micrel: add phy-mode support for the KSZ9031 PHY")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:50:37 -07:00
David S. Miller
d9f0d6605f Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2020-05-28

This series contains updates to the ice driver only.

Anirudh (Ani) adds a poll for reset completion before proceeding with
driver initialization when the DDP package fails to load and the firmware
issues a core reset.

Jake cleans up unnecessary code, since ice_set_dflt_vsi_ctx() performs a
memset to clear the info from the context structures.  Fixed a potential
double free during probe unrolling after a failure.  Also fixed a
potential NULL pointer dereference upon register_netdev() failure.

Tony makes two functions static which are not called outside of their
file.

Brett refactors the ice_ena_vf_mappings(), which was doing the VF's MSIx
and queue mapping in one function which was hard to digest.  So create a
new function to handle the enabling MSIx mappings and another function
to handle the enabling of queue mappings.  Simplify the code flow in
ice_sriov_configure().  Created a helper function for clearing
VPGEN_VFRTRIG register, as this needs to be done on reset to notify the
VF that we are done resetting it.  Fixed the initialization/creation and
reset flows, which was unnecessarily complicated, so separate the two
flows into their own functions.  Renamed VF initialization functions to
make it more clear what they do and why.  Added functionality to set the
VF trust mode bit on reset.  Added helper functions to rebuild the VLAN
and MAC configurations when resetting a VF.  Refactored how the VF reset
is handled to prevent VF reset timeouts.

Paul cleaned up code not needed during a CORER/GLOBR reset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:44:50 -07:00
Heiner Kallweit
67ee63ef2b r8169: improve handling power management ops
Simplify handling the power management callbacks.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:36:55 -07:00
Heiner Kallweit
8ac8e8c64b r8169: make rtl8169_down central chip quiesce function
Functionality for quiescing the chip is spread across different
functions currently. Move it to rtl8169_down().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:36:55 -07:00
Heiner Kallweit
bac75d8565 r8169: move some calls to rtl8169_hw_reset
Move calls that are needed before and after calling rtl8169_hw_reset()
into this function. This requires to move the function in the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:36:55 -07:00
Heiner Kallweit
9fdd50c579 r8169: don't reset tx ring indexes in rtl8169_tx_clear
In places where the indexes have to be reset, we call
rtl8169_init_ring_indexes() anyway after rtl8169_tx_clear().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:36:55 -07:00
Heiner Kallweit
01bd753d03 r8169: enable WAKE_PHY as only WoL source when runtime-suspending
We go to runtime-suspend few secs after cable removal. As cable is
removed "physical link up" is the only meaningful WoL source.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:36:55 -07:00
Heiner Kallweit
27dc36aefc r8169: change driver data type
Change driver private data type to struct rtl8169_private * to avoid
some overhead.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:36:55 -07:00
David S. Miller
4300c7e7fe mlx5-cleanup-2020-05-29
Accumulated cleanup patches and sparse warning fixes for mlx5 driver.
 
 1) sync with mlx5-next branch
 
 2) Eli Cohen declares mpls_entry_encode() helper in mpls.h as suggested
 by Jakub Kicinski and David Ahern, and use it in mlx5
 
 3) Jesper Fixes xdp data_meta setup in mlx5
 
 4) Many sparse and build warnings cleanup
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl7R3wcACgkQSD+KveBX
 +j6/ZQf/QD39naPeImfLjemkRK9L+TKbS4nU6wpUwf1jC33Wdm4HhkhsWEnR6C4l
 OwU/Pae3I9EtKP4gRE0W1o8h7zC9h4hY7+IKZOdyQ32iUY55PX/H25oqAiCj1NCM
 xzWpXOTwK/vkqmkCedAd+YpNdYlbOhfycr+KVPSsvFdaPqjzfNO1PJcLsUbAbzrX
 A+8pYdhUYTtx1N3YHJL5abLN6WzMAKxgwlm9GG8YCXACTJT6CBWWHGebVsC5TDUk
 Lj5hJj38mI8/3dcu6vWP0kLGVfRZo0HS/gpPGxbKQFpP+1uBYaRENAQONxkY++6S
 GDPix7ccvN+yNMlON893PC/Cogw3Yg==
 =WaCJ
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-cleanup-2020-05-29' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-cleanup-2020-05-29

Accumulated cleanup patches and sparse warning fixes for mlx5 driver.

1) sync with mlx5-next branch

2) Eli Cohen declares mpls_entry_encode() helper in mpls.h as suggested
by Jakub Kicinski and David Ahern, and use it in mlx5

3) Jesper Fixes xdp data_meta setup in mlx5

4) Many sparse and build warnings cleanup
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 17:53:57 -07:00
Huazhong Tan
996aade998 net: hns3: remove some unused codes in hns3_nic_set_features()
NETIF_F_HW_VLAN_CTAG_FILTER is not set in netdev->hw_feature for
the HNS3 driver, so the handler of NETIF_F_HW_VLAN_CTAG_FILTER
in hns3_nic_set_features() won't be called, remove it.

Reported-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 17:41:22 -07:00
Huazhong Tan
2adb8187e5 net: hns3: fix two coding style issues in hclgevf_main.c
Remove a redundant blank line in hclgevf_cmd_set_promisc_mode(),
and fix a reverse xmas tree coding style issue in
hclgevf_set_rss_tc_mode().

Reported-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 17:41:22 -07:00
Huazhong Tan
ec4d939220 net: hns3: fix an incorrect comment for num_tqps in struct hclgevf_dev
struct hclgevf_dev stands for VF device, its field num_tqps
indicates the number of VF's task queue pairs, so the comment
is incorrect, replace 'PF' with 'VF'.

Reported-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 17:41:22 -07:00
Huazhong Tan
fc68aed156 net: hns3: remove two unused macros in hclgevf_cmd.c
Macro hclgevf_ring_to_dma_dir and hclgevf_is_csq defined in
hclgevf_cmd.c, but not used, so remove them.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 17:41:22 -07:00
Huazhong Tan
d62805087e net: hns3: remove an unused macro hclge_is_csq
Macro hclge_is_csq defined in hcgle_cmd.c has not been used,
so remove it.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 17:41:22 -07:00
Huazhong Tan
1f4982ef56 net: hns3: fix a print format issue in hclge_mac_mdio_config()
Use %d to print int variable 'ret' in hclge_mac_mdio_config().

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 17:41:22 -07:00
Saeed Mahameed
eb24387183 net/mlx5e: Make mlx5e_dcbnl_ops static
Fix sparse warning:
drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c:988:29:
error: symbol 'mlx5e_dcbnl_ops' was not declared. Should it be static?

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 21:20:23 -07:00
Saeed Mahameed
58ff18e12c net/mlx5e: en_tc: Fix cast to restricted __be32 warning
Fixes sparse warnings:
warning: cast to restricted __be32
warning: restricted __be32 degrades to integer

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
2020-05-29 21:20:22 -07:00
Saeed Mahameed
c51323ee7a net/mlx5e: en_tc: Fix incorrect type in initializer warnings
Fix some trivial warnings of the type:
warning: incorrect type in initializer (different base types)

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
2020-05-29 21:20:22 -07:00
Saeed Mahameed
aee3e9c457 net/mlx5: Accel: fpga tls fix cast to __be64 and incorrect argument types
tls handle and rcd_sn are actually big endian and not in host format.
Fix that.

Fix the following sparse warnings:
drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c:177:21:
warning: cast to restricted __be64

drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c:178:52:
warning: incorrect type in argument 2 (different base types)
    expected unsigned int [usertype] handle
    got restricted __be32 [usertype] handle

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 21:20:22 -07:00
Saeed Mahameed
2553f421f4 net/mlx5: cmd: Fix memset with byte count warning
Fix sparse warning:
drivers/net/ethernet/mellanox/mlx5/core/cmd.c:1949:15:
warning: memset with byte count of 271720

mlx5_cmd_stats array is too big to be held inline in mlx5_cmd.
Allocate it separately.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 21:20:21 -07:00
Saeed Mahameed
9ff2e92c46 net/mlx5: DR: Fix incorrect type in return expression
dr_ste_crc32_calc() calculates crc32 and should return it in HW format.
It is being used to calculate a u32 index, hence we force the return value
of u32 to avoid the sparse warning:

drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c:115:16:
warning: incorrect type in return expression (different base types)
    expected unsigned int
    got restricted __be32 [usertype]

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
2020-05-29 21:20:21 -07:00
Saeed Mahameed
c2ba2c2287 net/mlx5: DR: Fix cast to restricted __be32
raw_ip actual type is __be32 and not u32.
Fix that and get rid of the warning.

drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c:906:31:
warning: cast to restricted __be32

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
2020-05-29 21:20:21 -07:00
Saeed Mahameed
618f88c4c4 net/mlx5: DR: Fix incorrect type in argument
HW spec objects should receive a void ptr to work on, the MLX5_SET/GET
macro will know how to handle it.

No need to provide explicit or wrong pointer type in this case.

warning: incorrect type in argument 1 (different base types)
    expected unsigned long long const [usertype] *sw_action
    got restricted __be64 [usertype] *[assigned] sw_action

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
2020-05-29 21:20:21 -07:00
Eli Cohen
f7e3ac424a net/mlx5e: Use generic API to build MPLS label
Make use of generic API mpls_entry_encode() to build mpls label and get
rid of local function.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 21:20:20 -07:00
Arnd Bergmann
e1167e1611 net/mlx5: reduce stack usage in qp_read_field
Moving the mlx5_ifc_query_qp_out_bits structure on the stack was a bit
excessive and now causes the compiler to complain on 32-bit architectures:

drivers/net/ethernet/mellanox/mlx5/core/debugfs.c: In function 'qp_read_field':
drivers/net/ethernet/mellanox/mlx5/core/debugfs.c:274:1: error: the frame size of 1104 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

Revert the previous patch partially to use dynamically allocation as
the code did before. Unfortunately there is no good error handling
in case the allocation fails.

Fixes: 57a6c5e992 ("net/mlx5: Replace hand written QP context struct with automatic getters")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 21:20:20 -07:00
Nathan Chancellor
2861904697 net/mlx5e: Don't use err uninitialized in mlx5e_attach_decap
Clang warns:

drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:3712:6: warning:
variable 'err' is used uninitialized whenever 'if' condition is false
[-Wsometimes-uninitialized]
        if (IS_ERR(d->pkt_reformat)) {
            ^~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:3718:6: note:
uninitialized use occurs here
        if (err)
            ^~~
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:3712:2: note: remove the
'if' if its condition is always true
        if (IS_ERR(d->pkt_reformat)) {
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:3670:9: note: initialize
the variable 'err' to silence this warning
        int err;
               ^
                = 0
1 warning generated.

It is not wrong, err is only ever initialized in if statements but this
one is not in one. Initialize err to 0 to fix this.

Fixes: 14e6b038af ("net/mlx5e: Add support for hw decapsulation of MPLS over UDP")
Link: https://github.com/ClangBuiltLinux/linux/issues/1037
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 21:20:20 -07:00
Saeed Mahameed
2950d1d64f net/mlx5: Kconfig: Fix spelling typo
"mdoe"->"mode"

Fixes: d956873f90 ("net/mlx5e: Introduce kconfig var for TC support")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
2020-05-29 21:20:19 -07:00
Jesper Dangaard Brouer
56e2287b41 mlx5: fix xdp data_meta setup in mlx5e_fill_xdp_buff
The helper function xdp_set_data_meta_invalid() must be called after
setting xdp->data as it depends on it.

The bug was introduced in the cited patch below, and cause the kernel
to crash when using BPF helper bpf_xdp_adjust_head() on mlx5 driver.

Fixes: 39d6443c8d ("mlx5, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL")
Reported-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 21:20:19 -07:00
Thomas Falcon
784688993e drivers/net/ibmvnic: Update VNIC protocol version reporting
VNIC protocol version is reported in big-endian format, but it
is not byteswapped before logging. Fix that, and remove version
comparison as only one protocol version exists at this time.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29 17:20:59 -07:00
Louis Peens
f0b37fa613 nfp: flower: fix incorrect flag assignment
A previous refactoring missed some locations the flags were renamed
but not moved from the previous flower_ext_feats to the new flower_en_feats
variable. This lead to the FLOW_MERGE and LAG features not being enabled.

Fixes: e09303d3c4 ("nfp: flower: renaming of feature bits")
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29 17:08:19 -07:00
Fugang Duan
94abdad697 net: ethernet: dwmac: add ethernet glue logic for NXP imx8 chip
NXP imx8 family like imx8mp/imx8dxl chips support Synopsys MAC 5.10a IP.
This patch adds settings for NXP imx8 glue layer:
- clocks
- dwmac address width
- phy interface mode selection
- adjust rgmii txclk rate

v2:
- adjust code sequences in order to have reverse christmas
  tree local variable ordering.

Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29 17:01:26 -07:00
Fugang Duan
139df98bdf stmmac: platform: add "snps, dwmac-5.10a" IP compatible string
Add "snps,dwmac-5.10a" compatible string for 5.10a version that can
avoid to define some plat data in glue layer.

Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29 17:01:26 -07:00
Saeed Mahameed
971ae1ed03 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
net/mlx5: Add ability to read and write ECE options
  net/mlx5: Add support for RDMA TX FT headers modifying
  net/mlx5: Move iseg access helper routines close to mlx5_core driver
  net/mlx5: Cleanup mlx5_ifc_fte_match_set_misc2_bits
  net/mlx5: Add support in forward to namespace
  {IB/net}/mlx5: Simplify don't trap code
  net/mlx5: Replace zero-length array with flexible-array

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 14:38:57 -07:00
Pablo Neira Ayuso
a683012a8e net/mlx5e: replace EINVAL in mlx5e_flower_parse_meta()
The drivers reports EINVAL to userspace through netlink on invalid meta
match. This is confusing since EINVAL is usually reserved for malformed
netlink messages. Replace it by more meaningful codes.

Fixes: 6d65bc64e2 ("net/mlx5e: Add mlx5e_flower_parse_meta support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 13:07:54 -07:00
Vlad Buslov
cb9a0641b5 net/mlx5e: Fix MLX5_TC_CT dependencies
Change MLX5_TC_CT config dependencies to include MLX5_ESWITCH instead of
MLX5_CORE_EN && NET_SWITCHDEV, which are already required by MLX5_ESWITCH.
Without this change mlx5 fails to compile if user disables MLX5_ESWITCH
without also manually disabling MLX5_TC_CT.

Fixes: 4c3844d9e9 ("net/mlx5e: CT: Introduce connection tracking")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 13:07:54 -07:00
Tal Gilboa
ebeaf084ad net/mlx5e: Properly set default values when disabling adaptive moderation
Add a call to mlx5e_reset_rx/tx_moderation() when enabling/disabling
adaptive moderation, in order to select the proper default values.

In order to do so, we separate the logic of selecting the moderation values
and setting moderion mode (CQE/EQE based).

Fixes: 0088cbbc4b ("net/mlx5e: Enable CQE based moderation on TX CQ")
Fixes: 9908aa2929 ("net/mlx5e: CQE based moderation")
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 13:07:53 -07:00
Aya Levin
b623603bbb net/mlx5e: Fix arch depending casting issue in FEC
Change type of active_fec to u32 to match the type expected by
mlx5e_get_fec_mode. Copy active_fec and configured_fec values to
unsigned long before preforming bitwise manipulations.
Take the same approach when configuring FEC over 50G link modes: copy
the policy into an unsigned long and only than preform bitwise
operations.

Fixes: 2132b71f78 ("net/mlx5e: Advertise globaly supported FEC modes")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 13:07:53 -07:00
Maor Dickman
20300aafa7 net/mlx5e: Remove warning "devices are not on same switch HW"
On tunnel decap rule insertion, the indirect mechanism will attempt to
offload the rule on all uplink representors which will trigger the
"devices are not on same switch HW, can't offload forwarding" message
for the uplink which isn't on the same switch HW as the VF representor.

The above flow is valid and shouldn't cause warning message,
fix by removing the warning and only report this flow using extack.

Fixes: 321348475d ("net/mlx5e: Fix allowed tc redirect merged eswitch offload cases")
Signed-off-by: Maor Dickman <maord@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 13:07:53 -07:00
Roi Dayan
0a2a6f498f net/mlx5e: Fix stats update for matchall classifier
It's bytes, packets, lastused.

Fixes: fcb64c0f56 ("net/mlx5: E-Switch, add ingress rate support")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 13:07:52 -07:00
Mark Bloch
8fc3e29be9 net/mlx5: Fix crash upon suspend/resume
Currently a Linux system with the mlx5 NIC always crashes upon
hibernation - suspend/resume.

Add basic callbacks so the NIC could be suspended and resumed.

Fixes: 9603b61de1 ("mlx5: Move pci device handling from mlx5_ib to mlx5_core")
Tested-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-29 13:07:52 -07:00
Bartosz Golaszewski
09b547a799 net: ethernet: mtk-star-emac: remove unused variable
The desc pointer is set but not used. Remove it.

Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 8c7bd5a454 ("net: ethernet: mtk-star-emac: new driver")
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29 12:42:04 -07:00
Hari
6a3faa4d7e e1000: Fix typo in the comment
Continuous Double "the" in a comment. Changed it to single "the"

Signed-off-by: Hari <harichandrakanthan@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 20:36:41 -07:00
Sasha Neftin
480b7a5a3f igc: Fix wrong register name
Accordance to the i225 datasheet this register address
used by Host Transmit Discarded Packet by MAC counter
and not by not applicable Carrier Extension Error counter.
This patch comes to fix this wrong definition.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 20:33:53 -07:00
Sasha Neftin
e2d0f2031e igc: Remove Sequence Error Counter
Accordance to the i225 datasheet sequence error counter does not
applicable to the i225 device.
This patch comes to clean up this counter.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 20:32:04 -07:00
Sasha Neftin
51c657b42f igc: Add Receive Error Counter
Receive error counter reflect total number of non-filtered
packets received with errors. This includes: CRC error,
symbol error, Rx data error and carrier extend error.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 20:30:06 -07:00
Sasha Neftin
758b51e1e7 igc: Remove symbol error counter
Accordance to the i225 datasheet symbol error counter does not
applicable to the i225 device.
This patch comes to clean up this counter.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 20:26:57 -07:00
Jason Yan
3f6023f77a i40e: Make i40e_shutdown_adminq() return void
Fix the following coccicheck warning:

drivers/net/ethernet/intel/i40e/i40e_adminq.c:699:13-21: Unneeded
variable: "ret_code". Return "0" on line 710

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 20:19:47 -07:00
Punit Agrawal
d601afcae2 e1000e: Relax condition to trigger reset for ME workaround
It's an error if the value of the RX/TX tail descriptor does not match
what was written. The error condition is true regardless the duration
of the interference from ME. But the driver only performs the reset if
E1000_ICH_FWSM_PCIM2PCI_COUNT (2000) iterations of 50us delay have
transpired. The extra condition can lead to inconsistency between the
state of hardware as expected by the driver.

Fix this by dropping the check for number of delay iterations.

While at it, also make __ew32_prepare() static as it's not used
anywhere else.

CC: stable <stable@vger.kernel.org>
Signed-off-by: Punit Agrawal <punit1.agrawal@toshiba.co.jp>
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 20:18:04 -07:00
Andre Guedes
e087d3bbc4 igc: Fix IGC_MAX_RXNFC_RULES
IGC supports a total of 32 rules. 16 MAC address based, 8 VLAN priority
based, and 8 Ethertype based. This patch fixes IGC_MAX_RXNFC_RULES
accordingly.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 20:15:38 -07:00
Andre Guedes
3d3e9b6b6a igc: Reject NFC rules with multiple matches
The way Rx queue assignment based on mac address, Ethertype and VLAN
priority filtering operates in I225 doesn't allow us to properly support
NFC rules with multiple matches.

Consider the following example which assigns to queue 2 frames matching
the address MACADDR *and* Ethertype ETYPE.

$ ethtool -N eth0 flow-type ether dst <MACADDR> proto <ETYPE> queue 2

When such rule is applied, we have 2 unwanted behaviors:

    1) Any frame matching MACADDR will be assigned to queue 2. It
       doesn't matter the ETYPE value.

    2) Any accepted frame that has Ethertype equals to ETYPE, no matter
       the mac address, will be assigned to queue 2 as well.

In current code, multiple-match filters are accepted by the driver, even
though it doesn't support them properly. This patch adds a check for
multiple-match rules in igc_ethtool_is_nfc_rule_valid() so they are
rejected.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 20:13:11 -07:00
Sasha Neftin
2c3076f5ed igc: Remove unused flags
Transmit underrun, late and excess collision flags not in use.
This patch comes to clean up these flags.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 20:05:20 -07:00
Jason Yan
49c65e95f3 igb: make igb_set_fc_watermarks() return void
This function always return 0 now, we can make it return void to
simplify the code. This fixes the following coccicheck warning:

drivers/net/ethernet/intel/igb/e1000_mac.c:728:5-12: Unneeded variable:
"ret_val". Return "0" on line 751

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 20:04:01 -07:00
YueHaibing
f2d9f29412 ixgbe: Remove unused inline function ixgbe_irq_disable_queues
commit b5f69ccf67 ("ixgbe: avoid bringing rings up/down as macvlans are added/removed")
left behind this, remove it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 20:01:39 -07:00
Jason Yan
c2d77e598b ixgbe: Use true, false for bool variable in __ixgbe_enable_sriov()
Fix the following coccicheck warning:

drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c:105:2-38: WARNING:
Assignment of 0/1 to bool variable

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 20:00:02 -07:00
Jason Yan
85c41c5b16 ixgbe: Remove conversion to bool in ixgbe_device_supports_autoneg_fc()
No need to convert '==' expression to bool. This fixes the following
coccicheck warning:

drivers/net/ethernet/intel/ixgbe/ixgbe_common.c:68:11-16: WARNING:
conversion to bool not needed here

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 19:58:34 -07:00
Xie XiuQi
3b70683fc4 ixgbe: fix signed-integer-overflow warning
ubsan report this warning, fix it by adding a unsigned suffix.

UBSAN: signed-integer-overflow in
drivers/net/ethernet/intel/ixgbe/ixgbe_common.c:2246:26
65535 * 65537 cannot be represented in type 'int'
CPU: 21 PID: 7 Comm: kworker/u256:0 Not tainted 5.7.0-rc3-debug+ #39
Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 03/27/2020
Workqueue: ixgbe ixgbe_service_task [ixgbe]
Call trace:
 dump_backtrace+0x0/0x3f0
 show_stack+0x28/0x38
 dump_stack+0x154/0x1e4
 ubsan_epilogue+0x18/0x60
 handle_overflow+0xf8/0x148
 __ubsan_handle_mul_overflow+0x34/0x48
 ixgbe_fc_enable_generic+0x4d0/0x590 [ixgbe]
 ixgbe_service_task+0xc20/0x1f78 [ixgbe]
 process_one_work+0x8f0/0xf18
 worker_thread+0x430/0x6d0
 kthread+0x218/0x238
 ret_from_fork+0x10/0x18

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 19:56:40 -07:00
Jesper Dangaard Brouer
e92c0e0235 i40e: trivial fixup of comments in i40e_xsk.c
The comment above i40e_run_xdp_zc() was clearly copy-pasted from
function i40e_xsk_umem_setup, which is just above.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 19:55:17 -07:00
Takashi Iwai
c28481a88c i40e: Use scnprintf() for avoiding potential buffer overflow
Since snprintf() returns the would-be-output size instead of the
actual output size, the succeeding calls may go beyond the given
buffer limit.  Fix it by replacing with scnprintf().

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 19:51:51 -07:00
Huazhong Tan
ead38a8537 net: hns3: print out speed info when parsing speed fails
When calling hclge_parse_speed() fails, printing out the speed is
helpful for debugging.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:39:04 -07:00
Huazhong Tan
7c6643cac0 net: hns3: remove some unused fields in struct hclge_dev
Remove some fields in struct hclge_dev which have not been used.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:39:04 -07:00
Huazhong Tan
9cee2e8d30 net: hns3: remove two duplicated register macros in hclgevf_main.h
HCLGEVF_CMDQ_INTR_SRC_REG and HCLGEVF_CMDQ_INTR_STS_REG are same
as HCLGEVF_VECTOR0_CMDQ_SRC_REG and HCLGEVF_VECTOR0_CMDQ_STAT_REG,
replace the former with the latter, and rename macro
HCLGEVF_VECTOR0_CMDQ_STAT_REG since 'stat' is not abbreviation of
'state'.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:39:03 -07:00
Huazhong Tan
4828b5766a net: hns3: remove unused struct hnae3_unic_private_info
Since field .uinfo in struct hnae3_handle never be used,
so remove it and its structure definition.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:39:03 -07:00
Huazhong Tan
c496299e06 net: hns3; remove unused HNAE3_RESTORE_CLIENT in enum hnae3_reset_notify_type
Remove HNAE3_RESTORE_CLIENT which is not needed now.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:39:03 -07:00
Huazhong Tan
5e86178dce net: hns3: remove some unused fields in struct hns3_nic_priv
Remove some fileds which defined in struct hns3_nic_priv,
but not used, and remove the related definition of struct
hns3_udp_tunnel and enum hns3_udp_tnl_type.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:39:03 -07:00
Huazhong Tan
fb9e44d63d net: hns3: modify an incorrect type in struct hclgevf_cfg_gro_status_cmd
Modify field .gro_en in struct hclgevf_cfg_gro_status_cmd to u8
according to the UM, otherwise, it will overwrite the reserved
byte which may be used for other purpose.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:39:03 -07:00
Huazhong Tan
639d84d0c4 net: hns3: modify an incorrect type in struct hclge_cfg_gro_status_cmd
Modify field .gro_en in struct hclge_cfg_gro_status_cmd to u8
according to the UM, otherwise, it will overwrite the reserved
byte which may be used for other purpose.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:39:03 -07:00
Huazhong Tan
5caa039f32 net: hns3: refactor hclge_query_bd_num_cmd_send()
In order to improve code maintainability and readability, rewrite
the process of BDs' initialization in hclge_query_bd_num_cmd_send().

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:39:03 -07:00
Huazhong Tan
9f5a981606 net: hns3: refactor hclge_config_tso()
Since parameters 'tso_mss_min' and 'tso_mss_max' only indicate
the minimum and maximum MSS, the hnae3_set_field() calls are
meaningless, remove them and change the type of these two
parameters to u16.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:39:03 -07:00
Huazhong Tan
9516352150 net: hns3: add a missing mutex destroy in hclge_init_ad_dev()
Add a mutex destroy call in hclge_init_ae_dev() when fails.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:39:03 -07:00
Huazhong Tan
2421ee2477 net: hns3: remove an unnecessary 'goto' in hclge_init_ae_dev()
Remove the redundant 'goto' and return -ENOMEM directly, when
allocating memory for 'hdev' fails in hclge_init_ae_dev().

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:39:03 -07:00
Marek Vasut
72628da6d6 net: ks8851: Remove ks8851_mll.c
The ks8851_mll.c is replaced by ks8851_par.c, which is using common code
from ks8851.c, just like ks8851_spi.c . Remove this old ad-hoc driver.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
797047f875 net: ks8851: Implement Parallel bus operations
Implement accessors for KS8851-16MLL/MLLI/MLLU parallel bus variant of
the KS8851. This is based off the ks8851_mll.c , which is a driver for
exactly the same hardware, however the ks8851.c code is much higher
quality. Hence, this patch pulls out the relevant information from the
ks8851_mll.c on how to access the bus, but uses the common ks8851.c
code. To make this patch reviewable, instead of rewriting ks8851_mll.c,
ks8851_mll.c is removed in a separate subsequent patch.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
b07f987a8d net: ks8851: Separate SPI operations into separate file
Pull all the SPI bus specific code into a separate file, so that it is
not mixed with the common code. Rename ks8851.c to ks8851_common.c. The
ks8851_common.c is linked with ks8851_spi.c now, so it can call the
accessors in the ks8851_spi.c without any pointer indirection.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
7a552c850c net: ks8851: Implement register, FIFO, lock accessor callbacks
The register and FIFO accessors are bus specific, so is locking.
Implement callbacks so that each variant of the KS8851 can implement
matching accessors and locking, and use the rest of the common code.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
d2a1c643a0 net: ks8851: Permit overridding interrupt enable register
The parallel bus variant does not need to use the TX interrupt at all
as it writes the TX FIFO directly with in .ndo_start_xmit, permit the
drivers to configure the interrupt enable bits.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
144ad36c3d net: ks8851: Factor out TX work flush function
While the SPI version of the KS8851 requires a TX worker thread to pump
data via SPI, the parallel bus version can write data into the TX FIFO
directly in .ndo_start_xmit, as the parallel bus access is much faster
and does not sleep. Factor out this TX work flush part, so it can be
overridden by the parallel bus driver.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
24be72632c net: ks8851: Split out SPI specific code from probe() and remove()
Factor out common code into ks8851_probe_common() and
ks8851_remove_common() to permit both SPI and parallel
bus driver variants to use the common code path for
both probing and removal.

There should be no functional change.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
d48b7634c6 net: ks8851: Split out SPI specific entries in struct ks8851_net
Add a new struct ks8851_net_spi, which embeds the original
struct ks8851_net and contains the entries specific only to
the SPI variant of KS8851.

There should be no functional change.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
18a3df7309 net: ks8851: Factor out SKB receive function
Factor out this netif_rx_ni(), so it could be overridden by the parallel
bus variant of the KS8851 driver.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
2272602005 net: ks8851: Factor out bus lock handling
Pull out bus access locking code into separate functions, this is done
in preparation for unifying the driver with the parallel bus one. The
parallel bus driver does not need heavy mutex locking of the bus and
works better with spinlocks, hence prepare these locking functions to
be overridden then.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
aa39bf6730 net: ks8851: Use 16-bit read of RXFC register
The RXFC register is the only one being read using 8-bit accessors.
To make it easier to support the 16-bit accesses used by the parallel
bus variant of KS8851, use 16-bit accessor to read RXFC register as
well as neighboring RXFCTR register.

Remove ks8851_rdreg8() as it is not used anywhere anymore.

There should be no functional change.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
88cfedd0d7 net: ks8851: Use 16-bit writes to program MAC address
On the SPI variant of KS8851, the MAC address can be programmed with
either 8/16/32-bit writes. To make it easier to support the 16-bit
parallel option of KS8851 too, switch both the MAC address programming
and readout to 16-bit operations.

Remove ks8851_wrreg8() as it is not used anywhere anymore.

There should be no functional change.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
806f66495e net: ks8851: Remove ks8851_rdreg32()
The ks8851_rdreg32() is used only in one place, to read two registers
using a single read. To make it easier to support 16-bit accesses via
parallel bus later on, replace this single read with two 16-bit reads
from each of the registers and drop the ks8851_rdreg32() altogether.

If this has noticeable performance impact on the SPI variant of KS8851,
then we should consider using regmap to abstract the SPI and parallel
bus options and in case of SPI, permit regmap to merge register reads
of neighboring registers into single, longer, read.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
2c5b0a86ac net: ks8851: Use dev_{get,set}_drvdata()
Replace spi_{get,set}_drvdata() with dev_{get,set}_drvdata(), which
works for both SPI and platform drivers. This is done in preparation
for unifying the KS8851 SPI and parallel bus drivers.

There should be no functional change.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
b6948e1b7b net: ks8851: Use devm_alloc_etherdev()
Use device managed version of alloc_etherdev() to simplify the code.
No functional change intended.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
848fc0ce6c net: ks8851: Pass device node into ks8851_init_mac()
Since the driver probe function already has a struct device *dev pointer
and can easily derive of_node pointer from it, pass the of_node pointer as
a parameter to ks8851_init_mac() to avoid fishing it out from ks->spidev.
This is the only reference to spidev in the function, so get rid of it.
This is done in preparation for unifying the KS8851 SPI and parallel bus
drivers.

No functional change.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:04 -07:00
Marek Vasut
2f3271c952 net: ks8851: Replace dev_err() with netdev_err() in IRQ handler
Use netdev_err() instead of dev_err() to avoid accessing the spidev->dev
in the interrupt handler. This is the only place which uses the spidev
in this function, so replace it with netdev_err() to get rid of it. This
is done in preparation for unifying the KS8851 SPI and parallel drivers.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:03 -07:00
Marek Vasut
bfd1e0eb08 net: ks8851: Rename ndev to netdev in probe
Rename ndev variable to netdev for the sake of consistency.

No functional change.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:03 -07:00
Marek Vasut
d320692d9f net: ks8851: Factor out spi->dev in probe()/remove()
Pull out the spi->dev into one common place in the function instead of
having it repeated over and over again. This is done in preparation for
unifying ks8851 and ks8851-mll drivers. No functional change.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 16:30:03 -07:00
Brett Creeley
3726cce258 ice: Refactor VF VSI release and setup functions
Currently when a VF VSI calls ice_vsi_release() and ice_vsi_setup() it
subsequently clears/sets the VF cached variables for lan_vsi_idx and
lan_vsi_num. This works fine, but can be improved by handling this in
the VF specific VSI release and setup functions.

Also, when a VF VSI is setup too many parameters are passed that can be
derived from the VF. Fix this by only calling VF VSI setup with the bare
minimum parameters.

Also, add functionality to invalidate a VF's VSI when it's released
and/or setup fails. This will make it so a VF VSI cannot be accessed via
its cached vsi_idx/vsi_num in these cases.

Finally when a VF's VSI is invalidated set the lan_vsi_idx and
lan_vsi_num to ICE_NO_VSI to clearly show that there is no valid VSI
associated with this VF.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 16:25:19 -07:00
Brett Creeley
12bb018c53 ice: Refactor VF reset
Currently VF VSI are being reset twice during a PFR or greater. This is
causing reset, specifically resetting all VFs, to take too long. This is
causing various issues with VF drivers not being able to gracefully
handle the VF reset timeout. Fix this by refactoring how VF reset is
handled for the case mentioned previously and for the VFR/VFLR case.

The refactor was done by doing the following:

1. Removing the call to ice_vsi_rebuild_by_type for
   ICE_VSI_VF VSI, which was causing the initial VSI rebuild.

2. Adding functions for pre/post VSI rebuild functions that can be called
   in both the reset all VFs case and reset individual VF case.

3. Adding VSI rebuild functions that are specific for the reset all VFs
   case and adding functions that are specific for the reset individual
   VF case.

4. Calling the pre-rebuild function, then the specific VSI rebuild
   function based on the reset type, and then calling the post-rebuild
   function to handle VF resets.

This patch series makes some assumptions about how VSI are handling by
FW during reset:

1. During a PFR or greater all VSI in FW will be cleared.
2. During a VFR/VFLR the VSI rebuild responsibility is in the hands of
   the PF software.
3. There is code in the ice_reset_all_vfs() case to amortize operations
   if possible. This was left intact.
4. PF software should not be replaying VSI based filters that were added
   other than host configured, PF software configured, or the VF's
   default/LAA MAC. This is the VF drivers job after it has been reset.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 16:24:02 -07:00
Paul Greenwalt
a58e1d8174 ice: remove VM/VF disable command on CORER/GLOBR reset
Remove VM/VF disable AQC (opcode 0x0C31) when resetting all VFs.
This is not required for CORER/GLOBR reset.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 16:11:26 -07:00
Brett Creeley
350e822cd5 ice: Add functions to rebuild host VLAN/MAC config for a VF
When resetting a VF the VLAN and MAC filter configurations need to be
replayed. Add helper functions for this purpose.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 16:10:02 -07:00
Brett Creeley
eb2af3ee94 ice: Add function to set trust mode bit on reset
As the title says, use a function to set trust mode bit on reset.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 16:07:21 -07:00
Brett Creeley
a06325a090 ice: Renaming and simplification in VF init path
Some function names weren't very clear and some portions of VF creation
could be moved into functions for clarity. Fix this by renaming some
functions and move pieces of code into clearly name functions.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 16:03:51 -07:00
Brett Creeley
916c7fdf5e ice: Separate VF VSI initialization/creation from reset flow
Currently the same flow is used for VF VSI initialization/creation and VF
VSI reset. This makes the initialization/creation flow unnecessarily
complicated. Fix this by separating the initialization/creation of the
VF VSI from the reset flow.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 16:01:39 -07:00
Brett Creeley
cfcee02b6c ice: Add helper function for clearing VPGEN_VFRTRIG
Create a helper function for clearing VPGEN_VFRTRIG as this needs to be
done on reset to notify the VF that we are done resetting it. Also, it
needs to be done on SR-IOV initialization/creation in case it was left
in a bad state after SR-IOV tear down.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 15:59:18 -07:00
Brett Creeley
02337f1f59 ice: Simplify ice_sriov_configure
Add a new function for checking if SR-IOV can be configured based on
the PF and/or device's state/capabilities. Also, simplify the flow in
ice_sriov_configure().

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 15:57:40 -07:00
Brett Creeley
ac3716134a ice: Refactor ice_ena_vf_mappings to split MSIX and queue mappings
Currently ice_ena_vf_mappings() does all of the VF's MSIX and queue
mapping in one function. This makes it hard to digest. Fix this by
creating a new function for enabling MSIX mappings and one for enabling
queue mappings.

Also, rename some variables in the functions for clarity.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 15:55:54 -07:00
Tony Nguyen
d3112cd1ab ice: Declare functions static
ice_get_pfa_module_tlv() and ice_read_sr_word() are not being called
outside of their file. Declare them as static.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 15:53:21 -07:00
Jacob Keller
c2b313b783 ice: fix kernel BUG if register_netdev fails
If register_netdev() fails, the driver will attempt to cleanup the
q_vectors and inadvertently trigger a kernel BUG due to a NULL pointer
dereference.

This occurs because cleaning up q_vectors attempts to call
netif_napi_del on napi_structs which were never initialized.

Resolve this by releasing the netdev in ice_cfg_netdev and setting
vsi->netdev to NULL. This ensures that after ice_cfg_netdev fails the
state is rewound to match as if ice_cfg_netdev was never called.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 15:51:28 -07:00
Jacob Keller
bc3a024101 ice: fix potential double free in probe unrolling
If ice_init_interrupt_scheme fails, ice_probe will jump to clearing up
the interrupts. This can lead to some static analysis tools such as the
compiler sanitizers complaining about double free problems.

Since ice_init_interrupt_scheme already unrolls internally on failure,
there is no need to call ice_clear_interrupt_scheme when it fails. Add
a new unroll label and use that instead.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 15:50:01 -07:00
Jacob Keller
072064a43e ice: cleanup VSI context initialization
Remove an unnecessary copy of vsi->info into ctxt->info in ice_vsi_init.
This line is essentially a no-op because ice_set_dflt_vsi_ctx performs
a memset to clear the info from the context structure.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 15:48:39 -07:00
Anirudh Venkataramanan
9918f2d22f ice: Poll for reset completion when DDP load fails
There are certain cases where the DDP load fails and the FW issues a
core reset. For these cases, wait for reset to complete before
proceeding with reset of the driver init.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28 15:40:12 -07:00
Arnd Bergmann
b113cabd43 sfc: avoid an unused-variable warning
'nic_data' is no longer used outside of the #ifdef block
in efx_ef10_set_mac_address:

drivers/net/ethernet/sfc/ef10.c:3231:28: error: unused variable 'nic_data' [-Werror,-Wunused-variable]
        struct efx_ef10_nic_data *nic_data = efx->nic_data;

Move the variable into a local scope.

Fixes: dfcabb0788 ("sfc: move vport_id to struct efx_nic")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 12:49:06 -07:00
David S. Miller
62c027883c Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2020-05-27

This series contains updates to the ice driver only.

Jesse fixes a number of issues, starting with fixing the remaining
signed versus unsigned comparison issues.  Cleaned up an unused code
define.  Fixed the implementation of the manage MAC write command, to
simplify it by using a simple array to represent the MAC address when
writing it.

Paul fixes the setting of the VF default LAN address, by removing a
check that assumed that the address had been deleted and zeroed.

Surabhi prevents a memory leak on filter management initialization
failures and during queue initialization and buffer allocation failures.

Brett adds additional receive error counters that are reported by
ethtool.  Fixed the enabling and disabling of VLAN stripping when the
PVID has been set.

Evan fixes a race condition between the firmware and software, which can
occur between the admin queue setup and the first command sent.

Marta fixes the driver when XDP transmit rings are destroyed, also make
sure the XDP transmit queues are also destroyed.  Update the statistics
when XDP transmit programs are loaded and packets are sent.  Changed the
number of XDP transmit queues to match the number of receive queues,
instead of matching the number of transmit queues.

Bruce avoids undefined behavior by not writing the 8-bit element
init_q_state with the associated internal-to-hardware field which is
122-bits.

Anirudh (Ani) refactors the receive checksum checks.

Krzysztof notifies the user if the fill queue is not long enough to
prepare all buffers before packet processing starts and allocates the
buffers during the NAPI poll.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 11:17:20 -07:00
David S. Miller
1eba1110f0 mlx5-updates-2020-05-26
Updates highlights:
 
 1) From Vu Pham (8): Support VM traffics failover with bonded VF
 representors and e-switch egress/ingress ACLs
 
 This series introduce the support for Virtual Machine running I/O
 traffic over direct/fast VF path and failing over to slower
 paravirtualized path using the following features:
 
      __________________________________
     |  VM      _________________        |
     |          |FAILOVER device |       |
     |          |________________|       |
     |                  |                |
     |              ____|_____           |
     |              |         |          |
     |       ______ |___  ____|_______   |
     |       |  VF PT  |  |VIRTIO-NET |  |
     |       | device  |  | device    |  |
     |       |_________|  |___________|  |
     |___________|______________|________|
                 |              |
                 | HYPERVISOR   |
                 |          ____|______
                 |         |  macvtap  |
                 |         |virtio BE  |
                 |         |___________|
                 |               |
                 |           ____|_____
                 |           |host VF  |
                 |           |_________|
                 |               |
            _____|______    _____|_____
            |  PT VF    |  |  host VF  |
            |representor|  |representor|
            |___________|  |___________|
                 \               /
                  \             /
                   \           /
                    \         /                     _________________
                     \_______/                     |                |
                  _______|________                 |    V-SWITCH    |
                 |VF representors |________________|      (OVS)     |
                 |      bond      |                |________________|
                 |________________|                        |
                                                   ________|________
                                                  |    Uplink       |
                                                  |  representor    |
                                                  |_________________|
 
 Summary:
 --------
 Problem statement:
 ------------------
 Currently in above topology, when netfailover device is configured using
 VFs and eswitch VF representors, and when traffic fails over to stand-by
 VF which is exposed using macvtap device to guest VM, eswitch fails to
 switch the traffic to the stand-by VF representor. This occurs because
 there is no knowledge at eswitch level of the stand-by representor
 device.
 
 Solution:
 ---------
 Using standard bonding driver, a bond netdevice is created over VF
 representor device which is used for offloading tc rules.
 Two VF representors are bonded together, one for the passthrough VF
 device and another one for the stand-by VF device.
 With this solution, mlx5 driver listens to the failover events
 occuring at the bond device level to failover traffic to either of
 the active VF representor of the bond.
 
 a. VM with netfailover device of VF pass-thru (PT) device and virtio-net
    paravirtualized device with same MAC-address to handle failover
    traffics at VM level.
 
 b. Host bond is active-standby mode, with the lower devices being the VM
    VF PT representor, and the representor of the 2nd VF to handle
    failover traffics at Hypervisor/V-Switch OVS level.
    - During the steady state (fast datapath): set the bond active
      device to be the VM PT VF representor.
    - During failover: apply bond failover to the second VF representor
      device which connects to the VM non-accelerated path.
 
 c. E-Switch ingress/egress ACL tables to support failover traffics at
    E-Switch level
    I. E-Switch egress ACL with forward-to-vport rule:
      - By default, eswitch vport egress acl forward packets to its
        counterpart NIC vport.
      - During port failover, the egress acl forward-to-vport rule will
        be added to e-switch vport of passive/in-active slave VF
 representor
        to forward packets to other e-switch vport ie. the active slave
        representor's e-switch vport to handle egress "failover"
 traffics.
      - Using lower change netdev event to detect a representor is a
        lower
        dev (slave) of bond and becomes active, adding egress acl
        forward-to-vport rule of all other slave netdevs to forward to
 this
        representor's vport.
      - Using upper change netdev event to detect a representor unslaving
        from bond device to delete its vport's egress acl forward-to-vport
        rule.
 
    II. E-Switch ingress ACL metadata reg_c for match
      - Bonded representors' vorts sharing tc block have the same
        root ingress acl table and a unique metadata for match.
      - Traffics from both representors's vports will be tagged with same
        unique metadata reg_c.
      - Using upper change netdev event to detect a representor
        enslaving/unslaving from bond device to setup shared root ingress
        acl and unique metadata.
 
 2) From Alex Vesker (2): Slpit RX and TX lock for parallel rule insertion in
 software steering
 
 3) Eli Britstein (2): Optimize performance for IPv4/IPv6 ethertype use the HW
 ip_version register rather than parsing eth frames for ethertype.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl7PEFAACgkQSD+KveBX
 +j4Z5Af+NYwihYZpQYBBN00K7Wu10XZ65u5MbGSDmzpdN62w0kKfjsJ70bb9aiws
 h8LC7lspdMLRMMn9pWwFKshyF6RoSD9Ku3ZYhUbtj+hJLElAd9IwGt6pPKr8hPDd
 9h+ZcBkacdhNwWKf7CKThic0c/0PLdVyzRysHxcQWKSMPCTdgiL5Z3PQHA0TM6J3
 6Excs2z7kSuuyyxQ1cyWCaqSz4rqCrYyd8Ws4HOPhXgSbX14Q3mtMsBDayx2gHNW
 rdVbaNN6s2o0TxbrCwd0AaNP3UWcnjNqu1ohxgJiSe8y+MHMoB0OMoO+6vQJnwNI
 bzpZEioswV1zdgK3qNmXqbHOiHRSVQ==
 =xM1D
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2020-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-05-26

Updates highlights:

1) From Vu Pham (8): Support VM traffics failover with bonded VF
representors and e-switch egress/ingress ACLs

This series introduce the support for Virtual Machine running I/O
traffic over direct/fast VF path and failing over to slower
paravirtualized path using the following features:

     __________________________________
    |  VM      _________________        |
    |          |FAILOVER device |       |
    |          |________________|       |
    |                  |                |
    |              ____|_____           |
    |              |         |          |
    |       ______ |___  ____|_______   |
    |       |  VF PT  |  |VIRTIO-NET |  |
    |       | device  |  | device    |  |
    |       |_________|  |___________|  |
    |___________|______________|________|
                |              |
                | HYPERVISOR   |
                |          ____|______
                |         |  macvtap  |
                |         |virtio BE  |
                |         |___________|
                |               |
                |           ____|_____
                |           |host VF  |
                |           |_________|
                |               |
           _____|______    _____|_____
           |  PT VF    |  |  host VF  |
           |representor|  |representor|
           |___________|  |___________|
                \               /
                 \             /
                  \           /
                   \         /                     _________________
                    \_______/                     |                |
                 _______|________                 |    V-SWITCH    |
                |VF representors |________________|      (OVS)     |
                |      bond      |                |________________|
                |________________|                        |
                                                  ________|________
                                                 |    Uplink       |
                                                 |  representor    |
                                                 |_________________|

Summary:
--------
Problem statement:
------------------
Currently in above topology, when netfailover device is configured using
VFs and eswitch VF representors, and when traffic fails over to stand-by
VF which is exposed using macvtap device to guest VM, eswitch fails to
switch the traffic to the stand-by VF representor. This occurs because
there is no knowledge at eswitch level of the stand-by representor
device.

Solution:
---------
Using standard bonding driver, a bond netdevice is created over VF
representor device which is used for offloading tc rules.
Two VF representors are bonded together, one for the passthrough VF
device and another one for the stand-by VF device.
With this solution, mlx5 driver listens to the failover events
occuring at the bond device level to failover traffic to either of
the active VF representor of the bond.

a. VM with netfailover device of VF pass-thru (PT) device and virtio-net
   paravirtualized device with same MAC-address to handle failover
   traffics at VM level.

b. Host bond is active-standby mode, with the lower devices being the VM
   VF PT representor, and the representor of the 2nd VF to handle
   failover traffics at Hypervisor/V-Switch OVS level.
   - During the steady state (fast datapath): set the bond active
     device to be the VM PT VF representor.
   - During failover: apply bond failover to the second VF representor
     device which connects to the VM non-accelerated path.

c. E-Switch ingress/egress ACL tables to support failover traffics at
   E-Switch level
   I. E-Switch egress ACL with forward-to-vport rule:
     - By default, eswitch vport egress acl forward packets to its
       counterpart NIC vport.
     - During port failover, the egress acl forward-to-vport rule will
       be added to e-switch vport of passive/in-active slave VF
representor
       to forward packets to other e-switch vport ie. the active slave
       representor's e-switch vport to handle egress "failover"
traffics.
     - Using lower change netdev event to detect a representor is a
       lower
       dev (slave) of bond and becomes active, adding egress acl
       forward-to-vport rule of all other slave netdevs to forward to
this
       representor's vport.
     - Using upper change netdev event to detect a representor unslaving
       from bond device to delete its vport's egress acl forward-to-vport
       rule.

   II. E-Switch ingress ACL metadata reg_c for match
     - Bonded representors' vorts sharing tc block have the same
       root ingress acl table and a unique metadata for match.
     - Traffics from both representors's vports will be tagged with same
       unique metadata reg_c.
     - Using upper change netdev event to detect a representor
       enslaving/unslaving from bond device to setup shared root ingress
       acl and unique metadata.

2) From Alex Vesker (2): Slpit RX and TX lock for parallel rule insertion in
software steering

3) Eli Britstein (2): Optimize performance for IPv4/IPv6 ethertype use the HW
ip_version register rather than parsing eth frames for ethertype.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 11:04:12 -07:00
Krzysztof Kazimierczak
3f0d97cdfe ice: Check UMEM FQ size when allocating bufs
If a UMEM is present on a queue when an interface/queue pair is being
enabled, the driver will try to prepare the Rx buffers in advance to
improve performance. However, if fill queue is shorter than HW Rx ring,
the driver will report failure after getting the last address from the
fill queue.

This still lets the driver process the packets correctly during the NAPI
poll, but leads to a constant NAPI rescheduling. Not allocating the
buffers in advance would result in a potential performance decrease.

Commit d57d76428a ("xsk: Add API to check for available entries in FQ")
provides an API that lets drivers check the number of addresses that the
fill queue holds.

Notify the user if fill queue is not long enough to prepare all buffers
before packet processing starts, and allocate the buffers during the
NAPI poll. If the fill queue size is sufficient, prepare Rx buffers in
advance.

Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 18:13:59 -07:00
Alex Vesker
ed03a418ab net/mlx5: DR, Split RX and TX lock for parallel insertion
Change the locking flow to support RX and TX locks, splitting
the single lock to two will allow inserting rules in parallel
for RX and TX parts of the FDB.

Locking the dr_domain will be done by locking the RX domain
and the TX domain locks, this is mostly used for control operations
on the dr_domain. When inserting rules for RX or TX the single
nic_doamin RX or TX lock will be used. Splitting the lock is safe since
RX and TX domains are logically separated from each other, shared
objects such the send-ring and memory pool are protected by locks.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:52 -07:00
Alex Vesker
cedb28191f net/mlx5: DR, Add a spinlock to protect the send ring
Adding this lock will allow writing steering entries without
locking the dr_domain and allow parallel insertion.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:51 -07:00
Eli Britstein
fca533041a net/mlx5e: Optimize performance for IPv4/IPv6 ethertype
The HW is optimized for IPv4/IPv6. For such cases, pending capability,
avoid matching on ethertype, and use ip_version field instead.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:51 -07:00
Eli Britstein
4a5d5d7392 net/mlx5e: Helper function to set ethertype
Set ethertype match in a helper function as a pre-step towards
optimizing it.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:50 -07:00
Parav Pandit
810cbb2554 net/mlx5: Add missing mutex destroy
Add mutex destroy calls to balance with mutex_init() done in the init
path.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:50 -07:00
Vu Pham
9728366f53 net/mlx5e: Use change upper event to setup representors' bond_metadata
Use change upper event to detect slave representor from
enslaving/unslaving to/from lag device.

On enslaving event, call mlx5_enslave_rep() API to create, add
this slave representor shadow entry to the slaves list of
bond_metadata structure representing master lag device and use
its metadata to setup ingress acl metadata header.

On unslaving event, resetting the vport of unslaved representor
to use its default ingress/egress acls and rx rules with its
default_metadata.

The last slave will free the shared bond_metadata and its
unique metadata.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:50 -07:00
Vu Pham
88e96e533c net/mlx5e: Slave representors sharing unique metadata for match
Bonded slave representors' vports must share a unique metadata
for match.

On enslaving event of slave representor to lag device, allocate
new unique "bond_metadata" for match if this is the first slave.
The subsequent enslaved representors will share the same unique
"bond_metadata".

On unslaving event of slave representor, reset the slave
representor's vport to use its own default metadata.

Replace ingress acl and rx rules of the slave representors' vports
using new vport->bond_metadata.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:49 -07:00
Vu Pham
133dcfc577 net/mlx5: E-Switch, Alloc and free unique metadata for match
Introduce infrastructure to create unique metadata for match
for vport without depending on vport_num. Vport uses its
default metadata for match in standalone configuration but
will share a different unique "bond_metadata" for match with
other vports in bond configuration.

Using ida to generate unique metadata for match for vports
in default and bond configurations.

Introduce APIs to generate, free metadata for match.
Introduce APIs to set vport's bond_metadata and replace its
ingress acl rules with bond_metatada.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:49 -07:00
Vu Pham
d97555e145 net/mlx5e: Add bond_metadata and its slave entries
Adding bond_metadata and its slave entries to represent a lag device
and its slaves VF representors. Bond_metadata structure includes a
unique metadata shared by slaves VF respresentors, and a list of slaves
representors slave entries.

On enslaving event, create a bond_metadata structure representing
the upper lag device of this slave representor if it has not been
created yet. Create and add entry for the slave representor to the
slaves list.

On unslaving event, free the slave entry of the slave representor.
On the last unslave event, free the bond_metadata structure and its
resources.

Introduce APIs to create and remove bond_metadata and its resources,
enslave and unslave VF representor slave entries.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:49 -07:00
Or Gerlitz
d34eb2fcd0 net/mlx5e: Offload flow rules to active lower representor
When a bond device is created over one or more non uplink representors,
and when a flow rule is offloaded to such bond device, offload a rule
to the active lower device.

Assuming that this is active-backup lag, the rules should be offloaded
to the active lower device which is the representor of the direct
path (not the failover).

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:48 -07:00
Vu Pham
553f932838 net/mlx5e: Support tc block sharing for representors
Currently offloading a rule over a tc block shared by multiple
representors fails because an e-switch global hashtable to keep
the mapping from tc cookies to mlx5e flow instances is used, and
tc block sharing offloads the same rule/cookie multiple times,
each time for different representor sharing the tc block.

Changing the implementation and behavior by acknowledging and returning
success if the same rule/cookie is offloaded again to other slave
representor sharing the tc block by setting, checking and comparing
the netdev that added the rule first.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:48 -07:00
Or Gerlitz
7e51891a23 net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule
Register a notifier block to handle netdev events for bond device
of non-uplink representors to support eswitch vports bonding.

When a non-uplink representor is a lower dev (slave) of bond and
becomes active, adding egress acl forward-to-vport rule of all slave
netdevs (active + standby) to forward to this representor's vport. Use
change lower netdev event to do this.

Use change upper event to detect slave representor unslaved from lag
device to delete its vport egress acl forward rule if any.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:47 -07:00
Vu Pham
bf773dc0e6 net/mlx5: E-Switch, Introduce APIs to enable egress acl forward-to-vport rule
By default, e-switch vport's egress acl just forward packets to its
counterpart NIC vport using existing egress acl table.

During port failover in bonding scenario where two VFs representors
are bonded, the egress acl forward-to-vport rule will be added to
the existing egress acl table of e-switch vport of passive/inactive
slave representor to forward packets to other NIC vport ie. the active
slave representor's NIC vport to handle egress "failover" traffic.

Enable egress acl and have APIs to create and destroy egress acl
forward-to-vport rule and group.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:47 -07:00
Vu Pham
07bab95026 net/mlx5: E-Switch, Refactor eswitch ingress acl codes
Restructure the eswitch ingress acl codes into eswitch directory
and different files:
. Acl ingress helper functions to acl_helper.c/h
. Acl ingress functions used in offloads mode to acl_ingress_ofld.c
. Acl ingress functions used in legacy mode to acl_ingress_lgy.c

This patch does not change any functionality.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
2020-05-27 18:13:47 -07:00
Vu Pham
ea651a86d4 net/mlx5: E-Switch, Refactor eswitch egress acl codes
Refactor the egress acl codes so that offloads and legacy modes
can configure specifically their own needs of egress acl table,
groups and rules. While at it, restructure the eswitch egress
acl codes into eswitch directory and different files:
. Acl egress helper functions to acl_helper.c/h
. Acl egress functions used in offloads mode to acl_egress_ofld.c
. Acl egress functions used in legacy mode to acl_egress_lgy.c

This patch does not change any functionality.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:46 -07:00
Anirudh Venkataramanan
13f90b393f ice: Refactor Rx checksum checks
We don't need both rx_status and rx_error parameters, as the latter is
a subset of the former. Remove rx_error completely and check the right bit
in rx_status.

Rename rx_status to rx_status0, and rx_status_err1 to
rx_status1. This naming more closely reflects the specification.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 18:00:35 -07:00
Bruce Allan
7e34786a74 ice: avoid undefined behavior
When writing the driver's struct ice_tlan_ctx structure, do not write the
8-bit element int_q_state with the associated internal-to-hardware field
which is 122-bits, otherwise the helper function ice_write_byte() will use
undefined behavior when setting the mask used for that write.  This should
not cause any functional change and will avoid use of undefined behavior.
Also, update a comment to highlight this structure element is not written.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:58:21 -07:00
Marta Plantykow
ae15e0ba1b ice: Change number of XDP Tx queues to match number of Rx queues
In current implementation number of XDP Tx queues is the same as
the number of transmit queues, which is not always true. This
patch changes this number to match the number of receive queues.
XDP programs are running on Rx rings, so what we actually need to
provide is the XDP Tx ring per each Rx ring so that the whole XDP
ecosystem is functional, e.g. if the result of XDP prog is XDP_TX
then you have the need to access the XDP Tx ring.

Signed-off-by: Marta Plantykow <marta.a.plantykow@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:55:56 -07:00
Marta Plantykow
49d358e0e7 ice: Add XDP Tx to VSI ring stats
When XDP Tx program is loaded and packets are sent from
interface, VSI statistics are not updated. This patch adds
packets sent on Tx XDP ring to VSI ring stats.

Signed-off-by: Marta Plantykow <marta.a.plantykow@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:54:16 -07:00
Marta Plantykow
c8f135c6ee ice: Change number of XDP TxQ to 0 when destroying rings
When XDP Tx rings are destroyed the number of XDP Tx queues
is not changing. This patch is changing this number to 0.

Signed-off-by: Marta Plantykow <marta.a.plantykow@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:49:56 -07:00
Evan Swanson
b5c7f857e5 ice: Handle critical FW error during admin queue initialization
A race condition between FW and SW can occur between admin queue setup and
the first command sent. A link event may occur and FW attempts to notify a
non-existent queue. FW will set the critical error bit and disable the
queue. When this happens retry queue setup.

Signed-off-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:48:23 -07:00
Brett Creeley
1960827570 ice: Don't allow VLAN stripping change when pvid set
Currently, if the PVID is set in the VLAN handling section of the VSI
context the driver still allows VLAN stripping to be enabled/disabled.
VLAN stripping should only be modifiable when the PVID is not set. Fix
this by preventing VLAN stripping modification when PVID is set.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:46:00 -07:00
Brett Creeley
4f1fe43c92 ice: Add more Rx errors to netdev's rx_error counter
Currently we are only including illegal_bytes and rx_crc_errors in the
PF netdev's rx_error counter. There are many more causes of Rx errors
that the device supports and reports via Ethtool. Accumulate all Rx
errors in the PF netdev's rx_error counter.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:44:06 -07:00
Surabhi Boob
68d2707837 ice: Fix for memory leaks and modify ICE_FREE_CQ_BUFS
Handle memory leaks during control queue initialization and
buffer allocation failures. The macro ICE_FREE_CQ_BUFS is modified to
re-use for this fix.

Signed-off-by: Surabhi Boob <surabhi.boob@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:32:50 -07:00
Surabhi Boob
1aaef2bc4e ice: Fix memory leak
Handle memory leak on filter management initialization failure.

Signed-off-by: Surabhi Boob <surabhi.boob@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:11:29 -07:00
Jesse Brandeburg
5df42c8267 ice: fix MAC write command
The manage MAC write command was implemented in an overly complex way
that actually didn't work, as it wasn't symmetric to the manage MAC
read command, and was feeding bytes out of order to the firmware. Fix
the implementation by just using a simple array to represent the MAC
address when it is being written via firmware command.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:06:44 -07:00
Paul Greenwalt
bf8987df8a ice: set VF default LAN address
Remove is_zero_ether_add() check when setting the VF default LAN address.
This check assumed that the address had been delete and zeroed before
calling ice_vc_add_mac_addr(). Now the default LAN address will be set
to the last unicast MAC address added by the VF.

The default LAN address is reported by the PF via ndo_get_vf_config.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:05:02 -07:00
Jesse Brandeburg
f0cbbb9c6e ice: remove unused macro
The driver had an unused define that can be removed.  Found by
compiler -Werror=unused-macros check.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:03:40 -07:00
Jesse Brandeburg
22bef5e78f ice: fix signed vs unsigned comparisons
Fix the remaining signed vs unsigned issues, which appear
when compiling with -Werror=sign-compare.

Many of these are because there is an external interface that is passing
an int to us (which we can't change) but that we (rightfully) store
and compare against as an unsigned in our data structures.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:02:47 -07:00
Huazhong Tan
6f45a9bdd2 net: hns3: add a print for initializing CMDQ when reset pending
When initializing CMDQ fails because of reset pending,
there is no hint for debugging, so adds a log for it.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 14:56:08 -07:00
Yufeng Mo
01952206e1 net: hns3: remove unnecessary MAC enable in app loopback
Packets will not pass through MAC during app loopback.
Therefore, it is meaningless to enable MAC while doing
app loopback. This patch removes this unnecessary action.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 14:56:08 -07:00
Yufeng Mo
60c800c64d net: hns3: change the order of reinitializing RoCE and NIC client during reset
The HNS RDMA driver will support VF device later, whose
re-initialization should be done after PF's. This patch
changes the order of hclge_reset_prepare_up() and
hclge_notify_roce_client(), so that PF's RoCE client
will be reinitialized before VF's.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 14:56:08 -07:00
Guangbin Huang
4cd5beaa89 net: hns3: add a resetting check in hclgevf_init_nic_client_instance()
To prevent from initializing VF NIC client in reset handling state,
this patch adds resetting check in hclgevf_init_nic_client_instance().

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 14:56:08 -07:00
Antoine Tenart
b2e118f638 net: mscc: allow offloading timestamping operations to the PHY
This patch adds support for offloading timestamping operations not only
to the Ocelot switch (as already supported) but to compatible PHYs.
When both the PHY and the Ocelot switch support timestamping operations,
the PHY implementation is chosen as the timestamp will happen closer to
the medium.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 14:54:31 -07:00
Antoine Tenart
7ff4f3f315 net: mscc: use the PHY MII ioctl interface when possible
Allow ioctl to be implemented by the PHY, when a PHY is attached to the
Ocelot switch. In case the ioctl is a request to set or get the hardware
timestamp, use the Ocelot switch implementation for now.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 14:54:31 -07:00
Jason Gunthorpe
e4fdf7625b Merge branch 'mellanox/mlx5-next' into rdma.git for/next
From the mlx5-next branch at
  git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Required for dependencies in following patches

* branch 'mellanox/mlx5-next':
  net/mlx5: Add ability to read and write ECE options
  net/mlx5: Add support for RDMA TX FT headers modifying
  net/mlx5: Move iseg access helper routines close to mlx5_core driver
  net/mlx5: Cleanup mlx5_ifc_fte_match_set_misc2_bits

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27 16:01:17 -03:00
Arnd Bergmann
f99c0646ef mtk-star-emac: mark PM functions as __maybe_unused
Without CONFIG_PM, the compiler warns about two unused functions:

drivers/net/ethernet/mediatek/mtk_star_emac.c:1472:12: error: unused function 'mtk_star_suspend' [-Werror,-Wunused-function]
drivers/net/ethernet/mediatek/mtk_star_emac.c:1488:12: error: unused function 'mtk_star_resume' [-Werror,-Wunused-function]

Mark these as __maybe_unused.

Fixes: 8c7bd5a454 ("net: ethernet: mtk-star-emac: new driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 11:32:39 -07:00
Bartosz Golaszewski
f96e9641e9 net: ethernet: mtk-star-emac: fix error path in RX handling
The dma_addr field in desc_data must not be overwritten until after the
new skb is mapped. Currently we do replace it with uninitialized value
in error path. This change fixes it by moving the assignment before the
label to which we jump after mapping or allocation errors.

Fixes: 8c7bd5a454 ("net: ethernet: mtk-star-emac: new driver")
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com> # build
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 11:24:31 -07:00
Colin Ian King
7cf4eda481 mlxsw: spectrum_router: remove redundant initialization of pointer br_dev
The pointer br_dev is being initialized with a value that is never read
and is being updated with a new value later on. The initialization
is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 11:23:27 -07:00
Heinrich Kuhn
5b186cd60f nfp: flower: fix used time of merge flow statistics
Prior to this change the correct value for the used counter is calculated
but not stored nor, therefore, propagated to user-space. In use-cases such
as OVS use-case at least this results in active flows being removed from
the hardware datapath. Which results in both unnecessary flow tear-down
and setup, and packet processing on the host.

This patch addresses the problem by saving the calculated used value
which allows the value to propagate to user-space.

Found by inspection.

Fixes: aa6ce2ea0c ("nfp: flower: support stats update for merge flows")
Signed-off-by: Heinrich Kuhn <heinrich.kuhn@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 11:18:57 -07:00
Armin Wolf
53c0ec4f4d ne2k-pci: Fix various coding-style issues and improve printk() usage
Fixed a ton of minor checkpatch errors/warnings and remove version
printing at module init/when device is found and use MODULE_VERSION
instead. Also modifying the RTL8029 PCI string to include the compatible
RTL8029AS nic.
The only mayor issue remaining is the missing SPDX tag, but since the
exact version of the GPL is not stated anywhere inside the file, its
impossible to add such a tag at the moment.
But maybe it is possible, since 8390.h states Donald Becker's 8390
drivers are licensed under GPL 2.2 only (= GPL-2.0-only ?).
The kernel module containing this patch compiles and runs without
problems on a RTL8029AS-based NE2000 clone card with kernel 5.7.0-rc6.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 23:19:28 -07:00
Ido Schimmel
10d3757fcb mlxsw: spectrum_router: Allow programming link-local prefix routes
The device has a trap for IPv6 packets that need be routed and have a
unicast link-local destination IP (i.e., fe80::/10). This allows mlxsw
to ignore link-local routes, as the packets will be trapped to the CPU
in any case.

However, since link-local routes are not programmed, it is possible for
routed packets to hit the default route which might also be programmed
to trap packets. This means that packets with a link-local destination
IP might be trapped for the wrong reason.

To overcome this, allow programming link-local prefix routes (usually
one fe80::/64 per-table), so that the packets will be forwarded until
reaching the link-local trap.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
9785b92b44 mlxsw: spectrum: Add packet traps for BFD packets
Bidirectional Forwarding Detection (BFD) provides "low-overhead,
short-duration detection of failures in the path between adjacent
forwarding engines" (RFC 5880).

This is accomplished by exchanging BFD packets between the two
forwarding engines. Up until now these packets were trapped via the
general local delivery (i.e., IP2ME) trap which also traps a lot of
other packets that are not as time-sensitive as BFD packets.

Expose dedicated traps for BFD packets so that user space could
configure a dedicated policer for them.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
dacc4e3acf mlxsw: spectrum: Treat IPv6 link-local SIP as an exception
IPv6 packets that need to be forwarded and have a link-local source IP are
dropped by the kernel and an ICMPv6 "Destination unreachable" is sent to
the sending host.

As such, change the trap group of such packets so that they do not
interfere with IPv6 management packets. In the future this trap will be
exposed as an exception via devlink-trap.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
1260e083d4 mlxsw: spectrum: Share one group for all locally delivered packets
Routed IP packets with the Router Alert option need to be trapped to
the CPU as they might need to be locally delivered to raw sockets with
the IP_ROUTER_ALERT / IPV6_ROUTER_ALERT socket option.

Move them to the same group with other packets that might need to be
trapped following route lookup.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
500769bebe mlxsw: reg: Move all trap groups under the same enum
After the previous patch the split is no longer necessary and all the
trap groups can be moved under the same enum.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
b87bde80da mlxsw: spectrum_trap: Do not hard code "thin" policer identifier
As explained in commit e612523041 ("mlxsw: spectrum_trap: Introduce
dummy group with thin policer"), the purpose of the "thin" policer is to
pass as less packets as possible to the CPU.

The identifier of this policer is currently set according to the maximum
number of used trap groups, but this is fragile: On Spectrum-1 the
maximum number of policers is less than the maximum number of trap
groups, which might result in an invalid policer identifier in case the
number of used trap groups grows beyond the policer limit.

Solve this by dynamically allocating the policer identifier.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
03cb0ce0dd mlxsw: switchx2: Move SwitchX-2 trap groups out of main enum
The number of Spectrum trap groups is not infinite, but two identifiers
are occupied by SwitchX-2 specific trap groups. Free these identifiers
by moving them out of the main enum.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
025b7de7f4 mlxsw: spectrum: Reduce priority of locally delivered packets
To align with recent recommended values. Will be configurable by future
patches.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
1e3cd58942 mlxsw: spectrum: Use same trap group for local routes and link-local destination
Packets with an IPv6 link-local destination (i.e., fe80::/10) should not
be forwarded and are therefore trapped to the CPU for local delivery.
Since these packets are trapped for the same logical reason as packets
hitting local routes, associate both traps with the same group.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:57 -07:00
Ido Schimmel
d322309d72 mlxsw: spectrum: Use separate trap group for FID miss
When a packet enters the device it is classified to a filtering
identifier (FID) based on the ingress port and VLAN. The FID miss trap
is used to trap packets for which a FID could not be found.

In mlxsw this trap should only be triggered when a port is enslaved to
an OVS bridge and a matching ACL rule could not be found, so as to
trigger learning.

These packets are therefore completely unrelated to packets hitting
local routes and should be in a different group. Move them.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:57 -07:00
Ido Schimmel
954eef2677 mlxsw: spectrum: Use same trap group for various IPv6 packets
Group these various IPv6 packets (e.g., router solicitations, router
advertisement) together and subject them to the same policer.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:57 -07:00
Ido Schimmel
412df3d1bb mlxsw: spectrum: Rename IPv6 ND trap group
The IPv6 Neighbour Discovery (ND) group will be used for various IPv6
packets, not all of which fall under the definition of ND, so rename it
to "IPV6" which is more appropriate.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:57 -07:00
Ido Schimmel
761bc42fbe mlxsw: spectrum: Use same switch case for identical groups
Trap groups that use the same policer settings can share the same switch
case.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:57 -07:00
Ido Schimmel
3c2d8a046a mlxsw: spectrum: Use dedicated trap group for ACL trap
Packets that are trapped via tc's trap action are currently subject to
the same policer as packets hitting local routes. The latter are
critical to the correct functioning of the control plane, while the
former are mainly used for traffic inspection.

Split the ACL trap to a separate group with its own policer. Use a
higher priority for these traps than for traps using mirror action
(e.g., ARP, IGMP). Otherwise, packets matching both traps will not be
forwarded in hardware (because of trap action) and also not forwarded in
software because they will be marked with 'offload_fwd_mark'.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:57 -07:00
Edwin Peer
2a5a8800fa bnxt_en: fix firmware message length endianness
The explicit mask and shift is not the appropriate way to parse fields
out of a little endian struct. The length field is internally __le16
and the strategy employed only happens to work on little endian machines
because the offset used is actually incorrect (length is at offset 6).

Also remove the related and no longer used definitions from bnxt.h.

Fixes: 845adfe40c ("bnxt_en: Improve valid bit checking in firmware response message.")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:30:48 -07:00
Vasundhara Volam
95ec1f470b bnxt_en: Fix return code to "flash_device".
When NVRAM directory is not found, return the error code
properly as per firmware command failure instead of the hardcode
-ENOBUFS.

Fixes: 3a707bed13 ("bnxt_en: Return -EAGAIN if fw command returns BUSY")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:30:48 -07:00
Michael Chan
b8056e8434 bnxt_en: Fix accumulation of bp->net_stats_prev.
We have logic to maintain network counters across resets by storing
the counters in bp->net_stats_prev before reset.  But not all resets
will clear the counters.  Certain resets that don't need to change
the number of rings do not clear the counters.  The current logic
accumulates the counters before all resets, causing big jumps in
the counters after some resets, such as ethtool -G.

Fix it by only accumulating the counters during reset if the irq_re_init
parameter is set.  The parameter signifies that all rings and interrupts
will be reset and that means that the counters will also be reset.

Reported-by: Vijayendra Suman <vijayendra.suman@oracle.com>
Fixes: b8875ca356 ("bnxt_en: Save ring statistics before reset.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:30:47 -07:00
Heiner Kallweit
12b1bc75cd r8169: improve rtl_remove_one
Don't call netif_napi_del() manually, free_netdev() does this for us.
In addition reorder calls to match reverse order of calls in probe().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:27:18 -07:00
Fugang Duan
8a448bf832 net: ethernet: fec: move GPR register offset and bit into DT
The commit da722186f6 (net: fec: set GPR bit on suspend by DT
configuration) set the GPR reigster offset and bit in driver for
wake on lan feature.

But it introduces two issues here:
- one SOC has two instances, they have different bit
- different SOCs may have different offset and bit

So to support wake-on-lan feature on other i.MX platforms, it should
configure the GPR reigster offset and bit from DT.

So the patch is to improve the commit da722186f6 (net: fec: set GPR
bit on suspend by DT configuration) to support multiple ethernet
instances on i.MX series.

v2:
 * switch back to store the quirks bitmask in driver_data
v3:
 * suggested by Sascha Hauer, use a struct fec_devinfo for
   abstracting differences between different hardware variants,
   it can give more freedom to describe the differences.

Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:21:43 -07:00
Fugang Duan
f2fb6b6275 net: stmmac: enable timestamp snapshot for required PTP packets in dwmac v5.10a
For rx filter 'HWTSTAMP_FILTER_PTP_V2_EVENT', it should be
PTP v2/802.AS1, any layer, any kind of event packet, but HW only
take timestamp snapshot for below PTP message: sync, Pdelay_req,
Pdelay_resp.

Then it causes below issue when test E2E case:
ptp4l[2479.534]: port 1: received DELAY_REQ without timestamp
ptp4l[2481.423]: port 1: received DELAY_REQ without timestamp
ptp4l[2481.758]: port 1: received DELAY_REQ without timestamp
ptp4l[2483.524]: port 1: received DELAY_REQ without timestamp
ptp4l[2484.233]: port 1: received DELAY_REQ without timestamp
ptp4l[2485.750]: port 1: received DELAY_REQ without timestamp
ptp4l[2486.888]: port 1: received DELAY_REQ without timestamp
ptp4l[2487.265]: port 1: received DELAY_REQ without timestamp
ptp4l[2487.316]: port 1: received DELAY_REQ without timestamp

Timestamp snapshot dependency on register bits in received path:
SNAPTYPSEL TSMSTRENA TSEVNTENA 	PTP_Messages
01         x         0          SYNC, Follow_Up, Delay_Req,
                                Delay_Resp, Pdelay_Req, Pdelay_Resp,
                                Pdelay_Resp_Follow_Up
01         0         1          SYNC, Pdelay_Req, Pdelay_Resp

For dwmac v5.10a, enabling all events by setting register
DWC_EQOS_TIME_STAMPING[SNAPTYPSEL] to 2’b01, clearing bit [TSEVNTENA]
to 0’b0, which can support all required events.

Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 16:27:41 -07:00
Guillaume Nault
58cff782cc flow_dissector: Parse multiple MPLS Label Stack Entries
The current MPLS dissector only parses the first MPLS Label Stack
Entry (second LSE can be parsed too, but only to set a key_id).

This patch adds the possibility to parse several LSEs by making
__skb_flow_dissect_mpls() return FLOW_DISSECT_RET_PROTO_AGAIN as long
as the Bottom Of Stack bit hasn't been seen, up to a maximum of
FLOW_DIS_MPLS_MAX entries.

FLOW_DIS_MPLS_MAX is arbitrarily set to 7. This should be enough for
many practical purposes, without wasting too much space.

To record the parsed values, flow_dissector_key_mpls is modified to
store an array of stack entries, instead of just the values of the
first one. A bit field, "used_lses", is also added to keep track of
the LSEs that have been set. The objective is to avoid defining a
new FLOW_DISSECTOR_KEY_MPLS_XX for each level of the MPLS stack.

TC flower is adapted for the new struct flow_dissector_key_mpls layout.
Matching on several MPLS Label Stack Entries will be added in the next
patch.

The NFP and MLX5 drivers are also adapted: nfp_flower_compile_mac() and
mlx5's parse_tunnel() now verify that the rule only uses the first LSE
and fail if it doesn't.

Finally, the behaviour of the FLOW_DISSECTOR_KEY_MPLS_ENTROPY key is
slightly modified. Instead of recording the first Entropy Label, it
now records the last one. This shouldn't have any consequences since
there doesn't seem to have any user of FLOW_DISSECTOR_KEY_MPLS_ENTROPY
in the tree. We'd probably better do a hash of all parsed MPLS labels
instead (excluding reserved labels) anyway. That'd give better entropy
and would probably also simplify the code. But that's not the purpose
of this patch, so I'm keeping that as a future possible improvement.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 15:22:58 -07:00
Yuval Basson
ff937b916e qed: Add EDPM mode type for user-fw compatibility
In older FW versions the completion flag was treated as the ack flag in
edpm messages. Expose the FW option of setting which mode the QP is in
by adding a flag to the qedr <-> qed API.

Flag is added for backward compatibility with libqedr.
This flag will be set by qedr after determining whether the libqedr is
using the updated version.

Fixes: f109394033 ("qed: Add support for QP verbs")
Signed-off-by: Yuval Basson <yuval.bason@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 15:15:40 -07:00
Heiner Kallweit
d05890c5ae r8169: sync RTL8168f/RTL8411 hw config with vendor driver
Sync hw config for RTL8168f/RTL8411 with r8168 vendor driver.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25 18:21:10 -07:00
Heiner Kallweit
33b00ca1da r8169: sync RTL8168evl hw config with vendor driver
Sync hw config for RTL8168evl with r8168 vendor driver.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25 18:21:09 -07:00
Heiner Kallweit
ee1350f94e r8169: sync RTL8168h hw config with vendor driver
Sync hw config for RTL8168h with r8168 vendor driver.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25 18:21:09 -07:00
Heiner Kallweit
d29d5ff9da r8169: sync RTL8168g hw config with vendor driver
Sync hw config for RTL8168g with r8168 vendor driver.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25 18:21:09 -07:00
Qiushi Wu
15c9738589 qlcnic: fix missing release in qlcnic_83xx_interrupt_test.
In function qlcnic_83xx_interrupt_test(), function
qlcnic_83xx_diag_alloc_res() is not handled by function
qlcnic_83xx_diag_free_res() after a call of the function
qlcnic_alloc_mbx_args() failed. Fix this issue by adding
a jump target "fail_mbx_args", and jump to this new target
when qlcnic_alloc_mbx_args() failed.

Fixes: b6b4316c8b ("qlcnic: Handle qlcnic_alloc_mbx_args() failure")
Signed-off-by: Qiushi Wu <wu000273@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25 18:06:09 -07:00
Vladimir Oltean
5d14c304bf dpaa_eth: fix usage as DSA master, try 3
The dpaa-eth driver probes on compatible string for the MAC node, and
the fman/mac.c driver allocates a dpaa-ethernet platform device that
triggers the probing of the dpaa-eth net device driver.

All of this is fine, but the problem is that the struct device of the
dpaa_eth net_device is 2 parents away from the MAC which can be
referenced via of_node. So of_find_net_device_by_node can't find it, and
DSA switches won't be able to probe on top of FMan ports.

It would be a bit silly to modify a core function
(of_find_net_device_by_node) to look for dev->parent->parent->of_node
just for one driver. We're just 1 step away from implementing full
recursion.

Actually there have already been at least 2 previous attempts to make
this work:
- Commit a1a50c8e4c ("fsl/man: Inherit parent device and of_node")
- One or more of the patches in "[v3,0/6] adapt DPAA drivers for DSA":
  https://patchwork.ozlabs.org/project/netdev/cover/1508178970-28945-1-git-send-email-madalin.bucur@nxp.com/
  (I couldn't really figure out which one was supposed to solve the
  problem and how).

Point being, it looks like this is still pretty much a problem today.
On T1040, the /sys/class/net/eth0 symlink currently points to

../../devices/platform/ffe000000.soc/ffe400000.fman/ffe4e6000.ethernet/dpaa-ethernet.0/net/eth0

which pretty much illustrates the problem. The closest of_node we've got
is the "fsl,fman-memac" at /soc@ffe000000/fman@400000/ethernet@e6000,
which is what we'd like to be able to reference from DSA as host port.

For of_find_net_device_by_node to find the eth0 port, we would need the
parent of the eth0 net_device to not be the "dpaa-ethernet" platform
device, but to point 1 level higher, aka the "fsl,fman-memac" node
directly. The new sysfs path would look like this:

../../devices/platform/ffe000000.soc/ffe400000.fman/ffe4e6000.ethernet/net/eth0

And this is exactly what SET_NETDEV_DEV does. It sets the parent of the
net_device. The new parent has an of_node associated with it, and
of_dev_node_match already checks for the of_node of the device or of its
parent.

Fixes: a1a50c8e4c ("fsl/man: Inherit parent device and of_node")
Fixes: c6e26ea8c8 ("dpaa_eth: change device used")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25 17:56:53 -07:00
Eric Dumazet
880f8f99d1 bnx2x: allow bnx2x_bsc_read() to schedule
bnx2x_warpcore_read_sfp_module_eeprom() can call bnx2x_bsc_read()
three times before giving up.

This causes latency blips of at least 31 ms (58 ms being reported
by our teams)

Convert the long lasting loops of udelay() to usleep_range() ones,
and breaks the loops on precise time tracking.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ariel Elior <aelior@marvell.com>
Cc: Sudarsana Kalluru <skalluru@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25 17:52:48 -07:00
Sven Auhagen
ca23cb0bc5 mvneta: MVNETA_SKB_HEADROOM set last 3 bits to zero
For XDP the MVNETA_SKB_HEADROOM is used as an offset for
the received data.
The MVNETA manual states that the last 3 bits assumed to be 0.

This is currently the case but lets make it explicit in the definition
to prevent future problems.

Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25 17:50:01 -07:00
Ido Schimmel
154388e112 mlxsw: spectrum: Fix spelling mistake in trap's name
Fix incorrect spelling of "advertisement".

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
ce3c3bf0bf mlxsw: spectrum: Use dedicated trap group for sampled packets
The rate with which packets are sampled is determined by user space, so
there is no need to associate such packets with a policer.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
b33f5d9fb7 mlxsw: spectrum: Use same trap group for IPv6 ND and ARP packets
Both packet types are needed for the same reason (neighbour discovery),
so associate them with the same trap group.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
32446438cc mlxsw: spectrum: Rename ARP trap group
The ARP trap group will be used for IPv6 ND traps in the next patch, so
rename it to "NEIGH_DISCOVERY" which is more appropriate.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
d88f8cc158 mlxsw: spectrum_trap: Remove unnecessary field
Now that traffic class (TC) and priority are set to the same value,
there is no need to store both. Remove the first.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
5047d819f5 mlxsw: spectrum: Align TC and trap priority
The traffic class (TC) attribute of packet traps determines through which
TC a packet trap will be scheduled through the CPU port.

The priority attribute determines which trap will be triggered in case
several packet traps match a packet.

We try to configure these attributes to the same value for all packet
traps as there is little reason not to.

Some packet traps did not use the same value, so rectify that now.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
e0d848477a mlxsw: spectrum_buffers: Assign non-zero quotas to TC 0 of the CPU port
As explained in commit 9ffcc3725f ("mlxsw: spectrum: Allow packets to
be trapped from any PG"), incoming packets can be admitted to the shared
buffer and forwarded / trapped, if:

(Ingress{Port}.Usage < Thres && Ingress{Port,PG}.Usage < Thres &&
 Egress{Port}.Usage < Thres && Egress{Port,TC}.Usage < Thres)
||
(Ingress{Port}.Usage < Min || Ingress{Port,PG} < Min ||
 Egress{Port}.Usage < Min || Egress{Port,TC}.Usage < Min)

Trapped packets are scheduled to transmission through the CPU port.
Currently, the minimum and maximum quotas of traffic class (TC) 0 of the
CPU port are 0, which means it is not usable.

Assign non-zero quotas to TC 0 of the CPU port, so that it could be
utilized by subsequent patches.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
938e6d0b76 mlxsw: spectrum: Change default rate and priority of DHCP packets
Reduce the default acceptable rate of DHCP packets to 128 packets per
second and reduce their priority. This is reasonable given the Spectrum
ASICs are limited to 128 ports at the moment.

These are only the default values. Users will be able to modify them via
devlink-trap.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
0ecb947412 mlxsw: spectrum: Trap IPv4 DHCP packets in router
Currently, IPv4 DHCP packets are trapped during L2 forwarding, which
means that packets might be trapped unnecessarily. Instead, only trap
the DHCP packets that reach the router. Either because they were flooded
to the router port or forwarded to it by the FDB. This is consistent
with the corresponding IPv6 trap.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
99129069b7 mlxsw: spectrum: Use same trap group for MLD and IGMP packets
Both packet types are needed for the same reason (multicast snooping),
so associate them with the same trap group.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
debb7af686 mlxsw: spectrum: Rename IGMP trap group
The IGMP trap group will be used for MLD traps in the next patch, so
rename it to "MC_SNOOPING" which is more appropriate.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
David S. Miller
13209a8f73 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The MSCC bug fix in 'net' had to be slightly adjusted because the
register accesses are done slightly differently in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 13:47:27 -07:00
Bartosz Golaszewski
9250dccc11 net: ethernet: mtk_star_emac: use devm_register_netdev()
Use the new devres variant of register_netdev() in the mtk-star-emac
driver and shrink the code by a couple lines.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:56:17 -07:00
Heiner Kallweit
787c0c04f4 r8169: remove mask argument from r8168ep_ocp_read
Remove the mask argument as it's not used by r8168ep_ocp_read().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:54:35 -07:00
Heiner Kallweit
a15aaa038b r8169: remove mask argument from r8168dp_ocp_read
All callers read the full 32bit value, therefore the mask argument can
be removed.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:54:35 -07:00
Heiner Kallweit
54113ded67 r8169: remove mask argument from rtl_w0w1_eri
rtl_eri_read() returns the full 32bit value, therefore there's no
benefit in writing back parts of it only. handle it like the vendor
driver and write the full 32 bit always. Omitting the mask argument
avoids some overhead and makes the code better readable.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:54:35 -07:00
Dinghao Liu
539d39ad0c net: smsc911x: Fix runtime PM imbalance on error
Remove runtime PM usage counter decrement when the
increment function has not been called to keep the
counter balanced.

Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:52:17 -07:00
David S. Miller
2b1a7f741a Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2020-05-22

This series contains updates to virtchnl and the ice driver.

Geert Uytterhoeven fixes a data structure alignment issue in the
virtchnl structures.

Henry adds Flow Director support which allows for the redirection on
ntuple rules over six patches.  Initially Henry adds the initial
infrastructure for Flow Director, and then later adds IPv4 and IPv6
support, as well as being able to display the ntuple rules.

Bret add Accelerated Receive Flow Steering (aRFS) support which is used
to steer receive flows to a specific queue.  Fixes a transmit timeout
when the VF link transitions from up/down/up because the transmit and
receive queue interrupts are not enabled as part of VF's link up.  Fixed
an issue when the default VF LAN address is changed and after reset the
PF will attempt to add the new MAC, which fails because it already
exists. This causes the VF to be disabled completely until it is removed
and enabled via sysfs.

Anirudh (Ani) makes a fix where the ice driver needs to call set_mac_cfg
to enable jumbo frames, so ensure it gets called during initialization
and after reset.  Fix bad register reads during a register dump in
ethtool by removing the bad registers.

Paul fixes an issue where the receive Malicious Driver Detection (MDD)
auto reset message was not being logged because it occurred after the VF
reset.

Victor adds a check for compatibility between the Dynamic Device
Personalization (DDP) package and the NIC firmware to ensure that
everything aligns.

Jesse fixes a administrative queue string call with the appropriate
error reporting variable.  Also fixed the loop variables that are
comparing or assigning signed against unsigned values.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:51:26 -07:00
David S. Miller
098205f3c6 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2020-05-22

This series contains updates to e1000e, igc and igb.

Many of the patches in this series are fixes, but many of the igc fixes
are based on the recent filter rule handling Andre has been working,
which will not backport to earlier/stable kernels.  The remaining fixes
for e1000e and igb have CC'd stable where applicable.

Andre continue with his refactoring of the filter rule code to help with
reducing the complexity, in multiple patches.  Fix the inconsistent size
of a struct field.  Fixed an issue where filter rules stay active in the
hardware, even after it was deleted, so make sure to disable the filter
rule before deleting.  Fixed an issue with NFC rules which were dropping
valid multicast MAC address.  Fixed how the NFC rules are restored after
the NIC is reset or brought up, so that they are restored in the same order
they were initially setup in.  Fix a potential memory leak when the
driver is unloaded and the NFC rules are not flushed from memory
properly.  Fixed how NFC rule validation handles when a request to
overwrite an existing rule.  Changed the locking around the NFC rule API
calls from spin_locks to mutex locks to avoid unnecessary busy waiting
on lock contention.

Sasha clean up more unused code in the igc driver.

Kai-Heng Feng from Canonical provides three fixes, first has igb report
the speed and duplex as unknown when in runtime suspend.  Fixed e1000e
to pass up the error when disabling ULP mode.  Fixed e1000e performance
by disabling TSO by default for certain MACs.

Vitaly disables S0ix entry and exit flows for ME systems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:47:41 -07:00
David S. Miller
e3181e9a72 mlx5-fixes-2020-05-22
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl7IbksACgkQSD+KveBX
 +j5T8Af/XT6b23VlSn2Km4tg8WQNDRJLdq1s6fTS5SGcyc0awxfH07cvYvJ26kKW
 kmdDNijkVbd0ma2UxHiiD3vmE8Vs85gZ6BDNyl485x/cH3zFzAm54R5fZdnK5JgN
 YNgdFP0MOwPtAdDtxLH+r8aOyNKncIOmCZrMNnxVgI+IytG1L5QLnS6GeQy2zyIx
 9F/9sihta2z567IstGu2wvmgviSHVk/zV9yqn/orD9tV6oFvvrBQMlEt8l27b1tA
 4bajbHIyc1WmfQ+wg56eXATdbqCQ2YYfMjhchiCfFv5DhnMnPi5bV0PNR9Rq0CYw
 05xpF16/85uvDbTizsgGNZ1Pb1nGsQ==
 =oFWF
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2020-05-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2020-05-22

This series introduces some fixes to mlx5 driver.

Please pull and let me know if there is any problem.

For -stable v4.13
   ('net/mlx5: Add command entry handling completion')

For -stable v5.2
   ('net/mlx5: Fix error flow in case of function_setup failure')
   ('net/mlx5: Fix memory leak in mlx5_events_init')

For -stable v5.3
   ('net/mlx5e: Update netdev txq on completions during closure')
   ('net/mlx5e: kTLS, Destroy key object after destroying the TIS')
   ('net/mlx5e: Fix inner tirs handling')

For -stable v5.6
   ('net/mlx5: Fix cleaning unmanaged flow tables')
   ('net/mlx5: Fix a race when moving command interface to events mode')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:39:45 -07:00
David S. Miller
46c54f9500 mlx5-updates-2020-05-22
This series includes two updates and one cleanup patch
 
 1) Tang Bim, clean-up with IS_ERR() usage
 
 2) Vlad introduces a new mlx5 kconfig flag for TC support
 
    This is required due to the high volume of current and upcoming
    development in the eswitch and representors areas where some of the
    feature are TC based such as the downstream patches of MPLSoUDP and
    the following representor bonding support for VF live migration and
    uplink representor dynamic loading.
    For this Vlad kept TC specific code in tc.c and rep/tc.c and
    organized non TC code in representors specific files.
 
 3) Eli Cohen adds support for MPLS over UPD encap and decap TC offloads.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl7IZFEACgkQSD+KveBX
 +j7IEQf/RFv633bWTlL63fEJjViRv1rjfkbyaXrGVL3gzr/Er01DeAPR22CNOlC3
 bu1jHLKqVn0Mg0g5g2B4/H/7JoFbMBRTy4MXpM5VrQCIqwMuXG4zhWuoUj7ncQ5w
 kXHAU6DUuZRn8/x1JLQOHDRTzKhav7ldT+nvvoKEMrad/DEMGz+bq67xh4l8nfi+
 ktSFAO0UFi9ysb25CMfdqIqAL0J5nAJ7DNhw5x7IvtwUxNxate7HtBaBhBgZ9NWv
 jYf8R3p+7JdgvVW18pZhmjbaBqaApXcZrC7rI07PR6rCOAHfToX6miR8gUtpIEno
 itQkzYt9UF2dgNwMmxoJLqnUNiy/Cg==
 =wkSR
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2020-05-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-05-22

This series includes two updates and one cleanup patch

1) Tang Bim, clean-up with IS_ERR() usage

2) Vlad introduces a new mlx5 kconfig flag for TC support

   This is required due to the high volume of current and upcoming
   development in the eswitch and representors areas where some of the
   feature are TC based such as the downstream patches of MPLSoUDP and
   the following representor bonding support for VF live migration and
   uplink representor dynamic loading.
   For this Vlad kept TC specific code in tc.c and rep/tc.c and
   organized non TC code in representors specific files.

3) Eli Cohen adds support for MPLS over UPD encap and decap TC offloads.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:37:00 -07:00
Qiushi Wu
febfd9d3c7 net/mlx4_core: fix a memory leak bug.
In function mlx4_opreq_action(), pointer "mailbox" is not released,
when mlx4_cmd_box() return and error, causing a memory leak bug.
Fix this issue by going to "out" label, mlx4_free_cmd_mailbox() can
free this pointer.

Fixes: fe6f700d6c ("net/mlx4_core: Respond to operation request by firmware")
Signed-off-by: Qiushi Wu <wu000273@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:34:37 -07:00
Grygorii Strashko
4c64b83d03 net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during suspend
vlan_for_each() are required to be called with rtnl_lock taken, otherwise
ASSERT_RTNL() warning will be triggered - which happens now during System
resume from suspend:
  cpsw_suspend()
  |- cpsw_ndo_stop()
    |- __hw_addr_ref_unsync_dev()
      |- cpsw_purge_all_mc()
         |- vlan_for_each()
            |- ASSERT_RTNL();

Hence, fix it by surrounding cpsw_ndo_stop() by rtnl_lock/unlock() calls.

Fixes: 15180eca56 ("net: ethernet: ti: cpsw: fix vlan mcast")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:33:20 -07:00
Leon Yu
31096c3e8b net: stmmac: don't attach interface until resume finishes
Commit 14b41a2959 ("net: stmmac: Delete txtimer in suspend") was the
first attempt to fix a race between mod_timer() and setup_timer()
during stmmac_resume(). However the issue still exists as the commit
only addressed half of the issue.

Same race can still happen as stmmac_resume() re-attaches interface
way too early - even before hardware is fully initialized.  Worse,
doing so allows network traffic to restart and stmmac_tx_timer_arm()
being called in the middle of stmmac_resume(), which re-init tx timers
in stmmac_init_coalesce().  timer_list will be corrupted and system
crashes as a result of race between mod_timer() and setup_timer().

  systemd--1995    2.... 552950018us : stmmac_suspend: 4994
  ksoftirq-9       0..s2 553123133us : stmmac_tx_timer_arm: 2276
  systemd--1995    0.... 553127896us : stmmac_resume: 5101
  systemd--320     7...2 553132752us : stmmac_tx_timer_arm: 2276
  (sd-exec-1999    5...2 553135204us : stmmac_tx_timer_arm: 2276
  ---------------------------------
  pc : run_timer_softirq+0x468/0x5e0
  lr : run_timer_softirq+0x570/0x5e0
  Call trace:
   run_timer_softirq+0x468/0x5e0
   __do_softirq+0x124/0x398
   irq_exit+0xd8/0xe0
   __handle_domain_irq+0x6c/0xc0
   gic_handle_irq+0x60/0xb0
   el1_irq+0xb8/0x180
   arch_cpu_idle+0x38/0x230
   default_idle_call+0x24/0x3c
   do_idle+0x1e0/0x2b8
   cpu_startup_entry+0x28/0x48
   secondary_start_kernel+0x1b4/0x208

Fix this by deferring netif_device_attach() to the end of
stmmac_resume().

Signed-off-by: Leon Yu <leoyu@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:30:27 -07:00
Tiezhu Yang
ef24d6c3d6 net: Fix return value about devm_platform_ioremap_resource()
When call function devm_platform_ioremap_resource(), we should use IS_ERR()
to check the return value and return PTR_ERR() if failed.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:28:25 -07:00
Jesse Brandeburg
c1e0883012 ice: cleanup unsigned loops
Fix loop variables that are comparing or assigning signed against
unsigned values, mostly by declaring loop counters as unsigned.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 22:27:31 -07:00
Jesse Brandeburg
9d68a79c3b ice: fix usage of incorrect variable
The driver was using rq_last_status where it should have been
using sq_last_status. Fix the string to be using the correct
error reporting variable.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 22:26:02 -07:00
Anirudh Venkataramanan
1fba4a8a92 ice: Fix bad register reads
The "ethtool -d" handler reads registers in the ice_regs_dump_list array
and returns read values back to the userspace.

The register offsets PFINT0_ITR* are not valid as per the specification
and reading these causes a "unable to handle kernel paging request" bug
in the driver. Remove these registers from ice_regs_dump_list.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 22:24:29 -07:00
Victor Raj
b827291958 ice: check for compatibility between DDP package and firmware
Require the Dynamic Device Personalization (DDP) file to have the same
major version number and the same or older minor number than the firmware
version major and minor, respectively.

Check the OS and NVM package versions before downloading the package.
If the OS package version is not compatible with NVM then return an
appropriate error.

Split the 32-byte segment name into a 28-byte segment name and
a 4-byte Track-ID. Older packages will still work with this change
because no package has a name that will take up more than 28 bytes;
in this case the Track-ID will be 0.

Note that the driver will store the segment name as 32-bytes in the
ice_hw structure, in order to normalize the length of the various
package name strings that it uses.

Also add section ID and structure for the segment metadata section.

Signed-off-by: Victor Raj <victor.raj@intel.com>
Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 22:22:50 -07:00
Brett Creeley
47ebc7b024 ice: Check if unicast MAC exists before setting VF MAC
Currently if a unicast MAC is set via ndo_set_vf_mac, the PF driver will
set the VF's dflt_lan_addr.addr once some basic checks have passed. The
VF is then reset. During reset the PF driver will attempt to program the
VF's MAC from the dflt_lan_addr.addr field. This fails when the MAC
already exists on the PF's switch.

This is causing the VF to be completely disabled until removing/enabling
any VFs via sysfs.

Fix this by checking if the unicast MAC exists before triggering a VF
reset directly in ndo_set_vf_mac. Also, add a check if the unicast MAC
is set to the same value as before and return 0 if that is the case.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 22:20:22 -07:00
Brett Creeley
4dc926d3a5 ice: Fix Tx timeout when link is toggled on a VF's interface
Currently if the iavf is loaded and a VF link transitions from up to
down to up again a Tx timeout will be triggered. This happens because
Tx/Rx queue interrupts are only enabled when receiving the
VIRTCHNL_OP_CONFIG_MAP_IRQ message, which happens on reset or initial
iavf driver load, but not when bringing link up. This is problematic
because they are disabled on the VIRTCHNL_OP_DISABLE_QUEUES message,
which is part of bringing a VF's link down. However, they are not
enabled on the VIRTCHNL_OP_ENABLE_QUEUES message, which is part of
bringing a VF's link up.

Fix this by re-enabling the VF's Rx and Tx queue interrupts when they
were previously configured. This is done by first checking to make
sure the previous value in QINT_[R|T]QCTL.MSIX_INDX is not 0, which
is used to represent the OICR in the VF's interrupt space. If the
MSIX_INDX is non-zero then enable the interrupt by setting the
QINT_[R|T]CTL.CAUSE_ENA bit to 1.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 22:10:58 -07:00
Paul Greenwalt
7438a3b094 ice: print Rx MDD auto reset message before VF reset
Rx MDD auto reset message was not being logged because logging occurred
after the VF reset and the VF MDD data was reinitialized.

Log the Rx MDD auto reset message before triggering the VF reset.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 22:07:06 -07:00
Anirudh Venkataramanan
4244910568 ice: Call ice_aq_set_mac_cfg
As per the specification, the driver needs to call set_mac_cfg
(opcode 0x0603) to be able to exercise jumbo frames. Call the
function during initialization and the post reset rebuild flow.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 22:05:25 -07:00
Brett Creeley
28bf26724f ice: Implement aRFS
Enable accelerated Receive Flow Steering (aRFS). It is used to steer Rx
flows to a specific queue. This functionality is triggered by the network
stack through ndo_rx_flow_steer and requires Flow Director (ntuple on) to
function.

The fltr_info is used to add/remove/update flow rules in the HW, the
fltr_state is used to determine what to do with the filter with respect
to HW and/or SW, and the flow_id is used in co-ordination with the
network stack.

The work for aRFS is split into two paths: the ndo_rx_flow_steer
operation and the ice_service_task. The former is where the kernel hands
us an Rx SKB among other items to setup aRFS and the latter is where
the driver adds/updates/removes filter rules from HW and updates filter
state.

In the Rx path the following things can happen:
        1. New aRFS entries are added to the hash table and the state is
           set to ICE_ARFS_INACTIVE so the filter can be updated in HW
           by the ice_service_task path.
        2. aRFS entries have their Rx Queue updated if we receive a
           pre-existing flow_id and the filter state is ICE_ARFS_ACTIVE.
           The state is set to ICE_ARFS_INACTIVE so the filter can be
           updated in HW by the ice_service_task path.
        3. aRFS entries marked as ICE_ARFS_TODEL are deleted

In the ice_service_task path the following things can happen:
        1. New aRFS entries marked as ICE_ARFS_INACTIVE are added or
           updated in HW.
           and their state is updated to ICE_ARFS_ACTIVE.
        2. aRFS entries are deleted from HW and their state is updated
           to ICE_ARFS_TODEL.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Madhu Chittim <madhu.chittim@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 22:02:34 -07:00
Henry Tieman
83af003951 ice: Restore filters following reset
Following a reset, Flow Director filters are cleared from the hardware.
Rebuild the filters using the software structures containing the filter
rules.

Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 21:46:51 -07:00
Henry Tieman
2c57ffcb19 ice: Enable flex-bytes support
Flex-bytes allows for packet matching based on an offset and value. This
is supported via the ethtool user-def option.  It is specified by providing
an offset followed by a 2 byte match value. Offset is measured from the
start of the MAC address.

The following restrictions apply to flex-bytes. The specified offset must
be an even number and be smaller than 0x1fe.

Example usage:

ethtool -N eth0 flow-type tcp4 src-ip 192.168.0.55 dst-ip 172.16.0.55 \
src-port 12 dst-port 13 user-def 0x10ffff action 32

Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 21:44:48 -07:00
Henry Tieman
165d80d6ad ice: Support IPv6 Flow Director filters
Extend supported filters to allow for IPv6 filters.

Supported fields are: src-ip, dst-ip, src-port, and dst-port
Supported flow-types are: tcp6, udp6, sctp6, ip6

Example usage:

ethtool -N eth0 flow-type tcp6 src-port 12 dst-port 13 \
src-ip fce0::1:34 dst-ip fce0::1:35 action 32

Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 21:42:20 -07:00
Henry Tieman
cac2a27cd9 ice: Support IPv4 Flow Director filters
Support the addition and deletion of IPv4 filters.

Supported fields are: src-ip, dst-ip, src-port, and dst-port
Supported flow-types are: tcp4, udp4, sctp4, ip4

Example usage:

ethtool -N eth0 flow-type tcp4 src-ip 192.168.0.55 dst-ip 172.16.0.55 \
src-port 16 dst-port 12 action 32

Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 21:36:27 -07:00
Henry Tieman
4ab956462f ice: Support displaying ntuple rules
Add functionality for ethtool --show-ntuple, allowing for filters to be
displayed when set functionality is added. Add statistics related to
Flow Director matches and status.

Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 21:30:23 -07:00
Henry Tieman
148beb6120 ice: Initialize Flow Director resources
Flow Director allows for redirection based on ntuple rules. Rules are
programmed using the ethtool set-ntuple interface. Supported actions are
redirect to queue and drop.

Setup the initial framework to process Flow Director filters. Create and
allocate resources to manage and program filters to the hardware. Filters
are processed via a sideband interface; a control VSI is created to manage
communication and process requests through the sideband. Upon allocation of
resources, update the hardware tables to accept perfect filters.

Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 21:26:37 -07:00
David S. Miller
a152b85984 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-05-23

The following pull-request contains BPF updates for your *net-next* tree.

We've added 50 non-merge commits during the last 8 day(s) which contain
a total of 109 files changed, 2776 insertions(+), 2887 deletions(-).

The main changes are:

1) Add a new AF_XDP buffer allocation API to the core in order to help
   lowering the bar for drivers adopting AF_XDP support. i40e, ice, ixgbe
   as well as mlx5 have been moved over to the new API and also gained a
   small improvement in performance, from Björn Töpel and Magnus Karlsson.

2) Add getpeername()/getsockname() attach types for BPF sock_addr programs
   in order to allow for e.g. reverse translation of load-balancer backend
   to service address/port tuple from a connected peer, from Daniel Borkmann.

3) Improve the BPF verifier is_branch_taken() logic to evaluate pointers
   being non-NULL, e.g. if after an initial test another non-NULL test on
   that pointer follows in a given path, then it can be pruned right away,
   from John Fastabend.

4) Larger rework of BPF sockmap selftests to make output easier to understand
   and to reduce overall runtime as well as adding new BPF kTLS selftests
   that run in combination with sockmap, also from John Fastabend.

5) Batch of misc updates to BPF selftests including fixing up test_align
   to match verifier output again and moving it under test_progs, allowing
   bpf_iter selftest to compile on machines with older vmlinux.h, and
   updating config options for lirc and v6 segment routing helpers, from
   Stanislav Fomichev, Andrii Nakryiko and Alan Maguire.

6) Conversion of BPF tracing samples outdated internal BPF loader to use
   libbpf API instead, from Daniel T. Lee.

7) Follow-up to BPF kernel test infrastructure in order to fix a flake in
   the XDP selftests, from Jesper Dangaard Brouer.

8) Minor improvements to libbpf's internal hashmap implementation, from
   Ian Rogers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 18:30:34 -07:00
Vitaly Lifshits
e086ba2fcc e1000e: disable s0ix entry and exit flows for ME systems
Since ME systems do not support SLP_S0 in S0ix state, and S0ix entry
and exit flows may cause errors on them it is best to avoid using
e1000e_s0ix_entry_flow and e1000e_s0ix_exit_flow functions.

This was done by creating a struct of all devices that comes with ME
and by checking if the current device has ME.

Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:52 -07:00
Kai-Heng Feng
f29801030a e1000e: Disable TSO for buffer overrun workaround
Commit b10effb92e ("e1000e: fix buffer overrun while the I219 is
processing DMA transactions") imposes roughly 30% performance penalty.

The commit log states that "Disabling TSO eliminates performance loss
for TCP traffic without a noticeable impact on CPU performance", so
let's disable TSO by default to regain the loss.

CC: stable <stable@vger.kernel.org>
Fixes: b10effb92e ("e1000e: fix buffer overrun while the I219 is processing DMA transactions")
BugLink: https://bugs.launchpad.net/bugs/1802691
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:52 -07:00
Kai-Heng Feng
0c80cdbf33 e1000e: Warn if disabling ULP failed
The hardware may stop working if driver failed to disable ULP mode.

Take the return value of e1000_disable_ulp_lpt_lp() into account, and
pass up the error if it fails.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:52 -07:00
Kai-Heng Feng
165ae7a8fe igb: Report speed and duplex as unknown when device is runtime suspended
igb device gets runtime suspended when there's no link partner. We can't
get correct speed under that state:
$ cat /sys/class/net/enp3s0/speed
1000

In addition to that, an error can also be spotted in dmesg:
[  385.991957] igb 0000:03:00.0 enp3s0: PCIe link lost

Since device can only be runtime suspended when there's no link partner,
we can skip reading register and let the following logic set speed and
duplex with correct status.

The more generic approach will be wrap get_link_ksettings() with begin()
and complete() callbacks. However, for this particular issue, begin()
calls igb_runtime_resume() , which tries to rtnl_lock() while the lock
is already hold by upper ethtool layer.

So let's take this approach until the igb_runtime_resume() no longer
needs to hold rtnl_lock.

CC: stable <stable@vger.kernel.org>
Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:52 -07:00
Sasha Neftin
14ec06b02e igc: Remove unused descriptor's flags
Enable Tidv register, Report Packet Sent, Report Status and
Ethernet CRC flags not in use.
This patch comes to clean up these flags.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:52 -07:00
Andre Guedes
5c739e77ca igc: Remove igc_nfc_rule_exit()
During igc_down(), we call igc_nfc_rule_exit() which traverse the NFC
rule list disabling filters one by one. Later on in igc_down() flow
we issue an hardware reset which also clear all filters.  Since we
already reset the hardware, we don't actually need to disable each
filter manually. In order to simplify the code, this patch removes
igc_nfc_rule() altogether.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:52 -07:00
Andre Guedes
42fc5dc042 igc: Change adapter->nfc_rule_lock to mutex
This patch changes adapter->nfc_rule_lock type from spin_lock to mutex
so we avoid unnecessary busy waiting on lock contention.

A closer look at the execution context of NFC rule API users shows that
all of them run in process context. The API users are: ethtool ops,
igc_configure(), called when interface is brought up by user or reset
workequeue thread, igc_down(), called when interface is brought down,
and igc_remove(), called when driver is unloaded.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:52 -07:00
Andre Guedes
acda576f72 igc: Change return type from igc_disable_nfc_rule()
None of igc_disable_nfc_rule() callers actually check its returning
value. A closer look at why this function would fail shows that the
only situation is when we try to delete an Ethertype or MAC filter that
doesn't exist.

That situation is very unlikely so we can change igc_del_etype_filter()
and igc_del_mac_filter() logic to "if the filter doesn't exist, we are
done", and keep the logic in igc_disable_nfc_rule() callers simple.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:52 -07:00
Andre Guedes
1894df0ccb igc: Fix NFC rule validation
If we try to overwrite an existing rule with the same filter but
different action, we get EEXIST error as shown below.

$ ethtool -N eth0 flow-type ether dst <MACADDR> action 1 loc 10
$ ethtool -N eth0 flow-type ether dst <MACADDR> action 2 loc 10
rmgr: Cannot insert RX class rule: File exists

The second command is expected to overwrite the previous rule in location
10 and succeed.

This patch fixes igc_ethtool_check_nfc_rule() so it also checks the
rules location. In case they match, the rule under evaluation should not
be considered invalid.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:51 -07:00
Andre Guedes
e256ec83fa igc: Fix NFC rules leak when driver is unloaded
If we have RFC rules in adapter->nfc_rule_list when the IGC driver
is unloaded, all rules are leaked. This patch fixes the issue by
introducing the helper igc_flush_nfc_rules() and calling it in
igc_remove(). It also updates igc_set_features() so is reuses the
new helper instead of re-implementing it.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:51 -07:00
Andre Guedes
36fa21520f igc: Refactor igc_ethtool_update_nfc_rule()
Current implementation of igc_ethtool_update_nfc_rule() is a bit
convoluted since it handles too many things: rule lookup, deletion
and addition. This patch breaks it into three functions so we simplify
the code and improve code reuse.

Code related to rule lookup is refactored out to a new function called
igc_get_nfc_rule().

Code related to rule addition is refactored out to a new function called
igc_add_nfc_rule(). This function enables the rule in hardware and adds
it to the adapter's list.

Code related to rule deletion is refactored out to a new function called
igc_del_nfc_rule(). This function disables the rule in hardware, removes
it from adapter's list, and deletes it.

As a byproduct of this refactoring, igc_enable_nfc_rule() and
igc_disable_nfc_rule() are moved to igc_main.c since they are not used
in igc_ethtool.c anymore, and igc_restore_nfc_rules() and igc_nfc_rule_
exit() are moved around to avoid forward declaration.

Also, since this patch already touches igc_ethtool_get_nfc_rule(), it
takes the opportunity to remove the 'match_flags' check. Empty flags
are not allowed to be added so no need to check that.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:51 -07:00
Andre Guedes
d957c6010a igc: Fix NFC rules restoration
When network interface is brought up, the driver re-enables the NFC
rules previously configured. However, this is done in reverse order
the rules were added and hardware filters are configured differently.

For example, consider the following rules:

$ ethtool -N eth0 flow-type ether dst 00:00:00:00:00:AA queue 0
$ ethtool -N eth0 flow-type ether dst 00:00:00:00:00:BB queue 1
$ ethtool -N eth0 flow-type ether dst 00:00:00:00:00:CC queue 2
$ ethtool -N eth0 flow-type ether dst 00:00:00:00:00:DD queue 3

RAL/RAH registers are configure so filter index 1 has address ending
with AA, filter index 2 has address ending in BB, and so on.

If we bring the interface down and up again, RAL/RAH registers are
configured so filter index 1 has address ending in DD, filter index 2
has CC, and so on. IOW, in reverse order we had before bringing the
interface down.

This issue can be fixed by traversing adapter->nfc_rule_list in
backwards when restoring the rules. Since hlist doesn't support
backwards traversal, this patch replaces it by list_head and fixes
igc_restore_nfc_rules() accordingly.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:51 -07:00
Andre Guedes
39707c16e6 igc: Fix NFC rules with multicast addresses
Multicast MAC addresses are valid address for NFC rules but
igc_add_mac_filter() is currently rejecting them. In fact, the I225
controller doesn't impose any constraint on the address value so this
patch gets rid of the address validation check in MAC filter APIs.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:51 -07:00
Andre Guedes
4bdf89e85e igc: Fix NFC rule overwrite cases
When the 'loc' argument is passed in ethtool, the input rule overwrites
any rule present in that location. In this situation we must disable the
old rule otherwise it is left enabled in hardware. This patch fixes
the issue by always calling igc_disable_nfc_rule() when deleting the
old rule, no matter the value of 'input' argument.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:51 -07:00
Andre Guedes
b500350a36 igc: Fix locking issue when retrieving NFC rules
Access to NFC rules stored in adapter->nfc_rule_list is protect by
adapter->nfc_rule_lock. The functions igc_ethtool_get_nfc_rule()
and igc_ethtool_get_nfc_rules() are missing to hold the lock while
accessing rule objects.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:51 -07:00
Andre Guedes
d3ba9e6f61 igc: Fix 'sw_idx' type in struct igc_nfc_rule
The 'sw_idx' field from 'struct igc_nfc_rule' is u16 type but it is
assigned an u32 value in igc_ethtool_init_nfc_rule(). This patch changes
'sw_idx' type to u32 so they match. Also, it makes more sense to call
this field 'location' since it holds the NFC rule location.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:51 -07:00
Andre Guedes
16fdc16c6b igc: Refactor igc_ethtool_add_nfc_rule()
Current implementation of igc_ethtool_add_nfc_rule() is quite long and a
bit convoluted so this patch does a code refactoring to improve the
code.

Code related to NFC rule object initialization is refactored out to the
local helper function igc_ethtool_init_nfc_rule(). Likewise, code
related to NFC rule validation is refactored out to another local
helper, igc_ethtool_is_nfc_rule_valid().

RX_CLS_FLOW_DISC check is removed since it is redundant. The macro is
defined as the max value fsp->ring_cookie can have, so checking if
fsp->ring_cookie >= adapter->num_rx_queues is already sufficient.

Finally, some log messages are improved or added, and obvious comments
are removed.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 18:21:51 -07:00
Shay Drory
4f7400d5cb net/mlx5: Fix error flow in case of function_setup failure
Currently, if an error occurred during mlx5_function_setup(), we
keep dev->state as DEVICE_STATE_UP.
Fixing it by adding a goto label.

Fixes: e161105e58 ("net/mlx5: Function setup/teardown procedures")
Signed-off-by: Shay Drory <shayd@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:58 -07:00
Roi Dayan
d37bd5e81e net/mlx5e: CT: Correctly get flow rule
The correct way is to us the flow_cls_offload_flow_rule() wrapper
instead of f->rule directly.

Fixes: 4c3844d9e9 ("net/mlx5e: CT: Introduce connection tracking")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:56 -07:00
Moshe Shemesh
5e911e2c06 net/mlx5e: Update netdev txq on completions during closure
On sq closure when we free its descriptors, we should also update netdev
txq on completions which would not arrive. Otherwise if we reopen sqs
and attach them back, for example on fw fatal recovery flow, we may get
tx timeout.

Fixes: 29429f3300 ("net/mlx5e: Timeout if SQ doesn't flush during close")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:54 -07:00
Roi Dayan
9ca415399d net/mlx5: Annotate mutex destroy for root ns
Invoke mutex_destroy() to catch any errors.

Fixes: 2cc43b494a ("net/mlx5_core: Managing root flow table")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:52 -07:00
Roi Dayan
6eb7a268a9 net/mlx5: Don't maintain a case of del_sw_func being null
Add del_sw_func cb for root ns. Now there is no need to
maintain a case of del_sw_func being null when freeing the node.

Fixes: 2cc43b494a ("net/mlx5_core: Managing root flow table")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:50 -07:00
Roi Dayan
aee37f3d94 net/mlx5: Fix cleaning unmanaged flow tables
Unmanaged flow tables doesn't have a parent and tree_put_node()
assume there is always a parent if cleaning is needed. fix that.

Fixes: 5281a0c909 ("net/mlx5: fs_core: Introduce unmanaged flow tables")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:48 -07:00
Moshe Shemesh
df14ad1ecc net/mlx5: Fix memory leak in mlx5_events_init
Fix memory leak in mlx5_events_init(), in case
create_single_thread_workqueue() fails, events
struct should be freed.

Fixes: 5d3c537f90 ("net/mlx5: Handle event of power detection in the PCIE slot")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:46 -07:00
Roi Dayan
a16b8e0dcf net/mlx5e: Fix inner tirs handling
In the cited commit inner_tirs argument was added to create and destroy
inner tirs, and no indication was added to mlx5e_modify_tirs_hash()
function. In order to have a consistent handling, use
inner_indir_tir[0].tirn in tirs destroy/modify function as an indication
to whether inner tirs are created.
Inner tirs are not created for representors and before this commit,
a call to mlx5e_modify_tirs_hash() was sending HW commands to
modify non-existent inner tirs.

Fixes: 46dc933cee ("net/mlx5e: Provide explicit directive if to create inner indirect tirs")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:44 -07:00
Tariq Toukan
16736e11f4 net/mlx5e: kTLS, Destroy key object after destroying the TIS
The TLS TIS object contains the dek/key ID.
By destroying the key first, the TIS would contain an invalid
non-existing key ID.
Reverse the destroy order, this also acheives the desired assymetry
between the destroy and the create flows.

Fixes: d2ead1f360 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:42 -07:00
Maor Dickman
321348475d net/mlx5e: Fix allowed tc redirect merged eswitch offload cases
After changing the parent_id to be the same for both NICs of same
The cited commit wrongly allow offload of tc redirect flows from
VF to uplink and vice versa when devcies are on different eswitch,
these cases aren't supported by HW.

Disallow the above offloads when devcies are on different eswitch
and VF LAG is not configured.

Fixes: f6dc1264f1 ("net/mlx5e: Disallow tc redirect offload cases we don't support")
Signed-off-by: Maor Dickman <maord@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:40 -07:00
Eran Ben Elisha
f7936ddd35 net/mlx5: Avoid processing commands before cmdif is ready
When driver is reloading during recovery flow, it can't get new commands
till command interface is up again. Otherwise we may get to null pointer
trying to access non initialized command structures.

Add cmdif state to avoid processing commands while cmdif is not ready.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:38 -07:00
Eran Ben Elisha
d43b7007db net/mlx5: Fix a race when moving command interface to events mode
After driver creates (via FW command) an EQ for commands, the driver will
be informed on new commands completion by EQE. However, due to a race in
driver's internal command mode metadata update, some new commands will
still be miss-handled by driver as if we are in polling mode. Such commands
can get two non forced completion, leading to already freed command entry
access.

CREATE_EQ command, that maps EQ to the command queue must be posted to the
command queue while it is empty and no other command should be posted.

Add SW mechanism that once the CREATE_EQ command is about to be executed,
all other commands will return error without being sent to the FW. Allow
sending other commands only after successfully changing the driver's
internal command mode metadata.
We can safely return error to all other commands while creating the command
EQ, as all other commands might be sent from the user/application during
driver load. Application can rerun them later after driver's load was
finished.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:36 -07:00
Moshe Shemesh
17d00e839d net/mlx5: Add command entry handling completion
When FW response to commands is very slow and all command entries in
use are waiting for completion we can have a race where commands can get
timeout before they get out of the queue and handled. Timeout
completion on uninitialized command will cause releasing command's
buffers before accessing it for initialization and then we will get NULL
pointer exception while trying access it. It may also cause releasing
buffers of another command since we may have timeout completion before
even allocating entry index for this command.
Add entry handling completion to avoid this race.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:34 -07:00
Eli Cohen
582234b465 net/mlx5e: Support pedit on mpls over UDP decap
Allow to modify ethernet headers while decapsulating mpls over UDP
packets. This is implemented using the same reformat object used for
decapsulation.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:23 -07:00
Eli Cohen
14e6b038af net/mlx5e: Add support for hw decapsulation of MPLS over UDP
MPLS over UDP is supported in hardware by using a packet reformat object
with reformat type equal L3_TUNNEL_TO_L2 which both decapsulates the
outer L3, L4 and MPLS headers, and allows for setting the L2 headers of
the resulting decapsulated packet. For the hardware to operate
correctly, the configuration of the firmware must have
FLEX_PARSER_PROFILE_ENABLE = 1.

Example tc rule:
  tc filter add dev bareudp0 protocol all prio 1 root flower enc_dst_port \
      6635 enc_src_ip 8.8.8.23 action mpls pop protocol ip pipe \
      action pedit ex munge eth dst set 00:11:22:33:44:21 pipe action \
      mirred egress redirect dev enp59s0f0_0

We use pedit to set the correct destination MAC.

For MPLS over UDP decapsulation to take place, the driver logic requires
the following:

1. flower filter added on bareudp device.
2. action mpls pop
3. zero or more pedit munge actions
4. one redirect action

Current implementation supports only IPv4 and no VLAN.

tc filter show output looks like this:
   filter protocol all pref 1 flower chain 0
   filter protocol all pref 1 flower chain 0 handle 0x1
     enc_src_ip 8.8.8.24
     enc_dst_port 6635
     in_hw in_hw_count 1
            action order 1: mpls  pop protocol ip pipe
             index 2 ref 1 bind 1

            action order 2:  pedit action pipe keys 2
             index 1 ref 1 bind 1
             key #0  at eth+0: val 00112233 mask 00000000
             key #1  at eth+4: val 44210000 mask 0000ffff

            action order 3: mirred (Egress Redirect to device enp59s0f0_0) stolen
            index 2 ref 1 bind 1

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:21 -07:00
Eli Cohen
72046a91d1 net/mlx5e: Allow to match on mpls parameters
Support matching on MPLS over UDP parameters using misc2 section of
match parameters.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:19 -07:00
Eli Cohen
f828ca6a2f net/mlx5e: Add support for hw encapsulation of MPLS over UDP
MPLS over UDP is supported by adding a rule on a representor net device
which does tunnel_key set, push mpls and forward to a baredup device. At
the hardware level we use a packet_reformat_context object to do the
encapsulation of the packet.

The resulting packet looks as follows (left side transmitted first):
outer L2 | outer IP | UDP | MPLS | inner L3 and data |

Example usage:
  tc filter add dev $rep0 protocol ip prio 1 root flower skip_sw  \
     action tunnel_key set src_ip 8.8.8.21 dst_ip 8.8.8.24 id 555 \
     dst_port 6635 tos 4 ttl 6 csum action mpls push protocol 0x8847 \
     label 555 tc 3 action mirred egress redirect dev bareudp0

This is how the filter is shown with tc filter show:
tc filter show dev enp59s0f0_0 ingress
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
  eth_type ipv4
  skip_sw
  in_hw in_hw_count 1
        action order 1: tunnel_key  set
        src_ip 8.8.8.21
        dst_ip 8.8.8.24
        key_id 555
        dst_port 6635
        csum
        tos 0x4
        ttl 6 pipe
         index 1 ref 1 bind 1

        action order 2: mpls  push protocol mpls_uc label 555 tc 3 ttl 255 pipe
         index 1 ref 1 bind 1

        action order 3: mirred (Egress Redirect to device bareudp0) stolen
        index 1 ref 1 bind 1

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:18 -07:00
Vlad Buslov
d956873f90 net/mlx5e: Introduce kconfig var for TC support
In order to improve code maintainability and readability, introduce new
CONFIG_MLX5_CLS_ACT kconfig variable to control compilation of TC hardware
offloads implementation. This allows distinguishing between features that
require TC support (MPLSoUDP, etc.) and features that just rely on
representor functionality (rep_bond for live migration, etc.).

Modify rep_tc.h, rep_neigh.h, en_tc.h and chains.h files to provide stubs
for functions that are called from generic code.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:14 -07:00
Vlad Buslov
e2394a61d2 net/mlx5e: Move TC-specific code from en_main.c to en_tc.c
As a preparation for introducing new kconfig option that controls
compilation of all TC offloads code in mlx5, extract TC-specific code from
en_main.c to en_tc.c. This allows easily compiling out the code by
only including new source in make file when corresponding kconfig is
enabled instead of adding multiple ifdef blocks to en_main.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:12 -07:00
Vlad Buslov
549c243e4e net/mlx5e: Extract neigh-specific code from en_rep.c to rep/neigh.c
As a preparation for introducing new kconfig option that controls
compilation of all TC offloads code in mlx5, extract neigh-specific code
from en_rep.c to standalone file. This allows easily compiling out the code
by only including new source in make file when corresponding kconfig is
enabled instead of adding multiple ifdef blocks to en_rep.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:10 -07:00
Vlad Buslov
768c3667e6 net/mlx5e: Extract TC-specific code from en_rep.c to rep/tc.c
As a preparation for introducing new kconfig option that controls
compilation of all TC offloads code in mlx5, extract TC-specific code from
en_rep.c to standalone file. This allows easily compiling out the code by
only including new source in make file when corresponding kconfig is
enabled instead of adding multiple ifdef blocks to en_rep.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:08 -07:00
Tang Bin
2639324a8f net/mlx5e: Use IS_ERR() to check and simplify code
Use IS_ERR() and PTR_ERR() instead of PTR_ERR_OR_ZERO() to
simplify code, avoid redundant judgements.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:07 -07:00
Qiushi Wu
5a73015398 net: sun: fix missing release regions in cas_init_one().
In cas_init_one(), "pdev" is requested by "pci_request_regions", but it
was not released after a call of the function “pci_write_config_byte”
failed. Thus replace the jump target “err_write_cacheline” by
"err_out_free_res".

Fixes: 1f26dac320 ("[NET]: Add Sun Cassini driver.")
Signed-off-by: Qiushi Wu <wu000273@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 16:19:46 -07:00
Vladimir Oltean
bf655ba212 net: mscc: ocelot: fix address ageing time (again)
ocelot_set_ageing_time has 2 callers:
 - felix_set_ageing_time: from drivers/net/dsa/ocelot/felix.c
 - ocelot_port_attr_ageing_set: from drivers/net/ethernet/mscc/ocelot.c

The issue described in the fixed commit below actually happened for the
felix_set_ageing_time code path only, since ocelot_port_attr_ageing_set
was already dividing by 1000. So to make both paths symmetrical (and to
fix addresses getting aged way too fast on Ocelot), stop dividing by
1000 at caller side altogether.

Fixes: c0d7eccbc7 ("net: mscc: ocelot: ANA_AUTOAGE_AGE_PERIOD holds a value in seconds, not ms")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 16:15:21 -07:00
Heiner Kallweit
561535b0f2 r8169: fix OCP access on RTL8117
According to r8168 vendor driver DASHv3 chips like RTL8168fp/RTL8117
need a special addressing for OCP access.
Fix is compile-tested only due to missing test hardware.

Fixes: 1287723aa1 ("r8169: add support for RTL8117")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 16:14:01 -07:00
David S. Miller
593532668f Revert "net: mvneta: speed down the PHY, if WoL used, to save energy"
This reverts commit 5e3768a436.

On request from Russell King, this is a layering violation.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 16:09:42 -07:00
Jiri Pirko
4340f42f20 mlxsw: spectrum: Fix use-after-free of split/unsplit/type_set in case reload fails
In case of reload fail, the mlxsw_sp->ports contains a pointer to a
freed memory (either by reload_down() or reload_up() error path).
Fix this by initializing the pointer to NULL and checking it before
dereferencing in split/unsplit/type_set callpaths.

Fixes: 24cc68ad6c ("mlxsw: core: Add support for reload")
Reported-by: Danielle Ratson <danieller@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 16:08:14 -07:00
Jonathan McDowell
a96ac8a004 net: ethernet: stmmac: Enable interface clocks on probe for IPQ806x
The ipq806x_gmac_probe() function enables the PTP clock but not the
appropriate interface clocks. This means that if the bootloader hasn't
done so attempting to bring up the interface will fail with an error
like:

[   59.028131] ipq806x-gmac-dwmac 37600000.ethernet: Failed to reset the dma
[   59.028196] ipq806x-gmac-dwmac 37600000.ethernet eth1: stmmac_hw_setup: DMA engine initialization failed
[   59.034056] ipq806x-gmac-dwmac 37600000.ethernet eth1: stmmac_open: Hw setup failed

This patch, a slightly cleaned up version of one posted by Sergey
Sergeev in:

https://forum.openwrt.org/t/support-for-mikrotik-rb3011uias-rm/4064/257

correctly enables the clock; we have already configured the source just
before this.

Tested on a MikroTik RB3011.

Signed-off-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 16:07:02 -07:00
Potnuri Bharat Teja
93a09e7457 cxgb4: add adapter hotplug support for ULDs
Upon adapter hotplug, cxgb4 registers ULD devices for all the ULDs that
are already loaded, ensuring that ULD's can enumerate the hotplugged
adapter without reloading the ULD.

Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 16:04:01 -07:00
Edward Cree
060b6381ef net: flow_offload: simplify hw stats check handling
Make FLOW_ACTION_HW_STATS_DONT_CARE be all bits, rather than none, so that
 drivers and __flow_action_hw_stats_check can use simple bitwise checks.

Pre-fill all actions with DONT_CARE in flow_rule_alloc(), rather than
 relying on implicit semantics of zero from kzalloc, so that callers which
 don't configure action stats themselves (i.e. netfilter) get the correct
 behaviour by default.

Only the kernel's internal API semantics change; the TC uAPI is unaffected.

v4: move DONT_CARE setting to flow_rule_alloc() for robustness and simplicity.

v3: set DONT_CARE in nft and ct offload.

v2: rebased on net-next, removed RFC tags.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 15:52:08 -07:00
Valentin Longchamp
79dde73cf9 net/ethernet/freescale: rework quiesce/activate for ucc_geth
ugeth_quiesce/activate are used to halt the controller when there is a
link change that requires to reconfigure the mac.

The previous implementation called netif_device_detach(). This however
causes the initial activation of the netdevice to fail precisely because
it's detached. For details, see [1].

A possible workaround was the revert of commit
net: linkwatch: add check for netdevice being present to linkwatch_do_dev
However, the check introduced in the above commit is correct and shall be
kept.

The netif_device_detach() is thus replaced with
netif_tx_stop_all_queues() that prevents any tranmission. This allows to
perform mac config change required by the link change, without detaching
the corresponding netdevice and thus not preventing its initial
activation.

[1] https://lists.openwall.net/netdev/2020/01/08/201

Signed-off-by: Valentin Longchamp <valentin@longchamp.me>
Acked-by: Matteo Ghidoni <matteo.ghidoni@ch.abb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 15:50:34 -07:00
Russell King
3138a07ce2 net: mvpp2: fix RX hashing for non-10G ports
When rxhash is enabled on any ethernet port except the first in each CP
block, traffic flow is prevented.  The analysis is below:

I've been investigating this afternoon, and what I've found, comparing
a kernel without 895586d5dc and with 895586d5dc applied is:

- The table programmed into the hardware via mvpp22_rss_fill_table()
  appears to be identical with or without the commit.

- When rxhash is enabled on eth2, mvpp2_rss_port_c2_enable() reports
  that c2.attr[0] and c2.attr[2] are written back containing:

   - with 895586d5dc, failing:    00200000 40000000
   - without 895586d5dc, working: 04000000 40000000

- When disabling rxhash, c2.attr[0] and c2.attr[2] are written back as:

   04000000 00000000

The second value represents the MVPP22_CLS_C2_ATTR2_RSS_EN bit, the
first value is the queue number, which comprises two fields. The high
5 bits are 24:29 and the low three are 21:23 inclusive. This comes
from:

       c2.attr[0] = MVPP22_CLS_C2_ATTR0_QHIGH(qh) |
                     MVPP22_CLS_C2_ATTR0_QLOW(ql);

So, the working case gives eth2 a queue id of 4.0, or 32 as per
port->first_rxq, and the non-working case a queue id of 0.1, or 1.
The allocation of queue IDs seems to be in mvpp2_port_probe():

        if (priv->hw_version == MVPP21)
                port->first_rxq = port->id * port->nrxqs;
        else
                port->first_rxq = port->id * priv->max_port_rxqs;

Where:

        if (priv->hw_version == MVPP21)
                priv->max_port_rxqs = 8;
        else
                priv->max_port_rxqs = 32;

Making the port 0 (eth0 / eth1) have port->first_rxq = 0, and port 1
(eth2) be 32. It seems the idea is that the first 32 queues belong to
port 0, the second 32 queues belong to port 1, etc.

mvpp2_rss_port_c2_enable() gets the queue number from it's parameter,
'ctx', which comes from mvpp22_rss_ctx(port, 0). This returns
port->rss_ctx[0].

mvpp22_rss_context_create() is responsible for allocating that, which
it does by looking for an unallocated priv->rss_tables[] pointer. This
table is shared amongst all ports on the CP silicon.

When we write the tables in mvpp22_rss_fill_table(), the RSS table
entry is defined by:

                u32 sel = MVPP22_RSS_INDEX_TABLE(rss_ctx) |
                          MVPP22_RSS_INDEX_TABLE_ENTRY(i);

where rss_ctx is the context ID (queue number) and i is the index in
the table.

If we look at what is written:

- The first table to be written has "sel" values of 00000000..0000001f,
  containing values 0..3. This appears to be for eth1. This is table 0,
  RX queue number 0.
- The second table has "sel" values of 00000100..0000011f, and appears
  to be for eth2.  These contain values 0x20..0x23. This is table 1,
  RX queue number 0.
- The third table has "sel" values of 00000200..0000021f, and appears
  to be for eth3.  These contain values 0x40..0x43. This is table 2,
  RX queue number 0.

How do queue numbers translate to the RSS table?  There is another
table - the RXQ2RSS table, indexed by the MVPP22_RSS_INDEX_QUEUE field
of MVPP22_RSS_INDEX and accessed through the MVPP22_RXQ2RSS_TABLE
register. Before 895586d5dc, it was:

       mvpp2_write(priv, MVPP22_RSS_INDEX,
                   MVPP22_RSS_INDEX_QUEUE(port->first_rxq));
       mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE,
                   MVPP22_RSS_TABLE_POINTER(port->id));

and after:

       mvpp2_write(priv, MVPP22_RSS_INDEX, MVPP22_RSS_INDEX_QUEUE(ctx));
       mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE, MVPP22_RSS_TABLE_POINTER(ctx));

Before the commit, for eth2, that would've contained '32' for the
index and '1' for the table pointer - mapping queue 32 to table 1.
Remember that this is queue-high.queue-low of 4.0.

After the commit, we appear to map queue 1 to table 1. That again
looks fine on the face of it.

Section 9.3.1 of the A8040 manual seems indicate the reason that the
queue number is separated. queue-low seems to always come from the
classifier, whereas queue-high can be from the ingress physical port
number or the classifier depending on the MVPP2_CLS_SWFWD_PCTRL_REG.

We set the port bit in MVPP2_CLS_SWFWD_PCTRL_REG, meaning that queue-high
comes from the MVPP2_CLS_SWFWD_P2HQ_REG() register... and this seems to
be where our bug comes from.

mvpp2_cls_oversize_rxq_set() sets this up as:

        mvpp2_write(port->priv, MVPP2_CLS_SWFWD_P2HQ_REG(port->id),
                    (port->first_rxq >> MVPP2_CLS_OVERSIZE_RXQ_LOW_BITS));

        val = mvpp2_read(port->priv, MVPP2_CLS_SWFWD_PCTRL_REG);
        val |= MVPP2_CLS_SWFWD_PCTRL_MASK(port->id);
        mvpp2_write(port->priv, MVPP2_CLS_SWFWD_PCTRL_REG, val);

Setting the MVPP2_CLS_SWFWD_PCTRL_MASK bit means that the queue-high
for eth2 is _always_ 4, so only queues 32 through 39 inclusive are
available to eth2. Yet, we're trying to tell the classifier to set
queue-high, which will be ignored, to zero. Hence, the queue-high
field (MVPP22_CLS_C2_ATTR0_QHIGH()) from the classifier will be
ignored.

This means we end up directing traffic from eth2 not to queue 1, but
to queue 33, and then we tell it to look up queue 33 in the RSS table.
However, RSS table has not been programmed for queue 33, and so it ends
up (presumably) dropping the packets.

It seems that mvpp22_rss_context_create() doesn't take account of the
fact that the upper 5 bits of the queue ID can't actually be changed
due to the settings in mvpp2_cls_oversize_rxq_set(), _or_ it seems that
mvpp2_cls_oversize_rxq_set() has been missed in this commit. Either
way, these two functions mutually disagree with what queue number
should be used.

Looking deeper into what mvpp2_cls_oversize_rxq_set() and the MTU
validation is doing, it seems that MVPP2_CLS_SWFWD_P2HQ_REG() is used
for over-sized packets attempting to egress through this port. With
the classifier having had RSS enabled and directing eth2 traffic to
queue 1, we may still have packets appearing on queue 32 for this port.

However, the only way we may end up with over-sized packets attempting
to egress through eth2 - is if the A8040 forwards frames between its
ports. From what I can see, we don't support that feature, and the
kernel restricts the egress packet size to the MTU. In any case, if we
were to attempt to transmit an oversized packet, we have no support in
the kernel to deal with that appearing in the port's receive queue.

So, this patch attempts to solve the issue by clearing the
MVPP2_CLS_SWFWD_PCTRL_MASK() bit, allowing MVPP22_CLS_C2_ATTR0_QHIGH()
from the classifier to define the queue-high field of the queue number.

My testing seems to confirm my findings above - clearing this bit
means that if I enable rxhash on eth2, the interface can then pass
traffic, as we are now directing traffic to RX queue 1 rather than
queue 33. Traffic still seems to work with rxhash off as well.

Reported-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Matteo Croce <mcroce@redhat.com>
Fixes: 895586d5dc ("net: mvpp2: cls: Use RSS contexts to handle RSS tables")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 15:31:43 -07:00
Bartosz Golaszewski
8c7bd5a454 net: ethernet: mtk-star-emac: new driver
This adds the driver for the MediaTek STAR Ethernet MAC currently used
on the MT8* SoC family. For now we only support full-duplex.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:20:11 -07:00
Bartosz Golaszewski
22f076a279 net: ethernet: mediatek: remove unnecessary spaces from Makefile
The Makefile formatting in the kernel tree usually doesn't use tabs,
so remove them before we add a second driver.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:20:11 -07:00
Bartosz Golaszewski
d3d6974bc5 net: ethernet: mediatek: rename Kconfig prompt
We'll soon by adding a second MediaTek Ethernet driver so modify the
Kconfig prompt.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:20:11 -07:00
Arthur Kiyanovski
4bb7f4cf60 net: ena: reduce driver load time
This commit reduces the driver load time by using usec resolution
instead of msec when polling for hardware state change.

Also add back-off mechanism to handle cases where minimal sleep
time is not enough.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:12:48 -07:00
Arthur Kiyanovski
b0ae3ac484 net: ena: cosmetic: minor code changes
1. Use BIT macro instead of shift operator for code clarity
2. Replace multiple flag assignments to a single assignment of multiple
   flags in ena_com_add_single_rx_desc()
3. Move ENA_HASH_KEY_SIZE from ena_netdev.h to ena_com.h

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:12:48 -07:00
Arthur Kiyanovski
6d0862e0ec net: ena: cosmetic: fix spacing issues
1. Add leading and trailing spaces to several comments for better
   readability
2. Make tabs and spaces uniform in enum defines in ena_admin_defs.h

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:12:48 -07:00
Arthur Kiyanovski
0a39a35f3f net: ena: cosmetic: code reorderings
1. Reorder sanity checks in get_comp_ctxt() to make more sense
2. Reorder variables in ena_com_fill_hash_function() and
   ena_calc_io_queue_size() in reverse christmas tree.
3. Move around member initializations.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:12:48 -07:00
Arthur Kiyanovski
f302044747 net: ena: cosmetic: remove unnecessary code
1. Remove unused definition of DRV_MODULE_VERSION
2. Remove {} from single line-of-code ifs
3. Remove unnecessary comments from ena_get/set_coalesce()
4. Remove unnecessary extra spaces and newlines

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:12:48 -07:00
Arthur Kiyanovski
46143e5888 net: ena: cosmetic: fix line break issues
1. Join unnecessarily broken short lines in ena_com.c ena_netdev.c
2. Fix Indentations of broken lines

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:12:48 -07:00
Arthur Kiyanovski
13830937cc net: ena: cosmetic: fix spelling and grammar mistakes in comments
fix spelling and grammar mistakes in comments in ena_com.h,
ena_com.c and ena_netdev.c

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:12:48 -07:00
Arthur Kiyanovski
ba6f6b4191 net: ena: cosmetic: set queue sizes to u32 for consistency
Make all types of variables that convey the number and sizeof queues to
be u32, for consistency with the API between the driver and device via
ena_admin_defs.h:ena_admin_get_feat_resp.max_queue_ext fields. Current
code sometimes uses int and there are multiple assignments between these
variables with different types.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:12:48 -07:00
Arthur Kiyanovski
95d0fcb570 net: ena: cosmetic: rename ena_update_tx/rx_rings_intr_moderation()
Rename ena_update_tx/rx_rings_intr_moderation() to
ena_update_tx/rx_rings_nonadaptive_intr_moderation()
to distinguish between adaptive and non adaptive interrupt moderaion.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:12:48 -07:00
Arthur Kiyanovski
da447b3b54 net: ena: simplify ena_com_update_intr_delay_resolution()
Initialize prev_intr_delay_resolution with ena_dev->intr_delay_resolution
unconditionally, since it is initialized with
ENA_DEFAULT_INTR_DELAY_RESOLUTION in ena_probe(). This approach makes much
more sense than handling errors of not initializing it.

Also added unlikely to if condition.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:12:48 -07:00
Arthur Kiyanovski
adb3fb3889 net: ena: fix ena_com_comp_status_to_errno() return value
Default return value should be -EINVAL since the input
in this case was unexpected.
Also remove the now redundant check in the beginning
of the function.

Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:12:48 -07:00
Arthur Kiyanovski
f391503b7a net: ena: use explicit variable size for clarity
Use u64 instead of unsigned long long for clarity

Signed-off-by: Shai Brandes <shaibran@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:12:48 -07:00
Arthur Kiyanovski
7cfe9a5593 net: ena: rename ena_com_free_desc to make API more uniform
Rename ena_com_free_desc to ena_com_free_q_entries to match
the LLQ mode.

In non-LLQ mode, an entry in an IO ring corresponds to a
a descriptor. In LLQ mode an entry may correspond to several
descriptors (per LLQ definition).

Signed-off-by: Igor Chauskin <igorch@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:12:48 -07:00
Arthur Kiyanovski
68f236df93 net: ena: add support for the rx offset feature
Newer ENA devices can write data to rx buffers with an offset
from the beginning of the buffer.

This commit adds support for this feature in the driver.

Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:12:48 -07:00
Mark Starovoytov
40f05e5b0d net: atlantic: proper rss_ctrl1 (54c0) initialization
This patch fixes an inconsistency between code and spec, which
was found while working on the QoS implementation.

When 8TCs are used, 2 is the maximum supported number of index bits.
In a 4TC mode, we do support 3, but we shouldn't really use the bytes,
which are intended for the 8TC mode.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:08:29 -07:00
Mark Starovoytov
2deac71ac4 net: atlantic: QoS implementation: min_rate
This patch adds support for mqprio min_rate limiters.

A2 HW supports Weighted Strict Priority (WSP) arbitration for Tx Descriptor
Queue scheduling among TCs, which can be used for min_rate shaping.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:08:29 -07:00
Mark Starovoytov
b64f2ac995 net: atlantic: change the order of arguments for TC weight/credit setters
This patch changes the order of arguments for TC weight/credit setter
functions.
Having the "value to be set" on the right is slightly more robust in
a sense that it's more natural for the humans, so it's a bit more
error-proof this way.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:08:29 -07:00
Mark Starovoytov
5479e8436f net: atlantic: always use random TC-queue mapping for TX on A2.
This patch changes the TC-queue mapping mechanism used on A2.
Configure the A2 HW in such a way that we can keep queue index mapping
exactly as it was on A1.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:08:28 -07:00
Mark Starovoytov
14ef766b13 net: atlantic: automatically downgrade the number of queues if necessary
This patch adds support for automatic queue number downgrade.

On A2: this is a must have, because only TC0/TC1 support more than 4Q.
Other TCs support 4Qs maximum.
Thus, on A2 we must downgrade the number of queues per TC to 4, if more
than 2 TCs are requested.

On A1: this allows using 8TCs even on systems with cpu count >= 8, when
we have 8 queues by default.
We will just automatically switch to 8TCx4Q mode in this case.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:08:28 -07:00
Mark Starovoytov
7327699f35 net: atlantic: QoS implementation: max_rate
This patch adds initial support for mqprio rate limiters (max_rate only).

Atlantic HW supports Rate-Shaping for time-sensitive traffic at per
Traffic Class (TC) granularity.
Target rate is defined by:
* nominal link rate (always 10G);
* rate factor (ratio between nominal rate and max allowed).

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:08:28 -07:00
Mark Starovoytov
b9e989262a net: atlantic: make TCVEC2RING accept nic_cfg
This patch updates TCVEC2RING to accept nic_cfg, which is needed to be able
to use it from hw_atl.
The name is updated to reflect the changes.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:08:28 -07:00
Mark Starovoytov
4272ba8b11 net: atlantic: per-TC queue statistics
This patch adds support for per-TC queue statistics.

By default (single TC), the output is the same as it used to be, e.g.:
     Queue[0] InPackets: 2
     Queue[0] OutPackets: 8
     Queue[0] Restarts: 0
     Queue[0] InJumboPackets: 0
     Queue[0] InLroPackets: 0
     Queue[0] InErrors: 0

If several TCs are enabled, then each queue statistics line is prefixed
with TC number, e.g.:
     TC0 Queue[0] InPackets: 6
     TC0 Queue[0] OutPackets: 11
Queue numbering is end-to-end, so:
     TC1 Queue[4] InPackets: 0
     TC1 Queue[4] OutPackets: 22

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:08:28 -07:00
Dmitry Bezrukov
a83fe6b6ad net: atlantic: QoS implementation: multi-TC support
This patch adds multi-TC support.

PTP is automatically disabled when the user enables more than 2 TCs,
otherwise traffic on TC2 won't quite work, because it's reserved for PTP.

Signed-off-by: Dmitry Bezrukov <dbezrukov@marvell.com>
Co-developed-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Co-developed-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:08:28 -07:00
Dmitry Bezrukov
0aa7bc3ee4 net: atlantic: changes for multi-TC support
This patch contains the following changes:
* add cfg->is_ptp (used for PTP enable/disable switch, which
  is described in more details below);
* add cfg->tc_mode (A1 supports 2 HW modes only);
* setup queue to TC mapping based on TC mode on A2;
* remove hw_tx_tc_mode_get / hw_rx_tc_mode_get hw_ops.

In the first generation of our hardware (A1), a whole traffic class is
consumed for PTP handling in FW (FW uses it to send the ptp data and to
send back timestamps).
The 'is_ptp' flag introduced in this patch will be used in to automatically
disable PTP when a conflicting configuration is detected, e.g. when
multiple TCs are enabled.

Signed-off-by: Dmitry Bezrukov <dbezrukov@marvell.com>
Co-developed-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:08:28 -07:00
Dmitry Bezrukov
593dd0fc20 net: atlantic: move PTP TC initialization to a separate function
This patch moves the PTP TC initialization into a separate function.

Signed-off-by: Dmitry Bezrukov <dbezrukov@marvell.com>
Co-developed-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:08:28 -07:00
Dmitry Bezrukov
8ce8427169 net: atlantic: changes for multi-TC support
This patch contains the following changes:
* access cfg via aq_nic_get_cfg() in aq_nic_start() and aq_nic_map_skb();
* call aq_nic_get_dev() just once in aq_nic_map_skb();
* move ring allocation/deallocation out of aq_vec_alloc()/aq_vec_free();
* add the missing aq_nic_deinit() in atl_resume_common();
* rename 'tcs' field to 'tcs_max' in aq_hw_caps_s to differentiate it from
  the 'tcs' field in aq_nic_cfg_s, which is used for the current number of
  TCs;
* update _TC_MAX defines to the actual number of supported TCs;
* move tx_tc_mode register defines slightly higher (just to keep the order
  of definitions);
* separate variables for TX/RX buff_size in hw_atl*_hw_qos_set();
* use AQ_HW_*_TC instead of hardcoded magic numbers;
* actually use the 'ret' value in aq_mdo_add_secy();

Signed-off-by: Dmitry Bezrukov <dbezrukov@marvell.com>
Co-developed-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:08:28 -07:00
David S. Miller
59b8d27705 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2020-05-21

This series contains updates to ice driver only.  Several of the changes
are fixes, which could be backported to stable, of which, only one was
marked for stable because of the memory leak potential.

Jake exposes the information in the flash memory used for link
management, which is called the netlist module.

Henry and Tony add support for tunnel offloads.

Brett adds promiscuous support in VF's which is based on VF trust and
the new vf-true-promisc flag.

Avinash fixes an issue where a transmit timeout for a queue that belongs
to a PFC enabled TC is not a true transmit timeout, but because the PFC
is in action.

Dave fixes the check for contiguous TCs to allow for various UP2TC
mapping configurations.  Also fixed an issue when changing the pause
parameters would could multiple link drop/down's in succession, which in
turn caused the firmware to not generate a link interrupt for the driver
to respond to.

Anirudh (Ani) fixed a potential race condition in probe/open due to a
bit being cleared too early.

Lihong updates an error message to make it more meaningful instead of
just printing out the numerical value of the status/error code.  Also
fixed an incorrect return value if deleting a filter does not find a
match to delete or when adding a filter that already exists.

Karol fixes casting issues and precision loss in the driver.

Jesse make the sign usage more consistent in the driver by making sure
all instances of vf_id are unsigned, since it can never be negative.

Eric fixes a potential memory leak in ice_add_prof_id_vsig() where was
not cleaning up resources properly when an error occurs.

Michal to help organize the filtering code in the driver, refactor the
code into a separate file and add functions to prepare the filter
information.

Bruce cleaned up a conditional statement that always resulted in true
and provided a comment to make it more obvious.  Also cleaned up
redundant code checks.

Tony helps with potential namespace issues by renaming a 'ice' specific
function with the driver name prepended.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:05:05 -07:00
David S. Miller
7b1b843a1e Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2020-05-21

This series contains updates to igc and e1000.

Andre cleans up code that was left over from the igb driver that handled
MAC address filters based on the source address, which is not currently
supported.  Simplifies the MAC address filtering code and prepare the
igc driver for future source address support.  Updated the MAC address
filter internal APIs to support filters based on source address.  Added
support for Network Flow Classification (NFC) rules based on source MAC
address.  Cleaned up the 'cookie' field which is not used anywhere in
the code and cleaned up a wrapper function that was not needed.
Simplified the filtering code for readability and aligned the ethtool
functions, so that function names were consistent.

Alex provides a fix for e1000 to resolve a deadlock issue when NAPI is
being disabled.

Sasha does additional cleanup of the igc driver of dead code that is not
used or needed.

v2: Fix the function header comment in patch 3 of the series, based on
    the feedback from Jakub Kicinski.
====================

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 13:48:50 -07:00
Tony Nguyen
5757cc7c8b ice: Rename build_ctob to ice_build_ctob
To make the function easier to identify as being part of the ice driver,
prepend ice to the function name.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Bruce Allan
c522d1f686 ice: remove unnecessary backslash
Self-explanatory.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Bruce Allan
86a2e00d20 ice: remove unnecessary check
The variable status cannot be zero due to a prior check of it; remove this
check.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Bruce Allan
92ace4824c ice: remove unnecessary expression that is always true
The else conditional expression is always true due to the if conditional
expression; remove it and add a comment to make it obvious still.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Lihong Yang
757976ab16 ice: Fix check for removing/adding mac filters
In function ice_set_mac_address, we will remove old dev_addr before
adding the new MAC. In the removing and adding process of the MAC,
there is no need to return error if the check finds the to-be-removed
dev_addr does not exist in the MAC filter list or the to-be-added mac
already exists, keep going or return success accordingly.

Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Michal Swiatkowski
1b8f15b64a ice: refactor filter functions
Move filter functions to separate file.

Add functions that prepare suitable ice_fltr_info struct
depending on the filter type and add this struct to earlier created
list:
- ice_fltr_add_mac_to_list
- ice_fltr_add_vlan_to_list
- ice_fltr_add_eth_to_list
This functions are used in adding and removing filters.

Create wrappers for functions mentioned above that alloc list,
add suitable ice_fltr_info to it and call add or remove function.
- ice_fltr_prepare_mac
- ice_fltr_prepare_mac_and_broadcast
- ice_fltr_prepare_vlan
- ice_fltr_prepare_eth

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Eric Joyner
857a4f0e9f ice: Fix resource leak on early exit from function
Memory allocated in the ice_add_prof_id_vsig() function wasn't being
properly freed if an error occurred inside the for-loop in the function.

In particular, 'p' wasn't being freed if an error occurred before it was
added to the resource list at the end of the for-loop.

Signed-off-by: Eric Joyner <eric.joyner@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Jesse Brandeburg
53bb66983f ice: cleanup vf_id signedness
The vf_id variable is dealt with in the code in inconsistent
ways of sign usage, preventing compilation with -Werror=sign-compare.
Fix this problem in the code by always treating vf_id as unsigned, since
there are no valid values of vf_id that are negative.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Karol Kolacinski
88865fc4bb ice: Fix casting issues
Change min() macros to min_t() which has compare type specified and it
helps avoid precision loss.

In some cases there was precision loss during calls or assignments.
Some fields in structs were unnecessarily large and gave multiple
warnings.

There were also some minor type differences which are now fixed as well as
some cases where a simple cast was needed.

Callers were were passing data that is a u16 to
ice_sched_cfg_node_bw_alloc() but the function was truncating that to a u8.
Fix that by changing the function to take a u16.

Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Lihong Yang
0fee35774d ice: Provide more meaningful error message
When printing the ice status or AQ error codes, instead of printing out the
numerical value, provide the description of the error code. This provides
more info about the issue than a number.

Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Anirudh Venkataramanan
de75135b5c ice: Fix probe/open race condition
As soon as the driver registers the PF netdev, userspace utilities
like NetworkManager try to bring up the associated interface. When
this happens, the driver may not have finished initializing fully,
resulting in a bunch of errors in the interface up flow.

The driver already has a mechanism to indicate if it's not up yet;
by setting the __ICE_DOWN bit in pf->state, but this bit gets
cleared too early in the current flow. So clear this bit only when
the driver is fully up. Also check for the same bit in the ice_open
flow, and return -EBUSY if the bit is set.

Also in ice_open, replace references of vsi->back with a local
variable.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Dave Ertman
46a316500e ice: only drop link once when setting pauseparams
Currently, the ice driver is setting a PHY configuration,
which causes a link drop, and then additionally it calls
for a nway_reset, which restarts auto-negotiation on the
link, which also causes a link drop.  These two link
events in such close timing is causing the FW to not be
able to generate a link interrupt for the driver to
respond to.

Remove the unnecessary auto-negotiation restart from the
set pauseparams flow.  Also remove error path that
would have performed an ice_down/ice_up as that is
also unnecessary.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Dave Ertman
891540024b ice: Fix check for contiguous TCs
The current implementation for contiguous TC check
is assuming that the UPs will be mapped to TCs in
a linear progressing fashion.  This is obviously
not always true.

Change the check to allow for various UP2TC mapping
configurations.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Avinash JD
610ed0e93e ice: Don't reset and rebuild for Tx timeout on PFC enabled queue
When there's a Tx timeout for a queue which belongs to a PFC enabled TC,
then it's not because the queue is hung but because PFC is in action.

In PFC, peer sends a pause frame for a specified period of time when its
buffer threshold is exceeded (due to congestion). Netdev on the other
hand checks if ACK is received within a specified time for a TX packet, if
not, it'll invoke the tx_timeout routine.

Signed-off-by: Avinash JD <avinash.dayanand@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:03 -07:00
Brett Creeley
01b5e89aab ice: Add VF promiscuous support
Implement promiscuous support for VF VSIs. Behaviour of promiscuous support
is based on VF trust as well as the, introduced, vf-true-promisc flag.

A trusted VF with vf-true-promisc disabled will be the default VSI, which
means that all traffic without a matching destination MAC address in the
device's internal switch will be forwarded to this VF VSI.

A trusted VF with vf-true-promisc enabled will go into "true promiscuous
mode". This amounts to the VF receiving all ingress and egress traffic
that hits the device's internal switch.

An untrusted VF will only receive traffic destined for that VF.

The vf-true-promisc-support flag cannot be toggled while any VF is in
promiscuous mode. This flag should be set prior to loading the iavf driver
or spawning VF(s).

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:03 -07:00
Tony Nguyen
a4e82a81f5 ice: Add support for tunnel offloads
Create a boost TCAM entry for each tunnel port in order to get a tunnel
PTYPE. Update netdev feature flags and implement the appropriate logic to
get and set values for hardware offloads.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:03 -07:00
Jacob Keller
f45a645fa6 ice: report netlist version in .info_get
The flash memory for the ice hardware contains a block of information
used for link management called the Netlist module.

As this essentially represents another section of firmware, add its
version information to the output of the driver's .info_get handler.

This includes both a version and the first few bytes of a hash of the
module contents.

  fw.netlist -> the version information extracted from the netlist module
  fw.netlist.build-> first 4 bytes of the hash of the contents, similar
                     to fw.mgmt.build

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:03 -07:00
Björn Töpel
39d6443c8d mlx5, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL
Use the new MEM_TYPE_XSK_BUFF_POOL API in lieu of MEM_TYPE_ZERO_COPY in
mlx5e. It allows to drop a lot of code from the driver (which is now
common in AF_XDP core and was related to XSK RX frame allocation, DMA
mapping, etc.) and slightly improve performance (RX +0.8 Mpps, TX +0.4
Mpps).

rfc->v1: Put back the sanity check for XSK params, use XSK API to get
         the total headroom size. (Maxim)

v1->v2: Fix DMA address handling, set XDP metadata to invalid. (Maxim)

v2->v3: Handle frame_sz, use xsk_buff_xdp_get_frame_dma, use xsk_buff
        API for DMA sync on TX, add performance numbers. (Maxim)

v3->v4: Remove unused variable num_xsk_frames. (Jakub)

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-12-bjorn.topel@gmail.com
2020-05-21 17:31:27 -07:00
Björn Töpel
7117132b22 ixgbe, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL
Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL
APIs.

v1->v2: Fixed xdp_buff data_end update. (Björn)

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-11-bjorn.topel@gmail.com
2020-05-21 17:31:27 -07:00
Björn Töpel
175fc43067 ice, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL
Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL
APIs.

v4->v5: Fixed "warning: Excess function parameter 'alloc' description
        in 'ice_alloc_rx_bufs_zc'" and "warning: Excess function
        parameter 'xdp' description in
        'ice_construct_skb_zc'". (Jakub)

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-10-bjorn.topel@gmail.com
2020-05-21 17:31:26 -07:00
Björn Töpel
3b4f0b66c2 i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL
Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL
APIs. The AF_XDP zero-copy rx_bi ring is now simply a struct xdp_buff
pointer.

v4->v5: Fixed "warning: Excess function parameter 'bi' description in
        'i40e_construct_skb_zc'". (Jakub)

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-9-bjorn.topel@gmail.com
2020-05-21 17:31:26 -07:00
Björn Töpel
be1222b585 i40e: Separate kernel allocated rx_bi rings from AF_XDP rings
Continuing the path to support MEM_TYPE_XSK_BUFF_POOL, the AF_XDP
zero-copy/sk_buff rx_bi rings are now separate. Functions to properly
allocate the different rings are added as well.

v3->v4: Made i40e_fd_handle_status() static. (kbuild test robot)
v4->v5: Fix kdoc for i40e_clean_programming_status(). (Jakub)

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-8-bjorn.topel@gmail.com
2020-05-21 17:31:26 -07:00
Björn Töpel
e1675f9736 i40e: Refactor rx_bi accesses
As a first step to migrate i40e to the new MEM_TYPE_XSK_BUFF_POOL
APIs, code that accesses the rx_bi (SW/shadow ring) is refactored to
use an accessor function.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-7-bjorn.topel@gmail.com
2020-05-21 17:31:26 -07:00
Magnus Karlsson
a71506a4fd xsk: Move driver interface to xdp_sock_drv.h
Move the AF_XDP zero-copy driver interface to its own include file
called xdp_sock_drv.h. This, hopefully, will make it more clear for
NIC driver implementors to know what functions to use for zero-copy
support.

v4->v5: Fix -Wmissing-prototypes by include header file. (Jakub)

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-4-bjorn.topel@gmail.com
2020-05-21 17:31:26 -07:00
Tang Bin
a7654211d0 net: sgi: ioc3-eth: Fix return value check in ioc3eth_probe()
In the function devm_platform_ioremap_resource(), if get resource
failed, the return value is ERR_PTR() not NULL. Thus it must be
replaced by IS_ERR(), or else it may result in crashes if a critical
error path is encountered.

Fixes: 0ce5ebd24d ("mfd: ioc3: Add driver for SGI IOC3 chip")
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21 17:26:54 -07:00
Wei Yongjun
1401cf600d net: ethernet: ti: am65-cpsw-nuss: fix error handling of am65_cpsw_nuss_probe
Convert to using IS_ERR() instead of NULL test for cpsw_ale_create()
error handling. Also fix to return negative error code from this error
handling case instead of 0 in.

Fixes: 93a7653031 ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21 17:14:18 -07:00
Wei Yongjun
3469660d1b net: ethernet: ti: fix some return value check of cpsw_ale_create()
cpsw_ale_create() can return both NULL and PTR_ERR(), but all of
the caller only check NULL for error handling. This patch convert
it to only return PTR_ERR() in all error cases, and the caller using
IS_ERR() instead of NULL test.

Fixes: 4b41d34367 ("net: ethernet: ti: cpsw: allow untagged traffic on host port")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21 17:14:18 -07:00
Yuval Basson
7bfb399eca qed: Add XRC to RoCE
Add support for XRC-SRQ's and XRC-QP's for upper layer driver.

We maintain separate bitmaps for resource management for srq and
xrc-srq, However, the range in FW is one, The xrc-srq's are first
and then the srq's follow. Therefore we maintain a srq-id offset.

v2: perform cleanups if XRC bitmpas allocation fail.

Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Yuval Bason <ybason@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21 17:08:25 -07:00
Yuval Basson
b8204ad878 qed: changes to ILT to support XRC
First ILT page for TSDM client is allocated for XRC-SRQ's.
For regular SRQ's skip first ILT page that is reserved for
XRC-SRQ's.

Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Yuval Bason <ybason@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21 17:08:25 -07:00
Andre Guedes
c983e32719 igc: Change byte order in struct igc_nfc_filter
Every time we access the 'etype' and 'vlan_tci' fields from struct
igc_nfc_filter to enable or disable filters in hardware we have to
convert them from big endian to host order so it makes more sense to
simply have these fields in host order.

The byte order conversion should take place in igc_ethtool_get_nfc_
rule() and igc_ethtool_add_nfc_rule(), which are called by .get_rxnfc
and .set_rxnfc ethtool ops, since ethtool subsystem is the one who deals
with them in big endian order.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:19:28 -07:00
Andre Guedes
97700bc86d igc: Align terms used in NFC support code
The Network Flow Classification (NFC) support code from IGC driver uses
terms such as 'rule', 'filter', 'entry', 'input' interchangeably when
referring to NFC rules, making it harder to follow the code. This patch
renames IGC's internal APIs, structs, and variables so we stick with the
term 'rule' since this is the term used in ethtool APIs. It also removes
some not applicable comments along the way. No functionality is changed
by this patch.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:19:23 -07:00
Andre Guedes
7df76bd191 igc: Add 'igc_ethtool_' prefix to functions in igc_ethtool.c
This patch adds the prefix 'igc_ethtool_' to all functions defined in
igc_ethtool.c so they align with the name convention already followed by
other parts of the driver (e.g. igc_tsn, igc_ptp). Also, this avoids
some name clashing with functions added to igc_main.c by upcoming
patches in this series. No functionality is changed by this patch, just
function renaming.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:19:19 -07:00
Andre Guedes
876ea04db7 igc: Early return in igc_get_ethtool_nfc_entry()
This patch re-writes the second half of igc_ethtool_get_nfc_entry() to
follow the 'return early' pattern seen in other parts of the driver and
removes some duplicate comments.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:19:15 -07:00
Andre Guedes
8b9c23cdf0 igc: Cleanup _get|set_rxnfc ethtool ops
This patch does a trivial change in igc_ethtool_get_rxnfc() and
igc_ethtool_set_rxnfc() to simplify their logic.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:19:11 -07:00
Andre Guedes
4d0710c241 igc: Get rid of igc_max_channels()
The local function igc_max_channels() is a pointless wrapper around
igc_get_max_rss_queues(). This patch removes it and updates the callers
accordingly. It also does some cleanup on igc_get_max_rss_queues().

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:19:07 -07:00
Andre Guedes
8e34cad167 igc: Remove unused field from igc_nfc_filter
The 'cookie' field is not used anywhere in the code so this patch
removes it from struct igc_nfc_filter.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:19:00 -07:00
Sasha Neftin
281380a6fd igc: Remove per queue good transmited counter register
Per queue good transmitted packet counter not applicable for i225 device.
This patch comes to clean up this register.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:18:56 -07:00
Sasha Neftin
d1fe569f51 igc: Remove header redirection register
Header redirection missed packet counter not applicable for i225 device.
This patch comes to clean up this register.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:18:52 -07:00
Sasha Neftin
3b5fc88f78 igc: Remove obsolete circuit breaker registers
Part of circuit breaker registers is obsolete
and not applicable for i225 device.
This patch comes to clean up these registers.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:18:48 -07:00
Alexander Duyck
49ee3c2ab5 e1000: Do not perform reset in reset_task if we are already down
We are seeing a deadlock in e1000 down when NAPI is being disabled. Looking
over the kernel function trace of the system it appears that the interface
is being closed and then a reset is hitting which deadlocks the interface
as the NAPI interface is already disabled.

To prevent this from happening I am disabling the reset task when
__E1000_DOWN is already set. In addition code has been added so that we set
the __E1000_DOWN while holding the __E1000_RESET flag in e1000_close in
order to guarantee that the reset task will not run after we have started
the close call.

Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Tested-by: Maxim Zhukov <mussitantesmortem@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:18:42 -07:00
Andre Guedes
8eb2449d83 igc: Enable NFC rules based source MAC address
This patch adds support for Network Flow Classification (NFC) rules
based on source MAC address. Note that the controller doesn't support
rules with both source and destination addresses set, so this special
case is checked in igc_add_ethtool_nfc_entry().

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:18:37 -07:00
Andre Guedes
750433d0aa igc: Add support for source address filters in core
This patch extends MAC address filter internal APIs igc_add_mac_filter()
and igc_del_mac_filter(), as well as local helpers, to support filters
based on source address.

A new parameters 'type' is added to the APIs to indicate if the filter
type is source or destination. In case it is source type, the RAH
register is configured accordingly in igc_set_mac_filter_hw().

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:18:30 -07:00
Jason Gunthorpe
eafd47fc20 Linux 5.7-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl7BzV8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGg8EH/A2pXMTxtc96RI4S
 sttEsUQqbakFS0Z/2tQPpMGr/qW2e5eHgsTX/a3SiUeZiIXk6f4lMFkMuctzBf7p
 X77cNEDwGOEdbtCXTsMcmKSde7sP2zCXsPB8xTWLyE6rnaFRgikwwkeqgkIKhp1h
 bvOQV0t9HNGvxGAM0iZeOvQAvFl4vd7nS123/MYbir9cugfQUSJRueQ4BiCiJqVE
 6cNA7/vFzDJuFGszzIrJ7HXn/IdQMMWHkvTDjgBw0GZw1mDbGFbfbZwOeTz1ojCt
 smUQ4tIFxBa/VA5zx7dOy2P2keHbSVf4VLkZRPcceT7OqVS65ETmFDp+qt5NdWM5
 vZ8+7/0=
 =CyYH
 -----END PGP SIGNATURE-----

Merge tag 'v5.7-rc6' into rdma.git for-next

Linux 5.7-rc6

Conflict in drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
resolved by deleting dr_cq_event, matching how netdev resolved it.

Required for dependencies in the following patches.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21 17:08:27 -03:00
Andre Guedes
d66358cae2 igc: Remove mac_table from igc_adapter
In igc_adapter we keep a sort of shadow copy of RAL and RAH registers.
There is not much benefit in keeping it, at the cost of maintainability,
since adding/removing MAC address filters is not hot path, and we
already keep filters information in adapter->nfc_filter_list for cleanup
and restoration purposes.

So in order to simplify the MAC address filtering code and prepare it
for source address support, this patch removes the mac_table from
igc_adapter.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-20 22:28:49 -07:00
Andre Guedes
1c3739cb6e igc: Remove IGC_MAC_STATE_SRC_ADDR flag
MAC address filters based on source address are not currently supported
by the IGC driver. Despite of that, the driver have some dangling code
to handle it, inherited from IGB driver. This patch removes that code to
prepare for a follow up patch that adds proper source MAC address filter
support.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-20 22:23:30 -07:00
David S. Miller
de1b99ef2a Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2020-05-19

This series contains updates to igc only.

Sasha cleans up the igc driver code that is not used or needed.

Vitaly cleans up driver code that was used to support Virtualization on
a device that is not supported by igc, so remove the dead code.

Andre renames a few macros to align with register and field names
described in the data sheet.  Also adds the VLAN Priority Queue Fliter
and EType Queue Filter registers to the list of registers dumped by
igc_get_regs().  Added additional debug messages and updated return codes
for unsupported features.  Refactored the VLAN priority filtering code to
move the core logic into igc_main.c.  Cleaned up duplicate code and
useless code.
====================

Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-20 19:27:57 -07:00
Sasha Neftin
e5264212eb igc: Remove unused registers
Tx data FIFO Head/Tail, Saved and Packet Count registers
not applicable for i225 LAN controller.
This patch comes to clean up these registers.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-19 16:00:00 -07:00
Sasha Neftin
551555a761 igc: Remove unused IGC_ICS_DRSTA define
Device reset assert for interrupt cause register not in
use for i225 device.
This patch comes to clean up this define.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-19 15:58:31 -07:00
Andre Guedes
81e330619e igc: Dump ETQF registers
This patch adds the EType Queue Filter (ETQF) registers to the list of
registers dumped by igc_get_regs().

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-19 15:56:41 -07:00
Andre Guedes
aa7ca7266f igc: Refactor ethertype filtering code
The whole ethertype filtering code is implemented in igc_ethtool.c and
mixes logic from ethtool and core parts. This patch refactors it so core
logic is moved to igc_main.c, aligning the ethertype filtering code
organization with the rest of the filtering code from the driver (MAC
address and VLAN priority).

Besides moving code to igc_main.c, this patch also does some minor
improvements to the code. Below are some highlights.

In case all filters are already in use and the user tries to add another
filter, we return -ENOSPC instead of -EINVAL so a more meaningful error
code is provided. This also aligns with the behavior implemented in MAC
address filtering code.

With this code refactoring, 'etype_bitmap' array in struct igc_adapter
and 'etype_reg_index' in struct igc_nfc_filter are not needed anymore
and are removed.

Log messages are added to help debugging the ethertype filtering code.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-19 15:46:35 -07:00
Andre Guedes
b4d48d96ea igc: Fix MAX_ETYPE_FILTER value
The I225 controller has 8 ethertype filters, not 4. This patch fixes the
MAX_ETYPE_FILTER macro accordingly.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-19 15:40:43 -07:00
Andre Guedes
1664ef3e62 igc: Remove ethertype filter in PTP code
The driver only supports hardware timestamping for all incoming
traffic (HWTSTAMP_FILTER_ALL) which is enabled via Rx Time Sync
Control (TSYNCRXCTL) register already. Therefore, the ethertype
filter set in in igc_ptp_set_timestamp_mode() is useless so this
patch removes it.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-19 15:35:42 -07:00
Andre Guedes
09a2b50a49 igc: Remove duplicated IGC_RXPBS macro
This patch remove the IGC_RXPBS macro defined in line 233 since it is
already defined in line 18 with the exactly same value.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-19 15:33:11 -07:00
Vaibhav Gupta
a1eae9f677 realtek/8139cp: use generic power management
compile-tested only

With legacy PM hooks, it was the responsibility
of a driver to manage PCI states and also
device's power state. The generic approach is
to let PCI core handle the work.

The suspend callback enables/disables PCI wake
on the basis of "cp->wol_enabled" variable
which is unknown to PCI core. To utilise its
need, call device_set_wakeup_enable().

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-19 15:32:24 -07:00
Vaibhav Gupta
6ad70c7686 realtek/8139too: use generic power management
compile-tested only

With legacy PM hooks, it was the responsibility
of a driver to manage PCI states and also
device's power state. The generic approach is
to let PCI core handle the work.

PCI core passes "struct device*" as an argument
to the .suspend() and .resume() callbacks. As
these callabcks work with "struct net_device*",
extract it from "struct device*" using
dev_get_drv_data().

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-19 15:32:24 -07:00
Andre Guedes
12ddee68d0 igc: Refactor VLAN priority filtering code
The whole VLAN priority filtering code is implemented in igc_ethtool.c
and mixes logic from ethtool and core parts. This patch refactors it so
core logic is moved to igc_main.c, aligning the VLAN priority filtering
code organization with the MAC address filtering code.

This patch also takes the opportunity to add some log messages to ease
debugging.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-19 15:29:17 -07:00
Louis Peens
465957c257 nfp: flower: inform firmware of flower features
For backwards compatibility it may be required for the firmware to
disable certain features depending on the features supported by
the host. Combine the host feature bits and firmware feature bits
and write this back to the firmware.

Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-19 15:27:21 -07:00
Louis Peens
e09303d3c4 nfp: flower: renaming of feature bits
Clean up name aliasing. Some features gets enabled using a slightly
different method, but the bitmap for these were stored in the same
field. Rename their #defines and move the bitmap to a new variable.

Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-19 15:27:21 -07:00
Andre Guedes
2e4f1716f3 igc: Return -EOPNOTSUPP when VLAN mask doesn't match
The I225 controller supports Rx queue assignment based on VLAN priority
only. Other Tag Control Information (TCI) are valid, but not supported
by the driver. So this patch changes the returning code from igc_add_
ethtool_nfc_entry() to -EOPNOTSUPP in order to provide more meaningful
information on why the function failed.

It also adds a debug messages to give the user a hint about what went
wrong with the NFC setup.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-19 15:18:33 -07:00
Andre Guedes
fbee4760ec igc: Dump VLANPQF register
This patch adds the VLAN Priority Queue Filter Register (VLANPQF) to the
list of registers dumped by igc_get_regs().

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-19 15:16:13 -07:00
Andre Guedes
bbfaa141d2 igc: Rename IGC_VLAPQF macro
This patch renames the IGC_VLAPQF macro to IGC_VLANPQF as well as
related macros so they match the register name and fields described in
the data sheet.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-19 15:12:17 -07:00
Sasha Neftin
65b9ee1b92 igc: Clean up obsolete NVM defines
Packet buffer allocation, reserved word and pointer guard
not applicable for i225 parts.
This patch comes to clean up these obsolete defines

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-19 15:03:20 -07:00
Vitaly Lifshits
3c215fb18e igc: remove IGC_REMOVED function
igc driver has leftovers from the previous device that supported
Virtualization. This can be found in the function IGC_REMOVED which
became obsolete, and can be removed.

Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-19 15:01:48 -07:00
Sasha Neftin
472abd3240 igc: Remove PCIe Control register
GCR (PCIe Control) register not in use and should be removed
This patch clean up this register

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-19 14:23:54 -07:00
Jeremy Kerr
ef01cee2ee net: bmac: Fix read of MAC address from ROM
In bmac_get_station_address, We're reading two bytes at a time from ROM,
but we do that six times, resulting in 12 bytes of read & writes. This
means we will write off the end of the six-byte destination buffer.

This change fixes the for-loop to only read/write six bytes.

Based on a proposed fix from Finn Thain <fthain@telegraphics.com.au>.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Reported-by: Stan Johnson <userm57@yahoo.com>
Tested-by: Stan Johnson <userm57@yahoo.com>
Reported-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-19 12:03:37 -07:00
David S. Miller
fa14b9b0c0 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2020-05-18

This series contains updates to igc driver only.

Sasha adds ECN support for TSO by adding the NETIF_F_TSO_ECN flag, which
aligns with other Intel drivers.  Also cleaned up defines that are not
supported or used in the igc driver.

Andre does most of the changes with updating the log messages for igc
driver.

Vitaly adds support for EEPROM, register and link ethtool
self-tests.

v2: Fixed up the added ethtool self-tests based on feedback from the
    community.  Dropped the four patches that removed '\n' from log
    messages.
v3: Reverted the debug message changes in patch 2 for messages in
    igc_probe, also made reg_test[] static in patch 3 based on community
    feedback
v4: Updated the patch description for patch 2, which referred to changes
    that no longer existed in the patch
v5: Scrubbed patches 4-7 patch description, which also referred to
    changes that no longer existed in the patch
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-19 12:02:29 -07:00
Heiner Kallweit
5cdfe83066 r8169: work around an irq coalescing related tx timeout
In [0] a user reported reproducible tx timeouts on RTL8168f except
PktCntrDisable is set and irq coalescing is enabled.
Realtek told me that they are not aware of any related hw issue on
this chip version, therefore root cause is still unknown. It's not
clear whether the issue affects one or more chip versions in general,
or whether issue is specific to reporter's system.
Due to this level of uncertainty, and due to the fact that I'm aware
of this one report only, let's apply the workaround on net-next only.
After this change setting irq coalescing via ethtool can reliably
avoid the issue on the affected system.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=207205

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-18 17:46:16 -07:00
Heiner Kallweit
e2e5fb8d2f r8169: improve rtl8169_mark_to_asic
Let the compiler decide about inlining, and as confirmed by Eric it's
better to use WRITE_ONCE here to ensure that the descriptor ownership
is transferred to NIC immediately.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-18 17:46:15 -07:00
Heiner Kallweit
588c7e5cc0 r8169: make rtl_rx better readable
Avoid the goto from the rx error handling branch into the else branch,
and in general avoid having the main rx work in the else branch.
In addition ensure proper reverse xmas tree order of variables in the
for loop.

No functional change intended.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-18 17:46:15 -07:00
Andy Shevchenko
35e43c392b net: seeq: Use %pM format specifier for MAC addresses
Convert to %pM instead of using custom code.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-18 17:43:13 -07:00
Andy Shevchenko
0992b49023 cxgb4: Use %pM format specifier for MAC addresses
Convert to %pM instead of using custom code.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-18 17:43:13 -07:00
Sasha Neftin
5ddb2747ae igc: Remove unneeded register
Flow control status register not applicable for i225 parts
so clean up the unneeded define.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-18 17:16:43 -07:00
Sasha Neftin
3494480ad5 igc: Remove unneeded definition
PHY_FORCE_LIMIT definition not in use and could be removed
i225 parts support auto negotiation mechanism

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-18 17:16:39 -07:00
Andre Guedes
faf82d5bb1 igc: Use netdev log helpers in igc_base.c
This patch coverts one pr_debug() call to hw_dbg() in order to keep log
output aligned with the rest of the driver. hw_dbg() is actually a macro
defined in igc_hw.h that expands to netdev_dbg().

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-18 17:16:33 -07:00
Andre Guedes
5c32bac98c igc: Use netdev log helpers in igc_dump.c
In igc_dump.c we print log messages using dev_* and pr_* helpers,
generating inconsistent output with the rest of the driver. Since this
is a network device driver, we should preferably use netdev_* helpers
because they append the interface name to the message, helping making
sense out of the logs.

This patch converts all dev_* and pr_* calls to netdev_*.

Quick note about igc_rings_dump(): This function is always called with
valid adapter->netdev so there is not need to check it.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-18 17:16:26 -07:00
Andre Guedes
916a3c6507 igc: Use netdev log helpers in igc_ptp.c
In igc_ptp.c we print log messages using dev_* helpers, generating
inconsistent output with the rest of the driver. Since this is a network
device driver, we should preferably use netdev_* helpers because they
append the interface name to the message, helping making sense out of
the logs.

This patch converts all dev_* calls to netdev_*.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-18 17:16:21 -07:00
Andre Guedes
95f96a9f2d igc: Use netdev log helpers in igc_ethtool.c
In igc_ethtool.c we print log messages using dev_* helpers, generating
inconsistent output with the rest of the driver. Since this is a network
device driver, we should preferably use netdev_* helpers because they
append the interface name to the message, helping making sense the of
the logs.

This patch converts all dev_* calls to netdev_*.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-18 17:16:01 -07:00
Vitaly Lifshits
f026d8ca29 igc: add support to eeprom, registers and link self-tests
Introduced igc_diag.c and igc_diag.h, these files have the
diagnostics functionality of igc driver. For the time being
these files are being used by ethtool self-test callbacks.
Which mean that eeprom, registers and link self-tests for
ethtool were implemented.

Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-18 15:06:28 -07:00
Andre Guedes
25f06eff75 igc: Use netdev log helpers in igc_main.c
In igc_main.c we print log messages using both dev_* and netdev_*
helpers, generating inconsistent output. Since this is a network device
driver, we should preferably use netdev_* helpers because they append
the interface name to the message, helping making sense out of the logs.

This patch converts all dev_* calls to netdev_*. There is only two
exceptions:
  1) calls wihtin igc_probe (net_device has not been registered yet)
  2) calls in igc_init_module (module initialization).

It also takes this opportunity to improve some messages.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-18 15:06:28 -07:00
Sasha Neftin
8e8204a4f3 igc: Add ECN support for TSO
Align with other Intel drivers and add ECN support for TSO.

Add NETIF_F_TSO_ECN flag

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-18 15:06:28 -07:00
Michael Guralnik
ecf814e0e1 net/mlx5: Add support for RDMA TX FT headers modifying
Support adding header modifying actions to the RDMA TX flow table.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-18 09:21:46 -07:00
Parav Pandit
555af0c3fa net/mlx5: Move iseg access helper routines close to mlx5_core driver
Only mlx5_core driver handles fw initialization check and command
interface revision check.
Hence move them inside the mlx5_core driver where it is used.
This avoid exposing these helpers to all mlx5 drivers.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-18 09:21:46 -07:00
Raed Salem
356d411c26 net/mlx5: Cleanup mlx5_ifc_fte_match_set_misc2_bits
Remove the "metadata_reg_b" field and all uses of this field in code
to match the device specification. As this field is not in use in SW
steering it is safe to remove it.

Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-18 09:21:46 -07:00
Markus Elfring
7b27b95a89 net/ps3_gelic_net: Remove duplicate error message
Remove an extra message for a memory allocation failure in
function gelic_descr_prepare_rx().

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ba4bea4da97308c804fd3a0fae3773dde27b20ce.1589049250.git.geoff@infradead.org
2020-05-19 00:10:35 +10:00
Ido Schimmel
200b7cca0b mlxsw: spectrum_trap: Store all trap data in one array
Each trap registered with devlink is mapped to one or more Rx listeners.
These listeners allow the switch driver (e.g., mlxsw_spectrum) to
register a function that is called when a packet is received (trapped)
for a specific reason.

Currently, three arrays are used to describe the mapping between the
logical devlink traps and the Rx listeners.

Instead, get rid of these arrays and store all the information in one
array that is easier to validate and extend with more per-trap
information.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 16:42:31 -07:00
Ido Schimmel
b14a40dbde mlxsw: spectrum_trap: Store all trap group data in one array
Use one array to store all the information about all the trap groups
instead of hard coding it in code. This will be used in future patches
to disable certain functionality (e.g., policer binding) on a trap group
basis.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 16:42:31 -07:00
Ido Schimmel
cc678f4dbc mlxsw: spectrum_trap: Store all trap policer data in one array
Instead of maintaining an array of policers and a linked list, only
maintain an array.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 16:42:31 -07:00
Ido Schimmel
85d4ec5925 mlxsw: spectrum_trap: Move struct definition out of header file
'struct mlxsw_sp_trap_policer_item' is only used in one file, so move it
there.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 16:42:31 -07:00
Heiner Kallweit
13f15b59ad r8169: remove remaining call to mdiobus_unregister
After having switched to devm_mdiobus_register() also this remaining
call to mdiobus_unregister() can be removed.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 15:20:34 -07:00
Jakub Kicinski
4df6ff2a99 nfp: don't check lack of RX/TX channels
Core will now perform this check.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 13:56:30 -07:00
Ioana Ciornei
74a1c05916 dpaa2-eth: add bulking to XDP_TX
Add driver level bulking to the XDP_TX action.

An array of frame descriptors is held for each Tx frame queue and
populated accordingly when the action returned by the XDP program is
XDP_TX. The frames will be actually enqueued only when the array is
filled. At the end of the NAPI cycle a flush on the queued frames is
performed in order to enqueue the remaining FDs.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 13:45:25 -07:00
David S. Miller
ea6119aa67 mlx5-updates-2020-05-15
mlx5 core and mlx5e (netdev) updates:
 
 1) Two fixes for release all FW pages support.
 2) Improvement in calculating the send queue stop room on tx
 3) Flow steering auto-groups creation improvements
 4) TC offload fix for Connection tracking with NAT action
 5) IPoIB support for self looback to allow communication between ipoib
 pkey child interfaces on the same host.
 6) DCBNL cleanup to avoid #ifdef DCBNL all over the main mlx5e code
 7) Small and trivial code cleanup
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl6/G1YACgkQSD+KveBX
 +j6z5Qf9GgUyciytFq3gcmIqjjvhugWuGAjsyD5i0X/TblQJXAAfXLBk4SDJDwdC
 FbjUpDzO6kbKUUoOYSUlyY8LAzve+jObCqRn6GHtJAm7qxaN/OuPhIBrh7ysphxV
 aPRV564KXMqOyOKmufWSmYfJtxthSv/c3ZkTZpYeqNr0psNSfz8mXX3YOrtr+UBH
 5fUoJO5sICtMvnswQD/uy1KcyH6YAtxSEcAtTSw5NZJRuXe8cxZ7g+4CrPmFUiQy
 JvXigroGgpL7WdHDaId74VkIOplSQZvIbebhEZ53Fy7aQQpwLjLY26TmCcvLMBSk
 4Y1rFp0/+rTXUXWaRfKYRQfW4qGLFA==
 =pY9E
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2020-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-05-15

mlx5 core and mlx5e (netdev) updates:

1) Two fixes for release all FW pages support.
2) Improvement in calculating the send queue stop room on tx
3) Flow steering auto-groups creation improvements
4) TC offload fix for Connection tracking with NAT action
5) IPoIB support for self looback to allow communication between ipoib
pkey child interfaces on the same host.
6) DCBNL cleanup to avoid #ifdef DCBNL all over the main mlx5e code
7) Small and trivial code cleanup
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 16:36:46 -07:00
Nathan Chancellor
2ea46dc686 ethernet: ti: am65-cpts: Add missing inline qualifier to stub functions
When building with Clang:

In file included from drivers/net/ethernet/ti/am65-cpsw-ethtool.c:15:
drivers/net/ethernet/ti/am65-cpts.h:58:12: warning: unused function
'am65_cpts_ns_gettime' [-Wunused-function]
static s64 am65_cpts_ns_gettime(struct am65_cpts *cpts)
           ^
drivers/net/ethernet/ti/am65-cpts.h:63:12: warning: unused function
'am65_cpts_estf_enable' [-Wunused-function]
static int am65_cpts_estf_enable(struct am65_cpts *cpts,
           ^
drivers/net/ethernet/ti/am65-cpts.h:69:13: warning: unused function
'am65_cpts_estf_disable' [-Wunused-function]
static void am65_cpts_estf_disable(struct am65_cpts *cpts, int idx)
            ^
3 warnings generated.

These functions need to be marked as inline, which adds __maybe_unused,
to avoid these warnings, which is the pattern for stub functions.

Fixes: ec008fa2a9 ("ethernet: ti: am65-cpts: add routines to support taprio offload")
Link: https://github.com/ClangBuiltLinux/linux/issues/1026
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 16:32:27 -07:00
Tariq Toukan
3f3ab178c7 net/mlx5e: Take DCBNL-related definitions into dedicated files
Take DCBNL-related definitions out of the common en.h header,
Use a dedicated header file for exposing them.
Some need not to be exposed, use them locally in the .c file.
Use stubs to eliminate use of CONFIG_MLX5_CORE_EN_DCB in the
generic control flows.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:36 -07:00
Maxim Mikityanskiy
5ffb4d858b net/mlx5e: Calculate SQ stop room in a robust way
Currently, different formulas are used to estimate the space that may be
taken by WQEs in the SQ during a single packet transmit. This space is
called stop room, and it's checked in the end of packet transmit to find
out if the next packet could overflow the SQ. If it could, the driver
tells the kernel to stop sending next packets.

Many factors affect the stop room:

1. Padding with NOPs to avoid WQEs spanning over page boundaries.

2. Enabled and disabled offloads (TLS, upcoming MPWQE).

3. The maximum size of a WQE.

The padding is performed before every WQE if it doesn't fit the current
page.

The current formula assumes that only one padding will be required per
packet, and it doesn't take into account that the WQEs posted during the
transmission of a single packet might exceed the page size in very rare
circumstances. For example, to hit this condition with 4096-byte pages,
TLS offload will have to interrupt an almost-full MPWQE session, be in
the resync flow and try to transmit a near to maximum amount of data.

To avoid SQ overflows in such rare cases after MPWQE is added, this
patch introduces a more robust formula to estimate the stop room. The
new formula uses the fact that a WQE of size X will not require more
than X-1 WQEBBs of padding. More exact estimations are possible, but
they result in much more complex and error-prone code for little gain.

Before this patch, the TLS stop room included space for both INNOVA and
ConnectX TLS offloads that couldn't run at the same time anyway, so this
patch accounts only for the active one.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:34 -07:00
Erez Shitrit
8b46d424a7 net/mlx5e: IPoIB, Drop multicast packets that this interface sent
After enabled loopback packets for IPoIB, we need to drop these packets
that this HCA has replicated and came back to the same interface that
sent them.

Fixes: 4c6c615e3f ("net/mlx5e: IPoIB, Add PKEY child interface nic profile")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:32 -07:00
Erez Shitrit
80639b199c net/mlx5e: IPoIB, Enable loopback packets for IPoIB interfaces
Enable loopback of unicast and multicast traffic for IPoIB enhanced
mode.
This will allow interfaces with the same pkey to communicate between
them e.g cloned interfaces that located in different namespaces.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:30 -07:00
Roi Dayan
9102d836d2 net/mlx5e: CT: Fix offload with CT action after CT NAT action
It could be a chain of rules will do action CT again after CT NAT
Before this fix matching will break as we get into the CT table
after NAT changes and not CT NAT.
Fix this by adding pre ct and pre ct nat tables to skip ct/ct_nat
tables and go straight to post_ct table if ct/nat was already done.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:27 -07:00
Eran Ben Elisha
90bf1c8dbd net/mlx5: Move internal timer read function to clock library
Move mlx5_read_internal_timer() into lib/clock.c file as it is being
used there. As such, make this function a static one.

In addition, rearrange headers include to support function move.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:25 -07:00
Paul Blakey
49c0355d30 net/mlx5: Wait for inactive autogroups
Currently, if one thread tries to add an entry to an autogrouped table
with no free matching group, while another thread is in the process of
creating a new matching autogroup, it doesn't wait for the new group
creation, and creates an unnecessary new autogroup.

Instead of skipping inactive, wait on the write lock of those groups.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:22 -07:00
Parav Pandit
41798df9bf net/mlx5: Drain wq first during PCI device removal
mlx5_unload_one() is done with cleanup = true only once.

So instead of doing health wq drain inside the if(), directly do
during PCI device removal.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:20 -07:00
Parav Pandit
4162f58b47 net/mlx5: Have single error unwinding path
Having multiple error unwinding path are error prone.
Lets have just one error unwinding path.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:17 -07:00
Eran Ben Elisha
e7f860e210 net/mlx5: Fix a bug of releasing wrong chunks on > 4K page size systems
On systems with page size larger than 4K, a fwp object has few 4K chunks.
Fix a bug in fwp free flow where the chunk address was dropped and
fwp->addr was used instead (first chunk address). This caused a wrong
update of fwp->bitmask which later can cause errors in re-alloc fwp
chunk flow.

In order to fix this it, re-factor the release flow:
- Free 4k: Releases a specific 4k chunk inside the fwp, defined by
  starting address.
- Free fwp: Unconditionally release the whole fwp and its resources.
Free addr will call free fwp if all chunks were released, in order to do
code sharing.

In addition, fix npages to count for all released chunks correctly.

Fixes: c6168161f6 ("net/mlx5: Add support for release all pages event")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:15 -07:00
Eran Ben Elisha
2726cd4a29 net/mlx5: Dedicate fw page to the requesting function
The cited patch assumes that all chuncks in a fw page belong to the same
function, thus the driver must dedicate fw page to the requesting
function, which is actually what was intedned in the original fw pages
allocator design, hence the fwp->func_id !

Up until the cited patch everything worked ok, but now "relase all pages"
is broken on systems with page_size > 4k.

Fix this by dedicating fw page to the requesting function id via adding a
func_id parameter to alloc_4k() function.

Fixes: c6168161f6 ("net/mlx5: Add support for release all pages event")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:12 -07:00
David S. Miller
da07f52d3c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Move the bpf verifier trace check into the new switch statement in
HEAD.

Resolve the overlapping changes in hinic, where bug fixes overlap
the addition of VF support.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 13:48:59 -07:00
Rahul Lakkireddy
5148e5950c cxgb4: add EOTID tracking and software context dump
Rework and add support for dumping EOTID software context used by
TC-MQPRIO. Also track number of EOTIDs in use.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:54:07 -07:00
Rahul Lakkireddy
4bccfc036a cxgb4: tune burst buffer size for TC-MQPRIO offload
For each traffic class, firmware handles up to 4 * MTU amount of data
per burst cycle. Under heavy load, this small buffer size is a
bottleneck when buffering large TSO packets in <= 1500 MTU case.
Increase the burst buffer size to 8 * MTU when supported.

Also, keep the driver's traffic class configuration API similar to
the firmware API counterpart.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:54:07 -07:00
Rahul Lakkireddy
4f1d97262d cxgb4: improve credits recovery in TC-MQPRIO Tx path
Request credit update for every half credits consumed, including
the current request. Also, avoid re-trying to post packets when there
are no credits left. The credit update reply via interrupt will
eventually restore the credits and will invoke the Tx path again.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:54:07 -07:00
Ioana Ciornei
efa6a7d075 dpaa2-eth: properly handle buffer size restrictions
Depending on the WRIOP version, the buffer size on the RX path must by a
multiple of 64 or 256. Handle this restriction properly by aligning down
the buffer size to the necessary value. Also, use the new buffer size
dynamically computed instead of the compile time one.

Fixes: 27c874867c ("dpaa2-eth: Use a single page per Rx buffer")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:30:47 -07:00
Jesper Dangaard Brouer
d628ee4fef mlx5: Rx queue setup time determine frame_sz for XDP
The mlx5 driver have multiple memory models, which are also changed
according to whether a XDP bpf_prog is attached.

The 'rx_striding_rq' setting is adjusted via ethtool priv-flags e.g.:
 # ethtool --set-priv-flags mlx5p2 rx_striding_rq off

On the general case with 4K page_size and regular MTU packet, then
the frame_sz is 2048 and 4096 when XDP is enabled, in both modes.

The info on the given frame size is stored differently depending on the
RQ-mode and encoded in a union in struct mlx5e_rq union wqe/mpwqe.
In rx striding mode rq->mpwqe.log_stride_sz is either 11 or 12, which
corresponds to 2048 or 4096 (MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ).
In non-striding mode (MLX5_WQ_TYPE_CYCLIC) the frag_stride is stored
in rq->wqe.info.arr[0].frag_stride, for the first fragment, which is
what the XDP case cares about.

To reduce effect on fast-path, this patch determine the frame_sz at
setup time, to avoid determining the memory model runtime. Variable
is named frame0_sz to make it clear that this is only the frame
size of the first fragment.

This mlx5 driver does a DMA-sync on XDP_TX action, but grow is safe
as it have done a DMA-map on the entire PAGE_SIZE. The driver also
already does a XDP length check against sq->hw_mtu on the possible
XDP xmit paths mlx5e_xmit_xdp_frame() + mlx5e_xmit_xdp_frame_mpwqe().

V3+4: Change variable name first_frame_sz to frame0_sz

V2: Fix that frag_size need to be recalc before creating SKB.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Link: https://lore.kernel.org/bpf/158945348021.97035.12295039384250022883.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer
2a637c5b1a xdp: For Intel AF_XDP drivers add XDP frame_sz
Intel drivers implement native AF_XDP zerocopy in separate C-files,
that have its own invocation of bpf_prog_run_xdp(). The setup of
xdp_buff is also handled in separately from normal code path.

This patch update XDP frame_sz for AF_XDP zerocopy drivers i40e, ice
and ixgbe, as the code changes needed are very similar.  Introduce a
helper function xsk_umem_xdp_frame_sz() for calculating frame size.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/158945347511.97035.8536753731329475655.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer
d4ecdbf7aa ice: Add XDP frame size to driver
This driver uses different memory models depending on PAGE_SIZE at
compile time. For PAGE_SIZE 4K it uses page splitting, meaning for
normal MTU frame size is 2048 bytes (and headroom 192 bytes). For
larger MTUs the driver still use page splitting, by allocating
order-1 pages (8192 bytes) for RX frames. For PAGE_SIZE larger than
4K, driver instead advance its rx_buffer->page_offset with the frame
size "truesize".

For XDP frame size calculations, this mean that in PAGE_SIZE larger
than 4K mode the frame_sz change on a per packet basis. For the page
split 4K PAGE_SIZE mode, xdp.frame_sz is more constant and can be
updated once outside the main NAPI loop.

The default setting in the driver uses build_skb(), which provides
the necessary headroom and tailroom for XDP-redirect in RX-frame
(in both modes).

There is one complication, which is legacy-rx mode (configurable via
ethtool priv-flags). There are zero headroom in this mode, which is a
requirement for XDP-redirect to work. The conversion to xdp_frame
(convert_to_xdp_frame) will detect this insufficient space, and
xdp_do_redirect() call will fail. This is deemed acceptable, as it
allows other XDP actions to still work in legacy-mode. In
legacy-mode + larger PAGE_SIZE due to lacking tailroom, we also
accept that xdp_adjust_tail shrink doesn't work.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Link: https://lore.kernel.org/bpf/158945347002.97035.328088795813704587.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer
24104024ce i40e: Add XDP frame size to driver
This driver uses different memory models depending on PAGE_SIZE at
compile time. For PAGE_SIZE 4K it uses page splitting, meaning for
normal MTU frame size is 2048 bytes (and headroom 192 bytes). For
larger MTUs the driver still use page splitting, by allocating
order-1 pages (8192 bytes) for RX frames. For PAGE_SIZE larger than
4K, driver instead advance its rx_buffer->page_offset with the frame
size "truesize".

For XDP frame size calculations, this mean that in PAGE_SIZE larger
than 4K mode the frame_sz change on a per packet basis. For the page
split 4K PAGE_SIZE mode, xdp.frame_sz is more constant and can be
updated once outside the main NAPI loop.

The default setting in the driver uses build_skb(), which provides
the necessary headroom and tailroom for XDP-redirect in RX-frame
(in both modes).

There is one complication, which is legacy-rx mode (configurable via
ethtool priv-flags). There are zero headroom in this mode, which is a
requirement for XDP-redirect to work. The conversion to xdp_frame
(convert_to_xdp_frame) will detect this insufficient space, and
xdp_do_redirect() call will fail. This is deemed acceptable, as it
allows other XDP actions to still work in legacy-mode. In
legacy-mode + larger PAGE_SIZE due to lacking tailroom, we also
accept that xdp_adjust_tail shrink doesn't work.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Link: https://lore.kernel.org/bpf/158945346494.97035.12809400414566061815.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer
81f3c6283c ixgbevf: Add XDP frame size to VF driver
This patch mirrors the changes to ixgbe in previous patch.

This VF driver doesn't support XDP_REDIRECT, but correct tailroom is
still necessary for BPF-helper xdp_adjust_tail.  In legacy-mode +
larger PAGE_SIZE, due to lacking tailroom, we accept that
xdp_adjust_tail shrink doesn't work.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Link: https://lore.kernel.org/bpf/158945345984.97035.13518286183248025173.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer
cf02512899 ixgbe: Add XDP frame size to driver
This driver uses different memory models depending on PAGE_SIZE at
compile time. For PAGE_SIZE 4K it uses page splitting, meaning for
normal MTU frame size is 2048 bytes (and headroom 192 bytes). For
larger MTUs the driver still use page splitting, by allocating
order-1 pages (8192 bytes) for RX frames. For PAGE_SIZE larger than
4K, driver instead advance its rx_buffer->page_offset with the frame
size "truesize".

For XDP frame size calculations, this mean that in PAGE_SIZE larger
than 4K mode the frame_sz change on a per packet basis. For the page
split 4K PAGE_SIZE mode, xdp.frame_sz is more constant and can be
updated once outside the main NAPI loop.

The default setting in the driver uses build_skb(), which provides
the necessary headroom and tailroom for XDP-redirect in RX-frame
(in both modes).

There is one complication, which is legacy-rx mode (configurable via
ethtool priv-flags). There are zero headroom in this mode, which is a
requirement for XDP-redirect to work. The conversion to xdp_frame
(convert_to_xdp_frame) will detect this insufficient space, and
xdp_do_redirect() call will fail. This is deemed acceptable, as it
allows other XDP actions to still work in legacy-mode. In
legacy-mode + larger PAGE_SIZE due to lacking tailroom, we also
accept that xdp_adjust_tail shrink doesn't work.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Link: https://lore.kernel.org/bpf/158945345455.97035.14334355929030628741.stgit@firesoul
2020-05-14 21:21:55 -07:00
Jesper Dangaard Brouer
88eb0ee17b ixgbe: Fix XDP redirect on archs with PAGE_SIZE above 4K
The ixgbe driver have another memory model when compiled on archs with
PAGE_SIZE above 4096 bytes. In this mode it doesn't split the page in
two halves, but instead increment rx_buffer->page_offset by truesize of
packet (which include headroom and tailroom for skb_shared_info).

This is done correctly in ixgbe_build_skb(), but in ixgbe_rx_buffer_flip
which is currently only called on XDP_TX and XDP_REDIRECT, it forgets
to add the tailroom for skb_shared_info. This breaks XDP_REDIRECT, for
veth and cpumap.  Fix by adding size of skb_shared_info tailroom.

Maintainers notice: This fix have been queued to Jeff.

Fixes: 6453073987 ("ixgbe: add initial support for xdp redirect")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Link: https://lore.kernel.org/bpf/158945344946.97035.17031588499266605743.stgit@firesoul
2020-05-14 21:21:55 -07:00
Jesper Dangaard Brouer
fa6540b8ef nfp: Add XDP frame size to netronome driver
The netronome nfp driver use PAGE_SIZE when xdp_prog is set, but
xdp.data_hard_start begins at offset NFP_NET_RX_BUF_HEADROOM.
Thus, adjust for this when setting xdp.frame_sz, as it counts
from data_hard_start.

When doing XDP_TX this driver is smart and instead of a full DMA-map
does a DMA-sync on with packet length. As xdp_adjust_tail can now
grow packet length, add checks to make sure that grow size is within
the DMA-mapped size.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/bpf/158945342911.97035.11214251236208648808.stgit@firesoul
2020-05-14 21:21:55 -07:00
Jesper Dangaard Brouer
c8145b263d net: thunderx: Add XDP frame size
To help reviewers these are the defines related to RCV_FRAG_LEN

 #define DMA_BUFFER_LEN	1536 /* In multiples of 128bytes */
 #define RCV_FRAG_LEN	(SKB_DATA_ALIGN(DMA_BUFFER_LEN + NET_SKB_PAD) + \
			 SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Sunil Goutham <sgoutham@marvell.com>
Cc: Robert Richter <rrichter@marvell.com>
Link: https://lore.kernel.org/bpf/158945342402.97035.12649844447148990032.stgit@firesoul
2020-05-14 21:21:55 -07:00
Jesper Dangaard Brouer
d201ea9ebc mlx4: Add XDP frame size and adjust max XDP MTU
The mlx4 drivers size of memory backing the RX packet is stored in
frag_stride. For XDP mode this will be PAGE_SIZE (normally 4096).
For normal mode frag_stride is 2048.

Also adjust MLX4_EN_MAX_XDP_MTU to take tailroom into account.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Link: https://lore.kernel.org/bpf/158945341893.97035.2688142527052329942.stgit@firesoul
2020-05-14 21:21:55 -07:00
Jesper Dangaard Brouer
08fc1cfd2d ena: Add XDP frame size to amazon NIC driver
Frame size ENA_PAGE_SIZE is limited to 16K on systems with larger
PAGE_SIZE than 16K. Change ENA_XDP_MAX_MTU to also take into account
the reserved tailroom.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Sameeh Jubran <sameehj@amazon.com>
Cc: Arthur Kiyanovski <akiyano@amazon.com>
Link: https://lore.kernel.org/bpf/158945341384.97035.907403694833419456.stgit@firesoul
2020-05-14 21:21:55 -07:00
Jesper Dangaard Brouer
c88c35181d net: ethernet: ti: Add XDP frame size to driver cpsw
The driver code cpsw.c and cpsw_new.c both use page_pool
with default order-0 pages or their RX-pages.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://lore.kernel.org/bpf/158945340875.97035.752144756428532878.stgit@firesoul
2020-05-14 21:21:55 -07:00
Jesper Dangaard Brouer
bc1c5745d7 qlogic/qede: Add XDP frame size to driver
The driver qede uses a full page, when XDP is enabled. The drivers value
in rx_buf_seg_size (struct qede_rx_queue) will be PAGE_SIZE when an
XDP bpf_prog is attached.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Ariel Elior <aelior@marvell.com>
Cc: GR-everest-linux-l2@marvell.com
Link: https://lore.kernel.org/bpf/158945340366.97035.7764939691580349618.stgit@firesoul
2020-05-14 21:21:55 -07:00
Jesper Dangaard Brouer
4a9b052a59 dpaa2-eth: Add XDP frame size
The dpaa2-eth driver reserve some headroom used for hardware and
software annotation area in RX/TX buffers. Thus, xdp.data_hard_start
doesn't start at page boundary.

When XDP is configured the area reserved via dpaa2_fd_get_offset(fd) is
448 bytes of which XDP have reserved 256 bytes. As frame_sz is
calculated as an offset from xdp_buff.data_hard_start, an adjust from
the full PAGE_SIZE == DPAA2_ETH_RX_BUF_RAW_SIZE.

When doing XDP_REDIRECT, the driver doesn't need this reserved headroom
any-longer and allows xdp_do_redirect() to use it. This is an advantage
for the drivers own ndo-xdp_xmit, as it uses part of this headroom for
itself.  Patch also adjust frame_sz in this case.

The driver cannot support XDP data_meta, because it uses the headroom
just before xdp.data for struct dpaa2_eth_swa (DPAA2_ETH_SWA_SIZE=64),
when transmitting the packet. When transmitting a xdp_frame in
dpaa2_eth_xdp_xmit_frame (call via ndo_xdp_xmit) is uses this area to
store a pointer to xdp_frame and dma_size, which is used in TX
completion (free_tx_fd) to return frame via xdp_return_frame().

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Link: https://lore.kernel.org/bpf/158945339348.97035.8562488847066908856.stgit@firesoul
2020-05-14 21:21:54 -07:00
Ilias Apalodimas
495de55f70 net: netsec: Add support for XDP frame size
This driver takes advantage of page_pool PP_FLAG_DMA_SYNC_DEV that
can help reduce the number of cache-lines that need to be flushed
when doing DMA sync for_device. Due to xdp_adjust_tail can grow the
area accessible to the by the CPU (can possibly write into), then max
sync length *after* bpf_prog_run_xdp() needs to be taken into account.

For XDP_TX action the driver is smart and does DMA-sync. When growing
tail this is still safe, because page_pool have DMA-mapped the entire
page size.

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/bpf/158945336295.97035.15034759661036971024.stgit@firesoul
2020-05-14 21:21:54 -07:00
Jesper Dangaard Brouer
494f44d54e mvneta: Add XDP frame size to driver
This marvell driver mvneta uses PAGE_SIZE frames, which makes it
really easy to convert.  Driver updates rxq and now frame_sz
once per NAPI call.

This driver takes advantage of page_pool PP_FLAG_DMA_SYNC_DEV that
can help reduce the number of cache-lines that need to be flushed
when doing DMA sync for_device. Due to xdp_adjust_tail can grow the
area accessible to the by the CPU (can possibly write into), then max
sync length *after* bpf_prog_run_xdp() needs to be taken into account.

For XDP_TX action the driver is smart and does DMA-sync. When growing
tail this is still safe, because page_pool have DMA-mapped the entire
page size.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: thomas.petazzoni@bootlin.com
Link: https://lore.kernel.org/bpf/158945335786.97035.12714388304493736747.stgit@firesoul
2020-05-14 21:21:54 -07:00
Jesper Dangaard Brouer
983e434518 sfc: Add XDP frame size
This driver uses RX page-split when possible. It was recently fixed
in commit 86e85bf698 ("sfc: fix XDP-redirect in this driver") to
add needed tailroom for XDP-redirect.

After the fix efx->rx_page_buf_step is the frame size, with enough
head and tail-room for XDP-redirect.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/158945335278.97035.14611425333184621652.stgit@firesoul
2020-05-14 21:21:54 -07:00
Jesper Dangaard Brouer
63fe91ab3d bnxt: Add XDP frame size to driver
This driver uses full PAGE_SIZE pages when XDP is enabled.

In case of XDP uses driver uses __bnxt_alloc_rx_page which does full
page DMA-map. Thus, xdp_adjust_tail grow is DMA compliant for XDP_TX
action that does DMA-sync.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Link: https://lore.kernel.org/bpf/158945334769.97035.13437970179897613984.stgit@firesoul
2020-05-14 21:21:54 -07:00
Heiner Kallweit
9b65d2ffe8 r8169: don't include linux/moduleparam.h
93882c6f21 ("r8169: switch from netif_xxx message functions to
netdev_xxx") removed the last module parameter from the driver,
therefore there's no need any longer to include linux/moduleparam.h.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 18:03:01 -07:00
Heiner Kallweit
aa443b3f8f r8169: remove not needed checks in rtl8169_set_eee
After 9de5d235b6 ("net: phy: fix aneg restart in phy_ethtool_set_eee")
we don't need the check for aneg being enabled any longer, and as
discussed with Russell configuring the EEE advertisement should be
supported even if we're in a half-duplex mode currently.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 18:03:01 -07:00
Luo bin
bcab67822d hinic: add set_ringparam ethtool_ops support
support to change TX/RX queue depth with ethtool -G

Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 17:38:19 -07:00
Ivan Khoronzhuk
8127224c27 ethernet: ti: am65-cpsw-qos: add TAPRIO offload support
AM65 CPSW h/w supports Enhanced Scheduled Traffic (EST – defined
in P802.1Qbv/D2.2 that later got included in IEEE 802.1Q-2018)
configuration. EST allows express queue traffic to be scheduled
(placed) on the wire at specific repeatable time intervals. In
Linux kernel, EST configuration is done through tc command and
the taprio scheduler in the net core implements a software only
scheduler (SCH_TAPRIO). If the NIC is capable of EST configuration,
user indicate "flag 2" in the command which is then parsed by
taprio scheduler in net core and indicate that the command is to
be offloaded to h/w. taprio then offloads the command to the
driver by calling ndo_setup_tc() ndo ops. This patch implements
ndo_setup_tc() to offload EST configuration to CPSW h/w.

Currently driver supports only SetGateStates operation. EST
operates on a repeating time interval generated by the CPTS EST
function generator. Each Ethernet port has a global EST fetch
RAM that can be configured as 2 buffers, each of 64 locations
or one large buffer of 128 locations. In 2 buffer configuration,
a ping pong mechanism is used to hold the active schedule (oper)
in one buffer and new (admin) command in the other. Each 22-bit
fetch command consists of a 14-bit fetch count (14 MSB’s) and an
8-bit priority fetch allow (8 LSB’s) that will be applied for the
fetch count time in wireside clocks. Driver process each of the
sched-entry in the offload command and update the fetch RAM.
Driver configures duration in sched-entry into the fetch count
and Gate mask into the priority fetch bits of the RAM. Then
configures the CPTS EST function generator to activate the
schedule. Currently driver supports only 2 buffer configuration
which means driver supports a max cycle time of ~8 msec.

CPSW supports a configurable number of priority queues (up to 8)
and needs to be switched to this mode from the default round
robin mode before EST can be offloaded. User configures
these through ethtool commands (-L for changing number of
queues and --set-priv-flags to disable round robin mode).
Driver doesn't enable EST if pf_p0_rx_ptype_rrobin privat flag
is set. The flag is common for all ports, and so can't be just
overridden by taprio configuration w/o user involvement.
Command fails if pf_p0_rx_ptype_rrobin is already set in the
driver.

Scheds (commands) configuration depends on interface speed so
driver translates the duration to the fetch count based on
link speed. Each schedule can be constructed with several
command entries in fetch RAM  depending on interval. For example
if each sched has timer interval < ~130us on 1000 Mb link then
each sched consumes one command and have 1:1 mapping. When
Ethernet link goes down, driver purge the configuration if link
is down for more than 1 second.

The patch allows to update the timer and scheds memory only if it's
really needed, and skip cases required the user to stop timer by
configuring only shceds memory.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 17:33:30 -07:00
Ivan Khoronzhuk
ec008fa2a9 ethernet: ti: am65-cpts: add routines to support taprio offload
TAPRIO/EST offload support in CPSW2G requires EST scheduler
function enabled in CPTS. So this patch add a function to
set cycle time for EST scheduler.  It also add a function for
getting time in ns of PHC clock for taprio qdisc configuration.
Mostly to verify if timer update is needed or to get actual
state of oper/admin schedule.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 17:33:29 -07:00
Igor Russkikh
ebf64bf4df net: qed: introduce critical hardware error handler
MCP may signal driver about generic critical failure.
Driver has to collect mdump information (get_retain),
it pushes that to logs and triggers generic notification on
"hardware attention" event.

Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:25:46 -07:00
Igor Russkikh
3e99c21110 net: qed: introduce critical fan failure handler
Fan failure is sent by firmware, driver reacts on this error with
newly introduced notification path. It will collect dump and shut down
the device to prevent physical breakage

Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:25:46 -07:00
Denis Bolotin
5144e9f439 net: qede: Implement ndo_tx_timeout
Upon tx timeout detection we do disable carrier and print TX queue
info on TX timeout. We then raise hw error condition and trigger
service task to handle this.

This handler will capture extra debug info and then optionally
trigger recovery procedure to try restore function.

Signed-off-by: Denis Bolotin <dbolotin@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:25:46 -07:00
Igor Russkikh
7d9acd87bd net: qede: optional hw recovery procedure
Driver has an ability to initiate a recovery process as a reaction to
detected errors. But the codepath (recovery_process) was disabled and
never active.

Here we add ethtool private flag to allow user have the recovery
procedure activated.

We still do not enable this by default though, since in some configurations
this is not desirable. E.g. this may impact other PFs/VFs.

Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:25:46 -07:00
Igor Russkikh
936c7ba4dd net: qed: attention clearing properties
On different hardware events we have to respond differently,
on some of hardware indications hw attention (error condition)
should be cleared by the driver to continue normal functioning.

Here we introduce attention clear flags, and put them on some
important events (in aeu_descs).

Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:25:46 -07:00
Igor Russkikh
ca352f0075 net: qed: cleanup debug related declarations
Thats probably a legacy code had double declaration of some fields.
Cleanup this, removing copy and fixing references.

Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:25:46 -07:00
Igor Russkikh
d8d6c5a7be net: qed: critical err reporting to management firmware
On various critical errors, notification handler should also report
the err information into the management firmware.

MFW can interact with server/motherboard backend agents - these are
used by server manufacturers to monitor server HW health.

Thus, it is important for driver to report on any faulty conditions

Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:25:46 -07:00
Igor Russkikh
2ec276d5b2 net: qed: invoke err notify on critical areas
In a number of critical places not only debug trace should be printed,
but the appropriate hw error condition should be raised and error
handling/recovery should start.

Introduce our new qed_hw_err_notify invocation in these places to
record and indicate critical error conditions in hardware.

Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:25:46 -07:00
Igor Russkikh
a8736ea83b net: qede: add hw err scheduled handler
qede (ethernet level driver) registers a callback handler.
This handler maintains eth dev state flags/bits to track error processing.

It implements in place processing part for nonsleeping context (WARN_ON
trigger), and a deferred (delayed work) part which triggers recovery
process for recoverable errors.

In later patches this atomic handler will come with more meat.

We introduce err_flags on ethdevice structure, its being used to record
error handling properties.

Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:25:46 -07:00
Igor Russkikh
d639836ab3 net: qed: adding hw_err states and handling
Here we introduce qed device error tracking flags and error types.

qed_hw_err_notify is an entrace point to report errors.
It'll notify higher level drivers (qede/qedr/etc) to handle and recover
the error.

List of posible errors comes from hardware interfaces, but could be
extended in future.

Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:25:46 -07:00
Huazhong Tan
5c6cfd309f net: hns3: remove unnecessary frag list checking in hns3_nic_net_xmit()
The skb_has_frag_list() in hns3_nic_net_xmit() is redundant, since
skb_walk_frags() includes this checking implicitly.

Reported-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:18:10 -07:00
Huazhong Tan
bd13f7e129 net: hns3: remove some unused macros
There are some macros defined in hns3_enet.h, but not used in
anywhere.

Reported-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:18:10 -07:00
Huazhong Tan
cb25a6072b net: hns3: modify an incorrect error log in hclge_mbx_handler()
When handling HCLGE_MBX_GET_LINK_STATUS, PF will return the link
status to the VF, so the error log of hclge_get_link_info() is
incorrect.

Reported-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:18:09 -07:00
Huazhong Tan
727f514bd6 net: hns3: remove a duplicated printing in hclge_configure()
Since hclge_get_cfg() already has error print, so hclge_configure()
should not print error when calling hclge_get_cfg() fail.

Reported-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:18:09 -07:00
Huazhong Tan
96b8e87838 net: hns3: modify some incorrect spelling
This patch modifies some incorrect spelling.

Reported-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:18:09 -07:00
Vinod Koul
fd4a517738 net: stmmac: fix num_por initialization
Driver missed initializing num_por which is one of the por values that
driver configures to hardware. In order to get these values, add a new
structure ethqos_emac_driver_data which holds por and num_por values
and populate that in driver probe.

Fixes: a7c30e62d4 ("net: stmmac: Add driver for Qualcomm ethqos")
Reported-by: Rahul Ankushrao Kawadgave <rahulak@qti.qualcomm.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 12:48:15 -07:00
Daniel González Cabanelas
5e3768a436 net: mvneta: speed down the PHY, if WoL used, to save energy
Some PHYs connected to this ethernet hardware support the WoL feature.
But when WoL is enabled and the machine is powered off, the PHY remains
waiting for a magic packet at max speed (i.e. 1Gbps), which is a waste of
energy.

Slow down the PHY speed before stopping the ethernet if WoL is enabled,
and save some energy while the machine is powered off or sleeping.

Tested using an Armada 370 based board (LS421DE) equipped with a Marvell
88E1518 PHY.

Signed-off-by: Daniel González Cabanelas <dgcbueu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13 15:21:11 -07:00
Colin Ian King
6545be8280 sfc: fix dereference of table before it is null checked
Currently pointer table is being dereferenced on a null check of
table->must_restore_filters before it is being null checked, leading
to a potential null pointer dereference issue.  Fix this by null
checking table before dereferencing it when checking for a null
table->must_restore_filters.

Addresses-Coverity: ("Dereference before null check")
Fixes: e4fe938cff ("sfc: move 'must restore' flags out of ef10-specific nic_data")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13 15:20:00 -07:00
Florian Fainelli
99addbe31f net: broadcom: Select BROADCOM_PHY for BCMGENET
The GENET controller on the Raspberry Pi 4 (2711) is typically
interfaced with an external Broadcom PHY via a RGMII electrical
interface. To make sure that delays are properly configured at the PHY
side, ensure that we the dedicated Broadcom PHY driver
(CONFIG_BROADCOM_PHY) is enabled for this to happen.

Fixes: 402482a6a7 ("net: bcmgenet: Clear ID_MODE_DIS in EXT_RGMII_OOB_CTRL when not needed")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13 12:49:07 -07:00
Martin Blumenstingl
9308c47640 net: stmmac: dwmac-meson8b: add support for the RX delay configuration
Configure the PRG_ETH0_ADJ_* bits to enable or disable the RX delay
based on the various RGMII PHY modes. For now the only supported RX
delay settings are:
- disabled, use for example for phy-mode "rgmii-id"
- 0ns - this is treated identical to "disabled", used for example on
  boards where the PHY provides 2ns TX delay and the PCB trace length
  already adds 2ns RX delay
- 2ns - for whenever the PHY cannot add the RX delay and the traces on
  the PCB don't add any RX delay

Disabling the RX delay (in case u-boot enables it, which is the case
for example on Meson8b Odroid-C1) simply means that PRG_ETH0_ADJ_ENABLE,
PRG_ETH0_ADJ_SETUP, PRG_ETH0_ADJ_DELAY and PRG_ETH0_ADJ_SKEW should be
disabled (just disabling PRG_ETH0_ADJ_ENABLE may be enough, since that
disables the whole re-timing logic - but I find it makes more sense to
clear the other bits as well since they depend on that setting).

u-boot on Odroid-C1 uses the following steps to enable a 2ns RX delay:
- enabling enabling the timing adjustment clock
- enabling the timing adjustment logic by setting PRG_ETH0_ADJ_ENABLE
- setting the PRG_ETH0_ADJ_SETUP bit

The documentation for the PRG_ETH0_ADJ_DELAY and PRG_ETH0_ADJ_SKEW
registers indicates that we can even set different RX delays. However,
I could not find out how this works exactly, so for now we only support
a 2ns RX delay using the exact same way that Odroid-C1's u-boot does.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13 12:23:14 -07:00
Martin Blumenstingl
a54dc4a490 net: stmmac: dwmac-meson8b: Make the clock enabling code re-usable
The timing adjustment clock will need similar logic as the RGMII clock:
It has to be enabled in the driver conditionally and when the driver is
unloaded it should be disabled again. Extract the existing code for the
RGMII clock into a new function so it can be re-used.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13 12:23:14 -07:00
Martin Blumenstingl
e4227bff80 net: stmmac: dwmac-meson8b: Fetch the "timing-adjustment" clock
The PRG_ETHERNET registers have a built-in timing adjustment circuit
which can provide the RX delay in RGMII mode. This is driven by an
external (to this IP, but internal to the SoC) clock input. Fetch this
clock as optional (even though it's there on all supported SoCs) since
we just learned about it and existing .dtbs don't specify it.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13 12:23:14 -07:00
Martin Blumenstingl
c92d1d2311 net: stmmac: dwmac-meson8b: Add the PRG_ETH0_ADJ_* bits
The PRG_ETH0_ADJ_* are used for applying the RGMII RX delay. The public
datasheets only have very limited description for these registers, but
Jianxin Pan provided more detailed documentation from an (unnamed)
Amlogic engineer. Add the PRG_ETH0_ADJ_* bits along with the improved
description.

Suggested-by: Jianxin Pan <jianxin.pan@amlogic.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13 12:23:14 -07:00
Martin Blumenstingl
889df20305 net: stmmac: dwmac-meson8b: Move the documentation for the TX delay
Move the documentation for the TX delay above the PRG_ETH0_TXDLY_MASK
definition. Future commits will add more registers also with
documentation above their register bit definitions. Move the existing
comment so it will be consistent with the upcoming changes.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13 12:23:13 -07:00
Martin Blumenstingl
3649abe432 net: stmmac: dwmac-meson8b: use FIELD_PREP instead of open-coding it
Use FIELD_PREP() to shift a value to the correct offset based on a
bitmask instead of open-coding the logic.
No functional changes.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13 12:23:13 -07:00
Maor Gottlieb
9254f8ed15 net/mlx5: Add support in forward to namespace
Currently, fs_core supports rule of forward the traffic
to continue matching in the next priority, now we add support
to forward the traffic matching in the next namespace.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-05-13 18:56:31 +03:00
Maor Gottlieb
14c129e301 {IB/net}/mlx5: Simplify don't trap code
The fs_core already supports creation of rules with multiple
actions/destinations. Refactor fs_core to handle the case
when don't trap rule is created with destination. Adapt the
calling code in the driver.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-05-13 18:56:18 +03:00
Edward Cree
1b0cde4091 sfc: siena_check_caps() can be static
Reported-by: Jakub Kicinski <kuba@kernel.org>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:47:40 -07:00
Edward Cree
527c1e615b sfc: actually wire up siena_check_caps()
Assign it to siena_a0_nic_type.check_caps function pointer.

Fixes: be904b8552 ("sfc: make capability checking a nic_type function")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:47:40 -07:00
Clay McClure
92db978f0d net: ethernet: ti: Remove TI_CPTS_MOD workaround
My recent commit b6d49cab44 ("net: Make PTP-specific drivers depend on
PTP_1588_CLOCK") exposes a missing dependency in defconfigs that select
TI_CPTS without selecting PTP_1588_CLOCK, leading to linker errors of the
form:

drivers/net/ethernet/ti/cpsw.o: in function `cpsw_ndo_stop':
cpsw.c:(.text+0x680): undefined reference to `cpts_unregister'
 ...

That's because TI_CPTS_MOD (which is the symbol gating the _compilation_ of
cpts.c) now depends on PTP_1588_CLOCK, and so is not enabled in these
configurations, but TI_CPTS (which is the symbol gating _calls_ to the cpts
functions) _is_ enabled. So we end up compiling calls to functions that
don't exist, resulting in the linker errors.

This patch fixes build errors and restores previous behavior by:
 - ensure PTP_1588_CLOCK=y in TI specific configs and CPTS will be built
 - remove TI_CPTS_MOD and, instead, add dependencies from CPTS in
   TI_CPSW/TI_KEYSTONE_NETCP/TI_CPSW_SWITCHDEV as below:

   config TI_CPSW_SWITCHDEV
   ...
    depends on TI_CPTS || !TI_CPTS

   which will ensure proper dependencies PTP_1588_CLOCK -> TI_CPTS ->
TI_CPSW/TI_KEYSTONE_NETCP/TI_CPSW_SWITCHDEV and build type selection.

Note. For NFS boot + CPTS all of above configs have to be built-in.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dan Murphy <dmurphy@ti.com>
Cc: Tony Lindgren <tony@atomide.com>
Fixes: b6d49cab44 ("net: Make PTP-specific drivers depend on PTP_1588_CLOCK")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Clay McClure <clay@daemons.net>
[grygorii.strashko@ti.com: rewording, add deps cpsw/netcp from cpts, drop IS_REACHABLE]
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:33:27 -07:00
Shannon Nelson
f64e0c5698 ionic: add more ethtool stats
Add hardware port stats and a few more driver collected
statistics to the ethtool stats output.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:19:30 -07:00
Shannon Nelson
c06107cabe ionic: more ionic name tweaks
Fix up a few more local names that need an "ionic" prefix.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:19:30 -07:00
Shannon Nelson
36ac2c5092 ionic: ionic_intr_free parameter change
Change the ionic_intr_free parameter from struct ionic_lif to
struct ionic since that's what it actually cares about.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:19:30 -07:00
Shannon Nelson
5c78431125 ionic: reset device at probe
Once we're talking to the device, tell it to reset to
be sure we've got a fresh, clean environment.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:19:30 -07:00
Shannon Nelson
62ba8766f7 ionic: shorter dev cmd wait time
Shorten our msleep time while polling for the dev command
request to finish.  Yes, checkpatch.pl complains that the
msleep might actually go longer - that won't hurt, but we'll
take the shorter time if we can get it.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:19:30 -07:00
Shannon Nelson
cba155d591 ionic: add support for more xcvr types
Add a couple more SFP and QSFP transceiver types to our
ethtool get link ksettings.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:19:30 -07:00
Shannon Nelson
a836c35229 ionic: protect vf calls from fw reset
When going into a firmware upgrade cycle, we set the device as
not present to keep some user commands from trying to change
the driver while we're only half there.  Unfortunately, the
ndo_vf_* calls don't check netif_device_present() so we need
to add a check in the callbacks.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:19:30 -07:00
Shannon Nelson
c4e7a75a09 ionic: updates to ionic FW api description
Lots of comment cleanup for better documentation, a few new
fields added, and a few minor mistakes fixed up.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:19:30 -07:00
Shannon Nelson
5b3f3f2a71 ionic: support longer tx sg lists
The version 1 Tx queues can use longer SG lists than the
original version 0 queues, but we need to check to see if the
firmware supports the v1 Tx queues.  This implements the queue
type query for all queue types, and uses the information to
set up for using the longer Tx SG lists.

Because the Tx SG list can be longer, we need to limit the
max ring length to be sure we stay inside the boundaries of a
DMA allocation max size, so we lower the max Tx ring size.

The driver sets its highest known version in the Q_IDENTITY
command, and the FW returns the highest version that it knows,
bounded by the driver's version.  The negotiated version number
is later used in the Q_INIT commands.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:19:30 -07:00
Shannon Nelson
ddc5911b9b ionic: call ionic_port_init after fw-upgrade
Since the fw has been re-inited, we need to refresh the port
information dma address so we can see fresh port information.
Let's call ionic_port_init again, and tweak it to allow for
a call to simply refresh the existing dma address.

Fixes: c672412f61 ("ionic: remove lifs on fw reset")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:12:34 -07:00
Shannon Nelson
f20a4d4041 ionic: leave netdev mac alone after fw-upgrade
When running in a bond setup, or some other potential
configurations, the netdev mac may have been changed from
the default device mac.  Since the userland doesn't know
about the changes going on under the covers in a fw-upgrade
it doesn't know the re-push the mac filter.  The driver
needs to leave the netdev mac filter alone when rebuilding
after the fw-upgrade.

Fixes: c672412f61 ("ionic: remove lifs on fw reset")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:12:34 -07:00
Edward Cree
9b46132cff sfc: make firmware-variant printing a nic_type function
Instead of having efx_mcdi_print_fwver() look at efx_nic_rev and
 conditionally poke around inside ef10-specific nic_data, add a new
 efx->type->print_additional_fwver() method to do this work.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-11 13:31:49 -07:00
Edward Cree
ed02112cff sfc: make filter table probe caller responsible for adding VLANs
By making the caller of efx_mcdi_filter_table_probe() loop over the
 vlan_list calling efx_mcdi_filter_add_vlan(), instead of doing it in
 efx_mcdi_filter_table_probe(), the latter avoids looking in ef10-
 specific nic_data.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-11 13:31:49 -07:00
Edward Cree
dbf2c66906 sfc: move rx_rss_context_exclusive into struct efx_mcdi_filter_table
It's both set and used solely by mcdi_filters.c, so there's no reason
 for it to be in ef10-specific nic_data.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-11 13:31:49 -07:00
Edward Cree
fd14e5fd13 sfc: rework handling of (firmware) multicast chaining state
Store the mc_chaining bit in struct efx_mcdi_filter_table, so that common
 code in mcdi_filters.c doesn't need to get it from ef10-specific nic_data.
Also, probe the firmware workaround just before the call to
 efx_mcdi_filter_table_probe(), rather than in a random other part of the
 driver bringup, to ensure that (a) it gets probed in time and (b) it gets
 reprobed as necessary on resets, no matter how the surrounding code gets
 reorganised and reordered.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-11 13:31:49 -07:00
Edward Cree
e4fe938cff sfc: move 'must restore' flags out of ef10-specific nic_data
Common code in mcdi_filters.c uses these flags, so by moving them to
 either struct efx_nic (in the case of must_realloc_vis) or struct
 efx_mcdi_filter_table (for must_restore_rss_contexts and
 must_restore_filters), decouple this code from ef10's nic_data.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-11 13:31:49 -07:00