linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-10 14:11:52 +00:00

Author	SHA1	Message	Date
Jakub Kicinski	93089de91e	MAINTAINERS: altx: move Jay Cliburn to CREDITS Jay was not active in recent years and does not have plans to return to work on ATLX drivers. Subsystem ATLX ETHERNET DRIVERS Changes 20 / 116 (17%) Last activity: 2020-02-24 Jay Cliburn <jcliburn@gmail.com>: Chris Snook <chris.snook@gmail.com>: Tags `ea97374214` 2020-02-24 00:00:00 1 Top reviewers: [4]: andrew@lunn.ch [2]: kuba@kernel.org [2]: o.rempel@pengutronix.de INACTIVE MAINTAINER Jay Cliburn <jcliburn@gmail.com> Acked-by: Chris Snook <chris.snook@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-14 10:53:48 -08:00
Eric Dumazet	3226b158e6	net: avoid 32 x truesize under-estimation for tiny skbs Both virtio net and napi_get_frags() allocate skbs with a very small skb->head While using page fragments instead of a kmalloc backed skb->head might give a small performance improvement in some cases, there is a huge risk of under estimating memory usage. For both GOOD_COPY_LEN and GRO_MAX_HEAD, we can fit at least 32 allocations per page (order-3 page in x86), or even 64 on PowerPC We have been tracking OOM issues on GKE hosts hitting tcp_mem limits but consuming far more memory for TCP buffers than instructed in tcp_mem[2] Even if we force napi_alloc_skb() to only use order-0 pages, the issue would still be there on arches with PAGE_SIZE >= 32768 This patch makes sure that small skb head are kmalloc backed, so that other objects in the slab page can be reused instead of being held as long as skbs are sitting in socket queues. Note that we might in the future use the sk_buff napi cache, instead of going through a more expensive __alloc_skb() Another idea would be to use separate page sizes depending on the allocated length (to never have more than 4 frags per page) I would like to thank Greg Thelen for his precious help on this matter, analysing crash dumps is always a time consuming task. Fixes: `fd11a83dd3` ("net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Greg Thelen <gthelen@google.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20210113161819.1155526-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-14 10:51:52 -08:00
Geert Uytterhoeven	7da17624e7	nt: usb: USB_RTL8153_ECM should not default to y In general, device drivers should not be enabled by default. Fixes: `657bc1d10b` ("r8153_ecm: avoid to be prior to r8152 driver") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20210113144309.1384615-1-geert+renesas@glider.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-14 10:51:06 -08:00
Yannick Vignon	fe28c53ed7	net: stmmac: fix taprio configuration when base_time is in the past The Synopsys TSN MAC supports Qbv base times in the past, but only up to a certain limit. As a result, a taprio qdisc configuration with a small base time (for example when treating the base time as a simple phase offset) is not applied by the hardware and silently ignored. This was observed on an NXP i.MX8MPlus device, but likely affects all TSN-variants of the MAC. Fix the issue by making sure the base time is in the future, pushing it by an integer amount of cycle times if needed. (a similar check is already done in several other taprio implementations, see for example drivers/net/ethernet/intel/igc/igc_tsn.c#L116 or drivers/net/dsa/sja1105/sja1105_ptp.h#L39). Fixes: `b60189e039` ("net: stmmac: Integrate EST with TAPRIO scheduler API") Signed-off-by: Yannick Vignon <yannick.vignon@nxp.com> Link: https://lore.kernel.org/r/20210113131557.24651-2-yannick.vignon@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-14 10:49:42 -08:00
Yannick Vignon	b76889ff51	net: stmmac: fix taprio schedule configuration When configuring a 802.1Qbv schedule through the tc taprio qdisc on an NXP i.MX8MPlus device, the effective cycle time differed from the requested one by N*96ns, with N number of entries in the Qbv Gate Control List. This is because the driver was adding a 96ns margin to each interval of the GCL, apparently to account for the IPG. The problem was observed on NXP i.MX8MPlus devices but likely affected all devices relying on the same configuration callback (dwmac 4.00, 4.10, 5.10 variants). Fix the issue by removing the margins, and simply setup the MAC with the provided cycle time value. This is the behavior expected by the user-space API, as altering the Qbv schedule timings would break standards conformance. This is also the behavior of several other Ethernet MAC implementations supporting taprio, including the dwxgmac variant of stmmac. Fixes: `504723af0d` ("net: stmmac: Add basic EST support for GMAC5+") Signed-off-by: Yannick Vignon <yannick.vignon@nxp.com> Link: https://lore.kernel.org/r/20210113131557.24651-1-yannick.vignon@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-14 10:49:42 -08:00
Mauro Carvalho Chehab	2576477929	net: tip: fix a couple kernel-doc markups A function has a different name between their prototype and its kernel-doc markup: ../net/tipc/link.c:2551: warning: expecting prototype for link_reset_stats(). Prototype was for tipc_link_reset_stats() instead ../net/tipc/node.c:1678: warning: expecting prototype for is the general link level function for message sending(). Prototype was for tipc_node_xmit() instead Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-14 10:30:24 -08:00
Jakub Kicinski	47e4bb147a	net: sit: unregister_netdevice on newlink's error path We need to unregister the netdevice if config failed. .ndo_uninit takes care of most of the heavy lifting. This was uncovered by recent commit `c269a24ce0` ("net: make free_netdev() more lenient with unregistering devices"). Previously the partially-initialized device would be left in the system. Reported-and-tested-by: syzbot+2393580080a2da190f04@syzkaller.appspotmail.com Fixes: `e2f1f072db` ("sit: allow to configure 6rd tunnels via netlink") Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Link: https://lore.kernel.org/r/20210114012947.2515313-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-14 10:26:46 -08:00
David Wu	5b55299eed	net: stmmac: Fixed mtu channged by cache aligned Since the original mtu is not used when the mtu is updated, the mtu is aligned with cache, this will get an incorrect. For example, if you want to configure the mtu to be 1500, but mtu 1536 is configured in fact. Fixed: `eaf4fac478` ("net: stmmac: Do not accept invalid MTU values") Signed-off-by: David Wu <david.wu@rock-chips.com> Link: https://lore.kernel.org/r/20210113034109.27865-1-david.wu@rock-chips.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-13 20:09:26 -08:00
Ayush Sawal	8ad2a970d2	cxgb4/chtls: Fix tid stuck due to wrong update of qid TID stuck is seen when there is a race in CPL_PASS_ACCEPT_RPL/CPL_ABORT_REQ and abort is arriving before the accept reply, which sets the queue number. In this case HW ends up sending CPL_ABORT_RPL_RSS to an incorrect ingress queue. V1->V2: - Removed the unused variable len in chtls_set_quiesce_ctrl(). V2->V3: - As kfree_skb() has a check for null skb, so removed this check before calling kfree_skb() in func chtls_send_reset(). Fixes: `cc35c88ae4` ("crypto : chtls - CPL handler definition") Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Link: https://lore.kernel.org/r/20210112053600.24590-1-ayush.sawal@chelsio.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-13 19:40:05 -08:00
Cristian Dumitrescu	7128c834d3	i40e: fix potential NULL pointer dereferencing Currently, the function i40e_construct_skb_zc only frees the input xdp buffer when the output skb is successfully built. On error, the function i40e_clean_rx_irq_zc does not commit anything for the current packet descriptor and simply exits the packet descriptor processing loop, with the plan to restart the processing of this descriptor on the next invocation. Therefore, on error the ring next-to-clean pointer should not advance, the xdp i.e. bi buffer should not be freed and the current buffer info should not be invalidated by setting bi to NULL. Therefore, the bi should only be set to NULL when the function i40e_construct_skb_zc is successful, otherwise a NULL bi will be dereferenced when the work for the current descriptor is eventually restarted. Fixes: `3b4f0b66c2` ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL") Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/r/20210111181138.49757-1-cristian.dumitrescu@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-13 19:33:28 -08:00
Jakub Kicinski	7b25339f4e	linux-can-fixes-for-5.11-20210113 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEEK3kIWJt9yTYMP3ehqclaivrt76kFAl//Y3QTHG1rbEBwZW5n dXRyb25peC5kZQAKCRCpyVqK+u3vqR52CACSt3kU1AySJTY9QhNqnxpxyOcrte50 6eFa2rNGpvZzSACh/OUb85jR0dIwAakvLmAkLmwn2zn4HQH/b9UapBPBf8HbVch5 Mm748XF6LzWNjRgJZEVkn1SV7MiKlvL3Q05HrjH4SsugIjT+LX257JoUrDRbdlgV cGI4urUAtm9ap34QH7ngijnV8DTGAbSycmtptDNxRkKUG1WWg9F28p3mbrphVl5q 13biWoKequCbF30LRf5X2g8lbVny64SnVyFsiqX1fiioXY6CuogBWXYpmzT46X+H zi0WoKf36+fh6rSdSTNKza46Ja5ndI9Tw2wZwMGSh2PQnRfFI4VcVwx3 =wBY+ -----END PGP SIGNATURE----- Merge tag 'linux-can-fixes-for-5.11-20210113' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2021-01-13 The first patch is by Oliver Hartkopp for the CAn ISO-TP protocol and fixes a kernel information leak to userspace. The last patch is by Qinglang Miao for the mcp251xfd driver and fixes a NULL pointer check to work on the correct variable. * tag 'linux-can-fixes-for-5.11-20210113' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: mcp251xfd: mcp251xfd_handle_rxif_one(): fix wrong NULL pointer check can: isotp: isotp_getname(): fix kernel information leak ==================== Link: https://lore.kernel.org/r/20210113212158.925513-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-13 19:00:50 -08:00
Seb Laveze	1f02efd1bb	net: stmmac: use __napi_schedule() for PREEMPT_RT Use of __napi_schedule_irqoff() is not safe with PREEMPT_RT in which hard interrupts are not disabled while running the threaded interrupt. Using __napi_schedule() works for both PREEMPT_RT and mainline Linux, just at the cost of an additional check if interrupts are disabled for mainline (since they are already disabled). Similar to the fix done for enetc commit `215602a8d2` ("enetc: use napi_schedule to be compatible with PREEMPT_RT") Signed-off-by: Seb Laveze <sebastien.laveze@nxp.com> Link: https://lore.kernel.org/r/20210112140121.1487619-1-sebastien.laveze@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-13 15:50:46 -08:00
Qinglang Miao	ca4c6ebeeb	can: mcp251xfd: mcp251xfd_handle_rxif_one(): fix wrong NULL pointer check If alloc_canfd_skb() returns NULL, 'cfg' is an uninitialized variable, so we should check 'skb' rather than 'cfd' after calling alloc_canfd_skb(priv->ndev, &cfd). Fixes: `55e5b97f00` ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN") Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20210113073100.79552-1-miaoqinglang@huawei.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-01-13 22:16:16 +01:00
Oliver Hartkopp	b42b3a2744	can: isotp: isotp_getname(): fix kernel information leak Initialize the sockaddr_can structure to prevent a data leak to user space. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Reported-by: syzbot+057884e2f453e8afebc8@syzkaller.appspotmail.com Fixes: `e057dd3fc2` ("can: add ISO 15765-2:2016 transport protocol") Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/r/20210112091643.11789-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-01-13 22:15:13 +01:00
Baptiste Lepers	a95d25dd7b	rxrpc: Call state should be read with READ_ONCE() under some circumstances The call state may be changed at any time by the data-ready routine in response to received packets, so if the call state is to be read and acted upon several times in a function, READ_ONCE() must be used unless the call state lock is held. As it happens, we used READ_ONCE() to read the state a few lines above the unmarked read in rxrpc_input_data(), so use that value rather than re-reading it. Fixes: `a158bdd324` ("rxrpc: Fix call timeouts") Signed-off-by: Baptiste Lepers <baptiste.lepers@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/161046715522.2450566.488819910256264150.stgit@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-13 10:38:20 -08:00
David Howells	d52e419ac8	rxrpc: Fix handling of an unsupported token type in rxrpc_read() Clang static analysis reports the following: net/rxrpc/key.c:657:11: warning: Assigned value is garbage or undefined toksize = toksizes[tok++]; ^ ~~~~~~~~~~~~~~~ rxrpc_read() contains two consecutive loops. The first loop calculates the token sizes and stores the results in toksizes[] and the second one uses the array. When there is an error in identifying the token in the first loop, the token is skipped, no change is made to the toksizes[] array. When the same error happens in the second loop, the token is not skipped. This will cause the toksizes[] array to be out of step and will overrun past the calculated sizes. Fix this by making both loops log a message and return an error in this case. This should only happen if a new token type is incompletely implemented, so it should normally be impossible to trigger this. Fixes: `9a059cd5ca` ("rxrpc: Downgrade the BUG() for unsupported token type in rxrpc_read()") Reported-by: Tom Rix <trix@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/161046503122.2445787.16714129930607546635.stgit@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-13 10:38:00 -08:00
Jakub Kicinski	c8a8ead017	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Pass conntrack -f to specify family in netfilter conntrack helper selftests, from Chen Yi. 2) Honor hashsize modparam from nf_conntrack_buckets sysctl, from Jesper D. Brouer. 3) Fix memleak in nf_nat_init() error path, from Dinghao Liu. * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf: netfilter: nf_nat: Fix memleak in nf_nat_init netfilter: conntrack: fix reading nf_conntrack_buckets selftests: netfilter: Pass family parameter "-f" to conntrack tool ==================== Link: https://lore.kernel.org/r/20210112222033.9732-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 20:25:29 -08:00
Jakub Kicinski	5527d0ea19	Merge branch 'net-smc-fix-out-of-bound-access-in-netlink-interface' Karsten Graul says: ==================== net/smc: fix out of bound access in netlink interface Both patches fix possible out-of-bounds reads. The original code expected that snprintf() reads len-1 bytes from source and appends the terminating null, but actually snprintf() first copies len bytes and finally overwrites the last byte with a null. Fix this by using memcpy() and terminating the string afterwards. ==================== Link: https://lore.kernel.org/r/20210112162122.26832-1-kgraul@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 20:22:15 -08:00
Guvenc Gulce	8a44653689	net/smc: use memcpy instead of snprintf to avoid out of bounds read Using snprintf() to convert not null-terminated strings to null terminated strings may cause out of bounds read in the source string. Therefore use memcpy() and terminate the target string with a null afterwards. Fixes: `a3db10efcc` ("net/smc: Add support for obtaining SMCR device list") Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 20:22:01 -08:00
Jakub Kicinski	25fe2c9c4c	smc: fix out of bound access in smc_nl_get_sys_info() smc_clc_get_hostname() sets the host pointer to a buffer which is not NULL-terminated (see smc_clc_init()). Reported-by: syzbot+f4708c391121cfc58396@syzkaller.appspotmail.com Fixes: `099b990bd1` ("net/smc: Add support for obtaining system information") Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 20:22:01 -08:00
Jakub Kicinski	584c19f927	Merge branch 'mptcp-a-couple-of-fixes' Paolo Abeni says: ==================== mptcp: a couple of fixes This series includes two related fixes addressing potential divide by 0 bugs in the MPTCP datapath. ==================== Link: https://lore.kernel.org/r/cover.1610471474.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 20:09:21 -08:00
Paolo Abeni	76e2a55d16	mptcp: better msk-level shutdown. Instead of re-implementing most of inet_shutdown, re-use such helper, and implement the MPTCP-specific bits at the 'proto' level. The msk-level disconnect() can now be invoked, lets provide a suitable implementation. As a side effect, this fixes bad state management for listener sockets. The latter could lead to division by 0 oops since commit `ea4ca586b1` ("mptcp: refine MPTCP-level ack scheduling"). Fixes: `43b54c6ee3` ("mptcp: Use full MPTCP-level disconnect state machine") Fixes: `ea4ca586b1` ("mptcp: refine MPTCP-level ack scheduling") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 20:09:19 -08:00
Paolo Abeni	20bc80b6f5	mptcp: more strict state checking for acks Syzkaller found a way to trigger division by zero in mptcp_subflow_cleanup_rbuf(). The current checks implemented into tcp_can_send_ack() are too week, let's be more accurate. Reported-by: Christoph Paasch <cpaasch@apple.com> Fixes: `ea4ca586b1` ("mptcp: refine MPTCP-level ack scheduling") Fixes: `fd8976790a` ("mptcp: be careful on MPTCP-level ack.") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 20:09:19 -08:00
Jakub Kicinski	ece9ab2a78	Merge branch 'bnxt_en-bug-fixes' Michael Chan says: ==================== bnxt_en: Bug fixes. This series has 2 fixes. The first one fixes a resource accounting error with the RDMA driver loaded and the second one fixes the firmware flashing sequence after defragmentation. ==================== Link: https://lore.kernel.org/r/1610357200-30755-1-git-send-email-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 20:05:37 -08:00
Pavan Chebbi	6874877518	bnxt_en: Clear DEFRAG flag in firmware message when retry flashing. When the FW tells the driver to retry the INSTALL_UPDATE command after it has cleared the NVM area, the driver is not clearing the previously used ALLOWED_TO_DEFRAG flag. As a result the FW tries to defrag the NVM area a second time in a loop and can fail the request. Fixes: `1432c3f6a6` ("bnxt_en: Retry installing FW package under NO_SPACE error condition.") Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 20:05:35 -08:00
Michael Chan	869c4d5eb1	bnxt_en: Improve stats context resource accounting with RDMA driver loaded. The function bnxt_get_ulp_stat_ctxs() does not count the stats contexts used by the RDMA driver correctly when the RDMA driver is freeing the MSIX vectors. It assumes that if the RDMA driver is registered, the additional stats contexts will be needed. This is not true when the RDMA driver is about to unregister and frees the MSIX vectors. This slight error leads to over accouting of the stats contexts needed after the RDMA driver has unloaded. This will cause some firmware warning and error messages in dmesg during subsequent config. changes or ifdown/ifup. Fix it by properly accouting for extra stats contexts only if the RDMA driver is registered and MSIX vectors have been successfully requested. Fixes: `c027c6b4e9` ("bnxt_en: get rid of num_stat_ctxs variable") Reviewed-by: Yongping Zhang <yongping.zhang@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 20:05:35 -08:00
Leon Schuermann	2284bbd0cf	r8153_ecm: Add Lenovo Powered USB-C Hub as a fallback of r8152 This commit enables the use of the r8153_ecm driver, introduced with commit `c1aedf015e` ("net/usb/r8153_ecm: support ECM mode for RTL8153") for the Lenovo Powered USB-C Hub (17ef:721e) based on the Realtek RTL8153B chip. This results in the following driver preference: - if r8152 is available, use the r8152 driver - if r8152 is not available, use the r8153_ecm driver This is done to prevent the NIC from constantly sending pause frames when the host system enters standby (fixed by using the r8152 driver in "r8152: Add Lenovo Powered USB-C Travel Hub"), while still allowing the device to work with the r8153_ecm driver as a fallback. Signed-off-by: Leon Schuermann <leon@is.currently.online> Tested-by: Leon Schuermann <leon@is.currently.online> Link: https://lore.kernel.org/r/20210111190312.12589-3-leon@is.currently.online Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 20:00:51 -08:00
Leon Schuermann	cb82a54904	r8152: Add Lenovo Powered USB-C Travel Hub This USB-C Hub (17ef:721e) based on the Realtek RTL8153B chip used to use the cdc_ether driver. However, using this driver, with the system suspended the device constantly sends pause-frames as soon as the receive buffer fills up. This causes issues with other devices, where some Ethernet switches stop forwarding packets altogether. Using the Realtek driver (r8152) fixes this issue. Pause frames are no longer sent while the host system is suspended. Signed-off-by: Leon Schuermann <leon@is.currently.online> Tested-by: Leon Schuermann <leon@is.currently.online> Link: https://lore.kernel.org/r/20210111190312.12589-2-leon@is.currently.online Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 20:00:51 -08:00
Vladimir Oltean	91158e1680	net: dsa: clear devlink port type before unregistering slave netdevs Florian reported a use-after-free bug in devlink_nl_port_fill found with KASAN: (devlink_nl_port_fill) (devlink_port_notify) (devlink_port_unregister) (dsa_switch_teardown.part.3) (dsa_tree_teardown_switches) (dsa_unregister_switch) (bcm_sf2_sw_remove) (platform_remove) (device_release_driver_internal) (device_links_unbind_consumers) (device_release_driver_internal) (device_driver_detach) (unbind_store) Allocated by task 31: alloc_netdev_mqs+0x5c/0x50c dsa_slave_create+0x110/0x9c8 dsa_register_switch+0xdb0/0x13a4 b53_switch_register+0x47c/0x6dc bcm_sf2_sw_probe+0xaa4/0xc98 platform_probe+0x90/0xf4 really_probe+0x184/0x728 driver_probe_device+0xa4/0x278 __device_attach_driver+0xe8/0x148 bus_for_each_drv+0x108/0x158 Freed by task 249: free_netdev+0x170/0x194 dsa_slave_destroy+0xac/0xb0 dsa_port_teardown.part.2+0xa0/0xb4 dsa_tree_teardown_switches+0x50/0xc4 dsa_unregister_switch+0x124/0x250 bcm_sf2_sw_remove+0x98/0x13c platform_remove+0x44/0x5c device_release_driver_internal+0x150/0x254 device_links_unbind_consumers+0xf8/0x12c device_release_driver_internal+0x84/0x254 device_driver_detach+0x30/0x34 unbind_store+0x90/0x134 What happens is that devlink_port_unregister emits a netlink DEVLINK_CMD_PORT_DEL message which associates the devlink port that is getting unregistered with the ifindex of its corresponding net_device. Only trouble is, the net_device has already been unregistered. It looks like we can stub out the search for a corresponding net_device if we clear the devlink_port's type. This looks like a bit of a hack, but also seems to be the reason why the devlink_port_type_clear function exists in the first place. Fixes: `3122433eb5` ("net: dsa: Register devlink ports before calling DSA driver setup()") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian fainelli <f.fainelli@gmail.com> Reported-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20210112004831.3778323-1-olteanv@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 18:48:50 -08:00
Vladimir Oltean	07b90056cb	net: dsa: unbind all switches from tree when DSA master unbinds Currently the following happens when a DSA master driver unbinds while there are DSA switches attached to it: $ echo 0000:00:00.5 > /sys/bus/pci/drivers/mscc_felix/unbind ------------[ cut here ]------------ WARNING: CPU: 0 PID: 392 at net/core/dev.c:9507 Call trace: rollback_registered_many+0x5fc/0x688 unregister_netdevice_queue+0x98/0x120 dsa_slave_destroy+0x4c/0x88 dsa_port_teardown.part.16+0x78/0xb0 dsa_tree_teardown_switches+0x58/0xc0 dsa_unregister_switch+0x104/0x1b8 felix_pci_remove+0x24/0x48 pci_device_remove+0x48/0xf0 device_release_driver_internal+0x118/0x1e8 device_driver_detach+0x28/0x38 unbind_store+0xd0/0x100 Located at the above location is this WARN_ON: /* Notifier chain MUST detach us all upper devices. */ WARN_ON(netdev_has_any_upper_dev(dev)); Other stacked interfaces, like VLAN, do indeed listen for NETDEV_UNREGISTER on the real_dev and also unregister themselves at that time, which is clearly the behavior that rollback_registered_many expects. But DSA interfaces are not VLAN. They have backing hardware (platform devices, PCI devices, MDIO, SPI etc) which have a life cycle of their own and we can't just trigger an unregister from the DSA framework when we receive a netdev notifier that the master unregisters. Luckily, there is something we can do, and that is to inform the driver core that we have a runtime dependency to the DSA master interface's device, and create a device link where that is the supplier and we are the consumer. Having this device link will make the DSA switch unbind before the DSA master unbinds, which is enough to avoid the WARN_ON from rollback_registered_many. Note that even before the blamed commit, DSA did nothing intelligent when the master interface got unregistered either. See the discussion here: https://lore.kernel.org/netdev/20200505210253.20311-1-f.fainelli@gmail.com/ But this time, at least the WARN_ON is loud enough that the upper_dev_link commit can be blamed. The advantage with this approach vs dev_hold(master) in the attached link is that the latter is not meant for long term reference counting. With dev_hold, the only thing that will happen is that when the user attempts an unbind of the DSA master, netdev_wait_allrefs will keep waiting and waiting, due to DSA keeping the refcount forever. DSA would not access freed memory corresponding to the master interface, but the unbind would still result in a freeze. Whereas with device links, graceful teardown is ensured. It even works with cascaded DSA trees. $ echo 0000:00:00.2 > /sys/bus/pci/drivers/fsl_enetc/unbind [ 1818.797546] device swp0 left promiscuous mode [ 1819.301112] sja1105 spi2.0: Link is Down [ 1819.307981] DSA: tree 1 torn down [ 1819.312408] device eno2 left promiscuous mode [ 1819.656803] mscc_felix 0000:00:00.5: Link is Down [ 1819.667194] DSA: tree 0 torn down [ 1819.711557] fsl_enetc 0000:00:00.2 eno2: Link is Down This approach allows us to keep the DSA framework absolutely unchanged, and the driver core will just know to unbind us first when the master goes away - as opposed to the large (and probably impossible) rework required if attempting to listen for NETDEV_UNREGISTER. As per the documentation at Documentation/driver-api/device_link.rst, specifying the DL_FLAG_AUTOREMOVE_CONSUMER flag causes the device link to be automatically purged when the consumer fails to probe or later unbinds. So we don't need to keep the consumer_link variable in struct dsa_switch. Fixes: `2f1e8ea726` ("net: dsa: link interfaces with the DSA master to get rid of lockdep warnings") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20210111230943.3701806-1-olteanv@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 18:48:40 -08:00
Marco Felsch	a18caa97b1	net: phy: smsc: fix clk error handling Commit `bedd8d78ab` ("net: phy: smsc: LAN8710/20: add phy refclk in support") added the phy clk support. The commit already checks if clk_get_optional() throw an error but instead of returning the error it ignores it. Fixes: `bedd8d78ab` ("net: phy: smsc: LAN8710/20: add phy refclk in support") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Marco Felsch <m.felsch@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20210111085932.28680-1-m.felsch@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 18:39:19 -08:00
Petr Machata	df85bc140a	net: dcb: Accept RTM_GETDCB messages carrying set-like DCB commands In commit `826f328e2b` ("net: dcb: Validate netlink message in DCB handler"), Linux started rejecting RTM_GETDCB netlink messages if they contained a set-like DCB_CMD_ command. The reason was that privileges were only verified for RTM_SETDCB messages, but the value that determined the action to be taken is the command, not the message type. And validation of message type against the DCB command was the obvious missing piece. Unfortunately it turns out that mlnx_qos, a somewhat widely deployed tool for configuration of DCB, accesses the DCB set-like APIs through RTM_GETDCB. Therefore do not bounce the discrepancy between message type and command. Instead, in addition to validating privileges based on the actual message type, validate them also based on the expected message type. This closes the loophole of allowing DCB configuration on non-admin accounts, while maintaining backward compatibility. Fixes: `2f90b8657e` ("ixgbe: this patch adds support for DCB to the kernel and ixgbe driver") Fixes: `826f328e2b` ("net: dcb: Validate netlink message in DCB handler") Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/a3edcfda0825f2aa2591801c5232f2bbf2d8a554.1610384801.git.me@pmachata.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-12 15:55:21 -08:00
Jakub Kicinski	1ee527a79f	Merge branch 'skb-frag-kmap_atomic-fixes' Willem de Bruijn says: ==================== skb frag: kmap_atomic fixes skb frags may be backed by highmem and/or compound pages. Various code calls kmap_atomic to safely access highmem pages. But this needs additional care for compound pages. Fix a few issues: patch 1 expect kmap mappings with CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP patch 2 fixes kmap_atomic + compound page support in skb_seq_read patch 3 fixes kmap_atomic + compound page support in esp ==================== Link: https://lore.kernel.org/r/20210109221834.3459768-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-11 18:20:12 -08:00
Willem de Bruijn	9bd6b629c3	esp: avoid unneeded kmap_atomic call esp(6)_output_head uses skb_page_frag_refill to allocate a buffer for the esp trailer. It accesses the page with kmap_atomic to handle highmem. But skb_page_frag_refill can return compound pages, of which kmap_atomic only maps the first underlying page. skb_page_frag_refill does not return highmem, because flag __GFP_HIGHMEM is not set. ESP uses it in the same manner as TCP. That also does not call kmap_atomic, but directly uses page_address, in skb_copy_to_page_nocache. Do the same for ESP. This issue has become easier to trigger with recent kmap local debugging feature CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP. Fixes: `cac2661c53` ("esp4: Avoid skb_cow_data whenever possible") Fixes: `03e2a30f6a` ("esp6: Avoid skb_cow_data whenever possible") Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-11 18:20:09 -08:00
Willem de Bruijn	97550f6fa5	net: compound page support in skb_seq_read skb_seq_read iterates over an skb, returning pointer and length of the next data range with each call. It relies on kmap_atomic to access highmem pages when needed. An skb frag may be backed by a compound page, but kmap_atomic maps only a single page. There are not enough kmap slots to always map all pages concurrently. Instead, if kmap_atomic is needed, iterate over each page. As this increases the number of calls, avoid this unless needed. The necessary condition is captured in skb_frag_must_loop. I tried to make the change as obvious as possible. It should be easy to verify that nothing changes if skb_frag_must_loop returns false. Tested: On an x86 platform with CONFIG_HIGHMEM=y CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP=y CONFIG_NETFILTER_XT_MATCH_STRING=y Run ip link set dev lo mtu 1500 iptables -A OUTPUT -m string --string 'badstring' -algo bm -j ACCEPT dd if=/dev/urandom of=in bs=1M count=20 nc -l -p 8000 > /dev/null & nc -w 1 -q 0 localhost 8000 < in Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-11 18:20:09 -08:00
Willem de Bruijn	29766bcffa	net: support kmap_local forced debugging in skb_frag_foreach Skb frags may be backed by highmem and/or compound pages. Highmem pages need kmap_atomic mappings to access. But kmap_atomic maps a single page, not the entire compound page. skb_foreach_page iterates over an skb frag, in one step in the common case, page by page only if kmap_atomic must be called for each page. The decision logic is captured in skb_frag_must_loop. CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP extends kmap from highmem to all pages, to increase code coverage. Extend skb_frag_must_loop to this new condition. Link: https://lore.kernel.org/linux-mm/20210106180132.41dc249d@gandalf.local.home/ Fixes: `0e91a0c698` ("mm/highmem: Provide CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP") Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Willem de Bruijn <willemb@google.com> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-11 18:20:09 -08:00
Andrey Zhizhikin	e56b3d94d9	rndis_host: set proper input size for OID_GEN_PHYSICAL_MEDIUM request MSFT ActiveSync implementation requires that the size of the response for incoming query is to be provided in the request input length. Failure to set the input size proper results in failed request transfer, where the ActiveSync counterpart reports the NDIS_STATUS_INVALID_LENGTH (0xC0010014L) error. Set the input size for OID_GEN_PHYSICAL_MEDIUM query to the expected size of the response in order for the ActiveSync to properly respond to the request. Fixes: `039ee17d1b` ("rndis_host: Add RNDIS physical medium checking into generic_rndis_bind()") Signed-off-by: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com> Link: https://lore.kernel.org/r/20210108095839.3335-1-andrey.zhizhikin@leica-geosystems.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-11 17:54:40 -08:00
Stefan Chulski	6f83802a1a	net: mvpp2: Remove Pause and Asym_Pause support Packet Processor hardware not connected to MAC flow control unit and cannot support TX flow control. This patch disable flow control support. Fixes: `3f518509de` ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Stefan Chulski <stefanc@marvell.com> Acked-by: Marcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/r/1610306582-16641-1-git-send-email-stefanc@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-11 16:57:54 -08:00
Seb Laveze	938288349c	dt-bindings: net: dwmac: fix queue priority documentation The priority field is not the queue priority (queue priority is fixed) but a bitmask of priorities assigned to this queue. In receive, priorities relate to tagged frames priorities. In transmit, priorities relate to PFC frames. Signed-off-by: Seb Laveze <sebastien.laveze@nxp.com> Link: https://lore.kernel.org/r/20210111081406.1348622-1-sebastien.laveze@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-11 15:18:47 -08:00
Dinghao Liu	869f4fdaf4	netfilter: nf_nat: Fix memleak in nf_nat_init When register_pernet_subsys() fails, nf_nat_bysource should be freed just like when nf_ct_extend_register() fails. Fixes: `1cd472bf03` ("netfilter: nf_nat: add nat hook register functions to nf_nat") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-01-11 00:34:11 +01:00
Jesper Dangaard Brouer	f6351c3f1c	netfilter: conntrack: fix reading nf_conntrack_buckets The old way of changing the conntrack hashsize runtime was through changing the module param via file /sys/module/nf_conntrack/parameters/hashsize. This was extended to sysctl change in commit `3183ab8997` ("netfilter: conntrack: allow increasing bucket size via sysctl too"). The commit introduced second "user" variable nf_conntrack_htable_size_user which shadow actual variable nf_conntrack_htable_size. When hashsize is changed via module param this "user" variable isn't updated. This results in sysctl net/netfilter/nf_conntrack_buckets shows the wrong value when users update via the old way. This patch fix the issue by always updating "user" variable when reading the proc file. This will take care of changes to the actual variable without sysctl need to be aware. Fixes: `3183ab8997` ("netfilter: conntrack: allow increasing bucket size via sysctl too") Reported-by: Yoel Caspersen <yoel@kviknet.dk> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-01-10 09:39:22 +01:00
Chen Yi	fab336b424	selftests: netfilter: Pass family parameter "-f" to conntrack tool Fix nft_conntrack_helper.sh false fail report: 1) Conntrack tool need "-f ipv6" parameter to show out ipv6 traffic items. 2) Sleep 1 second after background nc send packet, to make sure check is after this statement executed. False report: FAIL: ns1-lkjUemYw did not show attached helper ip set via ruleset PASS: ns1-lkjUemYw connection on port 2121 has ftp helper attached ... After fix: PASS: ns1-2hUniwU2 connection on port 2121 has ftp helper attached PASS: ns2-2hUniwU2 connection on port 2121 has ftp helper attached ... Fixes: `619ae8e069` ("selftests: netfilter: add test case for conntrack helper assignment") Signed-off-by: Chen Yi <yiche@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2021-01-10 09:38:15 +01:00
Geert Uytterhoeven	f97844f9c5	dt-bindings: net: renesas,etheravb: RZ/G2H needs tx-internal-delay-ps The merge resolution of the interaction of commits `307eea32b2` ("dt-bindings: net: renesas,ravb: Add support for r8a774e1 SoC") and `d7adf63311` ("dt-bindings: net: renesas,etheravb: Convert to json-schema") missed that "tx-internal-delay-ps" should be a required property on RZ/G2H. Fixes: `8b0308fe31` ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20210105151516.1540653-1-geert+renesas@glider.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-09 19:12:21 -08:00
Jakub Kicinski	26c49f0d10	Merge branch 'mlxsw-core-thermal-control-fixes' Ido Schimmel says: ==================== mlxsw: core: Thermal control fixes This series includes two fixes for thermal control in mlxsw. Patch #1 validates that the alarm temperature threshold read from a transceiver is above the warning temperature threshold. If not, the current thresholds are maintained. It was observed that some transceiver might be unreliable and sometimes report a too low alarm temperature threshold which would result in thermal shutdown of the system. Patch #2 increases the temperature threshold above which thermal shutdown is triggered for the ASIC thermal zone. It is currently too low and might result in thermal shutdown under perfectly fine operational conditions. ==================== Link: https://lore.kernel.org/r/20210108145210.1229820-1-idosch@idosch.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-09 16:25:14 -08:00
Vadim Pasternak	b06ca3d5a4	mlxsw: core: Increase critical threshold for ASIC thermal zone Increase critical threshold for ASIC thermal zone from 110C to 140C according to the system hardware requirements. All the supported ASICs (Spectrum-1, Spectrum-2, Spectrum-3) could be still operational with ASIC temperature below 140C. With the old critical threshold value system can perform unjustified shutdown. All the systems equipped with the above ASICs implement thermal protection mechanism at firmware level and firmware could decide to perform system thermal shutdown in case the temperature is below 140C. So with the new threshold system will not meltdown, while thermal operating range will be aligned with hardware abilities. Fixes: `41e760841d` ("mlxsw: core: Replace thermal temperature trips with defines") Fixes: `a50c1e3565` ("mlxsw: core: Implement thermal zone") Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-09 16:25:10 -08:00
Vadim Pasternak	57726ebe27	mlxsw: core: Add validation of transceiver temperature thresholds Validate thresholds to avoid a single failure due to some transceiver unreliability. Ignore the last readouts in case warning temperature is above alarm temperature, since it can cause unexpected thermal shutdown. Stay with the previous values and refresh threshold within the next iteration. This is a rare scenario, but it was observed at a customer site. Fixes: `6a79507cfe` ("mlxsw: core: Extend thermal module with per QSFP module thermal zones") Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-09 16:25:10 -08:00
Hoang Le	b774134464	tipc: fix NULL deref in tipc_link_xmit() The buffer list can have zero skb as following path: tipc_named_node_up()->tipc_node_xmit()->tipc_link_xmit(), so we need to check the list before casting an &sk_buff. Fault report: [] tipc: Bulk publication failure [] general protection fault, probably for non-canonical [#1] PREEMPT [...] [] KASAN: null-ptr-deref in range [0x00000000000000c8-0x00000000000000cf] [] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Not tainted 5.10.0-rc4+ #2 [] Hardware name: Bochs ..., BIOS Bochs 01/01/2011 [] RIP: 0010:tipc_link_xmit+0xc1/0x2180 [] Code: 24 b8 00 00 00 00 4d 39 ec 4c 0f 44 e8 e8 d7 0a 10 f9 48 [...] [] RSP: 0018:ffffc90000006ea0 EFLAGS: 00010202 [] RAX: dffffc0000000000 RBX: ffff8880224da000 RCX: 1ffff11003d3cc0d [] RDX: 0000000000000019 RSI: ffffffff886007b9 RDI: 00000000000000c8 [] RBP: ffffc90000007018 R08: 0000000000000001 R09: fffff52000000ded [] R10: 0000000000000003 R11: fffff52000000dec R12: ffffc90000007148 [] R13: 0000000000000000 R14: 0000000000000000 R15: ffffc90000007018 [] FS: 0000000000000000(0000) GS:ffff888037400000(0000) knlGS:000[...] [] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [] CR2: 00007fffd2db5000 CR3: 000000002b08f000 CR4: 00000000000006f0 Fixes: `af9b028e27` ("tipc: make media xmit call outside node spinlock context") Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Link: https://lore.kernel.org/r/20210108071337.3598-1-hoang.h.le@dektech.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-09 14:44:47 -08:00
Vadim Fedorenko	3502bd9b57	selftests/tls: fix selftests after adding ChaCha20-Poly1305 TLS selftests where broken because of wrong variable types used. Fix it by changing u16 -> uint16_t Fixes: `4f336e88a8` ("selftests/tls: add CHACHA20-POLY1305 to tls selftests") Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Link: https://lore.kernel.org/r/1610141865-7142-1-git-send-email-vfedorenko@novek.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-09 14:26:30 -08:00
Aya Levin	b210de4f8c	net: ipv6: Validate GSO SKB before finish IPv6 processing There are cases where GSO segment's length exceeds the egress MTU: - Forwarding of a TCP GRO skb, when DF flag is not set. - Forwarding of an skb that arrived on a virtualisation interface (virtio-net/vhost/tap) with TSO/GSO size set by other network stack. - Local GSO skb transmitted on an NETIF_F_TSO tunnel stacked over an interface with a smaller MTU. - Arriving GRO skb (or GSO skb in a virtualised environment) that is bridged to a NETIF_F_TSO tunnel stacked over an interface with an insufficient MTU. If so: - Consume the SKB and its segments. - Issue an ICMP packet with 'Packet Too Big' message containing the MTU, allowing the source host to reduce its Path MTU appropriately. Note: These cases are handled in the same manner in IPv4 output finish. This patch aligns the behavior of IPv6 and the one of IPv4. Fixes: `9e50849054` ("netfilter: ipv6: move POSTROUTING invocation before fragmentation") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/1610027418-30438-1-git-send-email-ayal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-09 14:06:32 -08:00
Manish Chopra	a2bc221b97	netxen_nic: fix MSI/MSI-x interrupts For all PCI functions on the netxen_nic adapter, interrupt mode (INTx or MSI) configuration is dependent on what has been configured by the PCI function zero in the shared interrupt register, as these adapters do not support mixed mode interrupts among the functions of a given adapter. Logic for setting MSI/MSI-x interrupt mode in the shared interrupt register based on PCI function id zero check is not appropriate for all family of netxen adapters, as for some of the netxen family adapters PCI function zero is not really meant to be probed/loaded in the host but rather just act as a management function on the device, which caused all the other PCI functions on the adapter to always use legacy interrupt (INTx) mode instead of choosing MSI/MSI-x interrupt mode. This patch replaces that check with port number so that for all type of adapters driver attempts for MSI/MSI-x interrupt modes. Fixes: `b37eb210c0` ("netxen_nic: Avoid mixed mode interrupts") Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Link: https://lore.kernel.org/r/20210107101520.6735-1-manishc@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-09 13:47:57 -08:00

1 2 3 4 5 ...

982396 Commits