linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-13 23:51:39 +00:00

Author	SHA1	Message	Date
Alexander Lobakin	4a36d0180c	net: skbuff: always try to recycle PP pages directly when in softirq Commit `8c48eea3ad` ("page_pool: allow caching from safely localized NAPI") allowed direct recycling of skb pages to their PP for some cases, but unfortunately missed a couple of other majors. For example, %XDP_DROP in skb mode. The netstack just calls kfree_skb(), which unconditionally passes `false` as @napi_safe. Thus, all pages go through ptr_ring and locks, although most of time we're actually inside the NAPI polling this PP is linked with, so that it would be perfectly safe to recycle pages directly. Let's address such. If @napi_safe is true, we're fine, don't change anything for this path. But if it's false, check whether we are in the softirq context. It will most likely be so and then if ->list_owner is our current CPU, we're good to use direct recycling, even though @napi_safe is false -- concurrent access is excluded. in_softirq() protection is needed mostly due to we can hit this place in the process context (not the hardirq though). For the mentioned xdp-drop-skb-mode case, the improvement I got is 3-4% in Mpps. As for page_pool stats, recycle_ring is now 0 and alloc_slow counter doesn't change most of time, which means the MM layer is not even called to allocate any new pages. Suggested-by: Jakub Kicinski <kuba@kernel.org> # in_softirq() Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230804180529.2483231-7-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-07 13:05:53 -07:00
Jakub Kicinski	ff4e538c8c	page_pool: add a lockdep check for recycling in hardirq Page pool use in hardirq is prohibited, add debug checks to catch misuses. IIRC we previously discussed using DEBUG_NET_WARN_ON_ONCE() for this, but there were concerns that people will have DEBUG_NET enabled in perf testing. I don't think anyone enables lockdep in perf testing, so use lockdep to avoid pushback and arguing :) Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230804180529.2483231-6-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-07 13:05:53 -07:00
Alexander Lobakin	5b899c33b3	net: skbuff: avoid accessing page_pool if !napi_safe when returning page Currently, pp->p.napi is always read, but the actual variable it gets assigned to is read-only when @napi_safe is true. For the !napi_safe cases, which yet is still a pack, it's an unneeded operation. Moreover, it can lead to premature or even redundant page_pool cacheline access. For example, when page_pool_is_last_frag() returns false (with the recent frag improvements). Thus, read it only when @napi_safe is true. This also allows moving @napi inside the condition block itself. Constify it while we are here, because why not. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230804180529.2483231-5-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-07 13:05:53 -07:00
Alexander Lobakin	06d0fbdad6	page_pool: place frag_* fields in one cacheline On x86_64, frag_* fields of struct page_pool are scattered across two cachelines despite the summary size of 24 bytes. All three fields are used in pretty much the same places, but the last field, ::frag_users, is pushed out to the next CL, provoking unwanted false-sharing on hotpath (frags allocation code). There are some holes and cold members to move around. Move frag_* one block up, placing them right after &page_pool_params perfectly at the beginning of CL2. This doesn't do any meaningful to the second block, as those are some destroy-path cold structures, and doesn't do anything to ::alloc_stats, which still starts at 200-byte offset, 8 bytes after CL3 (still fitting into 1 cacheline). On my setup, this yields 1-2% of Mpps when using PP frags actively. When it comes to 32-bit architectures with 32-byte CL: &page_pool_params plus ::pad is 44 bytes, the block taken care of is 16 bytes within one CL, so there should be at least no regressions from the actual change. ::pages_state_hold_cnt is not related directly to that triple, but is paired currently with ::frags_offset and decoupling them would mean either two 4-byte holes or more invasive layout changes. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230804180529.2483231-4-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-07 13:05:53 -07:00
Alexander Lobakin	75eaf63ea7	net: skbuff: don't include <net/page_pool/types.h> to <linux/skbuff.h> Currently, touching <net/page_pool/types.h> triggers a rebuild of more than half of the kernel. That's because it's included in <linux/skbuff.h>. And each new include to page_pool/types.h adds more [useless] data for the toolchain to process per each source file from that pile. In commit `6a5bcd84e8` ("page_pool: Allow drivers to hint on SKB recycling"), Matteo included it to be able to call a couple of functions defined there. Then, in commit `57f05bc2ab` ("page_pool: keep pp info as long as page pool owns the page") one of the calls was removed, so only one was left. It's the call to page_pool_return_skb_page() in napi_frag_unref(). The function is external and doesn't have any dependencies. Having very niche page_pool_types.h included only for that looks like an overkill. As %PP_SIGNATURE is not local to page_pool.c (was only in the early submissions), nothing holds this function there. Teleport page_pool_return_skb_page() to skbuff.c, just next to the main consumer, skb_pp_recycle(), and rename it to napi_pp_put_page(), as it doesn't work with skbs at all and the former name tells nothing. The #if guards here are only to not compile and have it in the vmlinux when not needed -- both call sites are already guarded. Now, touching page_pool_types.h only triggers rebuilding of the drivers using it and a couple of core networking files. Suggested-by: Jakub Kicinski <kuba@kernel.org> # make skbuff.h less heavy Suggested-by: Alexander Duyck <alexanderduyck@fb.com> # move to skbuff.c Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230804180529.2483231-3-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-07 13:05:53 -07:00
Yunsheng Lin	a9ca9f9cef	page_pool: split types and declarations from page_pool.h Split types and pure function declarations from page_pool.h and add them in page_page/types.h, so that C sources can include page_pool.h and headers should generally only include page_pool/types.h as suggested by jakub. Rename page_pool.h to page_pool/helpers.h to have both in one place. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230804180529.2483231-2-aleksander.lobakin@intel.com [Jakub: change microsoft/mana, fix kdoc paths in Documentation] Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-07 13:05:19 -07:00
Jakub Kicinski	96bc313783	linux-can-next-for-6.6-20230807 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEEDs2BvajyNKlf9TJQvlAcSiqKBOgFAmTQn1YTHG1rbEBwZW5n dXRyb25peC5kZQAKCRC+UBxKKooE6Eh3CACQmSMV6FLXhFsOUzQS9ZyiKwQaAMG7 1giCbXRJS6CiyHIbkye/h7AIC4WYcr9bTXxFTLypWBMqo9uFEm/jiFRPMyJcPsZa g8ySbQqkYeaGb0RkrHFbChsJaSnZhH1niatrAw+Vk2jeh/3Dait+LPFtdDWbLFsw 6mPoZMv18tVy1r/0kiqPCive1Gie3eKzmVwBk9AK6XVUPS88bX7OKRppoRXv3f9x DpNKfJjhyWtBCoK8wsR3vVc1jRNL3eeLN7t8E7zYhfKRQsts1rVnyrWdihEjZ9ik e97oMG5Zeu8yStYiAkbGjNlhs1Uujlzc3yJoN0tuIFvniD2xC4bxeZ+X =X3r+ -----END PGP SIGNATURE----- Merge tag 'linux-can-next-for-6.6-20230807' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2023-08-07 The patch is from me and reverts the addition of the CAN controller nodes in the allwinner d1 SoC. * tag 'linux-can-next-for-6.6-20230807' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: Revert "riscv: dts: allwinner: d1: Add CAN controller nodes" ==================== Link: https://lore.kernel.org/r/20230807074222.1576119-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-07 12:25:46 -07:00
Jakub Kicinski	0d0c5f0b9b	Merge branch 'net-stmmac-correct-mac-propagation-delay' Johannes Zink says: ==================== net: stmmac: correct MAC propagation delay Changes in v3: - work in Richard's review feedback. Thank you for reviewing my patch: - as some of the hardware may have no or invalid correction value registers: introduce feature switch which can be enabled in the glue code drivers depending on the actual hardware support - only enable the feature on the i.MX8MP for the time being, as the patch improves timing accuracy and is tested for this hardware - Link to v2: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v2-1-3366f38ee9a6@pengutronix.de Changes in v2: - fix builds for 32bit, this was found by the kernel build bot Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202307200225.B8rmKQPN-lkp@intel.com/ - while at it also fix an overflow by shifting a u32 constant from macro by 10bits by casting the constant to u64 - Link to v1: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v1-1-768aa4d09334@pengutronix.de Tested-by: Kurt Kanzenbach <kurt@linutronix.de> # imx8mp ==================== Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v3-0-61e63427735e@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-07 12:17:15 -07:00
Johannes Zink	6cb2e613c7	net: stmmac: dwmac-imx: enable MAC propagation delay correction for i.MX8MP As the i.MX8MP supports reading MAC propagation delay and correcting the Hardware timestamp counter for additional delays [1], enable the feature for this SoC. This reduces phase error of the PPS output from the PTP Hardware Clock from approx 150ns to 100ns. [1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp correction" Signed-off-by: Johannes Zink <j.zink@pengutronix.de> Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v3-2-61e63427735e@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-07 12:17:13 -07:00
Johannes Zink	26cfb838aa	net: stmmac: correct MAC propagation delay The IEEE1588 Standard specifies that the timestamps of Packets must be captured when the PTP message timestamp point (leading edge of first octet after the start of frame delimiter) crosses the boundary between the node and the network. As the MAC latches the timestamp at an internal point, the captured timestamp must be corrected for the additional data transmission latency, as described in the publicly available datasheet [1]. This patch only corrects for the MAC-Internal delay, which can be read out from the MAC_Ingress_Timestamp_Latency register on DWMAC version 5, since the Phy framework currently does not support querying the Phy ingress and egress latency. The Closs Domain Crossing Circuits errors as indicated in [1] are already being accounted in the stmmac_get_tx_hwtstamp() function and are not corrected here. As the Latency varies for different link speeds and MII modes of operation, the correction value needs to be updated on each link state change. As the delay also causes a phase shift in the timestamp counter compared to the rest of the network, this correction will also reduce phase error when generating PPS outputs from the timestamp counter. Since the correction registers may be unavailable on some hardware and no feature bits are documented for dynamically detection of the MAC propagation delay readout, introduce a feature bit to explicitely enable MAC delay Correction in the gluecode driver. [1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp correction" Signed-off-by: Johannes Zink <j.zink@pengutronix.de> Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v2-1-3366f38ee9a6@pengutronix.de Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v3-1-61e63427735e@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-07 12:17:13 -07:00
Yue Haibing	cc97777c80	udp/udplite: Remove unused function declarations udp{,lite}_get_port() Commit `6ba5a3c52d` ("[UDP]: Make full use of proto.h.udp_hash innovation.") removed these implementations but leave declarations. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-07 08:53:55 +01:00
Yue Haibing	a6ab5c29b8	net: sfp: Remove unused function declaration sfp_link_configure() Commit `ce0aa27ff3` ("sfp: add sfp-bus to bridge between network devices and sfp cages") declared but never implemented it. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-07 08:53:55 +01:00
Yue Haibing	2c6af36beb	ndisc: Remove unused ndisc_ifinfo_sysctl_strategy() declaration Commit `f8572d8f2a` ("sysctl net: Remove unused binary sysctl code") left behind this declaration. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-07 08:53:55 +01:00
Yue Haibing	992b47851b	net: pkt_cls: Remove unused inline helpers Commit `acb674428c` ("net: sched: introduce per-block callbacks") implemented these but never used it. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-07 08:53:54 +01:00
Yue Haibing	047551cd30	neighbour: Remove unused function declaration pneigh_for_each() pneigh_for_each() is never implemented since the beginning of git history. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-07 08:53:54 +01:00
Yue Haibing	f6ecb68b38	net/tls: Remove unused function declarations Commit `3c4d755915` ("tls: kernel TLS support") declared but never implemented these functions. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-07 08:53:54 +01:00
Marc Kleine-Budde	84059a0ef5	Revert "riscv: dts: allwinner: d1: Add CAN controller nodes" It turned out the dtsi changes were not quite ready, revert them for now. This reverts commit `6ea1ad888f`. Link: https://lore.kernel.org/all/2690764.mvXUDI8C0e@jernej-laptop Suggested-by: Jernej Škrabec <jernej.skrabec@gmail.com> Link: https://lore.kernel.org/all/20230807-riscv-allwinner-d1-revert-can-controller-nodes-v1-1-eb3f70b435d9@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2023-08-07 09:37:31 +02:00
Vladimir Oltean	c35e927cbe	net: omit ndo_hwtstamp_get() call when possible in dev_set_hwtstamp_phylib() Setting dev->priv_flags & IFF_SEE_ALL_HWTSTAMP_REQUESTS is only legal for drivers which were converted to ndo_hwtstamp_get() and ndo_hwtstamp_set(), and it is only there that we call ndo_hwtstamp_set() for a request that otherwise goes to phylib (for stuff like packet traps, which need to be undone if phylib failed, hence the old_cfg logic). The problem is that we end up calling ndo_hwtstamp_get() when we don't need to (even if the SIOCSHWTSTAMP wasn't intended for phylib, or if it was, but the driver didn't set IFF_SEE_ALL_HWTSTAMP_REQUESTS). For those unnecessary conditions, we share a code path with virtual drivers (vlan, macvlan, bonding) where ndo_hwtstamp_get() is implemented as generic_hwtstamp_get_lower(), and may be resolved through generic_hwtstamp_ioctl_lower() if the lower device is unconverted. I.e. this situation: $ ip link add link eno0 name eno0.100 type vlan id 100 $ hwstamp_ctl -i eno0.100 -t 1 We are unprepared to deal with this, because if ndo_hwtstamp_get() is resolved through a legacy ndo_eth_ioctl(SIOCGHWTSTAMP) lower_dev implementation, that needs a non-NULL old_cfg.ifr pointer, and we don't have it. But we don't even need to deal with it either. In the general case, drivers may not even implement SIOCGHWTSTAMP handling, only SIOCSHWTSTAMP, so it makes sense to completely avoid a SIOCGHWTSTAMP call if we can. The solution is to split the single "if" condition into 3 smaller ones, thus separating the decision to call ndo_hwtstamp_get() from the decision to call ndo_hwtstamp_set(). The third "if" condition is identical to the first one, and both are subsets of the second one. Thus, the "cfg" argument of kernel_hwtstamp_config_changed() is always valid. Reported-by: Eric Dumazet <edumazet@google.com> Closes: https://lore.kernel.org/netdev/CANn89iLOspJsvjPj+y8jikg7erXDomWe8sqHMdfL_2LQSFrPAg@mail.gmail.com/ Fixes: `fd770e856e` ("net: remove phy_has_hwtstamp() -> phy_mii_ioctl() decision from converted drivers") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 13:25:10 +01:00
Yang Yingliang	54024dbec9	net: ethernet: adi: adin1110: use eth_broadcast_addr() to assign broadcast address Use eth_broadcast_addr() to assign broadcast address instead of memset(). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 13:22:47 +01:00
Yu Liao	813f3662c2	ibmvnic: remove unused rc variable gcc with W=1 reports drivers/net/ethernet/ibm/ibmvnic.c:194:13: warning: variable 'rc' set but not used [-Wunused-but-set-variable] ^ This variable is not used so remove it. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202308040609.zQsSXWXI-lkp@intel.com/ Signed-off-by: Yu Liao <liaoyu15@huawei.com> Reviewed-by: Nick Child <nnac123@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 13:20:44 +01:00
Haiyang Zhang	b1d13f7a3b	net: mana: Add page pool for RX buffers Add page pool for RX buffers for faster buffer cycle and reduce CPU usage. The standard page pool API is used. With iperf and 128 threads test, this patch improved the throughput by 12-15%, and decreased the IRQ associated CPU's usage from 99-100% to 10-50%. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 08:36:06 +01:00
David S. Miller	48ae409aaf	Merge branch 'gve-desc' Rushil Gupta says: ==================== gve: Add QPL mode for DQO descriptor format GVE supports QPL ("queue-page-list") mode where all data is communicated through a set of pre-registered pages. Adding this mode to DQO. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 08:34:37 +01:00
Rushil Gupta	5a3f8d1231	gve: update gve.rst Add a note about QPL and RDA mode Signed-off-by: Rushil Gupta <rushilg@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 08:34:37 +01:00
Rushil Gupta	e7075ab4fb	gve: RX path for DQO-QPL The RX path allocates the QPL page pool at queue creation, and tries to reuse these pages through page recycling. This patch ensures that on refill no non-QPL pages are posted to the device. When the driver is running low on free buffers, an ondemand allocation step kicks in that allocates a non-qpl page for SKB business to free up the QPL page in use. gve_try_recycle_buf was moved to gve_rx_append_frags so that driver does not attempt to mark buffer as used if a non-qpl page was allocated ondemand. Signed-off-by: Rushil Gupta <rushilg@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 08:34:36 +01:00
Rushil Gupta	a6fb8d5a8b	gve: Tx path for DQO-QPL Each QPL page is divided into GVE_TX_BUFS_PER_PAGE_DQO buffers. When a packet needs to be transmitted, we break the packet into max GVE_TX_BUF_SIZE_DQO sized chunks and transmit each chunk using a TX descriptor. We allocate the TX buffers from the free list in dqo_tx. We store these TX buffer indices in an array in the pending_packet structure. The TX buffers are returned to the free list in dqo_compl after receiving packet completion or when removing packets from miss completions list. Signed-off-by: Rushil Gupta <rushilg@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 08:34:36 +01:00
Rushil Gupta	66ce8e6b49	gve: Control path for DQO-QPL GVE supports QPL ("queue-page-list") mode where all data is communicated through a set of pre-registered pages. Adding this mode to DQO descriptor format. Add checks, abi-changes and device options to support QPL mode for DQO in addition to GQI. Also, use pages-per-qpl supplied by device-option to control the size of the "queue-page-list". Signed-off-by: Rushil Gupta <rushilg@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 08:34:36 +01:00
David S. Miller	16fd753995	Merge branch 'tcp-options-lockless' Eric Dumazet says: ==================== tcp: set few options locklessly This series is avoiding the socket lock for six TCP options. They are not heavily used, but this exercise can give ideas for other parts of TCP/IP stack :) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 08:24:56 +01:00
Eric Dumazet	6e97ba552b	tcp: set TCP_DEFER_ACCEPT locklessly rskq_defer_accept field can be read/written without the need of holding the socket lock. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 08:24:56 +01:00
Eric Dumazet	a81722ddd7	tcp: set TCP_LINGER2 locklessly tp->linger2 can be set locklessly as long as readers use READ_ONCE(). Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 08:24:55 +01:00
Eric Dumazet	84485080cb	tcp: set TCP_KEEPCNT locklessly tp->keepalive_probes can be set locklessly, readers are already taking care of this field being potentially set by other threads. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 08:24:55 +01:00
Eric Dumazet	6fd70a6b4e	tcp: set TCP_KEEPINTVL locklessly tp->keepalive_intvl can be set locklessly, readers are already taking care of this field being potentially set by other threads. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 08:24:55 +01:00
Eric Dumazet	d58f2e15aa	tcp: set TCP_USER_TIMEOUT locklessly icsk->icsk_user_timeout can be set locklessly, if all read sides use READ_ONCE(). Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 08:24:55 +01:00
Eric Dumazet	d44fd4a767	tcp: set TCP_SYNCNT locklessly icsk->icsk_syn_retries can safely be set without locking the socket. We have to add READ_ONCE() annotations in tcp_fastopen_synack_timer() and tcp_write_timeout(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-06 08:24:55 +01:00
Jakub Kicinski	81083076a0	wireless-next patches for v6.6 The first pull request for v6.6 and only driver patches this time. Nothing special really standing out, it has been quiet most likely due to vacations. Major changes: rtl8xxxu * enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU), RTL8192EU and RTL8723BU mwifiex * allow moving to a different namespace mt76 * preparation for mt7925 support * mt7981 support ath12k * Extremely High Throughput (EHT) PHY support for Wi-Fi 7 -----BEGIN PGP SIGNATURE----- iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmTM6gkRHGt2YWxvQGtl cm5lbC5vcmcACgkQbhckVSbrbZuuAggAjRQi7Vjsfr+GlZ+g/y/vf+ircw8YjKgy wJqnQ0fnJ4rpyxqVFjMr+ocuOrdBufTSs/W4fqOBbbg9oimsgg+vxIQA8GmQIUVQ ZQVWQHVqPLQ6NVp/YZJnt9seeCewGHW6UZxG9k0MqR1RJn+KinmSjWKRo1D56niL rJQAK0FWrVqkj5nt9lKRJLMGxX0k/ftrdZgHanUOVCNYi9Ukx0jXSbqSMftTk7xz r3jtuY5zAV+2GXoMIbW4ogBks4Yx06XzVycByzj+dYt5E3VBdDFX+mXlsw9vnjbv whVzsuMnYBu6CCFlKDPdGsmrzZA0GrLCRZE9uw7yhwZZ+qKkQZ8kNw== =5r3T -----END PGP SIGNATURE----- Merge tag 'wireless-next-2023-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.6 The first pull request for v6.6 and only driver patches this time. Nothing special really standing out, it has been quiet most likely due to vacations. Major changes: rtl8xxxu - enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU), RTL8192EU and RTL8723BU mwifiex - allow moving to a different namespace mt76 - preparation for mt7925 support - mt7981 support ath12k - Extremely High Throughput (EHT) PHY support for Wi-Fi 7 * tag 'wireless-next-2023-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (172 commits) wifi: rtw89: return failure if needed firmware elements are not recognized wifi: rtw89: add to parse firmware elements of BB and RF tables wifi: rtw89: introduce infrastructure of firmware elements wifi: rtw89: add firmware suit for BB MCU 0/1 wifi: rtw89: add firmware parser for v1 format wifi: rtw89: introduce v1 format of firmware header wifi: rtw89: support firmware log with formatted text wifi: rtw89: recognize log format from firmware file wifi: ath12k: avoid deadlock by change ieee80211_queue_work for regd_update_work wifi: ath12k: add handler for scan event WMI_SCAN_EVENT_DEQUEUED wifi: ath12k: relax list iteration in ath12k_mac_vif_unref() wifi: ath12k: configure puncturing bitmap wifi: ath12k: parse WMI service ready ext2 event wifi: ath12k: add MLO header in peer association wifi: ath12k: peer assoc for 320 MHz wifi: ath12k: add WMI support for EHT peer wifi: ath12k: prepare EHT peer assoc parameters wifi: ath12k: add EHT PHY modes wifi: ath12k: propagate EHT capabilities to userspace wifi: ath12k: WMI support to process EHT capabilities ... ==================== Link: https://lore.kernel.org/r/87msz7j942.fsf@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 18:34:25 -07:00
Jakub Kicinski	90cd5467d1	Merge branch 'tcp-disable-header-prediction-for-md5' Kuniyuki Iwashima says: ==================== tcp: Disable header prediction for MD5. The 1st patch disable header prediction for MD5 flow and the 2nd patch updates the stale comment in tcp_parse_options(). ==================== Link: https://lore.kernel.org/r/20230803224552.69398-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 18:28:38 -07:00
Kuniyuki Iwashima	b205153689	tcp: Update stale comment for MD5 in tcp_parse_options(). Since commit `9ea88a1530` ("tcp: md5: check md5 signature without socket lock"), the MD5 option is checked in tcp_v[46]_rcv(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230803224552.69398-3-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 18:28:36 -07:00
Kuniyuki Iwashima	d0f2b7a9ca	tcp: Disable header prediction for MD5 flow. TCP socket saves the minimum required header length in tcp_header_len of struct tcp_sock, and later the value is used in __tcp_fast_path_on() to generate a part of TCP header in tcp_sock(sk)->pred_flags. In tcp_rcv_established(), if the incoming packet has the same pattern with pred_flags, we enter the fast path and skip full option parsing. The MD5 option is parsed in tcp_v[46]_rcv(), so we need not parse it again later in tcp_rcv_established() unless other options exist. We add TCPOLEN_MD5SIG_ALIGNED to tcp_header_len in two paths to avoid the slow path. For passive open connections with MD5, we add TCPOLEN_MD5SIG_ALIGNED to tcp_header_len in tcp_create_openreq_child() after 3WHS. On the other hand, we do it in tcp_connect_init() for active open connections. However, the value is overwritten while processing SYN+ACK or crossed SYN in tcp_rcv_synsent_state_process(). These two cases will have the wrong value in pred_flags and never go into the fast path. We could update tcp_header_len in tcp_rcv_synsent_state_process(), but a test with slightly modified netperf which uses MD5 for each flow shows that the slow path is actually a bit faster than the fast path. On c5.4xlarge EC2 instance (16 vCPU, 32 GiB mem) $ for i in {1..10}; do ./super_netperf $(nproc) -H localhost -l 10 -- -m 256 -M 256; done Avg of 10 * `36e68eadd3` : 10.376 Gbps * all fast path : 10.374 Gbps (patch v2, See Link) * all slow path : 10.394 Gbps The header prediction is not worth adding complexity for MD5, so let's disable it for MD5. Link: https://lore.kernel.org/netdev/20230803042214.38309-1-kuniyu@amazon.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230803224552.69398-2-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 18:28:36 -07:00
Russell King (Oracle)	f4bf467883	net: phy: move marking PHY on SFP module into SFP code Move marking the PHY as being on a SFP module into the SFP code between getting the PHY device (and thus initialising the phy_device structure) and registering the discovered device. This means that PHY drivers can use phy_on_sfp() in their match and get_features methods. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/E1qRaga-001vKt-8X@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 18:25:52 -07:00
Yue Haibing	852c18d561	mlxsw: spectrum: Remove unused function declarations Commit `c3d2ed93b1` ("mlxsw: Remove old parsing depth infrastructure") left behind mlxsw_sp_nve_inc_parsing_depth_get()/mlxsw_sp_nve_inc_parsing_depth_put(). And commit `532b49e41e` ("mlxsw: spectrum_span: Derive SBIB from maximum port speed & MTU") remove mlxsw_sp_span_port_mtu_update()/mlxsw_sp_span_speed_update_work() but leave the declarations. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20230803142047.42660-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 18:03:05 -07:00
Yue Haibing	f5f2d9bb52	ixgbevf: Remove unused function declarations ixgbe_napi_add_all()/ixgbe_napi_del_all() are declared but never implemented in commit `92915f7120` ("ixgbevf: Driver main and ethool interface module and main header") Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230803141904.15316-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 18:03:00 -07:00
Yue Haibing	781486e415	af_vsock: Remove unused declaration vsock_release_pending()/vsock_init_tap() Commit `d021c34405` ("VSOCK: Introduce VM Sockets") declared but never implemented vsock_release_pending(). Also vsock_init_tap() never implemented since introduction in commit `531b374834` ("VSOCK: Add vsockmon tap functions"). Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20230803134507.22660-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 15:34:03 -07:00
Yue Haibing	2f0e807bc2	net: 802: Remove unused function declarations Commit `d8d9ba8dc9` ("net: 802: remove dead leftover after ipx driver removal") remove these implementations but leave the declarations. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230803135424.41664-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 15:33:50 -07:00
Eric Dumazet	c4a6b2da4b	tcp_metrics: hash table allocation cleanup After commit `098a697b49` ("tcp_metrics: Use a single hash table for all network namespaces.") we can avoid calling tcp_net_metrics_init() for each new netns. Instead, rename tcp_net_metrics_init() to tcp_metrics_hash_alloc(), and move it to __init section. Also move tcpmhash_entries to __initdata section. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230803135417.2716879-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 15:33:39 -07:00
Yue Haibing	faa9039161	net: hns3: Remove unused function declarations Commit `1e6e76101f` ("net: hns3: configure promisc mode for VF asynchronously") left behind hclge_inform_vf_promisc_info() declaration. And commit `68c0a5c706` ("net: hns3: Add HNS3 IMP(Integrated Mgmt Proc) Cmd Interface Support") declared but never implemented hclge_cmd_mdio_write() and hclge_cmd_mdio_read(). Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230803135138.37456-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 15:33:27 -07:00
Yue Haibing	57ecc157b6	net: llc: Remove unused function declarations llc_conn_ac_send_i_rsp_as_ack() and llc_conn_ev_sendack_tmr_exp() are never implemented since beginning of git history. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230803134747.41512-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 15:33:17 -07:00
Jakub Kicinski	eef9630de0	Merge branch 'devlink-use-spec-to-generate-split-ops' Jiri Pirko says: ==================== devlink: use spec to generate split ops This is an outcome of the discussion in the following thread: https://lore.kernel.org/netdev/20230720121829.566974-1-jiri@resnulli.us/ It serves as a dependency on the linked selector patchset. There is an existing spec for devlink used for userspace part generation. There are two commands supported there. This patchset extends the spec so kernel split ops code could be generated from it. ==================== Link: https://lore.kernel.org/r/20230803111340.1074067-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 14:03:04 -07:00
Jiri Pirko	6e067d0cab	devlink: use generated split ops and remove duplicated commands from small ops Do the switch and use generated split ops for get and info_get commands. Remove those from small ops array. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230803111340.1074067-13-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 14:03:02 -07:00
Jiri Pirko	b2551b1517	devlink: include the generated netlink header Put the newly added generated header to the include list. Remove the duplicated temporary function prototypes. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230803111340.1074067-12-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 14:03:02 -07:00
Jiri Pirko	6b7c486cae	devlink: add split ops generated according to spec Improve the existing devlink spec in order to serve as a source for generation of valid devlink split ops for the existing commands. Add the generated sources. Node that the policies are narrowed down only to the attributes that are actually parsed. The dont-validate-strict parsing policy makes sure that other possibly passed garbage attributes from userspace are ignored during validation. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230803111340.1074067-11-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 14:03:01 -07:00
Jiri Pirko	759f661012	netlink: specs: devlink: add info-get dump op Add missing dump op for info-get command and re-generate related devlink-user.[ch] code. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230803111340.1074067-10-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-04 14:03:01 -07:00

1 2 3 4 5 ...

1202040 Commits