linux/net/xdp
Maciej Fijalkowski f7f6aa8e24 xsk: make xsk_buff_pool responsible for clearing xdp_buff::flags
XDP multi-buffer support introduced XDP_FLAGS_HAS_FRAGS flag that is
used by drivers to notify data path whether xdp_buff contains fragments
or not. Data path looks up mentioned flag on first buffer that occupies
the linear part of xdp_buff, so drivers only modify it there. This is
sufficient for SKB and XDP_DRV modes as usually xdp_buff is allocated on
stack or it resides within struct representing driver's queue and
fragments are carried via skb_frag_t structs. IOW, we are dealing with
only one xdp_buff.

ZC mode though relies on list of xdp_buff structs that is carried via
xsk_buff_pool::xskb_list, so ZC data path has to make sure that
fragments do *not* have XDP_FLAGS_HAS_FRAGS set. Otherwise,
xsk_buff_free() could misbehave if it would be executed against xdp_buff
that carries a frag with XDP_FLAGS_HAS_FRAGS flag set. Such scenario can
take place when within supplied XDP program bpf_xdp_adjust_tail() is
used with negative offset that would in turn release the tail fragment
from multi-buffer frame.

Calling xsk_buff_free() on tail fragment with XDP_FLAGS_HAS_FRAGS would
result in releasing all the nodes from xskb_list that were produced by
driver before XDP program execution, which is not what is intended -
only tail fragment should be deleted from xskb_list and then it should
be put onto xsk_buff_pool::free_list. Such multi-buffer frame will never
make it up to user space, so from AF_XDP application POV there would be
no traffic running, however due to free_list getting constantly new
nodes, driver will be able to feed HW Rx queue with recycled buffers.
Bottom line is that instead of traffic being redirected to user space,
it would be continuously dropped.

To fix this, let us clear the mentioned flag on xsk_buff_pool side
during xdp_buff initialization, which is what should have been done
right from the start of XSK multi-buffer support.

Fixes: 1bbc04de60 ("ice: xsk: add RX multi-buffer support")
Fixes: 1c9ba9c146 ("i40e: xsk: add RX multi-buffer support")
Fixes: 24ea50127e ("xsk: support mbuf on ZC RX")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20240124191602.566724-3-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24 16:24:06 -08:00
..
Kconfig treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
Makefile xsk: Introduce AF_XDP buffer allocation API 2020-05-21 17:31:26 -07:00
xdp_umem.c xsk: Add option to calculate TX checksum in SW 2023-11-29 14:59:40 -08:00
xdp_umem.h xsk: Fix umem cleanup bug at socket destruct 2020-11-20 15:52:39 +01:00
xsk_buff_pool.c xsk: make xsk_buff_pool responsible for clearing xdp_buff::flags 2024-01-24 16:24:06 -08:00
xsk_diag.c net: fill in MODULE_DESCRIPTION()s for SOCK_DIAG modules 2023-11-19 20:09:13 +00:00
xsk_queue.c xdp: Fix zero-size allocation warning in xskq_create() 2023-10-09 16:13:29 +02:00
xsk_queue.h xsk: Add TX timestamp and TX checksum offload support 2023-11-29 14:59:40 -08:00
xsk.c xsk: recycle buffer in case Rx queue was full 2024-01-24 16:24:06 -08:00
xsk.h xdp: Add proper __rcu annotations to redirect map entries 2021-06-24 19:41:15 +02:00
xskmap.c bpf: Centralize permissions checks for all BPF map types 2023-06-19 14:04:04 +02:00