linux/samples
Jesper Dangaard Brouer eff94154cc samples/bpf: xdp_redirect_cpu_user: Cpumap qsize set larger default
Experience from production shows queue size of 192 is too small, as
this caused packet drops during cpumap-enqueue on RX-CPU.  This can be
diagnosed with xdp_monitor sample program.

This bpftrace program was used to diagnose the problem in more detail:

 bpftrace -e '
  tracepoint:xdp:xdp_cpumap_kthread { @deq_bulk = lhist(args->processed,0,10,1); @drop_net = lhist(args->drops,0,10,1) }
  tracepoint:xdp:xdp_cpumap_enqueue { @enq_bulk = lhist(args->processed,0,10,1); @enq_drops = lhist(args->drops,0,10,1); }'

Watch out for the @enq_drops counter. The @drop_net counter can happen
when netstack gets invalid packets, so don't despair it can be
natural, and that counter will likely disappear in newer kernels as it
was a source of confusion (look at netstat info for reason of the
netstack @drop_net counters).

The production system was configured with CPU power-saving C6 state.
Learn more in this blogpost[1].

And wakeup latency in usec for the states are:

 # grep -H . /sys/devices/system/cpu/cpu0/cpuidle/*/latency
 /sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0
 /sys/devices/system/cpu/cpu0/cpuidle/state1/latency:2
 /sys/devices/system/cpu/cpu0/cpuidle/state2/latency:10
 /sys/devices/system/cpu/cpu0/cpuidle/state3/latency:133

Deepest state take 133 usec to wakeup from (133/10^6). The link speed
is 25Gbit/s ((25*10^9/8) in bytes/sec). How many bytes can arrive with
in 133 usec at this speed: (25*10^9/8)*(133/10^6) = 415625 bytes. With
MTU size packets this is 275 packets, and with minimum Ethernet (incl
intergap overhead) 84 bytes it is 4948 packets. Clearly default queue
size is too small.

Setting default cpumap queue to 2048 as worst-case (small packet) at
10Gbit/s is 1979 packets with 133 usec wakeup time, +64 packet before
kthread wakeup call (due to xdp_do_flush) worst-case 2043 packets.

Thus, if a packet burst on RX-CPU will enqueue packets to a remote
cpumap CPU that is in deep-sleep state it can overrun the cpumap queue.

The production system was also configured to avoid deep-sleep via:
 tuned-adm profile network-latency

[1] https://jeremyeder.com/2013/08/30/oh-did-you-expect-the-cpu/

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/162523477604.786243.13372630844944530891.stgit@firesoul
2021-07-07 20:11:48 -07:00
..
acrn sample/acrn: Introduce a sample of HSM ioctl interface usage 2021-02-09 10:58:19 +01:00
auxdisplay .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
binderfs .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
bpf samples/bpf: xdp_redirect_cpu_user: Cpumap qsize set larger default 2021-07-07 20:11:48 -07:00
configfs treewide: remove editor modelines and cruft 2021-05-07 00:26:34 -07:00
connector .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
ftrace samples/ftrace: Mark my_tramp[12]? global 2020-11-30 21:42:48 -05:00
hidraw .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
hw_breakpoint samples/hw_breakpoint: drop use of kallsyms_lookup_name() 2020-04-07 10:43:44 -07:00
kdb treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
kfifo kfifo: fix ternary sign extension bugs 2021-04-30 11:20:35 -07:00
kmemleak mm,kmemleak-test.c: move kmemleak-test.c to samples dir 2020-10-13 18:38:27 -07:00
kobject treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
kprobes This was a reasonably active cycle for documentation; this pull includes: 2021-06-28 16:53:05 -07:00
landlock samples/landlock: Add a sandbox manager example 2021-04-22 12:22:11 -07:00
livepatch livepatch: Handle allocation failure in the sample of shadow variable API 2020-01-17 11:12:06 +01:00
mei .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
nitro_enclaves .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
pidfd .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
pktgen samples: pktgen: add UDP tx checksum support 2021-05-28 14:52:13 -07:00
qmi samples: qmi: Constify static qmi ops 2020-11-24 17:08:47 -06:00
rpmsg samples/rpmsg: Introduce a module parameter for message count 2019-08-26 22:10:39 -07:00
seccomp .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
timers .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
trace_events tracing: Fix doc mistakes in trace sample 2020-05-07 13:32:57 -04:00
trace_printk samples/trace_printk: Wait for IRQ work to finish 2019-12-21 16:08:22 -05:00
uhid kbuild: introduce hostprogs-always-y and userprogs-always-y 2020-08-10 01:32:59 +09:00
v4l media: rename VFL_TYPE_GRABBER to _VIDEO 2020-02-24 16:52:39 +01:00
vfio-mdev samples: vfio-mdev: fix error handing in mdpy_fb_probe() 2021-05-24 13:40:13 -06:00
vfs .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
watch_queue .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
watchdog .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
Kconfig samples/landlock: Add a sandbox manager example 2021-04-22 12:22:11 -07:00
Makefile samples/landlock: Add a sandbox manager example 2021-04-22 12:22:11 -07:00