linux

Author	SHA1	Message	Date
Arnaldo Carvalho de Melo	467cd948f8	Merge remote-tracking branch 'torvalds/master' into perf/core Get fixes sent via perf/urgent, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-05-11 12:37:19 -03:00
Magnus Karlsson	27e934bec3	selftests: xsk: make stat tests not spin on getsockopt Convert the stats tests from spinning on the getsockopt to just check getsockopt once when the Rx thread has received all the packets. The actual completion of receiving the last packet forms a natural point in time when the receiver is ready to call the getsockopt to check the stats. In the previous version , we just span on the getsockopt until we received the right answer. This could be forever or just getting the "correct" answer by shear luck. The pacing_on variable can now be dropped since all test can now handle pacing properly. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-10-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 08:03:16 -07:00
Magnus Karlsson	4fec7028ff	selftests: xsk: make the stats tests normal tests Make the stats tests look and feel just like normal tests instead of bunched under the umbrella of TEST_STATS. This means we will always run each of them even if one fails. Also gets rid of some special case code. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-9-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 08:03:16 -07:00
Magnus Karlsson	76c576638f	selftests: xsk: introduce validation functions Introduce validation functions that can be optionally called by the Rx and Tx threads. These are then used to replace the Rx and Tx stats dispatchers. This so that we in the next commit can make the stats tests proper normal tests and not be some special case, as today. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-8-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 08:03:16 -07:00
Magnus Karlsson	d41cb6c474	selftests: xsk: cleanup veth pair at ctrl-c Remove the veth pair when the tests are aborted by pressing ctrl-c. Currently in this situation, the veth pair is left on the system polluting the netdev space. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-7-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 08:03:16 -07:00
Magnus Karlsson	db1bd7a994	selftests: xsk: add timeout to tests Add a timeout to the tests so that if all packets have not been received within 3 seconds, fail the ongoing test. Hinders a test from dead-locking if there is something wrong. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-6-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 08:03:15 -07:00
Magnus Karlsson	895b62eed2	selftests: xsk: fix reporting of failed tests Fix the reporting of failed tests as it was broken in several ways. First, a failed test was reported as both failed and passed messing up the count. Second, tests were not aborted after a failure and could generate more "failures" messing up the count even more. Third, the failure reporting from the application to the shell script was wrong. It always reported pass. And finally, the handling of the failures in the launch script was not correct. Correct all this by propagating the failure up through the function calls to a calling function that can abort the test. A receiver or sender thread will mark the new variable in the test spec called fail, if a test has failed. This is then picked up by the main thread when everyone else has exited and this is then marked and propagated up to the calling script. Also add a summary function in the calling script so that a user does not have to go through the sub tests to see if something has failed. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-5-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 08:03:15 -07:00
Magnus Karlsson	f90062b532	selftests: xsk: run all tests for busy-poll Execute all xsk selftests for busy-poll mode too. Currently they were only run for the standard interrupt driven softirq mode. Replace the unused option queue-id with the new option busy-poll. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-4-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 08:03:15 -07:00
Magnus Karlsson	f3e619bb34	selftests: xsk: do not send zero-length packets Do not try to send packets of zero length since they are dropped by veth after commit `726e2c5929` ("veth: Ensure eth header is in skb's linear part"). Replace these two packets with packets of length 60 so that they are not dropped. Also clean up the confusing naming. MIN_PKT_SIZE was really MIN_ETH_PKT_SIZE and PKT_SIZE was both MIN_ETH_SIZE and the default packet size called just PKT_SIZE. Make it consistent by using the right define in the right place. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-3-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 08:03:15 -07:00
Magnus Karlsson	685e64a3c9	selftests: xsk: cleanup bash scripts Remove the spec-file that is not used any longer from the shell scripts. Also remove an unused option. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-2-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 08:03:15 -07:00
Ravi Bangoria	9cb23f598c	perf/ibs: Fix comment s/IBS Op Data 2/IBS Op Data 1/ for MSR 0xc0011035. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220509044914.1473-9-ravi.bangoria@amd.com	2022-05-11 16:27:10 +02:00
Jiri Olsa	b63b3c490e	libbpf: Add bpf_program__set_insns function Adding bpf_program__set_insns that allows to set new instructions for a BPF program. This is a very advanced libbpf API and users need to know what they are doing. This should be used from prog_prepare_load_fn callback only. We can have changed instructions after calling prog_prepare_load_fn callback, reloading them. One of the users of this new API will be perf's internal BPF prologue generation. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510074659.2557731-2-jolsa@kernel.org	2022-05-11 14:15:17 +02:00
Andrii Nakryiko	5eefe17c7a	libbpf: Clean up ringbuf size adjustment implementation Drop unused iteration variable, move overflow prevention check into the for loop. Fixes: `0087a681fa` ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220510185159.754299-1-andrii@kernel.org	2022-05-11 14:06:29 +02:00
Kui-Feng Lee	ddc0027a4c	selftest/bpf: The test cases of BPF cookie for fentry/fexit/fmod_ret/lsm. Make sure BPF cookies are correct for fentry/fexit/fmod_ret/lsm. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-6-kuifeng@fb.com	2022-05-10 21:58:40 -07:00
Kui-Feng Lee	129b9c5ee2	libbpf: Assign cookies to links in libbpf. Add a cookie field to the attributes of bpf_link_create(). Add bpf_program__attach_trace_opts() to attach a cookie to a link. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-5-kuifeng@fb.com	2022-05-10 21:58:40 -07:00
Kui-Feng Lee	2fcc82411e	bpf, x86: Attach a cookie to fentry/fexit/fmod_ret/lsm. Pass a cookie along with BPF_LINK_CREATE requests. Add a bpf_cookie field to struct bpf_tracing_link to attach a cookie. The cookie of a bpf_tracing_link is available by calling bpf_get_attach_cookie when running the BPF program of the attached link. The value of a cookie will be set at bpf_tramp_run_ctx by the trampoline of the link. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-4-kuifeng@fb.com	2022-05-10 21:58:31 -07:00
Kui-Feng Lee	f7e0beaf39	bpf, x86: Generate trampolines from bpf_tramp_links Replace struct bpf_tramp_progs with struct bpf_tramp_links to collect struct bpf_tramp_link(s) for a trampoline. struct bpf_tramp_link extends bpf_link to act as a linked list node. arch_prepare_bpf_trampoline() accepts a struct bpf_tramp_links to collects all bpf_tramp_link(s) that a trampoline should call. Change BPF trampoline and bpf_struct_ops to pass bpf_tramp_links instead of bpf_tramp_progs. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-2-kuifeng@fb.com	2022-05-10 17:50:40 -07:00
Jiri Olsa	5b6c7e5c44	selftests/bpf: Add attach bench test Adding test that reads all functions from ftrace available_filter_functions file and attach them all through kprobe_multi API. It also prints stats info with -v option, like on my setup: test_bench_attach: found 48712 functions test_bench_attach: attached in 1.069s test_bench_attach: detached in 0.373s Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220510122616.2652285-6-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-10 14:42:06 -07:00
Dmitrii Dolgov	5a9b8e2c1a	selftests/bpf: Add bpf link iter test Add a simple test for bpf link iterator Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Link: https://lore.kernel.org/r/20220510155233.9815-5-9erthalion6@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-10 11:20:45 -07:00
Dmitrii Dolgov	f78625fdc9	selftests/bpf: Use ASSERT_* instead of CHECK Replace usage of CHECK with a corresponding ASSERT_* macro for bpf_iter tests. Only done if the final result is equivalent, no changes when replacement means loosing some information, e.g. from formatting string. Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Link: https://lore.kernel.org/r/20220510155233.9815-4-9erthalion6@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-10 11:20:45 -07:00
Dmitrii Dolgov	6b2d16b657	selftests/bpf: Fix result check for test_bpf_hash_map The original condition looks like a typo, verify the skeleton loading result instead. Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Link: https://lore.kernel.org/r/20220510155233.9815-3-9erthalion6@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-10 11:20:45 -07:00
Kaixi Fan	71b2ec21c3	selftests/bpf: Replace bpf_trace_printk in tunnel kernel code Replace bpf_trace_printk with bpf_printk in test_tunnel_kern.c. function bpf_printk is more easier and useful than bpf_trace_printk. Signed-off-by: Kaixi Fan <fankaixi.li@bytedance.com> Link: https://lore.kernel.org/r/20220430074844.69214-4-fankaixi.li@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-10 10:49:03 -07:00
Kaixi Fan	1ee7efd40a	selftests/bpf: Move vxlan tunnel testcases to test_progs Move vxlan tunnel testcases from test_tunnel.sh to test_progs. And add vxlan tunnel source testcases also. Other tunnel testcases will be moved to test_progs step by step in the future. Rename bpf program section name as SEC("tc") because test_progs bpf loader could not load sections with name SEC("gre_set_tunnel"). Because of this, add bpftool to load bpf programs in test_tunnel.sh. Signed-off-by: Kaixi Fan <fankaixi.li@bytedance.com> Link: https://lore.kernel.org/r/20220430074844.69214-3-fankaixi.li@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-10 10:49:03 -07:00
Kaixi Fan	26101f5ab6	bpf: Add source ip in "struct bpf_tunnel_key" Add tunnel source ip field in "struct bpf_tunnel_key". Add related code to set and get tunnel source field. Signed-off-by: Kaixi Fan <fankaixi.li@bytedance.com> Link: https://lore.kernel.org/r/20220430074844.69214-2-fankaixi.li@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-10 10:49:03 -07:00
KP Singh	bd2331b375	bpftool: bpf_link_get_from_fd support for LSM programs in lskel bpf_link_get_from_fd currently returns a NULL fd for LSM programs. LSM programs are similar to tracing programs and can also use skel_raw_tracepoint_open. Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220509214905.3754984-1-kpsingh@kernel.org	2022-05-10 10:42:08 -07:00
Namhyung Kim	cad10ce366	perf annotate: Add --percent-limit option Like in 'perf report' and 'perf top', Add this option to limit the number of functions it displays based on the overhead value in percent. This affects only stdio and stdio2 output modes. Without this, it shows very long disassembly lines for every function in the data file. If users don't want this behavior, they can set a value in percent to suppress that. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220502232015.697243-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-05-10 14:37:55 -03:00
Takshak Chahande	a82ebb093f	selftests/bpf: Handle batch operations for map-in-map bpf-maps This patch adds up test cases that handles 4 combinations: a) outer map: BPF_MAP_TYPE_ARRAY_OF_MAPS inner maps: BPF_MAP_TYPE_ARRAY and BPF_MAP_TYPE_HASH b) outer map: BPF_MAP_TYPE_HASH_OF_MAPS inner maps: BPF_MAP_TYPE_ARRAY and BPF_MAP_TYPE_HASH Signed-off-by: Takshak Chahande <ctakshak@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220510082221.2390540-2-ctakshak@fb.com	2022-05-10 10:34:57 -07:00
Adrian Hunter	7df319e5b3	perf auxtrace: Record whether an auxtrace mmap is needed Add a flag needs_auxtrace_mmap to record whether an auxtrace mmap is needed, in preparation for correctly determining whether or not an auxtrace mmap is needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-05-10 14:27:19 -03:00
Adrian Hunter	8f111be643	libperf evlist: Add evsel as a parameter to ->idx() Add evsel as a parameter to ->idx() in preparation for correctly determining whether an auxtrace mmap is needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-05-10 14:26:48 -03:00
Adrian Hunter	d8fe2efb65	libperf evlist: Move ->idx() into mmap_per_evsel() Move ->idx() into mmap_per_evsel() in preparation for adding evsel as a parameter. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-05-10 14:26:27 -03:00
Adrian Hunter	6a7b8a5a30	libperf evlist: Remove ->idx() per_cpu parameter Remove ->idx() per_cpu parameter because it isn't needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-05-10 14:25:57 -03:00
Adrian Hunter	d205a3a665	perf auxtrace: Do not mix up mmap idx The idx is with respect to evlist not evsel. That hasn't mattered because they are the same at present. Prepare for that not being the case, which it won't be when sideband tracking events are allowed on all CPUs even when auxtrace is limited to selected CPUs. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-05-10 14:24:32 -03:00
Adrian Hunter	024b3b42ad	perf auxtrace: Move evlist__enable_event_idx() to auxtrace.c evlist__enable_event_idx() is used only by auxtrace. Move it to auxtrace.c in preparation for making it even more auxtrace specific. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-05-10 14:23:34 -03:00
Adrian Hunter	a40bb7518e	perf evlist: Use libperf functions in evlist__enable_event_idx() evlist__enable_event_idx() is used only for auxtrace events which are never system_wide. Simplify by using libperf enable event functions. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-05-10 14:22:47 -03:00
Adrian Hunter	00632610c2	libperf evsel: Add perf_evsel__enable_thread() Add perf_evsel__enable_thread() as a counterpart to perf_evsel__enable_cpu(), to enable all events for a thread. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-05-10 14:19:09 -03:00
Joel Savitz	17de1e559c	selftests: clarify common error when running gup_test The gup_test binary will fail showing only the output of perror("open") in the case that /sys/kernel/debug/gup_test is not found. This will almost always be due to CONFIG_GUP_TEST not being set, which enables compilation of a kernel that provides this file. Add a short error message to clarify this failure and point the user to the solution. Link: https://lkml.kernel.org/r/20220502224942.995427-1-jsavitz@redhat.com Signed-off-by: Joel Savitz <jsavitz@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Nico Pache <npache@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-05-09 18:20:47 -07:00
David Hildenbrand	78fbe906cc	mm/page-flags: reuse PG_mappedtodisk as PG_anon_exclusive for PageAnon() pages The basic question we would like to have a reliable and efficient answer to is: is this anonymous page exclusive to a single process or might it be shared? We need that information for ordinary/single pages, hugetlb pages, and possibly each subpage of a THP. Introduce a way to mark an anonymous page as exclusive, with the ultimate goal of teaching our COW logic to not do "wrong COWs", whereby GUP pins lose consistency with the pages mapped into the page table, resulting in reported memory corruptions. Most pageflags already have semantics for anonymous pages, however, PG_mappedtodisk should never apply to pages in the swapcache, so let's reuse that flag. As PG_has_hwpoisoned also uses that flag on the second tail page of a compound page, convert it to PG_error instead, which is marked as PF_NO_TAIL, so never used for tail pages. Use custom page flag modification functions such that we can do additional sanity checks. The semantics we'll put into some kernel doc in the future are: " PG_anon_exclusive is usually only expressive in combination with a page table entry. Depending on the page table entry type it might store the following information: Is what's mapped via this page table entry exclusive to the single process and can be mapped writable without further checks? If not, it might be shared and we might have to COW. For now, we only expect PTE-mapped THPs to make use of PG_anon_exclusive in subpages. For other anonymous compound folios (i.e., hugetlb), only the head page is logically mapped and holds this information. For example, an exclusive, PMD-mapped THP only has PG_anon_exclusive set on the head page. When replacing the PMD by a page table full of PTEs, PG_anon_exclusive, if set on the head page, will be set on all tail pages accordingly. Note that converting from a PTE-mapping to a PMD mapping using the same compound page is currently not possible and consequently doesn't require care. If GUP wants to take a reliable pin (FOLL_PIN) on an anonymous page, it should only pin if the relevant PG_anon_exclusive is set. In that case, the pin will be fully reliable and stay consistent with the pages mapped into the page table, as the bit cannot get cleared (e.g., by fork(), KSM) while the page is pinned. For anonymous pages that are mapped R/W, PG_anon_exclusive can be assumed to always be set because such pages cannot possibly be shared. The page table lock protecting the page table entry is the primary synchronization mechanism for PG_anon_exclusive; GUP-fast that does not take the PT lock needs special care when trying to clear the flag. Page table entry types and PG_anon_exclusive: * Present: PG_anon_exclusive applies. * Swap: the information is lost. PG_anon_exclusive was cleared. * Migration: the entry holds this information instead. PG_anon_exclusive was cleared. * Device private: PG_anon_exclusive applies. * Device exclusive: PG_anon_exclusive applies. * HW Poison: PG_anon_exclusive is stale and not changed. If the page may be pinned (FOLL_PIN), clearing PG_anon_exclusive is not allowed and the flag will stick around until the page is freed and folio->mapping is cleared. " We won't be clearing PG_anon_exclusive on destructive unmapping (i.e., zapping) of page table entries, page freeing code will handle that when also invalidate page->mapping to not indicate PageAnon() anymore. Letting information about exclusivity stick around will be an important property when adding sanity checks to unpinning code. Note that we properly clear the flag in free_pages_prepare() via PAGE_FLAGS_CHECK_AT_PREP for each individual subpage of a compound page, so there is no need to manually clear the flag. Link: https://lkml.kernel.org/r/20220428083441.37290-12-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Rientjes <rientjes@google.com> Cc: Don Dutile <ddutile@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Khalid Aziz <khalid.aziz@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Liang Zhang <zhangliang5@huawei.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Nadav Amit <namit@vmware.com> Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-05-09 18:20:44 -07:00
Jason Wang	56c3e749d0	bpftool: Declare generator name Most code generators declare its name so did this for bfptool. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220509090247.5457-1-jasowang@redhat.com	2022-05-09 17:42:53 -07:00
Joel Savitz	41c240099f	selftests: vm: Makefile: rename TARGETS to VMTARGETS The tools/testing/selftests/vm/Makefile uses the variable TARGETS internally to generate a list of platform-specific binary build targets suffixed with _{32,64}. When building the selftests using its own Makefile directly, such as via the following command run in a kernel tree: One receives an error such as the following: make: Entering directory '/root/linux/tools/testing/selftests' make --no-builtin-rules ARCH=x86 -C ../../.. headers_install make[1]: Entering directory '/root/linux' INSTALL ./usr/include make[1]: Leaving directory '/root/linux' make[1]: Entering directory '/root/linux/tools/testing/selftests/vm' make[1]: * No rule to make target 'vm.c', needed by '/root/linux/tools/testing/selftests/vm/vm_64'. Stop. make[1]: Leaving directory '/root/linux/tools/testing/selftests/vm' make: * [Makefile:175: all] Error 2 make: Leaving directory '/root/linux/tools/testing/selftests' The TARGETS variable passed to tools/testing/selftests/Makefile collides with the TARGETS used in tools/testing/selftests/vm/Makefile, so rename the latter to VMTARGETS, eliminating the collision with no functional change. Link: https://lkml.kernel.org/r/20220504213454.1282532-1-jsavitz@redhat.com Fixes: `f21fda8f64` ("selftests: vm: pkeys: fix multilib builds for x86") Signed-off-by: Joel Savitz <jsavitz@redhat.com> Acked-by: Nico Pache <npache@redhat.com> Cc: Joel Savitz <jsavitz@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-05-09 17:34:29 -07:00
Milan Landaverde	b06a92a18d	bpftool: Output message if no helpers found in feature probing Currently in libbpf, we have hardcoded program types that are not supported for helper function probing (e.g. tracing, ext, lsm). Due to this (and other legitimate failures), bpftool feature probe returns empty for those program type helper functions. Instead of implying to the user that there are no helper functions available for a program type, we output a message to the user explaining that helper function probing failed for that program type. Signed-off-by: Milan Landaverde <milan@mdaverde.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220504161356.3497972-3-milan@mdaverde.com	2022-05-09 17:16:05 -07:00
Milan Landaverde	6d9f63b9df	bpftool: Adjust for error codes from libbpf probes Originally [1], libbpf's (now deprecated) probe functions returned a bool to acknowledge support but the new APIs return an int with a possible negative error code to reflect probe failure. This change decides for bpftool to declare maps and helpers are not available on probe failures. [1]: https://lore.kernel.org/bpf/20220202225916.3313522-3-andrii@kernel.org/ Signed-off-by: Milan Landaverde <milan@mdaverde.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220504161356.3497972-2-milan@mdaverde.com	2022-05-09 17:16:05 -07:00
Andrii Nakryiko	7b3a063824	selftests/bpf: Test libbpf's ringbuf size fix up logic Make sure we always excercise libbpf's ringbuf map size adjustment logic by specifying non-zero size that's definitely not a page size multiple. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-10-andrii@kernel.org	2022-05-09 17:15:32 +02:00
Andrii Nakryiko	0087a681fa	libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary Kernel imposes a pretty particular restriction on ringbuf map size. It has to be a power-of-2 multiple of page size. While generally this isn't hard for user to satisfy, sometimes it's impossible to do this declaratively in BPF source code or just plain inconvenient to do at runtime. One such example might be BPF libraries that are supposed to work on different architectures, which might not agree on what the common page size is. Let libbpf find the right size for user instead, if it turns out to not satisfy kernel requirements. If user didn't set size at all, that's most probably a mistake so don't upsize such zero size to one full page, though. Also we need to be careful about not overflowing __u32 max_entries. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-9-andrii@kernel.org	2022-05-09 17:15:32 +02:00
Andrii Nakryiko	f760d05379	libbpf: Provide barrier() and barrier_var() in bpf_helpers.h Add barrier() and barrier_var() macros into bpf_helpers.h to be used by end users. While a bit advanced and specialized instruments, they are sometimes indispensable. Instead of requiring each user to figure out exact asm volatile incantations for themselves, provide them from bpf_helpers.h. Also remove conflicting definitions from selftests. Some tests rely on barrier_var() definition being nothing, those will still work as libbpf does the #ifndef/#endif guarding for barrier() and barrier_var(), allowing users to redefine them, if necessary. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-8-andrii@kernel.org	2022-05-09 17:15:32 +02:00
Andrii Nakryiko	785c3342cf	selftests/bpf: Add bpf_core_field_offset() tests Add test cases for bpf_core_field_offset() helper. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-7-andrii@kernel.org	2022-05-09 17:15:32 +02:00
Andrii Nakryiko	7715f549a9	libbpf: Complete field-based CO-RE helpers with field offset helper Add bpf_core_field_offset() helper to complete field-based CO-RE helpers. This helper can be useful for feature-detection and for some more advanced cases of field reading (e.g., reading flexible array members). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-6-andrii@kernel.org	2022-05-09 17:15:32 +02:00
Andrii Nakryiko	2a4ca46b7d	selftests/bpf: Use both syntaxes for field-based CO-RE helpers Excercise both supported forms of bpf_core_field_exists() and bpf_core_field_size() helpers: variable-based field reference and type/field name-based one. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-5-andrii@kernel.org	2022-05-09 17:15:32 +02:00
Andrii Nakryiko	73d0280f6b	libbpf: Improve usability of field-based CO-RE helpers Allow to specify field reference in two ways: - if user has variable of necessary type, they can use variable-based reference (my_var.my_field or my_var_ptr->my_field). This was the only supported syntax up till now. - now, bpf_core_field_exists() and bpf_core_field_size() support also specifying field in a fashion similar to offsetof() macro, by specifying type of the containing struct/union separately and field name separately: bpf_core_field_exists(struct my_type, my_field). This forms is quite often more convenient in practice and it matches type-based CO-RE helpers that support specifying type by its name without requiring any variables. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-4-andrii@kernel.org	2022-05-09 17:15:28 +02:00
Andrii Nakryiko	8e2f618e8b	libbpf: Make __kptr and __kptr_ref unconditionally use btf_type_tag() attr It will be annoying and surprising for users of __kptr and __kptr_ref if libbpf silently ignores them just because Clang used for compilation didn't support btf_type_tag(). It's much better to get clear compiler error than debug BPF verifier failures later on. Fixes: `ef89654f2b` ("libbpf: Add kptr type tag macros to bpf_helpers.h") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-3-andrii@kernel.org	2022-05-09 17:14:40 +02:00
Andrii Nakryiko	1e2666e029	selftests/bpf: Prevent skeleton generation race Prevent "classic" and light skeleton generation rules from stomping on each other's toes due to the use of the same <obj>.linked{1,2,3}.o naming pattern. There is no coordination and synchronizataion between .skel.h and .lskel.h rules, so they can easily overwrite each other's intermediate object files, leading to errors like: /bin/sh: line 1: 170928 Bus error (core dumped) /data/users/andriin/linux/tools/testing/selftests/bpf/tools/sbin/bpftool gen skeleton /data/users/andriin/linux/tools/testing/selftests/bpf/test_ksyms_weak.linked3.o name test_ksyms_weak > /data/users/andriin/linux/tools/testing/selftests/bpf/test_ksyms_weak.skel.h make: * [Makefile:507: /data/users/andriin/linux/tools/testing/selftests/bpf/test_ksyms_weak.skel.h] Error 135 make: * Deleting file '/data/users/andriin/linux/tools/testing/selftests/bpf/test_ksyms_weak.skel.h' Fix by using different suffix for light skeleton rule. Fixes: `c48e51c8b0` ("bpf: selftests: Add selftests for module kfunc support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-2-andrii@kernel.org	2022-05-09 17:14:40 +02:00

... 6 7 8 9 10 ...

30354 Commits