Commit Graph

2235 Commits

Author SHA1 Message Date
Andrii Nakryiko
3e6fe5ce4d libbpf: Fix internal USDT address translation logic for shared libraries
Perform the same virtual address to file offset translation that libbpf
is doing for executable ELF binaries also for shared libraries.
Currently libbpf is making a simplifying and sometimes wrong assumption
that for shared libraries relative virtual addresses inside ELF are
always equal to file offsets.

Unfortunately, this is not always the case with LLVM's lld linker, which
now by default generates quite more complicated ELF segments layout.
E.g., for liburandom_read.so from selftests/bpf, here's an excerpt from
readelf output listing ELF segments (a.k.a. program headers):

  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000000040 0x0000000000000040 0x0001f8 0x0001f8 R   0x8
  LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x0005e4 0x0005e4 R   0x1000
  LOAD           0x0005f0 0x00000000000015f0 0x00000000000015f0 0x000160 0x000160 R E 0x1000
  LOAD           0x000750 0x0000000000002750 0x0000000000002750 0x000210 0x000210 RW  0x1000
  LOAD           0x000960 0x0000000000003960 0x0000000000003960 0x000028 0x000029 RW  0x1000

Compare that to what is generated by GNU ld (or LLVM lld's with extra
-znoseparate-code argument which disables this cleverness in the name of
file size reduction):

  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x000550 0x000550 R   0x1000
  LOAD           0x001000 0x0000000000001000 0x0000000000001000 0x000131 0x000131 R E 0x1000
  LOAD           0x002000 0x0000000000002000 0x0000000000002000 0x0000ac 0x0000ac R   0x1000
  LOAD           0x002dc0 0x0000000000003dc0 0x0000000000003dc0 0x000262 0x000268 RW  0x1000

You can see from the first example above that for executable (Flg == "R E")
PT_LOAD segment (LOAD #2), Offset doesn't match VirtAddr columns.
And it does in the second case (GNU ld output).

This is important because all the addresses, including USDT specs,
operate in a virtual address space, while kernel is expecting file
offsets when performing uprobe attach. So such mismatches have to be
properly taken care of and compensated by libbpf, which is what this
patch is fixing.

Also patch clarifies few function and variable names, as well as updates
comments to reflect this important distinction (virtaddr vs file offset)
and to ephasize that shared libraries are not all that different from
executables in this regard.

This patch also changes selftests/bpf Makefile to force urand_read and
liburand_read.so to be built with Clang and LLVM's lld (and explicitly
request this ELF file size optimization through -znoseparate-code linker
parameter) to validate libbpf logic and ensure regressions don't happen
in the future. I've bundled these selftests changes together with libbpf
changes to keep the above description tied with both libbpf and
selftests changes.

Fixes: 74cc6311ce ("libbpf: Add USDT notes parsing and resolution logic")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220616055543.3285835-1-andrii@kernel.org
2022-06-17 01:20:10 +02:00
Yonghong Song
c49a44b39b libbpf: Fix an unsigned < 0 bug
Andrii reported a bug with the following information:

  2859 	if (enum64_placeholder_id == 0) {
  2860 		enum64_placeholder_id = btf__add_int(btf, "enum64_placeholder", 1, 0);
  >>>     CID 394804:  Control flow issues  (NO_EFFECT)
  >>>     This less-than-zero comparison of an unsigned value is never true. "enum64_placeholder_id < 0U".
  2861 		if (enum64_placeholder_id < 0)
  2862 			return enum64_placeholder_id;
  2863    	...

Here enum64_placeholder_id declared as '__u32' so enum64_placeholder_id < 0
is always false. Declare enum64_placeholder_id as 'int' in order to capture
the potential error properly.

Fixes: f2a625889b ("libbpf: Add enum64 sanitization")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220613054314.1251905-1-yhs@fb.com
2022-06-14 17:01:54 +02:00
Andrii Nakryiko
fe92833524 libbpf: Fix uprobe symbol file offset calculation logic
Fix libbpf's bpf_program__attach_uprobe() logic of determining
function's *file offset* (which is what kernel is actually expecting)
when attaching uprobe/uretprobe by function name. Previously calculation
was determining virtual address offset relative to base load address,
which (offset) is not always the same as file offset (though very
frequently it is which is why this went unnoticed for a while).

Fixes: 433966e3ae ("libbpf: Support function name-based attach uprobes")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Riham Selim <rihams@fb.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20220606220143.3796908-1-andrii@kernel.org
2022-06-09 14:09:41 +02:00
Yonghong Song
23b2a3a8f6 libbpf: Add enum64 relocation support
The enum64 relocation support is added. The bpf local type
could be either enum or enum64 and the remote type could be
either enum or enum64 too. The all combinations of local enum/enum64
and remote enum/enum64 are supported.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062647.3721719-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07 10:20:43 -07:00
Yonghong Song
6ec7d79be2 libbpf: Add enum64 support for bpf linking
Add BTF_KIND_ENUM64 support for bpf linking, which is
very similar to BTF_KIND_ENUM.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062642.3721494-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07 10:20:43 -07:00
Yonghong Song
f2a625889b libbpf: Add enum64 sanitization
When old kernel does not support enum64 but user space btf
contains non-zero enum kflag or enum64, libbpf needs to
do proper sanitization so modified btf can be accepted
by the kernel.

Sanitization for enum kflag can be achieved by clearing
the kflag bit. For enum64, the type is replaced with an
union of integer member types and the integer member size
must be smaller than enum64 size. If such an integer
type cannot be found, a new type is created and used
for union members.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062636.3721375-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07 10:20:43 -07:00
Yonghong Song
d90ec262b3 libbpf: Add enum64 support for btf_dump
Add enum64 btf dumping support. For long long and unsigned long long
dump, suffixes 'LL' and 'ULL' are added to avoid compilation errors
in some cases.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062631.3720526-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07 10:20:43 -07:00
Yonghong Song
2ef2026349 libbpf: Add enum64 deduplication support
Add enum64 deduplication support. BTF_KIND_ENUM64 handling
is very similar to BTF_KIND_ENUM.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062626.3720166-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07 10:20:43 -07:00
Yonghong Song
dffbbdc2d9 libbpf: Add enum64 parsing and new enum64 public API
Add enum64 parsing support and two new enum64 public APIs:
  btf__add_enum64
  btf__add_enum64_value

Also add support of signedness for BTF_KIND_ENUM. The
BTF_KIND_ENUM API signatures are not changed. The signedness
will be changed from unsigned to signed if btf__add_enum_value()
finds any negative values.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062621.3719391-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07 10:20:43 -07:00
Yonghong Song
8479aa7522 libbpf: Refactor btf__add_enum() for future code sharing
Refactor btf__add_enum() function to create a separate
function btf_add_enum_common() so later the common function
can be used to add enum64 btf type. There is no functionality
change for this patch.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062615.3718063-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07 10:20:42 -07:00
Yonghong Song
b58b2b3a31 libbpf: Fix an error in 64bit relocation value computation
Currently, the 64bit relocation value in the instruction
is computed as follows:
  __u64 imm = insn[0].imm + ((__u64)insn[1].imm << 32)

Suppose insn[0].imm = -1 (0xffffffff) and insn[1].imm = 1.
With the above computation, insn[0].imm will first sign-extend
to 64bit -1 (0xffffffffFFFFFFFF) and then add 0x1FFFFFFFF,
producing incorrect value 0xFFFFFFFF. The correct value
should be 0x1FFFFFFFF.

Changing insn[0].imm to __u32 first will prevent 64bit sign
extension and fix the issue. Merging high and low 32bit values
also changed from '+' to '|' to be consistent with other
similar occurences in kernel and libbpf.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062610.3717378-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07 10:20:42 -07:00
Yonghong Song
776281652d libbpf: Permit 64bit relocation value
Currently, the libbpf limits the relocation value to be 32bit
since all current relocations have such a limit. But with
BTF_KIND_ENUM64 support, the enum value could be 64bit.
So let us permit 64bit relocation value in libbpf.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062605.3716779-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07 10:20:42 -07:00
Linus Torvalds
d0e60d46bc Bitmap patches for 5.19-rc1
This series includes the following patchsets:
  - bitmap: optimize bitmap_weight() usage(w/o bitmap_weight_cmp), from me;
  - lib/bitmap.c make bitmap_print_bitmask_to_buf parseable, from Mauro
    Carvalho Chehab;
  - include/linux/find: Fix documentation, from Anna-Maria Behnsen;
  - bitmap: fix conversion from/to fix-sized arrays, from me;
  - bitmap: Fix return values to be unsigned, from Kees Cook.
 
 It has been in linux-next for at least a week with no problems.
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmKaEzYACgkQsUSA/Tof
 vsiGKwv8Dgr3G0mLbSfmHZqdFMIsmSmwhxlEH6eBNtX6vjQbGafe/Buhj/1oSY8N
 NYC4+5Br6s7MmMRth3Kp6UECdl94TS3Ka06T+lVBKkG+C+B1w1/svqUMM2ZCQF3e
 Z5R/HhR6av9X9Qb2mWSasWLkWp629NjdtRsJSDWiVt1emVVwh+iwxQnMH9VuE+ao
 z3mvaQfSRhe4h+xCZOiohzFP+0jZb1EnPrQAIVzNUjigo7mglpNvVyO7p/8LU7gD
 dIjfGmSbtsHU72J+/0lotRqjhjORl1F/EILf8pIzx5Ga7ExUGhOzGWAj7/3uZxfA
 Cp1Z/QV271MGwv/sNdSPwCCJHf51eOmsbyOyUScjb3gFRwIStEa1jB4hKwLhS5wF
 3kh4kqu3WGuIQAdxkUpDBsy3CQjAPDkvtRJorwyWGbjwa9xUETESAgH7XCCTsgWc
 0sIuldWWaxC581+fAP1Dzmo8uuWBURTaOrVmRMILQHMTw54zoFyLY+VI9gEAT9aV
 gnPr3M4F
 =U7DN
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - bitmap: optimize bitmap_weight() usage, from me

 - lib/bitmap.c make bitmap_print_bitmask_to_buf parseable, from Mauro
   Carvalho Chehab

 - include/linux/find: Fix documentation, from Anna-Maria Behnsen

 - bitmap: fix conversion from/to fix-sized arrays, from me

 - bitmap: Fix return values to be unsigned, from Kees Cook

It has been in linux-next for at least a week with no problems.

* tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linux: (31 commits)
  nodemask: Fix return values to be unsigned
  bitmap: Fix return values to be unsigned
  KVM: x86: hyper-v: replace bitmap_weight() with hweight64()
  KVM: x86: hyper-v: fix type of valid_bank_mask
  ia64: cleanup remove_siblinginfo()
  drm/amd/pm: use bitmap_{from,to}_arr32 where appropriate
  KVM: s390: replace bitmap_copy with bitmap_{from,to}_arr64 where appropriate
  lib/bitmap: add test for bitmap_{from,to}_arr64
  lib: add bitmap_{from,to}_arr64
  lib/bitmap: extend comment for bitmap_(from,to)_arr32()
  include/linux/find: Fix documentation
  lib/bitmap.c make bitmap_print_bitmask_to_buf parseable
  MAINTAINERS: add cpumask and nodemask files to BITMAP_API
  arch/x86: replace nodes_weight with nodes_empty where appropriate
  mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate
  clocksource: replace cpumask_weight with cpumask_empty in clocksource.c
  genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate
  irq: mips: replace cpumask_weight with cpumask_empty where appropriate
  drm/i915/pmu: replace cpumask_weight with cpumask_empty where appropriate
  arch/x86: replace cpumask_weight with cpumask_empty where appropriate
  ...
2022-06-04 14:04:27 -07:00
Yuze Chi
611edf1bac libbpf: Fix is_pow_of_2
Move the correct definition from linker.c into libbpf_internal.h.

Fixes: 0087a681fa ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary")
Reported-by: Yuze Chi <chiyuze@google.com>
Signed-off-by: Yuze Chi <chiyuze@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220603055156.2830463-1-irogers@google.com
2022-06-03 14:53:33 -07:00
Daniel Müller
9bbdfad8a5 libbpf: Fix a couple of typos
This change fixes a couple of typos that were encountered while studying
the source code.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220601154025.3295035-1-deso@posteo.net
2022-06-03 14:53:33 -07:00
Kees Cook
005f17007f bitmap: Fix return values to be unsigned
Both nodemask and bitmap routines had mixed return values that provided
potentially signed return values that could never happen. This was
leading to the compiler getting confusing about the range of possible
return values (it was thinking things could be negative where they could
not be). In preparation for fixing nodemask, fix all the bitmap routines
that should be returning unsigned (or bool) values.

Cc: Yury Norov <yury.norov@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Christophe de Dinechin <dinechin@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-03 06:52:58 -07:00
Douglas Raillard
610cd93b44 libbpf: Fix determine_ptr_size() guessing
One strategy employed by libbpf to guess the pointer size is by finding
the size of "unsigned long" type. This is achieved by looking for a type
of with the expected name and checking its size.

Unfortunately, the C syntax is friendlier to humans than to computers
as there is some variety in how such a type can be named. Specifically,
gcc and clang do not use the same names for integer types in debug info:

    - clang uses "unsigned long"
    - gcc uses "long unsigned int"

Lookup all the names for such a type so that libbpf can hope to find the
information it wants.

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220524094447.332186-1-douglas.raillard@arm.com
2022-06-02 16:26:51 -07:00
Daniel Müller
ba5d1b5802 libbpf: Introduce libbpf_bpf_link_type_str
This change introduces a new function, libbpf_bpf_link_type_str, to the
public libbpf API. The function allows users to get a string
representation for a bpf_link_type enum variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-11-deso@posteo.net
2022-06-02 16:26:33 -07:00
Daniel Müller
ccde5760ba libbpf: Introduce libbpf_bpf_attach_type_str
This change introduces a new function, libbpf_bpf_attach_type_str, to
the public libbpf API. The function allows users to get a string
representation for a bpf_attach_type variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-8-deso@posteo.net
2022-06-02 16:26:26 -07:00
Daniel Müller
3e6dc0207b libbpf: Introduce libbpf_bpf_map_type_str
This change introduces a new function, libbpf_bpf_map_type_str, to the
public libbpf API. The function allows users to get a string
representation for a bpf_map_type enum variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-5-deso@posteo.net
2022-06-02 16:26:20 -07:00
Daniel Müller
d18616e7aa libbpf: Introduce libbpf_bpf_prog_type_str
This change introduces a new function, libbpf_bpf_prog_type_str, to the
public libbpf API. The function allows users to get a string
representation for a bpf_prog_type variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-2-deso@posteo.net
2022-06-02 16:26:10 -07:00
Adrian Hunter
a41e24f6c3 perf tools: Allow system-wide events to keep their own threads
System-wide events do not have threads, so do not propagate threads to
them.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20220524075436.29144-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-26 12:36:57 -03:00
Adrian Hunter
298613b8e3 perf tools: Allow system-wide events to keep their own CPUs
Currently, user_requested_cpus supplants system-wide CPUs when the evlist
has_user_cpus. Change that so that system-wide events retain their own
CPUs and they are added to all_cpus.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20220524075436.29144-15-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-26 12:36:57 -03:00
Adrian Hunter
f5fb6d4efe libperf evsel: Add comments for booleans
Add comments for 'system_wide' and 'requires_cpu' booleans

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20220524075436.29144-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-26 12:36:57 -03:00
Adrian Hunter
d3345fecf9 perf stat: Add requires_cpu flag for uncore
Uncore events require a CPU i.e. it cannot be -1.

The evsel system_wide flag is intended for events that should be on every
CPU, which does not make sense for uncore events because uncore events do
not map one-to-one with CPUs.

These 2 requirements are not exactly the same, so introduce a new flag
'requires_cpu' for the uncore case.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20220524075436.29144-13-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-26 12:36:57 -03:00
Adrian Hunter
4ce47d842d libperf evlist: Check nr_mmaps is correct
Print an error message if the predetermined number of mmaps is
incorrect.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20220524075436.29144-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-26 12:36:57 -03:00
Adrian Hunter
ae4f8ae16a libperf evlist: Allow mixing per-thread and per-cpu mmaps
mmap_per_evsel() will skip events that do not match the CPU, so all CPUs
can be iterated in any case.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20220524075436.29144-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-26 12:36:57 -03:00
Adrian Hunter
7be1fedd2a perf tools: Allow all_cpus to be a superset of user_requested_cpus
To support collection of system-wide events with user requested CPUs,
all_cpus must be a superset of user_requested_cpus.

In order to support all_cpus to be a superset of user_requested_cpus,
all_cpus must be used instead of user_requested_cpus when dealing with CPUs
of all events instead of CPUs of requested events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20220524075436.29144-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-26 12:36:57 -03:00
Linus Torvalds
d223575e50 perf tools changes for v5.19: 1st batch
Intel PT:
 
 - Allow hardware tracing on KVM test programs.  In this case, the VM is not
   running an OS, but only the functions loaded into it by the hypervisor test
   program, and conveniently, loaded at the same virtual addresses.
 
 - Improve documentation:
 
   - Add link to perf wiki's page.
 
 - Cleanups
 
   - Delete now unusdd perf-with-kcore.sh script.
 
   - Remove unused machines__find_host().
 
 ARM SPE (Statistical Profile Extensions):
 
   - Add man page entry.
 
 Vendor Events:
 
   - Update various Intel event topics.
 
   - Update various microarch events.
 
   - Fix various cstate metrics.
 
   - Fix Alderlake metric groups.
 
   - Add sapphirerapids events.
 
   - Add JSON files for ARM Cortex A34, A35, A55, A510, A65, A73, A75, A77, A78,
     A710, X1, X2 and Neoverse E1.
 
   - Update Cortex A57/A72.
 
 perf stat:
 
   - Introduce stats for the user and system rusage times.
 
 perf c2c:
 
   - Prep work to support ARM systems.
 
 perf annotate:
 
   - Add --percent-limit option
 
 perf lock:
 
   - Add -t/--thread option for report.
 
   - Do not discard broken lock stats.
 
 perf bench:
 
   Add breakpoint benchmarks.
 
 perf test:
 
   - Limit to only run executable scripts in tests.
 
   - Add basic perf record tests.
 
   - Add stat record+report test.
 
   - Add basic stat and topdown group test.
 
   - Skip several tests when the user hasn't permission to perform them.
 
   - Fix test case 81 ("perf record tests") on s390x.
 
 perf version:
 
   - debuginfod support improvements.
 
 perf scripting python:
 
   - Expose symbol offset and source information.
 
 perf build:
 
   - Error for BPF skeletons without LIBBPF.
 
   - Use Python devtools for version autodetection rather than runtime.
 
 Miscellaneous:
 
   - Add riscv64 support to 'perf jitdump'.
 
   - Various fixes/tidy ups related to cpu_map.
 
   - Fixes for handling Intel hybrid systems.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYow4VgAKCRCyPKLppCJ+
 J9txAP9pWif22k+pwIgw2NHzaA/TLlGoatUQRvryX02fohj+RAD8DGrPXD33zgMb
 QP9inIXOjQUPJrXomxVVskNdCCzolgo=
 =B/vL
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v5.19-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tool updates from Arnaldo Carvalho de Melo:
 "Intel PT:

   - Allow hardware tracing on KVM test programs. In this case, the VM
     is not running an OS, but only the functions loaded into it by the
     hypervisor test program, and conveniently, loaded at the same
     virtual addresses.

   - Improve documentation:
      - Add link to perf wiki's page

   - Cleanups:
      - Delete now unused perf-with-kcore.sh script
      - Remove unused machines__find_host()

  ARM SPE (Statistical Profile Extensions):

   - Add man page entry.

  Vendor Events:

   - Update various Intel event topics

   - Update various microarch events

   - Fix various cstate metrics

   - Fix Alderlake metric groups

   - Add sapphirerapids events

   - Add JSON files for ARM Cortex A34, A35, A55, A510, A65, A73, A75,
     A77, A78, A710, X1, X2 and Neoverse E1

   - Update Cortex A57/A72

  perf stat:

   - Introduce stats for the user and system rusage times

  perf c2c:

   - Prep work to support ARM systems

  perf annotate:

   - Add --percent-limit option

  perf lock:

   - Add -t/--thread option for report

   - Do not discard broken lock stats

  perf bench:

   - Add breakpoint benchmarks

  perf test:

   - Limit to only run executable scripts in tests

   - Add basic perf record tests

   - Add stat record+report test

   - Add basic stat and topdown group test

   - Skip several tests when the user hasn't permission to perform them

   - Fix test case 81 ("perf record tests") on s390x

  perf version:

   - debuginfod support improvements

  perf scripting python:

   - Expose symbol offset and source information

  perf build:

   - Error for BPF skeletons without LIBBPF

   - Use Python devtools for version autodetection rather than runtime

  Miscellaneous:

   - Add riscv64 support to 'perf jitdump'

   - Various fixes/tidy ups related to cpu_map

   - Fixes for handling Intel hybrid systems"

* tag 'perf-tools-for-v5.19-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (122 commits)
  perf intel-pt: Add guest_code support
  perf kvm report: Add guest_code support
  perf script: Add guest_code support
  perf tools: Add guest_code support
  perf tools: Factor out thread__set_guest_comm()
  perf tools: Add machine to machines back pointer
  perf vendors events arm64: Update Cortex A57/A72
  perf vendors events arm64: Arm Neoverse E1
  perf vendors events arm64: Arm Cortex-X2
  perf vendors events arm64: Arm Cortex-X1
  perf vendors events arm64: Arm Cortex-A710
  perf vendors events arm64: Arm Cortex-A78
  perf vendors events arm64: Arm Cortex-A77
  perf vendors events arm64: Arm Cortex-A75
  perf vendors events arm64: Arm Cortex-A73
  perf vendors events arm64: Arm Cortex-A65
  perf vendors events arm64: Arm Cortex-A510
  perf vendors events arm64: Arm Cortex-A55
  perf vendors events arm64: Arm Cortex-A35
  perf vendors events arm64: Arm Cortex-A34
  ...
2022-05-25 14:46:09 -07:00
Linus Torvalds
7e062cda7d Networking changes for 5.19.
Core
 ----
 
  - Support TCPv6 segmentation offload with super-segments larger than
    64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP).
 
  - Generalize skb freeing deferral to per-cpu lists, instead of
    per-socket lists.
 
  - Add a netdev statistic for packets dropped due to L2 address
    mismatch (rx_otherhost_dropped).
 
  - Continue work annotating skb drop reasons.
 
  - Accept alternative netdev names (ALT_IFNAME) in more netlink
    requests.
 
  - Add VLAN support for AF_PACKET SOCK_RAW GSO.
 
  - Allow receiving skb mark from the socket as a cmsg.
 
  - Enable memcg accounting for veth queues, sysctl tables and IPv6.
 
 BPF
 ---
 
  - Add libbpf support for User Statically-Defined Tracing (USDTs).
 
  - Speed up symbol resolution for kprobes multi-link attachments.
 
  - Support storing typed pointers to referenced and unreferenced
    objects in BPF maps.
 
  - Add support for BPF link iterator.
 
  - Introduce access to remote CPU map elements in BPF per-cpu map.
 
  - Allow middle-of-the-road settings for the
    kernel.unprivileged_bpf_disabled sysctl.
 
  - Implement basic types of dynamic pointers e.g. to allow for
    dynamically sized ringbuf reservations without extra memory copies.
 
 Protocols
 ---------
 
  - Retire port only listening_hash table, add a second bind table
    hashed by port and address. Avoid linear list walk when binding
    to very popular ports (e.g. 443).
 
  - Add bridge FDB bulk flush filtering support allowing user space
    to remove all FDB entries matching a condition.
 
  - Introduce accept_unsolicited_na sysctl for IPv6 to implement
    router-side changes for RFC9131.
 
  - Support for MPTCP path manager in user space.
 
  - Add MPTCP support for fallback to regular TCP for connections
    that have never connected additional subflows or transmitted
    out-of-sequence data (partial support for RFC8684 fallback).
 
  - Avoid races in MPTCP-level window tracking, stabilize and improve
    throughput.
 
  - Support lockless operation of GRE tunnels with seq numbers enabled.
 
  - WiFi support for host based BSS color collision detection.
 
  - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets.
 
  - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2).
 
  - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile).
 
  - Allow matching on the number of VLAN tags via tc-flower.
 
  - Add tracepoint for tcp_set_ca_state().
 
 Driver API
 ----------
 
  - Improve error reporting from classifier and action offload.
 
  - Add support for listing line cards in switches (devlink).
 
  - Add helpers for reporting page pool statistics with ethtool -S.
 
  - Add support for reading clock cycles when using PTP virtual clocks,
    instead of having the driver convert to time before reporting.
    This makes it possible to report time from different vclocks.
 
  - Support configuring low-latency Tx descriptor push via ethtool.
 
  - Separate Clause 22 and Clause 45 MDIO accesses more explicitly.
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - Marvell's Octeon NIC PCI Endpoint support (octeon_ep)
    - Sunplus SP7021 SoC (sp7021_emac)
    - Add support for Renesas RZ/V2M (in ravb)
    - Add support for MediaTek mt7986 switches (in mtk_eth_soc)
 
  - Ethernet PHYs:
    - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting)
    - TI DP83TD510 PHY
    - Microchip LAN8742/LAN88xx PHYs
 
  - WiFi:
    - Driver for pureLiFi X, XL, XC devices (plfxlc)
    - Driver for Silicon Labs devices (wfx)
    - Support for WCN6750 (in ath11k)
    - Support Realtek 8852ce devices (in rtw89)
 
  - Mobile:
    - MediaTek T700 modems (Intel 5G 5000 M.2 cards)
 
  - CAN:
   - ctucanfd: add support for CTU CAN FD open-source IP core
     from Czech Technical University in Prague
 
 Drivers
 -------
 
  - Delete a number of old drivers still using virt_to_bus().
 
  - Ethernet NICs:
    - intel: support TSO on tunnels MPLS
    - broadcom: support multi-buffer XDP
    - nfp: support VF rate limiting
    - sfc: use hardware tx timestamps for more than PTP
    - mlx5: multi-port eswitch support
    - hyper-v: add support for XDP_REDIRECT
    - atlantic: XDP support (including multi-buffer)
    - macb: improve real-time perf by deferring Tx processing to NAPI
 
  - High-speed Ethernet switches:
    - mlxsw: implement basic line card information querying
    - prestera: add support for traffic policing on ingress and egress
 
  - Embedded Ethernet switches:
    - lan966x: add support for packet DMA (FDMA)
    - lan966x: add support for PTP programmable pins
    - ti: cpsw_new: enable bc/mc storm prevention
 
  - Qualcomm 802.11ax WiFi (ath11k):
    - Wake-on-WLAN support for QCA6390 and WCN6855
    - device recovery (firmware restart) support
    - support setting Specific Absorption Rate (SAR) for WCN6855
    - read country code from SMBIOS for WCN6855/QCA6390
    - enable keep-alive during WoWLAN suspend
    - implement remain-on-channel support
 
  - MediaTek WiFi (mt76):
    - support Wireless Ethernet Dispatch offloading packet movement
      between the Ethernet switch and WiFi interfaces
    - non-standard VHT MCS10-11 support
    - mt7921 AP mode support
    - mt7921 IPv6 NS offload support
 
  - Ethernet PHYs:
    - micrel: ksz9031/ksz9131: cabletest support
    - lan87xx: SQI support for T1 PHYs
    - lan937x: add interrupt support for link detection
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmKNMPQACgkQMUZtbf5S
 IrsRARAAuDyYs6jFYB3p+xazZdOnbF4iAgVv71+DQGvmsCl6CB9OrsNZMlvE85OL
 Q3gjcRbgjrkN4lhgI8DmiGYbsUJnAvVjFdNjccz1Z/vTLYvuIM0ol54MUp5S+9WY
 StncOJkOGJxxR/Gi5gzVmejPDsysU3Jik+hm/fpIcz8pybXxAsFKU5waY5qfl+/T
 TZepfV0VCfqRDjqcF1qA5+jJZNU8pdodQlZ1+mh8bwu6Jk1ZkWkj6Ov8MWdwQldr
 LnPeK/9hIGzkdJYHZfajxA3t8D0K5CHzSuih2bJ9ry8ZXgVBkXEThew778/R5izW
 uB0YZs9COFlrIP7XHjtRTy/2xHOdYIPlj2nWhVdfuQDX8Crvt4VRN6EZ1rjko1ZJ
 WanfG6WHF8NH5pXBRQbh3kIMKBnYn6OIzuCfCQSqd+niHcxFIM4vRiggeXI5C5TW
 vJgEWfK6X+NfDiFVa3xyCrEmp5ieA/pNecpwd8rVkql+MtFAAw4vfsotLKOJEAru
 J/XL6UE+YuLqIJV9ACZ9x1AFXXAo661jOxBunOo4VXhXVzWS9lYYz5r5ryIkgT/8
 /Fr0zjANJWgfIuNdIBtYfQ4qG+LozGq038VA06RhFUAZ5tF9DzhqJs2Q2AFuWWBC
 ewCePJVqo1j2Ceq2mGonXRt47OEnlePoOxTk9W+cKZb7ZWE+zEo=
 =Wjii
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core
  ----

   - Support TCPv6 segmentation offload with super-segments larger than
     64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP).

   - Generalize skb freeing deferral to per-cpu lists, instead of
     per-socket lists.

   - Add a netdev statistic for packets dropped due to L2 address
     mismatch (rx_otherhost_dropped).

   - Continue work annotating skb drop reasons.

   - Accept alternative netdev names (ALT_IFNAME) in more netlink
     requests.

   - Add VLAN support for AF_PACKET SOCK_RAW GSO.

   - Allow receiving skb mark from the socket as a cmsg.

   - Enable memcg accounting for veth queues, sysctl tables and IPv6.

  BPF
  ---

   - Add libbpf support for User Statically-Defined Tracing (USDTs).

   - Speed up symbol resolution for kprobes multi-link attachments.

   - Support storing typed pointers to referenced and unreferenced
     objects in BPF maps.

   - Add support for BPF link iterator.

   - Introduce access to remote CPU map elements in BPF per-cpu map.

   - Allow middle-of-the-road settings for the
     kernel.unprivileged_bpf_disabled sysctl.

   - Implement basic types of dynamic pointers e.g. to allow for
     dynamically sized ringbuf reservations without extra memory copies.

  Protocols
  ---------

   - Retire port only listening_hash table, add a second bind table
     hashed by port and address. Avoid linear list walk when binding to
     very popular ports (e.g. 443).

   - Add bridge FDB bulk flush filtering support allowing user space to
     remove all FDB entries matching a condition.

   - Introduce accept_unsolicited_na sysctl for IPv6 to implement
     router-side changes for RFC9131.

   - Support for MPTCP path manager in user space.

   - Add MPTCP support for fallback to regular TCP for connections that
     have never connected additional subflows or transmitted
     out-of-sequence data (partial support for RFC8684 fallback).

   - Avoid races in MPTCP-level window tracking, stabilize and improve
     throughput.

   - Support lockless operation of GRE tunnels with seq numbers enabled.

   - WiFi support for host based BSS color collision detection.

   - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets.

   - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2).

   - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile).

   - Allow matching on the number of VLAN tags via tc-flower.

   - Add tracepoint for tcp_set_ca_state().

  Driver API
  ----------

   - Improve error reporting from classifier and action offload.

   - Add support for listing line cards in switches (devlink).

   - Add helpers for reporting page pool statistics with ethtool -S.

   - Add support for reading clock cycles when using PTP virtual clocks,
     instead of having the driver convert to time before reporting. This
     makes it possible to report time from different vclocks.

   - Support configuring low-latency Tx descriptor push via ethtool.

   - Separate Clause 22 and Clause 45 MDIO accesses more explicitly.

  New hardware / drivers
  ----------------------

   - Ethernet:
      - Marvell's Octeon NIC PCI Endpoint support (octeon_ep)
      - Sunplus SP7021 SoC (sp7021_emac)
      - Add support for Renesas RZ/V2M (in ravb)
      - Add support for MediaTek mt7986 switches (in mtk_eth_soc)

   - Ethernet PHYs:
      - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting)
      - TI DP83TD510 PHY
      - Microchip LAN8742/LAN88xx PHYs

   - WiFi:
      - Driver for pureLiFi X, XL, XC devices (plfxlc)
      - Driver for Silicon Labs devices (wfx)
      - Support for WCN6750 (in ath11k)
      - Support Realtek 8852ce devices (in rtw89)

   - Mobile:
      - MediaTek T700 modems (Intel 5G 5000 M.2 cards)

   - CAN:
      - ctucanfd: add support for CTU CAN FD open-source IP core from
        Czech Technical University in Prague

  Drivers
  -------

   - Delete a number of old drivers still using virt_to_bus().

   - Ethernet NICs:
      - intel: support TSO on tunnels MPLS
      - broadcom: support multi-buffer XDP
      - nfp: support VF rate limiting
      - sfc: use hardware tx timestamps for more than PTP
      - mlx5: multi-port eswitch support
      - hyper-v: add support for XDP_REDIRECT
      - atlantic: XDP support (including multi-buffer)
      - macb: improve real-time perf by deferring Tx processing to NAPI

   - High-speed Ethernet switches:
      - mlxsw: implement basic line card information querying
      - prestera: add support for traffic policing on ingress and egress

   - Embedded Ethernet switches:
      - lan966x: add support for packet DMA (FDMA)
      - lan966x: add support for PTP programmable pins
      - ti: cpsw_new: enable bc/mc storm prevention

   - Qualcomm 802.11ax WiFi (ath11k):
      - Wake-on-WLAN support for QCA6390 and WCN6855
      - device recovery (firmware restart) support
      - support setting Specific Absorption Rate (SAR) for WCN6855
      - read country code from SMBIOS for WCN6855/QCA6390
      - enable keep-alive during WoWLAN suspend
      - implement remain-on-channel support

   - MediaTek WiFi (mt76):
      - support Wireless Ethernet Dispatch offloading packet movement
        between the Ethernet switch and WiFi interfaces
      - non-standard VHT MCS10-11 support
      - mt7921 AP mode support
      - mt7921 IPv6 NS offload support

   - Ethernet PHYs:
      - micrel: ksz9031/ksz9131: cabletest support
      - lan87xx: SQI support for T1 PHYs
      - lan937x: add interrupt support for link detection"

* tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits)
  ptp: ocp: Add firmware header checks
  ptp: ocp: fix PPS source selector debugfs reporting
  ptp: ocp: add .init function for sma_op vector
  ptp: ocp: vectorize the sma accessor functions
  ptp: ocp: constify selectors
  ptp: ocp: parameterize input/output sma selectors
  ptp: ocp: revise firmware display
  ptp: ocp: add Celestica timecard PCI ids
  ptp: ocp: Remove #ifdefs around PCI IDs
  ptp: ocp: 32-bit fixups for pci start address
  Revert "net/smc: fix listen processing for SMC-Rv2"
  ath6kl: Use cc-disable-warning to disable -Wdangling-pointer
  selftests/bpf: Dynptr tests
  bpf: Add dynptr data slices
  bpf: Add bpf_dynptr_read and bpf_dynptr_write
  bpf: Dynptr support for ring buffers
  bpf: Add bpf_dynptr_from_mem for local dynptrs
  bpf: Add verifier support for dynptrs
  bpf: Suppress 'passing zero to PTR_ERR' warning
  bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack
  ...
2022-05-25 12:22:58 -07:00
Linus Torvalds
f4fb859665 Thermal control updates for 5.19-rc1
- Add thermal library and thermal tools to encapsulate the netlink
    into event based callbacks (Daniel Lezcano, Jiapeng Chong).
 
  - Improve overheat condition handling during suspend-to-idle in the
    Intel PCH thermal driver (Zhang Rui).
 
  - Use local ops instead of global ops in devfreq_cooling (Kant Fan).
 
  - Clean up _OSC handling in int340x (Davidlohr Bueso).
 
  - Switch hisi_termal from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
    (Hesham Almatary).
 
  - Add new k3 j72xx bangdap driver and the corresponding bindings
    (Keerthy).
 
  - Fix missing of_node_put() in the SC iMX driver at probe time
    (Miaoqian Lin).
 
  - Fix memory leak in __thermal_cooling_device_register() when
    device_register() fails by calling thermal_cooling_device_destroy_sysfs()
    (Yang Yingliang).
 
  - Add sc8180x and sc8280xp compatible string in the DT bindings and
    lMH support for QCom tsens driver (Bjorn Andersson).
 
  - Fix OTP Calibration Register values conforming to the documentation
    on RZ/G2L and bindings documentation for RZ/G2UL (Biju Das).
 
  - Fix type in kerneldoc description for __thermal_bind_params (Corentin
    Labbe).
 
  - Fix potential NULL dereference in sr_thermal_probe() on Broadcom
    platform (Zheng Yongjun).
 
  - Add change mode ops to the thermal-of sensor (Manaf Meethalavalappu
    Pallikunhi).
 
  - Fix non-negative value support by preventing the value to be clamp
    to zero (Stefan Wahren).
 
  - Add compatible string and DT bindings for MSM8960 tsens driver
    (Dmitry Baryshkov).
 
  - Add hwmon support for K3 driver (Massimiliano Minella).
 
  - Refactor and add multiple generations support for QCom ADC driver
    (Jishnu Prakash).
 
  - Use platform_get_irq_optional() to get the interrupt on RCar driver and
    document Document RZ/V2L bindings (Lad Prabhakar).
 
  - Remove NULL check after container_of() call from the Intel HFI
    thermal driver (Haowen Bai).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmKL3MESHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxgzsQAKj4D0ROHLdqReYeIutS7IU6YQ+ejh3N
 krcAZvxQd8B3sVwSHHABPdsHwwfYjP4BwoHHVtha5q+5zFGBq5F3fJHsDeRxN23T
 NEOW81HBaLu81N5gNuAYg+NQcdqBR4U4IGKZfxWRyx13OMECXGtJRb/SIW/TuDYk
 AQIrqOivVEsVRtn+qp+x0rKHsj6ge+2lU1OyJagr2BDhgLrwzxl6cmNBfali1C3D
 sw4SqGwGVjEB0QDDpMbty9BsLEjNRE86kwoPh2u0K9dT9/b/P9pk1lp+pOnltspZ
 eOdwr0CtoEHw2x5tuhbO7fmvNIuf5jgSDWsCrP/xYnsozcRiwWsgyeJj4f5soreN
 kTJB8S/FH+fd7UuYqdmDIeJpDnkkGt1jIqjkDvfJNL7ffBWIe/meXKF4vBnrdlbd
 FMNQwzgc5eug07IFjOE43V1v5Hw1H1leVUwZczOMGNIhqyy0WZyQr7vzwbr16jCO
 x0hu3q3Y++5SUAUy9WzsOOrpRHa9JxUwVhZuLG7ajf+6pqPxB8kqs9CJe+sgWKju
 T4KqHbljJ2Gnkm9SngT0BdnB+AgparhXf8/8Mj0ZBQvamXw+ylNbYfG7qTZk5Ne/
 rUf4YDew6hGDqikyJBe2e0a1lLSoN4zaADC1lLYhC+Zp7m6eK0r3y3mhic0bZe1c
 RU/2eeA4kTuZ
 =3ltV
 -----END PGP SIGNATURE-----

Merge tag 'thermal-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control updates from Rafael Wysocki:
 "These add a thermal library and thermal tools to wrap the netlink
  interface into event-based callbacks, improve overheat condition
  handling during suspend-to-idle on Intel SoCs, add some new hardware
  support, fix bugs and clean up code.

  Specifics:

   - Add thermal library and thermal tools to encapsulate the netlink
     into event based callbacks (Daniel Lezcano, Jiapeng Chong).

   - Improve overheat condition handling during suspend-to-idle in the
     Intel PCH thermal driver (Zhang Rui).

   - Use local ops instead of global ops in devfreq_cooling (Kant Fan).

   - Clean up _OSC handling in int340x (Davidlohr Bueso).

   - Switch hisi_termal from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
     (Hesham Almatary).

   - Add new k3 j72xx bangdap driver and the corresponding bindings
     (Keerthy).

   - Fix missing of_node_put() in the SC iMX driver at probe time
     (Miaoqian Lin).

   - Fix memory leak in __thermal_cooling_device_register()
     when device_register() fails by calling
     thermal_cooling_device_destroy_sysfs() (Yang Yingliang).

   - Add sc8180x and sc8280xp compatible string in the DT bindings and
     lMH support for QCom tsens driver (Bjorn Andersson).

   - Fix OTP Calibration Register values conforming to the documentation
     on RZ/G2L and bindings documentation for RZ/G2UL (Biju Das).

   - Fix type in kerneldoc description for __thermal_bind_params
     (Corentin Labbe).

   - Fix potential NULL dereference in sr_thermal_probe() on Broadcom
     platform (Zheng Yongjun).

   - Add change mode ops to the thermal-of sensor (Manaf Meethalavalappu
     Pallikunhi).

   - Fix non-negative value support by preventing the value to be clamp
     to zero (Stefan Wahren).

   - Add compatible string and DT bindings for MSM8960 tsens driver
     (Dmitry Baryshkov).

   - Add hwmon support for K3 driver (Massimiliano Minella).

   - Refactor and add multiple generations support for QCom ADC driver
     (Jishnu Prakash).

   - Use platform_get_irq_optional() to get the interrupt on RCar driver
     and document Document RZ/V2L bindings (Lad Prabhakar).

   - Remove NULL check after container_of() call from the Intel HFI
     thermal driver (Haowen Bai)"

* tag 'thermal-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (38 commits)
  thermal: intel: pch: improve the cooling delay log
  thermal: intel: pch: enhance overheat handling
  thermal: intel: pch: move cooling delay to suspend_noirq phase
  PM: wakeup: expose pm_wakeup_pending to modules
  thermal: k3_j72xx_bandgap: Add the bandgap driver support
  dt-bindings: thermal: k3-j72xx: Add VTM bindings documentation
  thermal/drivers/imx_sc_thermal: Fix refcount leak in imx_sc_thermal_probe
  thermal/core: Fix memory leak in __thermal_cooling_device_register()
  dt-bindings: thermal: tsens: Add sc8280xp compatible
  dt-bindings: thermal: lmh: Add Qualcomm sc8180x compatible
  thermal/drivers/qcom/lmh: Add sc8180x compatible
  thermal/drivers/rz2gl: Fix OTP Calibration Register values
  dt-bindings: thermal: rzg2l-thermal: Document RZ/G2UL bindings
  thermal: thermal_of: fix typo on __thermal_bind_params
  tools/thermal: remove unneeded semicolon
  tools/lib/thermal: remove unneeded semicolon
  thermal/drivers/broadcom: Fix potential NULL dereference in sr_thermal_probe
  tools/thermal: Add thermal daemon skeleton
  tools/thermal: Add a temperature capture tool
  tools/thermal: Add util library
  ...
2022-05-24 16:19:30 -07:00
Julia Lawall
bb412cf1d7 libbpf: Fix typo in comment
Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/bpf/20220521111145.81697-71-Julia.Lawall@inria.fr
2022-05-23 11:24:50 -07:00
Adrian Hunter
618ee7838e libperf: Add preadn()
Add preadn() to provide pread() and readn() semantics.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/ab8918a4-7ac8-a37e-2e2c-28438c422d87@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:11:12 -03:00
Ian Rogers
e696f6dbbf perf cpumap: Add perf_cpu_map__for_each_idx()
A variant of perf_cpu_map__for_each_cpu() that just iterates index values
without the corresponding load of the CPU.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lv Ruyi <lv.ruyi@zte.com.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220519032005.1273691-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 09:52:38 -03:00
Andrii Nakryiko
d16495a982 libbpf: remove bpf_create_map*() APIs
To test API removal, get rid of bpf_create_map*() APIs. Perf defines
__weak implementation of bpf_map_create() that redirects to old
bpf_create_map() and that seems to compile and run fine.

Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220518185915.3529475-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-19 09:03:31 -07:00
Andrii Nakryiko
e2371b1632 libbpf: start 1.0 development cycle
Start libbpf 1.0 development cycle by adding LIBBPF_1.0.0 section to
libbpf.map file and marking all current symbols as local. As we remove
all the deprecated APIs we'll populate global list before the final 1.0
release.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220518185915.3529475-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-19 09:03:31 -07:00
Andrii Nakryiko
056431ae4d libbpf: fix up global symbol counting logic
Add the same negative ABS filter that we use in VERSIONED_SYM_COUNT to
filter out ABS symbols like LIBBPF_0.8.0.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220518185915.3529475-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-19 09:03:31 -07:00
Jiapeng Chong
f21b57eb12 tools/lib/thermal: remove unneeded semicolon
Fix the following coccicheck warnings:

./tools/lib/thermal/commands.c:215:2-3: Unneeded semicolon.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220427030619.81556-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:52 +02:00
Daniel Lezcano
47c4b0de08 tools/lib/thermal: Add a thermal library
The thermal framework implements a netlink notification mechanism to
be used by the userspace to have a thermal configuration discovery,
trip point changes or violation, cooling device changes notifications,
etc...

This library provides a level of abstraction for the thermal netlink
notification allowing the userspace to connect to the notification
mechanism more easily. The library is callback oriented.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20220420160933.347088-2-daniel.lezcano@linaro.org
2022-05-19 12:11:51 +02:00
Andrii Nakryiko
ac6a65868a libbpf: fix memory leak in attach_tp for target-less tracepoint program
Fix sec_name memory leak if user defines target-less SEC("tp").

Fixes: 9af8efc45e ("libbpf: Allow "incomplete" basic tracing SEC() definitions")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20220516184547.3204674-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-16 13:42:44 -07:00
Andrii Nakryiko
737d0646a8 libbpf: Add safer high-level wrappers for map operations
Add high-level API wrappers for most common and typical BPF map
operations that works directly on instances of struct bpf_map * (so
you don't have to call bpf_map__fd()) and validate key/value size
expectations.

These helpers require users to specify key (and value, where
appropriate) sizes when performing lookup/update/delete/etc. This forces
user to actually think and validate (for themselves) those. This is
a good thing as user is expected by kernel to implicitly provide correct
key/value buffer sizes and kernel will just read/write necessary amount
of data. If it so happens that user doesn't set up buffers correctly
(which bit people for per-CPU maps especially) kernel either randomly
overwrites stack data or return -EFAULT, depending on user's luck and
circumstances. These high-level APIs are meant to prevent such
unpleasant and hard to debug bugs.

This patch also adds bpf_map_delete_elem_flags() low-level API and
requires passing flags to bpf_map__delete_elem() API for consistency
across all similar APIs, even though currently kernel doesn't expect
any extra flags for BPF_MAP_DELETE_ELEM operation.

List of map operations that get these high-level APIs:

  - bpf_map_lookup_elem;
  - bpf_map_update_elem;
  - bpf_map_delete_elem;
  - bpf_map_lookup_and_delete_elem;
  - bpf_map_get_next_key.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220512220713.2617964-1-andrii@kernel.org
2022-05-13 15:15:02 +02:00
Jiri Olsa
b63b3c490e libbpf: Add bpf_program__set_insns function
Adding bpf_program__set_insns that allows to set new instructions
for a BPF program.

This is a very advanced libbpf API and users need to know what
they are doing. This should be used from prog_prepare_load_fn
callback only.

We can have changed instructions after calling prog_prepare_load_fn
callback, reloading them.

One of the users of this new API will be perf's internal BPF prologue
generation.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220510074659.2557731-2-jolsa@kernel.org
2022-05-11 14:15:17 +02:00
Andrii Nakryiko
5eefe17c7a libbpf: Clean up ringbuf size adjustment implementation
Drop unused iteration variable, move overflow prevention check into the
for loop.

Fixes: 0087a681fa ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220510185159.754299-1-andrii@kernel.org
2022-05-11 14:06:29 +02:00
Kui-Feng Lee
129b9c5ee2 libbpf: Assign cookies to links in libbpf.
Add a cookie field to the attributes of bpf_link_create().
Add bpf_program__attach_trace_opts() to attach a cookie to a link.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220510205923.3206889-5-kuifeng@fb.com
2022-05-10 21:58:40 -07:00
Adrian Hunter
8f111be643 libperf evlist: Add evsel as a parameter to ->idx()
Add evsel as a parameter to ->idx() in preparation for correctly
determining whether an auxtrace mmap is needed.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220506122601.367589-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10 14:26:48 -03:00
Adrian Hunter
d8fe2efb65 libperf evlist: Move ->idx() into mmap_per_evsel()
Move ->idx() into mmap_per_evsel() in preparation for adding evsel as a
parameter.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220506122601.367589-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10 14:26:27 -03:00
Adrian Hunter
6a7b8a5a30 libperf evlist: Remove ->idx() per_cpu parameter
Remove ->idx() per_cpu parameter because it isn't needed.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220506122601.367589-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10 14:25:57 -03:00
Adrian Hunter
00632610c2 libperf evsel: Add perf_evsel__enable_thread()
Add perf_evsel__enable_thread() as a counterpart to
perf_evsel__enable_cpu(), to enable all events for a thread.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220506122601.367589-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10 14:19:09 -03:00
Andrii Nakryiko
0087a681fa libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary
Kernel imposes a pretty particular restriction on ringbuf map size. It
has to be a power-of-2 multiple of page size. While generally this isn't
hard for user to satisfy, sometimes it's impossible to do this
declaratively in BPF source code or just plain inconvenient to do at
runtime.

One such example might be BPF libraries that are supposed to work on
different architectures, which might not agree on what the common page
size is.

Let libbpf find the right size for user instead, if it turns out to not
satisfy kernel requirements. If user didn't set size at all, that's most
probably a mistake so don't upsize such zero size to one full page,
though. Also we need to be careful about not overflowing __u32
max_entries.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220509004148.1801791-9-andrii@kernel.org
2022-05-09 17:15:32 +02:00
Andrii Nakryiko
f760d05379 libbpf: Provide barrier() and barrier_var() in bpf_helpers.h
Add barrier() and barrier_var() macros into bpf_helpers.h to be used by
end users. While a bit advanced and specialized instruments, they are
sometimes indispensable. Instead of requiring each user to figure out
exact asm volatile incantations for themselves, provide them from
bpf_helpers.h.

Also remove conflicting definitions from selftests. Some tests rely on
barrier_var() definition being nothing, those will still work as libbpf
does the #ifndef/#endif guarding for barrier() and barrier_var(),
allowing users to redefine them, if necessary.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220509004148.1801791-8-andrii@kernel.org
2022-05-09 17:15:32 +02:00
Andrii Nakryiko
7715f549a9 libbpf: Complete field-based CO-RE helpers with field offset helper
Add bpf_core_field_offset() helper to complete field-based CO-RE
helpers. This helper can be useful for feature-detection and for some
more advanced cases of field reading (e.g., reading flexible array members).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220509004148.1801791-6-andrii@kernel.org
2022-05-09 17:15:32 +02:00
Andrii Nakryiko
73d0280f6b libbpf: Improve usability of field-based CO-RE helpers
Allow to specify field reference in two ways:

  - if user has variable of necessary type, they can use variable-based
    reference (my_var.my_field or my_var_ptr->my_field). This was the
    only supported syntax up till now.
  - now, bpf_core_field_exists() and bpf_core_field_size() support also
    specifying field in a fashion similar to offsetof() macro, by
    specifying type of the containing struct/union separately and field
    name separately: bpf_core_field_exists(struct my_type, my_field).
    This forms is quite often more convenient in practice and it matches
    type-based CO-RE helpers that support specifying type by its name
    without requiring any variables.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220509004148.1801791-4-andrii@kernel.org
2022-05-09 17:15:28 +02:00
Andrii Nakryiko
8e2f618e8b libbpf: Make __kptr and __kptr_ref unconditionally use btf_type_tag() attr
It will be annoying and surprising for users of __kptr and __kptr_ref if
libbpf silently ignores them just because Clang used for compilation
didn't support btf_type_tag(). It's much better to get clear compiler
error than debug BPF verifier failures later on.

Fixes: ef89654f2b ("libbpf: Add kptr type tag macros to bpf_helpers.h")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220509004148.1801791-3-andrii@kernel.org
2022-05-09 17:14:40 +02:00
Ian Rogers
33cd692803 perf evlist: Clear all_cpus before propagating
all_cpus is merged into during propagation. Initially all_cpus is set
from PMU sysfs. perf_evlist__set_maps() will recompute it and change
evsel->cpus to user_requested_cpus if they are given.

If all_cpus isn't cleared then the union of the user_requested_cpus and
PMU sysfs values is set to all_cpus, whereas just user_requested_cpus is
necessary.

To avoid this make all_cpus empty prior to propagation.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20220503041757.2365696-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-05 14:38:01 -03:00
Andrii Nakryiko
ec41817b4a libbpf: Allow to opt-out from creating BPF maps
Add bpf_map__set_autocreate() API that allows user to opt-out from
libbpf automatically creating BPF map during BPF object load.

This is a useful feature when building CO-RE-enabled BPF application
that takes advantage of some new-ish BPF map type (e.g., socket-local
storage) if kernel supports it, but otherwise uses some alternative way
(e.g., extra HASH map). In such case, being able to disable the creation
of a map that kernel doesn't support allows to successfully create and
load BPF object file with all its other maps and programs.

It's still up to user to make sure that no "live" code in any of their BPF
programs are referencing such map instance, which can be achieved by
guarding such code with CO-RE relocation check or by using .rodata
global variables.

If user fails to properly guard such code to turn it into "dead code",
libbpf will helpfully post-process BPF verifier log and will provide
more meaningful error and map name that needs to be guarded properly. As
such, instead of:

  ; value = bpf_map_lookup_elem(&missing_map, &zero);
  4: (85) call unknown#2001000000
  invalid func unknown#2001000000

... user will see:

  ; value = bpf_map_lookup_elem(&missing_map, &zero);
  4: <invalid BPF map reference>
  BPF map 'missing_map' is referenced but wasn't created

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220428041523.4089853-4-andrii@kernel.org
2022-04-28 20:03:29 -07:00
Andrii Nakryiko
69721203b1 libbpf: Use libbpf_mem_ensure() when allocating new map
Reuse libbpf_mem_ensure() when adding a new map to the list of maps
inside bpf_object. It takes care of proper resizing and reallocating of
map array and zeroing out newly allocated memory.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220428041523.4089853-3-andrii@kernel.org
2022-04-28 20:03:29 -07:00
Andrii Nakryiko
b198881d4b libbpf: Append "..." in fixed up log if CO-RE spec is truncated
Detect CO-RE spec truncation and append "..." to make user aware that
there was supposed to be more of the spec there.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220428041523.4089853-2-andrii@kernel.org
2022-04-28 20:03:29 -07:00
Andrii Nakryiko
cc7d8f2c8e libbpf: Support target-less SEC() definitions for BTF-backed programs
Similar to previous patch, support target-less definitions like
SEC("fentry"), SEC("freplace"), etc. For such BTF-backed program types
it is expected that user will specify BTF target programmatically at
runtime using bpf_program__set_attach_target() *before* load phase. If
not, libbpf will report this as an error.

Aslo use SEC_ATTACH_BTF flag instead of explicitly listing a set of
types that are expected to require attach_btf_id. This was an accidental
omission during custom SEC() support refactoring.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220428185349.3799599-3-andrii@kernel.org
2022-04-28 23:46:04 +02:00
Andrii Nakryiko
9af8efc45e libbpf: Allow "incomplete" basic tracing SEC() definitions
In a lot of cases the target of kprobe/kretprobe, tracepoint, raw
tracepoint, etc BPF program might not be known at the compilation time
and will be discovered at runtime. This was always a supported case by
libbpf, with APIs like bpf_program__attach_{kprobe,tracepoint,etc}()
accepting full target definition, regardless of what was defined in
SEC() definition in BPF source code.

Unfortunately, up till now libbpf still enforced users to specify at
least something for the fake target, e.g., SEC("kprobe/whatever"), which
is cumbersome and somewhat misleading.

This patch allows target-less SEC() definitions for basic tracing BPF
program types:

  - kprobe/kretprobe;
  - multi-kprobe/multi-kretprobe;
  - tracepoints;
  - raw tracepoints.

Such target-less SEC() definitions are meant to specify declaratively
proper BPF program type only. Attachment of them will have to be handled
programmatically using correct APIs. As such, skeleton's auto-attachment
of such BPF programs is skipped and generic bpf_program__attach() will
fail, if attempted, due to the lack of enough target information.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220428185349.3799599-2-andrii@kernel.org
2022-04-28 23:45:59 +02:00
Jakub Kicinski
50c6afabfd Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2022-04-27

We've added 85 non-merge commits during the last 18 day(s) which contain
a total of 163 files changed, 4499 insertions(+), 1521 deletions(-).

The main changes are:

1) Teach libbpf to enhance BPF verifier log with human-readable and relevant
   information about failed CO-RE relocations, from Andrii Nakryiko.

2) Add typed pointer support in BPF maps and enable it for unreferenced pointers
   (via probe read) and referenced ones that can be passed to in-kernel helpers,
   from Kumar Kartikeya Dwivedi.

3) Improve xsk to break NAPI loop when rx queue gets full to allow for forward
   progress to consume descriptors, from Maciej Fijalkowski & Björn Töpel.

4) Fix a small RCU read-side race in BPF_PROG_RUN routines which dereferenced
   the effective prog array before the rcu_read_lock, from Stanislav Fomichev.

5) Implement BPF atomic operations for RV64 JIT, and add libbpf parsing logic
   for USDT arguments under riscv{32,64}, from Pu Lehui.

6) Implement libbpf parsing of USDT arguments under aarch64, from Alan Maguire.

7) Enable bpftool build for musl and remove nftw with FTW_ACTIONRETVAL usage
   so it can be shipped under Alpine which is musl-based, from Dominique Martinet.

8) Clean up {sk,task,inode} local storage trace RCU handling as they do not
   need to use call_rcu_tasks_trace() barrier, from KP Singh.

9) Improve libbpf API documentation and fix error return handling of various
   API functions, from Grant Seltzer.

10) Enlarge offset check for bpf_skb_{load,store}_bytes() helpers given data
    length of frags + frag_list may surpass old offset limit, from Liu Jian.

11) Various improvements to prog_tests in area of logging, test execution
    and by-name subtest selection, from Mykola Lysenko.

12) Simplify map_btf_id generation for all map types by moving this process
    to build time with help of resolve_btfids infra, from Menglong Dong.

13) Fix a libbpf bug in probing when falling back to legacy bpf_probe_read*()
    helpers; the probing caused always to use old helpers, from Runqing Yang.

14) Add support for ARCompact and ARCv2 platforms for libbpf's PT_REGS
    tracing macros, from Vladimir Isaev.

15) Cleanup BPF selftests to remove old & unneeded rlimit code given kernel
    switched to memcg-based memory accouting a while ago, from Yafang Shao.

16) Refactor of BPF sysctl handlers to move them to BPF core, from Yan Zhu.

17) Fix BPF selftests in two occasions to work around regressions caused by latest
    LLVM to unblock CI until their fixes are worked out, from Yonghong Song.

18) Misc cleanups all over the place, from various others.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits)
  selftests/bpf: Add libbpf's log fixup logic selftests
  libbpf: Fix up verifier log for unguarded failed CO-RE relos
  libbpf: Simplify bpf_core_parse_spec() signature
  libbpf: Refactor CO-RE relo human description formatting routine
  libbpf: Record subprog-resolved CO-RE relocations unconditionally
  selftests/bpf: Add CO-RE relos and SEC("?...") to linked_funcs selftests
  libbpf: Avoid joining .BTF.ext data with BPF programs by section name
  libbpf: Fix logic for finding matching program for CO-RE relocation
  libbpf: Drop unhelpful "program too large" guess
  libbpf: Fix anonymous type check in CO-RE logic
  bpf: Compute map_btf_id during build time
  selftests/bpf: Add test for strict BTF type check
  selftests/bpf: Add verifier tests for kptr
  selftests/bpf: Add C tests for kptr
  libbpf: Add kptr type tag macros to bpf_helpers.h
  bpf: Make BTF type match stricter for release arguments
  bpf: Teach verifier about kptr_get kfunc helpers
  bpf: Wire up freeing of referenced kptr
  bpf: Populate pairs of btf_id and destructor kfunc in btf
  bpf: Adapt copy_map_value for multiple offset case
  ...
====================

Link: https://lore.kernel.org/r/20220427224758.20976-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27 17:09:32 -07:00
Andrii Nakryiko
9fdc4273b8 libbpf: Fix up verifier log for unguarded failed CO-RE relos
Teach libbpf to post-process BPF verifier log on BPF program load
failure and detect known error patterns to provide user with more
context.

Currently there is one such common situation: an "unguarded" failed BPF
CO-RE relocation. While failing CO-RE relocation is expected, it is
expected to be property guarded in BPF code such that BPF verifier
always eliminates BPF instructions corresponding to such failed CO-RE
relos as dead code. In cases when user failed to take such precautions,
BPF verifier provides the best log it can:

  123: (85) call unknown#195896080
  invalid func unknown#195896080

Such incomprehensible log error is due to libbpf "poisoning" BPF
instruction that corresponds to failed CO-RE relocation by replacing it
with invalid `call 0xbad2310` instruction (195896080 == 0xbad2310 reads
"bad relo" if you squint hard enough).

Luckily, libbpf has all the necessary information to look up CO-RE
relocation that failed and provide more human-readable description of
what's going on:

  5: <invalid CO-RE relocation>
  failed to resolve CO-RE relocation <byte_off> [6] struct task_struct___bad.fake_field_subprog (0:2 @ offset 8)

This hopefully makes it much easier to understand what's wrong with
user's BPF program without googling magic constants.

This BPF verifier log fixup is setup to be extensible and is going to be
used for at least one other upcoming feature of libbpf in follow up patches.
Libbpf is parsing lines of BPF verifier log starting from the very end.
Currently it processes up to 10 lines of code looking for familiar
patterns. This avoids wasting lots of CPU processing huge verifier logs
(especially for log_level=2 verbosity level). Actual verification error
should normally be found in last few lines, so this should work
reliably.

If libbpf needs to expand log beyond available log_buf_size, it
truncates the end of the verifier log. Given verifier log normally ends
with something like:

  processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

... truncating this on program load error isn't too bad (end user can
always increase log size, if it needs to get complete log).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-10-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
14032f2644 libbpf: Simplify bpf_core_parse_spec() signature
Simplify bpf_core_parse_spec() signature to take struct bpf_core_relo as
an input instead of requiring callers to decompose them into type_id,
relo, spec_str, etc. This makes using and reusing this helper easier.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-9-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
b58af63aab libbpf: Refactor CO-RE relo human description formatting routine
Refactor how CO-RE relocation is formatted. Now it dumps human-readable
representation, currently used by libbpf in either debug or error
message output during CO-RE relocation resolution process, into provided
buffer. This approach allows for better reuse of this functionality
outside of CO-RE relocation resolution, which we'll use in next patch
for providing better error message for BPF verifier rejecting BPF
program due to unguarded failed CO-RE relocation.

It also gets rid of annoying "stitching" of libbpf_print() calls, which
was the only place where we did this.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-8-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
185cfe837f libbpf: Record subprog-resolved CO-RE relocations unconditionally
Previously, libbpf recorded CO-RE relocations with insns_idx resolved
according to finalized subprog locations (which are appended at the end
of entry BPF program) to simplify the job of light skeleton generator.

This is necessary because once subprogs' instructions are appended to
main entry BPF program all the subprog instruction indices are shifted
and that shift is different for each entry (main) BPF program, so it's
generally impossible to map final absolute insn_idx of the finalized BPF
program to their original locations inside subprograms.

This information is now going to be used not only during light skeleton
generation, but also to map absolute instruction index to subprog's
instruction and its corresponding CO-RE relocation. So start recording
these relocations always, not just when obj->gen_loader is set.

This information is going to be freed at the end of bpf_object__load()
step, as before (but this can change in the future if there will be
a need for this information post load step).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-7-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
11d5daa892 libbpf: Avoid joining .BTF.ext data with BPF programs by section name
Instead of using ELF section names as a joining key between .BTF.ext and
corresponding BPF programs, pre-build .BTF.ext section number to ELF
section index mapping during bpf_object__open() and use it later for
matching .BTF.ext information (func/line info or CO-RE relocations) to
their respective BPF programs and subprograms.

This simplifies corresponding joining logic and let's libbpf do
manipulations with BPF program's ELF sections like dropping leading '?'
character for non-autoloaded programs. Original joining logic in
bpf_object__relocate_core() (see relevant comment that's now removed)
was never elegant, so it's a good improvement regardless. But it also
avoids unnecessary internal assumptions about preserving original ELF
section name as BPF program's section name (which was broken when
SEC("?abc") support was added).

Fixes: a3820c4811 ("libbpf: Support opting out from autoloading BPF programs declaratively")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-5-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
966a750932 libbpf: Fix logic for finding matching program for CO-RE relocation
Fix the bug in bpf_object__relocate_core() which can lead to finding
invalid matching BPF program when processing CO-RE relocation. IF
matching program is not found, last encountered program will be assumed
to be correct program and thus error detection won't detect the problem.

Fixes: 9c82a63cf3 ("libbpf: Fix CO-RE relocs against .text section")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-4-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
0994a54c52 libbpf: Drop unhelpful "program too large" guess
libbpf pretends it knows actual limit of BPF program instructions based
on UAPI headers it compiled with. There is neither any guarantee that
UAPI headers match host kernel, nor BPF verifier actually uses
BPF_MAXINSNS constant anymore. Just drop unhelpful "guess", BPF verifier
will emit actual reason for failure in its logs anyways.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-3-andrii@kernel.org
2022-04-26 15:41:45 -07:00
Andrii Nakryiko
afe98d46ba libbpf: Fix anonymous type check in CO-RE logic
Use type name for checking whether CO-RE relocation is referring to
anonymous type. Using spec string makes no sense.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-2-andrii@kernel.org
2022-04-26 15:41:45 -07:00
Kumar Kartikeya Dwivedi
ef89654f2b libbpf: Add kptr type tag macros to bpf_helpers.h
Include convenience definitions:
__kptr:	Unreferenced kptr
__kptr_ref: Referenced kptr

Users can use them to tag the pointer type meant to be used with the new
support directly in the map value definition.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220424214901.2743946-11-memxor@gmail.com
2022-04-25 20:26:44 -07:00
Yuntao Wang
003fed595c libbpf: Remove unnecessary type cast
The link variable is already of type 'struct bpf_link *', casting it to
'struct bpf_link *' is redundant, drop it.

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220424143420.457082-1-ytcoode@gmail.com
2022-04-25 17:39:16 +02:00
Arnaldo Carvalho de Melo
e0c1b8f9eb Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes, such as the llvm one for ubuntu:22.04.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-24 07:50:49 -03:00
Adrian Hunter
4bbac9a1f5 libperf evsel: Factor out perf_evsel__ioctl()
Factor out perf_evsel__ioctl() so it can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220422162402.147958-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-24 07:50:38 -03:00
Andrii Nakryiko
8462e0b46f libbpf: Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open()
Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open() on
older kernels for programs that are attachable through
BPF_RAW_TRACEPOINT_OPEN. This makes bpf_link_create() more unified and
convenient interface for creating bpf_link-based attachments.

With this approach end users can just use bpf_link_create() for
tp_btf/fentry/fexit/fmod_ret/lsm program attachments without needing to
care about kernel support, as libbpf will handle this transparently. On
the other hand, as newer features (like BPF cookie) are added to
LINK_CREATE interface, they will be readily usable though the same
bpf_link_create() API without any major refactoring from user's
standpoint.

bpf_program__attach_btf_id() is now using bpf_link_create() internally
as well and will take advantaged of this unified interface when BPF
cookie is added for fentry/fexit.

Doing proactive feature detection of LINK_CREATE support for
fentry/tp_btf/etc is quite involved. It requires parsing vmlinux BTF,
determining some stable and guaranteed to be in all kernels versions
target BTF type (either raw tracepoint or fentry target function),
actually attaching this program and thus potentially affecting the
performance of the host kernel briefly, etc. So instead we are taking
much simpler "lazy" approach of falling back to
bpf_raw_tracepoint_open() call only if initial LINK_CREATE command
fails. For modern kernels this will mean zero added overhead, while
older kernels will incur minimal overhead with a single fast-failing
LINK_CREATE call.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Kui-Feng Lee <kuifeng@fb.com>
Link: https://lore.kernel.org/bpf/20220421033945.3602803-3-andrii@kernel.org
2022-04-23 00:37:02 +02:00
Josh Poimboeuf
aa3d60e050 libsubcmd: Fix OPTION_GROUP sorting
The OPTION_GROUP option type is a way of grouping certain options
together in the printed usage text.  It happens to be completely broken,
thanks to the fact that the subcmd option sorting just sorts everything,
without regard for grouping.  Luckily, nobody uses this option anyway,
though that will change shortly.

Fix it by sorting each group individually.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/e167ea3a11e2a9800eb062c1fd0f13e9cd05140c.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:01 +02:00
Paolo Abeni
f70925bf99 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ethernet/microchip/lan966x/lan966x_main.c
  d08ed85256 ("net: lan966x: Make sure to release ptp interrupt")
  c834963932 ("net: lan966x: Add FDMA functionality")

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-22 09:56:00 +02:00
Gaosheng Cui
b71a2ebf74 libbpf: Remove redundant non-null checks on obj_elf
Obj_elf is already non-null checked at the function entry, so remove
redundant non-null checks on obj_elf.

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220421031803.2283974-1-cuigaosheng1@huawei.com
2022-04-21 09:56:26 -07:00
Grant Seltzer
a66ab9a9e6 libbpf: Add documentation to API functions
This adds documentation for the following API functions:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()
- bpf_program__set_attach_target()
- bpf_program__attach()
- bpf_program__pin()
- bpf_program__unpin()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-3-grantseltzer@gmail.com
2022-04-21 16:31:07 +02:00
Grant Seltzer
df28671632 libbpf: Update API functions usage to check error
This updates usage of the following API functions within
libbpf so their newly added error return is checked:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-2-grantseltzer@gmail.com
2022-04-21 16:28:25 +02:00
Grant Seltzer
93442f132b libbpf: Add error returns to two API functions
This adds an error return to the following API functions:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()

In both cases, the error occurs when the BPF object has
already been loaded when the function is called. In this
case -EBUSY is returned.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-1-grantseltzer@gmail.com
2022-04-21 16:28:11 +02:00
Pu Lehui
58ca8b0572 libbpf: Support riscv USDT argument parsing logic
Add riscv-specific USDT argument specification parsing logic.
riscv USDT argument format is shown below:
- Memory dereference case:
  "size@off(reg)", e.g. "-8@-88(s0)"
- Constant value case:
  "size@val", e.g. "4@5"
- Register read case:
  "size@reg", e.g. "-8@a1"

s8 will be marked as poison while it's a reg of riscv, we need
to alias it in advance. Both RV32 and RV64 have been tested.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220419145238.482134-3-pulehui@huawei.com
2022-04-19 21:59:35 -07:00
Pu Lehui
5af25a410a libbpf: Fix usdt_cookie being cast to 32 bits
The usdt_cookie is defined as __u64, which should not be
used as a long type because it will be cast to 32 bits
in 32-bit platforms.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220419145238.482134-2-pulehui@huawei.com
2022-04-19 21:59:35 -07:00
Andrii Nakryiko
a3820c4811 libbpf: Support opting out from autoloading BPF programs declaratively
Establish SEC("?abc") naming convention (i.e., adding question mark in
front of otherwise normal section name) that allows to set corresponding
program's autoload property to false. This is effectively just
a declarative way to do bpf_program__set_autoload(prog, false).

Having a way to do this declaratively in BPF code itself is useful and
convenient for various scenarios. E.g., for testing, when BPF object
consists of multiple independent BPF programs that each needs to be
tested separately. Opting out all of them by default and then setting
autoload to true for just one of them at a time simplifies testing code
(see next patch for few conversions in BPF selftests taking advantage of
this new feature).

Another real-world use case is in libbpf-tools for cases when different
BPF programs have to be picked depending on particulars of the host
kernel due to various incompatible changes (like kernel function renames
or signature change, or to pick kprobe vs fentry depending on
corresponding kernel support for the latter). Marking all the different
BPF program candidates as non-autoloaded declaratively makes this more
obvious in BPF source code and allows simpler code in user-space code.

When BPF program marked as SEC("?abc") it is otherwise treated just like
SEC("abc") and bpf_program__section_name() reported will be "abc".

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220419002452.632125-1-andrii@kernel.org
2022-04-19 13:48:20 -07:00
Adrian Hunter
a668cc07f9 perf tools: Fix segfault accessing sample_id xyarray
perf_evsel::sample_id is an xyarray which can cause a segfault when
accessed beyond its size. e.g.

  # perf record -e intel_pt// -C 1 sleep 1
  Segmentation fault (core dumped)
  #

That is happening because a dummy event is opened to capture text poke
events accross all CPUs, however the mmap logic is allocating according
to the number of user_requested_cpus.

In general, perf sometimes uses the evsel cpus to open events, and
sometimes the evlist user_requested_cpus. However, it is not necessary
to determine which case is which because the opened event file
descriptors are also in an xyarray, the size of whch can be used
to correctly allocate the size of the sample_id xyarray, because there
is one ID per file descriptor.

Note, in the affected code path, perf_evsel fd array is subsequently
used to get the file descriptor for the mmap, so it makes sense for the
xyarrays to be the same size there.

Fixes: d1a177595b ("libperf: Adopt perf_evlist__mmap()/munmap() from tools/perf")
Fixes: 246eba8e90 ("perf tools: Add support for PERF_RECORD_TEXT_POKE")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org # 5.5+
Link: https://lore.kernel.org/r/20220413114232.26914-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-13 22:23:02 -03:00
Alan Maguire
0f8619929c libbpf: Usdt aarch64 arg parsing support
Parsing of USDT arguments is architecture-specific. On aarch64 it is
relatively easy since registers used are x[0-31] and sp. Format is
slightly different compared to x86_64. Possible forms are:

- "size@[reg[,offset]]" for dereferences, e.g. "-8@[sp,76]" and "-4@[sp]";
- "size@reg" for register values, e.g. "-4@x0";
- "size@value" for raw values, e.g. "-8@1".

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649690496-1902-2-git-send-email-alan.maguire@oracle.com
2022-04-11 15:32:28 -07:00
Runqing Yang
d252a4a499 libbpf: Fix a bug with checking bpf_probe_read_kernel() support in old kernels
Background:
Libbpf automatically replaces calls to BPF bpf_probe_read_{kernel,user}
[_str]() helpers with bpf_probe_read[_str](), if libbpf detects that
kernel doesn't support new APIs. Specifically, libbpf invokes the
probe_kern_probe_read_kernel function to load a small eBPF program into
the kernel in which bpf_probe_read_kernel API is invoked and lets the
kernel checks whether the new API is valid. If the loading fails, libbpf
considers the new API invalid and replaces it with the old API.

static int probe_kern_probe_read_kernel(void)
{
	struct bpf_insn insns[] = {
		BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),	/* r1 = r10 (fp) */
		BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8),	/* r1 += -8 */
		BPF_MOV64_IMM(BPF_REG_2, 8),		/* r2 = 8 */
		BPF_MOV64_IMM(BPF_REG_3, 0),		/* r3 = 0 */
		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_probe_read_kernel),
		BPF_EXIT_INSN(),
	};
	int fd, insn_cnt = ARRAY_SIZE(insns);

	fd = bpf_prog_load(BPF_PROG_TYPE_KPROBE, NULL,
                           "GPL", insns, insn_cnt, NULL);
	return probe_fd(fd);
}

Bug:
On older kernel versions [0], the kernel checks whether the version
number provided in the bpf syscall, matches the LINUX_VERSION_CODE.
If not matched, the bpf syscall fails. eBPF However, the
probe_kern_probe_read_kernel code does not set the kernel version
number provided to the bpf syscall, which causes the loading process
alwasys fails for old versions. It means that libbpf will replace the
new API with the old one even the kernel supports the new one.

Solution:
After a discussion in [1], the solution is using BPF_PROG_TYPE_TRACEPOINT
program type instead of BPF_PROG_TYPE_KPROBE because kernel does not
enfoce version check for tracepoint programs. I test the patch in old
kernels (4.18 and 4.19) and it works well.

  [0] https://elixir.bootlin.com/linux/v4.19/source/kernel/bpf/syscall.c#L1360
  [1] Closes: https://github.com/libbpf/libbpf/issues/473

Signed-off-by: Runqing Yang <rainkin1993@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409144928.27499-1-rainkin1993@gmail.com
2022-04-10 20:15:22 -07:00
Vladimir Isaev
0738599856 libbpf: Add ARC support to bpf_tracing.h
Add PT_REGS macros suitable for ARCompact and ARCv2.

Signed-off-by: Vladimir Isaev <isaev@synopsys.com>
Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220408224442.599566-1-geomatsi@gmail.com
2022-04-10 18:53:37 -07:00
Jakub Kicinski
34ba23b44c Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2022-04-09

We've added 63 non-merge commits during the last 9 day(s) which contain
a total of 68 files changed, 4852 insertions(+), 619 deletions(-).

The main changes are:

1) Add libbpf support for USDT (User Statically-Defined Tracing) probes.
   USDTs are an abstraction built on top of uprobes, critical for tracing
   and BPF, and widely used in production applications, from Andrii Nakryiko.

2) While Andrii was adding support for x86{-64}-specific logic of parsing
   USDT argument specification, Ilya followed-up with USDT support for s390
   architecture, from Ilya Leoshkevich.

3) Support name-based attaching for uprobe BPF programs in libbpf. The format
   supported is `u[ret]probe/binary_path:[raw_offset|function[+offset]]`, e.g.
   attaching to libc malloc can be done in BPF via SEC("uprobe/libc.so.6:malloc")
   now, from Alan Maguire.

4) Various load/store optimizations for the arm64 JIT to shrink the image
   size by using arm64 str/ldr immediate instructions. Also enable pointer
   authentication to verify return address for JITed code, from Xu Kuohai.

5) BPF verifier fixes for write access checks to helper functions, e.g.
   rd-only memory from bpf_*_cpu_ptr() must not be passed to helpers that
   write into passed buffers, from Kumar Kartikeya Dwivedi.

6) Fix overly excessive stack map allocation for its base map structure and
   buckets which slipped-in from cleanups during the rlimit accounting removal
   back then, from Yuntao Wang.

7) Extend the unstable CT lookup helpers for XDP and tc/BPF to report netfilter
   connection tracking tuple direction, from Lorenzo Bianconi.

8) Improve bpftool dump to show BPF program/link type names, Milan Landaverde.

9) Minor cleanups all over the place from various others.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (63 commits)
  bpf: Fix excessive memory allocation in stack_map_alloc()
  selftests/bpf: Fix return value checks in perf_event_stackmap test
  selftests/bpf: Add CO-RE relos into linked_funcs selftests
  libbpf: Use weak hidden modifier for USDT BPF-side API functions
  libbpf: Don't error out on CO-RE relos for overriden weak subprogs
  samples, bpf: Move routes monitor in xdp_router_ipv4 in a dedicated thread
  libbpf: Allow WEAK and GLOBAL bindings during BTF fixup
  libbpf: Use strlcpy() in path resolution fallback logic
  libbpf: Add s390-specific USDT arg spec parsing logic
  libbpf: Make BPF-side of USDT support work on big-endian machines
  libbpf: Minor style improvements in USDT code
  libbpf: Fix use #ifdef instead of #if to avoid compiler warning
  libbpf: Potential NULL dereference in usdt_manager_attach_usdt()
  selftests/bpf: Uprobe tests should verify param/return values
  libbpf: Improve string parsing for uprobe auto-attach
  libbpf: Improve library identification for uprobe binary path resolution
  selftests/bpf: Test for writes to map key from BPF helpers
  selftests/bpf: Test passing rdonly mem to global func
  bpf: Reject writes for PTR_TO_MAP_KEY in check_helper_mem_access
  bpf: Check PTR_TO_MEM | MEM_RDONLY in check_helper_mem_access
  ...
====================

Link: https://lore.kernel.org/r/20220408231741.19116-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08 17:07:29 -07:00
Andrii Nakryiko
2fa5b0f290 libbpf: Use weak hidden modifier for USDT BPF-side API functions
Use __weak __hidden for bpf_usdt_xxx() APIs instead of much more
confusing `static inline __noinline`. This was previously impossible due
to libbpf erroring out on CO-RE relocations pointing to eliminated weak
subprogs. Now that previous patch fixed this issue, switch back to
__weak __hidden as it's a more direct way of specifying the desired
behavior.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-3-andrii@kernel.org
2022-04-08 22:24:15 +02:00
Andrii Nakryiko
e89d57d938 libbpf: Don't error out on CO-RE relos for overriden weak subprogs
During BPF static linking, all the ELF relocations and .BTF.ext
information (including CO-RE relocations) are preserved for __weak
subprograms that were logically overriden by either previous weak
subprogram instance or by corresponding "strong" (non-weak) subprogram.
This is just how native user-space linkers work, nothing new.

But libbpf is over-zealous when processing CO-RE relocation to error out
when CO-RE relocation belonging to such eliminated weak subprogram is
encountered. Instead of erroring out on this expected situation, log
debug-level message and skip the relocation.

Fixes: db2b8b0642 ("libbpf: Support CO-RE relocations for multi-prog sections")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-2-andrii@kernel.org
2022-04-08 22:24:15 +02:00
Andrii Nakryiko
3a06ec0a99 libbpf: Allow WEAK and GLOBAL bindings during BTF fixup
During BTF fix up for global variables, global variable can be global
weak and will have STB_WEAK binding in ELF. Support such global
variables in addition to non-weak ones.

This is not the problem when using BPF static linking, as BPF static
linker "fixes up" BTF during generation so that libbpf doesn't have to
do it anymore during bpf_object__open(), which led to this not being
noticed for a while, along with a pretty rare (currently) use of __weak
variables and maps.

Reported-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220407230446.3980075-2-andrii@kernel.org
2022-04-08 09:16:09 -07:00
Andrii Nakryiko
3c0dfe6e4c libbpf: Use strlcpy() in path resolution fallback logic
Coverity static analyzer complains that strcpy() can cause buffer
overflow. Use libbpf_strlcpy() instead to be 100% sure this doesn't
happen.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220407230446.3980075-1-andrii@kernel.org
2022-04-08 09:16:09 -07:00
Ilya Leoshkevich
bd022685bd libbpf: Add s390-specific USDT arg spec parsing logic
The logic is superficially similar to that of x86, but the small
differences (no need for register table and dynamic allocation of
register names, no $ sign before constants) make maintaining a common
implementation too burdensome. Therefore simply add a s390x-specific
version of parse_usdt_arg().

Note that while bcc supports index registers, this patch does not. This
should not be a problem in most cases, since s390 uses a default value
"nor" for STAP_SDT_ARG_CONSTRAINT.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220407214411.257260-4-iii@linux.ibm.com
2022-04-08 07:04:20 -07:00
Ilya Leoshkevich
6f403d9d53 libbpf: Make BPF-side of USDT support work on big-endian machines
BPF_USDT_ARG_REG_DEREF handling always reads 8 bytes, regardless of
the actual argument size. On little-endian the relevant argument bits
end up in the lower bits of val, and later on the code that handles
all the argument types expects them to be there.

On big-endian they end up in the upper bits of val, breaking that
expectation. Fix by right-shifting val on big-endian.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220407214411.257260-3-iii@linux.ibm.com
2022-04-07 20:59:10 -07:00
Ilya Leoshkevich
e1b6df598a libbpf: Minor style improvements in USDT code
Fix several typos and references to non-existing headers.
Also use __BYTE_ORDER__ instead of __BYTE_ORDER for consistency with
the rest of the bpf code - see commit 45f2bebc80 ("libbpf: Fix
endianness detection in BPF_CORE_READ_BITFIELD_PROBED()") for
rationale).

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220407214411.257260-2-iii@linux.ibm.com
2022-04-07 20:59:10 -07:00
Andrii Nakryiko
ded6dffaed libbpf: Fix use #ifdef instead of #if to avoid compiler warning
As reported by Naresh:

  perf build errors on i386 [1] on Linux next-20220407 [2]

  usdt.c:1181:5: error: "__x86_64__" is not defined, evaluates to 0
  [-Werror=undef]
   1181 | #if __x86_64__
        |     ^~~~~~~~~~
  usdt.c:1196:5: error: "__x86_64__" is not defined, evaluates to 0
  [-Werror=undef]
   1196 | #if __x86_64__
        |     ^~~~~~~~~~
  cc1: all warnings being treated as errors

Use #ifdef instead of #if to avoid this.

Fixes: 4c59e584d1 ("libbpf: Add x86-specific USDT arg spec parsing logic")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220407203842.3019904-1-andrii@kernel.org
2022-04-07 23:34:15 +02:00
Haowen Bai
e58c5c9717 libbpf: Potential NULL dereference in usdt_manager_attach_usdt()
link could be null but still dereference bpf_link__destroy(&link->link)
and it will lead to a null pointer access.

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649299098-2069-1-git-send-email-baihaowen@meizu.com
2022-04-07 11:46:33 -07:00
Alan Maguire
90db26e6be libbpf: Improve string parsing for uprobe auto-attach
For uprobe auto-attach, the parsing can be simplified for the SEC()
name to a single sscanf(); the return value of the sscanf can then
be used to distinguish between sections that simply specify
"u[ret]probe" (and thus cannot auto-attach), those that specify
"u[ret]probe/binary_path:function+offset" etc.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649245431-29956-3-git-send-email-alan.maguire@oracle.com
2022-04-07 11:42:50 -07:00
Alan Maguire
a1c9d61b19 libbpf: Improve library identification for uprobe binary path resolution
In the process of doing path resolution for uprobe attach, libraries are
identified by matching a ".so" substring in the binary_path.
This matches a lot of patterns that do not conform to library.so[.version]
format, so instead match a ".so" _suffix_, and if that fails match a
".so." substring for the versioned library case.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649245431-29956-2-git-send-email-alan.maguire@oracle.com
2022-04-07 11:42:50 -07:00
Colin Ian King
a8d600f6bc libbpf: Fix spelling mistake "libaries" -> "libraries"
There is a spelling mistake in a pr_warn message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220406080835.14879-1-colin.i.king@gmail.com
2022-04-06 10:14:27 -07:00
Andrii Nakryiko
4c59e584d1 libbpf: Add x86-specific USDT arg spec parsing logic
Add x86/x86_64-specific USDT argument specification parsing. Each
architecture will require their own logic, as all this is arch-specific
assembly-based notation. Architectures that libbpf doesn't support for
USDTs will pr_warn() with specific error and return -ENOTSUP.

We use sscanf() as a very powerful and easy to use string parser. Those
spaces in sscanf's format string mean "skip any whitespaces", which is
pretty nifty (and somewhat little known) feature.

All this was tested on little-endian architecture, so bit shifts are
probably off on big-endian, which our CI will hopefully prove.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-6-andrii@kernel.org
2022-04-05 13:16:08 -07:00