Commit Graph

28483 Commits

Author SHA1 Message Date
Jarkko Sakkinen
39f62536be selftests/sgx: Assign source for each segment
Define source per segment so that enclave pages can be added from different
sources, e.g. anonymous VMA for zero pages. In other words, add 'src' field
to struct encl_segment, and assign it to 'encl->src' for pages inherited
from the enclave binary.

Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/7850709c3089fe20e4bcecb8295ba87c54cc2b4a.1636997631.git.reinette.chatre@intel.com
2021-11-15 11:34:00 -08:00
Sean Christopherson
5064343fb1 selftests/sgx: Fix a benign linker warning
The enclave binary (test_encl.elf) is built with only three sections (tcs,
text, and data) as controlled by its custom linker script.

If gcc is built with "--enable-linker-build-id" (this appears to be a
common configuration even if it is by default off) then gcc
will pass "--build-id" to the linker that will prompt it (the linker) to
write unique bits identifying the linked file to a ".note.gnu.build-id"
section.

The section ".note.gnu.build-id" does not exist in the test enclave
resulting in the following warning emitted by the linker:

/usr/bin/ld: warning: .note.gnu.build-id section discarded, --build-id
ignored

The test enclave does not use the build id within the binary so fix the
warning by passing a build id of "none" to the linker that will disable the
setting from any earlier "--build-id" options and thus disable the attempt
to write the build id to a ".note.gnu.build-id" section that does not
exist.

Link: https://lore.kernel.org/linux-sgx/20191017030340.18301-2-sean.j.christopherson@intel.com/
Suggested-by: Cedric Xing <cedric.xing@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/ca0f8a81fc1e78af9bdbc6a88e0f9c37d82e53f2.1636997631.git.reinette.chatre@intel.com
2021-11-15 11:33:53 -08:00
Dan Williams
814dff9ae2 cxl/test: Mock acpi_table_parse_cedt()
Now that cxl_acpi has been converted to use the core ACPI CEDT sub-table
parser, update cxl_test to inject CFMWS and CHBS data directly into
cxl_acpi's handlers.

Cc: Alison Schofield <alison.schofield@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/163553711363.2509508.17428994087868269952.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-11-15 11:03:00 -08:00
Vishal Verma
09eac2ca98 tools/testing/cxl: add mock output for the GET_HEALTH_INFO command
Add mocked health information for cxl_test memdevs. This allows
cxl-cli's 'list' command to display the canned health_info fields.

Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20211018051251.2289112-1-vishal.l.verma@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-11-15 11:02:59 -08:00
Ira Weiny
5e2411ae80 cxl/memdev: Change cxl_mem to a more descriptive name
The 'struct cxl_mem' object actually represents the state of a CXL
device within the driver. Comments indicating that 'struct cxl_mem' is a
device itself are incorrect. It is data layered on top of a CXL Memory
Expander class device. Rename it 'struct cxl_dev_state'. The 'struct'
cxl_memdev' structure represents a Linux CXL memory device object, and
it uses services and information provided by 'struct cxl_dev_state'.

Update the structure name, function names, and the kdocs to reflect the
real uses of this structure.

Some helper functions that were previously prefixed "cxl_mem_" are
renamed to just "cxl_".

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20211102202901.3675568-3-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-11-15 11:02:58 -08:00
Jakub Kicinski
a5bdc36354 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2021-11-15

We've added 72 non-merge commits during the last 13 day(s) which contain
a total of 171 files changed, 2728 insertions(+), 1143 deletions(-).

The main changes are:

1) Add btf_type_tag attributes to bring kernel annotations like __user/__rcu to
   BTF such that BPF verifier will be able to detect misuse, from Yonghong Song.

2) Big batch of libbpf improvements including various fixes, future proofing APIs,
   and adding a unified, OPTS-based bpf_prog_load() low-level API, from Andrii Nakryiko.

3) Add ingress_ifindex to BPF_SK_LOOKUP program type for selectively applying the
   programmable socket lookup logic to packets from a given netdev, from Mark Pashmfouroush.

4) Remove the 128M upper JIT limit for BPF programs on arm64 and add selftest to
   ensure exception handling still works, from Russell King and Alan Maguire.

5) Add a new bpf_find_vma() helper for tracing to map an address to the backing
   file such as shared library, from Song Liu.

6) Batch of various misc fixes to bpftool, fixing a memory leak in BPF program dump,
   updating documentation and bash-completion among others, from Quentin Monnet.

7) Deprecate libbpf bpf_program__get_prog_info_linear() API and migrate its users as
   the API is heavily tailored around perf and is non-generic, from Dave Marchevsky.

8) Enable libbpf's strict mode by default in bpftool and add a --legacy option as an
   opt-out for more relaxed BPF program requirements, from Stanislav Fomichev.

9) Fix bpftool to use libbpf_get_error() to check for errors, from Hengqi Chen.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (72 commits)
  bpftool: Use libbpf_get_error() to check error
  bpftool: Fix mixed indentation in documentation
  bpftool: Update the lists of names for maps and prog-attach types
  bpftool: Fix indent in option lists in the documentation
  bpftool: Remove inclusion of utilities.mak from Makefiles
  bpftool: Fix memory leak in prog_dump()
  selftests/bpf: Fix a tautological-constant-out-of-range-compare compiler warning
  selftests/bpf: Fix an unused-but-set-variable compiler warning
  bpf: Introduce btf_tracing_ids
  bpf: Extend BTF_ID_LIST_GLOBAL with parameter for number of IDs
  bpftool: Enable libbpf's strict mode by default
  docs/bpf: Update documentation for BTF_KIND_TYPE_TAG support
  selftests/bpf: Clarify llvm dependency with btf_tag selftest
  selftests/bpf: Add a C test for btf_type_tag
  selftests/bpf: Rename progs/tag.c to progs/btf_decl_tag.c
  selftests/bpf: Test BTF_KIND_DECL_TAG for deduplication
  selftests/bpf: Add BTF_KIND_TYPE_TAG unit tests
  selftests/bpf: Test libbpf API function btf__add_type_tag()
  bpftool: Support BTF_KIND_TYPE_TAG
  libbpf: Support BTF_KIND_TYPE_TAG
  ...
====================

Link: https://lore.kernel.org/r/20211115162008.25916-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-15 08:49:23 -08:00
Kent Gibson
4f4d0af7b2 selftests: gpio: restore CFLAGS options
All the CFLAGS options were incorrectly removed in the recent rework
of the GPIO selftests.  While some of the flags were specific to the old
implementation the remainder are still relevant.  Restore those options.

Signed-off-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2021-11-15 14:28:03 +01:00
Kent Gibson
c472d71be0 selftests: gpio: fix uninitialised variable warning
When compiled with -Wall gpio-mockup-cdev.c reports an uninitialised
variable warning.  This is a false positive, as the variable is ignored
in the case it is uninitialised, but initialise the variable anyway
to remove the warning.

Signed-off-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2021-11-15 14:27:50 +01:00
Li Zhijian
92a59d7f38 selftests: gpio: fix gpio compiling error
The gpio selftests build against the system includes rather than the
headers from the linux tree.  This results in the compile failing if
the system includes are outdated.

Prefer the headers from the linux tree, as per other selftests.

Fixes: 8bc395a6a2 ("selftests: gpio: rework and simplify test implementation")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
[Kent: reworded commit comment and added Fixes:]
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2021-11-15 14:27:38 +01:00
Florian Westphal
a2acf0c0e2 selftests: nft_nat: switch port shadow test cases to socat
There are now at least three distinct flavours of netcat/nc tool:
'original' version, one version ported from openbsd and nmap-ncat.

The script only works with original because it sets SOREUSEPORT option.

Other nc versions return 'port already in use' error and port shadow test fails:

PASS: inet IPv6 redirection for ns2-hMHcaRvx
nc: bind failed: Address already in use
ERROR: portshadow test default: got reply from "ROUTER", not CLIENT as intended

Switch to socat instead.

Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-11-15 12:02:11 +01:00
Hengqi Chen
e5043894b2 bpftool: Use libbpf_get_error() to check error
Currently, LIBBPF_STRICT_ALL mode is enabled by default for
bpftool which means on error cases, some libbpf APIs would
return NULL pointers. This makes IS_ERR check failed to detect
such cases and result in segfault error. Use libbpf_get_error()
instead like we do in libbpf itself.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211115012436.3143318-1-hengqi.chen@gmail.com
2021-11-14 18:38:13 -08:00
Quentin Monnet
b06be5651f bpftool: Fix mixed indentation in documentation
Some paragraphs in bpftool's documentation have a mix of tabs and spaces
for indentation. Let's make it consistent.

This patch brings no change to the text content.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211110114632.24537-7-quentin@isovalent.com
2021-11-14 18:35:02 -08:00
Quentin Monnet
3811e2753a bpftool: Update the lists of names for maps and prog-attach types
To support the different BPF map or attach types, bpftool must remain
up-to-date with the types supported by the kernel. Let's update the
lists, by adding the missing Bloom filter map type and the perf_event
attach type.

Both missing items were found with test_bpftool_synctypes.py.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211110114632.24537-6-quentin@isovalent.com
2021-11-14 18:35:02 -08:00
Quentin Monnet
986dec18bb bpftool: Fix indent in option lists in the documentation
Mixed indentation levels in the lists of options in bpftool's
documentation produces some unexpected results. For the "bpftool" man
page, it prints a warning:

    $ make -C bpftool.8
      GEN     bpftool.8
    <stdin>:26: (ERROR/3) Unexpected indentation.

For other pages, there is no warning, but it results in a line break
appearing in the option lists in the generated man pages.

RST paragraphs should have a uniform indentation level. Let's fix it.

Fixes: c07ba629df ("tools: bpftool: Update and synchronise option list in doc and help msg")
Fixes: 8cc8c6357c ("tools: bpftool: Document and add bash completion for -L, -B options")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211110114632.24537-5-quentin@isovalent.com
2021-11-14 18:35:02 -08:00
Quentin Monnet
48f5aef4c4 bpftool: Remove inclusion of utilities.mak from Makefiles
Bpftool's Makefile, and the Makefile for its documentation, both include
scripts/utilities.mak, but they use none of the items defined in this
file. Remove the includes.

Fixes: 71bb428fe2 ("tools: bpf: add bpftool")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211110114632.24537-3-quentin@isovalent.com
2021-11-14 18:34:09 -08:00
Quentin Monnet
ebbd7f64a3 bpftool: Fix memory leak in prog_dump()
Following the extraction of prog_dump() from do_dump(), the struct btf
allocated in prog_dump() is no longer freed on error; the struct
bpf_prog_linfo is not freed at all. Make sure we release them before
exiting the function.

Fixes: ec2025095c ("bpftool: Match several programs with same tag")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211110114632.24537-2-quentin@isovalent.com
2021-11-14 18:34:08 -08:00
Linus Torvalds
218cc8b860 A single fix for static calls to make the trampoline patching more robust
by placing explicit signature bytes after the call trampoline to prevent
 patching random other jumps like the CFI jump table entries.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmGRDKsTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoZeNEADEFTbUJKd8812O9vkY9we1GDAtH7bY
 z6sYkh0/rPvYjdPfHuwqW8tUAl+CO2ne2X8FRPKgEdRLg44BY4HaMHmujdbGh3fh
 zpqynUBPoOIgtWxAPGdF+JxjrKlzjFd+WwjG3qBXOF3pjKgCc5knyjTucsl6ced3
 wF293rSYrIJ6uRv2TTNbM5hWJdC0arWbdMFnwQTxeZR54WLpu7Wfm+CCK41w0fAU
 nrfSsv73WEwpmAZNh04wsZsf7h6yCO7dCrIJD/3mpJtrUVBZXuZAKDzUzJPvHJal
 T8LcKwxZQAgPv0ubmOCrolj98Qp6PAPSdDJbzNsCJUYEbBqaB2inJ0PeHcZPspy9
 YyW00EHXD2UKm/GNF/DIlhoiNxOSh8Wn4b6H5ZRML50bS7jsMp8YVbticWEjItL6
 N4/61c45/uPILBS+Lysj0aqyj4TvagiuffJFWjw3YAQ+Gp/pzlJwRNjrw7/4DxAx
 KdpM881IKCR8UowBz3gIiA9FrJv2dGMqq31Rs1fjuauxkIX0gV3c64tAIRWrVscT
 k6GKGvHSis5cT97K3yhmNH0BUND+Skeku8G/SnTkefvcB85aU/7HBkLLJpw0w84F
 F6PTCaCJOEHrl3ADkilsi3z0sKWrph6aAzDEgp6Q6cmo9ulFAGw0bjuJb59xsvVK
 flIvTLUY3n76FA==
 =dgiF
 -----END PGP SIGNATURE-----

Merge tag 'locking-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 static call update from Thomas Gleixner:
 "A single fix for static calls to make the trampoline patching more
  robust by placing explicit signature bytes after the call trampoline
  to prevent patching random other jumps like the CFI jump table
  entries"

* tag 'locking-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  static_call,x86: Robustify trampoline patching
2021-11-14 10:30:17 -08:00
Linus Torvalds
35c8fad4a7 perf tools changes for v5.16: 2nd batch
Hardware tracing:
 
 ARM:
 
 - Print the size of the buffer size consistently in hexadecimal in ARM Coresight.
 
 - Add Coresight snapshot mode support.
 
 - Update --switch-events docs in 'perf record'.
 
 - Support hardware-based PID tracing.
 
 - Track task context switch for cpu-mode events.
 
 Vendor events:
 
 - Add metric events JSON file for power10 platform
 
 perf test:
 
 - Get 'perf test' unit tests closer to kunit.
 
 - Topology tests improvements.
 
 - Remove bashisms from some tests.
 
 perf bench:
 
 - Fix memory leak of perf_cpu_map__new() in the futex benchmarks.
 
 libbpf:
 
 - Add some more weak libbpf functions o allow building with the libbpf versions, old ones,
   present in distros.
 
 libbeauty:
 
 - Translate [gs]setsockopt 'level' argument integer values to strings.
 
 tools headers UAPI:
 
 - Sync futex_waitv, arch prctl, sound, i195_drm and msr-index files with the kernel sources.
 
 Documentation:
 
 - Add documentation to 'struct symbol'.
 
 - Synchronize the definition of enum perf_hw_id with code in tools/perf/design.txt.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYZAxDwAKCRCyPKLppCJ+
 J6mlAQD9Oz+atprlAikeneljy3xTquBcHl0Wg2Ta6shR5JjuogEA4hPQXUDFz6/4
 C1tsmSDp/UOYFumkX1VW8KOi1TAMCQ4=
 =Ib8h
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v5.16-2021-11-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull more perf tools updates from Arnaldo Carvalho de Melo:
 "Hardware tracing:

   - ARM:
      * Print the size of the buffer size consistently in hexadecimal in
        ARM Coresight.
      * Add Coresight snapshot mode support.
      * Update --switch-events docs in 'perf record'.
      * Support hardware-based PID tracing.
      * Track task context switch for cpu-mode events.

   - Vendor events:
      * Add metric events JSON file for power10 platform

  perf test:

   - Get 'perf test' unit tests closer to kunit.

   - Topology tests improvements.

   - Remove bashisms from some tests.

  perf bench:

   - Fix memory leak of perf_cpu_map__new() in the futex benchmarks.

  libbpf:

   - Add some more weak libbpf functions o allow building with the
     libbpf versions, old ones, present in distros.

  libbeauty:

   - Translate [gs]setsockopt 'level' argument integer values to
     strings.

  tools headers UAPI:

   - Sync futex_waitv, arch prctl, sound, i195_drm and msr-index files
     with the kernel sources.

  Documentation:

   - Add documentation to 'struct symbol'.

   - Synchronize the definition of enum perf_hw_id with code in
     tools/perf/design.txt"

* tag 'perf-tools-for-v5.16-2021-11-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (67 commits)
  perf tests: Remove bash constructs from stat_all_pmu.sh
  perf tests: Remove bash construct from record+zstd_comp_decomp.sh
  perf test: Remove bash construct from stat_bpf_counters.sh test
  perf bench futex: Fix memory leak of perf_cpu_map__new()
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
  tools headers UAPI: Sync sound/asound.h with the kernel sources
  tools headers UAPI: Sync linux/prctl.h with the kernel sources
  tools headers UAPI: Sync arch prctl headers with the kernel sources
  perf tools: Add more weak libbpf functions
  perf bpf: Avoid memory leak from perf_env__insert_btf()
  perf symbols: Factor out annotation init/exit
  perf symbols: Bit pack to save a byte
  perf symbols: Add documentation to 'struct symbol'
  tools headers UAPI: Sync files changed by new futex_waitv syscall
  perf test bpf: Use ARRAY_CHECK() instead of ad-hoc equivalent, addressing array_size.cocci warning
  perf arm-spe: Support hardware-based PID tracing
  perf arm-spe: Save context ID in record
  perf arm-spe: Update --switch-events docs in 'perf record'
  perf arm-spe: Track task context switch for cpu-mode events
  ...
2021-11-14 09:25:01 -08:00
James Clark
ac96f463cc perf tests: Remove bash constructs from stat_all_pmu.sh
The tests were passing but without testing and were printing the
following:

  $ ./perf test -v 90
  90: perf all PMU test                                               :
  --- start ---
  test child forked, pid 51650
  Testing cpu/branch-instructions/
  ./tests/shell/stat_all_pmu.sh: 10: [:
   Performance counter stats for 'true':

             137,307      cpu/branch-instructions/

         0.001686672 seconds time elapsed

         0.001376000 seconds user
         0.000000000 seconds sys: unexpected operator

Changing the regexes to a grep works in sh and prints this:

  $ ./perf test -v 90
  90: perf all PMU test                                               :
  --- start ---
  test child forked, pid 60186
  [...]
  Testing tlb_flush.stlb_any
  test child finished with 0
  ---- end ----
  perf all PMU test: Ok

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20211028134828.65774-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
James Clark
a9cdc1c5e3 perf tests: Remove bash construct from record+zstd_comp_decomp.sh
Commit 463538a383 ("perf tests: Fix test 68 zstd compression for
s390") inadvertently removed the -g flag from all platforms rather than
just s390, because the [[ ]] construct fails in sh. Changing to single
brackets restores testing of call graphs and removes the following error
from the output:

  $ ./perf test -v 85
  85: Zstd perf.data compression/decompression                        :
  --- start ---
  test child forked, pid 50643
  Collecting compressed record file:
  ./tests/shell/record+zstd_comp_decomp.sh: 15: [[: not found

Fixes: 463538a383 ("perf tests: Fix test 68 zstd compression for s390")
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20211028134828.65774-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
James Clark
c8b947642d perf test: Remove bash construct from stat_bpf_counters.sh test
Currently the test skips with an error because == only works in bash:

  $ ./perf test 91 -v
  Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
  91: perf stat --bpf-counters test                                   :
  --- start ---
  test child forked, pid 44586
  ./tests/shell/stat_bpf_counters.sh: 26: [: -v: unexpected operator
  test child finished with -2
  ---- end ----
  perf stat --bpf-counters test: Skip

Changing == to = does the same thing, but doesn't result in an error:

  ./perf test 91 -v
  Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
  91: perf stat --bpf-counters test                                   :
  --- start ---
  test child forked, pid 45833
  Skipping: --bpf-counters not supported
    Error: unknown option `bpf-counters'
  [...]
  test child finished with -2
  ---- end ----
  perf stat --bpf-counters test: Skip

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20211028134828.65774-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
Sohaib Mohamed
88e48238d5 perf bench futex: Fix memory leak of perf_cpu_map__new()
ASan reports memory leaks while running:

  $ sudo ./perf bench futex all

The leaks are caused by perf_cpu_map__new not being freed.
This patch adds the missing perf_cpu_map__put since it calls
cpu_map_delete implicitly.

Fixes: 9c3516d1b8 ("libperf: Add perf_cpu_map__new()/perf_cpu_map__read() functions")
Signed-off-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20211112201134.77892-1-sohaib.amhmd@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
Arnaldo Carvalho de Melo
3442b5e05a tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  dae1bd5838 ("x86/msr-index: Add MSRs for XFD")

Addressing these tools/perf build warnings:

    diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
    Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'

That makes the beautification scripts to pick some new entries:

  $ diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
  --- tools/arch/x86/include/asm/msr-index.h	2021-07-15 16:17:01.819817827 -0300
  +++ arch/x86/include/asm/msr-index.h	2021-11-06 15:49:33.738517311 -0300
  @@ -625,6 +625,8 @@

   #define MSR_IA32_BNDCFGS_RSVD		0x00000ffc

  +#define MSR_IA32_XFD			0x000001c4
  +#define MSR_IA32_XFD_ERR		0x000001c5
   #define MSR_IA32_XSS			0x00000da0

   #define MSR_IA32_APICBASE		0x0000001b
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > /tmp/before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > /tmp/after
  $ diff -u /tmp/before /tmp/after
  --- /tmp/before	2021-11-13 11:10:39.964201505 -0300
  +++ /tmp/after	2021-11-13 11:10:47.902410873 -0300
  @@ -93,6 +93,8 @@
   	[0x000001b0] = "IA32_ENERGY_PERF_BIAS",
   	[0x000001b1] = "IA32_PACKAGE_THERM_STATUS",
   	[0x000001b2] = "IA32_PACKAGE_THERM_INTERRUPT",
  +	[0x000001c4] = "IA32_XFD",
  +	[0x000001c5] = "IA32_XFD_ERR",
   	[0x000001c8] = "LBR_SELECT",
   	[0x000001c9] = "LBR_TOS",
   	[0x000001d9] = "IA32_DEBUGCTLMSR",
  $

And this gets rebuilt:

  CC       /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
  INSTALL  trace_plugins
  LD       /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
  LD       /tmp/build/perf/trace/beauty/perf-in.o
  LD       /tmp/build/perf/perf-in.o
  LINK     /tmp/build/perf/perf

Now one can trace systemwide asking to see backtraces to where those
MSRs are being read/written with:

  # perf trace -e msr:*_msr/max-stack=32/ --filter="msr==IA32_XFD || msr==IA32_XFD_ERR"
  ^C#
  #

If we use -v (verbose mode) we can see what it does behind the scenes:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_XFD || msr==IA32_XFD_ERR"
  <SNIP>
  New filter for msr:read_msr: (msr==0x1c4 || msr==0x1c5) && (common_pid != 4448951 && common_pid != 8781)
  New filter for msr:write_msr: (msr==0x1c4 || msr==0x1c5) && (common_pid != 4448951 && common_pid != 8781)
  <SNIP>
  ^C#

Example with a frequent msr:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
  Using CPUID AuthenticAMD-25-21-0
  0x48
  New filter for msr:read_msr: (msr==0x48) && (common_pid != 3738351 && common_pid != 3564)
  0x48
  New filter for msr:write_msr: (msr==0x48) && (common_pid != 3738351 && common_pid != 3564)
  mmap size 528384B
  Looking at the vmlinux_path (8 entries long)
  symsrc__init: build id mismatch for vmlinux.
  Using /proc/kcore for kernel data
  Using /proc/kallsyms for symbols
       0.000 pipewire/2479 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
                                         do_trace_write_msr ([kernel.kallsyms])
                                         do_trace_write_msr ([kernel.kallsyms])
                                         __switch_to_xtra ([kernel.kallsyms])
                                         __switch_to ([kernel.kallsyms])
                                         __schedule ([kernel.kallsyms])
                                         schedule ([kernel.kallsyms])
                                         schedule_hrtimeout_range_clock ([kernel.kallsyms])
                                         do_epoll_wait ([kernel.kallsyms])
                                         __x64_sys_epoll_wait ([kernel.kallsyms])
                                         do_syscall_64 ([kernel.kallsyms])
                                         entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
                                         epoll_wait (/usr/lib64/libc-2.33.so)
                                         [0x76c4] (/usr/lib64/spa-0.2/support/libspa-support.so)
                                         [0x4cf0] (/usr/lib64/spa-0.2/support/libspa-support.so)
       0.027 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
                                         do_trace_write_msr ([kernel.kallsyms])
                                         do_trace_write_msr ([kernel.kallsyms])
                                         __switch_to_xtra ([kernel.kallsyms])
                                         __switch_to ([kernel.kallsyms])
                                         __schedule ([kernel.kallsyms])
                                         schedule_idle ([kernel.kallsyms])
                                         do_idle ([kernel.kallsyms])
                                         cpu_startup_entry ([kernel.kallsyms])
                                         start_kernel ([kernel.kallsyms])
                                         secondary_startup_64_no_verify ([kernel.kallsyms])
  #

Cc: Borislav Petkov <bp@suse.de>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/YY%2FJdb6on7swsn+C@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
Arnaldo Carvalho de Melo
06cf00c48f tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
To pick up the changes in:

  e5e32171a2 ("drm/i915/guc: Connect UAPI to GuC multi-lrc interface")
  9409eb3594 ("drm/i915: Expose logical engine instance to user")
  ea673f17ab ("drm/i915/uapi: Add comment clarifying purpose of I915_TILING_* values")
  d3ac8d4216 ("drm/i915/pxp: interfaces for using protected objects")
  cbbd3764b2 ("drm/i915/pxp: Create the arbitrary session after boot")

That don't add any new ioctl, so no changes in tooling.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
  diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Huang, Sean Z <sean.z.huang@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
Arnaldo Carvalho de Melo
37057e743c tools headers UAPI: Sync sound/asound.h with the kernel sources
To pick up the changes in:

  5aec579e08 ("ALSA: uapi: Fix a C++ style comment in asound.h")

That is just changing a // style comment to /* */.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h'
  diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h

Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
Arnaldo Carvalho de Melo
4902420432 tools headers UAPI: Sync linux/prctl.h with the kernel sources
To pick the changes in:

  61bc346ce6 ("uapi/linux/prctl: provide macro definitions for the PR_SCHED_CORE type argument")

That don't result in any changes in tooling:

  $ tools/perf/trace/beauty/prctl_option.sh > before
  $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h
  $ tools/perf/trace/beauty/prctl_option.sh > after
  $ diff -u before after
  $

Just silences this perf tools build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h'
  diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h

Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
Arnaldo Carvalho de Melo
5b749efe2d tools headers UAPI: Sync arch prctl headers with the kernel sources
To pick the changes in this cset:

  db8268df09 ("x86/arch_prctl: Add controls for dynamic XSTATE components")

This picks these new prctls:

  $ tools/perf/trace/beauty/x86_arch_prctl.sh > /tmp/before
  $ cp arch/x86/include/uapi/asm/prctl.h tools/arch/x86/include/uapi/asm/prctl.h
  $ tools/perf/trace/beauty/x86_arch_prctl.sh > /tmp/after
  $ diff -u /tmp/before /tmp/after
  --- /tmp/before	2021-11-13 10:42:52.787308809 -0300
  +++ /tmp/after	2021-11-13 10:43:02.295558837 -0300
  @@ -6,6 +6,9 @@
   	[0x1004 - 0x1001]= "GET_GS",
   	[0x1011 - 0x1001]= "GET_CPUID",
   	[0x1012 - 0x1001]= "SET_CPUID",
  +	[0x1021 - 0x1001]= "GET_XCOMP_SUPP",
  +	[0x1022 - 0x1001]= "GET_XCOMP_PERM",
  +	[0x1023 - 0x1001]= "REQ_XCOMP_PERM",
   };

   #define x86_arch_prctl_codes_2_offset 0x2001
  $

With this 'perf trace' can translate those numbers into strings and use
the strings in filter expressions:

  # perf trace -e prctl
       0.000 ( 0.011 ms): DOM Worker/3722622 prctl(option: SET_NAME, arg2: 0x7f9c014b7df5)     = 0
       0.032 ( 0.002 ms): DOM Worker/3722622 prctl(option: SET_NAME, arg2: 0x7f9bb6b51580)     = 0
       5.452 ( 0.003 ms): StreamT~ns #30/3722623 prctl(option: SET_NAME, arg2: 0x7f9bdbdfeb70) = 0
       5.468 ( 0.002 ms): StreamT~ns #30/3722623 prctl(option: SET_NAME, arg2: 0x7f9bdbdfea70) = 0
      24.494 ( 0.009 ms): IndexedDB #556/3722624 prctl(option: SET_NAME, arg2: 0x7f562a32ae28) = 0
      24.540 ( 0.002 ms): IndexedDB #556/3722624 prctl(option: SET_NAME, arg2: 0x7f563c6d4b30) = 0
     670.281 ( 0.008 ms): systemd-userwo/3722339 prctl(option: SET_NAME, arg2: 0x564be30805c8) = 0
     670.293 ( 0.002 ms): systemd-userwo/3722339 prctl(option: SET_NAME, arg2: 0x564be30800f0) = 0
  ^C#

This addresses these perf build warnings:

  Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/prctl.h' differs from latest version at 'arch/x86/include/uapi/asm/prctl.h'
  diff -u tools/arch/x86/include/uapi/asm/prctl.h arch/x86/include/uapi/asm/prctl.h

Cc: Borislav Petkov <bp@suse.de>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/YY%2FER104k852WOTK@kernel.org/T/#u
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
Jiri Olsa
2a4898fc26 perf tools: Add more weak libbpf functions
We hit the window where perf uses libbpf functions, that did not make it
to the official libbpf release yet and it's breaking perf build with
dynamicly linked libbpf.

Fixing this by providing the new interface as weak functions which calls
the original libbpf functions. Fortunatelly the changes were just
renames.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20211109140707.1689940-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
Ian Rogers
4924b1f7c4 perf bpf: Avoid memory leak from perf_env__insert_btf()
perf_env__insert_btf() doesn't insert if a duplicate BTF id is
encountered and this causes a memory leak. Modify the function to return
a success/error value and then free the memory if insertion didn't
happen.

v2. Adds a return -1 when the insertion error occurs in
    perf_env__fetch_btf. This doesn't affect anything as the result is
    never checked.

Fixes: 3792cb2ff4 ("perf bpf: Save BTF in a rbtree in perf_env")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211112074525.121633-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
Ian Rogers
4f74f18789 perf symbols: Factor out annotation init/exit
The exit function fixes a memory leak with the src field as detected by
leak sanitizer. An example of which is:

Indirect leak of 25133184 byte(s) in 207 object(s) allocated from:
    #0 0x7f199ecfe987 in __interceptor_calloc libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55defe638224 in annotated_source__alloc_histograms util/annotate.c:803
    #2 0x55defe6397e4 in symbol__hists util/annotate.c:952
    #3 0x55defe639908 in symbol__inc_addr_samples util/annotate.c:968
    #4 0x55defe63aa29 in hist_entry__inc_addr_samples util/annotate.c:1119
    #5 0x55defe499a79 in hist_iter__report_callback tools/perf/builtin-report.c:182
    #6 0x55defe7a859d in hist_entry_iter__add util/hist.c:1236
    #7 0x55defe49aa63 in process_sample_event tools/perf/builtin-report.c:315
    #8 0x55defe731bc8 in evlist__deliver_sample util/session.c:1473
    #9 0x55defe731e38 in machines__deliver_event util/session.c:1510
    #10 0x55defe732a23 in perf_session__deliver_event util/session.c:1590
    #11 0x55defe72951e in ordered_events__deliver_event util/session.c:183
    #12 0x55defe740082 in do_flush util/ordered-events.c:244
    #13 0x55defe7407cb in __ordered_events__flush util/ordered-events.c:323
    #14 0x55defe740a61 in ordered_events__flush util/ordered-events.c:341
    #15 0x55defe73837f in __perf_session__process_events util/session.c:2390
    #16 0x55defe7385ff in perf_session__process_events util/session.c:2420
    ...

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211112035124.94327-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
Ian Rogers
4270456704 perf symbols: Bit pack to save a byte
Use a bit field alongside the earlier bit fields.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211112035124.94327-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
Ian Rogers
bd9acd9cc6 perf symbols: Add documentation to 'struct symbol'
Refactor some existing comments and then infer the rest.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211112035124.94327-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
Arnaldo Carvalho de Melo
7380aa8990 tools headers UAPI: Sync files changed by new futex_waitv syscall
To pick the changes in these csets:

  039c0ec9bb ("futex,x86: Wire up sys_futex_waitv()")
  bf69bad38c ("futex: Implement sys_futex_waitv()")

That add support for this new syscall in tools such as 'perf trace'.

For instance, this is now possible:

  # perf trace -e futex_waitv
  ^C#
  # perf trace -v -e futex_waitv
  Using CPUID AuthenticAMD-25-21-0
  event qualifier tracepoint filter: (common_pid != 807333 && common_pid != 3564) && (id == 449)
  mmap size 528384B
  ^C#
  # perf trace -v -e futex* --max-events 10
  Using CPUID AuthenticAMD-25-21-0
  event qualifier tracepoint filter: (common_pid != 812168 && common_pid != 3564) && (id == 202 || id == 449)
  mmap size 528384B
           ? (         ): Timer/219310  ... [continued]: futex())                                            = -1 ETIMEDOUT (Connection timed out)
       0.012 ( 0.002 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.024 ( 0.060 ms): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) = 0
       0.086 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.088 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d424, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
       0.075 ( 0.005 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d420, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
       0.169 ( 0.004 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d424, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
       0.088 ( 0.089 ms): Timer/219310  ... [continued]: futex())                                            = 0
       0.179 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.181 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
  #

That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.

  $ grep futex_waitv tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
  449	common	futex_waitv		sys_futex_waitv
  $

This addresses these perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
  diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
  Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
  diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl

Cc: André Almeida <andrealmeid@collabora.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
Guo Zhengkui
f08a8fccd7 perf test bpf: Use ARRAY_CHECK() instead of ad-hoc equivalent, addressing array_size.cocci warning
Address following coccicheck warnings:

  ./tools/perf/tests/bpf.c:316:22-23: WARNING: Use ARRAY_SIZE.

Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: kernel@vivo.com
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211108070801.5540-1-guozhengkui@vivo.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
German Gomez
27d113cfe8 perf arm-spe: Support hardware-based PID tracing
If ARM SPE traces contains CONTEXT packets with TID info, use these
values for tracking the TID of samples. Otherwise fall back to using
context switch events and display a message warning to the user of
possible timing inaccuracies [1].

[1] https://lore.kernel.org/lkml/f877cfa6-9b25-6445-3806-ca44a4042eaf@arm.com/

Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211111133625.193568-5-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
German Gomez
169de64f5d perf arm-spe: Save context ID in record
This patch is to save context ID in record, this will be used to set TID
for samples.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211111133625.193568-4-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:51 -03:00
German Gomez
455c988225 perf arm-spe: Update --switch-events docs in 'perf record'
Update 'perf record' docs and ARM SPE recording options so that they are
consistent. This includes supporting the --no-switch-events flag in ARM
SPE as well.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211111133625.193568-3-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Namhyung Kim
9dc9855f18 perf arm-spe: Track task context switch for cpu-mode events
When perf report synthesize events from ARM SPE data, it refers to
current cpu, pid and tid in the machine.  But there's no place to set
them in the ARM SPE decoder.  I'm seeing all pid/tid is set to -1 and
user symbols are not resolved in the output.

  # perf record -a -e arm_spe_0/ts_enable=1/ sleep 1

  # perf report -q | head
     8.77%     8.77%  :-1      [kernel.kallsyms]  [k] format_decode
     7.02%     7.02%  :-1      [kernel.kallsyms]  [k] seq_printf
     7.02%     7.02%  :-1      [unknown]          [.] 0x0000ffff9f687c34
     5.26%     5.26%  :-1      [kernel.kallsyms]  [k] vsnprintf
     3.51%     3.51%  :-1      [kernel.kallsyms]  [k] string
     3.51%     3.51%  :-1      [unknown]          [.] 0x0000ffff9f66ae20
     3.51%     3.51%  :-1      [unknown]          [.] 0x0000ffff9f670b3c
     3.51%     3.51%  :-1      [unknown]          [.] 0x0000ffff9f67c040
     1.75%     1.75%  :-1      [kernel.kallsyms]  [k] ___cache_free
     1.75%     1.75%  :-1      [kernel.kallsyms]  [k] __count_memcg_events

Like Intel PT, add context switch records to track task info.  As ARM
SPE support was added later than PERF_RECORD_SWITCH_CPU_WIDE, I think
we can safely set the attr.context_switch bit and use it.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211111133625.193568-2-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Kajol Jain
3ca3af7d1f perf vendor events power10: Add metric events JSON file for power10 platform
Add PMU metric JSON file for power10 platform.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Paul Clarke <pc@us.ibm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20211108060010.177517-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Like Xu
438f1a9f54 perf design.txt: Synchronize the definition of enum perf_hw_id with code
We're not surprised that there are tons of Linux users who only read the
documentation to learn about the kernel.

Let's update the perf part for common hardware events since three new
*generic* hardware events were added.

Signed-off-by: Like Xu <likexu@tencent.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20211109090147.56978-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Andrew Kilroy
09e9afac8c perf arm-spe: Print size using consistent format
Since the size is already printed earlier in hex, print the same data
using the same format, in hex.

Reviewed-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211109142153.56546-3-german.gomez@arm.com
Signed-off-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Andrew Kilroy
d54e50b7c9 perf cs-etm: Print size using consistent format
Since the size is already printed earlier in hex, print the same data
using the same format, in hex.

Reviewed-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211109142153.56546-2-german.gomez@arm.com
Signed-off-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
German Gomez
6b1b208bef perf arm-spe: Snapshot mode test
Shell script test_arm_spe.sh has been added to test the recording of SPE
tracing events in snapshot mode.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211109163009.92072-4-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
German Gomez
56c31cdff7 perf arm-spe: Implement find_snapshot callback
The head pointer of the AUX buffer managed by the arm_spe_pmu.c driver
is not monotonically increasing, therefore the find_snapshot callback is
needed in order to find the trace data within the AUX buffer and avoid
wasting space in the perf.data file.

The pointer is assumed to have wrapped if the buffer contains non-zero
data at the end. If it has wrapped, the entire contents of the AUX
buffer are stored in the perf.data file. Otherwise only the data up to
the head pointer is stored.

Reviewed-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211109163009.92072-3-german.gomez@arm.com
Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
German Gomez
0901b56028 perf arm-spe: Add snapshot mode support
This patch enables support for snapshot mode of arm_spe events,
including the implementation of the necessary callbacks (excluding
find_snapshot, which is to be included in a followup commit).

Reviewed-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211109163009.92072-2-german.gomez@arm.com
Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
9aba0adae8 perf expr: Add source_count for aggregating events
Events like uncore_imc/cas_count_read/ on Skylake open multiple events
and then aggregate in the metric leader. To determine the average value
per event the number of these events is needed. Add a source_count
function that returns this value by counting the number of events with
the given metric leader. For most events the value is 1 but for
uncore_imc/cas_count_read/ it can yield values like 6.

Add a generic test, but manually tested with a test metric that uses
the function.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul A . Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20211111002109.194172-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
1e7ab82975 perf expr: Move ID handling to its own function
This will facilitate sharing in a follow-on change.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul A . Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20211111002109.194172-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
fdf1e29b61 perf expr: Add metric literals for topology.
Allow the number of cpus, cores, dies and packages to be queried by a
metric expression.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul A . Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20211111002109.194172-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
3613f6c118 perf expr: Add literal values starting with #
It is useful to have literal values for constants relating to
topologies, SMT, etc. Make the parsing of literals shared code and add a
lookup function. Move #smt_on to this function.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul A . Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20211111002109.194172-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
0b6b84cca6 perf cputopo: Match thread_siblings to topology ABI name
The topology name for thread_siblings is core_cpus_list, use this for
consistency and add documentation.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul A . Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20211111002109.194172-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
406018dcc1 perf cputopo: Match die_siblings to topology ABI name
The topology name for die_siblings is die_cpus_list, use this for
consistency and add documentation.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul A . Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20211111002109.194172-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
48f07b0b2a perf cputopo: Update to use pakage_cpus
core_siblings_list is the deprecated topology name for
package_cpus_list, update the code to try the non-deprecated path first.
Adjust variable names to match topology name.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul A . Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20211111002109.194172-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
604ce2f004 perf test: Add expr test for events with hyphens
An example of such an event is topdown-fe-bound.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul A . Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20211111002109.194172-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
b47d2fb40f perf test: Remove skip_if_fail
Remove optionality, always run tests in a suite even if one fails. This
brings perf's test more inline with kunit that lacks this notion.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-23-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
848ddf5999 perf test: Remove is_supported function
All tests now return TEST_SKIP if not supported. Removing this function
brings perf's test_suite struct more inline with kunit.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-22-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
e74dd9cb33 perf test: TSC test, remove is_supported use
Migrate the is_supported functionality to returning TEST_SKIP.
Motivation is kunit has no is_supported function.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-21-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
4935e2cd1b perf test: BP tests, remove is_supported use
Migrate the is_supported functionality to returning TEST_SKIP.
Motivation is kunit has no is_supported function.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-20-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
c76ec1cf25 perf test: Remove non test case style support.
Convert shell tests to also run using test case style.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-19-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
1870356f35 perf test: Convert time to tsc test to test case.
Migration toward kunit style test cases.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-18-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:50 -03:00
Ian Rogers
e329f03a1f perf test: bp tests use test case
Migration toward kunit style test cases.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:49 -03:00
Ian Rogers
94e11fc771 perf test: Remove now unused subtest helpers
Replaced by null terminated test case array.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:49 -03:00
Ian Rogers
e65bc1fa29 perf test: Convert llvm tests to test cases.
Use null terminated array of test cases rather than the previous sub
test functions.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:49 -03:00
Ian Rogers
5801e96b88 perf test: Convert bpf tests to test cases.
Use null terminated array of test cases rather than the previous sub
test functions.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:49 -03:00
Ian Rogers
44a8528c24 perf test: Convert clang tests to test cases.
Use null terminated array of test cases rather than the previous sub
test functions.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:49 -03:00
Ian Rogers
e47c6ecaae perf test: Convert watch point tests to test cases.
Use null terminated array of test cases rather than the previous sub
test functions.

Committer notes:

On s/390x we don't use __event(), so wrap it with __s390x__

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 18:11:10 -03:00
Linus Torvalds
4d6fe79fde New x86 features:
* Guest API and guest kernel support for SEV live migration
 
 * SEV and SEV-ES intra-host migration
 
 Bugfixes and cleanups for x86:
 
 * Fix misuse of gfn-to-pfn cache when recording guest steal time / preempted status
 
 * Fix selftests on APICv machines
 
 * Fix sparse warnings
 
 * Fix detection of KVM features in CPUID
 
 * Cleanups for bogus writes to MSR_KVM_PV_EOI_EN
 
 * Fixes and cleanups for MSR bitmap handling
 
 * Cleanups for INVPCID
 
 * Make x86 KVM_SOFT_MAX_VCPUS consistent with other architectures
 
 Bugfixes for ARM:
 
 * Fix finalization of host stage2 mappings
 
 * Tighten the return value of kvm_vcpu_preferred_target()
 
 * Make sure the extraction of ESR_ELx.EC is limited to architected bits
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGO1usUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOsXgf/d2CgZmTxZii/pOt0JKGSDKwRdoti
 pfbmtXwTkagqg2tIJtEsxwvcc26Q2VLyg9mlEelX1I8KVlexSa8mDtbGWHLFilsc
 JIZY8lE96wtlr9AyuN06K44QingDMIbXzlxcwO+muS3zTlSPsNkPcaVDk5PL35sN
 Wy2GA6GCLWv6iTLCtlX5EzbcgkrR+Mypj4lGSdXGRqfBKVFOQjeVt+Q3YzOPuacS
 03NBlYOmRxTnAGXfeAVs0/zEa69u/dYjBmwDPe6ERma4u8thHv0U/fDAh5C/ES7q
 42Thes/JOYnEZiBQ1xZLsHqRII0achbJKZhmxqPwjRf/u6eXsfz0KY9FBg==
 =kN8U
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm updates from Paolo Bonzini:
 "New x86 features:

   - Guest API and guest kernel support for SEV live migration

   - SEV and SEV-ES intra-host migration

  Bugfixes and cleanups for x86:

   - Fix misuse of gfn-to-pfn cache when recording guest steal time /
     preempted status

   - Fix selftests on APICv machines

   - Fix sparse warnings

   - Fix detection of KVM features in CPUID

   - Cleanups for bogus writes to MSR_KVM_PV_EOI_EN

   - Fixes and cleanups for MSR bitmap handling

   - Cleanups for INVPCID

   - Make x86 KVM_SOFT_MAX_VCPUS consistent with other architectures

  Bugfixes for ARM:

   - Fix finalization of host stage2 mappings

   - Tighten the return value of kvm_vcpu_preferred_target()

   - Make sure the extraction of ESR_ELx.EC is limited to architected
     bits"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (34 commits)
  KVM: SEV: unify cgroup cleanup code for svm_vm_migrate_from
  KVM: x86: move guest_pv_has out of user_access section
  KVM: x86: Drop arbitrary KVM_SOFT_MAX_VCPUS
  KVM: Move INVPCID type check from vmx and svm to the common kvm_handle_invpcid()
  KVM: VMX: Add a helper function to retrieve the GPR index for INVPCID, INVVPID, and INVEPT
  KVM: nVMX: Clean up x2APIC MSR handling for L2
  KVM: VMX: Macrofy the MSR bitmap getters and setters
  KVM: nVMX: Handle dynamic MSR intercept toggling
  KVM: nVMX: Query current VMCS when determining if MSR bitmaps are in use
  KVM: x86: Don't update vcpu->arch.pv_eoi.msr_val when a bogus value was written to MSR_KVM_PV_EOI_EN
  KVM: x86: Rename kvm_lapic_enable_pv_eoi()
  KVM: x86: Make sure KVM_CPUID_FEATURES really are KVM_CPUID_FEATURES
  KVM: x86: Add helper to consolidate core logic of SET_CPUID{2} flows
  kvm: mmu: Use fast PF path for access tracking of huge pages when possible
  KVM: x86/mmu: Properly dereference rcu-protected TDP MMU sptep iterator
  KVM: x86: inhibit APICv when KVM_GUESTDBG_BLOCKIRQ active
  kvm: x86: Convert return type of *is_valid_rdpmc_ecx() to bool
  KVM: x86: Fix recording of guest steal time / preempted status
  selftest: KVM: Add intra host migration tests
  selftest: KVM: Add open sev dev helper
  ...
2021-11-13 10:01:10 -08:00
Ian Rogers
2a74fe8283 perf test: Convert pmu event tests to test cases.
Use null terminated array of test cases rather than the previous sub
test functions.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 10:32:27 -03:00
Ian Rogers
039f355545 perf test: Convert pfm tests to use test cases.
Use null terminated array of test cases rather than the previous sub
test functions.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 10:32:26 -03:00
Ian Rogers
9be56d3080 perf test: Add skip reason to test case.
This doesn't exist in kunit, but will ease the transition from perf
tests.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 10:32:26 -03:00
Ian Rogers
78244d2e21 perf test: Add test case struct.
Add a test case struct mirroring the 'struct kunit_case'. Use the struct
with the DEFINE_SUITE macro, where the single test is turned into a test
case. Update the helpers in builtin-test to handle test cases.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 10:32:26 -03:00
Ian Rogers
f832044c8e perf test: Add helper functions for abstraction.
Abstract certain test features so that they can be refactored in later
changes. No functional change.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 10:32:26 -03:00
Ian Rogers
33f44bfd3c perf test: Rename struct test to test_suite
This is to align with kunit's terminology.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 10:32:22 -03:00
Ian Rogers
d68f036508 perf test: Move each test suite struct to its test
Rather than export test functions, export the test struct. Rename with a
suite__ prefix to avoid name collisions.

Committer notes:

Its '&suite__vectors_page', not '&suite__vectors_pages', noticed when
cross building to arm (32-bit).

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 10:30:58 -03:00
Ian Rogers
df2252054e perf test: Make each test/suite its own struct.
By switching to an array of pointers to tests (later to be suites)
the definition of the tests can be moved to the file containing the
tests.

Committer notes:

It's "&vectors_page", not "&vectors_pages", noticed when cross building
to 32-bit ARM.

Also the DEFINE_SUITE(vectors_page) should be done where its function is
implemented, in tools/perf/arch/arm/tests/vectors-page.c, so that we can
make it static, as we don't have anymore its declaration in tests.h.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-13 10:30:04 -03:00
Jakub Kicinski
0cda7d4bac selftests: net: switch to socat in the GSO GRE test
Commit a985442fde ("selftests: net: properly support IPv6 in GSO GRE test")
is not compatible with:

  Ncat: Version 7.80 ( https://nmap.org/ncat )

(which is distributed with Fedora/Red Hat), tests fail with:

  nc: invalid option -- 'N'

Let's switch to socat which is far more dependable.

Fixes: 025efa0a82 ("selftests: add simple GSO GRE test")
Fixes: a985442fde ("selftests: net: properly support IPv6 in GSO GRE test")
Tested-by: Andrea Righi <andrea.righi@canonical.com>
Link: https://lore.kernel.org/r/20211111162929.530470-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-12 19:59:01 -08:00
Kumar Kartikeya Dwivedi
ba05fd36b8 libbpf: Perform map fd cleanup for gen_loader in case of error
Alexei reported a fd leak issue in gen loader (when invoked from
bpftool) [0]. When adding ksym support, map fd allocation was moved from
stack to loader map, however I missed closing these fds (relevant when
cleanup label is jumped to on error). For the success case, the
allocated fd is returned in loader ctx, hence this problem is not
noticed.

Make three changes, first MAX_USED_MAPS in MAX_FD_ARRAY_SZ instead of
MAX_USED_PROGS, the braino was not a problem until now for this case as
we didn't try to close map fds (otherwise use of it would have tried
closing 32 additional fds in ksym btf fd range). Then, do a cleanup for
all nr_maps fds in cleanup label code, so that in case of error all
temporary map fds from bpf_gen__map_create are closed.

Then, adjust the cleanup label to only generate code for the required
number of program and map fds.  To trim code for remaining program
fds, lay out prog_fd array in stack in the end, so that we can
directly skip the remaining instances.  Still stack size remains same,
since changing that would require changes in a lot of places
(including adjustment of stack_off macro), so nr_progs_sz variable is
only used to track required number of iterations (and jump over
cleanup size calculated from that), stack offset calculation remains
unaffected.

The difference for test_ksyms_module.o is as follows:
libbpf: //prog cleanup iterations: before = 34, after = 5
libbpf: //maps cleanup iterations: before = 64, after = 2

Also, move allocation of gen->fd_array offset to bpf_gen__init. Since
offset can now be 0, and we already continue even if add_data returns 0
in case of failure, we do not need to distinguish between 0 offset and
failure case 0, as we rely on bpf_gen__finish to check errors. We can
also skip check for gen->fd_array in add_*_fd functions, since
bpf_gen__init will take care of it.

  [0]: https://lore.kernel.org/bpf/CAADnVQJ6jSitKSNKyxOrUzwY2qDRX0sPkJ=VLGHuCLVJ=qOt9g@mail.gmail.com

Fixes: 18f4fccbf3 ("libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211112232022.899074-1-memxor@gmail.com
2021-11-12 17:23:46 -08:00
Jean-Philippe Brucker
e4ac80ef81 tools/runqslower: Fix cross-build
Commit be79505caf ("tools/runqslower: Install libbpf headers when
building") uses the target libbpf to build the host bpftool, which
doesn't work when cross-building:

  make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -C tools/bpf/runqslower O=/tmp/runqslower
  ...
    LINK    /tmp/runqslower/bpftool/bpftool
  /usr/bin/ld: /tmp/runqslower/libbpf/libbpf.a(libbpf-in.o): Relocations in generic ELF (EM: 183)
  /usr/bin/ld: /tmp/runqslower/libbpf/libbpf.a: error adding symbols: file in wrong format
  collect2: error: ld returned 1 exit status

When cross-building, the target architecture differs from the host. The
bpftool used for building runqslower is executed on the host, and thus
must use a different libbpf than that used for runqslower itself.
Remove the LIBBPF_OUTPUT and LIBBPF_DESTDIR parameters, so the bpftool
build makes its own library if necessary.

In the selftests, pass the host bpftool, already a prerequisite for the
runqslower recipe, as BPFTOOL_OUTPUT. The runqslower Makefile will use
the bpftool that's already built for selftests instead of making a new
one.

Fixes: be79505caf ("tools/runqslower: Install libbpf headers when building")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20211112155128.565680-1-jean-philippe@linaro.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-11-12 17:23:16 -08:00
Lorenz Bauer
6af2e12374 selftests/bpf: Check map in map pruning
Ensure that two registers with a map_value loaded from a nested
map are considered equivalent for the purpose of state pruning
and don't cause the verifier to revisit a pruning point.

This uses a rather crude match on the number of insns visited by
the verifier, which might change in the future. I've therefore
tried to keep the code as "unpruneable" as possible by having
the code paths only converge on the second to last instruction.

Should you require to adjust the test in the future, reducing the
number of processed instructions should always be safe. Increasing
them could cause another regression, so proceed with caution.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/CACAyw99hVEJFoiBH_ZGyy=+oO-jyydoz6v1DeKPKs2HVsUH28w@mail.gmail.com
Link: https://lore.kernel.org/bpf/20211111161452.86864-1-lmb@cloudflare.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-11-12 17:23:04 -08:00
Yonghong Song
325d956d67 selftests/bpf: Fix a tautological-constant-out-of-range-compare compiler warning
When using clang to build selftests with LLVM=1 in make commandline,
I hit the following compiler warning:

  benchs/bench_bloom_filter_map.c:84:46: warning: result of comparison of constant 256
    with expression of type '__u8' (aka 'unsigned char') is always false
    [-Wtautological-constant-out-of-range-compare]
                if (args.value_size < 2 || args.value_size > 256) {
                                           ~~~~~~~~~~~~~~~ ^ ~~~

The reason is arg.vaue_size has type __u8, so comparison "args.value_size > 256"
is always false.

This patch fixed the issue by doing proper comparison before assigning the
value to args.value_size. The patch also fixed the same issue in two
other places.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211112204838.3579953-1-yhs@fb.com
2021-11-12 14:11:46 -08:00
Yonghong Song
21c6ec3d52 selftests/bpf: Fix an unused-but-set-variable compiler warning
When using clang to build selftests with LLVM=1 in make commandline,
I hit the following compiler warning:
  xdpxceiver.c:747:6: warning: variable 'total' set but not used [-Wunused-but-set-variable]
          u32 total = 0;
              ^

This patch fixed the issue by removing that declaration and its
assocatied unused operation.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211112204833.3579457-1-yhs@fb.com
2021-11-12 14:11:46 -08:00
Sasha Levin
7246f4dcac tools/lib/lockdep: drop liblockdep
TL;DR: While a tool like liblockdep is useful, it probably doesn't
belong within the kernel tree.

liblockdep attempts to reuse kernel code both directly (by directly
building the kernel's lockdep code) as well as indirectly (by using
sanitized headers). This makes liblockdep an integral part of the
kernel.

It also makes liblockdep quite unique: while other userspace code might
use sanitized headers, it generally doesn't attempt to use kernel code
directly which means that changes on the kernel side of things don't
affect (and break) it directly.

All our workflows and tooling around liblockdep don't support this
uniqueness. Changes that go into the kernel code aren't validated to not
break in-tree userspace code.

liblockdep ended up being very fragile, breaking over and over, to the
point that living in the same tree as the lockdep code lost most of it's
value.

liblockdep should continue living in an external tree, syncing with
the kernel often, in a controllable way.

Signed-off-by: Sasha Levin <sashal@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-12 11:07:17 -08:00
Stanislav Fomichev
314f14abde bpftool: Enable libbpf's strict mode by default
Otherwise, attaching with bpftool doesn't work with strict section names.

Also:

  - Add --legacy option to switch back to pre-1.0 behavior
  - Print a warning when program fails to load in strict mode to
    point to --legacy flag
  - By default, don't append / to the section name; in strict
    mode it's relevant only for a small subset of prog types

+ bpftool --legacy prog loadall tools/testing/selftests/bpf/test_cgroup_link.o /sys/fs/bpf/kprobe type kprobe
libbpf: failed to pin program: File exists
Error: failed to pin all programs
+ bpftool prog loadall tools/testing/selftests/bpf/test_cgroup_link.o /sys/fs/bpf/kprobe type kprobe

v1 -> v2:
  - strict by default (Quentin Monnet)
  - add more info to --legacy description (Quentin Monnet)
  - add bash completion (Quentin Monnet)

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20211110192324.920934-1-sdf@google.com
2021-11-12 16:54:58 +01:00
Ian Rogers
54df5c8e01 perf test: Use macro for "suite" definitions
Add a macro to simplify later refactoring. No functional change.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 10:40:34 -03:00
Ian Rogers
fe90d37877 perf test: Use macro for "suite" declarations
Currently tests are setup in builtin-test with function pointers. Kunit
exposes tests as a kunit_suite with a null terminated array of test
cases. Use a macro to aid transition from one to the other in later
changes.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 10:40:34 -03:00
Arnaldo Carvalho de Melo
66aee54ba4 perf beauty: Add socket level scnprintf that handles ARCH specific SOL_SOCKET
SOL_SOCKET has a different value according to the architecture, some
have it as 0xffff while all the others have it as 1, so a simple string
array isn't usable, add a scnprintf routine that treats it as a special
case, using the array for other values.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 10:40:34 -03:00
Arnaldo Carvalho de Melo
0826b7fd0a perf trace: Beautify the 'level' argument of setsockopt
# perf trace -e setsockopt
     0.000 ( 0.019 ms): systemd-resolv/1121 setsockopt(fd: 22, level: IP, optname: 50, optval: 0x7ffee2c0c134, optlen: 4) = 0
     0.022 ( 0.003 ms): systemd-resolv/1121 setsockopt(fd: 22, level: IP, optname: 11, optval: 0x7ffee2c0c114, optlen: 4) = 0
     0.027 ( 0.003 ms): systemd-resolv/1121 setsockopt(fd: 22, level: IP, optname: 8, optval: 0x7ffee2c0c134, optlen: 4) = 0
     0.032 ( 0.002 ms): systemd-resolv/1121 setsockopt(fd: 22, level: IP, optname: 10, optval: 0x7ffee2c0c134, optlen: 4) = 0
     0.036 ( 0.002 ms): systemd-resolv/1121 setsockopt(fd: 22, level: IP, optname: 25, optval: 0x7ffee2c0c114, optlen: 4) = 0
     0.043 ( 0.003 ms): systemd-resolv/1121 setsockopt(fd: 22, level: 1, optname: 62, optval: 0x7ffee2c0c0fc, optlen: 4) = 0
     0.055 ( 0.003 ms): systemd-resolv/1121 setsockopt(fd: 22, level: 1, optname: 25)
  ^C#

So the simple straight STRARRAY method is not enough as SOL_SOCKET is
'1' in most architectures but some use 0xffff (alpha, mips, parisc and
sparc), so a followup patch will create a specialized scnprintf to cover
that.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 10:40:34 -03:00
Arnaldo Carvalho de Melo
f1c1e45e9c perf trace: Beautify the 'level' argument of getsockopt
# perf trace -e getsockopt
       0.000 ( 0.006 ms): systemd-resolv/1121 getsockopt(fd: 21, level: 1, optname: 17, optval: 0x7ffee2c0c6cc, optlen: 0x7ffee2c0c6c8) = 0
       0.301 ( 0.003 ms): systemd-resolv/1121 getsockopt(fd: 22, level: IP, optname: 14, optval: 0x7ffee2c0c1a0, optlen: 0x7ffee2c0c1a4) = -1 ENOTCONN (Transport endpoint is not connected)
       2.215 ( 0.005 ms): systemd-resolv/1121 getsockopt(fd: 21, level: 1, optname: 17, optval: 0x7ffee2c0c6cc, optlen: 0x7ffee2c0c6c8) = 0
       2.422 ( 0.005 ms): systemd-resolv/1121 getsockopt(fd: 22, level: IP, optname: 14, optval: 0x7ffee2c0c1a0, optlen: 0x7ffee2c0c1a4) = -1 ENOTCONN (Transport endpoint is not connected)
    1001.308 ( 0.006 ms): systemd-resolv/1121 getsockopt(fd: 21, level: 1, optname: 17, optval: 0x7ffee2c0c6cc, optlen: 0x7ffee2c0c6c8) = 0
    1001.586 ( 0.003 ms): systemd-resolv/1121 getsockopt(fd: 22, level: IP, optname: 14, optval: 0x7ffee2c0c1a0, optlen: 0x7ffee2c0c1a4) = -1 ENOTCONN (Transport endpoint is not connected)
    1001.647 ( 0.002 ms): systemd-resolv/1121 getsockopt(fd: 23, level: IP, optname: 14, optval: 0x7ffee2c0c1a0, optlen: 0x7ffee2c0c1a4) = -1 ENOTCONN (Transport endpoint is not connected)
    1003.868 ( 0.010 ms): systemd-resolv/1121 getsockopt(fd: 21, level: 1, optname: 17, optval: 0x7ffee2c0c6cc, optlen: 0x7ffee2c0c6c8) = 0
    1004.036 ( 0.006 ms): systemd-resolv/1121 getsockopt(fd: 22, level: IP, optname: 14, optval: 0x7ffee2c0c1a0, optlen: 0x7ffee2c0c1a4) = -1 ENOTCONN (Transport endpoint is not connected)
    1004.087 ( 0.002 ms): systemd-resolv/1121 getsockopt(fd: 23, level: IP, optname: 14, optval: 0x7ffee2c0c1a0, optlen: 0x7ffee2c0c1a4) = -1 ENOTCONN (Transport endpoint is not connected)
  ^C#

So the simple straight STRARRAY method is not enough as SOL_SOCKET is
'1' in most architectures but some use 0xffff (alpha, mips, parisc and
sparc), so a followup patch will create a specialized scnprintf to cover
that.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 10:40:34 -03:00
Arnaldo Carvalho de Melo
ecf0a35ba2 perf beauty socket: Add generator for socket level (SOL_*) string table
$ tools/perf/trace/beauty/socket.sh
  static const char *socket_ipproto[] = {
  	[0] = "IP",
  	[1] = "ICMP",
  <SNIP>
  	[255] = "RAW",
  	[262] = "MPTCP",
  };

  static const char *socket_level[] = {
  	[0] = "IP",
  	[6] = "TCP",
  	[17] = "UDP",
  	[41] = "IPV6",
  	[58] = "ICMPV6",
  	[132] = "SCTP",
  	[136] = "UDPLITE",
  	[255] = "RAW",
  	[256] = "IPX",
  	[257] = "AX25",
  	[258] = "ATALK",
  	[259] = "NETROM",
  	[260] = "ROSE",
  	[261] = "DECNET",
  	[262] = "X25",
  	[263] = "PACKET",
  	[264] = "ATM",
  	[265] = "AAL",
  	[266] = "IRDA",
  	[267] = "NETBEUI",
  	[268] = "LLC",
  	[269] = "DCCP",
  	[270] = "NETLINK",
  	[271] = "TIPC",
  	[272] = "RXRPC",
  	[273] = "PPPOL2TP",
  	[274] = "BLUETOOTH",
  	[275] = "PNPIPE",
  	[276] = "RDS",
  	[277] = "IUCV",
  	[278] = "CAIF",
  	[279] = "ALG",
  	[280] = "NFC",
  	[281] = "KCM",
  	[282] = "TLS",
  	[283] = "XDP",
  	[284] = "MPTCP",
  	[285] = "MCTP",
  };
  $

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 10:40:34 -03:00
Arnaldo Carvalho de Melo
d3f82839f8 perf beauty socket: Sort the ipproto array entries
Just tidying up the output for human consumption.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 10:40:34 -03:00
Arnaldo Carvalho de Melo
82e3664b0a perf beauty socket: Rename 'regex' to 'ipproto_regex'
Paving the way for more regexps to be used here.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 10:40:34 -03:00
Arnaldo Carvalho de Melo
1a1edf3320 perf beauty socket: Prep to receive more input header files
Move from ternary like expression to an if block, this way we'll
have just the extra lines for new files in the following patches.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 10:40:34 -03:00
Arnaldo Carvalho de Melo
012e569036 perf beauty socket: Rename header_dir to uapi_header_dir
Paving the way to pass more headers to be consumed, like
tools/perf/trace/beauty/include/linux/socket.h in addition to the
current tools/include/uapi/linux/in.h, to get the SOL_* defines.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 10:40:34 -03:00
Arnaldo Carvalho de Melo
795f91db26 perf beauty: Rename socket_ipproto.sh to socket.sh to hold more socket table generators
To avoid having to add new entries to tools/perf/Makefile.perf prep
socket.sh so that it can generate other socket table generators, such as
the upcoming SOL_ socket level one.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 10:40:34 -03:00
Arnaldo Carvalho de Melo
fa020dd78f perf beauty: Make all sockaddr files use a common naming scheme
The script that generates the tables was named 'socket.sh', which is
confusing, rename it to sockaddr.sh and make sure the related
Makefile.perf targets also use the 'sockaddr' namespace.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 10:40:34 -03:00
Yonghong Song
3f1d0dc0ba selftests/bpf: Clarify llvm dependency with btf_tag selftest
btf_tag selftest needs certain llvm versions (>= llvm14).
Make it clear in the selftests README.rst file.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012651.1508549-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Yonghong Song
5698a42a73 selftests/bpf: Add a C test for btf_type_tag
The following is the main btf_type_tag usage in the
C test:
  #define __tag1 __attribute__((btf_type_tag("tag1")))
  #define __tag2 __attribute__((btf_type_tag("tag2")))
  struct btf_type_tag_test {
       int __tag1 * __tag1 __tag2 *p;
  } g;

The bpftool raw dump with related types:
  [4] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
  [11] STRUCT 'btf_type_tag_test' size=8 vlen=1
          'p' type_id=14 bits_offset=0
  [12] TYPE_TAG 'tag1' type_id=16
  [13] TYPE_TAG 'tag2' type_id=12
  [14] PTR '(anon)' type_id=13
  [15] TYPE_TAG 'tag1' type_id=4
  [16] PTR '(anon)' type_id=15
  [17] VAR 'g' type_id=11, linkage=global

With format C dump, we have
  struct btf_type_tag_test {
        int __attribute__((btf_type_tag("tag1"))) * __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag2"))) *p;
  };
The result C code is identical to the original definition except macro's are gone.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012646.1508231-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Yonghong Song
26c79fcbfa selftests/bpf: Rename progs/tag.c to progs/btf_decl_tag.c
Rename progs/tag.c to progs/btf_decl_tag.c so we can introduce
progs/btf_type_tag.c in the next patch.

Also create a subtest for btf_decl_tag in prog_tests/btf_tag.c
so we can introduce btf_type_tag subtest in the next patch.

I also took opportunity to remove the check whether __has_attribute
is defined or not in progs/btf_decl_tag.c since all recent
clangs should already support this macro.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012641.1507144-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Yonghong Song
846f4826d1 selftests/bpf: Test BTF_KIND_DECL_TAG for deduplication
Add BTF_KIND_TYPE_TAG duplication unit tests.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012635.1506853-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Yonghong Song
6aa5dabc9d selftests/bpf: Add BTF_KIND_TYPE_TAG unit tests
Add BTF_KIND_TYPE_TAG unit tests.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012630.1506095-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Yonghong Song
0dc8587220 selftests/bpf: Test libbpf API function btf__add_type_tag()
Add unit tests for btf__add_type_tag().

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012625.1505748-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Yonghong Song
3da5ba6f05 bpftool: Support BTF_KIND_TYPE_TAG
Add bpftool support for BTF_KIND_TYPE_TAG.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012620.1505506-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Yonghong Song
2dc1e488e5 libbpf: Support BTF_KIND_TYPE_TAG
Add libbpf support for BTF_KIND_TYPE_TAG.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012614.1505315-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Yonghong Song
8c42d2fa4e bpf: Support BTF_KIND_TYPE_TAG for btf_type_tag attributes
LLVM patches ([1] for clang, [2] and [3] for BPF backend)
added support for btf_type_tag attributes. This patch
added support for the kernel.

The main motivation for btf_type_tag is to bring kernel
annotations __user, __rcu etc. to btf. With such information
available in btf, bpf verifier can detect mis-usages
and reject the program. For example, for __user tagged pointer,
developers can then use proper helper like bpf_probe_read_user()
etc. to read the data.

BTF_KIND_TYPE_TAG may also useful for other tracing
facility where instead of to require user to specify
kernel/user address type, the kernel can detect it
by itself with btf.

  [1] https://reviews.llvm.org/D111199
  [2] https://reviews.llvm.org/D113222
  [3] https://reviews.llvm.org/D113496

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012609.1505032-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Andrii Nakryiko
164b04f27f bpftool: Update btf_dump__new() and perf_buffer__new_raw() calls
Use v1.0-compatible variants of btf_dump and perf_buffer "constructors".
This is also a demonstration of reusing struct perf_buffer_raw_opts as
OPTS-style option struct for new perf_buffer__new_raw() API.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-10-andrii@kernel.org
2021-11-11 16:54:06 -08:00
Andrii Nakryiko
eda8bfa5b7 tools/runqslower: Update perf_buffer__new() calls
Use v1.0+ compatible variant of perf_buffer__new() call to prepare for
deprecation.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-9-andrii@kernel.org
2021-11-11 16:54:06 -08:00
Andrii Nakryiko
60ba87bb6b selftests/bpf: Update btf_dump__new() uses to v1.0+ variant
Update to-be-deprecated forms of btf_dump__new().

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-8-andrii@kernel.org
2021-11-11 16:54:06 -08:00
Andrii Nakryiko
0b52a5f4b9 selftests/bpf: Migrate all deprecated perf_buffer uses
Migrate all old-style perf_buffer__new() and perf_buffer__new_raw()
calls to new v1.0+ variants.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-7-andrii@kernel.org
2021-11-11 16:54:05 -08:00
Andrii Nakryiko
4178893465 libbpf: Make perf_buffer__new() use OPTS-based interface
Add new variants of perf_buffer__new() and perf_buffer__new_raw() that
use OPTS-based options for future extensibility ([0]). Given all the
currently used API names are best fits, re-use them and use
___libbpf_override() approach and symbol versioning to preserve ABI and
source code compatibility. struct perf_buffer_opts and struct
perf_buffer_raw_opts are kept as well, but they are restructured such
that they are OPTS-based when used with new APIs. For struct
perf_buffer_raw_opts we keep few fields intact, so we have to also
preserve the memory location of them both when used as OPTS and for
legacy API variants. This is achieved with anonymous padding for OPTS
"incarnation" of the struct.  These pads can be eventually used for new
options.

  [0] Closes: https://github.com/libbpf/libbpf/issues/311

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-6-andrii@kernel.org
2021-11-11 16:54:05 -08:00
Andrii Nakryiko
6084f5dc92 libbpf: Ensure btf_dump__new() and btf_dump_opts are future-proof
Change btf_dump__new() and corresponding struct btf_dump_ops structure
to be extensible by using OPTS "framework" ([0]). Given we don't change
the names, we use a similar approach as with bpf_prog_load(), but this
time we ended up with two APIs with the same name and same number of
arguments, so overloading based on number of arguments with
___libbpf_override() doesn't work.

Instead, use "overloading" based on types. In this particular case,
print callback has to be specified, so we detect which argument is
a callback. If it's 4th (last) argument, old implementation of API is
used by user code. If not, it must be 2nd, and thus new implementation
is selected. The rest is handled by the same symbol versioning approach.

btf_ext argument is dropped as it was never used and isn't necessary
either. If in the future we'll need btf_ext, that will be added into
OPTS-based struct btf_dump_opts.

struct btf_dump_opts is reused for both old API and new APIs. ctx field
is marked deprecated in v0.7+ and it's put at the same memory location
as OPTS's sz field. Any user of new-style btf_dump__new() will have to
set sz field and doesn't/shouldn't use ctx, as ctx is now passed along
the callback as mandatory input argument, following the other APIs in
libbpf that accept callbacks consistently.

Again, this is quite ugly in implementation, but is done in the name of
backwards compatibility and uniform and extensible future APIs (at the
same time, sigh). And it will be gone in libbpf 1.0.

  [0] Closes: https://github.com/libbpf/libbpf/issues/283

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-5-andrii@kernel.org
2021-11-11 16:54:05 -08:00
Andrii Nakryiko
957d350a8b libbpf: Turn btf_dedup_opts into OPTS-based struct
btf__dedup() and struct btf_dedup_opts were added before we figured out
OPTS mechanism. As such, btf_dedup_opts is non-extensible without
breaking an ABI and potentially crashing user application.

Unfortunately, btf__dedup() and btf_dedup_opts are short and succinct
names that would be great to preserve and use going forward. So we use
___libbpf_override() macro approach, used previously for bpf_prog_load()
API, to define a new btf__dedup() variant that accepts only struct btf *
and struct btf_dedup_opts * arguments, and rename the old btf__dedup()
implementation into btf__dedup_deprecated(). This keeps both source and
binary compatibility with old and new applications.

The biggest problem was struct btf_dedup_opts, which wasn't OPTS-based,
and as such doesn't have `size_t sz;` as a first field. But btf__dedup()
is a pretty rarely used API and I believe that the only currently known
users (besides selftests) are libbpf's own bpf_linker and pahole.
Neither use case actually uses options and just passes NULL. So instead
of doing extra hacks, just rewrite struct btf_dedup_opts into OPTS-based
one, move btf_ext argument into those opts (only bpf_linker needs to
dedup btf_ext, so it's not a typical thing to specify), and drop never
used `dont_resolve_fwds` option (it was never used anywhere, AFAIK, it
makes BTF dedup much less useful and efficient).

Just in case, for old implementation, btf__dedup_deprecated(), detect
non-NULL options and error out with helpful message, to help users
migrate, if there are any user playing with btf__dedup().

The last remaining piece is dedup_table_size, which is another
anachronism from very early days of BTF dedup. Since then it has been
reduced to the only valid value, 1, to request forced hash collisions.
This is only used during testing. So instead introduce a bool flag to
force collisions explicitly.

This patch also adapts selftests to new btf__dedup() and btf_dedup_opts
use to avoid selftests breakage.

  [0] Closes: https://github.com/libbpf/libbpf/issues/281

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-4-andrii@kernel.org
2021-11-11 16:54:05 -08:00
Andrii Nakryiko
de29e6bbb9 selftests/bpf: Minor cleanups and normalization of Makefile
Few clean ups and single-line simplifications. Also split CLEAN command
into multiple $(RM) invocations as it gets dangerously close to too long
argument list. Make sure that -o <output.o> is used always as the last
argument for saner verbose make output.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-3-andrii@kernel.org
2021-11-11 16:54:05 -08:00
Andrii Nakryiko
6501182c08 bpftool: Normalize compile rules to specify output file last
When dealing with verbose Makefile output, it's extremely confusing when
compiler invocation commands don't specify -o <output.o> as the last
argument. Normalize bpftool's Makefile to do just that, as most other
BPF-related Makefiles are already doing that.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-2-andrii@kernel.org
2021-11-11 16:54:05 -08:00
Andrii Nakryiko
50dee7078b selftests/bpf: Fix bpf_prog_test_load() logic to pass extra log level
After recent refactoring bpf_prog_test_load(), used across multiple
selftests, lost ability to specify extra log_level 1 or 2 (for -vv and
-vvv, respectively). Fix that problem by using bpf_object__load_xattr()
API that supports extra log_level flags. Also restore
BPF_F_TEST_RND_HI32 prog_flags by utilizing new bpf_program__set_extra_flags()
API.

Fixes: f87c1930ac ("selftests/bpf: Merge test_stub.c into testing_helpers.c")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111051758.92283-3-andrii@kernel.org
2021-11-11 16:44:26 -08:00
Andrii Nakryiko
a6ca715831 libbpf: Add ability to get/set per-program load flags
Add bpf_program__flags() API to retrieve prog_flags that will be (or
were) supplied to BPF_PROG_LOAD command.

Also add bpf_program__set_extra_flags() API to allow to set *extra*
flags, in addition to those determined by program's SEC() definition.
Such flags are logically OR'ed with libbpf-derived flags.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111051758.92283-2-andrii@kernel.org
2021-11-11 16:44:26 -08:00
Linus Torvalds
f54ca91fe6 Networking fixes for 5.16-rc1, including fixes from bpf, can
and netfilter.
 
 Current release - regressions:
 
  - bpf: do not reject when the stack read size is different
    from the tracked scalar size
 
  - net: fix premature exit from NAPI state polling in napi_disable()
 
  - riscv, bpf: fix RV32 broken build, and silence RV64 warning
 
 Current release - new code bugs:
 
  - net: fix possible NULL deref in sock_reserve_memory
 
  - amt: fix error return code in amt_init(); fix stopping the workqueue
 
  - ax88796c: use the correct ioctl callback
 
 Previous releases - always broken:
 
  - bpf: stop caching subprog index in the bpf_pseudo_func insn
 
  - security: fixups for the security hooks in sctp
 
  - nfc: add necessary privilege flags in netlink layer, limit operations
    to admin only
 
  - vsock: prevent unnecessary refcnt inc for non-blocking connect
 
  - net/smc: fix sk_refcnt underflow on link down and fallback
 
  - nfnetlink_queue: fix OOB when mac header was cleared
 
  - can: j1939: ignore invalid messages per standard
 
  - bpf, sockmap:
    - fix race in ingress receive verdict with redirect to self
    - fix incorrect sk_skb data_end access when src_reg = dst_reg
    - strparser, and tls are reusing qdisc_skb_cb and colliding
 
  - ethtool: fix ethtool msg len calculation for pause stats
 
  - vlan: fix a UAF in vlan_dev_real_dev() when ref-holder tries
    to access an unregistering real_dev
 
  - udp6: make encap_rcv() bump the v6 not v4 stats
 
  - drv: prestera: add explicit padding to fix m68k build
 
  - drv: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge
 
  - drv: mvpp2: fix wrong SerDes reconfiguration order
 
 Misc & small latecomers:
 
  - ipvs: auto-load ipvs on genl access
 
  - mctp: sanity check the struct sockaddr_mctp padding fields
 
  - libfs: support RENAME_EXCHANGE in simple_rename()
 
  - avoid double accounting for pure zerocopy skbs
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGNQdwACgkQMUZtbf5S
 IrsiMQ//f66lTJ8PJ5Qj70hX9dC897olx7uGHB9eiKoyOcJI459hFlfXwRU2T4Tf
 fPNwPNUQ9Mynw9tX/jWEi+7zd6r6TSHGXK49U9/rIbQ95QjKY4LHowIE63x+vPl2
 5Cpf+80zXC3DUX1fijgyG1ujnU3kBaqopTxDLmlsHw2PGkwT5Ox1DUwkhc370eEL
 xlpq3PYGWA8/AQNyhSVBkG/UmoLaq0jYNP5yVcOj4jGjgcgLe1SLrqczENr35QHZ
 cRkuBsFBMBZF7wSX2f9qQIB/+b1pcLlD9IO+K3S7Ruq+rUd7qfL/tmwNxEh0axYK
 AyIun1Bxcy7QJGjtpGAz+Ku7jS9T3HxzyxhqilQo3co8jAW0WJ1YwHl+XPgQXyjV
 DLG6Vxt4syiwsoSXGn8MQugs4nlBT+0qWl8YamIR+o7KkAYPc2QWkXlzEDfNeIW8
 JNCZA3sy7VGi1ytorZGx16sQsEWnyRG9a6/WV20Dr+HVs1SKPcFzIfG6mVngR07T
 mQMHnbAF6Z5d8VTcPQfMxd7UH48s1bHtk5lcSTa3j0Cw+GkA6ytTmjPdJ1qRcdkH
 dl9jAfADe4O6frG+9XH7FEFqhmkghVI7bOCA4ZOhClVaIcDGgEZc2y7sY9/oZ7P4
 KXBD2R5X1caCUM0UtzwL7/8ddOtPtHIrFnhY+7+I6ijt9qmI0BY=
 =Ttgq
 -----END PGP SIGNATURE-----

Merge tag 'net-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf, can and netfilter.

  Current release - regressions:

   - bpf: do not reject when the stack read size is different from the
     tracked scalar size

   - net: fix premature exit from NAPI state polling in napi_disable()

   - riscv, bpf: fix RV32 broken build, and silence RV64 warning

  Current release - new code bugs:

   - net: fix possible NULL deref in sock_reserve_memory

   - amt: fix error return code in amt_init(); fix stopping the
     workqueue

   - ax88796c: use the correct ioctl callback

  Previous releases - always broken:

   - bpf: stop caching subprog index in the bpf_pseudo_func insn

   - security: fixups for the security hooks in sctp

   - nfc: add necessary privilege flags in netlink layer, limit
     operations to admin only

   - vsock: prevent unnecessary refcnt inc for non-blocking connect

   - net/smc: fix sk_refcnt underflow on link down and fallback

   - nfnetlink_queue: fix OOB when mac header was cleared

   - can: j1939: ignore invalid messages per standard

   - bpf, sockmap:
      - fix race in ingress receive verdict with redirect to self
      - fix incorrect sk_skb data_end access when src_reg = dst_reg
      - strparser, and tls are reusing qdisc_skb_cb and colliding

   - ethtool: fix ethtool msg len calculation for pause stats

   - vlan: fix a UAF in vlan_dev_real_dev() when ref-holder tries to
     access an unregistering real_dev

   - udp6: make encap_rcv() bump the v6 not v4 stats

   - drv: prestera: add explicit padding to fix m68k build

   - drv: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge

   - drv: mvpp2: fix wrong SerDes reconfiguration order

  Misc & small latecomers:

   - ipvs: auto-load ipvs on genl access

   - mctp: sanity check the struct sockaddr_mctp padding fields

   - libfs: support RENAME_EXCHANGE in simple_rename()

   - avoid double accounting for pure zerocopy skbs"

* tag 'net-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (123 commits)
  selftests/net: udpgso_bench_rx: fix port argument
  net: wwan: iosm: fix compilation warning
  cxgb4: fix eeprom len when diagnostics not implemented
  net: fix premature exit from NAPI state polling in napi_disable()
  net/smc: fix sk_refcnt underflow on linkdown and fallback
  net/mlx5: Lag, fix a potential Oops with mlx5_lag_create_definer()
  gve: fix unmatched u64_stats_update_end()
  net: ethernet: lantiq_etop: Fix compilation error
  selftests: forwarding: Fix packet matching in mirroring selftests
  vsock: prevent unnecessary refcnt inc for nonblocking connect
  net: marvell: mvpp2: Fix wrong SerDes reconfiguration order
  net: ethernet: ti: cpsw_ale: Fix access to un-initialized memory
  net: stmmac: allow a tc-taprio base-time of zero
  selftests: net: test_vxlan_under_vrf: fix HV connectivity test
  net: hns3: allow configure ETS bandwidth of all TCs
  net: hns3: remove check VF uc mac exist when set by PF
  net: hns3: fix some mac statistics is always 0 in device version V2
  net: hns3: fix kernel crash when unload VF while it is being reset
  net: hns3: sync rx ring head in echo common pull
  net: hns3: fix pfc packet number incorrect after querying pfc parameters
  ...
2021-11-11 09:49:36 -08:00
Paolo Bonzini
1f05833193 Merge branch 'kvm-sev-move-context' into kvm-master
Add support for AMD SEV and SEV-ES intra-host migration support.  Intra
host migration provides a low-cost mechanism for userspace VMM upgrades.

In the common case for intra host migration, we can rely on the normal
ioctls for passing data from one VMM to the next. SEV, SEV-ES, and other
confidential compute environments make most of this information opaque, and
render KVM ioctls such as "KVM_GET_REGS" irrelevant.  As a result, we need
the ability to pass this opaque metadata from one VMM to the next. The
easiest way to do this is to leave this data in the kernel, and transfer
ownership of the metadata from one KVM VM (or vCPU) to the next.  In-kernel
hand off makes it possible to move any data that would be
unsafe/impossible for the kernel to hand directly to userspace, and
cannot be reproduced using data that can be handed to userspace.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11 11:02:58 -05:00
Peter Gonda
6a58150859 selftest: KVM: Add intra host migration tests
Adds testcases for intra host migration for SEV and SEV-ES. Also adds
locking test to confirm no deadlock exists.

Signed-off-by: Peter Gonda <pgonda@google.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Cc: Marc Orr <marcorr@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Message-Id: <20211021174303.385706-6-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11 10:36:17 -05:00
Peter Gonda
7a6ab3cf39 selftest: KVM: Add open sev dev helper
Refactors out open path support from open_kvm_dev_path_or_exit() and
adds new helper for SEV device path.

Signed-off-by: Peter Gonda <pgonda@google.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Cc: Marc Orr <marcorr@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Message-Id: <20211021174303.385706-5-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11 10:35:27 -05:00
Willem de Bruijn
d336509cb9 selftests/net: udpgso_bench_rx: fix port argument
The below commit added optional support for passing a bind address.
It configures the sockaddr bind arguments before parsing options and
reconfigures on options -b and -4.

This broke support for passing port (-p) on its own.

Configure sockaddr after parsing all arguments.

Fixes: 3327a9c463 ("selftests: add functionals test for UDP GRO")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-11 12:24:26 +00:00
Peter Zijlstra
2105a92748 static_call,x86: Robustify trampoline patching
Add a few signature bytes after the static call trampoline and verify
those bytes match before patching the trampoline. This avoids patching
random other JMPs (such as CFI jump-table entries) instead.

These bytes decode as:

   d:   53                      push   %rbx
   e:   43 54                   rex.XB push %r12

And happen to spell "SCT".

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20211030074758.GT174703@worktop.programming.kicks-ass.net
2021-11-11 13:09:31 +01:00
Mark Pashmfouroush
8b4fd2bf1f selftests/bpf: Add tests for accessing ingress_ifindex in bpf_sk_lookup
A new field was added to the bpf_sk_lookup data that users can access.
Add tests that validate that the new ingress_ifindex field contains the
right data.

Signed-off-by: Mark Pashmfouroush <markpash@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211110111016.5670-3-markpash@cloudflare.com
2021-11-10 16:29:59 -08:00
Mark Pashmfouroush
f89315650b bpf: Add ingress_ifindex to bpf_sk_lookup
It may be helpful to have access to the ifindex during bpf socket
lookup. An example may be to scope certain socket lookup logic to
specific interfaces, i.e. an interface may be made exempt from custom
lookup code.

Add the ifindex of the arriving connection to the bpf_sk_lookup API.

Signed-off-by: Mark Pashmfouroush <markpash@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211110111016.5670-2-markpash@cloudflare.com
2021-11-10 16:29:58 -08:00
Linus Torvalds
89fa0be0a0 arm64 fixes for -rc1
- Fix double-evaluation of 'pte' macro argument when using 52-bit PAs
 
 - Fix signedness of some MTE prctl PR_* constants
 
 - Fix kmemleak memory usage by skipping early pgtable allocations
 
 - Fix printing of CPU feature register strings
 
 - Remove redundant -nostdlib linker flag for vDSO binaries
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmGL8ukQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNB0XCADIFxVeLM3YMPsnDC6ttmGp/w1TXh5RoNBm
 IY5utdu8ybdrdCYO//Z6i7rQiYwPBRbR8Xts1LAfZDVuD/IPKqxiY+wjMe9FYu0U
 lb14dnT5iI7L3l0yo/5s1EC8FC/pfCMwi50+trAS5H5jo2wg/+6B/bgbymzTd1ZB
 vzVVzyGv1L/3xKHT4uEDjY1jMPRBH4Ugx2bJ1tzRzsC3Gz0D7ZeVd1Dg3ftPMsqI
 MA4Q5QeE9MsafcV8sKDRLnzNSYEr50goA0xtCre81VmJDNcm/y6UybUCF75KU72M
 kAOImXs0+wrDqNIocYgYlSWRu9xjw8qjoxjnyqikFx47HBesjY8T
 =pHQk
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:

 - Fix double-evaluation of 'pte' macro argument when using 52-bit PAs

 - Fix signedness of some MTE prctl PR_* constants

 - Fix kmemleak memory usage by skipping early pgtable allocations

 - Fix printing of CPU feature register strings

 - Remove redundant -nostdlib linker flag for vDSO binaries

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: pgtable: make __pte_to_phys/__phys_to_pte_val inline functions
  arm64: Track no early_pgtable_alloc() for kmemleak
  arm64: mte: change PR_MTE_TCF_NONE back into an unsigned long
  arm64: vdso: remove -nostdlib compiler flag
  arm64: arm64_ftr_reg->name may not be a human-readable string
2021-11-10 11:29:30 -08:00
Quentin Monnet
1a8b597dda bpftool: Fix SPDX tag for Makefiles and .gitignore
Bpftool is dual-licensed under GPLv2 and BSD-2-Clause. In commit
907b223651 ("tools: bpftool: dual license all files") we made sure
that all its source files were indeed covered by the two licenses, and
that they had the correct SPDX tags.

However, bpftool's Makefile, the Makefile for its documentation, and the
.gitignore file were skipped at the time (their GPL-2.0-only tag was
added later). Let's update the tags.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Joe Stringer <joe@cilium.io>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/bpf/20211105221904.3536-1-quentin@isovalent.com
2021-11-10 09:00:52 -08:00
Petr Machata
af0a51113c selftests: forwarding: Fix packet matching in mirroring selftests
In commit 6de6e46d27 ("cls_flower: Fix inability to match GRE/IPIP
packets"), cls_flower was fixed to match an outer packet of a tunneled
packet as would be expected, rather than dissecting to the inner packet and
matching on that.

This fix uncovered several issues in packet matching in mirroring
selftests:

- in mirror_gre_bridge_1d_vlan.sh and mirror_gre_vlan_bridge_1q.sh, the
  vlan_ethtype match is copied around as "ip", even as some of the tests
  are running over ip6gretap. This is fixed by using an "ipv6" for
  vlan_ethtype in the ip6gretap tests.

- in mirror_gre_changes.sh, a filter to count GRE packets is set up to
  match TTL of 50. This used to trigger in the offloaded datapath, where
  the envelope TTL was matched, but not in the software datapath, which
  considered TTL of the inner packet. Now that both match consistently, all
  the packets were double-counted. This is fixed by marking the filter as
  skip_hw, leaving only the SW datapath component active.

Fixes: 6de6e46d27 ("cls_flower: Fix inability to match GRE/IPIP packets")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-10 14:38:44 +00:00
Andrea Righi
e7e4785fa3 selftests: net: test_vxlan_under_vrf: fix HV connectivity test
It looks like test_vxlan_under_vrf.sh is always failing to verify the
connectivity test during the ping between the two simulated VMs.

This is due to the fact that veth-hv in each VM should have a distinct
MAC address.

Fix by setting a unique MAC address on each simulated VM interface.

Without this fix:

 $ sudo ./tools/testing/selftests/net/test_vxlan_under_vrf.sh
 Checking HV connectivity                                           [ OK ]
 Check VM connectivity through VXLAN (underlay in the default VRF)  [FAIL]

With this fix applied:

 $ sudo ./tools/testing/selftests/net/test_vxlan_under_vrf.sh
 Checking HV connectivity                                           [ OK ]
 Check VM connectivity through VXLAN (underlay in the default VRF)  [ OK ]
 Check VM connectivity through VXLAN (underlay in a VRF)            [FAIL]

NOTE: the connectivity test with the underlay VRF is still failing; it
seems that ARP requests are blocked at the simulated hypervisor level,
probably due to some missing ARP forwarding rules. This requires more
investigation (in the meantime we may consider to set that test as
expected failure - XFAIL).

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-10 14:31:02 +00:00
Jakub Kicinski
fceb07950a Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2021-11-09

We've added 7 non-merge commits during the last 3 day(s) which contain
a total of 10 files changed, 174 insertions(+), 48 deletions(-).

The main changes are:

1) Various sockmap fixes, from John and Jussi.

2) Fix out-of-bound issue with bpf_pseudo_func, from Martin.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf, sockmap: sk_skb data_end access incorrect when src_reg = dst_reg
  bpf: sockmap, strparser, and tls are reusing qdisc_skb_cb and colliding
  bpf, sockmap: Fix race in ingress receive verdict with redirect to self
  bpf, sockmap: Remove unhash handler for BPF sockmap usage
  bpf, sockmap: Use stricter sk state checks in sk_lookup_assign
  bpf: selftest: Trigger a DCE on the whole subprog
  bpf: Stop caching subprog index in the bpf_pseudo_func insn
====================

Link: https://lore.kernel.org/r/20211109215702.38350-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-09 15:44:48 -08:00
Kumar Kartikeya Dwivedi
3a74ac2d11 libbpf: Compile using -std=gnu89
The minimum supported C standard version is C89, with use of GNU
extensions, hence make sure to catch any instances that would break
the build for this mode by passing -std=gnu89.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211105234243.390179-4-memxor@gmail.com
2021-11-09 13:27:52 -08:00
Linus Torvalds
59a2ceeef6 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "87 patches.

  Subsystems affected by this patch series: mm (pagecache and hugetlb),
  procfs, misc, MAINTAINERS, lib, checkpatch, binfmt, kallsyms, ramfs,
  init, codafs, nilfs2, hfs, crash_dump, signals, seq_file, fork,
  sysvfs, kcov, gdb, resource, selftests, and ipc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (87 commits)
  ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL
  ipc: check checkpoint_restore_ns_capable() to modify C/R proc files
  selftests/kselftest/runner/run_one(): allow running non-executable files
  virtio-mem: disallow mapping virtio-mem memory via /dev/mem
  kernel/resource: disallow access to exclusive system RAM regions
  kernel/resource: clean up and optimize iomem_is_exclusive()
  scripts/gdb: handle split debug for vmlinux
  kcov: replace local_irq_save() with a local_lock_t
  kcov: avoid enable+disable interrupts if !in_task()
  kcov: allocate per-CPU memory on the relevant node
  Documentation/kcov: define `ip' in the example
  Documentation/kcov: include types.h in the example
  sysv: use BUILD_BUG_ON instead of runtime check
  kernel/fork.c: unshare(): use swap() to make code cleaner
  seq_file: fix passing wrong private data
  seq_file: move seq_escape() to a header
  signal: remove duplicate include in signal.h
  crash_dump: remove duplicate include in crash_dump.h
  crash_dump: fix boolreturn.cocci warning
  hfs/hfsplus: use WARN_ON for sanity check
  ...
2021-11-09 10:11:53 -08:00
SeongJae Park
303f8e2d02 selftests/kselftest/runner/run_one(): allow running non-executable files
When running a test program, 'run_one()' checks if the program has the
execution permission and fails if it doesn't.  However, it's easy to
mistakenly lose the permissions, as some common tools like 'diff' don't
support the permission change well[1].  Compared to that, making mistakes
in the test program's path would only rare, as those are explicitly listed
in 'TEST_PROGS'.  Therefore, it might make more sense to resolve the
situation on our own and run the program.

For this reason, this commit makes the test program runner function still
print the warning message but to try parsing the interpreter of the
program and to explicitly run it with the interpreter, in this case.

[1] https://lore.kernel.org/mm-commits/YRJisBs9AunccCD4@kroah.com/

Link: https://lkml.kernel.org/r/20210810164534.25902-1-sj38.park@gmail.com
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:53 -08:00
Florian Weimer
0658a0961b procfs: do not list TID 0 in /proc/<pid>/task
If a task exits concurrently, task_pid_nr_ns may return 0.

[akpm@linux-foundation.org: coding style tweaks]
[adobriyan@gmail.com: test that /proc/*/task doesn't contain "0"]
  Link: https://lkml.kernel.org/r/YV88AnVzHxPafQ9o@localhost.localdomain

Link: https://lkml.kernel.org/r/8735pn5dx7.fsf@oldenburg.str.redhat.com
Signed-off-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:48 -08:00
Alan Maguire
c23551c9c3 selftests/bpf: Add exception handling selftests for tp_bpf program
Exception handling is triggered in BPF tracing programs when a NULL pointer
is dereferenced; the exception handler zeroes the target register and
execution of the BPF program progresses.

To test exception handling then, we need to trigger a NULL pointer dereference
for a field which should never be zero; if it is, the only explanation is the
exception handler ran. task->task_works is the NULL pointer chosen (for a new
task from fork() no work is associated), and the task_works->func field should
not be zero if task_works is non-NULL. The test verifies that task_works and
task_works->func are 0.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1636131046-5982-3-git-send-email-alan.maguire@oracle.com
2021-11-08 22:17:55 +01:00
Linus Torvalds
dd72945c43 cxl for v5.16
- Fix support for platforms that do not enumerate every ACPI0016 (CXL
   Host Bridge) in the CHBS (ACPI Host Bridge Structure).
 
 - Introduce a common pci_find_dvsec_capability() helper, clean up open
   coded implementations in various drivers.
 
 - Add 'cxl_test' for regression testing CXL subsystem ABIs. 'cxl_test'
   is a module built from tools/testing/cxl/ that mocks up a CXL topology
   to augment the nascent support for emulation of CXL devices in QEMU.
 
 - Convert libnvdimm to use the uuid API.
 
 - Complete the definition of CXL namespace labels in libnvdimm.
 
 - Tunnel libnvdimm label operations from nd_ioctl() back to the CXL
   mailbox driver. Enable 'ndctl {read,write}-labels' for CXL.
 
 - Continue to sort and refactor functionality into distinct driver and
   core-infrastructure buckets. For example, mailbox handling is now a
   generic core capability consumed by the PCI and cxl_test drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYYRtyQAKCRDfioYZHlFs
 Z7UsAP9DzUN6IWWnYk1R95YXYVxFriRtRsBjujAqTg49EMghawEAoHaA9lxO3Hho
 l25TLYUOmB/zFTlUbe6YQptMJZ5YLwY=
 =im9j
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl updates from Dan Williams:
 "More preparation and plumbing work in the CXL subsystem.

  From an end user perspective the highlight here is lighting up the CXL
  Persistent Memory related commands (label read / write) with the
  generic ioctl() front-end in LIBNVDIMM.

  Otherwise, the ability to instantiate new persistent and volatile
  memory regions is still on track for v5.17.

  Summary:

   - Fix support for platforms that do not enumerate every ACPI0016 (CXL
     Host Bridge) in the CHBS (ACPI Host Bridge Structure).

   - Introduce a common pci_find_dvsec_capability() helper, clean up
     open coded implementations in various drivers.

   - Add 'cxl_test' for regression testing CXL subsystem ABIs.
     'cxl_test' is a module built from tools/testing/cxl/ that mocks up
     a CXL topology to augment the nascent support for emulation of CXL
     devices in QEMU.

   - Convert libnvdimm to use the uuid API.

   - Complete the definition of CXL namespace labels in libnvdimm.

   - Tunnel libnvdimm label operations from nd_ioctl() back to the CXL
     mailbox driver. Enable 'ndctl {read,write}-labels' for CXL.

   - Continue to sort and refactor functionality into distinct driver
     and core-infrastructure buckets. For example, mailbox handling is
     now a generic core capability consumed by the PCI and cxl_test
     drivers"

* tag 'cxl-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (34 commits)
  ocxl: Use pci core's DVSEC functionality
  cxl/pci: Use pci core's DVSEC functionality
  PCI: Add pci_find_dvsec_capability to find designated VSEC
  cxl/pci: Split cxl_pci_setup_regs()
  cxl/pci: Add @base to cxl_register_map
  cxl/pci: Make more use of cxl_register_map
  cxl/pci: Remove pci request/release regions
  cxl/pci: Fix NULL vs ERR_PTR confusion
  cxl/pci: Remove dev_dbg for unknown register blocks
  cxl/pci: Convert register block identifiers to an enum
  cxl/acpi: Do not fail cxl_acpi_probe() based on a missing CHBS
  cxl/pci: Disambiguate cxl_pci further from cxl_mem
  Documentation/cxl: Add bus internal docs
  cxl/core: Split decoder setup into alloc + add
  tools/testing/cxl: Introduce a mock memory device + driver
  cxl/mbox: Move command definitions to common location
  cxl/bus: Populate the target list at decoder create
  tools/testing/cxl: Introduce a mocked-up CXL port hierarchy
  cxl/pmem: Add support for multiple nvdimm-bridge objects
  cxl/pmem: Translate NVDIMM label commands to CXL label commands
  ...
2021-11-08 11:49:48 -08:00
Linus Torvalds
05b8cd3db7 Add 'tools/perf/libbpf/' to ignored files
Commit 6b491a86b7 ("perf build: Install libbpf headers locally when
building") installed copies of the libbpf headers into the build tree,
causing unnecessary noise from 'git status' after a perf tools build.

Add the 'libbpf/' subdirectory to the .gitignore file to silence it all
again.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-08 11:33:35 -08:00
Linus Torvalds
bbdbeb0048 perf tools changes for v5.16:
perf annotate:
 
 - Add riscv64 support.
 
 - Add fusion logic for AMD microarchs.
 
 perf record:
 
 - Add an option to control the synthesizing behavior:
 
     --synth <no|all|task|mmap|cgroup>
                       Fine-tune event synthesis: default=all
 
 core:
 
 - Allow controlling synthesizing PERF_RECORD_ metadata events during record.
 
 - perf.data reader prep work for multithreaded processing.
 
 - Fix missing exclude_{host,guest} setting in PMUs that don't support it and
   that were causing the feature detection code to disable it for all events,
   even the ones in PMUs that support it.
 
 - Fix the default use of precise events on AMD, that were always falling back
   to non-precise because perf_event_attr.exclude_guest=1 was set and IBS does
   not have filtering capability, refusing precise + exclude_guest.
 
 - Add bitfield_swap() to handle branch_stack endian issue.
 
 perf script:
 
 - Show binary offsets for userspace addresses in callchains.
 
 - Support instruction latency via new "ins_lat" selectable field.
 
 - Add dlfilter-show-cycles
 
 perf inject:
 
 - Add vmlinux and ignore-vmlinux arguments, similar to other tools.
 
 perf list:
 
 - Display PMU prefix for partially supported hybrid cache events.
 
 - Display hybrid PMU events with cpu type.
 
 perf stat:
 
 - Improve metrics documentation of data structures.
 
 - Fix memory leaks in the metric code.
 
 - Use NAN for missing event IDs.
 
 - Don't compute unused events.
 
 - Fix memory leak on error path.
 
 - Encode and use metric-id as a metric qualifier.
 
 - Allow metrics with no events.
 
 - Avoid events for an 'if' constant result.
 
 - Only add a referenced metric once.
 
 - Simplify metric_refs calculation.
 
 - Allow modifiers on metrics.
 
 perf test:
 
 - Add workload test of metric and metric groups.
 
 - Workload test of all PMUs.
 
 - vmlinux-kallsyms: Ignore hidden symbols.
 
 - Add pmu-event test for event described as "config=".
 
 - Verify more event members in pmu-events test.
 
 - Add endian test for struct branch_flags on the sample-parsing test.
 
 - Improve temp file cleanup in several tests.
 
 perf daemon:
 
 - Address MSAN warnings on send_cmd().
 
 perf kmem:
 
 - Improve man page for record options
 
 perf srcline:
 
 - Use long-running addr2line per DSO, greatly speeding up the 'srcline' sort order.
 
 perf symbols:
 
 - Ignore $a/$d symbols for ARM modules.
 
 - Fix /proc/kcore access on 32 bit systems.
 
 Kernel UAPI copies:
 
 - Update copy of linux/socket.h with the kernel sources, no change in tooling output.
 
 libbpf:
 
 - Pull in bpf_program__get_prog_info_linear() from libbpf, too much specific to perf.
 
 - Deprecate bpf_map__resize() in favor of bpf_map_set_max_entries()
 
 - Install libbpf headers locally when building.
 
 - Bump minimum LLVM C++ std to GNU++14.
 
 libperf:
 
 - Use binary search in perf_cpu_map__idx() as array are sorted.
 
 libtracefs:
 
 - Enable libtracefs dynamic linking.
 
 libtraceevent:
 
 - Increase logging when verbose.
 
 Arch specific:
 
 PowerPC:
 
 - Add support to expose instruction and data address registers as part of
   extended regs.
 
 Vendor events:
 
 JSON parser:
 
 - Support ConfigCode to set the config= in PMUs like:
 
   $ cat /sys/bus/event_source/devices/hisi_sccl1_ddrc3/events/act_cmd
   config=0x5
 
 - Make the JSON parser more conformant when in strict mode.
 
 All JSON files:
 
 - Fix all remaining invalid JSON files.
 
 ARM:
 
 - Syntax corrections in Neoverse N1 json.
 
 - Categorise the Neoverse V1 counters.
 
 - Add new armv8 PMU events.
 
 - Revise hip08 uncore events.
 
 Hardware tracing:
 
 auxtrace:
 
 - Add missing Z option to ITRACE_HELP.
 
 - Add itrace A option to approximate IPC.
 
 - Add itrace d+o option to direct debug log to stdout.
 
 Intel PT:
 
 - Add support for PERF_RECORD_AUX_OUTPUT_HW_ID
 
 - Support itrace A option to approximate IPC
 
 - Support itrace d+o option to direct debug log to stdout.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYYg7RwAKCRCyPKLppCJ+
 J1KgAQCGC3802gsI38/xwli1SLBNHsZ9DEy2nX5Ikw/f64K+0QEA6DsU0hpRTR0D
 i8ZFa2zNX748n+pU2WX6E7AjO01hrQo=
 =sXHg
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v5.16-2021-11-07-without-bpftool-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools updates from Arnaldo Carvalho de Melo:
 "perf annotate:
   - Add riscv64 support.
   - Add fusion logic for AMD microarchs.

  perf record:
   - Add an option to control the synthesizing behavior:
       --synth <no|all|task|mmap|cgroup>

  core:
   - Allow controlling synthesizing PERF_RECORD_ metadata events during
     record.
   - perf.data reader prep work for multithreaded processing.
   - Fix missing exclude_{host,guest} setting in PMUs that don't support
     it and that were causing the feature detection code to disable it
     for all events, even the ones in PMUs that support it.
   - Fix the default use of precise events on AMD, that were always
     falling back to non-precise because perf_event_attr.exclude_guest=1
     was set and IBS does not have filtering capability, refusing
     precise + exclude_guest.
   - Add bitfield_swap() to handle branch_stack endian issue.

  perf script:
   - Show binary offsets for userspace addresses in callchains.
   - Support instruction latency via new "ins_lat" selectable field.
   - Add dlfilter-show-cycles

  perf inject:
   - Add vmlinux and ignore-vmlinux arguments, similar to other tools.

  perf list:
   - Display PMU prefix for partially supported hybrid cache events.
   - Display hybrid PMU events with cpu type.

  perf stat:
   - Improve metrics documentation of data structures.
   - Fix memory leaks in the metric code.
   - Use NAN for missing event IDs.
   - Don't compute unused events.
   - Fix memory leak on error path.
   - Encode and use metric-id as a metric qualifier.
   - Allow metrics with no events.
   - Avoid events for an 'if' constant result.
   - Only add a referenced metric once.
   - Simplify metric_refs calculation.
   - Allow modifiers on metrics.

  perf test:
   - Add workload test of metric and metric groups.
   - Workload test of all PMUs.
   - vmlinux-kallsyms: Ignore hidden symbols.
   - Add pmu-event test for event described as "config=".
   - Verify more event members in pmu-events test.
   - Add endian test for struct branch_flags on the sample-parsing test.
   - Improve temp file cleanup in several tests.

  perf daemon:
   - Address MSAN warnings on send_cmd().

  perf kmem:
   - Improve man page for record options

  perf srcline:
   - Use long-running addr2line per DSO, greatly speeding up the
     'srcline' sort order.

  perf symbols:
   - Ignore $a/$d symbols for ARM modules.
   - Fix /proc/kcore access on 32 bit systems.

  Kernel UAPI copies:
   - Update copy of linux/socket.h with the kernel sources, no change in
     tooling output.

  libbpf:
   - Pull in bpf_program__get_prog_info_linear() from libbpf, too much
     specific to perf.
   - Deprecate bpf_map__resize() in favor of bpf_map_set_max_entries()
   - Install libbpf headers locally when building.
   - Bump minimum LLVM C++ std to GNU++14.

  libperf:
   - Use binary search in perf_cpu_map__idx() as array are sorted.

  libtracefs:
   - Enable libtracefs dynamic linking.

  libtraceevent:
   - Increase logging when verbose.

  Arch specific:

   * PowerPC:
      - Add support to expose instruction and data address registers as
        part of extended regs.

  Vendor events:

   * JSON parser:
      - Support ConfigCode to set the config= in PMUs
      - Make the JSON parser more conformant when in strict mode.

   * All JSON files:
      - Fix all remaining invalid JSON files.

   * ARM:
      - Syntax corrections in Neoverse N1 json.
      - Categorise the Neoverse V1 counters.
      - Add new armv8 PMU events.
      - Revise hip08 uncore events.

  Hardware tracing:

   * auxtrace:
      - Add missing Z option to ITRACE_HELP.
      - Add itrace A option to approximate IPC.
      - Add itrace d+o option to direct debug log to stdout.

   * Intel PT:
      - Add support for PERF_RECORD_AUX_OUTPUT_HW_ID
      - Support itrace A option to approximate IPC
      - Support itrace d+o option to direct debug log to stdout"

* tag 'perf-tools-for-v5.16-2021-11-07-without-bpftool-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (120 commits)
  perf build: Install libbpf headers locally when building
  perf MANIFEST: Add bpftool files to allow building with BUILD_BPF_SKEL=1
  perf metric: Fix memory leaks
  perf parse-event: Add init and exit to parse_event_error
  perf parse-events: Rename parse_events_error functions
  perf stat: Fix memory leak on error path
  perf tools: Use __BYTE_ORDER__
  perf inject: Add vmlinux and ignore-vmlinux arguments
  perf tools: Check vmlinux/kallsyms arguments in all tools
  perf tools: Refactor out kernel symbol argument sanity checking
  perf symbols: Ignore $a/$d symbols for ARM modules
  perf evsel: Don't set exclude_guest by default
  perf evsel: Fix missing exclude_{host,guest} setting
  perf bpf: Add missing free to bpf_event__print_bpf_prog_info()
  perf beauty: Update copy of linux/socket.h with the kernel sources
  perf clang: Fixes for more recent LLVM/clang
  tools: Bump minimum LLVM C++ std to GNU++14
  perf bpf: Pull in bpf_program__get_prog_info_linear()
  Revert "perf bench futex: Add support for 32-bit systems with 64-bit time_t"
  perf test sample-parsing: Add endian test for struct branch_flags
  ...
2021-11-08 09:25:26 -08:00
Phil Sutter
85c0c8b342 selftests: nft_nat: Simplify port shadow notrack test
The second rule in prerouting chain was probably a leftover: The router
listens on veth0, so not tracking connections via that interface is
sufficient. Likewise, the rule in output chain can be limited to that
interface as well.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-11-08 11:26:57 +01:00
Phil Sutter
e1f8bc06e4 selftests: nft_nat: Improve port shadow test stability
Setup phase in test_port_shadow() relied upon a race-condition:
Listening nc on port 1405 was started in background before attempting to
create the fake conntrack entry using the same source port. If listening
nc won, fake conntrack entry could not be created causing wrong
behaviour. Reorder nc calls to fix this and introduce a short delay
before testing the setup to wait for listening nc process startup.

Fixes: 465f15a6d1 ("selftests: nft_nat: add udp hole punch test case")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-11-08 11:26:57 +01:00
Florian Westphal
228c3fa054 selftests: netfilter: extend nfqueue tests to cover vrf device
VRF device calls the output/postrouting hooks so packet should be seeon
with oifname tvrf and once with eth0.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-11-08 11:24:01 +01:00
Florian Westphal
33b8aad21a selftests: netfilter: add a vrf+conntrack testcase
Rework the reproducer for the vrf+conntrack regression reported
by Eugene into a selftest and also add a test for ip masquerading
that Lahav fixed recently.

With net or net-next tree, the first test fails and the latter
two pass.

With 09e856d54b ("vrf: Reset skb conntrack connection on VRF rcv")
reverted first test passes but the last two fail.

A proper fix needs more work, for time being a revert seems to be
the best choice, snat/masquerade did not work before the fix.

Link: https://lore.kernel.org/netdev/378ca299-4474-7e9a-3d36-2350c8c98995@gmail.com/T/#m95358a31810df7392f541f99d187227bc75c9963
Reported-by: Eugene Crosser <crosser@average.org>
Cc: Lahav Schlesinger <lschlesinger@drivenets.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-11-08 11:24:01 +01:00
Peter Collingbourne
aedad3e1c6 arm64: mte: change PR_MTE_TCF_NONE back into an unsigned long
This constant was previously an unsigned long, but was changed
into an int in commit 433c38f40f ("arm64: mte: change ASYNC and
SYNC TCF settings into bitfields"). This ended up causing spurious
unsigned-signed comparison warnings in expressions such as:

(x & PR_MTE_TCF_MASK) != PR_MTE_TCF_NONE

Therefore, change it back into an unsigned long to silence these
warnings.

Link: https://linux-review.googlesource.com/id/I07a72310db30227a5b7d789d0b817d78b657c639
Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://lore.kernel.org/r/20211105230829.2254790-1-pcc@google.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-11-08 10:04:33 +00:00
Song Liu
f108662b27 selftests/bpf: Add tests for bpf_find_vma
Add tests for bpf_find_vma in perf_event program and kprobe program. The
perf_event program is triggered from NMI context, so the second call of
bpf_find_vma() will return -EBUSY (irq_work busy). The kprobe program,
on the other hand, does not have this constraint.

Also add tests for illegal writes to task or vma from the callback
function. The verifier should reject both cases.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211105232330.1936330-3-songliubraving@fb.com
2021-11-07 11:54:51 -08:00
Song Liu
7c7e3d31e7 bpf: Introduce helper bpf_find_vma
In some profiler use cases, it is necessary to map an address to the
backing file, e.g., a shared library. bpf_find_vma helper provides a
flexible way to achieve this. bpf_find_vma maps an address of a task to
the vma (vm_area_struct) for this address, and feed the vma to an callback
BPF function. The callback function is necessary here, as we need to
ensure mmap_sem is unlocked.

It is necessary to lock mmap_sem for find_vma. To lock and unlock mmap_sem
safely when irqs are disable, we use the same mechanism as stackmap with
build_id. Specifically, when irqs are disabled, the unlocked is postponed
in an irq_work. Refactor stackmap.c so that the irq_work is shared among
bpf_find_vma and stackmap helpers.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211105232330.1936330-2-songliubraving@fb.com
2021-11-07 11:54:51 -08:00
Anders Roxell
62b12ab5df selftests: net: tls: remove unused variable and code
When building selftests/net with clang, the compiler warn about the
function abs() see below:

tls.c:657:15: warning: variable 'len_compared' set but not used [-Wunused-but-set-variable]
        unsigned int len_compared = 0;
                     ^

Rework to remove the unused variable and the for-loop where the variable
'len_compared' was assinged.

Fixes: 7f657d5bf5 ("selftests: tls: add selftests for TLS sockets")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-07 19:35:08 +00:00
Quentin Monnet
6b491a86b7 perf build: Install libbpf headers locally when building
API headers from libbpf should not be accessed directly from the
library's source directory. Instead, they should be exported with "make
install_headers". Let's adjust perf's Makefile to install those headers
locally when building libbpf.

v2:

- Fix $(LIBBPF_OUTPUT) when $(OUTPUT) is null.
- Make sure the recipe for $(LIBBPF_OUTPUT) is not under a "ifdef".

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211107002445.4790-1-quentin@isovalent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 15:39:28 -03:00
Arnaldo Carvalho de Melo
f174940488 perf MANIFEST: Add bpftool files to allow building with BUILD_BPF_SKEL=1
We need bpftool and required kernel/bpf/disasm.[ch] to bootstrap the
cgroups, bperf and other BPF skels used by perf.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 15:39:28 -03:00
Ian Rogers
aba8c5e380 perf metric: Fix memory leaks
Certain error paths may leak memory as caught by address sanitizer.
Ensure this is cleaned up to make sure address/leak sanitizer is happy.

Fixes: 5ecd5a0c7d ("perf metrics: Modify setup and deduplication")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211107090002.3784612-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 15:39:28 -03:00
Ian Rogers
07eafd4e05 perf parse-event: Add init and exit to parse_event_error
parse_events() may succeed but leave string memory allocations reachable
in the error.

Add an init/exit that must be called to initialize and clean up the
error. This fixes a leak in metricgroup parse_ids.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211107090002.3784612-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 15:39:25 -03:00
Ian Rogers
6c1912898e perf parse-events: Rename parse_events_error functions
Group error functions and name after the data type they manipulate.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211107090002.3784612-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 15:38:54 -03:00
Andrii Nakryiko
8c7a955201 selftests/bpf: Fix bpf_object leak in skb_ctx selftest
skb_ctx selftest didn't close bpf_object implicitly allocated by
bpf_prog_test_load() helper. Fix the problem by explicitly calling
bpf_object__close() at the end of the test.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-10-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
f91231eeee selftests/bpf: Destroy XDP link correctly
bpf_link__detach() was confused with bpf_link__destroy() and leaves
leaked FD in the process. Fix the problem.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-9-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
f92321d706 selftests/bpf: Avoid duplicate btf__parse() call
btf__parse() is repeated after successful setup, leaving the first
instance leaked. Remove redundant and premature call.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-8-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
f79587520a selftests/bpf: Clean up btf and btf_dump in dump_datasec test
Free up used resources at the end and on error. Also make it more
obvious that there is btf__parse() call that creates struct btf
instance.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-7-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
5309b516bc selftests/bpf: Free inner strings index in btf selftest
Inner array of allocated strings wasn't freed on success. Now it's
always freed.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-6-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
b8b26e585f selftests/bpf: Free per-cpu values array in bpf_iter selftest
Array holding per-cpu values wasn't freed. Fix that.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211107165521.9240-5-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
8ba2858749 selftests/bpf: Fix memory leaks in btf_type_c_dump() helper
Free up memory and resources used by temporary allocated memstream and
btf_dump instance.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-4-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
8f7b239ea8 libbpf: Free up resources used by inner map definition
It's not enough to just free(map->inner_map), as inner_map itself can
have extra memory allocated, like map name.

Fixes: 646f02ffdd ("libbpf: Add BTF-defined map-in-map support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-3-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
2a2cb45b72 selftests/bpf: Pass sanitizer flags to linker through LDFLAGS
When adding -fsanitize=address to SAN_CFLAGS, it has to be passed both
to compiler through CFLAGS as well as linker through LDFLAGS. Add
SAN_CFLAGS into LDFLAGS to allow building selftests with ASAN.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-2-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
f19ddfe036 selftests/bpf: Use explicit bpf_test_load_program() helper calls
Remove the second part of prog loading testing helper re-definition:

  -Dbpf_load_program=bpf_test_load_program

This completes the clean up of deprecated libbpf program loading APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-13-andrii@kernel.org
2021-11-07 08:34:24 -08:00
Andrii Nakryiko
cbdb1461dc selftests/bpf: Use explicit bpf_prog_test_load() calls everywhere
-Dbpf_prog_load_deprecated=bpf_prog_test_load trick is both ugly and
breaks when deprecation goes into effect due to macro magic. Convert all
the uses to explicit bpf_prog_test_load() calls which avoid deprecation
errors and makes everything less magical.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-12-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Andrii Nakryiko
f87c1930ac selftests/bpf: Merge test_stub.c into testing_helpers.c
Move testing prog and object load wrappers (bpf_prog_test_load and
bpf_test_load_program) into testing_helpers.{c,h} and get rid of
otherwise useless test_stub.c. Make testing_helpers.c available to
non-test_progs binaries as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-11-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Andrii Nakryiko
d8e86407e5 selftests/bpf: Convert legacy prog load APIs to bpf_prog_load()
Convert all the uses of legacy low-level BPF program loading APIs
(mostly bpf_load_program_xattr(), but also some bpf_verify_program()) to
bpf_prog_load() uses.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-10-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Andrii Nakryiko
3d1d62397f selftests/bpf: Fix non-strict SEC() program sections
Fix few more SEC() definitions that were previously missed.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-9-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Andrii Nakryiko
5c5edcdebf libbpf: Remove deprecation attribute from struct bpf_prog_prep_result
This deprecation annotation has no effect because for struct deprecation
attribute has to be declared after struct definition. But instead of
moving it to the end of struct definition, remove it. When deprecation
will go in effect at libbpf v0.7, this deprecation attribute will cause
libbpf's own source code compilation to trigger deprecation warnings,
which is unavoidable because libbpf still has to support that API.

So keep deprecation of APIs, but don't mark structs used in API as
deprecated.

Fixes: e21d585cb3 ("libbpf: Deprecate multi-instance bpf_program APIs")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-8-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Andrii Nakryiko
a3c7c7e805 bpftool: Stop using deprecated bpf_load_program()
Switch to bpf_prog_load() instead.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-7-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Andrii Nakryiko
bcc40fc002 libbpf: Stop using to-be-deprecated APIs
Remove all the internal uses of libbpf APIs that are slated to be
deprecated in v0.7.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-6-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Andrii Nakryiko
e32660ac6f libbpf: Remove internal use of deprecated bpf_prog_load() variants
Remove all the internal uses of bpf_load_program_xattr(), which is
slated for deprecation in v0.7.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-5-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Andrii Nakryiko
d10ef2b825 libbpf: Unify low-level BPF_PROG_LOAD APIs into bpf_prog_load()
Add a new unified OPTS-based low-level API for program loading,
bpf_prog_load() ([0]).  bpf_prog_load() accepts few "mandatory"
parameters as input arguments (program type, name, license,
instructions) and all the other optional (as in not required to specify
for all types of BPF programs) fields into struct bpf_prog_load_opts.

This makes all the other non-extensible APIs variant for BPF_PROG_LOAD
obsolete and they are slated for deprecation in libbpf v0.7:
  - bpf_load_program();
  - bpf_load_program_xattr();
  - bpf_verify_program().

Implementation-wise, internal helper libbpf__bpf_prog_load is refactored
to become a public bpf_prog_load() API. struct bpf_prog_load_params used
internally is replaced by public struct bpf_prog_load_opts.

Unfortunately, while conceptually all this is pretty straightforward,
the biggest complication comes from the already existing bpf_prog_load()
*high-level* API, which has nothing to do with BPF_PROG_LOAD command.

We try really hard to have a new API named bpf_prog_load(), though,
because it maps naturally to BPF_PROG_LOAD command.

For that, we rename old bpf_prog_load() into bpf_prog_load_deprecated()
and mark it as COMPAT_VERSION() for shared library users compiled
against old version of libbpf. Statically linked users and shared lib
users compiled against new version of libbpf headers will get "rerouted"
to bpf_prog_deprecated() through a macro helper that decides whether to
use new or old bpf_prog_load() based on number of input arguments (see
___libbpf_overload in libbpf_common.h).

To test that existing
bpf_prog_load()-using code compiles and works as expected, I've compiled
and ran selftests as is. I had to remove (locally) selftest/bpf/Makefile
-Dbpf_prog_load=bpf_prog_test_load hack because it was conflicting with
the macro-based overload approach. I don't expect anyone else to do
something like this in practice, though. This is testing-specific way to
replace bpf_prog_load() calls with special testing variant of it, which
adds extra prog_flags value. After testing I kept this selftests hack,
but ensured that we use a new bpf_prog_load_deprecated name for this.

This patch also marks bpf_prog_load() and bpf_prog_load_xattr() as deprecated.
bpf_object interface has to be used for working with struct bpf_program.
Libbpf doesn't support loading just a bpf_program.

The silver lining is that when we get to libbpf 1.0 all these
complication will be gone and we'll have one clean bpf_prog_load()
low-level API with no backwards compatibility hackery surrounding it.

  [0] Closes: https://github.com/libbpf/libbpf/issues/284

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-4-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Andrii Nakryiko
45493cbaf5 libbpf: Pass number of prog load attempts explicitly
Allow to control number of BPF_PROG_LOAD attempts from outside the
sys_bpf_prog_load() helper.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-3-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Andrii Nakryiko
be80e9cdbc libbpf: Rename DECLARE_LIBBPF_OPTS into LIBBPF_OPTS
It's confusing that libbpf-provided helper macro doesn't start with
LIBBPF. Also "declare" vs "define" is confusing terminology, I can never
remember and always have to look up previous examples.

Bypass both issues by renaming DECLARE_LIBBPF_OPTS into a short and
clean LIBBPF_OPTS. To avoid breaking existing code, provide:

  #define DECLARE_LIBBPF_OPTS LIBBPF_OPTS

in libbpf_legacy.h. We can decide later if we ever want to remove it or
we'll keep it forever because it doesn't add any maintainability burden.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-2-andrii@kernel.org
2021-11-07 08:34:22 -08:00
Ian Rogers
e4e290791d perf stat: Fix memory leak on error path
strdup() is used to deduplicate, ensure it isn't leaking an already
created string by freeing first.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211107085444.3781604-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 12:44:53 -03:00
Ilya Leoshkevich
4e88118c20 perf tools: Use __BYTE_ORDER__
Switch from the libc-defined __BYTE_ORDER to the compiler-defined
__BYTE_ORDER__ in order to make endianness detection more robust, like
it was done for libbpf.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20211104132311.984703-1-iii@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 12:27:38 -03:00
James Clark
b3a018fc31 perf inject: Add vmlinux and ignore-vmlinux arguments
Other perf tools allow specifying the path to vmlinux. 'perf inject'
didn't have this argument which made some auxtrace workflows difficult.

Also add --ignore-vmlinux for consistency with other tools.

Suggested-by: Denis Nikitin <denik@chromium.org>
Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: Denis Nikitin <denik@chromium.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20211018134844.2627174-4-james.clark@arm.com
[ Added the perf-inject man page entries for these options, as noted by Denis ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 12:27:38 -03:00
James Clark
7cc72553ac perf tools: Check vmlinux/kallsyms arguments in all tools
Only perf report checked the validity of these arguments so apply the
same check to all tools that read them for consistency.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20211018134844.2627174-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 12:27:38 -03:00
James Clark
a3df50abeb perf tools: Refactor out kernel symbol argument sanity checking
User supplied values for vmlinux and kallsyms are checked before
continuing. Refactor this into a function so that it can be used
elsewhere.

Reviewed-by: Denis Nikitin <denik@chromium.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20211018134844.2627174-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 12:27:38 -03:00
Lexi Shao
1a86f4ba5c perf symbols: Ignore $a/$d symbols for ARM modules
On anARM machine, kernel symbols from modules can be resolved to $a
instead of printing the actual symbol name. Ignore symbols starting with
"$" when building kallsyms rbtree.

A sample stacktrace is shown as follows:

  c0f2e39c schedule_hrtimeout+0x14 ([kernel.kallsyms])
  bf4a66d8 $a+0x78 ([test_module])
  c0a4f5f4 kthread+0x15c ([kernel.kallsyms])
  c0a001f8 ret_from_fork+0x14 ([kernel.kallsyms])

On an ARM machine, $a/$d symbols are used by the compiler to mark the
beginning of code/data part in code section. These symbols are filtered
out when linking vmlinux(see scripts/kallsyms.c ignored_prefixes), but
are left on modules. So there are $a symbols in /proc/kallsyms which
share the same addresses with the actual module symbols and confuses
perf when resolving symbols.

After this patch, the module symbol name is printed:

  c0f2e39c schedule_hrtimeout+0x14 ([kernel.kallsyms])
  bf4a66d8 test_func+0x78 ([test_module])
  c0a4f5f4 kthread+0x15c ([kernel.kallsyms])
  c0a001f8 ret_from_fork+0x14 ([kernel.kallsyms])

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Lexi Shao <shaolexi@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: QiuXi <qiuxi1@huawei.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Wangbing <wangbing6@huawei.com>
Cc: Xiaoming Ni <nixiaoming@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: clang-built-linux@googlegroups.com
Link: https://lore.kernel.org/r/20211029065038.39449-2-shaolexi@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 12:27:38 -03:00
Ravi Bangoria
eb39bf3256 perf evsel: Don't set exclude_guest by default
Perf tool sets exclude_guest by default while calling perf_event_open().
Because IBS does not have filtering capability, it always gets rejected
by IBS PMU driver and thus perf falls back to non-precise sampling. Fix
it by not setting exclude_guest by default on AMD.

Before:
  $ sudo ./perf record -C 0 -vvv true |& grep precise
    precise_ip                       3
  decreasing precise_ip by one (2)
    precise_ip                       2
  decreasing precise_ip by one (1)
    precise_ip                       1
  decreasing precise_ip by one (0)

After:
  $ sudo ./perf record -C 0 -vvv true |& grep precise
    precise_ip                       3
  decreasing precise_ip by one (2)
    precise_ip                       2

Committer notes:

Fixup init to zero for perf_env in older compilers:

  arch/x86/util/evsel.c:15:26: error: missing field 'os_release' initializer [-Werror,-Wmissing-field-initializers]
          struct perf_env env = {0};
                                  ^

Committer notes:

Namhyung remarked:

  It'd be nice if it can cover explicit "-e cycles:pp" as well.

Ravi clarified:

  For explicit :pp modifier, evsel->precise_max does not get set and thus perf
  does not try with different attr->precise_ip values while exclude_guest set.
  So no issue with explicit :pp:

    $ sudo ./perf record -C 0 -e cycles:pp -vvv |& grep "precise_ip\|exclude_guest"
      precise_ip                       2
      exclude_guest                    1
      precise_ip                       2
      exclude_guest                    1
    switching off exclude_guest, exclude_host
      precise_ip                       2
    ^C

  Also, with :P modifier, evsel->precise_max gets set but exclude_guest does
  not and thus :P also works fine:

    $ sudo ./perf record -C 0 -e cycles:P -vvv |& grep "precise_ip\|exclude_guest"
      precise_ip                       3
    decreasing precise_ip by one (2)
      precise_ip                       2
    ^C

Reported-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211103072112.32312-1-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 12:26:24 -03:00
Andrii Nakryiko
b8b5cb55f5 libbpf: Fix non-C89 loop variable declaration in gen_loader.c
Fix the `int i` declaration inside the for statement. This is non-C89
compliant. See [0] for user report breaking BCC build.

  [0] https://github.com/libbpf/libbpf/issues/403

Fixes: 18f4fccbf3 ("libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20211105191055.3324874-1-andrii@kernel.org
2021-11-06 16:16:52 -07:00
Linus Torvalds
0b707e572a s390 updates for the 5.16 merge window
- Add support for ftrace with direct call and ftrace direct call samples.
 
 - Add support for kernel command lines longer than current 896 bytes and
   make its length configurable.
 
 - Add support for BEAR enhancement facility to improve last breaking
   event instruction tracking.
 
 - Add kprobes sanity checks and testcases to prevent kprobe in the mid
   of an instruction.
 
 - Allow concurrent access to /dev/hwc for the CPUMF users.
 
 - Various ftrace / jump label improvements.
 
 - Convert unwinder tests to KUnit.
 
 - Add s390_iommu_aperture kernel parameter to tweak the limits on
   concurrently usable DMA mappings.
 
 - Add ap.useirq AP module option which can be used to disable interrupt
   use.
 
 - Add add_disk() error handling support to block device drivers.
 
 - Drop arch specific and use generic implementation of strlcpy and strrchr.
 
 - Several __pa/__va usages fixes.
 
 - Various cio, crypto, pci, kernel doc and other small fixes and
   improvements all over the code.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmGFW6EACgkQjYWKoQLX
 FBg20Qf/UbohgnKnE6vxbbH3sNTlI2dk3Cw4z3IobcsZgqXAu6AFLgLQGLk/X07F
 DIyUdrgSgCzLIEKLqrLrFXIOMIK44zAGaurIltNt7IrnWWlA+/YVD+YeL2gHwccq
 wT7KXRcrVMZQ1z18djJQ45DpPUC8ErBdL6+P+ftHck90YGFZsfMA5S7jf8X1h08U
 IlqdPTmY8t4unKHWVpHbxx9b+xrUuV6KTEXADsllpMV2jQoTLdDECd3vmefYR6tR
 3lssgop1m/RzH5OCqvia5Sy2D5fOQObNWDMakwOkVMxOD43lmGCTHstzS2Uo2OFE
 QcY79lfZ5NrzKnenUdE5Fd0XJ9kSwQ==
 =k0Ab
 -----END PGP SIGNATURE-----

Merge tag 's390-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Add support for ftrace with direct call and ftrace direct call
   samples.

 - Add support for kernel command lines longer than current 896 bytes
   and make its length configurable.

 - Add support for BEAR enhancement facility to improve last breaking
   event instruction tracking.

 - Add kprobes sanity checks and testcases to prevent kprobe in the mid
   of an instruction.

 - Allow concurrent access to /dev/hwc for the CPUMF users.

 - Various ftrace / jump label improvements.

 - Convert unwinder tests to KUnit.

 - Add s390_iommu_aperture kernel parameter to tweak the limits on
   concurrently usable DMA mappings.

 - Add ap.useirq AP module option which can be used to disable interrupt
   use.

 - Add add_disk() error handling support to block device drivers.

 - Drop arch specific and use generic implementation of strlcpy and
   strrchr.

 - Several __pa/__va usages fixes.

 - Various cio, crypto, pci, kernel doc and other small fixes and
   improvements all over the code.

[ Merge fixup as per https://lore.kernel.org/all/YXAqZ%2FEszRisunQw@osiris/ ]

* tag 's390-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (63 commits)
  s390: make command line configurable
  s390: support command lines longer than 896 bytes
  s390/kexec_file: move kernel image size check
  s390/pci: add s390_iommu_aperture kernel parameter
  s390/spinlock: remove incorrect kernel doc indicator
  s390/string: use generic strlcpy
  s390/string: use generic strrchr
  s390/ap: function rework based on compiler warning
  s390/cio: make ccw_device_dma_* more robust
  s390/vfio-ap: s390/crypto: fix all kernel-doc warnings
  s390/hmcdrv: fix kernel doc comments
  s390/ap: new module option ap.useirq
  s390/cpumf: Allow multiple processes to access /dev/hwc
  s390/bitops: return true/false (not 1/0) from bool functions
  s390: add support for BEAR enhancement facility
  s390: introduce nospec_uses_trampoline()
  s390: rename last_break to pgm_last_break
  s390/ptrace: add last_break member to pt_regs
  s390/sclp: sort out physical vs virtual pointers usage
  s390/setup: convert start and end initrd pointers to virtual
  ...
2021-11-06 14:48:06 -07:00
Linus Torvalds
512b7931ad Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "257 patches.

  Subsystems affected by this patch series: scripts, ocfs2, vfs, and
  mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache,
  gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc,
  pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools,
  memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm,
  vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram,
  cleanups, kfence, and damon)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits)
  mm/damon: remove return value from before_terminate callback
  mm/damon: fix a few spelling mistakes in comments and a pr_debug message
  mm/damon: simplify stop mechanism
  Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions
  Docs/admin-guide/mm/damon/start: simplify the content
  Docs/admin-guide/mm/damon/start: fix a wrong link
  Docs/admin-guide/mm/damon/start: fix wrong example commands
  mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on
  mm/damon: remove unnecessary variable initialization
  Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM
  mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
  selftests/damon: support watermarks
  mm/damon/dbgfs: support watermarks
  mm/damon/schemes: activate schemes based on a watermarks mechanism
  tools/selftests/damon: update for regions prioritization of schemes
  mm/damon/dbgfs: support prioritization weights
  mm/damon/vaddr,paddr: support pageout prioritization
  mm/damon/schemes: prioritize regions within the quotas
  mm/damon/selftests: support schemes quotas
  mm/damon/dbgfs: support quotas of schemes
  ...
2021-11-06 14:08:17 -07:00
Namhyung Kim
3500eeebed perf evsel: Fix missing exclude_{host,guest} setting
The current logic for the perf missing feature has a bug that it can
wrongly clear some modifiers like G or H.  Actually some PMUs don't
support any filtering or exclusion while others do.  But we check it as
a global feature.

For example, the cycles event can have 'G' modifier to enable it only in
the guest mode on x86.  When you don't run any VMs it'll return 0.

  # perf stat -a -e cycles:G sleep 1

    Performance counter stats for 'system wide':

                    0      cycles:G

          1.000721670 seconds time elapsed

But when it's used with other pmu events that don't support G modifier,
it'll be reset and return non-zero values.

  # perf stat -a -e cycles:G,msr/tsc/ sleep 1

    Performance counter stats for 'system wide':

          538,029,960      cycles:G
       16,924,010,738      msr/tsc/

          1.001815327 seconds time elapsed

This is because of the missing feature detection logic being global.
Add a hashmap to set pmu-specific exclude_host/guest features.

Committer notes:

Fix 'perf test python' by adding a stub for evsel__find_pmu() in
tools/perf/util/python.c, document that it is used so far only for the
above reasons so that if anybody needs this in the python binding
usecases, we can revisit this.

Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: http://lore.kernel.org/lkml/20211105205847.120950-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-06 17:54:42 -03:00
Ian Rogers
88c42f4d6c perf bpf: Add missing free to bpf_event__print_bpf_prog_info()
If btf__new() is called then there needs to be a corresponding btf__free().

Fixes: f8dfeae009 ("perf bpf: Show more BPF program info in print_bpf_prog_info()")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211106053733.3580931-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-06 17:54:42 -03:00
Arnaldo Carvalho de Melo
6da2a45e15 perf beauty: Update copy of linux/socket.h with the kernel sources
To pick the changes in:

  99ce45d5e7 ("mctp: Implement extended addressing")
  55c42fa7fa ("mptcp: add MPTCP_INFO getsockopt")

That don't result in any changes in the tables generated from that
header.

A table generator for setsockopt is needed, probably will be done in the
5.16 cycle.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
  diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h

Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Westphal <fw@strlen.de>
Cc: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-06 17:54:42 -03:00
SeongJae Park
1dc90ccd15 selftests/damon: support watermarks
This updates DAMON selftests for 'schemes' debugfs file to reflect the
changes in the format.

Link: https://lkml.kernel.org/r/20211019150731.16699-14-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:45 -07:00
SeongJae Park
5a0d6a08b8 tools/selftests/damon: update for regions prioritization of schemes
This updates the DAMON selftests for 'schemes' debugfs file, as the file
format is updated.

Link: https://lkml.kernel.org/r/20211019150731.16699-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:45 -07:00
SeongJae Park
a2cb4dd0d4 mm/damon/selftests: support schemes quotas
This updates DAMON selftests to support updated schemes debugfs file
format for the quotas.

Link: https://lkml.kernel.org/r/20211019150731.16699-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:45 -07:00
SeongJae Park
8d5d4c6359 selftests/damon: add 'schemes' debugfs tests
This adds simple selftets for 'schemes' debugfs file of DAMON.

Link: https://lkml.kernel.org/r/20211001125604.29660-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:44 -07:00
David Hildenbrand
50f9481ed9 mm/memory_hotplug: remove CONFIG_MEMORY_HOTPLUG_SPARSE
CONFIG_MEMORY_HOTPLUG depends on CONFIG_SPARSEMEM, so there is no need for
CONFIG_MEMORY_HOTPLUG_SPARSE anymore; adjust all instances to use
CONFIG_MEMORY_HOTPLUG and remove CONFIG_MEMORY_HOTPLUG_SPARSE.

Link: https://lkml.kernel.org/r/20210929143600.49379-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>	[kselftest]
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Alex Shi <alexs@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:42 -07:00
David Hildenbrand
39b2e5cae4 selftests/vm: make MADV_POPULATE_(READ|WRITE) use in-tree headers
The madv_populate selftest currently builds with a warning when the
local installed headers (via the distribution) don't include
MADV_POPULATE_READ and MADV_POPULATE_WRITE.  The warning is correct,
because the test cannot locate the necessary header.

The reason is that the in-tree installed headers (usr/include) have a
"linux" instead of a "sys" subdirectory.

Including "linux/mman.h" instead of "sys/mman.h" doesn't work (e.g.,
mmap() and madvise() are not defined that way).  The only thing that
seems to work is including "linux/mman.h" in addition to "sys/mman.h".

We can get rid of our availability check and simplify.

Link: https://lkml.kernel.org/r/20211015165758.41374-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:42 -07:00
Pedro Demarchi Gomes
3252548996 selftests: vm: add KSM huge pages merging time test
Add test case of KSM merging time using mostly huge pages

Link: https://lkml.kernel.org/r/20211013044045.360251-1-pedrodemargomes@gmail.com
Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
Cc: Zhansaya Bagdauletkyzy <zhansayabagdaulet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Aneesh Kumar K.V
e3820ab252 selftest/vm: fix ksm selftest to run with different NUMA topologies
Platforms can have non-contiguous NUMA nodes like below

   #numactl  -H
  available: 2 nodes (0,8)
  .....
  node distances:
  node   0   8
    0:  10  40
    8:  40  10

   #numactl  -H
  available: 1 nodes (1)
  ....
  node distances:
  node   1
    1:  10

Hence update the test to not assume the presence of Node 0 and 1 and
also use numa_num_configured_nodes() instead of numa_max_node for
finding whether to skip the test.

Link: https://lkml.kernel.org/r/20210914141414.350759-1-aneesh.kumar@linux.ibm.com
Fixes: 82e717ad35 ("selftests: vm: add KSM merging across nodes test")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Zhansaya Bagdauletkyzy <zhansayabagdaulet@gmail.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Tyler Hicks <tyhicks@linux.microsoft.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
George G. Davis
39cad8878a selftests/vm/transhuge-stress: fix ram size thinko
When executing transhuge-stress with an argument to specify the virtual
memory size for testing, the ram size is reported as 0, e.g.

  transhuge-stress 384
  thp-mmap: allocate 192 transhuge pages, using 384 MiB virtual memory and 0 MiB of ram
  thp-mmap: 0.184 s/loop, 0.957 ms/page,   2090.265 MiB/s  192 succeed,    0 failed

This appears to be due to a thinko in commit 0085d61fe0
("selftests/vm/transhuge-stress: stress test for memory compaction"),
where, at a guess, the intent was to base "xyz MiB of ram" on `ram`
size.

Here are results after using `ram` size:

  thp-mmap: allocate 192 transhuge pages, using 384 MiB virtual memory and 14 MiB of ram

Link: https://lkml.kernel.org/r/20210825135843.29052-1-george_davis@mentor.com
Fixes: 0085d61fe0 ("selftests/vm/transhuge-stress: stress test for memory compaction")
Signed-off-by: George G. Davis <davis.george@siemens.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Eugeniu Rosca <erosca@de.adit-jv.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Naoya Horiguchi
41d4613b37 tools/vm/page-types.c: print file offset in hexadecimal
In page list mode (with -l and -L option), virtual address and physical
address are printed in hexadecimal, but file offset is not, which is
confusing, so let's align it.

Link: https://lkml.kernel.org/r/20211004061325.1525902-4-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Bin Wang <wangbin224@huawei.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:40 -07:00
Naoya Horiguchi
b76901db7b tools/vm/page-types.c: move show_file() to summary output
Currently file info from show_file() is printed out within page list
like below, but this is inconvenient a little to utilize the page list
from other scripts (maybe needs additional filtering).

    $ ./page-types -f page-types.c -l
    foffset offset  len     flags
    page-types.c Inode: 15108680 Size: 30953 (8 pages)
    Modify: Sat Oct  2 23:11:20 2021 (2399 seconds ago)
    Access: Sat Oct  2 23:11:28 2021 (2391 seconds ago)
    0       d9f59e  1       ___U_lA____________________________________
    1       1031eb5 1       __RU_l_____________________________________
    2       13bf717 1       __RU_l_____________________________________
    3       13ac333 1       ___U_lA____________________________________
    4       d9f59f  1       __RU_l_____________________________________
    5       183fd49 1       ___U_lA____________________________________
    6       13cbf69 1       ___U_lA____________________________________
    7       d9ef05  1       ___U_lA____________________________________

                 flags      page-count       MB  symbolic-flags                     long-symbolic-flags
    0x000000000000002c               3        0  __RU_l_____________________________________        referenced,uptodate,lru
    0x0000000000000068               5        0  ___U_lA____________________________________        uptodate,lru,active
                 total               8        0

With this patch file info is printed out in summary part like below:

    $ ./page-types -f page-types.c -l
    foffset offset  len     flags
    0       d9f59e  1       ___U_lA_____________________________________
    1       1031eb5 1       __RU_l______________________________________
    2       13bf717 1       __RU_l______________________________________
    3       13ac333 1       ___U_lA_____________________________________
    4       d9f59f  1       __RU_l______________________________________
    5       183fd49 1       ___U_lA_____________________________________
    6       13cbf69 1       ___U_lA_____________________________________

    page-types.c Inode: 15108680 Size: 30953 (8 pages)
    Modify: Sat Oct  2 23:11:20 2021 (2435 seconds ago)
    Access: Sat Oct  2 23:11:28 2021 (2427 seconds ago)

                 flags      page-count       MB  symbolic-flags                     long-symbolic-flags
    0x000000000000002c               3        0  __RU_l______________________________________       referenced,uptodate,lru
    0x0000000000000068               4        0  ___U_lA_____________________________________       uptodate,lru,active
                 total               7        0

Link: https://lkml.kernel.org/r/20211004061325.1525902-3-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Bin Wang <wangbin224@huawei.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:40 -07:00
Naoya Horiguchi
a62f5ecbfb tools/vm/page-types.c: make walk_file() aware of address range option
Patch series "tools/vm/page-types.c: a few improvements".

This patchset adds some improvements on tools/vm/page-types.c.  Patch
1/3 makes -a option (specify address range) work with -f (file cache
mode).  Patch 2/3 and 3/3 are to fix minor formatting issues of this
tool.  These would make life a little easier for the users of this tool.

Please see individual patches for more details about specific issues.

This patch (of 3):

-a|--addr option is used to limit the range of address to be scanned for
page status.  It works now for physical address space (dafult mode) or for
virtual address space (with -p option), but not for file address space
(with -f option).  So make walk_file() aware of -a option.

Link: https://lkml.kernel.org/r/20211004061325.1525902-1-naoya.horiguchi@linux.dev
Link: https://lkml.kernel.org/r/20211004061325.1525902-2-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Bin Wang <wangbin224@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:40 -07:00
Zhenliang Wei
f7df2b1cf0 tools/vm/page_owner_sort.c: count and sort by mem
When viewing page owner information, we may be more concerned about the
total memory rather than the times of stack appears.  Therefore, the
following adjustments are made:

1. Added the statistics on the total number of pages.

2. Added the optional parameter "-m" to configure the program to sort by
   memory (total pages).

The general output of page_owner is as follows:

	Page allocated via order XXX, ...
	PFN XXX ...
	 // Detailed stack

	Page allocated via order XXX, ...
	PFN XXX ...
	 // Detailed stack

The original page_owner_sort ignores PFN rows, puts the remaining rows
in buf, counts the times of buf, and finally sorts them according to the
times.  General output:

	XXX times:
	Page allocated via order XXX, ...
	 // Detailed stack

Now, we use regexp to extract the page order value from the buf, and
count the total pages for the buf.  General output:

	XXX times, XXX pages:
	Page allocated via order XXX, ...
	 // Detailed stack

By default, it is still sorted by the times of buf; If you want to sort
by the pages nums of buf, use the new -m parameter.

Link: https://lkml.kernel.org/r/1631678242-41033-1-git-send-email-weizhenliang@huawei.com
Signed-off-by: Zhenliang Wei <weizhenliang@huawei.com>
Cc: Tang Bin <tangbin@cmss.chinamobile.com>
Cc: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Cc: Zhenliang Wei <weizhenliang@huawei.com>
Cc: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:40 -07:00
Axel Rasmussen
ad0ce23ed0 userfaultfd/selftests: fix calculation of expected ioctls
Today, we assert that the ioctls the kernel reports as supported for a
registration match a precomputed list.  We decide which ioctls are
supported by examining the memory type.  Then, in several locations we
"fix up" this list by adding or removing things this initial decision
got wrong.

What ioctls the kernel reports is actually a function of several things:
- The memory type
- Kernel feature support (e.g., no writeprotect on aarch64)
- The registration type (e.g., CONTINUE only supported for MINOR mode)

So, we can't fully compute this at the start, in set_test_type.  It
varies per test, depending on what registration mode(s) those tests use.

Instead, introduce a new function which computes the correct list.  This
centralizes the add/remove of ioctls depending on these function inputs
in one place, so we don't have to repeat ourselves in various tests.

Not only is the resulting code a bit shorter, but it fixes a real bug in
the existing code: previously, we would incorrectly require the
writeprotect ioctl to be present on aarch64, where it isn't actually
supported.

Link: https://lkml.kernel.org/r/20210930212309.4001967-4-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:39 -07:00
Axel Rasmussen
1042a53d0e userfaultfd/selftests: fix feature support detection
Before any tests are run, in set_test_type, we decide what feature(s) we
are going to be testing, based upon our command line arguments.
However, the supported features are not just a function of the memory
type being used, so this is broken.

For instance, consider writeprotect support.  It is "normally" supported
for anonymous memory, but furthermore it requires that the kernel has
CONFIG_HAVE_ARCH_USERFAULTFD_WP.  So, it is *not* supported at all on
aarch64, for example.

So, this fixes this by querying the kernel for the set of features it
supports in set_test_type, by opening a userfaultfd and issuing a
UFFDIO_API ioctl.  Based upon the reported features, we toggle what
tests are enabled.

Link: https://lkml.kernel.org/r/20210930212309.4001967-3-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:39 -07:00
Axel Rasmussen
1c10e674b3 userfaultfd/selftests: don't rely on GNU extensions for random numbers
Patch series "Small userfaultfd selftest fixups", v2.

This patch (of 3):

Two arguments for doing this:

First, and maybe most importantly, the resulting code is significantly
shorter / simpler.

Then, we avoid using GNU libc extensions.  Why does this matter? It
makes testing userfaultfd with the selftest easier e.g.  on distros
which use something other than glibc (e.g., Alpine, which uses musl);
basically, it makes the test more portable.

Link: https://lkml.kernel.org/r/20210930212309.4001967-2-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:39 -07:00
Ran Jianping
b65c23f72e mm: remove duplicate include in hugepage-mremap.c
Remove duplicate includes 'unistd.h' included in
 '/tools/testing/selftests/vm/hugepage-mremap.c'  is duplicated.It is also
 included on 23 line.

Link: https://lkml.kernel.org/r/20211018102336.869726-1-ran.jianping@zte.com.cn
Signed-off-by: Ran Jianping <ran.jianping@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:39 -07:00
Mina Almasry
12b6132064 mm, hugepages: add hugetlb vma mremap() test
[almasrymina@google.com: v8]
  Link: https://lkml.kernel.org/r/20211014200542.4126947-2-almasrymina@google.com
[wanjiabing@vivo.com: remove duplicated include in hugepage-mremap]
  Link: https://lkml.kernel.org/r/20211021122944.8857-1-wanjiabing@vivo.com

Link: https://lkml.kernel.org/r/20211013195825.3058275-2-almasrymina@google.com
Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Ken Chen <kenchen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Kirill Shutemov <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:39 -07:00
Martin KaFai Lau
d99341b373 bpf: selftest: Trigger a DCE on the whole subprog
This patch adds a test to trigger the DCE to remove
the whole subprog to ensure the verifier  does not
depend on a stable subprog index.  The DCE is done
by testing a global const.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211106014020.651638-1-kafai@fb.com
2021-11-06 12:54:12 -07:00
Arnaldo Carvalho de Melo
7f9f879243 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up some tools/perf/ patches that went via tip/perf/core, such
as:

  tools/perf: Add mem_hops field in perf_mem_data_src structure

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-06 15:49:33 -03:00
Jakub Kicinski
9bea6aa498 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-11-05

We've added 15 non-merge commits during the last 3 day(s) which contain
a total of 14 files changed, 199 insertions(+), 90 deletions(-).

The main changes are:

1) Fix regression from stack spill/fill of <8 byte scalars, from Martin KaFai Lau.

2) Fix perf's build of bpftool's bootstrap version due to missing libbpf
   headers, from Quentin Monnet.

3) Fix riscv{32,64} BPF exception tables build errors and warnings, from Björn Töpel.

4) Fix bpf fs to allow RENAME_EXCHANGE support for atomic upgrades on sk_lookup
   control planes, from Lorenz Bauer.

5) Fix libbpf's error reporting in bpf_map_lookup_and_delete_elem_flags() due to
   missing libbpf_err_errno(), from Mehrdad Arshad Rad.

6) Various fixes to make xdp_redirect_multi selftest more reliable, from Hangbin Liu.

7) Fix netcnt selftest to make it run serial and thus avoid conflicts with other
   cgroup/skb selftests run in parallel that could cause flakes, from Andrii Nakryiko.

8) Fix reuseport_bpf_numa networking selftest to skip unavailable NUMA nodes,
   from Kleber Sacilotto de Souza.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  riscv, bpf: Fix RV32 broken build, and silence RV64 warning
  selftests/bpf/xdp_redirect_multi: Limit the tests in netns
  selftests/bpf/xdp_redirect_multi: Give tcpdump a chance to terminate cleanly
  selftests/bpf/xdp_redirect_multi: Use arping to accurate the arp number
  selftests/bpf/xdp_redirect_multi: Put the logs to tmp folder
  libbpf: Fix lookup_and_delete_elem_flags error reporting
  bpftool: Install libbpf headers for the bootstrap version, too
  selftests/net: Fix reuseport_bpf_numa by skipping unavailable nodes
  selftests/bpf: Verifier test on refill from a smaller spill
  bpf: Do not reject when the stack read size is different from the tracked scalar size
  selftests/bpf: Make netcnt selftests serial to avoid spurious failures
  selftests/bpf: Test RENAME_EXCHANGE and RENAME_NOREPLACE on bpffs
  selftests/bpf: Convert test_bpffs to ASSERT macros
  libfs: Support RENAME_EXCHANGE in simple_rename()
  libfs: Move shmem_exchange to simple_rename_exchange
====================

Link: https://lore.kernel.org/r/20211105165803.29372-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-05 14:13:40 -07:00
Hangbin Liu
8955c1a329 selftests/bpf/xdp_redirect_multi: Limit the tests in netns
As I want to test both DEVMAP and DEVMAP_HASH in XDP multicast redirect, I
limited DEVMAP max entries to a small value for performace. When the test
runs after amount of interface creating/deleting tests. The interface index
will exceed the map max entries and xdp_redirect_multi will error out with
"Get interfacesInterface index to large".

Fix this issue by limit the tests in netns and specify the ifindex when
creating interfaces.

Fixes: d232924762 ("selftests/bpf: Add xdp_redirect_multi test")
Reported-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211027033553.962413-5-liuhangbin@gmail.com
2021-11-05 16:42:56 +01:00
Hangbin Liu
648c367706 selftests/bpf/xdp_redirect_multi: Give tcpdump a chance to terminate cleanly
No need to kill tcpdump with -9.

Fixes: d232924762 ("selftests/bpf: Add xdp_redirect_multi test")
Suggested-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211027033553.962413-4-liuhangbin@gmail.com
2021-11-05 16:38:26 +01:00
Hangbin Liu
f53ea9dbf7 selftests/bpf/xdp_redirect_multi: Use arping to accurate the arp number
The arp request number triggered by ping none exist address is not accurate,
which may lead the test false negative/positive. Change to use arping to
accurate the arp number. Also do not use grep pattern match for dot.

Fixes: d232924762 ("selftests/bpf: Add xdp_redirect_multi test")
Suggested-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211027033553.962413-3-liuhangbin@gmail.com
2021-11-05 16:38:02 +01:00
Hangbin Liu
8b4ac13abe selftests/bpf/xdp_redirect_multi: Put the logs to tmp folder
The xdp_redirect_multi test logs are created in selftest folder and not cleaned
after test. Let's creat a tmp dir and remove the logs after testing.

Fixes: d232924762 ("selftests/bpf: Add xdp_redirect_multi test")
Suggested-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211027033553.962413-2-liuhangbin@gmail.com
2021-11-05 16:37:51 +01:00
Mehrdad Arshad Rad
64165ddf8e libbpf: Fix lookup_and_delete_elem_flags error reporting
Fix bpf_map_lookup_and_delete_elem_flags() to pass the return code through
libbpf_err_errno() as we do similarly in bpf_map_lookup_and_delete_elem().

Fixes: f12b654327 ("libbpf: Streamline error reporting for low-level APIs")
Signed-off-by: Mehrdad Arshad Rad <arshad.rad@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211104171354.11072-1-arshad.rad@gmail.com
2021-11-05 16:26:08 +01:00
Quentin Monnet
e41ac2020b bpftool: Install libbpf headers for the bootstrap version, too
We recently changed bpftool's Makefile to make it install libbpf's
headers locally instead of pulling them from the source directory of the
library. Although bpftool needs two versions of libbpf, a "regular" one
and a "bootstrap" version, we would only install headers for the regular
libbpf build. Given that this build always occurs before the bootstrap
build when building bpftool, this is enough to ensure that the bootstrap
bpftool will have access to the headers exported through the regular
libbpf build.

However, this did not account for the case when we only want the
bootstrap version of bpftool, through the "bootstrap" target. For
example, perf needs the bootstrap version only, to generate BPF
skeletons. In that case, when are the headers installed? For some time,
the issue has been masked, because we had a step (the installation of
headers internal to libbpf) which would depend on the regular build of
libbpf and hence trigger the export of the headers, just for the sake of
creating a directory. But this changed with commit 8b6c46241c
("bpftool: Remove Makefile dep. on $(LIBBPF) for
$(LIBBPF_INTERNAL_HDRS)"), where we cleaned up that stage and removed
the dependency on the regular libbpf build. As a result, when we only
want the bootstrap bpftool version, the regular libbpf is no longer
built. The bootstrap libbpf version is built, but headers are not
exported, and the bootstrap bpftool build fails because of the missing
headers.

To fix this, we also install the library headers for the bootstrap
version of libbpf, to use them for the bootstrap bpftool and for
generating the skeletons.

Fixes: f012ade10b ("bpftool: Install libbpf headers instead of including the dir")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/bpf/20211105015813.6171-1-quentin@isovalent.com
2021-11-05 16:19:48 +01:00
Linus Torvalds
5c0b0c676a powerpc updates for 5.16
- Enable STRICT_KERNEL_RWX for Freescale 85xx platforms.
 
  - Activate CONFIG_STRICT_KERNEL_RWX by default, while still allowing it to be disabled.
 
  - Add support for out-of-line static calls on 32-bit.
 
  - Fix oopses doing bpf-to-bpf calls when STRICT_KERNEL_RWX is enabled.
 
  - Fix boot hangs on e5500 due to stale value in ESR passed to do_page_fault().
 
  - Fix several bugs on pseries in handling of device tree cache information for hotplugged
    CPUs, and/or during partition migration.
 
  - Various other small features and fixes.
 
 Thanks to: Alexey Kardashevskiy, Alistair Popple, Anatolij Gustschin, Andrew Donnellan,
 Athira Rajeev, Bixuan Cui, Bjorn Helgaas, Cédric Le Goater, Christophe Leroy, Daniel
 Axtens, Daniel Henrique Barboza, Denis Kirjanov, Fabiano Rosas, Frederic Barrat, Gustavo
 A. R. Silva, Hari Bathini, Jacques de Laval, Joel Stanley, Kai Song, Kajol Jain, Laurent
 Vivier, Leonardo Bras, Madhavan Srinivasan, Nathan Chancellor, Nathan Lynch, Naveen N.
 Rao, Nicholas Piggin, Nick Desaulniers, Niklas Schnelle, Oliver O'Halloran, Rob Herring,
 Russell Currey, Srikar Dronamraju, Stan Johnson, Tyrel Datwyler, Uwe Kleine-König, Vasant
 Hegde, Wan Jiabing, Xiaoming Ni,
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmGFDPoTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgNbsEACVczMVwMBEny5a7W1tqq1bnY9RFVw3
 K+/rE7/FpSLX+RrwgoMkmqfPvfyc9WISVLlIQDKz4XhkBjaXv0+Y4OMsymuDCbxL
 Qk7F1ff22UfLmPjjJk39gHJ8QZQqZk3wmFu2QzTO4yBZbz2SqqXFLxwyoLpZ0LJ8
 pdGl51+bIsTkDJzrdkhX9X4AKS/fYyjbQxq5u7FS89ZCCs+KvzjLcDRo0GZYaOK/
 hgDBa60DCCszL/9yjbh0ANZxmM2Z3+6AXkvAAXrtXzIGk4JzenZfiV+VEzmq8Tt0
 UpWSsUEe7VgykMR3MTrL9G8op70PpKX6OMUPegJq4iRQ24h4mpFCK4oV9OMKJqpF
 ifN9NO2ZZKOz1ke4l7Xe8SEHLX7rq5U/P7INh3AsKYNYwo6HkJhSPxiCBWUTlnZ3
 OYoZ7czyO4gMPHWP92z4CoSiTYVBFuyhYexRcnQskg60TIwbr+lMXziRyPRGI8b6
 taf2rD8eAiyUJnvkFUsyAHtYHpkSkuMeiVqY2CDQdh2SdtIFgwKzB2RjFL0gzaBZ
 XP9RWD+HernGQAJSlIk7cVthont3JHklcKk+ohhDbsWzPeUJcz6t4ChtgRq0x4q4
 Hpes1lsVfXpyxj5ouBK/E/t+diwPvBLM0dCcarQJE6ExgMzBC/C7Br7jCSgyVJA2
 VhtcZaCYh+vRlQ==
 =f7HE
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Enable STRICT_KERNEL_RWX for Freescale 85xx platforms.

 - Activate CONFIG_STRICT_KERNEL_RWX by default, while still allowing it
   to be disabled.

 - Add support for out-of-line static calls on 32-bit.

 - Fix oopses doing bpf-to-bpf calls when STRICT_KERNEL_RWX is enabled.

 - Fix boot hangs on e5500 due to stale value in ESR passed to
   do_page_fault().

 - Fix several bugs on pseries in handling of device tree cache
   information for hotplugged CPUs, and/or during partition migration.

 - Various other small features and fixes.

Thanks to Alexey Kardashevskiy, Alistair Popple, Anatolij Gustschin,
Andrew Donnellan, Athira Rajeev, Bixuan Cui, Bjorn Helgaas, Cédric Le
Goater, Christophe Leroy, Daniel Axtens, Daniel Henrique Barboza, Denis
Kirjanov, Fabiano Rosas, Frederic Barrat, Gustavo A.  R.  Silva, Hari
Bathini, Jacques de Laval, Joel Stanley, Kai Song, Kajol Jain, Laurent
Vivier, Leonardo Bras, Madhavan Srinivasan, Nathan Chancellor, Nathan
Lynch, Naveen N.  Rao, Nicholas Piggin, Nick Desaulniers, Niklas
Schnelle, Oliver O'Halloran, Rob Herring, Russell Currey, Srikar
Dronamraju, Stan Johnson, Tyrel Datwyler, Uwe Kleine-König, Vasant
Hegde, Wan Jiabing, and Xiaoming Ni,

* tag 'powerpc-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (73 commits)
  powerpc/8xx: Fix Oops with STRICT_KERNEL_RWX without DEBUG_RODATA_TEST
  powerpc/32e: Ignore ESR in instruction storage interrupt handler
  powerpc/powernv/prd: Unregister OPAL_MSG_PRD2 notifier during module unload
  powerpc: Don't provide __kernel_map_pages() without ARCH_SUPPORTS_DEBUG_PAGEALLOC
  MAINTAINERS: Update powerpc KVM entry
  powerpc/xmon: fix task state output
  powerpc/44x/fsp2: add missing of_node_put
  powerpc/dcr: Use cmplwi instead of 3-argument cmpli
  KVM: PPC: Tick accounting should defer vtime accounting 'til after IRQ handling
  powerpc/security: Use a mutex for interrupt exit code patching
  powerpc/83xx/mpc8349emitx: Make mcu_gpiochip_remove() return void
  powerpc/fsl_booke: Fix setting of exec flag when setting TLBCAMs
  powerpc/book3e: Fix set_memory_x() and set_memory_nx()
  powerpc/nohash: Fix __ptep_set_access_flags() and ptep_set_wrprotect()
  powerpc/bpf: Fix write protecting JIT code
  selftests/powerpc: Use date instead of EPOCHSECONDS in mitigation-patching.sh
  powerpc/64s/interrupt: Fix check_return_regs_valid() false positive
  powerpc/boot: Set LC_ALL=C in wrapper script
  powerpc/64s: Default to 64K pages for 64 bit book3s
  Revert "powerpc/audit: Convert powerpc to AUDIT_ARCH_COMPAT_GENERIC"
  ...
2021-11-05 08:15:46 -07:00
Linus Torvalds
5c904c66ed Char/Misc driver update for 5.16-rc1
Here is the big set of char and misc and other tiny driver subsystem
 updates for 5.16-rc1.
 
 Loads of things in here, all of which have been in linux-next for a
 while with no reported problems (except for one called out below.)
 
 Included are:
 	- habanana labs driver updates, including dma_buf usage,
 	  reviewed and acked by the dma_buf maintainers
 	- iio driver update (going through this tree not staging as they
 	  really do not belong going through that tree anymore)
 	- counter driver updates
 	- hwmon driver updates that the counter drivers needed, acked by
 	  the hwmon maintainer
 	- xillybus driver updates
 	- binder driver updates
 	- extcon driver updates
 	- dma_buf module namespaces added (will cause a build error in
 	  arm64 for allmodconfig, but that change is on its way through
 	  the drm tree)
 	- lkdtm driver updates
 	- pvpanic driver updates
 	- phy driver updates
 	- virt acrn and nitr_enclaves driver updates
 	- smaller char and misc driver updates
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYYPX2A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymUUgCbB4EKysgLuXYdjUalZDx+vvZO4k0AniS14O4k
 F+2dVSZ5WX6wumUzCaA6
 =bXQM
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big set of char and misc and other tiny driver subsystem
  updates for 5.16-rc1.

  Loads of things in here, all of which have been in linux-next for a
  while with no reported problems (except for one called out below.)

  Included are:

   - habanana labs driver updates, including dma_buf usage, reviewed and
     acked by the dma_buf maintainers

   - iio driver update (going through this tree not staging as they
     really do not belong going through that tree anymore)

   - counter driver updates

   - hwmon driver updates that the counter drivers needed, acked by the
     hwmon maintainer

   - xillybus driver updates

   - binder driver updates

   - extcon driver updates

   - dma_buf module namespaces added (will cause a build error in arm64
     for allmodconfig, but that change is on its way through the drm
     tree)

   - lkdtm driver updates

   - pvpanic driver updates

   - phy driver updates

   - virt acrn and nitr_enclaves driver updates

   - smaller char and misc driver updates"

* tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (386 commits)
  comedi: dt9812: fix DMA buffers on stack
  comedi: ni_usb6501: fix NULL-deref in command paths
  arm64: errata: Enable TRBE workaround for write to out-of-range address
  arm64: errata: Enable workaround for TRBE overwrite in FILL mode
  coresight: trbe: Work around write to out of range
  coresight: trbe: Make sure we have enough space
  coresight: trbe: Add a helper to determine the minimum buffer size
  coresight: trbe: Workaround TRBE errata overwrite in FILL mode
  coresight: trbe: Add infrastructure for Errata handling
  coresight: trbe: Allow driver to choose a different alignment
  coresight: trbe: Decouple buffer base from the hardware base
  coresight: trbe: Add a helper to pad a given buffer area
  coresight: trbe: Add a helper to calculate the trace generated
  coresight: trbe: Defer the probe on offline CPUs
  coresight: trbe: Fix incorrect access of the sink specific data
  coresight: etm4x: Add ETM PID for Kryo-5XX
  coresight: trbe: Prohibit trace before disabling TRBE
  coresight: trbe: End the AUX handle on truncation
  coresight: trbe: Do not truncate buffer on IRQ
  coresight: trbe: Fix handling of spurious interrupts
  ...
2021-11-04 08:21:47 -07:00
Kleber Sacilotto de Souza
a38bc45a08 selftests/net: Fix reuseport_bpf_numa by skipping unavailable nodes
In some platforms the numa node numbers are not necessarily consecutive,
meaning that not all nodes from 0 to the value returned by numa_max_node()
are available on the system. Using node numbers which are not available
results on errors from libnuma such as:

  ---- IPv4 UDP ----
  send node 0, receive socket 0
  libnuma: Warning: Cannot read node cpumask from sysfs
  ./reuseport_bpf_numa: failed to pin to node: No such file or directory

Fix it by checking if the node number bit is set on numa_nodes_ptr, which
is defined on libnuma as "Set with all nodes the kernel has exposed to
userspace".

Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211101145317.286118-1-kleber.souza@canonical.com
2021-11-04 16:21:13 +01:00
Ian Rogers
32f7aa2731 perf clang: Fixes for more recent LLVM/clang
The parameters to two functions and the location of a variable have
changed in more recent LLVM/clang releases.

Remove the unneecessary -fmessage-length and -ferror-limit flags, the
former causes failures like:

  58: builtin clang support                                           :
  58.1: builtin clang compile C source to IR                          :
  --- start ---
  test child forked, pid 279307
  error: unknown argument: '-fmessage-length'
  1 error generated.
  test child finished with -1

Tested with LLVM 6, 8, 9, 10 and 11.

Reviewed-by: Fangrui Song <maskray@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>,
Cc: llvm@lists.linux.dev
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-04 09:32:00 -03:00
Ian Rogers
d0d0f0c124 tools: Bump minimum LLVM C++ std to GNU++14
LLVM 9 (current release is LLVM 13) moved the minimum C++ version to
GNU++14. Bump the version numbers in the feature test and perf build.

Reviewed-by: Fangrui Song <maskray@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20211012021321.291635-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-04 09:31:30 -03:00
Andrea Righi
a985442fde selftests: net: properly support IPv6 in GSO GRE test
Explicitly pass -6 to netcat when the test is using IPv6 to prevent
failures.

Also make sure to pass "-N" to netcat to close the socket after EOF on
the client side, otherwise we would always hit the timeout and the test
would fail.

Without this fix applied:

 TEST: GREv6/v4 - copy file w/ TSO                                   [FAIL]
 TEST: GREv6/v4 - copy file w/ GSO                                   [FAIL]
 TEST: GREv6/v6 - copy file w/ TSO                                   [FAIL]
 TEST: GREv6/v6 - copy file w/ GSO                                   [FAIL]

With this fix applied:

 TEST: GREv6/v4 - copy file w/ TSO                                   [ OK ]
 TEST: GREv6/v4 - copy file w/ GSO                                   [ OK ]
 TEST: GREv6/v6 - copy file w/ TSO                                   [ OK ]
 TEST: GREv6/v6 - copy file w/ GSO                                   [ OK ]

Fixes: 025efa0a82 ("selftests: add simple GSO GRE test")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-04 11:23:57 +00:00
Andrii Nakryiko
be2f2d1680 libbpf: Deprecate bpf_program__load() API
Mark bpf_program__load() as deprecated ([0]) since v0.6. Also rename few
internal program loading bpf_object helper functions to have more
consistent naming.

  [0] Closes: https://github.com/libbpf/libbpf/issues/301

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211103051449.1884903-1-andrii@kernel.org
2021-11-03 14:29:56 -07:00
Andrii Nakryiko
b7332d2820 libbpf: Improve ELF relo sanitization
Add few sanity checks for relocations to prevent div-by-zero and
out-of-bounds array accesses in libbpf.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211103173213.1376990-6-andrii@kernel.org
2021-11-03 13:25:37 -07:00
Andrii Nakryiko
0d6988e16a libbpf: Fix section counting logic
e_shnum does include section #0 and as such is exactly the number of ELF
sections that we need to allocate memory for to use section indices as
array indices. Fix the off-by-one error.

This is purely accounting fix, previously we were overallocating one
too many array items. But no correctness errors otherwise.

Fixes: 25bbbd7a44 ("libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211103173213.1376990-5-andrii@kernel.org
2021-11-03 13:25:37 -07:00
Andrii Nakryiko
62554d52e7 libbpf: Validate that .BTF and .BTF.ext sections contain data
.BTF and .BTF.ext ELF sections should have SHT_PROGBITS type and contain
data. If they are not, ELF is invalid or corrupted, so bail out.
Otherwise this can lead to data->d_buf being NULL and SIGSEGV later on.
Reported by oss-fuzz project.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211103173213.1376990-4-andrii@kernel.org
2021-11-03 13:25:37 -07:00
Andrii Nakryiko
88918dc12d libbpf: Improve sanity checking during BTF fix up
If BTF is corrupted DATASEC's variable type ID might be incorrect.
Prevent this easy to detect situation with extra NULL check.
Reported by oss-fuzz project.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211103173213.1376990-3-andrii@kernel.org
2021-11-03 13:25:37 -07:00
Andrii Nakryiko
833907876b libbpf: Detect corrupted ELF symbols section
Prevent divide-by-zero if ELF is corrupted and has zero sh_entsize.
Reported by oss-fuzz project.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211103173213.1376990-2-andrii@kernel.org
2021-11-03 13:25:37 -07:00
Kees Cook
1e6d69c7b9 selftests/seccomp: Report event mismatches more clearly
When running under tracer, more explicitly report the status and event
mismatches to help with debugging. Additionally add an "immediate kill"
test when under tracing to verify that fatal SIGSYS behaves the same
under ptrace or seccomp tracing.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: linux-kselftest@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Link: https://lore.kernel.org/r/20211103163039.2104830-3-keescook@chromium.org
2021-11-03 12:02:07 -07:00
Kees Cook
48d5fd0645 selftests/seccomp: Stop USER_NOTIF test if kcmp() fails
If kcmp() fails during the USER_NOTIF test, the test is likely to hang,
so switch from EXPECT to ASSERT.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: linux-kselftest@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Link: https://lore.kernel.org/r/20211103163039.2104830-2-keescook@chromium.org
2021-11-03 12:02:07 -07:00
Dave Marchevsky
f5aafbc2af libbpf: Deprecate bpf_program__get_prog_info_linear
As part of the road to libbpf 1.0, and discussed in libbpf issue tracker
[0], bpf_program__get_prog_info_linear and its associated structs and
helper functions should be deprecated. The functionality is too specific
to the needs of 'perf', and there's little/no out-of-tree usage to
preclude introduction of a more general helper in the future.

  [0] Closes: https://github.com/libbpf/libbpf/issues/313

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211101224357.2651181-5-davemarchevsky@fb.com
2021-11-03 11:25:36 -07:00
Dave Marchevsky
199e06fe83 perf: Pull in bpf_program__get_prog_info_linear
To prepare for impending deprecation of libbpf's
bpf_program__get_prog_info_linear, pull in the function and associated
helpers into the perf codebase and migrate existing uses to the perf
copy.

Since libbpf's deprecated definitions will still be visible to perf, it
is necessary to rename perf's definitions.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20211101224357.2651181-4-davemarchevsky@fb.com
2021-11-03 11:25:36 -07:00
Dave Marchevsky
c59765cfd1 bpftool: Use bpf_obj_get_info_by_fd directly
To prepare for impending deprecation of libbpf's
bpf_program__get_prog_info_linear, migrate uses of this function to use
bpf_obj_get_info_by_fd.

Since the profile_target_name and dump_prog_id_as_func_ptr helpers were
only looking at the first func_info, avoid grabbing the rest to save a
malloc. For do_dump, add a more full-featured helper, but avoid
free/realloc of buffer when possible for multi-prog dumps.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20211101224357.2651181-3-davemarchevsky@fb.com
2021-11-03 11:25:32 -07:00
Dave Marchevsky
60f2707539 bpftool: Migrate -1 err checks of libbpf fn calls
Per [0], callers of libbpf functions with LIBBPF_STRICT_DIRECT_ERRS set
should handle negative error codes of various values (e.g. -EINVAL).
Migrate two callsites which were explicitly checking for -1 only to
handle the new scheme.

  [0]: https://github.com/libbpf/libbpf/wiki/Libbpf-1.0-migration-guide#direct-error-code-returning-libbpf_strict_direct_errs

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20211101224357.2651181-2-davemarchevsky@fb.com
2021-11-03 11:22:30 -07:00
Linus Torvalds
e1fd0b2acd Second set of tracing updates for 5.16:
- osnoise and timerlat updates that will work with the RTLA tool (Real-Time
   Linux Analysis). Specifically it disconnects the work load (threads
   that look for latency) from the tracing instances attached to them,
   allowing for more than one instance to retrieve data from the work load.
 
 - Optimization on division in the trace histogram trigger code to use shift
   and multiply when possible. Also added documentation.
 
 - Fix prototype to my_direct_func in direct ftrace trampoline sample code.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYYKWXxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qqJEAP9czpSZ/nFvDjxdGHZAcKKXCFWbGcK5
 IF2cHDDwxXjZ/gD+NnpRhR1JPfA55fO52DUJPn2cOU5xOsP6DmJxu6mwDg0=
 =AKVv
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull more tracing updates from Steven Rostedt:

 - osnoise and timerlat updates that will work with the RTLA tool
   (Real-Time Linux Analysis).

   Specifically it disconnects the work load (threads that look for
   latency) from the tracing instances attached to them, allowing for
   more than one instance to retrieve data from the work load.

 - Optimization on division in the trace histogram trigger code to use
   shift and multiply when possible. Also added documentation.

 - Fix prototype to my_direct_func in direct ftrace trampoline sample
   code.

* tag 'trace-v5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace/samples: Add missing prototype for my_direct_func
  tracing/selftests: Add tests for hist trigger expression parsing
  tracing/histogram: Document hist trigger variables
  tracing/histogram: Update division by 0 documentation
  tracing/histogram: Optimize division by constants
  tracing/osnoise: Remove PREEMPT_RT ifdefs from inside functions
  tracing/osnoise: Remove STACKTRACE ifdefs from inside functions
  tracing/osnoise: Allow multiple instances of the same tracer
  tracing/osnoise: Remove TIMERLAT ifdefs from inside functions
  tracing/osnoise: Support a list of trace_array *tr
  tracing/osnoise: Use start/stop_per_cpu_kthreads() on osnoise_cpus_write()
  tracing/osnoise: Split workload start from the tracer start
  tracing/osnoise: Improve comments about barrier need for NMI callbacks
  tracing/osnoise: Do not follow tracing_cpumask
2021-11-03 09:08:47 -07:00
Martin KaFai Lau
c08455dec5 selftests/bpf: Verifier test on refill from a smaller spill
This patch adds a verifier test to ensure the verifier can read 8 bytes
from the stack after two 32bit write at fp-4 and fp-8. The test is similar
to the reported case from bcc [0].

  [0] https://github.com/iovisor/bcc/pull/3683

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211102064541.316414-1-kafai@fb.com
2021-11-03 15:55:43 +01:00
Andrii Nakryiko
401a33da3a selftests/bpf: Make netcnt selftests serial to avoid spurious failures
When running `./test_progs -j` test_netcnt fails with a very high
probability, undercounting number of packets received (9999 vs expected
10000). It seems to be conflicting with other cgroup/skb selftests. So
make it serial for now to make parallel mode more robust.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211103054113.2130582-1-andrii@kernel.org
2021-11-03 15:43:09 +01:00
Lorenz Bauer
7e5ad817ec selftests/bpf: Test RENAME_EXCHANGE and RENAME_NOREPLACE on bpffs
Add tests to exercise the behaviour of RENAME_EXCHANGE and RENAME_NOREPLACE
on bpffs. The former checks that after an exchange the inode of two
directories has changed. The latter checks that the source still exists
after a failed rename. Generally, having support for renameat2(RENAME_EXCHANGE)
in bpffs fixes atomic upgrades of our sk_lookup control plane.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211028094724.59043-5-lmb@cloudflare.com
2021-11-03 15:43:08 +01:00
Lorenz Bauer
9fc23c22e5 selftests/bpf: Convert test_bpffs to ASSERT macros
Remove usage of deprecated CHECK macros.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211028094724.59043-4-lmb@cloudflare.com
2021-11-03 15:43:08 +01:00
Hangbin Liu
17b67370c3 kselftests/net: add missed toeplitz.sh/toeplitz_client.sh to Makefile
When generating the selftests to another folder, the toeplitz.sh
and toeplitz_client.sh are missing as they are not in Makefile, e.g.

  make -C tools/testing/selftests/ install \
      TARGETS="net" INSTALL_PATH=/tmp/kselftests

Making them under TEST_PROGS_EXTENDED as they test NIC hardware features
and are not intended to be run from kselftests.

Fixes: 5ebfb4cc30 ("selftests/net: toeplitz test")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-03 11:07:02 +00:00
Hangbin Liu
8883deb50e kselftests/net: add missed vrf_strict_mode_test.sh test to Makefile
When generating the selftests to another folder, the
vrf_strict_mode_test.sh test will miss as it is not in Makefile, e.g.

  make -C tools/testing/selftests/ install \
      TARGETS="net" INSTALL_PATH=/tmp/kselftests

Fixes: 8735e6eaa4 ("selftests: add selftest for the VRF strict mode")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-03 11:07:02 +00:00
Hangbin Liu
653e7f19b4 kselftests/net: add missed SRv6 tests
When generating the selftests to another folder, the SRv6 tests are
missing as they are not in Makefile, e.g.

  make -C tools/testing/selftests/ install \
      TARGETS="net" INSTALL_PATH=/tmp/kselftests

Fixes: 03a0b567a0 ("selftests: seg6: add selftest for SRv6 End.DT46 Behavior")
Fixes: 2195444e09 ("selftests: add selftest for the SRv6 End.DT4 behavior")
Fixes: 2bc035538e ("selftests: add selftest for the SRv6 End.DT6 (VRF) behavior")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-03 11:07:02 +00:00
Hangbin Liu
b99ac18411 kselftests/net: add missed setup_loopback.sh/setup_veth.sh to Makefile
When generating the selftests to another folder, the include file
setup_loopback.sh/setup_veth.sh for gro.sh/gre_gro.sh are missing as
they are not in Makefile, e.g.

  make -C tools/testing/selftests/ install \
      TARGETS="net" INSTALL_PATH=/tmp/kselftests

Fixes: 7d1575014a ("selftests/net: GRO coalesce test")
Fixes: 9af771d2ec ("selftests/net: allow GRO coalesce test on veth")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-03 11:07:02 +00:00
Hangbin Liu
ca3676f94b kselftests/net: add missed icmp.sh test to Makefile
When generating the selftests to another folder, the icmp.sh test will
miss as it is not in Makefile, e.g.

  make -C tools/testing/selftests/ install \
      TARGETS="net" INSTALL_PATH=/tmp/kselftests

Fixes: 7e9838b791 ("selftests/net: Add icmp.sh for testing ICMP dummy address responses")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-03 11:07:02 +00:00
Linus Torvalds
313b6ffc8e linux-kselftest-kunit-5.16-rc1
This KUnit update for Linux 5.16-rc1 consist of several enhancements
 and fixes:
 
 - ability to run each test suite and test separately
 - support for timing test run
 - several fixes and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmGBhpoACgkQCwJExA0N
 Qxx2xQ//bCql+nCJdowo2er7MgYV1Jw9bkjapriV7vaakHVBOQhMItMAD3lw2WNI
 SXuX4n4x4Ap4FMBd3gBQ3Flc1e7MdY/FHQIIcIE5+xDoU/ehl9ypbUMd7NrGOTI7
 KMN18TJawHHyMHjKz/fFFV5Bfi97YptQMy3VB/ujdgRCIF2/bPhra5F/rFYUwdSk
 GLql9CBPbomgaginzAuvnfPKoxWGjiiRZjNTlsXDyRihras0ezS7kfSYPLEV8f/i
 L/ydZK+Lc3QCrWCYadBeAtq/3hkb1pV7FGKfjypvEGhGcvqEQZW7irKxE4KoLB2D
 VFeVVZmK0WCfCFHtOQookTKPVwkgTzItwpXxl57ILMWo6FaB8O8tQGHqEvcF7Xfm
 NIgT/V0laiUWABpWWpQjf1jnY+X1qI+s+f8eZ2eI889AduBdlrNIfCRotYz2tB0i
 /GfnmVLp5pDIYZX0SxCERpu91297sDXSV+uw4mf9hjDs7189LmWppEpPENfjWebM
 geFlconrtrmnWx3E3Wvqxk8pEmJwChDYVAGUUhKFZXLxYO0YvWoBM4REb9v8sv3c
 0x3A1ZhpdeN5f7S6BHEAVKa3VrDHXLG9nZCK6wlaxYIiM2rOJDQN9h4BF5/RO9/2
 SebYQuMA95aQgAPXjJHYKQ9fdxrR3BOgFdn0Xfp/EShEWTl/usA=
 =UpDf
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:
 "Several enhancements and fixes:

   - ability to run each test suite and test separately

   - support for timing test run

   - several fixes and improvements"

* tag 'linux-kselftest-kunit-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: tool: fix typecheck errors about loading qemu configs
  kunit: tool: continue past invalid utf-8 output
  kunit: Reset suite count after running tests
  kunit: tool: improve compatibility of kunit_parser with KTAP specification
  kunit: tool: yield output from run_kernel in real time
  kunit: tool: support running each suite/test separately
  kunit: tool: actually track how long it took to run tests
  kunit: tool: factor exec + parse steps into a function
  kunit: add 'kunit.action' param to allow listing out tests
  kunit: tool: show list of valid --arch options when invalid
  kunit: tool: misc fixes (unused vars, imports, leaked files)
  kunit: fix too small allocation when using suite-only kunit.filter_glob
  kunit: tool: allow filtering test cases via glob
  kunit: drop assumption in kunit-log-test about current suite
2021-11-02 22:06:20 -07:00
Linus Torvalds
84924e2e62 linux-kselftest-next-5.16-rc1
This Kselftest update for Linux 5.16-rc1 consists of fixes to compile
 time error and warnings.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmGBdeoACgkQCwJExA0N
 QxywbBAAuiQow4KPkTt39HGKazGrM6J6rVlxp2QJ+jN8xnIWHmwVO53qbcyED++X
 uC9ORwkKU8jr6sA9pL+uhFT4MauJO8+hlGVJ5A9edt1xer94EGOY41XAGmznHIhL
 KXtFJ/YbeMBZjjaFLiSePUxdOQhmHq4v3rWd9p/Do60PPhfAkLcob8DCt5W/CMXF
 FclEIZnaPlm0b/JSxljJDy4zl5QGDuTmL+Sk8ohtiWH6spwDfPs4eaDcxCjQryYH
 W5FcaBGo+ISh2xecl1Wa6RlmHOqmtM3WlIzT9vUTTbNGRR5b7titOepDz8kPQxtX
 Cz8swXfEI2ZVksXza7JJHkWPfKFw7BBFoUennuBq82FfjUlEaEv90YowIiXX0ibw
 INkYrB1e206tZdpcmS6BjcqqD/6bYf0Xwrz49950WyJYFLTJ3Fq3wUPK/jftS90+
 v8SDKEVnEalughVqzRDtdgM8WqXkDDZgCalMj6eH6nEuxaSLMoMm297y17LLpHPU
 4Bhqc2X9ObqLsEyLNmlbMw5qBg2xNRJQtTpGOEcf3S5eZ7eonnBnNhvtLZdKPHY4
 IMEybYyiiX27F+5eaVvRtNbLUKWXg9YAFLFsQ/7Gu8bcnShKH2ZpfW0saPihcOYH
 frGHyNMCZMW3y+CA3qGq5l7ddSYF8rUo5hIG8XwC1niRqyyuCWk=
 =j8/P
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-next-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest updates from Shuah Khan:
 "Fixes to compile time errors and warnings"

* tag 'linux-kselftest-next-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/core: fix conflicting types compile error for close_range()
  selftests: x86: fix [-Wstringop-overread] warn in test_process_vm_readv()
  selftests: kvm: fix mismatched fclose() after popen()
2021-11-02 22:00:17 -07:00
Linus Torvalds
d7e0a795bf ARM:
* More progress on the protected VM front, now with the full
   fixed feature set as well as the limitation of some hypercalls
   after initialisation.
 
 * Cleanup of the RAZ/WI sysreg handling, which was pointlessly
   complicated
 
 * Fixes for the vgic placement in the IPA space, together with a
   bunch of selftests
 
 * More memcg accounting of the memory allocated on behalf of a guest
 
 * Timer and vgic selftests
 
 * Workarounds for the Apple M1 broken vgic implementation
 
 * KConfig cleanups
 
 * New kvmarm.mode=none option, for those who really dislike us
 
 RISC-V:
 * New KVM port.
 
 x86:
 * New API to control TSC offset from userspace
 
 * TSC scaling for nested hypervisors on SVM
 
 * Switch masterclock protection from raw_spin_lock to seqcount
 
 * Clean up function prototypes in the page fault code and avoid
 repeated memslot lookups
 
 * Convey the exit reason to userspace on emulation failure
 
 * Configure time between NX page recovery iterations
 
 * Expose Predictive Store Forwarding Disable CPUID leaf
 
 * Allocate page tracking data structures lazily (if the i915
 KVM-GT functionality is not compiled in)
 
 * Cleanups, fixes and optimizations for the shadow MMU code
 
 s390:
 * SIGP Fixes
 
 * initial preparations for lazy destroy of secure VMs
 
 * storage key improvements/fixes
 
 * Log the guest CPNC
 
 Starting from this release, KVM-PPC patches will come from
 Michael Ellerman's PPC tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGBOiEUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNowwf/axlx3g9sgCwQHr12/6UF/7hL/RwP
 9z+pGiUzjl2YQE+RjSvLqyd6zXh+h4dOdOKbZDLSkSTbcral/8U70ojKnQsXM0XM
 1LoymxBTJqkgQBLm9LjYreEbzrPV4irk4ygEmuk3CPOHZu8xX1ei6c5LdandtM/n
 XVUkXsQY+STkmnGv4P3GcPoDththCr0tBTWrFWtxa0w9hYOxx0ay1AZFlgM4FFX0
 QFuRc8VBLoDJpIUjbkhsIRIbrlHc/YDGjuYnAU7lV/CIME8vf2BW6uBwIZJdYcDj
 0ejozLjodEnuKXQGnc8sXFioLX2gbMyQJEvwCgRvUu/EU7ncFm1lfs7THQ==
 =UxKM
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "ARM:

   - More progress on the protected VM front, now with the full fixed
     feature set as well as the limitation of some hypercalls after
     initialisation.

   - Cleanup of the RAZ/WI sysreg handling, which was pointlessly
     complicated

   - Fixes for the vgic placement in the IPA space, together with a
     bunch of selftests

   - More memcg accounting of the memory allocated on behalf of a guest

   - Timer and vgic selftests

   - Workarounds for the Apple M1 broken vgic implementation

   - KConfig cleanups

   - New kvmarm.mode=none option, for those who really dislike us

  RISC-V:

   - New KVM port.

  x86:

   - New API to control TSC offset from userspace

   - TSC scaling for nested hypervisors on SVM

   - Switch masterclock protection from raw_spin_lock to seqcount

   - Clean up function prototypes in the page fault code and avoid
     repeated memslot lookups

   - Convey the exit reason to userspace on emulation failure

   - Configure time between NX page recovery iterations

   - Expose Predictive Store Forwarding Disable CPUID leaf

   - Allocate page tracking data structures lazily (if the i915 KVM-GT
     functionality is not compiled in)

   - Cleanups, fixes and optimizations for the shadow MMU code

  s390:

   - SIGP Fixes

   - initial preparations for lazy destroy of secure VMs

   - storage key improvements/fixes

   - Log the guest CPNC

  Starting from this release, KVM-PPC patches will come from Michael
  Ellerman's PPC tree"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
  RISC-V: KVM: fix boolreturn.cocci warnings
  RISC-V: KVM: remove unneeded semicolon
  RISC-V: KVM: Fix GPA passed to __kvm_riscv_hfence_gvma_xyz() functions
  RISC-V: KVM: Factor-out FP virtualization into separate sources
  KVM: s390: add debug statement for diag 318 CPNC data
  KVM: s390: pv: properly handle page flags for protected guests
  KVM: s390: Fix handle_sske page fault handling
  KVM: x86: SGX must obey the KVM_INTERNAL_ERROR_EMULATION protocol
  KVM: x86: On emulation failure, convey the exit reason, etc. to userspace
  KVM: x86: Get exit_reason as part of kvm_x86_ops.get_exit_info
  KVM: x86: Clarify the kvm_run.emulation_failure structure layout
  KVM: s390: Add a routine for setting userspace CPU state
  KVM: s390: Simplify SIGP Set Arch handling
  KVM: s390: pv: avoid stalls when making pages secure
  KVM: s390: pv: avoid stalls for kvm_s390_pv_init_vm
  KVM: s390: pv: avoid double free of sida page
  KVM: s390: pv: add macros for UVC CC values
  s390/mm: optimize reset_guest_reference_bit()
  s390/mm: optimize set_guest_storage_key()
  s390/mm: no need for pte_alloc_map_lock() if we know the pmd is present
  ...
2021-11-02 11:24:14 -07:00
Linus Torvalds
cc0356d6a0 - Do not #GP on userspace use of CLI/STI but pretend it was a NOP to
keep old userspace from breaking. Adjust the corresponding iopl selftest
 to that.
 
 - Improve stack overflow warnings to say which stack got overflowed and
 raise the exception stack sizes to 2 pages since overflowing the single
 page of exception stack is very easy to do nowadays with all the tracing
 machinery enabled. With that, rip out the custom mapping of AMD SEV's
 too.
 
 - A bunch of changes in preparation for FGKASLR like supporting more
 than 64K section headers in the relocs tool, correct ORC lookup table
 size to cover the whole kernel .text and other adjustments.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmF/uugACgkQEsHwGGHe
 VUroKw//e8BJ3Aun8bg00FHxfiMGbPYcozjLGDkaoMtMDZ8WlfCUrvtqYICEr8eB
 UU0eRyygAPI167dre1O9JvAcbilkNTKntaU6qbu/ZVyUwS3+Jkjwsotbqn3xKtkd
 QDDTDNiCU+beCJ2ZbspbrPgEh13+H0MwMHUfRxZB9Scpmo6aGSEaU3g295f6GX57
 VFGJ/LNov5MV1dTD7Pp/h6/Nb+R6WmflKcBzJmQxYuKyKX+g1xsSv0VSga+t+uf3
 M9pUkizqTiUxzC2eLgtcEZTqqBHu810E8M76FmhKBUMilsFJT5YAJTiqyahwHXds
 HYarOFRgcnFuJPd29vn8UHjqeeoi6ru8GtcZYzccEc7U3ku/gXPaDJ9ffmvhs7vU
 pJX5Um3GiiFm0w/ZZOKDqh78wRAsCKLN+jIoyszuhkkNchZSj/jKfOgdd3EmcZst
 6L6rxBA4oRHwNOgM7uVMp+jFeRe1/prR280OWWH0D4QmmuqybThOdO23Iuh/Deth
 W3qPUH3UQtfSWxGy2yODzJ1ciuGAr/AzJZ9zjg04e3Vl0DkEpyWtLKJiG3ClXZag
 Nj+3xc4xYH2Aw+M0HRaONk5XVKLpqVjuAfgU5iLQa0YSUbtrR+wCWvY8KgQNbAqK
 xZmzYzQ89stwVCuGKx10gPsL3jSJ3VCylMfqdHD2Ajmld1yApr0=
 =DOZU
 -----END PGP SIGNATURE-----

Merge tag 'x86_core_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 core updates from Borislav Petkov:

 - Do not #GP on userspace use of CLI/STI but pretend it was a NOP to
   keep old userspace from breaking. Adjust the corresponding iopl
   selftest to that.

 - Improve stack overflow warnings to say which stack got overflowed and
   raise the exception stack sizes to 2 pages since overflowing the
   single page of exception stack is very easy to do nowadays with all
   the tracing machinery enabled. With that, rip out the custom mapping
   of AMD SEV's too.

 - A bunch of changes in preparation for FGKASLR like supporting more
   than 64K section headers in the relocs tool, correct ORC lookup table
   size to cover the whole kernel .text and other adjustments.

* tag 'x86_core_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftests/x86/iopl: Adjust to the faked iopl CLI/STI usage
  vmlinux.lds.h: Have ORC lookup cover entire _etext - _stext
  x86/boot/compressed: Avoid duplicate malloc() implementations
  x86/boot: Allow a "silent" kaslr random byte fetch
  x86/tools/relocs: Support >64K section headers
  x86/sev: Make the #VC exception stacks part of the default stacks storage
  x86: Increase exception stack sizes
  x86/mm/64: Improve stack overflow warnings
  x86/iopl: Fake iopl(3) CLI/STI usage
2021-11-02 07:56:47 -07:00
Linus Torvalds
fc02cb2b37 Core:
- Remove socket skb caches
 
  - Add a SO_RESERVE_MEM socket op to forward allocate buffer space
    and avoid memory accounting overhead on each message sent
 
  - Introduce managed neighbor entries - added by control plane and
    resolved by the kernel for use in acceleration paths (BPF / XDP
    right now, HW offload users will benefit as well)
 
  - Make neighbor eviction on link down controllable by userspace
    to work around WiFi networks with bad roaming implementations
 
  - vrf: Rework interaction with netfilter/conntrack
 
  - fq_codel: implement L4S style ce_threshold_ect1 marking
 
  - sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
 
 BPF:
 
  - Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
    as implemented in LLVM14
 
  - Introduce bpf_get_branch_snapshot() to capture Last Branch Records
 
  - Implement variadic trace_printk helper
 
  - Add a new Bloomfilter map type
 
  - Track <8-byte scalar spill and refill
 
  - Access hw timestamp through BPF's __sk_buff
 
  - Disallow unprivileged BPF by default
 
  - Document BPF licensing
 
 Netfilter:
 
  - Introduce egress hook for looking at raw outgoing packets
 
  - Allow matching on and modifying inner headers / payload data
 
  - Add NFT_META_IFTYPE to match on the interface type either from
    ingress or egress
 
 Protocols:
 
  - Multi-Path TCP:
    - increase default max additional subflows to 2
    - rework forward memory allocation
    - add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
 
  - MCTP flow support allowing lower layer drivers to configure msg
    muxing as needed
 
  - Automatic Multicast Tunneling (AMT) driver based on RFC7450
 
  - HSR support the redbox supervision frames (IEC-62439-3:2018)
 
  - Support for the ip6ip6 encapsulation of IOAM
 
  - Netlink interface for CAN-FD's Transmitter Delay Compensation
 
  - Support SMC-Rv2 eliminating the current same-subnet restriction,
    by exploiting the UDP encapsulation feature of RoCE adapters
 
  - TLS: add SM4 GCM/CCM crypto support
 
  - Bluetooth: initial support for link quality and audio/codec
    offload
 
 Driver APIs:
 
  - Add a batched interface for RX buffer allocation in AF_XDP
    buffer pool
 
  - ethtool: Add ability to control transceiver modules' power mode
 
  - phy: Introduce supported interfaces bitmap to express MAC
    capabilities and simplify PHY code
 
  - Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
 
 New drivers:
 
  - WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
 
  - Ethernet driver for ASIX AX88796C SPI device (x88796c)
 
 Drivers:
 
  - Broadcom PHYs
    - support 72165, 7712 16nm PHYs
    - support IDDQ-SR for additional power savings
 
  - PHY support for QCA8081, QCA9561 PHYs
 
  - NXP DPAA2: support for IRQ coalescing
 
  - NXP Ethernet (enetc): support for software TCP segmentation
 
  - Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
    Gigabit-capable IP found on RZ/G2L SoC
 
  - Intel 100G Ethernet
    - support for eswitch offload of TC/OvS flow API, including
      offload of GRE, VxLAN, Geneve tunneling
    - support application device queues - ability to assign Rx and Tx
      queues to application threads
    - PTP and PPS (pulse-per-second) extensions
 
  - Broadcom Ethernet (bnxt)
    - devlink health reporting and device reload extensions
 
  - Mellanox Ethernet (mlx5)
    - offload macvlan interfaces
    - support HW offload of TC rules involving OVS internal ports
    - support HW-GRO and header/data split
    - support application device queues
 
  - Marvell OcteonTx2:
    - add XDP support for PF
    - add PTP support for VF
 
  - Qualcomm Ethernet switch (qca8k): support for QCA8328
 
  - Realtek Ethernet DSA switch (rtl8366rb)
    - support bridge offload
    - support STP, fast aging, disabling address learning
    - support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
 
  - Mellanox Ethernet/IB switch (mlxsw)
    - multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
    - offload root TBF qdisc as port shaper
    - support multiple routing interface MAC address prefixes
    - support for IP-in-IP with IPv6 underlay
 
  - MediaTek WiFi (mt76)
    - mt7921 - ASPM, 6GHz, SDIO and testmode support
    - mt7915 - LED and TWT support
 
  - Qualcomm WiFi (ath11k)
    - include channel rx and tx time in survey dump statistics
    - support for 80P80 and 160 MHz bandwidths
    - support channel 2 in 6 GHz band
    - spectral scan support for QCN9074
    - support for rx decapsulation offload (data frames in 802.3
      format)
 
  - Qualcomm phone SoC WiFi (wcn36xx)
    - enable Idle Mode Power Save (IMPS) to reduce power consumption
      during idle
 
  - Bluetooth driver support for MediaTek MT7922 and MT7921
 
  - Enable support for AOSP Bluetooth extension in Qualcomm WCN399x
    and Realtek 8822C/8852A
 
  - Microsoft vNIC driver (mana)
    - support hibernation and kexec
 
  - Google vNIC driver (gve)
    - support for jumbo frames
    - implement Rx page reuse
 
 Refactor:
 
  - Make all writes to netdev->dev_addr go thru helpers, so that we
    can add this address to the address rbtree and handle the updates
 
  - Various TCP cleanups and optimizations including improvements
    to CPU cache use
 
  - Simplify the gnet_stats, Qdisc stats' handling and remove
    qdisc->running sequence counter
 
  - Driver changes and API updates to address devlink locking
    deficiencies
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGAzX4ACgkQMUZtbf5S
 IrvW3g//Q0ZLrOuHK9pZ8sCXMMhDj8qL6ajm0otMddHWA/+1UglwVBKFhsajfxOf
 wJ/5LZis+XKLpLqKTU5chKVfn39HuDGe/D3l+egi01Gv5BW0+XzEhagfyR5tJX5z
 wsGG5CXO/we/laVSzRiFtwwVEKHKN20YC+tIQwYOYP5Wy3q4G7qDsFhT7GqgsGCS
 n74QUEAIB5Tz0ODWFqLtbsySzIurXrskibwt5T9bvAAlPw/lCU68mmG+NVJ7VddO
 lBbNkLMOo8yW9Ci20H09SrYd4jZTmMARo9tsFO1tAvAMk7qpn0Wd8pnOYTjFFoMD
 +qjiFSVMh7E0JGb8Y7NCvwaB99suAK5rfGP68Xwe62DfP7vYWEx4pZGxBP19F4ld
 6Kn1ME33BX9rUF9tBecf0bdKfJUwB2Q2Xou/b9laG04bwiqsc9iG5FQq1C46lnLZ
 QdzNiS1My4dJMczkWt66HF3Kx30ibwHfvKMIHjf4PqkzEatkv6Y6SBZ57KXL+Lde
 0BQSFhbf0tm2Gf55etzrczLElI3uqHSFWUNZZ2Bt6WmzO1e6tpV9nAtRWF4C/dFg
 QDpLJtOOOY65uq+qz09zoPfv2lem868SrCAuFrVn99bEpYjx/CGNFDeEI02l6jyr
 84eUxd364UcbIk3fc+eTGdXHLQNVk30G0AHVBBxaWNIidwfqXeE=
 =srde
 -----END PGP SIGNATURE-----

Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - Remove socket skb caches

   - Add a SO_RESERVE_MEM socket op to forward allocate buffer space and
     avoid memory accounting overhead on each message sent

   - Introduce managed neighbor entries - added by control plane and
     resolved by the kernel for use in acceleration paths (BPF / XDP
     right now, HW offload users will benefit as well)

   - Make neighbor eviction on link down controllable by userspace to
     work around WiFi networks with bad roaming implementations

   - vrf: Rework interaction with netfilter/conntrack

   - fq_codel: implement L4S style ce_threshold_ect1 marking

   - sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()

  BPF:

   - Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
     as implemented in LLVM14

   - Introduce bpf_get_branch_snapshot() to capture Last Branch Records

   - Implement variadic trace_printk helper

   - Add a new Bloomfilter map type

   - Track <8-byte scalar spill and refill

   - Access hw timestamp through BPF's __sk_buff

   - Disallow unprivileged BPF by default

   - Document BPF licensing

  Netfilter:

   - Introduce egress hook for looking at raw outgoing packets

   - Allow matching on and modifying inner headers / payload data

   - Add NFT_META_IFTYPE to match on the interface type either from
     ingress or egress

  Protocols:

   - Multi-Path TCP:
      - increase default max additional subflows to 2
      - rework forward memory allocation
      - add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS

   - MCTP flow support allowing lower layer drivers to configure msg
     muxing as needed

   - Automatic Multicast Tunneling (AMT) driver based on RFC7450

   - HSR support the redbox supervision frames (IEC-62439-3:2018)

   - Support for the ip6ip6 encapsulation of IOAM

   - Netlink interface for CAN-FD's Transmitter Delay Compensation

   - Support SMC-Rv2 eliminating the current same-subnet restriction, by
     exploiting the UDP encapsulation feature of RoCE adapters

   - TLS: add SM4 GCM/CCM crypto support

   - Bluetooth: initial support for link quality and audio/codec offload

  Driver APIs:

   - Add a batched interface for RX buffer allocation in AF_XDP buffer
     pool

   - ethtool: Add ability to control transceiver modules' power mode

   - phy: Introduce supported interfaces bitmap to express MAC
     capabilities and simplify PHY code

   - Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks

  New drivers:

   - WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)

   - Ethernet driver for ASIX AX88796C SPI device (x88796c)

  Drivers:

   - Broadcom PHYs
      - support 72165, 7712 16nm PHYs
      - support IDDQ-SR for additional power savings

   - PHY support for QCA8081, QCA9561 PHYs

   - NXP DPAA2: support for IRQ coalescing

   - NXP Ethernet (enetc): support for software TCP segmentation

   - Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
     Gigabit-capable IP found on RZ/G2L SoC

   - Intel 100G Ethernet
      - support for eswitch offload of TC/OvS flow API, including
        offload of GRE, VxLAN, Geneve tunneling
      - support application device queues - ability to assign Rx and Tx
        queues to application threads
      - PTP and PPS (pulse-per-second) extensions

   - Broadcom Ethernet (bnxt)
      - devlink health reporting and device reload extensions

   - Mellanox Ethernet (mlx5)
      - offload macvlan interfaces
      - support HW offload of TC rules involving OVS internal ports
      - support HW-GRO and header/data split
      - support application device queues

   - Marvell OcteonTx2:
      - add XDP support for PF
      - add PTP support for VF

   - Qualcomm Ethernet switch (qca8k): support for QCA8328

   - Realtek Ethernet DSA switch (rtl8366rb)
      - support bridge offload
      - support STP, fast aging, disabling address learning
      - support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch

   - Mellanox Ethernet/IB switch (mlxsw)
      - multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
      - offload root TBF qdisc as port shaper
      - support multiple routing interface MAC address prefixes
      - support for IP-in-IP with IPv6 underlay

   - MediaTek WiFi (mt76)
      - mt7921 - ASPM, 6GHz, SDIO and testmode support
      - mt7915 - LED and TWT support

   - Qualcomm WiFi (ath11k)
      - include channel rx and tx time in survey dump statistics
      - support for 80P80 and 160 MHz bandwidths
      - support channel 2 in 6 GHz band
      - spectral scan support for QCN9074
      - support for rx decapsulation offload (data frames in 802.3
        format)

   - Qualcomm phone SoC WiFi (wcn36xx)
      - enable Idle Mode Power Save (IMPS) to reduce power consumption
        during idle

   - Bluetooth driver support for MediaTek MT7922 and MT7921

   - Enable support for AOSP Bluetooth extension in Qualcomm WCN399x and
     Realtek 8822C/8852A

   - Microsoft vNIC driver (mana)
      - support hibernation and kexec

   - Google vNIC driver (gve)
      - support for jumbo frames
      - implement Rx page reuse

  Refactor:

   - Make all writes to netdev->dev_addr go thru helpers, so that we can
     add this address to the address rbtree and handle the updates

   - Various TCP cleanups and optimizations including improvements to
     CPU cache use

   - Simplify the gnet_stats, Qdisc stats' handling and remove
     qdisc->running sequence counter

   - Driver changes and API updates to address devlink locking
     deficiencies"

* tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2122 commits)
  Revert "net: avoid double accounting for pure zerocopy skbs"
  selftests: net: add arp_ndisc_evict_nocarrier
  net: ndisc: introduce ndisc_evict_nocarrier sysctl parameter
  net: arp: introduce arp_evict_nocarrier sysctl parameter
  libbpf: Deprecate AF_XDP support
  kbuild: Unify options for BTF generation for vmlinux and modules
  selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
  bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
  bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
  net: vmxnet3: remove multiple false checks in vmxnet3_ethtool.c
  net: avoid double accounting for pure zerocopy skbs
  tcp: rename sk_wmem_free_skb
  netdevsim: fix uninit value in nsim_drv_configure_vfs()
  selftests/bpf: Fix also no-alu32 strobemeta selftest
  bpf: Add missing map_delete_elem method to bloom filter map
  selftests/bpf: Add bloom map success test for userspace calls
  bpf: Add alignment padding for "map_extra" + consolidate holes
  bpf: Bloom filter map naming fixups
  selftests/bpf: Add test cases for struct_ops prog
  bpf: Add dummy BPF STRUCT_OPS for test purpose
  ...
2021-11-02 06:20:58 -07:00
Linus Torvalds
6fedc28076 RCU pull request for v5.16
This pull request contains the following branches:
 
 fixes.2021.10.07a: Miscellaneous fixes.
 
 scftorture.2021.09.16a: smp_call_function torture-test updates, most
 	notably better checking of module parameters.
 
 tasks.2021.09.15a: Tasks-trace RCU updates that fix a number of rare
 	but important race-condition bugs.
 
 torture.2021.09.13b: Other torture-test updates, most notably
 	better checking of module parameters.  In addition, rcutorture
 	may now be run on CONFIG_PREEMPT_RT kernels.
 
 torturescript.2021.09.16a: Torture-test scripting updates, most notably
 	specifying the new CONFIG_KCSAN_STRICT kconfig option rather
 	than maintaining an ever-changing list of individual KCSAN
 	kconfig options.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmGAVMMTHHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jGJBD/9ld6USOpedBLAbTYVMQYvIKoSqqDIG
 74ZFhKvZ5I6Y8OZAGxXjb5U06rh4V2brlTN7IJ7XLEA1t401ENffsGeQSCxEmpEf
 PqQN04dbmVvaWjD4jiLZCcl3oDp+w1gIKwmX6wh0Weogr3KZWu5aNvD5tl9qIz4a
 uPC1JqTBxf7WDrLhqNxG5N4MXs27+KvukCd9wftk3NTzRJ9tyLM/YNGOVArM8rW2
 QpEh8n6veB5dEoXBxmRHzuxYHN1k0Fhkbm3irMjcI0T5wj8TDod89zbg9mdFXMIj
 AjZ9CGpIBa4frThdu654ZNuEQHDCsPWtMi925xNOWxh5lkPGjeWnwYpcRrwfI2pj
 op0xVlur+Nam5CT/AJNT9+KogpZthAWXvwqCs5GbYNSU30Rlw99bw1vyAsJUD+af
 Mv08/z4o7Kuhr4cw2vkd2UfF9zuIQsJ1jWCIjMxfj4ctBnIpedrEnEISp8Y61fWk
 w9vXgCRhZCSkxoURoNss+nAUsiePUafptsvqKLu6Z53ufPA5yL0rVS778xq8vurP
 Xyd34TVlQ94ydZDC5pkSNpri1HGV1U7pztFwey5GloE66iV+7TSQCfMhzLd4CM0K
 wW96wimHrDtIxD6LedCZOHLHkS9AJd7F9uSoNodKspTH0tJowQztrzPW1eZifDE3
 iJP8xcJ+vL67Og==
 =nmaP
 -----END PGP SIGNATURE-----

Merge tag 'rcu.2021.11.01a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull RCU updates from Paul McKenney:

 - Miscellaneous fixes

 - Torture-test updates for smp_call_function(), most notably improved
   checking of module parameters.

 - Tasks-trace RCU updates that fix a number of rare but important
   race-condition bugs.

 - Other torture-test updates, most notably better checking of module
   parameters. In addition, rcutorture may once again be run on
   CONFIG_PREEMPT_RT kernels.

 - Torture-test scripting updates, most notably specifying the new
   CONFIG_KCSAN_STRICT kconfig option rather than maintaining an
   ever-changing list of individual KCSAN kconfig options.

* tag 'rcu.2021.11.01a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (46 commits)
  rcu: Fix rcu_dynticks_curr_cpu_in_eqs() vs noinstr
  rcu: Always inline rcu_dynticks_task*_{enter,exit}()
  torture: Make kvm-remote.sh print size of downloaded tarball
  torture: Allot 1G of memory for scftorture runs
  tools/rcu: Add an extract-stall script
  scftorture: Warn on individual scf_torture_init() error conditions
  scftorture: Count reschedule IPIs
  scftorture: Account for weight_resched when checking for all zeroes
  scftorture: Shut down if nonsensical arguments given
  scftorture: Allow zero weight to exclude an smp_call_function*() category
  rcu: Avoid unneeded function call in rcu_read_unlock()
  rcu-tasks: Update comments to cond_resched_tasks_rcu_qs()
  rcu-tasks: Fix IPI failure handling in trc_wait_for_one_reader
  rcu-tasks: Fix read-side primitives comment for call_rcu_tasks_trace
  rcu-tasks: Clarify read side section info for rcu_tasks_rude GP primitives
  rcu-tasks: Correct comparisons for CPU numbers in show_stalled_task_trace
  rcu-tasks: Correct firstreport usage in check_all_holdout_tasks_trace
  rcu-tasks: Fix s/rcu_add_holdout/trc_add_holdout/ typo in comment
  rcu-tasks: Move RTGS_WAIT_CBS to beginning of rcu_tasks_kthread() loop
  rcu-tasks: Fix s/instruction/instructions/ typo in comment
  ...
2021-11-01 20:25:38 -07:00
Linus Torvalds
79ef0c0014 Tracing updates for 5.16:
- kprobes: Restructured stack unwinder to show properly on x86 when a stack
   dump happens from a kretprobe callback.
 
 - Fix to bootconfig parsing
 
 - Have tracefs allow owner and group permissions by default (only denying
   others). There's been pressure to allow non root to tracefs in a
   controlled fashion, and using groups is probably the safest.
 
 - Bootconfig memory managament updates.
 
 - Bootconfig clean up to have the tools directory be less dependent on
   changes in the kernel tree.
 
 - Allow perf to be traced by function tracer.
 
 - Rewrite of function graph tracer to be a callback from the function tracer
   instead of having its own trampoline (this change will happen on an arch
   by arch basis, and currently only x86_64 implements it).
 
 - Allow multiple direct trampolines (bpf hooks to functions) be batched
   together in one synchronization.
 
 - Allow histogram triggers to add variables that can perform calculations
   against the event's fields.
 
 - Use the linker to determine architecture callbacks from the ftrace
   trampoline to allow for proper parameter prototypes and prevent warnings
   from the compiler.
 
 - Extend histogram triggers to key off of variables.
 
 - Have trace recursion use bit magic to determine preempt context over if
   branches.
 
 - Have trace recursion disable preemption as all use cases do anyway.
 
 - Added testing for verification of tracing utilities.
 
 - Various small clean ups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYYBdxhQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qp1sAQD2oYFwaG3sx872gj/myBcHIBSKdiki
 Hry5csd8zYDBpgD+Poylopt5JIbeDuoYw/BedgEXmscZ8Qr7VzjAXdnv/Q4=
 =Loz8
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:

 - kprobes: Restructured stack unwinder to show properly on x86 when a
   stack dump happens from a kretprobe callback.

 - Fix to bootconfig parsing

 - Have tracefs allow owner and group permissions by default (only
   denying others). There's been pressure to allow non root to tracefs
   in a controlled fashion, and using groups is probably the safest.

 - Bootconfig memory managament updates.

 - Bootconfig clean up to have the tools directory be less dependent on
   changes in the kernel tree.

 - Allow perf to be traced by function tracer.

 - Rewrite of function graph tracer to be a callback from the function
   tracer instead of having its own trampoline (this change will happen
   on an arch by arch basis, and currently only x86_64 implements it).

 - Allow multiple direct trampolines (bpf hooks to functions) be batched
   together in one synchronization.

 - Allow histogram triggers to add variables that can perform
   calculations against the event's fields.

 - Use the linker to determine architecture callbacks from the ftrace
   trampoline to allow for proper parameter prototypes and prevent
   warnings from the compiler.

 - Extend histogram triggers to key off of variables.

 - Have trace recursion use bit magic to determine preempt context over
   if branches.

 - Have trace recursion disable preemption as all use cases do anyway.

 - Added testing for verification of tracing utilities.

 - Various small clean ups and fixes.

* tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (101 commits)
  tracing/histogram: Fix semicolon.cocci warnings
  tracing/histogram: Fix documentation inline emphasis warning
  tracing: Increase PERF_MAX_TRACE_SIZE to handle Sentinel1 and docker together
  tracing: Show size of requested perf buffer
  bootconfig: Initialize ret in xbc_parse_tree()
  ftrace: do CPU checking after preemption disabled
  ftrace: disable preemption when recursion locked
  tracing/histogram: Document expression arithmetic and constants
  tracing/histogram: Optimize division by a power of 2
  tracing/histogram: Covert expr to const if both operands are constants
  tracing/histogram: Simplify handling of .sym-offset in expressions
  tracing: Fix operator precedence for hist triggers expression
  tracing: Add division and multiplication support for hist triggers
  tracing: Add support for creating hist trigger variables from literal
  selftests/ftrace: Stop tracing while reading the trace file by default
  MAINTAINERS: Update KPROBES and TRACING entries
  test_kprobes: Move it from kernel/ to lib/
  docs, kprobes: Remove invalid URL and add new reference
  samples/kretprobes: Fix return value if register_kretprobe() failed
  lib/bootconfig: Fix the xbc_get_info kerneldoc
  ...
2021-11-01 20:05:19 -07:00
Jakub Kicinski
8a33dcc2f6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in the fixes we had queued in case there was another -rc.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-01 20:05:14 -07:00
Jakub Kicinski
b7b98f8689 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-11-01

We've added 181 non-merge commits during the last 28 day(s) which contain
a total of 280 files changed, 11791 insertions(+), 5879 deletions(-).

The main changes are:

1) Fix bpf verifier propagation of 64-bit bounds, from Alexei.

2) Parallelize bpf test_progs, from Yucong and Andrii.

3) Deprecate various libbpf apis including af_xdp, from Andrii, Hengqi, Magnus.

4) Improve bpf selftests on s390, from Ilya.

5) bloomfilter bpf map type, from Joanne.

6) Big improvements to JIT tests especially on Mips, from Johan.

7) Support kernel module function calls from bpf, from Kumar.

8) Support typeless and weak ksym in light skeleton, from Kumar.

9) Disallow unprivileged bpf by default, from Pawan.

10) BTF_KIND_DECL_TAG support, from Yonghong.

11) Various bpftool cleanups, from Quentin.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (181 commits)
  libbpf: Deprecate AF_XDP support
  kbuild: Unify options for BTF generation for vmlinux and modules
  selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
  bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
  bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
  selftests/bpf: Fix also no-alu32 strobemeta selftest
  bpf: Add missing map_delete_elem method to bloom filter map
  selftests/bpf: Add bloom map success test for userspace calls
  bpf: Add alignment padding for "map_extra" + consolidate holes
  bpf: Bloom filter map naming fixups
  selftests/bpf: Add test cases for struct_ops prog
  bpf: Add dummy BPF STRUCT_OPS for test purpose
  bpf: Factor out helpers for ctx access checking
  bpf: Factor out a helper to prepare trampoline for struct_ops prog
  selftests, bpf: Fix broken riscv build
  riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h
  tools, build: Add RISC-V to HOSTARCH parsing
  riscv, bpf: Increase the maximum number of iterations
  selftests, bpf: Add one test for sockmap with strparser
  selftests, bpf: Fix test_txmsg_ingress_parser error
  ...
====================

Link: https://lore.kernel.org/r/20211102013123.9005-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-01 19:59:46 -07:00
James Prestwood
f86ca07eb5 selftests: net: add arp_ndisc_evict_nocarrier
This tests the sysctl options for ARP/ND:

/net/ipv4/conf/<iface>/arp_evict_nocarrier
/net/ipv4/conf/all/arp_evict_nocarrier
/net/ipv6/conf/<iface>/ndisc_evict_nocarrier
/net/ipv6/conf/all/ndisc_evict_nocarrier

Signed-off-by: James Prestwood <prestwoj@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-01 19:57:15 -07:00
Magnus Karlsson
0b170456e0 libbpf: Deprecate AF_XDP support
Deprecate AF_XDP support in libbpf ([0]). This has been moved to
libxdp as it is a better fit for that library. The AF_XDP support only
uses the public libbpf functions and can therefore just use libbpf as
a library from libxdp. The libxdp APIs are exactly the same so it
should just be linking with libxdp instead of libbpf for the AF_XDP
functionality. If not, please submit a bug report. Linking with both
libraries is supported but make sure you link in the correct order so
that the new functions in libxdp are used instead of the deprecated
ones in libbpf.

Libxdp can be found at https://github.com/xdp-project/xdp-tools.

  [0] Closes: https://github.com/libbpf/libbpf/issues/270

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20211029090111.4733-1-magnus.karlsson@gmail.com
2021-11-01 18:12:44 -07:00
Alexei Starovoitov
0869e5078a selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
./test_progs-no_alu32 -vv -t twfw

Before the 64-bit_into_32-bit fix:
19: (25) if r1 > 0x3f goto pc+6
 R1_w=inv(id=0,umax_value=63,var_off=(0x0; 0xff),s32_max_value=255,u32_max_value=255)

and eventually:

invalid access to map value, value_size=8 off=7 size=8
R6 max value is outside of the allowed memory range
libbpf: failed to load object 'no_alu32/twfw.o'

After the fix:
19: (25) if r1 > 0x3f goto pc+6
 R1_w=inv(id=0,umax_value=63,var_off=(0x0; 0x3f))

verif_twfw:OK

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211101222153.78759-3-alexei.starovoitov@gmail.com
2021-11-01 18:05:12 -07:00
Alexei Starovoitov
b9979db834 bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
Before this fix:
166: (b5) if r2 <= 0x1 goto pc+22
from 166 to 189: R2=invP(id=1,umax_value=1,var_off=(0x0; 0xffffffff))

After this fix:
166: (b5) if r2 <= 0x1 goto pc+22
from 166 to 189: R2=invP(id=1,umax_value=1,var_off=(0x0; 0x1))

While processing BPF_JLE the reg_set_min_max() would set true_reg->umax_value = 1
and call __reg_combine_64_into_32(true_reg).

Without the fix it would not pass the condition:
if (__reg64_bound_u32(reg->umin_value) && __reg64_bound_u32(reg->umax_value))

since umin_value == 0 at this point.
Before commit 10bf4e8316 the umin was incorrectly ingored.
The commit 10bf4e8316 fixed the correctness issue, but pessimized
propagation of 64-bit min max into 32-bit min max and corresponding var_off.

Fixes: 10bf4e8316 ("bpf: Fix propagation of 32 bit unsigned bounds from 64 bit bounds")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211101222153.78759-1-alexei.starovoitov@gmail.com
2021-11-01 18:05:11 -07:00
Kalesh Singh
4e9f63c9e5 tracing/selftests: Add tests for hist trigger expression parsing
Add tests for the parsing of hist trigger expressions; and to
validate expression evaluation.

Link: https://lkml.kernel.org/r/20211029183339.3216491-5-kaleshsingh@google.com

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-11-01 20:47:00 -04:00
Linus Torvalds
46f8763228 arm64 updates for 5.16
- Support for the Arm8.6 timer extensions, including a self-synchronising
   view of the system registers to elide some expensive ISB instructions.
 
 - Exception table cleanup and rework so that the fixup handlers appear
   correctly in backtraces.
 
 - A handful of miscellaneous changes, the main one being selection of
   CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.
 
 - More mm and pgtable cleanups.
 
 - KASAN support for "asymmetric" MTE, where tag faults are reported
   synchronously for loads (via an exception) and asynchronously for
   stores (via a register).
 
 - Support for leaving the MMU enabled during kexec relocation, which
   significantly speeds up the operation.
 
 - Minor improvements to our perf PMU drivers.
 
 - Improvements to the compat vDSO build system, particularly when
   building with LLVM=1.
 
 - Preparatory work for handling some Coresight TRBE tracing errata.
 
 - Cleanup and refactoring of the SVE code to pave the way for SME
   support in future.
 
 - Ensure SCS pages are unpoisoned immediately prior to freeing them
   when KASAN is enabled for the vmalloc area.
 
 - Try moving to the generic pfn_valid() implementation again now that
   the DMA mapping issue from last time has been resolved.
 
 - Numerous improvements and additions to our FPSIMD and SVE selftests.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmF74ZYQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNI/eB/UZYAtmNi6xC5StPaETyMLeZph9BV/IqIFq
 N71ds7MFzlX/agR6MwLbH2tBHezBtlQ90O732Jjz8zAec2cHd+7sx/w82JesX7PB
 IuOfqP78rvtU4ZkKe1Rcd96QtYvbtNAqcRhIo95OzfV9xwuzkvdXI+ZTYhtCfCuZ
 GozCqQoJtnNDayMtfzbDSXyJLNJc/qnIcUQhrt3vg12zbF3BcHxnmp0nBcHCqZEo
 lDJYufju7p87kCzaFYda2WhlI3t+NThqKOiZ332wQfqzNcr+rw1Y4jWbnCfrdLtI
 JfHT9yiuHDmFSYaJrk7NU8kftW31NV70bbhD7rZ+DQCVndl0lRc=
 =3R3j
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "There's the usual summary below, but the highlights are support for
  the Armv8.6 timer extensions, KASAN support for asymmetric MTE, the
  ability to kexec() with the MMU enabled and a second attempt at
  switching to the generic pfn_valid() implementation.

  Summary:

   - Support for the Arm8.6 timer extensions, including a
     self-synchronising view of the system registers to elide some
     expensive ISB instructions.

   - Exception table cleanup and rework so that the fixup handlers
     appear correctly in backtraces.

   - A handful of miscellaneous changes, the main one being selection of
     CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.

   - More mm and pgtable cleanups.

   - KASAN support for "asymmetric" MTE, where tag faults are reported
     synchronously for loads (via an exception) and asynchronously for
     stores (via a register).

   - Support for leaving the MMU enabled during kexec relocation, which
     significantly speeds up the operation.

   - Minor improvements to our perf PMU drivers.

   - Improvements to the compat vDSO build system, particularly when
     building with LLVM=1.

   - Preparatory work for handling some Coresight TRBE tracing errata.

   - Cleanup and refactoring of the SVE code to pave the way for SME
     support in future.

   - Ensure SCS pages are unpoisoned immediately prior to freeing them
     when KASAN is enabled for the vmalloc area.

   - Try moving to the generic pfn_valid() implementation again now that
     the DMA mapping issue from last time has been resolved.

   - Numerous improvements and additions to our FPSIMD and SVE
     selftests"

[ armv8.6 timer updates were in a shared branch and already came in
  through -tip in the timer pull  - Linus ]

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (85 commits)
  arm64: Select POSIX_CPU_TIMERS_TASK_WORK
  arm64: Document boot requirements for FEAT_SME_FA64
  arm64/sve: Fix warnings when SVE is disabled
  arm64/sve: Add stub for sve_max_virtualisable_vl()
  arm64: errata: Add detection for TRBE write to out-of-range
  arm64: errata: Add workaround for TSB flush failures
  arm64: errata: Add detection for TRBE overwrite in FILL mode
  arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
  selftests: arm64: Factor out utility functions for assembly FP tests
  arm64: vmlinux.lds.S: remove `.fixup` section
  arm64: extable: add load_unaligned_zeropad() handler
  arm64: extable: add a dedicated uaccess handler
  arm64: extable: add `type` and `data` fields
  arm64: extable: use `ex` for `exception_table_entry`
  arm64: extable: make fixup_exception() return bool
  arm64: extable: consolidate definitions
  arm64: gpr-num: support W registers
  arm64: factor out GPR numbering helpers
  arm64: kvm: use kvm_exception_table_entry
  arm64: lib: __arch_copy_to_user(): fold fixups into body
  ...
2021-11-01 16:33:53 -07:00
Andrii Nakryiko
a20eac0af0 selftests/bpf: Fix also no-alu32 strobemeta selftest
Previous fix aded bpf_clamp_umax() helper use to re-validate boundaries.
While that works correctly, it introduces more branches, which blows up
past 1 million instructions in no-alu32 variant of strobemeta selftests.

Switching len variable from u32 to u64 also fixes the issue and reduces
the number of validated instructions, so use that instead. Fix this
patch and bpf_clamp_umax() removed, both alu32 and no-alu32 selftests
pass.

Fixes: 0133c20480 ("selftests/bpf: Fix strobemeta selftest regression")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211101230118.1273019-1-andrii@kernel.org
2021-11-01 16:07:27 -07:00
Linus Torvalds
160729afc8 - Use the proper interface for the job: get_unaligned() instead of
memcpy() in the insn decoder
 
 - A randconfig build fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmF/wogACgkQEsHwGGHe
 VUoQIw//WdNg7rD++X4GG5l73lGt5ajerqnxjpipiAQTy029cUx0OzeYlWeHR2QH
 p+zLb3xzghjHn0Gviv9omadcPjHjXbqU6vlR3b95JARM5NnJEKRE7nho/w3mRfaT
 gWBzo6awh5SXLlo7DYESHRfvyr/Ryjl6LvgBFXprO33ST+0RMsWW/J4bx63xEIUF
 TKIYtm994O/qQBNLIEu/CB2cOAxtGZrVfRfVK+8QJcUy9xwgP0Oa9I6o9LvzaoJ1
 UEvOkL1w6TttRsxgoHz/gskj8+LbXQD9LWVQ55u/HpRDhpNAe4f+RI73Fsgr7Av9
 irbrhKwXherKCk9lHgaXQ6XgrrkZyvDY/pvdlj3RlnDt0jsJa6R4gwBGCOXmTgkU
 5MF0hHr5kGgXAIJ7AVmYIaTBiLs99/JpF9+9lLW9UuJE2oKj2GxMot3YGTOokj1h
 u7Y32cta6Ve96ZHHtIXObY5c+LD3OQaljdBayLFaJuTVB6TqVc3dfsEzSNNf/duS
 56K28CQEIpPGMe/KW6uZW9eYzQsGv+Jux1X3p650Z/e9A5wVCbdmdEshtACbXSac
 FVhaybv8ksJKNQmHi3xqbDUpFSMlbXZB3UfpCoQoGR20IfN1H+L7h64Xro5bvbXd
 LResoLmpnyU3gs3gn9xRYsb4fBr4KYW9jFwzTZSEH3h/Si/Hm2c=
 =Wj9y
 -----END PGP SIGNATURE-----

Merge tag 'x86_misc_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 changes from Borislav Petkov:

 - Use the proper interface for the job: get_unaligned() instead of
   memcpy() in the insn decoder

 - A randconfig build fix

* tag 'x86_misc_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/insn: Use get_unaligned() instead of memcpy()
  x86/Kconfig: Fix an unused variable error in dell-smm-hwmon
2021-11-01 15:45:14 -07:00
Dave Marchevsky
6ac22d036f perf bpf: Pull in bpf_program__get_prog_info_linear()
To prepare for impending deprecation of libbpf's bpf_program__get_prog_info_linear(),
pull in the function and associated helpers into the perf codebase and migrate
existing uses to the perf copy.

Since libbpf's deprecated definitions will still be visible to perf, it is necessary
to rename perf's definitions.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211011082031.4148337-4-davemarchevsky@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-01 18:16:40 -03:00
Joanne Koong
7a67087250 selftests/bpf: Add bloom map success test for userspace calls
This patch has two changes:
1) Adds a new function "test_success_cases" to test
successfully creating + adding + looking up a value
in a bloom filter map from the userspace side.

2) Use bpf_create_map instead of bpf_create_map_xattr in
the "test_fail_cases" and test_inner_map to make the
code look cleaner.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211029224909.1721024-4-joannekoong@fb.com
2021-11-01 14:16:03 -07:00
Joanne Koong
8845b4681b bpf: Add alignment padding for "map_extra" + consolidate holes
This patch makes 2 changes regarding alignment padding
for the "map_extra" field.

1) In the kernel header, "map_extra" and "btf_value_type_id"
are rearranged to consolidate the hole.

Before:
struct bpf_map {
	...
        u32		max_entries;	/*    36     4	*/
        u32		map_flags;	/*    40     4	*/

        /* XXX 4 bytes hole, try to pack */

        u64		map_extra;	/*    48     8	*/
        int		spin_lock_off;	/*    56     4	*/
        int		timer_off;	/*    60     4	*/
        /* --- cacheline 1 boundary (64 bytes) --- */
        u32		id;		/*    64     4	*/
        int		numa_node;	/*    68     4	*/
	...
        bool		frozen;		/*   117     1	*/

        /* XXX 10 bytes hole, try to pack */

        /* --- cacheline 2 boundary (128 bytes) --- */
	...
        struct work_struct	work;	/*   144    72	*/

        /* --- cacheline 3 boundary (192 bytes) was 24 bytes ago --- */
	struct mutex	freeze_mutex;	/*   216   144 	*/

        /* --- cacheline 5 boundary (320 bytes) was 40 bytes ago --- */
        u64		writecnt; 	/*   360     8	*/

    /* size: 384, cachelines: 6, members: 26 */
    /* sum members: 354, holes: 2, sum holes: 14 */
    /* padding: 16 */
    /* forced alignments: 2, forced holes: 1, sum forced holes: 10 */

} __attribute__((__aligned__(64)));

After:
struct bpf_map {
	...
        u32		max_entries;	/*    36     4	*/
        u64		map_extra;	/*    40     8 	*/
        u32		map_flags;	/*    48     4	*/
        int		spin_lock_off;	/*    52     4	*/
        int		timer_off;	/*    56     4	*/
        u32		id;		/*    60     4	*/

        /* --- cacheline 1 boundary (64 bytes) --- */
        int		numa_node;	/*    64     4	*/
	...
	bool		frozen		/*   113     1  */

        /* XXX 14 bytes hole, try to pack */

        /* --- cacheline 2 boundary (128 bytes) --- */
	...
        struct work_struct	work;	/*   144    72	*/

        /* --- cacheline 3 boundary (192 bytes) was 24 bytes ago --- */
        struct mutex	freeze_mutex;	/*   216   144	*/

        /* --- cacheline 5 boundary (320 bytes) was 40 bytes ago --- */
        u64		writecnt;       /*   360     8	*/

    /* size: 384, cachelines: 6, members: 26 */
    /* sum members: 354, holes: 1, sum holes: 14 */
    /* padding: 16 */
    /* forced alignments: 2, forced holes: 1, sum forced holes: 14 */

} __attribute__((__aligned__(64)));

2) Add alignment padding to the bpf_map_info struct
More details can be found in commit 36f9814a49 ("bpf: fix uapi hole
for 32 bit compat applications")

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211029224909.1721024-3-joannekoong@fb.com
2021-11-01 14:16:03 -07:00
Hou Tao
31122b2f76 selftests/bpf: Add test cases for struct_ops prog
Running a BPF_PROG_TYPE_STRUCT_OPS prog for dummy_st_ops::test_N()
through bpf_prog_test_run(). Four test cases are added:
(1) attach dummy_st_ops should fail
(2) function return value of bpf_dummy_ops::test_1() is expected
(3) pointer argument of bpf_dummy_ops::test_1() works as expected
(4) multiple arguments passed to bpf_dummy_ops::test_2() are correct

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211025064025.2567443-5-houtao1@huawei.com
2021-11-01 14:10:00 -07:00
Linus Torvalds
8cb1ae19bf x86/fpu updates:
- Cleanup of extable fixup handling to be more robust, which in turn
    allows to make the FPU exception fixups more robust as well.
 
  - Change the return code for signal frame related failures from explicit
    error codes to a boolean fail/success as that's all what the calling
    code evaluates.
 
  - A large refactoring of the FPU code to prepare for adding AMX support:
 
    - Distangle the public header maze and remove especially the misnomed
      kitchen sink internal.h which is despite it's name included all over
      the place.
 
    - Add a proper abstraction for the register buffer storage (struct
      fpstate) which allows to dynamically size the buffer at runtime by
      flipping the pointer to the buffer container from the default
      container which is embedded in task_struct::tread::fpu to a
      dynamically allocated container with a larger register buffer.
 
    - Convert the code over to the new fpstate mechanism.
 
    - Consolidate the KVM FPU handling by moving the FPU related code into
      the FPU core which removes the number of exports and avoids adding
      even more export when AMX has to be supported in KVM. This also
      removes duplicated code which was of course unnecessary different and
      incomplete in the KVM copy.
 
    - Simplify the KVM FPU buffer handling by utilizing the new fpstate
      container and just switching the buffer pointer from the user space
      buffer to the KVM guest buffer when entering vcpu_run() and flipping
      it back when leaving the function. This cuts the memory requirements
      of a vCPU for FPU buffers in half and avoids pointless memory copy
      operations.
 
      This also solves the so far unresolved problem of adding AMX support
      because the current FPU buffer handling of KVM inflicted a circular
      dependency between adding AMX support to the core and to KVM.  With
      the new scheme of switching fpstate AMX support can be added to the
      core code without affecting KVM.
 
    - Replace various variables with proper data structures so the extra
      information required for adding dynamically enabled FPU features (AMX)
      can be added in one place
 
  - Add AMX (Advanved Matrix eXtensions) support (finally):
 
     AMX is a large XSTATE component which is going to be available with
     Saphire Rapids XEON CPUs. The feature comes with an extra MSR (MSR_XFD)
     which allows to trap the (first) use of an AMX related instruction,
     which has two benefits:
 
     1) It allows the kernel to control access to the feature
 
     2) It allows the kernel to dynamically allocate the large register
        state buffer instead of burdening every task with the the extra 8K
        or larger state storage.
 
     It would have been great to gain this kind of control already with
     AVX512.
 
     The support comes with the following infrastructure components:
 
     1) arch_prctl() to
        - read the supported features (equivalent to XGETBV(0))
        - read the permitted features for a task
        - request permission for a dynamically enabled feature
 
        Permission is granted per process, inherited on fork() and cleared
        on exec(). The permission policy of the kernel is restricted to
        sigaltstack size validation, but the syscall obviously allows
        further restrictions via seccomp etc.
 
     2) A stronger sigaltstack size validation for sys_sigaltstack(2) which
        takes granted permissions and the potentially resulting larger
        signal frame into account. This mechanism can also be used to
        enforce factual sigaltstack validation independent of dynamic
        features to help with finding potential victims of the 2K
        sigaltstack size constant which is broken since AVX512 support was
        added.
 
     3) Exception handling for #NM traps to catch first use of a extended
        feature via a new cause MSR. If the exception was caused by the use
        of such a feature, the handler checks permission for that
        feature. If permission has not been granted, the handler sends a
        SIGILL like the #UD handler would do if the feature would have been
        disabled in XCR0. If permission has been granted, then a new fpstate
        which fits the larger buffer requirement is allocated.
 
        In the unlikely case that this allocation fails, the handler sends
        SIGSEGV to the task. That's not elegant, but unavoidable as the
        other discussed options of preallocation or full per task
        permissions come with their own set of horrors for kernel and/or
        userspace. So this is the lesser of the evils and SIGSEGV caused by
        unexpected memory allocation failures is not a fundamentally new
        concept either.
 
        When allocation succeeds, the fpstate properties are filled in to
        reflect the extended feature set and the resulting sizes, the
        fpu::fpstate pointer is updated accordingly and the trap is disarmed
        for this task permanently.
 
     4) Enumeration and size calculations
 
     5) Trap switching via MSR_XFD
 
        The XFD (eXtended Feature Disable) MSR is context switched with the
        same life time rules as the FPU register state itself. The mechanism
        is keyed off with a static key which is default disabled so !AMX
        equipped CPUs have zero overhead. On AMX enabled CPUs the overhead
        is limited by comparing the tasks XFD value with a per CPU shadow
        variable to avoid redundant MSR writes. In case of switching from a
        AMX using task to a non AMX using task or vice versa, the extra MSR
        write is obviously inevitable.
 
        All other places which need to be aware of the variable feature sets
        and resulting variable sizes are not affected at all because they
        retrieve the information (feature set, sizes) unconditonally from
        the fpstate properties.
 
     6) Enable the new AMX states
 
   Note, this is relatively new code despite the fact that AMX support is in
   the works for more than a year now.
 
   The big refactoring of the FPU code, which allowed to do a proper
   integration has been started exactly 3 weeks ago. Refactoring of the
   existing FPU code and of the original AMX patches took a week and has
   been subject to extensive review and testing. The only fallout which has
   not been caught in review and testing right away was restricted to AMX
   enabled systems, which is completely irrelevant for anyone outside Intel
   and their early access program. There might be dragons lurking as usual,
   but so far the fine grained refactoring has held up and eventual yet
   undetected fallout is bisectable and should be easily addressable before
   the 5.16 release. Famous last words...
 
   Many thanks to Chang Bae and Dave Hansen for working hard on this and
   also to the various test teams at Intel who reserved extra capacity to
   follow the rapid development of this closely which provides the
   confidence level required to offer this rather large update for inclusion
   into 5.16-rc1.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/NkITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYodDkEADH4+/nN/QoSUHIuuha5Zptj3g2b16a
 /3TxT9fhwPen/kzMGsUk70s3iWJMA+I5dCfkSZexJ2hfhcRe9cBzZIa1HCawKwf3
 YCISTsO/M+LpeORuZ+TpfFLJKnxNr1SEOl+EYffGhq0AkCjifb9Cnr0JZuoMUzGU
 jpfJZ2bj28ri5lG812DtzSMBM9E3SAwgJv+GNjmZbxZKb9mAfhbAMdBUXHirX7Ej
 jmx6koQjYOKwYIW8w1BrdC270lUKQUyJTbQgdRkN9Mh/HnKyFixQ18JqGlgaV2cT
 EtYePUfTEdaHdAhUINLIlEug1MfOslHU+HyGsdywnoChNB4GHPQuePC5Tz60VeFN
 RbQ9aKcBUu8r95rjlnKtAtBijNMA4bjGwllVxNwJ/ZoA9RPv1SbDZ07RX3qTaLVY
 YhVQl8+shD33/W24jUTJv1kMMexpHXIlv0gyfMryzpwI7uzzmGHRPAokJdbYKctC
 dyMPfdE90rxTiMUdL/1IQGhnh3awjbyfArzUhHyQ++HyUyzCFh0slsO0CD18vUy8
 FofhCugGBhjuKw3XwLNQ+KsWURz5qHctSzBc3qMOSyqFHbAJCVRANkhsFvWJo2qL
 75+Z7OTRebtsyOUZIdq26r4roSxHrps3dupWTtN70HWx2NhQG1nLEw986QYiQu1T
 hcKvDmehQLrUvg==
 =x3WL
 -----END PGP SIGNATURE-----

Merge tag 'x86-fpu-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fpu updates from Thomas Gleixner:

 - Cleanup of extable fixup handling to be more robust, which in turn
   allows to make the FPU exception fixups more robust as well.

 - Change the return code for signal frame related failures from
   explicit error codes to a boolean fail/success as that's all what the
   calling code evaluates.

 - A large refactoring of the FPU code to prepare for adding AMX
   support:

      - Distangle the public header maze and remove especially the
        misnomed kitchen sink internal.h which is despite it's name
        included all over the place.

      - Add a proper abstraction for the register buffer storage (struct
        fpstate) which allows to dynamically size the buffer at runtime
        by flipping the pointer to the buffer container from the default
        container which is embedded in task_struct::tread::fpu to a
        dynamically allocated container with a larger register buffer.

      - Convert the code over to the new fpstate mechanism.

      - Consolidate the KVM FPU handling by moving the FPU related code
        into the FPU core which removes the number of exports and avoids
        adding even more export when AMX has to be supported in KVM.
        This also removes duplicated code which was of course
        unnecessary different and incomplete in the KVM copy.

      - Simplify the KVM FPU buffer handling by utilizing the new
        fpstate container and just switching the buffer pointer from the
        user space buffer to the KVM guest buffer when entering
        vcpu_run() and flipping it back when leaving the function. This
        cuts the memory requirements of a vCPU for FPU buffers in half
        and avoids pointless memory copy operations.

        This also solves the so far unresolved problem of adding AMX
        support because the current FPU buffer handling of KVM inflicted
        a circular dependency between adding AMX support to the core and
        to KVM. With the new scheme of switching fpstate AMX support can
        be added to the core code without affecting KVM.

      - Replace various variables with proper data structures so the
        extra information required for adding dynamically enabled FPU
        features (AMX) can be added in one place

 - Add AMX (Advanced Matrix eXtensions) support (finally):

   AMX is a large XSTATE component which is going to be available with
   Saphire Rapids XEON CPUs. The feature comes with an extra MSR
   (MSR_XFD) which allows to trap the (first) use of an AMX related
   instruction, which has two benefits:

    1) It allows the kernel to control access to the feature

    2) It allows the kernel to dynamically allocate the large register
       state buffer instead of burdening every task with the the extra
       8K or larger state storage.

   It would have been great to gain this kind of control already with
   AVX512.

   The support comes with the following infrastructure components:

    1) arch_prctl() to
        - read the supported features (equivalent to XGETBV(0))
        - read the permitted features for a task
        - request permission for a dynamically enabled feature

       Permission is granted per process, inherited on fork() and
       cleared on exec(). The permission policy of the kernel is
       restricted to sigaltstack size validation, but the syscall
       obviously allows further restrictions via seccomp etc.

    2) A stronger sigaltstack size validation for sys_sigaltstack(2)
       which takes granted permissions and the potentially resulting
       larger signal frame into account. This mechanism can also be used
       to enforce factual sigaltstack validation independent of dynamic
       features to help with finding potential victims of the 2K
       sigaltstack size constant which is broken since AVX512 support
       was added.

    3) Exception handling for #NM traps to catch first use of a extended
       feature via a new cause MSR. If the exception was caused by the
       use of such a feature, the handler checks permission for that
       feature. If permission has not been granted, the handler sends a
       SIGILL like the #UD handler would do if the feature would have
       been disabled in XCR0. If permission has been granted, then a new
       fpstate which fits the larger buffer requirement is allocated.

       In the unlikely case that this allocation fails, the handler
       sends SIGSEGV to the task. That's not elegant, but unavoidable as
       the other discussed options of preallocation or full per task
       permissions come with their own set of horrors for kernel and/or
       userspace. So this is the lesser of the evils and SIGSEGV caused
       by unexpected memory allocation failures is not a fundamentally
       new concept either.

       When allocation succeeds, the fpstate properties are filled in to
       reflect the extended feature set and the resulting sizes, the
       fpu::fpstate pointer is updated accordingly and the trap is
       disarmed for this task permanently.

    4) Enumeration and size calculations

    5) Trap switching via MSR_XFD

       The XFD (eXtended Feature Disable) MSR is context switched with
       the same life time rules as the FPU register state itself. The
       mechanism is keyed off with a static key which is default
       disabled so !AMX equipped CPUs have zero overhead. On AMX enabled
       CPUs the overhead is limited by comparing the tasks XFD value
       with a per CPU shadow variable to avoid redundant MSR writes. In
       case of switching from a AMX using task to a non AMX using task
       or vice versa, the extra MSR write is obviously inevitable.

       All other places which need to be aware of the variable feature
       sets and resulting variable sizes are not affected at all because
       they retrieve the information (feature set, sizes) unconditonally
       from the fpstate properties.

    6) Enable the new AMX states

   Note, this is relatively new code despite the fact that AMX support
   is in the works for more than a year now.

   The big refactoring of the FPU code, which allowed to do a proper
   integration has been started exactly 3 weeks ago. Refactoring of the
   existing FPU code and of the original AMX patches took a week and has
   been subject to extensive review and testing. The only fallout which
   has not been caught in review and testing right away was restricted
   to AMX enabled systems, which is completely irrelevant for anyone
   outside Intel and their early access program. There might be dragons
   lurking as usual, but so far the fine grained refactoring has held up
   and eventual yet undetected fallout is bisectable and should be
   easily addressable before the 5.16 release. Famous last words...

   Many thanks to Chang Bae and Dave Hansen for working hard on this and
   also to the various test teams at Intel who reserved extra capacity
   to follow the rapid development of this closely which provides the
   confidence level required to offer this rather large update for
   inclusion into 5.16-rc1

* tag 'x86-fpu-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (110 commits)
  Documentation/x86: Add documentation for using dynamic XSTATE features
  x86/fpu: Include vmalloc.h for vzalloc()
  selftests/x86/amx: Add context switch test
  selftests/x86/amx: Add test cases for AMX state management
  x86/fpu/amx: Enable the AMX feature in 64-bit mode
  x86/fpu: Add XFD handling for dynamic states
  x86/fpu: Calculate the default sizes independently
  x86/fpu/amx: Define AMX state components and have it used for boot-time checks
  x86/fpu/xstate: Prepare XSAVE feature table for gaps in state component numbers
  x86/fpu/xstate: Add fpstate_realloc()/free()
  x86/fpu/xstate: Add XFD #NM handler
  x86/fpu: Update XFD state where required
  x86/fpu: Add sanity checks for XFD
  x86/fpu: Add XFD state to fpstate
  x86/msr-index: Add MSRs for XFD
  x86/cpufeatures: Add eXtended Feature Disabling (XFD) feature bit
  x86/fpu: Reset permission and fpstate on exec()
  x86/fpu: Prepare fpu_clone() for dynamically enabled features
  x86/fpu/signal: Prepare for variable sigframe length
  x86/signal: Use fpu::__state_user_size for sigalt stack validation
  ...
2021-11-01 14:03:56 -07:00
Linus Torvalds
9a7e0a90a4 Scheduler updates:
- Revert the printk format based wchan() symbol resolution as it can leak
    the raw value in case that the symbol is not resolvable.
 
  - Make wchan() more robust and work with all kind of unwinders by
    enforcing that the task stays blocked while unwinding is in progress.
 
  - Prevent sched_fork() from accessing an invalid sched_task_group
 
  - Improve asymmetric packing logic
 
  - Extend scheduler statistics to RT and DL scheduling classes and add
    statistics for bandwith burst to the SCHED_FAIR class.
 
  - Properly account SCHED_IDLE entities
 
  - Prevent a potential deadlock when initial priority is assigned to a
    newly created kthread. A recent change to plug a race between cpuset and
    __sched_setscheduler() introduced a new lock dependency which is now
    triggered. Break the lock dependency chain by moving the priority
    assignment to the thread function.
 
  - Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.
 
  - Improve idle balancing in general and especially for NOHZ enabled
    systems.
 
  - Provide proper interfaces for live patching so it does not have to
    fiddle with scheduler internals.
 
  - Add cluster aware scheduling support.
 
  - A small set of tweaks for RT (irqwork, wait_task_inactive(), various
    scheduler options and delaying mmdrop)
 
  - The usual small tweaks and improvements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/OUkTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoR/5D/9ikdGNpKg9osNqJ3GjAmxsK6kVkB29
 iFe2k8pIpWDToWQf/wQRGih4Yj3Cl49QSnZcPIibh2/12EB1qrrW6iSPJkInz8Ec
 /1LS5/Vewn2OyoxyXZjdvGC5gTXEodSbIazASvX7nvdMeI4gsAsL5etzrMJirT/t
 aymqvr7zovvywrwMTQJrGjUMo9l4ewE8tafMNNhRu1BHU1U4ojM9yvThyRAAcmp7
 3Xy49A+Yq3IgrvYI4u8FMK5Zh08KaxSFjiLhePGm/bF+wSfYmWop2TP1jY05W2Uo
 ti8hfbJMUoFRYuMxAiEldkItnc0wV4M9PtWZZ/x+B71bs65Y4Zjt9cW+rxJv2+m1
 vzV31EsQwGnOti072dzWN4c/cZqngVXAjaNtErvDwJUr+Tw1ayv9KUvuodMQqZY6
 mu68bFUO2kV9EMe1CBOv51Uy1RGHyLj3rlNqrkw+Xp5ISE9Ad2vhUEiRp5bQx5Ci
 V/XFhGZkGUluh0vccrdFlNYZwhj8cZEzkOPCnPSeZ+bq8SyZE6xuHH/lTP1CJCOy
 s800rW1huM+kgV+zRN8adDkGXibAk9N3RtVGnQXmuEy8gB9LZmQg+JeM2wsc9B+6
 i0gdqZnsjNAfoK+BBAG4holxptSL8/eOJsFH8ZNIoxQ+iqooyPx9tFX7yXnRTBQj
 d2qWG7UvoseT+g==
 =fgtS
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Thomas Gleixner:

 - Revert the printk format based wchan() symbol resolution as it can
   leak the raw value in case that the symbol is not resolvable.

 - Make wchan() more robust and work with all kind of unwinders by
   enforcing that the task stays blocked while unwinding is in progress.

 - Prevent sched_fork() from accessing an invalid sched_task_group

 - Improve asymmetric packing logic

 - Extend scheduler statistics to RT and DL scheduling classes and add
   statistics for bandwith burst to the SCHED_FAIR class.

 - Properly account SCHED_IDLE entities

 - Prevent a potential deadlock when initial priority is assigned to a
   newly created kthread. A recent change to plug a race between cpuset
   and __sched_setscheduler() introduced a new lock dependency which is
   now triggered. Break the lock dependency chain by moving the priority
   assignment to the thread function.

 - Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.

 - Improve idle balancing in general and especially for NOHZ enabled
   systems.

 - Provide proper interfaces for live patching so it does not have to
   fiddle with scheduler internals.

 - Add cluster aware scheduling support.

 - A small set of tweaks for RT (irqwork, wait_task_inactive(), various
   scheduler options and delaying mmdrop)

 - The usual small tweaks and improvements all over the place

* tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (69 commits)
  sched/fair: Cleanup newidle_balance
  sched/fair: Remove sysctl_sched_migration_cost condition
  sched/fair: Wait before decaying max_newidle_lb_cost
  sched/fair: Skip update_blocked_averages if we are defering load balance
  sched/fair: Account update_blocked_averages in newidle_balance cost
  x86: Fix __get_wchan() for !STACKTRACE
  sched,x86: Fix L2 cache mask
  sched/core: Remove rq_relock()
  sched: Improve wake_up_all_idle_cpus() take #2
  irq_work: Also rcuwait for !IRQ_WORK_HARD_IRQ on PREEMPT_RT
  irq_work: Handle some irq_work in a per-CPU thread on PREEMPT_RT
  irq_work: Allow irq_work_sync() to sleep if irq_work() no IRQ support.
  sched/rt: Annotate the RT balancing logic irqwork as IRQ_WORK_HARD_IRQ
  sched: Add cluster scheduler level for x86
  sched: Add cluster scheduler level in core and related Kconfig for ARM64
  topology: Represent clusters of CPUs within a die
  sched: Disable -Wunused-but-set-variable
  sched: Add wrapper for get_wchan() to keep task blocked
  x86: Fix get_wchan() to support the ORC unwinder
  proc: Use task_is_running() for wchan in /proc/$pid/stat
  ...
2021-11-01 13:48:52 -07:00
Linus Torvalds
43aa0a195f objtool updates:
- Improve retpoline code patching by separating it from alternatives which
    reduces memory footprint and allows to do better optimizations in the
    actual runtime patching.
 
  - Add proper retpoline support for x86/BPF
 
  - Address noinstr warnings in x86/kvm, lockdep and paravirtualization code
 
  - Add support to handle pv_opsindirect calls in the noinstr analysis
 
  - Classify symbols upfront and cache the result to avoid redundant
    str*cmp() invocations.
 
  - Add a CFI hash to reduce memory consumption which also reduces runtime
    on a allyesconfig by ~50%
 
  - Adjust XEN code to make objtool handling more robust and as a side
    effect to prevent text fragmentation due to placement of the hypercall
    page.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/GFgTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoc1JD/0Sz6seP2OUMxbMT3gCcFo9sMvYTdsM
 7WuGFbBbnCIo7g8JH7k0zRRBigptMp2eUtQXKkgaaIbWN4JbuVKf8KxN5/qXxLi4
 fJ12QnNTGH9N2jtzl5wKmpjaKJnnJMD9D10XwoR+T6gn6NHd+AgLEs7GxxuQUlgo
 eC9oEXhNHC8uNhiZc38EwfwmItI1bRgaLrnZWIL4rYGSMxfCK1/cEOpWrFfX9wmj
 /diB6oqMyPXZXMCtgpX7TniUr5XOTCcUkeO9mQv5bmyq/YM/8hrTbcVSJlsVYLvP
 EsBnUSHAcfLFiHXwa1RNiIGdbiPjbN+UYeXGAvqF58f3e5dTIHtN/UmWo7OH93If
 9rLMVNcMpsfPx7QRk2IxEPumLCkyfwjzfKrVDM6P6TKEIUzD1og4IK9gTlfykVsh
 56G5XiCOC/X2x8IMxKTLGuBiAVLFHXK/rSwoqhvNEWBFKDbP13QWs0LurBcW09Sa
 /kQI9pIBT1xFA/R+OY5Xy1cqNVVK1Gxmk8/bllCijA9pCFSCFM4hLZE5CevdrBCV
 h5SdqEK5hIlzFyypXfsCik/4p/+rfvlGfUKtFsPctxx29SPe+T0orx+l61jiWQok
 rZOflwMawK5lDuASHrvNHGJcWaTwoo3VcXMQDnQY0Wulc43J5IFBaPxkZzgyd+S1
 4lktHxatrCMUgw==
 =pfZi
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Thomas Gleixner:

 - Improve retpoline code patching by separating it from alternatives
   which reduces memory footprint and allows to do better optimizations
   in the actual runtime patching.

 - Add proper retpoline support for x86/BPF

 - Address noinstr warnings in x86/kvm, lockdep and paravirtualization
   code

 - Add support to handle pv_opsindirect calls in the noinstr analysis

 - Classify symbols upfront and cache the result to avoid redundant
   str*cmp() invocations.

 - Add a CFI hash to reduce memory consumption which also reduces
   runtime on a allyesconfig by ~50%

 - Adjust XEN code to make objtool handling more robust and as a side
   effect to prevent text fragmentation due to placement of the
   hypercall page.

* tag 'objtool-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  bpf,x86: Respect X86_FEATURE_RETPOLINE*
  bpf,x86: Simplify computing label offsets
  x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
  x86/alternative: Add debug prints to apply_retpolines()
  x86/alternative: Try inline spectre_v2=retpoline,amd
  x86/alternative: Handle Jcc __x86_indirect_thunk_\reg
  x86/alternative: Implement .retpoline_sites support
  x86/retpoline: Create a retpoline thunk array
  x86/retpoline: Move the retpoline thunk declarations to nospec-branch.h
  x86/asm: Fixup odd GEN-for-each-reg.h usage
  x86/asm: Fix register order
  x86/retpoline: Remove unused replacement symbols
  objtool,x86: Replace alternatives with .retpoline_sites
  objtool: Shrink struct instruction
  objtool: Explicitly avoid self modifying code in .altinstr_replacement
  objtool: Classify symbols
  objtool: Support pv_opsindirect calls for noinstr
  x86/xen: Rework the xen_{cpu,irq,mmu}_opsarrays
  x86/xen: Mark xen_force_evtchn_callback() noinstr
  x86/xen: Make irq_disable() noinstr
  ...
2021-11-01 13:24:43 -07:00
Linus Torvalds
595b28fb0c Locking updates:
- Move futex code into kernel/futex/ and split up the kitchen sink into
    seperate files to make integration of sys_futex_waitv() simpler.
 
  - Add a new sys_futex_waitv() syscall which allows to wait on multiple
    futexes. The main use case is emulating Windows' WaitForMultipleObjects
    which allows Wine to improve the performance of Windows Games. Also
    native Linux games can benefit from this interface as this is a common
    wait pattern for this kind of applications.
 
  - Add context to ww_mutex_trylock() to provide a path for i915 to rework
    their eviction code step by step without making lockdep upset until the
    final steps of rework are completed. It's also useful for regulator and
    TTM to avoid dropping locks in the non contended path.
 
  - Lockdep and might_sleep() cleanups and improvements
 
  - A few improvements for the RT substitutions.
 
  - The usual small improvements and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/FTITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVNZD/9vIm3Bu1Coz8tbNXz58AiCYq9Y/vp5
 mzFgSzz+VJTkW5Vh8jo5Uel4rCKZyt+rL276EoaRPzYl8KFtWDbpK3qd3PrXKqTX
 At49JO4ttAMJUHIBQ6vblEkykmfEd9YPU1uSWk5roJ+s7Jmr5VWnu0FEWHP00As5
 tWOca/TM0ei9kof26V2fl5aecTGII4i4Zsvy+LPsXtI+TnmP0gSBcGAS/5UnZTtJ
 vQRWTR3ojoYvh5iTmNqbaURYoQLe2j8yscn1DSW1CABWVmP12eDWs+N7jRP4b5S9
 73xOv5P7vpva41wxrK2ir5iNkpsLE97VL2JOHTW8nm7orblfiuxHLTCkTjEdd2pO
 h8blI2IBizEB3JYn2BMkOAaZQOSjN8hd6Ye/b2B4AMEGWeXEoEv6eVy/orYKCluQ
 XDqGn47Vce/SYmo5vfTB8VMt6nANx8PKvOP3IvjHInYEQBgiT6QrlUw3RRkXBp5s
 clQkjYYwjAMVIXowcCrdhoKjMROzi6STShVwHwGL8MaZXqr8Vl6BUO9ckU0pY+4C
 F000Hzwxi8lGEQ9k+P+BnYOEzH5osCty8lloKiQ/7ciX6T+CZHGJPGK/iY4YL8P5
 C3CJWMsHCqST7DodNFJmdfZt99UfIMmEhshMDduU9AAH0tHCn8vOu0U6WvCtpyBp
 BvHj68zteAtlYg==
 =RZ4x
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Thomas Gleixner:

 - Move futex code into kernel/futex/ and split up the kitchen sink into
   seperate files to make integration of sys_futex_waitv() simpler.

 - Add a new sys_futex_waitv() syscall which allows to wait on multiple
   futexes.

   The main use case is emulating Windows' WaitForMultipleObjects which
   allows Wine to improve the performance of Windows Games. Also native
   Linux games can benefit from this interface as this is a common wait
   pattern for this kind of applications.

 - Add context to ww_mutex_trylock() to provide a path for i915 to
   rework their eviction code step by step without making lockdep upset
   until the final steps of rework are completed. It's also useful for
   regulator and TTM to avoid dropping locks in the non contended path.

 - Lockdep and might_sleep() cleanups and improvements

 - A few improvements for the RT substitutions.

 - The usual small improvements and cleanups.

* tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  locking: Remove spin_lock_flags() etc
  locking/rwsem: Fix comments about reader optimistic lock stealing conditions
  locking: Remove rcu_read_{,un}lock() for preempt_{dis,en}able()
  locking/rwsem: Disable preemption for spinning region
  docs: futex: Fix kernel-doc references
  futex: Fix PREEMPT_RT build
  futex2: Documentation: Document sys_futex_waitv() uAPI
  selftests: futex: Test sys_futex_waitv() wouldblock
  selftests: futex: Test sys_futex_waitv() timeout
  selftests: futex: Add sys_futex_waitv() test
  futex,arm: Wire up sys_futex_waitv()
  futex,x86: Wire up sys_futex_waitv()
  futex: Implement sys_futex_waitv()
  futex: Simplify double_lock_hb()
  futex: Split out wait/wake
  futex: Split out requeue
  futex: Rename mark_wake_futex()
  futex: Rename: match_futex()
  futex: Rename: hb_waiter_{inc,dec,pending}()
  futex: Split out PI futex
  ...
2021-11-01 13:15:36 -07:00
Linus Torvalds
91e1c99e17 perf updates:
core:
 
   - Allow ftrace to instrument parts of the perf core code
 
   - Add a new mem_hops field to perf_mem_data_src which allows to represent
     intra-node/package or inter-node/off-package details to prepare for
     next generation systems which have more hieararchy within the
     node/pacakge level.
 
  tools:
 
   - Update for the new mem_hops field in perf_mem_data_src
 
  arch:
 
   - A set of constraints fixes for the Intel uncore PMU
 
   - The usual set of small fixes and improvements for x86 and PPC
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/GkQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoaD8D/wLhXR8RxtF4W9HJmHA+5XFsPtg+isp
 ZNU2kOs4gZskFx75NQaRv5ikA8y68TKdIx+NuQvRLYItaMveTToLSsJ55bfGMxIQ
 JHqDvANUNxBmAACnbYQlqf9WgB0i/3fCUHY5lpmN0waKjaswz7WNpycv4ccShVZr
 PKbgEjkeFBhplCqqOF0X5H3V+4q85+nZONm1iSNd4S7/3B6OCxOf1u78usL1bbtW
 yJAMSuTeOVUZCJm7oVywKW/ZlCscT135aKr6xe5QTrjlPuRWzuLaXNezdMnMyoVN
 HVv8a0ClACb8U5KiGfhvaipaIlIAliWJp2qoiNjrspDruhH6Yc+eNh1gUhLbtNpR
 4YZR5jxv4/mS13kzMMQg00cCWQl7N4whPT+ZE9pkpshGt+EwT+Iy3U+v13wDfnnp
 MnDggpWYGEkAck13t/T6DwC3qBIsVujtpiG+tt/ERbTxiuxi1ccQTGY3PDjtHV3k
 tIMH5n7l4jEpfl8VmoSUgz/2h1MLZnQUWp41GXkjkaOt7uunQZen+nAwqpTm28KV
 7U6U0h1q6r7HxOZRxkPPe4HSV+aBNH3H1LeNBfEd3hDCFGf6MY6vLow+2BE9ybk7
 Y6LPbRqq0SN3sd5MND0ZvQEt5Zgol8CMlX+UKoLEEv7RognGbIxkgpK7exv5pC9w
 nWj7TaMfpRzPgw==
 =Oj0G
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Thomas Gleixner:
 "Core:

   - Allow ftrace to instrument parts of the perf core code

   - Add a new mem_hops field to perf_mem_data_src which allows to
     represent intra-node/package or inter-node/off-package details to
     prepare for next generation systems which have more hieararchy
     within the node/pacakge level.

  Tools:

   - Update for the new mem_hops field in perf_mem_data_src

  Arch:

   - A set of constraints fixes for the Intel uncore PMU

   - The usual set of small fixes and improvements for x86 and PPC"

* tag 'perf-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings
  powerpc/perf: Fix data source encodings for L2.1 and L3.1 accesses
  tools/perf: Add mem_hops field in perf_mem_data_src structure
  perf: Add mem_hops field in perf_mem_data_src structure
  perf: Add comment about current state of PERF_MEM_LVL_* namespace and remove an extra line
  perf/core: Allow ftrace for functions in kernel/event/core.c
  perf/x86: Add new event for AUX output counter index
  perf/x86: Add compiler barrier after updating BTS
  perf/x86/intel/uncore: Fix Intel SPR M3UPI event constraints
  perf/x86/intel/uncore: Fix Intel SPR M2PCIE event constraints
  perf/x86/intel/uncore: Fix Intel SPR IIO event constraints
  perf/x86/intel/uncore: Fix Intel SPR CHA event constraints
  perf/x86/intel/uncore: Fix Intel ICX IIO event constraints
  perf/x86/intel/uncore: Fix invalid unit check
  perf/x86/intel/uncore: Support extra IMC channel on Ice Lake server
2021-11-01 13:12:15 -07:00
Björn Töpel
36e70b9b06 selftests, bpf: Fix broken riscv build
This patch is closely related to commit 6016df8fe8 ("selftests/bpf:
Fix broken riscv build"). When clang includes the system include
directories, but targeting BPF program, __BITS_PER_LONG defaults to
32, unless explicitly set. Work around this problem, by explicitly
setting __BITS_PER_LONG to __riscv_xlen.

Signed-off-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211028161057.520552-5-bjorn@kernel.org
2021-11-01 17:10:40 +01:00
Björn Töpel
589fed479b riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h
Add macros for 64-bit RISC-V PT_REGS to bpf_tracing.h.

Signed-off-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211028161057.520552-4-bjorn@kernel.org
2021-11-01 17:08:21 +01:00
Björn Töpel
b390d69831 tools, build: Add RISC-V to HOSTARCH parsing
Add RISC-V to the HOSTARCH parsing, so that ARCH is "riscv", and not
"riscv32" or "riscv64".

This affects the perf and libbpf builds, so that arch specific
includes are correctly picked up for RISC-V.

Signed-off-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211028161057.520552-3-bjorn@kernel.org
2021-11-01 17:08:21 +01:00
Liu Jian
d69672147f selftests, bpf: Add one test for sockmap with strparser
Add the test to check sockmap with strparser is working well.

Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211029141216.211899-3-liujian56@huawei.com
2021-11-01 17:08:21 +01:00
Liu Jian
b556c3fd46 selftests, bpf: Fix test_txmsg_ingress_parser error
After "skmsg: lose offset info in sk_psock_skb_ingress", the test case
with ktls failed. This because ktls parser(tls_read_size) return value
is 285 not 256.

The case like this:

	tls_sk1 --> redir_sk --> tls_sk2

tls_sk1 sent out 512 bytes data, after tls related processing redir_sk
recved 570 btyes data, and redirect 512 (skb_use_parser) bytes data to
tls_sk2; but tls_sk2 needs 285 * 2 bytes data, receive timeout occurred.

Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211029141216.211899-2-liujian56@huawei.com
2021-11-01 17:08:21 +01:00
Andrii Nakryiko
0133c20480 selftests/bpf: Fix strobemeta selftest regression
After most recent nightly Clang update strobemeta selftests started
failing with the following error (relevant portion of assembly included):

  1624: (85) call bpf_probe_read_user_str#114
  1625: (bf) r1 = r0
  1626: (18) r2 = 0xfffffffe
  1628: (5f) r1 &= r2
  1629: (55) if r1 != 0x0 goto pc+7
  1630: (07) r9 += 104
  1631: (6b) *(u16 *)(r9 +0) = r0
  1632: (67) r0 <<= 32
  1633: (77) r0 >>= 32
  1634: (79) r1 = *(u64 *)(r10 -456)
  1635: (0f) r1 += r0
  1636: (7b) *(u64 *)(r10 -456) = r1
  1637: (79) r1 = *(u64 *)(r10 -368)
  1638: (c5) if r1 s< 0x1 goto pc+778
  1639: (bf) r6 = r8
  1640: (0f) r6 += r7
  1641: (b4) w1 = 0
  1642: (6b) *(u16 *)(r6 +108) = r1
  1643: (79) r3 = *(u64 *)(r10 -352)
  1644: (79) r9 = *(u64 *)(r10 -456)
  1645: (bf) r1 = r9
  1646: (b4) w2 = 1
  1647: (85) call bpf_probe_read_user_str#114

  R1 unbounded memory access, make sure to bounds check any such access

In the above code r0 and r1 are implicitly related. Clang knows that,
but verifier isn't able to infer this relationship.

Yonghong Song narrowed down this "regression" in code generation to
a recent Clang optimization change ([0]), which for BPF target generates
code pattern that BPF verifier can't handle and loses track of register
boundaries.

This patch works around the issue by adding an BPF assembly-based helper
that helps to prove to the verifier that upper bound of the register is
a given constant by controlling the exact share of generated BPF
instruction sequence. This fixes the immediate issue for strobemeta
selftest.

  [0] acabad9ff6

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211029182907.166910-1-andrii@kernel.org
2021-11-01 17:08:21 +01:00
Arnaldo Carvalho de Melo
ba4026b09d Revert "perf bench futex: Add support for 32-bit systems with 64-bit time_t"
This reverts commit c1ff12dac4.

This commit makes the build break on ubuntu 20.04 and other older
systems and it as well has identation problems, lets revert it till we
get these problems fixed.

Test results:

   1    78.36 almalinux:8                   : Ok   gcc (GCC) 8.4.1 20200928 (Red Hat 8.4.1-1) , clang version 11.0.0 (Red Hat 11.0.0-1.module_el8.4.0+2107+39fed697)
   2     8.40 alpine:3.4                    : FAIL gcc version 5.3.0 (Alpine 5.3.0)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
   3     8.89 alpine:3.5                    : FAIL gcc version 6.2.1 20160822 (Alpine 6.2.1)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
                                  ^
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
                                  ^
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
   4     8.59 alpine:3.6                    : FAIL gcc version 6.3.0 (Alpine 6.3.0)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
                                  ^
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
                                  ^
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
   5     9.01 alpine:3.7                    : FAIL gcc version 6.4.0 (Alpine 6.4.0)
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
                                  ^
    compilation terminated.
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
                                  ^
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
   6     8.70 alpine:3.8                    : FAIL gcc version 6.4.0 (Alpine 6.4.0)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
                                  ^
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
                                  ^
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
   7     9.70 alpine:3.9                    : FAIL gcc version 8.3.0 (Alpine 8.3.0)
    In file included from bench/futex-wake.c:25:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake-parallel.c:31:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-hash.c:29:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
   8     9.40 alpine:3.10                   : FAIL gcc version 8.3.0 (Alpine 8.3.0)
    In file included from bench/futex-hash.c:29:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake.c:25:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
   9     9.81 alpine:3.11                   : FAIL gcc version 9.3.0 (Alpine 9.3.0)
    In file included from bench/futex-hash.c:29:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
       16 | #include <linux/time_types.h>
          |          ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake.c:25:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
       16 | #include <linux/time_types.h>
          |          ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  10    10.32 alpine:3.12                   : FAIL gcc version 9.3.0 (Alpine 9.3.0)
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
       64 |  if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
          |                                 ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       68 |   struct __kernel_old_timespec ts32;
          |                                ^~~~
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    cc1: all warnings being treated as errors
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  11    99.82 alpine:3.13                   : Ok   gcc (Alpine 10.2.1_pre1) 10.2.1 20201203 , Alpine clang version 10.0.1
  12    87.39 alpine:3.14                   : Ok   gcc (Alpine 10.3.1_git20210424) 10.3.1 20210424 , Alpine clang version 11.1.0
  13    86.89 alpine:edge                   : Ok   gcc (Alpine 10.3.1_git20210921) 10.3.1 20210921 , Alpine clang version 12.0.1
  14     7.30 alt:p8                        : FAIL gcc version 5.3.1 20151207 (ALT p8 5.3.1-alt3.M80P.1) (GCC)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    make[3]: *** [bench] Error 2
  15    63.92 alt:p9                        : Ok   x86_64-alt-linux-gcc (GCC) 8.4.1 20200305 (ALT p9 8.4.1-alt0.p9.1) , clang version 10.0.0
  16    61.42 alt:sisyphus                  : Ok   x86_64-alt-linux-gcc (GCC) 11.2.1 20210911 (ALT Sisyphus 11.2.1-alt1) , ALT Linux Team clang version 12.0.1
  17     8.30 amazonlinux:1                 : FAIL gcc version 7.2.1 20170915 (Red Hat 7.2.1-2) (GCC)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[3]: *** [bench] Error 2
  18     8.71 amazonlinux:2                 : FAIL gcc version 7.3.1 20180712 (Red Hat 7.3.1-13) (GCC)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[3]: *** [bench] Error 2
  19    79.56 centos:8                      : Ok   gcc (GCC) 8.4.1 20200928 (Red Hat 8.4.1-1) , clang version 11.0.0 (Red Hat 11.0.0-1.module_el8.4.0+587+5187cac0)
  20    82.28 centos:stream                 : Ok   gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-3) , clang version 12.0.1 (Red Hat 12.0.1-2.module_el8.6.0+937+1cafe22c)
  21    55.24 clearlinux:latest             : Ok   gcc (Clear Linux OS for Intel Architecture) 11.2.1 20211020 releases/gcc-11.2.0-375-g40b209e340 , clang version 11.1.0
  22     7.41 debian:9                      : FAIL gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
                                  ^
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  23     7.90 debian:10                     : FAIL gcc version 8.3.0 (Debian 8.3.0-6)
    In file included from bench/futex-hash.c:29:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake.c:25:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  24    60.32 debian:11                     : Ok   gcc (Debian 10.2.1-6) 10.2.1 20210110 , Debian clang version 11.0.1-2
  25    59.42 debian:experimental           : Ok   gcc (Debian 11.2.0-10) 11.2.0 , Debian clang version 11.1.0-4
  26    23.76 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 11.2.0-9) 11.2.0
  27    19.25 debian:experimental-x-mips    : Ok   mips-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110
  28    21.25 debian:experimental-x-mips64  : Ok   mips64-linux-gnuabi64-gcc (Debian 10.2.1-6) 10.2.1 20210110
  29    21.88 debian:experimental-x-mipsel  : Ok   mipsel-linux-gnu-gcc (Debian 11.2.0-9) 11.2.0
  30     8.20 fedora:22                     : FAIL gcc version 5.3.1 20160406 (Red Hat 5.3.1-6) (GCC)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  31     8.20 fedora:23                     : FAIL gcc version 5.3.1 20160406 (Red Hat 5.3.1-6) (GCC)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  32     8.59 fedora:24                     : FAIL gcc version 6.3.1 20161221 (Red Hat 6.3.1-1) (GCC)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
                                  ^
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  33     6.60 fedora:24-x-ARC-uClibc        : FAIL gcc version 7.1.1 20170710 (ARCompact ISA Linux uClibc toolchain 2017.09-rc2)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  34     8.59 fedora:25                     : FAIL gcc version 6.4.1 20170727 (Red Hat 6.4.1-1) (GCC)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
                                  ^
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  35    14.61 fedora:26                     : FAIL gcc version 7.3.1 20180130 (Red Hat 7.3.1-2) (GCC)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  36     8.79 fedora:27                     : FAIL gcc version 7.3.1 20180712 (Red Hat 7.3.1-6) (GCC)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  37    15.12 fedora:28                     : FAIL gcc version 8.3.1 20190223 (Red Hat 8.3.1-2) (GCC)
    In file included from bench/futex-wake.c:25:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-hash.c:29:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake-parallel.c:31:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  38     9.60 fedora:29                     : FAIL gcc version 8.3.1 20190223 (Red Hat 8.3.1-2) (GCC)
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
      if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
                                     ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       struct __kernel_old_timespec ts32;
                                    ^~~~
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    cc1: all warnings being treated as errors
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  39   101.90 fedora:30                     : Ok   gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2) , clang version 8.0.0 (Fedora 8.0.0-3.fc30)
  40    99.30 fedora:31                     : Ok   gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2) , clang version 9.0.1 (Fedora 9.0.1-4.fc31)
  41    82.46 fedora:32                     : Ok   gcc (GCC) 10.3.1 20210422 (Red Hat 10.3.1-1) , clang version 10.0.1 (Fedora 10.0.1-3.fc32)
  42    81.32 fedora:33                     : Ok   gcc (GCC) 10.3.1 20210422 (Red Hat 10.3.1-1) , clang version 11.0.0 (Fedora 11.0.0-3.fc33)
  43    84.07 fedora:34                     : Ok   gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1) , clang version 12.0.1 (Fedora 12.0.1-1.fc34)
  44     7.09 fedora:34-x-ARC-glibc         : FAIL gcc version 8.3.1 20190225 (ARC HS GNU/Linux glibc toolchain 2019.03-rc1)
    In file included from bench/futex-hash.c:29:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake.c:25:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake-parallel.c:31:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  45     6.29 fedora:34-x-ARC-uClibc        : FAIL gcc version 8.3.1 20190225 (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1)
    In file included from bench/futex-hash.c:29:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  46    74.74 fedora:35                     : Ok   gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1) , clang version 13.0.0 (Fedora 13.0.0~rc1-1.fc35)
  47    73.13 fedora:rawhide                : Ok   gcc (GCC) 11.2.1 20211019 (Red Hat 11.2.1-6) , clang version 13.0.0 (Fedora 13.0.0-4.fc36)
  48    28.17 gentoo-stage3:latest          : Ok   gcc (Gentoo 11.2.0 p1) 11.2.0
  49     9.10 mageia:6                      : FAIL gcc version 5.5.0 (Mageia 5.5.0-1.mga6)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    In file included from bench/futex-wake-parallel.c:31:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  50    38.60 mageia:7                      : FAIL clang version 8.0.0 (Mageia 8.0.0-1.mga7)
          yychar = yylex (&yylval, &yylloc, scanner);
                   ^
    #define yylex           parse_events_lex
                            ^
    1 error generated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: util] Error 2
  51     6.18 openmandriva:cooker           : FAIL gcc version 11.2.0 20210728 (OpenMandriva) (GCC)
    In file included from builtin-bench.c:22:
    bench/bench.h:66:19: error: conflicting types for 'pthread_attr_setaffinity_np'; have 'int(pthread_attr_t *, size_t,  cpu_set_t *)' {aka 'int(pthread_attr_t *, long unsigned int,  cpu_set_t *)'}
       66 | static inline int pthread_attr_setaffinity_np(pthread_attr_t *attr __maybe_unused,
          |                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
    In file included from bench/bench.h:64,
                     from builtin-bench.c:22:
    /usr/include/pthread.h:394:12: note: previous declaration of 'pthread_attr_setaffinity_np' with type 'int(pthread_attr_t *, size_t,  const cpu_set_t *)' {aka 'int(pthread_attr_t *, long unsigned int,  const cpu_set_t *)'}
      394 | extern int pthread_attr_setaffinity_np (pthread_attr_t *__attr,
          |            ^~~~~~~~~~~~~~~~~~~~~~~~~~~
    file: Compiled magic version [540] does not match with shared library magic version [539]

    ld: warning: -r and --gc-sections may not be used together, disabling --gc-sections
    ld: warning: -r and --icf may not be used together, disabling --icf
    ld: warning: -r and --gc-sections may not be used together, disabling --gc-sections
    ld: warning: -r and --icf may not be used together, disabling --icf
    file: Compiled magic version [540] does not match with shared library magic version [539]

    file: Compiled magic version [540] does not match with shared library magic version [539]

    ld: warning: -r and --gc-sections may not be used together, disabling --gc-sections
    ld: warning: -r and --icf may not be used together, disabling --icf
  52    12.51 opensuse:15.0                 : FAIL gcc version 7.4.1 20190905 [gcc-7-branch revision 275407] (SUSE Linux)
    Makefile.config:999: No libbabeltrace found, disables 'perf data' CTF format support, please install libbabeltrace-dev[el]/libbabeltrace-ctf-dev
    update-alternatives: error: no alternatives for java
    update-alternatives: error: no alternatives for java
    Makefile.config:1043: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel

    Auto-detecting system features:
    ...                         dwarf: [ on  ]
    ...            dwarf_getlocations: [ on  ]
    ...                         glibc: [ on  ]
    ...                        libbfd: [ OFF ]
    ...                libbfd-buildid: [ OFF ]
    ...                        libcap: [ on  ]
    ...                        libelf: [ on  ]
    ...                       libnuma: [ on  ]
    ...        numa_num_possible_cpus: [ on  ]
    ...                       libperl: [ on  ]
    ...                     libpython: [ on  ]
    ...                     libcrypto: [ on  ]
    ...                     libunwind: [ on  ]
    ...            libdw-dwarf-unwind: [ on  ]
    ...                          zlib: [ on  ]
    ...                          lzma: [ on  ]
    ...                     get_cpuid: [ on  ]
    ...                           bpf: [ on  ]
    ...                        libaio: [ on  ]
    ...                       libzstd: [ on  ]
    ...        disassembler-four-args: [ on  ]

      PERF_VERSION = 5.15.g875eaa399042
      GEN     perf-archive
      GEN     perf-with-kcore
      GEN     perf-iostat
    --
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake-parallel.c:31:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-requeue.c:26:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  53    12.41 opensuse:15.1                 : FAIL gcc version 7.5.0 (SUSE Linux)
    Makefile.config:999: No libbabeltrace found, disables 'perf data' CTF format support, please install libbabeltrace-dev[el]/libbabeltrace-ctf-dev
    update-alternatives: error: no alternatives for java
    update-alternatives: error: no alternatives for java
    Makefile.config:1043: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel

    Auto-detecting system features:
    ...                         dwarf: [ on  ]
    ...            dwarf_getlocations: [ on  ]
    ...                         glibc: [ on  ]
    ...                        libbfd: [ OFF ]
    ...                libbfd-buildid: [ OFF ]
    ...                        libcap: [ on  ]
    ...                        libelf: [ on  ]
    ...                       libnuma: [ on  ]
    ...        numa_num_possible_cpus: [ on  ]
    ...                       libperl: [ on  ]
    ...                     libpython: [ on  ]
    ...                     libcrypto: [ on  ]
    ...                     libunwind: [ on  ]
    ...            libdw-dwarf-unwind: [ on  ]
    ...                          zlib: [ on  ]
    ...                          lzma: [ on  ]
    ...                     get_cpuid: [ on  ]
    ...                           bpf: [ on  ]
    ...                        libaio: [ on  ]
    ...                       libzstd: [ on  ]
    ...        disassembler-four-args: [ on  ]

      PERF_VERSION = 5.15.g875eaa399042
      GEN     perf-archive
      GEN     perf-with-kcore
      GEN     perf-iostat
    --
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake-parallel.c:31:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-requeue.c:26:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  54    12.20 opensuse:15.2                 : FAIL gcc version 7.5.0 (SUSE Linux)
    Makefile.config:999: No libbabeltrace found, disables 'perf data' CTF format support, please install libbabeltrace-dev[el]/libbabeltrace-ctf-dev
    update-alternatives: error: no alternatives for java
    update-alternatives: error: no alternatives for java
    Makefile.config:1043: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel

    Auto-detecting system features:
    ...                         dwarf: [ on  ]
    ...            dwarf_getlocations: [ on  ]
    ...                         glibc: [ on  ]
    ...                        libbfd: [ OFF ]
    ...                libbfd-buildid: [ OFF ]
    ...                        libcap: [ on  ]
    ...                        libelf: [ on  ]
    ...                       libnuma: [ on  ]
    ...        numa_num_possible_cpus: [ on  ]
    ...                       libperl: [ on  ]
    ...                     libpython: [ on  ]
    ...                     libcrypto: [ on  ]
    ...                     libunwind: [ on  ]
    ...            libdw-dwarf-unwind: [ on  ]
    ...                          zlib: [ on  ]
    ...                          lzma: [ on  ]
    ...                     get_cpuid: [ on  ]
    ...                           bpf: [ on  ]
    ...                        libaio: [ on  ]
    ...                       libzstd: [ on  ]
    ...        disassembler-four-args: [ on  ]

      PERF_VERSION = 5.15.g875eaa399042
      GEN     perf-archive
      GEN     perf-with-kcore
      GEN     perf-iostat
    --
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
      if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
                                     ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       struct __kernel_old_timespec ts32;
                                    ^~~~
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    cc1: all warnings being treated as errors
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
      if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
                                     ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       struct __kernel_old_timespec ts32;
                                    ^~~~
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    cc1: all warnings being treated as errors
    In file included from bench/futex-wake-parallel.c:31:0:
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
      if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
                                     ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       struct __kernel_old_timespec ts32;
                                    ^~~~
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    cc1: all warnings being treated as errors
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  55    12.30 opensuse:15.3                 : FAIL gcc version 7.5.0 (SUSE Linux)
    Makefile.config:999: No libbabeltrace found, disables 'perf data' CTF format support, please install libbabeltrace-dev[el]/libbabeltrace-ctf-dev
    update-alternatives: error: no alternatives for java
    update-alternatives: error: no alternatives for java
    Makefile.config:1043: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel

    Auto-detecting system features:
    ...                         dwarf: [ on  ]
    ...            dwarf_getlocations: [ on  ]
    ...                         glibc: [ on  ]
    ...                        libbfd: [ OFF ]
    ...                libbfd-buildid: [ OFF ]
    ...                        libcap: [ on  ]
    ...                        libelf: [ on  ]
    ...                       libnuma: [ on  ]
    ...        numa_num_possible_cpus: [ on  ]
    ...                       libperl: [ on  ]
    ...                     libpython: [ on  ]
    ...                     libcrypto: [ on  ]
    ...                     libunwind: [ on  ]
    ...            libdw-dwarf-unwind: [ on  ]
    ...                          zlib: [ on  ]
    ...                          lzma: [ on  ]
    ...                     get_cpuid: [ on  ]
    ...                           bpf: [ on  ]
    ...                        libaio: [ on  ]
    ...                       libzstd: [ on  ]
    ...        disassembler-four-args: [ on  ]

      PERF_VERSION = 5.15.g875eaa399042
      GEN     perf-archive
      GEN     perf-with-kcore
      GEN     perf-iostat
    --
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
      if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
                                     ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       struct __kernel_old_timespec ts32;
                                    ^~~~
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
      if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
                                     ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       struct __kernel_old_timespec ts32;
                                    ^~~~
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    cc1: all warnings being treated as errors
    cc1: all warnings being treated as errors
    In file included from bench/futex-wake-parallel.c:31:0:
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
      if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
                                     ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       struct __kernel_old_timespec ts32;
                                    ^~~~
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    cc1: all warnings being treated as errors
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  56    92.79 opensuse:tumbleweed           : Ok   gcc (SUSE Linux) 11.2.1 20210816 [revision 056e324ce46a7924b5cf10f61010cf9dd2ca10e9] , clang version 13.0.0
  57    78.85 oraclelinux:8                 : Ok   gcc (GCC) 8.4.1 20200928 (Red Hat 8.4.1-1.0.4) , clang version 11.0.0 (Red Hat 11.0.0-1.0.1.module+el8.4.0+20046+39fed697)
  58    78.47 rockylinux:8                  : Ok   gcc (GCC) 8.4.1 20200928 (Red Hat 8.4.1-1) , clang version 11.0.0 (Red Hat 11.0.0-1.module+el8.4.0+412+05cf643f)
  59     8.32 ubuntu:16.04                  : FAIL gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.12)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  60     7.19 ubuntu:16.04-x-arm            : FAIL gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9)
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  61    18.14 ubuntu:16.04-x-arm64          : FAIL gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  62     6.99 ubuntu:16.04-x-powerpc        : FAIL gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  63     7.29 ubuntu:16.04-x-powerpc64      : FAIL gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    In file included from bench/futex-wake-parallel.c:31:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    In file included from bench/futex-requeue.c:26:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    In file included from bench/futex-lock-pi.c:19:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  64     7.29 ubuntu:16.04-x-powerpc64el    : FAIL gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  65     6.59 ubuntu:16.04-x-s390           : FAIL gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    In file included from bench/futex-wake-parallel.c:31:0:
    bench/futex.h:16:30: fatal error: linux/time_types.h: No such file or directory
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  66     9.00 ubuntu:18.04                  : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  67     7.49 ubuntu:18.04-x-arm            : FAIL gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  68     7.49 ubuntu:18.04-x-arm64          : FAIL gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  69     6.09 ubuntu:18.04-x-m68k           : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake-parallel.c:31:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  70     7.40 ubuntu:18.04-x-powerpc        : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  71     8.00 ubuntu:18.04-x-powerpc64      : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  72     7.99 ubuntu:18.04-x-powerpc64el    : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  73     6.89 ubuntu:18.04-x-riscv64        : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  74     6.69 ubuntu:18.04-x-s390           : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  75     7.29 ubuntu:18.04-x-sh4            : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  76     6.69 ubuntu:18.04-x-sparc64        : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
    In file included from bench/futex-hash.c:29:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from bench/futex-wake.c:25:0:
    bench/futex.h:16:10: fatal error: linux/time_types.h: No such file or directory
     #include <linux/time_types.h>
              ^~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    /git/perf-5.15.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
    make[3]: *** [bench] Error 2
  77     9.59 ubuntu:20.04                  : FAIL gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
       64 |  if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
          |                                 ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       68 |   struct __kernel_old_timespec ts32;
          |                                ^~~~
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    cc1: all warnings being treated as errors
    In file included from bench/futex-wake.c:25:
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
       64 |  if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
          |                                 ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       68 |   struct __kernel_old_timespec ts32;
          |                                ^~~~
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    cc1: all warnings being treated as errors
    In file included from bench/futex-wake-parallel.c:31:
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
       64 |  if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
          |                                 ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       68 |   struct __kernel_old_timespec ts32;
          |                                ^~~~
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    cc1: all warnings being treated as errors
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  78     8.29 ubuntu:20.04-x-powerpc64el    : FAIL gcc version 10.3.0 (Ubuntu 10.3.0-1ubuntu1~20.04)
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
       64 |  if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
          |                                 ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       68 |   struct __kernel_old_timespec ts32;
          |                                ^~~~
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    cc1: all warnings being treated as errors
    In file included from bench/futex-wake.c:25:
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
       64 |  if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
          |                                 ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       68 |   struct __kernel_old_timespec ts32;
          |                                ^~~~
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    In file included from bench/futex-requeue.c:26:
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
       64 |  if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
          |                                 ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       68 |   struct __kernel_old_timespec ts32;
          |                                ^~~~
    In file included from bench/futex-wake-parallel.c:31:
    bench/futex.h: In function 'futex_syscall':
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    bench/futex.h:64:33: error: invalid application of 'sizeof' to incomplete type 'struct __kernel_old_timespec'
       64 |  if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
          |                                 ^~~~~~
    bench/futex.h:68:32: error: storage size of 'ts32' isn't known
       68 |   struct __kernel_old_timespec ts32;
          |                                ^~~~
    bench/futex.h:68:32: error: unused variable 'ts32' [-Werror=unused-variable]
    cc1: all warnings being treated as errors
    cc1: all warnings being treated as errors
    cc1: all warnings being treated as errors
    make[3]: *** [/git/perf-5.15.0/tools/build/Makefile.build:139: bench] Error 2
  79    65.92 ubuntu:20.10                  : Ok   gcc (Ubuntu 10.3.0-1ubuntu1~20.10) 10.3.0 , Ubuntu clang version 11.0.0-2
  80    65.91 ubuntu:21.04                  : Ok   gcc (Ubuntu 10.3.0-1ubuntu1) 10.3.0 , Ubuntu clang version 12.0.0-3ubuntu1~21.04.2
  81    68.12 ubuntu:21.10                  : Ok   gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0 , Ubuntu clang version 13.0.0-2

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-01 11:42:54 -03:00
Taehee Yoo
c08e8baea7 selftests: add amt interface selftest script
This is selftest script for amt interface.
This script includes basic forwarding scenarion and torture scenario.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:36:09 +00:00
Paolo Abeni
b6ab64b074 selftests: mptcp: more stable simult_flows tests
Currently the simult_flows.sh self-tests are not very stable,
especially when running on slow VMs.

The tests measure runtime for transfers on multiple subflows
and check that the time is near the theoretical maximum.

The current test infra introduces a bit of jitter in test
runtime, due to multiple explicit delays. Additionally the
runtime is measured by the shell script wrapper. On a slow
VM, the script overhead is measurable and subject to relevant
jitter.

One solution to make the test more stable would be adding more
slack to the expected time; that could possibly hide real
regressions. Instead move the measurement inside the command
doing the transfer, and drop most unneeded sleeps.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:19:49 +00:00
Geliang Tang
7c909a9804 selftests: mptcp: fix proto type in link_failure tests
In listener_ns, we should pass srv_proto argument to mptcp_connect command,
not cl_proto.

Fixes: 7d1e6f1639 ("selftests: mptcp: add testcase for active-back")
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:19:49 +00:00
Jakub Kicinski
b0ced8f290 selftests: udp: test for passing SO_MARK as cmsg
Before fix:
|  Case IPv6 rejection returned 0, expected 1
|FAIL - 1/4 cases failed

With the fix:
| OK

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:12:48 +00:00
Arnaldo Carvalho de Melo
875eaa3990 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-01 07:10:30 -03:00
Kan Liang
27730c8cd6 perf script: Fix PERF_SAMPLE_WEIGHT_STRUCT support
-F weight in perf script is broken.

  # ./perf mem record
  # ./perf script -F weight
  Samples for 'dummy:HG' event do not have WEIGHT attribute set. Cannot
print 'weight' field.

The sample type, PERF_SAMPLE_WEIGHT_STRUCT, is an alternative of the
PERF_SAMPLE_WEIGHT sample type. They share the same space, weight. The
lower 32 bits are exactly the same for both sample type. The higher 32
bits may be different for different architecture. For a new kernel on
x86, the PERF_SAMPLE_WEIGHT_STRUCT is used. For an old kernel or other
ARCHs, the PERF_SAMPLE_WEIGHT is used.

With -F weight, current perf script will only check the input string
"weight" with the PERF_SAMPLE_WEIGHT sample type. Because the commit
ea8d0ed6ea ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT") didn't
update the PERF_SAMPLE_WEIGHT_STRUCT sample type for perf script. For a
new kernel on x86, the check fails.

Use PERF_SAMPLE_WEIGHT_TYPE, which supports both sample types, to
replace PERF_SAMPLE_WEIGHT

Fixes: ea8d0ed6ea ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT")
Reported-by: Joe Mario <jmario@redhat.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Joe Mario <jmario@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Joe Mario <jmario@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1632929894-102778-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31 12:51:41 -03:00
Jiri Olsa
89ac61ff05 perf callchain: Fix compilation on powerpc with gcc11+
Got following build fail on powerpc:

    CC      arch/powerpc/util/skip-callchain-idx.o
  In function ‘check_return_reg’,
      inlined from ‘check_return_addr’ at arch/powerpc/util/skip-callchain-idx.c:213:7,
      inlined from ‘arch_skip_callchain_idx’ at arch/powerpc/util/skip-callchain-idx.c:265:7:
  arch/powerpc/util/skip-callchain-idx.c:54:18: error: ‘dwarf_frame_register’ accessing 96 bytes \
  in a region of size 64 [-Werror=stringop-overflow=]
     54 |         result = dwarf_frame_register(frame, ra_regno, ops_mem, &ops, &nops);
        |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  arch/powerpc/util/skip-callchain-idx.c: In function ‘arch_skip_callchain_idx’:
  arch/powerpc/util/skip-callchain-idx.c:54:18: note: referencing argument 3 of type ‘Dwarf_Op *’
  In file included from /usr/include/elfutils/libdwfl.h:32,
                   from arch/powerpc/util/skip-callchain-idx.c:10:
  /usr/include/elfutils/libdw.h:1069:12: note: in a call to function ‘dwarf_frame_register’
   1069 | extern int dwarf_frame_register (Dwarf_Frame *frame, int regno,
        |            ^~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors

The dwarf_frame_register args changed with [1],
Updating ops_mem accordingly.

[1] https://sourceware.org/git/?p=elfutils.git;a=commit;h=5621fe5443da23112170235dd5cac161e5c75e65

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Mark Wieelard <mjw@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20210928195253.1267023-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31 12:51:41 -03:00
Song Liu
29c77550ee perf script: Check session->header.env.arch before using it
When perf.data is not written cleanly, we would like to process existing
data as much as possible (please see f_header.data.size == 0 condition
in perf_session__read_header). However, perf.data with partial data may
crash perf. Specifically, we see crash in 'perf script' for NULL
session->header.env.arch.

Fix this by checking session->header.env.arch before using it to determine
native_arch. Also split the if condition so it is easier to read.

Committer notes:

If it is a pipe, we already assume is a native arch, so no need to check
session->header.env.arch.

Signed-off-by: Song Liu <songliubraving@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211004053238.514936-1-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31 12:51:41 -03:00
Adrian Hunter
095729484e perf build: Suppress 'rm dlfilter' build message
The following build message:

	rm dlfilters/dlfilter-test-api-v0.o

is unwanted.

The object file is being treated as an intermediate file and being
automatically removed. Mark the object file as .SECONDARY to prevent
removal and hence the message.

Requested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210930062849.110416-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31 12:51:41 -03:00
Paolo Bonzini
4e33868433 KVM/arm64 updates for Linux 5.16
- More progress on the protected VM front, now with the full
   fixed feature set as well as the limitation of some hypercalls
   after initialisation.
 
 - Cleanup of the RAZ/WI sysreg handling, which was pointlessly
   complicated
 
 - Fixes for the vgic placement in the IPA space, together with a
   bunch of selftests
 
 - More memcg accounting of the memory allocated on behalf of a guest
 
 - Timer and vgic selftests
 
 - Workarounds for the Apple M1 broken vgic implementation
 
 - KConfig cleanups
 
 - New kvmarm.mode=none option, for those who really dislike us
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmF7u5YPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpD6w8QAIKDLJCTqkxv5Vh4ZSmtXxg4gTZMBlg8oSQ8
 sVL639aqBvFe3A6Vmz6IwBm+NT7Sm1zxkuH9qHzVR1gmXq0oLYNrIuyrzRW8PvqO
 hIkSRRoVsf03755TmkxwR7/2jAFxb6FhEVAy6VWdQyI44orihIPvMp8aTIq+jvU+
 XoNGb/rPf9HpSUtvuaHYvZhSZBhoi5dRnkr33R1+VR69n7Axs8lm905xcl6Pt0a0
 QqYZWQvFu/BXPyNflG7LUsegRF/iiV2vNTbNNowkzlV5suqxBpJAp6ApDL/gWrHv
 ya/6cMqicSjBIkWnawhXY98w6/5xfzK4IV/zc00FNWOlUdVP89Thqrgc8EkigS9R
 BGcxFFqj41snr+ensSBBIkNtV+dBX52H3rUE0F9seiTXm8QWI86JobdeNadT8tUP
 TXdOeCUcA+cp4Ngln18lsbOEaBkPA5H1po1nUFPHbKnVOxnqXScB7E/xF6rAbryV
 m+Z+oidU7MyS/Ev/Da0ww/XFx7cs2ez9EgeQvjcdFAvUMqS6kcXEExvgGYlm+KRQ
 GBMKPLCNHKdflMANoSpol7MZUmPJ45XoWKW1rntj2r9X+oJW2Z2hEx32xrWDJdqK
 ixnbjog5kNZb0CjLGsUC90lo2hpRJecaLhAjgTLYaNC1QxGPrt92eat6gnwuMTBc
 mpADqi7w
 =qBAO
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for Linux 5.16

- More progress on the protected VM front, now with the full
  fixed feature set as well as the limitation of some hypercalls
  after initialisation.

- Cleanup of the RAZ/WI sysreg handling, which was pointlessly
  complicated

- Fixes for the vgic placement in the IPA space, together with a
  bunch of selftests

- More memcg accounting of the memory allocated on behalf of a guest

- Timer and vgic selftests

- Workarounds for the Apple M1 broken vgic implementation

- KConfig cleanups

- New kvmarm.mode=none option, for those who really dislike us
2021-10-31 02:28:48 -04:00
Borislav Petkov
a72fdfd21e selftests/x86/iopl: Adjust to the faked iopl CLI/STI usage
Commit in Fixes changed the iopl emulation to not #GP on CLI and STI
because it would break some insane luserspace tools which would toggle
interrupts.

The corresponding selftest would rely on the fact that executing CLI/STI
would trigger a #GP and thus detect it this way but since that #GP is
not happening anymore, the detection is now wrong too.

Extend the test to actually look at the IF flag and whether executing
those insns had any effect on it. The STI detection needs to have the
fact that interrupts were previously disabled, passed in so do that from
the previous CLI test, i.e., STI test needs to follow a previous CLI one
for it to make sense.

Fixes: b968e84b50 ("x86/iopl: Fake iopl(3) CLI/STI usage")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20211030083939.13073-1-bp@alien8.de
2021-10-30 23:18:04 +02:00
Shuah Khan
f35dcaa0a8 selftests/core: fix conflicting types compile error for close_range()
close_range() test type conflicts with close_range() library call in
x86_64-linux-gnu/bits/unistd_ext.h. Fix it by changing the name to
core_close_range().

gcc -g -I../../../../usr/include/    close_range_test.c  -o ../tools/testing/selftests/core/close_range_test
In file included from close_range_test.c:16:
close_range_test.c:57:6: error: conflicting types for ‘close_range’; have ‘void(struct __test_metadata *)’
   57 | TEST(close_range)
      |      ^~~~~~~~~~~
../kselftest_harness.h:181:21: note: in definition of macro ‘__TEST_IMPL’
  181 |         static void test_name(struct __test_metadata *_metadata); \
      |                     ^~~~~~~~~
close_range_test.c:57:1: note: in expansion of macro ‘TEST’
   57 | TEST(close_range)
      | ^~~~
In file included from /usr/include/unistd.h:1204,
                 from close_range_test.c:13:
/usr/include/x86_64-linux-gnu/bits/unistd_ext.h:56:12: note: previous declaration of ‘close_range’ with type ‘int(unsigned int,  unsigned int,  int)’
   56 | extern int close_range (unsigned int __fd, unsigned int __max_fd,
      |            ^~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-29 13:09:42 -06:00
Daniel Latypov
52a5d80a22 kunit: tool: fix typecheck errors about loading qemu configs
Currently, we have these errors:
$ mypy ./tools/testing/kunit/*.py
tools/testing/kunit/kunit_kernel.py:213: error: Item "_Loader" of "Optional[_Loader]" has no attribute "exec_module"
tools/testing/kunit/kunit_kernel.py:213: error: Item "None" of "Optional[_Loader]" has no attribute "exec_module"
tools/testing/kunit/kunit_kernel.py:214: error: Module has no attribute "QEMU_ARCH"
tools/testing/kunit/kunit_kernel.py:215: error: Module has no attribute "QEMU_ARCH"

exec_module
===========

pytype currently reports no errors, but that's because there's a comment
disabling it on 213.

This is due to https://github.com/python/typeshed/pull/2626.
The fix is to assert the loaded module implements the ABC
(abstract base class) we want which has exec_module support.

QEMU_ARCH
=========

pytype is fine with this, but mypy is not:
https://github.com/python/mypy/issues/5059

Add a check that the loaded module does indeed have QEMU_ARCH.
Note: this is not enough to appease mypy, so we also add a comment to
squash the warning.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-29 13:05:47 -06:00
Andrea Righi
f48ad69097 selftests/bpf: Fix fclose/pclose mismatch in test_progs
Make sure to use pclose() to properly close the pipe opened by popen().

Fixes: 81f77fd0de ("bpf: add selftest for stackmap with BPF_F_STACK_BUILD_ID")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211026143409.42666-1-andrea.righi@canonical.com
2021-10-29 17:48:43 +02:00
Nikolay Aleksandrov
34d7ecb3d4 selftests: net: bridge: update IGMP/MLD membership interval value
When I fixed IGMPv3/MLDv2 to use the bridge's multicast_membership_interval
value which is chosen by user-space instead of calculating it based on
multicast_query_interval and multicast_query_response_interval I forgot
to update the selftests relying on that behaviour. Now we have to
manually set the expected GMI value to perform the tests correctly and get
proper results (similar to IGMPv2 behaviour).

Fixes: fac3cb82a5 ("net: bridge: mcast: use multicast_membership_interval for IGMPv3")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29 13:58:21 +01:00
Shuah Khan
e300a85db1 selftests/net: update .gitignore with newly added tests
Update .gitignore with newly added tests:
	tools/testing/selftests/net/af_unix/test_unix_oob
	tools/testing/selftests/net/gro
	tools/testing/selftests/net/ioam6_parser
	tools/testing/selftests/net/toeplitz

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29 13:14:58 +01:00
Petr Machata
2b11e24eba selftests: mlxsw: Test port shaper
TBF can be used as a root qdisc, in which case it is supposed to configure
port shaper. Add a test that verifies that this is so by installing a root
TBF with a ETS or PRIO below it, and then expecting individual bands to all
be shaped according to the root TBF configuration.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-28 19:47:50 -07:00
Petr Machata
3d5290ea1d selftests: mlxsw: Test offloadability of root TBF
TBF can be used as a root qdisc, with the usual ETS/RED/TBF hierarchy below
it. This use should now be offloaded. Add a test that verifies that it is.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-28 19:47:49 -07:00
David Yang
9c7516d669 tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer
The coccinelle check report:

  ./tools/testing/selftests/vm/split_huge_page_test.c:344:36-42:
  ERROR: application of sizeof to pointer

Use "strlen" to fix it.

Link: https://lkml.kernel.org/r/20211012030116.184027-1-davidcomponentone@gmail.com
Signed-off-by: David Yang <davidcomponentone@gmail.com>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Kumar Kartikeya Dwivedi
efadf2ad17 selftests/bpf: Fix memory leak in test_ima
The allocated ring buffer is never freed, do so in the cleanup path.

Fixes: f446b570ac ("bpf/selftests: Update the IMA test to use BPF ring buffer")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-9-memxor@gmail.com
2021-10-28 16:30:07 -07:00
Kumar Kartikeya Dwivedi
c3fc706e94 selftests/bpf: Fix fd cleanup in sk_lookup test
Similar to the fix in commit:
e31eec77e4 ("bpf: selftests: Fix fd cleanup in get_branch_snapshot")

We use designated initializer to set fds to -1 without breaking on
future changes to MAX_SERVER constant denoting the array size.

The particular close(0) occurs on non-reuseport tests, so it can be seen
with -n 115/{2,3} but not 115/4. This can cause problems with future
tests if they depend on BTF fd never being acquired as fd 0, breaking
internal libbpf assumptions.

Fixes: 0ab5539f85 ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-8-memxor@gmail.com
2021-10-28 16:30:07 -07:00
Kumar Kartikeya Dwivedi
087cba799c selftests/bpf: Add weak/typeless ksym test for light skeleton
Also, avoid using CO-RE features, as lskel doesn't support CO-RE, yet.
Include both light and libbpf skeleton in same file to test both of them
together.

In c48e51c8b0 ("bpf: selftests: Add selftests for module kfunc support"),
I added support for generating both lskel and libbpf skel for a BPF
object, however the name parameter for bpftool caused collisions when
included in same file together. This meant that every test needed a
separate file for a libbpf/light skeleton separation instead of
subtests.

Change that by appending a "_lskel" suffix to the name for files using
light skeleton, and convert all existing users.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-7-memxor@gmail.com
2021-10-28 16:30:07 -07:00
Kumar Kartikeya Dwivedi
92274e24b0 libbpf: Use O_CLOEXEC uniformly when opening fds
There are some instances where we don't use O_CLOEXEC when opening an
fd, fix these up. Otherwise, it is possible that a parallel fork causes
these fds to leak into a child process on execve.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-6-memxor@gmail.com
2021-10-28 16:30:07 -07:00
Kumar Kartikeya Dwivedi
549a632386 libbpf: Ensure that BPF syscall fds are never 0, 1, or 2
Add a simple wrapper for passing an fd and getting a new one >= 3 if it
is one of 0, 1, or 2. There are two primary reasons to make this change:
First, libbpf relies on the assumption a certain BPF fd is never 0 (e.g.
most recently noticed in [0]). Second, Alexei pointed out in [1] that
some environments reset stdin, stdout, and stderr if they notice an
invalid fd at these numbers. To protect against both these cases, switch
all internal BPF syscall wrappers in libbpf to always return an fd >= 3.
We only need to modify the syscall wrappers and not other code that
assumes a valid fd by doing >= 0, to avoid pointless churn, and because
it is still a valid assumption. The cost paid is two additional syscalls
if fd is in range [0, 2].

  [0]: e31eec77e4 ("bpf: selftests: Fix fd cleanup in get_branch_snapshot")
  [1]: https://lore.kernel.org/bpf/CAADnVQKVKY8o_3aU8Gzke443+uHa-eGoM0h7W4srChMXU1S4Bg@mail.gmail.com

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-5-memxor@gmail.com
2021-10-28 16:30:07 -07:00
Kumar Kartikeya Dwivedi
585a357198 libbpf: Add weak ksym support to gen_loader
This extends existing ksym relocation code to also support relocating
weak ksyms. Care needs to be taken to zero out the src_reg (currently
BPF_PSEUOD_BTF_ID, always set for gen_loader by bpf_object__relocate_data)
when the BTF ID lookup fails at runtime.  This is not a problem for
libbpf as it only sets ext->is_set when BTF ID lookup succeeds (and only
proceeds in case of failure if ext->is_weak, leading to src_reg
remaining as 0 for weak unresolved ksym).

A pattern similar to emit_relo_kfunc_btf is followed of first storing
the default values and then jumping over actual stores in case of an
error. For src_reg adjustment, we also need to perform it when copying
the populated instruction, so depending on if copied insn[0].imm is 0 or
not, we decide to jump over the adjustment.

We cannot reach that point unless the ksym was weak and resolved and
zeroed out, as the emit_check_err will cause us to jump to cleanup
label, so we do not need to recheck whether the ksym is weak before
doing the adjustment after copying BTF ID and BTF FD.

This is consistent with how libbpf relocates weak ksym. Logging
statements are added to show the relocation result and aid debugging.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-4-memxor@gmail.com
2021-10-28 16:30:06 -07:00
Kumar Kartikeya Dwivedi
c24941cd37 libbpf: Add typeless ksym support to gen_loader
This uses the bpf_kallsyms_lookup_name helper added in previous patches
to relocate typeless ksyms. The return value ENOENT can be ignored, and
the value written to 'res' can be directly stored to the insn, as it is
overwritten to 0 on lookup failure. For repeating symbols, we can simply
copy the previously populated bpf_insn.

Also, we need to take care to not close fds for typeless ksym_desc, so
reuse the 'off' member's space to add a marker for typeless ksym and use
that to skip them in cleanup_relos.

We add a emit_ksym_relo_log helper that avoids duplicating common
logging instructions between typeless and weak ksym (for future commit).

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-3-memxor@gmail.com
2021-10-28 16:30:06 -07:00
Kumar Kartikeya Dwivedi
d6aef08a87 bpf: Add bpf_kallsyms_lookup_name helper
This helper allows us to get the address of a kernel symbol from inside
a BPF_PROG_TYPE_SYSCALL prog (used by gen_loader), so that we can
relocate typeless ksym vars.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-2-memxor@gmail.com
2021-10-28 16:30:06 -07:00
Peter Zijlstra
134ab5bd18 objtool,x86: Replace alternatives with .retpoline_sites
Instead of writing complete alternatives, simply provide a list of all
the retpoline thunk calls. Then the kernel is free to do with them as
it pleases. Simpler code all-round.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.850007165@infradead.org
2021-10-28 23:25:25 +02:00
Peter Zijlstra
c509331b41 objtool: Shrink struct instruction
Any one instruction can only ever call a single function, therefore
insn->mcount_loc_node is superfluous and can use insn->call_node.

This shrinks struct instruction, which is by far the most numerous
structure objtool creates.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.785456706@infradead.org
2021-10-28 23:25:25 +02:00
Peter Zijlstra
dd003edeff objtool: Explicitly avoid self modifying code in .altinstr_replacement
Assume ALTERNATIVE()s know what they're doing and do not change, or
cause to change, instructions in .altinstr_replacement sections.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.722511775@infradead.org
2021-10-28 23:25:25 +02:00
Peter Zijlstra
1739c66eb7 objtool: Classify symbols
In order to avoid calling str*cmp() on symbol names, over and over, do
them all once upfront and store the result.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.658539311@infradead.org
2021-10-28 23:25:24 +02:00
Joanne Koong
f44bc543a0 bpf/benchs: Add benchmarks for comparing hashmap lookups w/ vs. w/out bloom filter
This patch adds benchmark tests for comparing the performance of hashmap
lookups without the bloom filter vs. hashmap lookups with the bloom filter.

Checking the bloom filter first for whether the element exists should
overall enable a higher throughput for hashmap lookups, since if the
element does not exist in the bloom filter, we can avoid a costly lookup in
the hashmap.

On average, using 5 hash functions in the bloom filter tended to perform
the best across the widest range of different entry sizes. The benchmark
results using 5 hash functions (running on 8 threads on a machine with one
numa node, and taking the average of 3 runs) were roughly as follows:

value_size = 4 bytes -
	10k entries: 30% faster
	50k entries: 40% faster
	100k entries: 40% faster
	500k entres: 70% faster
	1 million entries: 90% faster
	5 million entries: 140% faster

value_size = 8 bytes -
	10k entries: 30% faster
	50k entries: 40% faster
	100k entries: 50% faster
	500k entres: 80% faster
	1 million entries: 100% faster
	5 million entries: 150% faster

value_size = 16 bytes -
	10k entries: 20% faster
	50k entries: 30% faster
	100k entries: 35% faster
	500k entres: 65% faster
	1 million entries: 85% faster
	5 million entries: 110% faster

value_size = 40 bytes -
	10k entries: 5% faster
	50k entries: 15% faster
	100k entries: 20% faster
	500k entres: 65% faster
	1 million entries: 75% faster
	5 million entries: 120% faster

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-6-joannekoong@fb.com
2021-10-28 13:22:49 -07:00
Joanne Koong
57fd1c63c9 bpf/benchs: Add benchmark tests for bloom filter throughput + false positive
This patch adds benchmark tests for the throughput (for lookups + updates)
and the false positive rate of bloom filter lookups, as well as some
minor refactoring of the bash script for running the benchmarks.

These benchmarks show that as the number of hash functions increases,
the throughput and the false positive rate of the bloom filter decreases.
>From the benchmark data, the approximate average false-positive rates
are roughly as follows:

1 hash function = ~30%
2 hash functions = ~15%
3 hash functions = ~5%
4 hash functions = ~2.5%
5 hash functions = ~1%
6 hash functions = ~0.5%
7 hash functions  = ~0.35%
8 hash functions = ~0.15%
9 hash functions = ~0.1%
10 hash functions = ~0%

For reference data, the benchmarks run on one thread on a machine
with one numa node for 1 to 5 hash functions for 8-byte and 64-byte
values are as follows:

1 hash function:
  50k entries
	8-byte value
	    Lookups - 51.1 M/s operations
	    Updates - 33.6 M/s operations
	    False positive rate: 24.15%
	64-byte value
	    Lookups - 15.7 M/s operations
	    Updates - 15.1 M/s operations
	    False positive rate: 24.2%
  100k entries
	8-byte value
	    Lookups - 51.0 M/s operations
	    Updates - 33.4 M/s operations
	    False positive rate: 24.04%
	64-byte value
	    Lookups - 15.6 M/s operations
	    Updates - 14.6 M/s operations
	    False positive rate: 24.06%
  500k entries
	8-byte value
	    Lookups - 50.5 M/s operations
	    Updates - 33.1 M/s operations
	    False positive rate: 27.45%
	64-byte value
	    Lookups - 15.6 M/s operations
	    Updates - 14.2 M/s operations
	    False positive rate: 27.42%
  1 mil entries
	8-byte value
	    Lookups - 49.7 M/s operations
	    Updates - 32.9 M/s operations
	    False positive rate: 27.45%
	64-byte value
	    Lookups - 15.4 M/s operations
	    Updates - 13.7 M/s operations
	    False positive rate: 27.58%
  2.5 mil entries
	8-byte value
	    Lookups - 47.2 M/s operations
	    Updates - 31.8 M/s operations
	    False positive rate: 30.94%
	64-byte value
	    Lookups - 15.3 M/s operations
	    Updates - 13.2 M/s operations
	    False positive rate: 30.95%
  5 mil entries
	8-byte value
	    Lookups - 41.1 M/s operations
	    Updates - 28.1 M/s operations
	    False positive rate: 31.01%
	64-byte value
	    Lookups - 13.3 M/s operations
	    Updates - 11.4 M/s operations
	    False positive rate: 30.98%

2 hash functions:
  50k entries
	8-byte value
	    Lookups - 34.1 M/s operations
	    Updates - 20.1 M/s operations
	    False positive rate: 9.13%
	64-byte value
	    Lookups - 8.4 M/s operations
	    Updates - 7.9 M/s operations
	    False positive rate: 9.21%
  100k entries
	8-byte value
	    Lookups - 33.7 M/s operations
	    Updates - 18.9 M/s operations
	    False positive rate: 9.13%
	64-byte value
	    Lookups - 8.4 M/s operations
	    Updates - 7.7 M/s operations
	    False positive rate: 9.19%
  500k entries
	8-byte value
	    Lookups - 32.7 M/s operations
	    Updates - 18.1 M/s operations
	    False positive rate: 12.61%
	64-byte value
	    Lookups - 8.4 M/s operations
	    Updates - 7.5 M/s operations
	    False positive rate: 12.61%
  1 mil entries
	8-byte value
	    Lookups - 30.6 M/s operations
	    Updates - 18.9 M/s operations
	    False positive rate: 12.54%
	64-byte value
	    Lookups - 8.0 M/s operations
	    Updates - 7.0 M/s operations
	    False positive rate: 12.52%
  2.5 mil entries
	8-byte value
	    Lookups - 25.3 M/s operations
	    Updates - 16.7 M/s operations
	    False positive rate: 16.77%
	64-byte value
	    Lookups - 7.9 M/s operations
	    Updates - 6.5 M/s operations
	    False positive rate: 16.88%
  5 mil entries
	8-byte value
	    Lookups - 20.8 M/s operations
	    Updates - 14.7 M/s operations
	    False positive rate: 16.78%
	64-byte value
	    Lookups - 7.0 M/s operations
	    Updates - 6.0 M/s operations
	    False positive rate: 16.78%

3 hash functions:
  50k entries
	8-byte value
	    Lookups - 25.1 M/s operations
	    Updates - 14.6 M/s operations
	    False positive rate: 7.65%
	64-byte value
	    Lookups - 5.8 M/s operations
	    Updates - 5.5 M/s operations
	    False positive rate: 7.58%
  100k entries
	8-byte value
	    Lookups - 24.7 M/s operations
	    Updates - 14.1 M/s operations
	    False positive rate: 7.71%
	64-byte value
	    Lookups - 5.8 M/s operations
	    Updates - 5.3 M/s operations
	    False positive rate: 7.62%
  500k entries
	8-byte value
	    Lookups - 22.9 M/s operations
	    Updates - 13.9 M/s operations
	    False positive rate: 2.62%
	64-byte value
	    Lookups - 5.6 M/s operations
	    Updates - 4.8 M/s operations
	    False positive rate: 2.7%
  1 mil entries
	8-byte value
	    Lookups - 19.8 M/s operations
	    Updates - 12.6 M/s operations
	    False positive rate: 2.60%
	64-byte value
	    Lookups - 5.3 M/s operations
	    Updates - 4.4 M/s operations
	    False positive rate: 2.69%
  2.5 mil entries
	8-byte value
	    Lookups - 16.2 M/s operations
	    Updates - 10.7 M/s operations
	    False positive rate: 4.49%
	64-byte value
	    Lookups - 4.9 M/s operations
	    Updates - 4.1 M/s operations
	    False positive rate: 4.41%
  5 mil entries
	8-byte value
	    Lookups - 18.8 M/s operations
	    Updates - 9.2 M/s operations
	    False positive rate: 4.45%
	64-byte value
	    Lookups - 5.2 M/s operations
	    Updates - 3.9 M/s operations
	    False positive rate: 4.54%

4 hash functions:
  50k entries
	8-byte value
	    Lookups - 19.7 M/s operations
	    Updates - 11.1 M/s operations
	    False positive rate: 1.01%
	64-byte value
	    Lookups - 4.4 M/s operations
	    Updates - 4.0 M/s operations
	    False positive rate: 1.00%
  100k entries
	8-byte value
	    Lookups - 19.5 M/s operations
	    Updates - 10.9 M/s operations
	    False positive rate: 1.00%
	64-byte value
	    Lookups - 4.3 M/s operations
	    Updates - 3.9 M/s operations
	    False positive rate: 0.97%
  500k entries
	8-byte value
	    Lookups - 18.2 M/s operations
	    Updates - 10.6 M/s operations
	    False positive rate: 2.05%
	64-byte value
	    Lookups - 4.3 M/s operations
	    Updates - 3.7 M/s operations
	    False positive rate: 2.05%
  1 mil entries
	8-byte value
	    Lookups - 15.5 M/s operations
	    Updates - 9.6 M/s operations
	    False positive rate: 1.99%
	64-byte value
	    Lookups - 4.0 M/s operations
	    Updates - 3.4 M/s operations
	    False positive rate: 1.99%
  2.5 mil entries
	8-byte value
	    Lookups - 13.8 M/s operations
	    Updates - 7.7 M/s operations
	    False positive rate: 3.91%
	64-byte value
	    Lookups - 3.7 M/s operations
	    Updates - 3.6 M/s operations
	    False positive rate: 3.78%
  5 mil entries
	8-byte value
	    Lookups - 13.0 M/s operations
	    Updates - 6.9 M/s operations
	    False positive rate: 3.93%
	64-byte value
	    Lookups - 3.5 M/s operations
	    Updates - 3.7 M/s operations
	    False positive rate: 3.39%

5 hash functions:
  50k entries
	8-byte value
	    Lookups - 16.4 M/s operations
	    Updates - 9.1 M/s operations
	    False positive rate: 0.78%
	64-byte value
	    Lookups - 3.5 M/s operations
	    Updates - 3.2 M/s operations
	    False positive rate: 0.77%
  100k entries
	8-byte value
	    Lookups - 16.3 M/s operations
	    Updates - 9.0 M/s operations
	    False positive rate: 0.79%
	64-byte value
	    Lookups - 3.5 M/s operations
	    Updates - 3.2 M/s operations
	    False positive rate: 0.78%
  500k entries
	8-byte value
	    Lookups - 15.1 M/s operations
	    Updates - 8.8 M/s operations
	    False positive rate: 1.82%
	64-byte value
	    Lookups - 3.4 M/s operations
	    Updates - 3.0 M/s operations
	    False positive rate: 1.78%
  1 mil entries
	8-byte value
	    Lookups - 13.2 M/s operations
	    Updates - 7.8 M/s operations
	    False positive rate: 1.81%
	64-byte value
	    Lookups - 3.2 M/s operations
	    Updates - 2.8 M/s operations
	    False positive rate: 1.80%
  2.5 mil entries
	8-byte value
	    Lookups - 10.5 M/s operations
	    Updates - 5.9 M/s operations
	    False positive rate: 0.29%
	64-byte value
	    Lookups - 3.2 M/s operations
	    Updates - 2.4 M/s operations
	    False positive rate: 0.28%
  5 mil entries
	8-byte value
	    Lookups - 9.6 M/s operations
	    Updates - 5.7 M/s operations
	    False positive rate: 0.30%
	64-byte value
	    Lookups - 3.2 M/s operations
	    Updates - 2.7 M/s operations
	    False positive rate: 0.30%

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-5-joannekoong@fb.com
2021-10-28 13:22:49 -07:00
Joanne Koong
ed9109ad64 selftests/bpf: Add bloom filter map test cases
This patch adds test cases for bpf bloom filter maps. They include tests
checking against invalid operations by userspace, tests for using the
bloom filter map as an inner map, and a bpf program that queries the
bloom filter map for values added by a userspace program.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-4-joannekoong@fb.com
2021-10-28 13:22:49 -07:00
Joanne Koong
47512102cd libbpf: Add "map_extra" as a per-map-type extra flag
This patch adds the libbpf infrastructure for supporting a
per-map-type "map_extra" field, whose definition will be
idiosyncratic depending on map type.

For example, for the bloom filter map, the lower 4 bits of
map_extra is used to denote the number of hash functions.

Please note that until libbpf 1.0 is here, the
"bpf_create_map_params" struct is used as a temporary
means for propagating the map_extra field to the kernel.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-3-joannekoong@fb.com
2021-10-28 13:22:49 -07:00
Joanne Koong
9330986c03 bpf: Add bloom filter map implementation
This patch adds the kernel-side changes for the implementation of
a bpf bloom filter map.

The bloom filter map supports peek (determining whether an element
is present in the map) and push (adding an element to the map)
operations.These operations are exposed to userspace applications
through the already existing syscalls in the following way:

BPF_MAP_LOOKUP_ELEM -> peek
BPF_MAP_UPDATE_ELEM -> push

The bloom filter map does not have keys, only values. In light of
this, the bloom filter map's API matches that of queue stack maps:
user applications use BPF_MAP_LOOKUP_ELEM/BPF_MAP_UPDATE_ELEM
which correspond internally to bpf_map_peek_elem/bpf_map_push_elem,
and bpf programs must use the bpf_map_peek_elem and bpf_map_push_elem
APIs to query or add an element to the bloom filter map. When the
bloom filter map is created, it must be created with a key_size of 0.

For updates, the user will pass in the element to add to the map
as the value, with a NULL key. For lookups, the user will pass in the
element to query in the map as the value, with a NULL key. In the
verifier layer, this requires us to modify the argument type of
a bloom filter's BPF_FUNC_map_peek_elem call to ARG_PTR_TO_MAP_VALUE;
as well, in the syscall layer, we need to copy over the user value
so that in bpf_map_peek_elem, we know which specific value to query.

A few things to please take note of:
 * If there are any concurrent lookups + updates, the user is
responsible for synchronizing this to ensure no false negative lookups
occur.
 * The number of hashes to use for the bloom filter is configurable from
userspace. If no number is specified, the default used will be 5 hash
functions. The benchmarks later in this patchset can help compare the
performance of using different number of hashes on different entry
sizes. In general, using more hashes decreases both the false positive
rate and the speed of a lookup.
 * Deleting an element in the bloom filter map is not supported.
 * The bloom filter map may be used as an inner map.
 * The "max_entries" size that is specified at map creation time is used
to approximate a reasonable bitmap size for the bloom filter, and is not
otherwise strictly enforced. If the user wishes to insert more entries
into the bloom filter than "max_entries", they may do so but they should
be aware that this may lead to a higher false positive rate.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-2-joannekoong@fb.com
2021-10-28 13:22:49 -07:00
Jakub Kicinski
7df621a3ee Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/net/sock.h
  7b50ecfcc6 ("net: Rename ->stream_memory_read to ->sock_is_readable")
  4c1e34c0db ("vsock: Enable y2038 safe timeval for timeout")

drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
  0daa55d033 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table")
  e77bcdd1f6 ("octeontx2-af: Display all enabled PF VF rsrc_alloc entries.")

Adjacent code addition in both cases, keep both.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-28 10:43:58 -07:00
Chang S. Bae
101c669d16 selftests/x86/amx: Add context switch test
XSAVE state is thread-local.  The kernel switches between thread
state at context switch time.  Generally, running a selftest for
a while will naturally expose it to some context switching and
and will test the XSAVE code.

Instead of just hoping that the tests get context-switched at
random times, force context-switches on purpose.  Spawn off a few
userspace threads and force context-switches between them.
Ensure that the kernel correctly context switches each thread's
unique AMX state.

 [ dhansen: bunches of cleanups ]

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211026122525.6EFD5758@davehans-spike.ostc.intel.com
2021-10-28 14:35:27 +02:00
Chang S. Bae
6a3e0651b4 selftests/x86/amx: Add test cases for AMX state management
AMX TILEDATA is a very large XSAVE feature.  It could have caused
nasty XSAVE buffer space waste in two places:

 * Signal stacks
 * Kernel task_struct->fpu buffers

To avoid this waste, neither of these buffers have AMX state by
default.  The non-default features are called "dynamic" features.

There is an arch_prctl(ARCH_REQ_XCOMP_PERM) which allows a task
to declare that it wants to use AMX or other "dynamic" XSAVE
features.  This arch_prctl() ensures that sufficient sigaltstack
space is available before it will succeed.  It also expands the
task_struct buffer.

Functions of this test:
 * Test arch_prctl(ARCH_REQ_XCOMP_PERM).  Ensure that it checks for
   proper sigaltstack sizing and that the sizing is enforced for
   future sigaltstack calls.
 * Ensure that ARCH_REQ_XCOMP_PERM is inherited across fork()
 * Ensure that TILEDATA use before the prctl() is fatal
 * Ensure that TILEDATA is cleared across fork()

Note: Generally, compiler support is needed to do something with
AMX.  Instead, directly load AMX state from userspace with a
plain XSAVE.  Do not depend on the compiler.

 [ dhansen: bunches of cleanups ]

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211026122524.7BEDAA95@davehans-spike.ostc.intel.com
2021-10-28 14:35:27 +02:00
Madhavan Srinivasan
10269a2ca2 perf test sample-parsing: Add endian test for struct branch_flags
Extend the sample-parsing test to include a branch_flag bitfield-endian
swap test.

This patch adds a include for "util/trace-event.h" in the sample-parsing
test for importing tep_is_bigendian() and extends samples_same() to
include "needs_swap" to detect/enable check for bitfield-endian swap.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211028113714.600549-2-maddy@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-28 09:34:01 -03:00
Madhavan Srinivasan
63c12ae2f2 perf evsel: Add bitfield_swap() to handle branch_stack endian issue
The branch_stack struct has bit field definition which produces
different bit ordering for big/little endian.

Because of this, when branch_stack sample is collected in a BE system
and viewed/reported in a LE system, bit fields of the branch stack are
not presented properly.

To address this issue, a evsel__bitfield_swap_branch_stack() is defined
and introduced in evsel__parse_sample.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211028113714.600549-1-maddy@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-28 09:33:02 -03:00
Kan Liang
6ea5d1a3e3 perf script: Support instruction latency
The instruction latency information can be recorded on
some platforms, e.g., the Intel Sapphire Rapids server. With both memory
latency (weight) and the new instruction latency information, users can
easily locate the expensive load instructions, and also understand the time
spent in different stages. The users can optimize their applications in
different pipeline stages.

Add a new field "ins_lat" to filter the instruction latency information,
which is available with sample type PERF_SAMPLE_WEIGHT_STRUCT.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Link: https://lore.kernel.org/r/1632929894-102778-2-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-28 09:28:03 -03:00
Lexi Shao
57d7ecfd11 perf script: Show binary offsets for userspace addr
Show binary offsets for userspace addr with map in perf script output
with callchain.

In commit 19610184693c("perf script: Show virtual addresses instead of
offsets"), the addr shown in perf script output with callchain is changed
from binary offsets to virtual address to fix the incorrectness when
displaying symbol offset.

This is inconvenient in scenario that the binary is stripped and
symbol cannot be resolved. If someone wants to further resolve symbols for
specific binaries later, he would need an extra step to translate virtual
address to binary offset with mapping information recorded in perf.data,
which can be difficult for people not familiar with perf.

This patch modifies function sample__fprintf_callchain to print binary
offset for userspace addr with dsos, and virtual address otherwise. It
does not affect symbol offset calculation so symoff remains correct.

Before applying this patch:

  test  1512    78.711307:     533129 cycles:
  	aaaae0da07f4 [unknown] (/tmp/test)
  	aaaae0da0704 [unknown] (/tmp/test)
  	ffffbe9f7ef4 __libc_start_main+0xe4 (/lib64/libc-2.31.so)

After this patch:

  test  1519   111.330127:     406953 cycles:
  	7f4 [unknown] (/tmp/test)
  	704 [unknown] (/tmp/test)
  	20ef4 __libc_start_main+0xe4 (/lib64/libc-2.31.so)

Fixes: 19610184693c("perf script: Show virtual addresses instead of offsets")

Signed-off-by: Lexi Shao <shaolexi@huawei.com>

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: QiuXi <qiuxi1@huawei.com>
Cc: Wangbing <wangbing6@huawei.com>
Cc: Xiaoming Ni <nixiaoming@huawei.com>
Link: http://lore.kernel.org/lkml/20211019072417.122576-1-shaolexi@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-27 20:56:36 -03:00
Alistair Francis
c1ff12dac4 perf bench futex: Add support for 32-bit systems with 64-bit time_t
Some 32-bit architectures (such are 32-bit RISC-V) only have a 64-bit
time_t and as such don't have the SYS_futex syscall. This patch will
allow us to use the SYS_futex_time64 syscall on those platforms.

This also converts the futex calls to be y2038 safe (when built for a
5.1+ kernel).

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alistair Francis <alistair23@gmail.com>
Cc: Atish Patra <atish.patra@wdc.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-riscv@lists.infradead.org
Link: http://lore.kernel.org/lkml/20211022013343.2262938-2-alistair.francis@opensource.wdc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-27 20:55:17 -03:00
Alistair Francis
fec5c3a515 perf bench futex: Call the futex syscall from a function
In preparation for a more complex futex() function let's convert the
current macro into two functions. We need two functions to avoid
compiler failures as the macro is overloaded.

This will allow us to include pre-processor conditionals in the futex
syscall functions.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alistair Francis <alistair23@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Atish Patra <atish.patra@wdc.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-riscv@lists.infradead.org
Link: http://lore.kernel.org/lkml/20211022013343.2262938-1-alistair.francis@opensource.wdc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-27 20:54:00 -03:00
Adrian Hunter
624ff63abf perf intel-pt: Support itrace d+o option to direct debug log to stdout
It can be useful to see debug output in between normal output.

Add support for AUXTRACE_LOG_FLG_USE_STDOUT to Intel PT.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20211027080334.365596-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-27 16:21:01 -03:00
Adrian Hunter
4b2b2c6a7d perf auxtrace: Add itrace d+o option to direct debug log to stdout
It can be useful to see debug output in between normal output.

Add 'o' to the flags of debug option 'd', so that '--itrace=d+o' can
specify output of the debug log to stdout.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20211027080334.365596-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-27 16:20:45 -03:00
Adrian Hunter
c3afd6e50f perf dlfilter: Add dlfilter-show-cycles
Add a new dlfilter to show cycles.

Cycle counts are accumulated per CPU (or per thread if CPU is not recorded)
from IPC information, and printed together with the change since the last
print, at the start of each line. Separate counts are kept for branches,
instructions or other events.

Note also, the itrace A option can be useful to provide higher granularity
cycle information.

Example:

  $ perf record -e intel_pt/cyc/u uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.044 MB perf.data ]
  $ perf script --itrace=A --call-trace --dlfilter dlfilter-show-cycles.so --deltatime | head
         0                   perf-exec  8509 [001]     0.000000000:  psb offs: 0
         0                   perf-exec  8509 [001]     0.000000000:  cbr: 42 freq: 4219 MHz (156%)
       833        833            uname  8509 [001]     0.000047689: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )        _start
       833                       uname  8509 [001]     0.000003261: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
      2015       1182            uname  8509 [001]     0.000000282: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
      2676        661            uname  8509 [001]     0.000002629: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
      3612        936            uname  8509 [001]     0.000001232: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
      4579        967            uname  8509 [001]     0.000002519: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
      6145       1566            uname  8509 [001]     0.000001050: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )                _dl_setup_hash
      6239         94            uname  8509 [001]     0.000000023: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )                _dl_sysdep_start

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20211027080334.365596-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-27 16:20:30 -03:00
Adrian Hunter
f2b91386ff perf intel-pt: Support itrace A option to approximate IPC
Normally, for cycle-acccurate mode, IPC values are an exact number of
instructions and cycles. Due to the granularity of timestamps, that happens
only when a CYC packet correlates to the event.

Support the itrace 'A' option, to use instead, the number of cycles
associated with the current timestamp. This provides IPC information for
every change of timestamp, but at the expense of accuracy. Due to the
granularity of timestamps, the actual number of cycles increases even
though the cycles reported does not. The number of instructions is known,
but if IPC is reported, cycles can be too low and so IPC is too high. Note
that inaccuracy decreases as the period of sampling increases i.e. if the
number of cycles is too low by a small amount, that becomes less
significant if the number of cycles is large.

Furthermore, it can be used in conjunction with dlfilter-show-cycles.so
to provide higher granularity cycle information.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20211027080334.365596-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-27 16:20:18 -03:00
Adrian Hunter
b6778fe1bb perf auxtrace: Add itrace A option to approximate IPC
Add an option to specify that synthesized IPC can be approximate, rather
than completely accurate.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20211027080334.365596-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-27 16:20:08 -03:00
Adrian Hunter
cf14013b6c perf auxtrace: Add missing Z option to ITRACE_HELP
ITRACE_HELP is used by perf commands to display help text for the --itrace
option. Add missing Z option.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20211027080334.365596-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-27 16:19:58 -03:00
Yucong Sun
e1ef62a4dd selftests/bpf: Adding a namespace reset for tc_redirect
This patch delete ns_src/ns_dst/ns_redir namespaces before recreating
them, making the test more robust.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211025223345.2136168-5-fallentree@fb.com
2021-10-27 11:59:02 -07:00
Yucong Sun
9e7240fb2d selftests/bpf: Fix attach_probe in parallel mode
This patch makes attach_probe uses its own method as attach point,
avoiding conflict with other tests like bpf_cookie.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211025223345.2136168-4-fallentree@fb.com
2021-10-27 11:59:02 -07:00
Yucong Sun
547208a386 selfetests/bpf: Update vmtest.sh defaults
Increase memory to 4G, 8 SMP core with host cpu passthrough. This
make it run faster in parallel mode and more likely to succeed.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211025223345.2136168-2-fallentree@fb.com
2021-10-27 11:58:44 -07:00
Joe Burton
689624f037 libbpf: Deprecate bpf_objects_list
Add a flag to `enum libbpf_strict_mode' to disable the global
`bpf_objects_list', preventing race conditions when concurrent threads
call bpf_object__open() or bpf_object__close().

bpf_object__next() will return NULL if this option is set.

Callers may achieve the same workflow by tracking bpf_objects in
application code.

  [0] Closes: https://github.com/libbpf/libbpf/issues/293

Signed-off-by: Joe Burton <jevburton@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026223528.413950-1-jevburton.kernel@gmail.com
2021-10-27 11:00:12 -07:00
Russell Currey
cb662608e5 selftests/powerpc: Use date instead of EPOCHSECONDS in mitigation-patching.sh
The EPOCHSECONDS environment variable was added in bash 5.0 (released
2019).  Some distributions of the "stable" and "long-term" variety ship
older versions of bash than this, so swap to using the date command
instead.

"%s" was added to coreutils `date` in 1993 so we should be good, but who
knows, it is a GNU extension and not part of the POSIX spec for `date`.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211025102436.19177-1-ruscur@russell.cc
2021-10-27 22:34:02 +11:00
Masami Hiramatsu
25b9513872 selftests/ftrace: Stop tracing while reading the trace file by default
Stop tracing while reading the trace file by default, to prevent
the test results while checking it and to avoid taking a long time
to check the result.
If there is any testcase which wants to test the tracing while reading
the trace file, please override this setting inside the test case.

This also recovers the pause-on-trace when clean it up.

Link: https://lkml.kernel.org/r/163529053143.690749.15365238954175942026.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-26 20:27:19 -04:00
Jakub Kicinski
440ffcdd9d Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-10-26

We've added 12 non-merge commits during the last 7 day(s) which contain
a total of 23 files changed, 118 insertions(+), 98 deletions(-).

The main changes are:

1) Fix potential race window in BPF tail call compatibility check, from Toke Høiland-Jørgensen.

2) Fix memory leak in cgroup fs due to missing cgroup_bpf_offline(), from Quanyang Wang.

3) Fix file descriptor reference counting in generic_map_update_batch(), from Xu Kuohai.

4) Fix bpf_jit_limit knob to the max supported limit by the arch's JIT, from Lorenz Bauer.

5) Fix BPF sockmap ->poll callbacks for UDP and AF_UNIX sockets, from Cong Wang and Yucong Sun.

6) Fix BPF sockmap concurrency issue in TCP on non-blocking sendmsg calls, from Liu Jian.

7) Fix build failure of INODE_STORAGE and TASK_STORAGE maps on !CONFIG_NET, from Tejun Heo.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Fix potential race in tail call compatibility check
  bpf: Move BPF_MAP_TYPE for INODE_STORAGE and TASK_STORAGE outside of CONFIG_NET
  selftests/bpf: Use recv_timeout() instead of retries
  net: Implement ->sock_is_readable() for UDP and AF_UNIX
  skmsg: Extract and reuse sk_msg_is_readable()
  net: Rename ->stream_memory_read to ->sock_is_readable
  tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function
  cgroup: Fix memory leak caused by missing cgroup_bpf_offline
  bpf: Fix error usage of map_fd and fdget() in generic_map_update_batch()
  bpf: Prevent increasing bpf_jit_limit above max
  bpf: Define bpf_jit_alloc_exec_limit for arm64 JIT
  bpf: Define bpf_jit_alloc_exec_limit for riscv JIT
====================

Link: https://lore.kernel.org/r/20211026201920.11296-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-26 14:38:55 -07:00
Yucong Sun
67b821502d selftests/bpf: Use recv_timeout() instead of retries
We use non-blocking sockets in those tests, retrying for
EAGAIN is ugly because there is no upper bound for the packet
arrival time, at least in theory. After we fix poll() on
sockmap sockets, now we can switch to select()+recv().

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211008203306.37525-5-xiyou.wangcong@gmail.com
2021-10-26 12:29:33 -07:00
John Keeping
432d7f5282 tools build: Drop needless slang include path in test-all
Commit cbefd24f0a ("tools build: Add test to check if slang.h is
in /usr/include/slang/") added a proper test to check whether slang.h is
in a subdirectory, and commit 1955c8cf5e ("perf tools: Don't
hardcode host include path for libslang") removed the include path for
test-libslang.bin but missed test-all.bin.

Apply the same change to test-all.bin.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Fixes: 1955c8cf5e ("perf tools: Don't hardcode host include path for libslang")
Signed-off-by: John Keeping <john@metanate.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20211025172314.3766032-1-john@metanate.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-26 11:24:18 -03:00
James Clark
133fe2e617 perf tests: Improve temp file cleanup in test_arm_coresight.sh
Cleanup perf.data.old files which are also dropped by perf, handle
sigint and propagate it to the parent in case the test is run in a bash
while loop and don't create the temp files if the test will be skipped.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20210921131009.390810-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-26 11:17:46 -03:00
James Clark
39c534889e perf tests: Fix trace+probe_vfs_getname.sh /tmp cleanup
The temp file is only cleaned up if the test is not skipped, so delay
making it until after the skip so it doesn't get left behind in /tmp.

Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20210921131009.390810-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-26 11:17:38 -03:00
James Clark
cf95f85e27 perf test: Fix record+script_probe_vfs_getname.sh /tmp cleanup
The temp files are only cleaned up if the test is not skipped, so delay
making them until after the skip so they don't get left behind in /tmp.

Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20210921131009.390810-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-26 11:14:49 -03:00
Arnaldo Carvalho de Melo
3a55445f11 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up the fixes from upstream.

Fix simple conflict on session.c related to the file position fix that
went upstream and is touched by the active decomp changes in perf/core.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-26 11:03:02 -03:00
Danielle Ratson
c24dbf3d4f selftests: mlxsw: Remove deprecated test cases
After adding the previous patches, the constraint that all the router
interface MAC addresses have the same prefix is no longer relevant.

Remove the test cases that validated that this constraint is honored.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 13:35:58 +01:00
Danielle Ratson
20d446db61 selftests: Add an occupancy test for RIF MAC profiles
When all the RIF MAC profiles are in use, test that it is possible to
change the MAC of a netdev (i.e., a RIF) when its MAC profile is not
shared with other RIFs. Test that replacement fails when the MAC profile
is shared.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 13:35:58 +01:00
Danielle Ratson
a10b7bacde selftests: mlxsw: Add forwarding test for RIF MAC profiles
Verify that MAC profile changes are indeed applied and that packets are
forwarded with the correct source MAC.

Output example:

$ ./rif_mac_profiles.sh
TEST: h1->h2: new mac profile                                       [ OK ]
TEST: h2->h1: new mac profile                                       [ OK ]
TEST: h1->h2: edit mac profile                                      [ OK ]
TEST: h2->h1: edit mac profile                                      [ OK ]

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 13:35:58 +01:00
Danielle Ratson
152f98e7c5 selftests: mlxsw: Add a scale test for RIF MAC profiles
Query the maximum number of supported RIF MAC profiles using
devlink-resource and verify that all available MAC profiles can be utilized
and that an error is generated when user space tries to exceed this number.

Output example in Spectrum-2:

$ TESTS='rif_mac_profile' ./resource_scale.sh
TEST: 'rif_mac_profile' 4                                           [ OK ]
TEST: 'rif_mac_profile' overflow 5                                  [ OK ]

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 13:35:58 +01:00
Song Liu
20d1b54a52 selftests/bpf: Guess function end for test_get_branch_snapshot
Function in modules could appear in /proc/kallsyms in random order.

ffffffffa02608a0 t bpf_testmod_loop_test
ffffffffa02600c0 t __traceiter_bpf_testmod_test_writable_bare
ffffffffa0263b60 d __tracepoint_bpf_testmod_test_write_bare
ffffffffa02608c0 T bpf_testmod_test_read
ffffffffa0260d08 t __SCT__tp_func_bpf_testmod_test_writable_bare
ffffffffa0263300 d __SCK__tp_func_bpf_testmod_test_read
ffffffffa0260680 T bpf_testmod_test_write
ffffffffa0260860 t bpf_testmod_test_mod_kfunc

Therefore, we cannot reliably use kallsyms_find_next() to find the end of
a function. Replace it with a simple guess (start + 128). This is good
enough for this test.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211022234814.318457-1-songliubraving@fb.com
2021-10-25 21:43:05 -07:00
Song Liu
b4e8707276 selftests/bpf: Skip all serial_test_get_branch_snapshot in vm
Skipping the second half of the test is not enough to silent the warning
in dmesg. Skip the whole test before we can either properly silent the
warning in kernel, or fix LBR snapshot for VM.

Fixes: 025bd7c753 ("selftests/bpf: Add test for bpf_get_branch_snapshot")
Fixes: aa67fdb464 ("selftests/bpf: Skip the second half of get_branch_snapshot in vm")
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026000733.477714-1-songliubraving@fb.com
2021-10-25 20:41:00 -07:00
Ilya Leoshkevich
2e2c6d3fb3 selftests/bpf: Fix test_core_reloc_mods on big-endian machines
This is the same as commit d164dd9a5c ("selftests/bpf: Fix
test_core_autosize on big-endian machines"), but for
test_core_reloc_mods.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026010831.748682-7-iii@linux.ibm.com
2021-10-25 20:39:42 -07:00
Ilya Leoshkevich
3e7ed9cebb selftests/seccomp: Use __BYTE_ORDER__
Use the compiler-defined __BYTE_ORDER__ instead of the libc-defined
__BYTE_ORDER for consistency.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026010831.748682-6-iii@linux.ibm.com
2021-10-25 20:39:42 -07:00
Ilya Leoshkevich
06fca841fb selftests/bpf: Use __BYTE_ORDER__
Use the compiler-defined __BYTE_ORDER__ instead of the libc-defined
__BYTE_ORDER for consistency.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026010831.748682-4-iii@linux.ibm.com
2021-10-25 20:39:42 -07:00
Ilya Leoshkevich
3930198dc9 libbpf: Use __BYTE_ORDER__
Use the compiler-defined __BYTE_ORDER__ instead of the libc-defined
__BYTE_ORDER for consistency.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026010831.748682-3-iii@linux.ibm.com
2021-10-25 20:39:41 -07:00
Ilya Leoshkevich
45f2bebc80 libbpf: Fix endianness detection in BPF_CORE_READ_BITFIELD_PROBED()
__BYTE_ORDER is supposed to be defined by a libc, and __BYTE_ORDER__ -
by a compiler. bpf_core_read.h checks __BYTE_ORDER == __LITTLE_ENDIAN,
which is true if neither are defined, leading to incorrect behavior on
big-endian hosts if libc headers are not included, which is often the
case.

Fixes: ee26dade0e ("libbpf: Add support for relocatable bitfields")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026010831.748682-2-iii@linux.ibm.com
2021-10-25 20:39:41 -07:00
Viktor Rosendahl
f604de20c0 tools/latency-collector: Use correct size when writing queue_full_warning
queue_full_warning is a pointer, so it is wrong to use sizeof to calculate
the number of characters of the string it points to. The effect is that we
only print out the first few characters of the warning string.

The correct way is to use strlen(). We don't need to add 1 to the strlen()
because we don't want to write the terminating null character to stdout.

Link: https://lkml.kernel.org/r/20211019160701.15587-1-Viktor.Rosendahl@bmw.de

Link: https://lore.kernel.org/r/8fd4bb65ef3da67feac9ce3258cdbe9824752cf1.1629198502.git.jing.yangyang@zte.com.cn
Link: https://lore.kernel.org/r/20211012025424.180781-1-davidcomponentone@gmail.com
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Viktor Rosendahl <Viktor.Rosendahl@bmw.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-25 22:27:19 -04:00
Andrii Nakryiko
c4813e969a libbpf: Deprecate ambiguously-named bpf_program__size() API
The name of the API doesn't convey clearly that this size is in number
of bytes (there needed to be a separate comment to make this clear in
libbpf.h). Further, measuring the size of BPF program in bytes is not
exactly the best fit, because BPF programs always consist of 8-byte
instructions. As such, bpf_program__insn_cnt() is a better alternative
in pretty much any imaginable case.

So schedule bpf_program__size() deprecation starting from v0.7 and it
will be removed in libbpf 1.0.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211025224531.1088894-5-andrii@kernel.org
2021-10-25 18:37:21 -07:00
Andrii Nakryiko
e21d585cb3 libbpf: Deprecate multi-instance bpf_program APIs
Schedule deprecation of a set of APIs that are related to multi-instance
bpf_programs:
  - bpf_program__set_prep() ([0]);
  - bpf_program__{set,unset}_instance() ([1]);
  - bpf_program__nth_fd().

These APIs are obscure, very niche, and don't seem to be used much in
practice. bpf_program__set_prep() is pretty useless for anything but the
simplest BPF programs, as it doesn't allow to adjust BPF program load
attributes, among other things. In short, it already bitrotted and will
bitrot some more if not removed.

With bpf_program__insns() API, which gives access to post-processed BPF
program instructions of any given entry-point BPF program, it's now
possible to do whatever necessary adjustments were possible with
set_prep() API before, but also more. Given any such use case is
automatically an advanced use case, requiring users to stick to
low-level bpf_prog_load() APIs and managing their own prog FDs is
reasonable.

  [0] Closes: https://github.com/libbpf/libbpf/issues/299
  [1] Closes: https://github.com/libbpf/libbpf/issues/300

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211025224531.1088894-4-andrii@kernel.org
2021-10-25 18:37:21 -07:00
Andrii Nakryiko
65a7fa2e4e libbpf: Add ability to fetch bpf_program's underlying instructions
Add APIs providing read-only access to bpf_program BPF instructions ([0]).
This is useful for diagnostics purposes, but it also allows a cleaner
support for cloning BPF programs after libbpf did all the FD resolution
and CO-RE relocations, subprog instructions appending, etc. Currently,
cloning BPF program is possible only through hijacking a half-broken
bpf_program__set_prep() API, which doesn't really work well for anything
but most primitive programs. For instance, set_prep() API doesn't allow
adjusting BPF program load parameters which are necessary for loading
fentry/fexit BPF programs (the case where BPF program cloning is
a necessity if doing some sort of mass-attachment functionality).

Given bpf_program__set_prep() API is set to be deprecated, having
a cleaner alternative is a must. libbpf internally already keeps track
of linear array of struct bpf_insn, so it's not hard to expose it. The
only gotcha is that libbpf previously freed instructions array during
bpf_object load time, which would make this API much less useful overall,
because in between bpf_object__open() and bpf_object__load() a lot of
changes to instructions are done by libbpf.

So this patch makes libbpf hold onto prog->insns array even after BPF
program loading. I think this is a small price for added functionality
and improved introspection of BPF program code.

See retsnoop PR ([1]) for how it can be used in practice and code
savings compared to relying on bpf_program__set_prep().

  [0] Closes: https://github.com/libbpf/libbpf/issues/298
  [1] https://github.com/anakryiko/retsnoop/pull/1

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211025224531.1088894-3-andrii@kernel.org
2021-10-25 18:37:21 -07:00
Andrii Nakryiko
de5d0dcef6 libbpf: Fix off-by-one bug in bpf_core_apply_relo()
Fix instruction index validity check which has off-by-one error.

Fixes: 3ee4f53355 ("libbpf: Split bpf_core_apply_relo() into bpf_program independent helper.")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211025224531.1088894-2-andrii@kernel.org
2021-10-25 18:37:21 -07:00
Quentin Monnet
d6699f8e0f bpftool: Switch to libbpf's hashmap for PIDs/names references
In order to show PIDs and names for processes holding references to BPF
programs, maps, links, or BTF objects, bpftool creates hash maps to
store all relevant information. This commit is part of a set that
transitions from the kernel's hash map implementation to the one coming
with libbpf.

The motivation is to make bpftool less dependent of kernel headers, to
ease the path to a potential out-of-tree mirror, like libbpf has.

This is the third and final step of the transition, in which we convert
the hash maps used for storing the information about the processes
holding references to BPF objects (programs, maps, links, BTF), and at
last we drop the inclusion of tools/include/linux/hashtable.h.

Note: Checkpatch complains about the use of __weak declarations, and the
missing empty lines after the bunch of empty function declarations when
compiling without the BPF skeletons (none of these were introduced in
this patch). We want to keep things as they are, and the reports should
be safe to ignore.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211023205154.6710-6-quentin@isovalent.com
2021-10-25 17:31:39 -07:00
Quentin Monnet
2828d0d75b bpftool: Switch to libbpf's hashmap for programs/maps in BTF listing
In order to show BPF programs and maps using BTF objects when the latter
are being listed, bpftool creates hash maps to store all relevant items.
This commit is part of a set that transitions from the kernel's hash map
implementation to the one coming with libbpf.

The motivation is to make bpftool less dependent of kernel headers, to
ease the path to a potential out-of-tree mirror, like libbpf has.

This commit focuses on the two hash maps used by bpftool when listing
BTF objects to store references to programs and maps, and convert them
to the libbpf's implementation.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211023205154.6710-5-quentin@isovalent.com
2021-10-25 17:31:39 -07:00
Quentin Monnet
8f184732b6 bpftool: Switch to libbpf's hashmap for pinned paths of BPF objects
In order to show pinned paths for BPF programs, maps, or links when
listing them with the "-f" option, bpftool creates hash maps to store
all relevant paths under the bpffs. So far, it would rely on the
kernel implementation (from tools/include/linux/hashtable.h).

We can make bpftool rely on libbpf's implementation instead. The
motivation is to make bpftool less dependent of kernel headers, to ease
the path to a potential out-of-tree mirror, like libbpf has.

This commit is the first step of the conversion: the hash maps for
pinned paths for programs, maps, and links are converted to libbpf's
hashmap.{c,h}. Other hash maps used for the PIDs of process holding
references to BPF objects are left unchanged for now. On the build side,
this requires adding a dependency to a second header internal to libbpf,
and making it a dependency for the bootstrap bpftool version as well.
The rest of the changes are a rather straightforward conversion.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211023205154.6710-4-quentin@isovalent.com
2021-10-25 17:31:38 -07:00
Quentin Monnet
46241271d1 bpftool: Do not expose and init hash maps for pinned path in main.c
BPF programs, maps, and links, can all be listed with their pinned paths
by bpftool, when the "-f" option is provided. To do so, bpftool builds
hash maps containing all pinned paths for each kind of objects.

These three hash maps are always initialised in main.c, and exposed
through main.h. There appear to be no particular reason to do so: we can
just as well make them static to the files that need them (prog.c,
map.c, and link.c respectively), and initialise them only when we want
to show objects and the "-f" switch is provided.

This may prevent unnecessary memory allocations if the implementation of
the hash maps was to change in the future.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211023205154.6710-3-quentin@isovalent.com
2021-10-25 17:31:38 -07:00
Quentin Monnet
8b6c46241c bpftool: Remove Makefile dep. on $(LIBBPF) for $(LIBBPF_INTERNAL_HDRS)
The dependency is only useful to make sure that the $(LIBBPF_HDRS_DIR)
directory is created before we try to install locally the required
libbpf internal header. Let's create this directory properly instead.

This is in preparation of making $(LIBBPF_INTERNAL_HDRS) a dependency to
the bootstrap bpftool version, in which case we want no dependency on
$(LIBBPF).

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211023205154.6710-2-quentin@isovalent.com
2021-10-25 17:31:38 -07:00
Andrii Nakryiko
3762a39ce8 selftests/bpf: Split out bpf_verif_scale selftests into multiple tests
Instead of using subtests in bpf_verif_scale selftest, turn each scale
sub-test into its own test. Each subtest is compltely independent and
just reuses a bit of common test running logic, so the conversion is
trivial. For convenience, keep all of BPF verifier scale tests in one
file.

This conversion shaves off a significant amount of time when running
test_progs in parallel mode. E.g., just running scale tests (-t verif_scale):

BEFORE
======
Summary: 24/0 PASSED, 0 SKIPPED, 0 FAILED

real    0m22.894s
user    0m0.012s
sys     0m22.797s

AFTER
=====
Summary: 24/0 PASSED, 0 SKIPPED, 0 FAILED

real    0m12.044s
user    0m0.024s
sys     0m27.869s

Ten second saving right there. test_progs -j is not yet ready to be
turned on by default, unfortunately, and some tests fail almost every
time, but this is a good improvement nevertheless. Ignoring few
failures, here is sequential vs parallel run times when running all
tests now:

SEQUENTIAL
==========
Summary: 206/953 PASSED, 4 SKIPPED, 0 FAILED

real    1m5.625s
user    0m4.211s
sys     0m31.650s

PARALLEL
========
Summary: 204/952 PASSED, 4 SKIPPED, 2 FAILED

real    0m35.550s
user    0m4.998s
sys     0m39.890s

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211022223228.99920-5-andrii@kernel.org
2021-10-25 14:45:46 -07:00
Andrii Nakryiko
2c0f51ac32 selftests/bpf: Mark tc_redirect selftest as serial
It seems to cause a lot of harm to kprobe/tracepoint selftests. Yucong
mentioned before that it does manipulate sysfs, which might be the
reason. So let's mark it as serial, though ideally it would be less
intrusive on the system at test.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211022223228.99920-4-andrii@kernel.org
2021-10-25 14:45:46 -07:00
Andrii Nakryiko
8ea688e7f4 selftests/bpf: Support multiple tests per file
Revamp how test discovery works for test_progs and allow multiple test
entries per file. Any global void function with no arguments and
serial_test_ or test_ prefix is considered a test.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211022223228.99920-3-andrii@kernel.org
2021-10-25 14:45:46 -07:00
Andrii Nakryiko
6972dc3b87 selftests/bpf: Normalize selftest entry points
Ensure that all test entry points are global void functions with no
input arguments. Mark few subtest entry points as static.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211022223228.99920-2-andrii@kernel.org
2021-10-25 14:45:45 -07:00
Daniel Latypov
2ab5d5e67f kunit: tool: continue past invalid utf-8 output
kunit.py currently crashes and fails to parse kernel output if it's not
fully valid utf-8.

This can come from memory corruption or just inadvertently printing
out binary data as strings.

E.g. adding this line into a kunit test
  pr_info("\x80")
will cause this exception
  UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position
  1961: invalid start byte

We can tell Python how to handle errors, see
https://docs.python.org/3/library/codecs.html#error-handlers

Unfortunately, it doesn't seem like there's a way to specify this in
just one location, so we need to repeat ourselves quite a bit.

Specify `errors='backslashreplace'` so we instead:
* print out the offending byte as '\x80'
* try and continue parsing the output.
  * as long as the TAP lines themselves are valid, we're fine.

Fixed spelling/grammar in commit log:
Shuah Khan <<skhan@linuxfoundation.org>

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Tested-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-25 13:06:45 -06:00
John Garry
342cb7ebf5 perf jevents: Fix some would-be warnings
Before enabling warnings through HOSTCFLAGS, fix the would-be warnings:

    HOSTCC  pmu-events/jevents.o
  pmu-events/jevents.c:74:22: warning: no previous prototype for ‘convert’ [-Wmissing-prototypes]
     74 | enum aggr_mode_class convert(const char *aggr_mode)
        |                      ^~~~~~~
  pmu-events/jevents.c: In function ‘print_events_table_entry’:
  pmu-events/jevents.c:373:8: warning: declaration of ‘topic’ shadows a global declaration [-Wshadow]
    373 |  char *topic = pd->topic;
        |        ^~~~~
  pmu-events/jevents.c:316:14: note: shadowed declaration is here
    316 | static char *topic;
        |              ^~~~~
  pmu-events/jevents.c: In function ‘json_events’:
  pmu-events/jevents.c:554:9: warning: declaration of ‘func’ shadows a global declaration [-Wshadow]
    554 |   int (*func)(void *data, struct json_event *je),
        |   ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  pmu-events/jevents.c:85:15: note: shadowed declaration is here
     85 | typedef int (*func)(void *data, struct json_event *je);
        |               ^~~~
  pmu-events/jevents.c: In function ‘main’:
  pmu-events/jevents.c:1211:25: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
   1211 |  char *err_string_ext = "";
        |                         ^~
  pmu-events/jevents.c:1304:17: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
   1304 |  err_string_ext = " for std arch event";
        |                 ^

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1634807805-40093-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:42 -03:00
James Clark
d4145960e5 perf dso: Fix /proc/kcore access on 32 bit systems
Because _LARGEFILE64_SOURCE is set in perf, file offset sizes can be
64 bits. If a workflow needs to open /proc/kcore on a 32 bit system (for
example to decode Arm ETM kernel trace) then the size value will be
wrapped to 32 bits in the function file_size() at this line:

  dso->data.file_size = st.st_size;

Setting the file_size member to be u64 fixes the issue and allows
/proc/kcore to be opened.

Reported-by: Denis Nikitin <denik@chromium.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20211021112700.112499-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:42 -03:00
Adrian Hunter
e277ac28df perf build: Suppress 'rm dlfilter' build message
The following build message:

	rm dlfilters/dlfilter-test-api-v0.o

is unwanted.

The object file is being treated as an intermediate file and being
automatically removed. Mark the object file as .SECONDARY to prevent
removal and hence the message.

Requested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210930062849.110416-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:42 -03:00
Jin Yao
0e0ae87422 perf list: Display hybrid PMU events with cpu type
Add a new option '--cputype' to 'perf list' to display core-only PMU
events or atom-only PMU events.

Each hybrid PMU event has been assigned with a PMU name, this patch
compares the PMU name before listing the result.

For example:

  perf list --cputype atom
  ...
  cache:
    core_reject_l2q.any
         [Counts the number of request that were not accepted into the L2Q because the L2Q is FULL. Unit: cpu_atom]
  ...

The "Unit: cpu_atom" is displayed in the brief description section
to indicate this is an atom event.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210903025239.22754-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:42 -03:00
Athira Rajeev
83e1ada67a perf powerpc: Add support to expose instruction and data address registers as part of extended regs
This patch enables presenting Sampled Instruction Address Register
(SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended
registers for the perf tool.

Add these SPR's to sample_reg_mask in the tool side (to use with -I?
option).

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20211018114948.16830-3-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:42 -03:00
Athira Rajeev
637b8b90fe perf powerpc: Refactor the code definition of perf reg extended mask in tools side header file
PERF_REG_PMU_MASK_300 and PERF_REG_PMU_MASK_31 defines the mask value
for extended registers. Current definition of these mask values uses hex
constant and does not use registers by name, making it less readable.
Patch refactor the macro values in perf tools side header file by or'ing
together the actual register value constants.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20211018114948.16830-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:42 -03:00
Alexey Bayduraev
25900ea85c perf session: Introduce reader EOF function
Introduce function to check end-of-file status.

Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Link: https://lore.kernel.org/r/b3b0e0904da01f9ec84d4ae9368df99ecd231598.1634113027.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:41 -03:00
Alexey Bayduraev
4c0028864c perf session: Introduce reader return codes
Add READER_OK and READER_NODATA return codes to make the code more
clear.

Suggested-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/5fca481e91c3c5d2ba033d4c6e9b969f8033ab0f.1634113027.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:41 -03:00
Alexey Bayduraev
5c10dc9244 perf session: Move the event read code to a separate function
Separate the reading code of a single event to a new
reader__read_event() function.

Suggested-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/ffe570d937138dd24f282978ce7ed9c46a06ff9b.1634113027.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:41 -03:00
Alexey Bayduraev
de096489d0 perf session: Move unmap code to reader__mmap
Move the unmapping code to reader__mmap(), so that the mmap code is
located together.

Move the head/file_offset computation to reader__mmap(), so all the
offset computation is located together and in one place only.

Suggested-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/f1c5e17cfa1ecfe912d10b411be203b55d148bc7.1634113027.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:41 -03:00
Alexey Bayduraev
06763e7b30 perf session: Move reader map code to a separate function
Move the mapping code into a separate reader__mmap() function.

Suggested-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/e445de5bb85bbd91287986802d6ed0ce1b419b5a.1634113027.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:41 -03:00
Alexey Bayduraev
5965063094 perf session: Move init/release code to separate functions
Separate init/release code into reader__init() and reader__release_decomp()
functions.

Remove a duplicate call to ui_progress__init_size(), the same call can
be found in __perf_session__process_events().

For multiple traces ui_progress should be initialized by total size
before reader__init() calls.

Suggested-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/8bacf247de220be8e57af1d2b796322175f5e257.1634113027.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:41 -03:00
Alexey Bayduraev
3a3535e67d perf session: Introduce decompressor in reader object
Introduce a decompressor data structure with pointers to decomp
objects and to zstd object.

We cannot just move session->zstd_data to decomp_data as
session->zstd_data is not only used for decompression.

Adding decompressor data object to reader object and introducing
active_decomp into perf_session object to select current decompressor.

Thus decompression could be executed separately for each data file.

Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/0eee270cb52aebcbd029c8445d9009fd17709d53.1634113027.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:41 -03:00
Alexey Bayduraev
529b6fbca0 perf session: Move all state items to reader object
We need all the state info about reader in separate object to load data
from multiple files, so we can keep multiple readers at the same time.
Moving all items that need to be kept from reader__process_events to
the reader object. Introducing mmap_cur to keep current mapping.

Suggested-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/5c7bdebfaadd7fcb729bd999b181feccaa292e8e.1634113027.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:41 -03:00
Adrian Hunter
dedcc0ea6d perf intel-pt: Add support for PERF_RECORD_AUX_OUTPUT_HW_ID
Originally, software only supported redirecting at most one PEBS event to
Intel PT (PEBS-via-PT) because it was not able to differentiate one event
from another. To overcome that, add support for the
PERF_RECORD_AUX_OUTPUT_HW_ID side-band event.

Committer notes:

Cast the pointer arg to for_each_set_bit() to (unsigned long *), to fix
the build on 32-bit systems.

Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20210907163903.11820-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25 13:47:05 -03:00
Shuah Khan
dd40f44eab selftests: x86: fix [-Wstringop-overread] warn in test_process_vm_readv()
Fix the following [-Wstringop-overread] by passing in the variable
instead of the value.

test_vsyscall.c: In function ‘test_process_vm_readv’:
test_vsyscall.c:500:22: warning: ‘__builtin_memcmp_eq’ specified bound 4096 exceeds source size 0 [-Wstringop-overread]
  500 |                 if (!memcmp(buf, (const void *)0xffffffffff600000, 4096)) {
      |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-25 09:27:20 -06:00
Ido Schimmel
e860419684 selftests: mlxsw: Reduce test run time
Instead of iterating over all the available trap policers, only perform
the tests with three policers: The first, the last and the one in the
middle of the range. On a Spectrum-3 system, this reduces the run time
from almost an hour to a few minutes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:10:11 +01:00
Ido Schimmel
535ac9a5fb selftests: mlxsw: Use permanent neighbours instead of reachable ones
The nexthop objects tests configure dummy reachable neighbours so that
the nexthops will have a MAC address and be programmed to the device.

Since these are dummy reachable neighbours, they can be transitioned by
the kernel to a failed state if they are around for too long. This can
happen, for example, if the "TIMEOUT" variable is configured with a too
high value.

Make the tests more robust by configuring the neighbours as permanent,
so that the tests do not depend on the configured timeout value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:10:11 +01:00
Petr Machata
b8bfafe434 selftests: mlxsw: Add helpers for skipping selftests
A number of mlxsw-specific selftests currently detect whether they are run
on a compatible machine, and bail out silently when not. These tests are
however done in a somewhat impenetrable manner by directly comparing PCI
IDs against a blacklist or a whitelist, and bailing out silently if the
machine is not compatible.

Instead, add a helper, mlxsw_only_on_spectrum(), which allows specifying
the supported machines in a human-readable manner. If the current machine
is incompatible, the helper emits a SKIP message and returns an error code,
based on which the caller can gracefully bail out in a suitable way. This
allows a more readable conditions such as:

	mlxsw_only_on_spectrum 2+ || return

Convert all existing open-coded guards to the new helper. Also add two new
guards to do_mark_test() and do_drop_test(), which are supported only on
Spectrum-2+, but the corresponding check was not there.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:10:11 +01:00
Vladimir Oltean
eccd0a80dc selftests: net: dsa: add a stress test for unlocked FDB operations
This test is a bit strange in that it is perhaps more manual than
others: it does not transmit a clear OK/FAIL verdict, because user space
does not have synchronous feedback from the kernel. If a hardware access
fails, it is in deferred context.

Nonetheless, on sja1105 I have used it successfully to find and solve a
concurrency issue, so it can be used as a starting point for other
driver maintainers too.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 12:59:42 +01:00
Vladimir Oltean
d70b51f284 selftests: lib: forwarding: allow tests to not require mz and jq
These programs are useful, but not all selftests require them.

Additionally, on embedded boards without package management (things like
buildroot), installing mausezahn or jq is not always as trivial as
downloading a package from the web.

So it is actually a bit annoying to require programs that are not used.
Introduce options that can be set by scripts to not enforce these
dependencies. For compatibility, default to "yes".

Cc: Nikolay Aleksandrov <nikolay@nvidia.com>
Cc: Ido Schimmel <idosch@nvidia.com>
Cc: Guillaume Nault <gnault@redhat.com>
Cc: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 12:59:42 +01:00
David S. Miller
2d7e73f09f Revert "Merge branch 'dsa-rtnl'"
This reverts commit 965e6b262f, reversing
changes made to 4d98bb0d7e.
2021-10-25 12:59:25 +01:00
Kees Cook
d46e58ef77 lkdtm/bugs: Check that a per-task stack canary exists
Introduce REPORT_STACK_CANARY to check for differing stack canaries
between two processes (i.e. that an architecture is correctly implementing
per-task stack canaries), using the task_struct canary as the hint to
locate in the stack. Requires that one of the processes being tested
not be pid 1.

Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20211022223826.330653-3-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-25 09:13:46 +02:00
Kees Cook
149538cd55 selftests/lkdtm: Add way to repeat a test
Some LKDTM tests need to be run more than once (usually to setup and
then later trigger). Until now, the only case was the SOFT_LOCKUP test,
which wasn't useful to run in the bulk selftests. The coming stack canary
checking needs to run twice, so support this with a new test output prefix
"repeat".

Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20211022223826.330653-2-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-25 09:13:46 +02:00
Vladimir Oltean
edc90d1585 selftests: net: dsa: add a stress test for unlocked FDB operations
This test is a bit strange in that it is perhaps more manual than
others: it does not transmit a clear OK/FAIL verdict, because user space
does not have synchronous feedback from the kernel. If a hardware access
fails, it is in deferred context.

Nonetheless, on sja1105 I have used it successfully to find and solve a
concurrency issue, so it can be used as a starting point for other
driver maintainers too.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-24 13:47:45 +01:00
Vladimir Oltean
016748961b selftests: lib: forwarding: allow tests to not require mz and jq
These programs are useful, but not all selftests require them.

Additionally, on embedded boards without package management (things like
buildroot), installing mausezahn or jq is not always as trivial as
downloading a package from the web.

So it is actually a bit annoying to require programs that are not used.
Introduce options that can be set by scripts to not enforce these
dependencies. For compatibility, default to "yes".

Cc: Nikolay Aleksandrov <nikolay@nvidia.com>
Cc: Ido Schimmel <idosch@nvidia.com>
Cc: Guillaume Nault <gnault@redhat.com>
Cc: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-24 13:47:45 +01:00
Andrii Nakryiko
c825f5fee1 libbpf: Fix BTF header parsing checks
Original code assumed fixed and correct BTF header length. That's not
always the case, though, so fix this bug with a proper additional check.
And use actual header length instead of sizeof(struct btf_header) in
sanity checks.

Fixes: 8a138aed4a ("bpf: btf: Add BTF support to libbpf")
Reported-by: Evgeny Vereshchagin <evvers@ya.ru>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211023003157.726961-2-andrii@kernel.org
2021-10-22 17:33:31 -07:00
Andrii Nakryiko
5245dafe3d libbpf: Fix overflow in BTF sanity checks
btf_header's str_off+str_len or type_off+type_len can overflow as they
are u32s. This will lead to bypassing the sanity checks during BTF
parsing, resulting in crashes afterwards. Fix by using 64-bit signed
integers for comparison.

Fixes: d812362450 ("libbpf: Fix BTF data layout checks and allow empty BTF")
Reported-by: Evgeny Vereshchagin <evvers@ya.ru>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211023003157.726961-1-andrii@kernel.org
2021-10-22 17:33:31 -07:00
Yonghong Song
8c18ea2d2c selftests/bpf: Add BTF_KIND_DECL_TAG typedef example in tag.c
Change value type in progs/tag.c to a typedef with a btf_decl_tag.
With `bpftool btf dump file tag.o`, we have
  ...
  [14] TYPEDEF 'value_t' type_id=17
  [15] DECL_TAG 'tag1' type_id=14 component_idx=-1
  [16] DECL_TAG 'tag2' type_id=14 component_idx=-1
  [17] STRUCT '(anon)' size=8 vlen=2
        'a' type_id=2 bits_offset=0
        'b' type_id=2 bits_offset=32
  ...

The btf_tag selftest also succeeded:
  $ ./test_progs -t tag
    #21 btf_tag:OK
    Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211021195643.4020315-1-yhs@fb.com
2021-10-22 17:04:44 -07:00
Yonghong Song
557c8c4804 selftests/bpf: Test deduplication for BTF_KIND_DECL_TAG typedef
Add unit tests for deduplication of BTF_KIND_DECL_TAG to typedef types.
Also changed a few comments from "tag" to "decl_tag" to match
BTF_KIND_DECL_TAG enum value name.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211021195638.4019770-1-yhs@fb.com
2021-10-22 17:04:44 -07:00
Yonghong Song
9d19a12b02 selftests/bpf: Add BTF_KIND_DECL_TAG typedef unit tests
Test good and bad variants of typedef BTF_KIND_DECL_TAG encoding.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211021195633.4019472-1-yhs@fb.com
2021-10-22 17:04:43 -07:00
Stanislav Fomichev
d1321207b1 selftests/bpf: Fix flow dissector tests
- update custom loader to search by name, not section name
- update bpftool commands to use proper pin path

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211021214814.1236114-4-sdf@google.com
2021-10-22 16:53:38 -07:00
Stanislav Fomichev
a77f879ba1 libbpf: Use func name when pinning programs with LIBBPF_STRICT_SEC_NAME
We can't use section name anymore because they are not unique
and pinning objects with multiple programs with the same
progtype/secname will fail.

  [0] Closes: https://github.com/libbpf/libbpf/issues/273

Fixes: 33a2c75c55 ("libbpf: add internal pin_name")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20211021214814.1236114-2-sdf@google.com
2021-10-22 16:53:11 -07:00
Quentin Monnet
e89ef634f8 bpftool: Avoid leaking the JSON writer prepared for program metadata
Bpftool creates a new JSON object for writing program metadata in plain
text mode, regardless of metadata being present or not. Then this writer
is freed if any metadata has been found and printed, but it leaks
otherwise. We cannot destroy the object unconditionally, because the
destructor prints an undesirable line break. Instead, make sure the
writer is created only after we have found program metadata to print.

Found with valgrind.

Fixes: aff52e685e ("bpftool: Support dumping metadata")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211022094743.11052-1-quentin@isovalent.com
2021-10-22 16:44:56 -07:00
Hengqi Chen
487ef148cf selftests/bpf: Switch to new btf__type_cnt/btf__raw_data APIs
Replace the calls to btf__get_nr_types/btf__get_raw_data in
selftests with new APIs btf__type_cnt/btf__raw_data. The old
APIs will be deprecated in libbpf v0.7+.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211022130623.1548429-6-hengqi.chen@gmail.com
2021-10-22 16:09:14 -07:00
Hengqi Chen
58fc155b0e bpftool: Switch to new btf__type_cnt API
Replace the call to btf__get_nr_types with new API btf__type_cnt.
The old API will be deprecated in libbpf v0.7+. No functionality
change.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211022130623.1548429-5-hengqi.chen@gmail.com
2021-10-22 16:09:14 -07:00
Hengqi Chen
2d8f09fafc tools/resolve_btfids: Switch to new btf__type_cnt API
Replace the call to btf__get_nr_types with new API btf__type_cnt.
The old API will be deprecated in libbpf v0.7+. No functionality
change.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211022130623.1548429-4-hengqi.chen@gmail.com
2021-10-22 16:09:14 -07:00
Hengqi Chen
2502e74bb5 perf bpf: Switch to new btf__raw_data API
Replace the call to btf__get_raw_data with new API btf__raw_data.
The old APIs will be deprecated in libbpf v0.7+. No functionality
change.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211022130623.1548429-3-hengqi.chen@gmail.com
2021-10-22 16:09:14 -07:00
Hengqi Chen
6a886de070 libbpf: Add btf__type_cnt() and btf__raw_data() APIs
Add btf__type_cnt() and btf__raw_data() APIs and deprecate
btf__get_nr_type() and btf__get_raw_data() since the old APIs
don't follow the libbpf naming convention for getters which
omit 'get' in the name (see [0]). btf__raw_data() is just an
alias to the existing btf__get_raw_data(). btf__type_cnt()
now returns the number of all types of the BTF object
including 'void'.

  [0] Closes: https://github.com/libbpf/libbpf/issues/279

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211022130623.1548429-2-hengqi.chen@gmail.com
2021-10-22 16:09:14 -07:00
Mauricio Vásquez
1000298c76 libbpf: Fix memory leak in btf__dedup()
Free btf_dedup if btf_ensure_modifiable() returns error.

Fixes: 919d2b1dbb ("libbpf: Allow modification of BTF and add btf__add_str API")
Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211022202035.48868-1-mauricio@kinvolk.io
2021-10-22 16:00:53 -07:00
Andrii Nakryiko
57385ae31f selftests/bpf: Make perf_buffer selftests work on 4.9 kernel again
Recent change to use tp/syscalls/sys_enter_nanosleep for perf_buffer
selftests causes this selftest to fail on 4.9 kernel in libbpf CI ([0]):

  libbpf: prog 'handle_sys_enter': failed to attach to perf_event FD 6: Invalid argument
  libbpf: prog 'handle_sys_enter': failed to attach to tracepoint 'syscalls/sys_enter_nanosleep': Invalid argument

It's not exactly clear why, because perf_event itself is created for
this tracepoint, but I can't even compile 4.9 kernel locally, so it's
hard to figure this out. If anyone has better luck and would like to
help investigating this, I'd really appreciate this.

For now, unblock CI by switching back to raw_syscalls/sys_enter, but reduce
amount of unnecessary samples emitted by filter by process ID. Use
explicit ARRAY map for that to make it work on 4.9 as well, because
global data isn't yet supported there.

Fixes: aa274f98b2 ("selftests/bpf: Fix possible/online index mismatch in perf_buffer test")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211022201342.3490692-1-andrii@kernel.org
2021-10-22 14:26:33 -07:00
Andrii Nakryiko
fae1b05e6f libbpf: Fix the use of aligned attribute
Building libbpf sources out of kernel tree (in Github repo) we run into
compilation error due to unknown __aligned attribute. It must be coming
from some kernel header, which is not available to Github sources. Use
explicit __attribute__((aligned(16))) instead.

Fixes: 961632d541 ("libbpf: Fix dumping non-aligned __int128")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211022192502.2975553-1-andrii@kernel.org
2021-10-22 14:24:36 -07:00
Florian Westphal
1f83b835a3 fcnal-test: kill hanging ping/nettest binaries on cleanup
On my box I see a bunch of ping/nettest processes hanging
around after fcntal-test.sh is done.

Clean those up before netns deletion.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20211021140247.29691-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 14:03:18 -07:00
Jim Mattson
ed290e1c20 KVM: selftests: Fix nested SVM tests when built with clang
Though gcc conveniently compiles a simple memset to "rep stos," clang
prefers to call the libc version of memset. If a test is dynamically
linked, the libc memset isn't available in L1 (nor is the PLT or the
GOT, for that matter). Even if the test is statically linked, the libc
memset may choose to use some CPU features, like AVX, which may not be
enabled in L1. Note that __builtin_memset doesn't solve the problem,
because (a) the compiler is free to call memset anyway, and (b)
__builtin_memset may also choose to use features like AVX, which may
not be available in L1.

To avoid a myriad of problems, use an explicit "rep stos" to clear the
VMCB in generic_svm_setup(), which is called both from L0 and L1.

Reported-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Fixes: 20ba262f86 ("selftests: KVM: AMD Nested test infrastructure")
Message-Id: <20210930003649.4026553-1-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-22 12:46:37 -04:00
David S. Miller
bdfa75ad70 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Lots of simnple overlapping additions.

With a build fix from Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-22 11:41:16 +01:00
Michael Roth
413eaa4ecd KVM: selftests: set CPUID before setting sregs in vcpu creation
Recent kernels have checks to ensure the GPA values in special-purpose
registers like CR3 are within the maximum physical address range and
don't overlap with anything in the upper/reserved range. In the case of
SEV kselftest guests booting directly into 64-bit mode, CR3 needs to be
initialized to the GPA of the page table root, with the encryption bit
set. The kernel accounts for this encryption bit by removing it from
reserved bit range when the guest advertises the bit position via
KVM_SET_CPUID*, but kselftests currently call KVM_SET_SREGS as part of
vm_vcpu_add_default(), before KVM_SET_CPUID*.

As a result, KVM_SET_SREGS will return an error in these cases.
Address this by moving vcpu_set_cpuid() (which calls KVM_SET_CPUID*)
ahead of vcpu_setup() (which calls KVM_SET_SREGS).

While there, address a typo in the assertion that triggers when
KVM_SET_SREGS fails.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20211006203617.13045-1-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Nathan Tempelman <natet@google.com>
2021-10-22 05:19:29 -04:00
Linus Torvalds
6c2c712767 Networking fixes for 5.15-rc7, including fixes from netfilter, and can.
Current release - regressions:
 
  - revert "vrf: reset skb conntrack connection on VRF rcv",
    there are valid uses for previous behavior
 
  - can: m_can: fix iomap_read_fifo() and iomap_write_fifo()
 
 Current release - new code bugs:
 
  - mlx5: e-switch, return correct error code on group creation failure
 
 Previous releases - regressions:
 
  - sctp: fix transport encap_port update in sctp_vtag_verify
 
  - stmmac: fix E2E delay mechanism (in PTP timestamping)
 
 Previous releases - always broken:
 
  - netfilter: ip6t_rt: fix out-of-bounds read of ipv6_rt_hdr
 
  - netfilter: xt_IDLETIMER: fix out-of-bound read caused by lack of init
 
  - netfilter: ipvs: make global sysctl read-only in non-init netns
 
  - tcp: md5: fix selection between vrf and non-vrf keys
 
  - ipv6: count rx stats on the orig netdev when forwarding
 
  - bridge: mcast: use multicast_membership_interval for IGMPv3
 
  - can:
    - j1939: fix UAF for rx_kref of j1939_priv
             abort sessions on receiving bad messages
 
    - isotp: fix TX buffer concurrent access in isotp_sendmsg()
             fix return error on FC timeout on TX path
 
  - ice: fix re-init of RDMA Tx queues and crash if RDMA was not inited
 
  - hns3: schedule the polling again when allocation fails,
    prevent stalls
 
  - drivers: add missing of_node_put() when aborting
    for_each_available_child_of_node()
 
  - ptp: fix possible memory leak and UAF in ptp_clock_register()
 
  - e1000e: fix packet loss in burst mode on Tiger Lake and later
 
  - mlx5e: ipsec: fix more checksum offload issues
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFxgHwACgkQMUZtbf5S
 IrvFgw//T73aR3B2Xvz5/1rtglfmtcUqFQsyGDXGD5HnfbAbsRcz8vcQ/mTsExl7
 +mJY/ZuQefsD7UQDyg3GNhbgf1+pEjHC81ryeNsfET7+JxgYLD3NEYSBYUqIFZUo
 gStAStGBG+ClQUaqlkGFyyf6GrqwpmxZKRr6F9fUsufQ14m9tvcT/QPcrXL4q7qX
 Fz644yUe/IvKnuJDHJVZsc8UXR9NTPCyCNJT9kVewwPMIMEc/xMOg5QONLZT0TlC
 Zk4XJIqlBBEQWrN/QwrGXm82aO+3gQZyD5K9AvpczgcBjOr6FJOmN6zkQrqNNWaC
 2wPAfWi7DALPtOZR6lCxoeWfLRfdn1ZOn5x2z5xrtAXCV2FTaMg8in9TzJ57qmcb
 /l43QzcNGSj1ytyny8pqgdsX2MSqs0O5VSG4egMtz7TeU/rs7uAx2IVHbPT8CHop
 PvhVHeUeu9lGu+FUK8piQbb5aVpbA9qlOj/rXNrHDIxdA9McQgVs+tljNG4X5KtX
 L7BR84wNg98HtIINVx6RjYz9lOpG1qBuw5RCiqiAaN1RBY7lYAhMaAE6U3azjgC+
 AIz/MacNuAz/oTuutQB6/0WZDDJhy4WEy3TrDLlpQNz6yIrpKFN+ftyF6DuVUSMH
 PmtZ4E/DLooQL5KwuoDdYDH1gSMlggBejeGHTFJ+RUMuvRePZQ8=
 =Hwqr
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, and can.

  We'll have one more fix for a socket accounting regression, it's still
  getting polished. Otherwise things look fine.

  Current release - regressions:

   - revert "vrf: reset skb conntrack connection on VRF rcv", there are
     valid uses for previous behavior

   - can: m_can: fix iomap_read_fifo() and iomap_write_fifo()

  Current release - new code bugs:

   - mlx5: e-switch, return correct error code on group creation failure

  Previous releases - regressions:

   - sctp: fix transport encap_port update in sctp_vtag_verify

   - stmmac: fix E2E delay mechanism (in PTP timestamping)

  Previous releases - always broken:

   - netfilter: ip6t_rt: fix out-of-bounds read of ipv6_rt_hdr

   - netfilter: xt_IDLETIMER: fix out-of-bound read caused by lack of
     init

   - netfilter: ipvs: make global sysctl read-only in non-init netns

   - tcp: md5: fix selection between vrf and non-vrf keys

   - ipv6: count rx stats on the orig netdev when forwarding

   - bridge: mcast: use multicast_membership_interval for IGMPv3

   - can:
      - j1939: fix UAF for rx_kref of j1939_priv abort sessions on
        receiving bad messages

      - isotp: fix TX buffer concurrent access in isotp_sendmsg() fix
        return error on FC timeout on TX path

   - ice: fix re-init of RDMA Tx queues and crash if RDMA was not inited

   - hns3: schedule the polling again when allocation fails, prevent
     stalls

   - drivers: add missing of_node_put() when aborting
     for_each_available_child_of_node()

   - ptp: fix possible memory leak and UAF in ptp_clock_register()

   - e1000e: fix packet loss in burst mode on Tiger Lake and later

   - mlx5e: ipsec: fix more checksum offload issues"

* tag 'net-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (75 commits)
  usbnet: sanity check for maxpacket
  net: enetc: make sure all traffic classes can send large frames
  net: enetc: fix ethtool counter name for PM0_TERR
  ptp: free 'vclock_index' in ptp_clock_release()
  sfc: Don't use netif_info before net_device setup
  sfc: Export fibre-specific supported link modes
  net/mlx5e: IPsec: Fix work queue entry ethernet segment checksum flags
  net/mlx5e: IPsec: Fix a misuse of the software parser's fields
  net/mlx5e: Fix vlan data lost during suspend flow
  net/mlx5: E-switch, Return correct error code on group creation failure
  net/mlx5: Lag, change multipath and bonding to be mutually exclusive
  ice: Add missing E810 device ids
  igc: Update I226_K device ID
  e1000e: Fix packet loss on Tiger Lake and later
  e1000e: Separate TGP board type from SPT
  ptp: Fix possible memory leak in ptp_clock_register()
  net: stmmac: Fix E2E delay mechanism
  nfc: st95hf: Make spi remove() callback return zero
  net: hns3: disable sriov before unload hclge layer
  net: hns3: fix vf reset workqueue cannot exit
  ...
2021-10-21 15:36:50 -10:00
Andrii Nakryiko
4f2511e199 selftests/bpf: Switch to ".bss"/".rodata"/".data" lookups for internal maps
Utilize libbpf's feature of allowing to lookup internal maps by their
ELF section names. No need to guess or calculate the exact truncated
prefix taken from the object name.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-11-andrii@kernel.org
2021-10-21 17:10:11 -07:00
Andrii Nakryiko
26071635ac libbpf: Simplify look up by name of internal maps
Map name that's assigned to internal maps (.rodata, .data, .bss, etc)
consist of a small prefix of bpf_object's name and ELF section name as
a suffix. This makes it hard for users to "guess" the name to use for
looking up by name with bpf_object__find_map_by_name() API.

One proposal was to drop object name prefix from the map name and just
use ".rodata", ".data", etc, names. One downside called out was that
when multiple BPF applications are active on the host, it will be hard
to distinguish between multiple instances of .rodata and know which BPF
object (app) they belong to. Having few first characters, while quite
limiting, still can give a bit of a clue, in general.

Note, though, that btf_value_type_id for such global data maps (ARRAY)
points to DATASEC type, which encodes full ELF name, so tools like
bpftool can take advantage of this fact to "recover" full original name
of the map. This is also the reason why for custom .data.* and .rodata.*
maps libbpf uses only their ELF names and doesn't prepend object name at
all.

Another downside of such approach is that it is not backwards compatible
and, among direct use of bpf_object__find_map_by_name() API, will break
any BPF skeleton generated using bpftool that was compiled with older
libbpf version.

Instead of causing all this pain, libbpf will still generate map name
using a combination of object name and ELF section name, but it will
allow looking such maps up by their natural names, which correspond to
their respective ELF section names. This means non-truncated ELF section
names longer than 15 characters are going to be expected and supported.

With such set up, we get the best of both worlds: leave small bits of
a clue about BPF application that instantiated such maps, as well as
making it easy for user apps to lookup such maps at runtime. In this
sense it closes corresponding libbpf 1.0 issue ([0]).

BPF skeletons will continue using full names for lookups.

  [0] Closes: https://github.com/libbpf/libbpf/issues/275

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-10-andrii@kernel.org
2021-10-21 17:10:10 -07:00
Andrii Nakryiko
30c5bd9647 selftests/bpf: Demonstrate use of custom .rodata/.data sections
Enhance existing selftests to demonstrate the use of custom
.data/.rodata sections.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-9-andrii@kernel.org
2021-10-21 17:10:10 -07:00
Andrii Nakryiko
aed659170a libbpf: Support multiple .rodata.* and .data.* BPF maps
Add support for having multiple .rodata and .data data sections ([0]).
.rodata/.data are supported like the usual, but now also
.rodata.<whatever> and .data.<whatever> are also supported. Each such
section will get its own backing BPF_MAP_TYPE_ARRAY, just like
.rodata and .data.

Multiple .bss maps are not supported, as the whole '.bss' name is
confusing and might be deprecated soon, as well as user would need to
specify custom ELF section with SEC() attribute anyway, so might as well
stick to just .data.* and .rodata.* convention.

User-visible map name for such new maps is going to be just their ELF
section names.

  [0] https://github.com/libbpf/libbpf/issues/274

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-8-andrii@kernel.org
2021-10-21 17:10:10 -07:00
Andrii Nakryiko
ef9356d392 bpftool: Improve skeleton generation for data maps without DATASEC type
It can happen that some data sections (e.g., .rodata.cst16, containing
compiler populated string constants) won't have a corresponding BTF
DATASEC type. Now that libbpf supports .rodata.* and .data.* sections,
situation like that will cause invalid BPF skeleton to be generated that
won't compile successfully, as some parts of skeleton would assume
memory-mapped struct definitions for each special data section.

Fix this by generating empty struct definitions for such data sections.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-7-andrii@kernel.org
2021-10-21 17:10:10 -07:00
Andrii Nakryiko
8654b4d35e bpftool: Support multiple .rodata/.data internal maps in skeleton
Remove the assumption about only single instance of each of .rodata and
.data internal maps. Nothing changes for '.rodata' and '.data' maps, but new
'.rodata.something' map will get 'rodata_something' section in BPF
skeleton for them (as well as having struct bpf_map * field in maps
section with the same field name).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-6-andrii@kernel.org
2021-10-21 17:10:10 -07:00
Andrii Nakryiko
25bbbd7a44 libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps
Remove internal libbpf assumption that there can be only one .rodata,
.data, and .bss map per BPF object. To achieve that, extend and
generalize the scheme that was used for keeping track of relocation ELF
sections. Now each ELF section has a temporary extra index that keeps
track of logical type of ELF section (relocations, data, read-only data,
BSS). Switch relocation to this scheme, as well as .rodata/.data/.bss
handling.

We don't yet allow multiple .rodata, .data, and .bss sections, but no
libbpf internal code makes an assumption that there can be only one of
each and thus they can be explicitly referenced by a single index. Next
patches will actually allow multiple .rodata and .data sections.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-5-andrii@kernel.org
2021-10-21 17:10:10 -07:00
Andrii Nakryiko
ad23b72384 libbpf: Use Elf64-specific types explicitly for dealing with ELF
Minimize the usage of class-agnostic gelf_xxx() APIs from libelf. These
APIs require copying ELF data structures into local GElf_xxx structs and
have a more cumbersome API. BPF ELF file is defined to be always 64-bit
ELF object, even when intended to be run on 32-bit host architectures,
so there is no need to do class-agnostic conversions everywhere. BPF
static linker implementation within libbpf has been using Elf64-specific
types since initial implementation.

Add two simple helpers, elf_sym_by_idx() and elf_rel_by_idx(), for more
succinct direct access to ELF symbol and relocation records within ELF
data itself and switch all the GElf_xxx usage into Elf64_xxx
equivalents. The only remaining place within libbpf.c that's still using
gelf API is gelf_getclass(), as there doesn't seem to be a direct way to
get underlying ELF bitness.

No functional changes intended.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-4-andrii@kernel.org
2021-10-21 17:10:10 -07:00
Andrii Nakryiko
29a30ff501 libbpf: Extract ELF processing state into separate struct
Name currently anonymous internal struct that keeps ELF-related state
for bpf_object. Just a bit of clean up, no functional changes.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-3-andrii@kernel.org
2021-10-21 17:10:10 -07:00
Andrii Nakryiko
b96c07f3b5 libbpf: Deprecate btf__finalize_data() and move it into libbpf.c
There isn't a good use case where anyone but libbpf itself needs to call
btf__finalize_data(). It was implemented for internal use and it's not
clear why it was made into public API in the first place. To function, it
requires active ELF data, which is stored inside bpf_object for the
duration of opening phase only. But the only BTF that needs bpf_object's
ELF is that bpf_object's BTF itself, which libbpf fixes up automatically
during bpf_object__open() operation anyways. There is no need for any
additional fix up and no reasonable scenario where it's useful and
appropriate.

Thus, btf__finalize_data() is just an API atavism and is better removed.
So this patch marks it as deprecated immediately (v0.6+) and moves the
code from btf.c into libbpf.c where it's used in the context of
bpf_object opening phase. Such code co-location allows to make code
structure more straightforward and remove bpf_object__section_size() and
bpf_object__variable_offset() internal helpers from libbpf_internal.h,
making them static. Their naming is also adjusted to not create
a wrong illusion that they are some sort of method of bpf_object. They
are internal helpers and are called appropriately.

This is part of libbpf 1.0 effort ([0]).

  [0] Closes: https://github.com/libbpf/libbpf/issues/276

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-2-andrii@kernel.org
2021-10-21 17:10:10 -07:00
Jiri Olsa
99d099757a selftests/bpf: Use nanosleep tracepoint in perf buffer test
The perf buffer tests triggers trace with nanosleep syscall,
but monitors all syscalls, which results in lot of data in the
buffer and makes it harder to debug. Let's lower the trace
traffic and monitor just nanosleep syscall.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211021114132.8196-4-jolsa@kernel.org
2021-10-21 15:59:31 -07:00
Jiri Olsa
aa274f98b2 selftests/bpf: Fix possible/online index mismatch in perf_buffer test
The perf_buffer fails on system with offline cpus:

  # test_progs -t perf_buffer
  serial_test_perf_buffer:PASS:nr_cpus 0 nsec
  serial_test_perf_buffer:PASS:nr_on_cpus 0 nsec
  serial_test_perf_buffer:PASS:skel_load 0 nsec
  serial_test_perf_buffer:PASS:attach_kprobe 0 nsec
  serial_test_perf_buffer:PASS:perf_buf__new 0 nsec
  serial_test_perf_buffer:PASS:epoll_fd 0 nsec
  skipping offline CPU #4
  serial_test_perf_buffer:PASS:perf_buffer__poll 0 nsec
  serial_test_perf_buffer:PASS:seen_cpu_cnt 0 nsec
  serial_test_perf_buffer:PASS:buf_cnt 0 nsec
  ...
  serial_test_perf_buffer:PASS:fd_check 0 nsec
  serial_test_perf_buffer:PASS:drain_buf 0 nsec
  serial_test_perf_buffer:PASS:consume_buf 0 nsec
  serial_test_perf_buffer:FAIL:cpu_seen cpu 5 not seen
  #88 perf_buffer:FAIL
  Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

If the offline cpu is from the middle of the possible set,
we get mismatch with possible and online cpu buffers.

The perf buffer test calls perf_buffer__consume_buffer for
all 'possible' cpus, but the library holds only 'online'
cpu buffers and perf_buffer__consume_buffer returns them
based on index.

Adding extra (online) index to keep track of online buffers,
we need the original (possible) index to trigger trace on
proper cpu.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211021114132.8196-3-jolsa@kernel.org
2021-10-21 15:59:28 -07:00
Jiri Olsa
d4121376ac selftests/bpf: Fix perf_buffer test on system with offline cpus
The perf_buffer fails on system with offline cpus:

  # test_progs -t perf_buffer
  test_perf_buffer:PASS:nr_cpus 0 nsec
  test_perf_buffer:PASS:nr_on_cpus 0 nsec
  test_perf_buffer:PASS:skel_load 0 nsec
  test_perf_buffer:PASS:attach_kprobe 0 nsec
  test_perf_buffer:PASS:perf_buf__new 0 nsec
  test_perf_buffer:PASS:epoll_fd 0 nsec
  skipping offline CPU #24
  skipping offline CPU #25
  skipping offline CPU #26
  skipping offline CPU #27
  skipping offline CPU #28
  skipping offline CPU #29
  skipping offline CPU #30
  skipping offline CPU #31
  test_perf_buffer:PASS:perf_buffer__poll 0 nsec
  test_perf_buffer:PASS:seen_cpu_cnt 0 nsec
  test_perf_buffer:FAIL:buf_cnt got 24, expected 32
  Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

Changing the test to check online cpus instead of possible.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211021114132.8196-2-jolsa@kernel.org
2021-10-21 15:59:20 -07:00
Dave Marchevsky
e1b9023fc7 selftests/bpf: Add verif_stats test
verified_insns field was added to response of bpf_obj_get_info_by_fd
call on a prog. Confirm that it's being populated by loading a simple
program and asking for its info.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211020074818.1017682-3-davemarchevsky@fb.com
2021-10-21 15:51:47 -07:00
Dave Marchevsky
aba64c7da9 bpf: Add verified_insns to bpf_prog_info and fdinfo
This stat is currently printed in the verifier log and not stored
anywhere. To ease consumption of this data, add a field to bpf_prog_aux
so it can be exposed via BPF_OBJ_GET_INFO_BY_FD and fdinfo.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211020074818.1017682-2-davemarchevsky@fb.com
2021-10-21 15:51:47 -07:00
Ilya Leoshkevich
632f96d265 libbpf: Fix ptr_is_aligned() usages
Currently ptr_is_aligned() takes size, and not alignment, as a
parameter, which may be overly pessimistic e.g. for __i128 on s390,
which must be only 8-byte aligned. Fix by using btf__align_of().

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211021104658.624944-2-iii@linux.ibm.com
2021-10-21 15:50:04 -07:00
Hengqi Chen
b6c4e71516 selftests/bpf: Test bpf_skc_to_unix_sock() helper
Add a new test which triggers unix_listen kernel function
to test bpf_skc_to_unix_sock helper.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211021134752.1223426-3-hengqi.chen@gmail.com
2021-10-21 15:11:06 -07:00
Hengqi Chen
9eeb3aa33a bpf: Add bpf_skc_to_unix_sock() helper
The helper is used in tracing programs to cast a socket
pointer to a unix_sock pointer.
The return value could be NULL if the casting is illegal.

Suggested-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021134752.1223426-2-hengqi.chen@gmail.com
2021-10-21 15:11:06 -07:00
Shuah Khan
c3867ab592 selftests: kvm: fix mismatched fclose() after popen()
get_warnings_count() does fclose() using File * returned from popen().
Fix it to call pclose() as it should.

tools/testing/selftests/kvm/x86_64/mmio_warning_test
x86_64/mmio_warning_test.c: In function ‘get_warnings_count’:
x86_64/mmio_warning_test.c:87:9: warning: ‘fclose’ called on pointer returned from a mismatched allocation function [-Wmismatched-dealloc]
   87 |         fclose(f);
      |         ^~~~~~~~~
x86_64/mmio_warning_test.c:84:13: note: returned from ‘popen’
   84 |         f = popen("dmesg | grep \"WARNING:\" | wc -l", "r");
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-21 15:16:28 -06:00
David S. Miller
1439caa1d9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter fixes for net:

1) Crash due to missing initialization of timer data in
   xt_IDLETIMER, from Juhee Kang.

2) NF_CONNTRACK_SECMARK should be bool in Kconfig, from Vegard Nossum.

3) Skip netdev events on netns removal, from Florian Westphal.

4) Add testcase to show port shadowing via UDP, also from Florian.

5) Remove pr_debug() code in ip6t_rt, this fixes a crash due to
   unsafe access to non-linear skbuff, from Xin Long.

6) Make net/ipv4/vs/debug_level read-only from non-init netns,
   from Antoine Tenart.

7) Remove bogus invocation to bash in selftests/netfilter/nft_flowtable.sh
   also from Florian.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21 12:32:41 +01:00
Marc Zyngier
5a2acbbb01 Merge branch kvm/selftests/memslot into kvmarm-master/next
* kvm/selftests/memslot:
  : .
  : Enable KVM memslot selftests on arm64, making them less
  : x86 specific.
  : .
  KVM: selftests: Build the memslot tests for arm64
  KVM: selftests: Make memslot_perf_test arch independent

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-10-21 11:40:03 +01:00
Ricardo Koller
358928fd52 KVM: selftests: Build the memslot tests for arm64
Add memslot_perf_test and memslot_modification_stress_test to the list
of aarch64 selftests.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210907180957.609966-3-ricarkol@google.com
2021-10-21 11:36:41 +01:00
Ricardo Koller
ffb4ce3c49 KVM: selftests: Make memslot_perf_test arch independent
memslot_perf_test uses ucalls for synchronization between guest and
host. Ucalls API is architecture independent: tests do not need to know
details like what kind of exit they generate on a specific arch.  More
specifically, there is no need to check whether an exit is KVM_EXIT_IO
in x86 for the host to know that the exit is ucall related, as
get_ucall() already makes that check.

Change memslot_perf_test to not require specifying what exit does a
ucall generate. Also add a missing ucall_init.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210907180957.609966-2-ricarkol@google.com
2021-10-21 11:36:34 +01:00
Mark Brown
260ea4ba94 selftests: arm64: Factor out utility functions for assembly FP tests
The various floating point test programs written in assembly have a bunch
of helper functions and macros which are cut'n'pasted between them. Factor
them out into a separate source file which is linked into all of them.

We don't include memcmp() since it isn't as generic as it should be and
directly branches to report an error in the programs.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019181851.3341232-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 11:11:27 +01:00
Brendan Jackman
7960d02ddd selftests/bpf: Some more atomic tests
Some new verifier tests that hit some important gaps in the parameter
space for atomic ops.

There are already exhaustive tests for the JIT part in
lib/test_bpf.c, but these exercise the verifier too.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211015093318.1273686-1-jackmanb@google.com
2021-10-20 18:09:08 -07:00
Ilya Leoshkevich
961632d541 libbpf: Fix dumping non-aligned __int128
Non-aligned integers are dumped as bitfields, which is supported for at
most 64-bit integers. Fix by using the same trick as
btf_dump_float_data(): copy non-aligned values to the local buffer.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211013160902.428340-4-iii@linux.ibm.com
2021-10-20 11:40:45 -07:00
Ilya Leoshkevich
c9e982b879 libbpf: Fix dumping big-endian bitfields
On big-endian arches not only bytes, but also bits are numbered in
reverse order (see e.g. S/390 ELF ABI Supplement, but this is also true
for other big-endian arches as well).

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211013160902.428340-3-iii@linux.ibm.com
2021-10-20 11:40:01 -07:00
Ilya Leoshkevich
b16d12f390 selftests/bpf: Use cpu_number only on arches that have it
cpu_number exists only on Intel and aarch64, so skip the test involing
it on other arches. An alternative would be to replace it with an
exported non-ifdefed primitive-typed percpu variable from the common
code, but there appears to be none.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211013160902.428340-2-iii@linux.ibm.com
2021-10-20 11:40:01 -07:00
Quentin Monnet
efc36d6c64 bpftool: Remove useless #include to <perf-sys.h> from map_perf_ring.c
The header is no longer needed since the event_pipe implementation
was updated to rely on libbpf's perf_buffer. This makes bpftool free of
dependencies to perf files, and we can update the Makefile accordingly.

Fixes: 9b190f185d ("tools/bpftool: switch map event_pipe to libbpf's perf_buffer")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211020094826.16046-1-quentin@isovalent.com
2021-10-20 10:47:39 -07:00
Wan Jiabing
b8f49dce79 selftests/bpf: Remove duplicated include in cgroup_helpers
Fix following checkincludes.pl warning:
./scripts/checkincludes.pl tools/testing/selftests/bpf/cgroup_helpers.c
tools/testing/selftests/bpf/cgroup_helpers.c: unistd.h is included more
than once.

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211012023231.19911-1-wanjiabing@vivo.com
2021-10-20 10:45:09 -07:00
Dave Marchevsky
ebc7b50a38 libbpf: Migrate internal use of bpf_program__get_prog_info_linear
In preparation for bpf_program__get_prog_info_linear deprecation, move
the single use in libbpf to call bpf_obj_get_info_by_fd directly.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211011082031.4148337-2-davemarchevsky@fb.com
2021-10-20 10:35:49 -07:00
Linus Torvalds
0afe64bebb Tools:
* kvm_stat: do not show halt_wait_ns since it is not a cumulative statistic
 
 x86:
 * clean ups and fixes for bus lock vmexit and lazy allocation of rmaps
 * two fixes for SEV-ES (one more coming as soon as I get reviews)
 * fix for static_key underflow
 
 ARM:
 * Properly refcount pages used as a concatenated stage-2 PGD
 * Fix missing unlock when detecting the use of MTE+VM_SHARED
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmFtuqYUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNGbAf9Ha4mlieY7lDQLk96GydPwlMofi1B
 dteRaWizokT0Xk7HovPr8G1zwwE9DrqO1FuHiZrkckzf7cloaPDvncLag3D3Vakr
 dWIqa7MaavSWBKDpcEIKOEo2SfIBU38xXQSEpegz2f2fhZK0Ud2xUNtGQMNrYatX
 Lz6FXHRvHDmv4+9EjASoGBd0/C/NxMaumYa1VOxMt8JPyn+zho0z5rUDKDF4pg70
 KAgxVZuksy15XFRTgaSaU0BqVn9uCHwZVqRFKBm+ocPXIFjhdMkgrxJ7NSYB1T+N
 VFqcUBTFTjhg9e5eZnQ6GMf9FXpLzK912VhCRd0uU5PGeBwUDJTSnyu5OQ==
 =GZqR
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Tools:
   - kvm_stat: do not show halt_wait_ns since it is not a cumulative statistic

  x86:
   - clean ups and fixes for bus lock vmexit and lazy allocation of rmaps
   - two fixes for SEV-ES (one more coming as soon as I get reviews)
   - fix for static_key underflow

  ARM:
   - Properly refcount pages used as a concatenated stage-2 PGD
   - Fix missing unlock when detecting the use of MTE+VM_SHARED"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: SEV-ES: reduce ghcb_sa_len to 32 bits
  KVM: VMX: Remove redundant handling of bus lock vmexit
  KVM: kvm_stat: do not show halt_wait_ns
  KVM: x86: WARN if APIC HW/SW disable static keys are non-zero on unload
  Revert "KVM: x86: Open code necessary bits of kvm_lapic_set_base() at vCPU RESET"
  KVM: SEV-ES: Set guest_state_protected after VMSA update
  KVM: X86: fix lazy allocation of rmaps
  KVM: SEV-ES: fix length of string I/O
  KVM: arm64: Release mmap_lock when using VM_SHARED with MTE
  KVM: arm64: Report corrupted refcount at EL2
  KVM: arm64: Fix host stage-2 PGD refcount
  KVM: s390: Function documentation fixes
2021-10-20 05:52:10 -10:00
Adrian Hunter
6175047358 perf tools: Add support for PERF_RECORD_AUX_OUTPUT_HW_ID
The PERF_RECORD_AUX_OUTPUT_HW_ID event provides a way to match AUX output
data like Intel PT PEBS-via-PT back to the event that it came from, by
providing a hardware ID that is present in the AUX output.

Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20210907163903.11820-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 11:22:27 -03:00
Andrew Kilroy
70ae034d49 perf vendor events arm64: Categorise the Neoverse V1 counters
This is so they are categorised in the perf list output.  The pmus all
exist in the armv8-common-and-microarch.json and arm-recommended.json
files, so this commit places them into each category's own file under

  tools/perf/pmu-events/arch/arm64/arm/neoverse-v1

Also add the Neoverse V1 to the arm64 mapfile

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211006081106.8649-3-andrew.kilroy@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 11:22:19 -03:00
Andrew Kilroy
e166fc328b perf vendor events arm64: Add new armv8 pmu events
Add new armv8 common events for use by Arm Neoverse V1 cores in a later
commit. These are defined in the ArmV8 architecture reference manual
available from

  https://developer.arm.com/documentation/ddi0487/gb/?lang=en

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211006081106.8649-2-andrew.kilroy@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 11:22:11 -03:00
Andrew Kilroy
25bc4793dc perf vendor events: Syntax corrections in Neoverse N1 json
There are some syntactical mistakes in the json files for the Cortex A76
N1 (Neoverse N1).  This was obstructing parsing from an external tool.

This patch fixes the erroneous placement of commas causing the problems.

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211006081106.8649-1-andrew.kilroy@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 11:19:50 -03:00
Ian Rogers
b85a4d61d3 perf metric: Allow modifiers on metrics
By allowing modifiers on metrics we can, for example, gather the
same metric for kernel and user mode. On a SkylakeX with
TopDownL1 this gives:

  $ perf stat -M TopDownL1:u,TopDownL1:k -a sleep 2

   Performance counter stats for 'system wide':

         849,855,577    uops_issued.any:k         #     0.06 Bad_Speculation:k
                                                  #     0.51 Backend_Bound:k          (16.71%)
       1,995,257,996    cycles:k
                                                  # 7981031984.00 SLOTS:k
                                                  #     0.35 Frontend_Bound:k
                                                  #     0.08 Retiring:k               (16.71%)
       2,791,940,753    idq_uops_not_delivered.core:k                                 (16.71%)
         641,961,928    uops_retired.retire_slots:k                                   (16.71%)
          72,239,337    int_misc.recovery_cycles:k                                    (16.71%)
       2,294,413,647    uops_issued.any:u         #     0.04 Bad_Speculation:u
                                                  #     0.39 Backend_Bound:u          (16.78%)
       1,333,248,940    cycles:u
                                                  # 5332995760.00 SLOTS:u
                                                  #     0.16 Frontend_Bound:u
                                                  #     0.40 Retiring:u               (16.78%)
         858,517,081    idq_uops_not_delivered.core:u                                 (16.78%)
       2,153,789,582    uops_retired.retire_slots:u                                   (16.78%)
          19,373,627    int_misc.recovery_cycles:u                                    (16.78%)
          31,503,661    cpu_clk_unhalted.one_thread_active:k #     0.18 CoreIPC_SMT:k (16.73%)
         315,454,104    inst_retired.any:k        # 315454104.00 Instructions:k       (16.73%)
          42,533,729    cpu_clk_unhalted.ref_xclk:k                                   (16.73%)
       2,043,119,037    cpu_clk_unhalted.thread:k                                     (16.73%)
          28,843,803    cpu_clk_unhalted.one_thread_active:u #     1.55 CoreIPC_SMT:u (16.60%)
       2,153,353,869    inst_retired.any:u        # 2153353869.00 Instructions:u      (16.60%)
          28,844,743    cpu_clk_unhalted.ref_xclk:u                                   (16.60%)
       1,387,544,378    cpu_clk_unhalted.thread:u                                     (16.60%)
         308,031,603    inst_retired.any:k        #     0.15 CoreIPC:k                (33.19%)
       2,036,774,753    cycles:k                                                      (33.19%)
       1,994,344,281    inst_retired.any:u        #     1.59 CoreIPC:u                (33.18%)
       1,251,538,227    cycles:u                                                      (33.18%)

         2.000342948 seconds time elapsed

Modifiers are naively copy and pasted on to events, this can yield errors like:

  $ perf stat -M Kernel_Utilization:k -a sleep 2
  event syntax error: '..d.thread:k/kk,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/k..'
                                    \___ Bad modifier

   Usage: perf stat [<options>] [<command>]

      -M, --metrics <metric/metric group list>
                            monitor specified metrics or metric groups (separated by ,)

When modifiers are present with constraints, from --metric-no-group or
the NMI watchdog, they are no longer placed in the same set - which may
miss deduplicating events.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-22-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 11:13:13 -03:00
Ian Rogers
eabd452339 perf parse-events: Identify broken modifiers
Previously the broken modifier causes a usage message to printed but
nothing else.

After:

  $ perf stat -e 'cycles:kk' -a sleep 2
  event syntax error: 'cycles:kk'
                              \___ Bad modifier
  Run 'perf list' for a list of valid events

   Usage: perf stat [<options>] [<command>]

      -e, --event <event>   event selector. use 'perf list' to list available events

  $ perf stat -e '{instructions,cycles}:kk' -a sleep 2
  event syntax error: '..ns,cycles}:kk'
                                    \___ Bad modifier
  Run 'perf list' for a list of valid events

   Usage: perf stat [<options>] [<command>]

      -e, --event <event>   event selector. use 'perf list' to list available events

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-21-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 11:12:42 -03:00
Ian Rogers
e068c25671 perf metric: Switch fprintf() to pr_err()
There's no clear reason for the inconsistency that stems from the
initial commit.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-20-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 11:09:37 -03:00
Ian Rogers
5ecd5a0c7d perf metrics: Modify setup and deduplication
Previously find_evsel_group was trying to share events while
mark-sweeping to eliminate unused events, this was complicated and had
issues around uncore events and grouped sharing.

This was further complicated by the event string being created while
metrics and metric groups were being added, with the string affecting
the evlist order.

This change moves deduplication before event parsing.  Ungrouped events
are placed in a single combined set. Groups are checked to see if an
earlier (larger) group can support their events.

As the deduplication and sharing detection is done on metric IDs before
parsing, wildcard expansion problems with uncore events are avoided.

Overall the code is simpler while working better.

An example of failing to deduplicate can be seen with a list of metrics
like the following, where in the after case multiplexing has been
avoided:

Before:

  $ perf stat -M Bad_Speculation,Backend_Bound,Frontend_Bound,Retiring -a sleep 2

   Performance counter stats for 'system wide':

         959,620,872      uops_issued.any           #     0.06 Bad_Speculation    (50.03%)
       2,163,072,261      cycles
                                                    #     0.09 Retiring           (50.03%)
         735,827,436      uops_retired.retire_slots                               (50.03%)
          74,676,484      int_misc.recovery_cycles                                (50.03%)
         987,062,794      uops_issued.any           #     0.50 Backend_Bound      (49.97%)
       2,203,734,187      cycles
                                                    #     0.35 Frontend_Bound     (49.97%)
       3,085,016,091      idq_uops_not_delivered.core                             (49.97%)
         758,599,232      uops_retired.retire_slots                               (49.97%)
          75,807,526      int_misc.recovery_cycles                                (49.97%)

         2.002103760 seconds time elapsed

After:

  $ sudo perf stat -M Bad_Speculation,Backend_Bound,Frontend_Bound,Retiring -a sleep 2

   Performance counter stats for 'system wide':

         769,694,676      uops_issued.any           #     0.08 Bad_Speculation
                                                    #     0.41 Backend_Bound
       1,087,548,633      cycles
                                                    #     0.38 Frontend_Bound
                                                    #     0.14 Retiring
       1,642,085,777      idq_uops_not_delivered.core
         603,112,590      uops_retired.retire_slots
          43,787,854      int_misc.recovery_cycles

         2.003844383 seconds time elapsed

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-19-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 11:00:17 -03:00
Ian Rogers
798c3f4a66 perf expr: Add subset_of_ids() utility
Add a helper that returns true if all the IDs in needles are present in
haystack.

Later this will be used in sharing events between metrics.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-18-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:59:17 -03:00
Ian Rogers
ec5c5b3d2c perf metric: Encode and use metric-id as qualifier
For a metric like IPC a group of events like {instructions,cycles}:W
would be formed.

If the events names were changed in parsing then the metric expression
parser would fail to find them.

This change makes the event encoding be something like:

  {instructions/metric-id=instructions/, cycles/metric-id=cycles/}

and then uses the evsel's stable metric-id value to locate the events.

This fixes the case that an event is restricted to user because of the
paranoia setting:

  $ echo 2 > /proc/sys/kernel/perf_event_paranoid
  $ perf stat -M IPC /bin/true
   Performance counter stats for '/bin/true':

             150,298      inst_retired.any:u        #      0.77 IPC
             187,095      cpu_clk_unhalted.thread:u

         0.002042731 seconds time elapsed

         0.000000000 seconds user
         0.002377000 seconds sys

Adding the metric-id as a qualifier has a complication in that
qualifiers will become embedded in qualifiers.

For example, msr/tsc/ could become msr/tsc,metric-id=msr/tsc// which
will fail parse-events.

To solve this problem the metric is encoded and decoded for the
metric-id with !<num> standing in for an encoded value.

Previously ! wasn't parsed.

With this msr/tsc/ becomes msr/tsc,metric-id=msr!3tsc!3/

The metric expression parser is changed so that @ isn't changed to /,
instead this is done when the ID is encoded for parse events.

metricgroup__add_metric_non_group() and metricgroup__add_metric_weak_group()
need to inject the metric-id qualifier, so to avoid repetition they are
merged into a single metricgroup__build_event_string with error codes
more rigorously checked.

stat-shadow's prepare_metric() uses the metric-id to match the metricgroup
code.

As "metric-id=..." is added to all events, it is adding during testing
with the fake PMU.

This complicates pmu_str_check code as PE_PMU_EVENT_FAKE won't match as
part of a configuration.

The testing fake PMU case is fixed so that if a known qualifier with an
! is parsed then it isn't reported as a fake PMU.

This is sufficient to pass all testing but it and the original mechanism
are somewhat brittle.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:57:05 -03:00
Ian Rogers
fb0811535e perf parse-events: Allow config on kernel PMU events
An event like inst_retired.any on an Intel skylake is found in the
pmu-events code created from the pipeline event JSON.

The event is an alias for cpu/event=0xc0,period=2000003/ and
parse-events recognizes the event with the token PE_KERNEL_PMU_EVENT.

The parser doesn't currently allow extra configuration on such events,
except for modifiers, so:

  $ perf stat -e inst_retired.any// /bin/true
  event syntax error: 'inst_retired.any//'
                       \___ parser error
  Run 'perf list' for a list of valid events

   Usage: perf stat [<options>] [<command>]

      -e, --event <event>   event selector. use 'perf list' to list available events

This patch adds configuration to these events which can be useful for a
number of parameters like name and call-graph:

  $ sudo perf record -e inst_retired.any/call-graph=lbr/ -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.856 MB perf.data (44 samples) ]

It is necessary for the metric code so that we may add metric-id values
to these events before they are parsed.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:55:56 -03:00
Ian Rogers
2b62b3a611 perf parse-events: Add new "metric-id" term
Add a new "metric-id" term to events so that metric parsing can set an
ID that can be reliably looked up.

Metric parsing currently will turn a metric like "instructions/cycles"
into a parse events string of "{instructions,cycles}:W".

However, parse-events may change "instructions" into "instructions:u" if
perf_event_paranoid=2.

When this happens expr__resolve_id currently fails as stat-shadow adds
the ID "instructions:u" to match with the counter value and the metric
tries to look up the ID just "instructions".

A later patch will use the new term.

An example of the current problem:

  $ echo -1 > /proc/sys/kernel/perf_event_paranoid
  $ perf stat -M IPC /bin/true
   Performance counter stats for '/bin/true':

           1,217,161      inst_retired.any          #     0.97 IPC
           1,250,389      cpu_clk_unhalted.thread

         0.002064773 seconds time elapsed

         0.002378000 seconds user
         0.000000000 seconds sys

  $ echo 2 > /proc/sys/kernel/perf_event_paranoid
  $ perf stat -M IPC /bin/true
   Performance counter stats for '/bin/true':

             150,298      inst_retired.any:u        #      nan IPC
             187,095      cpu_clk_unhalted.thread:u

         0.002042731 seconds time elapsed

         0.000000000 seconds user
         0.002377000 seconds sys

Note: nan IPC is printed as an effect of "perf metric: Use NAN for
missing event IDs." but earlier versions of perf just fail with a parse
error and display no value.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:54:44 -03:00
Ian Rogers
8e8bbfb311 perf parse-events: Add const to evsel name
The evsel name is strdup-ed before assignment and so can be const.

A later change will add another similar string.

Using const makes it clearer that these are not out arguments.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:54:13 -03:00
Ian Rogers
46bdc0bf8d perf metric: Simplify metric_refs calculation
Don't build a list and then turn to an array, just directly build the
array.

The size of the array is known due to the search for a duplicate.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:41:06 -03:00
Ian Rogers
485fcaed98 perf metric: Document the internal 'struct metric'
Add documentation as part of code tidying.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:39:55 -03:00
Ian Rogers
4d61aef93d perf metric: Comment data structures
Document the data structures maintained by metricgroup.c and used by
stat-shadow.c for metric output.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:36:18 -03:00
Ian Rogers
80be6434c3 perf metric: Modify resolution and recursion check
Modify resolution. Rather than resolving a list of metrics, resolve a
metric immediately after it is added.

This simplifies knowing the root of the metric's tree so that IDs may be
associated with it.

A bug in the current implementation is that all the IDs were placed on
the first metric in a metric group.

Rather than maintain data on IDs' parents to detect cycles, maintain
a list of visited metrics and detect cycles if the same metric is
visited twice.

Only place the root metric onto the list of metrics.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:35:33 -03:00
Ian Rogers
a3de76903d perf metric: Only add a referenced metric once
If a metric references other metrics then the same other metrics may be
referenced more than once, but the events and metric ref are only needed
once.

An example of this is in tests/parse-metric.c where DCache_L2_Hits
references the metric DCache_L2_All_Hits twice, once directly and once
through DCache_L2_All.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:34:53 -03:00
Ian Rogers
3d81d761a5 perf metric: Add metric new() and free() methods
Metrics are complex enough that a new/free reduces the risk of memory
leaks. Move static functions used in new.

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:34:22 -03:00
Ian Rogers
68074811df perf metric: Add documentation and rename a variable.
Documentation to make current functionality clearer.

Rename a variable called 'metric' to 'metric_name' as it can be
ambiguous as to whether a string is the name of a metric or the
expression.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:33:59 -03:00
Ian Rogers
fa831fbb43 perf metric: Move runtime value to the expr context
The runtime value is needed when recursively parsing metrics, currently
a value of 1 is passed which is incorrect.

Rather than add more arguments to the bison parser, add runtime to the
context.

Fix call sites not to pass a value. The runtime value is defaulted to 0,
which is arbitrary. In some places this replaces a value of 1, which was
also arbitrary.

This shouldn't affect anything other than PPC.

The use of 0 or 1 shouldn't matter as a proper runtime value would be
needed in a case that it did matter.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:33:02 -03:00
Ian Rogers
47f572aad5 perf pmu: Make pmu_event tables const.
Make lookup nature of data structures clearer through their type. Reduce
scope of architecture specific pmu_event tables by making them static.

Suggested-by: John Garry <john.garry@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:32:33 -03:00
Ian Rogers
857974a642 perf pmu: Make pmu_sys_event_tables const.
Make lookup nature of data structures clearer through their type.

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:32:16 -03:00
Ian Rogers
0ec43c0837 perf pmu: Add const to pmu_events_map.
The pmu_events_map is generated at compile time and used for lookup. For
testing purposes we need to swap the map being used.

Having the pmu_events_map be non-const is misleading as it may be an out
argument.

Make it const and update uses so they work on const too.

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:31:47 -03:00
Ian Rogers
92ec3cc94c tools lib: Adopt list_sort() from the kernel sources
Add list_sort.[ch] from the main kernel tree. The linux/bug.h #include
is removed due to conflicting definitions. Add check-headers and modify
perf build accordingly.

MANIFEST and python-ext-sources fixes suggested by Arnaldo.

Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Denys Zagorui <dzagorui@cisco.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wan Jiabing <wanjiabing@vivo.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20211015172132.1162559-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20 10:30:59 -03:00
Quentin Monnet
062e1fc008 bpftool: Turn check on zlib from a phony target into a conditional error
One of bpftool's object files depends on zlib. To make sure we do not
attempt to build that object when the library is not available, commit
d66fa3c70e ("tools: bpftool: add feature check for zlib") introduced a
feature check to detect whether zlib is present.

This check comes as a rule for which the target ("zdep") is a
nonexistent file (phony target), which means that the Makefile always
attempts to rebuild it. It is mostly harmless. However, one side effect
is that, on running again once bpftool is already built, make considers
that "something" (the recipe for zdep) was executed, and does not print
the usual message "make: Nothing to be done for 'all'", which is a
user-friendly indicator that the build went fine.

Before, with some level of debugging information:

    $ make --debug=m
    [...]
    Reading makefiles...

    Auto-detecting system features:
    ...                        libbfd: [ on  ]
    ...        disassembler-four-args: [ on  ]
    ...                          zlib: [ on  ]
    ...                        libcap: [ on  ]
    ...               clang-bpf-co-re: [ on  ]

    Updating makefiles....
    Updating goal targets....
     File 'all' does not exist.
           File 'zdep' does not exist.
          Must remake target 'zdep'.
     File 'all' does not exist.
    Must remake target 'all'.
    Successfully remade target file 'all'.

After the patch:

    $ make --debug=m
    [...]

    Auto-detecting system features:
    ...                        libbfd: [ on  ]
    ...        disassembler-four-args: [ on  ]
    ...                          zlib: [ on  ]
    ...                        libcap: [ on  ]
    ...               clang-bpf-co-re: [ on  ]

    Updating makefiles....
    Updating goal targets....
     File 'all' does not exist.
    Must remake target 'all'.
    Successfully remade target file 'all'.
    make: Nothing to be done for 'all'.

(Note the last line, which is not part of make's debug information.)

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211009210341.6291-4-quentin@isovalent.com
2021-10-19 16:41:50 -07:00
Quentin Monnet
ced846c65e bpftool: Do not FORCE-build libbpf
In bpftool's Makefile, libbpf has a FORCE dependency, to make sure we
rebuild it in case its source files changed. Let's instead make the
rebuild depend on the source files directly, through a call to the
"$(wildcard ...)" function. This avoids descending into libbpf's
directory if there is nothing to update.

Do the same for the bootstrap libbpf version.

This results in a slightly faster operation and less verbose output when
running make a second time in bpftool's directory.

Before:

    Auto-detecting system features:
    ...                        libbfd: [ on  ]
    ...        disassembler-four-args: [ on  ]
    ...                          zlib: [ on  ]
    ...                        libcap: [ on  ]
    ...               clang-bpf-co-re: [ on  ]

    make[1]: Entering directory '/root/dev/linux/tools/lib/bpf'
    make[1]: Entering directory '/root/dev/linux/tools/lib/bpf'
    make[1]: Nothing to be done for 'install_headers'.
    make[1]: Leaving directory '/root/dev/linux/tools/lib/bpf'
    make[1]: Leaving directory '/root/dev/linux/tools/lib/bpf'

After:

    Auto-detecting system features:
    ...                        libbfd: [ on  ]
    ...        disassembler-four-args: [ on  ]
    ...                          zlib: [ on  ]
    ...                        libcap: [ on  ]
    ...               clang-bpf-co-re: [ on  ]

Other ways to clean up the output could be to pass the "-s" option, or
to redirect the output to >/dev/null, when calling make recursively to
descend into libbpf's directory. However, this would suppress some
useful output if something goes wrong during the build. A better
alternative would be to pass "--no-print-directory" to the recursive
make, but that would still leave us with some noise for
"install_headers". Skipping the descent into libbpf's directory if no
source file has changed works best, and seems the most logical option
overall.

Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211009210341.6291-3-quentin@isovalent.com
2021-10-19 16:41:37 -07:00
Quentin Monnet
34e3ab1447 bpftool: Fix install for libbpf's internal header(s)
We recently updated bpftool's Makefile to make it install the headers
from libbpf, instead of pulling them directly from libbpf's directory.
There is also an additional header, internal to libbpf, that needs be
installed. The way that bpftool's Makefile installs that particular
header is currently correct, but would break if we were to modify
$(LIBBPF_INTERNAL_HDRS) to make it point to more than one header.

Use a static pattern rule instead, so that the Makefile can withstand
the addition of other headers to install.

The objective is simply to make the Makefile more robust. It should
_not_ be read as an invitation to import more internal headers from
libbpf into bpftool.

Fixes: f012ade10b ("bpftool: Install libbpf headers instead of including the dir")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211009210341.6291-2-quentin@isovalent.com
2021-10-19 16:40:42 -07:00
Quentin Monnet
d51b6b2287 libbpf: Remove Makefile warnings on out-of-sync netlink.h/if_link.h
Although relying on some definitions from the netlink.h and if_link.h
headers copied into tools/include/uapi/linux/, libbpf does not need
those headers to stay entirely up-to-date with their original versions,
and the warnings emitted by the Makefile when it detects a difference
are usually just noise. Let's remove those warnings.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211010002528.9772-1-quentin@isovalent.com
2021-10-19 16:33:10 -07:00
Rae Moar
d65d07cb5b kunit: tool: improve compatibility of kunit_parser with KTAP specification
Update to kunit_parser to improve compatibility with KTAP
specification including arbitrarily nested tests. Patch accomplishes
three major changes:

- Use a general Test object to represent all tests rather than TestCase
and TestSuite objects. This allows for easier implementation of arbitrary
levels of nested tests and promotes the idea that both test suites and test
cases are tests.

- Print errors incrementally rather than all at once after the
parsing finishes to maximize information given to the user in the
case of the parser given invalid input and to increase the helpfulness
of the timestamps given during printing. Note that kunit.py parse does
not print incrementally yet. However, this fix brings us closer to
this feature.

- Increase compatibility for different formats of input. Arbitrary levels
of nested tests supported. Also, test cases and test suites are now
supported to be present on the same level of testing.

This patch now implements the draft KTAP specification here:
https://lore.kernel.org/linux-kselftest/CA+GJov6tdjvY9x12JsJT14qn6c7NViJxqaJk+r-K1YJzPggFDQ@mail.gmail.com/
We'll update the parser as the spec evolves.

This patch adjusts the kunit_tool_test.py file to check for
the correct outputs from the new parser and adds a new test to check
the parsing for a KTAP result log with correct format for multiple nested
subtests (test_is_test_passed-all_passed_nested.log).

This patch also alters the kunit_json.py file to allow for arbitrarily
nested tests.

Signed-off-by: Rae Moar <rmoar@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:22:02 -06:00
Daniel Latypov
7d7c48df81 kunit: tool: yield output from run_kernel in real time
Currently, `run_kernel()` dumps all the kernel output to a file
(.kunit/test.log) and then opens the file and yields it to callers.
This made it easier to respect the requested timeout, if any.

But it means that we can't yield the results in real time, either to the
parser or to stdout (if --raw_output is set).

This change spins up a background thread to enforce the timeout, which
allows us to yield the kernel output in real time, while also copying it
to the .kunit/test.log file.
It's also careful to ensure that the .kunit/test.log file is complete,
even in the kunit_parser throws an exception/otherwise doesn't consume
every line, see the new `finally` block and unit test.

For example:

$ ./tools/testing/kunit/kunit.py run --arch=x86_64 --raw_output
<configure + build steps>
...
<can now see output from QEMU in real time>

This does not currently have a visible effect when --raw_output is not
passed, as kunit_parser.py currently only outputs everything at the end.
But that could change, and this patch is a necessary step towards
showing parsed test results in real time.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:22:02 -06:00
Daniel Latypov
ff9e09a376 kunit: tool: support running each suite/test separately
The new --run_isolated flag makes the tool boot the kernel once per
suite or test, preventing leftover state from one suite to impact the
other. This can be useful as a starting point to debugging test
hermeticity issues.

Note: it takes a lot longer, so people should not use it normally.

Consider the following very simplified example:

  bool disable_something_for_test = false;
  void function_being_tested() {
    ...
    if (disable_something_for_test) return;
    ...
  }

  static void test_before(struct kunit *test)
  {
    disable_something_for_test = true;
    function_being_tested();
    /* oops, we forgot to reset it back to false */
  }

  static void test_after(struct kunit *test)
  {
    /* oops, now "fixing" test_before can cause test_after to fail! */
    function_being_tested();
  }

Presented like this, the issues are obvious, but it gets a lot more
complicated to track down as the amount of test setup and helper
functions increases.

Another use case is memory corruption. It might not be surfaced as a
failure/crash in the test case or suite that caused it. I've noticed in
kunit's own unit tests, the 3rd suite after might be the one to finally
crash after an out-of-bounds write, for example.

Example usage:

Per suite:
$ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit --run_isolated=suite
...
Starting KUnit Kernel (1/7)...
============================================================
======== [PASSED] kunit_executor_test ========
....
Testing complete. 5 tests run. 0 failed. 0 crashed. 0 skipped.
Starting KUnit Kernel (2/7)...
============================================================
======== [PASSED] kunit-try-catch-test ========
...

Per test:
$ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit --run_isolated=test
Starting KUnit Kernel (1/23)...
============================================================
======== [PASSED] kunit_executor_test ========
[PASSED] parse_filter_test
============================================================
Testing complete. 1 tests run. 0 failed. 0 crashed. 0 skipped.
Starting KUnit Kernel (2/23)...
============================================================
======== [PASSED] kunit_executor_test ========
[PASSED] filter_subsuite_test
...

It works with filters as well:
$ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit --run_isolated=suite example
...
Starting KUnit Kernel (1/1)...
============================================================
======== [PASSED] example ========
...

It also handles test filters, '*.*skip*' runs these 3 tests:
  kunit_status.kunit_status_mark_skipped_test
  example.example_skip_test
  example.example_mark_skipped_test

Fixed up merge conflict between:
  d8c23ead70 ("kunit: tool: better handling of quasi-bool args (--json, --raw_output)") and
  6710951ee039 ("kunit: tool: support running each suite/test separately")
    Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
    Shuah Khan <skhan@linuxfoundation.org>

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:21:08 -06:00
Daniel Latypov
5f6aa6d82e kunit: tool: actually track how long it took to run tests
This is a long standing bug in kunit tool.
Since these files were added, run_kernel() has always yielded lines.

That means, the call to run_kernel() returns before the kernel finishes
executing tests, potentially before a single line of output is even
produced.

So code like this
  time_start = time.time()
  result = linux.run_kernel(...)
  time_end = time.time()

would only measure the time taken for python to give back the generator
object.

From a caller's perspective, the only way to know the kernel has exited
is for us to consume all the output from the `result` generator object.
Alternatively, we could change run_kernel() to try and do its own book
keeping and return the total time, but that doesn't seem worth it.

This change makes us record `time_end` after we're done parsing all the
output (which should mean we've consumed all of it, or errored out).
That means we're including in the parsing time as well, but that should
be quite small, and it's better than claiming it took 0s to run tests.

Let's use this as an example:
$ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit example

Before:
Elapsed time: 7.684s total, 0.001s configuring, 4.692s building, 0.000s running

After:
Elapsed time: 6.283s total, 0.001s configuring, 3.202s building, 3.079s running

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:18:50 -06:00
Daniel Latypov
7ef925ea81 kunit: tool: factor exec + parse steps into a function
Currently this code is copy-pasted between the normal "run" subcommand
and the "exec" subcommand.

Given we don't have any interest in just executing the tests without
giving the user any indication what happened (i.e. parsing the output),
make a function that does both this things and can be reused.

This will be useful when we allow more complicated ways of running
tests, e.g. invoking the kernel multiple times instead of just once,
etc.

We remove input_data from the ParseRequest so the callers don't have to
pass in a dummy value for this field. Named tuples are also immutable,
so if they did pass in a dummy, exec_tests() would need to make a copy
to call parse_tests().

Removing it also makes KunitParseRequest match the other *Request types,
as they only contain user arguments/flags, not data.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Acked-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:18:50 -06:00
Daniel Latypov
fe678fed2c kunit: tool: show list of valid --arch options when invalid
Consider this attempt to run KUnit in QEMU:
$ ./tools/testing/kunit/kunit.py run --arch=x86

Before you'd get this error message:
kunit_kernel.ConfigError: x86 is not a valid arch

After:
kunit_kernel.ConfigError: x86 is not a valid arch, options are ['alpha', 'arm', 'arm64', 'i386', 'powerpc', 'riscv', 's390', 'sparc', 'x86_64']

This should make it a bit easier for people to notice when they make
typos, etc. Currently, one would have to dive into the python code to
figure out what the valid set is.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:18:50 -06:00
Daniel Latypov
a54ea2e057 kunit: tool: misc fixes (unused vars, imports, leaked files)
Drop some variables in unit tests that were unused and/or add assertions
based on them.

For ExitStack, it was imported, but the `es` variable wasn't used so it
didn't do anything, and we were leaking the file objects.
Refactor it to just use nested `with` statements to properly close them.

And drop the direct use of .close() on file objects in the kunit tool
unit test, as these can be leaked if test assertions fail.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:18:49 -06:00
Daniel Latypov
a127b154a8 kunit: tool: allow filtering test cases via glob
Commit 1d71307a6f ("kunit: add unit test for filtering suites by
names") introduced the ability to filter which suites we run via glob.

This change extends it so we can also filter individual test cases
inside of suites as well.

This is quite useful when, e.g.
* trying to run just the tests cases you've just added or are working on
* trying to debug issues with test hermeticity

Examples:
$ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit '*exec*.parse*'
...
============================================================
======== [PASSED] kunit_executor_test ========
[PASSED] parse_filter_test
============================================================
Testing complete. 1 tests run. 0 failed. 0 crashed.

$ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit '*.no_matching_tests'
...
[ERROR] no tests run!

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:18:49 -06:00
Kajol Jain
cae1d75906 tools/perf: Add mem_hops field in perf_mem_data_src structure
Going forward, future generation systems can have more hierarchy
within the node/package level but currently we don't have any data source
encoding field in perf, which can be used to represent this level of data.

Add a new field called 'mem_hops' in the perf_mem_data_src structure
which can be used to represent intra-node/package or inter-node/off-package
details. This field is of size 3 bits where PERF_MEM_HOPS_{NA, 0..6} value
can be used to present different hop levels data.

Also add corresponding macros to define mem_hop field values
and shift value.

Currently we define macro for HOPS_0 which corresponds
to data coming from another core but same node.

Add functionality to represent mem_hop field data in
perf_mem__lvl_scnprintf function with the help of added string
array called mem_hops.

For ex: Encodings for mem_hops fields with L2 cache:

L2                      - local L2
L2 | REMOTE | HOPS_0    - remote core, same node L2

Since with the addition of HOPS field, now remote can be used to
denote cache access from the same node but different core, a check
is added in the c2c_decode_stats function to set mrem only when HOPS
is zero along with set remote field.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211006140654.298352-4-kjain@linux.ibm.com
2021-10-19 17:27:00 +02:00
Kajol Jain
f4c6217f7f perf: Add comment about current state of PERF_MEM_LVL_* namespace and remove an extra line
Add a comment about PERF_MEM_LVL_* namespace being depricated
to some extent in favour of added PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_}
fields.

Remove an extra line present in perf_mem__lvl_scnprintf function.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211006140654.298352-2-kjain@linux.ibm.com
2021-10-19 17:27:00 +02:00
Petr Machata
29c1eac2e6 selftests: mlxsw: Add a test for un/offloadable qdisc trees
This checks that various qdisc configurations either are or are not
offloaded.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-19 12:24:52 +01:00
Greg Kroah-Hartman
2b74240be3 First set of counter subsystem new feature support for the 5.16 cycle
Most interesting element this time is the new chrdev based interface
 for the counter subsystem.  Affects all drivers. Some minor precursor
 patches.
 
 Major parts:
 * Bring all the sysfs attribute setup into the counter core rather than
   leaving it to individual drivers.  Docs updates accompany these changes.
 * Move various definitions to a uapi header as now needed from userspace.
 * Add the chardev interface + extensive documentation and example tool
 * Add new ABI needed to identify indexes needed for chrdev interface
 * Implement new interface for the 104-quad-8
 * Follow up deals with wrong path for documentation build
 * Various trivial cleanups and missing feature additions related to this
   series
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAmFtvWMRHGppYzIzQGtl
 cm5lbC5vcmcACgkQVIU0mcT0FohYnQ//U58yVdWHoav81bm08xl38Hv3F1O9Gk1E
 1pWVb2FNn065q9ES9n0YB1UjECRU/4zvEw+5S+vRqfXlbQiUMVvgUSDWWokzK4Zf
 rMxRFeYjzS89A8b+VoTLSxOn5wAWUc67v29HMDl1MiQHG6gP9UW7mxBbCIeIU9Et
 BhNpo32IS9Qu9LeoaPsRz8yEfAyBheLGMfrlJAYy6ENTh4EMdyhzLvVlJXiE+rm7
 8KwFU7XVTdIG8JQxJGSs9/CMd6CGhKaBr+B6LG4rnSZTy5k41a1DXpIBQU24nLns
 ofX+KllTCnuHW5g+rqi05FkNApCN06RCV7FgUKmqOCa4vbXVEF4mFXejkQo+mgIS
 zHy7lLfkiYVNprpKRKZv5lzj1go8cl4OB0F5odyaiRVh8KeCS9UbYe5fV3vLTiY9
 2/yhmW/dbaogM0bFsx8pbWdn84MiBNXT/PvyqJ+I0EfndDZcmrl0mPWQlAS31QPC
 3aS2wKyBt2pjTUQIaPq80xPeWAAVCVQ4IxrV5SLT8BUxJc6YwZUDEMAM+/3oEhFC
 f1/fxqpcctoFEauJXrggiQWpcVbYkbA9RzF+ecUT+3ngABlbf0KOI0+RZnn6GVvJ
 UVM99Dy0q5utkzFXsHZ4FjXhxiW9LcbvU27KbyrZjKJjOoQAs9Ky5uhjt5DKXh4B
 KSofs8zHEs8=
 =AaeD
 -----END PGP SIGNATURE-----

Merge tag 'counter-for-5.16a-take2' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next

Jonathan writes:

First set of counter subsystem new feature support for the 5.16 cycle

Most interesting element this time is the new chrdev based interface
for the counter subsystem.  Affects all drivers. Some minor precursor
patches.

Major parts:
* Bring all the sysfs attribute setup into the counter core rather than
  leaving it to individual drivers.  Docs updates accompany these changes.
* Move various definitions to a uapi header as now needed from userspace.
* Add the chardev interface + extensive documentation and example tool
* Add new ABI needed to identify indexes needed for chrdev interface
* Implement new interface for the 104-quad-8
* Follow up deals with wrong path for documentation build
* Various trivial cleanups and missing feature additions related to this
  series

* tag 'counter-for-5.16a-take2' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio:
  docs: counter: Include counter-chrdev kernel-doc to generic-counter.rst
  counter: fix docum. build problems after filename change
  counter: microchip-tcb-capture: Tidy up a false kernel-doc /** marking.
  counter: 104-quad-8: Add IRQ support for the ACCES 104-QUAD-8
  counter: 104-quad-8: Replace mutex with spinlock
  counter: Implement events_queue_size sysfs attribute
  counter: Implement *_component_id sysfs attributes
  counter: Implement signalZ_action_component_id sysfs attribute
  tools/counter: Create Counter tools
  docs: counter: Document character device interface
  counter: Add character device interface
  counter: Move counter enums to uapi header
  docs: counter: Update to reflect sysfs internalization
  counter: Update counter.h comments to reflect sysfs internalization
  counter: Internalize sysfs interface code
  counter: stm32-timer-cnt: Provide defines for slave mode selection
  counter: stm32-lptimer-cnt: Provide defines for clock polarities
2021-10-19 09:08:16 +02:00
Peter Xu
8913970c19 mm/userfaultfd: selftests: fix memory corruption with thp enabled
In RHEL's gating selftests we've encountered memory corruption in the
uffd event test even with upstream kernel:

        # ./userfaultfd anon 128 4
        nr_pages: 32768, nr_pages_per_cpu: 32768
        bounces: 3, mode: rnd racing read, userfaults: 6240 missing (6240) 14729 wp (14729)
        bounces: 2, mode: racing read, userfaults: 1444 missing (1444) 28877 wp (28877)
        bounces: 1, mode: rnd read, userfaults: 6055 missing (6055) 14699 wp (14699)
        bounces: 0, mode: read, userfaults: 82 missing (82) 25196 wp (25196)
        testing uffd-wp with pagemap (pgsize=4096): done
        testing uffd-wp with pagemap (pgsize=2097152): done
        testing events (fork, remap, remove): ERROR: nr 32427 memory corruption 0 1 (errno=0, line=963)
        ERROR: faulting process failed (errno=0, line=1117)

It can be easily reproduced when global thp enabled, which is the
default for RHEL.

It's also known as a side effect of commit 0db282ba2c ("selftest: use
mmap instead of posix_memalign to allocate memory", 2021-07-23), which
is imho right itself on using mmap() to make sure the addresses will be
untagged even on arm.

The problem is, for each test we allocate buffers using two
allocate_area() calls.  We assumed these two buffers won't affect each
other, however they could, because mmap() could have found that the two
buffers are near each other and having the same VMA flags, so they got
merged into one VMA.

It won't be a big problem if thp is not enabled, but when thp is
agressively enabled it means when initializing the src buffer it could
accidentally setup part of the dest buffer too when there's a shared THP
that overlaps the two regions.  Then some of the dest buffer won't be
able to be trapped by userfaultfd missing mode, then it'll cause memory
corruption as described.

To fix it, do release_pages() after initializing the src buffer.

Since the previous two release_pages() calls are after
uffd_test_ctx_clear() which will unmap all the buffers anyway (which is
stronger than release pages; as unmap() also tear town pgtables), drop
them as they shouldn't really be anything useful.

We can mark the Fixes tag upon 0db282ba2c as it's reported to only
happen there, however the real "Fixes" IMHO should be 8ba6e86408, as
before that commit we'll always do explicit release_pages() before
registration of uffd, and 8ba6e86408 changed that logic by adding
extra unmap/map and we didn't release the pages at the right place.
Meanwhile I don't have a solid glue anyway on whether posix_memalign()
could always avoid triggering this bug, hence it's safer to attach this
fix to commit 8ba6e86408.

Link: https://lkml.kernel.org/r/20210923232512.210092-1-peterx@redhat.com
Fixes: 8ba6e86408 ("userfaultfd/selftests: reinitialize test context in each test")
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1994931
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Li Wang <liwan@redhat.com>
Tested-by: Li Wang <liwang@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:02 -10:00
Yonghong Song
223f903e9c bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG
Patch set [1] introduced BTF_KIND_TAG to allow tagging
declarations for struct/union, struct/union field, var, func
and func arguments and these tags will be encoded into
dwarf. They are also encoded to btf by llvm for the bpf target.

After BTF_KIND_TAG is introduced, we intended to use it
for kernel __user attributes. But kernel __user is actually
a type attribute. Upstream and internal discussion showed
it is not a good idea to mix declaration attribute and
type attribute. So we proposed to introduce btf_type_tag
as a type attribute and existing btf_tag renamed to
btf_decl_tag ([2]).

This patch renamed BTF_KIND_TAG to BTF_KIND_DECL_TAG and some
other declarations with *_tag to *_decl_tag to make it clear
the tag is for declaration. In the future, BTF_KIND_TYPE_TAG
might be introduced per [3].

 [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
 [2] https://reviews.llvm.org/D111588
 [3] https://reviews.llvm.org/D111199

Fixes: b5ea834dde ("bpf: Support for new btf kind BTF_KIND_TAG")
Fixes: 5b84bd1036 ("libbpf: Add support for BTF_KIND_TAG")
Fixes: 5c07f2fec0 ("bpftool: Add support for BTF_KIND_TAG")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com
2021-10-18 18:35:36 -07:00
Oliver Upton
3f9808cac0 selftests: KVM: Introduce system counter offset test
Introduce a KVM selftest to verify that userspace manipulation of the
TSC (via the new vCPU attribute) results in the correct behavior within
the guest.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210916181555.973085-6-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18 14:43:46 -04:00
Oliver Upton
c895513453 selftests: KVM: Add helpers for vCPU device attributes
vCPU file descriptors are abstracted away from test code in KVM
selftests, meaning that tests cannot directly access a vCPU's device
attributes. Add helpers that tests can use to get at vCPU device
attributes.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210916181555.973085-5-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18 14:43:46 -04:00
Oliver Upton
c1901feef5 selftests: KVM: Fix kvm device helper ioctl assertions
The KVM_CREATE_DEVICE and KVM_{GET,SET}_DEVICE_ATTR ioctls are defined
to return a value of zero on success. As such, tighten the assertions in
the helper functions to only pass if the return code is zero.

Suggested-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210916181555.973085-4-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18 14:43:46 -04:00
Oliver Upton
61fb1c5485 selftests: KVM: Add test for KVM_{GET,SET}_CLOCK
Add a selftest for the new KVM clock UAPI that was introduced. Ensure
that the KVM clock is consistent between userspace and the guest, and
that the difference in realtime will only ever cause the KVM clock to
advance forward.

Cc: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210916181555.973085-3-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18 14:43:45 -04:00
Oliver Upton
5000653934 tools: arch: x86: pull in pvclock headers
Copy over approximately clean versions of the pvclock headers into
tools. Reconcile headers/symbols missing in tools that are unneeded.

Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210916181555.973085-2-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18 14:43:45 -04:00
Christian Borntraeger
01c7d2672a KVM: kvm_stat: do not show halt_wait_ns
Similar to commit 111d0bda8e ("tools/kvm_stat: Exempt time-based
counters"), we should not show timer values in kvm_stat. Remove the new
halt_wait_ns.

Fixes: 87bcc5fa09 ("KVM: stats: Add halt_wait_ns stats for all architectures")
Cc: Jing Zhang <jingzhangos@google.com>
Cc: Stefan Raspl <raspl@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Stefan Raspl <raspl@linux.ibm.com>
Message-Id: <20211006121724.4154-1-borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18 14:07:18 -04:00
Tianjia Zhang
d49fe5e815 selftests/tls: add SM4 algorithm dependency for tls selftests
Kernel TLS test has added SM4 GCM/CCM algorithm support, but SM4
algorithm is not compiled by default, this patch add SM4 config
dependency.

Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:52:11 +01:00
Linus Torvalds
6890acacde - Update section headers before the respective relocations to not
trigger a safety check in elftoolchain's implementation of libelf
 
 - Do not add garbage data to the .rela.orc_unwind_ip section
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmFr+qQACgkQEsHwGGHe
 VUqYCxAAqkHYOu4ok1t0ZUePJSDNFXbMNEq8gfb9gY4zQfsk80eE40EUeZhaqyvq
 a4JHKJSMus+towDB+Zpy1s3ZEfT8V2leZCjMsIS4yIy/dLHPC3TCL4kS1CVlpw0U
 funRhy4FkeiY2Un7ghvhvqSM9l9Dnw3TqxhGUZ0LcdlUaPIKIwzFhHvEVmZsXRFM
 cccz5F/r0yeg7r4TbGtTAmxpKSIsx4BaFqf/LICaLrWGo3pDVGlhn9sPflzv1vue
 rlR4Lb1NKKCaQTefFmLdEpV7FCiJ0QXeoZNGCJlk9NHZyGoNSRfkVSZRDGt/fb5A
 aKfaDvrimgALDB4naWSEplC3slVp9Z2+TpS2zHyqDchGnHIAyx9E6MSOt9F0fs//
 DSv83K2j18rRvmnOPsZoy2pySqtMULohIPECPHOtdZ1y+PwhD/FpSNtO51t/hig5
 2cPq8DnVycJiXM7VQpTcNXv+M2tZ2IZ/E3g8SHJCNSlGZ73YmZdiO2ZTmXGRPnmx
 Xq8gFl/nomMFQSdv6H+nlSy0EA5C3dWqa3pwtKBNrZ8mGAsrEZ2Ptl1TSKfjs28x
 tI7KTeZouxA2NlggBllgOFzm5HSSq+4S6l/gWjuR0yu7j2Kpha+UZUaXZHKHt+OU
 iFZ6BlITsFEE+cd2Nns2rTJjsjN08j/AfNJOGcjtiFVUjHeCc24=
 =tiWi
 -----END PGP SIGNATURE-----

Merge tag 'objtool_urgent_for_v5.15_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fixes from Borislav Petkov:

 - Update section headers before the respective relocations to not
   trigger a safety check in elftoolchain's implementation of libelf

 - Do not add garbage data to the .rela.orc_unwind_ip section

* tag 'objtool_urgent_for_v5.15_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Update section header before relocations
  objtool: Check for gelf_update_rel[a] failures
2021-10-17 17:41:39 -10:00
Marc Zyngier
551a13346e Merge branch kvm-arm64/selftest/timer into kvmarm-master/next
* kvm-arm64/selftest/timer:
  : .
  : Add a set of selftests for the KVM/arm64 timer emulation.
  : Comes with a minimal GICv3 infrastructure.
  : .
  KVM: arm64: selftests: arch_timer: Support vCPU migration
  KVM: arm64: selftests: Add arch_timer test
  KVM: arm64: selftests: Add host support for vGIC
  KVM: arm64: selftests: Add basic GICv3 support
  KVM: arm64: selftests: Add light-weight spinlock support
  KVM: arm64: selftests: Add guest support to get the vcpuid
  KVM: arm64: selftests: Maintain consistency for vcpuid type
  KVM: arm64: selftests: Add support to disable and enable local IRQs
  KVM: arm64: selftests: Add basic support to generate delays
  KVM: arm64: selftests: Add basic support for arch_timers
  KVM: arm64: selftests: Add support for cpu_relax
  KVM: arm64: selftests: Introduce ARM64_SYS_KVM_REG
  tools: arm64: Import sysreg.h
  KVM: arm64: selftests: Add MMIO readl/writel support

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-10-17 11:19:42 +01:00
Raghavendra Rao Ananta
61f6fadbf9 KVM: arm64: selftests: arch_timer: Support vCPU migration
Since the timer stack (hardware and KVM) is per-CPU, there
are potential chances for races to occur when the scheduler
decides to migrate a vCPU thread to a different physical CPU.
Hence, include an option to stress-test this part as well by
forcing the vCPUs to migrate across physical CPUs in the
system at a particular rate.

Originally, the bug for the fix with commit 3134cc8beb
("KVM: arm64: vgic: Resample HW pending state on deactivation")
was discovered using arch_timer test with vCPU migrations and
can be easily reproduced.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211007233439.1826892-16-rananta@google.com
2021-10-17 11:17:22 +01:00
Raghavendra Rao Ananta
4959d8650e KVM: arm64: selftests: Add arch_timer test
Add a KVM selftest to validate the arch_timer functionality.
Primarily, the test sets up periodic timer interrupts and
validates the basic architectural expectations upon its receipt.

The test provides command-line options to configure the period
of the timer, number of iterations, and number of vCPUs.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211007233439.1826892-15-rananta@google.com
2021-10-17 11:17:21 +01:00
Raghavendra Rao Ananta
250b8d6cb3 KVM: arm64: selftests: Add host support for vGIC
Implement a simple library to perform vGIC-v3 setup
from a host point of view. This includes creating a
vGIC device, setting up distributor and redistributor
attributes, and mapping the guest physical addresses.

The definition of REDIST_REGION_ATTR_ADDR is taken from
aarch64/vgic_init test. Hence, replace the definition
by including vgic.h in the test file.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211007233439.1826892-14-rananta@google.com
2021-10-17 11:17:21 +01:00
Raghavendra Rao Ananta
28281652f9 KVM: arm64: selftests: Add basic GICv3 support
Add basic support for ARM Generic Interrupt Controller v3.
The support provides guests to setup interrupts.

The work is inspired from kvm-unit-tests and the kernel's
GIC driver (drivers/irqchip/irq-gic-v3.c).

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211007233439.1826892-13-rananta@google.com
2021-10-17 11:17:21 +01:00
Raghavendra Rao Ananta
414de89df1 KVM: arm64: selftests: Add light-weight spinlock support
Add a simpler version of spinlock support for ARM64 for
the guests to use.

The implementation is loosely based on the spinlock
implementation in kvm-unit-tests.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211007233439.1826892-12-rananta@google.com
2021-10-17 11:17:21 +01:00
Raghavendra Rao Ananta
17229bdc86 KVM: arm64: selftests: Add guest support to get the vcpuid
At times, such as when in the interrupt handler, the guest wants
to get the vcpuid that it's running on to pull the per-cpu private
data. As a result, introduce guest_get_vcpuid() that returns the
vcpuid of the calling vcpu. The interface is architecture
independent, but defined only for arm64 as of now.

Suggested-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211007233439.1826892-11-rananta@google.com
2021-10-17 11:17:21 +01:00
Raghavendra Rao Ananta
0226cd531c KVM: arm64: selftests: Maintain consistency for vcpuid type
The prototype of aarch64_vcpu_setup() accepts vcpuid as
'int', while the rest of the aarch64 (and struct vcpu)
carries it as 'uint32_t'. Hence, change the prototype
to make it consistent throughout the board.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211007233439.1826892-10-rananta@google.com
2021-10-17 11:17:21 +01:00
Raghavendra Rao Ananta
5c636d585c KVM: arm64: selftests: Add support to disable and enable local IRQs
Add functions local_irq_enable() and local_irq_disable() to
enable and disable the IRQs from the guest, respectively.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211007233439.1826892-9-rananta@google.com
2021-10-17 11:17:20 +01:00
Raghavendra Rao Ananta
8016690465 KVM: arm64: selftests: Add basic support to generate delays
Add udelay() support to generate a delay in the guest.

The routines are derived and simplified from kernel's
arch/arm64/lib/delay.c.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211007233439.1826892-8-rananta@google.com
2021-10-17 11:17:20 +01:00
Raghavendra Rao Ananta
d977ed3994 KVM: arm64: selftests: Add basic support for arch_timers
Add a minimalistic library support to access the virtual timers,
that can be used for simple timing functionalities, such as
introducing delays in the guest.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211007233439.1826892-7-rananta@google.com
2021-10-17 11:17:20 +01:00
Raghavendra Rao Ananta
740826ec02 KVM: arm64: selftests: Add support for cpu_relax
Implement the guest helper routine, cpu_relax(), to yield
the processor to other tasks.

The function was derived from
arch/arm64/include/asm/vdso/processor.h.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211007233439.1826892-6-rananta@google.com
2021-10-17 11:17:20 +01:00
Raghavendra Rao Ananta
b3c79c6130 KVM: arm64: selftests: Introduce ARM64_SYS_KVM_REG
With the inclusion of sysreg.h, that brings in system register
encodings, it would be redundant to re-define register encodings
again in processor.h to use it with ARM64_SYS_REG for the KVM
functions such as set_reg() or get_reg(). Hence, add helper macro,
ARM64_SYS_KVM_REG, that converts SYS_* definitions in sysreg.h
into ARM64_SYS_REG definitions.

Also replace all the users of ARM64_SYS_REG, relying on
the encodings created in processor.h, with ARM64_SYS_KVM_REG and
remove the definitions.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211007233439.1826892-5-rananta@google.com
2021-10-17 11:17:20 +01:00
Raghavendra Rao Ananta
272a067df3 tools: arm64: Import sysreg.h
Bring-in the kernel's arch/arm64/include/asm/sysreg.h
into tools/ for arm64 to make use of all the standard
register definitions in consistence with the kernel.

Make use of the register read/write definitions from
sysreg.h, instead of the existing definitions. A syntax
correction is needed for the files that use write_sysreg()
to make it compliant with the new (kernel's) syntax.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
[maz: squashed two commits in order to keep the series bisectable]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211007233439.1826892-3-rananta@google.com
Link: https://lore.kernel.org/r/20211007233439.1826892-4-rananta@google.com
2021-10-17 11:15:51 +01:00
Raghavendra Rao Ananta
88ec7e258b KVM: arm64: selftests: Add MMIO readl/writel support
Define the readl() and writel() functions for the guests to
access (4-byte) the MMIO region.

The routines, and their dependents, are inspired from the kernel's
arch/arm64/include/asm/io.h and arch/arm64/include/asm/barrier.h.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211007233439.1826892-2-rananta@google.com
2021-10-17 11:15:11 +01:00
William Breathitt Gray
086099893f tools/counter: Create Counter tools
This creates an example Counter program under tools/counter/*
to exemplify the Counter character device interface.

Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com>
Link: https://lore.kernel.org/r/7c0f975ba098952122302d258ec9ffdef04befaf.1632884256.git.vilhelm.gray@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2021-10-17 10:54:16 +01:00
Linus Torvalds
d999ade1cc perf tools fixes for v5.15: 4th patch
- Fix 'perf test evsel' build error on !x86 architectures.
 
 - Fix libperf's test_stat_cpu mixup of CPU numbers and CPU indexes.
 
 - Output offsets for decompressed records, not just useless zeros.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYWq4ZAAKCRCyPKLppCJ+
 JyRdAQD1iZFc0g9GLL0n50Q6HJCAFDLfufbkTpYiPPTFhemDRgD9Em5IvOAW6Zhv
 9xLhduRa0DfTe/NIxz82bjSD4BvuzQw=
 =JBso
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v5.15-2021-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix 'perf test evsel' build error on !x86 architectures

 - Fix libperf's test_stat_cpu mixup of CPU numbers and CPU indexes

 - Output offsets for decompressed records, not just useless zeros

* tag 'perf-tools-fixes-for-v5.15-2021-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  libperf tests: Fix test_stat_cpu
  libperf test evsel: Fix build error on !x86 architectures
  perf report: Output non-zero offset for decompressed records
2021-10-16 11:11:07 -07:00
Linus Torvalds
368a978cc5 Tracing fixes for 5.15:
- Fix defined but not use warning/error for osnoise function
 
  - Fix memory leak in event probe
 
  - Fix memblock leak in bootconfig
 
  - Fix the API of event probes to be like kprobes
 
  - Added test to check removal of event probe API
 
  - Fix recordmcount.pl for nds32 failed build
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYWpB6BQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qnu/AQD1eYekS43uCDyzzpvjsz0tZ6tzVH8z
 ainpgtcAd11q4AD8CHLvhBsEyo99Yna2Mvir6nCkafm2Y2IVGvVbnDofnAA=
 =yvDo
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Tracing fixes for 5.15:

 - Fix defined but not use warning/error for osnoise function

 - Fix memory leak in event probe

 - Fix memblock leak in bootconfig

 - Fix the API of event probes to be like kprobes

 - Added test to check removal of event probe API

 - Fix recordmcount.pl for nds32 failed build

* tag 'trace-v5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  nds32/ftrace: Fix Error: invalid operands (*UND* and *UND* sections) for `^'
  selftests/ftrace: Update test for more eprobe removal process
  tracing: Fix event probe removal from dynamic events
  tracing: Fix missing * in comment block
  bootconfig: init: Fix memblock leak in xbc_make_cmdline()
  tracing: Fix memory leak in eprobe_register()
  tracing: Fix missing osnoise tracer on max_latency
2021-10-16 10:51:41 -07:00
Paolo Abeni
72bcbc46a5 mptcp: increase default max additional subflows to 2
The current default does not allowing additional subflows, mostly
as a safety restriction to avoid uncontrolled resource consumption
on busy servers.

Still the system admin and/or the application have to opt-in to
MPTCP explicitly. After that, they need to change (increase) the
default maximum number of additional subflows.

Let set that to reasonable default, and make end-users life easier.

Additionally we need to update some self-tests accordingly.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:46:08 +01:00
Stefano Garzarella
ba95a6225b vsock_diag_test: remove free_sock_stat() call in test_no_sockets
In `test_no_sockets` we don't expect any sockets, indeed
check_no_sockets() prints an error and exits if `sockets` list is
not empty, so free_sock_stat() call is unnecessary since it would
only be called when the `sockets` list is empty.

This was discovered by a strange warning printed by gcc v11.2.1:
  In file included from ../../include/linux/list.h:7,
                   from vsock_diag_test.c:18:
  vsock_diag_test.c: In function ‘test_no_sockets’:
  ../../include/linux/kernel.h:35:45: error: array subscript ‘struct vsock_stat[0]’ is partly outside array bound
  s of ‘struct list_head[1]’ [-Werror=array-bounds]
     35 |         const typeof(((type *)0)->member) * __mptr = (ptr);     \
        |                                             ^~~~~~
  ../../include/linux/list.h:352:9: note: in expansion of macro ‘container_of’
    352 |         container_of(ptr, type, member)
        |         ^~~~~~~~~~~~
  ../../include/linux/list.h:393:9: note: in expansion of macro ‘list_entry’
    393 |         list_entry((pos)->member.next, typeof(*(pos)), member)
        |         ^~~~~~~~~~
  ../../include/linux/list.h:522:21: note: in expansion of macro ‘list_next_entry’
    522 |                 n = list_next_entry(pos, member);                       \
        |                     ^~~~~~~~~~~~~~~
  vsock_diag_test.c:325:9: note: in expansion of macro ‘list_for_each_entry_safe’
    325 |         list_for_each_entry_safe(st, next, sockets, list) {
        |         ^~~~~~~~~~~~~~~~~~~~~~~~
  In file included from vsock_diag_test.c:18:
  vsock_diag_test.c:333:19: note: while referencing ‘sockets’
    333 |         LIST_HEAD(sockets);
        |                   ^~~~~~~
  ../../include/linux/list.h:23:26: note: in definition of macro ‘LIST_HEAD’
     23 |         struct list_head name = LIST_HEAD_INIT(name)

It seems related to some compiler optimization and assumption
about the empty `sockets` list, since this warning is printed
only with -02 or -O3. Also removing `exit(1)` from
check_no_sockets() makes the warning disappear since in that
case free_sock_stat() can be reached also when the list is
not empty.

Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211014152045.173872-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-15 17:21:34 -07:00
Stephen Suryaputra
0857d6f8c7 ipv6: When forwarding count rx stats on the orig netdev
Commit bdb7cc643f ("ipv6: Count interface receive statistics on the
ingress netdev") does not work when ip6_forward() executes on the skbs
with vrf-enslaved netdev. Use IP6CB(skb)->iif to get to the right one.

Add a selftest script to verify.

Fixes: bdb7cc643f ("ipv6: Count interface receive statistics on the ingress netdev")
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20211014130845.410602-1-ssuryaextr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-15 15:32:04 -07:00
Leonard Crestez
64e4017778 selftests: net/fcnal: Test --{force,no}-bind-key-ifindex
Test that applications binding listening sockets to VRFs without
specifying TCP_MD5SIG_FLAG_IFINDEX will work as expected. This would
be broken if __tcp_md5_do_lookup always made a strict comparison on
l3index. See this email:

https://lore.kernel.org/netdev/209548b5-27d2-2059-f2e9-2148f5a0291b@gmail.com/

Applications using tcp_l3mdev_accept=1 and a single global socket (not
bound to any interface) also should have a way to specify keys that are
only for the default VRF, this is done by --force-bind-key-ifindex
without otherwise binding to a device.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-15 14:36:57 +01:00
Leonard Crestez
78a9cf6143 selftests: nettest: Add --{force,no}-bind-key-ifindex
These options allow explicit control over the TCP_MD5SIG_FLAG_IFINDEX
flag instead of always setting it based on binding to an interface.

Do this by converting to getopt_long because nettest has too many
single-character flags already and getopt_long is widely used in
selftests.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-15 14:36:57 +01:00
Jakub Kicinski
e15f5972b8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
tools/testing/selftests/net/ioam6.sh
  7b1700e009 ("selftests: net: modify IOAM tests for undef bits")
  bf77b1400a ("selftests: net: Test for the IOAM encapsulation with IPv6")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-14 16:50:14 -07:00
Linus Torvalds
ec681c53f8 Networking fixes for 5.15-rc6.
Current release - regressions:
 
  - af_unix: rename UNIX-DGRAM to UNIX to maintain backwards compatibility
 
  - procfs: revert "add seq_puts() statement for dev_mcast",
    minor format change broke user space
 
 Current release - new code bugs:
 
  - dsa: fix bridge_num not getting cleared after ports leaving
    the bridge, resource leak
 
  - dsa: tag_dsa: send packets with TX fwd offload from VLAN-unaware
    bridges using VID 0, prevent packet drops if pvid is removed
 
  - dsa: mv88e6xxx: keep the pvid at 0 when VLAN-unaware, prevent
    HW getting confused about station to VLAN mapping
 
 Previous releases - regressions:
 
  - virtio-net: fix for skb_over_panic inside big mode
 
  - phy: do not shutdown PHYs in READY state
 
  - dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's,
    fix link LED staying lit after ifdown
 
  - mptcp: fix possible infinite wait on recvmsg(MSG_WAITALL)
 
  - mqprio: Correct stats in mqprio_dump_class_stats()
 
  - ice: fix deadlock for Tx timestamp tracking flush
 
  - stmmac: fix feature detection on old hardware
 
 Previous releases - always broken:
 
  - sctp: account stream padding length for reconf chunk
 
  - icmp: fix icmp_ext_echo_iio parsing in icmp_build_probe()
 
  - isdn: cpai: check ctr->cnr to avoid array index out of bound
 
  - isdn: mISDN: fix sleeping function called from invalid context
 
  - nfc: nci: fix potential UAF of rf_conn_info object
 
  - dsa: microchip: prevent ksz_mib_read_work from kicking back
    in after it's canceled in .remove and crashing
 
  - dsa: mv88e6xxx: isolate the ATU databases of standalone and
    bridged ports
 
  - dsa: sja1105, ocelot: break circular dependency between switch
    and tag drivers
 
  - dsa: felix: improve timestamping in presence of packe loss
 
  - mlxsw: thermal: fix out-of-bounds memory accesses
 
 Misc:
 
  - ipv6: ioam: move the check for undefined bits to improve
    interoperability
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFoSAcACgkQMUZtbf5S
 Irupdw/7BAWMN6LZ/tmnDJMO9st3TPVKfd9hE8P0sl3YMw568kC61nNLei9k34Pl
 7GfQRjBnalnr5ueX9hZHZmJBqj0XfXP4ZLjCoTNNfwG3mgoZ34BRODxgM60hnvK/
 VFFG5z1bEwPRXDm5urgOmbtVadUXDu/6uZHC/SxnPpy4LlLkpCigUM9FMFaOOx1q
 vJu/0D0RGPv+ukBTyLwyZ9ux1erzD8UAR9uVA8HMFYpSH5MFDG+DmsWHT/IC+0Jl
 TbWmltj9ED5kKqfQxW5gW/xc30H5o33SAzAM1/l6dnHhGfjoKqr5+6MdgAYNT3Y3
 6VcNyMArqqJF/+gBFiRzBJek/K5w40bW+EXLGIaa/BdJtJg6UrMhSlcmE3My+4WU
 vFp1+kTuLhxSp7co319IcuHTaPnvw9U7NUmdoOCDMOdbTPT369VNjDs9PN3SXhO7
 6mXUNPyS9zycZfBYkCRd5uWHjWBMvImY6VdrTaPsWCBrtWjZY7+HProKcUxLnD6t
 AwhEsVlrxVJKqNPRjtB9/NzqlXxW5TEuPKHzGK90ZWRdnErj5pDWLbQiG2bcIvZ6
 JHYZeWHhKyRADj29KzvD3nFJODzK8fqkYTK0k//dTbmFsVwRnCGrKM13Dt8f5Cly
 /FZsISOxq7JIaWQVdkoOOx+9P50dxWYN2Ibzl+upFJqs9ZNvbKA=
 =/K9E
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Quite calm.

  The noisy DSA driver (embedded switches) changes, and adjustment to
  IPv6 IOAM behavior add to diffstat's bottom line but are not scary.

  Current release - regressions:

   - af_unix: rename UNIX-DGRAM to UNIX to maintain backwards
     compatibility

   - procfs: revert "add seq_puts() statement for dev_mcast", minor
     format change broke user space

  Current release - new code bugs:

   - dsa: fix bridge_num not getting cleared after ports leaving the
     bridge, resource leak

   - dsa: tag_dsa: send packets with TX fwd offload from VLAN-unaware
     bridges using VID 0, prevent packet drops if pvid is removed

   - dsa: mv88e6xxx: keep the pvid at 0 when VLAN-unaware, prevent HW
     getting confused about station to VLAN mapping

  Previous releases - regressions:

   - virtio-net: fix for skb_over_panic inside big mode

   - phy: do not shutdown PHYs in READY state

   - dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's, fix link
     LED staying lit after ifdown

   - mptcp: fix possible infinite wait on recvmsg(MSG_WAITALL)

   - mqprio: Correct stats in mqprio_dump_class_stats()

   - ice: fix deadlock for Tx timestamp tracking flush

   - stmmac: fix feature detection on old hardware

  Previous releases - always broken:

   - sctp: account stream padding length for reconf chunk

   - icmp: fix icmp_ext_echo_iio parsing in icmp_build_probe()

   - isdn: cpai: check ctr->cnr to avoid array index out of bound

   - isdn: mISDN: fix sleeping function called from invalid context

   - nfc: nci: fix potential UAF of rf_conn_info object

   - dsa: microchip: prevent ksz_mib_read_work from kicking back in
     after it's canceled in .remove and crashing

   - dsa: mv88e6xxx: isolate the ATU databases of standalone and bridged
     ports

   - dsa: sja1105, ocelot: break circular dependency between switch and
     tag drivers

   - dsa: felix: improve timestamping in presence of packe loss

   - mlxsw: thermal: fix out-of-bounds memory accesses

  Misc:

   - ipv6: ioam: move the check for undefined bits to improve
     interoperability"

* tag 'net-5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (60 commits)
  icmp: fix icmp_ext_echo_iio parsing in icmp_build_probe
  MAINTAINERS: Update the devicetree documentation path of imx fec driver
  sctp: account stream padding length for reconf chunk
  mlxsw: thermal: Fix out-of-bounds memory accesses
  ethernet: s2io: fix setting mac address during resume
  NFC: digital: fix possible memory leak in digital_in_send_sdd_req()
  NFC: digital: fix possible memory leak in digital_tg_listen_mdaa()
  nfc: fix error handling of nfc_proto_register()
  Revert "net: procfs: add seq_puts() statement for dev_mcast"
  net: encx24j600: check error in devm_regmap_init_encx24j600
  net: korina: select CRC32
  net: arc: select CRC32
  net: dsa: felix: break at first CPU port during init and teardown
  net: dsa: tag_ocelot_8021q: fix inability to inject STP BPDUs into BLOCKING ports
  net: dsa: felix: purge skb from TX timestamping queue if it cannot be sent
  net: dsa: tag_ocelot_8021q: break circular dependency with ocelot switch lib
  net: dsa: tag_ocelot: break circular dependency with ocelot switch lib driver
  net: mscc: ocelot: cross-check the sequence id from the timestamp FIFO with the skb PTP header
  net: mscc: ocelot: deny TX timestamping of non-PTP packets
  net: mscc: ocelot: warn when a PTP IRQ is raised for an unknown skb
  ...
2021-10-14 18:21:39 -04:00
Florian Westphal
3e6ed7703d selftests: netfilter: remove stray bash debug line
This should not be there.

Fixes: 2de03b4523 ("selftests: netfilter: add flowtable test script")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-10-14 23:08:35 +02:00
Shunsuke Nakamura
3ff6d64e68 libperf tests: Fix test_stat_cpu
The `cpu` argument of perf_evsel__read() must specify the cpu index.

perf_cpu_map__for_each_cpu() is for iterating the cpu number (not index)
and is thus not appropriate for use with perf_evsel__read().

So, if there is an offline CPU, the cpu number specified in the argument
may point out of range because the cpu number and the cpu index are
different.

Fix test_stat_cpu().

Testing it:

  # make tests -C tools/lib/perf/
  make: Entering directory '/home/nakamura/kernel_src/linux-5.15-rc4_fix/tools/lib/perf'
  running static:
  - running tests/test-cpumap.c...OK
  - running tests/test-threadmap.c...OK
  - running tests/test-evlist.c...OK
  - running tests/test-evsel.c...OK
  running dynamic:
  - running tests/test-cpumap.c...OK
  - running tests/test-threadmap.c...OK
  - running tests/test-evlist.c...OK
  - running tests/test-evsel.c...OK
  make: Leaving directory '/home/nakamura/kernel_src/linux-5.15-rc4_fix/tools/lib/perf'

Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20211011083704.4108720-1-nakamura.shun@fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-14 15:41:35 -03:00
Shunsuke Nakamura
f304c8d949 libperf test evsel: Fix build error on !x86 architectures
In test_stat_user_read, following build error occurs except i386 and
x86_64 architectures:

tests/test-evsel.c:129:31: error: variable 'pc' set but not used [-Werror=unused-but-set-variable]
  struct perf_event_mmap_page *pc;

Fix build error.

Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211006095703.477826-1-nakamura.shun@fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-14 15:41:35 -03:00
Alexey Bayduraev
8e820f9623 perf report: Output non-zero offset for decompressed records
Print offset of PERF_RECORD_COMPRESSED record instead of zero for
decompressed records in raw trace dump (-D option of perf-report):

  0x17cf08 [0x28]: event: 9

instead of:

  0 [0x28]: event: 9

The fix is not critical, because currently file_pos for compressed
events is used in perf_session__process_event only to show offsets
in the raw dump.

This patch was separated from patchset:

https://lore.kernel.org/lkml/cover.1629186429.git.alexey.v.bayduraev@linux.intel.com/

and was already rewieved.

Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210929091445.18274-1-alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-14 15:41:35 -03:00
Petr Machata
bf86273294 selftests: mlxsw: RED: Test per-TC ECN counters
Add a variant of ECN test that uses qdisc marked counter (supported on
Spectrum-3 and above) instead of the aggregate ethtool ecn_marked counter.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-13 17:47:18 -07:00
Steven Rostedt (VMware)
0282b0f012 selftests/ftrace: Update test for more eprobe removal process
The removal of eprobes was broken and missed in testing. Add various ways
to remove eprobes that are considered acceptable to the testing process to
catch when/if they break again.

Link: https://lkml.kernel.org/r/20211013205533.836644549@goodmis.org

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-13 19:27:53 -04:00
Justin Iurman
7b1700e009 selftests: net: modify IOAM tests for undef bits
The output behavior for undefined bits is now directly tested inside the bash
script. Trying to set an undefined bit should be refused.

The input behavior for undefined bits has been removed due to the fact that we
would need another sender allowed to set undefined bits.

Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-12 11:49:49 +01:00
Petr Machata
0cd6fa99a0 selftests: mlxsw: RED: Add selftests for the mark qevent
Add do_mark_test(), which is to do_ecn_test() like do_drop_test() is to
do_red_test(): meant to test that actions on the RED mark qevent block are
offloaded, and executed on ECN-marked packets.

The test splits install_qdisc() into its constituents, install_root_qdisc()
and install_qdisc_tcX(). This is in order to test that when mirroring is
enabled on one TC, the other TC does not mirror.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-12 11:19:35 +01:00
Petr Machata
a703b5179b selftests: mlxsw: sch_red_core: Drop two unused variables
These variables are cut'n'pasted from other functions in the file and not
actually used.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-12 11:19:35 +01:00
Linus Torvalds
fa58787605 linux-kselftest-kunit-fixes-5.15-rc6
This KUnit fixes update for Linux 5.15-rc6 consists of:
 
 - Fixes to address the structleak plugin causing the stack frame size
   to grow immensely when used with KUnit. Fixes include adding a new
   makefile to disable structleak and using it from KUnit iio, device
   property, thunderbolt, and bitfield tests to disable it.
 
 - KUnit framework reference count leak in kfree_at_end
 
 - KUnit tool fix to resolve conflict between --json and --raw_output
   and generate correct test output in either case.
 
 - kernel-doc warnings due to mismatched arg names
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmFkqr8ACgkQCwJExA0N
 QxyziBAAn+02Iq8Kd1p9/+ajr78iQKmjIpSHH51R6wPwOBJg0RgusyuheJ/pccl1
 OOt7sAIdoBWlbZjkwi9SOoFabdvtAO6i7SWpw1R0AdWT5iG4ewe17ODIJiGQ6l9E
 4fNayU+XFunXSqxxEsiSJYO5ztoyDDNOqWJ1a43r+9j6ApsHPH3aKwtx4hFutaJA
 TAUvSOJOMv93SWTHdQh1ihHMVBkhvdnnpSvsRzJFzkEmI5rmL+p7HcbuM0Zk/gyD
 9obygEx8QsJ4pg1AJs8qAj1YbV599qbVhBYYr3EewWscyWME4RrIiXPGImcYKN3/
 99S5pOVTI+wp12r5c8WOdXta+Qe+1RvT9wvTPS+msGQakQhIq0eNePtO1AwIoIor
 S+36JbJdW1rOepg1WssL55F6+re//GYdluSUGwmwYCL3eUEIMpfRqeamg7BKQH3j
 gfukFJvLzzdLiyeYRg2f/G3Vxwa18shlIJgz+6ADCmemura3waeelTG0zFWiNsd5
 XYXXhrxV8eq+Cj5ee/iBddKIgn2+QjFtDmZtsLtiZts3aV7zlugYm2bAYCN1Yxsu
 ZDvymnZUozoOCRFvBe60Eo4DSTDTnmAaRRnuX5L/x1ybpAUvwZQxzyNDIuvfZKQ9
 JB8LrxAJDz7G4jj/9OY5IeacoXNKqDVZYK+Kb9L7hAS08ITk6vU=
 =08Ra
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-fixes-5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kunit fixes from Shuah Khan:

 - Fixes to address the structleak plugin causing the stack frame size
   to grow immensely when used with KUnit. Fixes include adding a new
   makefile to disable structleak and using it from KUnit iio, device
   property, thunderbolt, and bitfield tests to disable it.

 - KUnit framework reference count leak in kfree_at_end

 - KUnit tool fix to resolve conflict between --json and --raw_output
   and generate correct test output in either case.

 - kernel-doc warnings due to mismatched arg names

* tag 'linux-kselftest-kunit-fixes-5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: fix kernel-doc warnings due to mismatched arg names
  bitfield: build kunit tests without structleak plugin
  thunderbolt: build kunit tests without structleak plugin
  device property: build kunit tests without structleak plugin
  iio/test-format: build kunit tests without structleak plugin
  gcc-plugins/structleak: add makefile var for disabling structleak
  kunit: fix reference count leak in kfree_at_end
  kunit: tool: better handling of quasi-bool args (--json, --raw_output)
2021-10-11 17:25:08 -07:00
Florian Westphal
465f15a6d1 selftests: nft_nat: add udp hole punch test case
Add a test case that demonstrates port shadowing via UDP.

ns2 sends packet to ns1, from source port used by a udp service on the
router, ns0.  Then, ns1 sends packet to ns0:service, but that ends up getting
forwarded to ns2.

Also add three test cases that demonstrate mitigations:
1. disable use of $port as source from 'unstrusted' origin
2. make the service untracked.  This prevents masquerade entries
   from having any effects.
3. add forced PAT via 'random' mode to translate the "wrong" sport
   into an acceptable range.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-10-12 01:42:39 +02:00
Heiko Carstens
3990b5baf2 selftests/ftrace: add s390 support for kprobe args tests
This is the s390 variant of commit 9855c4626c ("selftests/ftrace:
Add ppc support for kprobe args tests").

Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-11 20:55:59 +02:00
Ricardo Koller
3e197f17b2 KVM: arm64: selftests: Add init ITS device test
Add some ITS device init tests: general KVM device tests (address not
defined already, address aligned) and tests for the ITS region being
within the addressable IPA range.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005011921.437353-12-ricarkol@google.com
2021-10-11 09:31:43 +01:00
Ricardo Koller
1883458638 KVM: arm64: selftests: Add test for legacy GICv3 REDIST base partially above IPA range
Add a new test into vgic_init which checks that the first vcpu fails to
run if there is not sufficient REDIST space below the addressable IPA
range.  This only applies to the KVM_VGIC_V3_ADDR_TYPE_REDIST legacy API
as the required REDIST space is not know when setting the DIST region.

Note that using the REDIST_REGION API results in a different check at
first vcpu run: that the number of redist regions is enough for all
vcpus. And there is already a test for that case in, the first step of
test_v3_new_redist_regions.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005011921.437353-11-ricarkol@google.com
2021-10-11 09:31:43 +01:00
Ricardo Koller
2dcd9aa1c3 KVM: arm64: selftests: Add tests for GIC redist/cpuif partially above IPA range
Add tests for checking that KVM returns the right error when trying to
set GICv2 CPU interfaces or GICv3 Redistributors partially above the
addressable IPA range. Also tighten the IPA range by replacing
KVM_CAP_ARM_VM_IPA_SIZE with the IPA range currently configured for the
guest (i.e., the default).

The check for the GICv3 redistributor created using the REDIST legacy
API is not sufficient as this new test only checks the check done using
vcpus already created when setting the base. The next commit will add
the missing test which verifies that the KVM check is done at first vcpu
run.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005011921.437353-10-ricarkol@google.com
2021-10-11 09:31:42 +01:00
Ricardo Koller
c44df5f9ff KVM: arm64: selftests: Add some tests for GICv2 in vgic_init
Add some GICv2 tests: general KVM device tests and DIST/CPUIF overlap
tests.  Do this by making test_vcpus_then_vgic and test_vgic_then_vcpus
in vgic_init GIC version agnostic.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005011921.437353-9-ricarkol@google.com
2021-10-11 09:31:42 +01:00
Ricardo Koller
46fb941bc0 KVM: arm64: selftests: Make vgic_init/vm_gic_create version agnostic
Make vm_gic_create GIC version agnostic in the vgic_init test. Also
add a nr_vcpus arg into it instead of defaulting to NR_VCPUS.

No functional change.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005011921.437353-8-ricarkol@google.com
2021-10-11 09:31:42 +01:00
Ricardo Koller
3f4db37e20 KVM: arm64: selftests: Make vgic_init gic version agnostic
As a preparation for the next commits which will add some tests for
GICv2, make aarch64/vgic_init GIC version agnostic. Add a new generic
run_tests function(gic_dev_type) that starts all applicable tests using
GICv3 or GICv2. GICv2 tests are attempted if GICv3 is not available in
the system. There are currently no GICv2 tests, but the test passes now
in GICv2 systems.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005011921.437353-7-ricarkol@google.com
2021-10-11 09:31:42 +01:00
Masami Hiramatsu
4ee1b4cac2 bootconfig: Cleanup dummy headers in tools/bootconfig
Cleanup dummy headers in tools/bootconfig/include except
for tools/bootconfig/include/linux/bootconfig.h.
For this change, I use __KERNEL__ macro to split kernel
header #include and introduce xbc_alloc_mem() and
xbc_free_mem().

Link: https://lkml.kernel.org/r/163187299574.2366983.18371329724128746091.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-10 22:16:02 -04:00
Masami Hiramatsu
4f292c4886 bootconfig: Replace u16 and u32 with uint16_t and uint32_t
Replace u16 and u32 with uint16_t and uint32_t so
that the tools/bootconfig only needs <stdint.h>.

Link: https://lkml.kernel.org/r/163187298835.2366983.9838262576854319669.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-10 20:44:05 -04:00
Masami Hiramatsu
160321b260 tools/bootconfig: Print all error message in stderr
Print all error message in stderr. This also removes
unneeded tools/bootconfig/include/linux/printk.h.

Link: https://lkml.kernel.org/r/163187298106.2366983.15210300267326257397.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-10 20:44:05 -04:00
Masami Hiramatsu
115d4d08ae bootconfig: Rename xbc_destroy_all() to xbc_exit()
Avoid using this noisy name and use more calm one.
This is just a name change. No functional change.

Link: https://lkml.kernel.org/r/163187295918.2366983.5231840238429996027.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-10 20:44:05 -04:00
Masami Hiramatsu
f30f00cc96 tools/bootconfig: Run test script when build all
Run the bootconfig test script when build all target
so that user can notice any issue when build it.

Link: https://lkml.kernel.org/r/163187295173.2366983.18295281097397499118.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-10 20:44:05 -04:00
Masami Hiramatsu
e306220cb7 bootconfig: Add xbc_get_info() for the node information
Add xbc_get_info() API which allows user to get the
number of used xbc_nodes and the size of bootconfig
data. This is also useful for checking the bootconfig
is initialized or not.

Link: https://lkml.kernel.org/r/163177340877.682366.4360676589783197627.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-10 20:43:53 -04:00
Masami Hiramatsu
bdac5c2b24 bootconfig: Allocate xbc_data inside xbc_init()
Allocate 'xbc_data' in the xbc_init() so that it does
not need to care about the ownership of the copied
data.

Link: https://lkml.kernel.org/r/163177339986.682366.898762699429769117.stgit@devnote2

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-10 20:43:42 -04:00
Linus Torvalds
75cd9b0152 - Remove an extra section.len member in favour of section.sh_size
- Align .altinstructions section creation with the kernel's by creating
 them with entry size of 0
 
 - Fix objtool to convert a reloc symbol to a section offset and not to
 not warn about not knowing how
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmFirC4ACgkQEsHwGGHe
 VUqkNg/8Dc8CUTz950A0aGmqrTvy4JWlbh+vHMF3o45oEA7JI34awSAGR2I6USTf
 Z9EQCjaSf/i6IddZKJBqO8ajBXcOfZi6w6lsAtN2mKK4zgx5JLEP7PKTBjaCBWZC
 8+nTKhCd1ZBsh5ndjntO7ippMZYpiMzneMtHn84qtCarMxX6HEkm1zjB7f6y1S5v
 C3I1V0EdxNfyJQIFiOoGVpMZTmPLp9akvJAe5tUElAsMJ+fQgvlc2jgCRX3JeXUr
 buaZRgAONbIhnxC9xgCt7s7nn0ICgp5r6P5aA8QAzfDBPbeL/g2ZggozTufYECxM
 KE2FVPibwejKpUEtUyBNjA+yz46Ce89xryoTKN8uwTKcDQIheo0mE03eWtbObZW1
 vDPFyGAEl6a3g7WOotL5tDYUEHtGPKSmXrz8UL8F+E4nEKtlNySZspkrOx+aJSg7
 fHbbc4L1GllDaC8spF8hH1EDYva2QZyc1xt9rFdEAOBluwX0unVo250+Ue5D6DE8
 aOQtv/RJgkPxlk8UEyPzP1HHQ/VXQ07F8jRRgkuQbEsvY3mxAvt+BcG+ZtLF0FKk
 e7jO3I3wCHclS4EnP/4kY/Diy9A9htqr30vJayiy2+G80hbFYoE7nuiLU75rukWp
 fQcsHGv8UjkOO9qkNecDn0znc2sqhgxF+3RyaEfNaeEsVmsUASE=
 =0JCB
 -----END PGP SIGNATURE-----

Merge tag 'objtool_urgent_for_v5.15_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fixes from Borislav Petkov:

 - Remove an extra section.len member in favour of section.sh_size

 - Align .altinstructions section creation with the kernel's by creating
   them with entry size of 0

 - Fix objtool to convert a reloc symbol to a section offset and not to
   not warn about not knowing how

* tag 'objtool_urgent_for_v5.15_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Remove redundant 'len' field from struct section
  objtool: Make .altinstructions section entry size consistent
  objtool: Remove reloc symbol type checks in get_alt_entry()
2021-10-10 10:05:39 -07:00
Ilya Leoshkevich
5319255b8d selftests/bpf: Skip verifier tests that fail to load with ENOTSUPP
The verifier tests added in commit c48e51c8b0 ("bpf: selftests: Add
selftests for module kfunc support") fail on s390, since the JIT does
not support calling kernel functions. This is most likely an issue for
all the other non-Intel arches, as well as on Intel with
!CONFIG_DEBUG_INFO_BTF or !CONFIG_BPF_JIT.

Trying to check for messages from all the possible add_kfunc_call()
failure cases in test_verifier looks pointless, so do a much simpler
thing instead: just like it's already done in do_prog_test_run(), skip
the tests that fail to load with ENOTSUPP.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211007173329.381754-1-iii@linux.ibm.com
2021-10-08 20:07:05 -07:00
Tianjia Zhang
e506342a03 selftests/tls: add SM4 GCM/CCM to tls selftests
Add new cipher as a variant of standard tls selftests.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Link: https://lore.kernel.org/r/20211008091745.42917-1-tianjia.zhang@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-08 16:58:46 -07:00
Yucong Sun
d3f7b1664d selfetest/bpf: Make some tests serial
Change tests that often fails in parallel execution mode to serial.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-15-fallentree@fb.com
2021-10-08 15:17:00 -07:00
Yucong Sun
5db02dd7f0 selftests/bpf: Fix pid check in fexit_sleep test
bpf_get_current_pid_tgid() returns u64, whose upper 32 bits are the same
as userspace getpid() return value.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-13-fallentree@fb.com
2021-10-08 15:17:00 -07:00
Yucong Sun
0f4feacc91 selftests/bpf: Adding pid filtering for atomics test
This make atomics test able to run in parallel with other tests.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-11-fallentree@fb.com
2021-10-08 15:17:00 -07:00
Yucong Sun
445e72c782 selftests/bpf: Make cgroup_v1v2 use its own port
This patch change cgroup_v1v2 use a different port, avoid conflict with
other tests.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-8-fallentree@fb.com
2021-10-08 15:10:43 -07:00
Yucong Sun
d719de0d2f selftests/bpf: Fix race condition in enable_stats
In parallel execution mode, this test now need to use atomic operation
to avoid race condition.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-7-fallentree@fb.com
2021-10-08 15:10:43 -07:00
Yucong Sun
e87c3434f8 selftests/bpf: Add per worker cgroup suffix
This patch make each worker use a unique cgroup base directory, thus
allowing tests that uses cgroups to run concurrently.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-5-fallentree@fb.com
2021-10-08 14:45:18 -07:00
Yucong Sun
6587ff58ce selftests/bpf: Allow some tests to be executed in sequence
This patch allows tests to define serial_test_name() instead of
test_name(), and this will make test_progs execute those in sequence
after all other tests finished executing concurrently.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-3-fallentree@fb.com
2021-10-08 14:38:02 -07:00
Yucong Sun
91b2c0afd0 selftests/bpf: Add parallelism to test_progs
This patch adds "-j" mode to test_progs, executing tests in multiple
process.  "-j" mode is optional, and works with all existing test
selection mechanism, as well as "-v", "-l" etc.

In "-j" mode, main process use UDS/SEQPACKET to communicate to each forked
worker, commanding it to run tests and collect logs. After all tests are
finished, a summary is printed. main process use multiple competing
threads to dispatch work to worker, trying to keep them all busy.

The test status will be printed as soon as it is finished, if there are
error logs, it will be printed after the final summary line.

By specifying "--debug", additional debug information on server/worker
communication will be printed.

Example output:
  > ./test_progs -n 15-20 -j
  [   12.801730] bpf_testmod: loading out-of-tree module taints kernel.
  Launching 8 workers.
  #20 btf_split:OK
  #16 btf_endian:OK
  #18 btf_module:OK
  #17 btf_map_in_map:OK
  #19 btf_skc_cls_ingress:OK
  #15 btf_dump:OK
  Summary: 6/20 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-2-fallentree@fb.com
2021-10-08 14:37:56 -07:00
Hou Tao
fa7f17d066 bpf/selftests: Add test for writable bare tracepoint
Add a writable bare tracepoint in bpf_testmod module, and
trigger its calling when reading /sys/kernel/bpf_testmod
with a specific buffer length. The reading will return
the value in writable context if the early return flag
is enabled in writable context.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211004094857.30868-4-hotforest@gmail.com
2021-10-08 13:22:57 -07:00
Hou Tao
ccaf12d621 libbpf: Support detecting and attaching of writable tracepoint program
Program on writable tracepoint is BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE,
but its attachment is the same as BPF_PROG_TYPE_RAW_TRACEPOINT.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211004094857.30868-3-hotforest@gmail.com
2021-10-08 13:22:57 -07:00
Ian Rogers
f792cf8a09 perf kmem: Improve man page for record options
Since:

  https://lore.kernel.org/lkml/20200708183919.4141023-1-irogers@google.com/

The output option works for 'perf kmem', however, it must appear after
'record'. This is different to 'stat' where '-i' for the input must
appear before. Try to capture this complication in the man page.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210922212031.485950-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-08 16:10:02 -03:00
Quentin Monnet
d7db0a4e8d bpftool: Add install-bin target to install binary only
With "make install", bpftool installs its binary and its bash completion
file. Usually, this is what we want. But a few components in the kernel
repository (namely, BPF iterators and selftests) also install bpftool
locally before using it. In such a case, bash completion is not
necessary and is just a useless build artifact.

Let's add an "install-bin" target to bpftool, to offer a way to install
the binary only.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-13-quentin@isovalent.com
2021-10-08 12:02:40 -07:00
Quentin Monnet
87ee33bfdd selftests/bpf: Better clean up for runqslower in test_bpftool_build.sh
The script test_bpftool_build.sh attempts to build bpftool in the
various supported ways, to make sure nothing breaks.

One of those ways is to run "make tools/bpf" from the root of the kernel
repository. This command builds bpftool, along with the other tools
under tools/bpf, and runqslower in particular. After running the
command and upon a successful bpftool build, the script attempts to
cleanup the generated objects. However, after building with this target
and in the case of runqslower, the files are not cleaned up as expected.

This is because the "tools/bpf" target sets $(OUTPUT) to
.../tools/bpf/runqslower/ when building the tool, causing the object
files to be placed directly under the runqslower directory. But when
running "cd tools/bpf; make clean", the value for $(OUTPUT) is set to
".output" (relative to the runqslower directory) by runqslower's
Makefile, and this is where the Makefile looks for files to clean up.

We cannot easily fix in the root Makefile (where "tools/bpf" is defined)
or in tools/scripts/Makefile.include (setting $(OUTPUT)), where changing
the way the output variables are passed would likely have consequences
elsewhere. We could change runqslower's Makefile to build in the
repository instead of in a dedicated ".output/", but doing so just to
accommodate a test script doesn't sound great. Instead, let's just make
sure that we clean up runqslower properly by adding the correct command
to the script.

This will attempt to clean runqslower twice: the first try with command
"cd tools/bpf; make clean" will search for tools/bpf/runqslower/.output
and fail to clean it (but will still clean the other tools, in
particular bpftool), the second one (added in this commit) sets the
$(OUTPUT) variable like for building with the "tool/bpf" target and
should succeed.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-12-quentin@isovalent.com
2021-10-08 12:02:38 -07:00
James Clark
eda1a84cb4 perf tools: Enable strict JSON parsing
This is to ensure that the PMU event files can always be parsed by
other tools.

Testing
=======

 * There are no errors when parsing files for all architectures:
     # pmu-events/jevents nds32 pmu-events/arch/ test
     # pmu-events/jevents s390 pmu-events/arch/ test
     # pmu-events/jevents powerpc pmu-events/arch/ test
     # pmu-events/jevents arm64 pmu-events/arch/ test
     # pmu-events/jevents test pmu-events/arch/ test
     # pmu-events/jevents x86 pmu-events/arch/ test

 * Trailing and leading commas now cause a parse error

 * Double commas now cause a parse error

 * Compilation and parsing works with strict mode disabled and enabled

 * A diff of the output files shows no changes

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Kajol Jain<kjain@linux.ibm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew.Kilroy@arm.com
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick.Forrington@arm.com
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211007110543.564963-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-08 15:59:55 -03:00
James Clark
21813684e4 perf tools: Make the JSON parser more conformant when in strict mode
Return an error when a trailing comma is found or a new item is
encountered before a comma or an opening brace. This ensures that the
perf JSON files conform more closely to the spec at https://www.json.org

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Kajol Jain<kjain@linux.ibm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew.Kilroy@arm.com
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick.Forrington@arm.com
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211007110543.564963-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-08 15:59:46 -03:00
James Clark
08f3e0873a perf vendor-events: Fix all remaining invalid JSON files
Remove trailing commas. A later commit will make the parser more strict
and these will not be valid anymore.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Kajol Jain<kjain@linux.ibm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew.Kilroy@arm.com
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick.Forrington@arm.com
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211007110543.564963-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-08 15:59:21 -03:00
Quentin Monnet
be79505caf tools/runqslower: Install libbpf headers when building
API headers from libbpf should not be accessed directly from the
library's source directory. Instead, they should be exported with "make
install_headers". Let's make sure that runqslower installs the
headers properly when building.

We use a libbpf_hdrs target to mark the logical dependency on libbpf's
headers export for a number of object files, even though the headers
should have been exported at this time (since bpftool needs them, and is
required to generate the skeleton or the vmlinux.h).

When descending from a parent Makefile, the specific output directories
for building the library and exporting the headers are configurable with
BPFOBJ_OUTPUT and BPF_DESTDIR, respectively. This is in addition to
OUTPUT, on top of which those variables are constructed by default.

Also adjust the Makefile for the BPF selftests. We pass a number of
variables to the "make" invocation, because we want to point runqslower
to the (target) libbpf shared with other tools, instead of building its
own version. In addition, runqslower relies on (target) bpftool, and we
also want to pass the proper variables to its Makefile so that bpftool
itself reuses the same libbpf.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-6-quentin@isovalent.com
2021-10-08 11:54:15 -07:00
Quentin Monnet
1478994aad tools/resolve_btfids: Install libbpf headers when building
API headers from libbpf should not be accessed directly from the
library's source directory. Instead, they should be exported with "make
install_headers". Let's make sure that resolve_btfids installs the
headers properly when building.

When descending from a parent Makefile, the specific output directories
for building the library and exporting the headers are configurable with
LIBBPF_OUT and LIBBPF_DESTDIR, respectively. This is in addition to
OUTPUT, on top of which those variables are constructed by default.

Also adjust the Makefile for the BPF selftests in order to point to the
(target) libbpf shared with other tools, instead of building a version
specific to resolve_btfids. Remove libbpf's order-only dependencies on
the include directories (they are created by libbpf and don't need to
exist beforehand).

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-5-quentin@isovalent.com
2021-10-08 11:54:11 -07:00
Quentin Monnet
f012ade10b bpftool: Install libbpf headers instead of including the dir
Bpftool relies on libbpf, therefore it relies on a number of headers
from the library and must be linked against the library. The Makefile
for bpftool exposes these objects by adding tools/lib as an include
directory ("-I$(srctree)/tools/lib"). This is a working solution, but
this is not the cleanest one. The risk is to involuntarily include
objects that are not intended to be exposed by the libbpf.

The headers needed to compile bpftool should in fact be "installed" from
libbpf, with its "install_headers" Makefile target. In addition, there
is one header which is internal to the library and not supposed to be
used by external applications, but that bpftool uses anyway.

Adjust the Makefile in order to install the header files properly before
compiling bpftool. Also copy the additional internal header file
(nlattr.h), but call it out explicitly. Build (and install headers) in a
subdirectory under bpftool/ instead of tools/lib/bpf/. When descending
from a parent Makefile, this is configurable by setting the OUTPUT,
LIBBPF_OUTPUT and LIBBPF_DESTDIR variables.

Also adjust the Makefile for BPF selftests, so as to reuse the (host)
libbpf compiled earlier and to avoid compiling a separate version of the
library just for bpftool.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-4-quentin@isovalent.com
2021-10-08 11:48:43 -07:00
Quentin Monnet
c66a248f19 bpftool: Remove unused includes to <bpf/bpf_gen_internal.h>
It seems that the header file was never necessary to compile bpftool,
and it is not part of the headers exported from libbpf. Let's remove the
includes from prog.c and gen.c.

Fixes: d510296d33 ("bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command.")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-3-quentin@isovalent.com
2021-10-08 11:47:40 -07:00
Quentin Monnet
b79c2ce3ba libbpf: Skip re-installing headers file if source is older than target
The "install_headers" target in libbpf's Makefile would unconditionally
export all API headers to the target directory. When those headers are
installed to compile another application, this means that make always
finds newer dependencies for the source files relying on those headers,
and deduces that the targets should be rebuilt.

Avoid that by making "install_headers" depend on the source header
files, and (re-)install them only when necessary.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-2-quentin@isovalent.com
2021-10-08 11:47:34 -07:00
Guo Zhengkui
c6c00900c7 perf daemon: Remove duplicate sys/file.h include
There is a "#include <sys/file.h>" in line 10, so remove a duplicate
one in line 1124.

Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20211006062235.6364-1-guozhengkui@vivo.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-08 15:14:50 -03:00
Yucong Sun
7e3cbd3405 selftests/bpf: Fix btf_dump test under new clang
New clang version changed ([0]) type name in dwarf from "long int" to "long",
this is causing btf_dump tests to fail.

  [0] f6a561c4d6

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211008173139.1457407-1-fallentree@fb.com
2021-10-08 11:08:11 -07:00
Amit Cohen
7f63cdde50 selftests: mlxsw: devlink_trap_tunnel_ipip: Send a full-length key
As part of adding same test for GRE tunnel with IPv6 underlay, missing
bytes for key were found.

mausezahn does not fill zeros between two colons, so send them
explicitly. For example, use "00:00:00:E9:" instead of ":E9:"

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:59 +01:00
Amit Cohen
8bb0ebd522 selftests: mlxsw: devlink_trap_tunnel_ipip: Remove code duplication
As part of adding same test for GRE tunnel with IPv6 underlay, an
optional improvement was found - call ipip_payload_get from
ecn_payload_get, so do not duplicate the code which creates the payload.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:59 +01:00
Amit Cohen
c473f723f9 selftests: mlxsw: devlink_trap_tunnel_ipip: Align topology drawing correctly
As part of adding same test for GRE tunnel with IPv6 underlay, wrong
alignments were found, fix them.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:59 +01:00
Amit Cohen
4bb6cce00a selftests: mlxsw: devlink_trap_tunnel_ipip6: Add test case for IPv6 decap_error
IPv6 underlay support was added, add test to check that "decap_error" trap
is triggered under the right conditions and that devlink counters increase.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:58 +01:00
Amit Cohen
4b3d967b5c selftests: forwarding: Add IPv6 GRE hierarchical tests
Add tests that check IPv6-in-IPv6, IPv4-in-IPv6 and MTU change of GRE
tunnel. The tests use hierarchical model - the tunnel is bound to a device
in a different VRF.

These tests can be run with TC_FLAG=skip_sw, so then they will verify
that packets go through hardware as part of enacp and decap phases.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:58 +01:00
Amit Cohen
7df29960fa selftests: forwarding: Add IPv6 GRE flat tests
Add tests that check IPv6-in-IPv6, IPv4-in-IPv6 and MTU change of GRE
tunnel. The tests use flat model - overlay and underlay share the same VRF.

These tests can be run with TC_FLAG=skip_sw, so then they will verify
that packets go through hardware as part of enacp and decap phases.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:58 +01:00
Amit Cohen
c08d227290 testing: selftests: tc_common: Add tc_check_at_least_x_packets()
Add function that checks that at least X packets hit the tc rule.
There are cases that it is not possible to catch only the interesting
packets, so then, it is possible to send many packets and verify that at
least this amount of packets hit the rule.

This function will be used in the next patch for general tc rule that
can be used to test both software and hardware.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:58 +01:00
Amit Cohen
45d45e5323 testing: selftests: forwarding.config.sample: Add tc flag
Add TC_FLAG value to tests topology.
This flag supposed to be skip_sw/skip_hw which means do not filter by
software/hardware.

This can be useful for adding tests to forwarding directory, and be able
to verify that packets go through the hardware.

When the flag is not set or set to 'skip_hw', tests can still be executed
with veth pairs.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:57 +01:00
Riccardo Mancini
c2d4fab01f perf test evlist-open-close: Use inline func to convert timeval to usec
This patch introduces a new inline function to convert a timeval to
usec.

This function will be used also in the next patch.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/b95035ec4a125355be8ea843f7275c4580da6398.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-08 11:45:38 -03:00
Riccardo Mancini
6bd006c6eb perf mmap: Introduce mmap_cpu_mask__duplicate()
This patch adds a new function in util/mmap.c to duplicate a mmap_cpu_mask.

This new function will be used in patches in the workqueue patchkit.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/8943a548ef7a3dd3e015095afad7e9a8b2154c05.1629490974.git.rickyman7@gmail.com
[ bitmap_alloc() was renamed to bitmap_zalloc() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-08 11:42:17 -03:00
Riccardo Mancini
73e40c9bd4 libperf cpumap: Use binary search in perf_cpu_map__idx() as array are sorted
Since 7074674e73 ("perf cpumap: Maintain cpumaps ordered and
without dups") perf_cpu_map elements are sorted in ascending order.

This patch improves perf_cpu_map__idx() by using a binary search.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/f1543c15797169c21e8b205a4a6751159180580d.1629490974.git.rickyman7@gmail.com
[ Removed 'else' after if + return, declared variables where needed ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-08 11:29:16 -03:00
Arnaldo Carvalho de Melo
47e7dd34a2 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up the fixes in perf/urgent that were just merged into upstream.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-08 11:13:17 -03:00
Dave Marchevsky
dd65acf72d selftests/bpf: Remove SEC("version") from test progs
Since commit 6c4fc209fc ("bpf: remove useless version check for prog
load") these "version" sections, which result in bpf_attr.kern_version
being set, have been unnecessary.

Remove them so that it's obvious to folks using selftests as a guide that
"modern" BPF progs don't need this section.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007231234.2223081-1-davemarchevsky@fb.com
2021-10-07 22:01:56 -07:00
Song Liu
aa67fdb464 selftests/bpf: Skip the second half of get_branch_snapshot in vm
VMs running on upstream 5.12+ kernel support LBR. However,
bpf_get_branch_snapshot couldn't stop the LBR before too many entries
are flushed. Skip the hit/waste test for VMs before we find a proper fix
for LBR in VM.

Fixes: 025bd7c753 ("selftests/bpf: Add test for bpf_get_branch_snapshot")
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007050231.728496-1-songliubraving@fb.com
2021-10-07 21:51:04 -07:00
Jakub Kicinski
9fe1155233 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-07 15:24:06 -07:00
Linus Torvalds
14df9235aa perf tools fixes for v5.15: 3rd batch
- Fix plugin static linking with libopencsd on ARM and ARM64.
 
 - Add missing -lstdc++ when linking with libopencsd.
 
 - Add missing topdown metrics events to 'perf test attr'.
 
 - Plug leak sys_event_tables list after processing JSON vendor events entries.
 
 - Sync sound/asound.h copy with the kernel sources.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYV8V0gAKCRCyPKLppCJ+
 JwEKAQCqrmoaUcX3hsrb1GqRNVgVpxztL+GmfK8dL9IEWZxSfgEA6ekSZzSx8bn/
 i7t3ccKYYuDKA0nqtB5hcaXFArqVTQA=
 =AIE3
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v5.15-2021-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix plugin static linking with libopencsd on ARM and ARM64

 - Add missing -lstdc++ when linking with libopencsd

 - Add missing topdown metrics events to 'perf test attr'

 - Plug leak sys_event_tables list after processing JSON vendor events
   entries

 - Sync sound/asound.h copy with the kernel sources

* tag 'perf-tools-fixes-for-v5.15-2021-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf tests attr: Add missing topdown metrics events
  tools include UAPI: Sync sound/asound.h copy with the kernel sources
  perf build: Fix plugin static linking with libopencsd on ARM and ARM64
  perf build: Add missing -lstdc++ when linking with libopencsd
  perf jevents: Free the sys_event_tables list after processing entries
2021-10-07 10:58:42 -07:00
Linus Torvalds
4a16df549d Networking fixes for 5.15-rc5, including fixes from xfrm, bpf,
netfilter, and wireless.
 
 Current release - regressions:
 
  - xfrm: fix XFRM_MSG_MAPPING ABI breakage caused by inserting
    a new value in the middle of an enum
 
  - unix: fix an issue in unix_shutdown causing the other end
    read/write failures
 
  - phy: mdio: fix memory leak
 
 Current release - new code bugs:
 
  - mlx5e: improve MQPRIO resiliency against bad configs
 
 Previous releases - regressions:
 
  - bpf: fix integer overflow leading to OOB access in map element
    pre-allocation
 
  - stmmac: dwmac-rk: fix ethernet on rk3399 based devices
 
  - netfilter: conntrack: fix boot failure with nf_conntrack.enable_hooks=1
 
  - brcmfmac: revert using ISO3166 country code and 0 rev as fallback
 
  - i40e: fix freeing of uninitialized misc IRQ vector
 
  - iavf: fix double unlock of crit_lock
 
 Previous releases - always broken:
 
  - bpf, arm: fix register clobbering in div/mod implementation
 
  - netfilter: nf_tables: correct issues in netlink rule change
    event notifications
 
  - dsa: tag_dsa: fix mask for trunked packets
 
  - usb: r8152: don't resubmit rx immediately to avoid soft lockup
    on device unplug
 
  - i40e: fix endless loop under rtnl if FW fails to correctly
    respond to capability query
 
  - mlx5e: fix rx checksum offload coexistence with ipsec offload
 
  - mlx5: force round second at 1PPS out start time and allow it
    only in supported clock modes
 
  - phy: pcs: xpcs: fix incorrect CL37 AN sequence, EEE disable
    sequence
 
 Misc:
 
  - xfrm: slightly rejig the new policy uAPI to make it less cryptic
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFfEpUACgkQMUZtbf5S
 Iru3vQ//fgm+pdDE860BXmLEgrbJTHU4rq/YD1vwZBcWw/i5wI1vnLr6BzZsdPNX
 DkhcKFGOZUTj+8ctuRDuqrkqoDjva6uRjwM0vcPWh5i9sGqJpKjxB3dFksyxJELR
 SnXM3Jmlk7YiGAw9Bi+66OuIwt2ouRLR/bNIwg8/qCnFI1efIF7IPeCpuvKCw/yd
 SOiSBIfuSPD1qcs5Sy4aHZqA8Xr9qbwDNwWQfFLXgNDKEiY7XOSdo3CoCddSxdR+
 2nmpOiz4w68wspapdZn3GSZHYrQh5kjz7b0Aru0Jvw86M79mKp3b9AfJ9uXTcJhp
 4cQBralLnQvLKanvKi1z5CI6NjXx+r6rXI43N6NjHOtjRUPoFMqxZEH0d7o11aT1
 sN3UDgtFtuE9Pfrhnc5ZHuHqFCCyxA6NWD6nt1dUoSEo0oWt9mOHfeoFT4+45fO0
 5no5+1q3EkYdH4jiJlavnM2DMvVzMd6FbxDzWpXJ2j4W1vM6TLkexEJIK4MLGxPV
 76lxeXzcvbM9a0vq5BabR4QbPIAv+A9qYPmXJwPVrvjo+zynwuWc3gMO5hc4EaOf
 FXF2Ka5Jn97jW8/JS7i7Gj6M8GKdyIxaFHgS4MLtJNs6pt3h7m6bSgcIOQZ5psBZ
 dKRYjM2lxeVkvDDmy5Gztkw2asbofYQP004tgagc+jwXP7DwXaY=
 =xzZg
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from xfrm, bpf, netfilter, and wireless.

  Current release - regressions:

   - xfrm: fix XFRM_MSG_MAPPING ABI breakage caused by inserting a new
     value in the middle of an enum

   - unix: fix an issue in unix_shutdown causing the other end
     read/write failures

   - phy: mdio: fix memory leak

  Current release - new code bugs:

   - mlx5e: improve MQPRIO resiliency against bad configs

  Previous releases - regressions:

   - bpf: fix integer overflow leading to OOB access in map element
     pre-allocation

   - stmmac: dwmac-rk: fix ethernet on rk3399 based devices

   - netfilter: conntrack: fix boot failure with
     nf_conntrack.enable_hooks=1

   - brcmfmac: revert using ISO3166 country code and 0 rev as fallback

   - i40e: fix freeing of uninitialized misc IRQ vector

   - iavf: fix double unlock of crit_lock

  Previous releases - always broken:

   - bpf, arm: fix register clobbering in div/mod implementation

   - netfilter: nf_tables: correct issues in netlink rule change event
     notifications

   - dsa: tag_dsa: fix mask for trunked packets

   - usb: r8152: don't resubmit rx immediately to avoid soft lockup on
     device unplug

   - i40e: fix endless loop under rtnl if FW fails to correctly respond
     to capability query

   - mlx5e: fix rx checksum offload coexistence with ipsec offload

   - mlx5: force round second at 1PPS out start time and allow it only
     in supported clock modes

   - phy: pcs: xpcs: fix incorrect CL37 AN sequence, EEE disable
     sequence

  Misc:

   - xfrm: slightly rejig the new policy uAPI to make it less cryptic"

* tag 'net-5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
  net: prefer socket bound to interface when not in VRF
  iavf: fix double unlock of crit_lock
  i40e: Fix freeing of uninitialized misc IRQ vector
  i40e: fix endless loop under rtnl
  dt-bindings: net: dsa: marvell: fix compatible in example
  ionic: move filter sync_needed bit set
  gve: report 64bit tx_bytes counter from gve_handle_report_stats()
  gve: fix gve_get_stats()
  rtnetlink: fix if_nlmsg_stats_size() under estimation
  gve: Properly handle errors in gve_assign_qpl
  gve: Avoid freeing NULL pointer
  gve: Correct available tx qpl check
  unix: Fix an issue in unix_shutdown causing the other end read/write failures
  net: stmmac: trigger PCS EEE to turn off on link down
  net: pcs: xpcs: fix incorrect steps on disable EEE
  netlink: annotate data races around nlk->bound
  net: pcs: xpcs: fix incorrect CL37 AN sequence
  net: sfp: Fix typo in state machine debug string
  net/sched: sch_taprio: properly cancel timer from taprio_destroy()
  net: bridge: fix under estimation in br_get_linkxstats_size()
  ...
2021-10-07 09:50:31 -07:00
Jakub Kicinski
7671b026bb Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-10-07

We've added 7 non-merge commits during the last 8 day(s) which contain
a total of 8 files changed, 38 insertions(+), 21 deletions(-).

The main changes are:

1) Fix ARM BPF JIT to preserve caller-saved regs for DIV/MOD JIT-internal
   helper call, from Johan Almbladh.

2) Fix integer overflow in BPF stack map element size calculation when
   used with preallocation, from Tatsuhiko Yasumatsu.

3) Fix an AF_UNIX regression due to added BPF sockmap support related
   to shutdown handling, from Jiang Wang.

4) Fix a segfault in libbpf when generating light skeletons from objects
   without BTF, from Kumar Kartikeya Dwivedi.

5) Fix a libbpf memory leak in strset to free the actual struct strset
   itself, from Andrii Nakryiko.

6) Dual-license bpf_insn.h similarly as we did for libbpf and bpftool,
   with ACKs from all contributors, from Luca Boccassi.
====================

Link: https://lore.kernel.org/r/20211007135010.21143-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-07 07:11:33 -07:00
André Almeida
9d57f7c797 selftests: futex: Test sys_futex_waitv() wouldblock
Test if futex_waitv() returns -EWOULDBLOCK correctly when the expected
value is different from the actual value for a waiter.

Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210923171111.300673-22-andrealmeid@collabora.com
2021-10-07 13:51:12 +02:00
André Almeida
02e56ccbae selftests: futex: Test sys_futex_waitv() timeout
Test if the futex_waitv timeout is working as expected, using the
supported clockid options.

Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210923171111.300673-21-andrealmeid@collabora.com
2021-10-07 13:51:12 +02:00
André Almeida
5e59c1d1c7 selftests: futex: Add sys_futex_waitv() test
Create a new file to test the waitv mechanism. Test both private and
shared futexes. Wake the last futex in the array, and check if the
return value from futex_waitv() is the right index.

Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210923171111.300673-20-andrealmeid@collabora.com
2021-10-07 13:51:12 +02:00
Mark Brown
0ba1ce1e86 selftests: arm64: Add coverage of ptrace flags for SVE VL inheritance
Add a test that covers enabling and disabling of SVE vector length
inheritance via the ptrace interface.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20211005123537.976795-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-07 09:20:52 +01:00
Michael Forney
86e1e054e0 objtool: Update section header before relocations
The libelf implementation from elftoolchain has a safety check in
gelf_update_rel[a] to check that the data corresponds to a section
that has type SHT_REL[A] [0]. If the relocation is updated before
the section header is updated with the proper type, this check
fails.

To fix this, update the section header first, before the relocations.
Previously, the section size was calculated in elf_rebuild_reloc_section
by counting the number of entries in the reloc_list. However, we
now need the size during elf_write so instead keep a running total
and add to it for every new relocation.

[0] https://sourceforge.net/p/elftoolchain/mailman/elftoolchain-developers/thread/CAGw6cBtkZro-8wZMD2ULkwJ39J+tHtTtAWXufMjnd3cQ7XG54g@mail.gmail.com/

Signed-off-by: Michael Forney <mforney@mforney.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210509000103.11008-2-mforney@mforney.org
2021-10-06 20:11:57 -07:00
Michael Forney
b46179d6bb objtool: Check for gelf_update_rel[a] failures
Otherwise, if these fail we end up with garbage data in the
.rela.orc_unwind_ip section, leading to errors like

  ld: fs/squashfs/namei.o: bad reloc symbol index (0x7f16 >= 0x12) for offset 0x7f16d5c82cc8 in section `.orc_unwind_ip'

Signed-off-by: Michael Forney <mforney@mforney.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210509000103.11008-1-mforney@mforney.org
2021-10-06 20:11:53 -07:00
Peter Zijlstra
b08cadbd3b Merge branch 'objtool/urgent'
Fixup conflicts.

# Conflicts:
#	tools/objtool/check.c
2021-10-07 00:40:17 +02:00
Hengqi Chen
6f2b219b62 selftests/bpf: Switch to new bpf_object__next_{map,program} APIs
Replace deprecated bpf_{map,program}__next APIs with newly added
bpf_object__next_{map,program} APIs, so that no compilation warnings
emit.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211003165844.4054931-3-hengqi.chen@gmail.com
2021-10-06 12:34:02 -07:00
Hengqi Chen
2088a3a71d libbpf: Deprecate bpf_{map,program}__{prev,next} APIs since v0.7
Deprecate bpf_{map,program}__{prev,next} APIs. Replace them with
a new set of APIs named bpf_object__{prev,next}_{program,map} which
follow the libbpf API naming convention ([0]). No functionality changes.

  [0] Closes: https://github.com/libbpf/libbpf/issues/296

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211003165844.4054931-2-hengqi.chen@gmail.com
2021-10-06 12:34:02 -07:00
Hengqi Chen
4a404a7e8a libbpf: Deprecate bpf_object__unload() API since v0.6
BPF objects are not reloadable after unload. Users are expected to use
bpf_object__close() to unload and free up resources in one operation.
No need to expose bpf_object__unload() as a public API, deprecate it
([0]).  Add bpf_object__unload() as an alias to internal
bpf_object_unload() and replace all bpf_object__unload() uses to avoid
compilation errors.

  [0] Closes: https://github.com/libbpf/libbpf/issues/290

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211002161000.3854559-1-hengqi.chen@gmail.com
2021-10-06 12:34:02 -07:00
Jiri Olsa
189c83bdde selftest/bpf: Switch recursion test to use htab_map_delete_elem
Currently the recursion test is hooking __htab_map_lookup_elem
function, which is invoked both from bpf_prog and bpf syscall.

But in our kernel build, the __htab_map_lookup_elem gets inlined
within the htab_map_lookup_elem, so it's not trigered and the
test fails.

Fixing this by using htab_map_delete_elem, which is not inlined
for bpf_prog calls (like htab_map_lookup_elem is) and is used
directly as pointer for map_delete_elem, so it won't disappear
by inlining.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/YVnfFTL/3T6jOwHI@krava
2021-10-06 12:34:02 -07:00
Quentin Monnet
929bef4677 bpf: Use $(pound) instead of \# in Makefiles
Recent-ish versions of make do no longer consider number signs ("#") as
comment symbols when they are inserted inside of a macro reference or in
a function invocation. In such cases, the symbols should not be escaped.

There are a few occurrences of "\#" in libbpf's and samples' Makefiles.
In the former, the backslash is harmless, because grep associates no
particular meaning to the escaped symbol and reads it as a regular "#".
In samples' Makefile, recent versions of make will pass the backslash
down to the compiler, making the probe fail all the time and resulting
in the display of a warning about "make headers_install" being required,
even after headers have been installed.

A similar issue has been addressed at some other locations by commit
9564a8cf42 ("Kbuild: fix # escaping in .cmd files for future Make").
Let's address it for libbpf's and samples' Makefiles in the same
fashion, by using a "$(pound)" variable (pulled from
tools/scripts/Makefile.include for libbpf, or re-defined for the
samples).

Reference for the change in make:
https://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57

Fixes: 2f38304127 ("libbpf: Make libbpf_version.h non-auto-generated")
Fixes: 07c3bbdb1a ("samples: bpf: print a warning about headers_install")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006111049.20708-1-quentin@isovalent.com
2021-10-06 12:34:02 -07:00
Andrii Nakryiko
9d05787223 selftests/bpf: Test new btf__add_btf() API
Add a test that validates that btf__add_btf() API is correctly copying
all the types from the source BTF into destination BTF object and
adjusts type IDs and string offsets properly.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211006051107.17921-4-andrii@kernel.org
2021-10-06 15:36:30 +02:00
Andrii Nakryiko
c65eb8082d selftests/bpf: Refactor btf_write selftest to reuse BTF generation logic
Next patch will need to reuse BTF generation logic, which tests every
supported BTF kind, for testing btf__add_btf() APIs. So restructure
existing selftests and make it as a single subtest that uses bulk
VALIDATE_RAW_BTF() macro for raw BTF dump checking.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211006051107.17921-3-andrii@kernel.org
2021-10-06 15:36:29 +02:00
Andrii Nakryiko
7ca6112159 libbpf: Add API that copies all BTF types from one BTF object to another
Add a bulk copying api, btf__add_btf(), that speeds up and simplifies
appending entire contents of one BTF object to another one, taking care
of copying BTF type data, adjusting resulting BTF type IDs according to
their new locations in the destination BTF object, as well as copying
and deduplicating all the referenced strings and updating all the string
offsets in new BTF types as appropriate.

This API is intended to be used from tools that are generating and
otherwise manipulating BTFs generically, such as pahole. In pahole's
case, this API is useful for speeding up parallelized BTF encoding, as
it allows pahole to offload all the intricacies of BTF type copying to
libbpf and handle the parallelization aspects of the process.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/bpf/20211006051107.17921-2-andrii@kernel.org
2021-10-06 15:35:46 +02:00
Jie Meng
57a610f1c5 bpf, x64: Save bytes for DIV by reducing reg copies
Instead of unconditionally performing push/pop on %rax/%rdx in case of
division/modulo, we can save a few bytes in case of destination register
being either BPF r0 (%rax) or r3 (%rdx) since the result is written in
there anyway.

Also, we do not need to copy the source to %r11 unless the source is either
%rax, %rdx or an immediate.

For example, before the patch:

  22:   push   %rax
  23:   push   %rdx
  24:   mov    %rsi,%r11
  27:   xor    %edx,%edx
  29:   div    %r11
  2c:   mov    %rax,%r11
  2f:   pop    %rdx
  30:   pop    %rax
  31:   mov    %r11,%rax

After:

  22:   push   %rdx
  23:   xor    %edx,%edx
  25:   div    %rsi
  28:   pop    %rdx

Signed-off-by: Jie Meng <jmeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211002035626.2041910-1-jmeng@fb.com
2021-10-06 15:24:36 +02:00
Borislav Petkov
f96b467583 x86/insn: Use get_unaligned() instead of memcpy()
Use get_unaligned() instead of memcpy() to access potentially unaligned
memory, which, when accessed through a pointer, leads to undefined
behavior. get_unaligned() describes much better what is happening there
anyway even if memcpy() does the job.

In addition, since perf tool builds with -Werror, it would fire with:

  util/intel-pt-decoder/../../../arch/x86/lib/insn.c: In function '__insn_get_emulate_prefix':
  tools/include/../include/asm-generic/unaligned.h:10:15: error: packed attribute is unnecessary [-Werror=packed]
     10 |  const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \

because -Werror=packed would complain if the packed attribute would have
no effect on the layout of the structure.

In this case, that is intentional so disable the warning only for that
compilation unit.

That part is Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>

No functional changes.

Fixes: 5ba1071f75 ("x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lkml.kernel.org/r/YVSsIkj9Z29TyUjE@zn.tnic
2021-10-06 11:56:37 +02:00
Kumar Kartikeya Dwivedi
c48e51c8b0 bpf: selftests: Add selftests for module kfunc support
This adds selftests that tests the success and failure path for modules
kfuncs (in presence of invalid kfunc calls) for both libbpf and
gen_loader. It also adds a prog_test kfunc_btf_id_list so that we can
add module BTF ID set from bpf_testmod.

This also introduces  a couple of test cases to verifier selftests for
validating whether we get an error or not depending on if invalid kfunc
call remains after elimination of unreachable instructions.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-10-memxor@gmail.com
2021-10-05 17:07:42 -07:00
Kumar Kartikeya Dwivedi
18f4fccbf3 libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations
This change updates the BPF syscall loader to relocate BTF_KIND_FUNC
relocations, with support for weak kfunc relocations. The general idea
is to move map_fds to loader map, and also use the data for storing
kfunc BTF fds. Since both reuse the fd_array parameter, they need to be
kept together.

For map_fds, we reserve MAX_USED_MAPS slots in a region, and for kfunc,
we reserve MAX_KFUNC_DESCS. This is done so that insn->off has more
chances of being <= INT16_MAX than treating data map as a sparse array
and adding fd as needed.

When the MAX_KFUNC_DESCS limit is reached, we fall back to the sparse
array model, so that as long as it does remain <= INT16_MAX, we pass an
index relative to the start of fd_array.

We store all ksyms in an array where we try to avoid calling the
bpf_btf_find_by_name_kind helper, and also reuse the BTF fd that was
already stored. This also speeds up the loading process compared to
emitting calls in all cases, in later tests.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-9-memxor@gmail.com
2021-10-05 17:07:42 -07:00
Kumar Kartikeya Dwivedi
466b2e1397 libbpf: Resolve invalid weak kfunc calls with imm = 0, off = 0
Preserve these calls as it allows verifier to succeed in loading the
program if they are determined to be unreachable after dead code
elimination during program load. If not, the verifier will fail at
runtime. This is done for ext->is_weak symbols similar to the case for
variable ksyms.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-8-memxor@gmail.com
2021-10-05 17:07:42 -07:00
Kumar Kartikeya Dwivedi
9dbe601563 libbpf: Support kernel module function calls
This patch adds libbpf support for kernel module function call support.
The fd_array parameter is used during BPF program load to pass module
BTFs referenced by the program. insn->off is set to index into this
array, but starts from 1, because insn->off as 0 is reserved for
btf_vmlinux.

We try to use existing insn->off for a module, since the kernel limits
the maximum distinct module BTFs for kfuncs to 256, and also because
index must never exceed the maximum allowed value that can fit in
insn->off (INT16_MAX). In the future, if kernel interprets signed offset
as unsigned for kfunc calls, this limit can be increased to UINT16_MAX.

Also introduce a btf__find_by_name_kind_own helper to start searching
from module BTF's start id when we know that the BTF ID is not present
in vmlinux BTF (in find_ksym_btf_id).

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-7-memxor@gmail.com
2021-10-05 17:07:42 -07:00
Kumar Kartikeya Dwivedi
f614f2c755 tools: Allow specifying base BTF file in resolve_btfids
This commit allows specifying the base BTF for resolving btf id
lists/sets during link time in the resolve_btfids tool. The base BTF is
set to NULL if no path is passed. This allows resolving BTF ids for
module kernel objects.

Also, drop the --no-fail option, as it is only used in case .BTF_ids
section is not present, instead make no-fail the default mode. The long
option name is same as that of pahole.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-5-memxor@gmail.com
2021-10-05 17:07:41 -07:00
Joe Lawrence
fe255fe6ad objtool: Remove redundant 'len' field from struct section
The section structure already contains sh_size, so just remove the extra
'len' member that requires extra mirroring and potential confusion.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210822225037.54620-3-joe.lawrence@redhat.com
Cc: Andy Lavr <andy.lavr@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
2021-10-05 12:03:21 -07:00
Joe Lawrence
dc02368164 objtool: Make .altinstructions section entry size consistent
Commit e31694e0a7 ("objtool: Don't make .altinstructions writable")
aligned objtool-created and kernel-created .altinstructions section
flags, but there remains a minor discrepency in their use of a section
entry size: objtool sets one while the kernel build does not.

While sh_entsize of sizeof(struct alt_instr) seems intuitive, this small
deviation can cause failures with external tooling (kpatch-build).

Fix this by creating new .altinstructions sections with sh_entsize of 0
and then later updating sec->sh_size as alternatives are added to the
section.  An added benefit is avoiding the data descriptor and buffer
created by elf_create_section(), but previously unused by
elf_add_alternative().

Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210822225037.54620-2-joe.lawrence@redhat.com
Cc: Andy Lavr <andy.lavr@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
2021-10-05 12:03:20 -07:00
Josh Poimboeuf
4d8b35968b objtool: Remove reloc symbol type checks in get_alt_entry()
Converting a special section's relocation reference to a symbol is
straightforward.  No need for objtool to complain that it doesn't know
how to handle it.  Just handle it.

This fixes the following warning:

  arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception

Fixes: 24ff652573 ("objtool: Teach get_alt_entry() about more relocation types")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/feadbc3dfb3440d973580fad8d3db873cbfe1694.1633367242.git.jpoimboe@redhat.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: linux-kernel@vger.kernel.org
2021-10-05 12:03:20 -07:00
Kan Liang
0b6c5371c0 perf tests attr: Add missing topdown metrics events
The Topdown metrics events were added as 'perf stat' default events
since commit 42641d6f4d ("perf stat: Add Topdown metrics events as
default events").

However, the perf attr tests were not updated
accordingly.

The perf attr test fails on the platform which supports Topdown metrics.

  # perf test 17

    17: Setup struct perf_event_attr        :FAILED!

Add Topdown metrics events into perf attr test cases. Make them optional
since they are only available on newer platforms.

Fixes: 42641d6f4d ("perf stat: Add Topdown metrics events as default events")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/1633031566-176517-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-05 14:55:38 -03:00
Arnaldo Carvalho de Melo
9fce636e5c tools include UAPI: Sync sound/asound.h copy with the kernel sources
Picking the changes from:

  09d2317440 ("ALSA: rawmidi: introduce SNDRV_RAWMIDI_IOCTL_USER_PVERSION")

Which entails no changes in the tooling side as it doesn't introduce new
SNDRV_PCM_IOCTL_ ioctls.

To silence this perf tools build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h'
  diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h

Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-05 14:51:33 -03:00
Branislav Rankov
35c46bf545 perf build: Fix plugin static linking with libopencsd on ARM and ARM64
Filter out -static flag when building plugins as they are always built
as dynamic libraries and -static and -dynamic don't work well together
on arm and arm64.

Signed-off-by: Branislav Rankov <branislav.rankov@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Brown <Mark.Brown@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: coresight@lists.linaro.org
Cc: nd@arm.com
Link: https://lore.kernel.org/r/e88952b3-2470-da96-dee9-e247a1759cd0@arm.com
Signed-off-by: Tamas Zsoldos <tamas.zsoldos@arm.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-05 14:48:10 -03:00
Branislav Rankov
573cf5c9a1 perf build: Add missing -lstdc++ when linking with libopencsd
Add -lstdc++ to perf when linking libopencsd as it is a dependency. It
does not hurt to add it when dynamic linking.

Signed-off-by: Branislav Rankov <branislav.rankov@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Brown <Mark.Brown@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: coresight@lists.linaro.org
Cc: nd@arm.com
Link: https://lore.kernel.org/r/e88952b3-2470-da96-dee9-e247a1759cd0@arm.com
Signed-off-by: Tamas Zsoldos <tamas.zsoldos@arm.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-05 14:48:10 -03:00
Like Xu
b94729919d perf jevents: Free the sys_event_tables list after processing entries
The compiler reports that free_sys_event_tables() is dead code.

But according to the semantics, the "LIST_HEAD(sys_event_tables)" should
also be released, just like we do with 'arch_std_events' in main().

Fixes: e9d32c1bf0 ("perf vendor events: Add support for arch standard events")
Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210928102938.69681-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-05 14:48:10 -03:00
Li Zhijian
1c36432b27 kselftests/sched: cleanup the child processes
Previously, 'make -C sched run_tests' will block forever when it occurs
something wrong where the *selftests framework* is waiting for its child
processes to exit.

[root@iaas-rpma sched]# ./cs_prctl_test

 ## Create a thread/process/process group hiearchy
Not a core sched system
tid=74985, / tgid=74985 / pgid=74985: ffffffffffffffff
Not a core sched system
    tid=74986, / tgid=74986 / pgid=74985: ffffffffffffffff
Not a core sched system
        tid=74988, / tgid=74986 / pgid=74985: ffffffffffffffff
Not a core sched system
        tid=74989, / tgid=74986 / pgid=74985: ffffffffffffffff
Not a core sched system
        tid=74990, / tgid=74986 / pgid=74985: ffffffffffffffff
Not a core sched system
    tid=74987, / tgid=74987 / pgid=74985: ffffffffffffffff
Not a core sched system
        tid=74991, / tgid=74987 / pgid=74985: ffffffffffffffff
Not a core sched system
        tid=74992, / tgid=74987 / pgid=74985: ffffffffffffffff
Not a core sched system
        tid=74993, / tgid=74987 / pgid=74985: ffffffffffffffff

Not a core sched system
(268) FAILED: get_cs_cookie(0) == 0

 ## Set a cookie on entire process group
-1 = prctl(62, 1, 0, 2, 0)
core_sched create failed -- PGID: Invalid argument
(cs_prctl_test.c:272) -
[root@iaas-rpma sched]# ps
    PID TTY          TIME CMD
   4605 pts/2    00:00:00 bash
  74986 pts/2    00:00:00 cs_prctl_test
  74987 pts/2    00:00:00 cs_prctl_test
  74999 pts/2    00:00:00 ps

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chris Hyser <chris.hyser@oracle.com>
Link: https://lore.kernel.org/r/20210902024333.75983-1-lizhijian@cn.fujitsu.com
2021-10-05 15:51:43 +02:00
Linus Torvalds
f6274b06e3 linux-kselftest-fixes-5.15-rc5
This Kselftest fixes update for Linux 5.15-rc5 consists of a fix
 to implicit declaration warns in drivers/dma-buf test.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmFbaykACgkQCwJExA0N
 QxwXRhAAwGntF/dFNcKspMKTepdmKLFHr5Mopd1+OMbVLgjQAtXZs+wCoR0tm6RK
 EZf5D7y8IpGedFnbjWJ1t18XCRJtBW7HladC6otTmR0+1SEFdHR1n4H/pqrmSagV
 5rYqwh1BgTb5Eb/9GPWKlZ0zagXGujR5UVdFPqCEMWpF6be9J989NdKGzpD4Qbkq
 eux3x5iaz5dsVf7iTGhzy+iSxKqIqSw5LCtSrE3PoTRbM5PS+K4I7ImafysnsYK9
 kZ8zkmEzixziwoWUEJLRqFveS7cY/5l7Nd2kD9YH1WE2xBYxYQZFdj4HVWdlQePd
 99UthTSfcyO6C2fplorlKkoNKuX1tlGCJyZbNjgQSHPYuVrLoau136yoSk5F+ZPg
 OUFsIs52d09e7vN70nh7UD9rRqihFRWwgI4EQt9nUP1mAmYm8y7bXKSFDF5hp3ay
 4VbvYl50QEkllqU0sDcEuFAodGLz72T2PZNWP5DQxg5q70Hni/CWqFm+EizztRCi
 mv749twroD61b8Jziu8Mqhp5BqlO1tY8uu9z4GuRC4UIMhdARY2J4mD+8wm03JRO
 4S28o+mH4StrrhyjngrCM9k8YYwa/n+XDrswiNJIpd/PmxVXD1KUP6z80sQE6Tcb
 o34yFVeABzdovryC/8VwDC9n3RDYIN6SM5XqOxvX5jckdtfwCbI=
 =t85S
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-fixes-5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:
 "A fix to implicit declaration warns in drivers/dma-buf test"

* tag 'linux-kselftest-fixes-5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: drivers/dma-buf: Fix implicit declaration warns
2021-10-04 14:33:30 -07:00
Tony Garnock-Jones
be8ecc57f1 perf srcline: Use long-running addr2line per DSO
Invoking addr2line in a separate subprocess, one for each required
lookup, takes a terribly long time.

This patch introduces a long-running addr2line process for each DSO,
*DRAMATICALLY* speeding up runs of perf.

What used to take tens of minutes now takes tens of seconds.

Debian bug report about this issue:

  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=911815

Signed-off-by: Tony Garnock-Jones <tonyg@leastfixedpoint.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210916120939.453536-1-tonyg@leastfixedpoint.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-04 09:29:07 -03:00
Justin Iurman
bf77b1400a selftests: net: Test for the IOAM encapsulation with IPv6
This patch adds support for testing the encap (ip6ip6) mode of IOAM.

Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04 12:53:36 +01:00
Linus Torvalds
7fab1c12bd objtool: print out the symbol type when complaining about it
The objtool warning that the kvm instruction emulation code triggered
wasn't very useful:

    arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception

in that it helpfully tells you which symbol name it had trouble figuring
out the relocation for, but it doesn't actually say what the unknown
symbol type was that triggered it all.

In this case it was because of missing type information (type 0, aka
STT_NOTYPE), but on the whole it really should just have printed that
out as part of the message.

Because if this warning triggers, that's very much the first thing you
want to know - why did reloc2sec_off() return failure for that symbol?

So rather than just saying you can't handle some type of symbol without
saying what the type _was_, just print out the type number too.

Fixes: 24ff652573 ("objtool: Teach get_alt_entry() about more relocation types")
Link: https://lore.kernel.org/lkml/CAHk-=wiZwq-0LknKhXN4M+T8jbxn_2i9mcKpO+OaBSSq_Eh7tg@mail.gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-03 13:45:48 -07:00
Linus Torvalds
52c3c17062 - Handle symbol relocations properly due to changes in the toolchains
which remove section symbols now
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmFZesIACgkQEsHwGGHe
 VUp3PhAAg6pe71fCCoO3Z30wwTpqEI9n4CV93Nj+L6xdNw2LhvKWuPVYee0eAYnW
 oMyMv82kWngAMDa55U4o4isvB3CK4yOcCLOk+Odtok/2xmjfL1NB4wGMsk0Zz7Pr
 Rf3gNcyc4/5d8njxPvxEx4jj4thXm//vIKYpoeMXYdMoSGOT1p3w/DVBB2LhneZf
 7ygEH0zTmX75Nxfc9YFZSwXweg1sB6vz0sK0fXPb/BF547y0qyplPRKhK8BOLLCe
 LAN15cxdV0/Su3Wcu59aguzuwDGa3N7GGIa1Lbsg6rxXgMwOzVY6fzk1edCvFnOo
 ro+19jm+riDJGBJUCfi0O8JzTXY2gEBxDACvMDkUtKkONxpeet+mWS3CmFOMqQM4
 0z6C052//vJd6grqDzzy1nb151bW4lU/RrDNrcihA99rVtmfe249xVzniIgR5zAy
 Tzezu+DgCZnN5Os/dLFoMOKTrRqGfqJ/yGNjs9KSgN2Lwe6SmKkVmrNL/i0KIwgd
 4sIh+nXhMerZaR/GAPcazEBR3yUvCtKDMmtNYpzxfVo+JRtNUolvlglPpzxE3e0t
 gOjnqrf1/wutYzJ65xpa4h7JjTtXCwey/83JdY2f463kid2+t94mYpp9wirooY23
 yfv+wTyECcMrlP+p0tTdrMD8mD9m0Cxo96vMVqfT5ER8g4CwiTo=
 =Buve
 -----END PGP SIGNATURE-----

Merge tag 'objtool_urgent_for_v5.15_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fix from Borislav Petkov:

 - Handle symbol relocations properly due to changes in the toolchains
   which remove section symbols now

* tag 'objtool_urgent_for_v5.15_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Teach get_alt_entry() about more relocation types
2021-10-03 10:23:54 -07:00
Vladimir Oltean
434ef35095 selftests: net: mscc: ocelot: add a test for egress VLAN modification
For this test we are exercising the VCAP ES0 block's ability to match on
a packet with a given VLAN ID, and push an ES0 TAG A with a VID derived
from VID_A_VAL plus the classified VLAN.

$eth3.200 is the generator port
$eth0 is the bridged DUT port that receives
$eth1 is the bridged DUT port that forwards and rewrites VID 200 to 300
      on egress via VCAP ES0
$eth2 is the port that receives from the DUT port $eth1

Since the egress rewriting happens outside the bridging service, VID 300
does not need to be in the bridge VLAN table of $eth1.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-02 14:15:57 +01:00
Vladimir Oltean
4a907f6594 selftests: net: mscc: ocelot: rename the VLAN modification test to ingress
There will be one more VLAN modification selftest added, this time for
egress. Rename the one that exists right now to be more specific.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-02 14:15:57 +01:00
Vladimir Oltean
239f163cea selftests: net: mscc: ocelot: bring up the ports automatically
Looks like when I wrote the selftests I was using a network manager that
brought up the ports automatically. In order to not rely on that, let
the script open them up.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-02 14:15:57 +01:00
Jakub Kicinski
6b7b0c3091 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
bpf-next 2021-10-02

We've added 85 non-merge commits during the last 15 day(s) which contain
a total of 132 files changed, 13779 insertions(+), 6724 deletions(-).

The main changes are:

1) Massive update on test_bpf.ko coverage for JITs as preparatory work for
   an upcoming MIPS eBPF JIT, from Johan Almbladh.

2) Add a batched interface for RX buffer allocation in AF_XDP buffer pool,
   with driver support for i40e and ice from Magnus Karlsson.

3) Add legacy uprobe support to libbpf to complement recently merged legacy
   kprobe support, from Andrii Nakryiko.

4) Add bpf_trace_vprintk() as variadic printk helper, from Dave Marchevsky.

5) Support saving the register state in verifier when spilling <8byte bounded
   scalar to the stack, from Martin Lau.

6) Add libbpf opt-in for stricter BPF program section name handling as part
   of libbpf 1.0 effort, from Andrii Nakryiko.

7) Add a document to help clarifying BPF licensing, from Alexei Starovoitov.

8) Fix skel_internal.h to propagate errno if the loader indicates an internal
   error, from Kumar Kartikeya Dwivedi.

9) Fix build warnings with -Wcast-function-type so that the option can later
   be enabled by default for the kernel, from Kees Cook.

10) Fix libbpf to ignore STT_SECTION symbols in legacy map definitions as it
    otherwise errors out when encountering them, from Toke Høiland-Jørgensen.

11) Teach libbpf to recognize specialized maps (such as for perf RB) and
    internally remove BTF type IDs when creating them, from Hengqi Chen.

12) Various fixes and improvements to BPF selftests.
====================

Link: https://lore.kernel.org/r/20211002001327.15169-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-01 19:58:02 -07:00
Hengqi Chen
bd368cb554 selftests/bpf: Use BTF-defined key/value for map definitions
Change map definitions in BPF selftests to use BTF-defined
key/value types. This unifies the map definitions and ensures
libbpf won't emit warning about retrying map creation.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210930161456.3444544-3-hengqi.chen@gmail.com
2021-10-01 15:31:51 -07:00
Hengqi Chen
f731052325 libbpf: Support uniform BTF-defined key/value specification across all BPF maps
A bunch of BPF maps do not support specifying BTF types for key and value.
This is non-uniform and inconvenient[0]. Currently, libbpf uses a retry
logic which removes BTF type IDs when BPF map creation failed. Instead
of retrying, this commit recognizes those specialized maps and removes
BTF type IDs when creating BPF map.

  [0] Closes: https://github.com/libbpf/libbpf/issues/355

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210930161456.3444544-2-hengqi.chen@gmail.com
2021-10-01 15:31:50 -07:00
Andrii Nakryiko
b0e875bac0 libbpf: Fix memory leak in strset
Free struct strset itself, not just its internal parts.

Fixes: 90d76d3ece ("libbpf: Extract internal set-of-strings datastructure APIs")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211001185910.86492-1-andrii@kernel.org
2021-10-01 22:54:38 +02:00
Daniel Latypov
d8c23ead70 kunit: tool: better handling of quasi-bool args (--json, --raw_output)
Problem:

What does this do?
$ kunit.py run --json
Well, it runs all the tests and prints test results out as JSON.

And next is
$ kunit.py run my-test-suite --json
This runs just `my-test-suite` and prints results out as JSON.

But what about?
$ kunit.py run --json my-test-suite
This runs all the tests and stores the json results in a "my-test-suite"
file.

Why:
--json, and now --raw_output are actually string flags. They just have a
default value. --json in particular takes the name of an output file.

It was intended that you'd do
$ kunit.py run --json=my_output_file my-test-suite
if you ever wanted to specify the value.

Workaround:
It doesn't seem like there's a way to make
https://docs.python.org/3/library/argparse.html only accept arg values
after a '='.

I believe that `--json` should "just work" regardless of where it is.
So this patch automatically rewrites a bare `--json` to `--json=stdout`.

That makes the examples above work the same way.
Add a regression test that can catch this for --raw_output.

Fixes: 6a499c9c42 ("kunit: tool: make --raw_output support only showing kunit output")
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Tested-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-01 13:45:25 -06:00
Linus Torvalds
b2626f1e32 Small x86 fixes.
-----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmFXQUoUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMglgf/egh3zb9/+BUQWe0xWfhcINNzpsVk
 PJtiBmJc3nQLbZbTSLp63rouy1lNgR0s2DiMwP7G1u39OwW8W3LHMrBUSqF1F01+
 gntb4GGiRTiTPJI64K4z6ytORd3tuRarHq8TUIa2zvki9ZW5Obgkm1i1RsNMOo+s
 AOA7whhpS8e/a5fBbtbS9bTZb30PKTZmbW4oMjvO9Sw4Eb76IauqPSEtRPSuCAc7
 r7z62RTlm10Qk0JR3tW1iXMxTJHZk+tYPJ8pclUAWVX5bZqWa/9k8R0Z5i/miFiZ
 glW/y3R4+aUwIQV2v7V3Jx9MOKDhZxniMtnqZG/Hp9NVDtWIz37V/U37vw==
 =zQQ1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm fixes from Paolo Bonzini:
 "Small x86 fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Ensure all migrations are performed when test is affined
  KVM: x86: Swap order of CPUID entry "index" vs. "significant flag" checks
  ptp: Fix ptp_kvm_getcrosststamp issue for x86 ptp_kvm
  x86/kvmclock: Move this_cpu_pvti into kvmclock.h
  selftests: KVM: Don't clobber XMM register when read
  KVM: VMX: Fix a TSX_CTRL_CPUID_CLEAR field mask issue
2021-10-01 11:08:07 -07:00
Peter Zijlstra
24ff652573 objtool: Teach get_alt_entry() about more relocation types
Occasionally objtool encounters symbol (as opposed to section)
relocations in .altinstructions. Typically they are the alternatives
written by elf_add_alternative() as encountered on a noinstr
validation run on vmlinux after having already ran objtool on the
individual .o files.

Basically this is the counterpart of commit 44f6a7c075 ("objtool:
Fix seg fault with Clang non-section symbols"), because when these new
assemblers (binutils now also does this) strip the section symbols,
elf_add_reloc_to_insn() is forced to emit symbol based relocations.

As such, teach get_alt_entry() about different relocation types.

Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/YVWUvknIEVNkPvnP@hirez.programming.kicks-ass.net
2021-10-01 13:57:47 +02:00
Josh Poimboeuf
5b284b1933 objtool: Ignore unwind hints for ignored functions
If a function is ignored, also ignore its hints.  This is useful for the
case where the function ignore is conditional on frame pointers, e.g.
STACK_FRAME_NON_STANDARD_FP().

Link: https://lkml.kernel.org/r/163163048317.489837.10988954983369863209.stgit@devnote2

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-30 21:24:07 -04:00
Josh Poimboeuf
e028c4f7ac objtool: Add frame-pointer-specific function ignore
Add a CONFIG_FRAME_POINTER-specific version of
STACK_FRAME_NON_STANDARD() for the case where a function is
intentionally missing frame pointer setup, but otherwise needs
objtool/ORC coverage when frame pointers are disabled.

Link: https://lkml.kernel.org/r/163163047364.489837.17377799909553689661.stgit@devnote2

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-30 21:24:07 -04:00
Jakub Kicinski
dd9a887b35 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/phy/bcm7xxx.c
  d88fd1b546 ("net: phy: bcm7xxx: Fixed indirect MMD operations")
  f68d08c437 ("net: phy: bcm7xxx: Add EPHY entry for 72165")

net/sched/sch_api.c
  b193e15ac6 ("net: prevent user from passing illegal stab size")
  69508d4333 ("net_sched: Use struct_size() and flex_array_size() helpers")

Both cases trivial - adjacent code additions.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-30 14:49:21 -07:00
Linus Torvalds
4de593fb96 Networking fixes for 5.15-rc4, including fixes from mac80211, netfilter
and bpf.
 
 Current release - regressions:
 
  - bpf, cgroup: assign cgroup in cgroup_sk_alloc when called from
    interrupt
 
  - mdio: revert mechanical patches which broke handling of optional
    resources
 
  - dev_addr_list: prevent address duplication
 
 Previous releases - regressions:
 
  - sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb
    (NULL deref)
 
  - Revert "mac80211: do not use low data rates for data frames with no
    ack flag", fixing broadcast transmissions
 
  - mac80211: fix use-after-free in CCMP/GCMP RX
 
  - netfilter: include zone id in tuple hash again, minimize collisions
 
  - netfilter: nf_tables: unlink table before deleting it (race -> UAF)
 
  - netfilter: log: work around missing softdep backend module
 
  - mptcp: don't return sockets in foreign netns
 
  - sched: flower: protect fl_walk() with rcu (race -> UAF)
 
  - ixgbe: fix NULL pointer dereference in ixgbe_xdp_setup
 
  - smsc95xx: fix stalled rx after link change
 
  - enetc: fix the incorrect clearing of IF_MODE bits
 
  - ipv4: fix rtnexthop len when RTA_FLOW is present
 
  - dsa: mv88e6xxx: 6161: use correct MAX MTU config method for this SKU
 
  - e100: fix length calculation & buffer overrun in ethtool::get_regs
 
 Previous releases - always broken:
 
  - mac80211: fix using stale frag_tail skb pointer in A-MSDU tx
 
  - mac80211: drop frames from invalid MAC address in ad-hoc mode
 
  - af_unix: fix races in sk_peer_pid and sk_peer_cred accesses
    (race -> UAF)
 
  - bpf, x86: Fix bpf mapping of atomic fetch implementation
 
  - bpf: handle return value of BPF_PROG_TYPE_STRUCT_OPS prog
 
  - netfilter: ip6_tables: zero-initialize fragment offset
 
  - mhi: fix error path in mhi_net_newlink
 
  - af_unix: return errno instead of NULL in unix_create1() when
    over the fs.file-max limit
 
 Misc:
 
  - bpf: exempt CAP_BPF from checks against bpf_jit_limit
 
  - netfilter: conntrack: make max chain length random, prevent guessing
    buckets by attackers
 
  - netfilter: nf_nat_masquerade: make async masq_inet6_event handling
    generic, defer conntrack walk to work queue (prevent hogging RTNL lock)
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFV5KYACgkQMUZtbf5S
 Irs3cRAAqNsgaQXSVOXcPKsndeDDKHIv1Ktes8aOP9tgEXZw5rpsbct7g9Yxc0os
 Oolyt6HThjWr3u1/e3HHJO9I5Klr/J8eQReoRZnKW+6TYZflmmzfuf8u1nx6SLP/
 tliz5y8wKbp8BqqNuTMdRpm+R1QQcNkXTeruUoR1PgREcY4J7bC2BRrqeZhBGHSR
 Z5yPOIietFN3nITxNwbe4AYJXlesMc6QCWhBtXjMPGQ4Zc4/sjDNfqi7eHJi2H2y
 kW2dHeXG86gnlgFllOBBWP85ptxynyxoNQJuhrxgC9T+/FpSVST7cwKbtmkwDI3M
 5WGmeE6B3yfF8iOQuR8fbKQmsnLgQlYhjpbbhgN0GxzkyI7RpGYOFroX0Pht4IVZ
 mwprDOtvoLs4UeDjULRMB0JZfRN75PCtVlhfUkhhJxXGCCmnhGYaxG/pE+6OQWlr
 +n8RXYYMoOzPaHIYTS9NGSGqT0r32IUy/W5Yfv3rEzSeehy2/fxzGr2fOyBGs+q7
 xrnqpsOnM8cODDwGMy3TclCI4Dd72WoHNCHPhA/bk/ZMjHpBd4CSEZPm8IROY3Ja
 g1t68cncgL8fB7TSD9WLFgYu67Lg5j0gC/BHOzUQDQMX5IbhOq/fj1xRy5Lc6SYp
 mqW1f7LdnixBe4W61VjDAYq5jJRqrwEWedx+rvV/ONLvr77KULk=
 =rSti
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from mac80211, netfilter and bpf.

  Current release - regressions:

   - bpf, cgroup: assign cgroup in cgroup_sk_alloc when called from
     interrupt

   - mdio: revert mechanical patches which broke handling of optional
     resources

   - dev_addr_list: prevent address duplication

  Previous releases - regressions:

   - sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb
     (NULL deref)

   - Revert "mac80211: do not use low data rates for data frames with no
     ack flag", fixing broadcast transmissions

   - mac80211: fix use-after-free in CCMP/GCMP RX

   - netfilter: include zone id in tuple hash again, minimize collisions

   - netfilter: nf_tables: unlink table before deleting it (race -> UAF)

   - netfilter: log: work around missing softdep backend module

   - mptcp: don't return sockets in foreign netns

   - sched: flower: protect fl_walk() with rcu (race -> UAF)

   - ixgbe: fix NULL pointer dereference in ixgbe_xdp_setup

   - smsc95xx: fix stalled rx after link change

   - enetc: fix the incorrect clearing of IF_MODE bits

   - ipv4: fix rtnexthop len when RTA_FLOW is present

   - dsa: mv88e6xxx: 6161: use correct MAX MTU config method for this
     SKU

   - e100: fix length calculation & buffer overrun in ethtool::get_regs

  Previous releases - always broken:

   - mac80211: fix using stale frag_tail skb pointer in A-MSDU tx

   - mac80211: drop frames from invalid MAC address in ad-hoc mode

   - af_unix: fix races in sk_peer_pid and sk_peer_cred accesses (race
     -> UAF)

   - bpf, x86: Fix bpf mapping of atomic fetch implementation

   - bpf: handle return value of BPF_PROG_TYPE_STRUCT_OPS prog

   - netfilter: ip6_tables: zero-initialize fragment offset

   - mhi: fix error path in mhi_net_newlink

   - af_unix: return errno instead of NULL in unix_create1() when over
     the fs.file-max limit

  Misc:

   - bpf: exempt CAP_BPF from checks against bpf_jit_limit

   - netfilter: conntrack: make max chain length random, prevent
     guessing buckets by attackers

   - netfilter: nf_nat_masquerade: make async masq_inet6_event handling
     generic, defer conntrack walk to work queue (prevent hogging RTNL
     lock)"

* tag 'net-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
  af_unix: fix races in sk_peer_pid and sk_peer_cred accesses
  net: stmmac: fix EEE init issue when paired with EEE capable PHYs
  net: dev_addr_list: handle first address in __hw_addr_add_ex
  net: sched: flower: protect fl_walk() with rcu
  net: introduce and use lock_sock_fast_nested()
  net: phy: bcm7xxx: Fixed indirect MMD operations
  net: hns3: disable firmware compatible features when uninstall PF
  net: hns3: fix always enable rx vlan filter problem after selftest
  net: hns3: PF enable promisc for VF when mac table is overflow
  net: hns3: fix show wrong state when add existing uc mac address
  net: hns3: fix mixed flag HCLGE_FLAG_MQPRIO_ENABLE and HCLGE_FLAG_DCB_ENABLE
  net: hns3: don't rollback when destroy mqprio fail
  net: hns3: remove tc enable checking
  net: hns3: do not allow call hns3_nic_net_open repeatedly
  ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
  net: bridge: mcast: Associate the seqcount with its protecting lock.
  net: mdio-ipq4019: Fix the error for an optional regs resource
  net: hns3: fix hclge_dbg_dump_tm_pg() stack usage
  net: mdio: mscc-miim: Fix the mdio controller
  af_unix: Return errno instead of NULL in unix_create1().
  ...
2021-09-30 14:28:05 -07:00
Kumar Kartikeya Dwivedi
4729445b47 libbpf: Fix segfault in light skeleton for objects without BTF
When fed an empty BPF object, bpftool gen skeleton -L crashes at
btf__set_fd() since it assumes presence of obj->btf, however for
the sequence below clang adds no .BTF section (hence no BTF).

Reproducer:

  $ touch a.bpf.c
  $ clang -O2 -g -target bpf -c a.bpf.c
  $ bpftool gen skeleton -L a.bpf.o
  /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
  /* THIS FILE IS AUTOGENERATED! */

  struct a_bpf {
	struct bpf_loader_ctx ctx;
  Segmentation fault (core dumped)

The same occurs for files compiled without BTF info, i.e. without
clang's -g flag.

Fixes: 6723474373 (libbpf: Generate loader program out of BPF ELF file.)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210930061634.1840768-1-memxor@gmail.com
2021-09-30 23:19:58 +02:00
Po-Hsu Lin
d4b6f87e8d selftests/bpf: Use kselftest skip code for skipped tests
There are several test cases in the bpf directory are still using
exit 0 when they need to be skipped. Use kselftest framework skip
code instead so it can help us to distinguish the return status.

Criterion to filter out what should be fixed in bpf directory:
  grep -r "exit 0" -B1 | grep -i skip

This change might cause some false-positives if people are running
these test scripts directly and only checking their return codes,
which will change from 0 to 4. However I think the impact should be
small as most of our scripts here are already using this skip code.
And there will be no such issue if running them with the kselftest
framework.

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210929051250.13831-1-po-hsu.lin@canonical.com
2021-09-30 23:09:17 +02:00
Thomas Huth
22d7108ce4 KVM: selftests: Fix kvm_vm_free() in cr4_cpuid_sync and vmx_tsc_adjust tests
The kvm_vm_free() statement here is currently dead code, since the loop
in front of it can only be left with the "goto done" that jumps right
after the kvm_vm_free(). Fix it by swapping the locations of the "done"
label and the kvm_vm_free().

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210826074928.240942-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30 04:27:08 -04:00
Colin Ian King
d22869aff4 kvm: selftests: Fix spelling mistake "missmatch" -> "mismatch"
There is a spelling mistake in an error message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Message-Id: <20210826120752.12633-1-colin.king@canonical.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30 04:27:08 -04:00
Juergen Gross
a1c42ddedf kvm: rename KVM_MAX_VCPU_ID to KVM_MAX_VCPU_IDS
KVM_MAX_VCPU_ID is not specifying the highest allowed vcpu-id, but the
number of allowed vcpu-ids. This has already led to confusion, so
rename KVM_MAX_VCPU_ID to KVM_MAX_VCPU_IDS to make its semantics more
clear

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210913135745.13944-3-jgross@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30 04:27:05 -04:00
Sean Christopherson
7b0035eaa7 KVM: selftests: Ensure all migrations are performed when test is affined
Rework the CPU selection in the migration worker to ensure the specified
number of migrations are performed when the test iteslf is affined to a
subset of CPUs.  The existing logic skips iterations if the target CPU is
not in the original set of possible CPUs, which causes the test to fail
if too many iterations are skipped.

  ==== Test Assertion Failure ====
  rseq_test.c:228: i > (NR_TASK_MIGRATIONS / 2)
  pid=10127 tid=10127 errno=4 - Interrupted system call
     1  0x00000000004018e5: main at rseq_test.c:227
     2  0x00007fcc8fc66bf6: ?? ??:0
     3  0x0000000000401959: _start at ??:?
  Only performed 4 KVM_RUNs, task stalled too much?

Calculate the min/max possible CPUs as a cheap "best effort" to avoid
high runtimes when the test is affined to a small percentage of CPUs.
Alternatively, a list or xarray of the possible CPUs could be used, but
even in a horrendously inefficient setup, such optimizations are not
needed because the runtime is completely dominated by the cost of
migrating the task, and the absolute runtime is well under a minute in
even truly absurd setups, e.g. running on a subset of vCPUs in a VM that
is heavily overcommited (16 vCPUs per pCPU).

Fixes: 61e52f1630 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs")
Reported-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210929234112.1862848-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30 04:25:57 -04:00
Kumar Kartikeya Dwivedi
e68ac00827 libbpf: Fix skel_internal.h to set errno on loader retval < 0
When the loader indicates an internal error (result of a checked bpf
system call), it returns the result in attr.test.retval. However, tests
that rely on ASSERT_OK_PTR on NULL (returned from light skeleton) may
miss that NULL denotes an error if errno is set to 0. This would result
in skel pointer being NULL, while ASSERT_OK_PTR returning 1, leading to
a SEGV on dereference of skel, because libbpf_get_error relies on the
assumption that errno is always set in case of error for ptr == NULL.

In particular, this was observed for the ksyms_module test. When
executed using `./test_progs -t ksyms`, prior tests manipulated errno
and the test didn't crash when it failed at ksyms_module load, while
using `./test_progs -t ksyms_module` crashed due to errno being
untouched.

Fixes: 6723474373 (libbpf: Generate loader program out of BPF ELF file.)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210927145941.1383001-11-memxor@gmail.com
2021-09-29 20:42:32 -07:00
Toke Høiland-Jørgensen
161ecd5379 libbpf: Properly ignore STT_SECTION symbols in legacy map definitions
The previous patch to ignore STT_SECTION symbols only added the ignore
condition in one of them. This fails if there's more than one map
definition in the 'maps' section, because the subsequent modulus check will
fail, resulting in error messages like:

libbpf: elf: unable to determine legacy map definition size in ./xdpdump_xdp.o

Fix this by also ignoring STT_SECTION in the first loop.

Fixes: c3e8c44a90 ("libbpf: Ignore STT_SECTION symbols in 'maps' section")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210929213837.832449-1-toke@redhat.com
2021-09-29 15:50:32 -07:00
Alexei Starovoitov
66fe332417 libbpf: Make gen_loader data aligned.
Align gen_loader data to 8 byte boundary to make sure union bpf_attr,
bpf_insns and other structs are aligned.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210927145941.1383001-9-memxor@gmail.com
2021-09-29 13:27:19 -07:00
Kumar Kartikeya Dwivedi
e31eec77e4 bpf: selftests: Fix fd cleanup in get_branch_snapshot
Cleanup code uses while (cpu++ < cpu_cnt) for closing fds, which means
it starts iterating from 1 for closing fds. If the first fd is -1, it
skips over it and closes garbage fds (typically zero) in the remaining
array. This leads to test failures for future tests when they end up
storing fd 0 (as the slot becomes free due to close(0)) in ldimm64's BTF
fd, ending up trying to match module BTF id with vmlinux.

This was observed as spurious CI failure for the ksym_module_libbpf and
module_attach tests. The test ends up closing fd 0 and breaking libbpf's
assumption that module BTF fd will always be > 0, which leads to the
kernel thinking that we are pointing to a BTF ID in vmlinux BTF.

Fixes: 025bd7c753 (selftests/bpf: Add test for bpf_get_branch_snapshot)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210927145941.1383001-12-memxor@gmail.com
2021-09-29 13:25:09 -07:00
Michael Petlan
2b775152bb perf tests vmlinux-kallsyms: Ignore hidden symbols
Certain kernel symbols are purposely hidden from kallsyms. The function
is_ignored_symbol() from scripts/kallsyms.c decides if a symbol should
be hidden or not.

The perf test "vmlinux symtab matches kallsyms" fails in case perf finds
some of the hidden symbols in its machine image and can't match them to
kallsyms.

Let's add a filter to check if a symbol not found isn't one of these
before failing the test.

The function is_ignored_symbol() has been copied from scripts/kallsyms.c
and needs to be updated along with the original.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
LPU-Reference: 20210922152706.23655-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-29 14:13:42 -03:00
Ian Rogers
94886961e3 perf metric: Avoid events for an 'if' constant result
For a metric like:

  CONST if expr else CONST

if the values of CONST are identical then expr doesn't need evaluating,
and events, in order to compute a result.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210923074616.674826-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-29 13:51:04 -03:00
Ian Rogers
a8e4e88083 perf metric: Don't compute unused events
For a metric like:

  EVENT1 if #smt_on else EVENT2

currently EVENT1 and EVENT2 will be measured and then when the metric is
reported EVENT1 or EVENT2 will be printed depending on the value from
smt_on() during the expr parsing. Computing both events is unnecessary and
can lead to multiplexing as discussed in this thread:

  https://lore.kernel.org/lkml/20201110100346.2527031-1-irogers@google.com/

If the input is constant to certain operators like:

 IDS1 if CONST else IDS2

then the result will be either IDS1 or IDS2 depending on CONST (which
may be evaluated from an entire expression), and so IDS1 or IDS2 may
be discarded avoiding events from being programmed.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210923074616.674826-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-29 13:51:04 -03:00
Ian Rogers
970f7afe55 perf expr: Propagate constants for binary operations
When we're computing ID values, if we have constant values then compute
the constant result. For example:

  1 + 2

Previously .val would be set to BOTTOM by union_expr, meaning that
all values are possible. With this change .val is set to 3.

Later changes will use the constant values to hopefully eliminate ID
values that don't need to be computed.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210923074616.674826-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-29 13:51:04 -03:00
Ian Rogers
3f965a7df0 perf expr: Merge find_ids and regular parsing
Add a new option to parsing that the set of IDs being used should be
computed, this means every action needs to handle the compute_ids and
regular case. This means actions yield a new ids type is a set of ids or
the value being computed. Use the compute_ids case to replace find IDs
parsing.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210923074616.674826-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-29 13:51:04 -03:00
Ian Rogers
762a05c561 perf metric: Allow metrics with no events
A metric may be a constant value, for example, some SMT metrics are
constant 0 if #smt_on is 0. If we eliminate all the events then there is
no printing. Fix this by forcing metrics like this to have a
duration_time tool event, previously the metric would fail when parsing
the events with a parse error.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210923074616.674826-10-irogers@google.com
[ Reflow one __parse_events() call so that a ternary operation gets in a single line ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-29 13:50:16 -03:00
Ian Rogers
114a9d6e39 perf metric: Add utilities to work on ids map.
Add utilities to new/free an ids hashmap, as well as to union. Add
testing of the union. Unioning hashmaps will be used when parsing the
metric, if a value is known then the hashmap is unnecessary, otherwise
we need to union together all the event ids to compute their values for
reporting.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210923074616.674826-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-29 13:42:11 -03:00
Ian Rogers
7e06a5e30a perf metric: Rename expr__find_other.
A later change will remove the notion of other, rename the function to
expr__find_ids as this is what it populates.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210923074616.674826-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-29 13:42:03 -03:00
Ian Rogers
c924e0cc05 perf expr: Move actions to the left.
No functional change, just modifying whitespace. This creates additional
space for adding logic to actions in later changes.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210923074616.674826-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-29 13:41:49 -03:00
Ian Rogers
e87576c5ac perf expr: Use macros for operators
No functional change, switch the operators to use macros so that
additional complexity for constants can be added in a later change.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210923074616.674826-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-29 13:41:41 -03:00
Ian Rogers
aed0d6f8c6 perf expr: Separate token declataion from type
No functional change, so the type of expr remains <num>. A later patch
will change the computation to be an aggregate type and making this
change makes that later change smaller.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210923074616.674826-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-29 13:27:17 -03:00
Ian Rogers
7f8fdcbbbe perf expr: Remove unused headers and inline d_ratio
No functional change. Inlining d_ratio makes it easier to special case
for constants in a later patch.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210923074616.674826-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-29 13:26:27 -03:00
Ian Rogers
edfe7f554a perf metric: Use NAN for missing event IDs.
If during computing a metric an event (id) is missing the parsing
aborts. A later patch will make it so that events that aren't used in
the output are deliberately omitted, in which case we don't want the
abort. Modify the missing ID case to report NAN for these cases.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210923074616.674826-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-29 13:25:56 -03:00
Ian Rogers
cb94a02e74 perf metric: Restructure struct expr_parse_ctx.
A later change to parsing the ids out (in expr__find_other) will
potentially drop hashmaps and so it is more convenient to move
expr_parse_ctx to have a hashmap pointer rather than a struct value.

As this pointer must be freed, rather than just going out of scope, add
expr__ctx_new and expr__ctx_free to manage expr_parse_ctx memory.
Adjust use of struct expr_parse_ctx accordingly.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210923074616.674826-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-29 13:21:47 -03:00
Mark Brown
8694e5e638 selftests: arm64: Verify that all possible vector lengths are handled
As part of the enumeration interface for setting vector lengths it is valid
to set vector lengths not supported in the system, these will be rounded to
a supported vector length and returned from the prctl(). Add a test which
exercises this for every valid vector length and makes sure that the return
value is as expected and that this is reflected in the actual system state.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Link: https://lore.kernel.org/r/20210929151925.9601-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 16:33:04 +01:00
Mark Brown
e42391150e selftests: arm64: Fix and enable test for setting current VL in vec-syscfg
We had some test code for verifying that we can write the current VL via
the prctl() interface but the condition for the test was inverted which
wasn't noticed as it was never actually hooked up to the array of tests
we execute. Fix this.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210929151925.9601-4-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 16:33:04 +01:00
Mark Brown
4caf339c03 selftests: arm64: Remove bogus error check on writing to files
Due to some refactoring with the error handling we ended up mangling things
so we never actually set ret and therefore shouldn't be checking it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210929151925.9601-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 16:33:04 +01:00
Mark Brown
ff944c44b7 selftests: arm64: Fix printf() format mismatch in vec-syscfg
The format for this error message calls for the plain text version of the
error but we weren't supply it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210929151925.9601-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 16:33:04 +01:00
Mark Brown
34785030dc selftests: arm64: Move FPSIMD in SVE ptrace test into a function
Now that all the other tests are in functions rather than inline in the
main parent process function also move the test for accessing the FPSIMD
registers via the SVE regset out into their own function.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-9-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:32 +01:00
Mark Brown
a1d7111257 selftests: arm64: More comprehensively test the SVE ptrace interface
Currently the selftest for the SVE register set is not quite as thorough
as is desirable - it only validates that the value of a single Z register
is not modified by a partial write to a lower numbered Z register after
having previously been set through the FPSIMD regset.

Make this more thorough:
 - Test the ability to set vector lengths and enumerate those supported in
   the system.
 - Validate data in all Z and P registers, plus FPSR and FPCR.
 - Test reads via the FPSIMD regset after set via the SVE regset.

There's still some oversights, the main one being that due to the need to
generate a pattern in FFR and the fact that this rewrite is primarily
motivated by SME's streaming SVE which doesn't have FFR we don't currently
test FFR. Update the TODO to reflect those that occurred to me (and fix an
adjacent typo in there).

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-8-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:32 +01:00
Mark Brown
9f7d03a2c5 selftests: arm64: Verify interoperation of SVE and FPSIMD register sets
After setting the FPSIMD registers via the SVE register set read them back
via the FPSIMD register set, validating that the two register sets are
interoperating and that the values we thought we set made it into the
child process.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-7-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:32 +01:00
Mark Brown
8c9eece0bf selftests: arm64: Clarify output when verifying SVE register set
When verifying setting a Z register via ptrace we check each byte by hand,
iterating over the buffer using a pointer called p and treating each
register value written as a test. This creates output referring to "p[X]"
which is confusing since SVE also has predicate registers Pn. Tweak the
output to avoid confusion here.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-6-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:32 +01:00
Mark Brown
736e6d5a54 selftests: arm64: Document what the SVE ptrace test is doing
Before we go modifying it further let's add some comments and output
clarifications explaining what this test is actually doing.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:32 +01:00
Mark Brown
eab281e3af selftests: arm64: Remove extraneous register setting code
For some reason the SVE ptrace test code starts off by setting values in
some of the SVE vector registers in the parent process which it then never
interacts with when verifying the ptrace interfaces. This is not especially
relevant to what's being tested and somewhat confusing when reading the
code so let's remove it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-4-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:31 +01:00
Mark Brown
09121ad718 selftests: arm64: Don't log child creation as a test in SVE ptrace test
Currently we log the creation of the child process as a test but it's not
really relevant to what we're trying to test and can make the output a
little confusing so don't do that.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:31 +01:00
Mark Brown
78d2d816c4 selftests: arm64: Use a define for the number of SVE ptrace tests to be run
Partly in preparation for future refactoring move from hard coding the
number of tests in main() to putting #define at the top of the source
instead.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:31 +01:00
Yonghong Song
38261f369f selftests/bpf: Fix probe_user test failure with clang build kernel
clang build kernel failed the selftest probe_user.
  $ ./test_progs -t probe_user
  $ ...
  $ test_probe_user:PASS:get_kprobe_res 0 nsec
  $ test_probe_user:FAIL:check_kprobe_res wrong kprobe res from probe read: 0.0.0.0:0
  $ #94 probe_user:FAIL

The test attached to kernel function __sys_connect(). In net/socket.c, we have
  int __sys_connect(int fd, struct sockaddr __user *uservaddr, int addrlen)
  {
        ......
  }
  ...
  SYSCALL_DEFINE3(connect, int, fd, struct sockaddr __user *, uservaddr,
                  int, addrlen)
  {
        return __sys_connect(fd, uservaddr, addrlen);
  }

The gcc compiler (8.5.0) does not inline __sys_connect() in syscall entry
function. But latest clang trunk did the inlining. So the bpf program
is not triggered.

To make the test more reliable, let us kprobe the syscall entry function
instead. Note that x86_64, arm64 and s390 have syscall wrappers and they have
to be handled specially.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210929033000.3711921-1-yhs@fb.com
2021-09-28 23:17:22 -07:00
Yucong Sun
09710d82c0 bpftool: Avoid using "?: " in generated code
"?:" is a GNU C extension, some environment has warning flags for its
use, or even prohibit it directly.  This patch avoid triggering these
problems by simply expand it to its full form, no functionality change.

Signed-off-by: Yucong Sun <fallentree@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210928184221.1545079-1-fallentree@fb.com
2021-09-28 15:19:22 -07:00
Andrii Nakryiko
7c80c87ad5 selftests/bpf: Switch sk_lookup selftests to strict SEC("sk_lookup") use
Update "sk_lookup/" definition to be a stand-alone type specifier,
with backwards-compatible prefix match logic in non-libbpf-1.0 mode.

Currently in selftests all the "sk_lookup/<whatever>" uses just use
<whatever> for duplicated unique name encoding, which is redundant as
BPF program's name (C function name) uniquely and descriptively
identifies the intended use for such BPF programs.

With libbpf's SEC_DEF("sk_lookup") definition updated, switch existing
sk_lookup programs to use "unqualified" SEC("sk_lookup") section names,
with no random text after it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-11-andrii@kernel.org
2021-09-28 13:51:20 -07:00
Andrii Nakryiko
dd94d45cf0 libbpf: Add opt-in strict BPF program section name handling logic
Implement strict ELF section name handling for BPF programs. It utilizes
`libbpf_set_strict_mode()` framework and adds new flag: LIBBPF_STRICT_SEC_NAME.

If this flag is set, libbpf will enforce exact section name matching for
a lot of program types that previously allowed just partial prefix
match. E.g., if previously SEC("xdp_whatever_i_want") was allowed, now
in strict mode only SEC("xdp") will be accepted, which makes SEC("")
definitions cleaner and more structured. SEC() now won't be used as yet
another way to uniquely encode BPF program identifier (for that
C function name is better and is guaranteed to be unique within
bpf_object). Now SEC() is strictly BPF program type and, depending on
program type, extra load/attach parameter specification.

Libbpf completely supports multiple BPF programs in the same ELF
section, so multiple BPF programs of the same type/specification easily
co-exist together within the same bpf_object scope.

Additionally, a new (for now internal) convention is introduced: section
name that can be a stand-alone exact BPF program type specificator, but
also could have extra parameters after '/' delimiter. An example of such
section is "struct_ops", which can be specified by itself, but also
allows to specify the intended operation to be attached to, e.g.,
"struct_ops/dctcp_init". Note, that "struct_ops_some_op" is not allowed.
Such section definition is specified as "struct_ops+".

This change is part of libbpf 1.0 effort ([0], [1]).

  [0] Closes: https://github.com/libbpf/libbpf/issues/271
  [1] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-more-uniform-bpf-program-section-name-sec-handling

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-10-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko
d41ea045a6 libbpf: Complete SEC() table unification for BPF_APROG_SEC/BPF_EAPROG_SEC
Complete SEC() table refactoring towards unified form by rewriting
BPF_APROG_SEC and BPF_EAPROG_SEC definitions with
SEC_DEF(SEC_ATTACHABLE_OPT) (for optional expected_attach_type) and
SEC_DEF(SEC_ATTACHABLE) (mandatory expected_attach_type), respectively.
Drop BPF_APROG_SEC, BPF_EAPROG_SEC, and BPF_PROG_SEC_IMPL macros after
that, leaving SEC_DEF() macro as the only one used.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-9-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko
15ea31fadd libbpf: Refactor ELF section handler definitions
Refactor ELF section handler definitions table to use a set of flags and
unified SEC_DEF() macro. This allows for more succinct and table-like
set of definitions, and allows to more easily extend the logic without
adding more verbosity (this is utilized in later patches in the series).

This approach is also making libbpf-internal program pre-load callback
not rely on bpf_sec_def definition, which demonstrates that future
pluggable ELF section handlers will be able to achieve similar level of
integration without libbpf having to expose extra types and APIs.

For starters, update SEC_DEF() definitions and make them more succinct.
Also convert BPF_PROG_SEC() and BPF_APROG_COMPAT() definitions to
a common SEC_DEF() use.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-8-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko
13d35a0cf1 libbpf: Reduce reliance of attach_fns on sec_def internals
Move closer to not relying on bpf_sec_def internals that won't be part
of public API, when pluggable SEC() handlers will be allowed. Drop
pre-calculated prefix length, and in various helpers don't rely on this
prefix length availability. Also minimize reliance on knowing
bpf_sec_def's prefix for few places where section prefix shortcuts are
supported (e.g., tp vs tracepoint, raw_tp vs raw_tracepoint).

Given checking some string for having a given string-constant prefix is
such a common operation and so annoying to be done with pure C code, add
a small macro helper, str_has_pfx(), and reuse it throughout libbpf.c
where prefix comparison is performed. With __builtin_constant_p() it's
possible to have a convenient helper that checks some string for having
a given prefix, where prefix is either string literal (or compile-time
known string due to compiler optimization) or just a runtime string
pointer, which is quite convenient and saves a lot of typing and string
literal duplication.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-7-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko
12d9466d8b libbpf: Refactor internal sec_def handling to enable pluggability
Refactor internals of libbpf to allow adding custom SEC() handling logic
easily from outside of libbpf. To that effect, each SEC()-handling
registration sets mandatory program type/expected attach type for
a given prefix and can provide three callbacks called at different
points of BPF program lifetime:

  - init callback for right after bpf_program is initialized and
  prog_type/expected_attach_type is set. This happens during
  bpf_object__open() step, close to the very end of constructing
  bpf_object, so all the libbpf APIs for querying and updating
  bpf_program properties should be available;

  - pre-load callback is called right before BPF_PROG_LOAD command is
  called in the kernel. This callbacks has ability to set both
  bpf_program properties, as well as program load attributes, overriding
  and augmenting the standard libbpf handling of them;

  - optional auto-attach callback, which makes a given SEC() handler
  support auto-attachment of a BPF program through bpf_program__attach()
  API and/or BPF skeletons <skel>__attach() method.

Each callbacks gets a `long cookie` parameter passed in, which is
specified during SEC() handling. This can be used by callbacks to lookup
whatever additional information is necessary.

This is not yet completely ready to be exposed to the outside world,
mainly due to non-public nature of struct bpf_prog_load_params. Instead
of making it part of public API, we'll wait until the planned low-level
libbpf API improvements for BPF_PROG_LOAD and other typical bpf()
syscall APIs, at which point we'll have a public, probably OPTS-based,
way to fully specify BPF program load parameters, which will be used as
an interface for custom pre-load callbacks.

But this change itself is already a good first step to unify the BPF
program hanling logic even within the libbpf itself. As one example, all
the extra per-program type handling (sleepable bit, attach_btf_id
resolution, unsetting optional expected attach type) is now more obvious
and is gathered in one place.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-6-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko
15669e1dcd selftests/bpf: Normalize all the rest SEC() uses
Normalize all the other non-conforming SEC() usages across all
selftests. This is in preparation for libbpf to start to enforce
stricter SEC() rules in libbpf 1.0 mode.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-5-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko
c22bdd2825 selftests/bpf: Switch SEC("classifier*") usage to a strict SEC("tc")
Convert all SEC("classifier*") uses to a new and strict SEC("tc")
section name. In reference_tracking selftests switch from ambiguous
searching by program title (section name) to non-ambiguous searching by
name in some selftests, getting closer to completely removing
bpf_object__find_program_by_title().

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-4-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko
8fffa0e345 selftests/bpf: Normalize XDP section names in selftests
Convert almost all SEC("xdp_blah") uses to strict SEC("xdp") to comply
with strict libbpf 1.0 logic of exact section name match for XDP program
types. There is only one exception, which is only tested through
iproute2 and defines multiple XDP programs within the same BPF object.
Given iproute2 still works in non-strict libbpf mode and it doesn't have
means to specify XDP programs by its name (not section name/title),
leave that single file alone for now until iproute2 gains lookup by
function/program name.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-3-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko
9673268f03 libbpf: Add "tc" SEC_DEF which is a better name for "classifier"
As argued in [0], add "tc" ELF section definition for SCHED_CLS BPF
program type. "classifier" is a misleading terminology and should be
migrated away from.

  [0] https://lore.kernel.org/bpf/270e27b1-e5be-5b1c-b343-51bd644d0747@iogearbox.net/

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-2-andrii@kernel.org
2021-09-28 13:51:19 -07:00
John Garry
c801612875 perf vendor events arm64: Revise hip08 uncore events
To improve alias matching, remove the PMU name prefix from the
EventName.  This will mean that the pmu code will merge aliases, such
that we no longer get a huge list of per-PMU events - see
perf_pmu_merge_alias().

Also make the following associated changes:

- Use "ConfigCode" rather than "EventCode", so the pmu code is not so
  disagreeable about inconsistent event codes

- Add undocumented HHA event codes to allow alias merging (for those
  events)

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Cc: liuqi115@huawei.com
Link: https://lore.kernel.org/r/1631795665-240946-6-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-28 16:15:33 -03:00
John Garry
b8b350afaa perf test: Add pmu-event test for event described as "config="
Add a new test event for a system event whose event member is in form
"config=".

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Cc: liuqi115@huawei.com
Link: https://lore.kernel.org/r/1631795665-240946-5-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-28 16:15:15 -03:00
John Garry
56be05103a perf test: Verify more event members in pmu-events test
Function compare_pmu_events() does not compare all struct pmu-events
members, so add tests for missing members "name", "event", "aggr_mod",
"event", "metric_constraint", and "metric_group", and re-order the tests
to match current struct pmu-events member ordering.

Also fix uncore_hisi_l3c_rd_hit_cpipe.event member, now that we're
actually testing it.

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Cc: liuqi115@huawei.com
Link: https://lore.kernel.org/r/1631795665-240946-4-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-28 16:14:57 -03:00
John Garry
d60bad10c4 perf jevents: Support ConfigCode
Some PMUs use "config=XXX" for eventcodes, like:

more /sys/bus/event_source/devices/hisi_sccl1_ddrc3/events/act_cmd
config=0x5

However jevents would give an alias with .event field "event=0x5" for
this event. This is handled without issue by the parse events code, but
the pmu alias code gets a bit confused, as it warns about assigning
"event=0x5" over "config=0x5" in perf_pmu_assign_str() when merging
aliases: ./perf stat -v -e act_cmd ...  alias act_cmd differs in field
'value' ...

To make things a bit more straightforward, allow jevents to support
"config=XXX" as well, by supporting a "ConfigCode" field.

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Cc: liuqi115@huawei.com
Link: https://lore.kernel.org/r/1631795665-240946-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-28 16:14:32 -03:00
John Garry
4f9d4f8aa7 perf parse-events: Set numeric term config
For numeric terms, the config field may be NULL as it is not set from
the l+y parsing.

Fix by setting the term config from the term type name.

Also fix up the pmu-events test to set the alias strings to set the
period term properly, and fix up parse-events test to check the term
config string.

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Cc: liuqi115@huawei.com
Link: https://lore.kernel.org/r/1631795665-240946-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-28 16:14:05 -03:00
Ian Rogers
08efcb4a63 libtraceevent: Increase libtraceevent logging when verbose
libtraceevent has added more levels of debug printout and with changes
like:

  https://lore.kernel.org/linux-trace-devel/20210507095022.1079364-3-tz.stoyanov@gmail.com

previously generated output like "registering plugin" is no longer
displayed.

This change makes it so that if perf's verbose debug output is enabled
then the debug and info libtraceevent messages can be displayed.

The code is conditionally enabled based on the libtraceevent version as
discussed in the RFC:

  https://lore.kernel.org/lkml/20210610060643.595673-1-irogers@google.com/

v2. Is a rebase and handles the case of building without
    LIBTRACEEVENT_DYNAMIC.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210923001024.550263-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-28 16:09:16 -03:00
Ian Rogers
359cad09e4 perf tools: Add define for libtracefs version
This will allow version specific support of libtracefs.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210923001024.550263-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-28 16:09:08 -03:00
Ian Rogers
569715164b perf tools: Add define for libtraceevent version
The definition is derived from pkg-config as discussed in:

  https://lore.kernel.org/lkml/20210610155915.20a252d3@oasis.local.home/

The definition is computed using expr rather than passed to be computed
in C code, this avoids complications with quote  in the variable
expansions.

For example see the target python/perf.so in Makefile.perf.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210923001024.550263-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-28 16:08:46 -03:00
Ian Rogers
b758a61b39 perf tools: Enable libtracefs dynamic linking
Currently libtracefs isn't used by perf, but there are potential
improvements by using it as identified Steven Rostedt's e-mail:
https://lore.kernel.org/lkml/20210610154759.1ef958f0@oasis.local.home/

This change is modelled on the dynamic libtraceevent patch by Michael
Petlan:

https://lore.kernel.org/linux-perf-users/20210428092023.4009-1-mpetlan@redhat.com/

v3. Adds file missed in v1 and v2 spotted by Jiri Olsa.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210923001024.550263-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-28 16:08:37 -03:00
Ian Rogers
3d5ac9effc perf test: Workload test of all PMUs
Iterate over the list of PMUs and run the 'true' workload on them. If
the event isn't printed then run the large 'perf bench internals
synthesize' workload and check the event is counted.

On a Skylake this test takes 1m15s mainly running the 'true' workload.

Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210917184240.2181186-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-28 15:54:42 -03:00
Ian Rogers
4a87dea9e6 perf test: Workload test of metric and metricgroups
Test every metric and metricgroup with 'true' as a workload. For
metrics, check that we see the metric printed or get unsupported. If the
'true' workload executes too quickly retry with 'perf bench internals
synthesize'.

v3. Fix test condition (thanks to Paul A. Clarke <pc@us.ibm.com>). Add a
    fallback case of a larger workload so that we don't ignore "<not
    counted>".
v2. Switched the workload to something faster.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210917184240.2181186-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-28 15:43:49 -03:00
Arnaldo Carvalho de Melo
0e46c83075 perf jevents: Add __maybe_unused attribute to unused function arg
The tools/perf/pmu-events/jevents.c file isn't being compiled with
-Werror and -Wextra, which will be the case soon, so before we turn
those compiler flags on, fix what it would flag.

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Like Xu <like.xu.linux@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To: John Garry <john.garry@huawei.com>
2021-09-28 14:15:01 -03:00
Oliver Upton
e02c16b9cd selftests: KVM: Don't clobber XMM register when read
There is no need to clobber a register that is only being read from.
Oops. Drop the XMM register from the clobbers list.

Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210927223621.50178-1-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-28 11:31:29 -04:00
David S. Miller
4ccb9f03fe Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-09-28

The following pull-request contains BPF updates for your *net* tree.

We've added 10 non-merge commits during the last 14 day(s) which contain
a total of 11 files changed, 139 insertions(+), 53 deletions(-).

The main changes are:

1) Fix MIPS JIT jump code emission for too large offsets, from Piotr Krysiuk.

2) Fix x86 JIT atomic/fetch emission when dst reg maps to rax, from Johan Almbladh.

3) Fix cgroup_sk_alloc corner case when called from interrupt, from Daniel Borkmann.

4) Fix segfault in libbpf's linker for objects without BTF, from Kumar Kartikeya Dwivedi.

5) Fix bpf_jit_charge_modmem for applications with CAP_BPF, from Lorenz Bauer.

6) Fix return value handling for struct_ops BPF programs, from Hou Tao.

7) Various fixes to BPF selftests, from Jiri Benc.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
,
2021-09-28 13:52:46 +01:00
Jiri Benc
79e2c30666 selftests, bpf: test_lwt_ip_encap: Really disable rp_filter
It's not enough to set net.ipv4.conf.all.rp_filter=0, that does not override
a greater rp_filter value on the individual interfaces. We also need to set
net.ipv4.conf.default.rp_filter=0 before creating the interfaces. That way,
they'll also get their own rp_filter value of zero.

Fixes: 0fde56e438 ("selftests: bpf: add test_lwt_ip_encap selftest")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/b1cdd9d469f09ea6e01e9c89a6071c79b7380f89.1632386362.git.jbenc@redhat.com
2021-09-28 09:30:38 +02:00
Jiri Benc
d888eaac4f selftests, bpf: Fix makefile dependencies on libbpf
When building bpf selftest with make -j, I'm randomly getting build failures
such as this one:

  In file included from progs/bpf_flow.c:19:
  [...]/tools/testing/selftests/bpf/tools/include/bpf/bpf_helpers.h:11:10: fatal error: 'bpf_helper_defs.h' file not found
  #include "bpf_helper_defs.h"
           ^~~~~~~~~~~~~~~~~~~

The file that fails the build varies between runs but it's always in the
progs/ subdir.

The reason is a missing make dependency on libbpf for the .o files in
progs/. There was a dependency before commit 3ac2e20fba but that commit
removed it to prevent unneeded rebuilds. However, that only works if libbpf
has been built already; the 'wildcard' prerequisite does not trigger when
there's no bpf_helper_defs.h generated yet.

Keep the libbpf as an order-only prerequisite to satisfy both goals. It is
always built before the progs/ objects but it does not trigger unnecessary
rebuilds by itself.

Fixes: 3ac2e20fba ("selftests/bpf: BPF object files should depend only on libbpf headers")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/ee84ab66436fba05a197f952af23c98d90eb6243.1632758415.git.jbenc@redhat.com
2021-09-28 09:30:14 +02:00
Kumar Kartikeya Dwivedi
bcfd367c28 libbpf: Fix segfault in static linker for objects without BTF
When a BPF object is compiled without BTF info (without -g),
trying to link such objects using bpftool causes a SIGSEGV due to
btf__get_nr_types accessing obj->btf which is NULL. Fix this by
checking for the NULL pointer, and return error.

Reproducer:
$ cat a.bpf.c
extern int foo(void);
int bar(void) { return foo(); }
$ cat b.bpf.c
int foo(void) { return 0; }
$ clang -O2 -target bpf -c a.bpf.c
$ clang -O2 -target bpf -c b.bpf.c
$ bpftool gen obj out a.bpf.o b.bpf.o
Segmentation fault (core dumped)

After fix:
$ bpftool gen obj out a.bpf.o b.bpf.o
libbpf: failed to find BTF info for object 'a.bpf.o'
Error: failed to link 'a.bpf.o': Unknown error -22 (-22)

Fixes: a46349227c (libbpf: Add linker extern resolution support for functions and global variables)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210924023725.70228-1-memxor@gmail.com
2021-09-28 09:29:03 +02:00
Toke Høiland-Jørgensen
c3e8c44a90 libbpf: Ignore STT_SECTION symbols in 'maps' section
When parsing legacy map definitions, libbpf would error out when
encountering an STT_SECTION symbol. This becomes a problem because some
versions of binutils will produce SECTION symbols for every section when
processing an ELF file, so BPF files run through 'strip' will end up with
such symbols, making libbpf refuse to load them.

There's not really any reason why erroring out is strictly necessary, so
change libbpf to just ignore SECTION symbols when parsing the ELF.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210927205810.715656-1-toke@redhat.com
2021-09-27 21:29:37 -07:00
Magnus Karlsson
e34087fc00 selftests: xsk: Add frame_headroom test
Add a test for the frame_headroom feature that can be set on the
umem. The logic added validates that all offsets in all tests and
packets are valid, not just the ones that have a specifically
configured frame_headroom.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-14-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Magnus Karlsson
e4e9baf06a selftests: xsk: Change interleaving of packets in unaligned mode
Change the interleaving of packets in unaligned mode. With the current
buffer addresses in the packet stream, the last buffer in the umem
could not be used as a large packet could potentially write over the
end of the umem. The kernel correctly threw this buffer address away
and refused to use it. This is perfectly fine for all regular packet
streams, but the ones used for unaligned mode have every other packet
being at some different offset. As we will add checks for correct
offsets in the next patch, this needs to be fixed. Just start these
page-boundary straddling buffers one page earlier so that the last
one is not on the last page of the umem, making all buffers valid.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-13-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Magnus Karlsson
96a40678ce selftests: xsk: Add single packet test
Add a test where a single packet is sent and received. This might
sound like a silly test, but since many of the interfaces in xsk are
batched, it is important to be able to validate that we did not break
something as fundamental as just receiving single packets, instead of
batches of packets at high speed.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-12-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Magnus Karlsson
1bf3649688 selftests: xsk: Introduce pacing of traffic
Introduce pacing of traffic so that the Tx thread can never send more
packets than the receiver has processed plus the number of packets it
can have in its umem. So at any point in time, the number of in flight
packets (not processed by the Rx thread) are less than or equal to the
number of packets that can be held in the Rx thread's umem.

The batch size is also increased to improve running time.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-11-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Magnus Karlsson
89013b8a29 selftests: xsk: Fix socket creation retry
The socket creation retry unnecessarily registered the umem once for
every retry. No reason to do this. It wastes memory and it might lead
to too many pages being locked at some point and the failure of a
test.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-10-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Magnus Karlsson
872a1184db selftests: xsk: Put the same buffer only once in the fill ring
Fix a problem where the fill ring was populated with too many
entries. If number of buffers in the umem was smaller than the fill
ring size, the code used to loop over from the beginning of the umem
and start putting the same buffers in again. This is racy indeed as a
later packet can be received overwriting an earlier one before the Rx
thread manages to validate it.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-9-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Magnus Karlsson
5b13205612 selftests: xsk: Fix missing initialization
Fix missing initialization of the member rx_pkt_nb in the packet
stream. This leads to some tests declaring success too early as the
test thought all packets had already been received.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-8-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Linus Torvalds
0513e464f9 perf tools fixes for v5.15: 2nd batch
- Fix 'perf test' DWARF unwind for optimized builds.
 
 - Fix 'perf test' 'Object code reading' when dealing with samples in @plt
   symbols.
 
 - Fix off-by-one directory paths in the ARM support code.
 
 - Fix error message to eliminate confusion in 'perf config' when first creating
   a config file.
 
 - 'perf iostat' fix for system wide operation.
 
 - Fix printing of metrics when 'perf iostat' is used with one or more
   iio_root_ports and unconnected cpus (using -C).
 
 - Fix several typos in the documentation files.
 
 - Fix spelling mistake "icach" -> "icache" in the power8 JSON vendor files.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYVIO3wAKCRCyPKLppCJ+
 J9piAP4jmxYEnimD6qvVHjOLio2LvwGI0u7MakZCHWVKQZKHbgEArb8l3+D2+YXw
 U7RxDmXoSE+0EjTV8o13sQlerRTU3wM=
 =oVI7
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v5.15-2021-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull more perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix 'perf test' DWARF unwind for optimized builds.

 - Fix 'perf test' 'Object code reading' when dealing with samples in
   @plt symbols.

 - Fix off-by-one directory paths in the ARM support code.

 - Fix error message to eliminate confusion in 'perf config' when first
   creating a config file.

 - 'perf iostat' fix for system wide operation.

 - Fix printing of metrics when 'perf iostat' is used with one or more
   iio_root_ports and unconnected cpus (using -C).

 - Fix several typos in the documentation files.

 - Fix spelling mistake "icach" -> "icache" in the power8 JSON vendor
   files.

* tag 'perf-tools-fixes-for-v5.15-2021-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf iostat: Fix Segmentation fault from NULL 'struct perf_counts_values *'
  perf iostat: Use system-wide mode if the target cpu_list is unspecified
  perf config: Refine error message to eliminate confusion
  perf doc: Fix typos all over the place
  perf arm: Fix off-by-one directory paths.
  perf vendor events powerpc: Fix spelling mistake "icach" -> "icache"
  perf tests: Fix flaky test 'Object code reading'
  perf test: Fix DWARF unwind for optimized builds.
2021-09-27 14:06:42 -07:00
Linus Torvalds
9cccec2bf3 x86:
- missing TLB flush
 
 - nested virtualization fixes for SMM (secure boot on nested hypervisor)
   and other nested SVM fixes
 
 - syscall fuzzing fixes
 
 - live migration fix for AMD SEV
 
 - mirror VMs now work for SEV-ES too
 
 - fixes for reset
 
 - possible out-of-bounds access in IOAPIC emulation
 
 - fix enlightened VMCS on Windows 2022
 
 ARM:
 
 - Add missing FORCE target when building the EL2 object
 
 - Fix a PMU probe regression on some platforms
 
 Generic:
 
 - KCSAN fixes
 
 selftests:
 
 - random fixes, mostly for clang compilation
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmFN0EwUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNqaQf/Vx7ePFTqwWpo+8wKapnc6JN9SLjC
 hM4jipxfc1WyQWcfCt8ZuPhCnhF7o8mG/mrqTm+JB+oGqIsydHW19DiUT8ekv09F
 dQ+XYSiR4B547wUH5XLQc4xG9imwYlXGEOHqrE7eJvGH3LOqVFX2fLRBnFefZbO8
 GKhRJrGXwG3/JSAP6A0c22iVU+pLbfV9gpKwrAj0V7o8nzT2b3Wmh74WBNb47BzE
 a4+AwKpWO4rqJGOwdYwy67pdFHh1YmrlZ59cFZc7fzlXE+o0D0bitaJyioZALpOl
 4mRGdzoYkNB++ZjDzVFnAClCYQV/oNxCNGFaFF2mh/gzXG1TLmN7B8zGDg==
 =7oVh
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "A bit late... I got sidetracked by back-from-vacation routines and
  conferences. But most of these patches are already a few weeks old and
  things look more calm on the mailing list than what this pull request
  would suggest.

  x86:

   - missing TLB flush

   - nested virtualization fixes for SMM (secure boot on nested
     hypervisor) and other nested SVM fixes

   - syscall fuzzing fixes

   - live migration fix for AMD SEV

   - mirror VMs now work for SEV-ES too

   - fixes for reset

   - possible out-of-bounds access in IOAPIC emulation

   - fix enlightened VMCS on Windows 2022

  ARM:

   - Add missing FORCE target when building the EL2 object

   - Fix a PMU probe regression on some platforms

  Generic:

   - KCSAN fixes

  selftests:

   - random fixes, mostly for clang compilation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits)
  selftests: KVM: Explicitly use movq to read xmm registers
  selftests: KVM: Call ucall_init when setting up in rseq_test
  KVM: Remove tlbs_dirty
  KVM: X86: Synchronize the shadow pagetable before link it
  KVM: X86: Fix missed remote tlb flush in rmap_write_protect()
  KVM: x86: nSVM: don't copy virt_ext from vmcb12
  KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround
  KVM: x86: selftests: test simultaneous uses of V_IRQ from L1 and L0
  KVM: x86: nSVM: restore int_vector in svm_clear_vintr
  kvm: x86: Add AMD PMU MSRs to msrs_to_save_all[]
  KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit
  KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry
  KVM: x86: VMX: synthesize invalid VM exit when emulating invalid guest state
  KVM: x86: nSVM: refactor svm_leave_smm and smm_enter_smm
  KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode
  KVM: x86: reset pdptrs_from_userspace when exiting smm
  KVM: x86: nSVM: restore the L1 host state prior to resuming nested guest on SMM exit
  KVM: nVMX: Filter out all unsupported controls when eVMCS was activated
  KVM: KVM: Use cpumask_available() to check for NULL cpumask when kicking vCPUs
  KVM: Clean up benign vcpu->cpu data races when kicking vCPUs
  ...
2021-09-27 13:58:23 -07:00
Shuah Khan
2f96028708 selftests: drivers/dma-buf: Fix implicit declaration warns
udmabuf has the following implicit declaration warns:

udmabuf.c:30:10: warning: implicit declaration of function 'open';
udmabuf.c:42:8: warning: implicit declaration of function 'fcntl'

These are caused due to not including fcntl.h and including just
linux/fcntl.h. Fix it to include fcntl.h which will bring in the
linux/fcntl.h. In addition, define __EXPORTED_HEADERS__ to bring in
F_ADD_SEALS and F_SEAL_SHRINK defines and fix the following error
that show up when just fcntl.h is included.

udmabuf.c:45:21: error: 'F_ADD_SEALS' undeclared
   45 |  ret = fcntl(memfd, F_ADD_SEALS, F_SEAL_SHRINK);
      |                     ^~~~~~~~~~~
udmabuf.c:45:34: error: 'F_SEAL_SHRINK' undeclared
   45 |  ret = fcntl(memfd, F_ADD_SEALS, F_SEAL_SHRINK);
      |                                  ^~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-27 09:52:29 -06:00
Like Xu
4da8b12188 perf iostat: Fix Segmentation fault from NULL 'struct perf_counts_values *'
If the 'perf iostat' user specifies two or more iio_root_ports and also
specifies the cpu(s) by -C which is not *connected to all* the above iio
ports, the iostat_print_metric() will run into trouble:

For example:

  $ perf iostat list
  S0-uncore_iio_0<0000:16>
  S1-uncore_iio_0<0000:97> # <--- CPU 1 is located in the socket S0

  $ perf iostat 0000:16,0000:97 -C 1 -- ls
  port 	Inbound Read(MB)	Inbound Write(MB)	Outbound Read(MB)	Outbound
  Write(MB) ../perf-iostat: line 12: 104418 Segmentation fault
  (core dumped) perf stat --iostat$DELIMITER$*

The core-dump stack says, in the above corner case, the returned
(struct perf_counts_values *) count will be NULL, and the caller
iostat_print_metric() apparently doesn't not handle this case.

  433	struct perf_counts_values *count = perf_counts(evsel->counts, die, 0);
  434
  435	if (count->run && count->ena) {
  (gdb) p count
  $1 = (struct perf_counts_values *) 0x0

The deeper reason is that there are actually no statistics from the user
specified pair "iostat 0000:X, -C (disconnected) Y ", but let's fix it with
minimum cost by adding a NULL check in the user space.

Fixes: f9ed693e8b ("perf stat: Enable iostat mode for x86 platforms")
Signed-off-by: Like Xu <likexu@tencent.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210927081115.39568-2-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:41:07 -03:00
Like Xu
e4fe5d7349 perf iostat: Use system-wide mode if the target cpu_list is unspecified
An iostate use case like "perf iostat 0000:16,0000:97 -- ls" should be
implemented to work in system-wide mode to ensure that the output from
print_header() is consistent with the user documentation perf-iostat.txt,
rather than incorrectly assuming that the kernel does not support it:

 Error:
 The sys_perf_event_open() syscall returned with 22 (Invalid argument) \
 for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/).
 /bin/dmesg | grep -i perf may provide additional information.

This error is easily fixed by assigning system-wide mode by default
for IOSTAT_RUN only when the target cpu_list is unspecified.

Fixes: f07952b179 ("perf stat: Basic support for iostat in perf")
Signed-off-by: Like Xu <likexu@tencent.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210927081115.39568-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:39:30 -03:00
William Cohen
0ba37e05c2 perf annotate: Add riscv64 support
This patch adds basic arch initialization and instruction associate
support for the riscv64 CPU architecture.

Example output:

  $ perf annotate --stdio2
  Samples: 122K of event 'task-clock:u', 4000 Hz, Event count (approx.): 30637250000, [percent: local period]
  strcmp() /usr/lib64/libc-2.32.so
  Percent

	      Disassembly of section .text:

	      0000000000069a30 <strcmp>:
	      __GI_strcmp():
	      const unsigned char *s2 = (const unsigned char *) p2;
	      unsigned char c1, c2;

	      do
	      {
	      c1 = (unsigned char) *s1++;
   37.30        lbu  a5,0(a0)
	      c2 = (unsigned char) *s2++;
    1.23        addi a1,a1,1
	      c1 = (unsigned char) *s1++;
   18.68        addi a0,a0,1
	      c2 = (unsigned char) *s2++;
    1.37        lbu  a4,-1(a1)
	      if (c1 == '\0')
   18.71      ↓ beqz a5,18
	       return c1 - c2;
	       }

Signed-off-by: William Cohen <wcohen@redhat.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-riscv@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210927005115.610264-1-wcohen@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:33:44 -03:00
Like Xu
a827c007c7 perf config: Refine error message to eliminate confusion
If there is no configuration file at first, the user can write any pair
of "key.subkey=value" to the newly created configuration file, while
value validation against a valid configurable key is *deferred* until
the next execution or the implied execution of "perf config ... ".

For example:

  $ rm ~/.perfconfig
  $ perf config call-graph.dump-size=65529
  $ cat ~/.perfconfig
  # this file is auto-generated.
  [call-graph]
 	dump-size = 65529
  $ perf config call-graph.dump-size=2048
  callchain: Incorrect stack dump size (max 65528): 65529
  Error: wrong config key-value pair call-graph.dump-size=65529

The user might expect that the second value 2048 is valid and can be
updated to the configuration file, but the error message is very
confusing because the first value 65529 is not reported as an error
during the last configuration.

It is recommended not to change the current behavior of delayed
validation (as more effort is needed), but to refine the original error
message to *clearly indicate* that the cause of the error is the
configuration file.

Signed-off-by: Like Xu <likexu@tencent.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210924115817.58689-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Like Xu
4da6552c5d perf doc: Fix typos all over the place
Considering that perf and its subcommands have so many parameters, the
documentation is always the first stop for perf beginners. Fixing some
spelling errors will relax the eyes of some readers a little bit.

 s/specicfication/specification/
 s/caheline/cacheline/
 s/tranasaction/transaction/
 s/complan/complain/
 s/sched_wakep/sched_wakeup/
 s/possble/possible/
 s/methology/methodology/

Signed-off-by: Like Xu <likexu@tencent.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210924081942.38368-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Ian Rogers
c6613bd4a5 perf arm: Fix off-by-one directory paths.
Relative path include works in the regular build due to -I paths but may
fail in other situations.

v2. Rebase. Comments on v1 were that we should handle include paths
    differently and it is agreed that can be a sensible refactor but
    beyond the scope of this change.
https://lore.kernel.org/lkml/20210504191227.793712-1-irogers@google.com/

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210923154254.737657-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Colin Ian King
774f2c0890 perf vendor events powerpc: Fix spelling mistake "icach" -> "icache"
There is a spelling mistake in the description text, fix it.

Signed-off-by: Colin King <colin.king@canonical.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210916081314.41751-1-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
James Clark
0f892fd1bd perf tests: Fix flaky test 'Object code reading'
This test occasionally fails on aarch64 when a sample is taken in
free@plt and it fails with "Bytes read differ from those read by
objdump".

This is because that symbol is near a section boundary in the elf file.
Despite the -z option to always output zeros, objdump uses
bfd_map_over_sections() to iterate through the elf file so it doesn't
see outside of the sections where these zeros are and can't print them.

For example this boundary proceeds free@plt in libc with a gap of 48
bytes between .plt and .text:

  objdump -d -z --start-address=0x23cc8 --stop-address=0x23d08 libc-2.30.so

  libc-2.30.so:     file format elf64-littleaarch64

  Disassembly of section .plt:

  0000000000023cc8 <*ABS*+0x7fd00@plt+0x8>:
     23cc8:	91018210 	add	x16, x16, #0x60
     23ccc:	d61f0220 	br	x17

  Disassembly of section .text:

  0000000000023d00 <abort@@GLIBC_2.17-0x98>:
     23d00:	a9bf7bfd 	stp	x29, x30, [sp, #-16]!
     23d04:	910003fd 	mov	x29, sp

Taking a sample in free@plt is very rare because it is so small, but the
test can be forced to fail almost every time on any platform by linking
the test with a shared library that has a single empty function and
calling it in a loop.

The fix is to zero the buffers so that when there is a jump in the
addresses output by objdump, zeros are already filled in between.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20210906152238.3415467-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Ian Rogers
5c34aea341 perf test: Fix DWARF unwind for optimized builds.
To ensure the stack frames are on the stack tail calls optimizations
need to be inhibited. If your compiler supports an attribute use it,
otherwise use an asm volatile barrier.

The barrier fix was suggested here:
https://lore.kernel.org/lkml/20201028081123.GT2628@hirez.programming.kicks-ass.net/
Tested with an optimized clang build and by forcing the asm barrier
route with an optimized clang build.

A GCC bug tracking a proper disable_tail_calls is:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97831

Fixes: 9ae1e990f1 ("perf tools: Remove broken __no_tail_call
       attribute")

v2. is a rebase. The original fix patch generated quite a lot of
    discussion over the right place for the fix:
    https://lore.kernel.org/lkml/20201114000803.909530-1-irogers@google.com/
    The patch reflects my preference of it being near the use, so that
    future code cleanups don't break this somewhat special usage.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20210922173812.456348-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Petr Machata
b69c99463d selftests: net: fib_nexthops: Wait before checking reported idle time
The purpose of this test is to verify that after a short activity passes,
the reported time is reasonable: not zero (which could be reported by
mistake), and not something outrageous (which would be indicative of an
issue in used units).

However, the idle time is reported in units of clock_t, or hundredths of
second. If the initial sequence of commands is very quick, it is possible
that the idle time is reported as just flat-out zero. When this test was
recently enabled in our nightly regression, we started seeing spurious
failures for exactly this reason.

Therefore buffer the delay leading up to the test with a sleep, to make
sure there is no legitimate way of reporting 0.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 12:15:36 +01:00
Martin KaFai Lau
ef979017b8 bpf: selftest: Add verifier tests for <8-byte scalar spill and refill
This patch adds a few verifier tests for <8-byte spill and refill.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210922004953.627183-1-kafai@fb.com
2021-09-26 13:07:28 -07:00
Martin KaFai Lau
54ea6079b7 bpf: selftest: A bpf prog that has a 32bit scalar spill
It is a simplified example that can trigger a 32bit scalar spill.
The const scalar is refilled and added to a skb->data later.
Since the reg state of the 32bit scalar spill is not saved now,
adding the refilled reg to skb->data and then comparing it with
skb->data_end cannot verify the skb->data access.

With the earlier verifier patch and the llvm patch [1].  The verifier
can correctly verify the bpf prog.

Here is the snippet of the verifier log that leads to verifier conclusion
that the packet data is unsafe to read.  The log is from the kerne
without the previous verifier patch to save the <8-byte scalar spill.
67: R0=inv1 R1=inv17 R2=invP2 R3=inv1 R4=pkt(id=0,off=68,r=102,imm=0) R5=inv102 R6=pkt(id=0,off=62,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0
67: (63) *(u32 *)(r10 -12) = r5
68: R0=inv1 R1=inv17 R2=invP2 R3=inv1 R4=pkt(id=0,off=68,r=102,imm=0) R5=inv102 R6=pkt(id=0,off=62,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmm????
...
101: R0_w=map_value_or_null(id=2,off=0,ks=16,vs=1,imm=0) R6_w=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmmmmmm
101: (61) r1 = *(u32 *)(r10 -12)
102: R0_w=map_value_or_null(id=2,off=0,ks=16,vs=1,imm=0) R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6_w=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmmmmmm
102: (bc) w1 = w1
103: R0_w=map_value_or_null(id=2,off=0,ks=16,vs=1,imm=0) R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6_w=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmmmmmm
103: (0f) r7 += r1
last_idx 103 first_idx 67
regs=2 stack=0 before 102: (bc) w1 = w1
regs=2 stack=0 before 101: (61) r1 = *(u32 *)(r10 -12)
104: R0_w=map_value_or_null(id=2,off=0,ks=16,vs=1,imm=0) R1_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6_w=pkt(id=0,off=70,r=102,imm=0) R7_w=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmmmmmm
...
127: R0_w=inv1 R1=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9_w=invP17 R10=fp0 fp-16=mmmmmmmm
127: (bf) r1 = r7
128: R0_w=inv1 R1_w=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9_w=invP17 R10=fp0 fp-16=mmmmmmmm
128: (07) r1 += 8
129: R0_w=inv1 R1_w=pkt(id=3,off=8,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9_w=invP17 R10=fp0 fp-16=mmmmmmmm
129: (b4) w0 = 1
130: R0=inv1 R1=pkt(id=3,off=8,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9=invP17 R10=fp0 fp-16=mmmmmmmm
130: (2d) if r1 > r8 goto pc-66
 R0=inv1 R1=pkt(id=3,off=8,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9=invP17 R10=fp0 fp-16=mmmmmmmm
131: R0=inv1 R1=pkt(id=3,off=8,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9=invP17 R10=fp0 fp-16=mmmmmmmm
131: (69) r6 = *(u16 *)(r7 +0)
invalid access to packet, off=0 size=2, R7(id=3,off=0,r=0)
R7 offset is outside of the packet

[1]: https://reviews.llvm.org/D109073

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210922004947.626286-1-kafai@fb.com
2021-09-26 13:07:28 -07:00
Linus Torvalds
5bb7b2107f A set of fixes for X86:
- Prevent sending the wrong signal when protection keys are enabled and
    the kernel handles a fault in the vsyscall emulation.
 
  - Invoke early_reserve_memory() before invoking e820_memory_setup() which
    is required to make the Xen dom0 e820 hooks work correctly.
 
  - Use the correct data type for the SETZ operand in the EMQCMDS
    instruction wrapper.
 
  - Prevent undefined behaviour to the potential unaligned accesss in the
    instroction decoder library.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmFQQaITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoaZjD/0TF0mE8QUhI4tyGELdNgwvje5iZ9vg
 Nd9KJpR4hUALHgfUD44NVl9JWawFY2d8FXyIPoAFEcvmy6o4f1w0ia8US3hQWA0Y
 EdLSigXi/eYSstkONaJUEBCxlLbwy7JDzaazA9DeKOEuRc7NWSyZURYvzTAkPK1Y
 mbE9kjKhjFa5NGnSB8HbSF2yEzFsKaTo4nreWP/OkzDjnEMshLR1/FUOUvZmlsgA
 CWjMxAVYFqeJN3QhDgR/vRKPoz1sOjDL1s4AsU+xdy63WyFJZ7Z1b8t6bOBoYh6w
 UztkuOkzZ6pIdzz4O1WGoFx4/FJ74qNx0vO/hOB+cKH6rgJs6AkHAvwlnjI/fE2C
 Y+IsuE4PBXMRpkaayTCsAq/enabwgKsmLSUu916APrhVvuUtb3GJgyhedLE3mEBw
 yZXezzRDhNpYop2yQSRXDeKebpoQgl+zqEP5g1O8pAFnud8FGHnz64eJV7Su7Y7C
 BCac0hmv+drlqb/jOSYqjsfo6QfhvR60WwDIgTplOMMLa3plEJFx/rIuU2xVg5g9
 w0m2QUsZboyT2yBnl8gRrqrcQmv2t4iX6TAj9Wm23Lx41h94JQMRtZyJT9bcNqY9
 jMJu27BcNSveciZA7W2DVUlFf/gTF3bwpF7ZDWRt/VSrHPtkI9WKlERhQaywo1L0
 rF8SGCEuNU2ktw==
 =h7v1
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of fixes for X86:

   - Prevent sending the wrong signal when protection keys are enabled
     and the kernel handles a fault in the vsyscall emulation.

   - Invoke early_reserve_memory() before invoking e820_memory_setup()
     which is required to make the Xen dom0 e820 hooks work correctly.

   - Use the correct data type for the SETZ operand in the EMQCMDS
     instruction wrapper.

   - Prevent undefined behaviour to the potential unaligned accesss in
     the instruction decoder library"

* tag 'x86-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses
  x86/asm: Fix SETZ size enqcmds() build failure
  x86/setup: Call early_reserve_memory() earlier
  x86/fault: Fix wrong signal when vsyscall fails with pkey
2021-09-26 10:09:20 -07:00
Linus Torvalds
a3b397b4ff Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "16 patches.

  Subsystems affected by this patch series: xtensa, sh, ocfs2, scripts,
  lib, and mm (memory-failure, kasan, damon, shmem, tools, pagecache,
  debug, and pagemap)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: fix uninitialized use in overcommit_policy_handler
  mm/memory_failure: fix the missing pte_unmap() call
  kasan: always respect CONFIG_KASAN_STACK
  sh: pgtable-3level: fix cast to pointer from integer of different size
  mm/debug: sync up latest migrate_reason to migrate_reason_names
  mm/debug: sync up MR_CONTIG_RANGE and MR_LONGTERM_PIN
  mm: fs: invalidate bh_lrus for only cold path
  lib/zlib_inflate/inffast: check config in C to avoid unused function warning
  tools/vm/page-types: remove dependency on opt_file for idle page tracking
  scripts/sorttable: riscv: fix undeclared identifier 'EM_RISCV' error
  ocfs2: drop acl cache for directories too
  mm/shmem.c: fix judgment error in shmem_is_huge()
  xtensa: increase size of gcc stack frame check
  mm/damon: don't use strnlen() with known-bogus source length
  kasan: fix Kconfig check of CC_HAS_WORKING_NOSANITIZE_ADDRESS
  mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()
2021-09-25 16:20:34 -07:00
Linus Torvalds
90316e6ea0 linux-kselftest-fixes-5.15-rc3
This Kselftest fixes update for Linux 5.15-rc3 consists of:
 
 - fix to Kselftest common framework header install to run before
   other targets for it work correctly in parallel build case.
 - fixes to kvm test to not ignore fscanf() returns which could
   result in inconsistent test behavior and failures.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmFOL14ACgkQCwJExA0N
 Qxyz9hAA6LKcNpv7BGXWpTNB2z/p9KcbSWdA3eejvb+3kiovd2AyFPvf1U1fT5kA
 VlJWk4d3cbhPkCrXmkpsI2vBuSb+u5HcwoEy0WH4T4q2aDjoEOnyn3xr2ktnBdXl
 oeWqPusSRKTJn7gCCZN+H6s5JaUbGZxTWsL2n0v5Pzb6sW4hikc8nXpd+1IeoTKq
 0xg7FM6NcaWRzKb6D97wFpQEo6mnC9Zifv6TalxQn71d//n9MXGW2600ZSDDfBRG
 XfolWqVUHGI2Lyy0mIT788fmntA7xOka3Tajzk4WrfmcgczABJFRJr8ZG6iZ+J0j
 dT+KaTqEZHcL1L9Pusf2VehJZTUwKMenc2NmOZ+n6pK+PUL/KZNobcXoFW01I9jI
 z4UYeH9DUnTiaTP8b7OJ1RB7H5XU3CncuO2gELkera212XckdmiTldpox0ywwyh9
 x+X6mk4lzbDSFMPrSPJrVTasvLMEZv6A05I8Td8Gw7mgeYIxupyk0/k5jgjMlZGN
 6CHyQu4iGre3oM72snuthomtzqwArF3bhWAm7ooZYG4qLQ6z6naDdvcpWYmszIV0
 RuC74FZCHZWGXXojVJWPquf47C79QPFQyQcwXxfAaSMY3XezVsI7c/QaXLoT1VeA
 BZbsUoAPvLa8mLRiFU2jZmLbx57rxej5zM+UvKJjI1JdCL9+cUA=
 =Q9zx
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-fixes-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:

 - fix to Kselftest common framework header install to run before other
   targets for it work correctly in parallel build case.

 - fixes to kvm test to not ignore fscanf() returns which could result
   in inconsistent test behavior and failures.

* tag 'linux-kselftest-fixes-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: kvm: fix get_run_delay() ignoring fscanf() return warn
  selftests: kvm: move get_run_delay() into lib/test_util
  selftests:kvm: fix get_trans_hugepagesz() ignoring fscanf() return warn
  selftests:kvm: fix get_warnings_count() ignoring fscanf() return warn
  selftests: be sure to make khdr before other targets
2021-09-25 15:30:29 -07:00
Linus Torvalds
2c4e969c38 USB driver fixes for 5.15-rc3
Here are some USB driver fixes and new device ids for 5.15-rc3.
 
 They include:
 	- usb-storage quirk additions
 	- usb-serial new device ids
 	- usb-serial driver fixes
 	- USB roothub registration bugfix to resolve a long-reported
 	  issue
 	- usb gadget driver fixes for a large number of small things
 	- dwc2 driver fixes
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYU8uKQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynjlgCeIE84XRGpjE/jCK/63Sjve9zyJjoAn2ZUFwLN
 lcLJxlHV3XHK8coC5/YZ
 =hNgV
 -----END PGP SIGNATURE-----

Merge tag 'usb-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB driver fixes from Greg KH:
 "Here are some USB driver fixes and new device ids for 5.15-rc3.

  They include:

   - usb-storage quirk additions

   - usb-serial new device ids

   - usb-serial driver fixes

   - USB roothub registration bugfix to resolve a long-reported issue

   - usb gadget driver fixes for a large number of small things

   - dwc2 driver fixes

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'usb-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (28 commits)
  USB: serial: option: add device id for Foxconn T99W265
  USB: serial: cp210x: add ID for GW Instek GDM-834x Digital Multimeter
  USB: serial: cp210x: add part-number debug printk
  USB: serial: cp210x: fix dropped characters with CP2102
  MAINTAINERS: usb, update Peter Korsgaard's entries
  usb: musb: tusb6010: uninitialized data in tusb_fifo_write_unaligned()
  usb-storage: Add quirk for ScanLogic SL11R-IDE older than 2.6c
  Re-enable UAS for LaCie Rugged USB3-FW with fk quirk
  USB: serial: option: remove duplicate USB device ID
  USB: serial: mos7840: remove duplicated 0xac24 device ID
  arm64: dts: qcom: ipq8074: remove USB tx-fifo-resize property
  usb: gadget: f_uac2: Populate SS descriptors' wBytesPerInterval
  usb: gadget: f_uac2: Add missing companion descriptor for feedback EP
  usb: dwc2: gadget: Fix ISOC transfer complete handling for DDMA
  usb: core: hcd: Modularize HCD stop configuration in usb_stop_hcd()
  xhci: Set HCD flag to defer primary roothub registration
  usb: core: hcd: Add support for deferring roothub registration
  usb: dwc2: gadget: Fix ISOC flow for BDMA and Slave
  usb: dwc3: core: balance phy init and exit
  Revert "USB: bcma: Add a check for devm_gpiod_get"
  ...
2021-09-25 10:10:38 -07:00
Jakub Kicinski
7fe7f3182a Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

1) ipset limits the max allocatable memory via kvmalloc() to MAX_INT,
   from Jozsef Kadlecsik.

2) Check ip_vs_conn_tab_bits value to be in the range specified
   in Kconfig, from Andrea Claudi.

3) Initialize fragment offset in ip6tables, from Jeremy Sowden.

4) Make conntrack hash chain length random, from Florian Westphal.

5) Add zone ID to conntrack and NAT hashtuple again, also from Florian.

6) Add selftests for bidirectional zone support and colliding tuples,
   from Florian Westphal.

7) Unlink table before synchronize_rcu when cleaning tables with
   owner, from Florian.

8) ipset limits the max allocatable memory via kvmalloc() to MAX_INT.

9) Release conntrack entries via workqueue in masquerade, from Florian.

10) Fix bogus net_init in iptables raw table definition, also from Florian.

11) Work around missing softdep in log extensions, from Florian Westphal.

12) Serialize hash resizes and cleanups with mutex, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
  netfilter: conntrack: serialize hash resizes and cleanups
  netfilter: log: work around missing softdep backend module
  netfilter: iptable_raw: drop bogus net_init annotation
  netfilter: nf_nat_masquerade: defer conntrack walk to work queue
  netfilter: nf_nat_masquerade: make async masq_inet6_event handling generic
  netfilter: nf_tables: Fix oversized kvmalloc() calls
  netfilter: nf_tables: unlink table before deleting it
  selftests: netfilter: add zone stress test with colliding tuples
  selftests: netfilter: add selftest for directional zone support
  netfilter: nat: include zone id in nat table hash again
  netfilter: conntrack: include zone id in tuple hash again
  netfilter: conntrack: make max chain length random
  netfilter: ip6_tables: zero-initialize fragment offset
  ipvs: check that ip_vs_conn_tab_bits is between 8 and 20
  netfilter: ipset: Fix oversized kvmalloc() calls
====================

Link: https://lore.kernel.org/r/20210924221113.348767-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-24 17:27:20 -07:00
Changbin Du
ebaeab2fe8 tools/vm/page-types: remove dependency on opt_file for idle page tracking
Idle page tracking can also be used for process address space, not only
file mappings.

Without this change, using with '-i' option for process address space
encounters below errors reported.

  $ sudo ./page-types -p $(pidof bash) -i
  mark page idle: Bad file descriptor
  mark page idle: Bad file descriptor
  mark page idle: Bad file descriptor
  mark page idle: Bad file descriptor
  ...

Link: https://lkml.kernel.org/r/20210917032826.10669-1-changbin.du@gmail.com
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-24 16:13:35 -07:00
Yonghong Song
091037fb77 selftests/bpf: Fix btf_dump __int128 test failure with clang build kernel
With clang build kernel (adding LLVM=1 to kernel and selftests/bpf build
command line), I hit the following test failure:

  $ ./test_progs -t btf_dump
  ...
  btf_dump_data:PASS:ensure expected/actual match 0 nsec
  btf_dump_data:FAIL:find type id unexpected find type id: actual -2 < expected 0
  btf_dump_data:FAIL:find type id unexpected find type id: actual -2 < expected 0
  test_btf_dump_int_data:FAIL:dump __int128 unexpected error: -2 (errno 2)
  #15/9 btf_dump/btf_dump: int_data:FAIL

Further analysis showed gcc build kernel has type "__int128" in dwarf/BTF
and it doesn't exist in clang build kernel. Code searching for kernel code
found the following:
  arch/s390/include/asm/types.h:  unsigned __int128 pair;
  crypto/ecc.c:   unsigned __int128 m = (unsigned __int128)left * right;
  include/linux/math64.h: return (u64)(((unsigned __int128)a * mul) >> shift);
  include/linux/math64.h: return (u64)(((unsigned __int128)a * mul) >> shift);
  lib/ubsan.h:typedef __int128 s_max;
  lib/ubsan.h:typedef unsigned __int128 u_max;

In my case, CONFIG_UBSAN is not enabled. Even if we only have "unsigned __int128"
in the code, somehow gcc still put "__int128" in dwarf while clang didn't.
Hence current test works fine for gcc but not for clang.

Enabling CONFIG_UBSAN is an option to provide __int128 type into dwarf
reliably for both gcc and clang, but not everybody enables CONFIG_UBSAN
in their kernel build. So the best choice is to use "unsigned __int128" type
which is available in both clang and gcc build kernels. But clang and gcc
dwarf encoded names for "unsigned __int128" are different:

  [$ ~] cat t.c
  unsigned __int128 a;
  [$ ~] gcc -g -c t.c && llvm-dwarfdump t.o | grep __int128
                  DW_AT_type      (0x00000031 "__int128 unsigned")
                  DW_AT_name      ("__int128 unsigned")
  [$ ~] clang -g -c t.c && llvm-dwarfdump t.o | grep __int128
                  DW_AT_type      (0x00000033 "unsigned __int128")
                  DW_AT_name      ("unsigned __int128")

The test change in this patch tries to test type name before
doing actual test.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20210924025856.2192476-1-yhs@fb.com
2021-09-24 15:35:03 -07:00
Jin Yao
6c93f39f2f perf list: Display pmu prefix for partially supported hybrid cache events
Part of hardware cache events are only available on one CPU PMU.
For example, 'L1-dcache-load-misses' is only available on cpu_core.
perf list should clearly report this info.

root@otcpl-adl-s-2:~# ./perf list

Before:
  L1-dcache-load-misses                              [Hardware cache event]
  L1-dcache-loads                                    [Hardware cache event]
  L1-dcache-stores                                   [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  L1-icache-loads                                    [Hardware cache event]
  LLC-load-misses                                    [Hardware cache event]
  LLC-loads                                          [Hardware cache event]
  LLC-store-misses                                   [Hardware cache event]
  LLC-stores                                         [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  dTLB-store-misses                                  [Hardware cache event]
  dTLB-stores                                        [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]
  node-load-misses                                   [Hardware cache event]
  node-loads                                         [Hardware cache event]
  node-store-misses                                  [Hardware cache event]
  node-stores                                        [Hardware cache event]

After:
  L1-dcache-loads                                    [Hardware cache event]
  L1-dcache-stores                                   [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  LLC-load-misses                                    [Hardware cache event]
  LLC-loads                                          [Hardware cache event]
  LLC-store-misses                                   [Hardware cache event]
  LLC-stores                                         [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  cpu_atom/L1-icache-loads/                          [Hardware cache event]
  cpu_core/L1-dcache-load-misses/                    [Hardware cache event]
  cpu_core/node-load-misses/                         [Hardware cache event]
  cpu_core/node-loads/                               [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  dTLB-store-misses                                  [Hardware cache event]
  dTLB-stores                                        [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]

Now we can clearly see 'L1-dcache-load-misses' is only available
on cpu_core.

If without pmu prefix, it indicates the event is available on both
cpu_core and cpu_atom.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210909061844.10221-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-24 15:54:08 -03:00
Linus Torvalds
1b7eaf5701 arm64 fixes:
- It turns out that the optimised string routines merged in 5.14 are not
   safe with in-kernel MTE (KASAN_HW_TAGS) because of reading beyond the
   end of a string (strcmp, strncmp). Such reading may go across a 16
   byte tag granule and cause a tag check fault. When KASAN_HW_TAGS is
   enabled, use the generic strcmp/strncmp C implementation.
 
 - An errata workaround for ThunderX relied on the CPU capabilities being
   enabled in a specific order. This disappeared with the automatic
   generation of the cpucaps.h file (sorted alphabetically). Fix it by
   checking the current CPU only rather than the system-wide capability.
 
 - Add system_supports_mte() checks on the kernel entry/exit path and
   thread switching to avoid unnecessary barriers and function calls on
   systems where MTE is not supported.
 
 - kselftests: skip arm64 tests if the required features are missing.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmFOC7oACgkQa9axLQDI
 XvHGhBAAkTzrLtyveymoeD5bk0Lh4Co2i0PKqoOl82ltvagjcvUY5lDLfgKV8ac2
 0fZ6E7Jjt6Khq0VAWryRf/RkjxTXmuqDmyb7hZ/t0TNciRXKMfQyE8o8KhsWOvRF
 hD0COUOnJ4CrgQuAJwsW9uagcrix9NucK9WziHB65RACxnD+k7foffCNv1b9Tw9s
 oqAfZ144B+L1Z0WFoqG4uGkngIyG/lcSPJMhDRo45iRWtNUnlR/m/ApWKb+r+X1X
 /Aq2dWklUnrGBB1MdO3WAaRH/lvulH6t4KVhTJw2Gkq5ogy+mYLO15wGYmM+FAWT
 ExS8fy4/G2vX4RAxYRBZF7Xl3zN7F9n3RYKyYfxgGwSy30TnOLYUMoemYvA9Eh27
 O6cjaZacyPVfaB5LUwj6H84NbDbcJKCO775sLuIW8IEMklyZhorS8HG7nYnL6rBk
 jXM5AVPC2j8kXCgawW9u+s/pEQAvLOvM9y7Q0pjj3n2WNeV09bxzkgX5wStSaMYJ
 Ok95zCK1SbKBYdA7yQnFcQkM52XkqpML8vccBHUdxVbtAetuYxoyGKKs/9G6mPEH
 vWSxP4ymvRza62EofmUJzNIEMWwwNUgyNeY/bVJXYf1oc324Wx4T1ewQagrGVQve
 miEgOVEpn+ygHFkEO3mc0wnFfQ3m5CSBvOjeWlTbv7lgekRDyx4=
 =JIq+
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - It turns out that the optimised string routines merged in 5.14 are
   not safe with in-kernel MTE (KASAN_HW_TAGS) because of reading beyond
   the end of a string (strcmp, strncmp). Such reading may go across a
   16 byte tag granule and cause a tag check fault. When KASAN_HW_TAGS
   is enabled, use the generic strcmp/strncmp C implementation.

 - An errata workaround for ThunderX relied on the CPU capabilities
   being enabled in a specific order. This disappeared with the
   automatic generation of the cpucaps.h file (sorted alphabetically).
   Fix it by checking the current CPU only rather than the system-wide
   capability.

 - Add system_supports_mte() checks on the kernel entry/exit path and
   thread switching to avoid unnecessary barriers and function calls on
   systems where MTE is not supported.

 - kselftests: skip arm64 tests if the required features are missing.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Restore forced disabling of KPTI on ThunderX
  kselftest/arm64: signal: Skip tests if required features are missing
  arm64: Mitigate MTE issues with str{n}cmp()
  arm64: add MTE supported check to thread switching and syscall entry/exit
2021-09-24 11:12:17 -07:00
Numfor Mbiziwo-Tiapo
5ba1071f75 x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses
Don't perform unaligned loads in __get_next() and __peek_nbyte_next() as
these are forms of undefined behavior:

"A pointer to an object or incomplete type may be converted to a pointer
to a different object or incomplete type. If the resulting pointer
is not correctly aligned for the pointed-to type, the behavior is
undefined."

(from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)

These problems were identified using the undefined behavior sanitizer
(ubsan) with the tools version of the code and perf test.

 [ bp: Massage commit message. ]

Signed-off-by: Numfor Mbiziwo-Tiapo <nums@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20210923161843.751834-1-irogers@google.com
2021-09-24 12:37:38 +02:00
Oliver Upton
386ca9d7fd selftests: KVM: Explicitly use movq to read xmm registers
Compiling the KVM selftests with clang emits the following warning:

>> include/x86_64/processor.h:297:25: error: variable 'xmm0' is uninitialized when used here [-Werror,-Wuninitialized]
>>                return (unsigned long)xmm0;

where xmm0 is accessed via an uninitialized register variable.

Indeed, this is a misuse of register variables, which really should only
be used for specifying register constraints on variables passed to
inline assembly. Rather than attempting to read xmm registers via
register variables, just explicitly perform the movq from the desired
xmm register.

Fixes: 783e9e5126 ("kvm: selftests: add API testing infrastructure")
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210924005147.1122357-1-oupton@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-24 02:32:58 -04:00
Oliver Upton
fbf094ce52 selftests: KVM: Call ucall_init when setting up in rseq_test
While x86 does not require any additional setup to use the ucall
infrastructure, arm64 needs to set up the MMIO address used to signal a
ucall to userspace. rseq_test does not initialize the MMIO address,
resulting in the test spinning indefinitely.

Fix the issue by calling ucall_init() during setup.

Fixes: 61e52f1630 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs")
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210923220033.4172362-1-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-24 02:32:12 -04:00
Linus Torvalds
f10f0481a5 A fix for a bug with restartable sequences and KVM. KVM's handling
of TIF_NOTIFY_RESUME, e.g. for task migration, clears the flag without
 informing rseq and leads to stale data in userspace's rseq struct.
 
 I'm sending this as a separate pull request since it's not code
 that I usually touch.  In particular, patch 2 ("entry: rseq: Call
 rseq_handle_notify_resume() in tracehook_notify_resume()") is just a
 cleanup to try and make future bugs less likely.  If you prefer this to
 be sent via Thomas and only in 5.16, please speak up.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmFLPYgUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNCowf6A9pTuDspCC2IRICfQnHj/Q/Xc7HH
 UmwYo26GctfYaq9AyIAGzcRx3iOEGjb9fZVJ6mzPUGIygio9ZyuiPHogf7lMAb+x
 39ts5uSOp+N+8e0fvX578WFfmG5hQa4Tp9W3T2Y5KsVgK2Nf8F08DckzIgD8cbkN
 NQKTRIi8AYgb20y3NFZjzsPRxF8850QK7xVCI+LBjryyWpEGT5ZsthrYUeexiJPz
 XN+VOYJen5GXVBCar2JbA7EVSrMZbKSy+M3fJ1vuW5dZHySaiu69JXJHop71jTnJ
 5BGue917MfH6RTDzIFFUcg7NmwcuXHpw4dsFeiyExYFNw1uWWQpk0efC1g==
 =/xlE
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull rseq fixes from Paolo Bonzini:
 "A fix for a bug with restartable sequences and KVM.

  KVM's handling of TIF_NOTIFY_RESUME, e.g. for task migration, clears
  the flag without informing rseq and leads to stale data in userspace's
  rseq struct"

* tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Remove __NR_userfaultfd syscall fallback
  KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs
  tools: Move x86 syscall number fallbacks to .../uapi/
  entry: rseq: Call rseq_handle_notify_resume() in tracehook_notify_resume()
  KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest
2021-09-23 11:24:12 -07:00
Jakub Kicinski
2fcd14d0f7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/mptcp/protocol.c
  977d293e23 ("mptcp: ensure tx skbs always have the MPTCP ext")
  efe686ffce ("mptcp: ensure tx skbs always have the MPTCP ext")

same patch merged in both trees, keep net-next.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-23 11:19:49 -07:00
Linus Torvalds
9bc62afe03 Networking fixes for 5.15-rc3.
Current release - regressions:
 
  - dsa: bcm_sf2: fix array overrun in bcm_sf2_num_active_ports()
 
 Previous releases - regressions:
 
  - introduce a shutdown method to mdio device drivers, and make DSA
    switch drivers compatible with masters disappearing on shutdown;
    preventing infinite reference wait
 
  - fix issues in mdiobus users related to ->shutdown vs ->remove
 
  - virtio-net: fix pages leaking when building skb in big mode
 
  - xen-netback: correct success/error reporting for the SKB-with-fraglist
 
  - dsa: tear down devlink port regions when tearing down the devlink
         port on error
 
  - nexthop: fix division by zero while replacing a resilient group
 
  - hns3: check queue, vf, vlan ids range before using
 
 Previous releases - always broken:
 
  - napi: fix race against netpoll causing NAPI getting stuck
 
  - mlx4_en: ensure link operstate is updated even if link comes up
             before netdev registration
 
  - bnxt_en: fix TX timeout when TX ring size is set to the smallest
 
  - enetc: fix illegal access when reading affinity_hint;
           prevent oops on sysfs access
 
  - mtk_eth_soc: avoid creating duplicate offload entries
 
 Misc:
 
  - core: correct the sock::sk_lock.owned lockdep annotations
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFMr4MACgkQMUZtbf5S
 Irv6Ag/+Ml4q6/IVO0jBppztZO1RrSalb3YE9JjPQyMchauVcdcADNpYF+Jo/gcH
 4q+/Oikfp6gkQpJTFd0Y9X7UhwA4Jm4wWtEisqc6PJeHOagDZmVUn353WtgpnCNL
 CgYBfa5k3msGudkqgeXIyiP/2sekBevTy+fOptubLZClyBrEwNUUUZBlpT9aI9Sj
 ru1eMYklfcxP60AQgNhqq6ZwJnRELgN75fSR6ypVCGcRnTK4UGL/b6TvnPYn8uYY
 zeNuMZZzYZK5B73tC6rWpteHWZ7VW3Km0WvIKs+ORM8nYchz/EprKZ0HCLPYrWvf
 ib5Wi7HyL7/n9k9NUTCGrQY3tkOWNzXOepjpiBZPqCG9r2hc3JSR7Q2lFwL+gKv/
 sh2y+T2xfp0WFGmG2XiU2MgnkypMSKah1sC/XRE7YLw02vPAnWQxxl/KVNek4j7M
 CH/Tg9ErVKDRLN7KO/kKl3s8I8N4hdctms/YUt9QD5J9Rw/Jqwr/79bq1uLy6d4o
 //ipmCTHex57Nvy80PtgcuKJhoeqGwR/Av6BvBMRZ1SOYs/C6q45skHTlYyiNY3+
 Dyj9+nfrhsyE835GKPe8lqBFZONBXpXw+EUNXeYRiv0Pcd+JKek07bbajSQVSpd8
 8nqQwylpGII0iPGyOc9wKajzh7O5W2odFIdOwtY/5yVjrcFgBd0=
 =VcL3
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Current release - regressions:

   - dsa: bcm_sf2: fix array overrun in bcm_sf2_num_active_ports()

  Previous releases - regressions:

   - introduce a shutdown method to mdio device drivers, and make DSA
     switch drivers compatible with masters disappearing on shutdown;
     preventing infinite reference wait

   - fix issues in mdiobus users related to ->shutdown vs ->remove

   - virtio-net: fix pages leaking when building skb in big mode

   - xen-netback: correct success/error reporting for the
     SKB-with-fraglist

   - dsa: tear down devlink port regions when tearing down the devlink
     port on error

   - nexthop: fix division by zero while replacing a resilient group

   - hns3: check queue, vf, vlan ids range before using

  Previous releases - always broken:

   - napi: fix race against netpoll causing NAPI getting stuck

   - mlx4_en: ensure link operstate is updated even if link comes up
     before netdev registration

   - bnxt_en: fix TX timeout when TX ring size is set to the smallest

   - enetc: fix illegal access when reading affinity_hint; prevent oops
     on sysfs access

   - mtk_eth_soc: avoid creating duplicate offload entries

  Misc:

   - core: correct the sock::sk_lock.owned lockdep annotations"

* tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits)
  atlantic: Fix issue in the pm resume flow.
  net/mlx4_en: Don't allow aRFS for encapsulated packets
  net: mscc: ocelot: fix forwarding from BLOCKING ports remaining enabled
  net: ethernet: mtk_eth_soc: avoid creating duplicate offload entries
  nfc: st-nci: Add SPI ID matching DT compatible
  MAINTAINERS: remove Guvenc Gulce as net/smc maintainer
  nexthop: Fix memory leaks in nexthop notification chain listeners
  mptcp: ensure tx skbs always have the MPTCP ext
  qed: rdma - don't wait for resources under hw error recovery flow
  s390/qeth: fix deadlock during failing recovery
  s390/qeth: Fix deadlock in remove_discipline
  s390/qeth: fix NULL deref in qeth_clear_working_pool_list()
  net: dsa: realtek: register the MDIO bus under devres
  net: dsa: don't allocate the slave_mii_bus using devres
  Doc: networking: Fox a typo in ice.rst
  net: dsa: fix dsa_tree_setup error path
  net/smc: fix 'workqueue leaked lock' in smc_conn_abort_work
  net/smc: add missing error check in smc_clc_prfx_set()
  net: hns3: fix a return value error in hclge_get_reset_status()
  net: hns3: check vlan id before using it
  ...
2021-09-23 10:30:31 -07:00
Maxim Levitsky
1ad32105d7 KVM: x86: selftests: test simultaneous uses of V_IRQ from L1 and L0
Test that if:

* L1 disables virtual interrupt masking, and INTR intercept.

* L1 setups a virtual interrupt to be injected to L2 and enters L2 with
  interrupts disabled, thus the virtual interrupt is pending.

* Now an external interrupt arrives in L1 and since
  L1 doesn't intercept it, it should be delivered to L2 when
  it enables interrupts.

  to do this L0 (abuses) V_IRQ to setup an
  interrupt window, and returns to L2.

* L2 enables interrupts.
  This should trigger the interrupt window,
  injection of the external interrupt and delivery
  of the virtual interrupt that can now be done.

* Test that now L2 gets those interrupts.

This is the test that demonstrates the issue that was
fixed in the previous patch.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210914154825.104886-3-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23 10:05:07 -04:00
David Matlack
7c236b816e KVM: selftests: Create a separate dirty bitmap per slot
The calculation to get the per-slot dirty bitmap was incorrect leading
to a buffer overrun. Fix it by splitting out the dirty bitmap into a
separate bitmap per slot.

Fixes: 609e6202ea ("KVM: selftests: Support multiple slots in dirty_log_perf_test")
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20210917173657.44011-4-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:14 -04:00
David Matlack
9f2fc5554a KVM: selftests: Refactor help message for -s backing_src
All selftests that support the backing_src option were printing their
own description of the flag and then calling backing_src_help() to dump
the list of available backing sources. Consolidate the flag printing in
backing_src_help() to align indentation, reduce duplicated strings, and
improve consistency across tests.

Note: Passing "-s" to backing_src_help is unnecessary since every test
uses the same flag. However I decided to keep it for code readability
at the call sites.

While here this opportunistically fixes the incorrectly interleaved
printing -x help message and list of backing source types in
dirty_log_perf_test.

Fixes: 609e6202ea ("KVM: selftests: Support multiple slots in dirty_log_perf_test")
Reviewed-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20210917173657.44011-3-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:14 -04:00
David Matlack
a1e638da1b KVM: selftests: Change backing_src flag to -s in demand_paging_test
Every other KVM selftest uses -s for the backing_src, so switch
demand_paging_test to match.

Reviewed-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20210917173657.44011-2-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:13 -04:00
Oliver Upton
01f91acb55 selftests: KVM: Align SMCCC call with the spec in steal_time
The SMC64 calling convention passes a function identifier in w0 and its
parameters in x1-x17. Given this, there are two deviations in the
SMC64 call performed by the steal_time test: the function identifier is
assigned to a 64 bit register and the parameter is only 32 bits wide.

Align the call with the SMCCC by using a 32 bit register to handle the
function identifier and increasing the parameter width to 64 bits.

Suggested-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20210921171121.2148982-3-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:08 -04:00
Oliver Upton
90b54129e8 selftests: KVM: Fix check for !POLLIN in demand_paging_test
The logical not operator applies only to the left hand side of a bitwise
operator. As such, the check for POLLIN not being set in revents wrong.
Fix it by adding parentheses around the bitwise expression.

Fixes: 4f72180eb4 ("KVM: selftests: Add demand paging content to the demand paging test")
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210921171121.2148982-2-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:08 -04:00
Sean Christopherson
2da4a23599 KVM: selftests: Remove __NR_userfaultfd syscall fallback
Revert the __NR_userfaultfd syscall fallback added for KVM selftests now
that x86's unistd_{32,63}.h overrides are under uapi/ and thus not in
KVM selftests' search path, i.e. now that KVM gets x86 syscall numbers
from the installed kernel headers.

No functional change intended.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210901203030.1292304-6-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:24:02 -04:00
Sean Christopherson
61e52f1630 KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs
Add a test to verify an rseq's CPU ID is updated correctly if the task is
migrated while the kernel is handling KVM_RUN.  This is a regression test
for a bug introduced by commit 72c3c0fe54 ("x86/kvm: Use generic xfer
to guest work function"), where TIF_NOTIFY_RESUME would be cleared by KVM
without updating rseq, leading to a stale CPU ID and other badness.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Message-Id: <20210901203030.1292304-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:24:02 -04:00
Sean Christopherson
de5f4213da tools: Move x86 syscall number fallbacks to .../uapi/
Move unistd_{32,64}.h from x86/include/asm to x86/include/uapi/asm so
that tools/selftests that install kernel headers, e.g. KVM selftests, can
include non-uapi tools headers, e.g. to get 'struct list_head', without
effectively overriding the installed non-tool uapi headers.

Swapping KVM's search order, e.g. to search the kernel headers before
tool headers, is not a viable option as doing results in linux/type.h and
other core headers getting pulled from the kernel headers, which do not
have the kernel-internal typedefs that are used through tools, including
many files outside of selftests/kvm's control.

Prior to commit cec07f53c3 ("perf tools: Move syscall number fallbacks
from perf-sys.h to tools/arch/x86/include/asm/"), the handcoded numbers
were actual fallbacks, i.e. overriding unistd_{32,64}.h from the kernel
headers was unintentional.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210901203030.1292304-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:24:01 -04:00
Jiri Benc
17b52c226a seltests: bpf: test_tunnel: Use ip neigh
The 'arp' command is deprecated and is another dependency of the selftest.
Just use 'ip neigh', the test depends on iproute2 already.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/40f24b9d3f0f53b5c44471b452f9a11f4d13b7af.1632236133.git.jbenc@redhat.com
2021-09-21 20:40:16 -07:00
Andrii Nakryiko
cc10623c68 libbpf: Add legacy uprobe attaching support
Similarly to recently added legacy kprobe attach interface support
through tracefs, support attaching uprobes using the legacy interface if
host kernel doesn't support newer FD-based interface.

For uprobes event name consists of "libbpf_" prefix, PID, sanitized
binary path and offset within that binary. Structuraly the code is
aligned with kprobe logic refactoring in previous patch. struct
bpf_link_perf is re-used and all the same legacy_probe_name and
legacy_is_retprobe fields are used to ensure proper cleanup on
bpf_link__destroy().

Users should be aware, though, that on old kernels which don't support
FD-based interface for kprobe/uprobe attachment, if the application
crashes before bpf_link__destroy() is called, uprobe legacy
events will be left in tracefs. This is the same limitation as with
legacy kprobe interfaces.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-5-andrii@kernel.org
2021-09-21 19:40:09 -07:00
Andrii Nakryiko
46ed5fc33d libbpf: Refactor and simplify legacy kprobe code
Refactor legacy kprobe handling code to follow the same logic as uprobe
legacy logic added in the next patchs:
  - add append_to_file() helper that makes it simpler to work with
    tracefs file-based interface for creating and deleting probes;
  - move out probe/event name generation outside of the code that
    adds/removes it, which simplifies bookkeeping significantly;
  - change the probe name format to start with "libbpf_" prefix and
    include offset within kernel function;
  - switch 'unsigned long' to 'size_t' for specifying kprobe offsets,
    which is consistent with how uprobes define that, simplifies
    printf()-ing internally, and also avoids unnecessary complications on
    architectures where sizeof(long) != sizeof(void *).

This patch also implicitly fixes the problem with invalid open() error
handling present in poke_kprobe_events(), which (the function) this
patch removes.

Fixes: ca304b40c2 ("libbpf: Introduce legacy kprobe events support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-4-andrii@kernel.org
2021-09-21 19:40:09 -07:00
Andrii Nakryiko
d3b0e3b03c selftests/bpf: Adopt attach_probe selftest to work on old kernels
Make sure to not use ref_ctr_off feature when running on old kernels
that don't support this feature. This allows to test libbpf's legacy
kprobe and uprobe logic on old kernels.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-3-andrii@kernel.org
2021-09-21 19:40:09 -07:00
Andrii Nakryiko
303a257223 libbpf: Fix memory leak in legacy kprobe attach logic
In some error scenarios legacy_probe string won't be free()'d. Fix this.
This was reported by Coverity static analysis.

Fixes: ca304b40c2 ("libbpf: Introduce legacy kprobe events support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-2-andrii@kernel.org
2021-09-21 19:40:08 -07:00
Ian Rogers
cb7bfb1da6 perf parse-events: Remove unnecessary #includes
Minor cleanup motivated by trying to separately fuzz test parse-events.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210127184629.516169-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-21 20:55:48 -03:00
Dan Williams
7d3eb23c4c tools/testing/cxl: Introduce a mock memory device + driver
Introduce an emulated device-set plus driver to register CXL memory
devices, 'struct cxl_memdev' instances, in the mock cxl_test topology.
This enables the development of HDM Decoder (Host-managed Device Memory
Decoder) programming flow (region provisioning) in an environment that
can be updated alongside the kernel as it gains more functionality.

Whereas the cxl_pci module looks for CXL memory expanders on the 'pci'
bus, the cxl_mock_mem module attaches to CXL expanders on the platform
bus emitted by cxl_test.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/163116440099.2460985.10692549614409346604.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-09-21 14:09:34 -07:00
Dan Williams
67dcdd4d3b tools/testing/cxl: Introduce a mocked-up CXL port hierarchy
Create an environment for CXL plumbing unit tests. Especially when it
comes to an algorithm for HDM Decoder (Host-managed Device Memory
Decoder) programming, the availability of an in-kernel-tree emulation
environment for CXL configuration complexity and corner cases speeds
development and deters regressions.

The approach taken mirrors what was done for tools/testing/nvdimm/. I.e.
an external module, cxl_test.ko built out of the tools/testing/cxl/
directory, provides mock implementations of kernel APIs and kernel
objects to simulate a real world device hierarchy.

One feedback for the tools/testing/nvdimm/ proposal was "why not do this
in QEMU?". In fact, the CXL development community has developed a QEMU
model for CXL [1]. However, there are a few blocking issues that keep
QEMU from being a tight fit for topology + provisioning unit tests:

1/ The QEMU community has yet to show interest in merging any of this
   support that has had patches on the list since November 2020. So,
   testing CXL to date involves building custom QEMU with out-of-tree
   patches.

2/ CXL mechanisms like cross-host-bridge interleave do not have a clear
   path to be emulated by QEMU without major infrastructure work. This
   is easier to achieve with the alloc_mock_res() approach taken in this
   patch to shortcut-define emulated system physical address ranges with
   interleave behavior.

The QEMU enabling has been critical to get the driver off the ground,
and may still move forward, but it does not address the ongoing needs of
a regression testing environment and test driven development.

This patch adds an ACPI CXL Platform definition with emulated CXL
multi-ported host-bridges. A follow on patch adds emulated memory
expander devices.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20210202005948.241655-1-ben.widawsky@intel.com [1]
Link: https://lore.kernel.org/r/163164680798.2831381.838684634806668012.stgit@dwillia2-desk3.amr.corp.intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-09-21 13:47:10 -07:00
Ian Rogers
b28e5e4391 perf daemon: Avoid msan warnings on send_cmd
As a full union is always sent, ensure all bytes of the union are
initialized with memset to avoid msan warnings of use of uninitialized
memory.

An example warning from the daemon test:

Uninitialized bytes in __interceptor_write at offset 71 inside [0x7ffd98da6280, 72)
==11602==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x5597edccdbe4 in ion tools/lib/perf/lib.c:18:6
    #1 0x5597edccdbe4 in writen tools/lib/perf/lib.c:47:9
    #2 0x5597ed221d30 in send_cmd tools/perf/builtin-daemon.c:1376:22
    #3 0x5597ed21b48c in cmd_daemon tools/perf/builtin-daemon.c
    #4 0x5597ed1d6b67 in run_builtin tools/perf/perf.c:313:11
    #5 0x5597ed1d6036 in handle_internal_command tools/perf/perf.c:365:8
    #6 0x5597ed1d6036 in run_argv tools/perf/perf.c:409:2
    #7 0x5597ed1d6036 in main tools/perf/perf.c:539:3

SUMMARY: MemorySanitizer: use-of-uninitialized-value tools/lib/perf/lib.c:18:6 in ion
Exiting

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210617055554.1917997-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-21 16:25:41 -03:00
Arnaldo Carvalho de Melo
4122c9c3f0 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes in the last pushed perf/urgent.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-21 15:35:47 -03:00
Cristian Marussi
0e3dbf765f kselftest/arm64: signal: Skip tests if required features are missing
During initialization of a signal testcase, features declared as required
are properly checked against the running system but no action is then taken
to effectively skip such a testcase.

Fix core signals test logic to abort initialization and report such a
testcase as skipped to the KSelfTest framework.

Fixes: f96bf43403 ("kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils")
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210920121228.35368-1-cristian.marussi@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-09-21 18:12:03 +01:00
Florian Westphal
cb89f63ba6 selftests: netfilter: add zone stress test with colliding tuples
Add 20k entries to the connection tracking table, once from the
data plane, once via ctnetlink.

In both cases, each entry lives in a different conntrack zone
and addresses/ports are identical.

Expectation is that insertions work and occurs in constant time:

PASS: added 10000 entries in 1215 ms (now 10000 total, loop 1)
PASS: added 10000 entries in 1214 ms (now 20000 total, loop 2)
PASS: inserted 20000 entries from packet path in 2434 ms total
PASS: added 10000 entries in 57631 ms (now 10000 total)
PASS: added 10000 entries in 58572 ms (now 20000 total)
PASS: inserted 20000 entries via ctnetlink in 116205 ms

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-09-21 03:46:55 +02:00
Florian Westphal
0f1148abb2 selftests: netfilter: add selftest for directional zone support
Add a script to exercise NAT port clash resolution with directional zones.

Add net namespaces that use the same IP address and connect them to a
gateway.

Gateway uses policy routing based on iif/mark and conntrack zones to
isolate the client namespaces.  In server direction, same zone with NAT
to single address is used.

Then, connect to a server from each client netns, using identical
connection id, i.e.  saddr:sport -> daddr:dport.

Expectation is for all connections to succeeed: NAT gatway is
supposed to do port reallocation for each of the (clashing) connections.

This is based on the description/use case provided in the commit message of
deedb59039 ("netfilter: nf_conntrack: add direction support for zones").

Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-09-21 03:46:55 +02:00
Grant Seltzer
97c140d94e libbpf: Add doc comments in libbpf.h
This adds comments above functions in libbpf.h which document
their uses. These comments are of a format that doxygen and sphinx
can pick up and render. These are rendered by libbpf.readthedocs.org

These doc comments are for:
- bpf_object__find_map_by_name()
- bpf_map__fd()
- bpf_map__is_internal()
- libbpf_get_error()
- libbpf_num_possible_cpus()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210918031457.36204-1-grantseltzer@gmail.com
2021-09-20 17:23:31 -07:00
Linus Torvalds
62453a460a powerpc fixes for 5.15 #2
Fix crashes when scv (System Call Vectored) is used to make a syscall when a transaction
 is active, on Power9 or later.
 
 Fix bad interactions between rfscv (Return-from scv) and Power9 fake-suspend mode.
 
 Fix crashes when handling machine checks in LPARs using the Hash MMU.
 
 Partly revert a recent change to our XICS interrupt controller code, which broke the
 recently added Microwatt support.
 
 Thanks to: Cédric Le Goater, Eirik Fuller, Ganesh Goudar, Gustavo Romero, Joel Stanley,
 Nicholas Piggin.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmFHDfsTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgNlAD/oC/14jV2fCv54vH5Al1XAz2WuPhhEu
 UsN+MLlx7JshXfLeGzzEogvvakfLFuVkpIxsv+g8a9H6jejjCj/A4wjUy34Im0hS
 nhsMsa1vZPZY/Vz9v3WzdYmqjHVZDKo6ZdUmn0z4OjPnv7wQaEEbaXmoe0+CaXXQ
 jIY2TjVQxbgXeUI1QcPUWh36hiSaYn4O8fAD4VdjxdStl2z6q8zbo/5nzQ8dkBpP
 lldC0T/HRCKLe1ODUExFfbBFqFuLVdbALq0RIuh0GXd+dAqpK1iOkNsxUWChU6Q7
 gu0PrmhRydX3bmL9jmOsZfUk/hIR0fEgwSGITQAb8v0LCvZI0JRn+yvkl7uVZ9Ce
 OWY7B8HZQldvoQqA4/f7YCpAIS2ag7ugDrcAfuDVkDl2o/qJHoTi58IU24wI6+LL
 lwJZjNJoxZ8n1tl1l+gtCNp4E4YYOfuKZi0zH0oJoLiSWeK6PYOvwEGuEC2iQ9cJ
 BEdyYVXqSBuc2anCQ4k+y8gu/zJwP1MCpDNMPMU4P/JcaG9lnrLUhvctf4GKtOSy
 UiktHSTL52J5ldMxSMicpw9qCVgePsQX78Sk3UTHRilRmix6TtWOWYslm4mL+zeD
 u3wpnK+NKkju72LRS/AB1U6cWvModW+WBANPws9j6caoOorh397xsDGEYiKWvp0Z
 xSG/dzXzi2oscQ==
 =yC9U
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix crashes when scv (System Call Vectored) is used to make a syscall
   when a transaction is active, on Power9 or later.

 - Fix bad interactions between rfscv (Return-from scv) and Power9
   fake-suspend mode.

 - Fix crashes when handling machine checks in LPARs using the Hash MMU.

 - Partly revert a recent change to our XICS interrupt controller code,
   which broke the recently added Microwatt support.

Thanks to Cédric Le Goater, Eirik Fuller, Ganesh Goudar, Gustavo Romero,
Joel Stanley, Nicholas Piggin.

* tag 'powerpc-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/xics: Set the IRQ chip data for the ICS native backend
  powerpc/mce: Fix access error in mce handler
  KVM: PPC: Book3S HV: Tolerate treclaim. in fake-suspend mode changing registers
  powerpc/64s: system call rfscv workaround for TM bugs
  selftests/powerpc: Add scv versions of the basic TM syscall tests
  powerpc/64s: system call scv tabort fix for corrupt irq soft-mask state
2021-09-19 13:00:23 -07:00
Shuah Khan
e30cd812df selftests: net: af_unix: Fix makefile to use TEST_GEN_PROGS
Makefile uses TEST_PROGS instead of TEST_GEN_PROGS to define
executables. TEST_PROGS is for shell scripts that need to be
installed and run by the common lib.mk framework. The common
framework doesn't touch TEST_PROGS when it does build and clean.

As a result "make kselftest-clean" and "make clean" fail to remove
executables. Run and install work because the common framework runs
and installs TEST_PROGS. Build works because the Makefile defines
"all" rule which is unnecessary if TEST_GEN_PROGS is used.

Use TEST_GEN_PROGS so the common framework can handle build/run/
install/clean properly.

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 13:22:18 +01:00
Shuah Khan
48514a2233 selftests: net: af_unix: Fix incorrect args in test result msg
Fix the args to fprintf(). Splitting the message ends up passing
incorrect arg for "sigurg %d" and an extra arg overall. The test
result message ends up incorrect.

test_unix_oob.c: In function ‘main’:
test_unix_oob.c:274:43: warning: format ‘%d’ expects argument of type ‘int’, but argument 3 has type ‘char *’ [-Wformat=]
  274 |   fprintf(stderr, "Test 3 failed, sigurg %d len %d OOB %c ",
      |                                          ~^
      |                                           |
      |                                           int
      |                                          %s
  275 |   "atmark %d\n", signal_recvd, len, oob, atmark);
      |   ~~~~~~~~~~~~~
      |   |
      |   char *
test_unix_oob.c:274:19: warning: too many arguments for format [-Wformat-extra-args]
  274 |   fprintf(stderr, "Test 3 failed, sigurg %d len %d OOB %c ",

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 13:15:19 +01:00
Andrii Nakryiko
219d720e6d perf bpf: Ignore deprecation warning when using libbpf's btf__get_from_id()
Perf code re-implements libbpf's btf__load_from_kernel_by_id() API as
a weak function, presumably to dynamically link against old version of
libbpf shared library. Unfortunately this causes compilation warning
when perf is compiled against libbpf v0.6+.

For now, just ignore deprecation warning, but there might be a better
solution, depending on perf's needs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: kernel-team@fb.com
LPU-Reference: 20210914170004.4185659-1-andrii@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:47:02 -03:00
Ian Rogers
aba5daeb64 libperf evsel: Make use of FD robust.
FD uses xyarray__entry that may return NULL if an index is out of
bounds. If NULL is returned then a segv happens as FD unconditionally
dereferences the pointer. This was happening in a case of with perf
iostat as shown below. The fix is to make FD an "int*" rather than an
int and handle the NULL case as either invalid input or a closed fd.

  $ sudo gdb --args perf stat --iostat  list
  ...
  Breakpoint 1, perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
  50      {
  (gdb) bt
   #0  perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
   #1  0x000055555585c188 in evsel__open_cpu (evsel=0x5555560951a0, cpus=0x555556093410,
      threads=0x555556086fb0, start_cpu=0, end_cpu=1) at util/evsel.c:1792
   #2  0x000055555585cfb2 in evsel__open (evsel=0x5555560951a0, cpus=0x0, threads=0x555556086fb0)
      at util/evsel.c:2045
   #3  0x000055555585d0db in evsel__open_per_thread (evsel=0x5555560951a0, threads=0x555556086fb0)
      at util/evsel.c:2065
   #4  0x00005555558ece64 in create_perf_stat_counter (evsel=0x5555560951a0,
      config=0x555555c34700 <stat_config>, target=0x555555c2f1c0 <target>, cpu=0) at util/stat.c:590
   #5  0x000055555578e927 in __run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
      at builtin-stat.c:833
   #6  0x000055555578f3c6 in run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
      at builtin-stat.c:1048
   #7  0x0000555555792ee5 in cmd_stat (argc=1, argv=0x7fffffffe4a0) at builtin-stat.c:2534
   #8  0x0000555555835ed3 in run_builtin (p=0x555555c3f540 <commands+288>, argc=3,
      argv=0x7fffffffe4a0) at perf.c:313
   #9  0x0000555555836154 in handle_internal_command (argc=3, argv=0x7fffffffe4a0) at perf.c:365
   #10 0x000055555583629f in run_argv (argcp=0x7fffffffe2ec, argv=0x7fffffffe2e0) at perf.c:409
   #11 0x0000555555836692 in main (argc=3, argv=0x7fffffffe4a0) at perf.c:539
  ...
  (gdb) c
  Continuing.
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/).
  /bin/dmesg | grep -i perf may provide additional information.

  Program received signal SIGSEGV, Segmentation fault.
  0x00005555559b03ea in perf_evsel__close_fd_cpu (evsel=0x5555560951a0, cpu=1) at evsel.c:166
  166                     if (FD(evsel, cpu, thread) >= 0)

v3. fixes a bug in perf_evsel__run_ioctl where the sense of a branch was
    backward.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210918054440.2350466-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:06 -03:00
Michael Petlan
57f0ff059e perf machine: Initialize srcline string member in add_location struct
It's later supposed to be either a correct address or NULL. Without the
initialization, it may contain an undefined value which results in the
following segmentation fault:

  # perf top --sort comm -g --ignore-callees=do_idle

terminates with:

  #0  0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6
  #1  0x00007ffff55e3802 in strdup () from /lib64/libc.so.6
  #2  0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489
  #3  hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564
  #4  0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420,
      sample_self=sample_self@entry=true) at util/hist.c:657
  #5  0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0,
      sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288
  #6  0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38)
      at util/hist.c:1056
  #7  iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056
  #8  0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0)
      at util/hist.c:1231
  #9  0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0)
      at builtin-top.c:842
  #10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202
  #11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244
  #12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323
  #13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339
  #14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341
  #15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339
  #16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114
  #17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0
  #18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6

If you look at the frame #2, the code is:

488	 if (he->srcline) {
489          he->srcline = strdup(he->srcline);
490          if (he->srcline == NULL)
491              goto err_rawdata;
492	 }

If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish),
it gets strdupped and strdupping a rubbish random string causes the problem.

Also, if you look at the commit 1fb7d06a50, it adds the srcline property
into the struct, but not initializing it everywhere needed.

Committer notes:

Now I see, when using --ignore-callees=do_idle we end up here at line
2189 in add_callchain_ip():

2181         if (al.sym != NULL) {
2182                 if (perf_hpp_list.parent && !*parent &&
2183                     symbol__match_regex(al.sym, &parent_regex))
2184                         *parent = al.sym;
2185                 else if (have_ignore_callees && root_al &&
2186                   symbol__match_regex(al.sym, &ignore_callees_regex)) {
2187                         /* Treat this symbol as the root,
2188                            forgetting its callees. */
2189                         *root_al = al;
2190                         callchain_cursor_reset(cursor);
2191                 }
2192         }

And the al that doesn't have the ->srcline field initialized will be
copied to the root_al, so then, back to:

1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
1212                          int max_stack_depth, void *arg)
1213 {
1214         int err, err2;
1215         struct map *alm = NULL;
1216
1217         if (al)
1218                 alm = map__get(al->map);
1219
1220         err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
1221                                         iter->evsel, al, max_stack_depth);
1222         if (err) {
1223                 map__put(alm);
1224                 return err;
1225         }
1226
1227         err = iter->ops->prepare_entry(iter, al);
1228         if (err)
1229                 goto out;
1230
1231         err = iter->ops->add_single_entry(iter, al);
1232         if (err)
1233                 goto out;
1234

That al at line 1221 is what hist_entry_iter__add() (called from
sample__resolve_callchain()) saw as 'root_al', and then:

        iter->ops->add_single_entry(iter, al);

will go on with al->srcline with a bogus value, I'll add the above
sequence to the cset and apply, thanks!

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
CC: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 1fb7d06a50 ("perf report Use srcline from callchain for hist entries")
Link: https //lore.kernel.org/r/20210719145332.29747-1-mpetlan@redhat.com
Reported-by: Juri Lelli <jlelli@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:05 -03:00
Adrian Hunter
ff6f41fbce perf script: Fix ip display when type != attr->type
set_print_ip_opts() was not being called when type != attr->type
because there is not a one-to-one relationship between output types
and attr->type. That resulted in ip not printing.

The attr_type() function is removed, and the match of attr->type to
output type is corrected.

Example on ADL using taskset to select an atom cpu:

 # perf record -e cpu_atom/cpu-cycles/ taskset 0x1000 uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ]

 Before:

  # perf script | head
         taskset   428 [-01] 10394.179041:          1 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179043:          1 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179044:         11 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179045:        407 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179046:      16789 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179052:     676300 cpu_atom/cpu-cycles/:
           uname   428 [-01] 10394.179278:    4079859 cpu_atom/cpu-cycles/:

 After:

  # perf script | head
         taskset   428 10394.179041:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179043:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179044:         11 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179045:        407 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179046:      16789 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179052:     676300 cpu_atom/cpu-cycles/:      7f829ef73800 cfree+0x0 (/lib/libc-2.32.so)
           uname   428 10394.179278:    4079859 cpu_atom/cpu-cycles/:  ffffffff95bae912 vma_interval_tree_remove+0x1f2 ([kernel.kallsyms])

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210911133053.15682-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:05 -03:00
Ravi Bangoria
7efbcc8c07 perf annotate: Fix fused instr logic for assembly functions
Some x86 microarchitectures fuse a subset of cmp/test/ALU instructions
with branch instructions, and thus perf annotate highlight such valid
pairs as fused.

When annotated with source, perf uses struct disasm_line to contain
either source or instruction line from objdump output. Usually, a C
statement generates multiple instructions which include such
cmp/test/ALU + branch instruction pairs. But in case of assembly
function, each individual assembly source line generate one
instruction.

The 'perf annotate' instruction fusion logic assumes the previous
disasm_line as the previous instruction line, which is wrong because,
for assembly function, previous disasm_line contains source line.  And
thus perf fails to highlight valid fused instruction pairs for assembly
functions.

Fix it by searching backward until we find an instruction line and
consider that disasm_line as fused with current branch instruction.

Before:
         │    cmpq    %rcx, RIP+8(%rsp)
    0.00 │      cmp    %rcx,0x88(%rsp)
         │    je      .Lerror_bad_iret      <--- Source line
    0.14 │   ┌──je     b4                   <--- Instruction line
         │   │movl    %ecx, %eax

After:
         │    cmpq    %rcx, RIP+8(%rsp)
    0.00 │   ┌──cmp    %rcx,0x88(%rsp)
         │   │je      .Lerror_bad_iret
    0.14 │   ├──je     b4
         │   │movl    %ecx, %eax

Reviewed-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https //lore.kernel.org/r/20210911043854.8373-1-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:05 -03:00
Florian Westphal
ce9979129a selftests: mptcp: add mptcp getsockopt test cases
Add a test program that retrieves the three info types:
1. mptcp meta information
2. tcp info for subflow
3. subflow endpoint addresses

For all three rudimentary checks are added.

1. Meta information checks that the logical mptcp
   sequence numbers advance as expected, based on the bytes read
   (init seq + bytes_received/sent) and the connection state
   (after close, we should exect 1 extra byte due to FIN).

2. TCP info checks the number of bytes sent/received vs.
   sums of read/write syscall return values.

3. Subflow endpoint addresses are checked vs. getsockname/getpeername
   result.

Tests for forward compatibility (0-initialisation of output-only
fields in mptcp_subflow_data structure) are added as well.

Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-18 14:20:01 +01:00
Dave Marchevsky
a42effb0b2 bpf: Clarify data_len param in bpf_snprintf and bpf_seq_printf comments
Since the data_len in these two functions is a byte len of the preceding
u64 *data array, it must always be a multiple of 8. If this isn't the
case both helpers error out, so let's make the requirement explicit so
users don't need to infer it.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-10-davemarchevsky@fb.com
2021-09-17 14:02:06 -07:00
Dave Marchevsky
7606729fe2 selftests/bpf: Add trace_vprintk test prog
This commit adds a test prog for vprintk which confirms that:
  * bpf_trace_vprintk is writing to /sys/kernel/debug/tracing/trace_pipe
  * __bpf_vprintk macro works as expected
  * >3 args are printed
  * bpf_printk w/ 0 format args compiles
  * bpf_trace_vprintk call w/ a fmt specifier but NULL fmt data fails

Approach and code are borrowed from trace_printk test.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-9-davemarchevsky@fb.com
2021-09-17 14:02:06 -07:00
Dave Marchevsky
d313d45a22 selftests/bpf: Migrate prog_tests/trace_printk CHECKs to ASSERTs
Guidance for new tests is to use ASSERT macros instead of CHECK. Since
trace_vprintk test will borrow heavily from trace_printk's, migrate its
CHECKs so it remains obvious that the two are closely related.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-8-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky
4190c299a4 bpftool: Only probe trace_vprintk feature in 'full' mode
Since commit 368cb0e7cd ("bpftool: Make probes which emit dmesg
warnings optional"), some helpers aren't probed by bpftool unless
`full` arg is added to `bpftool feature probe`.

bpf_trace_vprintk can emit dmesg warnings when probed, so include it.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-7-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky
6c66b0e7c9 libbpf: Use static const fmt string in __bpf_printk
The __bpf_printk convenience macro was using a 'char' fmt string holder
as it predates support for globals in libbpf. Move to more efficient
'static const char', but provide a fallback to the old way via
BPF_NO_GLOBAL_DATA so users on old kernels can still use the macro.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-6-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky
c2758baa97 libbpf: Modify bpf_printk to choose helper based on arg count
Instead of being a thin wrapper which calls into bpf_trace_printk,
libbpf's bpf_printk convenience macro now chooses between
bpf_trace_printk and bpf_trace_vprintk. If the arg count (excluding
format string) is >3, use bpf_trace_vprintk, otherwise use the older
helper.

The motivation behind this added complexity - instead of migrating
entirely to bpf_trace_vprintk - is to maintain good developer experience
for users compiling against new libbpf but running on older kernels.
Users who are passing <=3 args to bpf_printk will see no change in their
bytecode.

__bpf_vprintk functions similarly to BPF_SEQ_PRINTF and BPF_SNPRINTF
macros elsewhere in the file - it allows use of bpf_trace_vprintk
without manual conversion of varargs to u64 array. Previous
implementation of bpf_printk macro is moved to __bpf_printk for use by
the new implementation.

This does change behavior of bpf_printk calls with >3 args in the "new
libbpf, old kernels" scenario. Before this patch, attempting to use 4
args to bpf_printk results in a compile-time error. After this patch,
using bpf_printk with 4 args results in a trace_vprintk helper call
being emitted and a load-time failure on older kernels.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-5-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky
10aceb629e bpf: Add bpf_trace_vprintk helper
This helper is meant to be "bpf_trace_printk, but with proper vararg
support". Follow bpf_snprintf's example and take a u64 pseudo-vararg
array. Write to /sys/kernel/debug/tracing/trace_pipe using the same
mechanism as bpf_trace_printk. The functionality of this helper was
requested in the libbpf issue tracker [0].

[0] Closes: https://github.com/libbpf/libbpf/issues/315

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-4-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky
84b4c52960 selftests/bpf: Stop using bpf_program__load
bpf_program__load is not supposed to be used directly. Replace it with
bpf_object__ APIs for the reference_tracking prog_test, which is the
last offender in bpf selftests.

Some additional complexity is added for this test, namely the use of one
bpf_object to iterate through progs, while a second bpf_object is
created and opened/closed to test actual loading of progs. This is
because the test was doing bpf_program__load then __unload to test
loading of individual progs and same semantics with
bpf_object__load/__unload result in failure to load an __unload-ed obj.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-3-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Jakub Kicinski
af54faab84 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-09-17

We've added 63 non-merge commits during the last 12 day(s) which contain
a total of 65 files changed, 2653 insertions(+), 751 deletions(-).

The main changes are:

1) Streamline internal BPF program sections handling and
   bpf_program__set_attach_target() in libbpf, from Andrii.

2) Add support for new btf kind BTF_KIND_TAG, from Yonghong.

3) Introduce bpf_get_branch_snapshot() to capture LBR, from Song.

4) IMUL optimization for x86-64 JIT, from Jie.

5) xsk selftest improvements, from Magnus.

6) Introduce legacy kprobe events support in libbpf, from Rafael.

7) Access hw timestamp through BPF's __sk_buff, from Vadim.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (63 commits)
  selftests/bpf: Fix a few compiler warnings
  libbpf: Constify all high-level program attach APIs
  libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7
  selftests/bpf: Switch fexit_bpf2bpf selftest to set_attach_target() API
  libbpf: Allow skipping attach_func_name in bpf_program__set_attach_target()
  libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs
  selftests/bpf: Stop using relaxed_core_relocs which has no effect
  libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id()
  bpf: Update bpf_get_smp_processor_id() documentation
  libbpf: Add sphinx code documentation comments
  selftests/bpf: Skip btf_tag test if btf_tag attribute not supported
  docs/bpf: Add documentation for BTF_KIND_TAG
  selftests/bpf: Add a test with a bpf program with btf_tag attributes
  selftests/bpf: Test BTF_KIND_TAG for deduplication
  selftests/bpf: Add BTF_KIND_TAG unit tests
  selftests/bpf: Change NAME_NTH/IS_NAME_NTH for BTF_KIND_TAG format
  selftests/bpf: Test libbpf API function btf__add_tag()
  bpftool: Add support for BTF_KIND_TAG
  libbpf: Add support for BTF_KIND_TAG
  libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
  ...
====================

Link: https://lore.kernel.org/r/20210917173738.3397064-1-ast@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-17 12:40:21 -07:00
Yonghong Song
ca21a3e5ed selftests/bpf: Fix a few compiler warnings
With clang building selftests/bpf, I hit a few warnings like below:

  .../bpf_iter.c:592:48: warning: variable 'expected_key_c' set but not used [-Wunused-but-set-variable]
  __u32 expected_key_a = 0, expected_key_b = 0, expected_key_c = 0;
                                                ^

  .../bpf_iter.c:688:48: warning: variable 'expected_key_c' set but not used [-Wunused-but-set-variable]
  __u32 expected_key_a = 0, expected_key_b = 0, expected_key_c = 0;
                                                ^

  .../tc_redirect.c:657:6: warning: variable 'target_fd' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
  if (!ASSERT_OK_PTR(nstoken, "setns " NS_FWD))
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  .../tc_redirect.c:743:6: note: uninitialized use occurs here
  if (target_fd >= 0)
      ^~~~~~~~~

Removing unused variables and initializing the previously-uninitialized variable
to ensure these warnings are gone.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210917043343.3711917-1-yhs@fb.com
2021-09-17 09:10:54 -07:00
Andrii Nakryiko
942025c9f3 libbpf: Constify all high-level program attach APIs
Attach APIs shouldn't need to modify bpf_program/bpf_map structs, so
change all struct bpf_program and struct bpf_map pointers to const
pointers. This is completely backwards compatible with no functional
change.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-8-andrii@kernel.org
2021-09-17 09:05:41 -07:00
Andrii Nakryiko
91b555d73e libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7
bpf_object_open_opts.attach_prog_fd makes a pretty strong assumption
that bpf_object contains either only single freplace BPF program or all
of BPF programs in BPF object are freplaces intended to replace
different subprograms of the same target BPF program. This seems both
a bit confusing, too assuming, and limiting.

We've had bpf_program__set_attach_target() API which allows more
fine-grained control over this, on a per-program level. As such, mark
open_opts.attach_prog_fd as deprecated starting from v0.7, so that we
have one more universal way of setting freplace targets. With previous
change to allow NULL attach_func_name argument, and especially combined
with BPF skeleton, arguable bpf_program__set_attach_target() is a more
convenient and explicit API as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-7-andrii@kernel.org
2021-09-17 09:05:41 -07:00
Andrii Nakryiko
60aed22076 selftests/bpf: Switch fexit_bpf2bpf selftest to set_attach_target() API
Switch fexit_bpf2bpf selftest to bpf_program__set_attach_target()
instead of using bpf_object_open_opts.attach_prog_fd, which is going to
be deprecated. These changes also demonstrate the new mode of
set_attach_target() in which it allows NULL when the target is BPF
program (attach_prog_fd != 0).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-6-andrii@kernel.org
2021-09-17 09:05:41 -07:00
Andrii Nakryiko
2d5ec1c66e libbpf: Allow skipping attach_func_name in bpf_program__set_attach_target()
Allow to use bpf_program__set_attach_target to only set target attach
program FD, while letting libbpf to use target attach function name from
SEC() definition. This might be useful for some scenarios where
bpf_object contains multiple related freplace BPF programs intended to
replace different sub-programs in target BPF program. In such case all
programs will have the same attach_prog_fd, but different
attach_func_name. It's convenient to specify such target function names
declaratively in SEC() definitions, but attach_prog_fd is a dynamic
runtime setting.

To simplify such scenario, allow bpf_program__set_attach_target() to
delay BTF ID resolution till the BPF program load time by providing NULL
attach_func_name. In that case the behavior will be similar to using
bpf_object_open_opts.attach_prog_fd (which is marked deprecated since
v0.7), but has the benefit of allowing more control by user in what is
attached to what. Such setup allows having BPF programs attached to
different target attach_prog_fd with target functions still declaratively
recorded in BPF source code in SEC() definitions.

Selftests changes in the next patch should make this more obvious.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-5-andrii@kernel.org
2021-09-17 09:05:00 -07:00
Andrii Nakryiko
277641859e libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs
It's relevant and hasn't been doing anything for a long while now.
Deprecated it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-4-andrii@kernel.org
2021-09-17 09:04:12 -07:00
Andrii Nakryiko
23a7baaa93 selftests/bpf: Stop using relaxed_core_relocs which has no effect
relaxed_core_relocs option hasn't had any effect for a while now, stop
specifying it. Next patch marks it as deprecated.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-3-andrii@kernel.org
2021-09-17 09:04:12 -07:00
Andrii Nakryiko
f11f86a393 libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id()
Don't perform another search for sec_def inside
libbpf_find_attach_btf_id(), as each recognized bpf_program already has
prog->sec_def set.

Also remove unnecessary NULL check for prog->sec_name, as it can never
be NULL.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-2-andrii@kernel.org
2021-09-17 09:04:12 -07:00
Vladimir Oltean
3c9cfb5269 net: update NXP copyright text
NXP Legal insists that the following are not fine:

- Saying "NXP Semiconductors" instead of "NXP", since the company's
  registered name is "NXP"

- Putting a "(c)" sign in the copyright string

- Putting a comma in the copyright string

The only accepted copyright string format is "Copyright <year-range> NXP".

This patch changes the copyright headers in the networking files that
were sent by me, or derived from code sent by me.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-17 13:52:17 +01:00
Namhyung Kim
41b740b6e8 perf record: Add --synth option
Add an option to control the synthesizing behavior.

    --synth <no|all|task|mmap|cgroup>
                      Fine-tune event synthesis: default=all

This can be useful when we know it doesn't need some synthesis like
in a specific usecase and/or when using pipe:

  $ perf record -a --all-cgroups --synth cgroup -o- sleep 1 | \
  > perf report -i- -s cgroup

Committer notes:

Added a clarification to the man page entry for --synth that this is
about pre-existing threads.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210811044658.1313391-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-17 08:55:00 -03:00
Namhyung Kim
84111b9c95 perf tools: Allow controlling synthesizing PERF_RECORD_ metadata events during record
Depending on the use case, it might require some kind of synthesizing
and some not.  Make it controllable to turn off heavy operations like
MMAP for all tasks.

Currently all users are converted to enable all the synthesis by
default.  It'll be updated in the later patch.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210811044658.1313391-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-17 08:44:19 -03:00
Peter Zijlstra
db2b0c5d7b objtool: Support pv_opsindirect calls for noinstr
Normally objtool will now follow indirect calls; there is no need.

However, this becomes a problem with noinstr validation; if there's an
indirect call from noinstr code, we very much need to know it is to
another noinstr function. Luckily there aren't many indirect calls in
entry code with the obvious exception of paravirt. As such, noinstr
validation didn't work with paravirt kernels.

In order to track pv_ops[] call targets, objtool reads the static
pv_ops[] tables as well as direct assignments to the pv_ops[] array,
provided the compiler makes them a single instruction like:

  bf87:       48 c7 05 00 00 00 00 00 00 00 00        movq   $0x0,0x0(%rip)
    bf92 <xen_init_spinlocks+0x5f>
    bf8a: R_X86_64_PC32     pv_ops+0x268

There are, as of yet, no warnings for when this goes wrong :/

Using the functions found with the above means, all pv_ops[] calls are
now subject to noinstr validation.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210624095149.118815755@infradead.org
2021-09-17 13:20:26 +02:00
Jakub Kicinski
561bed688b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts!

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-16 13:58:38 -07:00
Linus Torvalds
fc0c0548c1 Networking fixes for 5.15-rc2, including fixes from bpf.
Current release - regressions:
 
  - vhost_net: fix OoB on sendmsg() failure
 
  - mlx5: bridge, fix uninitialized variable usage
 
  - bnxt_en: fix error recovery regression
 
 Current release - new code bugs:
 
  - bpf, mm: fix lockdep warning triggered by stack_map_get_build_id_offset()
 
 Previous releases - regressions:
 
  - r6040: restore MDIO clock frequency after MAC reset
 
  - tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()
 
  - dsa: flush switchdev workqueue before tearing down CPU/DSA ports
 
 Previous releases - always broken:
 
  - ptp: dp83640: don't define PAGE0, avoid compiler warning
 
  - igc: fix tunnel segmentation offloads
 
  - phylink: update SFP selected interface on advertising changes
 
  - stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume
 
  - mlx5e: fix mutual exclusion between CQE compression and HW TS
 
 Misc:
 
  - bpf, cgroups: fix cgroup v2 fallback on v1/v2 mixed mode
 
  - sfc: fallback for lack of xdp tx queues
 
  - hns3: add option to turn off page pool feature
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFDnXIACgkQMUZtbf5S
 Irs+pA//UvT0UqL+G/9He9eVMaX/7mnvoShbU/hHaymLuvJE4dx6B3qw8im5fQ0i
 b7ovy1dxjMjSQI4E+2FleajRBgsdXvWgMCopFZr7h35+qPXziLxTTztwsaOIjD8U
 42HBpdwFrVtdrOhSsiKsGKe8KXqIehHhFAp4vQrRlsvsz1Yk/93o6wYaXaXqR6pQ
 qlTVgI7nAuLAw4VfFZ3k7iAiQsyrXrn5kUXq5/Wy6kK7iFnIN7X9j+epTJP/EXHn
 +17SW/iGnjUfwXsRSC2ysgICtOPatu48Cke5hq5XOGqXAtRVhhLfMDiZlr8a6+z4
 gS+QETakBXLrAfvYK6neLjlMhg1FX+Y952xd3XIU4jz+DgIh03r2Nd1xBsp17Yhu
 RRBc96w2KxuHQ6USEXbi5dY1K6o9DK52m/WRmnqXnaNX4ADeesgcrxXHqZ6O6I5R
 +ddvVWylaHxvngCQSL4DvZdYNJVoKjdlo6fHrwD1qu0Uv0rGz6O+AEI9o8DkK5Af
 ZhFBETvjTKzjCykd3UlaRo/ibYChIPYwpGTK+TTI9rUm8OWNw1+sC/Z14bx0/3Vl
 RB53rkJ93/qi6Wb9ljUxM8CPGViuPvMkQJjpJOaktrj0GOgQz3EBdDkIQMaplZx5
 r2AeHW7X5VGxCWMRg4NS/x7Do1jh8m/ch8/EulOAB5fG7KQPfYs=
 =xd6p
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf.

  Current release - regressions:

   - vhost_net: fix OoB on sendmsg() failure

   - mlx5: bridge, fix uninitialized variable usage

   - bnxt_en: fix error recovery regression

  Current release - new code bugs:

   - bpf, mm: fix lockdep warning triggered by stack_map_get_build_id_offset()

  Previous releases - regressions:

   - r6040: restore MDIO clock frequency after MAC reset

   - tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()

   - dsa: flush switchdev workqueue before tearing down CPU/DSA ports

  Previous releases - always broken:

   - ptp: dp83640: don't define PAGE0, avoid compiler warning

   - igc: fix tunnel segmentation offloads

   - phylink: update SFP selected interface on advertising changes

   - stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume

   - mlx5e: fix mutual exclusion between CQE compression and HW TS

  Misc:

   - bpf, cgroups: fix cgroup v2 fallback on v1/v2 mixed mode

   - sfc: fallback for lack of xdp tx queues

   - hns3: add option to turn off page pool feature"

* tag 'net-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits)
  mlxbf_gige: clear valid_polarity upon open
  igc: fix tunnel offloading
  net/{mlx5|nfp|bnxt}: Remove unnecessary RTNL lock assert
  net: wan: wanxl: define CROSS_COMPILE_M68K
  selftests: nci: replace unsigned int with int
  net: dsa: flush switchdev workqueue before tearing down CPU/DSA ports
  Revert "net: phy: Uniform PHY driver access"
  net: dsa: destroy the phylink instance on any error in dsa_slave_phy_setup
  ptp: dp83640: don't define PAGE0
  bnx2x: Fix enabling network interfaces without VFs
  Revert "Revert "ipv4: fix memory leaks in ip_cmsg_send() callers""
  tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()
  net-caif: avoid user-triggerable WARN_ON(1)
  bpf, selftests: Add test case for mixed cgroup v1/v2
  bpf, selftests: Add cgroup v1 net_cls classid helpers
  bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode
  bpf: Add oversize check before call kvcalloc()
  net: hns3: fix the timing issue of VF clearing interrupt sources
  net: hns3: fix the exception when query imp info
  net: hns3: disable mac in flr process
  ...
2021-09-16 13:05:42 -07:00
Shuah Khan
f5013d412a selftests: kvm: fix get_run_delay() ignoring fscanf() return warn
Fix get_run_delay() to check fscanf() return value to get rid of the
following warning. When fscanf() fails return MIN_RUN_DELAY_NS from
get_run_delay(). Move MIN_RUN_DELAY_NS from steal_time.c to test_util.h
so get_run_delay() and steal_time.c can use it.

lib/test_util.c: In function ‘get_run_delay’:
lib/test_util.c:316:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  316 |  fscanf(fp, "%ld %ld ", &val[0], &val[1]);
      |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-16 12:57:32 -06:00
Shuah Khan
20175d5eac selftests: kvm: move get_run_delay() into lib/test_util
get_run_delay() is defined static in xen_shinfo_test and steal_time test.
Move it to lib and remove code duplication.

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-16 12:57:26 -06:00
Shuah Khan
3a4f0cc693 selftests:kvm: fix get_trans_hugepagesz() ignoring fscanf() return warn
Fix get_trans_hugepagesz() to check fscanf() return value to get rid
of the following warning:

lib/test_util.c: In function ‘get_trans_hugepagesz’:
lib/test_util.c:138:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  138 |  fscanf(f, "%ld", &size);
      |  ^~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-16 12:57:11 -06:00
Shuah Khan
39a71f712d selftests:kvm: fix get_warnings_count() ignoring fscanf() return warn
Fix get_warnings_count() to check fscanf() return value to get rid
of the following warning:

x86_64/mmio_warning_test.c: In function ‘get_warnings_count’:
x86_64/mmio_warning_test.c:85:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
   85 |  fscanf(f, "%d", &warnings);
      |  ^~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-16 12:57:02 -06:00
Paul E. McKenney
faaaf2ac03 torture: Make kvm-remote.sh print size of downloaded tarball
This commit causes kvm-remote.sh to print the size of the tarball that
is downloaded to each of the remote systems.  This size can help with
performance projections and analysis.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-09-16 10:32:35 -07:00
Paul E. McKenney
ae3357ac11 torture: Allot 1G of memory for scftorture runs
By default, torture.sh allots 512M of memory for each guest OS.  However,
when running scftorture with KASAN, 1G is needed.  This commit therefore
causes torture.sh to provide the required 1G.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-09-16 10:32:35 -07:00
Paul E. McKenney
2010776f8c tools/rcu: Add an extract-stall script
This commit adds a script that extracts RCU CPU stall warnings
from console output.  The user can optionally specify the number of
lines preceding the stall to output, and also the number of lines of
stall-warning text.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-09-16 10:31:26 -07:00
Xiang wangx
98dc68f8b0 selftests: nci: replace unsigned int with int
Should not use comparison of unsigned expressions < 0.

Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 13:55:51 +01:00
Ian Rogers
8228e9361e perf parse-events: Avoid enum forward declaration.
Enum forward declarations aren't allowed as the size can't be implied.
Switch to just using an int. This fixes a clang warning:

  In file included from tools/perf/bench/evlist-open-close.c:13:
  tools/perf/bench/../util/parse-events.h:185:6: error: redeclaration of already-defined enum 'perf_tool_event' is a GNU extension [-Werror,-Wgnu-redeclared-enum]
  enum perf_tool_event;
       ^
  tools/perf/bench/../util/evsel.h:28:6: note: previous definition is here
  enum perf_tool_event {
       ^

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210915211428.1773567-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-15 18:21:00 -03:00
Andrii Nakryiko
00e0ca3721 perf bpf: Ignore deprecation warning when using libbpf's btf__get_from_id()
Perf code re-implements libbpf's btf__load_from_kernel_by_id() API as
a weak function, presumably to dynamically link against old version of
libbpf shared library. Unfortunately this causes compilation warning
when perf is compiled against libbpf v0.6+.

For now, just ignore deprecation warning, but there might be a better
solution, depending on perf's needs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: kernel-team@fb.com
LPU-Reference: 20210914170004.4185659-1-andrii@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-15 18:06:08 -03:00
Muhammad Falak R Wani
ddf0d4dee4 perf bpf: Deprecate bpf_map__resize() in favor of bpf_map_set_max_entries()
As a part of libbpf 1.0 plan[0], this patch deprecates use of
bpf_map__resize in favour of bpf_map__set_max_entries.

Reference: https://github.com/libbpf/libbpf/issues/304
[0]: https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#libbpfh-high-level-apis

Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Muhammad Falak R Wani <falakreyaz@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Link: http //lore.kernel.org/lkml/20210815103610.27887-1-falakreyaz@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-15 17:57:30 -03:00
Ravi Bangoria
3149733584 perf annotate: Add fusion logic for AMD microarchs
AMD family 15h and above microarchs fuse a subset of cmp/test/ALU
instructions with branch instructions[1][2]. Add perf annotate
fused instruction support for these microarchs.

Before:
         │       testb  $0x80,0x51(%rax)
         │    ┌──jne    5b3
    0.78 │    │  mov    %r13,%rdi
         │    │→ callq  mark_page_accessed
    1.08 │5b3:└─→mov    0x8(%r13),%rax

After:
         │    ┌──testb  $0x80,0x51(%rax)
         │    ├──jne    5b3
    0.78 │    │  mov    %r13,%rdi
         │    │→ callq  mark_page_accessed
    1.08 │5b3:└─→mov    0x8(%r13),%rax

[1] https://bugzilla.kernel.org/attachment.cgi?id=298553
[2] https://bugzilla.kernel.org/attachment.cgi?id=298555

Committer testing:

On a:

  $ grep -m1 "model name" /proc/cpuinfo
  model name	: AMD Ryzen 9 3900X 12-Core Processor
  $

  Samples: 44K of event 'cycles', 4000 Hz, Event count (approx.): 7533249650
  _int_malloc  /usr/lib64/libc-2.33.so [Percent: local period]
  Percent│    ┌──test   %eax,%eax
         │    ├──jne    884
         │    │↓ jmpq   943
         │    │  nop
         │878:│  add    $0x10,%rdx
    0.64 │    │  add    %eax,%eax
    0.57 │    │↓ je     cc9
    0.77 │884:└─→test   %esi,%eax
         │     ↑ je     878
         │       mov    0x18(%rdx),%r15

Reported-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https //lore.kernel.org/r/20210911043854.8373-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-15 17:54:52 -03:00
Matteo Croce
336562752a bpf: Update bpf_get_smp_processor_id() documentation
BPF programs run with migration disabled regardless of preemption, as
they are protected by migrate_disable(). Update the uapi documentation
accordingly.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210914235400.59427-1-mcroce@linux.microsoft.com
2021-09-15 22:39:55 +02:00
Grant Seltzer
69cd823956 libbpf: Add sphinx code documentation comments
This adds comments above five functions in btf.h which document
their uses. These comments are of a format that doxygen and sphinx
can pick up and render. These are rendered by libbpf.readthedocs.org

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210915021951.117186-1-grantseltzer@gmail.com
2021-09-15 13:16:02 -07:00
Masami Hiramatsu
80be5998ad tools/bootconfig: Define memblock_free_ptr() to fix build error
The lib/bootconfig.c file is shared with the 'bootconfig' tooling, and
as a result, the changes incommit 77e02cf57b ("memblock: introduce
saner 'memblock_free_ptr()' interface") need to also be reflected in the
tooling header file.

So define the new memblock_free_ptr() wrapper, and remove unused __pa()
and memblock_free().

Fixes: 77e02cf57b ("memblock: introduce saner 'memblock_free_ptr()' interface")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-15 09:49:48 -07:00
Li Zhijian
8914a7a247 selftests: be sure to make khdr before other targets
LKP/0Day reported some building errors about kvm, and errors message
are not always same:
- lib/x86_64/processor.c:1083:31: error: ‘KVM_CAP_NESTED_STATE’ undeclared
(first use in this function); did you mean ‘KVM_CAP_PIT_STATE2’?
- lib/test_util.c:189:30: error: ‘MAP_HUGE_16KB’ undeclared (first use
in this function); did you mean ‘MAP_HUGE_16GB’?

Although kvm relies on the khdr, they still be built in parallel when -j
is specified. In this case, it will cause compiling errors.

Here we mark target khdr as NOTPARALLEL to make it be always built
first.

CC: Philip Li <philip.li@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-15 10:34:21 -06:00
Yonghong Song
2220ecf55c selftests/bpf: Skip btf_tag test if btf_tag attribute not supported
Commit c240ba2878 ("selftests/bpf: Add a test with a bpf
program with btf_tag attributes") added btf_tag selftest
to test BTF_KIND_TAG generation from C source code, and to
test kernel validation of generated BTF types.
But if an old clang (clang 13 or earlier) is used, the
following compiler warning may be seen:
  progs/tag.c:23:20: warning: unknown attribute 'btf_tag' ignored
and the test itself is marked OK. The compiler warning is bad
and the test itself shouldn't be marked OK.

This patch added the check for btf_tag attribute support.
If btf_tag is not supported by the clang, the attribute will
not be used in the code and the test will be marked as skipped.
For example, with clang 13:
  ./test_progs -t btf_tag
  #21 btf_tag:SKIP
  Summary: 1/0 PASSED, 1 SKIPPED, 0 FAILED

The selftests/README.rst is updated to clarify when the btf_tag
test may be skipped.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210915061036.2577971-1-yhs@fb.com
2021-09-15 08:49:49 -07:00
Jakub Kicinski
2d6a58996e selftests: net: test ethtool -L vs mq
Add a selftest for checking mq children are visible after ethtool -L.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 15:46:02 +01:00
Peter Zijlstra
f56dae88a8 objtool: Handle __sanitize_cov*() tail calls
Turns out the compilers also generate tail calls to __sanitize_cov*(),
make sure to also patch those out in noinstr code.

Fixes: 0f1441b44e ("objtool: Fix noinstr vs KCOV")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20210624095147.818783799@infradead.org
2021-09-15 15:51:45 +02:00
Peter Zijlstra
8b946cc38e objtool: Introduce CFI hash
Andi reported that objtool on vmlinux.o consumes more memory than his
system has, leading to horrific performance.

This is in part because we keep a struct instruction for every
instruction in the file in-memory. Shrink struct instruction by
removing the CFI state (which includes full register state) from it
and demand allocating it.

Given most instructions don't actually change CFI state, there's lots
of repetition there, so add a hash table to find previous CFI
instances.

Reduces memory consumption (and runtime) for processing an
x86_64-allyesconfig:

  pre:  4:40.84 real,   143.99 user,    44.18 sys,      30624988 mem
  post: 2:14.61 real,   108.58 user,    25.04 sys,      16396184 mem

Suggested-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210624095147.756759107@infradead.org
2021-09-15 15:51:45 +02:00
Peter Zijlstra
9af9dcf11b x86/xen: Mark cpu_bringup_and_idle() as dead_end_function
The asm_cpu_bringup_and_idle() function is required to push the return
value on the stack in order to make ORC happy, but the only reason
objtool doesn't complain is because of a happy accident.

The thing is that asm_cpu_bringup_and_idle() doesn't return, so
validate_branch() never terminates and falls through to the next
function, which in the normal case is the hypercall_page. And that, as
it happens, is 4095 NOPs and a RET.

Make asm_cpu_bringup_and_idle() terminate on it's own, by making the
function it calls as a dead-end. This way we no longer rely on what
code happens to come after.

Fixes: c3881eb58d ("x86/xen: Make the secondary CPU idle tasks reliable")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/20210624095147.693801717@infradead.org
2021-09-15 15:51:44 +02:00
Yonghong Song
c240ba2878 selftests/bpf: Add a test with a bpf program with btf_tag attributes
Add a bpf program with btf_tag attributes. The program is
loaded successfully with the kernel. With the command
  bpftool btf dump file ./tag.o
the following dump shows that tags are properly encoded:
  [8] STRUCT 'key_t' size=12 vlen=3
          'a' type_id=2 bits_offset=0
          'b' type_id=2 bits_offset=32
          'c' type_id=2 bits_offset=64
  [9] TAG 'tag1' type_id=8 component_id=-1
  [10] TAG 'tag2' type_id=8 component_id=-1
  [11] TAG 'tag1' type_id=8 component_id=1
  [12] TAG 'tag2' type_id=8 component_id=1
  ...
  [21] FUNC_PROTO '(anon)' ret_type_id=2 vlen=1
          'x' type_id=2
  [22] FUNC 'foo' type_id=21 linkage=static
  [23] TAG 'tag1' type_id=22 component_id=0
  [24] TAG 'tag2' type_id=22 component_id=0
  [25] TAG 'tag1' type_id=22 component_id=-1
  [26] TAG 'tag2' type_id=22 component_id=-1
  ...
  [29] VAR 'total' type_id=27, linkage=global
  [30] TAG 'tag1' type_id=29 component_id=-1
  [31] TAG 'tag2' type_id=29 component_id=-1

If an old clang compiler, which does not support btf_tag attribute,
is used, these btf_tag attributes will be silently ignored.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223058.248949-1-yhs@fb.com
2021-09-14 18:45:53 -07:00
Yonghong Song
ad526474ae selftests/bpf: Test BTF_KIND_TAG for deduplication
Add unit tests for BTF_KIND_TAG deduplication for
  - struct and struct member
  - variable
  - func and func argument

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223052.248535-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song
35baba7a83 selftests/bpf: Add BTF_KIND_TAG unit tests
Test good and bad variants of BTF_KIND_TAG encoding.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223047.248223-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song
3df3bd68d4 selftests/bpf: Change NAME_NTH/IS_NAME_NTH for BTF_KIND_TAG format
BTF_KIND_TAG ELF format has a component_idx which might have value -1.
test_btf may confuse it with common_type.name as NAME_NTH checkes
high 16bit to be 0xffff. Change NAME_NTH high 16bit check to be
0xfffe so it won't confuse with component_idx.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223041.248009-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song
71d29c2d47 selftests/bpf: Test libbpf API function btf__add_tag()
Add btf_write tests with btf__add_tag() function.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223036.247560-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song
5c07f2fec0 bpftool: Add support for BTF_KIND_TAG
Added bpftool support to dump BTF_KIND_TAG information.
The new bpftool will be used in later patches to dump
btf in the test bpf program object file.

Currently, the tags are not emitted with
  bpftool btf dump file <path> format c
and they are silently ignored.  The tag information is
mostly used in the kernel for verification purpose and the kernel
uses its own btf to check. With adding these tags
to vmlinux.h, tags will be encoded in program's btf but
they will not be used by the kernel, at least for now.
So let us delay adding these tags to format C header files
until there is a real need.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223031.246951-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song
5b84bd1036 libbpf: Add support for BTF_KIND_TAG
Add BTF_KIND_TAG support for parsing and dedup.
Also added sanitization for BTF_KIND_TAG. If BTF_KIND_TAG is not
supported in the kernel, sanitize it to INTs.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223025.246687-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song
30025e8bd8 libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
This patch renames functions btf_{hash,equal}_int() to
btf_{hash,equal}_int_tag() so they can be reused for
BTF_KIND_TAG support. There is no functionality change for
this patch.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223020.245829-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song
b5ea834dde bpf: Support for new btf kind BTF_KIND_TAG
LLVM14 added support for a new C attribute ([1])
  __attribute__((btf_tag("arbitrary_str")))
This attribute will be emitted to dwarf ([2]) and pahole
will convert it to BTF. Or for bpf target, this
attribute will be emitted to BTF directly ([3], [4]).
The attribute is intended to provide additional
information for
  - struct/union type or struct/union member
  - static/global variables
  - static/global function or function parameter.

For linux kernel, the btf_tag can be applied
in various places to specify user pointer,
function pre- or post- condition, function
allow/deny in certain context, etc. Such information
will be encoded in vmlinux BTF and can be used
by verifier.

The btf_tag can also be applied to bpf programs
to help global verifiable functions, e.g.,
specifying preconditions, etc.

This patch added basic parsing and checking support
in kernel for new BTF_KIND_TAG kind.

 [1] https://reviews.llvm.org/D106614
 [2] https://reviews.llvm.org/D106621
 [3] https://reviews.llvm.org/D106622
 [4] https://reviews.llvm.org/D109560

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223015.245546-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song
41ced4cd88 btf: Change BTF_KIND_* macros to enums
Change BTF_KIND_* macros to enums so they are encoded in dwarf and
appear in vmlinux.h. This will make it easier for bpf programs
to use these constants without macro definitions.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223009.245307-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Andrii Nakryiko
8987ede3ed selftests/bpf: Fix .gitignore to not ignore test_progs.c
List all possible test_progs flavors explicitly to avoid accidentally
ignoring valid source code files. In this case, test_progs.c was still
ignored after recent 809ed84de8 ("selftests/bpf: Whitelist test_progs.h
from .gitignore") fix that added exception only for test_progs.h.

Fixes: 74b5a5968f ("selftests/bpf: Replace test_progs and test_maps w/ general rule")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210914162228.3995740-1-andrii@kernel.org
2021-09-14 18:41:14 -07:00
Jie Meng
c035407743 bpf,x64 Emit IMUL instead of MUL for x86-64
IMUL allows for multiple operands and saving and storing rax/rdx is no
longer needed. Signedness of the operands doesn't matter here because
the we only keep the lower 32/64 bit of the product for 32/64 bit
multiplications.

Signed-off-by: Jie Meng <jmeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210913211337.1564014-1-jmeng@fb.com
2021-09-14 18:36:36 -07:00
Andrii Nakryiko
b6291a6f30 libbpf: Minimize explicit iterator of section definition array
Remove almost all the code that explicitly iterated BPF program section
definitions in favor of using find_sec_def(). The only remaining user of
section_defs is libbpf_get_type_names that has to iterate all of them to
construct its result.

Having one internal API entry point for section definitions will
simplify further refactorings around libbpf's program section
definitions parsing.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-5-andrii@kernel.org
2021-09-14 15:49:24 -07:00
Andrii Nakryiko
5532dfd42e libbpf: Simplify BPF program auto-attach code
Remove the need to explicitly pass bpf_sec_def for auto-attachable BPF
programs, as it is already recorded at bpf_object__open() time for all
recognized type of BPF programs. This further reduces number of explicit
calls to find_sec_def(), simplifying further refactorings.

No functional changes are done by this patch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-4-andrii@kernel.org
2021-09-14 15:49:24 -07:00
Andrii Nakryiko
91b4d1d1d5 libbpf: Ensure BPF prog types are set before relocations
Refactor bpf_object__open() sequencing to perform BPF program type
detection based on SEC() definitions before we get to relocations
collection. This allows to have more information about BPF program by
the time we get to, say, struct_ops relocation gathering. This,
subsequently, simplifies struct_ops logic and removes the need to
perform extra find_sec_def() resolution.

With this patch libbpf will require all struct_ops BPF programs to be
marked with SEC("struct_ops") or SEC("struct_ops/xxx") annotations.
Real-world applications are already doing that through something like
selftests's BPF_STRUCT_OPS() macro. This change streamlines libbpf's
internal handling of SEC() definitions and is in the sprit of
upcoming libbpf-1.0 section strictness changes ([0]).

  [0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-more-uniform-bpf-program-section-name-sec-handling

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-3-andrii@kernel.org
2021-09-14 15:49:24 -07:00
Andrii Nakryiko
53df63ccdc selftests/bpf: Update selftests to always provide "struct_ops" SEC
Update struct_ops selftests to always specify "struct_ops" section
prefix. Libbpf will require a proper BPF program type set in the next
patch, so this prevents tests breaking.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-2-andrii@kernel.org
2021-09-14 15:49:24 -07:00
Rafael David Tinoco
ca304b40c2 libbpf: Introduce legacy kprobe events support
Allow kprobe tracepoint events creation through legacy interface, as the
kprobe dynamic PMUs support, used by default, was only created in v4.17.

Store legacy kprobe name in struct bpf_perf_link, instead of creating
a new "subclass" off of bpf_perf_link. This is ok as it's just two new
fields, which are also going to be reused for legacy uprobe support in
follow up patches.

Signed-off-by: Rafael David Tinoco <rafaeldtinoco@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210912064844.3181742-1-rafaeldtinoco@gmail.com
2021-09-14 14:44:45 -07:00
David S. Miller
2865ba8247 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-09-14

The following pull-request contains BPF updates for your *net* tree.

We've added 7 non-merge commits during the last 13 day(s) which contain
a total of 18 files changed, 334 insertions(+), 193 deletions(-).

The main changes are:

1) Fix mmap_lock lockdep splat in BPF stack map's build_id lookup, from Yonghong Song.

2) Fix BPF cgroup v2 program bypass upon net_cls/prio activation, from Daniel Borkmann.

3) Fix kvcalloc() BTF line info splat on oversized allocation attempts, from Bixuan Cui.

4) Fix BPF selftest build of task_pt_regs test for arm64/s390, from Jean-Philippe Brucker.

5) Fix BPF's disasm.{c,h} to dual-license so that it is aligned with bpftool given the former
   is a build dependency for the latter, from Daniel Borkmann with ACKs from contributors.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 13:09:54 +01:00
Faizel K B
f81c08f897 usb: testusb: Fix for showing the connection speed
testusb' application which uses 'usbtest' driver reports 'unknown speed'
from the function 'find_testdev'. The variable 'entry->speed' was not
updated from  the application. The IOCTL mentioned in the FIXME comment can
only report whether the connection is low speed or not. Speed is read using
the IOCTL USBDEVFS_GET_SPEED which reports the proper speed grade.  The
call is implemented in the function 'handle_testdev' where the file
descriptor was availble locally. Sample output is given below where 'high
speed' is printed as the connected speed.

sudo ./testusb -a
high speed      /dev/bus/usb/001/011    0
/dev/bus/usb/001/011 test 0,    0.000015 secs
/dev/bus/usb/001/011 test 1,    0.194208 secs
/dev/bus/usb/001/011 test 2,    0.077289 secs
/dev/bus/usb/001/011 test 3,    0.170604 secs
/dev/bus/usb/001/011 test 4,    0.108335 secs
/dev/bus/usb/001/011 test 5,    2.788076 secs
/dev/bus/usb/001/011 test 6,    2.594610 secs
/dev/bus/usb/001/011 test 7,    2.905459 secs
/dev/bus/usb/001/011 test 8,    2.795193 secs
/dev/bus/usb/001/011 test 9,    8.372651 secs
/dev/bus/usb/001/011 test 10,    6.919731 secs
/dev/bus/usb/001/011 test 11,   16.372687 secs
/dev/bus/usb/001/011 test 12,   16.375233 secs
/dev/bus/usb/001/011 test 13,    2.977457 secs
/dev/bus/usb/001/011 test 14 --> 22 (Invalid argument)
/dev/bus/usb/001/011 test 17,    0.148826 secs
/dev/bus/usb/001/011 test 18,    0.068718 secs
/dev/bus/usb/001/011 test 19,    0.125992 secs
/dev/bus/usb/001/011 test 20,    0.127477 secs
/dev/bus/usb/001/011 test 21 --> 22 (Invalid argument)
/dev/bus/usb/001/011 test 24,    4.133763 secs
/dev/bus/usb/001/011 test 27,    2.140066 secs
/dev/bus/usb/001/011 test 28,    2.120713 secs
/dev/bus/usb/001/011 test 29,    0.507762 secs

Signed-off-by: Faizel K B <faizel.kb@dicortech.com>
Link: https://lore.kernel.org/r/20210902114444.15106-1-faizel.kb@dicortech.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-14 10:31:41 +02:00
Paul E. McKenney
b380b10b84 torture: Make torture.sh print the number of files to be compressed
Compressing gigabyte vmlinux files can take some time, and it can be a
bit annoying to not know many more batches of compression there will be.
This commit therefore makes torture.sh print the number of files to be
compressed just before starting compression and just after compression
completes.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-09-13 16:37:11 -07:00
Daniel Borkmann
43d2b88c29 bpf, selftests: Add test case for mixed cgroup v1/v2
Minimal selftest which implements a small BPF policy program to the
connect(2) hook which rejects TCP connection requests to port 60123
with EPERM. This is being attached to a non-root cgroup v2 path. The
test asserts that this works under cgroup v2-only and under a mixed
cgroup v1/v2 environment where net_classid is set in the former case.

Before fix:

  # ./test_progs -t cgroup_v1v2
  test_cgroup_v1v2:PASS:server_fd 0 nsec
  test_cgroup_v1v2:PASS:client_fd 0 nsec
  test_cgroup_v1v2:PASS:cgroup_fd 0 nsec
  test_cgroup_v1v2:PASS:server_fd 0 nsec
  run_test:PASS:skel_open 0 nsec
  run_test:PASS:prog_attach 0 nsec
  test_cgroup_v1v2:PASS:cgroup-v2-only 0 nsec
  run_test:PASS:skel_open 0 nsec
  run_test:PASS:prog_attach 0 nsec
  run_test:PASS:join_classid 0 nsec
  (network_helpers.c:219: errno: None) Unexpected success to connect to server
  test_cgroup_v1v2:FAIL:cgroup-v1v2 unexpected error: -1 (errno 0)
  #27 cgroup_v1v2:FAIL
  Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

After fix:

  # ./test_progs -t cgroup_v1v2
  #27 cgroup_v1v2:OK
  Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210913230759.2313-3-daniel@iogearbox.net
2021-09-13 16:35:58 -07:00
Daniel Borkmann
d8079d8026 bpf, selftests: Add cgroup v1 net_cls classid helpers
Minimal set of helpers for net_cls classid cgroupv1 management in order
to set an id, join from a process, initiate setup and teardown. cgroupv2
helpers are left as-is, but reused where possible.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210913230759.2313-2-daniel@iogearbox.net
2021-09-13 16:35:58 -07:00
Nathan Chancellor
d0ee23f9d7 tools: compiler-gcc.h: Guard error attribute use with __has_attribute
When building objtool with HOSTCC=clang, there are several errors along
the lines of

  orc_dump.c:201:28: error: unknown attribute 'error' ignored [-Werror,-Wunknown-attributes]

This occurs after commit 4e59869aa6 ("compiler-gcc.h: drop checks for
older GCC versions"), which removed the GCC_VERSION gating.  The removed
version check just so happened to prevent __compiletime_error() from
being defined with clang because it pretends to be GCC 4.2.1 for
compatibility but the error attribute was not added to clang until
14.0.0.

Commit 815f0ddb34 ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive") and commit a3f8a30f3f ("Compiler Attributes: use
feature checks instead of version checks") refactored the handling of
attributes in the main kernel to avoid situations like this but that
refactoring has never been done for the tools directory.

Refactoring is a rather large undertaking and this has never been an
issue before so instead, just guard the definition of
__compiletime_error() with __has_attribute() so that there are no more
errors.

Fixes: 4e59869aa6 ("compiler-gcc.h: drop checks for older GCC versions")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-13 15:51:41 -07:00
Andrii Nakryiko
2f38304127 libbpf: Make libbpf_version.h non-auto-generated
Turn previously auto-generated libbpf_version.h header into a normal
header file. This prevents various tricky Makefile integration issues,
simplifies the overall build process, but also allows to further extend
it with some more versioning-related APIs in the future.

To prevent accidental out-of-sync versions as defined by libbpf.map and
libbpf_version.h, Makefile checks their consistency at build time.

Simultaneously with this change bump libbpf.map to v0.6.

Also undo adding libbpf's output directory into include path for
kernel/bpf/preload, bpftool, and resolve_btfids, which is not necessary
because libbpf_version.h is just a normal header like any other.

Fixes: 0b46b75505 ("libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210913222309.3220849-1-andrii@kernel.org
2021-09-13 15:36:47 -07:00
Daniel Borkmann
dbd7eb14e0 bpf, selftests: Replicate tailcall limit test for indirect call case
The tailcall_3 test program uses bpf_tail_call_static() where the JIT
would patch a direct jump. Add a new tailcall_6 test program replicating
exactly the same test just ensuring that bpf_tail_call() uses a map
index where the verifier cannot make assumptions this time.

In other words, this will now cover both on x86-64 JIT, meaning, JIT
images with emit_bpf_tail_call_direct() emission as well as JIT images
with emit_bpf_tail_call_indirect() emission.

  # echo 1 > /proc/sys/net/core/bpf_jit_enable
  # ./test_progs -t tailcalls
  #136/1 tailcalls/tailcall_1:OK
  #136/2 tailcalls/tailcall_2:OK
  #136/3 tailcalls/tailcall_3:OK
  #136/4 tailcalls/tailcall_4:OK
  #136/5 tailcalls/tailcall_5:OK
  #136/6 tailcalls/tailcall_6:OK
  #136/7 tailcalls/tailcall_bpf2bpf_1:OK
  #136/8 tailcalls/tailcall_bpf2bpf_2:OK
  #136/9 tailcalls/tailcall_bpf2bpf_3:OK
  #136/10 tailcalls/tailcall_bpf2bpf_4:OK
  #136/11 tailcalls/tailcall_bpf2bpf_5:OK
  #136 tailcalls:OK
  Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED

  # echo 0 > /proc/sys/net/core/bpf_jit_enable
  # ./test_progs -t tailcalls
  #136/1 tailcalls/tailcall_1:OK
  #136/2 tailcalls/tailcall_2:OK
  #136/3 tailcalls/tailcall_3:OK
  #136/4 tailcalls/tailcall_4:OK
  #136/5 tailcalls/tailcall_5:OK
  #136/6 tailcalls/tailcall_6:OK
  [...]

For interpreter, the tailcall_1-6 tests are passing as well. The later
tailcall_bpf2bpf_* are failing due lack of bpf2bpf + tailcall support
in interpreter, so this is expected.

Also, manual inspection shows that both loaded programs from tailcall_3
and tailcall_6 test case emit the expected opcodes:

* tailcall_3 disasm, emit_bpf_tail_call_direct():

  [...]
   b:   push   %rax
   c:   push   %rbx
   d:   push   %r13
   f:   mov    %rdi,%rbx
  12:   movabs $0xffff8d3f5afb0200,%r13
  1c:   mov    %rbx,%rdi
  1f:   mov    %r13,%rsi
  22:   xor    %edx,%edx                 _
  24:   mov    -0x4(%rbp),%eax          |  limit check
  2a:   cmp    $0x20,%eax               |
  2d:   ja     0x0000000000000046       |
  2f:   add    $0x1,%eax                |
  32:   mov    %eax,-0x4(%rbp)          |_
  38:   nopl   0x0(%rax,%rax,1)
  3d:   pop    %r13
  3f:   pop    %rbx
  40:   pop    %rax
  41:   jmpq   0xffffffffffffe377
  [...]

* tailcall_6 disasm, emit_bpf_tail_call_indirect():

  [...]
  47:   movabs $0xffff8d3f59143a00,%rsi
  51:   mov    %edx,%edx
  53:   cmp    %edx,0x24(%rsi)
  56:   jbe    0x0000000000000093        _
  58:   mov    -0x4(%rbp),%eax          |  limit check
  5e:   cmp    $0x20,%eax               |
  61:   ja     0x0000000000000093       |
  63:   add    $0x1,%eax                |
  66:   mov    %eax,-0x4(%rbp)          |_
  6c:   mov    0x110(%rsi,%rdx,8),%rcx
  74:   test   %rcx,%rcx
  77:   je     0x0000000000000093
  79:   pop    %rax
  7a:   mov    0x30(%rcx),%rcx
  7e:   add    $0xb,%rcx
  82:   callq  0x000000000000008e
  87:   pause
  89:   lfence
  8c:   jmp    0x0000000000000087
  8e:   mov    %rcx,(%rsp)
  92:   retq
  [...]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Acked-by: Paul Chaignon <paul@cilium.io>
Link: https://lore.kernel.org/bpf/CAM1=_QRyRVCODcXo_Y6qOm1iT163HoiSj8U2pZ8Rj3hzMTT=HQ@mail.gmail.com
Link: https://lore.kernel.org/bpf/20210910091900.16119-1-daniel@iogearbox.net
2021-09-13 14:52:22 -07:00
Song Liu
025bd7c753 selftests/bpf: Add test for bpf_get_branch_snapshot
This test uses bpf_get_branch_snapshot from a fexit program. The test uses
a target function (bpf_testmod_loop_test) and compares the record against
kallsyms. If there isn't enough record matching kallsyms, the test fails.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210910183352.3151445-4-songliubraving@fb.com
2021-09-13 10:53:50 -07:00
Song Liu
856c02dbce bpf: Introduce helper bpf_get_branch_snapshot
Introduce bpf_get_branch_snapshot(), which allows tracing pogram to get
branch trace from hardware (e.g. Intel LBR). To use the feature, the
user need to create perf_event with proper branch_record filtering
on each cpu, and then calls bpf_get_branch_snapshot in the bpf function.
On Intel CPUs, VLBR event (raw event 0x1b00) can be use for this.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210910183352.3151445-3-songliubraving@fb.com
2021-09-13 10:53:50 -07:00
Linus Torvalds
316346243b Merge branch 'gcc-min-version-5.1' (make gcc-5.1 the minimum version)
Merge patch series from Nick Desaulniers to update the minimum gcc
version to 5.1.

This is some of the left-overs from the merge window that I didn't want
to deal with yesterday, so it comes in after -rc1 but was sent before.

Gcc-4.9 support has been an annoyance for some time, and with -Werror I
had the choice of applying a fairly big patch from Kees Cook to remove a
fair number of initializer warnings (still leaving some), or this patch
series from Nick that just removes the source of the problem.

The initializer cleanups might still be worth it regardless, but
honestly, I preferred just tackling the problem with gcc-4.9 head-on.
We've been more aggressiuve about no longer having to care about
compilers that were released a long time ago, and I think it's been a
good thing.

I added a couple of patches on top to sort out a few left-overs now that
we no longer support gcc-4.x.

As noted by Arnd, as a result of this minimum compiler version upgrade
we can probably change our use of '--std=gnu89' to '--std=gnu11', and
finally start using local loop declarations etc.  But this series does
_not_ yet do that.

Link: https://lore.kernel.org/all/20210909182525.372ee687@canb.auug.org.au/
Link: https://lore.kernel.org/lkml/CAK7LNASs6dvU6D3jL2GG3jW58fXfaj6VNOe55NJnTB8UPuk2pA@mail.gmail.com/
Link: https://github.com/ClangBuiltLinux/linux/issues/1438

* emailed patches from Nick Desaulniers <ndesaulniers@google.com>:
  Drop some straggling mentions of gcc-4.9 as being stale
  compiler_attributes.h: drop __has_attribute() support for gcc4
  vmlinux.lds.h: remove old check for GCC 4.9
  compiler-gcc.h: drop checks for older GCC versions
  Makefile: drop GCC < 5 -fno-var-tracking-assignments workaround
  arm64: remove GCC version check for ARCH_SUPPORTS_INT128
  powerpc: remove GCC version check for UPD_CONSTR
  riscv: remove Kconfig check for GCC version for ARCH_RV64I
  Kconfig.debug: drop GCC 5+ version check for DWARF5
  mm/ksm: remove old GCC 4.9+ check
  compiler.h: drop fallback overflow checkers
  Documentation: raise minimum supported version of GCC to 5.1
2021-09-13 10:43:04 -07:00
Nick Desaulniers
4e59869aa6 compiler-gcc.h: drop checks for older GCC versions
Now that GCC 5.1 is the minimally supported default, drop the values we
don't use.

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-13 10:18:29 -07:00
Nick Desaulniers
4eb6bd55cf compiler.h: drop fallback overflow checkers
Once upgrading the minimum supported version of GCC to 5.1, we can drop
the fallback code for !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW.

This is effectively a revert of commit f0907827a8 ("compiler.h: enable
builtin overflow checkers and add fallback code")

Link: https://github.com/ClangBuiltLinux/linux/issues/1438#issuecomment-916745801
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-13 10:18:28 -07:00
Paul E. McKenney
11e46f0804 torture: Apply CONFIG_KCSAN_STRICT to kvm.sh --kcsan argument
Currently, the --kcsan argument to kvm.sh applies a laundry list of
Kconfig options.  Now that KCSAN provides the CONFIG_KCSAN_STRICT Kconfig
option, this commit reduces the laundry list to this one option.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-09-13 08:14:33 -07:00
Nicholas Piggin
5379ef2a60 selftests/powerpc: Add scv versions of the basic TM syscall tests
The basic TM vs syscall test code hard codes an sc instruction for the
system call, which fails to cover scv even when the userspace libc has
support for it.

Duplicate the tests with hard coded scv variants so both are tested
when possible.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fix build on old toolchains by using .long for scv]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210903125707.1601269-2-npiggin@gmail.com
2021-09-13 22:34:11 +10:00
Linus Torvalds
b5b65f1398 perf tools changes for v5.15: 2nd batch
- Add missing fields and remove some duplicate fields when printing a perf_event_attr.
 
 - Fix hybrid config terms list corruption.
 
 - Update kernel header copies, some resulted in new kernel features being
   automagically added to 'perf trace' syscall/tracepoint argument id->string translators.
 
 - Add a file generated during the documentation build to .gitignore.
 
 - Add an option to build without libbfd, as some distros, like Debian consider
   its ABI unstable.
 
 - Add support to print a textual representation of IBS raw sample data in 'perf report'.
 
 - Fix bpf 'perf test' sample mismatch reporting
 
 - Fix passing arguments to stackcollapse report in a 'perf script' python script.
 
 - Allow build-id with trailing zeros.
 
 - Look for ImageBase in PE file to compute .text offset.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYT0+hwAKCRCyPKLppCJ+
 JxcPAQDO+iCKK/sF3TVN8f0T8xkFD6y8krBXPAtQHCAhVBeiqAD9F4R0VMX6nwy3
 8rJnsNd2ODjywgFBO4uPy0N2fxBWjwo=
 =/hH1
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v5.15-2021-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull more perf tools updates from Arnaldo Carvalho de Melo:

 - Add missing fields and remove some duplicate fields when printing a
   perf_event_attr.

 - Fix hybrid config terms list corruption.

 - Update kernel header copies, some resulted in new kernel features
   being automagically added to 'perf trace' syscall/tracepoint argument
   id->string translators.

 - Add a file generated during the documentation build to .gitignore.

 - Add an option to build without libbfd, as some distros, like Debian
   consider its ABI unstable.

 - Add support to print a textual representation of IBS raw sample data
   in 'perf report'.

 - Fix bpf 'perf test' sample mismatch reporting

 - Fix passing arguments to stackcollapse report in a 'perf script'
   python script.

 - Allow build-id with trailing zeros.

 - Look for ImageBase in PE file to compute .text offset.

* tag 'perf-tools-for-v5.15-2021-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (25 commits)
  tools headers UAPI: Update tools's copy of drm.h headers
  tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
  tools headers UAPI: Sync linux/fs.h with the kernel sources
  tools headers UAPI: Sync linux/in.h copy with the kernel sources
  perf tools: Add an option to build without libbfd
  perf tools: Allow build-id with trailing zeros
  perf tools: Fix hybrid config terms list corruption
  perf tools: Factor out copy_config_terms() and free_config_terms()
  perf tools: Fix perf_event_attr__fprintf() missing/dupl. fields
  perf tools: Ignore Documentation dependency file
  perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions
  tools include UAPI: Update linux/mount.h copy
  perf beauty: Cover more flags in the  move_mount syscall argument beautifier
  tools headers UAPI: Sync linux/prctl.h with the kernel sources
  tools include UAPI: Sync sound/asound.h copy with the kernel sources
  tools headers UAPI: Sync linux/kvm.h with the kernel sources
  tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
  perf report: Add support to print a textual representation of IBS raw sample data
  perf report: Add tools/arch/x86/include/asm/amd-ibs.h
  perf env: Add perf_env__cpuid, perf_env__{nr_}pmu_mappings
  ...
2021-09-12 16:18:15 -07:00
Andrea Claudi
1b704b27be selftest: net: fix typo in altname test
If altname deletion of the short alternative name fails, the error
message printed is: "Failed to add short alternative name".
This is obviously a typo, as we are testing altname deletion.

Fix this using a proper error message.

Fixes: f95e6c9c46 ("selftest: net: add alternative names test")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-12 10:48:26 +01:00
Linus Torvalds
78e709522d virtio,vdpa,vhost: features, fixes
vduse driver supporting blk
 virtio-vsock support for end of record with SEQPACKET
 vdpa: mac and mq support for ifcvf and mlx5
 vdpa: management netlink for ifcvf
 virtio-i2c, gpio dt bindings
 
 misc fixes, cleanups
 
 NB: when merging this with
 b542e383d8 ("eventfd: Make signal recursion protection a task bit")
 from Linus' tree, replace eventfd_signal_count with
 eventfd_signal_allowed, and drop the export of eventfd_wake_count from
 ("eventfd: Export eventfd_wake_count to modules").
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmE1+awPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpt6EIAJy0qrc62lktNA0IiIVJSLbUbTMmFj8MzkGR
 8UxZdhpjWqBPJPyaOuNeksAqTGm/UAPEYx3C2c95Jhej7anFpy7dbCtIXcPHLJME
 DjcJg+EDrlNCj8m0FcsHpHWsFzPMERJpyEZNxgB5WazQbv+yWhGrg2FN5DCnF0Ro
 ZFYeKSVty148pQ0nHl8X0JM2XMtqit+O+LvKN2HQZ+fubh7BCzMxzkHY0QLHIzUS
 UeZqd3Qm8YcbqnlX38P5D6k+NPiTEgknmxaBLkPxg6H3XxDAmaIRFb8Ldd1rsgy1
 zTLGDiSGpVDIpawRnuEAzqJThV3Y5/MVJ1WD+mDYQ96tmhfp+KY=
 =DBH/
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:

 - vduse driver ("vDPA Device in Userspace") supporting emulated virtio
   block devices

 - virtio-vsock support for end of record with SEQPACKET

 - vdpa: mac and mq support for ifcvf and mlx5

 - vdpa: management netlink for ifcvf

 - virtio-i2c, gpio dt bindings

 - misc fixes and cleanups

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (39 commits)
  Documentation: Add documentation for VDUSE
  vduse: Introduce VDUSE - vDPA Device in Userspace
  vduse: Implement an MMU-based software IOTLB
  vdpa: Support transferring virtual addressing during DMA mapping
  vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap()
  vdpa: Add an opaque pointer for vdpa_config_ops.dma_map()
  vhost-iotlb: Add an opaque pointer for vhost IOTLB
  vhost-vdpa: Handle the failure of vdpa_reset()
  vdpa: Add reset callback in vdpa_config_ops
  vdpa: Fix some coding style issues
  file: Export receive_fd() to modules
  eventfd: Export eventfd_wake_count to modules
  iova: Export alloc_iova_fast() and free_iova_fast()
  virtio-blk: remove unneeded "likely" statements
  virtio-balloon: Use virtio_find_vqs() helper
  vdpa: Make use of PFN_PHYS/PFN_UP/PFN_DOWN helper macro
  vsock_test: update message bounds test for MSG_EOR
  af_vsock: rename variables in receive loop
  virtio/vsock: support MSG_EOR bit processing
  vhost/vsock: support MSG_EOR bit processing
  ...
2021-09-11 14:48:42 -07:00
Arnaldo Carvalho de Melo
17a99e521f tools headers UAPI: Update tools's copy of drm.h headers
Picking the changes from:

  17ce9c61c7 ("drm: document DRM_IOCTL_MODE_RMFB")

Doesn't result in any tooling changes:

  $ tools/perf/trace/beauty/drm_ioctl.sh  > before
  $ cp include/uapi/drm/drm.h tools/include/uapi/drm/drm.h
  $ tools/perf/trace/beauty/drm_ioctl.sh  > after
  $ diff -u before after

Silencing these perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/drm.h' differs from latest version at 'include/uapi/drm/drm.h'
  diff -u tools/include/uapi/drm/drm.h include/uapi/drm/drm.h

Cc: Simon Ser <contact@emersion.fr>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:24:10 -03:00
Arnaldo Carvalho de Melo
4dc24d7cf4 tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
To pick the changes in:

  b65a948973 ("drm/i915/userptr: Probe existence of backing struct pages upon creation")
  ee242ca704 ("drm/i915/guc: Implement GuC priority management")
  81340cf3bd ("drm/i915/uapi: reject set_domain for discrete")
  7961c5b60f ("drm/i915: Add TTM offset argument to mmap.")
  aef7b67a79 ("drm/i915/uapi: convert drm_i915_gem_userptr to kernel doc")
  e7737b67ab ("drm/i915/uapi: reject caching ioctls for discrete")
  3aa8c57fe2 ("drm/i915/uapi: convert drm_i915_gem_set_domain to kernel doc")
  289f5a7200 ("drm/i915/uapi: convert drm_i915_gem_caching to kernel doc")
  4a766ae40e ("drm/i915: Drop the CONTEXT_CLONE API (v2)")
  6ff6d61dd2 ("drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP")
  fe4751c3d5 ("drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE")
  577729533c ("drm/i915: Document the Virtual Engine uAPI")
  c649432e86 ("drm/i915: Fix busy ioctl commentary")

That doesn't result in any changes to tooling as no new ioctl were
added (at least not perceived by tools/perf/trace/beauty/drm_ioctl.sh).

Addressing this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
  diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:21:22 -03:00
Arnaldo Carvalho de Melo
2bae3e64ec tools headers UAPI: Sync linux/fs.h with the kernel sources
To pick the change in:

  7957d93bf3 ("block: add ioctl to read the disk sequence number")

It adds a new ioctl, but we are still not using that to generate tables
for 'perf trace', so no changes in tooling.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from latest version at 'include/uapi/linux/fs.h'
  diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:14:53 -03:00
Arnaldo Carvalho de Melo
ee286c60c2 tools headers UAPI: Sync linux/in.h copy with the kernel sources
To pick the changes in:

  db243b7964 ("net/ipv4/ipv6: Replace one-element arraya with flexible-array members")
  2d3e5caf96 ("net/ipv4: Replace one-element array with flexible-array member")

That don't result in any change in tooling, the structs changed remains
with the same layout.

This addresses this build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h'
  diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h

Cc: David S. Miller <davem@davemloft.net>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:12:26 -03:00
Ian Rogers
0d1c50ac48 perf tools: Add an option to build without libbfd
Some distributions, like debian, don't link perf with libbfd. Add a
build flag to make this configuration buildable and testable.

This was inspired by:

  https://lore.kernel.org/linux-perf-users/20210910102307.2055484-1-tonyg@leastfixedpoint.com/T/#u

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: tony garnock-jones <tonyg@leastfixedpoint.com>
Link: http://lore.kernel.org/lkml/20210910225756.729087-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:06:52 -03:00
Namhyung Kim
4a86d41404 perf tools: Allow build-id with trailing zeros
Currently perf saves a build-id with size but old versions assumes the
size of 20.  In case the build-id is less than 20 (like for MD5), it'd
fill the rest with 0s.

I saw a problem when old version of perf record saved a binary in the
build-id cache and new version of perf reads the data.  The symbols
should be read from the build-id cache (as the path no longer has the
same binary) but it failed due to mismatch in the build-id.

  symsrc__init: build id mismatch for /home/namhyung/.debug/.build-id/53/e4c2f42a4c61a2d632d92a72afa08f00000000/elf.

The build-id event in the data has 20 byte build-ids, but it saw a
different size (16) when it reads the build-id of the elf file in the
build-id cache.

  $ readelf -n ~/.debug/.build-id/53/e4c2f42a4c61a2d632d92a72afa08f00000000/elf

  Displaying notes found in: .note.gnu.build-id
    Owner                Data size 	Description
    GNU                  0x00000010	NT_GNU_BUILD_ID (unique build ID bitstring)
      Build ID: 53e4c2f42a4c61a2d632d92a72afa08f

Let's fix this by allowing trailing zeros if the size is different.

Fixes: 39be8d0115 ("perf tools: Pass build_id object to dso__build_id_equal()")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210910224630.1084877-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:04:47 -03:00
Adrian Hunter
99fc5941b8 perf tools: Fix hybrid config terms list corruption
A config terms list was spliced twice, resulting in a never-ending loop
when the list was traversed. Fix by using list_splice_init() and copying
and freeing the lists as necessary.

This patch also depends on patch "perf tools: Factor out
copy_config_terms() and free_config_terms()"

Example on ADL:

 Before:

  # perf record -e '{intel_pt//,cycles/aux-sample-size=4096/pp}' uname &
  # jobs
  [1]+  Running                    perf record -e "{intel_pt//,cycles/aux-sample-size=4096/pp}" uname
  # perf top -E 10
    PerfTop:    4071 irqs/sec  kernel: 6.9%  exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles],  (all, 24 CPUs)
  ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    97.60%  perf           [.] __evsel__get_config_term
     0.25%  [kernel]       [k] kallsyms_expand_symbol.constprop.13
     0.24%  perf           [.] kallsyms__parse
     0.15%  [kernel]       [k] _raw_spin_lock
     0.14%  [kernel]       [k] number
     0.13%  [kernel]       [k] advance_transaction
     0.08%  [kernel]       [k] format_decode
     0.08%  perf           [.] map__process_kallsym_symbol
     0.08%  perf           [.] rb_insert_color
     0.08%  [kernel]       [k] vsnprintf
  exiting.
  # kill %1

After:

  # perf record -e '{intel_pt//,cycles/aux-sample-size=4096/pp}' uname &
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.060 MB perf.data ]
  # perf script | head
       perf-exec   604 [001]  1827.312293:                            psb:  psb offs: 0                       ffffffffb8415e87 pt_config_start+0x37 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a3bd event_sched_in.isra.133+0xfd ([kernel.kallsyms]) => ffffffffb856a9a0 perf_pmu_nop_void+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856b10e merge_sched_in+0x26e ([kernel.kallsyms]) => ffffffffb856a2c0 event_sched_in.isra.133+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a45d event_sched_in.isra.133+0x19d ([kernel.kallsyms]) => ffffffffb8568b80 perf_event_set_state.part.61+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb8568b86 perf_event_set_state.part.61+0x6 ([kernel.kallsyms]) => ffffffffb85662a0 perf_event_update_time+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a35c event_sched_in.isra.133+0x9c ([kernel.kallsyms]) => ffffffffb8567610 perf_log_itrace_start+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a377 event_sched_in.isra.133+0xb7 ([kernel.kallsyms]) => ffffffffb8403b40 x86_pmu_add+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb8403b86 x86_pmu_add+0x46 ([kernel.kallsyms]) => ffffffffb8403940 collect_events+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb8403a7b collect_events+0x13b ([kernel.kallsyms]) => ffffffffb8402cd0 collect_event+0x0 ([kernel.kallsyms])

Fixes: 30def61f64 ("perf parse-events Create two hybrid cache events")
Fixes: 94da591b1c ("perf parse-events Create two hybrid raw events")
Fixes: 9cbfa2f64c ("perf parse-events Create two hybrid hardware events")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https //lore.kernel.org/r/20210909125508.28693-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:00:34 -03:00
Adrian Hunter
a7d212fc6c perf tools: Factor out copy_config_terms() and free_config_terms()
Factor out copy_config_terms() and free_config_terms() so that they can
be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https //lore.kernel.org/r/20210909125508.28693-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:00:13 -03:00
Adrian Hunter
eb34363ae1 perf tools: Fix perf_event_attr__fprintf() missing/dupl. fields
Some fields are missing and text_poke is duplicated. Fix that up.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210911120550.12203-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 15:58:36 -03:00
Ian Rogers
da4572d62d perf tools: Ignore Documentation dependency file
When building directly on the checked out repository the build process
produces a file that should be ignored, so add it to .gitignore.

Fixes: a81df63a5d ("perf doc: Fix doc.dep")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210910232249.739661-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 15:36:16 -03:00
Linus Torvalds
dd4703876e - Add the tegra3 thermal sensor and fix the compilation testing on
tegra by adding a dependency on ARCH_TEGRA along with COMPILE_TEST
   (Dmitry Osipenko)
 
 - Fix the error code for the exynos when devm_get_clk() fails (Dan
   Carpenter)
 
 - Add the TCC cooling support for AlderLake platform (Sumeet Pawnikar)
 
 - Add support for hardware trip points for the rcar gen3 thermal
   driver and store TSC id as unsigned int (Niklas Söderlund)
 
 - Replace the deprecated CPU-hotplug functions get_online_cpus() and
   put_online_cpus (Sebastian Andrzej Siewior)
 
 - Add the thermal tools directory in the MAINTAINERS file (Daniel
   Lezcano)
 
 - Fix the Makefile and the cross compilation flags for the userspace
   'tmon' tool (Rolf Eike Beer)
 
 - Allow to use the IMOK independently from the GDDV on Int340x (Sumeet
   Pawnikar)
 
 - Fix the stub thermal_cooling_device_register() function prototype
   which does not match the real function (Arnd Bergmann)
 
 - Make the thermal trip point optional in the DT bindings (Maxime
   Ripard)
 
 - Fix a typo in a comment in the core code (Geert Uytterhoeven)
 
 - Reduce the verbosity of the trace in the SoC thermal tegra driver
   (Dmitry Osipenko)
 
 - Add the support for the LMh (Limit Management hardware) driver on
   the QCom platforms (Thara Gopinath)
 
 - Allow processing of HWP interrupt by adding a weak function in the
   Intel driver (Srinivas Pandruvada)
 
 - Prevent an abort of the sensor probe is a channel is not used
   (Matthias Kaehlcke)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmE7yAcACgkQqDIjiipP
 6E8WKwgAmI+kdOURCz1CtCIv6FnQFx20suxZidfATPULBPZ8zcHqdjxJECXJpvDz
 y8T8dGAPRqXmMBJm2vF/qVraHaDJvscHXaflhsurhyEp0tiGptNeugBosso0jDEQ
 JFrQ6Rny/NCmNMOkWW4YlSfkik0zQ43xlHocy3UfOKfcshhYWGsPTbIXdA6wH/zh
 3tFW8j327tkyenrKZiaioJOSX88Oyy7Ft8l5k0HP+hPYkrvihWDJuEfEEmpfkdA7
 Q70Lacf2QtFP8mUfGKpQxURSgndjphDPU7tzP5NqqWvLuT94aTspvJx4ywVwK8Iv
 o+jNvR33yBIs5FXXB7B1ymkNE1h8uQ==
 =blIQ
 -----END PGP SIGNATURE-----

Merge tag 'thermal-v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux

Pull thermal updates from Daniel Lezcano:

 - Add the tegra3 thermal sensor and fix the compilation testing on
   tegra by adding a dependency on ARCH_TEGRA along with COMPILE_TEST
   (Dmitry Osipenko)

 - Fix the error code for the exynos when devm_get_clk() fails (Dan
   Carpenter)

 - Add the TCC cooling support for AlderLake platform (Sumeet Pawnikar)

 - Add support for hardware trip points for the rcar gen3 thermal driver
   and store TSC id as unsigned int (Niklas Söderlund)

 - Replace the deprecated CPU-hotplug functions get_online_cpus() and
   put_online_cpus (Sebastian Andrzej Siewior)

 - Add the thermal tools directory in the MAINTAINERS file (Daniel
   Lezcano)

 - Fix the Makefile and the cross compilation flags for the userspace
   'tmon' tool (Rolf Eike Beer)

 - Allow to use the IMOK independently from the GDDV on Int340x (Sumeet
   Pawnikar)

 - Fix the stub thermal_cooling_device_register() function prototype
   which does not match the real function (Arnd Bergmann)

 - Make the thermal trip point optional in the DT bindings (Maxime
   Ripard)

 - Fix a typo in a comment in the core code (Geert Uytterhoeven)

 - Reduce the verbosity of the trace in the SoC thermal tegra driver
   (Dmitry Osipenko)

 - Add the support for the LMh (Limit Management hardware) driver on the
   QCom platforms (Thara Gopinath)

 - Allow processing of HWP interrupt by adding a weak function in the
   Intel driver (Srinivas Pandruvada)

 - Prevent an abort of the sensor probe is a channel is not used
   (Matthias Kaehlcke)

* tag 'thermal-v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  thermal/drivers/qcom/spmi-adc-tm5: Don't abort probing if a sensor is not used
  thermal/drivers/intel: Allow processing of HWP interrupt
  dt-bindings: thermal: Add dt binding for QCOM LMh
  thermal/drivers/qcom: Add support for LMh driver
  firmware: qcom_scm: Introduce SCM calls to access LMh
  thermal/drivers/tegra-soctherm: Silence message about clamped temperature
  thermal: Spelling s/scallbacks/callbacks/
  dt-bindings: thermal: Make trips node optional
  thermal/core: Fix thermal_cooling_device_register() prototype
  thermal/drivers/int340x: Use IMOK independently
  tools/thermal/tmon: Add cross compiling support
  thermal/tools/tmon: Improve the Makefile
  MAINTAINERS: Add missing userspace thermal tools to the thermal section
  thermal/drivers/intel_powerclamp: Replace deprecated CPU-hotplug functions.
  thermal/drivers/rcar_gen3_thermal: Store TSC id as unsigned int
  thermal/drivers/rcar_gen3_thermal: Add support for hardware trip points
  drivers/thermal/intel: Add TCC cooling support for AlderLake platform
  thermal/drivers/exynos: Fix an error code in exynos_tmu_probe()
  thermal/drivers/tegra: Correct compile-testing of drivers
  thermal/drivers/tegra: Add driver for Tegra30 thermal sensor
2021-09-11 09:20:57 -07:00
Arnaldo Carvalho de Melo
218e7b775d perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions
The btf__get_from_id() function was deprecated in favour of
btf__load_from_kernel_by_id(), but it is still avaiable, so use it to
provide a weak function btf__load_from_kernel_by_id() for older libbpf
when building perf with LIBBPF_DYNAMIC=1, i.e. using the system's libbpf
package.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:32:04 -03:00
Vadim Fedorenko
3384c7c764 selftests/bpf: Test new __sk_buff field hwtstamp
Analogous to the gso_segs selftests introduced in commit d9ff286a0f
("bpf: allow BPF programs access skb_shared_info->gso_segs field").

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210909220409.8804-3-vfedorenko@novek.ru
2021-09-10 23:20:13 +02:00
Vadim Fedorenko
f64c4acea5 bpf: Add hardware timestamp field to __sk_buff
BPF programs may want to know hardware timestamps if NIC supports
such timestamping.

Expose this data as hwtstamp field of __sk_buff the same way as
gso_segs/gso_size. This field could be accessed from the same
programs as tstamp field, but it's read-only field. Explicit test
to deny access to padding data is added to bpf_skb_is_valid_access.

Also update BPF_PROG_TEST_RUN tests of the feature.

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210909220409.8804-2-vfedorenko@novek.ru
2021-09-10 23:19:58 +02:00
Arnaldo Carvalho de Melo
37ce9e4fc5 tools include UAPI: Update linux/mount.h copy
To pick the changes from:

  9ffb14ef61 ("move_mount: allow to add a mount into an existing group")

That ends up adding support for the new MOVE_MOUNT_SET_GROUP move_mount
flag.

  $ tools/perf/trace/beauty/move_mount_flags.sh > before
  $ cp include/uapi/linux/mount.h tools/include/uapi/linux/mount.h
  $ tools/perf/trace/beauty/move_mount_flags.sh > after
  $ diff -u before after
  --- before	2021-09-10 12:28:43.865279808 -0300
  +++ after	2021-09-10 12:28:50.183429184 -0300
  @@ -5,4 +5,5 @@
   	[ilog2(0x00000010) + 1] = "T_SYMLINKS",
   	[ilog2(0x00000020) + 1] = "T_AUTOMOUNTS",
   	[ilog2(0x00000040) + 1] = "T_EMPTY_PATH",
  +	[ilog2(0x00000100) + 1] = "SET_GROUP",
   };
  $

So now one can use it in --filter expressions for tracepoints.

This silences this perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/mount.h' differs from latest version at 'include/uapi/linux/mount.h'
  diff -u tools/include/uapi/linux/mount.h include/uapi/linux/mount.h

Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:22 -03:00
Arnaldo Carvalho de Melo
155ed9f1b5 perf beauty: Cover more flags in the move_mount syscall argument beautifier
Previously the regext expected MOVE_MOUNT_[FT]_*, but in the next patch
a flag that doesn't match that expression will be added, MOVE_MOUNT_SET_GROUP

To make this more future proof, take advantage of the fact that the only
one we don't need to cover is MOVE_MOUNT__MASK and use MOVE_MOUNT_[^_]+_*_.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:22 -03:00
Arnaldo Carvalho de Melo
2c3ef25c4a tools headers UAPI: Sync linux/prctl.h with the kernel sources
To pick the changes in:

  433c38f40f ("arm64: mte: change ASYNC and SYNC TCF settings into bitfields")
  e893bb1bb4 ("x86, prctl: Hook L1D flushing in via prctl")

That don't result in any changes in tooling:

  $ tools/perf/trace/beauty/prctl_option.sh > before
  $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h
  $ tools/perf/trace/beauty/prctl_option.sh > after
  $ diff -u before after
  $

Just silences this perf tools build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h'
  diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h

Cc: Balbir Singh <sblbir@amazon.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:22 -03:00
Arnaldo Carvalho de Melo
f9f018e4d9 tools include UAPI: Sync sound/asound.h copy with the kernel sources
Picking the changes from:

  81be109349 ("ALSA: pcm: Add SNDRV_PCM_INFO_EXPLICIT_SYNC flag")

Which entails no changes in the tooling side as it doesn't introduce new
ioctls.

To silence this perf tools build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h'
  diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h

Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:22 -03:00
Arnaldo Carvalho de Melo
dfa00459c6 tools headers UAPI: Sync linux/kvm.h with the kernel sources
To pick the changes in:

  f95937ccf5 ("KVM: stats: Support linear and logarithmic histogram statistics")
  f0376edb1d ("KVM: arm64: Add ioctl to fetch/store tags in a guest")
  ea7fc1bb1c ("KVM: arm64: Introduce MTE VM feature")

That just rebuilds perf, as these patches don't add any new KVM ioctl to
be harvested for the the 'perf trace' ioctl syscall argument
beautifiers.

This is also by now used by tools/testing/selftests/kvm/, so that will
pick the new KVM_STATS_TYPE_LINEAR_HIST and KVM_STATS_TYPE_LOG_HIST
defines.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Cc: Jing Zhang <jingzhangos@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Steven Price <steven.price@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:22 -03:00
Arnaldo Carvalho de Melo
03d6f3fe54 tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
To pick the changes in:

  61e5f69ef0 ("KVM: x86: implement KVM_GUESTDBG_BLOCKIRQ")

That just rebuilds kvm-stat.c on x86, no change in functionality.

This silences these perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
  diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h

Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:22 -03:00
Kim Phillips
291dcb98d7 perf report: Add support to print a textual representation of IBS raw sample data
Perf records IBS (Instruction Based Sampling) extra sample data when
'perf record --raw-samples' is used with an IBS-compatible event, on a
machine that supports IBS.  IBS support is indicated in
CPUID_Fn80000001_ECX bit #10.

Up until now, users have been able to see the extra sample data solely
in raw hex format using 'perf report --dump-raw-trace'.  From there,
users could decode the data either manually, or by using an external
script.

Enable the built-in 'perf report --dump-raw-trace' to do the decoding of
the extra sample data bits, so manual or external script decoding isn't
necessary.

Example usage:

  $ sudo perf record -c 10000001 -a --raw-samples -e ibs_fetch/rand_en=1/,ibs_op/cnt_ctl=1/ -C 0,1 taskset -c 0,1 7za b -mmt2 | perf report --dump-raw-trace

Stdout contains IBS Fetch samples, e.g.:

  ibs_fetch_ctl:	02170007ffffffff MaxCnt 1048560 Cnt 1048560 Lat     7 En 1 Val 1 Comp 1 IcMiss 0 PhyAddrValid 1 L1TlbPgSz 4KB L1TlbMiss 0 L2TlbMiss 0 RandEn 1 L2Miss 0
  IbsFetchLinAd:	000056016b2ead40
  IbsFetchPhysAd:	000000115cedfd40
  c_ibs_ext_ctl:	0000000000000000 IbsItlbRefillLat   0

..and IBS Op samples, e.g.:

  ibs_op_ctl:	0000009e009e8968 MaxCnt  10000000 En 1 Val 1 CntCtl 1=uOps CurCnt       158
  IbsOpRip:	000056016b2ea73d
  ibs_op_data:	00000000000b0002 CompToRetCtr     2 TagToRetCtr    11 BrnRet 0  RipInvalid 0 BrnFuse 0 Microcode 0
  ibs_op_data2:	0000000000000002 CacheHitSt 0=M-state RmtNode 0 DataSrc 2=Local node cache
  ibs_op_data3:	0000000000c60002 LdOp 0 StOp 1 DcL1TlbMiss 0 DcL2TlbMiss 0 DcL1TlbHit2M 0 DcL1TlbHit1G 0 DcL2TlbHit2M 0 DcMiss 0 DcMisAcc 0 DcWcMemAcc 0 DcUcMemAcc 0 DcLockedOp 0 DcMissNoMabAlloc 0 DcLinAddrValid 1 DcPhyAddrValid 1 DcL2TlbHit1G 0 L2Miss 0 SwPf 0 OpMemWidth  4 bytes OpDcMissOpenMemReqs  0 DcMissLat     0 TlbRefillLat     0
  IbsDCLinAd:	00007f133c319ce0
  IbsDCPhysAd:	0000000270485ce0

Committer notes:

Fixed up this:

  util/amd-sample-raw.c: In function ‘evlist__amd_sample_raw’:
  util/amd-sample-raw.c:125:42: error: ‘ bytes’ directive output may be truncated writing 6 bytes into a region of size between 4 and 7 [-Werror=format-truncation=]
    125 |                          " OpMemWidth %2d bytes", 1 << (reg.op_mem_width - 1));
        |                                          ^~~~~~
  In file included from /usr/include/stdio.h:866,
                   from util/amd-sample-raw.c:7:
  /usr/include/bits/stdio2.h:71:10: note: ‘__builtin___snprintf_chk’ output between 21 and 24 bytes into a destination of size 21
     71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
        |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     72 |                                    __glibc_objsize (__s), __fmt,
        |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     73 |                                    __va_arg_pack ());
        |                                    ~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors

As that %2d won't limit the number of chars to 2, just state that 2 is
the minimal width:

  $ cat printf.c
  #include <stdio.h>
  #include <stdlib.h>

  int main(int argc, char *argv[])
  {
  	char bf[64];
  	int len = snprintf(bf, sizeof(bf), "%2d", atoi(argv[1]));

  	printf("strlen(%s): %u\n", bf, len);

  	return 0;
  }
  $ ./printf 1
  strlen( 1): 2
  $ ./printf 12
  strlen(12): 2
  $ ./printf 123
  strlen(123): 3
  $ ./printf 1234
  strlen(1234): 4
  $ ./printf 12345
  strlen(12345): 5
  $ ./printf 123456
  strlen(123456): 6
  $

And since we probably don't want that output to be truncated, just
assume the worst case, as the compiler did, and add a few more chars to
that buffer.

Also use sizeof(var) instead of sizeof(dup-of-wanted-format-string) to
avoid bugs when changing one but not the other.

I also had to change this:

  -#include <asm/amd-ibs.h>
  +#include "../../arch/x86/include/asm/amd-ibs.h"

To make it build on other architectures, just like intel-pt does.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210817221509.88391-4-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:21 -03:00
Kim Phillips
dde994dd54 perf report: Add tools/arch/x86/include/asm/amd-ibs.h
This is a tools/-side patch for the patch that adds the original copy
of the IBS header file, in arch/x86/include/asm/.

We also add an entry to check-headers.sh, so future changes continue
to be copied.

Committer notes:

Had to add this

  -#include <asm/msr-index.h>
  +#include "msr-index.h"

And change the check-headers.sh entry to ignore this line when diffing
with the original kernel header.

This is needed so that we can use 'perf report' on a perf.data with IBS
data on a !x86 system, i.e. building on ARM fails without this as there
is no asm/msr-index.h there.

This was done on the next patch in this series and is done for things
like Intel PT and ARM CoreSight.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210817221509.88391-3-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:13:20 -03:00
Magnus Karlsson
909f0e2820 selftests: xsk: Add tests for 2K frame size
Add tests for 2K frame size. Both a standard send and receive test and
one testing for invalid descriptors when the frame size is 2K.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-21-magnus.karlsson@gmail.com
2021-09-10 21:15:32 +02:00
Magnus Karlsson
0d1b7f3a00 selftests: xsk: Add tests for invalid xsk descriptors
Add tests for invalid xsk descriptors in the Tx ring. A number of
handcrafted nasty invalid descriptors are created and submitted to the
tx ring to check that they are validated correctly. Corner case valid
ones are also sent. The tests are run for both aligned and unaligned
mode.

pkt_stream_set() is introduced to be able to create a hand-crafted
packet stream where every single packet is specified in detail.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-20-magnus.karlsson@gmail.com
2021-09-10 21:15:32 +02:00
Magnus Karlsson
6ce67b5165 selftests: xsk: Eliminate test specific if-statement in test runner
Eliminate a test specific if-statement for the RX_FILL_EMTPY stats
test that is present in the test runner. We can do this as we now have
the use_addr_for_fill option. Just create and empty Rx packet stream
and indicated that the test runner should use the addresses in that to
populate the fill ring. As there are no packets in the stream, the
fill ring will be empty and we will get the error stats that we want
to test.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-19-magnus.karlsson@gmail.com
2021-09-10 21:15:32 +02:00
Magnus Karlsson
a4ba98dd0c selftests: xsk: Add test for unaligned mode
Add a test for unaligned mode in which packet buffers can be placed
anywhere within the umem. Some packets are made to straddle page
boundaries in order to check for correctness. On the Tx side, buffers
are now allocated according to the addresses found in the packet
stream. Thus, the placement of buffers can be controlled with the
boolean use_addr_for_fill in the packet stream.

One new pkt_stream interface is introduced: pkt_stream_replace_half()
that replaces every other packet in the default packet stream with the
specified new packet. The constant DEFAULT_OFFSET is also
introduced. It specifies at what offset from the start of a chunk a Tx
packet is placed by the sending thread. This is just to be able to
test that it is possible to send packets at an offset not equal to
zero.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-18-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson
605091c510 selftests: xsk: Introduce replacing the default packet stream
Introduce the concept of a default packet stream that is the set of
packets sent by most tests. Then add the ability to replace it for a
test that would like to send or receive something else through the use
of the function pkt_stream_replace() and then restored with
pkt_stream_restore_default(). These are then used to convert the
STAT_TEST_TX_INVALID to use these new APIs.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-17-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson
8abf6f725a selftests: xsk: Allow for invalid packets
Allow for invalid packets to be sent. These are verified by the Rx
thread not to be received. Or put in another way, if they are
received, the test will fail. This feature will be used to eliminate
an if statement for a stats test and will also be used by other tests
in later patches. The previous code could only deal with valid
packets.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-16-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson
8ce7192b50 selftests: xsk: Eliminate MAX_SOCKS define
Remove the MAX_SOCKS define as it always will be one for the forseable
future and the code does not work for any other case anyway.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-15-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson
e2d850d534 selftests: xsx: Make pthreads local scope
Make the pthread_t variables local scope instead of global. No reason
for them to be global.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-14-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson
af6731d1e1 selftests: xsk: Make xdp_flags and bind_flags local
Make xdp_flags and bind_flags local instead of global by moving them
into the interface object. These flags decide if the socket should be
created in SKB mode or in DRV mode and therefore they are sticky and
will survive a test_spec_reset. Since every test is first run in SKB
mode then in DRV mode, this change only happens once. With this
change, the configured_mode global variable can also be
erradicated. The first test_spec_init() also becomes superfluous and
can be eliminated.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-13-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson
85c6c95739 selftests: xsk: Specify number of sockets to create
Add the ability in the test specification to specify numbers of
sockets to create. The default is one socket. This is then used to
remove test specific if-statements around the bpf_res tests.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-12-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson
55be575dc1 selftests: xsk: Replace second_step global variable
Replace the second_step global variable with a test specification
variable called total_steps that a test can be set to indicate how
many times the packet stream should be sent without reinitializing any
sockets. This eliminates test specific code in the test runner around
the bidirectional test.

The total_steps variable is 1 by default as most tests only need a
single round of packets.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-11-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson
1856c24db0 selftests: xsk: Introduce rx_on and tx_on in ifobject
Introduce rx_on and tx_on in the ifobject so that we can describe if
the thread should create a socket with only tx, rx, or both. This
eliminates some test specific if statements from the code. We can also
eliminate the flow vector structure now as this is fully specified
by the tx_on and rx_on variables.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-10-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson
119d4b02fe selftests: xsk: Add use_poll to ifobject
Add a use_poll option to the ifobject so that we do not need to use a
test specific if-statement in the test runner.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-9-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson
53cb3cec2f selftests: xsx: Introduce test name in test spec
Introduce the test name in the test specification. This so we can set
the name locally in the test function and simplify the logic for
printing out test results.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-8-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson
c160d7afba selftests: xsk: Make frame_size configurable
Make the frame size configurable instead of it being hard coded to a
default. This is a property of the umem and will make it possible to
implement tests for different umem frame sizes in a later patch.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-7-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson
4bf8ee65ba selftests: xsk: Move rxqsize into xsk_socket_info
Move the global variable rxqsize to struct xsk_socket_info as it
describes the size of a ring in that struct. By default, it is set to
the size dictated by libbpf.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-6-magnus.karlsson@gmail.com
2021-09-10 21:15:31 +02:00
Magnus Karlsson
83f4ae2f26 selftests: xsk: Move num_frames and frame_headroom to xsk_umem_info
Move the global variables num_frames and frame_headroom to struct
xsk_umem_info. They describe properties of the umem so no reason for
them to be global.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-5-magnus.karlsson@gmail.com
2021-09-10 21:15:30 +02:00
Magnus Karlsson
ce74acaf01 selftests: xsk: Introduce test specifications
Introduce a test specification to be able to concisely describe a
test. Currently, a test is implemented by sprinkling test specific if
statements here and there, which is not scalable or easy to
understand. The end goal with this patch set is to come to the point
in which a test is completely specified by a test specification that
can easily be constructed in a single function so that new tests can
be added without too much trouble. This test specification will be run
by a test runner that has no idea about tests. It just executes the
what test specification states.

This patch introduces the test specification and, as a start, puts the
two interface objects in there, one containing the packet stream to be
sent and the other one the packet stream that is supposed to be
received for a test to pass. The global variables containing these can
then be eliminated. The following patches will convert each existing
test into a test specification and add the needed fields into it and
the functionality in the test runner that act on the test
specification. At the end, the test runner should contain no test
specific code and each test should be described in a single simple
function.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-4-magnus.karlsson@gmail.com
2021-09-10 21:15:30 +02:00
Magnus Karlsson
744eb5c882 selftests: xsk: Introduce type for thread function
Introduce a typedef of the thread function so this can be passed to
init_iface() in order to simplify that function.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-3-magnus.karlsson@gmail.com
2021-09-10 21:15:30 +02:00
Magnus Karlsson
ed7b74dc77 selftests: xsk: Simplify xsk and umem arrays
Simplify the xsk_info and umem_info allocation by allocating them
upfront in an array, instead of allocating an array of pointers to
future creations of these. Allocating them upfront also has the
advantage that configuration information can be stored in these
structures instead of relying on global variables. With the previous
structure, xsk_info and umem_info were created too late to be able to
store most configuration information. This will be used to eliminate
most global variables in later patches in this series.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210907071928.9750-2-magnus.karlsson@gmail.com
2021-09-10 21:15:30 +02:00
Kim Phillips
9fe8895a27 perf env: Add perf_env__cpuid, perf_env__{nr_}pmu_mappings
To be used by IBS raw data display: It needs the recorder's cpuid in
order to determine which errata workarounds to apply to the data, and
the pmu_mappings are needed in order to figure out which PMU sample
type is IBS Fetch vs. IBS Op.

When not available from perf.data, we assume local operation, and
retrieve cpuid and pmu mappings directly from the running system.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210817221509.88391-2-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 11:45:19 -03:00
Remi Bernon
d2930ede52 perf symbol: Look for ImageBase in PE file to compute .text offset
Instead of using the file offset in the debug file.

This fixes a regression from 00a3423492 ("perf symbols: Make
dso__load_bfd_symbols() load PE files from debug cache only"), causing
incorrect symbol resolution when debug file have been stripped from
non-debug sections (in which case its .text section is empty and doesn't
have any file position).

The debug files could also be created with a different file alignment,
and have different file positions from the mmap-ed binary, or have the
section reordered.

This instead looks for the file image base, using the corresponding bfd
*ABS* symbols. As PE symbols only have 4 bytes, it also needs to keep
.text section vma high bits.

Signed-off-by: Remi Bernon <rbernon@codeweavers.com>
Fixes: 00a3423492 ("perf symbols: Make dso__load_bfd_symbols() load PE files from debug cache only")
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210909192637.4139125-1-rbernon@codeweavers.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 11:45:19 -03:00
Michael Petlan
51ae7fa62d perf scripts python: Fix passing arguments to stackcollapse report
The '--' prevented arguments from being passed to the script, such as:

  $ perf script report stackcollapse -i my_perf.data

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
LPU-Reference: 20200427142327.21172-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 11:45:19 -03:00
Michael Petlan
3e11300cdf perf test: Fix bpf test sample mismatch reporting
When the expected sample count in the condition changed, the message
needs to be changed too, otherwise we'll get:

  0x1001f2091d8: mmap mask[0]:
  BPF filter result incorrect, expected 56, got 56 samples

Fixes: 4b04e0decd ("perf test: Fix basic bpf filtering test")
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Link: https //lore.kernel.org/r/20210805160611.5542-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 11:45:19 -03:00
Arnaldo Carvalho de Melo
64f4535166 tools headers UAPI: Sync files changed by new process_mrelease syscall and the removal of some compat entry points
To pick the changes in these csets:

  59ab844eed ("compat: remove some compat entry points")
  dce4910396 ("mm: wire up syscall process_mrelease")
  b48c7236b1 ("exit/bdflush: Remove the deprecated bdflush system call")

That add support for this new syscall in tools such as 'perf trace'.

For instance, this is now possible:

  # perf trace -v -e process_mrelease
  event qualifier tracepoint filter: (common_pid != 19351 && common_pid != 9112) && (id == 448)
  ^C#

That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.

  $ grep process_mrelease tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
  448    common  process_mrelease            sys_process_mrelease
  $

This addresses these perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
  diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
  Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
  diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
  Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
  Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
  Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
  diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 11:45:07 -03:00
Arnaldo Carvalho de Melo
bb91de4469 perf beauty: Update copy of linux/socket.h with the kernel sources
To pick the changes in:

Fixes: d32f89da7f ("net: add accept helper not installing fd")
Fixes: bc49d8169a ("mctp: Add MCTP base")

This automagically adds support for the AF_MCTP protocol domain:

  $ tools/perf/trace/beauty/socket.sh > before
  $ cp include/linux/socket.h tools/perf/trace/beauty/include/linux/socket.h
  $ tools/perf/trace/beauty/socket.sh > after
  $ diff -u before after
  --- before	2021-09-06 11:57:14.972747200 -0300
  +++ after	2021-09-06 11:57:30.541920222 -0300
  @@ -44,4 +44,5 @@
   	[42] = "QIPCRTR",
   	[43] = "SMC",
   	[44] = "XDP",
  +	[45] = "MCTP",
   };
  $

This will allow 'perf trace' to translate 45 into "MCTP" as is done with
the other domains:

  # perf trace -e socket*
     0.000 chronyd/1029 socket(family: INET, type: DGRAM|CLOEXEC|NONBLOCK, protocol: IP) = 4
  ^C#

This addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
  diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h

Cc: David S. Miller <davem@davemloft.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jeremy Kerr <jk@codeconstruct.com.au>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 10:42:49 -03:00
Quentin Monnet
0b46b75505 libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations
Introduce a macro LIBBPF_DEPRECATED_SINCE(major, minor, message) to prepare
the deprecation of two API functions. This macro marks functions as deprecated
when libbpf's version reaches the values passed as an argument.

As part of this change libbpf_version.h header is added with recorded major
(LIBBPF_MAJOR_VERSION) and minor (LIBBPF_MINOR_VERSION) libbpf version macros.
They are now part of libbpf public API and can be relied upon by user code.
libbpf_version.h is installed system-wide along other libbpf public headers.

Due to this new build-time auto-generated header, in-kernel applications
relying on libbpf (resolve_btfids, bpftool, bpf_preload) are updated to
include libbpf's output directory as part of a list of include search paths.
Better fix would be to use libbpf's make_install target to install public API
headers, but that clean up is left out as a future improvement. The build
changes were tested by building kernel (with KBUILD_OUTPUT and O= specified
explicitly), bpftool, libbpf, selftests/bpf, and resolve_btfids builds. No
problems were detected.

Note that because of the constraints of the C preprocessor we have to write
a few lines of macro magic for each version used to prepare deprecation (0.6
for now).

Also, use LIBBPF_DEPRECATED_SINCE() to schedule deprecation of
btf__get_from_id() and btf__load(), which are replaced by
btf__load_from_kernel_by_id() and btf__load_into_kernel(), respectively,
starting from future libbpf v0.6. This is part of libbpf 1.0 effort ([0]).

  [0] Closes: https://github.com/libbpf/libbpf/issues/278

Co-developed-by: Quentin Monnet <quentin@isovalent.com>
Co-developed-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210908213226.1871016-1-andrii@kernel.org
2021-09-09 23:28:05 +02:00
Linus Torvalds
43175623dd More tracing updates for 5.15:
- Add migrate-disable counter to tracing header
 
  - Fix error handling in event probes
 
  - Fix missed unlock in osnoise in error path
 
  - Fix merge issue with tools/bootconfig
 
  - Clean up bootconfig data when init memory is removed
 
  - Fix bootconfig to loop only on subkeys
 
  - Have kernel command lines override bootconfig options
 
  - Increase field counts for synthetic events
 
  - Have histograms dynamic allocate event elements to save space
 
  - Fixes in testing and documentation
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYToFZBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qtg5AP44U3Dn1m1lQo3y1DJ9kUP3HsAsDofS
 Cv7ZM9tLV2p4MQEA9KJc3/B/5BZEK1kso3uLeLT+WxJOC4YStXY19WwmjAI=
 =Wuo+
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull more tracing updates from Steven Rostedt:

 - Add migrate-disable counter to tracing header

 - Fix error handling in event probes

 - Fix missed unlock in osnoise in error path

 - Fix merge issue with tools/bootconfig

 - Clean up bootconfig data when init memory is removed

 - Fix bootconfig to loop only on subkeys

 - Have kernel command lines override bootconfig options

 - Increase field counts for synthetic events

 - Have histograms dynamic allocate event elements to save space

 - Fixes in testing and documentation

* tag 'trace-v5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/boot: Fix to loop on only subkeys
  selftests/ftrace: Exclude "(fault)" in testing add/remove eprobe events
  tracing: Dynamically allocate the per-elt hist_elt_data array
  tracing: synth events: increase max fields count
  tools/bootconfig: Show whole test command for each test case
  bootconfig: Fix missing return check of xbc_node_compose_key function
  tools/bootconfig: Fix tracing_on option checking in ftrace2bconf.sh
  docs: bootconfig: Add how to use bootconfig for kernel parameters
  init/bootconfig: Reorder init parameter from bootconfig and cmdline
  init: bootconfig: Remove all bootconfig data when the init memory is removed
  tracing/osnoise: Fix missed cpus_read_unlock() in start_per_cpu_kthreads()
  tracing: Fix some alloc_event_probe() error handling bugs
  tracing: Add migrate-disabled counter to tracing output.
2021-09-09 13:11:15 -07:00
Linus Torvalds
2d338201d5 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "147 patches, based on 7d2a07b769.

  Subsystems affected by this patch series: mm (memory-hotplug, rmap,
  ioremap, highmem, cleanups, secretmem, kfence, damon, and vmscan),
  alpha, percpu, procfs, misc, core-kernel, MAINTAINERS, lib,
  checkpatch, epoll, init, nilfs2, coredump, fork, pids, criu, kconfig,
  selftests, ipc, and scripts"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
  scripts: check_extable: fix typo in user error message
  mm/workingset: correct kernel-doc notations
  ipc: replace costly bailout check in sysvipc_find_ipc()
  selftests/memfd: remove unused variable
  Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH
  configs: remove the obsolete CONFIG_INPUT_POLLDEV
  prctl: allow to setup brk for et_dyn executables
  pid: cleanup the stale comment mentioning pidmap_init().
  kernel/fork.c: unexport get_{mm,task}_exe_file
  coredump: fix memleak in dump_vma_snapshot()
  fs/coredump.c: log if a core dump is aborted due to changed file permissions
  nilfs2: use refcount_dec_and_lock() to fix potential UAF
  nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group
  nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group
  nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group
  nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group
  nilfs2: fix NULL pointer in nilfs_##name##_attr_release
  nilfs2: fix memory leak in nilfs_sysfs_create_device_group
  trap: cleanup trap_init()
  init: move usermodehelper_enable() to populate_rootfs()
  ...
2021-09-08 12:55:35 -07:00
Steven Rostedt (VMware)
04178ea130 selftests/ftrace: Exclude "(fault)" in testing add/remove eprobe events
The original test for adding and removing eprobes used synthetic events
and retrieved the filename from the open system call at the end of the
system call. This would allow it to always be loaded into the page tables
when accessed.

Masami suggested that the test was too complex for just testing add and
remove, so it was changed to test just adding and removing an event probe
on top of the start of the open system call event. Now it is possible that
the filename will not be loaded into memory at the time the eprobe is
triggered, and will result in "(fault)" being displayed in the event. This
causes the test to fail.

Account for "(fault)" also being one of the values of the filename field
of the event probe.

Link: https://lkml.kernel.org/r/20210907230429.5783d519@rorschach.local.home

Fixes: 079db70794 ("selftests/ftrace: Add test case to test adding and removing of event probe")
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-08 15:29:16 -04:00