Commit Graph

31555 Commits

Author SHA1 Message Date
Len Brown
7535249d10 tools/power turbostat: update turbostat.8
Update turbostat.8 to reflect new uncore frequency output (UncMHz)
Also, refresh examples.

Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28 14:23:25 -04:00
Len Brown
a5c6d65d06 tools/power turbostat: Show uncore frequency
When CONFIG_INTEL_UNCORE_FREQ_CONTROL is effective,
(Linux 5.9 and later), print the current (and default)
min and max uncore frequency limits.

When that driver provides the current uncore frequency
(Linux 5.18 and later), print a UncMHz column
reflecting the current uncore frequency.

Note that UncMHz is an instantaneous sample, not an average.

eg.

$ sudo ./turbostat -S --show frequency
...
Uncore Frequency pkg0 die0: 800 - 3900 MHz (800 - 3900 MHz)
...
Avg_MHz	Busy%	Bzy_MHz	TSC_MHz	UncMHz
28	0.70	4049	3095	3900

Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28 14:23:25 -04:00
Colin Ian King
5e5fd36c58 tools/power turbostat: Fix file pointer leak
Currently if a fscanf fails then an early return leaks an open
file pointer. Fix this by fclosing the file before the return.
Detected using static analysis with cppcheck:

tools/power/x86/turbostat/turbostat.c:2039:3: error: Resource leak: fp [resourceLeak]

Fixes: eae97e053f ("tools/power turbostat: Support thermal throttle count print")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28 14:23:25 -04:00
Colin Ian King
e13da9a1db tools/power turbostat: replace strncmp with single character compare
Using strncmp for a single character comparison is overly complicated,
just use a simpler single character comparison instead. Also stops
static analyzers (such as cppcheck) from complaining about strncmp on
non-null terminated strings.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28 14:23:25 -04:00
Chen Yu
033312336d tools/power turbostat: print the kernel boot commandline
It would be handy to have cmdline in turbostat output. For example,
according to the turbostat output, there are no C-states requested.
In this case the user is very curious if something like
intel_idle.max_cstate=0 was used, or may be idle=none too. It is
also curious whether things like intel_pstate=nohwp were used.

Print the boot command line accordingly:
turbostat version 21.05.04 - Len Brown <lenb@kernel.org>
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.16.0+ root=UUID=
 b42359ed-1e05-42eb-8757-6bf2a1c19070 ro quiet splash vt.handoff=7

Suggested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28 14:23:25 -04:00
Zhang Rui
fb5e29df8d tools/power turbostat: Introduce support for RaptorLake
RaptorLake is compatible with AlderLake.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2022-07-28 14:23:25 -04:00
Xin Gao
c55ae10230 tools/power/x86/intel-speed-select: Remove unneeded semicolon
Remove an unneeded semicolon.

Signed-off-by: Xin Gao <gaoxin@cdjrlc.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-07-28 19:41:26 +02:00
Dan Carpenter
d9f74d98bb tools/power/x86/intel-speed-select: Fix off by one check
Change > MAX_DIE_PER_PACKAGE to >= MAX_DIE_PER_PACKAGE to prevent
accessing one element beyond the end of the array.

Fixes: 7fd786dfbd ("tools/power/x86/intel-speed-select: OOB daemon mode")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-07-28 19:41:07 +02:00
Sean Christopherson
ce30d8b976 KVM: selftests: Verify VMX MSRs can be restored to KVM-supported values
Verify that KVM allows toggling VMX MSR bits to be "more" restrictive,
and also allows restoring each MSR to KVM's original, less restrictive
value.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220607213604.3346000-16-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-28 13:25:24 -04:00
Sean Christopherson
cfe12e64b0 KVM: selftests: Add an option to run vCPUs while disabling dirty logging
Add a command line option to dirty_log_perf_test to run vCPUs for the
entire duration of disabling dirty logging.  By default, the test stops
running runs vCPUs before disabling dirty logging, which is faster but
less interesting as it doesn't stress KVM's handling of contention
between page faults and the zapping of collapsible SPTEs.  Enabling the
flag also lets the user verify that KVM is indeed rebuilding zapped SPTEs
as huge pages by checking KVM's pages_{1g,2m,4k} stats.  Without vCPUs to
fault in the zapped SPTEs, the stats will show that KVM is zapping pages,
but they never show whether or not KVM actually allows huge pages to be
recreated.

Note!  Enabling the flag can _significantly_ increase runtime, especially
if the thread that's disabling dirty logging doesn't have a dedicated
pCPU, e.g. if all pCPUs are used to run vCPUs.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220715232107.3775620-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-28 13:22:24 -04:00
Slark Xiao
7a12f91885 thermal/tools/tmon: Fix typo 'the the' in comment
Replace 'the the' with 'the' in the comment.

Signed-off-by: Slark Xiao <slark_xiao@163.com>
Link: https://lore.kernel.org/r/20220722104047.83312-1-slark_xiao@163.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-07-28 17:29:52 +02:00
Markus Mayer
0cf51bfe99 thermal/tools/tmon: Include pthread and time headers in tmon.h
Include sys/time.h and pthread.h in tmon.h, so that types
"pthread_mutex_t" and "struct timeval tv" are known when tmon.h
references them.

Without these headers, compiling tmon against musl-libc will fail with
these errors:

In file included from sysfs.c:31:0:
tmon.h:47:8: error: unknown type name 'pthread_mutex_t'
 extern pthread_mutex_t input_lock;
        ^~~~~~~~~~~~~~~
make[3]: *** [<builtin>: sysfs.o] Error 1
make[3]: *** Waiting for unfinished jobs....
In file included from tui.c:31:0:
tmon.h:54:17: error: field 'tv' has incomplete type
  struct timeval tv;
                 ^~
make[3]: *** [<builtin>: tui.o] Error 1
make[2]: *** [Makefile:83: tmon] Error 2

Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Acked-by: Alejandro González <alejandro.gonzalez.correo@gmail.com>
Tested-by: Alejandro González <alejandro.gonzalez.correo@gmail.com>
Fixes: 94f69966fa ("tools/thermal: Introduce tmon, a tool for thermal  subsystem")
Link: https://lore.kernel.org/r/20220718031040.44714-1-f.fainelli@gmail.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-07-28 17:29:51 +02:00
Rashmica Gupta
cd1e64935f selftests/powerpc: Fix matrix multiply assist test
The ISA states: "when ACC[i] contains defined data, the contents of VSRs
4×i to 4×i+3 are undefined until either a VSX Move From ACC instruction
is used to copy the contents of ACC[i] to VSRs 4×i to 4×i+3 or some other
instruction directly writes to one of these VSRs." We aren't doing this.

This test only works on Power10 because the hardware implementation
happens to map ACC0 to VSRs 0-3, but will fail on any other implementation
that doesn't do this. So add xxmfacc between writing to the accumulator
and accessing the VSRs.

Fixes: 3527e1ab9a ("selftests/powerpc: Add matrix multiply assist (MMA) test")
Signed-off-by: Rashmica Gupta <rashmica@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220617043935.428083-1-rashmica@linux.ibm.com
2022-07-28 16:22:15 +10:00
YiFei Zhu
3ce4b78f73 selftests/seccomp: Fix compile warning when CC=clang
clang has -Wconstant-conversion by default, and the constant 0xAAAAAAAAA
(9 As) being converted to an int, which is generally 32 bits, results
in the compile warning:

  clang -Wl,-no-as-needed -Wall -isystem ../../../../usr/include/  -lpthread  seccomp_bpf.c -lcap -o seccomp_bpf
  seccomp_bpf.c:812:67: warning: implicit conversion from 'long' to 'int' changes value from 45812984490 to -1431655766 [-Wconstant-conversion]
          int kill = kill_how == KILL_PROCESS ? SECCOMP_RET_KILL_PROCESS : 0xAAAAAAAAA;
              ~~~~                                                         ^~~~~~~~~~~
  1 warning generated.

-1431655766 is the expected truncation, 0xAAAAAAAA (8 As), so use
this directly in the code to avoid the warning.

Fixes: 3932fcecd9 ("selftests/seccomp: Add test for unknown SECCOMP_RET kill behavior")
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220526223407.1686936-1-zhuyifei@google.com
2022-07-27 12:12:16 -07:00
Linus Torvalds
6e7765cb47 Merge tag 'asm-generic-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic fixes from Arnd Bergmann:
 "Two more bug fixes for asm-generic, one addressing an incorrect
  Kconfig symbol reference and another one fixing a build failure for
  the perf tool on mips and possibly others"

* tag 'asm-generic-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic: remove a broken and needless ifdef conditional
  tools: Fixed MIPS builds due to struct flock re-definition
2022-07-27 09:50:18 -07:00
Daniel Müller
40b09653b1 selftests/bpf: Adjust vmtest.sh to use local kernel configuration
So far the vmtest.sh script, which can be used as a convenient way to
run bpf selftests, has obtained the kernel config safe to use for
testing from the libbpf/libbpf GitHub repository [0].

Given that we now have included this configuration into this very
repository, we can just consume it from here as well, eliminating the
necessity of remote accesses.

With this change we adjust the logic in the script to use the
configuration from below tools/testing/selftests/bpf/configs/ instead
of pulling it over the network.

  [0] https://github.com/libbpf/libbpf

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Mykola Lysenko <mykolal@fb.com>
Link: https://lore.kernel.org/bpf/20220727001156.3553701-4-deso@posteo.net
2022-07-27 17:03:14 +02:00
Daniel Müller
cbd620fc18 selftests/bpf: Copy over libbpf configs
This change integrates libbpf maintained configurations and black/white
lists [0] into the repository, co-located with the BPF selftests themselves.
We minimize the kernel configurations to keep future updates as small as
possible [1].

Furthermore, we make both kernel configurations build on top of the existing
configuration tools/testing/selftests/bpf/config (to be concatenated before
build). Lastly, we replaced the terms blacklist & whitelist with denylist and
allowlist, respectively.

  [0] 20f0330235/travis-ci/vmtest/configs
  [1] https://lore.kernel.org/bpf/20220712212124.3180314-1-deso@posteo.net/T/#m30a53648352ed494e556ac003042a9ad0a8f98c6

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Mykola Lysenko <mykolal@fb.com>
Link: https://lore.kernel.org/bpf/20220727001156.3553701-3-deso@posteo.net
2022-07-27 17:02:38 +02:00
Daniel Müller
aee993bbd0 selftests/bpf: Sort configuration
This change makes sure to sort the existing minimal kernel configuration
containing options required for running BPF selftests alphabetically.
Doing so will make it easier to diff it against other configurations,
which in turn helps with maintaining disjunct config files that build on
top of each other. It also helped identify the CONFIG_IPV6_GRE being set
twice and removes one of the occurrences.

Lastly, we change NET_CLS_BPF from 'm' to 'y'. Having this option as 'm'
will cause failures of the btf_skc_cls_ingress selftest.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Mykola Lysenko <mykolal@fb.com>
Link: https://lore.kernel.org/bpf/20220727001156.3553701-2-deso@posteo.net
2022-07-27 17:02:29 +02:00
Ian Rogers
9a24180567 perf bpf: Remove undefined behavior from bpf_perf_object__next()
bpf_perf_object__next() folded the last element in the list test with the
empty list test. However, this meant that offsets were computed against
null and that a struct list_head was compared against a 'struct
bpf_perf_object'.

Working around this with clang's undefined behavior sanitizer required
-fno-sanitize=null and -fno-sanitize=object-size.

Remove the undefined behavior by using the regular Linux list APIs and
handling the starting case separately from the end testing case.

Looking at uses like bpf_perf_object__for_each(), as the constant NULL
or non-NULL argument can be constant propagated, the code is no less
efficient.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Christy Lee <christylee@fb.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20220726220921.2567761-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:19:39 -03:00
Leo Yan
882528d2e7 perf symbol: Skip symbols if SHF_ALLOC flag is not set
Some symbols are observed with the 'st_value' field zeroed.  E.g.
libc.so.6 in Ubuntu contains a symbol '__evoke_link_warning_getwd' which
resides in the '.gnu.warning.getwd' section.

Unlike normal sections, such kind of sections are used for linker
warning when a file calls deprecated functions, but they are not part of
memory images, the symbols in these sections should be dropped.

This patch checks the section attribute SHF_ALLOC bit, if the bit is not
set, it skips symbols to avoid spurious ones.

Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chang Rui <changruinj@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220724060013.171050-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:17:50 -03:00
Leo Yan
2d86612aac perf symbol: Correct address for bss symbols
When using 'perf mem' and 'perf c2c', an issue is observed that tool
reports the wrong offset for global data symbols.  This is a common
issue on both x86 and Arm64 platforms.

Let's see an example, for a test program, below is the disassembly for
its .bss section which is dumped with objdump:

  ...

  Disassembly of section .bss:

  0000000000004040 <completed.0>:
  	...

  0000000000004080 <buf1>:
  	...

  00000000000040c0 <buf2>:
  	...

  0000000000004100 <thread>:
  	...

First we used 'perf mem record' to run the test program and then used
'perf --debug verbose=4 mem report' to observe what's the symbol info
for 'buf1' and 'buf2' structures.

  # ./perf mem record -e ldlat-loads,ldlat-stores -- false_sharing.exe 8
  # ./perf --debug verbose=4 mem report
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 sh_addr: 0x4040 sh_offset: 0x3028
    symbol__new: buf2 0x30a8-0x30e8
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x4080 sh_addr: 0x4040 sh_offset: 0x3028
    symbol__new: buf1 0x3068-0x30a8
    ...

The perf tool relies on libelf to parse symbols, in executable and
shared object files, 'st_value' holds a virtual address; 'sh_addr' is
the address at which section's first byte should reside in memory, and
'sh_offset' is the byte offset from the beginning of the file to the
first byte in the section.  The perf tool uses below formula to convert
a symbol's memory address to a file address:

  file_address = st_value - sh_addr + sh_offset
                    ^
                    ` Memory address

We can see the final adjusted address ranges for buf1 and buf2 are
[0x30a8-0x30e8) and [0x3068-0x30a8) respectively, apparently this is
incorrect, in the code, the structure for 'buf1' and 'buf2' specifies
compiler attribute with 64-byte alignment.

The problem happens for 'sh_offset', libelf returns it as 0x3028 which
is not 64-byte aligned, combining with disassembly, it's likely libelf
doesn't respect the alignment for .bss section, therefore, it doesn't
return the aligned value for 'sh_offset'.

Suggested by Fangrui Song, ELF file contains program header which
contains PT_LOAD segments, the fields p_vaddr and p_offset in PT_LOAD
segments contain the execution info.  A better choice for converting
memory address to file address is using the formula:

  file_address = st_value - p_vaddr + p_offset

This patch introduces elf_read_program_header() which returns the
program header based on the passed 'st_value', then it uses the formula
above to calculate the symbol file address; and the debugging log is
updated respectively.

After applying the change:

  # ./perf --debug verbose=4 mem report
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 p_vaddr: 0x3d28 p_offset: 0x2d28
    symbol__new: buf2 0x30c0-0x3100
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x4080 p_vaddr: 0x3d28 p_offset: 0x2d28
    symbol__new: buf1 0x3080-0x30c0
    ...

Fixes: f17e04afaf ("perf report: Fix ELF symbol parsing")
Reported-by: Chang Rui <changruinj@gmail.com>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220724060013.171050-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:17:50 -03:00
Leo Yan
b226521923 perf scripts python: Let script to be python2 compliant
The mainline kernel can be used for relative old distros, e.g. RHEL 7.
The distro doesn't upgrade from python2 to python3, this causes the
building error that the python script is not python2 compliant.

To fix the building failure, this patch changes from the python f-string
format to traditional string format.

Fixes: 12fdd6c009 ("perf scripts python: Support Arm CoreSight trace data disassembly")
Reported-by: Akemi Yagi <toracat@elrepo.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: ElRepo <contact@elrepo.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220725104220.1106663-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:17:50 -03:00
Arnaldo Carvalho de Melo
553de6e115 tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  28a99e95f5 ("x86/amd: Use IBPB for firmware calls")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org
Link: https://lore.kernel.org/lkml/Yt6oWce9UDAmBAtX@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:17:50 -03:00
Arnaldo Carvalho de Melo
40d02efad9 Merge remote-tracking branch 'torvalds/master' into perf/core
To get upstream fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:08:48 -03:00
Slark Xiao
060468f0dd selftests: net: Fix typo 'the the' in comment
Replace 'the the' with 'the' in the comment.

Signed-off-by: Slark Xiao <slark_xiao@163.com>
Link: https://lore.kernel.org/r/20220725020124.5760-1-slark_xiao@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26 20:26:58 -07:00
Adam Sindelar
187e7c4144 selftests/vm: fix va_128TBswitch.sh permissions
Restore the +x bit to va_128TBswitch.sh, which got dropped from the
previous patch, somehow.

Link: https://lkml.kernel.org/r/20220708090646.34927-1-adam@wowsignal.io
Fixes: 1afd01d43efc3 ("selftests/vm: Only run 128TBswitch with 5-level paging")

Signed-off-by: Adam Sindelar <adam@wowsignal.io>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-26 16:59:50 -07:00
Jiri Pirko
949c84f05e selftests: mlxsw: Check line card info on activated line card
Once line card is activated, check the FW version and PSID are exposed.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26 13:56:44 -07:00
Jiri Pirko
e96c8da380 selftests: mlxsw: Check line card info on provisioned line card
Once line card is provisioned, check if HW revision and INI version
are exposed on associated nested auxiliary device.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26 13:56:44 -07:00
Ian Rogers
a061a8ad3f perf test: Avoid sysfs state affecting fake events
Icelake has a slots event, on my Skylakex I have CPU events in sysfs of
topdown-slots-issued and topdown-total-slots.

Legacy event parsing would try to use '-' to separate parts of an event
and so perf_pmu__parse_init sets 'slots' to be a
PMU_EVENT_SYMBOL_SUFFIX2.

As such parsing the slots event for a fake PMU fails as a
PMU_EVENT_SYMBOL_SUFFIX2 isn't made into the PE_PMU_EVENT_FAKE token.

Resolve this issue by test initializing the PMU parsing state before
every parse. This must be done every parse as the state is removes after
each parse_events.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220725223633.2301737-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:55 -03:00
Zhengjun Xing
bedd17381b perf vendor events intel: Update event list for haswellx
Update JSON core/uncore events for haswellx to perf.

Based on HSX JSON list v24:

https://download.01.org/perfmon/HSX

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220614145019.2177071-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:55 -03:00
Zhengjun Xing
28738de918 perf vendor events intel: Update event list for broadwellx
Update JSON core/uncore events for broadwellx to perf.

Based on BDX JSON list v19:

https://download.01.org/perfmon/BDX

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220614145019.2177071-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:55 -03:00
Zhengjun Xing
b43a5442d8 perf vendor events intel: Update event list for Snowridgex
More uncore events are added in the converter tool:

https://github.com/intel/event-converter-for-linux-perf

Keep both alias and the original name for the events, in case someone
already used the alias in their script.

Generate the perf events based on Snowridgex(SNR) event list v1.20:

https://download.01.org/perfmon/SNR/

Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220609094222.2030167-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Zhengjun Xing
9146af4413 perf vendor events intel: Rename tremontx to snowridgex
Tremontx was an old name for Snowridgex, so rename Tremontx to Snowridgex.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220609094222.2030167-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Zhengjun Xing
6a92916de5 perf vendor events intel: Update event list for Sapphirerapids
Update JSON event list for Sapphirerapids to perf.

Based on JSON list v1.02:

https://download.01.org/perfmon/SPR/

Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220607092749.1976878-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Zhengjun Xing
5fa2481cdf perf vendor events intel: Update event list for Alderlake
Update JSON event list for Alderlake to perf.

It is a hybrid event list for both Atom and Core.

Based on JSON list v1.11:

https://download.01.org/perfmon/ADL/

Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220607092749.1976878-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Colin Ian King
8147f79ea5 perf inject: Fix spelling mistake "theads" -> "threads"
There is a spelling mistake in a pr_err message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: https://lore.kernel.org/r/20220721124528.20997-1-colin.i.king@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
acfb65fe1d perf kwork: Add workqueue trace BPF support
Implements workqueue trace bpf function.

Test cases:

  # perf kwork -k workqueue lat -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)addrconf_verify_work        | 0002 |      5.856 ms |         1 |      5.856 ms |     111994.634313 s |     111994.640169 s |
    (w)vmstat_update               | 0001 |      1.247 ms |         1 |      1.247 ms |     111996.462651 s |     111996.463899 s |
    (w)neigh_periodic_work         | 0001 |      1.183 ms |         1 |      1.183 ms |     111996.462789 s |     111996.463973 s |
    (w)neigh_managed_work          | 0001 |      0.989 ms |         2 |      1.635 ms |     111996.462820 s |     111996.464455 s |
    (w)wb_workfn                   | 0000 |      0.667 ms |         1 |      0.667 ms |     111996.384273 s |     111996.384940 s |
    (w)bpf_prog_free_deferred      | 0001 |      0.495 ms |         1 |      0.495 ms |     111986.314201 s |     111986.314696 s |
    (w)mix_interrupt_randomness    | 0002 |      0.421 ms |         6 |      0.749 ms |     111995.927750 s |     111995.928499 s |
    (w)vmstat_shepherd             | 0000 |      0.374 ms |         2 |      0.385 ms |     111991.265242 s |     111991.265627 s |
    (w)e1000_watchdog              | 0002 |      0.356 ms |         5 |      0.390 ms |     111994.528380 s |     111994.528770 s |
    (w)vmstat_update               | 0000 |      0.231 ms |         2 |      0.365 ms |     111996.384407 s |     111996.384772 s |
    (w)flush_to_ldisc              | 0006 |      0.165 ms |         1 |      0.165 ms |     111995.930606 s |     111995.930771 s |
    (w)flush_to_ldisc              | 0000 |      0.094 ms |         2 |      0.095 ms |     111996.460453 s |     111996.460548 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork -k workqueue rep -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)e1000_watchdog              | 0002 |      0.627 ms |         2 |      0.324 ms |     112002.720665 s |     112002.720989 s |
    (w)flush_to_ldisc              | 0007 |      0.598 ms |         2 |      0.534 ms |     112000.875226 s |     112000.875761 s |
    (w)wq_barrier_func             | 0007 |      0.492 ms |         1 |      0.492 ms |     112000.876981 s |     112000.877473 s |
    (w)flush_to_ldisc              | 0007 |      0.281 ms |         1 |      0.281 ms |     112005.826882 s |     112005.827163 s |
    (w)mix_interrupt_randomness    | 0002 |      0.229 ms |         3 |      0.102 ms |     112005.825671 s |     112005.825774 s |
    (w)vmstat_shepherd             | 0000 |      0.202 ms |         1 |      0.202 ms |     112001.504511 s |     112001.504713 s |
    (w)bpf_prog_free_deferred      | 0001 |      0.181 ms |         1 |      0.181 ms |     112000.883251 s |     112000.883432 s |
    (w)wb_workfn                   | 0007 |      0.130 ms |         1 |      0.130 ms |     112001.505195 s |     112001.505325 s |
    (w)vmstat_update               | 0000 |      0.053 ms |         1 |      0.053 ms |     112001.504763 s |     112001.504815 s |
   --------------------------------------------------------------------------------------------------------------------------------

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-18-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
5a81927a40 perf kwork: Add softirq trace BPF support
Implements softirq trace bpf function.

Test cases:
Trace softirq latency without filter:

  # perf kwork -k softirq lat -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)RCU:9                       | 0005 |      0.281 ms |         3 |      0.338 ms |     111295.752222 s |     111295.752560 s |
    (s)RCU:9                       | 0002 |      0.262 ms |        24 |      1.400 ms |     111301.335986 s |     111301.337386 s |
    (s)SCHED:7                     | 0005 |      0.177 ms |        14 |      0.212 ms |     111295.752270 s |     111295.752481 s |
    (s)RCU:9                       | 0007 |      0.161 ms |        47 |      2.022 ms |     111295.402159 s |     111295.404181 s |
    (s)NET_RX:3                    | 0003 |      0.149 ms |        12 |      1.261 ms |     111301.192964 s |     111301.194225 s |
    (s)TIMER:1                     | 0001 |      0.105 ms |         9 |      0.198 ms |     111301.180191 s |     111301.180389 s |
    ... <SNIP> ...
    (s)NET_RX:3                    | 0002 |      0.098 ms |         6 |      0.124 ms |     111295.403760 s |     111295.403884 s |
    (s)SCHED:7                     | 0001 |      0.093 ms |        19 |      0.242 ms |     111301.180256 s |     111301.180498 s |
    (s)SCHED:7                     | 0007 |      0.078 ms |        15 |      0.188 ms |     111300.064226 s |     111300.064415 s |
    (s)SCHED:7                     | 0004 |      0.077 ms |        11 |      0.213 ms |     111301.361759 s |     111301.361973 s |
    (s)SCHED:7                     | 0000 |      0.063 ms |        33 |      0.805 ms |     111295.401811 s |     111295.402616 s |
    (s)SCHED:7                     | 0003 |      0.063 ms |        14 |      0.085 ms |     111301.192255 s |     111301.192340 s |
   --------------------------------------------------------------------------------------------------------------------------------

Trace softirq latency with cpu filter:

  # perf kwork -k softirq lat -b -C 1
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)RCU:9                       | 0001 |      0.178 ms |         5 |      0.572 ms |     111435.534135 s |     111435.534707 s |
   --------------------------------------------------------------------------------------------------------------------------------

Trace softirq latency with name filter:

  # perf kwork -k softirq lat -b -n SCHED
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)SCHED:7                     | 0001 |      0.295 ms |        15 |      2.183 ms |     111452.534950 s |     111452.537133 s |
    (s)SCHED:7                     | 0002 |      0.215 ms |        10 |      0.315 ms |     111460.000238 s |     111460.000553 s |
    (s)SCHED:7                     | 0005 |      0.190 ms |        29 |      0.338 ms |     111457.032538 s |     111457.032876 s |
    (s)SCHED:7                     | 0003 |      0.097 ms |        10 |      0.319 ms |     111452.434351 s |     111452.434670 s |
    (s)SCHED:7                     | 0006 |      0.089 ms |         1 |      0.089 ms |     111450.737450 s |     111450.737539 s |
    (s)SCHED:7                     | 0007 |      0.085 ms |        17 |      0.169 ms |     111452.471333 s |     111452.471502 s |
    (s)SCHED:7                     | 0004 |      0.071 ms |        15 |      0.221 ms |     111452.535252 s |     111452.535473 s |
    (s)SCHED:7                     | 0000 |      0.044 ms |        32 |      0.130 ms |     111460.001982 s |     111460.002112 s |
   --------------------------------------------------------------------------------------------------------------------------------

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-17-yangjihong1@huawei.com
[ Add {} for multiline if blocks ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
420298aefe perf kwork: Add IRQ trace BPF support
Implements irq trace bpf function.

Test cases:
Trace irq without filter:

  # perf kwork -k irq rep -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |     31.026 ms |       285 |      1.493 ms |     110326.049963 s |     110326.051456 s |
    eth0:10                        | 0002 |      7.875 ms |        96 |      1.429 ms |     110313.916835 s |     110313.918264 s |
    ata_piix:14                    | 0002 |      2.510 ms |        28 |      0.396 ms |     110331.367987 s |     110331.368383 s |
   --------------------------------------------------------------------------------------------------------------------------------

Trace irq with cpu filter:

  # perf kwork -k irq rep -b -C 0
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |     34.288 ms |       282 |      2.061 ms |     110358.078968 s |     110358.081029 s |
   --------------------------------------------------------------------------------------------------------------------------------

Trace irq with name filter:

  # perf kwork -k irq rep -b -n eth0
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    eth0:10                        | 0002 |      2.184 ms |        21 |      0.572 ms |     110386.541699 s |     110386.542271 s |
   --------------------------------------------------------------------------------------------------------------------------------

Trace irq with summary:

  # perf kwork -k irq rep -b -S
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |     42.923 ms |       285 |      1.181 ms |     110418.128867 s |     110418.130049 s |
    eth0:10                        | 0002 |      2.085 ms |        20 |      0.668 ms |     110416.002935 s |     110416.003603 s |
    ata_piix:14                    | 0002 |      0.970 ms |         4 |      0.656 ms |     110424.034482 s |     110424.035138 s |
   --------------------------------------------------------------------------------------------------------------------------------
    Total count            :       309
    Total runtime   (msec) :    45.977 (0.003% load average)
    Total time span (msec) : 17017.655
   --------------------------------------------------------------------------------------------------------------------------------

Committer testing:

  # perf kwork -k irq rep -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    nvme0q20:145                   | 0019 |      0.570 ms |        28 |      0.064 ms |      26966.635102 s |      26966.635167 s |
    amdgpu:162                     | 0002 |      0.568 ms |        29 |      0.068 ms |      26966.644346 s |      26966.644414 s |
    nvme0q4:129                    | 0003 |      0.565 ms |        31 |      0.037 ms |      26966.614830 s |      26966.614866 s |
    nvme0q16:141                   | 0015 |      0.205 ms |        66 |      0.012 ms |      26967.145161 s |      26967.145174 s |
    nvme0q29:154                   | 0028 |      0.154 ms |        44 |      0.014 ms |      26967.078970 s |      26967.078984 s |
    nvme0q10:135                   | 0009 |      0.134 ms |        43 |      0.011 ms |      26967.132093 s |      26967.132104 s |
    nvme0q2:127                    | 0001 |      0.132 ms |        26 |      0.011 ms |      26966.883584 s |      26966.883595 s |
    nvme0q25:150                   | 0024 |      0.127 ms |        32 |      0.014 ms |      26966.631419 s |      26966.631433 s |
    nvme0q14:139                   | 0013 |      0.110 ms |        21 |      0.017 ms |      26966.760843 s |      26966.760861 s |
    nvme0q30:155                   | 0029 |      0.102 ms |        30 |      0.022 ms |      26966.677171 s |      26966.677193 s |
    nvme0q13:138                   | 0012 |      0.088 ms |        20 |      0.015 ms |      26966.738733 s |      26966.738748 s |
    nvme0q6:131                    | 0005 |      0.087 ms |        13 |      0.020 ms |      26966.648445 s |      26966.648465 s |
    nvme0q28:153                   | 0027 |      0.066 ms |        12 |      0.015 ms |      26966.771431 s |      26966.771447 s |
    nvme0q26:151                   | 0025 |      0.060 ms |        13 |      0.012 ms |      26966.704266 s |      26966.704278 s |
    nvme0q21:146                   | 0020 |      0.054 ms |        20 |      0.011 ms |      26967.322082 s |      26967.322094 s |
    nvme0q1:126                    | 0000 |      0.046 ms |        11 |      0.013 ms |      26966.859754 s |      26966.859767 s |
    nvme0q17:142                   | 0016 |      0.046 ms |        10 |      0.011 ms |      26967.114513 s |      26967.114524 s |
    xhci_hcd:74                    | 0015 |      0.041 ms |         3 |      0.016 ms |      26967.086004 s |      26967.086020 s |
    nvme0q8:133                    | 0007 |      0.039 ms |        12 |      0.008 ms |      26966.712056 s |      26966.712063 s |
    nvme0q32:157                   | 0031 |      0.036 ms |        10 |      0.014 ms |      26966.627054 s |      26966.627068 s |
    nvme0q9:134                    | 0008 |      0.036 ms |        11 |      0.011 ms |      26967.258452 s |      26967.258462 s |
    nvme0q7:132                    | 0006 |      0.024 ms |         3 |      0.014 ms |      26966.767404 s |      26966.767418 s |
    nvme0q11:136                   | 0010 |      0.023 ms |         5 |      0.006 ms |      26966.935455 s |      26966.935461 s |
    nvme0q31:156                   | 0030 |      0.018 ms |         5 |      0.006 ms |      26966.627517 s |      26966.627524 s |
    nvme0q12:137                   | 0011 |      0.015 ms |         2 |      0.014 ms |      26966.799588 s |      26966.799602 s |
    enp5s0-rx-0:164                | 0006 |      0.009 ms |         2 |      0.005 ms |      26966.742024 s |      26966.742028 s |
    enp5s0-rx-1:165                | 0007 |      0.006 ms |         2 |      0.004 ms |      26966.939486 s |      26966.939490 s |
    enp5s0-tx-0:166                | 0008 |      0.005 ms |         1 |      0.005 ms |      26966.939484 s |      26966.939489 s |
    enp5s0-tx-1:167                | 0009 |      0.005 ms |         1 |      0.005 ms |      26966.939484 s |      26966.939489 s |
   --------------------------------------------------------------------------------------------------------------------------------

  #t

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-16-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
daf07d2207 perf kwork: Implement BPF trace
'perf record' generates perf.data, which generates extra interrupts
for hard disk, amount of data to be collected increases with time.

Using eBPF trace can process the data in kernel, which solves the
preceding two problems.

Add -b/--use-bpf option for latency and report to support
tracing kwork events using eBPF:

1. Create bpf prog and attach to tracepoints,
2. Start tracing after command is entered,
3. After user hit "ctrl+c", stop tracing and report,
4. Support CPU and name filtering.

This commit implements the framework code and
does not add specific event support.

Test cases:

  # perf kwork rep -h

   Usage: perf kwork report [<options>]

      -b, --use-bpf         Use BPF to measure kwork runtime
      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): runtime, max, count
      -S, --with-summary    Show summary with statistics
          --time <str>      Time span for analysis (start,stop)

  # perf kwork lat -h

   Usage: perf kwork latency [<options>]

      -b, --use-bpf         Use BPF to measure kwork latency
      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): avg, max, count
          --time <str>      Time span for analysis (start,stop)

  # perf kwork lat -b
  Unsupported bpf trace class irq

  # perf kwork rep -b
  Unsupported bpf trace class irq

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-15-yangjihong1@huawei.com
[ Simplify work_findnew() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
bcc8b3e88d perf kwork: Implement perf kwork timehist
Implements framework of perf kwork timehist,
to provide an analysis of kernel work events.

Test cases:

  # perf kwork tim
   Runtime start      Runtime end        Cpu     Kwork name                      Runtime     Delaytime
                                                 (TYPE)NAME:NUM                  (msec)      (msec)
   -----------------  -----------------  ------  ------------------------------  ----------  ----------
        91576.060290       91576.060344  [0000]  (s)RCU:9                             0.055       0.111
        91576.061470       91576.061547  [0000]  (s)SCHED:7                           0.077       0.073
        91576.062604       91576.062697  [0001]  (s)RCU:9                             0.094       0.409
        91576.064443       91576.064517  [0002]  (s)RCU:9                             0.074       0.114
        91576.065144       91576.065211  [0000]  (s)SCHED:7                           0.067       0.058
        91576.066564       91576.066609  [0003]  (s)RCU:9                             0.045       0.110
        91576.068495       91576.068559  [0000]  (s)SCHED:7                           0.064       0.059
        91576.068900       91576.068996  [0004]  (s)RCU:9                             0.096       0.726
        91576.069364       91576.069420  [0002]  (s)RCU:9                             0.056       0.082
        91576.069649       91576.069701  [0004]  (s)RCU:9                             0.052       0.111
        91576.070147       91576.070206  [0000]  (s)SCHED:7                           0.060       0.057
        91576.073147       91576.073202  [0000]  (s)SCHED:7                           0.054       0.060
  <SNIP>

  # perf kwork tim --max-stack 2 -g
   Runtime start      Runtime end        Cpu     Kwork name                      Runtime     Delaytime
                                                 (TYPE)NAME:NUM                  (msec)      (msec)
   -----------------  -----------------  ------  ------------------------------  ----------  ----------
        91576.060290       91576.060344  [0000]  (s)RCU:9                             0.055       0.111   irq_exit_rcu <- sysvec_apic_timer_interrupt
        91576.061470       91576.061547  [0000]  (s)SCHED:7                           0.077       0.073   irq_exit_rcu <- sysvec_call_function_single
        91576.062604       91576.062697  [0001]  (s)RCU:9                             0.094       0.409   irq_exit_rcu <- sysvec_apic_timer_interrupt
        91576.064443       91576.064517  [0002]  (s)RCU:9                             0.074       0.114   irq_exit_rcu <- sysvec_apic_timer_interrupt
        91576.065144       91576.065211  [0000]  (s)SCHED:7                           0.067       0.058   irq_exit_rcu <- sysvec_call_function_single
        91576.066564       91576.066609  [0003]  (s)RCU:9                             0.045       0.110   irq_exit_rcu <- sysvec_apic_timer_interrupt
        91576.068495       91576.068559  [0000]  (s)SCHED:7                           0.064       0.059   irq_exit_rcu <- sysvec_call_function_single
        91576.068900       91576.068996  [0004]  (s)RCU:9                             0.096       0.726   irq_exit_rcu <- sysvec_apic_timer_interrupt
        91576.069364       91576.069420  [0002]  (s)RCU:9                             0.056       0.082   irq_exit_rcu <- sysvec_apic_timer_interrupt
        91576.069649       91576.069701  [0004]  (s)RCU:9                             0.052       0.111   irq_exit_rcu <- sysvec_apic_timer_interrupt
  <SNIP>

Committer testing:

  # perf kwork -k workqueue timehist | head -40
   Runtime start      Runtime end        Cpu     Kwork name                      Runtime     Delaytime
                                                 (TYPE)NAME:NUM                  (msec)      (msec)
   -----------------  -----------------  ------  ------------------------------  ----------  ----------
        26520.211825       26520.211832  [0019]  (w)free_work                         0.007       0.004
        26520.212929       26520.212934  [0020]  (w)free_work                         0.005       0.004
        26520.213226       26520.213228  [0014]  (w)kfree_rcu_work                    0.002       0.004
        26520.214057       26520.214061  [0021]  (w)free_work                         0.004       0.004
        26520.221239       26520.221241  [0007]  (w)kfree_rcu_work                    0.002       0.009
        26520.223232       26520.223238  [0013]  (w)psi_avgs_work                     0.005       0.006
        26520.230057       26520.230060  [0020]  (w)free_work                         0.003       0.003
        26520.270428       26520.270434  [0015]  (w)free_work                         0.006       0.004
        26520.270546       26520.270550  [0014]  (w)free_work                         0.004       0.003
        26520.281626       26520.281629  [0015]  (w)free_work                         0.003       0.002
        26520.287225       26520.287230  [0012]  (w)psi_avgs_work                     0.005       0.008
        26520.287231       26520.287235  [0001]  (w)psi_avgs_work                     0.004       0.011
        26520.287236       26520.287239  [0001]  (w)psi_avgs_work                     0.003       0.012
        26520.329488       26520.329492  [0024]  (w)free_work                         0.004       0.004
        26520.330600       26520.330605  [0007]  (w)free_work                         0.005       0.004
        26520.334218       26520.334218  [0007]  (w)kfree_rcu_monitor                 0.001       0.002
        26520.335220       26520.335221  [0005]  (w)kfree_rcu_monitor                 0.001       0.004
        26520.343980       26520.343985  [0007]  (w)free_work                         0.005       0.002
        26520.345093       26520.345097  [0006]  (w)free_work                         0.004       0.003
        26520.351233       26520.351238  [0027]  (w)psi_avgs_work                     0.005       0.008
        26520.353228       26520.353229  [0007]  (w)kfree_rcu_work                    0.001       0.002
        26520.353229       26520.353231  [0005]  (w)kfree_rcu_work                    0.001       0.006
        26520.382381       26520.382383  [0006]  (w)free_work                         0.003       0.002
        26520.386547       26520.386548  [0006]  (w)free_work                         0.002       0.001
        26520.391243       26520.391245  [0015]  (w)console_callback                  0.002       0.016
        26520.415369       26520.415621  [0027]  (w)btrfs_work_helper                 0.252
        26520.415351       26520.416174  [0002]  (w)btrfs_work_helper                 0.823       0.037
        26520.415343       26520.416304  [0031]  (w)btrfs_work_helper                 0.961
        26520.415335       26520.417078  [0001]  (w)btrfs_work_helper                 1.743
        26520.415250       26520.417564  [0002]  (w)wb_workfn                         2.314
        26520.424777       26520.424787  [0002]  (w)btrfs_work_helper                 0.010
        26520.424788       26520.424798  [0002]  (w)btrfs_work_helper                 0.010
        26520.424790       26520.424805  [0001]  (w)btrfs_work_helper                 0.016       0.016
        26520.424801       26520.424807  [0002]  (w)btrfs_work_helper                 0.006
        26520.424809       26520.424831  [0002]  (w)btrfs_work_helper                 0.022       0.030
        26520.424824       26520.424835  [0027]  (w)btrfs_work_helper                 0.011
        26520.424809       26520.424867  [0001]  (w)btrfs_work_helper                 0.059       0.032
  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-14-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
53e49e32ae perf kwork: Add workqueue latency support
Implements workqueue latency function.

Test cases:

  # perf kwork -k workqueue lat

    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)vmstat_update               | 0001 |      5.004 ms |         1 |      5.004 ms |      44001.745646 s |      44001.750650 s |
    (w)vmstat_update               | 0006 |      1.773 ms |         1 |      1.773 ms |      44000.830840 s |      44000.832613 s |
    (w)vmstat_shepherd             | 0000 |      0.992 ms |         8 |      2.474 ms |      44007.717845 s |      44007.720318 s |
    (w)vmstat_update               | 0000 |      0.974 ms |         5 |      2.624 ms |      44004.785970 s |      44004.788594 s |
    (w)e1000_watchdog              | 0002 |      0.687 ms |         5 |      2.632 ms |      44005.009334 s |      44005.011966 s |
    (w)vmstat_update               | 0002 |      0.307 ms |         1 |      0.307 ms |      44004.817395 s |      44004.817702 s |
    (w)vmstat_update               | 0004 |      0.296 ms |         1 |      0.296 ms |      43997.913677 s |      43997.913973 s |
    (w)mix_interrupt_randomness    | 0000 |      0.283 ms |       285 |      3.724 ms |      44006.790889 s |      44006.794613 s |
    (w)neigh_managed_work          | 0001 |      0.271 ms |         1 |      0.271 ms |      43997.665542 s |      43997.665813 s |
    (w)vmstat_update               | 0005 |      0.261 ms |         1 |      0.261 ms |      44007.820542 s |      44007.820803 s |
    (w)neigh_managed_work          | 0004 |      0.220 ms |         1 |      0.220 ms |      44002.953287 s |      44002.953507 s |
    (w)neigh_periodic_work         | 0004 |      0.217 ms |         1 |      0.217 ms |      43999.929718 s |      43999.929935 s |
    (w)mix_interrupt_randomness    | 0002 |      0.199 ms |         5 |      0.310 ms |      44005.012316 s |      44005.012625 s |
    (w)vmstat_update               | 0003 |      0.199 ms |         4 |      0.307 ms |      44005.714391 s |      44005.714699 s |
    (w)gc_worker                   | 0001 |      0.071 ms |       173 |      1.128 ms |      44002.062579 s |      44002.063707 s |
   --------------------------------------------------------------------------------------------------------------------------------
    INFO: 0.020% skipped events (17 including 10 raise, 7 entry, 0 exit)

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-13-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
19807bba5a perf kwork: Add softirq latency support
Implements softirq latency function.

Test cases:

  # perf kwork -k softirq lat

    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)TIMER:1                     | 0006 |      1.048 ms |         1 |      1.048 ms |      44000.829759 s |      44000.830807 s |
    (s)TIMER:1                     | 0001 |      1.008 ms |         4 |      3.434 ms |      43997.662069 s |      43997.665503 s |
    (s)RCU:9                       | 0006 |      0.675 ms |         7 |      1.328 ms |      43997.670304 s |      43997.671632 s |
    (s)RCU:9                       | 0000 |      0.414 ms |       701 |      3.996 ms |      43997.661170 s |      43997.665167 s |
    (s)RCU:9                       | 0005 |      0.245 ms |        88 |      1.866 ms |      43997.683105 s |      43997.684971 s |
    (s)SCHED:7                     | 0000 |      0.158 ms |       677 |      2.639 ms |      44004.785716 s |      44004.788355 s |
    ... <SNIP> ...
    (s)RCU:9                       | 0002 |      0.141 ms |       932 |      1.662 ms |      44005.010206 s |      44005.011868 s |
    (s)RCU:9                       | 0003 |      0.129 ms |      2193 |      1.507 ms |      44006.010208 s |      44006.011715 s |
    (s)TIMER:1                     | 0005 |      0.128 ms |         1 |      0.128 ms |      44007.820346 s |      44007.820474 s |
    (s)SCHED:7                     | 0002 |      0.040 ms |      1731 |      0.211 ms |      44005.009237 s |      44005.009447 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork -k softirq lat -C 1,2

    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)TIMER:1                     | 0001 |      1.008 ms |         4 |      3.434 ms |      43997.662069 s |      43997.665503 s |
    (s)RCU:9                       | 0001 |      0.216 ms |      1619 |      3.659 ms |      43997.662069 s |      43997.665727 s |
    (s)RCU:9                       | 0002 |      0.141 ms |       932 |      1.662 ms |      44005.010206 s |      44005.011868 s |
    (s)NET_RX:3                    | 0002 |      0.106 ms |         5 |      0.163 ms |      44005.012255 s |      44005.012418 s |
    (s)TIMER:1                     | 0002 |      0.084 ms |         9 |      0.114 ms |      44005.009168 s |      44005.009282 s |
    (s)SCHED:7                     | 0001 |      0.049 ms |       655 |      0.837 ms |      44005.707998 s |      44005.708835 s |
    (s)SCHED:7                     | 0002 |      0.040 ms |      1731 |      0.211 ms |      44005.009237 s |      44005.009447 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork -k softirq lat -n RCU

    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)RCU:9                       | 0006 |      0.675 ms |         7 |      1.328 ms |      43997.670304 s |      43997.671632 s |
    (s)RCU:9                       | 0000 |      0.414 ms |       701 |      3.996 ms |      43997.661170 s |      43997.665167 s |
    (s)RCU:9                       | 0005 |      0.245 ms |        88 |      1.866 ms |      43997.683105 s |      43997.684971 s |
    (s)RCU:9                       | 0004 |      0.237 ms |        26 |      0.792 ms |      43997.683018 s |      43997.683810 s |
    (s)RCU:9                       | 0007 |      0.217 ms |       140 |      1.335 ms |      43997.671080 s |      43997.672415 s |
    (s)RCU:9                       | 0001 |      0.216 ms |      1619 |      3.659 ms |      43997.662069 s |      43997.665727 s |
    (s)RCU:9                       | 0002 |      0.141 ms |       932 |      1.662 ms |      44005.010206 s |      44005.011868 s |
    (s)RCU:9                       | 0003 |      0.129 ms |      2193 |      1.507 ms |      44006.010208 s |      44006.011715 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork -k softirq lat -s count,avg -n RCU

    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)RCU:9                       | 0003 |      0.129 ms |      2193 |      1.507 ms |      44006.010208 s |      44006.011715 s |
    (s)RCU:9                       | 0001 |      0.216 ms |      1619 |      3.659 ms |      43997.662069 s |      43997.665727 s |
    (s)RCU:9                       | 0002 |      0.141 ms |       932 |      1.662 ms |      44005.010206 s |      44005.011868 s |
    (s)RCU:9                       | 0000 |      0.414 ms |       701 |      3.996 ms |      43997.661170 s |      43997.665167 s |
    (s)RCU:9                       | 0007 |      0.217 ms |       140 |      1.335 ms |      43997.671080 s |      43997.672415 s |
    (s)RCU:9                       | 0005 |      0.245 ms |        88 |      1.866 ms |      43997.683105 s |      43997.684971 s |
    (s)RCU:9                       | 0004 |      0.237 ms |        26 |      0.792 ms |      43997.683018 s |      43997.683810 s |
    (s)RCU:9                       | 0006 |      0.675 ms |         7 |      1.328 ms |      43997.670304 s |      43997.671632 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork -k softirq lat --time 43997,

    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)TIMER:1                     | 0006 |      1.048 ms |         1 |      1.048 ms |      44000.829759 s |      44000.830807 s |
    (s)TIMER:1                     | 0001 |      1.008 ms |         4 |      3.434 ms |      43997.662069 s |      43997.665503 s |
    (s)RCU:9                       | 0006 |      0.675 ms |         7 |      1.328 ms |      43997.670304 s |      43997.671632 s |
    (s)RCU:9                       | 0000 |      0.414 ms |       701 |      3.996 ms |      43997.661170 s |      43997.665167 s |
    (s)TIMER:1                     | 0004 |      0.083 ms |        21 |      0.127 ms |      44004.969171 s |      44004.969298 s |
    ... <SNIP> ...
    (s)SCHED:7                     | 0005 |      0.050 ms |         4 |      0.086 ms |      43997.684852 s |      43997.684938 s |
    (s)SCHED:7                     | 0001 |      0.049 ms |       655 |      0.837 ms |      44005.707998 s |      44005.708835 s |
    (s)SCHED:7                     | 0007 |      0.044 ms |       171 |      0.077 ms |      43997.943265 s |      43997.943342 s |
    (s)SCHED:7                     | 0002 |      0.040 ms |      1731 |      0.211 ms |      44005.009237 s |      44005.009447 s |
   --------------------------------------------------------------------------------------------------------------------------------

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-12-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
ad3d9f7a92 perf kwork: Implement perf kwork latency
Implements framework of perf kwork latency, which is used to report time
properties such as delay time and frequency.

Test cases:

  # perf kwork lat -h

   Usage: perf kwork latency [<options>]

      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): avg, max, count
          --time <str>      Time span for analysis (start,stop)

  # perf kwork lat -C 199
  Requested CPU 199 too large. Consider raising MAX_NR_CPUS
  Invalid cpu bitmap

  # perf kwork lat -i perf_no_exist.data
  failed to open perf_no_exist.data: No such file or directory

  # perf kwork lat -s avg1
    Error: Unknown --sort key: `avg1'

   Usage: perf kwork latency [<options>]

      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): avg, max, count
          --time <str>      Time span for analysis (start,stop)

  # perf kwork lat --time FFFF,
  Invalid time span

  # perf kwork lat

    Kwork Name                     | Cpu  | Avg delay     | Count    | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
   --------------------------------------------------------------------------------------------------------------------------------
    INFO: 36.570% skipped events (31537 including 0 raise, 31537 entry, 0 exit)

Since there are no latency-enabled events, the output is empty.

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-11-yangjihong1@huawei.com
[ Add {} for multiline if blocks ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
8dbc3c8689 perf kwork: Add workqueue report support
Implements workqueue report function.

Test cases:

  # perf kwork -k workqueue rep

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)gc_worker                   | 0001 |   1912.389 ms |       173 |     12.896 ms |      44002.050787 s |      44002.063683 s |
    (w)mix_interrupt_randomness    | 0000 |     24.308 ms |       285 |      3.349 ms |      44004.784908 s |      44004.788257 s |
    (w)e1000_watchdog              | 0002 |      5.332 ms |         5 |      2.059 ms |      44000.914366 s |      44000.916424 s |
    (w)vmstat_update               | 0005 |      0.989 ms |         2 |      0.953 ms |      43997.986991 s |      43997.987944 s |
    (w)vmstat_shepherd             | 0000 |      0.964 ms |         8 |      0.195 ms |      43997.986453 s |      43997.986648 s |
    (w)vmstat_update               | 0003 |      0.306 ms |         6 |      0.077 ms |      44004.689543 s |      44004.689620 s |
    (w)vmstat_update               | 0000 |      0.196 ms |         5 |      0.049 ms |      44005.713732 s |      44005.713781 s |
    (w)vmstat_update               | 0001 |      0.162 ms |         2 |      0.130 ms |      44000.192034 s |      44000.192164 s |
    (w)mix_interrupt_randomness    | 0002 |      0.114 ms |         5 |      0.037 ms |      44005.012625 s |      44005.012662 s |
    (w)vmstat_update               | 0002 |      0.084 ms |         2 |      0.043 ms |      44004.817702 s |      44004.817745 s |
    (w)vmstat_update               | 0006 |      0.067 ms |         2 |      0.041 ms |      43997.987214 s |      43997.987254 s |
    (w)neigh_periodic_work         | 0004 |      0.039 ms |         1 |      0.039 ms |      43999.929935 s |      43999.929974 s |
    (w)vmstat_update               | 0007 |      0.037 ms |         1 |      0.037 ms |      43997.988969 s |      43997.989006 s |
    (w)neigh_managed_work          | 0001 |      0.036 ms |         1 |      0.036 ms |      43997.665813 s |      43997.665849 s |
    (w)neigh_managed_work          | 0004 |      0.036 ms |         1 |      0.036 ms |      44002.953507 s |      44002.953543 s |
    (w)vmstat_update               | 0004 |      0.027 ms |         1 |      0.027 ms |      43997.913973 s |      43997.914000 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork -k workqueue rep -S

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)gc_worker                   | 0001 |   1912.389 ms |       173 |     12.896 ms |      44002.050787 s |      44002.063683 s |
    (w)mix_interrupt_randomness    | 0000 |     24.308 ms |       285 |      3.349 ms |      44004.784908 s |      44004.788257 s |
    (w)e1000_watchdog              | 0002 |      5.332 ms |         5 |      2.059 ms |      44000.914366 s |      44000.916424 s |
    (w)vmstat_update               | 0005 |      0.989 ms |         2 |      0.953 ms |      43997.986991 s |      43997.987944 s |
    (w)vmstat_shepherd             | 0000 |      0.964 ms |         8 |      0.195 ms |      43997.986453 s |      43997.986648 s |
    (w)vmstat_update               | 0003 |      0.306 ms |         6 |      0.077 ms |      44004.689543 s |      44004.689620 s |
    (w)vmstat_update               | 0000 |      0.196 ms |         5 |      0.049 ms |      44005.713732 s |      44005.713781 s |
    (w)vmstat_update               | 0001 |      0.162 ms |         2 |      0.130 ms |      44000.192034 s |      44000.192164 s |
    (w)mix_interrupt_randomness    | 0002 |      0.114 ms |         5 |      0.037 ms |      44005.012625 s |      44005.012662 s |
    (w)vmstat_update               | 0002 |      0.084 ms |         2 |      0.043 ms |      44004.817702 s |      44004.817745 s |
    (w)vmstat_update               | 0006 |      0.067 ms |         2 |      0.041 ms |      43997.987214 s |      43997.987254 s |
    (w)neigh_periodic_work         | 0004 |      0.039 ms |         1 |      0.039 ms |      43999.929935 s |      43999.929974 s |
    (w)vmstat_update               | 0007 |      0.037 ms |         1 |      0.037 ms |      43997.988969 s |      43997.989006 s |
    (w)neigh_managed_work          | 0001 |      0.036 ms |         1 |      0.036 ms |      43997.665813 s |      43997.665849 s |
    (w)neigh_managed_work          | 0004 |      0.036 ms |         1 |      0.036 ms |      44002.953507 s |      44002.953543 s |
    (w)vmstat_update               | 0004 |      0.027 ms |         1 |      0.027 ms |      43997.913973 s |      43997.914000 s |
   --------------------------------------------------------------------------------------------------------------------------------
    Total count            :       500
    Total runtime   (msec) :  1945.085 (0.192% load average)
    Total time span (msec) : 10155.026
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork -k workqueue rep -n vmstat_update

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)vmstat_update               | 0005 |      0.989 ms |         2 |      0.953 ms |      43997.986991 s |      43997.987944 s |
    (w)vmstat_update               | 0003 |      0.306 ms |         6 |      0.077 ms |      44004.689543 s |      44004.689620 s |
    (w)vmstat_update               | 0000 |      0.196 ms |         5 |      0.049 ms |      44005.713732 s |      44005.713781 s |
    (w)vmstat_update               | 0001 |      0.162 ms |         2 |      0.130 ms |      44000.192034 s |      44000.192164 s |
    (w)vmstat_update               | 0002 |      0.084 ms |         2 |      0.043 ms |      44004.817702 s |      44004.817745 s |
    (w)vmstat_update               | 0006 |      0.067 ms |         2 |      0.041 ms |      43997.987214 s |      43997.987254 s |
    (w)vmstat_update               | 0007 |      0.037 ms |         1 |      0.037 ms |      43997.988969 s |      43997.989006 s |
    (w)vmstat_update               | 0004 |      0.027 ms |         1 |      0.027 ms |      43997.913973 s |      43997.914000 s |
   --------------------------------------------------------------------------------------------------------------------------------

Committer testing:

  # perf kwork -k workqueue rep -C 1 | head -20

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)commit_work                 | 0001 |     25.896 ms |         2 |     13.200 ms |      26522.906700 s |      26522.919900 s |
    (w)commit_work                 | 0001 |     13.316 ms |         1 |     13.316 ms |      26522.573246 s |      26522.586562 s |
    (w)commit_work                 | 0001 |     13.177 ms |         1 |     13.177 ms |      26522.673406 s |      26522.686583 s |
    (w)commit_work                 | 0001 |     12.630 ms |         1 |     12.630 ms |      26522.123921 s |      26522.136551 s |
    (w)btrfs_work_helper           | 0001 |      3.544 ms |         1 |      3.544 ms |      26529.131296 s |      26529.134840 s |
    (w)btrfs_work_helper           | 0001 |      3.330 ms |         1 |      3.330 ms |      26529.137698 s |      26529.141028 s |
    (w)btrfs_work_helper           | 0001 |      2.855 ms |         1 |      2.855 ms |      26529.134842 s |      26529.137697 s |
    (w)btrfs_work_helper           | 0001 |      2.757 ms |         1 |      2.757 ms |      26529.124086 s |      26529.126843 s |
    (w)btrfs_work_helper           | 0001 |      2.182 ms |         1 |      2.182 ms |      26529.141030 s |      26529.143212 s |
    (w)btrfs_work_helper           | 0001 |      1.743 ms |         1 |      1.743 ms |      26520.415335 s |      26520.417078 s |
    (w)btrfs_work_helper           | 0001 |      1.499 ms |         1 |      1.499 ms |      26529.127774 s |      26529.129272 s |
    (w)btrfs_work_helper           | 0001 |      1.446 ms |         1 |      1.446 ms |      26529.129848 s |      26529.131294 s |
    (w)btrfs_work_helper           | 0001 |      1.373 ms |         1 |      1.373 ms |      26523.808270 s |      26523.809643 s |
    (w)wb_workfn                   | 0001 |      1.165 ms |         2 |      0.763 ms |      26527.071056 s |      26527.071819 s |
    (w)btrfs_work_helper           | 0001 |      0.926 ms |         1 |      0.926 ms |      26529.126846 s |      26529.127771 s |
    (w)btrfs_work_helper           | 0001 |      0.571 ms |         1 |      0.571 ms |      26529.129275 s |      26529.129846 s |
    (w)wb_workfn                   | 0001 |      0.525 ms |         1 |      0.525 ms |      26522.975151 s |      26522.975676 s |
  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-10-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
4c14819169 perf kwork: Add softirq report support
Implements softirq kwork report function.

Test cases:

  # perf kwork -k softirq rep

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)TIMER:1                     | 0003 |    181.387 ms |      2476 |      1.240 ms |      44004.787960 s |      44004.789201 s |
    (s)RCU:9                       | 0003 |     91.573 ms |      2193 |      0.650 ms |      44004.790258 s |      44004.790908 s |
    (s)RCU:9                       | 0001 |     78.960 ms |      1619 |      1.195 ms |      44001.496553 s |      44001.497749 s |
    (s)SCHED:7                     | 0003 |     55.962 ms |      1255 |      0.954 ms |      44004.812008 s |      44004.812962 s |
    ... <SNIP> ...
    (s)RCU:9                       | 0004 |      0.830 ms |        26 |      0.058 ms |      43997.666418 s |      43997.666476 s |
    (s)TIMER:1                     | 0001 |      0.471 ms |         4 |      0.158 ms |      44007.834694 s |      44007.834852 s |
    (s)RCU:9                       | 0006 |      0.220 ms |         7 |      0.048 ms |      44004.833764 s |      44004.833812 s |
    (s)NET_RX:3                    | 0002 |      0.164 ms |         5 |      0.049 ms |      44005.012418 s |      44005.012466 s |
    (s)TIMER:1                     | 0005 |      0.164 ms |         1 |      0.164 ms |      44007.820474 s |      44007.820638 s |
    (s)TIMER:1                     | 0006 |      0.087 ms |         1 |      0.087 ms |      44000.830807 s |      44000.830894 s |
    (s)SCHED:7                     | 0006 |      0.080 ms |         2 |      0.044 ms |      43997.826145 s |      43997.826189 s |
   --------------------------------------------------------------------------------------------------------------------------------

  #
  # perf kwork -k softirq rep -S

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)TIMER:1                     | 0003 |    181.387 ms |      2476 |      1.240 ms |      44004.787960 s |      44004.789201 s |
    (s)RCU:9                       | 0003 |     91.573 ms |      2193 |      0.650 ms |      44004.790258 s |      44004.790908 s |
    (s)RCU:9                       | 0001 |     78.960 ms |      1619 |      1.195 ms |      44001.496553 s |      44001.497749 s |
    (s)SCHED:7                     | 0000 |     63.631 ms |       680 |      2.690 ms |      44006.721976 s |      44006.724666 s |
    ... <SNIP> ...
    (s)SCHED:7                     | 0003 |     55.962 ms |      1255 |      0.954 ms |      44004.812008 s |      44004.812962 s |
    (s)RCU:9                       | 0006 |      0.220 ms |         7 |      0.048 ms |      44004.833764 s |      44004.833812 s |
    (s)NET_RX:3                    | 0002 |      0.164 ms |         5 |      0.049 ms |      44005.012418 s |      44005.012466 s |
    (s)TIMER:1                     | 0005 |      0.164 ms |         1 |      0.164 ms |      44007.820474 s |      44007.820638 s |
    (s)TIMER:1                     | 0006 |      0.087 ms |         1 |      0.087 ms |      44000.830807 s |      44000.830894 s |
    (s)SCHED:7                     | 0006 |      0.080 ms |         2 |      0.044 ms |      43997.826145 s |      43997.826189 s |
   --------------------------------------------------------------------------------------------------------------------------------
    Total count            :     12748
    Total runtime   (msec) :   661.433 (0.065% load average)
    Total time span (msec) : 10176.441
   --------------------------------------------------------------------------------------------------------------------------------

  #
  # perf kwork -k softirq rep -s count,max

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)TIMER:1                     | 0003 |    181.387 ms |      2476 |      1.240 ms |      44004.787960 s |      44004.789201 s |
    (s)RCU:9                       | 0003 |     91.573 ms |      2193 |      0.650 ms |      44004.790258 s |      44004.790908 s |
    (s)SCHED:7                     | 0002 |     50.039 ms |      1731 |      0.074 ms |      44005.009447 s |      44005.009521 s |
    (s)RCU:9                       | 0001 |     78.960 ms |      1619 |      1.195 ms |      44001.496553 s |      44001.497749 s |
    (s)SCHED:7                     | 0003 |     55.962 ms |      1255 |      0.954 ms |      44004.812008 s |      44004.812962 s |
    ... <SNIP> ...
    (s)RCU:9                       | 0002 |     35.241 ms |       932 |      0.407 ms |      44005.009541 s |      44005.009949 s |
    (s)RCU:9                       | 0000 |     45.710 ms |       702 |      1.144 ms |      44004.787023 s |      44004.788167 s |
    (s)SCHED:7                     | 0006 |      0.080 ms |         2 |      0.044 ms |      43997.826145 s |      43997.826189 s |
    (s)TIMER:1                     | 0005 |      0.164 ms |         1 |      0.164 ms |      44007.820474 s |      44007.820638 s |
    (s)TIMER:1                     | 0006 |      0.087 ms |         1 |      0.087 ms |      44000.830807 s |      44000.830894 s |
   --------------------------------------------------------------------------------------------------------------------------------

Committer testing:

  # perf kwork -k softirq report -C 2 -s count,max

  Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
 --------------------------------------------------------------------------------------------------------------------------------
  (s)SCHED:7                     | 0002 |      0.980 ms |       159 |      0.024 ms |      26035.571037 s |      26035.571061 s |
  (s)RCU:9                       | 0002 |      0.124 ms |        88 |      0.021 ms |      26035.177050 s |      26035.177071 s |
  (s)TIMER:1                     | 0002 |      0.122 ms |        56 |      0.007 ms |      26035.468045 s |      26035.468052 s |
 --------------------------------------------------------------------------------------------------------------------------------

  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-9-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
94348520c6 perf kwork: Add irq report support
Implements irq kwork report function.

Test cases:

  # perf kwork record -- sleep 10
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 6.134 MB perf.data ]

  # perf kwork report

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |   1167.501 ms |     18284 |      1.096 ms |      44004.464905 s |      44004.466001 s |
    eth0:10                        | 0002 |      0.185 ms |         5 |      0.058 ms |      44005.012222 s |      44005.012280 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report -C 2

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    eth0:10                        | 0002 |      0.185 ms |         5 |      0.058 ms |      44005.012222 s |      44005.012280 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report -C 3

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report -i perf.data

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |   1167.501 ms |     18284 |      1.096 ms |      44004.464905 s |      44004.466001 s |
    eth0:10                        | 0002 |      0.185 ms |         5 |      0.058 ms |      44005.012222 s |      44005.012280 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report -s max,freq

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |   1167.501 ms |     18284 |      1.096 ms |      44004.464905 s |      44004.466001 s |
    eth0:10                        | 0002 |      0.185 ms |         5 |      0.058 ms |      44005.012222 s |      44005.012280 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report -S

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |   1167.501 ms |     18284 |      1.096 ms |      44004.464905 s |      44004.466001 s |
    eth0:10                        | 0002 |      0.185 ms |         5 |      0.058 ms |      44005.012222 s |      44005.012280 s |
   --------------------------------------------------------------------------------------------------------------------------------
    Total count            :     18289
    Total runtime   (msec) :  1167.686 (0.115% load average)
    Total time span (msec) : 10159.155
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report --time 44005,

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |    402.173 ms |      4695 |      0.981 ms |      44007.831992 s |      44007.832973 s |
    eth0:10                        | 0002 |      0.089 ms |         2 |      0.058 ms |      44005.012222 s |      44005.012280 s |
   --------------------------------------------------------------------------------------------------------------------------------

Committer testing:

  # perf kwork report

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    nvme0q5:130                    | 0004 |      1.101 ms |        49 |      0.051 ms |      26035.056403 s |      26035.056455 s |
    amdgpu:162                     | 0002 |      0.176 ms |         9 |      0.046 ms |      26035.268020 s |      26035.268066 s |
    nvme0q24:149                   | 0023 |      0.161 ms |        55 |      0.009 ms |      26035.655280 s |      26035.655288 s |
    nvme0q20:145                   | 0019 |      0.090 ms |        33 |      0.014 ms |      26035.939018 s |      26035.939032 s |
    nvme0q31:156                   | 0030 |      0.075 ms |        21 |      0.010 ms |      26035.052237 s |      26035.052247 s |
    nvme0q8:133                    | 0007 |      0.062 ms |        12 |      0.021 ms |      26035.416840 s |      26035.416861 s |
    nvme0q6:131                    | 0005 |      0.054 ms |        22 |      0.010 ms |      26035.199919 s |      26035.199929 s |
    nvme0q19:144                   | 0018 |      0.052 ms |        14 |      0.010 ms |      26035.110615 s |      26035.110625 s |
    nvme0q7:132                    | 0006 |      0.049 ms |        13 |      0.007 ms |      26035.125180 s |      26035.125187 s |
    nvme0q18:143                   | 0017 |      0.033 ms |        14 |      0.007 ms |      26035.169698 s |      26035.169705 s |
    nvme0q17:142                   | 0016 |      0.013 ms |         1 |      0.013 ms |      26035.565147 s |      26035.565160 s |
    enp5s0-rx-0:164                | 0006 |      0.004 ms |         4 |      0.002 ms |      26035.928882 s |      26035.928884 s |
    enp5s0-tx-0:166                | 0008 |      0.003 ms |         3 |      0.002 ms |      26035.870923 s |      26035.870925 s |
   --------------------------------------------------------------------------------------------------------------------------------

  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-8-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
f98919ec4f perf kwork: Implement 'report' subcommand
Implements framework of 'perf kwork report', which is used to report
time properties such as run time and frequency:

Test cases:

  # perf kwork

   Usage: perf kwork [<options>] {record|report}

      -D, --dump-raw-trace  dump raw trace in ASCII
      -f, --force           don't complain, do it
      -k, --kwork <kwork>   list of kwork to profile (irq, softirq, workqueue, etc)
      -v, --verbose         be more verbose (show symbol address, etc)

  # perf kwork report -h

   Usage: perf kwork report [<options>]

      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): runtime, max, count
      -S, --with-summary    Show summary with statistics
          --time <str>      Time span for analysis (start,stop)

  # perf kwork report

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report -S

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
   --------------------------------------------------------------------------------------------------------------------------------
    Total count            :         0
    Total runtime   (msec) :     0.000 (0.000% load average)
    Total time span (msec) :     0.000
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report -C 0,100
  Requested CPU 100 too large. Consider raising MAX_NR_CPUS
  Invalid cpu bitmap

  # perf kwork report -s runtime1
    Error: Unknown --sort key: `runtime1'

   Usage: perf kwork report [<options>]

      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): runtime, max, count
      -S, --with-summary    Show summary with statistics
          --time <str>      Time span for analysis (start,stop)

  # perf kwork report -i perf_no_exist.data
  failed to open perf_no_exist.data: No such file or directory

  # perf kwork report --time 00FFF,
  Invalid time span

Since there are no report supported events, the output is empty.

Briefly describe the data structure:

1. "class" indicates event type. For example, irq and softiq correspond
to different types.

2. "cluster" refers to a specific event corresponding to a type. For
example, RCU and TIMER in softirq correspond to different clusters,
which contains three types of events: raise, entry, and exit.

3. "atom" includes time of each sample and sample of the previous phase.
(For example, exit corresponds to entry, which is used for timehist.)

Committer notes:

- Add {} for multiline if blocks.

- report_print_work() should either return that ret variable that
  accounts how many bytes were printed or stop accounting and be void.
  Do the former for now to avoid this:

builtin-kwork.c:534:6: error: variable 'ret' set but not used [-Werror,-Wunused-but-set-variable]
        int ret = 0;
            ^
1 error generated.

  When building with:

  ⬢[acme@toolbox perf]$ clang --version
  clang version 13.0.0 (https://github.com/llvm/llvm-project e8991caea8690ec2d17b0b7e1c29bf0da6609076)

Also:

  -       if ((dst_type >= 0) && (dst_type < KWORK_TRACE_MAX)) {
  +       if (dst_type < KWORK_TRACE_MAX) {

Several versions of clang and at least this gcc:

   3    51.40 alpine:3.9                    : FAIL gcc version 8.3.0 (Alpine 8.3.0)
    builtin-kwork.c:411:16: error: comparison of unsigned enum expression >= 0 is
          always true [-Werror,-Wtautological-compare]
            if ((dst_type >= 0) && (dst_type < KWORK_TRACE_MAX)) {

As the first entry in a enum is zero.

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-7-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:30:05 -03:00
Yang Jihong
e432947ef5 tools lib: Add list_last_entry_or_null()
Add list_last_entry_or_null() to get the last element from a list,
returns NULL if the list is empty.

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-6-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:02:13 -03:00
Yang Jihong
97179d9d08 perf kwork: Add workqueue kwork record support
Record workqueue events workqueue:workqueue_activate_work,
workqueue:workqueue_execute_start & workqueue:workqueue_execute_end

Tese cases:
Record all events:

  # perf kwork record -o perf_kwork.date -- sleep 1
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 0.857 MB perf_kwork.date ]
  #
  # perf evlist -i perf_kwork.date
  irq:irq_handler_entry
  irq:irq_handler_exit
  irq:softirq_raise
  irq:softirq_entry
  irq:softirq_exit
  workqueue:workqueue_activate_work
  workqueue:workqueue_execute_start
  workqueue:workqueue_execute_end
  dummy:HG
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events

Record workqueue events:

  # perf kwork -k workqueue record -o perf_kwork.date -- sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.081 MB perf_kwork.date ]
  #
  # perf evlist -i perf_kwork.date
  workqueue:workqueue_activate_work
  workqueue:workqueue_execute_start
  workqueue:workqueue_execute_end
  dummy:HG
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events

Committer testing:

  # perf kwork record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 3.430 MB perf.data (24130 samples) ]
  # perf evlist -v
  irq:irq_handler_entry: type: 2, size: 128, config: 0x97, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  irq:irq_handler_exit: type: 2, size: 128, config: 0x96, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  irq:softirq_raise: type: 2, size: 128, config: 0x93, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  irq:softirq_entry: type: 2, size: 128, config: 0x95, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  irq:softirq_exit: type: 2, size: 128, config: 0x94, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  workqueue:workqueue_activate_work: type: 2, size: 128, config: 0x106, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  workqueue:workqueue_execute_start: type: 2, size: 128, config: 0x105, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  workqueue:workqueue_execute_end: type: 2, size: 128, config: 0x104, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  dummy:HG: type: 1, size: 128, config: 0x9, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|RAW|IDENTIFIER, read_format: ID, inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
  # perf script | grep workqueue | head
           swapper     0 [018] 26035.043289: workqueue:workqueue_activate_work: work struct 0xffff8b8ffeeae368
   kworker/18:2-ev 70440 [018] 26035.043293: workqueue:workqueue_execute_start: work struct 0xffff8b8ffeeae368: function free_work
   kworker/18:2-ev 70440 [018] 26035.043301:   workqueue:workqueue_execute_end: work struct 0xffff8b8ffeeae368: function free_work
           swapper     0 [021] 26035.044704: workqueue:workqueue_activate_work: work struct 0xffff8b8ffef6e368
   kworker/21:0-ev 4080535 [021] 26035.044709: workqueue:workqueue_execute_start: work struct 0xffff8b8ffef6e368: function free_work
   kworker/21:0-ev 4080535 [021] 26035.044716:   workqueue:workqueue_execute_end: work struct 0xffff8b8ffef6e368: function free_work
           swapper     0 [018] 26035.045230: workqueue:workqueue_activate_work: work struct 0xffff8b8ffeeae368
   kworker/18:2-ev 70440 [018] 26035.045232: workqueue:workqueue_execute_start: work struct 0xffff8b8ffeeae368: function free_work
   kworker/18:2-ev 70440 [018] 26035.045235:   workqueue:workqueue_execute_end: work struct 0xffff8b8ffeeae368: function free_work
           swapper     0 [001] 26035.052046: workqueue:workqueue_activate_work: work struct 0xffff8b8108901590
  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-5-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:02:04 -03:00