Fix various spelling errors as reported by Debian's lintian tool.
"amount of times" -> "number of times"
ocurrence -> occurrence
upto -> up to
Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230122122034.48020-1-didi.debian@cknow.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In the context of LBR stitching documentation.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230119201036.156441-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a test to check function symbols do not overlap and are not zero
length.
The main motivation for the test is to make it easier to review changes
to PLT symbol synthesis i.e. changes to dso__synthesize_plt_symbols().
By default the test uses the perf executable as a test DSO, but a
specific DSO can be specified via a new perf test option "--dso".
The test is useful in the following ways:
- Any DSO can be tested, even ones that do not run on the current
architecture. For example, using cross-compiled DSOs to see how
well perf handles different architectures.
- With verbose > 1 (e.g. -vv), all the symbols are printed, which
makes it easier to see issues.
- perf removes duplicate symbols and expands zero-length symbols
to reach the next symbol, however that is done before adding
synthesized symbols, so the test is checking those also.
Example:
$ perf test -v Symbols
74: Symbols :
--- start ---
test child forked, pid 154918
Testing /home/user/bin/perf
Overlapping symbols:
7d000-7f3a0 g _init
7d030-7d040 g __printf_chk@plt
test child finished with -1
---- end ----
Symbols: FAILED!
Note the test fails because perf expands the _init symbol over the PLT
because there are no PLT symbols at that point, but then
dso__synthesize_plt_symbols() creates them.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This explodes the build if HEAD is signed, since the generated version
is gpg: Signature made Mon 26 Dec 2022 20:34:48 CET, then a few more
lines, then the SHA.
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/7c9637711271f50ec2341fb8a7c29585335dab04.1672174189.git.nabijaczleweli@nabijaczleweli.xyz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -L/--lock-filter option is to filter only given locks. The locks
can be specified by address or name (if exists).
$ sudo ./perf lock record -a sleep 1
$ sudo ./perf lock con -l
contended total wait max wait avg wait address symbol
57 1.11 ms 42.83 us 19.54 us ffff9f4140059000
15 280.88 us 23.51 us 18.73 us ffffffff9d007a40 jiffies_lock
1 20.49 us 20.49 us 20.49 us ffffffff9d0d50c0 rcu_state
1 9.02 us 9.02 us 9.02 us ffff9f41759e9ba0
$ sudo ./perf lock con -L jiffies_lock,rcu_state
contended total wait max wait avg wait type caller
15 280.88 us 23.51 us 18.73 us spinlock tick_sched_do_timer+0x93
1 20.49 us 20.49 us 20.49 us spinlock __softirqentry_text_start+0xeb
$ sudo ./perf lock con -L ffff9f4140059000
contended total wait max wait avg wait type caller
38 779.40 us 42.83 us 20.51 us spinlock worker_thread+0x50
11 216.30 us 39.87 us 19.66 us spinlock queue_work_on+0x39
8 118.13 us 20.51 us 14.77 us spinlock kthread+0xe5
Committer testing:
# uname -a
Linux quaco 6.0.12-200.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Dec 8 17:15:53 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
# perf lock record
^C[ perf record: Woken up 1 times to write data ]
# perf lock con -L jiffies_lock,rcu_state
contended total wait max wait avg wait type caller
# perf lock con
contended total wait max wait avg wait type caller
1 9.06 us 9.06 us 9.06 us spinlock call_timer_fn+0x24
# perf lock con -L call
ignore unknown symbol: call
contended total wait max wait avg wait type caller
1 9.06 us 9.06 us 9.06 us spinlock call_timer_fn+0x24
#
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -Y/--type-filter option is to filter the result for specific lock
types only. It can accept comma-separated values. Note that it would
accept type names like one in the output. spinlock, mutex, rwsem:R and
so on.
For RW-variant lock types, it converts the name to the both variants.
In other words, "rwsem" is same as "rwsem:R,rwsem:W". Also note that
"mutex" has two different encoding - one for sleeping wait, another for
optimistic spinning. Add "mutex-spin" entry for the lock_type_table so
that we can add it for "mutex" under the table.
$ sudo ./perf lock record -a -- ./perf bench sched messaging
$ sudo ./perf lock con -E 5 -Y spinlock
contended total wait max wait avg wait type caller
802 1.26 ms 11.73 us 1.58 us spinlock __wake_up_common_lock+0x62
13 787.16 us 105.44 us 60.55 us spinlock remove_wait_queue+0x14
12 612.96 us 78.70 us 51.08 us spinlock prepare_to_wait+0x27
114 340.68 us 12.61 us 2.99 us spinlock try_to_wake_up+0x1f5
83 226.38 us 9.15 us 2.73 us spinlock folio_lruvec_lock_irqsave+0x5e
Committer notes:
Make get_type_flag() return UINT_MAX for error instad of -1UL, as that
function returns 'unsigned int' and we store the value on a 'unsigned
int' 'flags' variable which makes clang unhappy:
35 98.23 fedora:37 : FAIL clang version 15.0.6 (Fedora 15.0.6-1.fc37)
builtin-lock.c:2012:14: error: result of comparison of constant 18446744073709551615 with expression of type 'unsigned int' is always true [-Werror,-Wtautological-constant-out-of-range-compare]
if (flags != -1UL) {
~~~~~ ^ ~~~~
builtin-lock.c:2021:14: error: result of comparison of constant 18446744073709551615 with expression of type 'unsigned int' is always true [-Werror,-Wtautological-constant-out-of-range-compare]
if (flags != -1UL) {
~~~~~ ^ ~~~~
builtin-lock.c:2037:14: error: result of comparison of constant 18446744073709551615 with expression of type 'unsigned int' is always true [-Werror,-Wtautological-constant-out-of-range-compare]
if (flags != -1UL) {
~~~~~ ^ ~~~~
3 errors generated.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The group option predates grouping events using curly braces added in
commit 89efb02950 ("perf tools: Add support to parse event group
syntax").
The --group option was retained for legacy support (in August
2012) but keeping it adds complexity.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Eelco Chaudron <echaudro@redhat.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Shaomin Deng <dengshaomin@cdjrlc.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221213232651.1269909-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -l/--lock-addr option is to implement per-lock-instance contention
stat using LOCK_AGGR_ADDR. It displays lock address and optionally
symbol name if exists.
$ sudo ./perf lock con -abl sleep 1
contended total wait max wait avg wait address symbol
1 36.28 us 36.28 us 36.28 us ffff92615d6448b8
9 10.91 us 1.84 us 1.21 us ffffffffbaed50c0 rcu_state
1 10.49 us 10.49 us 10.49 us ffff9262ac4f0c80
8 4.68 us 1.67 us 585 ns ffffffffbae07a40 jiffies_lock
3 3.03 us 1.45 us 1.01 us ffff9262277861e0
1 924 ns 924 ns 924 ns ffff926095ba9d20
1 436 ns 436 ns 436 ns ffff9260bfda4f60
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221209190727.759804-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This adds all remaining branch filters i.e "no_cycles", "no_flags" and
"hw_index". While here, also updates the documentation.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20221205064443.533587-1-anshuman.khandual@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Deprecate the --cputype option and add a --unit option where '--unit
cpu_atom' behaves like '--cputype atom'. The --unit option can be used
with arbitrary PMUs, for example:
```
$ perf list --unit msr pmu
List of pre-defined events (to be used in -e or -M):
msr/aperf/ [Kernel PMU event]
msr/cpu_thermal_margin/ [Kernel PMU event]
msr/mperf/ [Kernel PMU event]
msr/pperf/ [Kernel PMU event]
msr/smi/ [Kernel PMU event]
msr/tsc/ [Kernel PMU event]
```
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xin Gao <gaoxin@cdjrlc.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20221114210723.2749751-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use the global quiet variable everywhere so that all tools hide warnings
in quiet mode and update the documentation to reflect this.
'perf probe' claimed that errors are not printed in quiet mode but I
don't see this so remove it from the docs.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221018094137.783081-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT timestamps are not provided for every branch, let alone every
instruction, so there can be many samples with the same timestamp. With
per-cpu contexts, decoding is done for each CPU in turn, which can make it
difficult to see what is happening on different CPUs at the same time.
Currently the interleaving from perf script --itrace=i0ns is quite coarse
grained. There are often long stretches executing on one CPU and nothing on
another.
Some people are interested in seeing what happened on multiple CPUs before
a crash to debug races etc.
To improve perf script interleaving for parallel execution, the
intel-pt-events.py script has been enhanced to enable interleaving the
output with the same timestamp from different CPUs. It is understood that
interleaving is not perfect or causal.
Add parameter --interleave [<n>] to interleave sample output for the same
timestamp so that no more than n samples for a CPU are displayed in a row.
'n' defaults to 4. Note this only affects the order of output, and only
when the timestamp is the same.
Example:
$ perf script intel-pt-events.py --insn-trace --interleave 3
...
bash 2267/2267 [004] 9323.692625625 563caa3c86f0 jz 0x563caa3c89c7 run_pending_traps+0x30 (/usr/bin/bash) IPC: 1.52 (38/25)
bash 2267/2267 [004] 9323.692625625 563caa3c89c7 movq 0x118(%rsp), %rax run_pending_traps+0x307 (/usr/bin/bash)
bash 2267/2267 [004] 9323.692625625 563caa3c89cf subq %fs:0x28, %rax run_pending_traps+0x30f (/usr/bin/bash)
bash 2270/2270 [007] 9323.692625625 55dc58cabf02 jz 0x55dc58cabf48 unquoted_glob_pattern_p+0x102 (/usr/bin/bash) IPC: 1.56 (25/16)
bash 2270/2270 [007] 9323.692625625 55dc58cabf04 cmp $0x5d, %al unquoted_glob_pattern_p+0x104 (/usr/bin/bash)
bash 2270/2270 [007] 9323.692625625 55dc58cabf06 jnz 0x55dc58cabf10 unquoted_glob_pattern_p+0x106 (/usr/bin/bash)
bash 2264/2264 [001] 9323.692625625 7fd556a4376c jbe 0x7fd556a43ac8 round_and_return+0x3fc (/usr/lib/x86_64-linux-gnu/libc.so.6) IPC: 4.30 (43/10)
bash 2264/2264 [001] 9323.692625625 7fd556a43772 and $0x8, %edx round_and_return+0x402 (/usr/lib/x86_64-linux-gnu/libc.so.6)
bash 2264/2264 [001] 9323.692625625 7fd556a43775 jnz 0x7fd556a43ac8 round_and_return+0x405 (/usr/lib/x86_64-linux-gnu/libc.so.6)
bash 2267/2267 [004] 9323.692625625 563caa3c89d8 jnz 0x563caa3c8b11 run_pending_traps+0x318 (/usr/bin/bash)
bash 2267/2267 [004] 9323.692625625 563caa3c89de add $0x128, %rsp run_pending_traps+0x31e (/usr/bin/bash)
bash 2267/2267 [004] 9323.692625625 563caa3c89e5 popq %rbx run_pending_traps+0x325 (/usr/bin/bash)
...
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221020152509.5298-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf build assumes documentation files starting with "perf-" are man
pages but perf-arm-coresight.txt is not a man page:
asciidoc: ERROR: perf-arm-coresight.txt: line 2: malformed manpage title
asciidoc: ERROR: perf-arm-coresight.txt: line 3: name section expected
asciidoc: FAILED: perf-arm-coresight.txt: line 3: section title expected
make[3]: *** [Makefile:266: perf-arm-coresight.xml] Error 1
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [Makefile.perf:895: man] Error 2
Fix by renaming it.
Fixes: dc2e0fb00b ("perf test coresight: Add relevant documentation about ARM64 CoreSight testing")
Reported-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reported-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: coresight@lists.linaro.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/a176a3e1-6ddc-bb63-e41c-15cda8c2d5d2@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf mem' and 'perf c2c' tools are wrappers around 'perf record'
with mem load/ store events. IBS tagged load/store sample provides most
of the information needed for these tools. Wire in the "ibs_op//" event
as mem-ldst event for AMD.
There are some limitations though: Only load/store micro-ops provide
mem/c2c information. Whereas, IBS does not have a way to choose a
particular type of micro-op to tag. This results in many non-LS
micro-ops being tagged which appear as N/A in the perf report. IBS,
being an uncore pmu from kernel point of view[1], does not support per
process monitoring. Thus, perf mem/c2c on AMD are currently supported in
per-cpu mode only.
Example:
$ sudo perf mem record -- -c 10000
^C[ perf record: Woken up 227 times to write data ]
[ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ]
$ sudo perf mem report -F mem,sample,snoop
Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762
Memory access Samples Snoop
N/A 700620 N/A
L1 hit 126675 N/A
L2 hit 424 N/A
L3 hit 664 HitM
L3 hit 10 N/A
Local RAM hit 2 N/A
Remote RAM (1 hop) hit 8558 N/A
Remote Cache (1 hop) hit 3 N/A
Remote Cache (1 hop) hit 2 HitM
Remote Cache (2 hops) hit 10 HitM
Remote Cache (2 hops) hit 6 N/A
Uncached hit 4 N/A
$
[1]: https://lore.kernel.org/lkml/20220829113347.295-1-ravi.bangoria@amd.com
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20221006153946.7816-6-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently perf sets PERF_SAMPLE_WEIGHT flag only for mem load events.
Set it for combined load-store event as well which will enable recording
of load latency by default on arch that does not support independent
mem load event.
Also document missing -W in perf-record man page.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20221006153946.7816-5-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add/improve documentation helping people get started with CoreSight and
perf as well as describe the testing and how it works.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-doc@vger.kernel.org
Link: https://lore.kernel.org/r/20220909152803.2317006-14-carsten.haitzler@foss.arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like in 'perf report', this option is to suppress header and debug messages.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220924004221.841024-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like in 'perf top', the -E option can limit number of entries to print.
It can be useful when users want to see top N contended locks only.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220924004221.841024-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Sometimes users want to see actual (virtual) address of sampled instructions.
Add a new 'addr' sort key to display the raw addresses.
$ perf record -o- true | perf report -i- -s addr
# To display the perf.data header info, please use --header/--header-only options.
#
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
#
# Total Lost Samples: 0
#
# Samples: 12 of event 'cycles:u'
# Event count (approx.): 252512
#
# Overhead Address
# ........ ..................
#
42.96% 0x7f96f08443d7
29.55% 0x7f96f0859b50
14.76% 0x7f96f0852e02
8.30% 0x7f96f0855028
4.43% 0xffffffff8de01087
Note that it just compares and displays the sample ip. Each process can
have a different memory layout and the ip will be different even if they run
the same binary. So this sort key is mostly meaningful for per-process
profile data.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220923173142.805896-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the documentation of --build-id and --buildid-all options to
clarify the difference between them. The former requires full sample
processing to find which DSOs are actually used. While the latter simply
injects every DSO's build-id from MMAP{,2} records, skipping SAMPLEs.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220923173142.805896-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It needs stack traces to find callers of locks. To minimize the
performance overhead it only collects up to 8 entries for each stack
trace. And it skips first 3 entries as they came from BPF, tracepoint
and lock functions which are not interested for most users.
But it turned out that those numbers are different in some
configuration. Using fixed number can result in non meaningful caller
names. Let's make them adjustable with --stack-depth and --skip-stack
options.
On my setup, the default output is like below:
# /perf lock con -ab -F contended,wait_total sleep 3
contended total wait type caller
28 4.55 ms rwlock:W __bpf_trace_contention_begin+0xb
33 1.67 ms rwlock:W __bpf_trace_contention_begin+0xb
12 580.28 us spinlock __bpf_trace_contention_begin+0xb
60 240.54 us rwsem:R __bpf_trace_contention_begin+0xb
27 64.45 us spinlock __bpf_trace_contention_begin+0xb
If I change the stack skip to 5, the result will be like:
# perf lock con -ab -F contended,wait_total --stack-skip 5 sleep 3
contended total wait type caller
32 715.45 us spinlock folio_lruvec_lock_irqsave+0x61
26 550.22 us spinlock folio_lruvec_lock_irqsave+0x61
15 486.93 us rwsem:R mmap_read_lock+0x13
12 139.66 us rwsem:W vm_mmap_pgoff+0x93
1 7.04 us spinlock tick_do_update_jiffies64+0x25
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220912055314.744552-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pass d+e option and log size via intel_pt_log_enable(). Allocate a buffer
for log messages and provide intel_pt_log_dump_buf() to dump and reset the
buffer upon decoder errors.
Example:
$ sudo perf record -e intel_pt// sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.094 MB perf.data ]
$ sudo perf config itrace.debug-log-buffer-size=300
$ sudo perf script --itrace=ed+e+o | head -20
Dumping debug log buffer (first line may be sliced)
Other
ffffffff96ca22f6: 48 89 e5 Other
ffffffff96ca22f9: 65 48 8b 05 ff e0 38 69 Other
ffffffff96ca2301: 48 3d c0 a5 c1 98 Other
ffffffff96ca2307: 74 08 Jcc +8
ffffffff96ca2311: 5d Other
ffffffff96ca2312: c3 Ret
ERROR: Bad RET compression (TNT=N) at 0xffffffff96ca2312
End of debug log buffer dump
instruction trace error type 1 time 15913.537143482 cpu 5 pid 36292 tid 36292 ip 0xffffffff96ca2312 code 6: Trace doesn't match instruction
Dumping debug log buffer (first line may be sliced)
Other
ffffffff96ce7fe9: f6 47 2e 20 Other
ffffffff96ce7fed: 74 11 Jcc +17
ffffffff96ce7fef: 48 8b 87 28 0a 00 00 Other
ffffffff96ce7ff6: 5d Other
ffffffff96ce7ff7: 48 8b 40 18 Other
ffffffff96ce7ffb: c3 Ret
ERROR: Bad RET compression (TNT=N) at 0xffffffff96ce7ffb
Warning:
8 instruction trace errors
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220905073424.3971-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add flag +e to the itrace d (decoder debug log) option to get output only
on decoding errors.
The log can be very big so reducing the output to where there are decoding
errors can be useful for analyzing errors.
By default, the log size in that case is 16384 bytes, but can be altered by
perf config e.g. perf config itrace.debug-log-buffer-size=30000
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220905073424.3971-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This updates the perf tools with branch privilege information request flag
i.e PERF_SAMPLE_BRANCH_PRIV_SAVE that has been added earlier in the kernel.
This also updates 'perf record' documentation, branch_modes[], and generic
branch privilege level enumeration as added earlier in the kernel.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220824044822.70230-8-anshuman.khandual@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
AUX area traces can produce too much data to record successfully or
analyze subsequently. Add another means to reduce data collection by
allowing multiple recording time ranges.
This is useful, for instance, in cases where a workload produces
predictably reproducible events in specific time ranges.
Today we only have perf record -D <msecs> to start at a specific region, or
some complicated approach using snapshot mode and external scripts sending
signals or using the fifos. But these approaches are difficult to set up
compared with simply having perf do it.
Extend perf record option -D/--delay option to specifying relative time
stamps for start stop controlled by perf with the right time offset, for
instance:
perf record -e intel_pt// -D 10-20,30-40
to record 10ms to 20ms into the trace and 30ms to 40ms.
Example:
The example workload is:
$ cat repeat-usleep.c
int usleep(useconds_t usec);
int usage(int ret, const char *msg)
{
if (msg)
fprintf(stderr, "%s\n", msg);
fprintf(stderr, "Usage is: repeat-usleep <microseconds>\n");
return ret;
}
int main(int argc, char *argv[])
{
unsigned long usecs;
char *end_ptr;
if (argc != 2)
return usage(1, "Error: Wrong number of arguments!");
errno = 0;
usecs = strtoul(argv[1], &end_ptr, 0);
if (errno || *end_ptr || usecs > UINT_MAX)
return usage(1, "Error: Invalid argument!");
while (1) {
int ret = usleep(usecs);
if (ret & errno != EINTR)
return usage(1, "Error: usleep() failed!");
}
return 0;
}
$ perf record -e intel_pt//u --delay 10-20,40-70,110-160 -- ./repeat-usleep 500
Events disabled
Events enabled
Events disabled
Events enabled
Events disabled
Events enabled
Events disabled
[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 0.204 MB perf.data ]
Terminated
A dlfilter is used to determine continuous data collection (timestamps
less than 1ms apart):
$ cat dlfilter-show-delays.c
static __u64 start_time;
static __u64 last_time;
int start(void **data, void *ctx)
{
printf("%-17s\t%-9s\t%-6s\n", " Time", " Duration", " Delay");
return 0;
}
int filter_event_early(void *data, const struct perf_dlfilter_sample *sample, void *ctx)
{
__u64 delta;
if (!sample->time)
return 1;
if (!last_time)
goto out;
delta = sample->time - last_time;
if (delta < 1000000)
goto out2;;
printf("%17.9f\t%9.1f\t%6.1f\n", start_time / 1000000000.0, (last_time - start_time) / 1000000.0, delta / 1000000.0);
out:
start_time = sample->time;
out2:
last_time = sample->time;
return 1;
}
int stop(void *data, void *ctx)
{
printf("%17.9f\t%9.1f\n", start_time / 1000000000.0, (last_time - start_time) / 1000000.0);
return 0;
}
The result shows the times roughly match the --delay option:
$ perf script --itrace=qb --dlfilter dlfilter-show-delays.so
Time Duration Delay
39215.302317300 9.7 20.5
39215.332480217 30.4 40.9
39215.403837717 49.8
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220824072814.16422-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This commit adds the option --known-build-ids to perf inject.
It allows the user to explicitly specify the build id for a given
path, instead of retrieving it from the current system. This is
useful in cases where a perf.data file is processed on a different
system from where it was collected, or if some of the binaries are
no longer available.
The build ids and paths are specified in pairs in the command line.
Using the file:// specifier, build ids can be loaded from a file
directly generated by perf buildid-list. This is convenient to copy
build ids from one perf.data file to another.
** Example: In this example we use perf record to create two
perf.data files, one with build ids and another without, and use
perf buildid-list and perf inject to copy the build ids from the
first file to the second.
$ perf record ls /tmp
$ perf record --no-buildid -o perf.data.no-buildid ls /tmp
$ perf buildid-list > build-ids.txt
$ perf inject -b --known-build-ids='file://build-ids.txt' \
-i perf.data.no-buildid -o perf.data.buildid
Signed-off-by: Raul Silvera <rsilvera@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220815225922.2118745-1-rsilvera@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the documentation to reflect the kernel changes.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220816125612.2042397-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The Intel hybrid description is written in a different style than the
rest of the perf record man page. There were some new command line
options added after it which resulted in very strange section ordering.
Move the hybrid include last.
Also the sub sections in the hybrid document don't fit the record
manpage well (especially since it talks about all kinds of unrelated
commands). I left this for now, but would be better to separate this
properly in the different man pages.
It would be better to use sub sections for the other sections, but these
don't seem to be supported in AsciiDoc?
Some of the examples are still misrendered in the manpage with an
indented troff command, but I don't know how to fix that.
In any case it's now better than before.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: zhengjun.xing@intel.com
Link: https://lore.kernel.org/r/20220818100127.249401-1-ak@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since the new display option 'peer' is introduced, this patch is to
update the documentation to reflect it.
Reviewed-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-16-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move common guest options into include files. Use attribute substitution to
customize an example, using "[verse]" to define the block instead of a
"literal" block which does not permit substitution.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220811170411.84154-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf inject' documentation is missing the guestmount option. Add it.
Fixes: 97406a7e4f ("perf inject: Add support for injecting guest sideband events")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220811170411.84154-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf script' documentation is missing several options relating to
guests. Add them.
Fixes: 15a108af1a ("perf script: Allow specifying the files to process guest samples")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220811170411.84154-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The --map-nr-entries option is to control number of max entries in the
perf lock contention BPF maps.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220802191004.347740-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add -a/--all-cpus and -C/--cpu options for cpu filtering. Also -p/--pid
and --tid options are added for task filtering. The short -t option is
taken for --threads already. Tracking the command line workload is
possible as well.
$ sudo perf lock contention -a -b sleep 1
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220729200756.666106-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add -b/--use-bpf option to use BPF to collect lock contention stats.
For simplicity it now runs system-wide and requires C-c to stop.
Upcoming changes will add the usual filtering.
$ sudo perf lock con -b
^C
contended total wait max wait avg wait type caller
42 192.67 us 13.64 us 4.59 us spinlock queue_work_on+0x20
23 85.54 us 10.28 us 3.72 us spinlock worker_thread+0x14a
6 13.92 us 6.51 us 2.32 us mutex kernfs_iop_permission+0x30
3 11.59 us 10.04 us 3.86 us mutex kernfs_dop_revalidate+0x3c
1 7.52 us 7.52 us 7.52 us spinlock kthread+0x115
1 7.24 us 7.24 us 7.24 us rwlock:W sys_epoll_wait+0x148
2 7.08 us 3.99 us 3.54 us spinlock delayed_work_timer_fn+0x1b
1 6.41 us 6.41 us 6.41 us spinlock idle_balance+0xa06
2 2.50 us 1.83 us 1.25 us mutex kernfs_iop_lookup+0x2f
1 1.71 us 1.71 us 1.71 us mutex kernfs_iop_getattr+0x2c
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220729200756.666106-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf record' generates perf.data, which generates extra interrupts
for hard disk, amount of data to be collected increases with time.
Using eBPF trace can process the data in kernel, which solves the
preceding two problems.
Add -b/--use-bpf option for latency and report to support
tracing kwork events using eBPF:
1. Create bpf prog and attach to tracepoints,
2. Start tracing after command is entered,
3. After user hit "ctrl+c", stop tracing and report,
4. Support CPU and name filtering.
This commit implements the framework code and
does not add specific event support.
Test cases:
# perf kwork rep -h
Usage: perf kwork report [<options>]
-b, --use-bpf Use BPF to measure kwork runtime
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): runtime, max, count
-S, --with-summary Show summary with statistics
--time <str> Time span for analysis (start,stop)
# perf kwork lat -h
Usage: perf kwork latency [<options>]
-b, --use-bpf Use BPF to measure kwork latency
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): avg, max, count
--time <str> Time span for analysis (start,stop)
# perf kwork lat -b
Unsupported bpf trace class irq
# perf kwork rep -b
Unsupported bpf trace class irq
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-15-yangjihong1@huawei.com
[ Simplify work_findnew() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implements framework of perf kwork latency, which is used to report time
properties such as delay time and frequency.
Test cases:
# perf kwork lat -h
Usage: perf kwork latency [<options>]
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): avg, max, count
--time <str> Time span for analysis (start,stop)
# perf kwork lat -C 199
Requested CPU 199 too large. Consider raising MAX_NR_CPUS
Invalid cpu bitmap
# perf kwork lat -i perf_no_exist.data
failed to open perf_no_exist.data: No such file or directory
# perf kwork lat -s avg1
Error: Unknown --sort key: `avg1'
Usage: perf kwork latency [<options>]
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): avg, max, count
--time <str> Time span for analysis (start,stop)
# perf kwork lat --time FFFF,
Invalid time span
# perf kwork lat
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------
INFO: 36.570% skipped events (31537 including 0 raise, 31537 entry, 0 exit)
Since there are no latency-enabled events, the output is empty.
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-11-yangjihong1@huawei.com
[ Add {} for multiline if blocks ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implements framework of 'perf kwork report', which is used to report
time properties such as run time and frequency:
Test cases:
# perf kwork
Usage: perf kwork [<options>] {record|report}
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-k, --kwork <kwork> list of kwork to profile (irq, softirq, workqueue, etc)
-v, --verbose be more verbose (show symbol address, etc)
# perf kwork report -h
Usage: perf kwork report [<options>]
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): runtime, max, count
-S, --with-summary Show summary with statistics
--time <str> Time span for analysis (start,stop)
# perf kwork report
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork report -S
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------
Total count : 0
Total runtime (msec) : 0.000 (0.000% load average)
Total time span (msec) : 0.000
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork report -C 0,100
Requested CPU 100 too large. Consider raising MAX_NR_CPUS
Invalid cpu bitmap
# perf kwork report -s runtime1
Error: Unknown --sort key: `runtime1'
Usage: perf kwork report [<options>]
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): runtime, max, count
-S, --with-summary Show summary with statistics
--time <str> Time span for analysis (start,stop)
# perf kwork report -i perf_no_exist.data
failed to open perf_no_exist.data: No such file or directory
# perf kwork report --time 00FFF,
Invalid time span
Since there are no report supported events, the output is empty.
Briefly describe the data structure:
1. "class" indicates event type. For example, irq and softiq correspond
to different types.
2. "cluster" refers to a specific event corresponding to a type. For
example, RCU and TIMER in softirq correspond to different clusters,
which contains three types of events: raise, entry, and exit.
3. "atom" includes time of each sample and sample of the previous phase.
(For example, exit corresponds to entry, which is used for timehist.)
Committer notes:
- Add {} for multiline if blocks.
- report_print_work() should either return that ret variable that
accounts how many bytes were printed or stop accounting and be void.
Do the former for now to avoid this:
builtin-kwork.c:534:6: error: variable 'ret' set but not used [-Werror,-Wunused-but-set-variable]
int ret = 0;
^
1 error generated.
When building with:
⬢[acme@toolbox perf]$ clang --version
clang version 13.0.0 (https://github.com/llvm/llvm-project e8991caea8690ec2d17b0b7e1c29bf0da6609076)
Also:
- if ((dst_type >= 0) && (dst_type < KWORK_TRACE_MAX)) {
+ if (dst_type < KWORK_TRACE_MAX) {
Several versions of clang and at least this gcc:
3 51.40 alpine:3.9 : FAIL gcc version 8.3.0 (Alpine 8.3.0)
builtin-kwork.c:411:16: error: comparison of unsigned enum expression >= 0 is
always true [-Werror,-Wtautological-compare]
if ((dst_type >= 0) && (dst_type < KWORK_TRACE_MAX)) {
As the first entry in a enum is zero.
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-7-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Record interrupt events irq:irq_handler_entry & irq_handler_exit
Test cases:
# perf kwork record -o perf_kwork.date -- sleep 1
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 0.556 MB perf_kwork.date ]
#
# perf evlist -i perf_kwork.date
irq:irq_handler_entry
irq:irq_handler_exit
dummy:HG
# Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
#
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-3-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf kwork' tool is used to trace time properties of kernel work
(such as irq, softirq, and workqueue), including runtime, latency, and
timehist, using the infrastructure in the perf tools to allow tracing
extra targets.
This is the first commit to reuse the 'perf record' framework code to
implement a simple record function, kwork is not supported currently.
Test cases:
# perf
usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]
The most commonly used perf commands are:
<SNIP>
iostat Show I/O performance metrics
kallsyms Searches running kernel for symbols
kmem Tool to trace/measure kernel memory properties
kvm Tool to trace/measure kvm guest os
kwork Tool to trace/measure kernel work properties (latencies)
list List all symbolic event types
lock Analyze lock events
mem Profile memory accesses
record Run a command and record its profile into perf.data
<SNIP>
See 'perf help COMMAND' for more information on a specific command.
# perf kwork
Usage: perf kwork [<options>] {record}
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-k, --kwork <kwork> list of kwork to profile
-v, --verbose be more verbose (show symbol address, etc)
# perf kwork record -- sleep 1
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 1.787 MB perf.data ]
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-2-yangjihong1@huawei.com
[ Add {} for multiline if blocks ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like perf lock report, it can report lock contention stat of each task.
$ perf lock contention -t
contended total wait max wait avg wait pid comm
5 945.20 us 902.08 us 189.04 us 316167 EventManager_De
33 98.17 us 6.78 us 2.97 us 766063 kworker/0:1-get
7 92.47 us 61.26 us 13.21 us 316170 EventManager_De
14 76.31 us 12.87 us 5.45 us 12949 timedcall
24 76.15 us 12.27 us 3.17 us 767992 sched-pipe
15 75.62 us 11.93 us 5.04 us 15127 switchto-defaul
24 71.84 us 5.59 us 2.99 us 629168 kworker/u513:2-
17 67.41 us 7.94 us 3.96 us 13504 coroner-
1 59.56 us 59.56 us 59.56 us 316165 EventManager_De
14 56.21 us 6.89 us 4.01 us 0 swapper
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220725183124.368304-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like perf lock report, add -k/--key and -F/--field options to control
output formatting and sorting. Note that it has slightly different
default options as some fields are not available and to optimize the
screen space.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220725183124.368304-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf lock contention' processes the lock contention events and
displays the result like perf lock report. Right now, there's not
much difference between the two but the lock contention specific
features will come soon.
$ perf lock contention
contended total wait max wait avg wait type caller
238 1.41 ms 29.20 us 5.94 us spinlock update_blocked_averages+0x4c
1 902.08 us 902.08 us 902.08 us rwsem:R do_user_addr_fault+0x1dd
81 330.30 us 17.24 us 4.08 us spinlock _nohz_idle_balance+0x172
2 89.54 us 61.26 us 44.77 us spinlock do_anonymous_page+0x16d
24 78.36 us 12.27 us 3.27 us mutex pipe_read+0x56
2 71.58 us 59.56 us 35.79 us spinlock __handle_mm_fault+0x6aa
6 25.68 us 6.89 us 4.28 us spinlock do_idle+0x28d
1 18.46 us 18.46 us 18.46 us rtmutex exec_fw_cmd+0x21b
3 15.25 us 6.26 us 5.08 us spinlock tick_do_update_jiffies64+0x2c
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220725183124.368304-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now it is possible to decode a host Intel PT trace including guest machine
user space, add documentation for the steps needed to do it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-36-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Inject events from a perf.data file recorded in a virtual machine into
a perf.data file recorded on the host at the same time.
Only side band events (e.g. mmap, comm, fork, exit etc) and build IDs are
injected. Additionally, the guest kcore_dir is copied as kcore_dir__
appended to the machine PID.
This is non-trivial because:
o It is not possible to process 2 sessions simultaneously so instead
events are first written to a temporary file.
o To avoid conflict, guest sample IDs are replaced with new unused sample
IDs.
o Guest event's CPU is changed to be the host CPU because it is more
useful for reporting and analysis.
o Sample ID is mapped to machine PID which is recorded with VCPU in the
id index. This is important to allow guest events to be related to the
guest machine and VCPU.
o Timestamps must be converted.
o Events are inserted to obey finished-round ordering.
The anticipated use-case is:
- start recording sideband events in a guest machine
- start recording an AUX area trace on the host which can trace also the
guest (e.g. Intel PT)
- run test case on the guest
- stop recording on the host
- stop recording on the guest
- copy the guest perf.data file to the host
- inject the guest perf.data file sideband events into the host perf.data
file using perf inject
- the resulting perf.data file can now be used
Subsequent patches provide Intel PT support for this.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-25-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add machine_pid and vcpu to struct perf_dlfilter_sample. The 'size' can be
used to determine if the values are present, however machine_pid is zero if
unused in any case. vcpu should be ignored if machine_pid is zero.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-17-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add fields machine_pid and vcpu. These are displayed only if machine_pid is
non-zero.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When reviewing the results of perf inject, it is useful to be able to see
the events in the order they appear in the file.
So add --dump-unsorted-raw-trace option to do an unsorted dump.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This new option displays all of the information needed to do external
BuildID-based symbolization of kernel stack traces, such as those collected
by bpf_get_stackid().
For each kernel module plus the main kernel, it displays the BuildID,
the start and end virtual addresses of that module's text range (rounded
out to page boundaries), and the pathname of the module.
When run as a non-privileged user, the actual addresses of the modules'
text ranges are not available, so the tools displays "0, <text length>" for
kernel modules and "0, 0xffffffffffffffff" for the kernel itself.
Sample output:
root# perf buildid-list -m
cf6df852fd4da122d616153353cc8f560fd12fe0 ffffffffa5400000 ffffffffa6001e27 [kernel.kallsyms]
1aa7209aa2acb067d66ed6cf7676d65066384d61 ffffffffc0087000 ffffffffc008b000 /lib/modules/5.15.15-1rodete2-amd64/kernel/crypto/sha512_generic.ko
3857815b5bf0183697b68f8fe0ea06121644041e ffffffffc008c000 ffffffffc0098000 /lib/modules/5.15.15-1rodete2-amd64/kernel/arch/x86/crypto/sha512-ssse3.ko
4081fde0bca2bc097cb3e9d1efcb836047d485f1 ffffffffc0099000 ffffffffc009f000 /lib/modules/5.15.15-1rodete2-amd64/kernel/drivers/acpi/button.ko
1ef81ba4890552ea6b0314f9635fc43fc8cef568 ffffffffc00a4000 ffffffffc00aa000 /lib/modules/5.15.15-1rodete2-amd64/kernel/crypto/cryptd.ko
cc5c985506cb240d7d082b55ed260cbb851f983e ffffffffc00af000 ffffffffc00b6000 /lib/modules/5.15.15-1rodete2-amd64/kernel/drivers/i2c/busses/i2c-piix4.ko
[...]
Committer notes:
u64 formatter should be PRIx64 for printing as hex numbers, fix this:
28 5.28 debian:experimental-x-mips : FAIL gcc version 11.2.0 (Debian 11.2.0-18)
builtin-buildid-list.c: In function 'buildid__map_cb':
builtin-buildid-list.c:32:24: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=]
32 | printf("%s %16lx %16lx", bid_buf, map->start, map->end);
| ~~~~^ ~~~~~~~~~~
| | |
| long unsigned int u64 {aka long long unsigned int}
| %16llx
builtin-buildid-list.c:32:30: error: format '%lx' expects argument of type 'long unsigned int', but argument 4 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=]
32 | printf("%s %16lx %16lx", bid_buf, map->start, map->end);
| ~~~~^ ~~~~~~~~
| | |
| long unsigned int u64 {aka long long unsigned int}
| %16llx
cc1: all warnings being treated as errors
Signed-off-by: Blake Jones <blakejones@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220629213632.3899212-1-blakejones@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently it has no interface to specify the max stack depth for perf
record. Extend the command line parameter to accept a number after
'fp' to specify the depth like '--call-graph fp,32'.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220615163222.1275500-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add --vmlinux and --kallsyms options to support data file from
different kernels.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220615163222.1275500-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PMUs advertise their capabilities via sysfs attribute files but
the perf tool currently parses only core(CPU) or hybrid core PMU
capabilities. Add support of recording non-core PMU capabilities
int perf.data header.
Note that a newly proposed HEADER_PMU_CAPS is replacing existing
HEADER_HYBRID_CPU_PMU_CAPS. Special care is taken for hybrid core
PMUs by writing their capabilities first in the perf.data header
to make sure new perf.data file being read by old perf tool does
not break.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rrichter@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: like.xu.linux@gmail.com
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20220604044519.594-6-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In preparation for recording sideband events in a virtual machine guest so
that they can be injected into a host perf.data file.
This is needed to enable injecting events after the initial synthesized
user events (that have an all zero id sample) but before regular events.
Committer notes:
Add entry about PERF_RECORD_FINISHED_INIT to
tools/perf/Documentation/perf.data-file-format.txt.
Committer testing:
Before:
# perf report -D | grep FINISHED
0 0x5910 [0x8]: PERF_RECORD_FINISHED_ROUND
FINISHED_ROUND events: 1 ( 0.5%)
#
After:
# perf record -- sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.020 MB perf.data (7 samples) ]
# perf report -D | grep FINISHED
0 0x5068 [0x8]: PERF_RECORD_FINISHED_INIT: unhandled!
0 0x5390 [0x8]: PERF_RECORD_FINISHED_ROUND
FINISHED_ROUND events: 1 ( 0.5%)
FINISHED_INIT events: 1 ( 0.5%)
#
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220610113316.6682-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add --off-cpu option to enable the off-cpu profiling with BPF. It'd
use a bpf_output event and rename it to "offcpu-time". Samples will
be synthesized at the end of the record session using data from a BPF
map which contains the aggregated off-cpu time at context switches.
So it needs root privilege to get the off-cpu profiling.
Each sample will have a separate user stacktrace so it will skip
kernel threads. The sample ip will be set from the stacktrace and
other sample data will be updated accordingly. Currently it only
handles some basic sample types.
The sample timestamp is set to a dummy value just not to bother with
other events during the sorting. So it has a very big initial value
and increase it on processing each samples.
Good thing is that it can be used together with regular profiling like
cpu cycles. If you don't want to that, you can use a dummy event to
enable off-cpu profiling only.
Example output:
$ sudo perf record --off-cpu perf bench sched messaging -l 1000
$ sudo perf report --stdio --call-graph=no
# Total Lost Samples: 0
#
# Samples: 41K of event 'cycles'
# Event count (approx.): 42137343851
...
# Samples: 1K of event 'offcpu-time'
# Event count (approx.): 587990831640
#
# Children Self Command Shared Object Symbol
# ........ ........ ............... .................. .........................
#
81.66% 0.00% sched-messaging libc-2.33.so [.] __libc_start_main
81.66% 0.00% sched-messaging perf [.] cmd_bench
81.66% 0.00% sched-messaging perf [.] main
81.66% 0.00% sched-messaging perf [.] run_builtin
81.43% 0.00% sched-messaging perf [.] bench_sched_messaging
40.86% 40.86% sched-messaging libpthread-2.33.so [.] __read
37.66% 37.66% sched-messaging libpthread-2.33.so [.] __write
2.91% 2.91% sched-messaging libc-2.33.so [.] __poll
...
As you can see it spent most of off-cpu time in read and write in
bench_sched_messaging(). The --call-graph=no was added just to make
the output concise here.
It uses perf hooks facility to control BPF program during the record
session rather than adding new BPF/off-cpu specific calls.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220518224725.742882-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add an option to indicate that guest code can be found in the hypervisor
process.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220517131011.6117-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add an option to indicate that guest code can be found in the hypervisor
process.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220517131011.6117-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'N/A' metric is added for store operations, update documentation to
reflect changes in the report table.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adam Li <adamli@amperemail.onmicrosoft.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Alyssa Ross <hi@alyssa.is>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220518055729.1869566-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
ptwrite is an Intel x86 instruction that writes arbitrary values into an
Intel PT trace. It is not supported on all hardware, so provide an
alternative that makes use of TNT packets to convey the payload data.
TNT packets encode Taken/Not-taken conditional branch information, so
taking branches based on the payload value will encode the value into
the TNT packet. Refer to the changes to the documentation file
perf-intel-pt.txt in this patch for an example.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220509152400.376613-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like in 'perf report' and 'perf top', Add this option to limit the
number of functions it displays based on the overhead value in percent.
This affects only stdio and stdio2 output modes. Without this, it
shows very long disassembly lines for every function in the data
file. If users don't want this behavior, they can set a value in
percent to suppress that.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220502232015.697243-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For hybrid events, by default stat aggregates and reports the event counts
per pmu.
# ./perf stat -e cycles -a sleep 1
Performance counter stats for 'system wide':
14,066,877,268 cpu_core/cycles/
6,814,443,147 cpu_atom/cycles/
1.002760625 seconds time elapsed
Sometimes, it's also useful to aggregate event counts from all PMUs.
Create a new option '--hybrid-merge' to enable that behavior and report
the counts without PMUs.
# ./perf stat -e cycles -a --hybrid-merge sleep 1
Performance counter stats for 'system wide':
20,732,982,512 cycles
1.002776793 seconds time elapsed
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220422065635.767648-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The SPE integration in Perf has quite a few usability quirks that
can't be found by just reading the reference manual. So document this
and at the same time add a summary of the feature that is also hard to
find elsewhere.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Co-authored-by: Al Grant <al.grant@arm.com>
Co-authored-by: Luke Dare <Luke.Dare@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220413084021.2556142-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When analyzing with 'perf script', it's useful to understand the
captured instruction and the next sequential instruction.
To calculate the address of the next sequential instruction, the length
of the captured instruction is required.
For example, you can’t know the next sequential instruction after an
unconditional branch unless you calculate that based on its length.
For branch stacks, 'perf script' only prints the instruction bytes with
'brstackinsn', but lacks the instruction length.
Add 'brstackinsnlen' to print the instruction length.
$ perf script -F ip,brstackinsn,brstackinsnlen --xed
7fa555be8f75
_start:
00007fa555be8090 mov %rsp, %rdi ilen: 3
00007fa555be8093 callq 0x7fa555be8ea0 ilen: 5 # PRED 102 cycles [102] 0.02 IPC
_dl_start+38:
00007fa555be8ec6 movq %rdx,0x227853(%rip) ilen: 7
00007fa555be8ecd leaq 0x227f94(%rip),%rdx ilen: 7
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Ahmad Yasin <ahmad.yasin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/1647871212-184070-1-git-send-email-kan.liang@linux.intel.com
[ Added the new field to tools/perf/Documentation/perf-script.txt ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add documentation for Event Trace and TNT disable to the perf Intel PT man
page.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-26-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Amend the display to include D and t flags in the same way as the x flag.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-21-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add itrace option "I" to synthesize interrupt or similar (asynchronous)
events. This will be used for Intel PT Event Trace events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-13-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Extend --threads option in perf record command line interface.
The option can have a value in the form of masks that specify
CPUs to be monitored with data streaming threads and its layout
in system topology. The masks can be filtered using CPU mask
provided via -C option.
The specification value can be user defined list of masks. Masks
separated by colon define CPUs to be monitored by one thread and
affinity mask of that thread is separated by slash. For example:
<cpus mask 1>/<affinity mask 1>:<cpu mask 2>/<affinity mask 2>
specifies parallel threads layout that consists of two threads
with corresponding assigned CPUs to be monitored.
The specification value can be a string e.g. "cpu", "core" or
"package" meaning creation of data streaming thread for every
CPU or core or package to monitor distinct CPUs or CPUs grouped
by core or package.
The option provided with no or empty value defaults to per-cpu
parallel threads layout creating data streaming thread for every
CPU being monitored.
Document --threads option syntax and parallel data streaming modes
in Documentation/perf-record.txt.
Suggested-by: Jiri Olsa <jolsa@kernel.org>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/079e2619be70c465317cf7c9fdaf5fa069728c32.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Provide --threads option in perf record command line interface.
The option creates a data streaming thread for each CPU in the system.
Document --threads option in Documentation/perf-record.txt.
Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/01aeae43b047f428596c4ef9f9342ab94865cedd.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fedora 35 sets DEBUGINFOD_URLS by default, which might lead to
unexpected stalls in perf record exit path, when we try to cache
profiled binaries.
# DEBUGINFOD_PROGRESS=1 ./perf record -a
^C[ perf record: Woken up 1 times to write data ]
Downloading from https://debuginfod.fedoraproject.org/ 447069
Downloading from https://debuginfod.fedoraproject.org/ 1502175
Downloading \^Z
Disabling DEBUGINFOD_URLS by default in perf record and adding
debuginfod option and .perfconfig variable support to enable id.
Default without debuginfo processing:
# perf record -a
Using system debuginfod setup:
# perf record -a --debuginfod
Using custom debuginfd url:
# perf record -a --debuginfod='https://evenbetterdebuginfodserver.krava'
Adding single perf_debuginfod_setup function and using
it also in perf buildid-cache command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20211209200425.303561-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In previous patch, we have supported the syntax which enables
the event on a specified pmu, such as:
cpu_core/<event>/
cpu_atom/<event>/
While this syntax is not very easy for applying on a set of
events or applying on a group. In following example, we have to
explicitly assign the pmu prefix.
# ./perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}' -- sleep 1
Performance counter stats for 'sleep 1':
1,158,545 cpu_core/cycles/
1,003,113 cpu_core/instructions/
1.002428712 seconds time elapsed
A much easier way is:
# ./perf stat --cputype core -e '{cycles,instructions}' -- sleep 1
Performance counter stats for 'sleep 1':
1,101,071 cpu_core/cycles/
939,892 cpu_core/instructions/
1.002363142 seconds time elapsed
For this example, the '--cputype' enables the events from specified
pmu (cpu_core).
If '--cputype' conflicts with pmu prefix, '--cputype' is ignored.
# ./perf stat --cputype core -e cycles,cpu_atom/instructions/ -a -- sleep 1
Performance counter stats for 'system wide':
21,003,407 cpu_core/cycles/
367,886 cpu_atom/instructions/
1.002203520 seconds time elapsed
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210909062215.10278-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This updates the link to documentation on AMD processors. The new link
points to a page where users can find the Processor Programming
Reference (PPR) documents for the family and model codes corresponding
to processors they are using.
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Robert Richter <rrichter@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Link: https://lore.kernel.org/r/20211123084613.243792-2-sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
AMD processors have events with event select codes and unit masks larger
than a byte. The core PMU, for example, uses 12-bit event select codes
split between bits 0-7 and 32-35 of the PERF_CTL MSRs as can be seen
from /sys/bus/event_sources/devices/cpu/format/*.
The Processor Programming Reference (PPR) lists the event codes as
unified 12-bit hexadecimal values instead and the split between the bits
is not apparent to someone who is not aware of the layout of the
PERF_CTL MSRs.
8-bit event select codes continue to work as the layout matches that of
the PERF_CTL MSRs i.e. bits 0-7 for event select and 8-15 for unit mask.
This adds more details in the perf man pages about using
/sys/bus/event_sources/devices/*/format/* for determining the correct
raw event encoding scheme.
E.g. the "op_cache_hit_miss.op_cache_hit" event with code 0x28f and
umask 0x03 can be programmed using its symbolic name as:
$ sudo perf --debug perf-event-open stat -e op_cache_hit_miss.op_cache_hit sleep 1
------------------------------------------------------------
perf_event_attr:
type 4
size 128
config 0x20000038f
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
[...]
One might use a simple eventsel+umask combination based on what the
current man pages say and incorrectly program the event as:
$ sudo perf --debug perf-event-open stat -e r0328f sleep 1
------------------------------------------------------------
perf_event_attr:
type 4
size 128
config 0x328f
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
[...]
When it should have been based on the format from sysfs:
$ cat /sys/bus/event_source/devices/cpu/format/event
config:0-7,32-35
$ sudo perf --debug perf-event-open stat -e r20000038f sleep 1
------------------------------------------------------------
perf_event_attr:
type 4
size 128
config 0x20000038f
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
[...]
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Robert Richter <rrichter@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Link: https://lore.kernel.org/r/20211123084613.243792-1-sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update 'perf record' docs and ARM SPE recording options so that they are
consistent. This includes supporting the --no-switch-events flag in ARM
SPE as well.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211111133625.193568-3-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Other perf tools allow specifying the path to vmlinux. 'perf inject'
didn't have this argument which made some auxtrace workflows difficult.
Also add --ignore-vmlinux for consistency with other tools.
Suggested-by: Denis Nikitin <denik@chromium.org>
Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: Denis Nikitin <denik@chromium.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20211018134844.2627174-4-james.clark@arm.com
[ Added the perf-inject man page entries for these options, as noted by Denis ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To prepare for impending deprecation of libbpf's bpf_program__get_prog_info_linear(),
pull in the function and associated helpers into the perf codebase and migrate
existing uses to the perf copy.
Since libbpf's deprecated definitions will still be visible to perf, it is necessary
to rename perf's definitions.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211011082031.4148337-4-davemarchevsky@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The instruction latency information can be recorded on
some platforms, e.g., the Intel Sapphire Rapids server. With both memory
latency (weight) and the new instruction latency information, users can
easily locate the expensive load instructions, and also understand the time
spent in different stages. The users can optimize their applications in
different pipeline stages.
Add a new field "ins_lat" to filter the instruction latency information,
which is available with sample type PERF_SAMPLE_WEIGHT_STRUCT.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Link: https://lore.kernel.org/r/1632929894-102778-2-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It can be useful to see debug output in between normal output.
Add support for AUXTRACE_LOG_FLG_USE_STDOUT to Intel PT.
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20211027080334.365596-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It can be useful to see debug output in between normal output.
Add 'o' to the flags of debug option 'd', so that '--itrace=d+o' can
specify output of the debug log to stdout.
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20211027080334.365596-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a new dlfilter to show cycles.
Cycle counts are accumulated per CPU (or per thread if CPU is not recorded)
from IPC information, and printed together with the change since the last
print, at the start of each line. Separate counts are kept for branches,
instructions or other events.
Note also, the itrace A option can be useful to provide higher granularity
cycle information.
Example:
$ perf record -e intel_pt/cyc/u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.044 MB perf.data ]
$ perf script --itrace=A --call-trace --dlfilter dlfilter-show-cycles.so --deltatime | head
0 perf-exec 8509 [001] 0.000000000: psb offs: 0
0 perf-exec 8509 [001] 0.000000000: cbr: 42 freq: 4219 MHz (156%)
833 833 uname 8509 [001] 0.000047689: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _start
833 uname 8509 [001] 0.000003261: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_start
2015 1182 uname 8509 [001] 0.000000282: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_start
2676 661 uname 8509 [001] 0.000002629: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_start
3612 936 uname 8509 [001] 0.000001232: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_start
4579 967 uname 8509 [001] 0.000002519: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_start
6145 1566 uname 8509 [001] 0.000001050: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_setup_hash
6239 94 uname 8509 [001] 0.000000023: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_sysdep_start
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20211027080334.365596-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Normally, for cycle-acccurate mode, IPC values are an exact number of
instructions and cycles. Due to the granularity of timestamps, that happens
only when a CYC packet correlates to the event.
Support the itrace 'A' option, to use instead, the number of cycles
associated with the current timestamp. This provides IPC information for
every change of timestamp, but at the expense of accuracy. Due to the
granularity of timestamps, the actual number of cycles increases even
though the cycles reported does not. The number of instructions is known,
but if IPC is reported, cycles can be too low and so IPC is too high. Note
that inaccuracy decreases as the period of sampling increases i.e. if the
number of cycles is too low by a small amount, that becomes less
significant if the number of cycles is large.
Furthermore, it can be used in conjunction with dlfilter-show-cycles.so
to provide higher granularity cycle information.
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20211027080334.365596-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add an option to specify that synthesized IPC can be approximate, rather
than completely accurate.
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20211027080334.365596-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a new option '--cputype' to 'perf list' to display core-only PMU
events or atom-only PMU events.
Each hybrid PMU event has been assigned with a PMU name, this patch
compares the PMU name before listing the result.
For example:
perf list --cputype atom
...
cache:
core_reject_l2q.any
[Counts the number of request that were not accepted into the L2Q because the L2Q is FULL. Unit: cpu_atom]
...
The "Unit: cpu_atom" is displayed in the brief description section
to indicate this is an atom event.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210903025239.22754-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Originally, software only supported redirecting at most one PEBS event to
Intel PT (PEBS-via-PT) because it was not able to differentiate one event
from another. To overcome that, add support for the
PERF_RECORD_AUX_OUTPUT_HW_ID side-band event.
Committer notes:
Cast the pointer arg to for_each_set_bit() to (unsigned long *), to fix
the build on 32-bit systems.
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20210907163903.11820-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since:
https://lore.kernel.org/lkml/20200708183919.4141023-1-irogers@google.com/
The output option works for 'perf kmem', however, it must appear after
'record'. This is different to 'stat' where '-i' for the input must
appear before. Try to capture this complication in the man page.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210922212031.485950-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Considering that perf and its subcommands have so many parameters, the
documentation is always the first stop for perf beginners. Fixing some
spelling errors will relax the eyes of some readers a little bit.
s/specicfication/specification/
s/caheline/cacheline/
s/tranasaction/transaction/
s/complan/complain/
s/sched_wakep/sched_wakeup/
s/possble/possible/
s/methology/methodology/
Signed-off-by: Like Xu <likexu@tencent.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210924081942.38368-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add an option to control the synthesizing behavior.
--synth <no|all|task|mmap|cgroup>
Fine-tune event synthesis: default=all
This can be useful when we know it doesn't need some synthesis like
in a specific usecase and/or when using pipe:
$ perf record -a --all-cgroups --synth cgroup -o- sleep 1 | \
> perf report -i- -s cgroup
Committer notes:
Added a clarification to the man page entry for --synth that this is
about pre-existing threads.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210811044658.1313391-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like all locally-built programs, dlfilters may need to be re-built if
shared libraries they use change. Also there may be unexpected results
if the dfilter uses different versions of the shared libraries that perf
uses.
Note those things in the documentation.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https //lore.kernel.org/r/20210811101036.17986-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The option --list-dlfilters does use a string value.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 638e2b9984 ("perf script Add option to list dlfilters")
Link: https //lore.kernel.org/r/20210811101036.17986-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_dlfilter_fns must not be const, because it is not.
Declaring it const can result in it being mapped read-only, causing a
segfaullt when it is written. Update documentation accordingly.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 8defa7147d5572 ("perf script Add API for filtering via dynamically loaded shared object")
Link: https //lore.kernel.org/r/20210811101036.17986-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
" -- " is an em dash (—) in asciidoc, so all these examples that were
supposed to be producing a literal two dashes were being misrendered.
Signed-off-by: Alyssa Ross <hi@alyssa.is>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210809153226.332545-1-hi@alyssa.is
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The text ranging from "subsystem__event_name" to "raw_syscalls__sys_enter()"
is interpreted by asciidoc as a pair of unconstrained text formatting markers.
The result is that the manual page displayed this text as underlined,
and the HTML pages displayed this text as italicized. Escape the first
double-underscore to prevent this.
https://docs.asciidoctor.org/asciidoc/latest/syntax-quick-reference/
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210806204502.110305-1-stephen.s.brennan@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Place early, as they are in the git Makefile. Remove references to a
'technical` directory that doesn't exist in perf.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210715013343.2286699-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
howto-index.sh exists in git but not in perf, as such targets that
depend upon it fail. Remove such failing targets.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210715013343.2286699-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
cmd-list.perl exists in git but not in perf. As such these targets fail
with missing dependencies. Remove them.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210715013343.2286699-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Enabled to ensure that info pages build.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210715013343.2286699-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf doesn't have a user-manual.txt, but git does and this explains why
there are references here. Having these references breaks 'make info' as
user-manual.info can't be created given the missing dependency. Remove
all references to user-manual so that 'make info' can succeed.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210715013343.2286699-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The doc.dep dependencies for the Makefile fail to build as
build-docdep.perl is missing. Add this file from git.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210715013343.2286699-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before this change 'make perfman.info' fails as cat-texi.perl is
missing. It also fails as the makeinfo output isn't written into the
appropriate file. Add cat-texi.perl from git. Add missing output file
flag for makeinfo.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210715013343.2286699-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The Intel PT decoder limits the number of unconditional branches (e.g.
jmps) decoded without consuming any trace packets. Generally, a loop
needs a conditional branch which generates a TNT packet, whereas a "ret"
instruction will generate a TIP or TNT packet. So exceeding the limit is
assumed to be a never-ending loop, which can happen if there has been a
decoding error putting the decoder at the wrong place in the code.
Up until now, the limit of 10000 has been enough but some analytic
purposes have been reported to exceed that.
Increase the limit to 100000, and make it configurable via perf config
intel-pt.max-loops. Also amend the "Never-ending loop" message to
mention the configuration entry.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210701175132.3977-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a function, for use by dlfilters, to read object code.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a function, for use by dlfilters, to return the perf_event_attr
structure.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a function, for use by dlfilters, to return source code file name and
line number.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a function, for use by dlfilters, to return instruction bytes.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a function, for use by dlfilters, to resolve addresses from branch
stacks or callchains.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add option --dlarg to pass arguments to dlfilters. The --dlarg option can
be repeated to pass more than 1 argument.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add option --list-dlfilters to list dlfilters in the current directory or
the exec-path e.g. ~/libexec/perf-core/dlfilters. Use with option -v (must
come before option --list-dlfilters) to show long descriptions.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
filter_event_early() can be more than 30% faster than filter_event()
because it is called before internal filtering. In other respects it
is the same as filter_event(), except that it will be passed events
that have yet to be filtered out.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In some cases, users want to filter very large amounts of data (e.g.
from AUX area tracing like Intel PT) looking for something specific.
While scripting such as Python can be used, Python is 10 to 20 times
slower than C. So define a C API so that custom filters can be written
and loaded.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a section to notify the permission and sysctl setting for perf
probe. And fix some indentations.
Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/162204068898.388434.16842705842611255787.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add an instruction trace and a source trace to the intel-pt-events.py
script.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210530192308.7382-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add new fields and functions to the perf-script-python documentation.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210530192308.7382-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add 'g' (guest) for VM-Entry and 'h' (host) for VM-Exit.
Fixes: c025d46cd9 ("perf script: Add branch types for VM-Entry and VM-Exit")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210521175127.27264-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf has supported the CPU_PMU_CAPS feature to display a list of CPU PMU
capabilities. But on a hybrid platform, it may have several CPU PMUs (such
as "cpu_core" and "cpu_atom"). The CPU_PMU_CAPS feature is hard to extend
to support multiple CPU PMUs well if it needs to be compatible for the case
of old perf data file + new perf tool.
So for better compatibility we now create a new feature HYBRID_CPU_PMU_CAPS
in the header.
For the perf.data generated on hybrid platform,
root@otcpl-adl-s-2:~# perf report --header-only -I
# cpu_core pmu capabilities: branches=32, max_precise=3, pmu_name=alderlake_hybrid
# cpu_atom pmu capabilities: branches=32, max_precise=3, pmu_name=alderlake_hybrid
# missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CPU_PMU_CAPS CLOCK_DATA
For the perf.data generated on non-hybrid platform
root@kbl-ppc:~# perf report --header-only -I
# cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake
# missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CLOCK_DATA HYBRID_TOPOLOGY HYBRID_CPU_PMU_CAPS
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210514122948.9472-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is useful to let the user know about the hybrid topology.
Add the HYBRID_TOPOLOGY feature in header to indicate the core CPUs
and the atom CPUs.
With this patch a perf.data generated on a hybrid platform reports
the hybrid CPU list:
root@otcpl-adl-s-2:~# perf report --header-only -I
...
# hybrid cpu system:
# cpu_core cpu list : 0-15
# cpu_atom cpu list : 16-23
For a perf.data generated on a non-hybrid platform, reports a message
that HYBRID_TOPOLOGY is missing:
root@kbl-ppc:~# perf report --header-only -I
...
# missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CLOCK_DATA HYBRID_TOPOLOGY
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210514122948.9472-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add parsing and validation of VM Time Correlation options, and pass
parameters to the decoder. Also update the Intel PT documentation
accordingly.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210430070309.17624-13-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT timestamps are affected by virtualization. Add a new option
that will allow the Intel PT decoder to correlate the timestamps and
translate the virtual machine timestamps to host timestamps.
The advantages of making this a separate step, rather than a part of
normal decoding are that it is simpler to implement, and it needs to
be done only once.
This patch adds only the option. Later patches add Intel PT support.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210430070309.17624-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Correlating virtual machine TSC packets is not supported at present, so
instead support the Z itrace option.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210430070309.17624-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Issues correlating timestamps can be avoided with timeless decoding. Add
an option for that, so that timeless decoding can be used even when
timestamps are present.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210430070309.17624-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add some words and examples to help understanding of
Intel hybrid perf support.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-27-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To make the output more readable, I think it's better to remove 0's in
the output. Also the dummy event has no event stats so it just wasts
the space. Let's use the --skip-empty option to suppress it.
$ perf report --stat --skip-empty
Aggregated stats:
TOTAL events: 16530
MMAP events: 226
COMM events: 1596
EXIT events: 2
THROTTLE events: 121
UNTHROTTLE events: 117
FORK events: 1595
SAMPLE events: 719
MMAP2 events: 12147
CGROUP events: 2
FINISHED_ROUND events: 2
THREAD_MAP events: 1
CPU_MAP events: 1
TIME_CONV events: 1
cycles stats:
SAMPLE events: 719
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, to use BPF to aggregate perf event counters, the user uses
--bpf-counters option. Enable "use bpf by default" events with a config
option, stat.bpf-counter-events. Events with name in the option will use
BPF.
This also enables mixed BPF event and regular event in the same sesssion.
For example:
perf config stat.bpf-counter-events=instructions
perf stat -e instructions,cs
The second command will use BPF for "instructions" but not "cs".
Signed-off-by: Song Liu <song@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20210425214333.1090950-4-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update Topdown documentation to permit calls to rdpmc, and describe
interaction with system calls.
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: http://lore.kernel.org/lkml/20210421091009.1711565-1-mdr@ashroe.eu
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This functionality is based on recently introduced sysfs attributes for
Intel® Xeon® Scalable processor family (code name Skylake-SP):
Commit bb42b3d397 ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")
Mode is intended to provide four I/O performance metrics in MB per each
PCIe root port:
- Inbound Read: I/O devices below root port read from the host memory
- Inbound Write: I/O devices below root port write to the host memory
- Outbound Read: CPU reads from I/O devices below root port
- Outbound Write: CPU writes to I/O devices below root port
Each metric requiries only one uncore event which increments at every 4B
transfer in corresponding direction. The formulas to compute metrics
are generic:
#EventCount * 4B / (1024 * 1024)
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210419094147.15909-4-alexander.antonov@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf annotate' supports --symbol but it's impossible to filter a C++
symbol. With --no-demangle one can filter easily by mangled function
name.
Signed-off-by: Martin Liška <mliska@suse.cz>
Link: http://lore.kernel.org/lkml/c3c7e959-9f7f-18e2-e795-f604275cbac3@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The pipeline stage cycles details can be recorded on powerpc from the
contents of Performance Monitor Unit (PMU) registers. On ISA v3.1
platform, sampling registers exposes the cycles spent in different
pipeline stages. Patch adds perf tools support to present two of the
cycle counter information along with memory latency (weight).
Re-use the field 'ins_lat' for storing the first pipeline stage cycle.
This is stored in 'var2_w' field of 'perf_sample_weight'.
Add a new field 'p_stage_cyc' to store the second pipeline stage cycle
which is stored in 'var3_w' field of perf_sample_weight.
Add new sort function 'Pipeline Stage Cycle' and include this in
default_mem_sort_order[]. This new sort function may be used to denote
some other pipeline stage in another architecture. So add this to list
of sort entries that can have dynamic header string.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: https://lore.kernel.org/r/1616425047-1666-5-git-send-email-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf stat' subcommand supports the request for a summary of the
interval counter readings. But the summary lines break the CSV output
so it's hard for scripts to parse the result.
Before:
# perf stat -x, -I1000 --interval-count 1 --summary
1.001323097,8013.48,msec,cpu-clock,8013483384,100.00,8.013,CPUs utilized
1.001323097,270,,context-switches,8013513297,100.00,0.034,K/sec
1.001323097,13,,cpu-migrations,8013530032,100.00,0.002,K/sec
1.001323097,184,,page-faults,8013546992,100.00,0.023,K/sec
1.001323097,20574191,,cycles,8013551506,100.00,0.003,GHz
1.001323097,10562267,,instructions,8013564958,100.00,0.51,insn per cycle
1.001323097,2019244,,branches,8013575673,100.00,0.252,M/sec
1.001323097,106152,,branch-misses,8013585776,100.00,5.26,of all branches
8013.48,msec,cpu-clock,8013483384,100.00,7.984,CPUs utilized
270,,context-switches,8013513297,100.00,0.034,K/sec
13,,cpu-migrations,8013530032,100.00,0.002,K/sec
184,,page-faults,8013546992,100.00,0.023,K/sec
20574191,,cycles,8013551506,100.00,0.003,GHz
10562267,,instructions,8013564958,100.00,0.51,insn per cycle
2019244,,branches,8013575673,100.00,0.252,M/sec
106152,,branch-misses,8013585776,100.00,5.26,of all branches
The summary line loses the timestamp column, which breaks the CSV
output.
We add a column at the original 'timestamp' position and it just says
'summary' for the summary line.
After:
# perf stat -x, -I1000 --interval-count 1 --summary
1.001196053,8012.72,msec,cpu-clock,8012722903,100.00,8.013,CPUs utilized
1.001196053,218,,context-switches,8012753271,100.00,0.027,K/sec
1.001196053,9,,cpu-migrations,8012769767,100.00,0.001,K/sec
1.001196053,0,,page-faults,8012786257,100.00,0.000,K/sec
1.001196053,15004518,,cycles,8012790637,100.00,0.002,GHz
1.001196053,7954691,,instructions,8012804027,100.00,0.53,insn per cycle
1.001196053,1590259,,branches,8012814766,100.00,0.198,M/sec
1.001196053,82601,,branch-misses,8012824365,100.00,5.19,of all branches
summary,8012.72,msec,cpu-clock,8012722903,100.00,7.986,CPUs utilized
summary,218,,context-switches,8012753271,100.00,0.027,K/sec
summary,9,,cpu-migrations,8012769767,100.00,0.001,K/sec
summary,0,,page-faults,8012786257,100.00,0.000,K/sec
summary,15004518,,cycles,8012790637,100.00,0.002,GHz
summary,7954691,,instructions,8012804027,100.00,0.53,insn per cycle
summary,1590259,,branches,8012814766,100.00,0.198,M/sec
summary,82601,,branch-misses,8012824365,100.00,5.19,of all branches
Now it's easy for script to analyse the summary lines.
Of course, we also consider not to break possible existing scripts which
can continue to use the broken CSV format by using a new '--no-csv-summary.'
option.
# perf stat -x, -I1000 --interval-count 1 --summary --no-csv-summary
1.001213261,8012.67,msec,cpu-clock,8012672327,100.00,8.013,CPUs utilized
1.001213261,197,,context-switches,8012703742,100.00,24.586,/sec
1.001213261,9,,cpu-migrations,8012720902,100.00,1.123,/sec
1.001213261,644,,page-faults,8012738266,100.00,80.373,/sec
1.001213261,18350698,,cycles,8012744109,100.00,0.002,GHz
1.001213261,12745021,,instructions,8012759001,100.00,0.69,insn per cycle
1.001213261,2458033,,branches,8012770864,100.00,306.768,K/sec
1.001213261,102107,,branch-misses,8012781751,100.00,4.15,of all branches
8012.67,msec,cpu-clock,8012672327,100.00,7.985,CPUs utilized
197,,context-switches,8012703742,100.00,24.586,/sec
9,,cpu-migrations,8012720902,100.00,1.123,/sec
644,,page-faults,8012738266,100.00,80.373,/sec
18350698,,cycles,8012744109,100.00,0.002,GHz
12745021,,instructions,8012759001,100.00,0.69,insn per cycle
2458033,,branches,8012770864,100.00,306.768,K/sec
102107,,branch-misses,8012781751,100.00,4.15,of all branches
This option can be enabled in perf config by setting the variable
'stat.no-csv-summary'.
# perf config stat.no-csv-summary=true
# perf config -l
stat.no-csv-summary=true
# perf stat -x, -I1000 --interval-count 1 --summary
1.001330198,8013.28,msec,cpu-clock,8013279201,100.00,8.013,CPUs utilized
1.001330198,205,,context-switches,8013308394,100.00,25.583,/sec
1.001330198,10,,cpu-migrations,8013324681,100.00,1.248,/sec
1.001330198,0,,page-faults,8013340926,100.00,0.000,/sec
1.001330198,8027742,,cycles,8013344503,100.00,0.001,GHz
1.001330198,2871717,,instructions,8013356501,100.00,0.36,insn per cycle
1.001330198,553564,,branches,8013366204,100.00,69.081,K/sec
1.001330198,54021,,branch-misses,8013375952,100.00,9.76,of all branches
8013.28,msec,cpu-clock,8013279201,100.00,7.985,CPUs utilized
205,,context-switches,8013308394,100.00,25.583,/sec
10,,cpu-migrations,8013324681,100.00,1.248,/sec
0,,page-faults,8013340926,100.00,0.000,/sec
8027742,,cycles,8013344503,100.00,0.001,GHz
2871717,,instructions,8013356501,100.00,0.36,insn per cycle
553564,,branches,8013366204,100.00,69.081,K/sec
54021,,branch-misses,8013375952,100.00,9.76,of all branches
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210319070156.20394-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf tool uses performance monitoring counters (PMCs) to monitor
system performance. The PMCs are limited hardware resources. For
example, Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
Modern data center systems use these PMCs in many different ways: system
level monitoring, (maybe nested) container level monitoring, per process
monitoring, profiling (in sample mode), etc. In some cases, there are
more active perf_events than available hardware PMCs. To allow all
perf_events to have a chance to run, it is necessary to do expensive
time multiplexing of events.
On the other hand, many monitoring tools count the common metrics
(cycles, instructions). It is a waste to have multiple tools create
multiple perf_events of "cycles" and occupy multiple PMCs.
bperf tries to reduce such wastes by allowing multiple perf_events of
"cycles" or "instructions" (at different scopes) to share PMUs. Instead
of having each perf-stat session to read its own perf_events, bperf uses
BPF programs to read the perf_events and aggregate readings to BPF maps.
Then, the perf-stat session(s) reads the values from these BPF maps.
Please refer to the comment before the definition of bperf_ops for the
description of bperf architecture.
bperf is off by default. To enable it, pass --bpf-counters option to
perf-stat. bperf uses a BPF hashmap to share information about BPF
programs and maps used by bperf. This map is pinned to bpffs. The
default path is /sys/fs/bpf/perf_attr_map. The user could change the
path with option --bpf-attr-map.
Committer testing:
# dmesg|grep "Performance Events" -A5
[ 0.225277] Performance Events: Fam17h+ core perfctr, AMD PMU driver.
[ 0.225280] ... version: 0
[ 0.225280] ... bit width: 48
[ 0.225281] ... generic registers: 6
[ 0.225281] ... value mask: 0000ffffffffffff
[ 0.225281] ... max period: 00007fffffffffff
#
# for a in $(seq 6) ; do perf stat -a -e cycles,instructions sleep 100000 & done
[1] 2436231
[2] 2436232
[3] 2436233
[4] 2436234
[5] 2436235
[6] 2436236
# perf stat -a -e cycles,instructions sleep 0.1
Performance counter stats for 'system wide':
310,326,987 cycles (41.87%)
236,143,290 instructions # 0.76 insn per cycle (41.87%)
0.100800885 seconds time elapsed
#
We can see that the counters were enabled for this workload 41.87% of
the time.
Now with --bpf-counters:
# for a in $(seq 32) ; do perf stat --bpf-counters -a -e cycles,instructions sleep 100000 & done
[1] 2436514
[2] 2436515
[3] 2436516
[4] 2436517
[5] 2436518
[6] 2436519
[7] 2436520
[8] 2436521
[9] 2436522
[10] 2436523
[11] 2436524
[12] 2436525
[13] 2436526
[14] 2436527
[15] 2436528
[16] 2436529
[17] 2436530
[18] 2436531
[19] 2436532
[20] 2436533
[21] 2436534
[22] 2436535
[23] 2436536
[24] 2436537
[25] 2436538
[26] 2436539
[27] 2436540
[28] 2436541
[29] 2436542
[30] 2436543
[31] 2436544
[32] 2436545
#
# ls -la /sys/fs/bpf/perf_attr_map
-rw-------. 1 root root 0 Mar 23 14:53 /sys/fs/bpf/perf_attr_map
# bpftool map | grep bperf | wc -l
64
#
# bpftool map | tail
1265: percpu_array name accum_readings flags 0x0
key 4B value 24B max_entries 1 memlock 4096B
1266: hash name filter flags 0x0
key 4B value 4B max_entries 1 memlock 4096B
1267: array name bperf_fo.bss flags 0x400
key 4B value 8B max_entries 1 memlock 4096B
btf_id 996
pids perf(2436545)
1268: percpu_array name accum_readings flags 0x0
key 4B value 24B max_entries 1 memlock 4096B
1269: hash name filter flags 0x0
key 4B value 4B max_entries 1 memlock 4096B
1270: array name bperf_fo.bss flags 0x400
key 4B value 8B max_entries 1 memlock 4096B
btf_id 997
pids perf(2436541)
1285: array name pid_iter.rodata flags 0x480
key 4B value 4B max_entries 1 memlock 4096B
btf_id 1017 frozen
pids bpftool(2437504)
1286: array flags 0x0
key 4B value 32B max_entries 1 memlock 4096B
#
# bpftool map dump id 1268 | tail
value (CPU 21):
8f f3 bc ca 00 00 00 00 80 fd 2a d1 4d 00 00 00
80 fd 2a d1 4d 00 00 00
value (CPU 22):
7e d5 64 4d 00 00 00 00 a4 8a 2e ee 4d 00 00 00
a4 8a 2e ee 4d 00 00 00
value (CPU 23):
a7 78 3e 06 01 00 00 00 b2 34 94 f6 4d 00 00 00
b2 34 94 f6 4d 00 00 00
Found 1 element
# bpftool map dump id 1268 | tail
value (CPU 21):
c6 8b d9 ca 00 00 00 00 20 c6 fc 83 4e 00 00 00
20 c6 fc 83 4e 00 00 00
value (CPU 22):
9c b4 d2 4d 00 00 00 00 3e 0c df 89 4e 00 00 00
3e 0c df 89 4e 00 00 00
value (CPU 23):
18 43 66 06 01 00 00 00 5b 69 ed 83 4e 00 00 00
5b 69 ed 83 4e 00 00 00
Found 1 element
# bpftool map dump id 1268 | tail
value (CPU 21):
f2 6e db ca 00 00 00 00 92 67 4c ba 4e 00 00 00
92 67 4c ba 4e 00 00 00
value (CPU 22):
dc 8e e1 4d 00 00 00 00 d9 32 7a c5 4e 00 00 00
d9 32 7a c5 4e 00 00 00
value (CPU 23):
bd 2b 73 06 01 00 00 00 7c 73 87 bf 4e 00 00 00
7c 73 87 bf 4e 00 00 00
Found 1 element
#
# perf stat --bpf-counters -a -e cycles,instructions sleep 0.1
Performance counter stats for 'system wide':
119,410,122 cycles
152,105,479 instructions # 1.27 insn per cycle
0.101395093 seconds time elapsed
#
See? We had the counters enabled all the time.
Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: kernel-team@fb.com
Link: http://lore.kernel.org/lkml/20210316211837.910506-2-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>