Detected with gcc's ASan:
Direct leak of 66 byte(s) in 5 object(s) allocated from:
#0 0x7ff3b1f32070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070)
#1 0x560c8761034d in collect_config util/config.c:597
#2 0x560c8760d9cb in get_value util/config.c:169
#3 0x560c8760dfd7 in perf_parse_file util/config.c:285
#4 0x560c8760e0d2 in perf_config_from_file util/config.c:476
#5 0x560c876108fd in perf_config_set__init util/config.c:661
#6 0x560c87610c72 in perf_config_set__new util/config.c:709
#7 0x560c87610d2f in perf_config__init util/config.c:718
#8 0x560c87610e5d in perf_config util/config.c:730
#9 0x560c875ddea0 in main /home/changbin/work/linux/tools/perf/perf.c:442
#10 0x7ff3afb8609a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Fixes: 20105ca124 ("perf config: Introduce perf_config_set class")
Link: http://lkml.kernel.org/r/20190316080556.3075-6-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The option 'sort-order' should be 'sort_order'.
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fixes: 893c5c798b ("perf config: Show default report configuration in example and docs")
Link: http://lkml.kernel.org/r/20190316080556.3075-5-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Optimization level '-Og' offers a reasonable level of optimization while
maintaining fast compilation and a good debugging experience. This patch
tries to make it work.
$ make DEBUG=1 EXTRA_CFLAGS='-Og'
bench/epoll-ctl.c: In function ‘do_threads’:
bench/epoll-ctl.c:274:9: error: ‘ret’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
return ret;
^~~
...
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20190316080556.3075-4-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Detected via gcc's ASan:
Direct leak of 2048 byte(s) in 64 object(s) allocated from:
6 #0 0x7f606512e370 in __interceptor_realloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee370)
7 #1 0x556b0f1d7ddd in thread_map__realloc util/thread_map.c:43
8 #2 0x556b0f1d84c7 in thread_map__new_by_tid util/thread_map.c:85
9 #3 0x556b0f0e045e in is_event_supported util/parse-events.c:2250
10 #4 0x556b0f0e1aa1 in print_hwcache_events util/parse-events.c:2382
11 #5 0x556b0f0e3231 in print_events util/parse-events.c:2514
12 #6 0x556b0ee0a66e in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58
13 #7 0x556b0f01e0ae in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
14 #8 0x556b0f01e859 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
15 #9 0x556b0f01edc8 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
16 #10 0x556b0f01f71f in main /home/changbin/work/linux/tools/perf/perf.c:520
17 #11 0x7f6062ccf09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fixes: 89896051f8 ("perf tools: Do not put a variable sized type not at the end of a struct")
Link: http://lkml.kernel.org/r/20190316080556.3075-3-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
AddressSanitizer (or ASan) and UndefinedBehaviorSanitizer (or UBSan) are
very useful tools to detect program bugs:
- AddressSanitizer (or ASan) is a GCC feature that detects memory
corruption bugs such as buffer overflows and memory leaks.
- UndefinedBehaviorSanitizer (or UBSan) is a fast undefined behavior
detector supported by GCC. UBSan detects undefined behaviors of programs
at runtime.
This patch adds a document about how to use them on perf. Later patches will fix
some of the issues disclosed by them.
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20190316080556.3075-2-changbin.du@gmail.com
[ Make some changes based on comments made by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -c option to enable multiplex scaling has been useless for quite
some time because scaling is default.
It's only useful as --no-scale to disable scaling. But the non scaling
code path has bitrotted and doesn't print anything because perf output
code relies on value run/ena information.
Also even when we don't want to scale a value it's still useful to show
its multiplex percentage.
This patch:
- Fixes help and documentation to show --no-scale instead of -c
- Removes -c, only keeps the long option because -c doesn't support negatives.
- Enables running/enabled even with --no-scale
- And fixes some other problems in the no-scale output.
Before:
$ perf stat --no-scale -e cycles true
Performance counter stats for 'true':
<not counted> cycles
0.000984154 seconds time elapsed
After:
$ ./perf stat --no-scale -e cycles true
Performance counter stats for 'true':
706,070 cycles
0.001219821 seconds time elapsed
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LPU-Reference: 20190314225002.30108-9-andi@firstfloor.org
Link: https://lkml.kernel.org/n/tip-xggjvwcdaj2aqy8ib3i4b1g6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When comparing time stamps in 'perf script' traces it can be annoying to
work with the full perf time stamps.
Add a --reltime option that displays time stamps relative to the trace
start to make it easier to read the traces.
Note: not currently supported for --time. Report an error in this
case.
Before:
% perf script
swapper 0 [000] 245402.891216: 1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
swapper 0 [000] 245402.891223: 1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
swapper 0 [000] 245402.891227: 5 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
swapper 0 [000] 245402.891231: 41 cycles:ppp: ffffffffa0068816 native_write_msr+0x6 ([kernel.kallsyms])
swapper 0 [000] 245402.891235: 355 cycles:ppp: ffffffffa000dd51 intel_bts_enable_local+0x21 ([kernel.kallsyms])
swapper 0 [000] 245402.891239: 3084 cycles:ppp: ffffffffa0a0150a end_repeat_nmi+0x48 ([kernel.kallsyms])
After:
% perf script --reltime
swapper 0 [000] 0.000000: 1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
swapper 0 [000] 0.000006: 1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
swapper 0 [000] 0.000010: 5 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
swapper 0 [000] 0.000014: 41 cycles:ppp: ffffffffa0068816 native_write_msr+0x6 ([kernel.kallsyms])
swapper 0 [000] 0.000018: 355 cycles:ppp: ffffffffa000dd51 intel_bts_enable_local+0x21 ([kernel.kallsyms])
swapper 0 [000] 0.000022: 3084 cycles:ppp: ffffffffa0a0150a end_repeat_nmi+0x48 ([kernel.kallsyms])
Committer notes:
Do not use 'time' as the name of a variable, as this breaks the build on
older glibcs:
cc1: warnings being treated as errors
builtin-script.c: In function 'perf_sample__fprintf_start':
builtin-script.c:691: warning: declaration of 'time' shadows a global declaration
/usr/include/time.h:187: warning: shadowed declaration is here
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
LPU-Reference: 20190314225002.30108-8-andi@firstfloor.org
Link: https://lkml.kernel.org/n/tip-bpahyi6pr9r399mvihu65fvc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Show all the supported sort keys in the command line help output, so
that it's not needed to refer to the manpage.
Before:
% perf report -h
...
-s, --sort <key[,key2...]>
sort by key(s): pid, comm, dso, symbol, parent, cpu, srcline, ... Please refer the man page for the complete list.
After:
% perf report -h
...
-s, --sort <key[,key2...]>
sort by key(s): overhead overhead_sys overhead_us overhead_guest_sys overhead_guest_us overhead_children sample period pid comm dso symbol parent cpu ...
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
LPU-Reference: 20190314225002.30108-5-andi@firstfloor.org
Link: https://lkml.kernel.org/n/tip-9r3uz2ch4izoi1uln3f889co@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The help description for --switch-output looks like there are multiple
comma separated fields. But it's actually a choice of different options.
Make it clear and less confusing.
Before:
% perf record -h
...
--switch-output[=<signal,size,time>]
Switch output when receive SIGUSR2 or cross size,time threshold
After:
% perf record -h
...
--switch-output[=<signal or size[BKMG] or time[smhd]>]
Switch output when receiving SIGUSR2 (signal) or cross a size or time threshold
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
LPU-Reference: 20190314225002.30108-4-andi@firstfloor.org
Link: https://lkml.kernel.org/n/tip-9yecyuha04nyg8toyd1b2pgi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When doing long term recording and waiting for some event to snapshot
on, we often only care about the last minute or so.
The --switch-output command line option supports rotating the perf.data
file when the size exceeds a threshold. But the disk would still be
filled with unnecessary old files.
Add a new option to only keep a number of rotated files, so that the
disk space usage can be limited.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
LPU-Reference: 20190314225002.30108-3-andi@firstfloor.org
Link: https://lkml.kernel.org/n/tip-y5u2lik0ragt4vlktz6qc9ks@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a filter is specified on the command line, filter the metrics too.
Before:
% perf list foo
List of pre-defined events (to be used in -e):
Metric Groups:
DSB:
DSB_Coverage
[Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
... more metrics ...
After:
% perf list foo
List of pre-defined events (to be used in -e):
Metric Groups:
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
LPU-Reference: 20190314225002.30108-1-andi@firstfloor.org
Link: https://lkml.kernel.org/n/tip-1y8oi2s8c4jhjtykgs5zvda1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a way to define custom scripts through ~/.perfconfig, which are then
added to the scripts menu. The scripts get the same arguments as 'perf
script', in particular -i, --cpu, --tid.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190311144502.15423-10-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix the argv ui browser code to correctly display more entries than fit
on the screen without crashing. The problem was some type confusion with
pointer types in the ->seek function. Do the argv arithmetic correctly
with char ** pointers. Also add some asserts to find overruns and limit
the display function correctly.
Then finally remove a workaround for this in the res sample browser.
Committer testing:
1) Resize the x terminal to have just some 5 lines
2) Use 'perf report --samples 1' to activate the sample browser options
in the menu
3) Press ENTER, this will cause the crash:
# perf report --samples 1
perf: Segmentation fault
-------- backtrace --------
perf[0x5a514a]
/lib64/libc.so.6(+0x385bf)[0x7f27281b55bf]
/lib64/libc.so.6(+0x161a67)[0x7f27282dea67]
/lib64/libslang.so.2(SLsmg_write_wrapped_string+0x82)[0x7f272874a0b2]
perf(ui_browser__argv_refresh+0x77)[0x5939a7]
perf[0x5924cc]
perf(ui_browser__run+0x39)[0x593449]
perf(ui__popup_menu+0x83)[0x5a5263]
perf[0x59f421]
perf(perf_evlist__tui_browse_hists+0x3a0)[0x5a3780]
perf(cmd_report+0x2746)[0x447136]
perf[0x4a95fe]
perf(main+0x61c)[0x42dc6c]
/lib64/libc.so.6(__libc_start_main+0xf2)[0x7f27281a1412]
perf(_start+0x2d)[0x42de9d]
#
After applying this patch no crash takes place in such situation.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190311144502.15423-12-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Don't overflow array when the scripts directory is too large, or the
script file name is too long.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190311144502.15423-11-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now 'perf report' can show whole time periods with 'perf script', but
the user still has to find individual samples of interest manually.
It would be expensive and complicated to search for the right samples in
the whole perf file. Typically users only need to look at a small number
of samples for useful analysis.
Also the full scripts tend to show samples of all CPUs and all threads
mixed up, which can be very confusing on larger systems.
Add a new --samples option to save a small random number of samples per
hist entry.
Use a reservoir sample technique to select a representatve number of
samples.
Then allow browsing the samples using 'perf script' as part of the hist
entry context menu. This automatically adds the right filters, so only
the thread or cpu of the sample is displayed. Then we use less' search
functionality to directly jump the to the time stamp of the selected
sample.
It uses different menus for assembler and source display. Assembler
needs xed installed and source needs debuginfo.
Currently it only supports as many samples as fit on the screen due to
some limitations in the slang ui code.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190311174605.GA29294@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The scripts menu traditionally only showed custom perf scripts.
Allow to run standard perf script with useful default options too.
- Normal perf script
- perf script with assembler (needs xed installed)
- perf script with source code output (needs debuginfo)
- perf script with custom arguments
Then we automatically select the right options to display the
information in the perf.data file.
For example with -b display branch contexts.
It's not easily possible to check for xed's existence in advance. perf
script usually gives sensible error messages when it's not available.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190311144502.15423-7-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When using the time sort key, add new context menus to run scripts for
only the currently selected time range. Compute the correct range for
the selection add pass it as the --time option to perf script.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190311144502.15423-6-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a time sort key to perf report to display samples for different time
quantums separately. This allows easier analysis of workloads that
change over time, and also will allow looking at the context of samples.
% perf record ...
% perf report --sort time,overhead,symbol --time-quantum 1ms --stdio
...
0.67% 277061.87300 [.] _dl_start
0.50% 277061.87300 [.] f1
0.50% 277061.87300 [.] f2
0.33% 277061.87300 [.] main
0.29% 277061.87300 [.] _dl_lookup_symbol_x
0.29% 277061.87300 [.] dl_main
0.29% 277061.87300 [.] do_lookup_x
0.17% 277061.87300 [.] _dl_debug_initialize
0.17% 277061.87300 [.] _dl_init_paths
0.08% 277061.87300 [.] check_match
0.04% 277061.87300 [.] _dl_count_modids
1.33% 277061.87400 [.] f1
1.33% 277061.87400 [.] f2
1.33% 277061.87400 [.] main
1.17% 277061.87500 [.] main
1.08% 277061.87500 [.] f1
1.08% 277061.87500 [.] f2
1.00% 277061.87600 [.] main
0.83% 277061.87600 [.] f1
0.83% 277061.87600 [.] f2
1.00% 277061.87700 [.] main
Committer notes:
Rename 'time' argument to hist_time() to htime to overcome this in older
distros:
cc1: warnings being treated as errors
util/hist.c: In function 'hist_time':
util/hist.c:251: error: declaration of 'time' shadows a global declaration
/usr/include/time.h:186: error: shadowed declaration is here
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190311144502.15423-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The --cpu option only filtered samples. Filter other perf events, such
as COMM, FORK, SWITCH by the CPU too.
Reported-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190311144502.15423-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in 7948450d45 ("x86/x32: use time64 versions of
sigtimedwait and recvmmsg"), that doesn't cause any change in behaviour
in tools/perf/ as it deals just with the x32 entries.
This silences this tools/perf build warning:
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-mqpvshayeqidlulx5qpioa59@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce a printdate function to eliminate the repetitive use of
datetime.datetime.today() in the SQL exporting scripts.
Signed-off-by: Tony Jones <tonyj@suse.de>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/20190309000518.2438-5-tonyj@suse.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the export-to-sqlite.py script
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/20190309000518.2438-4-tonyj@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the export-to-postgresql.py script.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Link: http://lkml.kernel.org/r/20190309000518.2438-3-tonyj@suse.de
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the exported-sql-viewer.py script.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/20190309000518.2438-2-tonyj@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The UI viewer for scripts output has a lot of limitations: limited size,
no search or save function, slow, and various other issues.
Just use 'less' to display directly on the terminal instead.
This won't work in GTK mode, but GTK doesn't support these context menus
anyways. If that is ever done could use an terminal for the output.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Feng Tang <feng.tang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190309055628.21617-8-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding callback function to reader object so callers can process data in
different ways.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190308134745.5057-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The data files layout is described by HEADER_DIR_FORMAT feature.
Currently it holds only version number (1):
uint64_t version;
The current version holds only version value (1) means that data files:
- Follow the 'data.*' name format.
- Contain raw events data in standard perf format as read from kernel
(and need to be sorted)
Future versions are expected to describe different data files layout
according to special needs.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190308134745.5057-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make perf_data__size() return proper size for directory data, summing up
all the individual file sizes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190308134745.5057-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add perf_data__update_dir() to update the size for every file within the
perf.data directory.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190308134745.5057-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We can't store the auxtrace index when we store into multiple files,
because we keep only offset for it, not the file.
The auxtrace data will be processed correctly in the 'pipe' mode.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190308134745.5057-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The caller needs to set 'struct perf_data::is_dir flag and the path will
be treated as a directory.
The 'struct perf_data::file' is initialized and open as 'path/header'
file.
Add a check to the direcory interface functions to check the is_dir flag.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190308134745.5057-2-jolsa@kernel.org
[ Be consistent on how to signal failure, i.e. use -1 and let users check errno ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Thi patch adds PMC events for AMD Family 17 CPUs as defined in [1]. It
covers events described in section: 2.1.13. Regex pattern in mapfile.csv
covers all CPUs of the family.
[1] https://support.amd.com/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf
Signed-off-by: Martin Liška <mliska@suse.cz>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/r/d65873ca-e402-b198-4fe9-8c4af81258c8@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since commit 4d99e41365 ("perf machine: Workaround missing maps for
x86 PTI entry trampolines"), perf tools has been creating more than one
kernel map, however 'perf probe' assumed there could be only one.
Fix by using machine__kernel_map() to get the main kernel map.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Cc: Xu Yu <xuyu@linux.alibaba.com>
Fixes: 4d99e41365 ("perf machine: Workaround missing maps for x86 PTI entry trampolines")
Fixes: d83212d5dd ("kallsyms, x86: Export addresses of PTI entry trampolines")
Link: http://lkml.kernel.org/r/2ed432de-e904-85d2-5c36-5897ddc5b23b@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Many workloads change over time. 'perf report' currently aggregates the
whole time range reported in perf.data.
This patch adds an option for a time quantum to quantisize the perf.data
over time.
This just adds the option, will be used in follow on patches for a time
sort key.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190305144758.12397-6-andi@firstfloor.org
[ Use NSEC_PER_[MU]SEC ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a utility function to print nanosecond timestamps.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190305144758.12397-11-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Upcoming changes add timestamp output in perf report. Add a --ns
argument similar to perf script to support nanoseconds resolution when
needed.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190305144758.12397-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf updates from Thomas Gleixner:
"Perf updates and fixes:
Kernel:
- Handle events which have the bpf_event attribute set as side band
events as they carry information about BPF programs.
- Add missing switch-case fall-through comments
Libraries:
- Fix leaks and double frees in error code paths.
- Prevent buffer overflows in libtraceevent
Tools:
- Improvements in handling Intel BT/PTS
- Add BTF ELF markers to perf trace BPF programs to improve output
- Support --time, --cpu, --pid and --tid filters for perf diff
- Calculate the column width in perf annotate as the hardcoded 6
characters for the instruction are not sufficient
- Small fixes all over the place"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
perf/core: Mark expected switch fall-through
perf/x86/intel/uncore: Fix client IMC events return huge result
perf/ring_buffer: Use high order allocations for AUX buffers optimistically
perf data: Force perf_data__open|close zero data->file.path
perf session: Fix double free in perf_data__close
perf evsel: Probe for precise_ip with simple attr
perf tools: Read and store caps/max_precise in perf_pmu
perf hist: Fix memory leak of srcline
perf hist: Add error path into hist_entry__init
perf c2c: Fix c2c report for empty numa node
perf script python: Add Python3 support to intel-pt-events.py
perf script python: Add Python3 support to event_analyzing_sample.py
perf script python: add Python3 support to check-perf-trace.py
perf script python: Add Python3 support to futex-contention.py
perf script python: Remove mixed indentation
perf diff: Support --pid/--tid filter options
perf diff: Support --cpu filter option
perf diff: Support --time filter option
perf thread: Generalize function to copy from thread addr space from intel-bts code
perf annotate: Calculate the max instruction name, align column to that
...
Arnaldo Carvalho de Melo:
- Automatically add BTF ELF markers to 'perf trace' BPF programs, so that
tools such as 'bpftool map dump' can pretty print map keys and values.
perf c2c:
Jiri Olsa:
- Fix report for empty NUMA node.
perf diff:
Jin Yao:
- Support --time, --cpu, --pid and --tid filter options.
perf probe:
Arnaldo Carvalho de Melo:
- Clarify error message about not finding kernel modules debuginfo.
perf record:
Jiri Olsa:
- Fixup probing for max attr.precise_ip.
perf trace:
Arnaldo Carvalho de Melo:
- Add missing %s lost in the 'msg_flags' recvmmsg arg when adding prefix suppression logic.
perf annotate:
Arnaldo Carvalho de Melo:
- Calculate the max instruction name, align column to that, removing the
hardcoded max 6 chars and cope with instructions with names longer than that,
such as vpmovmskb, vpcmpeqb, etc.
kernel:
Song Liu:
- Consider events with attr.bpf_event set as side-band.
Gustavo A. R. Silva:
- Mark expected switch fall-through in perf_event_parse_addr_filter().
Libraries:
Jiri Olsa:
- Fix leaks and double frees on error paths.
libtraceevent:
Tony Jones:
- Fix buffer overflow in arg_eval().
python scripting:
Tony Jones:
- More python3 fixes.
Trivial:
Yang Wei:
- Remove needless extra semicolon in clang C++ glue code.
Intel PT/BTS:
Adrian Hunter:
- Improve auxtrace address filter error message when there is no DSO.
- Fix divide by zero when TSC is not available.
- Further improvements to the export to sqlite/posgresql python scripts
and to the GUI sqlviewer, exporting 'parent_id' so that we have enable
the creation of call trees.
Andi Kleen:
- Generalize function to copy from thread addr space from intel-bts code.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXIFXsgAKCRCyPKLppCJ+
Jz++AQDVDXs1rKyZ5JDmnDpJ1tvVPZM1tTAU+6C/GnnoSDgX/AD+L3smvLoPihbu
msd3TpSroXuQ7nZ4BQ894jHyX3STqQE=
=MN9Q
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-5.1-20190307' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core changes from Arnaldo Carvalho de Melo:
perf bpf:
Arnaldo Carvalho de Melo:
- Automatically add BTF ELF markers to 'perf trace' BPF programs, so that
tools such as 'bpftool map dump' can pretty print map keys and values.
perf c2c:
Jiri Olsa:
- Fix report for empty NUMA node.
perf diff:
Jin Yao:
- Support --time, --cpu, --pid and --tid filter options.
perf probe:
Arnaldo Carvalho de Melo:
- Clarify error message about not finding kernel modules debuginfo.
perf record:
Jiri Olsa:
- Fixup probing for max attr.precise_ip.
perf trace:
Arnaldo Carvalho de Melo:
- Add missing %s lost in the 'msg_flags' recvmmsg arg when adding prefix suppression logic.
perf annotate:
Arnaldo Carvalho de Melo:
- Calculate the max instruction name, align column to that, removing the
hardcoded max 6 chars and cope with instructions with names longer than that,
such as vpmovmskb, vpcmpeqb, etc.
kernel:
Song Liu:
- Consider events with attr.bpf_event set as side-band.
Gustavo A. R. Silva:
- Mark expected switch fall-through in perf_event_parse_addr_filter().
Libraries:
Jiri Olsa:
- Fix leaks and double frees on error paths.
libtraceevent:
Tony Jones:
- Fix buffer overflow in arg_eval().
python scripting:
Tony Jones:
- More python3 fixes.
Trivial:
Yang Wei:
- Remove needless extra semicolon in clang C++ glue code.
Intel PT/BTS:
Adrian Hunter:
- Improve auxtrace address filter error message when there is no DSO.
- Fix divide by zero when TSC is not available.
- Further improvements to the export to sqlite/posgresql python scripts
and to the GUI sqlviewer, exporting 'parent_id' so that we have enable
the creation of call trees.
Andi Kleen:
- Generalize function to copy from thread addr space from intel-bts code.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Making sure the data->file.path is zeroed on perf_data__open error path
and in perf_data__close, so we don't double free it in case someone call
it twice.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Cc: Nageswara R Sastry <nasastry@in.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190305152536.21035-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We can't call perf_data__close and subsequently perf_session__delete,
because it will call perf_data__close again and cause double free for
data->file.path.
$ perf report -i .
incompatible file format (rerun with -v to learn more)
free(): double free detected in tcache 2
Aborted (core dumped)
In fact we don't need to call perf_data__close at all, because at the
time the got out_close is reached, session->data is already initialized,
so the perf_data__close call will be triggered from
perf_session__delete.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Cc: Nageswara R Sastry <nasastry@in.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Fixes: 2d4f27999b ("perf data: Add global path holder")
Link: http://lkml.kernel.org/r/20190305152536.21035-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently we probe for precise_ip with user specified perf_event_attr,
which might fail because of unsupported kernel features, which would get
disabled during the open time anyway.
Switching the probe to take place on simple hw cycles, so the following
record sets proper precise_ip:
# perf record -e cycles:P ls
# perf evlist -v
cycles:P: size: 112, ... precise_ip: 3, ...
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Cc: Nageswara R Sastry <nasastry@in.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190305152536.21035-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Read the caps/max_precise value and store it in struct perf_pmu to be
used when setting the maximum precise_ip field in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Cc: Nageswara R Sastry <nasastry@in.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190305152536.21035-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We can't allocate he->srcline unconditionaly, only when new hist_entry
is created. Moving he->srcline allocation into hist_entry__init
function.
Original-patch-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Nageswara R Sastry <nasastry@in.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190305152536.21035-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding error path into hist_entry__init to unify error handling, so
every new member does not need to free everything else.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: nageswara r sastry <nasastry@in.ibm.com>
Link: http://lkml.kernel.org/r/20190305152536.21035-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ravi Bangoria reported that we fail with an empty NUMA node with the
following message:
$ lscpu
NUMA node0 CPU(s):
NUMA node1 CPU(s): 0-4
$ sudo ./perf c2c report
node/cpu topology bugFailed setup nodes
Fix this by detecting the empty node and keeping its CPU set empty.
Reported-by: Nageswara R Sastry <nasastry@in.ibm.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190305152536.21035-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the intel-pt-events.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/fd26acf9-0c0f-717f-9664-a3c33043ce19@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the event_analyzing_sample.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Cc: Feng Tang <feng.tang@intel.com>
Link: http://lkml.kernel.org/r/20190302011903.2416-5-tonyj@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python 2 and Python 3 in the check-perf-trace.py script.
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of from __future__ implies the minimum supported version of
Python2 is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/20190302011903.2416-4-tonyj@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the futex-contention.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Link: http://lkml.kernel.org/r/20190302011903.2416-3-tonyj@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove mixed indentation in Python scripts. Revert to either all tabs
(most common form) or all spaces (4 or 8) depending on what was the
intent of the original commit. This is necessary to complete Python3
support as it will flag an error if it encounters mixed indentation.
Signed-off-by: Tony Jones <tonyj@suse.de>
Link: http://lkml.kernel.org/r/20190302011903.2416-2-tonyj@suse.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using the existing symbol_conf.pid_list_str and symbol_conf.tid_list_str
logic.
For example:
perf diff --tid 13965
It'll only diff the samples for thread 13965.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1551791143-10334-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To improve 'perf diff', implement a --cpu filter option.
Multiple CPUs can be provided as a comma-separated list with no space:
0,1. Ranges of CPUs are specified with -: 0-2. Default is to report
samples on all CPUs.
For example,
perf diff --cpu 0,1
It only diff the samples for CPU0 and CPU1.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1551791143-10334-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To improve 'perf diff', implement a --time filter option to diff the
samples within given time window.
It supports time percent with multiple time ranges. The time string
format is 'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
For example:
Select the second 10% time slice to diff:
perf diff --time 10%/2
Select from 0% to 10% time slice to diff:
perf diff --time 0%-10%
Select the first and the second 10% time slices to diff:
perf diff --time 10%/1,10%/2
Select from 0% to 10% and 30% to 40% slices to diff:
perf diff --time 0%-10%,30%-40%
It also supports analysing samples within a given time window
<start>,<stop>.
Times have the format seconds.microseconds.
If 'start' is not given (i.e., time string is ',x.y') then analysis starts at
the beginning of the file.
If the stop time is not given (i.e, time string is 'x.y,') then analysis
goes to end of file.
Time string is 'a1.b1,c1.d1:a2.b2,c2.d2'. Use ':' to separate timestamps for
different perf.data files.
For example, we get the timestamp information from perf script.
perf script -i perf.data.old
mgen 13940 [000] 3946.361400: ...
perf script -i perf.data
mgen 13940 [000] 3971.150589 ...
perf diff --time 3946.361400,:3971.150589,
It analyzes the perf.data.old from the timestamp 3946.361400 to the end of
perf.data.old and analyzes the perf.data from the timestamp 3971.150589 to the
end of perf.data.
v4:
---
Update abstime_str_dup(), let it return error if strdup
is failed, and update __cmd_diff() accordingly.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1551791143-10334-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a utility function to fetch executable code. Convert one
user over to it. There are more places doing that, but they
do significantly different actions, so they are not
easy to fit into a single library function.
Committer changes:
. No need to cast around, make 'buf' be a void pointer.
. Rename it to thread__memcpy() to reflect the fact it is about copying
a chunk of memory from a thread, i.e. from its address space.
. No need to have it in a separate object file, move it to thread.[ch]
. Check the return of map__load(), the original code didn't do it, but
since we're moving this around, check that as well.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/r/20190305144758.12397-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Merge misc updates from Andrew Morton:
- a few misc things
- ocfs2 updates
- most of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (159 commits)
tools/testing/selftests/proc/proc-self-syscall.c: remove duplicate include
proc: more robust bulk read test
proc: test /proc/*/maps, smaps, smaps_rollup, statm
proc: use seq_puts() everywhere
proc: read kernel cpu stat pointer once
proc: remove unused argument in proc_pid_lookup()
fs/proc/thread_self.c: code cleanup for proc_setup_thread_self()
fs/proc/self.c: code cleanup for proc_setup_self()
proc: return exit code 4 for skipped tests
mm,mremap: bail out earlier in mremap_to under map pressure
mm/sparse: fix a bad comparison
mm/memory.c: do_fault: avoid usage of stale vm_area_struct
writeback: fix inode cgroup switching comment
mm/huge_memory.c: fix "orig_pud" set but not used
mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC
mm/memcontrol.c: fix bad line in comment
mm/cma.c: cma_declare_contiguous: correct err handling
mm/page_ext.c: fix an imbalance with kmemleak
mm/compaction: pass pgdat to too_many_isolated() instead of zone
mm: remove zone_lru_lock() function, access ->lru_lock directly
...
Pull perf updates from Ingo Molnar:
"Lots of tooling updates - too many to list, here's a few highlights:
- Various subcommand updates to 'perf trace', 'perf report', 'perf
record', 'perf annotate', 'perf script', 'perf test', etc.
- CPU and NUMA topology and affinity handling improvements,
- HW tracing and HW support updates:
- Intel PT updates
- ARM CoreSight updates
- vendor HW event updates
- BPF updates
- Tons of infrastructure updates, both on the build system and the
library support side
- Documentation updates.
- ... and lots of other changes, see the changelog for details.
Kernel side updates:
- Tighten up kprobes blacklist handling, reduce the number of places
where developers can install a kprobe and hang/crash the system.
- Fix/enhance vma address filter handling.
- Various PMU driver updates, small fixes and additions.
- refcount_t conversions
- BPF updates
- error code propagation enhancements
- misc other changes"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
perf script python: Add Python3 support to syscall-counts-by-pid.py
perf script python: Add Python3 support to syscall-counts.py
perf script python: Add Python3 support to stat-cpi.py
perf script python: Add Python3 support to stackcollapse.py
perf script python: Add Python3 support to sctop.py
perf script python: Add Python3 support to powerpc-hcalls.py
perf script python: Add Python3 support to net_dropmonitor.py
perf script python: Add Python3 support to mem-phys-addr.py
perf script python: Add Python3 support to failed-syscalls-by-pid.py
perf script python: Add Python3 support to netdev-times.py
perf tools: Add perf_exe() helper to find perf binary
perf script: Handle missing fields with -F +..
perf data: Add perf_data__open_dir_data function
perf data: Add perf_data__(create_dir|close_dir) functions
perf data: Fail check_backup in case of error
perf data: Make check_backup work over directories
perf tools: Add rm_rf_perf_data function
perf tools: Add pattern name checking to rm_rf
perf tools: Add depth checking to rm_rf
perf data: Add global path holder
...
Delete a superfluous semicolon in getBPFObjectFromModule().
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Wei <albin_yang@163.com>
Link: http://lkml.kernel.org/r/1551710174-3349-1-git-send-email-albin_yang@163.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The libbpf loader expects that some __btf_map_<MAP_NAME> structs be in
place with the keys and values types of maps so that one can store the
struct definitions and have them sent to the kernel via sys_bpf(fd, cmd
= BTF_LOAD) and then later be retrievable via sys_bpf(fd, cmd =
BPF_OBJ_GET_INFO_BY_FD) for use by tools such as 'bpftool map dump id
MAP_ID'.
Since we already have this for defining maps in 'perf trace' BPF events:
bpf_map(name, _type, type_key, type_val, _max_entries)
As used in the tools/perf/examples/bpf/augmented_raw_syscalls.c:
--- 8< ---
struct syscall {
bool enabled;
};
bpf_map(syscalls, ARRAY, int, struct syscall, 512);
--- 8< ---
All we need is to get all that already available info, piggyback on the
'bpf_map' define in tools/perf/include/bpf/bpf.h, that is included by
'perf trace' BPF programs and do that without requiring changes to the
BPF programs already defining maps using 'bpf_map()'.
So this is what we have before this patch:
1) With this in ~/.perfconfig to dump .c events as .o, aka save a copy
so that we can use the .o later as a pre-compiled BPF bytecode:
# grep '\[llvm\]' -A2 ~/.perfconfig
[llvm]
dump-obj = true
clang-opt = -g
#
# clang --version
clang version 9.0.0 (https://git.llvm.org/git/clang.git/ 7906282d3afec5dfdc2b27943fd6c0309086c507) (https://git.llvm.org/git/llvm.git/ a1b5de1ff8ae8bc79dc8e86e1f82565229bd0500)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/llvm/bin
2) Note the -g there so that we get clang to generate debuginfo, and
since the target is 'bpf' it will generate the BTF info in this
clang version (9.0).
3) Run a simple 'perf record' specifiying as an event the augmented_raw_syscalls.c
source code:
# perf record -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1
LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB perf.data ]
# file /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), with debug_info, not stripped
4) Look at the BTF structs encoded in it:
# pahole -F btf --sizes /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
syscall_enter_args 64 0
augmented_filename 264 0
syscall 1 0
syscall_exit_args 24 0
bpf_map 28 0
#
# pahole -F btf -C syscalls /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
# pahole -F btf -C syscall /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
struct syscall {
bool enabled; /* 0 1 */
/* size: 1, cachelines: 1, members: 1 */
/* last cacheline: 1 bytes */
};
#
5) Ok, with just this we don't have the markers expected by the libbpf
loader and when we run with this BPF bytecode, because we have:
# grep '\[trace\]' -A1 ~/.perfconfig
[trace]
add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
#
6) Lets do a 'perf trace' system wide session using this BPF program:
# perf trace -e *mmsg,open*
Cache2 I/O/6885 openat(AT_FDCWD, "/home/acme/.cache/mozilla/firefox/ina67tev.default/cache2/entries/BA220AB2914006A7AE96D27BE6EA13DD77519FCA", O_RDWR|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR) = 106
Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
DNS Res~ver #3/23340 openat(AT_FDCWD, "/etc/hosts", O_RDONLY|O_CLOEXEC) = 106
DNS Res~ver #3/23340 sendmmsg(106<socket:[3482690]>, 0x7f252f1fcaf0, 2, MSG_NOSIGNAL) = 2
Cache2 I/O/6885 openat(AT_FDCWD, "/home/acme/.cache/mozilla/firefox/ina67tev.default/cache2/entries/BA220AB2914006A7AE96D27BE6EA13DD77519FCA", O_RDWR) = 106
lighttpd/18915 openat(AT_FDCWD, "/proc/loadavg", O_RDONLY) = 12
7) While it runs lets see the maps that 'perf trace' + libbpf's BPF
loader loaded into the kernel via sys_bpf(fd, BPF_BTF_LOAD, ...):
# bpftool map list | tail -6
149: perf_event_array name __augmented_sys flags 0x0
key 4B value 4B max_entries 8 memlock 4096B
150: array name syscalls flags 0x0
key 4B value 1B max_entries 512 memlock 8192B
151: hash name pids_filtered flags 0x0
key 4B value 1B max_entries 64 memlock 8192B
#
8) Dump the "pids_filtered", map, that will have one entry per PID that
'perf trace' wants filtered, which includes its own, to avoid a
tracing feedback loop (perf trace shows the syscalls it does which
generates more syscalls that it has to show that...), it also
auto-filters the 'gnome-terminal' and 'sshd' parent PIDs, for the
same reason:
# bpftool map dump id 151
key: a5 0c 00 00 value: 01
key: 14 63 00 00 value: 01
Found 2 elements
#
9) Since there is no BTF info available, it does a generic hex dump :-\
10) Now, with this patch applied, we'll do steps 3 to 6 again and look
with pahole if there are extra structs encoded in BTF:
# pahole -F btf --sizes /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
syscall_enter_args 64 0
augmented_filename 264 0
syscall 1 0
syscall_exit_args 24 0
bpf_map 28 0
____btf_map___augmented_syscalls__ 8 0
____btf_map_syscalls 8 0
____btf_map_pids_filtered 8 0
#
11) Yes, those __btf_map_ + the map names, lets see how they look like:
# pahole -F btf -C ____btf_map_syscalls /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
struct ____btf_map_syscalls {
int key; /* 0 4 */
struct syscall value; /* 4 1 */
/* size: 8, cachelines: 1, members: 2 */
/* padding: 3 */
/* last cacheline: 8 bytes */
};
#
12) Lets repeat step 7 to get the new map ids:
# bpftool map list | tail -6
155: perf_event_array name __augmented_sys flags 0x0
key 4B value 4B max_entries 8 memlock 4096B
156: array name syscalls flags 0x0
key 4B value 1B max_entries 512 memlock 8192B
157: hash name pids_filtered flags 0x0
key 4B value 1B max_entries 64 memlock 8192B
#
13) And finally lets dump the 'pids_filtered':
# bpftool map dump id 157
[{
"key": 3237,
"value": true
},{
"key": 26435,
"value": true
}
]
#
Looks much better! BTF info was used to interpret the key as an integer
and the value as a struct with just one boolean member, so to make it
more compact, show just the 'true' value where we saw '01'.
Now to make 'perf trace --dump-map' to use BTF!
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-ybuf9wpkm30xk28iq7jbwb40@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This replaces all open encodings in tools with NUMA_NO_NODE. Also
linux/numa.h is now needed for the perf build.
[sfr@canb.auug.org.au: fix for replace open encodings for NUMA_NO_NODE]
Link: http://lkml.kernel.org/r/20190108131141.730e9c4f@canb.auug.org.au
Link: http://lkml.kernel.org/r/1545127933-10711-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: David Hildenbrand <david@redhat.com>
Cc: Doug Ledford <dledford@redhat.com> [drivers/infiniband]
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> [ixgbe]
Cc: Jens Axboe <axboe@kernel.dk> [mtip32xx]
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au> [powerpc]
Cc: Vinod Koul <vkoul@kernel.org> [dmaengine.c]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Factor out a base class CallGraphModelBase from CallGraphModel, so that
CallGraphModelBase can be reused.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lkml.kernel.org/n/tip-76eybebzjwvgnadkm2oufrqi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead of passing the tree root, get it from a method that can be
implemented in any derived class.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lkml.kernel.org/n/tip-ovcv28bg4mt9swk36ypdyz14@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factor out a base class TreeWindowBase from CallGraphWindow, so that
TreeWindowBase can be reused.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lkml.kernel.org/n/tip-ifirw0c0mhkwxg6l12lk6k4p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Export to the 'calls' table the newly created 'parent_id' and create an
index for it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lkml.kernel.org/n/tip-eybd6fnk6j9r7g643lsideoo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix SQL query error "invalid input syntax for integer":
Traceback (most recent call last):
File "tools/perf/scripts/python/export-to-postgresql.py", line 465, in <module>
do_query(query, 'CREATE VIEW calls_view AS '
File "tools/perf/scripts/python/export-to-postgresql.py", line 274, in do_query
raise Exception("Query failed: " + q.lastError().text())
Exception: Query failed: ERROR: invalid input syntax for integer: ""
LINE 1: ...ch_count,call_id,return_id,CASE WHEN flags=0 THEN '' WHEN fl...
^
(22P02) QPSQL: Unable to create query
Error running python script tools/perf/scripts/python/export-to-postgresql.py
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: f08046cb30 ("perf thread-stack: Represent jmps to the start of a different symbol")
Link: https://lkml.kernel.org/n/tip-strfpdozrvg7bi1xzrivxzqt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Export to the 'calls' table the newly created 'parent_id'.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lkml.kernel.org/n/tip-b09oukl48rsl9azkp2wmh0bl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The call_path can be used to find the parent symbol for a call but not
the exact parent call. To do that add parent_id to the call_return
export. This enables the creation of a call tree from the exported data.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lkml.kernel.org/n/tip-6j7tzdxo67cox6kan7k22oo6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When TSC is not available, "timeless" decoding is used but a divide by
zero occurs if perf_time_to_tsc() is called.
Ensure the divisor is not zero.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.9+
Link: https://lkml.kernel.org/n/tip-1i4j0wqoc8vlbkcizqqxpsf4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The message does not indicate the possibility that the symbol is not
found because the file does not exist.
Before:
$ perf record -e intel_pt//u --filter 'filter strcmp / strcpy @ foo ' ls
Symbol 'strcmp' not found.
Note that symbols must be functions.
Failed to parse address filter: 'filter strcmp / strcpy @ foo '
Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>]
Where multiple filters are separated by space or comma.
After:
$ perf record -e intel_pt//u --filter 'filter strcmp / strcpy @ foo ' ls
File 'foo' not found or has no symbols.
Symbol 'strcmp' not found.
Note that symbols must be functions.
Failed to parse address filter: 'filter strcmp / strcpy @ foo '
Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>]
Where multiple filters are separated by space or comma.
Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lkml.kernel.org/n/tip-dvngzxd0jkplzw1ary69dilb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For historical reasons the helper to loop over maps in an object
is called bpf_map__for_each while it really should be called
bpf_object__for_each_map. Rename and add a correctly named
define for backward compatibility.
Switch all in-tree users to the correct name (Quentin).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
'perf probe' supports using just the kernel module name, but that will
work only when the module is loaded, or using the full pathname to the
file with the DWARF debug info, but the warning was cryptic:
Before:
# perf probe -m cls_flower -L fl_change
Failed to find the path for cls_flower: No such file or directory
Error: Failed to show lines.
#
After:
# perf probe -m cls_flower -L fl_change
Module cls_flower is not loaded, please specify its full path name.
Error: Failed to show lines.
# perf probe -m /lib/modules/5.0.0-rc7+/kernel/net/sched/cls_flower.ko -L fl_change | head -7
<fl_change@/home/acme/git/linux/net/sched/cls_flower.c:0>
0 static int fl_change(struct net *net, struct sk_buff *in_skb,
struct tcf_proto *tp, unsigned long base,
u32 handle, struct nlattr **tca,
void **arg, bool ovr, struct netlink_ext_ack *extack)
4 {
5 struct cls_fl_head *head = rtnl_dereference(tp->root);
#
The behaviour doesn't change when the module is loaded:
# modprobe cls_flower
# perf probe -m cls_flower -L fl_change | head -7
<fl_change@/home/acme/git/linux/net/sched/cls_flower.c:0>
0 static int fl_change(struct net *net, struct sk_buff *in_skb,
struct tcf_proto *tp, unsigned long base,
u32 handle, struct nlattr **tca,
void **arg, bool ovr, struct netlink_ext_ack *extack)
4 {
5 struct cls_fl_head *head = rtnl_dereference(tp->root);
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marcelo Ricardo Leitner <mleitner@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-q4njvk9mshra00jacqjbzfn5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the syscall-counts-by-pid.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Link: http://lkml.kernel.org/r/20190222230619.17887-15-tonyj@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the syscall-counts.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Link: http://lkml.kernel.org/r/20190222230619.17887-14-tonyj@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the stat-cpi.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190222230619.17887-13-tonyj@suse.de
Signed-off-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the stackcollapse.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com> <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/20190222230619.17887-12-tonyj@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the sctop.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/20190222230619.17887-11-tonyj@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the powerpc-hcalls.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190222230619.17887-10-tonyj@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the net_dropmonitor.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Link: http://lkml.kernel.org/r/20190222230619.17887-9-tonyj@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the mem-phys-addr.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Link: http://lkml.kernel.org/r/20190222230619.17887-8-tonyj@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the failed-syscalls-by-pid.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <tonyj@suse.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/20190222230619.17887-5-tonyj@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support both Python2 and Python3 in the netdev-times.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2
version is now v2.6.
Signed-off-by: Tony Jones <tonyj@suse.de>
Cc: Sanagi Koki <sanagi.koki@jp.fujitsu.com>
Link: http://lkml.kernel.org/r/20190222230619.17887-2-tonyj@suse.de
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When using -F + syntax to add a field the existing defaults are
currently all marked user_set. This can cause errors when some field is
missing in the perf.data
This patch tracks the actually user set fields separately, so that we don't
error out in this case.
Before:
% perf record true
% perf script -F +metric
Samples for 'cycles:ppp' event do not have CPU attribute set. Cannot print 'cpu' field.
%
After:
5 perf record true
% perf script -F +metric
perf 28936 278636.237688: 1 cycles:ppp: ffffffff8117da99 perf_event_exec+0x59 (/lib/modules/4.20.0-odilo/build/vmlinux)
...
%
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224153722.27020-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add perf_data__create_dir() to create nr files inside 'struct perf_data'
path directory:
int perf_data__create_dir(struct perf_data *data, int nr);
and function to close that data:
void perf_data__close_dir(struct perf_data *data);
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And display the error message from removing the old data file:
$ perf record ls
Can't remove old data: Permission denied (perf.data.old)
Perf session creation failed.
$ perf record ls
Can't remove old data: Unknown file found (perf.data.old)
Perf session creation failed.
Not sure how to make fail the rename (after we successfully remove the
destination file/dir) to show the message, anyway let's have it there.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Change check_backup() to call rm_rf_perf_data() instead of unlink() to
work over directory paths.
Also move the call earlier in the code, before we fork for file/dir, so
it can backup also directory data.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To remove perf.data including the directory, with checking on expected
files and no other directories inside.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Suggested-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add pattern argument to rm_rf_depth() (and rename it to rm_rf_depth_pat())
to specify the name pattern files need to match inside the directory.
The function fails if we find different file to remove.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding depth argument to rm_rf (and renaming it to rm_rf_depth) to
specify the depth we will go searching for files to remove.
It will be used to specify single depth for perf.data directory removal
in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a 'path' member to 'struct perf_data'. It will keep the configured
path for the data (const char *). The path in struct perf_data_file is
now dynamically allocated (duped) from it.
This scheme is useful/used in following patches where struct
perf_data::path holds the 'configure' directory path and struct
perf_data_file::path holds the allocated path for specific files.
Also it actually makes the code little simpler.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
[ Fixup data-convert-bt.c missing conversion ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We are about to add support for multiple files, so we need each file to
keep its size.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190221094145.9151-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a new report to display top calls by elapsed time. It displays calls
in descending order of time elapsed between when the function was called
and when it returned.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If no selection is made on the 'Selected branches' dialog, then the
output is the same as the 'All branches' report. That is not really an
error, and is not desirable for future reports, so remove it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove SQLTableDialogDataItem as it is no longer used.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Create new dialog data item classes to replace SQLTableDialogDataItem.
This separates out different dialog data items and makes it easier to
add new ones. SQLTableDialogDataItem is removed in a separate patch
because it makes the diff more readable.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The report name is a report variable so move it into into ReportVars.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factor out ReportVars to provide a single container for information from
report dialogs.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factor out ReportDialogBase so it can be re-used.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move column headers from SQLAutoTableModel into SQLTableModel so that
they can be used for other models based on SQLTableModel.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The Call Graph depends on the calls table which is optional when exporting
data, so hide the Call Graph option if there is no calls table.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
exported-sql-viewer.py is a standalone python script and requires a
shebang. Also only python2 is supported at present. Restore the shebang
but use the more flexible 'env' form.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: a38352de44 ("perf script python: Remove explicit shebang from Python script")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Let rm_rf() remove a file if it's provided by path, not just
directories.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So it does not screw up single -v verbose output.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a missing new line into pr_debug call in perf_event__synthesize_bpf_events(),
so that the error message does not screw the verbose output.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: http://lkml.kernel.org/r/20190220122800.864-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support to add/remove fields for specific event types in -F option.
It's now possible to use '+-' after event type, like:
# cat > test.c
#include <stdio.h>
int main(void)
{
printf("Hello world\n");
while(1) {}
}
^D
# gcc -g -o test test.c
# perf probe -x test 'test.c:5'
# perf record -e '{cpu/cpu-cycles,period=10000/,probe_test:main}:S' ./test
...
# perf script -Ftrace:+period,-cpu
test 3859 396291.117343: 10275 cpu/cpu-cycles,period=10000/: 7f..
test 3859 396291.118234: 11041 cpu/cpu-cycles,period=10000/: ffffff..
test 3859 396291.118234: 1 probe_test:main:
test 3859 396291.118248: 8668 cpu/cpu-cycles,period=10000/: ffffff..
test 3859 396291.118263: 10139 cpu/cpu-cycles,period=10000/: ffffff..
Committer testing:
Couldn't make the test above work, but tested it with:
# perf probe -x hello main
Added new event:
probe_hello:main (on main in /home/acme/c/hello)
You can now use it in all perf tools, such as:
perf record -e probe_hello:main -aR sleep 1
# perf record -e probe_hello:main ./hello
hello, world
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB perf.data (1 samples) ]
# perf script
hello 21454 [002] 254116.874005: probe_hello:main: (401126)
#
# perf script -Ftrace:+period,-cpu
hello 21454 254116.874005: 1 probe_hello:main: (401126)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Force sample_type setup for slave events in group leader sessions.
We don't get sample for slave events, we make them when delivering group
leader sample. Set the slave event to follow the master sample_type to
ease up report.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's no reason to deliver a sample with zero period. It means there
was no value for slave event since its last group leader sample.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
At some point I'll suggest moving this to libbpf, for now I'll
experiment with ways to dump BPF maps set by events in 'perf trace',
starting with a very basic dumper for the current very limited needs
of the augmented_raw_syscalls code: dumping booleans.
Having functions that apply to the map keys and values and do table
lookup in things like syscall id to string tables should come next.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-lz14w0esqyt1333aon05jpwc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 489338a717 ("perf tests evsel-tp-sched: Fix bitwise operator")
causes test case 14 "Parse sched tracepoints fields" to fail on s390.
This test succeeds on x86.
In fact this test now fails on all architectures with type char treated
as type unsigned char.
The root cause is the signed-ness of character arrays in the tracepoints
sched_switch for structure members prev_comm and next_comm.
On s390 the output of:
[root@m35lp76 perf]# cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
name: sched_switch
ID: 287
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
...
field:char prev_comm[16]; offset:8; size:16; signed:0;
...
field:char next_comm[16]; offset:40; size:16; signed:0;
reveals the character arrays prev_comm and next_comm are per
default unsigned char and have values in the range of 0..255.
On x86 both fields are signed as this output shows:
[root@f29]# cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
name: sched_switch
ID: 287
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
...
field:char prev_comm[16]; offset:8; size:16; signed:1;
...
field:char next_comm[16]; offset:40; size:16; signed:1;
and the character arrays prev_comm and next_comm are per default signed
char and have values in the range of -1..127. The implementation of
type char is architecture specific.
Since the character arrays in both tracepoints sched_switch and
sched_wakeup should contain ascii characters, simply omit the check for
signedness in the test case.
Output before:
[root@m35lp76 perf]# ./perf test -F 14
14: Parse sched tracepoints fields :
--- start ---
sched:sched_switch: "prev_comm" signedness(0) is wrong, should be 1
sched:sched_switch: "next_comm" signedness(0) is wrong, should be 1
sched:sched_wakeup: "comm" signedness(0) is wrong, should be 1
---- end ----
14: Parse sched tracepoints fields : FAILED!
[root@m35lp76 perf]#
Output after:
[root@m35lp76 perf]# ./perf test -Fv 14
14: Parse sched tracepoints fields :
--- start ---
---- end ----
Parse sched tracepoints fields: Ok
[root@m35lp76 perf]#
Fixes: 489338a717 ("perf tests evsel-tp-sched: Fix bitwise operator")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20190219153639.31267-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
According to the current documentation the flags section is placed after
the file header itself but the code assumes to find the flags section
after the data section. This change updates the documentation to that
assumption.
Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190219154515.3954-2-jonas.rabenstein@studium.uni-erlangen.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The content of the HEADER_CMDLINE feature header is a perf_header_string_list
of the argument vector and not a perf_header_string of the commandline.
Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190219154515.3954-1-jonas.rabenstein@studium.uni-erlangen.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We can't assume inlined symbols with the same name are equal, because
their address range may be different. This will cause the symbols with
different addresses be shadowed when adding to the hist entry, and lead
to ERANGE error when checking the symbol address during sample parse,
the addr should be within the range of [sym.start, sym.end].
The error message is like: "0x36aea60 [0x8]: failed to process type: 68".
The second parameter of symbol__new() is the length of the fake symbol
for the inline frame, which is the subtraction of the end and start
address of base_sym.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: aa441895f7 ("perf report: Compare symbol name for inlined frames when sorting")
Link: http://lkml.kernel.org/r/20190219130531.15692-1-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use sysfs__mountpoint() when reading sysfs files to obtain cpu/numa
topologies.
Also use scnprintf instead of sprintf as suggested by Namhyung.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190219095815.15931-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add the numa_topology object to return the list of numa nodes together
with their cpus. It will replace the numa code in header.c and will be
used from 'perf record' in the following patches.
Add the following interface functions to load numa details:
struct numa_topology *numa_topology__new(void);
void numa_topology__delete(struct numa_topology *tp);
And replace the current (copied) local interface, with no functional
changes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190219095815.15931-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make struct cpu_topo global and rename it to 'struct cpu_topology', so
that it can be used from the 'perf record' command in the following
patches.
Add the following interface functions to load/free cpu topology details:
struct cpu_topology *cpu_topology__new(void);
void cpu_topology__delete(struct cpu_topology *tp);
Move it to a separate source file cputopo.c together with numa related
object in the following patches.
No functional change, the new interface will be used in upcoming changes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190219095815.15931-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We are currently passing the node index instead of the real node number.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: fbe96f29ce ("perf tools: Make perf.data more self-descriptive (v8)"
Link: http://lkml.kernel.org/r/20190219095815.15931-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The netfilter conflicts were rather simple overlapping
changes.
However, the cls_tcindex.c stuff was a bit more complex.
On the 'net' side, Cong is fixing several races and memory
leaks. Whilst on the 'net-next' side we have Vlad adding
the rtnl-ness support.
What I've decided to do, in order to resolve this, is revert the
conversion over to using a workqueue that Cong did, bringing us back
to pure RCU. I did it this way because I believe that either Cong's
races don't apply with have Vlad did things, or Cong will have to
implement the race fix slightly differently.
Signed-off-by: David S. Miller <davem@davemloft.net>
If perf was built without trace support, the trace+probe_vfs_getname.sh
'perf test' entry fails:
# perf trace -h
perf: 'trace' is not a perf-command. See 'perf --help'
# perf test 64
64: Check open filename arg using perf trace + vfs_getname: FAILED!
Check trace support, so that we'll skip the test in that case:
# perf test 64
64: Check open filename arg using perf trace + vfs_getname: Skip
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190215134253.11454-1-tt.rantala@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Not used at all.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190213123246.4015-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Simplifying the code a bit.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190213123246.4015-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixing legacy symbol events parsing. We can't support single slash
separator, like 'cycles/u', because it conflicts with non empty terms,
like 'cycles/period/u'.
Keeping only '//' and ':' separator for these events:
cycles//u
cycles:k
And removing '/' separator support, which is not working
anymore. Also adding automated tests for above events.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190213123246.4015-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename build libperf to perf, because it's used to build perf.
The libperf build object name will be used for libperf library.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190213123246.4015-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Simple rename, no functional change.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190213123246.4015-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's no need for perf build to use libperf.a,
we can use directly libperf-in.o.
The libperf.a stays as a target if needed:
$ make libperf.a
...
CC util/pmu.o
CC util/pmu-flex.o
LD util/libperf-in.o
LD libperf-in.o
AR libperf.a
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190213123246.4015-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Making the auxtrace_buffer fetch function modular so that it can be
called from different decoding context (timeless vs. non-timeless),
avoiding to repeat code.
No change in functionality is introduced by this patch.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190212171618.25355-14-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Making the main packet processing loop modular so that it can be called
from different decoding context (timeless vs. non-timless), avoiding to
repeat code.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190212171618.25355-13-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Making the main decoder block modular so that it can be called from
different decoding context (timeless vs. non-timeless), avoiding
to repeat code.
No change in functionality is introduced by this patch.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190212171618.25355-12-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch makes decoding of auxtrace buffer centered around a struct
cs_etm_queue. This eliminates surperflous variables and is a precursor
for work that simplifies the main decoder loop.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190212171618.25355-11-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving initialisation of the kernel start address to function
cs_etm__setup_queues(), considered to be the common denominator for
queue initialisation. That way we don't have to repeat the same code
at different places.
No change of functionatlity is introduced by this patch.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190212171618.25355-10-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Function cs_etm__alloc_queue() should only be concerned with the allocation
of memory for the etmq and accompanying decoder. Everything else should
be done in the calling function.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190212171618.25355-9-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The comment just before initialising the decoder is plane wrong since it
is part of the decoding queue setup function and the operation code
specifically mention that trace data is to be decoded rather than printed
out.
This patch simply fix the comment to prevent people from getting really
confused.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190212171618.25355-8-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The trace parameter initialisation code is repeated in two different
places, something that bloats the file and can lead to errors. This
is fixed by introducing a helper function and calling the right
protocol initialisation code when required.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190212171618.25355-7-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Memory allocated for variable 't_params' isn't released properly in the
error path of function cs_etm_queue *cs_etm__alloc_queue() and
cs_etm__dump_event(), something this patch addresses.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190212171618.25355-6-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introducing function cs_etm_decoder__init_dparams() to avoid repeating
code at two different places.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190212171618.25355-5-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Function cs_etm__mem_access() is supposed to return a u32 but the error
path returns negative values at a couple of places, something that really
throws off the clients using it. Fix the situation by return '0'.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190212171618.25355-4-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Field "time" and "timestamp" in structure cs_etm_queue are no longer
used and need to be removed.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190212171618.25355-3-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Field "state" in structure cs_etm_queue is no longer used and needs
to be removed.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190212171618.25355-2-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the libcrypto feature test was added we forgot to add its
FEATURE_CHECK_LDFLAGS pointing to the library needed to link with the
test-all.bin feature test fast path binary, so even when it was
introduced we got this:
$ cat /tmp/build/perf/feature/test-all.make.output
/usr/bin/ld: /tmp/ccjKeJJU.o: in function `main_test_libcrypto':
/home/acme/git/perf/tools/build/feature/test-libcrypto.c:10: undefined reference to `MD5_Init'
/usr/bin/ld: /home/acme/git/perf/tools/build/feature/test-libcrypto.c:11: undefined reference to `MD5_Update'
/usr/bin/ld: /home/acme/git/perf/tools/build/feature/test-libcrypto.c:12: undefined reference to `MD5_Final'
/usr/bin/ld: /home/acme/git/perf/tools/build/feature/test-libcrypto.c:14: undefined reference to `SHA1'
collect2: error: ld returned 1 exit status
$ cat /tmp/build/perf/feature/test-libcrypto.
test-libcrypto.bin test-libcrypto.d test-libcrypto.make.output
$ cat /tmp/build/perf/feature/test-libcrypto.make.output
$
Fix it, so that we keep the fast path, which, at this point, will fail
with the unwind-ARCH feature tests, that will be fixed in a followup
patch:
$ make -C tools/perf O=/tmp/build/perf
... libcrypto: [ on ]
<SNIP>
$ cat /tmp/build/perf/feature/test-all.make.output
$ ldd /tmp/build/perf/feature/test-all.bin | grep libcrypto
libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007f9892805000)
$
$ grep libcrypto /tmp/build/perf/FEATURE-DUMP
feature-libcrypto=1
$
With the unwind-ARCH tests fixed, we now finally manage to get
test-all.bin built and linked with the features it tests, among them the
ones fixed in this patchkit:
$ ldd /tmp/build/perf/feature/test-all.bin | egrep 'unwind|crypto'
libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007f95cf2b8000)
libunwind-x86_64.so.8 => /lib64/libunwind-x86_64.so.8 (0x00007f95cf294000)
libunwind.so.8 => /lib64/libunwind.so.8 (0x00007f95cf278000)
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Fixes: 8ee4646038 ("perf build: Add libcrypto feature detection")
Link: https://lkml.kernel.org/n/tip-rexc248jorf5b4l3qjn888cz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As it is not normally available on x86_64 not being tested on test-all.c
but being in FEATURE_TESTS_BASIC ends up implying that those features
are present, which leads to trying to link with those libraries and a
build failure now that test-all.c is finally again building
successfully:
/usr/bin/ld: cannot find -lunwind-x86
/usr/bin/ld: cannot find -lunwind-aarch64
collect2: error: ld returned 1 exit status
make[3]: *** [Makefile:199: /tmp/build/perf/plugin_jbd2.so] Error 1
make[3]: *** Waiting for unfinished jobs....
/usr/bin/ld: cannot find -lunwind-x86
/usr/bin/ld: cannot find -lunwind-aarch64
So remove those features from there and explicitely test them.
And then move this patch to just before the last one that allows this to
be exposed, so that we keep the tree bisectable.
With all this in place we get, at this point:
$ ldd /tmp/build/perf/feature/test-libunwind.bin
linux-vdso.so.1 (0x00007fffa09c6000)
libunwind-x86_64.so.8 => /lib64/libunwind-x86_64.so.8 (0x00007fbcf4451000)
libunwind.so.8 => /lib64/libunwind.so.8 (0x00007fbcf4435000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00007fbcf440c000)
libelf.so.1 => /lib64/libelf.so.1 (0x00007fbcf43f2000)
libc.so.6 => /lib64/libc.so.6 (0x00007fbcf422c000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fbcf4211000)
/lib64/ld-linux-x86-64.so.2 (0x00007fbcf4491000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fbcf41ed000)
libz.so.1 => /lib64/libz.so.1 (0x00007fbcf41d3000)
$ cat /tmp/build/perf/feature/test-libunwind-x86.make.output
test-libunwind-x86.c:2:10: fatal error: libunwind-x86.h: No such file or directory
#include <libunwind-x86.h>
^~~~~~~~~~~~~~~~~
compilation terminated.
$ cat /tmp/build/perf/feature/test-libunwind-aarch64.make.output
test-libunwind-aarch64.c:2:10: fatal error: libunwind-aarch64.h: No such file or directory
#include <libunwind-aarch64.h>
^~~~~~~~~~~~~~~~~~~~~
compilation terminated.
$
$ ldd ~/bin/perf | grep unwind
libunwind-x86_64.so.8 => /lib64/libunwind-x86_64.so.8 (0x00007f5ceb24b000)
libunwind.so.8 => /lib64/libunwind.so.8 (0x00007f5ceb22f000)
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: https://lkml.kernel.org/n/tip-vs6kwqsvwk7oxhs6z9mq87pp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since it is not yet that generally available, avoid testing for the
presence of libcoresight in the fast path test-all.bin feature test.
# dnf search opencsd
No matches found.
# dnf search OpenCSD
No matches found.
# cat /etc/fedora-release
Fedora release 29 (Twenty Nine)
#
I.e. right now, in my system test-all.bin is failing all the time since
Fedora29 doesn't have libopencsd available:
$ cat /tmp/build/perf/feature/test-all.make.output
In file included from test-all.c:174:
test-libopencsd.c:2:10: fatal error: opencsd/c_api/opencsd_c_api.h: No such file or directory
#include <opencsd/c_api/opencsd_c_api.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
See:
6ab2b762be ("perf build: Disable libbabeltrace check by default")
For the rationale, as soon as libopencsd becomes more generally packaged
and available, we do the same thing we did with babeltrace, enabling it
by default, as done in:
24787afbcd ("perf tools: Enable LIBBABELTRACE by default")
For now, to explicitely ask for opencsd, make sure you have it installed
and use:
make -C tools/perf CORESIGHT=1
The feature test output will be there as an empty file:
$ ls -la /tmp/build/perf/feature/test-libopencsd.make.output
Because the binary used for the feature check was successfully built:
$ ls -la /tmp/build/perf/feature/test-libopencsd.bin
-rwxrwxr-x. 1 acme acme 18336 Feb 12 14:49 /tmp/build/perf/feature/test-libopencsd.bin
$ ldd /tmp/build/perf/feature/test-libopencsd.bin
linux-vdso.so.1 (0x00007fffe18cc000)
libopencsd_c_api.so.0 => /lib64/libopencsd_c_api.so.0 (0x00007fb8e67f6000)
libopencsd.so.0 => /lib64/libopencsd.so.0 (0x00007fb8e676f000)
libc.so.6 => /lib64/libc.so.6 (0x00007fb8e65a9000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fb8e6411000)
libm.so.6 => /lib64/libm.so.6 (0x00007fb8e628d000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fb8e6272000)
/lib64/ld-linux-x86-64.so.2 (0x00007fb8e6828000)
$
And the resulting perf binary will be linked with it:
-rw-rw-r--. 1 acme acme 0 Feb 12 14:49 /tmp/build/perf/feature/test-libopencsd.make.output
$ ldd ~/bin/perf | grep opencsd
libopencsd_c_api.so.0 => /lib64/libopencsd_c_api.so.0 (0x00007fd43097f000)
libopencsd.so.0 => /lib64/libopencsd.so.0 (0x00007fd4308f8000)
$
To make sure this gets built before pushing things upstream I have a
ubuntu:19.04-x-arm64 container that has:
[root@quaco x-arm64]# grep CORESIGHT Dockerfile
ENV EXTRA_MAKE_ARGS=CORESIGHT=1
[root@quaco x-arm64]#
So that I always build with libopencsd before pushing things upstream.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Link: https://lkml.kernel.org/n/tip-20vyy39jw9jgrijesi30fgox@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just like it does with 'sshd', to reduce the feedback loop when doing
system wide tracing on on a gnome GUI.
Need to figure out how to auto-filter the calls to other UI components
tho.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-rjopq5y92itgokppdhe8sc6z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since we need it to resolve the AIO symbols, otherwise we fail with:
$ cat /tmp/build/perf/feature/test-all.make.output
/usr/bin/ld: /tmp/ccEqrj36.o: undefined reference to symbol 'aio_return64@@GLIBC_2.2.5'
/usr/bin/ld: //usr/lib64/librt.so.1: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
$
When we added the aio support in 'perf record' only the test-libaio.bin
target got the -lrt, i.e. the feature detection slow path. Fix it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 2a07d81474 ("tools build feature: Check if libaio is available")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When introducing the possibility for selecting if the common prefix to
options such as the waitid ones, i.e. all 'waitid' options start with
'W', so, to make it make it more compact if configured to suppress it,
'perf trace' will do so, other examples include mmap's PROT_ prefix for
its 'prot' argument, etc, which, when showing the syscall argument name
ends up producing duplicated info that clutters the screen, i.e.:
# perf trace -e mmap --max-events 2 sleep 1
0.000 ( 0.014 ms): sleep/20886 mmap(len: 112595, prot: PROT_READ, flags: MAP_PRIVATE, fd: 3) = 0x7f3e986d2000
0.041 ( 0.005 ms): sleep/20886 mmap(len: 8192, prot: PROT_READ|PROT_WRITE, flags: MAP_PRIVATE|MAP_ANONYMOUS) = 0x7f3e986d0000
#
So it is possible to suppress that and make it more compact by having
this in your ~/.perfconfig:
# cat ~/.perfconfig
[trace]
show_prefix = no
#
# perf trace -e mmap --max-events 2 sleep 1
0.000 ( 0.014 ms): sleep/8009 mmap(len: 112595, prot: READ, flags: PRIVATE, fd: 3) = 0x7ff2373de000
0.040 ( 0.005 ms): sleep/8009 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7ff2373dc000
#
To have it look more like strace's output, we instead want to suppress
the arg name and show the prefix, so use:
# cat ~/.perfconfig
[trace]
show_prefix = yes
show_arg_names = no
#
# perf trace -e mmap --max-events 2 sleep 1
0.000 ( 0.006 ms): sleep/15513 mmap(NULL, 112595, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7a9b6d3000
0.020 ( 0.002 ms): sleep/15513 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS) = 0x7f7a9b6d1000
#
When this logic was introduced a bug came with it when processing the
waitid 'option' arg that ended up expecting 3 strings when just two were
being provided, fix it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: c65c83ffe9 ("perf trace: Allow asking for not suppressing common string prefixes")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were crashing when processing a negative fd:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000609bbf in syscall_arg__scnprintf_ioctl_cmd (bf=0x1172eca "", size=2038, arg=0x7fffffff8360) at trace/beauty/ioctl.c:182
182 if (file->dev_maj == USB_DEVICE_MAJOR)
Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.6-28.fc29.x86_64 elfutils-libelf-0.174-5.fc29.x86_64 elfutils-libs-0.174-5.fc29.x86_64 glib2-2.58.3-1.fc29.x86_64 libbabeltrace-1.5.6-1.fc29.x86_64 libunwind-1.2.1-6.fc29.x86_64 libuuid-2.32.1-1.fc29.x86_64 libxcrypt-4.4.3-2.fc29.x86_64 numactl-libs-2.0.12-1.fc29.x86_64 openssl-libs-1.1.1a-1.fc29.x86_64 pcre-8.42-6.fc29.x86_64 perl-libs-5.28.1-427.fc29.x86_64 popt-1.16-15.fc29.x86_64 python2-libs-2.7.15-11.fc29.x86_64 slang-2.3.2-4.fc29.x86_64 xz-libs-5.2.4-3.fc29.x86_64
(gdb) bt
#0 0x0000000000609bbf in syscall_arg__scnprintf_ioctl_cmd (bf=0x1172eca "", size=2038, arg=0x7fffffff8360) at trace/beauty/ioctl.c:182
#1 0x000000000048e295 in syscall__scnprintf_val (sc=0x123b500, bf=0x1172eca "", size=2038, arg=0x7fffffff8360, val=21519)
at builtin-trace.c:1594
#2 0x000000000048e60d in syscall__scnprintf_args (sc=0x123b500, bf=0x1172ec6 "-1, ", size=2042, args=0x7ffff6a7c034 "\377\377\377\377",
augmented_args=0x7ffff6a7c064, augmented_args_size=4, trace=0x7fffffffa8d0, thread=0x1175cd0) at builtin-trace.c:1661
#3 0x000000000048f04e in trace__sys_enter (trace=0x7fffffffa8d0, evsel=0xb260b0, event=0x7ffff6a7bfe8, sample=0x7fffffff84f0)
at builtin-trace.c:1880
#4 0x00000000004915a4 in trace__handle_event (trace=0x7fffffffa8d0, event=0x7ffff6a7bfe8, sample=0x7fffffff84f0) at builtin-trace.c:2590
#5 0x0000000000491eed in __trace__deliver_event (trace=0x7fffffffa8d0, event=0x7ffff6a7bfe8) at builtin-trace.c:2818
#6 0x0000000000492030 in trace__deliver_event (trace=0x7fffffffa8d0, event=0x7ffff6a7bfe8) at builtin-trace.c:2845
#7 0x0000000000492896 in trace__run (trace=0x7fffffffa8d0, argc=0, argv=0x7fffffffdb58) at builtin-trace.c:3040
#8 0x000000000049603a in cmd_trace (argc=0, argv=0x7fffffffdb58) at builtin-trace.c:3952
#9 0x00000000004d5103 in main (argc=1, argv=0x7fffffffdb58) at perf.c:474
(gdb) p fd
$1 = -1
(gdb) p file
$7 = (struct file *) 0xfffffffffffffff0
(gdb) p ((struct thread_trace *)arg->thread)->files.table + fd
$8 = (struct file *) 0xfffffffffffffff0
(gdb)
Check for that and return NULL instead.
This problem was introduced recently, the other codepaths leading to
thread_trace__files_entry() check for negative fds, like thread__fd_path(),
but we need to do it at thread_trace__files_entry() as more users are now
calling it directly.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 2d473389f8 ("perf trace beauty: Export function to get the files for a thread")
Link: https://lkml.kernel.org/n/tip-oq7bvaaf07gsd4yqty3107u2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is possible to pass a negative number as the fd and that has to be
handled, so stop using 'unsigned int fd' in the ioctl syscall 'cmd'
beautifier.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-b7qwa0l19dswa09h3s41akfu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Synthesizing BPF events is only supported for root. Silent warning msg
when non-root user runs perf-record.
Reported-by: David Carrillo-Cisneros <davidca@fb.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Tested-by: David Carrillo-Cisneros <davidca@fb.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: kernel-team@fb.com
Link: http://lkml.kernel.org/r/20190204193140.719740-1-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Descriptions of metrics for POWER9 processors can be found in the
"POWER9 Performance Monitor Unit User’s Guide", which is currently
available on the "IBM Portal for OpenPOWER"
(https://www-355.ibm.com/systems/power/openpower/welcome.xhtml) at
https://www-355.ibm.com/systems/power/openpower/posting.xhtml?postingId=4948CDE1963C9BCA852582F800718190
This patch is for metric groups:
- branch_prediction
- instruction_stats_percent_per_ref
- latency
- lsu_rejects
- memory
- prefetch
- translation
Plus, some whitespace changes.
Signed-off-by: Paul Clarke <pc@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: linuxppc-dev@ozlabs.org
Link: http://lkml.kernel.org/r/20190209181429.23950-4-pc@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Descriptions of metrics for POWER9 processors can be found in the
"POWER9 Performance Monitor Unit User’s Guide", which is currently
available on the "IBM Portal for OpenPOWER"
(https://www-355.ibm.com/systems/power/openpower/welcome.xhtml) at
https://www-355.ibm.com/systems/power/openpower/posting.xhtml?postingId=4948CDE1963C9BCA852582F800718190
This patch is for metric groups:
- dl1_reloads_percent_per_inst
- dl1_reloads_percent_per_ref
- instruction_misses_percent_per_inst
- l2_stats
- l3_stats
- pteg_reloads_percent_per_inst
- pteg_reloads_percent_per_ref
Signed-off-by: Paul Clarke <pc@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: linuxppc-dev@ozlabs.org
Link: http://lkml.kernel.org/r/20190209181429.23950-3-pc@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
POWER8 metrics are not well publicized. Some are here:
https://www.ibm.com/support/knowledgecenter/en/SSFK5S_2.2.0/com.ibm.cluster.pedev.v2r2.pedev100.doc/bl7ug_derivedmetricspower8.htm
This patch is for metric groups:
- branch_prediction
- latency
- bus_stats
- instruction_mix
- instruction_stats_percent_per_ref
Signed-off-by: Paul Clarke <pc@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20190207175314.31813-4-pc@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
POWER8 metrics are not well publicized.
Some are here:
https://www.ibm.com/support/knowledgecenter/en/SSFK5S_2.2.0/com.ibm.cluster.pedev.v2r2.pedev100.doc/bl7ug_derivedmetricspower8.htm
This patch is for metric groups:
- dl1_reloads_percent_per_inst
- dl1_reloads_percent_per_ref
- instruction_misses_percent_per_inst
- l2_stats
- lsu_rejects
- memory
- pteg_reloads_percent_per_inst
- pteg_reloads_percent_per_ref
Signed-off-by: Paul Clarke <pc@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20190207175314.31813-3-pc@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On IBM z13 machine types 2964 and 2965 the descriptor
sizes for sampling and diagnostic sampling entries
might be missing in the trailer entry and are set to zero.
This leads to a perf report failure when processing diagnostic
sampling entries.
This patch adds missing descriptor sizes when the trailer entry
contains zero for these fields.
Output before:
[root@s38lp82 perf]# ./perf report --stdio | fgrep Samples
0xabbf0 [0x8]: failed to process type: 68
Error:
failed to process sample
[root@s38lp82 perf]#
Output after:
[root@s38lp82 perf]# ./perf report --stdio | fgrep Samples
# Total Lost Samples: 0
# Samples: 3K of event 'SF_CYCLES_BASIC_DIAG'
# Samples: 162 of event 'CF_DIAG'
[root@s38lp82 perf]#
Fixes: 2b1444f2e2 ("perf report: Add raw report support for s390 auxiliary trace")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20190211100627.85714-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After 'commit e22c1c7511 ("perf thread: Don't include symbol.h,
symbol_conf.h is enough")'
Compilation of the perf tools is broken when using the functionality
provided by the openCSD library:
[...]
... timerfd: [ on ]
... sched_getcpu: [ on ]
... sdt: [ OFF ]
... setns: [ on ]
... libopencsd: [ on ]
[...]
CC util/arm-spe.o
CC util/arm-spe-pkt-decoder.o
CC util/s390-cpumsf.o
CC util/cs-etm.o
CC util/parse-branch-options.o
util/cs-etm.c: In function ‘cs_etm__mem_access’:
util/cs-etm.c:297:24: error: storage size of ‘al’ isn’t known
struct addr_location al;
And rightly so since file cs-etm.c doesn't include symbol.h, something
that is rectified in this patch.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190208223543.31836-1-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implement --affinity=node|cpu option for the record mode defaulting
to system affinity mask bouncing.
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/083f5422-ece9-10dd-8305-bf59c860f10f@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Hardware tracing:
Adrian Hunter:
- Handle calls optimized into jumps to a different symbol
in the thread stack routines used to process hardware traces (Adrian Hunter)
Intel PT:
Adrian Hunter:
- Fix overlap calculation for padding.
- Fix CYC timestamp calculation after OVF.
- Packet splitting can only happen in 32-bit.
- Add timestamp to auxtrace errors.
ARM CoreSight:
Leo Yan:
- Add last instruction information in packet
- Set sample flags for instruction range, exception and
return packets and for a trace discontinuity.
- Add exception number in exception packet
- Change tuple from traceID-CPU# to traceID-metadata
- Add traceID in packet
Mathieu Poirier:
- Add "sinks" group to PMU directory
- Use event attributes to send sink information to kernel
- Remove set_drv_config() API, no longer used.
perf annotate:
Jiri Olsa:
- Delay symbol annotation to the resort phase, speeding up 'perf report'
startup.
perf record:
Alexey Budankov:
- Allow binding userspace buffers to NUMA nodes.
Symbols:
Adrian Hunter:
- Fix calculating of symbol sizes when splitting kallsyms into
maps for kcore processing.
Vendor events:
William Cohen:
- Intel: Fix Load_Miss_Real_Latency on CLX
Misc:
Arnaldo Carvalho de Melo:
- Streamline headers, removing includes when all that is needed are
just forward declarations, fixup the fallout for cases where headers
should have been explicitely included but were instead obtained
indirectly, by sheer luck.
- Add fallback versions for CPU_{OR,EQUAL}(), so that code using it
continue to build on older systems where those were not yet introduced
or in systems using some other libc than the GNU one where those
helpers aren't present.
Documentation:
Changbin Du:
- Add documentation for BPF event selection.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXFsqugAKCRCyPKLppCJ+
JzpwAQDEh1mNZoxfdGZEi9d+8p2hnRlOs3GOUG4iGnqAYfae4QEAkMJ0V1wrmkdw
NXgV+PgWfDcgbD4Cn90eWA8M6KEcbgA=
=ogOF
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-5.1-20190206' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
Hardware tracing:
Adrian Hunter:
- Handle calls optimized into jumps to a different symbol
in the thread stack routines used to process hardware traces (Adrian Hunter)
Intel PT:
Adrian Hunter:
- Fix overlap calculation for padding.
- Fix CYC timestamp calculation after OVF.
- Packet splitting can only happen in 32-bit.
- Add timestamp to auxtrace errors.
ARM CoreSight:
Leo Yan:
- Add last instruction information in packet
- Set sample flags for instruction range, exception and
return packets and for a trace discontinuity.
- Add exception number in exception packet
- Change tuple from traceID-CPU# to traceID-metadata
- Add traceID in packet
Mathieu Poirier:
- Add "sinks" group to PMU directory
- Use event attributes to send sink information to kernel
- Remove set_drv_config() API, no longer used.
perf annotate:
Jiri Olsa:
- Delay symbol annotation to the resort phase, speeding up 'perf report'
startup.
perf record:
Alexey Budankov:
- Allow binding userspace buffers to NUMA nodes.
Symbols:
Adrian Hunter:
- Fix calculating of symbol sizes when splitting kallsyms into
maps for kcore processing.
Vendor events:
William Cohen:
- Intel: Fix Load_Miss_Real_Latency on CLX
Misc:
Arnaldo Carvalho de Melo:
- Streamline headers, removing includes when all that is needed are
just forward declarations, fixup the fallout for cases where headers
should have been explicitely included but were instead obtained
indirectly, by sheer luck.
- Add fallback versions for CPU_{OR,EQUAL}(), so that code using it
continue to build on older systems where those were not yet introduced
or in systems using some other libc than the GNU one where those
helpers aren't present.
Documentation:
Changbin Du:
- Add documentation for BPF event selection.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
An ipvlan bug fix in 'net' conflicted with the abstraction away
of the IPV6 specific support in 'net-next'.
Similarly, a bug fix for mlx5 in 'net' conflicted with the flow
action conversion in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
The timestamp can use useful to find part of a trace that has an error
without outputting all of the trace e.g. using the itrace 's' option to
skip initial number of events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190206103947.15750-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Data is copied when the trace is stopped, so packets are never split
between buffers except when processing if the buffer cannot fit in the
address space which can only happen on 32-bit systems. Change the logic
to reflect that.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190206103947.15750-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
CYC packet timestamp calculation depends upon CBR which was being
cleared upon overflow (OVF). That can cause errors due to failing to
synchronize with sideband events. Even if a CBR change has been lost,
the old CBR is still a better estimate than zero. So remove the clearing
of CBR.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20190206103947.15750-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Define auxtrace record alignment so that it can be referenced elsewhere.
Note this is preparation for patch "perf intel-pt: Fix overlap calculation
for padding"
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20190206103947.15750-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The compiler might optimize a call/ret combination by making it a jmp.
However the thread-stack does not presently cater for that, so that such
control flow is not visible in the call graph. Make it visible by
recording on the stack a branch to the start of a different symbol.
Note, that means when a ret pops the stack, all jmps must be popped off
first.
Example:
$ cat jmp-to-fn.c
__attribute__((noinline)) int bar(void)
{
return -1;
}
__attribute__((noinline)) int foo(void)
{
return bar() + 1;
}
int main()
{
return foo();
}
$ gcc -ggdb3 -Wall -Wextra -O2 -o jmp-to-fn jmp-to-fn.c
$ objdump -d jmp-to-fn
<SNIP>
0000000000001040 <main>:
1040: 31 c0 xor %eax,%eax
1042: e9 09 01 00 00 jmpq 1150 <foo>
<SNIP>
0000000000001140 <bar>:
1140: b8 ff ff ff ff mov $0xffffffff,%eax
1145: c3 retq
<SNIP>
0000000000001150 <foo>:
1150: 31 c0 xor %eax,%eax
1152: e8 e9 ff ff ff callq 1140 <bar>
1157: 83 c0 01 add $0x1,%eax
115a: c3 retq
<SNIP>
$ perf record -o jmp-to-fn.perf.data -e intel_pt/cyc/u ./jmp-to-fn
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0,017 MB jmp-to-fn.perf.data ]
$ perf script -i jmp-to-fn.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py jmp-to-fn.db branches calls
2019-01-08 13:24:58.783069 Creating database...
2019-01-08 13:24:58.794650 Writing records...
2019-01-08 13:24:59.008050 Adding indexes
2019-01-08 13:24:59.015802 Done
$ ~/libexec/perf-core/scripts/python/exported-sql-viewer.py jmp-to-fn.db
Before:
main
-> bar
After:
main
-> foo
-> bar
Committer testing:
Install the python2-pyside package, then select these menu options
on the GUI:
"Reports"
"Context sensitive callgraphs"
Then go on expanding the symbols, to get, full picture when doing this
on a fedora:29 with gcc version 8.2.1 20181215 (Red Hat 8.2.1-6) (GCC):
jmp-to-fn
PID:TID
_start (ld-2.28.so)
__libc_start_main
main
foo
bar
To verify that indeed, this fixes the problem.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190109091835.5570-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make thread_stack__no_call_return() more readable by adding more local
variables.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190109091835.5570-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If 'cp' is checked in thread_stack__push_cp() a number of error checks
can be removed, reducing code size and improving readability.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190109091835.5570-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kallsyms symbols do not have a size, so the size becomes the distance to
the next symbol.
Consequently the recently added trampoline symbols end up with large
sizes because the trampolines are some distance from one another and the
main kernel map.
However, symbols that end outside their map can disrupt the symbol tree
because, after mapping, it can appear incorrectly that they overlap
other symbols.
Add logic to truncate symbol size to the end of the corresponding map.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Fixes: d83212d5dd ("kallsyms, x86: Export addresses of PTI entry trampolines")
Link: http://lkml.kernel.org/r/20190109091835.5570-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix incorrect event names for the Load_Miss_Real_Latency metric for
Cascadelake server in the same manner as commit 91b2b97025 for SKL/SKX.
Signed-off-by: William Cohen <wcohen@redhat.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190129170536.22510-1-wcohen@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When return from exception, we need to distinguish if it's system call
return or for other type exceptions for setting sample flags. Due to
the exception return packet doesn't contain exception number, so we
cannot decide sample flags based on exception number.
On the other hand, the exception return packet is followed by an
instruction range packet; this range packet deliveries the start address
after exception handling, we can check if it is a SVC instruction just
before the start address. If there has one SVC instruction is found
ahead the return address, this means it's an exception return for system
call; otherwise it is an normal return for other exceptions.
This patch is to set sample flags for exception return packet, firstly
it simply set sample flags as PERF_IP_FLAG_INTERRUPT for all exception
returns since at this point it doesn't know what's exactly the exception
type. We will defer to decide if it's an exception return for system
call when the next instruction range packet comes, it checks if there
has one SVC instruction prior to the start address and if so we will
change sample flags to PERF_IP_FLAG_SYSCALLRET for system call return.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190129122842.32041-9-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The exception taken and returning are typical flow for instruction jump
but it needs to be handled with exception packets. This patch is to set
sample flags for exception packet.
Since the exception packet contains the exception number, according to
the exception number this patch makes decision for belonging to which
exception types.
The decoder have defined different exception number for ETMv3 and ETMv4
separately, hence this patch needs firstly decide the ETM version by
using the metadata magic number, and this patch adds helper function
cs_etm__get_magic() for easily getting magic number.
Based on different ETM version, the exception packet contains the
exception number, according to the exception number this patch makes
decision for the exception belonging to which exception types.
In this patch, it introduces helper function cs_etm__is_svc_instr(); for
ETMv4 CS_ETMV4_EXC_CALL covers SVC, SMC and HVC cases in the single
exception number, thus need to use cs_etm__is_svc_instr() to decide an
exception taken for system call.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Robert Walker <robert.walker@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190129122842.32041-8-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add traceID in packet, thus we can use traceID to retrieve metadata
pointer from traceID-metadata tuple.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190129122842.32041-7-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If packet processing wants to know the packet is bound with which ETM
version, it needs to access metadata to decide that based on metadata
magic number; but we cannot simply to use CPU logic ID number as index
to access metadata sequential array, especially when system have
hotplugged off CPUs, the metadata array are only allocated for online
CPUs but not offline CPUs, so the CPU logic number doesn't match with
its index in the array.
This patch is to change tuple from traceID-CPU# to traceID-metadata,
thus it can use the tuple to retrieve metadata pointer according to
traceID.
For safe accessing metadata fields, this patch provides helper function
cs_etm__get_cpu() which is used to return CPU number according to
traceID; cs_etm_decoder__buffer_packet() is the first consumer for this
helper function.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190129122842.32041-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When an exception packet comes, it contains the information for
exception number; the exception number indicates the exception types, so
from it we can know if the exception is taken for interrupt, system call
or other traps, etc.
This patch simply adds a field in cs_etm_packet struct, it records
exception number for exception packet that will then be used to properly
identify exception types to the perf synthesize mechanic.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190129122842.32041-5-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In the middle of trace stream, it might be interrupted thus the trace
data is not continuous, the trace stream firstly is ended for previous
trace block and restarted for next block.
To display related information for showing trace is restarted, this
patch set sample flags for trace discontinuity:
- If one discontinuity packet is coming, append flag
PERF_IP_FLAG_TRACE_END to the previous packet to indicate the trace
has been ended;
- If one instruction packet is following discontinuity packet, this
instruction packet is the first one packet to restarting trace. So
set flag PERF_IP_FLAG_TRACE_START to discontinuity packet, this flag
will be used to generate sample when connect with the sequential
instruction packet.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190129122842.32041-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf sample data contains flags to indicate the hardware trace data
is belonging to which type branch instruction, thus this can be used to
print out the human readable string. Arm CoreSight ETM sample data is
missed to set flags and it is always set to zeros, this results in perf
tool skips to print string for instruction types.
This patch is to set branch instruction flags for instruction range
packet.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190129122842.32041-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Decoder provides last instruction related information, these information
can be used for trace analysis; specifically we can get to know what
kind of branch instruction has been executed, mainly the information are
contained in three element fields:
last_i_type: this is significant type for waypoint calculation, it
indicates the last instruction is one of immediate branch instruction,
indirect branch instruction, instruction barrier (ISB), or data
barrier (DSB/DMB).
last_i_subtype: this is used for instruction sub type, it can be
branch with link, ARMv8 return instruction, ARMv8 eret instruction
(return from exception), or ARMv7 instruction which could imply
return (e.g. MOV PC, LR; POP { ,PC}).
last_instr_cond: it indicates if the last instruction was conditional.
But these three fields are not saved into cs_etm_packet struct, thus
cs-etm layer don't know related information and cannot generate sample
flags for branch instructions.
This patch add corresponding three new fields in cs_etm_packet struct
and save related value into the packet structure, it is preparation for
supporting sample flags.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190129122842.32041-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add documentation for how to pass a BPF program as a perf event.
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190201134651.12373-1-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently we make the annotation for the IPC column during the entry
display, already outside of the progress bar scope, so it appears like
'perf report' is stuck.
Move the annotation retrieval to the resort phase, so that all the data
are ready for display.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190204141808.23031-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add perf_evsel__output_resort_cb() so we have an interface with a
callback for each hist entry. It will be used in the following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190204141808.23031-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add argument to hists__resort_cb_t so that we can pass data from upper
layers to the callback function. It will be used in the following
patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190204141808.23031-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It prevents copy elision, generating this warning when building with
fedora:rawhide's clang:
clang version 7.0.1 (Fedora 7.0.1-2.fc30)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-redhat-linux/9
Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/9
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-redhat-linux/9
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
$ make -C tools/perf CC=clang LIBCLANGLLVM=1
<SNIP>
util/c++/clang.cpp: In function 'std::unique_ptr<llvm::SmallVectorImpl<char> > perf::getBPFObjectFromModule(llvm::Module*)':
util/c++/clang.cpp:163:18: error: moving a local object in a return statement prevents copy elision [-Werror=pessimizing-move]
163 | return std::move(Buffer);
| ~~~~~~~~~^~~~~~~~
util/c++/clang.cpp:163:18: note: remove 'std::move' call
cc1plus: all warnings being treated as errors
<SNIP>
References:
http://www.cplusplus.com/forum/general/186411/#msg908572https://en.cppreference.com/w/cpp/language/return#Noteshttps://en.cppreference.com/w/cpp/language/copy_elision
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-lehqf5x5q96l0o8myhb6blz6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Build node cpu masks for mmap data buffers. Apply node cpu masks to tool
thread every time it references data buffers cross node or cross cpu.
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/b25e4ebc-078d-2c7b-216c-f0bed108d073@linux.intel.com
[ Use cpu-set-sched.h to get the CPU_{EQUAL,OR}() fallbacks for older systems ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
From the glibc sources, so that we can keep the tooling buildable in
older systems while using recent sched.h CPU_ macros.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/n/tip-hvm9ysmrjip75ebdzhzoh429@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>