To make the output more readable, I think it's better to remove 0's in
the output. Also the dummy event has no event stats so it just wasts
the space. Let's use the --skip-empty option to suppress it.
$ perf report --stat --skip-empty
Aggregated stats:
TOTAL events: 16530
MMAP events: 226
COMM events: 1596
EXIT events: 2
THROTTLE events: 121
UNTHROTTLE events: 117
FORK events: 1595
SAMPLE events: 719
MMAP2 events: 12147
CGROUP events: 2
FINISHED_ROUND events: 2
THREAD_MAP events: 1
CPU_MAP events: 1
TIME_CONV events: 1
cycles stats:
SAMPLE events: 719
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In hist__find_annotations(), since different 'struct hist_entry' entries
may point to same symbol, we free notes->src to signal already processed
this symbol in stdio mode; when annotate, entry will skipped if
notes->src is NULL to avoid repeated output.
However, there is a problem, for example, run the following command:
# perf record -e branch-misses -e branch-instructions -a sleep 1
perf.data file contains different types of sample event.
If the same IP sample event exists in branch-misses and branch-instructions,
this event uses the same symbol. When annotate branch-misses events, notes->src
corresponding to this event is set to null, as a result, when annotate
branch-instructions events, this event is skipped and no annotate is output.
Solution of this patch is to remove zfree in hists__find_annotations and
change sort order to "dso,symbol" to avoid duplicate output when different
processes correspond to the same symbol.
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: zhangjinhao2@huawei.com
Link: http://lore.kernel.org/lkml/20210319123527.173883-1-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The objdump utility has useful --prefix / --prefix-strip options to
allow changing source code file names hardcoded into executables' debug
info. Add options to 'perf report', 'perf top' and 'perf annotate',
which are then passed to objdump.
$ mkdir foo
$ echo 'main() { for (;;); }' > foo/foo.c
$ gcc -g foo/foo.c
foo/foo.c:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
1 | main() { for (;;); }
| ^~~~
$ perf record ./a.out
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.230 MB perf.data (5721 samples) ]
$ mv foo bar
$ perf annotate
<does not show source code>
$ perf annotate --prefix=/home/ak/lsrc/git/bar --prefix-strip=5
<does show source code>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
LPU-Reference: 20200107210444.214071-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we pass that substructure around and with it consolidate lots of
functions that receive a (map, symbol) pair and now can receive just a
'struct map_symbol' pointer.
This further paves the way to add 'struct map_groups' to 'struct
map_symbol' so that we can have all we need for annotation so that we
can ditch 'struct map'->groups, i.e. have the map_groups pointer in a
more central place, avoiding the pointer in the 'struct map' that have
tons of instances.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-fs90ttd9q12l7989fo7pw81q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The hist__account_cycles() function is executed when the
hist_iter__branch_callback() is called.
But it looks it's not necessary. In hist__account_cycles, it already
walks on all branch entries.
This patch moves the hist__account_cycles out of callback, now the data
processing is much faster than before.
Previous code has an issue that the ch[offset].num++ (in
__symbol__account_cycles) is executed repeatedly since
hist__account_cycles is called in each hist_iter__branch_callback, so
the counting of ch[offset].num is not correct (too big).
With this patch, the issue is fixed. And we don't need the code of
"ch->reset >= ch->num / 2" to check if there are too many overlaps (in
annotation__count_and_fill), otherwise some data would be hidden.
Now, we can try, for example:
perf record -b ...
perf annotate or perf report -s symbol
The before/after output should be no change.
v3:
---
Fix the crash in stdio mode.
Like previous code, it needs the checking of ui__has_annotation()
before hist__account_cycles()
v2:
---
1. Cover the similar perf report
2. Remove the checking code "ch->reset >= ch->num / 2"
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1552684577-29041-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node), which is
something heavily required for histograms. Specifically, the following
are converted to rb_root_cached, and users accordingly:
hist::entries_in_array
hist::entries_in
hist::entries
hist::entries_collapsed
hist_entry::hroot_in
hist_entry::hroot_out
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181206191819.30182-7-dave@stgolabs.net
[ Added some missing conversions to rb_first_cached() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add --percent-type option to set annotation percent type from following
choices:
global-period, local-period, global-hits, local-hits
Examples:
$ perf annotate --percent-type period-local --stdio | head -1
Percent | Source code ... es, percent: local period)
$ perf annotate --percent-type hits-local --stdio | head -1
Percent | Source code ... es, percent: local hits)
$ perf annotate --percent-type hits-global --stdio | head -1
Percent | Source code ... es, percent: global hits)
$ perf annotate --percent-type period-global --stdio | head -1
Percent | Source code ... es, percent: global period)
The local/global keywords set if the percentage is computed in the scope
of the function (local) or the whole data (global).
The period/hits keywords set the base the percentage is computed on -
the samples period or the number of samples (hits).
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20180804130521.11408-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_event__process_feature() accesses feat_ops[HEADER_LAST_FEATURE]
which is not defined and thus perf is crashing. HEADER_LAST_FEATURE is
used as an end marker for the perf report but it's unused for perf
script/annotate. Ignore HEADER_LAST_FEATURE for perf script/annotate,
just like it is done in 'perf report'.
Before:
# perf record -o - ls | perf script
<SNIP 'ls' output>
Segmentation fault (core dumped)
#
After:
# perf record -o - ls | perf script
<SNIP 'ls' output>
Segmentation fault (core dumped)
ls 7031 4392.099856: 250000 cpu-clock:uhH: 7f5e0ce7cd60
ls 7031 4392.100355: 250000 cpu-clock:uhH: 7f5e0c706ef7
#
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 57b5de4639 ("perf report: Support forced leader feature in pipe mode")
Link: http://lkml.kernel.org/r/20180625124220.6434-4-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove the split of symbol tables for data (MAP__VARIABLE) and for
functions (MAP__FUNCTION), its unneeded and there were various places
doing two lookups to find a symbol, so simplify this.
We still will consider only the symbols that matched the filters in
place, i.e. see the (elf_(sec,sym)|symbol_type)__filter() routines in
the patch, just so that we consider only the same symbols as before,
to reduce the possibility of regressions.
All the tests on 50-something build environments, in varios versions
of lots of distros and cross build environments were performed without
build regressions, as usual with all pull requests the other tests were
also performed: 'perf test' and 'make -C tools/perf build-test'.
Also this was done at a great granularity so that regressions can be
bisected more easily.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-hiq0fy2rsleupnqqwuojo1ne@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>