Implement the TopDown formulas in 'perf stat'. The topdown basic metrics
reported by the kernel are collected, and the formulas are computed and
output as normal metrics.
See the kernel commit exporting the events for details on the used
metrics.
Committer note:
Output example:
# perf stat --topdown -a usleep 1
Performance counter stats for 'system wide':
retiring bad speculation frontend bound backend bound
S0-C0 2 23.8% 11.6% 28.3% 36.3%
S0-C1 2 16.2% 15.7% 36.5% 31.6%
0.000579956 seconds time elapsed
#
v2: Always print all metrics, only use thresholds for coloring.
v3: Mark retiring over threshold green, not red.
v4: Only print one decimal digit
Fix color printing of one metric
v5: Avoid printing -0.0
v6: Remove extra frontend event lookup
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1464119559-17203-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently only the task-clock event updates the runtime_nsec so it
cannot show the metric when using cpu-clock events. However cpu clock
works basically same as task-clock, so no need to not update the runtime
IMHO.
Before:
# perf stat -a -e cpu-clock,context-switches,page-faults,cycles sleep 0.1
Performance counter stats for 'system wide':
1217.759506 cpu-clock (msec)
93 context-switches
61 page-faults
18,958,022 cycles
0.101393794 seconds time elapsed
After:
Performance counter stats for 'system wide':
1220.471884 cpu-clock (msec) # 12.013 CPUs utilized
118 context-switches # 0.097 K/sec
59 page-faults # 0.048 K/sec
17,941,247 cycles # 0.015 GHz
0.101594777 seconds time elapsed
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1463119263-5569-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All over the tree.
Cc: David Ahern <dsahern@gmail.com>
cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-8nzhnokxyp8y4v7gf0j00oyb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add an extra check for frontend stalled in the metrics. This avoids an
extra column for the --metric-only case when the CPU does not support
frontend stalled.
v2: Add separate init function
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1456858672-21594-8-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now support CSV output for metrics. With the new output callbacks this
is relatively straight forward by creating new callbacks.
This allows to easily plot metrics from CSV files.
The new line callback needs to know the number of fields to skip them
correctly
Example output before:
% perf stat -x, true
0.200687,,task-clock,200687,100.00
0,,context-switches,200687,100.00
0,,cpu-migrations,200687,100.00
40,,page-faults,200687,100.00
730871,,cycles,203601,100.00
551056,,stalled-cycles-frontend,203601,100.00
<not supported>,,stalled-cycles-backend,0,100.00
385523,,instructions,203601,100.00
78028,,branches,203601,100.00
3946,,branch-misses,203601,100.00
After:
% perf stat -x, true
.502457,,task-clock,502457,100.00,0.485,CPUs utilized
0,,context-switches,502457,100.00,0.000,K/sec
0,,cpu-migrations,502457,100.00,0.000,K/sec
45,,page-faults,502457,100.00,0.090,M/sec
644692,,cycles,509102,100.00,1.283,GHz
423470,,stalled-cycles-frontend,509102,100.00,65.69,frontend cycles idle
<not supported>,,stalled-cycles-backend,0,100.00,,,,
492701,,instructions,509102,100.00,0.76,insn per cycle
,,,,,0.86,stalled cycles per insn
97767,,branches,509102,100.00,194.578,M/sec
4788,,branch-misses,509102,100.00,4.90,of all branches
or easier readable
$ perf stat -x, -o x.csv true
$ column -s, -t x.csv
0.490635 task-clock 490635 100.00 0.489 CPUs utilized
0 context-switches 490635 100.00 0.000 K/sec
0 cpu-migrations 490635 100.00 0.000 K/sec
45 page-faults 490635 100.00 0.092 M/sec
629080 cycles 497698 100.00 1.282 GHz
409498 stalled-cycles-frontend 497698 100.00 65.09 frontend cycles idle
<not supported> stalled-cycles-backend 0 100.00
491424 instructions 497698 100.00 0.78 insn per cycle
0.83 stalled cycles per insn
97278 branches 497698 100.00 198.270 M/sec
4569 branch-misses 497698 100.00 4.70 of all branches
Two new fields are added: metric value and metric name.
v2: Split out function argument changes
v3: Reenable metrics for real.
v4: Fix wrong hunk from refactoring.
v5: Remove extra "noise" printing (Jiri), but add it to the not counted case.
Print empty metrics for not counted.
v6: Avoid outputting metric on empty format.
v7: Print metric at the end
v8: Remove extra run, ena fields
v9: Avoid extra new line for unsupported counters
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/1456785386-19481-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Abstract the printing of shadow metrics. Instead of every metric calling
fprintf directly and taking care of indentation, use two call backs: one
to print metrics and another to start a new line.
This will allow adding metrics to CSV mode and also using them for other
purposes.
The computation of padding is now done in the central callback, instead
of every metric doing it manually. This makes it easier to add new
metrics.
v2: Refactor functions, printout now does more. Move
shadow printing. Improve fallback callbacks. Don't
use void * callback data.
v3: Remove unnecessary hunk. Add typedef for new_line
v4: Remove unnecessary hunk. Don't print metrics for CSV/interval
mode yet. Move printout change to separate patch.
v5: Fix bisect bugs. Avoid bogus frontend cycles printing.
Fix indentation in different aggregation modes.
v6: Delay newline handling
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454173616-17710-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The sw clock metrics printing was missed in the earlier move to
stat-shadow of all the other metric printouts. Move it too.
v2: Fix metrics printing in this version to make bisect safe.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1446515428-7450-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The transaction length metrics in perf stat -T broke recently.
It would not match the metric correctly and always print K/sec.
This was caused by a incorrect update of the cycles_in_tx statistics.
Update the correct variable.
Also the check for zero division was reversed, which resulted in K/sec
being printed for no transactions. Fix this also up.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1438039491-22091-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Separating shadow counters code into separate object as a cleanup, but
mainly for upcomming changes, so could use it from script command
context.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1433341559-31848-10-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>