This is already present in 'perf top', albeit undocumented (will fix),
and is useful to use /proc/kcore instead of vmlinux and then get what is
really in place, not what the kernel starts with, before alternatives,
ftrace .text patching, etc, see the differences:
# perf annotate --stdio2 _raw_spin_lock_irqsave
_raw_spin_lock_irqsave() /lib/modules/4.16.0-rc4/build/vmlinux
Event: anon group { cycles, instructions }
0.00 3.17 → callq __fentry__
0.00 7.94 push %rbx
7.69 36.51 → callq __page_file_index
mov %rax,%rbx
7.69 3.17 → callq *ffffffff82225cd0
xor %eax,%eax
mov $0x1,%edx
80.77 49.21 lock cmpxchg %edx,(%rdi)
test %eax,%eax
↓ jne 2b
3.85 0.00 mov %rbx,%rax
pop %rbx
← retq
2b: mov %eax,%esi
→ callq queued_spin_lock_slowpath
mov %rbx,%rax
pop %rbx
← retq
[root@jouet ~]# perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave
_raw_spin_lock_irqsave() /proc/kcore
Event: anon group { cycles, instructions }
0.00 3.17 nop
0.00 7.94 push %rbx
0.00 23.81 pushfq
7.69 12.70 pop %rax
nop
mov %rax,%rbx
7.69 3.17 cli
nop
xor %eax,%eax
mov $0x1,%edx
80.77 49.21 lock cmpxchg %edx,(%rdi)
test %eax,%eax
↓ jne 2b
3.85 0.00 mov %rbx,%rax
pop %rbx
← retq
2b: mov %eax,%esi
→ callq *ffffffff820e96b0
mov %rbx,%rax
pop %rbx
← retq
#
Diff of the output of those commands:
# perf annotate --stdio2 _raw_spin_lock_irqsave > /tmp/vmlinux
# perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave > /tmp/kcore
# diff -y /tmp/vmlinux /tmp/kcore
_raw_spin_lock_irqsave() vmlinux | _raw_spin_lock_irqsave() /proc/kcore
Event: anon group { cycles, instructions } Event: anon group { cycles, instructions }
0.00 3.17 → callq __fentry__ | 0.00 3.17 nop
0.00 7.94 push %rbx 0.00 7.94 push %rbx
7.69 36.51 → callq __page_file_index | 0.00 23.81 pushfq
> 7.69 12.70 pop %rax
> nop
mov %rax,%rbx mov %rax,%rbx
7.69 3.17 → callq *ffffffff82225cd0 | 7.69 3.17 cli
> nop
xor %eax,%eax xor %eax,%eax
mov $0x1,%edx mov $0x1,%edx
80.77 49.21 lock cmpxchg %edx,(%rdi) 80.77 49.21 lock cmpxchg %edx,(%rdi)
test %eax,%eax test %eax,%eax
↓ jne 2b ↓ jne 2b
3.85 0.00 mov %rbx,%rax 3.85 0.00 mov %rbx,%rax
pop %rbx pop %rbx
← retq ← retq
2b: mov %eax,%esi 2b: mov %eax,%esi
→ callq queued_spin_lock_slowpath| → callq *ffffffff820e96b0
mov %rbx,%rax mov %rbx,%rax
pop %rbx pop %rbx
← retq ← retq
#
This should be further streamlined by doing both annotations and
allowing the TUI to toggle initial/current, and show the patched
instructions in a slightly different color.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-wz8d269hxkcwaczr0r4rhyjg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
124 lines
2.9 KiB
Plaintext
124 lines
2.9 KiB
Plaintext
perf-annotate(1)
|
|
================
|
|
|
|
NAME
|
|
----
|
|
perf-annotate - Read perf.data (created by perf record) and display annotated code
|
|
|
|
SYNOPSIS
|
|
--------
|
|
[verse]
|
|
'perf annotate' [-i <file> | --input=file] [symbol_name]
|
|
|
|
DESCRIPTION
|
|
-----------
|
|
This command reads the input file and displays an annotated version of the
|
|
code. If the object file has debug symbols then the source code will be
|
|
displayed alongside assembly code.
|
|
|
|
If there is no debug info in the object, then annotated assembly is displayed.
|
|
|
|
OPTIONS
|
|
-------
|
|
-i::
|
|
--input=<file>::
|
|
Input file name. (default: perf.data unless stdin is a fifo)
|
|
|
|
-d::
|
|
--dsos=<dso[,dso...]>::
|
|
Only consider symbols in these dsos.
|
|
-s::
|
|
--symbol=<symbol>::
|
|
Symbol to annotate.
|
|
|
|
-f::
|
|
--force::
|
|
Don't do ownership validation.
|
|
|
|
-v::
|
|
--verbose::
|
|
Be more verbose. (Show symbol address, etc)
|
|
|
|
-q::
|
|
--quiet::
|
|
Do not show any message. (Suppress -v)
|
|
|
|
-n::
|
|
--show-nr-samples::
|
|
Show the number of samples for each symbol
|
|
|
|
-D::
|
|
--dump-raw-trace::
|
|
Dump raw trace in ASCII.
|
|
|
|
-k::
|
|
--vmlinux=<file>::
|
|
vmlinux pathname.
|
|
|
|
--ignore-vmlinux::
|
|
Ignore vmlinux files.
|
|
|
|
-m::
|
|
--modules::
|
|
Load module symbols. WARNING: use only with -k and LIVE kernel.
|
|
|
|
-l::
|
|
--print-line::
|
|
Print matching source lines (may be slow).
|
|
|
|
-P::
|
|
--full-paths::
|
|
Don't shorten the displayed pathnames.
|
|
|
|
--stdio:: Use the stdio interface.
|
|
|
|
--stdio2:: Use the stdio2 interface, non-interactive, uses the TUI formatting.
|
|
|
|
--stdio-color=<mode>::
|
|
'always', 'never' or 'auto', allowing configuring color output
|
|
via the command line, in addition to via "color.ui" .perfconfig.
|
|
Use '--stdio-color always' to generate color even when redirecting
|
|
to a pipe or file. Using just '--stdio-color' is equivalent to
|
|
using 'always'.
|
|
|
|
--tui:: Use the TUI interface. Use of --tui requires a tty, if one is not
|
|
present, as when piping to other commands, the stdio interface is
|
|
used. This interfaces starts by centering on the line with more
|
|
samples, TAB/UNTAB cycles through the lines with more samples.
|
|
|
|
--gtk:: Use the GTK interface.
|
|
|
|
-C::
|
|
--cpu=<cpu>:: Only report samples for the list of CPUs provided. Multiple CPUs can
|
|
be provided as a comma-separated list with no space: 0,1. Ranges of
|
|
CPUs are specified with -: 0-2. Default is to report samples on all
|
|
CPUs.
|
|
|
|
--asm-raw::
|
|
Show raw instruction encoding of assembly instructions.
|
|
|
|
--show-total-period:: Show a column with the sum of periods.
|
|
|
|
--source::
|
|
Interleave source code with assembly code. Enabled by default,
|
|
disable with --no-source.
|
|
|
|
--symfs=<directory>::
|
|
Look for files with symbols relative to this directory.
|
|
|
|
-M::
|
|
--disassembler-style=:: Set disassembler style for objdump.
|
|
|
|
--objdump=<path>::
|
|
Path to objdump binary.
|
|
|
|
--skip-missing::
|
|
Skip symbols that cannot be annotated.
|
|
|
|
--group::
|
|
Show event group information together
|
|
|
|
SEE ALSO
|
|
--------
|
|
linkperf:perf-record[1], linkperf:perf-report[1]
|