linux/tools
Adrian Hunter 3c0cd952cf perf thread-stack: Hide x86 retpolines
x86 retpoline functions pollute the call graph by showing up everywhere
there is an indirect branch, but they do not really mean anything. Make
changes so that the default retpoline functions will no longer appear in
the call graph. Note this only affects the call graph, since all the
original branches are left unchanged.

This does not handle function return thunks, nor is there any
improvement for the handling of inline thunks or extern thunks.

Example:

  $ cat simple-retpoline.c
  __attribute__((noinline)) int bar(void)
  {
          return -1;
  }

  int foo(void)
  {
          return bar() + 1;
  }

  __attribute__((indirect_branch("thunk"))) int main()
  {
          int (*volatile fn)(void) = foo;

          fn();
          return fn();
  }
  $ gcc -ggdb3 -Wall -Wextra -O2 -o simple-retpoline simple-retpoline.c
  $ objdump -d simple-retpoline
  <SNIP>
  0000000000001040 <main>:
      1040:       48 83 ec 18             sub    $0x18,%rsp
      1044:       48 8d 05 25 01 00 00    lea    0x125(%rip),%rax        # 1170 <foo>
      104b:       48 89 44 24 08          mov    %rax,0x8(%rsp)
      1050:       48 8b 44 24 08          mov    0x8(%rsp),%rax
      1055:       e8 1f 01 00 00          callq  1179 <__x86_indirect_thunk_rax>
      105a:       48 8b 44 24 08          mov    0x8(%rsp),%rax
      105f:       48 83 c4 18             add    $0x18,%rsp
      1063:       e9 11 01 00 00          jmpq   1179 <__x86_indirect_thunk_rax>
  <SNIP>
  0000000000001160 <bar>:
      1160:       b8 ff ff ff ff          mov    $0xffffffff,%eax
      1165:       c3                      retq
  <SNIP>
  0000000000001170 <foo>:
      1170:       e8 eb ff ff ff          callq  1160 <bar>
      1175:       83 c0 01                add    $0x1,%eax
      1178:       c3                      retq
  0000000000001179 <__x86_indirect_thunk_rax>:
      1179:       e8 07 00 00 00          callq  1185 <__x86_indirect_thunk_rax+0xc>
      117e:       f3 90                   pause
      1180:       0f ae e8                lfence
      1183:       eb f9                   jmp    117e <__x86_indirect_thunk_rax+0x5>
      1185:       48 89 04 24             mov    %rax,(%rsp)
      1189:       c3                      retq
  <SNIP>
  $ perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0,017 MB simple-retpoline.perf.data ]
  $ perf script -i simple-retpoline.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls
  2019-01-08 14:03:37.851655 Creating database...
  2019-01-08 14:03:37.863256 Writing records...
  2019-01-08 14:03:38.069750 Adding indexes
  2019-01-08 14:03:38.078799 Done
  $ ~/libexec/perf-core/scripts/python/exported-sql-viewer.py simple-retpoline.db

Before:

    main
        -> __x86_indirect_thunk_rax
            -> __x86_indirect_thunk_rax
                -> foo
                    -> bar

After:

    main
        -> foo
            -> bar

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190109091835.5570-7-adrian.hunter@intel.com
[ Remove (sym->name != NULL) test, this is not a pointer and breaks the build with clang version 7.0.1 (Fedora 7.0.1-2.fc30) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-22 16:49:49 -03:00
..
accounting delayacct: track delays from thrashing cache pages 2018-10-26 16:26:32 -07:00
arch powerpc fixes for 5.0 #3 2019-01-19 05:55:42 +12:00
bpf tools: bpftool: Cleanup license mess 2019-01-18 15:26:54 -08:00
build tools build feature sched_getcpu: Undef _GNU_SOURCE at the end 2019-02-14 15:39:21 -03:00
cgroup
crypto crypto: user - rename err_cnt parameter 2018-12-07 14:15:00 +08:00
firewire
firmware tools: Add 'firmware' category and add ihex2fw tool 2018-11-11 12:58:27 -08:00
gpio tools gpio: Allow overriding CFLAGS 2018-12-28 16:33:08 -03:00
hv Tools: hv: kvp: Fix a warning of buffer overflow with gcc 8.0.1 2018-11-11 12:58:27 -08:00
iio tools iio: Override CFLAGS assignments 2019-01-04 12:54:49 -03:00
include Merge branch 'perf/urgent' into perf/core, to pick up fixes 2019-02-09 13:15:32 +01:00
kvm/kvm_stat tools/kvm_stat: switch to python3 2018-11-27 12:53:44 +01:00
laptop
leds
lib Merge branch 'perf/urgent' into perf/core, to pick up fixes 2019-02-04 08:45:42 +01:00
memory-model tools/memory-model: Add more LKMM limitations 2018-10-02 10:28:04 +02:00
nfsd
objtool objtool: Fix segfault in .cold detection with -ffunction-sections 2018-11-20 18:59:00 +01:00
pci tools: PCI: Change pcitest compiling process 2018-10-03 11:19:52 +01:00
pcmcia
perf perf thread-stack: Hide x86 retpolines 2019-02-22 16:49:49 -03:00
power perf/core improvements and fixes: 2019-01-03 14:05:16 +01:00
scripts
spi spi: spidev_test: Improve decoded text part of hex dump 2018-09-04 17:00:37 +01:00
testing proc: fix /proc/net/* after setns(2) 2019-02-01 15:46:22 -08:00
thermal/tmon tools thermal tmon: Use -O3 instead of -O1 if available 2019-01-04 12:54:49 -03:00
time
usb usbip: tools: fix atoi() on non-null terminated string 2018-10-18 19:44:39 +02:00
virtio virtio: fix test build after uio.h change 2018-12-19 18:23:49 -05:00
vm tools/vm/page_owner: use page_owner_sort in the use example 2019-01-08 17:15:11 -08:00
wmi
Makefile tools: Add 'firmware' category and add ihex2fw tool 2018-11-11 12:58:27 -08:00