linux/tools/perf
Masami Hiramatsu d2d4edbebe perf probe: Fix to show correct locations for events on modules
Fix to show correct locations for events on modules by relocating given
address instead of retrying after failure.

This happens when the module text size is big enough, bigger than
sh_addr, because the original code retries with given address + sh_addr
if it failed to find CU DIE at the given address.

Any address smaller than sh_addr always fails and it retries with the
correct address, but addresses bigger than sh_addr will get a CU DIE
which is on the given address (not adjusted by sh_addr).

In my environment(x86-64), the sh_addr of ".text" section is 0x10030.
Since i915 is a huge kernel module, we can see this issue as below.

  $ grep "[Tt] .*\[i915\]" /proc/kallsyms | sort | head -n1
  ffffffffc0270000 t i915_switcheroo_can_switch	[i915]

ffffffffc0270000 + 0x10030 = ffffffffc0280030, so we'll check
symbols cross this boundary.

  $ grep "[Tt] .*\[i915\]" /proc/kallsyms | grep -B1 ^ffffffffc028\
  | head -n 2
  ffffffffc027ff80 t haswell_init_clock_gating	[i915]
  ffffffffc0280110 t valleyview_init_clock_gating	[i915]

So setup probes on both function and see what happen.

  $ sudo ./perf probe -m i915 -a haswell_init_clock_gating \
        -a valleyview_init_clock_gating
  Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

  You can now use it in all perf tools, such as:

  	perf record -e probe:valleyview_init_clock_gating -aR sleep 1

  $ sudo ./perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on i915_vga_set_decode:4@gpu/drm/i915/i915_drv.c in i915)

As you can see, haswell_init_clock_gating is correctly shown,
but valleyview_init_clock_gating is not.

With this patch, both events are shown correctly.

  $ sudo ./perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)

Committer notes:

In my case:

  # perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
  Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

  You can now use it in all perf tools, such as:

	  perf record -e probe:valleyview_init_clock_gating -aR sleep 1

  # perf probe -l
    probe:haswell_init_clock_gating (on i915_getparam+432@gpu/drm/i915/i915_drv.c in i915)
    probe:valleyview_init_clock_gating (on __i915_printk+240@gpu/drm/i915/i915_drv.c in i915)
  #

  # readelf -SW /lib/modules/4.9.0+/build/vmlinux | egrep -w '.text|Name'
   [Nr] Name   Type      Address          Off    Size   ES Flg Lk Inf Al
   [ 1] .text  PROGBITS  ffffffff81000000 200000 822fd3 00  AX  0   0 4096
  #

  So both are b0rked, now with the fix:

  # perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
  Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

  You can now use it in all perf tools, such as:

	perf record -e probe:valleyview_init_clock_gating -aR sleep 1

  # perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
  #

Both looks correct.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148411436777.9978.1440275861947194930.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 15:14:06 -03:00
..
arch perf annotate: AArch64 support 2016-12-01 13:03:19 -03:00
bench perf bench futex: Fix lock-pi help string 2016-12-20 09:37:40 -03:00
Documentation perf record: Fix --switch-output documentation and comment 2017-01-03 11:11:38 -03:00
jvmti perf kvmti: Remove unused Makefile file 2016-11-14 12:42:56 -03:00
pmu-events perf vendor events: Support couple more POWER8 PVRs in mapfile 2016-10-17 13:39:47 -03:00
python perf python: Add tracepoint example 2016-07-12 16:23:35 -03:00
scripts perf/core improvements and fixes: 2016-08-04 11:02:38 +02:00
tests Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-23 16:49:12 -08:00
trace perf trace: Check if MAP_32BIT is defined (again) 2016-12-20 09:37:40 -03:00
ui perf annotate: Fix jump target outside of function address range 2016-12-15 16:25:46 -03:00
util perf probe: Fix to show correct locations for events on modules 2017-01-16 15:14:06 -03:00
.gitignore
Build perf c2c: Add c2c command 2016-10-19 13:18:31 -03:00
builtin-annotate.c perf annotate: Add branch stack / basic block 2016-09-08 13:44:03 -03:00
builtin-bench.c
builtin-buildid-cache.c
builtin-buildid-list.c
builtin-c2c.c perf tools: Remove some needless __maybe_unused 2016-12-15 16:25:45 -03:00
builtin-config.c perf config: Mark where are config items from (user or system) 2016-11-14 13:10:37 -03:00
builtin-data.c
builtin-diff.c perf hists: Add support for header span 2016-08-23 15:37:33 -03:00
builtin-evlist.c
builtin-help.c
builtin-inject.c perf symbols: Remove symbol_filter_t machinery 2016-09-05 11:14:50 -03:00
builtin-kmem.c perf kmem: Add option to specify time window of interest 2016-12-01 13:03:02 -03:00
builtin-kvm.c perf kvm: Use NSEC_PER_USEC 2016-08-23 15:37:33 -03:00
builtin-list.c perf list: Support long jevents descriptions 2016-10-03 21:35:47 -03:00
builtin-lock.c
builtin-mem.c perf mem: Fix --all-user/--all-kernel options 2016-12-15 16:25:45 -03:00
builtin-probe.c perf probe: Ignore vmlinux Build-id when offline vmlinux given 2016-09-01 12:42:22 -03:00
builtin-record.c perf record: Fix --switch-output documentation and comment 2017-01-03 11:11:38 -03:00
builtin-report.c perf tools: Remove some needless __maybe_unused 2016-12-15 16:25:45 -03:00
builtin-sched.c perf sched timehist: Show total scheduling time 2016-12-27 21:47:57 -03:00
builtin-script.c perf script: Add option to specify time window of interest 2016-12-01 13:02:45 -03:00
builtin-stat.c perf tools: Remove some needless __maybe_unused 2016-12-15 16:25:45 -03:00
builtin-timechart.c perf timechart: Use NSEC_PER_U?SEC 2016-08-23 15:37:33 -03:00
builtin-top.c perf annotate: Start supporting cross arch annotation 2016-11-17 17:12:50 -03:00
builtin-trace.c perf trace: Update tid/pid filtering option to leverage symbol_conf 2016-11-25 16:04:22 -03:00
builtin-version.c
builtin.h perf c2c: Add c2c command 2016-10-19 13:18:31 -03:00
check-headers.sh perf tools: Move headers check into bash script 2016-12-15 16:25:44 -03:00
command-list.txt
CREDITS
design.txt
Makefile
Makefile.config perf build: Check LLVM version in feature check 2016-12-06 13:21:55 -03:00
Makefile.perf perf/urgent fixes and one improvement: 2017-01-05 08:33:02 +01:00
MANIFEST tools lib: Add for_each_clear_bit macro 2016-10-24 11:07:33 -03:00
perf-archive.sh
perf-completion.sh
perf-read-vdso.c
perf-sys.h perf powerpc: Fix build-test failure 2016-09-08 13:44:07 -03:00
perf-with-kcore.sh
perf.c perf c2c: Add c2c command 2016-10-19 13:18:31 -03:00
perf.h perf evsel: Allow to ignore missing pid 2016-12-15 16:25:46 -03:00