linux/tools/perf
Thomas Richter ea1fa48c05 perf stat: Handle different PMU names with common prefix
On s390 the CPU Measurement Facility for counters now supports
2 PMUs named cpum_cf (CPU Measurement Facility for counters) and
cpum_cf_diag (CPU Measurement Facility for diagnostic counters)
for one and the same CPU.

Running command

 [root@s35lp76 perf]# ./perf stat -e tx_c_tend \
	 -- ~/mytests/cf-tx-events 1

 Measuring transactions
 TX_C_TABORT_NO_SPECIAL: 0 expected:0
 TX_C_TABORT_SPECIAL: 0 expected:0
 TX_C_TEND: 1 expected:1
 TX_NC_TABORT: 11 expected:11
 TX_NC_TEND: 1 expected:1

 Performance counter stats for '/root/mytests/cf-tx-events 1':

  2      tx_c_tend

      0.002120091 seconds time elapsed

      0.000121000 seconds user
      0.002127000 seconds sys

 [root@s35lp76 perf]#

displays output which is unexpected (and wrong):

  2      tx_c_tend

The test program definitely triggers only one transaction, as shown
in line 'TX_C_TEND: 1 expected:1'.

This is caused by the following call sequence:

pmu_lookup() scans and installs a PMU.
+--> pmu_aliases() parses all aliases in directory
		.../<pmu-name>/events/* which are file names.
     +--> pmu_aliases_parse() Read each file in directory and create
                      an new alias entry. This is done with
          +--> perf_pmu__new_alias() and
	       +--> __perf_pmu__new_alias() which also check for
	                   identical alias names.

After pmu_aliases() returns, a complete list of event names
for this pmu has been created. Now function

pmu_add_cpu_aliases()   is called to add the events listed in the json
|                       files to the alias list of the cpu.
+--> perf_pmu__find_map()  Returns a pointer to the json events.

Now function pmu_add_cpu_aliases() scans through all events listed
in the JSON files for this CPU.
Each json event pmu name is compared with the current PMU being
built up and if they mismatch, the json event is added to the
current PMUs alias list.
To avoid duplicate entries the following comparison is done:

	if (!is_arm_pmu_core(name)) {
	     pname = pe->pmu ? pe->pmu : "cpu";
	     if (strncmp(pname, name, strlen(pname)))
		     continue;
     }

The culprit is the strncmp() function.

Using current s390 PMU naming, the first PMU is 'cpum_cf'
and a long list of events is added, among them 'tx_c_tend'

When the second PMU named 'cpum_cf_diag' is added, only one event
named 'CF_DIAG' is added by the pmu_aliases()  function.

Now function pmu_add_cpu_aliases() is invoked for PMU 'cpum_cf_diag'.
Since the CPUID string is the same for both PMUs, json file events
for PMU named 'cpum_cf' are added to the PMU 'cpm_cf_diag'

This happens because the strncmp() actually compares:

     strncmp("cpum_cf", "cpum_cf_diag", 6);

The first parameter is the pmu name taken from the event in
the json file. The second parameter is the pmu name of the PMU
currently being built.
They are different, but the length of the compare only tests the
common prefix and this returns 0(true) when it should return false.

Now all events for PMU cpum_cf are added to the alias list for pmu
cpum_cf_diag.

Later on in function parse_events_add_pmu() the event 'tx_c_end' is
searched in all available PMUs and found twice, adding it two
times to the evsel_list global variable which is the root
of all events. This results in a counter value of 2 instead
of 1.

Output with this patch:

 [root@s35lp76 perf]# ./perf stat -e tx_c_tend \
			-- ~/mytests/cf-tx-events 1
 Measuring transactions
 TX_C_TABORT_NO_SPECIAL: 0 expected:0
 TX_C_TABORT_SPECIAL: 0 expected:0
 TX_C_TEND: 1 expected:1
 TX_NC_TABORT: 11 expected:11
 TX_NC_TEND: 1 expected:1

 Performance counter stats for '/root/mytests/cf-tx-events 1':

                  1      tx_c_tend

      0.001815365 seconds time elapsed

      0.000123000 seconds user
      0.001756000 seconds sys

 [root@s35lp76 perf]#

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Reviewed-by: Sebastien Boisvert <sboisvert@gydle.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 292c34c102 ("perf pmu: Fix core PMU alias list for X86 platform")
Link: http://lkml.kernel.org/r/20181023151616.78193-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-05 14:37:10 -03:00
..
arch Merge branch 'linus' into perf/urgent, to pick up fixes 2018-10-29 07:20:52 +01:00
bench tools arch: Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy' 2018-07-30 12:36:51 -03:00
Documentation perf record: Support weak groups 2018-11-05 14:37:10 -03:00
examples/bpf perf augmented_syscalls: Start collecting pathnames in the BPF program 2018-11-05 12:41:10 -03:00
include/bpf perf bpf: Add syscall_exit() helper 2018-08-30 15:52:20 -03:00
jvmti perf tools: Fix compilation errors on gcc8 2018-07-11 09:39:57 -04:00
pmu-events Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-10-23 13:32:18 +01:00
python perf python: Make twatch.py work with both python2 and python3 2018-02-19 12:28:08 -03:00
scripts perf scripts python: exported-sql-viewer.py: Add All branches report 2018-10-23 14:47:14 -03:00
tests perf test: S390 does not support watchpoints in test 22 2018-10-08 14:23:44 -03:00
trace perf trace beauty: Use the mmap flags table generated from headers 2018-10-31 09:57:53 -03:00
ui perf annotate: Add support to toggle percent type 2018-08-08 15:55:52 -03:00
util perf stat: Handle different PMU names with common prefix 2018-11-05 14:37:10 -03:00
.gitignore perf tools: Add trace/beauty/generated/ into .gitignore 2018-02-05 13:58:02 -03:00
Build perf trace: Remove audit-libs dependency if syscall tables are present 2018-01-23 09:51:38 -03:00
builtin-annotate.c perf tools: Remove perf_tool from event_op2 2018-09-19 10:25:10 -03:00
builtin-bench.c
builtin-buildid-cache.c perf buildid-cache: Warn --purge-all failures 2018-05-15 10:32:16 -03:00
builtin-buildid-list.c
builtin-c2c.c perf c2c report: Fix crash for empty browser 2018-07-31 10:53:20 -03:00
builtin-config.c
builtin-data.c
builtin-diff.c perf hists: Clarify callchain disabling when available 2018-07-24 14:37:33 -03:00
builtin-evlist.c
builtin-ftrace.c perf ftrace: Append an EOL when write tracing files 2018-02-19 09:49:12 -03:00
builtin-help.c perf tools: Rename HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORT 2018-04-12 10:33:31 -03:00
builtin-inject.c perf tools: Report itrace options in help 2018-09-19 15:06:59 -03:00
builtin-kallsyms.c perf machine: Ditch find_kernel_function variants 2018-04-30 12:20:54 -03:00
builtin-kmem.c tools lib traceevent, perf tools: Rename 'enum pevent_flag' to 'enum tep_flag' 2018-08-13 15:22:18 -03:00
builtin-kvm.c perf tools: Ditch the symbol_conf.nr_events global 2018-06-04 10:28:52 -03:00
builtin-list.c
builtin-lock.c
builtin-mem.c perf mem: Allow all record/report options 2018-04-18 15:35:48 -03:00
builtin-probe.c perf tools: No need to check if the argument to __get() function is NULL 2018-06-04 10:28:50 -03:00
builtin-record.c perf record: Support weak groups 2018-11-05 14:37:10 -03:00
builtin-report.c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-10-23 13:32:18 +01:00
builtin-sched.c perf sched: Use sched->show_callchain where appropriate 2018-06-05 10:09:54 -03:00
builtin-script.c perf script: Support total cycles count 2018-10-24 15:29:56 -03:00
builtin-stat.c perf evlist: Move perf_evsel__reset_weak_group into evlist 2018-11-05 14:37:09 -03:00
builtin-timechart.c perf thread: Make thread__find_symbol() return the symbol searched 2018-04-26 13:47:09 -03:00
builtin-top.c perf top: Start display thread earlier 2018-10-31 10:10:11 -03:00
builtin-trace.c perf trace: Fix setting of augmented payload when using eBPF + raw_syscalls 2018-11-03 08:19:56 -03:00
builtin-version.c perf version: Print status for syscall_table 2018-04-12 10:33:34 -03:00
builtin.h
check-headers.sh tools include uapi: Grab a copy of linux/fs.h 2018-10-30 11:46:22 -03:00
command-list.txt perf help: Add missing subcommand version 2018-09-19 14:53:36 -03:00
CREDITS
design.txt
Makefile perf tools: Disable parallelism for 'make clean' 2018-08-20 08:54:58 -03:00
Makefile.config perf tools: Fix use of alternatives to find JDIR 2018-10-16 12:06:47 -03:00
Makefile.perf perf beauty: Wire up the mmap flags table generator to the Makefile 2018-10-31 09:57:52 -03:00
MANIFEST
perf-archive.sh
perf-completion.sh perf tools: Auto-complete for events with ':' 2017-12-27 12:16:00 -03:00
perf-read-vdso.c
perf-sys.h Drop a bunch of metag references 2018-02-23 14:29:59 +00:00
perf-with-kcore.sh
perf.c perf tools: Remove dead quote.[ch] code 2018-06-04 10:28:50 -03:00
perf.h perf record: Encode -k clockid frequency into Perf trace 2018-10-18 11:16:38 -03:00