Commit Graph

7716 Commits

Author SHA1 Message Date
Li Huafei
28b8e87abf perf mem-events: Remove duplicate #undef
Remove duplicate '#undef E'.

Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zhang Jinhao <zhangjinhao2@huawei.com>
Link: http://lore.kernel.org/lkml/20210616120339.219807-1-lihuafei1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-16 15:05:24 -03:00
Leo Yan
197eecb6ec perf session: Correct buffer copying when peeking events
When peeking an event, it has a short path and a long path.  The short
path uses the session pointer "one_mmap_addr" to directly fetch the
event; and the long path needs to read out the event header and the
following event data from file and fill into the buffer pointer passed
through the argument "buf".

The issue is in the long path that it copies the event header and event
data into the same destination address which pointer "buf", this means
the event header is overwritten.  We are just lucky to run into the
short path in most cases, so we don't hit the issue in the long path.

This patch adds the offset "hdr_sz" to the pointer "buf" when copying
the event data, so that it can reserve the event header which can be
used properly by its caller.

Fixes: 5a52f33adf ("perf session: Add perf_session__peek_event()")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210605052957.1070720-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-11 12:54:24 -03:00
Jin Yao
1fcc57b7e5 perf evsel: Adjust hybrid event and global event mixed group
A group mixed with hybrid event and global event is allowed. For
example, group leader is 'intel_pt//' and the group member is
'cpu_atom/cycles/'.

e.g.:

  # perf record --aux-sample -e '{intel_pt//,cpu_atom/cycles/}:u'

The challenge is that their available cpus are not fully matched. For
example, 'intel_pt//' is available on CPU0-CPU23, but 'cpu_atom/cycles/'
is available on CPU16-CPU23.

When getting the group id for group member, we must be very careful.
Because the cpu for 'intel_pt//' is not equal to the cpu for
'cpu_atom/cycles/'. Actually the cpu here is the index of evsel->core.cpus,
not the real CPU ID.

e.g. cpu0 for 'intel_pt//' is CPU0, but cpu0 for 'cpu_atom/cycles/' is CPU16.

Before:

  # perf record --aux-sample -e '{intel_pt//,cpu_atom/cycles/}:u' -vv uname
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             10
    size                             128
    config                           0xe601
    { sample_period, sample_freq }   1
    sample_type                      IP|TID|TIME|CPU|IDENTIFIER
    read_format                      ID
    disabled                         1
    inherit                          1
    exclude_kernel                   1
    exclude_hv                       1
    enable_on_exec                   1
    sample_id_all                    1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 4084  cpu 0  group_fd -1  flags 0x8 = 5
  sys_perf_event_open: pid 4084  cpu 1  group_fd -1  flags 0x8 = 6
  sys_perf_event_open: pid 4084  cpu 2  group_fd -1  flags 0x8 = 7
  sys_perf_event_open: pid 4084  cpu 3  group_fd -1  flags 0x8 = 9
  sys_perf_event_open: pid 4084  cpu 4  group_fd -1  flags 0x8 = 10
  sys_perf_event_open: pid 4084  cpu 5  group_fd -1  flags 0x8 = 11
  sys_perf_event_open: pid 4084  cpu 6  group_fd -1  flags 0x8 = 12
  sys_perf_event_open: pid 4084  cpu 7  group_fd -1  flags 0x8 = 13
  sys_perf_event_open: pid 4084  cpu 8  group_fd -1  flags 0x8 = 14
  sys_perf_event_open: pid 4084  cpu 9  group_fd -1  flags 0x8 = 15
  sys_perf_event_open: pid 4084  cpu 10  group_fd -1  flags 0x8 = 16
  sys_perf_event_open: pid 4084  cpu 11  group_fd -1  flags 0x8 = 17
  sys_perf_event_open: pid 4084  cpu 12  group_fd -1  flags 0x8 = 18
  sys_perf_event_open: pid 4084  cpu 13  group_fd -1  flags 0x8 = 19
  sys_perf_event_open: pid 4084  cpu 14  group_fd -1  flags 0x8 = 20
  sys_perf_event_open: pid 4084  cpu 15  group_fd -1  flags 0x8 = 21
  sys_perf_event_open: pid 4084  cpu 16  group_fd -1  flags 0x8 = 22
  sys_perf_event_open: pid 4084  cpu 17  group_fd -1  flags 0x8 = 23
  sys_perf_event_open: pid 4084  cpu 18  group_fd -1  flags 0x8 = 24
  sys_perf_event_open: pid 4084  cpu 19  group_fd -1  flags 0x8 = 25
  sys_perf_event_open: pid 4084  cpu 20  group_fd -1  flags 0x8 = 26
  sys_perf_event_open: pid 4084  cpu 21  group_fd -1  flags 0x8 = 27
  sys_perf_event_open: pid 4084  cpu 22  group_fd -1  flags 0x8 = 28
  sys_perf_event_open: pid 4084  cpu 23  group_fd -1  flags 0x8 = 29
  ------------------------------------------------------------
  perf_event_attr:
    size                             128
    config                           0x800000000
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|PERIOD|IDENTIFIER|AUX
    read_format                      ID
    inherit                          1
    exclude_kernel                   1
    exclude_hv                       1
    freq                             1
    sample_id_all                    1
    exclude_guest                    1
    aux_sample_size                  4096
  ------------------------------------------------------------
  sys_perf_event_open: pid 4084  cpu 16  group_fd 5  flags 0x8
  sys_perf_event_open failed, error -22

The group_fd 5 is not correct. It should be 22 (the fd of
'intel_pt' on CPU16).

After:

  # perf record --aux-sample -e '{intel_pt//,cpu_atom/cycles/}:u' -vv uname
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             10
    size                             128
    config                           0xe601
    { sample_period, sample_freq }   1
    sample_type                      IP|TID|TIME|CPU|IDENTIFIER
    read_format                      ID
    disabled                         1
    inherit                          1
    exclude_kernel                   1
    exclude_hv                       1
    enable_on_exec                   1
    sample_id_all                    1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 5162  cpu 0  group_fd -1  flags 0x8 = 5
  sys_perf_event_open: pid 5162  cpu 1  group_fd -1  flags 0x8 = 6
  sys_perf_event_open: pid 5162  cpu 2  group_fd -1  flags 0x8 = 7
  sys_perf_event_open: pid 5162  cpu 3  group_fd -1  flags 0x8 = 9
  sys_perf_event_open: pid 5162  cpu 4  group_fd -1  flags 0x8 = 10
  sys_perf_event_open: pid 5162  cpu 5  group_fd -1  flags 0x8 = 11
  sys_perf_event_open: pid 5162  cpu 6  group_fd -1  flags 0x8 = 12
  sys_perf_event_open: pid 5162  cpu 7  group_fd -1  flags 0x8 = 13
  sys_perf_event_open: pid 5162  cpu 8  group_fd -1  flags 0x8 = 14
  sys_perf_event_open: pid 5162  cpu 9  group_fd -1  flags 0x8 = 15
  sys_perf_event_open: pid 5162  cpu 10  group_fd -1  flags 0x8 = 16
  sys_perf_event_open: pid 5162  cpu 11  group_fd -1  flags 0x8 = 17
  sys_perf_event_open: pid 5162  cpu 12  group_fd -1  flags 0x8 = 18
  sys_perf_event_open: pid 5162  cpu 13  group_fd -1  flags 0x8 = 19
  sys_perf_event_open: pid 5162  cpu 14  group_fd -1  flags 0x8 = 20
  sys_perf_event_open: pid 5162  cpu 15  group_fd -1  flags 0x8 = 21
  sys_perf_event_open: pid 5162  cpu 16  group_fd -1  flags 0x8 = 22
  sys_perf_event_open: pid 5162  cpu 17  group_fd -1  flags 0x8 = 23
  sys_perf_event_open: pid 5162  cpu 18  group_fd -1  flags 0x8 = 24
  sys_perf_event_open: pid 5162  cpu 19  group_fd -1  flags 0x8 = 25
  sys_perf_event_open: pid 5162  cpu 20  group_fd -1  flags 0x8 = 26
  sys_perf_event_open: pid 5162  cpu 21  group_fd -1  flags 0x8 = 27
  sys_perf_event_open: pid 5162  cpu 22  group_fd -1  flags 0x8 = 28
  sys_perf_event_open: pid 5162  cpu 23  group_fd -1  flags 0x8 = 29
  ------------------------------------------------------------
  perf_event_attr:
    size                             128
    config                           0x800000000
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|PERIOD|IDENTIFIER|AUX
    read_format                      ID
    inherit                          1
    exclude_kernel                   1
    exclude_hv                       1
    freq                             1
    sample_id_all                    1
    exclude_guest                    1
    aux_sample_size                  4096
  ------------------------------------------------------------
  sys_perf_event_open: pid 5162  cpu 16  group_fd 22  flags 0x8 = 30
  sys_perf_event_open: pid 5162  cpu 17  group_fd 23  flags 0x8 = 31
  sys_perf_event_open: pid 5162  cpu 18  group_fd 24  flags 0x8 = 32
  sys_perf_event_open: pid 5162  cpu 19  group_fd 25  flags 0x8 = 33
  sys_perf_event_open: pid 5162  cpu 20  group_fd 26  flags 0x8 = 34
  sys_perf_event_open: pid 5162  cpu 21  group_fd 27  flags 0x8 = 35
  sys_perf_event_open: pid 5162  cpu 22  group_fd 28  flags 0x8 = 36
  sys_perf_event_open: pid 5162  cpu 23  group_fd 29  flags 0x8 = 37
  ------------------------------------------------------------
  ...

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210609044555.27180-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-10 13:41:50 -03:00
Masami Hiramatsu
0808b3d5b7 perf probe: Provide clearer message permission error for tracefs access
Report permission error for the tracefs open and rewrite whole the error
message code around it.

You'll see a hint according to what you want to do with perf probe as
below.

  $ perf probe -l
  No permission to read tracefs.
  Please try 'sudo mount -o remount,mode=755 /sys/kernel/tracing/'
    Error: Failed to show event list.

  $ perf probe -d \*
  No permission to write tracefs.
  Please run this command again with sudo.
    Error: Failed to delete events.

This also fixes -ENOTSUP checking for mounting tracefs/debugfs.
Actually open returns -ENOENT in that case and we have to check it with
current mount point list. If we unmount debugfs and tracefs perf probe
shows correct message as below.

  $ perf probe -l
  Debugfs or tracefs is not mounted
  Please try 'sudo mount -t tracefs nodev /sys/kernel/tracing/'
    Error: Failed to show event list.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162299456839.503471.13863002017089255222.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-08 14:12:14 -03:00
Leo Yan
bde1e7d934 perf auxtrace: Change to use SMP memory barriers
The kernel and the userspace tool can access the AUX ring buffer head
and tail from different CPUs, thus SMP class of barriers are required
on SMP system.

This patch changes to use SMP barriers to replace mb() and rmb()
barriers.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20210602103007.184993-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-08 13:45:04 -03:00
Zou Wei
f54cad25a1 perf srccode: Use list_move() instead of equivalent list_del() + list_add() sequence
Using list_move() instead of list_del() + list_add(), shorter,
equivalent.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1623113566-49455-1-git-send-email-zou_wei@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-08 09:36:36 -03:00
Masami Hiramatsu
f4f1c42953 perf probe: Report possible permission error for map__load() failure
Report possible permission error including kptr_restrict setting
for map__load() failure. This can happen when non-superuser runs
perf probe.

With this patch, perf probe shows the following message.

 $ perf probe vfs_read
 Failed to load symbols from /proc/kallsyms
 Please ensure you can read the /proc/kallsyms symbol addresses.
 If the /proc/sys/kernel/kptr_restrict is '2', you can not read
 kernel symbol address even if you are a superuser. Please change
 it to '1'. If kptr_restrict is '1', the superuser can read the
 symbol addresses.
 In that case, please run this command again with sudo.
   Error: Failed to add events.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162282065877.448336.10047912688119745151.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 15:43:37 -03:00
Riccardo Mancini
67069a1f0f perf env: Fix memory leak of bpf_prog_info_linear member
ASan reported a memory leak caused by info_linear not being deallocated.

The info_linear was allocated during in perf_event__synthesize_one_bpf_prog().

This patch adds the corresponding free() when bpf_prog_info_node
is freed in perf_env__purge_bpf().

  $ sudo ./perf record -- sleep 5
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ]

  =================================================================
  ==297735==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 7688 byte(s) in 19 object(s) allocated from:
      #0 0x4f420f in malloc (/home/user/linux/tools/perf/perf+0x4f420f)
      #1 0xc06a74 in bpf_program__get_prog_info_linear /home/user/linux/tools/lib/bpf/libbpf.c:11113:16
      #2 0xb426fe in perf_event__synthesize_one_bpf_prog /home/user/linux/tools/perf/util/bpf-event.c:191:16
      #3 0xb42008 in perf_event__synthesize_bpf_events /home/user/linux/tools/perf/util/bpf-event.c:410:9
      #4 0x594596 in record__synthesize /home/user/linux/tools/perf/builtin-record.c:1490:8
      #5 0x58c9ac in __cmd_record /home/user/linux/tools/perf/builtin-record.c:1798:8
      #6 0x58990b in cmd_record /home/user/linux/tools/perf/builtin-record.c:2901:8
      #7 0x7b2a20 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
      #8 0x7b12ff in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
      #9 0x7b2583 in run_argv /home/user/linux/tools/perf/perf.c:409:2
      #10 0x7b0d79 in main /home/user/linux/tools/perf/perf.c:539:3
      #11 0x7fa357ef6b74 in __libc_start_main /usr/src/debug/glibc-2.33-8.fc34.x86_64/csu/../csu/libc-start.c:332:16

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20210602224024.300485-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 10:26:20 -03:00
Riccardo Mancini
69c9ffed6c perf symbol-elf: Fix memory leak by freeing sdt_note.args
Reported by ASan.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Jiri Slaby <jirislaby@kernel.org>
Link: http://lore.kernel.org/lkml/20210602220833.285226-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 10:06:27 -03:00
Namhyung Kim
3cc84399e9 perf stat: Honor event config name on --no-merge
If user gave an event name explicitly, it should be displayed in the
output as is.  But with --no-merge option it adds a pmu name at the
end so might confuse users.

Actually this is true for hybrid pmus, I think we should do the same
for others.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210602212241.2175005-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 10:05:23 -03:00
Namhyung Kim
2dc065eae5 perf evsel: Add missing cloning of evsel->use_config_name
The evsel__clone() should copy all fields in the evsel which are set
during the event parsing.  But it missed the use_config_name field.

Fixes: 12279429d8 ("perf stat: Uniquify hybrid event name")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210602212241.2175005-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 10:04:20 -03:00
Arnaldo Carvalho de Melo
0ab8009b3e Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes from perf/urgent to allow perf/core to be used for new
development.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 14:58:44 -03:00
Jin Yao
d5a8bd0fcd perf mem: Disable 'mem-loads-aux' group before reporting
For some platforms, such as Alderlake, the 'mem-loads' event is required
to use together with 'mem-loads-aux' within a group and 'mem-loads-aux'
must be the group leader. Now we disable this group before reporting
because 'mem-loads-aux' is just an auxiliary event. It doesn't carry
any valid memory load result. If we show the 'mem-loads-aux' +
'mem-loads' as a group in report, it needs many of changes but they
are totally unnecessary.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-8-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:06:01 -03:00
Jin Yao
4a9086adc3 perf mem: Support record for hybrid platform
Support 'perf mem record' for hybrid platform. On hybrid platform,
such as Alderlake, when executing 'perf mem record', it actually calls:

record -e {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}:P
       -e cpu_atom/mem-loads,ldlat=30/P
       -e cpu_core/mem-stores/P
       -e cpu_atom/mem-stores/P

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-6-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:04:59 -03:00
Jin Yao
e7ce8d11bf perf tools: Check if mem_events is supported for hybrid platform
Check if the mem_events ('mem-loads' and 'mem-stores') exist
in the sysfs path.

For Alderlake, the hybrid cpu pmu are "cpu_core" and "cpu_atom".
Check the existing of following paths:

/sys/devices/cpu_atom/events/mem-loads
/sys/devices/cpu_atom/events/mem-stores
/sys/devices/cpu_core/events/mem-loads
/sys/devices/cpu_core/events/mem-stores

If the patch exists, the mem_event is supported.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-5-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:04:33 -03:00
Jin Yao
d2f327acc6 perf tools: Support pmu prefix for mem-load event
The perf_mem_events__name() can generate the mem-load event name.
It uses a variable 'mem_loads_name__init' to avoid generating the
event name every time (because perf_pmu__scan takes some time).

The perf_mem_events__name() assumes the pmu is "cpu" but it's not
correct for hybrid platform. For Alderlake, the pmu is "cpu_core" or
"cpu_atom"

Introduce a new parameter 'pmu_name' in perf_mem_events__name
to let the caller specify a pmu name.

Considering such event name is x86 specific, so move
perf_mem_events[] to arch/x86/util/mem-events.c.

We still keep the variable 'mem_loads_name__init' but it's only
used when pmu_name is NULL (compatible for original behavior). When
pmu_name is not NULL (e.g. "cpu_core"), this patch doesn't have
optimization. That can be implemented in follow up patch.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:03:35 -03:00
Yu Kuai
d3fddc355a perf stat: Fix error return code in bperf__load()
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.

Committer notes:

Added the missing {} for the now multiline 'if' block, fixing this error:

    CC      /tmp/build/perf/util/bpf_counter.o
  util/bpf_counter.c: In function ‘bperf__load’:
  util/bpf_counter.c:523:9: error: this ‘if’ clause does not guard... [-Werror=misleading-indentation]
    523 |         if (evsel->bperf_leader_link_fd < 0 &&
        |         ^~
  util/bpf_counter.c:526:17: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
    526 |                 goto out;
        |                 ^~~~
  cc1: all warnings being treated as errors

Fixes: 7fac83aaf2 ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: Zhang Yi <yi.zhang@huawei.com>
Link: http://lore.kernel.org/lkml/20210517081254.1561564-1-yukuai3@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:56:27 -03:00
Namhyung Kim
4f2abe9192 perf record: Move probing cgroup sampling support
I found that checking cgroup sampling support using the missing features
doesn't work on old kernels.  Because it added both attr.cgroup bit and
PERF_SAMPLE_CGROUP bit, it needs to check whichever comes first (usually
the actual event, not dummy).

But it only checks the attr.cgroup bit which is set only in the dummy
event so cannot detect failtures due the sample bits.  Also we don't
ignore the missing feature and retry, it'd be better checking it with
the API probing logic.

Committer notes:

Extracted the minimal part to check using the new cgroup API probe
routine, the part that removes the cgroup member can be left for further
discussion.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210527182835.1634339-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:32:00 -03:00
Li Huafei
3cb17cce1e perf probe: Fix NULL pointer dereference in convert_variable_location()
If we just check whether the variable can be converted, 'tvar' should be
a null pointer. However, the null pointer check is missing in the
'Constant value' execution path.

The following cases can trigger this problem:

	$ cat test.c
	#include <stdio.h>

	void main(void)
	{
	        int a;
	        const int b = 1;

	        asm volatile("mov %1, %0" : "=r"(a): "i"(b));
	        printf("a: %d\n", a);
	}

	$ gcc test.c -o test -O -g
	$ sudo ./perf probe -x ./test -L "main"
	<main@/home/lhf/test.c:0>
	      0  void main(void)
	         {
	      2          int a;
	                 const int b = 1;

	                 asm volatile("mov %1, %0" : "=r"(a): "i"(b));
	      6          printf("a: %d\n", a);
	         }

	$ sudo ./perf probe -x ./test -V "main:6"
	Segmentation fault

The check on 'tvar' is added. If 'tavr' is a null pointer, we return 0
to indicate that the variable can be converted. Now, we can successfully
show the variables that can be accessed.

	$ sudo ./perf probe -x ./test -V "main:6"
	Available variables at main:6
	        @<main+13>
	                char*   __fmt
	                int     a
	                int     b

However, the variable 'b' cannot be tracked.

	$ sudo ./perf probe -x ./test -D "main:6 b"
	Failed to find the location of the 'b' variable at this address.
	 Perhaps it has been optimized out.
	 Use -V with the --range option to show 'b' location range.
	  Error: Failed to add events.

This is because __die_find_variable_cb() did not successfully match
variable 'b', which has the DW_AT_const_value attribute instead of
DW_AT_location. We added support for DW_AT_const_value in
__die_find_variable_cb(). With this modification, we can successfully
track the variable 'b'.

	$ sudo ./perf probe -x ./test -D "main:6 b"
	p:probe_test/main_L6 /home/lhf/test:0x1156 b=\1:s32

Fixes: 66f69b2197 ("perf probe: Support DW_AT_const_value constant value")
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Jianlin Lv <jianlin.lv@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Zhang Jinhao <zhangjinhao2@huawei.com>
http://lore.kernel.org/lkml/20210601092750.169601-1-lihuafei1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:11:24 -03:00
Adrian Hunter
e621b8ffec perf auxtrace: Factor out itrace_do_parse_synth_opts()
Factor out itrace_do_parse_synth_opts() so that it can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210530192308.7382-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:04:10 -03:00
Adrian Hunter
d9ae9c9776 perf script: Factor out script_fetch_insn()
Factor out script_fetch_insn() so it can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210530192308.7382-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:03:46 -03:00
Adrian Hunter
cf9bfa6c15 perf scripting python: Assign perf_script_context
The scripting_context pointer itself does not change and nor does it need
to. Put it directly into the script as a variable at the start so it does
not have to be passed on each call into the script.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210530192308.7382-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:03:33 -03:00
Adrian Hunter
67e50ce0e3 perf scripting: Add perf_session to scripting_context
This is preparation for allowing a script to set the itrace options
for the session if they have not already been set.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210530192308.7382-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:03:17 -03:00
Adrian Hunter
cac30400a6 perf scripting: Add scripting_context__update()
Move scripting_context update to a separate function and add
the arguments of ->process_event() to it.

This prepares the way for adding more methods to the perf_trace_context
module, by providing the context information that they will need.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210530192308.7382-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:03:02 -03:00
Namhyung Kim
c673b7f59e perf stat: Fix error check for bpf_program__attach
It seems the bpf_program__attach() returns a negative error code instead
of a NULL pointer in case of error.

Fixes: 7fac83aaf2 ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: http://lore.kernel.org/lkml/20210527220052.1657578-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-27 21:51:21 -03:00
Ravi Bangoria
41ca1d1e88 perf probe: Provide more detail with relocation warning
When run as normal user with default sysctl kernel.kptr_restrict=0
and kernel.perf_event_paranoid=2, perf probe fails with:

  $ ./perf probe move_page_tables
  Relocated base symbol is not found!

The warning message is not much informative. The reason perf fails
is because /proc/kallsyms is restricted by perf_event_paranoid=2
for normal user and thus perf fails to read relocated address of
the base symbol.

Tweaking kptr_restrict and perf_event_paranoid can change the
behavior of perf probe. Also, running as root or privileged user
works too. Add these details in the warning message.

Plus, kmap->ref_reloc_sym might not be always set even if
host_machine is initialized. Above is the example of the same.
Remove that comment.

Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210525043744.193297-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-27 13:55:28 -03:00
Denys Zagorui
6793672acc perf parse-events: Add bison --file-prefix-map option
During a perf build with O= bison stores full paths in generated files
and those paths are stored in resulting perf binary.

Starting from bison v3.7.1 those paths can be remapped by using the
--file-prefix-map option.  Use this option if possible to make perf
binary more reproducible.

Signed-off-by: Denys Zagorui <dzagorui@cisco.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210524111514.65713-3-dzagorui@cisco.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-27 13:55:28 -03:00
Adrian Hunter
2ede92173f perf scripting python: Add auxtrace error
Add auxtrace_error to general python scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
0db2134069 perf scripting python: Add context switch
Add context_switch to general python scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
22cc2f74bb perf scripting python: Add cpumode
Add cpumode to python scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
142b05182e perf scripting python: Add IPC
Add IPC to python scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
bee272af78 perf scripting python: Add sample flags
Add sample flags to python scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
54cd8b0324 perf script: Factor out perf_sample__sprintf_flags()
Factor out perf_sample__sprintf_flags() so it can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
3f8e009e01 perf scripting python: Add 'addr_location' for 'addr'
If sample addr correlates to a symbol, add  "addr_dso", "addr_symbol", and
"addr_symoff" to python scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
8271b50958 perf scripting python: Factor out set_sym_in_dict()
Factor out set_sym_in_dict() so it can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
d04c1ff0b3 perf scripting python: Fix tuple_set_u64()
tuple_set_u64() produces a signed value instead of an unsigned value.
That works for database export but not other cases. Rename to
tuple_set_d64() for database export and fix tuple_set_u64().

Fixes: df919b400a ("perf scripting python: Extend interface to export data in a database-friendly way")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-2-adrian.hunter@intel.com
2021-05-25 10:07:16 -03:00
Arnaldo Carvalho de Melo
0461296878 perf auxtrace: Make perf_event__process_auxtrace*() callable
As we'll use it in the upcoming python interfaces and when built with:

                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
  +NO_LIBZSTD=1 NO_LIBCAP=1 NO_SYSCALL_TABLE=1
  make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 NO_LIBZSTD=1 NO_LIBCAP=1
  +NO_SYSCALL_TABLE=1
    BUILD:   Doing 'make -j24' parallel build
  <SNIP>
    CC      /tmp/tmp.rGrdpQlTCr/builtin-daemon.o
  In file included from util/events_stats.h:8,
                   from util/evlist.h:12,
                   from builtin-script.c:18:
  builtin-script.c: In function ‘process_auxtrace_error’:
  util/auxtrace.h:708:57: error: called object is not a function or function pointer
    708 | #define perf_event__process_auxtrace_error              0
        |                                                         ^
  builtin-script.c:2443:16: note: in expansion of macro ‘perf_event__process_auxtrace_error’
   2443 |         return perf_event__process_auxtrace_error(session, event);
        |                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    MKDIR   /tmp/tmp.rGrdpQlTCr/tests/
    MKDIR   /tmp/tmp.rGrdpQlTCr/bench/
    CC      /tmp/tmp.rGrdpQlTCr/tests/builtin-test.o
    CC      /tmp/tmp.rGrdpQlTCr/bench/sched-messaging.o
  builtin-script.c:2444:1: error: control reaches end of non-void function [-Werror=return-type]
   2444 | }
        | ^

To: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:16 -03:00
Adrian Hunter
6ea4b5dbe0 perf script: Find script file relative to exec path
Allow perf script to find a script in the exec path.

Example:

Before:

 $ perf record -a -e intel_pt/branch=0/ sleep 0.1
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.954 MB perf.data ]
 $ perf script intel-pt-events.py 2>&1 | head -3
   Error: Couldn't find script `intel-pt-events.py'
   See perf script -l for available scripts.
 $ perf script -s intel-pt-events.py 2>&1 | head -3
 Can't open python script "intel-pt-events.py": No such file or directory
 $ perf script ~/libexec/perf-core/scripts/python/intel-pt-events.py 2>&1 | head -3
   Error: Couldn't find script `/home/ahunter/libexec/perf-core/scripts/python/intel-pt-events.py'
   See perf script -l for available scripts.
 $

After:

 $ perf script intel-pt-events.py 2>&1 | head -3
 Intel PT Power Events and PTWRITE
            perf  8123/8123  [000]       551.230753986     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
            perf  8123/8123  [001]       551.230808216     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
 $ perf script -s intel-pt-events.py 2>&1 | head -3
 Intel PT Power Events and PTWRITE
            perf  8123/8123  [000]       551.230753986     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
            perf  8123/8123  [001]       551.230808216     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
 $ perf script ~/libexec/perf-core/scripts/python/intel-pt-events.py 2>&1 | head -3
 Intel PT Power Events and PTWRITE
            perf  8123/8123  [000]       551.230753986     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
            perf  8123/8123  [001]       551.230808216     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
 $

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210524065718.11421-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 09:51:44 -03:00
Arnaldo Carvalho de Melo
100475f83b Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes from perf/urgent.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 09:13:52 -03:00
Song Liu
f8b61bd204 perf stat: Skip evlist__[enable|disable] when all events uses BPF
When all events of a perf-stat session use BPF, it is not necessary to
call evlist__enable() and evlist__disable(). Skip them when
all_counters_use_bpf is true.

Signed-off-by: Song Liu <song@kernel.org>
Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-21 16:50:17 -03:00
Adrian Hunter
f42907e8a4 perf script: Add missing PERF_IP_FLAG_CHARS for VM-Entry and VM-Exit
Add 'g' (guest) for VM-Entry and 'h' (host) for VM-Exit.

Fixes: c025d46cd9 ("perf script: Add branch types for VM-Entry and VM-Exit")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210521175127.27264-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-21 16:41:37 -03:00
Arnaldo Carvalho de Melo
3b2f17ad17 perf parse-events: Check if the software events array slots are populated
To avoid a NULL pointer dereference when the kernel supports the new
feature but the tooling still hasn't an entry for it.

This happened with the recently added PERF_COUNT_SW_CGROUP_SWITCHES
software event.

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Link: https://lore.kernel.org/linux-perf-users/YKVESEKRjKtILhog@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-21 07:47:56 -03:00
Namhyung Kim
fb6c79d726 perf tools: Add 'cgroup-switches' software event
It counts how often cgroups are changed actually during the context
switches.

  # perf stat -a -e context-switches,cgroup-switches -a sleep 1

   Performance counter stats for 'system wide':

              11,267      context-switches
              10,950      cgroup-switches

         1.015634369 seconds time elapsed

Committer notes:

The kernel patches landed in v5.13, but this entry wasn't filled in
perf's parse-events tables, which was leading to a segfault when running
'perf list' on a kernel with that feature, as reported by Thomas
Richter.

Also removed the part touching tools/include/uapi/linux/perf_event.h as
it was updated in the usual sync with the kernel UAPI headers, in a
previous, already upstream, patch.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20210210083327.22726-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-19 14:23:23 -03:00
Adrian Hunter
0a0c597245 perf intel-pt: Remove redundant setting of ptq->insn_len
Remove redundant "ptq->insn_len = 0" statement.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210519074515.9262-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-19 10:35:31 -03:00
Adrian Hunter
c954eb72b3 perf intel-pt: Fix sample instruction bytes
The decoder reports the current instruction if it was decoded. In some
cases the current instruction is not decoded, in which case the instruction
bytes length must be set to zero. Ensure that is always done.

Note perf script can anyway get the instruction bytes for any samples where
they are not present.

Also note, that there is a redundant "ptq->insn_len = 0" statement which is
not removed until a subsequent patch in order to make this patch apply
cleanly to stable branches.

Example:

A machne that supports TSX is required. It will have flag "rtm". Kernel
parameter tsx=on may be required.

 # for w in `cat /proc/cpuinfo | grep -m1 flags `;do echo $w | grep rtm ; done
 rtm

Test program:

 #include <stdio.h>
 #include <immintrin.h>

 int main()
 {
        int x = 0;

        if (_xbegin() == _XBEGIN_STARTED) {
                x = 1;
                _xabort(1);
        } else {
                printf("x = %d\n", x);
        }
        return 0;
 }

Compile with -mrtm i.e.

 gcc -Wall -Wextra -mrtm xabort.c -o xabort

Record:

 perf record -e intel_pt/cyc/u --filter 'filter main @ ./xabort' ./xabort

Before:

 # perf script --itrace=xe -F+flags,+insn,-period --xed --ns
          xabort  1478 [007] 92161.431348581:   transactions:   x                              400b81 main+0x14 (/root/xabort)          mov $0xffffffff, %eax
          xabort  1478 [007] 92161.431348624:   transactions:   tx abrt                        400b93 main+0x26 (/root/xabort)          mov $0xffffffff, %eax

After:

 # perf script --itrace=xe -F+flags,+insn,-period --xed --ns
          xabort  1478 [007] 92161.431348581:   transactions:   x                              400b81 main+0x14 (/root/xabort)          xbegin 0x6
          xabort  1478 [007] 92161.431348624:   transactions:   tx abrt                        400b93 main+0x26 (/root/xabort)          xabort $0x1

Fixes: faaa87680b ("perf intel-pt/bts: Report instruction bytes and length in sample")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210519074515.9262-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-19 10:33:43 -03:00
Adrian Hunter
cb7987837c perf intel-pt: Fix transaction abort handling
When adding support for power events, some handling of FUP packets was
unified. That resulted in breaking reporting of TSX aborts, by not
considering the associated TIP packet. Fix that.

Example:

A machine that supports TSX is required. It will have flag "rtm". Kernel
parameter tsx=on may be required.

 # for w in `cat /proc/cpuinfo | grep -m1 flags `;do echo $w | grep rtm ; done
 rtm

Test program:

 #include <stdio.h>
 #include <immintrin.h>

 int main()
 {
        int x = 0;

        if (_xbegin() == _XBEGIN_STARTED) {
                x = 1;
                _xabort(1);
        } else {
                printf("x = %d\n", x);
        }
        return 0;
 }

Compile with -mrtm i.e.

 gcc -Wall -Wextra -mrtm xabort.c -o xabort

Record:

 perf record -e intel_pt/cyc/u --filter 'filter main @ ./xabort' ./xabort

Before:

 # perf script --itrace=be -F+flags,+addr,-period,-event --ns
          xabort  1478 [007] 92161.431348552:   tr strt                             0 [unknown] ([unknown]) =>           400b6d main+0x0 (/root/xabort)
          xabort  1478 [007] 92161.431348624:   jmp                            400b96 main+0x29 (/root/xabort) =>           400bae main+0x41 (/root/xabort)
          xabort  1478 [007] 92161.431348624:   return                         400bb4 main+0x47 (/root/xabort) =>           400b87 main+0x1a (/root/xabort)
          xabort  1478 [007] 92161.431348637:   jcc                            400b8a main+0x1d (/root/xabort) =>           400b98 main+0x2b (/root/xabort)
          xabort  1478 [007] 92161.431348644:   tr end  call                   400ba9 main+0x3c (/root/xabort) =>           40f690 printf+0x0 (/root/xabort)
          xabort  1478 [007] 92161.431360859:   tr strt                             0 [unknown] ([unknown]) =>           400bae main+0x41 (/root/xabort)
          xabort  1478 [007] 92161.431360882:   tr end  return                 400bb4 main+0x47 (/root/xabort) =>           401139 __libc_start_main+0x309 (/root/xabort)

After:

 # perf script --itrace=be -F+flags,+addr,-period,-event --ns
          xabort  1478 [007] 92161.431348552:   tr strt                             0 [unknown] ([unknown]) =>           400b6d main+0x0 (/root/xabort)
          xabort  1478 [007] 92161.431348624:   tx abrt                        400b93 main+0x26 (/root/xabort) =>           400b87 main+0x1a (/root/xabort)
          xabort  1478 [007] 92161.431348637:   jcc                            400b8a main+0x1d (/root/xabort) =>           400b98 main+0x2b (/root/xabort)
          xabort  1478 [007] 92161.431348644:   tr end  call                   400ba9 main+0x3c (/root/xabort) =>           40f690 printf+0x0 (/root/xabort)
          xabort  1478 [007] 92161.431360859:   tr strt                             0 [unknown] ([unknown]) =>           400bae main+0x41 (/root/xabort)
          xabort  1478 [007] 92161.431360882:   tr end  return                 400bb4 main+0x47 (/root/xabort) =>           401139 __libc_start_main+0x309 (/root/xabort)

Fixes: a472e65fc4 ("perf intel-pt: Add decoder support for ptwrite and power event packets")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210519074515.9262-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-19 10:31:04 -03:00
Thomas Richter
316a76a58c perf test: Fix libpfm4 support (63) test error for nested event groups
Compiling perf with make LIBPFM4=1 includes libpfm support and
enables test case 63 'Test libpfm4 support'. This test reports an error
on all platforms for subtest 63.2 'test groups of --pfm-events'.
The reported error message is 'nested event groups not supported'

 # ./perf test -F 63
 63: Test libpfm4 support                                            :
 63.1: test of individual --pfm-events                               :
 Error:
 failed to parse event stereolab : event not found
 Error:
 failed to parse event stereolab,instructions : event not found
 Error:
 failed to parse event instructions,stereolab : event not found
  Ok
 63.2: test groups of --pfm-events                                   :
 Error:
 nested event groups not supported    <------ Error message here
 Error:
 failed to parse event {stereolab} : event not found
 Error:
 failed to parse event {instructions,cycles},{instructions,stereolab} :\
	 event not found
 Ok
 #

This patch addresses the error message 'nested event groups not supported'.
The root cause is function parse_libpfm_events_option() which parses the
event string '{},{instructions}' and can not handle a leading empty
group notation '{},...'.

The code detects the first (empty) group indicator '{' but does not
terminate group processing on the following group closing character '}'.
So when the second group indicator '{' is detected, the code assumes
a nested group and returns an error.

With the error message fixed, also change the expected event number to
one for the test case to succeed.

While at it also fix a memory leak. In good case the function does not
free the duplicated string given as first parameter.

Output after:
 # ./perf test -F 63
 63: Test libpfm4 support                                            :
 63.1: test of individual --pfm-events                               :
 Error:
 failed to parse event stereolab : event not found
 Error:
 failed to parse event stereolab,instructions : event not found
 Error:
 failed to parse event instructions,stereolab : event not found
  Ok
 63.2: test groups of --pfm-events                                   :
 Error:
 failed to parse event {stereolab} : event not found
 Error:
 failed to parse event {instructions,cycles},{instructions,stereolab} : \
	 event not found
  Ok
 #
Error message 'nested event groups not supported' is gone.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-By: Ian Rogers <irogers@google.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20210517140931.2559364-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-19 10:30:37 -03:00
James Clark
c1a6165a63 perf cs-etm: Prevent and warn on underflows during timestamp calculation.
When a zero timestamp is encountered, warn once. This is to make
hardware or configuration issues visible. Also suggest that the issue
can be worked around with the --itrace=Z option.

When an underflow with a non-zero timestamp occurs, warn every time.
This is an unexpected scenario, and with increasing timestamps, it's
unlikely that it would occur more than once, therefore it should be
ok to warn every time.

Only try to calculate the timestamp by subtracting the instruction
count if neither of the above cases are true. This makes attempting
to decode files with zero timestamps in non-timeless mode
more consistent. Currently it can half work if the timestamp wraps
around and becomes non-zero, although the behavior is undefined and
unpredictable.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210517131741.3027-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-17 11:06:56 -03:00
James Clark
c36c1ef6f6 perf cs-etm: Start reading 'Z' --itrace option
Recently the 'Z' --itrace option was added to override detection
of timeless decoding. This is also useful in Coresight to work around
issues with invalid timestamps on some hardware.

When the 'Z' option is provided, the existing timeless decoding mode
will be used, even if timestamps were recorded.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210517131741.3027-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-17 11:06:14 -03:00
James Clark
cac3141867 perf cs-etm: Move synth_opts initialisation
Move initialisation of synth_opts earlier in the function
so that synth_opts can be used at an earlier stage in a
later commit.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210517131741.3027-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-17 11:02:50 -03:00