Detected with gcc's ASan:
Direct leak of 4356 byte(s) in 120 object(s) allocated from:
#0 0x7ff1a2b5a070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070)
#1 0x55719aef4814 in build_id_cache__origname util/build-id.c:215
#2 0x55719af649b6 in print_sdt_events util/parse-events.c:2339
#3 0x55719af66272 in print_events util/parse-events.c:2542
#4 0x55719ad1ecaa in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58
#5 0x55719aec745d in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
#6 0x55719aec7d1a in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
#7 0x55719aec8184 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
#8 0x55719aeca41a in main /home/changbin/work/linux/tools/perf/perf.c:520
#9 0x7ff1a07ae09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fixes: 40218daea1 ("perf list: Show SDT and pre-cached events")
Link: http://lkml.kernel.org/r/20190316080556.3075-7-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Detected via gcc's ASan:
Direct leak of 2048 byte(s) in 64 object(s) allocated from:
6 #0 0x7f606512e370 in __interceptor_realloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee370)
7 #1 0x556b0f1d7ddd in thread_map__realloc util/thread_map.c:43
8 #2 0x556b0f1d84c7 in thread_map__new_by_tid util/thread_map.c:85
9 #3 0x556b0f0e045e in is_event_supported util/parse-events.c:2250
10 #4 0x556b0f0e1aa1 in print_hwcache_events util/parse-events.c:2382
11 #5 0x556b0f0e3231 in print_events util/parse-events.c:2514
12 #6 0x556b0ee0a66e in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58
13 #7 0x556b0f01e0ae in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
14 #8 0x556b0f01e859 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
15 #9 0x556b0f01edc8 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
16 #10 0x556b0f01f71f in main /home/changbin/work/linux/tools/perf/perf.c:520
17 #11 0x7f6062ccf09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fixes: 89896051f8 ("perf tools: Do not put a variable sized type not at the end of a struct")
Link: http://lkml.kernel.org/r/20190316080556.3075-3-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The strncpy() function may leave the destination string buffer
unterminated, better use strlcpy() that we have a __weak fallback
implementation for systems without it.
This fixes this warning on an Alpine Linux Edge system with gcc 8.2:
util/parse-events.c: In function 'print_symbol_events':
util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
strncpy(name, syms->symbol, MAX_NAME_LEN);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function 'print_symbol_events.constprop',
inlined from 'print_events' at util/parse-events.c:2508:2:
util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
strncpy(name, syms->symbol, MAX_NAME_LEN);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function 'print_symbol_events.constprop',
inlined from 'print_events' at util/parse-events.c:2511:2:
util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
strncpy(name, syms->symbol, MAX_NAME_LEN);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 947b4ad1d1 ("perf list: Fix max event string size")
Link: https://lkml.kernel.org/n/tip-b663e33bm6x8hrkie4uxh7u2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This simply adds the field to 'struct perf_evsel' and allows setting
it via the event parser, to test it lets trace trace:
First look at where in a function that receives an evsel we can put a probe
to read how evsel->max_events was setup:
# perf probe -x ~/bin/perf -L trace__event_handler
<trace__event_handler@/home/acme/git/perf/tools/perf/builtin-trace.c:0>
0 static int trace__event_handler(struct trace *trace, struct perf_evsel *evsel,
union perf_event *event __maybe_unused,
struct perf_sample *sample)
3 {
4 struct thread *thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);
5 int callchain_ret = 0;
7 if (sample->callchain) {
8 callchain_ret = trace__resolve_callchain(trace, evsel, sample, &callchain_cursor);
9 if (callchain_ret == 0) {
10 if (callchain_cursor.nr < trace->min_stack)
11 goto out;
12 callchain_ret = 1;
}
}
See what variables we can probe at line 7:
# perf probe -x ~/bin/perf -V trace__event_handler:7
Available variables at trace__event_handler:7
@<trace__event_handler+89>
int callchain_ret
struct perf_evsel* evsel
struct perf_sample* sample
struct thread* thread
struct trace* trace
union perf_event* event
Add a probe at that line asking for evsel->max_events to be collected and named
as "max_events":
# perf probe -x ~/bin/perf trace__event_handler:7 'max_events=evsel->max_events'
Added new event:
probe_perf:trace__event_handler (on trace__event_handler:7 in /home/acme/bin/perf with max_events=evsel->max_events)
You can now use it in all perf tools, such as:
perf record -e probe_perf:trace__event_handler -aR sleep 1
Now use 'perf trace', here aliased to just 'trace' and trace trace, i.e.
the first 'trace' is tracing just that 'probe_perf:trace__event_handler' event,
while the traced trace is tracing all scheduler tracepoints, will stop at two
events (--max-events 2) and will just set evsel->max_events for all the sched
tracepoints to 9, we will see the output of both traces intermixed:
# trace -e *perf:*event_handler trace --max-events 2 -e sched:*/nr=9/
0.000 :0/0 sched:sched_waking:comm=rcu_sched pid=10 prio=120 target_cpu=000
0.009 :0/0 sched:sched_wakeup:comm=rcu_sched pid=10 prio=120 target_cpu=000
0.000 trace/23949 probe_perf:trace__event_handler:(48c34a) max_events=0x9
0.046 trace/23949 probe_perf:trace__event_handler:(48c34a) max_events=0x9
#
Now, if the traced trace sends its output to /dev/null, we'll see just
what the first level trace outputs: that evsel->max_events is indeed
being set to 9:
# trace -e *perf:*event_handler trace -o /dev/null --max-events 2 -e sched:*/nr=9/
0.000 trace/23961 probe_perf:trace__event_handler:(48c34a) max_events=0x9
0.030 trace/23961 probe_perf:trace__event_handler:(48c34a) max_events=0x9
#
Now that we can set evsel->max_events, we can go to the next step, honour that
per-event property in 'perf trace'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-og00yasj276joem6e14l1eas@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is the second version of a patch that improves the error message of
the perf events parser when the PMU hardware does not support address
filters.
Previously, the perf returned the following error:
$ perf record -e intel_pt// --filter 'filter sys_write'
--filter option should follow a -e tracepoint or HW tracer option
This implies there is some syntax error present in the command line,
which is not true. Rather, notify the user that the CPU does not have
support for this feature.
For example, Intel chips based on the Broadwell micro-archticture have
the Intel PT PMU, but do not support address filtering.
Now, perf prints the following error message:
$ perf record -e intel_pt// --filter 'filter sys_write'
This CPU does not support address filtering
Signed-off-by: Jack Henschel <jackdev@mailbox.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180704121345.19025-1-jackdev@mailbox.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That takes care of using the right call to get the tracing_path
directory, the one that will end up calling tracing_path_set() to figure
out where tracefs is mounted.
One more step in doing just lazy reading of system structures to reduce
the number of operations done unconditionaly at 'perf' start.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-42zzi0f274909bg9mxzl81bu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead of accessing the trace_events_path variable directly, that may
not have been properly initialized wrt detecting where tracefs is
mounted.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-id7hzn1ydgkxbumeve5wapqz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When using for_each_event() we needlessly rebuild the whole path to
the tracepoint directory, reuse the dir_path instead, saving some cycles
and reducing the size of the next patch.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-54bcs15n0cp6gwcgpc4hptyc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf stat doesn't count the uncore event aliases from the same uncore
block in a group, for example:
perf stat -e '{unc_m_cas_count.all,unc_m_clockticks}' -a -I 1000
# time counts unit events
1.000447342 <not counted> unc_m_cas_count.all
1.000447342 <not counted> unc_m_clockticks
2.000740654 <not counted> unc_m_cas_count.all
2.000740654 <not counted> unc_m_clockticks
The output is very misleading. It gives a wrong impression that the
uncore event doesn't work.
An uncore block could be composed by several PMUs. An uncore event alias
is a joint name which means the same event runs on all PMUs of a block.
Perf doesn't support mixed events from different PMUs in the same group.
It is wrong to put uncore event aliases in a big group.
The right way is to split the big group into multiple small groups which
only include the events from the same PMU.
Only uncore event aliases from the same uncore block should be specially
handled here. It doesn't make sense to mix the uncore events with other
uncore events from different blocks or even core events in a group.
With the patch:
# time counts unit events
1.001557653 140,833 unc_m_cas_count.all
1.001557653 1,330,231,332 unc_m_clockticks
2.002709483 85,007 unc_m_cas_count.all
2.002709483 1,429,494,563 unc_m_clockticks
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1525727623-19768-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is not specific to BPF but was found when parsing a .c BPF proggie
that while valid, had no events attached to tracepoints, kprobes, etc:
Very minimal file that perf's BPF code can compile:
# cat empty.c
char _license[] __attribute__((section("license"), used)) = "GPL";
int _version __attribute__((section("version"), used)) = LINUX_VERSION_CODE;
#
Before this patch:
# perf trace -e empty.c
WARNING: event parser found nothinginvalid or unsupported event: 'empty.c'
Run 'perf list' for a list of valid events
Usage: perf trace [<options>] [<command>]
or: perf trace [<options>] -- <command> [<options>]
or: perf trace record [<options>] [<command>]
or: perf trace record [<options>] -- <command> [<options>]
-e, --event <event> event/syscall selector. use 'perf list' to list available events
#
After:
# perf trace -e empty.c
WARNING: event parser found nothing
invalid or unsupported event: 'empty.c'
Run 'perf list' for a list of valid events
Usage: perf trace [<options>] [<command>]
or: perf trace [<options>] -- <command> [<options>]
or: perf trace record [<options>] [<command>]
or: perf trace record [<options>] -- <command> [<options>]
-e, --event <event> event/syscall selector. use 'perf list' to list available events
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-8ysughiz00h6mjpcot04qyjj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With gcc 8 we get new set of snprintf() warnings that breaks the
compilation, one example:
tests/mem.c: In function ‘check’:
tests/mem.c:19:48: error: ‘%s’ directive output may be truncated writing \
up to 99 bytes into a region of size 89 [-Werror=format-truncation=]
snprintf(failure, sizeof failure, "unexpected %s", out);
The gcc docs says:
To avoid the warning either use a bigger buffer or handle the
function's return value which indicates whether or not its output
has been truncated.
Given that all these warnings are harmless, because the code either
properly fails due to uncomplete file path or we don't care for
truncated output at all, I'm changing all those snprintf() calls to
scnprintf(), which actually 'checks' for the snprint return value so the
gcc stays silent.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Link: http://lkml.kernel.org/r/20180319082902.4518-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Auto-merge for these events was disabled when auto-merging of non-alias
events was disabled in commit 63ce844 (perf stat: Only auto-merge events
that are PMU aliases).
Non-merging of legacy events is preserved:
$ perf stat -ag -e cache-misses,cache-misses sleep 1
Performance counter stats for 'system wide':
86,323 cache-misses
86,323 cache-misses
1.002623307 seconds time elapsed
But prefix or glob matching auto-merges the events created:
$ perf stat -a -e l3cache/read-miss/ sleep 1
Performance counter stats for 'system wide':
328 l3cache/read-miss/
1.002627008 seconds time elapsed
$ perf stat -a -e l3cache_0_[01]/read-miss/ sleep 1
Performance counter stats for 'system wide':
172 l3cache/read-miss/
1.002627008 seconds time elapsed
As with events created with aliases, auto-merging can be suppressed with
the --no-merge option:
$ perf stat -a -e l3cache/read-miss/ --no-merge sleep 1
Performance counter stats for 'system wide':
67 l3cache/read-miss/
67 l3cache/read-miss/
63 l3cache/read-miss/
60 l3cache/read-miss/
1.002622192 seconds time elapsed
Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timur Tabi <timur@codeaurora.org>
Cc: linux-arm-kernel@lists.infradead.org
Change-Id: I0a47eed54c05e1982ca964d743b37f50f60c508c
Link: http://lkml.kernel.org/r/1520345084-42646-4-git-send-email-agustinv@codeaurora.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To simplify creation of events accross multiple instances of the same
type of PMU stat supports two methods for creating multiple events from
a single event specification:
1. A prefix or glob can be used in the PMU name.
2. Aliases, which are listed immediately after the Kernel PMU events
by perf list, are used.
When the --no-merge option is passed and these events are displayed
individually the PMU name is lost and it's not possible to see which
count corresponds to which pmu:
$ perf stat -a -e l3cache/read-miss/ --no-merge ls > /dev/null
Performance counter stats for 'system wide':
67 l3cache/read-miss/
67 l3cache/read-miss/
63 l3cache/read-miss/
60 l3cache/read-miss/
0.001675706 seconds time elapsed
$ perf stat -a -e l3cache_read_miss --no-merge ls > /dev/null
Performance counter stats for 'system wide':
12 l3cache_read_miss
17 l3cache_read_miss
10 l3cache_read_miss
8 l3cache_read_miss
0.001661305 seconds time elapsed
This change adds the original pmu name to the event. For dynamic pmu
events the pmu name is restored in the event name:
$ perf stat -a -e l3cache/read-miss/ --no-merge ls > /dev/null
Performance counter stats for 'system wide':
63 l3cache_0_3/read-miss/
74 l3cache_0_1/read-miss/
64 l3cache_0_2/read-miss/
74 l3cache_0_0/read-miss/
0.001675706 seconds time elapsed
For alias events the name is added after the event name:
$ perf stat -a -e l3cache_read_miss --no-merge ls > /dev/null
Performance counter stats for 'system wide':
10 l3cache_read_miss [l3cache_0_3]
12 l3cache_read_miss [l3cache_0_1]
10 l3cache_read_miss [l3cache_0_2]
17 l3cache_read_miss [l3cache_0_0]
0.001661305 seconds time elapsed
Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timur Tabi <timur@codeaurora.org>
Cc: linux-arm-kernel@lists.infradead.org
Change-Id: I8056b9eda74bda33e95065056167ad96e97cb1fb
Link: http://lkml.kernel.org/r/1520345084-42646-3-git-send-email-agustinv@codeaurora.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Not needed there, fixup the places where it is needed and was getting
only by luck via evlist.h.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-yxjpetn64z8vjuguu84gr6x6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The Intel PMU event aliases have a implicit period= specifier to set the
default period.
Unfortunately this breaks overriding these periods with -c or -F,
because the alias terms look like they are user specified to the
internal parser, and user specified event qualifiers override the
command line options.
Track that they are coming from aliases by adding a "weak" state to the
term. Any weak terms don't override command line options.
I only did it for -c/-F for now, I think that's the only case that's
broken currently.
Before:
$ perf record -c 1000 -vv -e uops_issued.any
...
{ sample_period, sample_freq } 2000003
After:
$ perf record -c 1000 -vv -e uops_issued.any
...
{ sample_period, sample_freq } 1000
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20171020202755.21410-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, perf record is broken on arm/arm64 systems when the PMU is
specified explicitly as part of the event, e.g.
$ ./perf record -e armv8_cortex_a53/cpu_cycles/u true
In such cases, perf record fails to open events unless
perf_event_paranoid is set to -1, even if the PMU in question supports
mode exclusion. Further, even when perf_event_paranoid is toggled, no
samples are recorded.
This is an unintended side effect of commit:
e3ba76deef ("perf tools: Force uncore events to system wide monitoring)
... which assumes that if a PMU has an associated cpu_map, it is an
uncore PMU, and forces events for such PMUs to be system-wide.
This is not true for arm/arm64 systems, which can have heterogeneous
CPUs. To account for this, multiple CPU PMUs are exposed, each with a
"cpus" field under sysfs, which the perf tool parses into a cpu_map. ARM
PMUs do not have a "cpumask" file, and only have a "cpus" file. For the
gory details as to why, see commit:
7e3fcffe95 ("perf pmu: Support alternative sysfs cpumask")
Given all of this, we can instead identify uncore PMUs by explicitly
checking for a "cpumask" file, and restore arm/arm64 PMU support back to
a working state. This patch does so, adding a new perf_pmu::is_uncore
field, and splitting the existing cpumask parsing so that it can be
reused.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by Will Deacon <will.deacon@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: 4.12+ <stable@vger.kernel.org>
Fixes: e3ba76deef ("perf tools: Force uncore events to system wide monitoring)
Link: http://lkml.kernel.org/r/1507315102-5942-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a PMU is missing print a better error message mentioning
the missing PMU.
% mkdir empty
% mount --bind empty /sys/devices/msr
% perf stat -M Summary true
event syntax error: '{inst_retired.any,cycles}:W,{cpu_clk_unhalted.thread}:W,{inst_retired.any}:W,{cpu_clk_unhalted.ref_tsc,msr/tsc/}:W,{fp_comp_ops_exe.sse_scalar..'
\___ Cannot find PMU `msr'. Missing kernel support?
It still cannot find the right column for aliases, but it's already a vast improvement.
v2: Check asprintf
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170913215006.32222-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add code to perf list to print metric groups, and metrics
that don't have an event name. The metricgroup code collects
the eventgroups and events into a rblist, and then prints
them according to the configured filters.
The metricgroups are printed by default, but can be
limited by perf list metric or perf list metricgroup
% perf list metricgroup
..
Metric Groups:
DSB:
DSB_Coverage
[Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
FLOPS:
GFLOPs
[Giga Floating Point Operations Per Second]
Frontend:
IFetch_Line_Utilization
[Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions]
Frontend_Bandwidth:
DSB_Coverage
[Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
Memory_BW:
MLP
[Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)]
v2: Check return value of asprintf to fix warning on FC26
Fix key in lookup/addition for the groups list
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-8-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Setting up groups can be complicated due to the complicated scheduling
restrictions of different PMUs.
User tools usually don't understand all these restrictions.
Still in many cases it is useful to set up groups and they work most of
the time. However if the group is set up wrong some members will not
report any value because they never get scheduled.
Add a concept of a 'weak group': try to set up a group, but if it's not
schedulable fallback to not using a group. That gives us the best of
both worlds: groups if they work, but still a usable fallback if they
don't.
In theory it would be possible to have more complex fallback strategies
(e.g. try to split the group in half), but the simple fallback of not
using a group seems to work for now.
So far the weak group is only implemented for perf stat, not for record.
Here's an unschedulable group (on IvyBridge with SMT on)
% perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1
73,806,067 branches
4,848,144 branch-misses # 6.57% of all branches
14,754,458 l1d.replacement
24,905,558 l2_lines_in.all
<not supported> l2_rqsts.all_code_rd <------- will never report anything
With the weak group:
% perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}:W' -a sleep 1
125,366,055 branches (80.02%)
9,208,402 branch-misses # 7.35% of all branches (80.01%)
24,560,249 l1d.replacement (80.00%)
43,174,971 l2_lines_in.all (80.05%)
31,891,457 l2_rqsts.all_code_rd (79.92%)
The extra event scheduled with some extra multiplexing
v2: Move fallback code to separate function.
Add comment on for_each_group_member
Adjust to new perf_evsel__close interface
v3: Fix debug print out.
Committer testing:
Before:
# perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1
Performance counter stats for 'system wide':
<not counted> branches
<not counted> branch-misses
<not counted> l1d.replacement
<not counted> l2_lines_in.all
<not supported> l2_rqsts.all_code_rd
1.002147212 seconds time elapsed
# perf stat -e '{branches,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1
Performance counter stats for 'system wide':
83,207,892 branches
11,065,444 l1d.replacement
28,484,024 l2_lines_in.all
12,186,179 l2_rqsts.all_code_rd
1.001739493 seconds time elapsed
After:
# perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}':W -a sleep 1
Performance counter stats for 'system wide':
543,323,909 branches (80.01%)
27,100,512 branch-misses # 4.99% of all branches (80.02%)
50,402,905 l1d.replacement (80.03%)
67,385,892 l2_lines_in.all (80.01%)
21,352,885 l2_rqsts.all_code_rd (79.94%)
1.001086658 seconds time elapsed
#
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/20170831194036.30146-2-andi@firstfloor.org
[ Add a "'perf stat' only, for now" comment in the man page, suggested by Jiri ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Peter reported that when he explicitely asked for multiple events with
the same name on the command line it got coalesced into just one line,
i.e.:
# perf stat -e cycles -e cycles -e cycles usleep 1
Performance counter stats for 'usleep 1':
3,269,652 cycles
0.000884123 seconds time elapsed
#
And while there is the --no-merges option to disable that auto-merging,
this is a blunt change in behaviour for such explicit request, so change
the code so that this auto merging is done only when handling the multi
PMU aliases with the same name that introduced this coalescing,
restoring the previous behaviour for the explicit case:
# perf stat -e cycles -e cycles -e cycles usleep 1
Performance counter stats for 'usleep 1':
1,472,837 cycles
1,472,837 cycles
1,472,837 cycles
0.001764870 seconds time elapsed
#
Reported-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 430daf2dc7 ("perf stat: Collapse identically named events")
Link: http://lkml.kernel.org/r/20170831184122.GK4831@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Calling them just "data" is too vague, call it 'perf_state', to make it
clearer, for instance, when looking at patch hunks.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-rnhk5yb05wem77rjpclrh7so@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi reported problems when parse errors were detected with vendor
events (json), because in the yyparse/parse_events_parse function we
dereferenced the _data parameter to two different structs, with
different layouts, which ended up making parse_events_evlist->error to
point to random stack addresses.
Fix it by making _data to always be struct parse_events_state, changing
the only place where 'struct parse_events_term' was used in
parse_events.y.
Reported-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-bc27lshz823hxl8n9nkelcgh@git.kernel.org
Fixes: 90e2b22dee ("perf/tool: Add support to reuse event grammar to parse out terms")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename it from 'parse_events_evlist' to 'parse_events_state' to better
state that this is parsing state that has to be passed around.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-dursqtg2h2w98ztaa297u43x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Teach buildid-cache how to add, remove, and update binary objects from
other mount namespaces. Allow probe events tracing binaries in
different namespaces to add their objects to the probe and build-id
caches too. As a handy side effect, this also lets us access SDT probes
in binaries from alternate mount namespaces.
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1499305693-1599-5-git-send-email-kjlx@templeofstupid.com
[ Add util/namespaces.c to tools/perf/util/python-ext-sources, to fix the python binding 'perf test' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As it is going away from util.h, where it is not needed.
This is mostly for things like MAXPATHLEN, MAX() and MIN(), these later
two probably should go away in favor of its kernel sources replacements.
Link: http://lkml.kernel.org/n/tip-z1666f3fl3fqobxvjr5o2r39@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The files using the dirent.h routines should instead include it,
reducing the includes hell that lead to longer build times.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-42g2f4z6nfg7mdb2ae97n7tj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Continuing the disentanglement, mostly the TUI needs CTRL(c), that is
in sys/ttydefaults.h and term.c needs the termios headers.
And term.h needs to be added to a few places too.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-il19zna7qj9ytavdbwlipc7t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There are places where we just need a forward declaration, and others
were we need to include strlist.h and/or strfilter.h, reducing the
impact of changes in headers on the build time, do it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-zab42gbiki88y9k0csorxekb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Removing it from util.h, part of an effort to disentangle the includes
hell, that makes changes to util.h or something included by it to cause
a complete rebuild of the tools.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ztrjy52q1rqcchuy3rubfgt2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving them from util.h, where they don't belong. Since libc already
have string.h, name it slightly differently, as string2.h.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-eh3vz5sqxsrdd8lodoro4jrw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the printing of perf expressions and internal events to a new
clearer --details flag, instead of lumping it together with other debug
options in --debug. This makes it clearer to use.
Before
perf list --debug
...
unc_m_power_critical_throttle_cycles
[Cycles all ranks are in critical thermal throttle. Unit: uncore_imc]
uncore_imc_2/event=0x86/ MetricName: power_critical_throttle_cycles % MetricExpr: (unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100.
after
perf list --details
...
unc_m_power_critical_throttle_cycles
[Cycles all ranks are in critical thermal throttle. Unit: uncore_imc]
uncore_imc_2/event=0x86/ MetricName: power_critical_throttle_cycles % MetricExpr: (unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/20170320201711.14142-14-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for a new JSON event attribute to name MetricExpr for better
output in perf stat.
If the event has no MetricName it uses the normal event name instead to
describe the metric.
Before
% perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only
time unc_p_freq_max_os_cycles
1.000149775 15.7
2.000344807 19.3
3.000502544 16.7
4.000640656 6.6
5.000779955 9.9
After
% perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only
time freq_max_os_cycles %
1.000149775 15.7
2.000344807 19.3
3.000502544 16.7
4.000640656 6.6
5.000779955 9.9
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-13-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add generic infrastructure to perf stat to output ratios for
"MetricExpr" entries in the event lists. Many events are more useful as
ratios than in raw form, typically some count in relation to total
ticks.
Transfer the MetricExpr information from the alias to the evsel.
We mark the events that need to be collected for MetricExpr, and also
link the events using them with a pointer. The code is careful to always
prefer the right event in the same group to minimize multiplexing
errors. At the moment only a single relation is supported.
Then add a rblist to the stat shadow code that remembers stats based on
the cpu and context.
Then finally update and retrieve and print these values similarly to the
existing hardcoded perf metrics. We use the simple expression parser
added earlier to evaluate the expression.
Normally we just output the result without further commentary, but for
--metric-only this would lead to empty columns. So for this case use the
original event as description.
There is no attempt to automatically add the MetricExpr event, if it is
missing, however we suggest it to the user, because the user tool
doesn't have enough information to reliably construct a group that is
guaranteed to schedule. So we leave that to the user.
% perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}'
1.000147889 800,085,181 unc_p_clockticks
1.000147889 93,126,241 unc_p_freq_max_os_cycles # 11.6
2.000448381 800,218,217 unc_p_clockticks
2.000448381 142,516,095 unc_p_freq_max_os_cycles # 17.8
3.000639852 800,243,057 unc_p_clockticks
3.000639852 162,292,689 unc_p_freq_max_os_cycles # 20.3
% perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only
# time freq_max_os_cycles %
1.000127077 0.9
2.000301436 0.7
3.000456379 0.0
v2: Change from DivideBy to MetricExpr
v3: Use expr__ prefix. Support more than one other event.
v4: Update description
v5: Only print warning message once for multiple PMUs.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-11-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the user specifies a pmu directly, expand it automatically with a
prefix match for all available PMUs, similar as we do for the normal
aliases now.
This allows to specify attributes for duplicated boxes quickly. For
example uncore_cbox_{0,6}/.../ can be now specified as uncore_cbox/.../
and it gets automatically expanded for all boxes.
This generally makes it more concise to write uncore specifications, and
also avoids the need to know the exact topology of the system.
Before:
% perf stat -a -e uncore_cbox_0/event=0x35,umask=0x1,filter_opc=0x19C/,\
uncore_cbox_1/event=0x35,umask=0x1,filter_opc=0x19C/,\
uncore_cbox_2/event=0x35,umask=0x1,filter_opc=0x19C/,\
uncore_cbox_3/event=0x35,umask=0x1,filter_opc=0x19C/,\
uncore_cbox_4/event=0x35,umask=0x1,filter_opc=0x19C/,\
uncore_cbox_5/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1
After:
% perf stat -a -e uncore_cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1
v2: Handle all bison rules. Move multi add code to separate function.
Handle uncore_ prefix correctly.
v3: Move parse_events_multi_pmu_add to separate patch. Move uncore
prefix check to separate patch.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-6-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factor out the PMU name matching in the event parser into a separate
function, to use the same code for other grammar rules later.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make system wide (-a) the default option if no target was specified and
one of following conditions is met:
- there's no workload specified (current behaviour)
- there is workload specified but all requested
events are system wide ones
Mixed events core/uncore with workload:
$ perf stat -e 'uncore_cbox_0/clockticks/,cycles' sleep 1
Performance counter stats for 'sleep 1':
<not supported> uncore_cbox_0/clockticks/
980,489 cycles
1.000897406 seconds time elapsed
Uncore event with workload:
$ perf stat -e 'uncore_cbox_0/clockticks/' sleep 1
Performance counter stats for 'system wide':
281,473,897,192,670 uncore_cbox_0/clockticks/
1.000833784 seconds time elapsed
Committer note:
When testing I realized the default case for !root, i.e. no events
passed via -e, was broke by v2 of this patch, reported and after a
patch provided by Jiri it is back working:
[acme@jouet linux]$ perf stat usleep 1
Performance counter stats for 'usleep 1':
0.401335 task-clock:u (msec) # 0.297 CPUs utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
48 page-faults:u # 0.120 M/sec
458,146 cycles:u # 1.142 GHz
245,113 instructions:u # 0.54 insn per cycle
47,991 branches:u # 119.578 M/sec
4,022 branch-misses:u # 8.38% of all branches
0.001350029 seconds time elapsed
[acme@jouet linux]$
Suggested-and-Tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170227094818.GA12764@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently we allow not to specify value for numeric terms and we set
them to value 1. This was originaly meant just for single bit terms to
allow user to type:
$ perf record -e 'cpu/cpu-cycles,any'
instead of:
$ perf record -e 'cpu/cpu-cycles,any=1'
However it works also for multi bits terms like:
$ perf record -e 'cpu/event/' ls
...
$ perf evlist -v
..., config: 0x1, ...
After discussion with Peter we decided making such term usage to fail,
like:
$ perf record -e 'cpu/event/' ls
event syntax error: 'cpu/event/'
\___ no value assigned for term
...
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1487340058-10496-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need to add yet another parameter to new_term function in following
patch, so it's better to move first all the current params into template
struct parse_events_term and use it as a single argument.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1487340058-10496-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As this is a GNU extension and while harmless in this case, we can do
the same thing in a more clearer way by using an existing thread_map
constructor.
With this we avoid this while compiling with clang:
util/parse-events.c:2024:21: error: field 'map' with variable sized type 'struct thread_map' not at the end of a struct or class is a GNU extension
[-Werror,-Wgnu-variable-sized-type-not-at-end]
struct thread_map map;
^
1 error generated.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-tqocbplnyyhpst6drgm2u4m3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The cases changed in this patch are for when we free but keep the
pointer to the freed area, which is not always a good idea.
Be more defensive and zero the pointer to avoid possible use after
free bugs to take more time to be detected.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-5-git-send-email-treeze.taeung@gmail.com
[ rewrote commit log ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We have zfree(&ptr) for this very common pattern:
free(ptr);
ptr = NULL;
So use it in a few more places.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-4-git-send-email-treeze.taeung@gmail.com
[ rewrote commit log ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The code for handling pmu aliases without specifying the PMU hardcoded
only supported the cpu PMU.
This patch extends it to work for all PMUs. We always duplicate the
event for all PMUs that have an matching alias. This allows to
automatically expand an alias for all instances of a PMU (so for example
you can monitor all cache boxes with a single event)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It can be useful to specify branch type state per event, for example if
we want to collect both software trace points and last branch PMU events
in a single collection. Currently this doesn't work because the software
trace point errors out with -b.
There was already a branch-type parameter to configure branch sample
types per event in the parser, but it was stubbed out. This patch
implements the necessary plumbing to actually enable it.
Now:
$ perf record -e sched:sched_switch,cpu/cpu-cycles,branch_type=any/ ...
works.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1476306127-19721-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make alias matching the events parser case-insensitive. This is useful
with the JSON events. perf uses lower case events, but the CPU manuals
generally use upper case event names. The JSON files use lower case by
default too. But if we search case insensitively then users can
cut-n-paste the upper case event names.
So the following works:
% perf stat -e BR_INST_EXEC.TAKEN_INDIRECT_NEAR_CALL true
Performance counter stats for 'true':
305 BR_INST_EXEC.TAKEN_INDIRECT_NEAR_CALL
0.000492799 seconds time elapsed
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1473978296-20712-17-git-send-email-sukadev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This avoids the JSON PMU events parser having to know whether its
aliases are for perf stat or perf record.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1473978296-20712-20-git-send-email-sukadev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Previously we were dropping the useful longer descriptions that some
events have in the event list completely. This patch makes them appear with
perf list.
Old perf list:
baclears:
baclears.all
[Counts the number of baclears]
vs new:
perf list -v:
...
baclears:
baclears.all
[The BACLEARS event counts the number of times the front end is
resteered, mainly when the Branch Prediction Unit cannot provide
a correct prediction and this is corrected by the Branch Address
Calculator at the front end. The BACLEARS.ANY event counts the
number of baclears for any type of branch]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1473978296-20712-13-git-send-email-sukadev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a --no-desc flag to 'perf list' to not print the event descriptions
that were earlier added for JSON events. This may be useful to get a
less crowded listing.
It's still default to print descriptions as that is the more useful
default for most users.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1473978296-20712-9-git-send-email-sukadev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch makes it possible to use the current filter framework with
address filters. That way address filters for HW tracers such as
CoreSight and Intel PT can be communicated to the kernel drivers.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1474037045-31730-4-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Making function perf_evsel__append_filter() static and introducing a new
tracepoint specific function to append filters. That way we eliminate
redundant code and avoid formatting mistake.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1474037045-31730-3-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By making function perf_evsel__append_filter() take a format rather than
an operator it is possible to reuse the code for other purposes (ex.
Intel PT and CoreSight) than tracepoints.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1474037045-31730-2-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds PMU driver specific configuration to the parser
infrastructure by preceding any term with the '@' letter. As such doing
something like:
perf record -e some_event/@cfg1,@cfg2=config/ ...
will see 'cfg1' and 'cfg2=config' being added to the list of evsel
config terms. Token 'cfg1' and 'cfg2=config' are not processed in user
space and are meant to be interpreted by the PMU driver.
First the lexer/parser are supplemented with the required definitions to
recognise the driver specific configuration. From there they are simply
added to the list of event terms. The bulk of the work is done in
function "parse_events_add_pmu()" where driver config event terms are
added to a new list of driver config terms, which in turn spliced with
the event's new driver configuration list.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1473179837-3293-4-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch allows following config terms and option:
Globally setting events to overwrite;
# perf record --overwrite ...
Set specific events to be overwrite or no-overwrite.
# perf record --event cycles/overwrite/ ...
# perf record --event cycles/no-overwrite/ ...
Add missing config terms and update the config term array size because
the longest string length has changed.
For overwritable events, it automatically selects attr.write_backward
since perf requires it to be backward for reading.
Test result:
# perf record --overwrite -e syscalls:*enter_nanosleep* usleep 1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]
# perf evlist -v
syscalls:sys_enter_nanosleep: type: 2, size: 112, config: 0x134, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, write_backward: 1
# Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-14-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Show SDT and pre-cached events by perf-list with "sdt". This also shows
the binary and build-id where the events are placed only when there are
same name events on different binaries.
e.g.:
# perf list sdt
List of pre-defined events (to be used in -e):
sdt_libc:lll_futex_wake [SDT event]
sdt_libc:lll_lock_wait_private [SDT event]
sdt_libc:longjmp [SDT event]
sdt_libc:longjmp_target [SDT event]
...
sdt_libstdcxx:rethrow@/usr/bin/gcc(0cc207fc4b27) [SDT event]
sdt_libstdcxx:rethrow@/usr/lib64/libstdc++.so.6.0.20(91c7a88fdf49)
sdt_libstdcxx:throw@/usr/bin/gcc(0cc207fc4b27) [SDT event]
sdt_libstdcxx:throw@/usr/lib64/libstdc++.so.6.0.20(91c7a88fdf49)
The binary path and build-id are shown in below format;
<GROUP>:<EVENT>@<PATH>(<BUILD-ID>)
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160624090646.25421.44225.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Following commit will allow BPF script attach to tracepoints.
bpf__foreach_tev() will iterate over all events, not only kprobes.
Rename it to bpf__foreach_event().
Since only group and event are used by caller, there's no need to pass
full 'struct probe_trace_event' to bpf_prog_iter_callback_t. Pass only
these two strings. After this patch bpf_prog_iter_callback_t natually
support tracepoints.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468406646-21642-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add missing 'const' qualifiers so following commits are able to create
tracepoints using const strings.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468406646-21642-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To match the semantics for list.h in the kernel, that are used to
implement those macros.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qbcjlgj0ffxquxscahbpddi3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case when parsing tracepoint event definitions, to
avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it
instead of readdir_r().
See: http://man7.org/linux/man-pages/man3/readdir.3.html
"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe. In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."
Noticed while building on a Fedora Rawhide docker container.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wddn49r6bz6wq4ee3dxbl7lo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Users can pass options to tracepoints defined in the BPF script. For
example:
# perf record -e ./test.c/no-inherit/ bash
# dd if=/dev/zero of=/dev/null count=10000
# exit
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.022 MB perf.data (139 samples) ]
(no-inherit works, only the sys_read issued by bash are captured, at
least 10000 sys_read issued by dd are skipped.)
test.c:
#define SEC(NAME) __attribute__((section(NAME), used))
SEC("func=sys_read")
int bpf_func__sys_read(void *ctx)
{
return 1;
}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
no-inherit is applied to the kprobe event defined in test.c.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch introduces basic facilities to support config different slots
in a BPF map one by one.
array.nr_ranges and array.ranges are introduced into 'struct
parse_events_term', where ranges is an array of indices range (start,
length) which will be configured by this config term. nr_ranges is the
size of the array. The array is passed to 'struct bpf_map_priv'. To
indicate the new type of configuration, BPF_MAP_KEY_RANGES is added as a
new key type. bpf_map_config_foreach_key() is extended to iterate over
those indices instead of all possible keys.
Code in this commit will be enabled by following commit which enables
the indices syntax for array configuration.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-8-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds the final step for BPF map configuration. A new syntax
is appended into parser so user can config BPF objects through '/' '/'
enclosed config terms.
After this patch, following syntax is available:
# perf record -e ./test_bpf_map_1.c/map:channel.value=10/ ...
It would takes effect after appling following commits.
Test result:
# cat ./test_bpf_map_1.c
/************************ BEGIN **************************/
#include <uapi/linux/bpf.h>
#define SEC(NAME) __attribute__((section(NAME), used))
struct bpf_map_def {
unsigned int type;
unsigned int key_size;
unsigned int value_size;
unsigned int max_entries;
};
static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
(void *)BPF_FUNC_map_lookup_elem;
static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
(void *)BPF_FUNC_trace_printk;
struct bpf_map_def SEC("maps") channel = {
.type = BPF_MAP_TYPE_ARRAY,
.key_size = sizeof(int),
.value_size = sizeof(int),
.max_entries = 1,
};
SEC("func=sys_nanosleep")
int func(void *ctx)
{
int key = 0;
char fmt[] = "%d\n";
int *pval = map_lookup_elem(&channel, &key);
if (!pval)
return 0;
trace_printk(fmt, sizeof(fmt), *pval);
return 0;
}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
/************************* END ***************************/
- Normal case:
# ./perf record -e './test_bpf_map_1.c/map:channel.value=10/' usleep 10
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB perf.data ]
- Error case:
# ./perf record -e './test_bpf_map_1.c/map:channel.value/' usleep 10
event syntax error: '..ps:channel:value/'
\___ Config value not set (missing '=')
Hint: Valid config term:
map:[<arraymap>]:value=[value]
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
# ./perf record -e './test_bpf_map_1.c/xmap:channel.value=10/' usleep 10
event syntax error: '..pf_map_1.c/xmap:channel.value=10/'
\___ Invalid object config option
[SNIP]
# ./perf record -e './test_bpf_map_1.c/map:xchannel.value=10/' usleep 10
event syntax error: '..p_1.c/map:xchannel.value=10/'
\___ Target map not exist
[SNIP]
# ./perf record -e './test_bpf_map_1.c/map:channel.xvalue=10/' usleep 10
event syntax error: '..ps:channel.xvalue=10/'
\___ Invalid object map config option
[SNIP]
# ./perf record -e './test_bpf_map_1.c/map:channel.value=x10/' usleep 10
event syntax error: '..nnel.value=x10/'
\___ Incorrect value type for map
[SNIP]
Change BPF_MAP_TYPE_ARRAY to '1' in test_bpf_map_1.c:
# ./perf record -e './test_bpf_map_1.c/map:channel.value=10/' usleep 10
event syntax error: '..ps:channel.value=10/'
\___ Can't use this config term to this type of map
Hint: Valid config term:
map:[<arraymap>].value=[value]
(add -v to see detail)
Signed-off-by: Wang Nan <wangnan0@huawei.com>
[for parser part]
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-5-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Older compilers don't like this, for instance, on RHEL6.7:
CC /tmp/build/perf/util/parse-events.o
util/parse-events.c:844: error: redefinition of typedef ‘config_term_func_t’
util/parse-events.c:353: note: previous declaration of ‘config_term_func_t’ was here
So remove the second definition, that should've been just moved in 43d0b97817
("perf tools: Enable config and setting names for legacy cache events"), not copied.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 43d0b97817 ("perf tools: Enable config and setting names for legacy cache events")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In RHEL 6.7:
CC /tmp/build/perf/util/parse-events.o
cc1: warnings being treated as errors
util/parse-events.c: In function ‘parse_events_add_cache’:
util/parse-events.c:366: error: declaration of ‘error’ shadows a global declaration
util/util.h:136: error: shadowed declaration is here
Rename it to 'err'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 43d0b97817 ("perf tools: Enable config and setting names for legacy cache events")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Following commits will make more events obey /name=newname/ options.
This patch makes pmu_event_name() a generic helper.
Makes new get_config_name() accept NULL input to make life easier.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-12-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf stat' accepts some config terms but doesn't apply them. For
example:
# perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
# ls
# exit
Performance counter stats for 'bash':
266258061 instructions/no-inherit/
266258061 instructions/inherit/
1.402183915 seconds time elapsed
The result is confusing, because user may expect the first
'instructions' event exclude the 'ls' command.
This patch forbid most of these config terms for 'perf stat'.
Result:
# ./perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
event syntax error: 'instructions/no-inherit/'
\___ 'no-inherit' is not usable in 'perf stat'
...
We can add blocked config terms back when 'perf stat' really supports them.
This patch also removes unavailable config term from error message:
# ./perf stat -e 'instructions/badterm/' ls
event syntax error: 'instructions/badterm/'
\___ unknown term
valid terms: config,config1,config2,name
# ./perf stat -e 'cpu/badterm/' ls
event syntax error: 'cpu/badterm/'
\___ unknown term
valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-11-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
config_term_names[] is introduced for future commits which will be able
to retrieve the config name through the config term.
Utilize this array in parse_events_formats_error_string() so the missing
'{,no-}inherit' terms are added.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
According to man pages, asprintf returns -1 when failure. This patch
fixes two incorrect return value checker.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: stable@vger.kernel.org # v4.4+
Fixes: ffeb883e56 ("perf tools: Show proper error message for wrong terms of hw/sw events")
Link: http://lkml.kernel.org/r/1455882283-79592-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To follow convention used in other tools/perf/ areas. Also remove the
need to check if it is NULL before calling the destructor, again, to
follow convention that goes back to free().
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-w6owu7rb8a46gvunlinxaqwx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixing a leak, since code calling parse_events__free_terms() expect it
to free the list_head too.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
[ Spun off from another patch ]
Link: http://lkml.kernel.org/r/1454680939-24963-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Purges 'struct parse_event_term' entries from a list_head.
Some users need this because they don't allocate space for the list
head, it maybe on the stack or embedded into some other struct.
Next patch will convert users that need just purging and then the
perf_events__free_terms() routine will free the list head as well,
finally being renamed to perf_events_terms__delete().
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-4w3zl4ifcl0ed0j4bu3tckqp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were just freeing them, better unlink and init its nodes to catch
bugs faster if we keep dangling references to them.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
[ Spun off from another patch, use list_del_init() instead of list_del() ]
Link: http://lkml.kernel.org/r/1454680939-24963-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes segmentation fault using, for instance:
(gdb) run record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
Starting program: /home/acme/bin/perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.22-7.fc23.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0 x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410
(gdb) bt
#0 0x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410
#1 0x00000000004b9fc5 in add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
at util/parse-events.c:433
#2 0x00000000004ba334 in add_tracepoint_event (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
at util/parse-events.c:498
#3 0x00000000004bb699 in parse_events_add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys=0x19b1370 "sched", event=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
at util/parse-events.c:936
#4 0x00000000004f6eda in parse_events_parse (_data=0x7fffffffb8b0, scanner=0x19a49d0) at util/parse-events.y:391
#5 0x00000000004bc8e5 in parse_events__scanner (str=0x663ff2 "sched:sched_switch", data=0x7fffffffb8b0, start_token=258) at util/parse-events.c:1361
#6 0x00000000004bca57 in parse_events (evlist=0x19a5220, str=0x663ff2 "sched:sched_switch", err=0x0) at util/parse-events.c:1401
#7 0x0000000000518d5f in perf_evlist__can_select_event (evlist=0x19a3b90, str=0x663ff2 "sched:sched_switch") at util/record.c:253
#8 0x0000000000553c42 in intel_pt_track_switches (evlist=0x19a3b90) at arch/x86/util/intel-pt.c:364
#9 0x00000000005549d1 in intel_pt_recording_options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at arch/x86/util/intel-pt.c:664
#10 0x000000000051e076 in auxtrace_record__options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at util/auxtrace.c:539
#11 0x0000000000433368 in cmd_record (argc=1, argv=0x7fffffffde60, prefix=0x0) at builtin-record.c:1264
#12 0x000000000049bec2 in run_builtin (p=0x8fa2a8 <commands+168>, argc=5, argv=0x7fffffffde60) at perf.c:390
#13 0x000000000049c12a in handle_internal_command (argc=5, argv=0x7fffffffde60) at perf.c:451
#14 0x000000000049c278 in run_argv (argcp=0x7fffffffdcbc, argv=0x7fffffffdcb0) at perf.c:495
#15 0x000000000049c60a in main (argc=5, argv=0x7fffffffde60) at perf.c:618
(gdb)
Intel PT attempts to find the sched:sched_switch tracepoint but that seg
faults if tracefs is not readable, because the error reporting structure
is null, as errors are not reported when automatically adding
tracepoints. Fix by checking before using.
Committer note:
This doesn't take place in a kernel that supports
perf_event_attr.context_switch, that is the default way that will be
used for tracking context switches, only in older kernels, like 4.2, in
a machine with Intel PT (e.g. Broadwell) for non-priviledged users.
Further info from a similar patch by Wang:
The error is in tracepoint_error: it assumes the 'e' parameter is valid.
However, there are many situation a parse_event() can be called without
parse_events_error. See result of
$ grep 'parse_events(.*NULL)' ./tools/perf/ -r'
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Tong Zhang <ztong@vt.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: 196581717d ("perf tools: Enhance parsing events tracepoint error output")
Link: http://lkml.kernel.org/r/1453809921-24596-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the subcommand-related files from perf to a new library named
libsubcmd.a.
Since we're moving files anyway, go ahead and rename 'exec_cmd.*' to
'exec-cmd.*' to be consistent with the naming of all the other files.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/c0a838d4c878ab17fee50998811612b2281355c1.1450193761.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a43eec3042 ("bpf: introduce bpf_perf_event_output() helper") added
PERF_COUNT_SW_BPF_OUTPUT we ended up with a new entry in the event_symbols_sw
array that wasn't initialized, thus set to NULL, fix print_symbol_events()
to check for that case so that we don't crash if this happens again.
(gdb) bt
#0 __match_glob (ignore_space=false, pat=<optimized out>, str=<optimized out>) at util/string.c:198
#1 strglobmatch (str=<optimized out>, pat=pat@entry=0x7fffffffe61d "stall") at util/string.c:252
#2 0x00000000004993a5 in print_symbol_events (type=1, syms=0x872880 <event_symbols_sw+160>, max=11, name_only=false, event_glob=0x7fffffffe61d "stall")
at util/parse-events.c:1615
#3 print_events (event_glob=event_glob@entry=0x7fffffffe61d "stall", name_only=false) at util/parse-events.c:1675
#4 0x000000000042c79e in cmd_list (argc=1, argv=0x7fffffffe390, prefix=<optimized out>) at builtin-list.c:68
#5 0x00000000004788a5 in run_builtin (p=p@entry=0x871758 <commands+120>, argc=argc@entry=2, argv=argv@entry=0x7fffffffe390) at perf.c:370
#6 0x0000000000420ab0 in handle_internal_command (argv=0x7fffffffe390, argc=2) at perf.c:429
#7 run_argv (argv=0x7fffffffe110, argcp=0x7fffffffe11c) at perf.c:473
#8 main (argc=2, argv=0x7fffffffe390) at perf.c:588
(gdb) p event_symbols_sw[PERF_COUNT_SW_BPF_OUTPUT]
$4 = {symbol = 0x0, alias = 0x0}
(gdb)
A patch to robustify perf to not segfault when the next counter gets added in
the kernel will follow this one.
Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-57wysblcjfrseb0zg5u7ek10@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When PERF_COUNT_SW_BPF_OUTPUT was added to the kernel we should've
added it to tools/perf, where it is used just to list events.
This ended up causing a segfault in commands like "perf list stall".
Fix it by adding that new software counter.
A patch to robustify perf to not segfault when the next counter gets
added in the kernel will follow this one.
Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-uya354upi3eprsey6mi5962d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A series of bpf loader related error codes were introduced to help error
reporting. Functions were improved to return these new error codes.
Functions which return pointers were adjusted to encode error codes into
return value using the ERR_PTR() interface.
bpf_loader_strerror() was improved to convert these error messages to
strings. It checks the error codes and calls libbpf_strerror() and
strerror_r() accordingly, so caller don't need to consider checking the
range of the error code.
In bpf__strerror_load(), print kernel version of running kernel and the
object's 'version' section to notify user how to fix his/her program.
v1 -> v2:
Use macro for error code.
Fetch error message based on array index, eliminate for-loop.
Print version strings.
Before:
# perf record -e ./test_kversion_nomatch_program.o sleep 1
event syntax error: './test_kversion_nomatch_program.o'
\___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
SKIP
After:
# perf record -e ./test_kversion_nomatch_program.o ls
event syntax error: './test_kversion_nomatch_program.o'
\___ 'version' (4.4.0) doesn't match running kernel (4.3.0)
SKIP
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446818289-87444-1-git-send-email-wangnan0@huawei.com
[ Add 'static inline' to bpf__strerror_prepare_load() when LIBBPF is disabled ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In this patch, a series of libbpf specific error numbers and
libbpf_strerror() are introduced to help reporting errors.
Functions are updated to pass correct the error number through the
CHECK_ERR() macro.
All users of bpf_object__open{_buffer}() and bpf_program__title() in
perf are modified accordingly. In addition, due to the error codes
changing, bpf__strerror_load() is also modified to use them.
bpf__strerror_head() is also changed accordingly so it can parse libbpf
errors. bpf_loader_strerror() is introduced for that purpose, and will
be improved by the following patch.
load_program() is improved not to dump log buffer if it is empty. log
buffer is also used to deduce whether the error was caused by an invalid
program or other problem.
v1 -> v2:
- Using macro for error code.
- Fetch error message based on array index, eliminate for-loop.
- Use log buffer to detect the reason of failure. 3 new error code
are introduced to replace LIBBPF_ERRNO__LOAD.
In v1:
# perf record -e ./test_ill_program.o ls
event syntax error: './test_ill_program.o'
\___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
SKIP
# perf record -e ./test_kversion_nomatch_program.o ls
event syntax error: './test_kversion_nomatch_program.o'
\___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
SKIP
# perf record -e ./test_big_program.o ls
event syntax error: './test_big_program.o'
\___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
SKIP
In v2:
# perf record -e ./test_ill_program.o ls
event syntax error: './test_ill_program.o'
\___ Kernel verifier blocks program loading
SKIP
# perf record -e ./test_kversion_nomatch_program.o
event syntax error: './test_kversion_nomatch_program.o'
\___ Incorrect kernel version
SKIP
(Will be further improved by following patches)
# perf record -e ./test_big_program.o
event syntax error: './test_big_program.o'
\___ Program too big
SKIP
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446817783-86722-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch provides infrastructure for passing source files to --event
directly using:
# perf record --event bpf-file.c command
This patch does following works:
1) Allow passing '.c' file to '--event'. parse_events_load_bpf() is
expanded to allow caller tell it whether the passed file is source
file or object.
2) llvm__compile_bpf() is called to compile the '.c' file, the result
is saved into memory. Use bpf_object__open_buffer() to load the
in-memory object.
Introduces a bpf-script-example.c so we can manually test it:
# perf record --clang-opt "-DLINUX_VERSION_CODE=0x40200" --event ./bpf-script-example.c sleep 1
Note that '--clang-opt' must put before '--event'.
Futher patches will merge it into a testcase so can be tested automatically.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-10-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is the final patch which makes basic BPF filter work. After
applying this patch, users are allowed to use BPF filter like:
# perf record --event ./hello_world.o ls
A bpf_fd field is appended to 'struct evsel', and setup during the
callback function add_bpf_event() for each 'probe_trace_event'.
PERF_EVENT_IOC_SET_BPF ioctl is used to attach eBPF program to a newly
created perf event. The file descriptor of the eBPF program is passed to
perf record using previous patches, and stored into evsel->bpf_fd.
It is possible that different perf event are created for one kprobe
events for different CPUs. In this case, when trying to call the ioctl,
EEXIST will be return. This patch doesn't treat it as an error.
Committer note:
The bpf proggie used so far:
__attribute__((section("fork=_do_fork"), used))
int fork(void *ctx)
{
return 0;
}
char _license[] __attribute__((section("license"), used)) = "GPL";
int _version __attribute__((section("version"), used)) = 0x40300;
failed to produce any samples, even with forks happening and it being
running in system wide mode.
That is because now the filter is being associated, and the code above
always returns zero, meaning that all forks will be probed but filtered
away ;-/
Change it to 'return 1;' instead and after that:
# trace --no-syscalls --event /tmp/foo.o
0.000 perf_bpf_probe:fork:(ffffffff8109be30))
2.333 perf_bpf_probe:fork:(ffffffff8109be30))
3.725 perf_bpf_probe:fork:(ffffffff8109be30))
4.550 perf_bpf_probe:fork:(ffffffff8109be30))
^C#
And it works with all tools, including 'perf trace'.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-8-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch creates a 'struct perf_evsel' for every probe in a BPF object
file(s) and fills 'struct evlist' with them. The previously introduced
dummy event is now removed. After this patch, the following command:
# perf record --event filter.o ls
Can trace on each of the probes defined in filter.o.
The core of this patch is bpf__foreach_tev(), which calls a callback
function for each 'struct probe_trace_event' event for a bpf program
with each associated file descriptors. The add_bpf_event() callback
creates evsels by calling parse_events_add_tracepoint().
Since bpf-loader.c will not be built if libbpf is turned off, an empty
bpf__foreach_tev() is defined in bpf-loader.h to avoid build errors.
Committer notes:
Before:
# /tmp/oldperf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.198 MB perf.data ]
# perf evlist
/tmp/foo.o
# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
exclude_guest: 1, mmap2: 1, comm_exec: 1
I.e. we create just the PERF_TYPE_SOFTWARE (type: 1),
PERF_COUNT_SW_DUMMY(config 0x9) event, now, with this patch:
# perf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.210 MB perf.data ]
# perf evlist -v
perf_bpf_probe:fork: type: 2, size: 112, config: 0x6bd, { sample_period,
sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1,
inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest:
1, mmap2: 1, comm_exec: 1
#
We now have a PERF_TYPE_SOFTWARE (type: 1), but the config states 0x6bd,
which is how, after setting up the event via the kprobes interface, the
'perf_bpf_probe:fork' event is accessible via the perf_event_open
syscall. This is all transient, as soon as the 'perf record' session
ends, these probes will go away.
To see how it looks like, lets try doing a neverending session, one that
expects a control+C to end:
# perf record --event /tmp/foo.o -a
So, with that in place, we can use 'perf probe' to see what is in place:
# perf probe -l
perf_bpf_probe:fork (on _do_fork@acme/git/linux/kernel/fork.c)
We also can use debugfs:
[root@felicio ~]# cat /sys/kernel/debug/tracing/kprobe_events
p:perf_bpf_probe/fork _text+638512
Ok, now lets stop and see if we got some forks:
[root@felicio linux]# perf record --event /tmp/foo.o -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.325 MB perf.data (111 samples) ]
[root@felicio linux]# perf script
sshd 1271 [003] 81797.507678: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [000] 81797.524917: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [001] 81799.381603: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [001] 81799.408635: perf_bpf_probe:fork: (ffffffff8109be30)
<SNIP>
Sure enough, we have 111 forks :-)
Callchains seems to work as well:
# perf report --stdio --no-child
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 562 of event 'perf_bpf_probe:fork'
# Event count (approx.): 562
#
# Overhead Command Shared Object Symbol
# ........ ........ ................ ............
#
44.66% sh [kernel.vmlinux] [k] _do_fork
|
---_do_fork
entry_SYSCALL_64_fastpath
__libc_fork
make_child
26.16% make [kernel.vmlinux] [k] _do_fork
<SNIP>
#
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch utilizes bpf_object__load() provided by libbpf to load all
objects into kernel.
Committer notes:
Testing it:
When using an incorrect kernel version number, i.e., having this in your
eBPF proggie:
int _version __attribute__((section("version"), used)) = 0x40100;
For a 4.3.0-rc6+ kernel, say, this happens and needs checking at event
parsing time, to provide a better error report to the user:
# perf record --event /tmp/foo.o sleep 1
libbpf: load bpf program failed: Invalid argument
libbpf: -- BEGIN DUMP LOG ---
libbpf:
libbpf: -- END LOG --
libbpf: failed to load program 'fork=_do_fork'
libbpf: failed to load object '/tmp/foo.o'
event syntax error: '/tmp/foo.o'
\___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel?
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
If we instead make it match, i.e. use 0x40300 on this v4.3.0-rc6+
kernel, the whole process goes thru:
# perf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.202 MB perf.data ]
# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
exclude_guest: 1, mmap2: 1, comm_exec: 1
#
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch introduces bpf__{un,}probe() functions to enable callers to
create kprobe points based on section names a BPF program. It parses the
section names in the program and creates corresponding 'struct
perf_probe_event' structures. The parse_perf_probe_command() function is
used to do the main parsing work. The resuling 'struct perf_probe_event'
is stored into program private data for further using.
By utilizing the new probing API, this patch creates probe points during
event parsing.
To ensure probe points be removed correctly, register an atexit hook so
even perf quit through exit() bpf__clear() is still called, so probing
points are cleared. Note that bpf_clear() should be registered before
bpf__probe() is called, so failure of bpf__probe() can still trigger
bpf__clear() to remove probe points which are already probed.
strerror style error reporting scaffold is created by this patch.
bpf__strerror_probe() is the first error reporting function in
bpf-loader.c.
Committer note:
Trying it:
To build a test eBPF object file:
I am testing using a script I built from the 'perf test -v LLVM' output:
$ cat ~/bin/hello-ebpf
export KERNEL_INC_OPTIONS="-nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.8.3/include -I/home/acme/git/linux/arch/x86/include -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated -I/home/acme/git/linux/include -Iinclude -I/home/acme/git/linux/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -Iinclude/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h"
export WORKING_DIR=/lib/modules/4.2.0/build
export CLANG_SOURCE=-
export CLANG_OPTIONS=-xc
OBJ=/tmp/foo.o
rm -f $OBJ
echo '__attribute__((section("fork=do_fork"), used)) int fork(void *ctx) {return 0;} char _license[] __attribute__((section("license"), used)) = "GPL";int _version __attribute__((section("version"), used)) = 0x40100;' | \
clang -D__KERNEL__ $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o /tmp/foo.o && file $OBJ
---
First asking to put a probe in a function not present in the kernel
(misses the initial _):
$ perf record --event /tmp/foo.o sleep 1
Probe point 'do_fork' not found.
event syntax error: '/tmp/foo.o'
\___ You need to check probing points in BPF file
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
---
Now, with "__attribute__((section("fork=_do_fork"), used)):
$ grep _do_fork /proc/kallsyms
ffffffff81099ab0 T _do_fork
$ perf record --event /tmp/foo.o sleep 1
Failed to open kprobe_events: Permission denied
event syntax error: '/tmp/foo.o'
\___ Permission denied
---
Cool, we need to provide some better hints, "kprobe_events" is too low
level, one doesn't strictly need to know the precise details of how
these things are put in place, so something that shows the command
needed to fix the permissions would be more helpful.
Lets try as root instead:
# perf record --event /tmp/foo.o sleep 1
Lowering default frequency rate to 1000.
Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
# perf evlist
/tmp/foo.o
[root@felicio ~]# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 1000, sample_type: IP|TID|TIME|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1,
sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
---
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By introducing new rules in tools/perf/util/parse-events.[ly], this
patch enables 'perf record --event bpf_file.o' to select events by an
eBPF object file. It calls parse_events_load_bpf() to load that file,
which uses bpf__prepare_load() and finally calls bpf_object__open() for
the object files.
After applying this patch, commands like:
# perf record --event foo.o sleep
become possible.
However, at this point it is unable to link any useful things onto the
evsel list because the creating of probe points and BPF program
attaching have not been implemented. Before real events are possible to
be extracted, to avoid perf report error because of empty evsel list,
this patch link a dummy evsel. The dummy event related code will be
removed when probing and extracting code is ready.
Commiter notes:
Using it:
$ ls -la foo.o
ls: cannot access foo.o: No such file or directory
$ perf record --event foo.o sleep
libbpf: failed to open foo.o: No such file or directory
event syntax error: 'foo.o'
\___ BPF object file 'foo.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/build/perf/perf.o
/tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$ perf record --event /tmp/build/perf/perf.o sleep
libbpf: /tmp/build/perf/perf.o is not an eBPF object file
event syntax error: '/tmp/build/perf/perf.o'
\___ BPF object file '/tmp/build/perf/perf.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/foo.o
/tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
$ perf record --event /tmp/foo.o sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
$ perf evlist
/tmp/foo.o
$ perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
$
So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok.
$ perf report --stdio
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
$
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch allows perf record setting event's attr.inherit bit by
config terms like:
# perf record -e cycles/no-inherit/ ...
# perf record -e cycles/inherit/ ...
So user can control inherit bit for each event separately.
In following example, a.out fork()s in main then do some complex
CPU intensive computations in both of its children.
Basic result with and without inherit:
# perf record -e cycles -e instructions ./a.out
[ perf record: Woken up 9 times to write data ]
[ perf record: Captured and wrote 2.205 MB perf.data (47920 samples) ]
# perf report --stdio
# ...
# Samples: 23K of event 'cycles'
# Event count (approx.): 23641752891
...
# Samples: 24K of event 'instructions'
# Event count (approx.): 30428312415
# perf record -i -e cycles -e instructions ./a.out
[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 1.111 MB perf.data (24019 samples) ]
...
# Samples: 12K of event 'cycles'
# Event count (approx.): 11699501775
...
# Samples: 12K of event 'instructions'
# Event count (approx.): 15058023559
Cancel inherit for one event when globally enable:
# perf record -e cycles/no-inherit/ -e instructions ./a.out
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.660 MB perf.data (36004 samples) ]
...
# Samples: 12K of event 'cycles/no-inherit/'
# Event count (approx.): 11895759282
...
# Samples: 24K of event 'instructions'
# Event count (approx.): 30668000441
Enable inherit for one event when globally disable:
# perf record -i -e cycles/inherit/ -e instructions ./a.out
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.654 MB perf.data (35868 samples) ]
...
# Samples: 23K of event 'cycles/inherit/'
# Event count (approx.): 23285400229
...
# Samples: 11K of event 'instructions'
# Event count (approx.): 14969050259
Committer note:
One can check if the bit was set, in addition to seeing the result in
the perf.data file size as above by doing one of:
# perf record -e cycles -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.911 MB perf.data (63 samples) ]
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
#
So, the inherit bit was set in both, now, if we disable it globally using
--no-inherit:
# perf record --no-inherit -e cycles -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.910 MB perf.data (56 samples) ]
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
No inherit bit set, then disabling it and setting just on the cycles event:
# perf record --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.909 MB perf.data (48 samples) ]
# perf evlist -v
cycles/inherit/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
#
We can see it as well in by using a more verbose level of debug messages in
the tool that sets up the perf_event_attr, 'perf record' in this case:
[root@zoo ~]# perf record -vv --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
------------------------------------------------------------
perf_event_attr:
size 112
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID
disabled 1
inherit 1
mmap 1
comm 1
freq 1
task 1
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8
------------------------------------------------------------
perf_event_attr:
size 112
config 0x1
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID
disabled 1
freq 1
sample_id_all 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
<SNIP>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446029705-199659-2-git-send-email-wangnan0@huawei.com
[ s/u64/bool/ for the perf_evsel_config_term inherit field - jolsa]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently we dont fail properly when pattern matching fails to find any
tracepoint.
Current behaviour:
$ perf record -e 'sched:krava*' sleep 1
WARNING: event parser found nothinginvalid or unsupported event: 'sched:krava*'
Run 'perf list' for a list of valid events
usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
This patch change:
$ perf record -e 'sched:krava*' sleep 1
event syntax error: 'sched:krava*'
\___ unknown tracepoint
Error: File /sys/kernel/debug/tracing/events/sched/krava* not found.
Hint: Perhaps this kernel misses some CONFIG_ setting to enable this feature?.
Run 'perf list' for a list of valid events
usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
Reported-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444073477-3181-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'P' will cause the event to get maximum possible detected precise
level.
Following record:
$ perf record -e cycles:P ...
will detect maximum precise level for 'cycles' event and use it.
Commiter note:
Testing it:
$ perf record -e cycles:P usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data (9 samples) ]
$ perf evlist
cycles:P
$ perf evlist -v
cycles:P: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1, mmap2: 1,
comm_exec: 1
$
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>