The Intel PT decoder limits the number of unconditional branches (e.g.
jmps) decoded without consuming any trace packets. Generally, a loop
needs a conditional branch which generates a TNT packet, whereas a "ret"
instruction will generate a TIP or TNT packet. So exceeding the limit is
assumed to be a never-ending loop, which can happen if there has been a
decoding error putting the decoder at the wrong place in the code.
Up until now, the limit of 10000 has been enough but some analytic
purposes have been reported to exceed that.
Increase the limit to 100000, and make it configurable via perf config
intel-pt.max-loops. Also amend the "Never-ending loop" message to
mention the configuration entry.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210701175132.3977-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf has supported the CPU_PMU_CAPS feature to display a list of CPU PMU
capabilities. But on a hybrid platform, it may have several CPU PMUs (such
as "cpu_core" and "cpu_atom"). The CPU_PMU_CAPS feature is hard to extend
to support multiple CPU PMUs well if it needs to be compatible for the case
of old perf data file + new perf tool.
So for better compatibility we now create a new feature HYBRID_CPU_PMU_CAPS
in the header.
For the perf.data generated on hybrid platform,
root@otcpl-adl-s-2:~# perf report --header-only -I
# cpu_core pmu capabilities: branches=32, max_precise=3, pmu_name=alderlake_hybrid
# cpu_atom pmu capabilities: branches=32, max_precise=3, pmu_name=alderlake_hybrid
# missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CPU_PMU_CAPS CLOCK_DATA
For the perf.data generated on non-hybrid platform
root@kbl-ppc:~# perf report --header-only -I
# cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake
# missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CLOCK_DATA HYBRID_TOPOLOGY HYBRID_CPU_PMU_CAPS
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210514122948.9472-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is useful to let the user know about the hybrid topology.
Add the HYBRID_TOPOLOGY feature in header to indicate the core CPUs
and the atom CPUs.
With this patch a perf.data generated on a hybrid platform reports
the hybrid CPU list:
root@otcpl-adl-s-2:~# perf report --header-only -I
...
# hybrid cpu system:
# cpu_core cpu list : 0-15
# cpu_atom cpu list : 16-23
For a perf.data generated on a non-hybrid platform, reports a message
that HYBRID_TOPOLOGY is missing:
root@kbl-ppc:~# perf report --header-only -I
...
# missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CLOCK_DATA HYBRID_TOPOLOGY
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210514122948.9472-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT timestamps are affected by virtualization. Add a new option
that will allow the Intel PT decoder to correlate the timestamps and
translate the virtual machine timestamps to host timestamps.
The advantages of making this a separate step, rather than a part of
normal decoding are that it is simpler to implement, and it needs to
be done only once.
This patch adds only the option. Later patches add Intel PT support.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210430070309.17624-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To make the output more readable, I think it's better to remove 0's in
the output. Also the dummy event has no event stats so it just wasts
the space. Let's use the --skip-empty option to suppress it.
$ perf report --stat --skip-empty
Aggregated stats:
TOTAL events: 16530
MMAP events: 226
COMM events: 1596
EXIT events: 2
THROTTLE events: 121
UNTHROTTLE events: 117
FORK events: 1595
SAMPLE events: 719
MMAP2 events: 12147
CGROUP events: 2
FINISHED_ROUND events: 2
THREAD_MAP events: 1
CPU_MAP events: 1
TIME_CONV events: 1
cycles stats:
SAMPLE events: 719
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, to use BPF to aggregate perf event counters, the user uses
--bpf-counters option. Enable "use bpf by default" events with a config
option, stat.bpf-counter-events. Events with name in the option will use
BPF.
This also enables mixed BPF event and regular event in the same sesssion.
For example:
perf config stat.bpf-counter-events=instructions
perf stat -e instructions,cs
The second command will use BPF for "instructions" but not "cs".
Signed-off-by: Song Liu <song@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20210425214333.1090950-4-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This functionality is based on recently introduced sysfs attributes for
Intel® Xeon® Scalable processor family (code name Skylake-SP):
Commit bb42b3d397 ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")
Mode is intended to provide four I/O performance metrics in MB per each
PCIe root port:
- Inbound Read: I/O devices below root port read from the host memory
- Inbound Write: I/O devices below root port write to the host memory
- Outbound Read: CPU reads from I/O devices below root port
- Outbound Write: CPU writes to I/O devices below root port
Each metric requiries only one uncore event which increments at every 4B
transfer in corresponding direction. The formulas to compute metrics
are generic:
#EventCount * 4B / (1024 * 1024)
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210419094147.15909-4-alexander.antonov@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The pipeline stage cycles details can be recorded on powerpc from the
contents of Performance Monitor Unit (PMU) registers. On ISA v3.1
platform, sampling registers exposes the cycles spent in different
pipeline stages. Patch adds perf tools support to present two of the
cycle counter information along with memory latency (weight).
Re-use the field 'ins_lat' for storing the first pipeline stage cycle.
This is stored in 'var2_w' field of 'perf_sample_weight'.
Add a new field 'p_stage_cyc' to store the second pipeline stage cycle
which is stored in 'var3_w' field of perf_sample_weight.
Add new sort function 'Pipeline Stage Cycle' and include this in
default_mem_sort_order[]. This new sort function may be used to denote
some other pipeline stage in another architecture. So add this to list
of sort entries that can have dynamic header string.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: https://lore.kernel.org/r/1616425047-1666-5-git-send-email-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf stat' subcommand supports the request for a summary of the
interval counter readings. But the summary lines break the CSV output
so it's hard for scripts to parse the result.
Before:
# perf stat -x, -I1000 --interval-count 1 --summary
1.001323097,8013.48,msec,cpu-clock,8013483384,100.00,8.013,CPUs utilized
1.001323097,270,,context-switches,8013513297,100.00,0.034,K/sec
1.001323097,13,,cpu-migrations,8013530032,100.00,0.002,K/sec
1.001323097,184,,page-faults,8013546992,100.00,0.023,K/sec
1.001323097,20574191,,cycles,8013551506,100.00,0.003,GHz
1.001323097,10562267,,instructions,8013564958,100.00,0.51,insn per cycle
1.001323097,2019244,,branches,8013575673,100.00,0.252,M/sec
1.001323097,106152,,branch-misses,8013585776,100.00,5.26,of all branches
8013.48,msec,cpu-clock,8013483384,100.00,7.984,CPUs utilized
270,,context-switches,8013513297,100.00,0.034,K/sec
13,,cpu-migrations,8013530032,100.00,0.002,K/sec
184,,page-faults,8013546992,100.00,0.023,K/sec
20574191,,cycles,8013551506,100.00,0.003,GHz
10562267,,instructions,8013564958,100.00,0.51,insn per cycle
2019244,,branches,8013575673,100.00,0.252,M/sec
106152,,branch-misses,8013585776,100.00,5.26,of all branches
The summary line loses the timestamp column, which breaks the CSV
output.
We add a column at the original 'timestamp' position and it just says
'summary' for the summary line.
After:
# perf stat -x, -I1000 --interval-count 1 --summary
1.001196053,8012.72,msec,cpu-clock,8012722903,100.00,8.013,CPUs utilized
1.001196053,218,,context-switches,8012753271,100.00,0.027,K/sec
1.001196053,9,,cpu-migrations,8012769767,100.00,0.001,K/sec
1.001196053,0,,page-faults,8012786257,100.00,0.000,K/sec
1.001196053,15004518,,cycles,8012790637,100.00,0.002,GHz
1.001196053,7954691,,instructions,8012804027,100.00,0.53,insn per cycle
1.001196053,1590259,,branches,8012814766,100.00,0.198,M/sec
1.001196053,82601,,branch-misses,8012824365,100.00,5.19,of all branches
summary,8012.72,msec,cpu-clock,8012722903,100.00,7.986,CPUs utilized
summary,218,,context-switches,8012753271,100.00,0.027,K/sec
summary,9,,cpu-migrations,8012769767,100.00,0.001,K/sec
summary,0,,page-faults,8012786257,100.00,0.000,K/sec
summary,15004518,,cycles,8012790637,100.00,0.002,GHz
summary,7954691,,instructions,8012804027,100.00,0.53,insn per cycle
summary,1590259,,branches,8012814766,100.00,0.198,M/sec
summary,82601,,branch-misses,8012824365,100.00,5.19,of all branches
Now it's easy for script to analyse the summary lines.
Of course, we also consider not to break possible existing scripts which
can continue to use the broken CSV format by using a new '--no-csv-summary.'
option.
# perf stat -x, -I1000 --interval-count 1 --summary --no-csv-summary
1.001213261,8012.67,msec,cpu-clock,8012672327,100.00,8.013,CPUs utilized
1.001213261,197,,context-switches,8012703742,100.00,24.586,/sec
1.001213261,9,,cpu-migrations,8012720902,100.00,1.123,/sec
1.001213261,644,,page-faults,8012738266,100.00,80.373,/sec
1.001213261,18350698,,cycles,8012744109,100.00,0.002,GHz
1.001213261,12745021,,instructions,8012759001,100.00,0.69,insn per cycle
1.001213261,2458033,,branches,8012770864,100.00,306.768,K/sec
1.001213261,102107,,branch-misses,8012781751,100.00,4.15,of all branches
8012.67,msec,cpu-clock,8012672327,100.00,7.985,CPUs utilized
197,,context-switches,8012703742,100.00,24.586,/sec
9,,cpu-migrations,8012720902,100.00,1.123,/sec
644,,page-faults,8012738266,100.00,80.373,/sec
18350698,,cycles,8012744109,100.00,0.002,GHz
12745021,,instructions,8012759001,100.00,0.69,insn per cycle
2458033,,branches,8012770864,100.00,306.768,K/sec
102107,,branch-misses,8012781751,100.00,4.15,of all branches
This option can be enabled in perf config by setting the variable
'stat.no-csv-summary'.
# perf config stat.no-csv-summary=true
# perf config -l
stat.no-csv-summary=true
# perf stat -x, -I1000 --interval-count 1 --summary
1.001330198,8013.28,msec,cpu-clock,8013279201,100.00,8.013,CPUs utilized
1.001330198,205,,context-switches,8013308394,100.00,25.583,/sec
1.001330198,10,,cpu-migrations,8013324681,100.00,1.248,/sec
1.001330198,0,,page-faults,8013340926,100.00,0.000,/sec
1.001330198,8027742,,cycles,8013344503,100.00,0.001,GHz
1.001330198,2871717,,instructions,8013356501,100.00,0.36,insn per cycle
1.001330198,553564,,branches,8013366204,100.00,69.081,K/sec
1.001330198,54021,,branch-misses,8013375952,100.00,9.76,of all branches
8013.28,msec,cpu-clock,8013279201,100.00,7.985,CPUs utilized
205,,context-switches,8013308394,100.00,25.583,/sec
10,,cpu-migrations,8013324681,100.00,1.248,/sec
0,,page-faults,8013340926,100.00,0.000,/sec
8027742,,cycles,8013344503,100.00,0.001,GHz
2871717,,instructions,8013356501,100.00,0.36,insn per cycle
553564,,branches,8013366204,100.00,69.081,K/sec
54021,,branch-misses,8013375952,100.00,9.76,of all branches
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210319070156.20394-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf tool uses performance monitoring counters (PMCs) to monitor
system performance. The PMCs are limited hardware resources. For
example, Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
Modern data center systems use these PMCs in many different ways: system
level monitoring, (maybe nested) container level monitoring, per process
monitoring, profiling (in sample mode), etc. In some cases, there are
more active perf_events than available hardware PMCs. To allow all
perf_events to have a chance to run, it is necessary to do expensive
time multiplexing of events.
On the other hand, many monitoring tools count the common metrics
(cycles, instructions). It is a waste to have multiple tools create
multiple perf_events of "cycles" and occupy multiple PMCs.
bperf tries to reduce such wastes by allowing multiple perf_events of
"cycles" or "instructions" (at different scopes) to share PMUs. Instead
of having each perf-stat session to read its own perf_events, bperf uses
BPF programs to read the perf_events and aggregate readings to BPF maps.
Then, the perf-stat session(s) reads the values from these BPF maps.
Please refer to the comment before the definition of bperf_ops for the
description of bperf architecture.
bperf is off by default. To enable it, pass --bpf-counters option to
perf-stat. bperf uses a BPF hashmap to share information about BPF
programs and maps used by bperf. This map is pinned to bpffs. The
default path is /sys/fs/bpf/perf_attr_map. The user could change the
path with option --bpf-attr-map.
Committer testing:
# dmesg|grep "Performance Events" -A5
[ 0.225277] Performance Events: Fam17h+ core perfctr, AMD PMU driver.
[ 0.225280] ... version: 0
[ 0.225280] ... bit width: 48
[ 0.225281] ... generic registers: 6
[ 0.225281] ... value mask: 0000ffffffffffff
[ 0.225281] ... max period: 00007fffffffffff
#
# for a in $(seq 6) ; do perf stat -a -e cycles,instructions sleep 100000 & done
[1] 2436231
[2] 2436232
[3] 2436233
[4] 2436234
[5] 2436235
[6] 2436236
# perf stat -a -e cycles,instructions sleep 0.1
Performance counter stats for 'system wide':
310,326,987 cycles (41.87%)
236,143,290 instructions # 0.76 insn per cycle (41.87%)
0.100800885 seconds time elapsed
#
We can see that the counters were enabled for this workload 41.87% of
the time.
Now with --bpf-counters:
# for a in $(seq 32) ; do perf stat --bpf-counters -a -e cycles,instructions sleep 100000 & done
[1] 2436514
[2] 2436515
[3] 2436516
[4] 2436517
[5] 2436518
[6] 2436519
[7] 2436520
[8] 2436521
[9] 2436522
[10] 2436523
[11] 2436524
[12] 2436525
[13] 2436526
[14] 2436527
[15] 2436528
[16] 2436529
[17] 2436530
[18] 2436531
[19] 2436532
[20] 2436533
[21] 2436534
[22] 2436535
[23] 2436536
[24] 2436537
[25] 2436538
[26] 2436539
[27] 2436540
[28] 2436541
[29] 2436542
[30] 2436543
[31] 2436544
[32] 2436545
#
# ls -la /sys/fs/bpf/perf_attr_map
-rw-------. 1 root root 0 Mar 23 14:53 /sys/fs/bpf/perf_attr_map
# bpftool map | grep bperf | wc -l
64
#
# bpftool map | tail
1265: percpu_array name accum_readings flags 0x0
key 4B value 24B max_entries 1 memlock 4096B
1266: hash name filter flags 0x0
key 4B value 4B max_entries 1 memlock 4096B
1267: array name bperf_fo.bss flags 0x400
key 4B value 8B max_entries 1 memlock 4096B
btf_id 996
pids perf(2436545)
1268: percpu_array name accum_readings flags 0x0
key 4B value 24B max_entries 1 memlock 4096B
1269: hash name filter flags 0x0
key 4B value 4B max_entries 1 memlock 4096B
1270: array name bperf_fo.bss flags 0x400
key 4B value 8B max_entries 1 memlock 4096B
btf_id 997
pids perf(2436541)
1285: array name pid_iter.rodata flags 0x480
key 4B value 4B max_entries 1 memlock 4096B
btf_id 1017 frozen
pids bpftool(2437504)
1286: array flags 0x0
key 4B value 32B max_entries 1 memlock 4096B
#
# bpftool map dump id 1268 | tail
value (CPU 21):
8f f3 bc ca 00 00 00 00 80 fd 2a d1 4d 00 00 00
80 fd 2a d1 4d 00 00 00
value (CPU 22):
7e d5 64 4d 00 00 00 00 a4 8a 2e ee 4d 00 00 00
a4 8a 2e ee 4d 00 00 00
value (CPU 23):
a7 78 3e 06 01 00 00 00 b2 34 94 f6 4d 00 00 00
b2 34 94 f6 4d 00 00 00
Found 1 element
# bpftool map dump id 1268 | tail
value (CPU 21):
c6 8b d9 ca 00 00 00 00 20 c6 fc 83 4e 00 00 00
20 c6 fc 83 4e 00 00 00
value (CPU 22):
9c b4 d2 4d 00 00 00 00 3e 0c df 89 4e 00 00 00
3e 0c df 89 4e 00 00 00
value (CPU 23):
18 43 66 06 01 00 00 00 5b 69 ed 83 4e 00 00 00
5b 69 ed 83 4e 00 00 00
Found 1 element
# bpftool map dump id 1268 | tail
value (CPU 21):
f2 6e db ca 00 00 00 00 92 67 4c ba 4e 00 00 00
92 67 4c ba 4e 00 00 00
value (CPU 22):
dc 8e e1 4d 00 00 00 00 d9 32 7a c5 4e 00 00 00
d9 32 7a c5 4e 00 00 00
value (CPU 23):
bd 2b 73 06 01 00 00 00 7c 73 87 bf 4e 00 00 00
7c 73 87 bf 4e 00 00 00
Found 1 element
#
# perf stat --bpf-counters -a -e cycles,instructions sleep 0.1
Performance counter stats for 'system wide':
119,410,122 cycles
152,105,479 instructions # 1.27 insn per cycle
0.101395093 seconds time elapsed
#
See? We had the counters enabled all the time.
Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: kernel-team@fb.com
Link: http://lore.kernel.org/lkml/20210316211837.910506-2-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the fixes sent for v5.12 and continue development based on
v5.12-rc2, i.e. without the swap on file bug.
This also gets a slightly newer and better tools/perf/arch/arm/util/cs-etm.c
patch version, using the BIT() macro, that had already been slated to
v5.13 but ended up going to v5.12-rc1 on an older version.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Warning "dso not found" is reported when using "perf report -D".
66702781413407 0x32c0 [0x30]: PERF_RECORD_SAMPLE(IP, 0x2): 28177/28177: 0x55e493e00563 period: 106578 addr: 0
... thread: perf:28177
...... dso: <not found>
66702727832429 0x9dd8 [0x38]: PERF_RECORD_COMM exec: triad_loop:28177/28177
The PERF_RECORD_SAMPLE event (timestamp: 66702781413407) should be after the
PERF_RECORD_COMM event (timestamp: 66702727832429), but it's early processed.
So for most of cases, it makes sense to keep the event ordered even for dump
mode. But it would be also useful to disable ordered_events for reporting raw
dump to see events as they are stored in the perf.data file.
So now, set ordered_events by default to true and add a new option
'disable-order' to disable it. For example,
perf report -D --disable-order
Fixes: 977f739b71 ("perf report: Disable ordered_events for raw dump")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210219070005.12397-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a 'ping' command to verify that the 'perf record' session is up and
operational.
It's used in the following patches via test code to make sure 'perf
record' is ready to receive signals.
Example:
# cat ~/.perfconfig
[daemon]
base=/opt/perfdata
[session-cycles]
run = -m 10M -e cycles --overwrite --switch-output -a
[session-sched]
run = -m 20M -e sched:* --overwrite --switch-output -a
Start the daemon:
# perf daemon start
Ping all sessions:
# perf daemon ping
OK cycles
OK sched
Ping specific session:
# perf daemon ping --session sched
OK sched
Committer notes:
Fixed up bug pointed by clang:
Buggy:
if (!pollfd.revents & POLLIN)
Correct code:
if (!(pollfd.revents & POLLIN))
clang warning:
builtin-daemon.c:560:6: error: logical not is only applied to the left hand side of this bitwise operator [-Werror,-Wlogical-not-parentheses]
if (!pollfd.revents & POLLIN) {
^ ~
builtin-daemon.c:560:6: note: add parentheses after the '!' to evaluate the bitwise operator first
Also use designated initialized with pollfd, i.e.:
struct pollfd pollfd = { .events = POLLIN, };
Instead of:
struct pollfd pollfd = { 0, };
To get past:
builtin-daemon.c:510:30: error: missing field 'events' initializer [-Werror,-Wmissing-field-initializers]
struct pollfd pollfd = { 0, };
^
1 error generated.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-16-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>