Timothy Hayes
7599b70a3c
perf arm-spe: Fix SPE events with phys addresses
...
This patch corrects a bug whereby SPE collection is invoked with
pa_enable=1 but synthesized events fail to show physical addresses.
Reviewed-by: Leo Yan <leo.yan@linaro.org >
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Fastabend <john.fastabend@gmail.com >
Cc: John Garry <john.garry@huawei.com >
Cc: KP Singh <kpsingh@kernel.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Martin KaFai Lau <kafai@fb.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Song Liu <songliubraving@fb.com >
Cc: Will Deacon <will@kernel.org >
Cc: Yonghong Song <yhs@fb.com >
Cc: bpf@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220421165205.117662-3-timothy.hayes@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-04-28 10:39:28 -03:00
Timothy Hayes
4e13f6706d
perf arm-spe: Fix addresses of synthesized SPE events
...
This patch corrects a bug whereby synthesized events from SPE
samples are missing virtual addresses.
Fixes: 54f7815efe ("perf arm-spe: Fill address info for samples")
Reviewed-by: Leo Yan <leo.yan@linaro.org >
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: bpf@vger.kernel.org
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Fastabend <john.fastabend@gmail.com >
Cc: John Garry <john.garry@huawei.com >
Cc: KP Singh <kpsingh@kernel.org >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Martin KaFai Lau <kafai@fb.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: netdev@vger.kernel.org
Cc: Song Liu <songliubraving@fb.com >
Cc: Will Deacon <will@kernel.org >
Cc: Yonghong Song <yhs@fb.com >
Link: https://lore.kernel.org/r/20220421165205.117662-2-timothy.hayes@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-04-28 10:39:14 -03:00
German Gomez
ff8752d761
perf arm-spe: Synthesize SPE instruction events
...
Synthesize instruction events for every ARM SPE record.
Arm SPE implements a hardware-based sample period, and perf implements a
software-based one. Add a warning message to inform the user of this.
Signed-off-by: German Gomez <german.gomez@arm.com >
Tested-by: Leo Yan <leo.yan@linaro.org >
Acked-by: Namhyung Kim <namhyung@kernel.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211216152404.52474-1-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-12-17 22:44:10 -03:00
Namhyung Kim
b0fde9c6e2
perf arm-spe: Add SPE total latency as PERF_SAMPLE_WEIGHT
...
Use total latency info in the SPE counter packet as sample weight so
that we can see it in local_weight and (global) weight sort keys.
Maybe we can use PERF_SAMPLE_WEIGHT_STRUCT to support ins_lat as well
but I'm not sure which latency it matches. So just adding total latency
first.
Reviewed-by: Leo Yan <leo.yan@linaro.org >
Signed-off-by: Namhyung Kim <namhyung@kernel.org >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Ian Rogers <irogers@google.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Link: http://lore.kernel.org/lkml/20211201220855.1260688-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-12-16 12:18:11 -03:00
German Gomez
9e1a8d9f68
perf inject: Fix ARM SPE handling
...
'perf inject' is currently not working for Arm SPE. When you try to run
'perf inject' and 'perf report' with a perf.data file that contains SPE
traces, the tool reports a "Bad address" error:
# ./perf record -e arm_spe_0/ts_enable=1,store_filter=1,branch_filter=1,load_filter=1/ -a -- sleep 1
# ./perf inject -i perf.data -o perf.inject.data --itrace
# ./perf report -i perf.inject.data --stdio
0x42c00 [0x8]: failed to process type: 9 [Bad address]
Error:
failed to process sample
As far as I know, the issue was first spotted in [1], but 'perf inject'
was not yet injecting the samples. This patch does something similar to
what cs_etm does for injecting the samples [2], but for SPE.
[1] https://patchwork.kernel.org/project/linux-arm-kernel/cover/20210412091006.468557-1-leo.yan@linaro.org/#24117339
[2] https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/cs-etm.c?h=perf/core&id=133fe2e617e48ca0948983329f43877064ffda3e#n1196
Reviewed-by: James Clark <james.clark@arm.com >
Signed-off-by: German Gomez <german.gomez@arm.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211105104130.28186-2-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-11-18 10:08:07 -03:00
German Gomez
27d113cfe8
perf arm-spe: Support hardware-based PID tracing
...
If ARM SPE traces contains CONTEXT packets with TID info, use these
values for tracking the TID of samples. Otherwise fall back to using
context switch events and display a message warning to the user of
possible timing inaccuracies [1].
[1] https://lore.kernel.org/lkml/f877cfa6-9b25-6445-3806-ca44a4042eaf@arm.com/
Signed-off-by: German Gomez <german.gomez@arm.com >
Acked-by: Namhyung Kim <namhyung@kernel.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211111133625.193568-5-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-11-13 18:11:51 -03:00
Namhyung Kim
9dc9855f18
perf arm-spe: Track task context switch for cpu-mode events
...
When perf report synthesize events from ARM SPE data, it refers to
current cpu, pid and tid in the machine. But there's no place to set
them in the ARM SPE decoder. I'm seeing all pid/tid is set to -1 and
user symbols are not resolved in the output.
# perf record -a -e arm_spe_0/ts_enable=1/ sleep 1
# perf report -q | head
8.77% 8.77% :-1 [kernel.kallsyms] [k] format_decode
7.02% 7.02% :-1 [kernel.kallsyms] [k] seq_printf
7.02% 7.02% :-1 [unknown] [.] 0x0000ffff9f687c34
5.26% 5.26% :-1 [kernel.kallsyms] [k] vsnprintf
3.51% 3.51% :-1 [kernel.kallsyms] [k] string
3.51% 3.51% :-1 [unknown] [.] 0x0000ffff9f66ae20
3.51% 3.51% :-1 [unknown] [.] 0x0000ffff9f670b3c
3.51% 3.51% :-1 [unknown] [.] 0x0000ffff9f67c040
1.75% 1.75% :-1 [kernel.kallsyms] [k] ___cache_free
1.75% 1.75% :-1 [kernel.kallsyms] [k] __count_memcg_events
Like Intel PT, add context switch records to track task info. As ARM
SPE support was added later than PERF_RECORD_SWITCH_CPU_WIDE, I think
we can safely set the attr.context_switch bit and use it.
Reviewed-by: Leo Yan <leo.yan@linaro.org >
Signed-off-by: German Gomez <german.gomez@arm.com >
Signed-off-by: Namhyung Kim <namhyung@kernel.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211111133625.193568-2-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-11-13 18:11:50 -03:00
Andrew Kilroy
09e9afac8c
perf arm-spe: Print size using consistent format
...
Since the size is already printed earlier in hex, print the same data
using the same format, in hex.
Reviewed-by: James Clark <james.clark@arm.com >
Reviewed-by: Leo Yan <leo.yan@linaro.org >
Signed-off-by: Andrew Kilroy <andrew.kilroy@arm.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Will Deacon <will@kernel.org >
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211109142153.56546-3-german.gomez@arm.com
Signed-off-by: German Gomez <german.gomez@arm.com >
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-11-13 18:11:50 -03:00
Leo Yan
8941ba502f
perf arm-spe: Don't wait for PERF_RECORD_EXIT event
...
When decode Arm SPE trace, it waits for PERF_RECORD_EXIT event (the last
perf event) for processing trace data, which is needless and even might
cause logic error, e.g. it might fail to correlate perf events with Arm
SPE events correctly.
So this patch removes the condition checking for PERF_RECORD_EXIT event.
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Reviewed-by: James Clark <james.clark@arm.com >
Tested-by: James Clark <james.clark@arm.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Al Grant <Al.Grant@arm.com >
Cc: Dave Martin <Dave.Martin@arm.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Will Deacon <will@kernel.org >
Link: https://lore.kernel.org/r/20210519071939.1598923-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-07-01 16:14:36 -03:00
Leo Yan
afb5e9e47f
perf arm-spe: Bail out if the trace is later than perf event
...
It's possible that record in Arm SPE trace is later than perf event and
vice versa. This asks to correlate the perf events and Arm SPE
synthesized events to be processed in the manner of correct timing.
To achieve the time ordering, this patch reverses the flow, it firstly
calls arm_spe_sample() and then calls arm_spe_decode(). By comparing
the timestamp value and detect the perf event is coming earlier than Arm
SPE trace data, it bails out from the decoding loop, the last record is
pushed into auxtrace stack and is deferred to generate sample. To track
the timestamp, everytime it updates timestamp for the latest record.
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Reviewed-by: James Clark <james.clark@arm.com >
Tested-by: James Clark <james.clark@arm.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Al Grant <Al.Grant@arm.com >
Cc: Dave Martin <Dave.Martin@arm.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-5-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-07-01 16:14:36 -03:00
Leo Yan
85498f756f
perf arm-spe: Assign kernel time to synthesized event
...
In current code, it assigns the arch timer counter to the synthesized
samples Arm SPE trace, thus the samples don't contain the kernel time
but only contain the raw counter value.
To fix the issue, this patch converts the timer counter to kernel time
and assigns it to sample timestamp.
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Reviewed-by: James Clark <james.clark@arm.com >
Tested-by: James Clark <james.clark@arm.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Al Grant <Al.Grant@arm.com >
Cc: Dave Martin <Dave.Martin@arm.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-07-01 16:14:36 -03:00
Leo Yan
630519014c
perf arm-spe: Convert event kernel time to counter value
...
When handle a perf event, Arm SPE decoder needs to decide if this perf
event is earlier or later than the samples from Arm SPE trace data; to
do comparision, it needs to use the same unit for the time.
This patch converts the event kernel time to arch timer's counter value,
thus it can be used to compare with counter value contained in Arm SPE
Timestamp packet.
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Reviewed-by: James Clark <james.clark@arm.com >
Tested-by: James Clark <james.clark@arm.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Al Grant <Al.Grant@arm.com >
Cc: Dave Martin <Dave.Martin@arm.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-07-01 16:14:36 -03:00
Leo Yan
c210c30696
perf arm-spe: Save clock parameters from TIME_CONV event
...
During the recording phase, "perf record" tool synthesizes event
PERF_RECORD_TIME_CONV for the hardware clock parameters and saves the
event into the data file.
Afterwards, when processing the data file, the event TIME_CONV will be
processed at the very early time and is stored into session context.
This patch extracts these parameters from the session context and saves
into the structure "spe->tc" with the type perf_tsc_conversion, so that
the parameters are ready for conversion between clock counter and time
stamp.
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Reviewed-by: James Clark <james.clark@arm.com >
Tested-by: James Clark <james.clark@arm.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Al Grant <Al.Grant@arm.com >
Cc: Dave Martin <Dave.Martin@arm.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-07-01 16:14:36 -03:00
Leo Yan
a89dbc9b98
perf arm-spe: Set sample's data source field
...
The sample structure contains the field 'data_src' which is used to
tell the data operation attributions, e.g. operation type is loading or
storing, cache level, it's snooping or remote accessing, etc. At the
end, the 'data_src' will be parsed by perf mem/c2c tools to display
human readable strings.
This patch is to fill the 'data_src' field in the synthesized samples
base on different types. Currently perf tool can display statistics for
L1/L2/L3 caches but it doesn't support the 'last level cache'. To fit
to current implementation, 'data_src' field uses L3 cache for last level
cache.
Before this commit, perf mem report looks like this:
# Samples: 75K of event 'l1d-miss'
# Total weight : 75951
# Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead Samples Local Weight Memory access Symbol Shared Object Data Symbol Data Object Snoop TLB access
# ........ ....... ............ ............. ...................... ............. ...................... ........... ..... ..........
#
81.56% 61945 0 N/A [.] 0x00000000000009d8 serial_c [.] 0000000000000000 [unknown] N/A N/A
18.44% 14003 0 N/A [.] 0x0000000000000828 serial_c [.] 0000000000000000 [unknown] N/A N/A
Now on a system with Arm SPE, addresses and access types are displayed:
# Samples: 75K of event 'l1d-miss'
# Total weight : 75951
# Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead Samples Local Weight Memory access Symbol Shared Object Data Symbol Data Object Snoop TLB access
# ........ ....... ............ ............. ...................... ............. ...................... ........... ..... ..........
#
0.43% 324 0 L1 miss [.] 0x00000000000009d8 serial_c [.] 0x0000ffff80794e00 anon N/A Walker hit
0.42% 322 0 L1 miss [.] 0x00000000000009d8 serial_c [.] 0x0000ffff80794580 anon N/A Walker hit
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Reviewed-by: James Clark <james.clark@arm.com >
Tested-by: James Clark <james.clark@arm.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Al Grant <al.grant@arm.com >
Cc: Andre Przywara <andre.przywara@arm.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Wei Li <liwei391@huawei.com >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: James Clark <james.clark@arm.com >
Link: https://lore.kernel.org/r/20210211133856.2137-6-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-02-16 11:52:00 -03:00
Leo Yan
e55ed3423c
perf arm-spe: Synthesize memory event
...
The memory event can deliver two benefits:
- The first benefit is the memory event can give out global view for
memory accessing, rather than organizing events with scatter mode
(e.g. uses separate event for L1 cache, last level cache, etc) which
which can only display a event for single memory type, memory events
include all memory accessing so it can display the data accessing
cross memory levels in the same view;
- The second benefit is the sample generation might introduce a big
overhead and need to wait for long time for Perf reporting, we can
specify itrace option '--itrace=M' to filter out other events and only
output memory events, this can significantly reduce the overhead
caused by generating samples.
This patch is to enable memory event for Arm SPE.
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Reviewed-by: James Clark <james.clark@arm.com >
Tested-by: James Clark <james.clark@arm.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Al Grant <al.grant@arm.com >
Cc: Andre Przywara <andre.przywara@arm.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Wei Li <liwei391@huawei.com >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: James Clark <james.clark@arm.com >
Link: https://lore.kernel.org/r/20210211133856.2137-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-02-16 11:52:00 -03:00
Leo Yan
54f7815efe
perf arm-spe: Fill address info for samples
...
To properly handle memory and branch samples, this patch divides into
two functions for generating samples: arm_spe__synth_mem_sample() is for
synthesizing memory and TLB samples; arm_spe__synth_branch_sample() is
to synthesize branch samples.
Arm SPE backend decoder has passed virtual and physical address through
packets, the address info is stored into the synthesize samples in the
function arm_spe__synth_mem_sample().
Committer notes:
Fixed this:
36 46.77 fedora:27 : FAIL clang version 5.0.2 (tags/RELEASE_502/final)
util/arm-spe.c:269:34: error: missing field 'pid' initializer [-Werror,-Wmissing-field-initializers]
struct perf_sample sample = { 0 };
^
util/arm-spe.c:288:34: error: missing field 'pid' initializer [-Werror,-Wmissing-field-initializers]
struct perf_sample sample = { 0 };
By using = { .ip = 0, };
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Reviewed-by: James Clark <james.clark@arm.com >
Tested-by: James Clark <james.clark@arm.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Al Grant <al.grant@arm.com >
Cc: Andre Przywara <andre.przywara@arm.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Wei Li <liwei391@huawei.com >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: James Clark <james.clark@arm.com >
Link: https://lore.kernel.org/r/20210211133856.2137-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-02-16 11:51:08 -03:00
Leo Yan
845d3a65c3
perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC
...
This patch is to enable sample type PERF_SAMPLE_DATA_SRC for Arm SPE in
the perf data, when output the tracing data, it tells tools that it
contains data source in the memory event.
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Reviewed-by: James Clark <james.clark@arm.com >
Tested-by: James Clark <james.clark@arm.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Al Grant <al.grant@arm.com >
Cc: Andre Przywara <andre.przywara@arm.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Wei Li <liwei391@huawei.com >
Cc: Will Deacon <will@kernel.org >
Link: https://lore.kernel.org/r/20210211133856.2137-1-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com >
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2021-02-12 17:37:46 -03:00
Leo Yan
75eeaddd57
perf arm-spe: Refactor printing string to buffer
...
When outputs strings to the decoding buffer with function snprintf(),
SPE decoder needs to detects if any error returns from snprintf() and if
so needs to directly bail out. If snprintf() returns success, it needs
to update buffer pointer and reduce the buffer length so can continue to
output the next string into the consequent memory space.
This complex logics are spreading in the function arm_spe_pkt_desc() so
there has many duplicate codes for handling error detecting, increment
buffer pointer and decrement buffer size.
To avoid the duplicate code, this patch introduces a new helper function
arm_spe_pkt_out_string() which is used to wrap up the complex logics,
and it's used by the caller arm_spe_pkt_desc(). This patch moves the
variable 'blen' as the function's local variable so allows to remove
the unnecessary braces and improve the readability.
This patch simplifies the return value for arm_spe_pkt_desc(): '0' means
success and other values mean an error has occurred. To realize this,
it relies on arm_spe_pkt_out_string()'s parameter 'err', the 'err' is a
cumulative value, returns its final value if printing buffer is called
for one time or multiple times. Finally, the error is handled in a
central place, rather than directly bailing out in switch-cases, it
returns error at the end of arm_spe_pkt_desc().
This patch changes the caller arm_spe_dump() to respect the updated
return value semantics of arm_spe_pkt_desc().
Suggested-by: Dave Martin <Dave.Martin@arm.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Reviewed-by: Andre Przywara <andre.przywara@arm.com >
Reviewed-by: Dave Martin <Dave.Martin@arm.com >
Acked-by: Will Deacon <will@kernel.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Al Grant <Al.Grant@arm.com >
Cc: Arnaldo Carvalho de Melo <acme@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: John Garry <john.garry@huawei.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Wei Li <liwei391@huawei.com >
Link: https://lore.kernel.org/r/20201119152441.6972-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2020-11-26 09:31:23 -03:00
Tan Xiaojun
a54ca19498
perf arm-spe: Support synthetic events
...
After the commit ffd3d18c20 ("perf tools: Add ARM Statistical
Profiling Extensions (SPE) support") has been merged, it supports to
output raw data with option "--dump-raw-trace". However, it misses for
support synthetic events so cannot output any statistical info.
This patch is to improve the "perf report" support for ARM SPE for four
types synthetic events:
First level cache synthetic events, including L1 data cache accessing
and missing events;
Last level cache synthetic events, including last level cache
accessing and missing events;
TLB synthetic events, including TLB accessing and missing events;
Remote access events, which is used to account load/store operations
caused to another socket.
Example usage:
$ perf record -c 1024 -e arm_spe_0/branch_filter=1,ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ dd if=/dev/zero of=/dev/null count=10000
$ perf report --stdio
# Samples: 59 of event 'l1d-miss'
# Event count (approx.): 59
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ..................................
#
23.73% 23.73% dd [kernel.kallsyms] [k] perf_iterate_ctx.constprop.135
20.34% 20.34% dd [kernel.kallsyms] [k] filemap_map_pages
5.08% 5.08% dd [kernel.kallsyms] [k] perf_event_mmap
5.08% 5.08% dd [kernel.kallsyms] [k] unlock_page_memcg
5.08% 5.08% dd [kernel.kallsyms] [k] unmap_page_range
3.39% 3.39% dd [kernel.kallsyms] [k] PageHuge
3.39% 3.39% dd [kernel.kallsyms] [k] release_pages
3.39% 3.39% dd ld-2.28.so [.] 0x0000000000008b5c
1.69% 1.69% dd [kernel.kallsyms] [k] __alloc_fd
[...]
# Samples: 3K of event 'l1d-access'
# Event count (approx.): 3980
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ......................................
#
26.98% 26.98% dd [kernel.kallsyms] [k] ret_to_user
10.53% 10.53% dd [kernel.kallsyms] [k] fsnotify
7.51% 7.51% dd [kernel.kallsyms] [k] new_sync_read
4.57% 4.57% dd [kernel.kallsyms] [k] vfs_read
4.35% 4.35% dd [kernel.kallsyms] [k] vfs_write
3.69% 3.69% dd [kernel.kallsyms] [k] __fget_light
3.69% 3.69% dd [kernel.kallsyms] [k] rw_verify_area
3.44% 3.44% dd [kernel.kallsyms] [k] security_file_permission
2.76% 2.76% dd [kernel.kallsyms] [k] __fsnotify_parent
2.44% 2.44% dd [kernel.kallsyms] [k] ksys_write
2.24% 2.24% dd [kernel.kallsyms] [k] iov_iter_zero
2.19% 2.19% dd [kernel.kallsyms] [k] read_iter_zero
1.81% 1.81% dd dd [.] 0x0000000000002960
1.78% 1.78% dd dd [.] 0x0000000000002980
[...]
# Samples: 35 of event 'llc-miss'
# Event count (approx.): 35
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ...........................
#
34.29% 34.29% dd [kernel.kallsyms] [k] filemap_map_pages
8.57% 8.57% dd [kernel.kallsyms] [k] unlock_page_memcg
8.57% 8.57% dd [kernel.kallsyms] [k] unmap_page_range
5.71% 5.71% dd [kernel.kallsyms] [k] PageHuge
5.71% 5.71% dd [kernel.kallsyms] [k] release_pages
5.71% 5.71% dd ld-2.28.so [.] 0x0000000000008b5c
2.86% 2.86% dd [kernel.kallsyms] [k] __queue_work
2.86% 2.86% dd [kernel.kallsyms] [k] __radix_tree_lookup
2.86% 2.86% dd [kernel.kallsyms] [k] copy_page
[...]
# Samples: 2 of event 'llc-access'
# Event count (approx.): 2
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. .............
#
50.00% 50.00% dd [kernel.kallsyms] [k] copy_page
50.00% 50.00% dd libc-2.28.so [.] _dl_addr
# Samples: 48 of event 'tlb-miss'
# Event count (approx.): 48
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ..................................
#
20.83% 20.83% dd [kernel.kallsyms] [k] perf_iterate_ctx.constprop.135
12.50% 12.50% dd [kernel.kallsyms] [k] __arch_clear_user
10.42% 10.42% dd [kernel.kallsyms] [k] clear_page
4.17% 4.17% dd [kernel.kallsyms] [k] copy_page
4.17% 4.17% dd [kernel.kallsyms] [k] filemap_map_pages
2.08% 2.08% dd [kernel.kallsyms] [k] __alloc_fd
2.08% 2.08% dd [kernel.kallsyms] [k] __mod_memcg_state.part.70
2.08% 2.08% dd [kernel.kallsyms] [k] __queue_work
2.08% 2.08% dd [kernel.kallsyms] [k] __rcu_read_unlock
2.08% 2.08% dd [kernel.kallsyms] [k] d_path
2.08% 2.08% dd [kernel.kallsyms] [k] destroy_inode
2.08% 2.08% dd [kernel.kallsyms] [k] do_dentry_open
[...]
# Samples: 9K of event 'tlb-access'
# Event count (approx.): 9573
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ......................................
#
25.79% 25.79% dd [kernel.kallsyms] [k] __arch_clear_user
11.22% 11.22% dd [kernel.kallsyms] [k] ret_to_user
8.56% 8.56% dd [kernel.kallsyms] [k] fsnotify
4.06% 4.06% dd [kernel.kallsyms] [k] new_sync_read
3.67% 3.67% dd [kernel.kallsyms] [k] el0_svc_common.constprop.2
3.04% 3.04% dd [kernel.kallsyms] [k] __fsnotify_parent
2.90% 2.90% dd [kernel.kallsyms] [k] vfs_write
2.82% 2.82% dd [kernel.kallsyms] [k] vfs_read
2.52% 2.52% dd libc-2.28.so [.] write
2.26% 2.26% dd [kernel.kallsyms] [k] security_file_permission
2.08% 2.08% dd [kernel.kallsyms] [k] ksys_write
1.96% 1.96% dd [kernel.kallsyms] [k] rw_verify_area
1.95% 1.95% dd [kernel.kallsyms] [k] read_iter_zero
[...]
# Samples: 9 of event 'branch-miss'
# Event count (approx.): 9
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. .........................
#
22.22% 22.22% dd libc-2.28.so [.] _dl_addr
11.11% 11.11% dd [kernel.kallsyms] [k] __arch_clear_user
11.11% 11.11% dd [kernel.kallsyms] [k] __arch_copy_from_user
11.11% 11.11% dd [kernel.kallsyms] [k] __dentry_kill
11.11% 11.11% dd [kernel.kallsyms] [k] __efistub_memcpy
11.11% 11.11% dd ld-2.28.so [.] 0x0000000000012b7c
11.11% 11.11% dd libc-2.28.so [.] 0x000000000002a980
11.11% 11.11% dd libc-2.28.so [.] 0x0000000000083340
# Samples: 29 of event 'remote-access'
# Event count (approx.): 29
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ...........................
#
41.38% 41.38% dd [kernel.kallsyms] [k] filemap_map_pages
10.34% 10.34% dd [kernel.kallsyms] [k] unlock_page_memcg
10.34% 10.34% dd [kernel.kallsyms] [k] unmap_page_range
6.90% 6.90% dd [kernel.kallsyms] [k] release_pages
3.45% 3.45% dd [kernel.kallsyms] [k] PageHuge
3.45% 3.45% dd [kernel.kallsyms] [k] __queue_work
3.45% 3.45% dd [kernel.kallsyms] [k] page_add_file_rmap
3.45% 3.45% dd [kernel.kallsyms] [k] page_counter_try_charge
3.45% 3.45% dd [kernel.kallsyms] [k] page_remove_rmap
3.45% 3.45% dd [kernel.kallsyms] [k] xas_start
3.45% 3.45% dd ld-2.28.so [.] 0x0000000000002a1c
3.45% 3.45% dd ld-2.28.so [.] 0x0000000000008b5c
3.45% 3.45% dd ld-2.28.so [.] 0x00000000000093cc
Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com >
Tested-by: James Clark <james.clark@arm.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Al Grant <al.grant@arm.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Jin Yao <yao.jin@linux.intel.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200530122442.490-4-leo.yan@linaro.org
Signed-off-by: James Clark <james.clark@arm.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2020-06-01 12:24:23 -03:00
Tan Xiaojun
4db25f6693
perf tools: Move arm-spe-pkt-decoder.h/c to the new dir
...
Create a new arm-spe-decoder directory for subsequent extensions and
move arm-spe-pkt-decoder.h/c to this directory. No code changes.
Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com >
Tested-by: James Clark <james.clark@arm.com >
Tested-by: Qi Liu <liuqi115@hisilicon.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Al Grant <al.grant@arm.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Jin Yao <yao.jin@linux.intel.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200530122442.490-2-leo.yan@linaro.org
Signed-off-by: James Clark <james.clark@arm.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2020-06-01 12:24:23 -03:00
Adrian Hunter
508c71e3f9
perf arm-spe: Implement ->evsel_is_auxtrace() callback
...
Implement ->evsel_is_auxtrace() callback.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com >
Reviewed-by: Leo Yan <leo.yan@linaro.org >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Jiri Olsa <jolsa@redhat.com >
Cc: Kim Phillips <kim.phillips@arm.com >
Link: http://lore.kernel.org/lkml/20200401101613.6201-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2020-04-16 12:19:15 -03:00
Arnaldo Carvalho de Melo
87ffb6c640
perf env: Remove needless cpumap.h header
...
Only a 'struct perf_cmp_map' forward allocation is necessary, fix the
places that need the header but were getting it indirectly, by luck,
from env.h.
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Link: https://lkml.kernel.org/n/tip-3sj3n534zghxhk7ygzeaqlx9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2019-09-20 09:19:21 -03:00
Arnaldo Carvalho de Melo
7ae811b12e
perf tools: Remove needless evlist.h include directives
...
Now that evlist.h isn't included by any other header, we can check where
it is really needed, i.e. we can remove it and be sure that it isn't
being obtained indirectly.
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Link: https://lkml.kernel.org/n/tip-6d7kape36m94a266md0d3xbh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2019-08-31 22:24:10 -03:00
Arnaldo Carvalho de Melo
4becb2395f
perf tools: Remove needless thread.h include directives
...
Now that thread.h isn't included by any other header, we can check where
it is really needed, i.e. we can remove it and be sure that it isn't
being obtained indirectly.
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Link: https://lkml.kernel.org/n/tip-kh333ivjbw05wsggckpziu86@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2019-08-31 22:24:10 -03:00
Jiri Olsa
72932371e7
libperf: Rename the PERF_RECORD_ structs to have a "perf" prefix
...
Even more, to have a "perf_record_" prefix, so that they match the
PERF_RECORD_ enum they map to.
Signed-off-by: Jiri Olsa <jolsa@kernel.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Michael Petlan <mpetlan@redhat.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: http://lkml.kernel.org/r/20190828135717.7245-23-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2019-08-29 08:36:12 -03:00
Jiri Olsa
9a8dad0419
libperf: Add PERF_RECORD_AUXTRACE_INFO 'struct auxtrace_info_event' to perf/event.h
...
Move the PERF_RECORD_AUXTRACE_INFO event definition to libperf's
event.h.
In order to keep libperf simple, we switch 'u64/u32/u16/u8' types used
events to their generic '__u*' versions.
Signed-off-by: Jiri Olsa <jolsa@kernel.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Michael Petlan <mpetlan@redhat.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: http://lkml.kernel.org/r/20190828135717.7245-9-jolsa@kernel.org
[ Fix cs_etm__print_auxtrace_info() arg to be __u64 too to fix the CORESIGHT=1 build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2019-08-29 08:34:52 -03:00
Arnaldo Carvalho de Melo
7f7c536f23
tools lib: Adopt zalloc()/zfree() from tools/perf
...
Eroding a bit more the tools/perf/util/util.h hodpodge header.
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2019-07-09 10:13:26 -03:00
Kim Phillips
ffd3d18c20
perf tools: Add ARM Statistical Profiling Extensions (SPE) support
...
'perf record' and 'perf report --dump-raw-trace' supported in this
release.
Example usage:
# perf record -e arm_spe/ts_enable=1,pa_enable=1/ dd if=/dev/zero of=/dev/null count=10000
# perf report --dump-raw-trace
Note that the perf.data file is portable, so the report can be run on
another architecture host if necessary.
Output will contain raw SPE data and its textual representation, such
as:
0x5c8 [0x30]: PERF_RECORD_AUXTRACE size: 0x200000 offset: 0 ref: 0x1891ad0e idx: 1 tid: 2227 cpu: 1
.
. ... ARM SPE data: size 2097152 bytes
. 00000000: 49 00 LD
. 00000002: b2 c0 3b 29 0f 00 00 ff ff VA 0xffff00000f293bc0
. 0000000b: b3 c0 eb 24 fb 00 00 00 80 PA 0xfb24ebc0 ns=1
. 00000014: 9a 00 00 LAT 0 XLAT
. 00000017: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS
. 00000019: b0 00 c4 15 08 00 00 ff ff PC 0xff00000815c400 el3 ns=1
. 00000022: 98 00 00 LAT 0 TOT
. 00000025: 71 36 6c 21 2c 09 00 00 00 TS 39395093558
. 0000002e: 49 00 LD
. 00000030: b2 80 3c 29 0f 00 00 ff ff VA 0xffff00000f293c80
. 00000039: b3 80 ec 24 fb 00 00 00 80 PA 0xfb24ec80 ns=1
. 00000042: 9a 00 00 LAT 0 XLAT
. 00000045: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS
. 00000047: b0 f4 11 16 08 00 00 ff ff PC 0xff0000081611f4 el3 ns=1
. 00000050: 98 00 00 LAT 0 TOT
. 00000053: 71 36 6c 21 2c 09 00 00 00 TS 39395093558
. 0000005c: 48 00 INSN-OTHER
. 0000005e: 42 02 EV RETIRED
. 00000060: b0 2c ef 7f 08 00 00 ff ff PC 0xff0000087fef2c el3 ns=1
. 00000069: 98 00 00 LAT 0 TOT
. 0000006c: 71 d1 6f 21 2c 09 00 00 00 TS 39395094481
...
Other release notes:
- applies to acme's perf/{core,urgent} branches, likely elsewhere
- Report is self-contained within the tool.
Record requires enabling the kernel SPE driver by
setting CONFIG_ARM_SPE_PMU.
- The intel-bts implementation was used as a starting point; its
min/default/max buffer sizes and power of 2 pages granularity need to be
revisited for ARM SPE
- Recording across multiple SPE clusters/domains not supported
- Snapshot support (record -S), and conversion to native perf events
(e.g., via 'perf inject --itrace'), are also not supported
- Technically both cs-etm and spe can be used simultaneously, however
disabled for simplicity in this release
Signed-off-by: Kim Phillips <kim.phillips@arm.com >
Reviewed-by: Dongjiu Geng <gengdongjiu@huawei.com >
Acked-by: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Cc: Marc Zyngier <marc.zyngier@arm.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Pawel Moll <pawel.moll@arm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Rob Herring <robh@kernel.org >
Cc: Suzuki Poulouse <suzuki.poulose@arm.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Wang Nan <wangnan0@huawei.com >
Cc: Will Deacon <will.deacon@arm.com >
Link: http://lkml.kernel.org/r/20180114132850.0b127434b704a26bad13268f@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2018-01-17 10:23:31 -03:00