Ian Rogers
f793ae185e
perf jevents: Remove the type/version variables
...
pmu_events_map has a type variable that is always initialized to "core"
and a version variable that is never read. Remove these from the API as
it is straightforward to add them back when necessary.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 14:58:40 -03:00
Ian Rogers
099b157c08
perf jevent: Add an 'all' architecture argument
...
When 'all' is passed as the architecture generate a mapping table for
all architectures. This simplifies testing. To identify the table for an
architecture add an arch variable to the pmu_events_map.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 14:58:20 -03:00
Yang Li
8d33834f9f
perf stat: Remove duplicated include in builtin-stat.c
...
util/topdown.h is included twice in builtin-stat.c,
remove one of them.
Reported-by: Abaci Robot <abaci@linux.alibaba.com >
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com >
Tested-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=1818
Link: https://lore.kernel.org/r/20220804005213.71990-1-yang.lee@linux.alibaba.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 16:51:31 -03:00
shaomin Deng
0029e8ace1
perf scripting python: Delete repeated word in comments
...
Delete the repeated word "into" in comments.
Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: http://lore.kernel.org/lkml/20220807160239.474-1-dengshaomin@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 16:45:53 -03:00
shaomin Deng
987f5cbd2f
perf tools: Fix double word in comments
...
Delete the repeated word "to" in comments.
Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: http://lore.kernel.org/lkml/20220807155549.30953-1-dengshaomin@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 16:45:24 -03:00
shaomin Deng
632f5c224e
perf trace: Fix double word in comments
...
Delete repeated word "and" in comments.
Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: http://lore.kernel.org/lkml/20220807084629.23121-1-dengshaomin@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 16:44:56 -03:00
shaomin Deng
ae4e4a0ba3
perf script: Delete repeated word "from"
...
Delete the repeated word "from" in code.
Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: http://lore.kernel.org/lkml/20220807080642.13004-1-dengshaomin@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 16:44:30 -03:00
shaomin Deng
f3c96bec7c
perf test: Fix double word in comments
...
Delete the redundant word "then" in comments.
Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: http://lore.kernel.org/lkml/20220807074753.7857-1-dengshaomin@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 16:42:55 -03:00
Martin Liška
1bf7d836e5
perf record: Improve error message of -p not_existing_pid
...
When one uses -p $not_existing_pid, the output of --help is printed:
$ perf record -p 123456789 2>&1 | head -n3
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
Let's change it something similar what perf top -p $not_existing_pid
prints:
$ ./perf top -p 123456789 --stdio
Error:
Couldn't create thread/CPU maps: No such process
Newly suggested error message:
$ ./perf record -p 123456789
Couldn't create thread/CPU maps: No such process
Signed-off-by: Martin Liška <mliska@suse.cz >
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com >
Link: http://lore.kernel.org/lkml/8e00eda1-4de0-2c44-ce67-d4df48ac1f7c@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 10:27:21 -03:00
Martin Liška
a072a7a026
perf build-id: Print debuginfod queries if -v option is used
...
When ending a 'perf record' session, the querying of a debuginfod server
can take quite some time. Inform a user about it when -v options is
used.
Signed-off-by: Martin Liška <mliska@suse.cz >
Link: http://lore.kernel.org/lkml/325871cf-b71f-6237-8793-82182272ece8@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 10:27:20 -03:00
Martin Liška
34575ded68
perf build-id: Fix coding style, replace 8 spaces by tabs
...
Use tabs instead of 8 spaces for the indentation.
Signed-off-by: Martin Liška <mliska@suse.cz >
Link: http://lore.kernel.org/lkml/2983e2e0-6850-ad59-79d8-efe83b22cffe@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 10:22:49 -03:00
Leo Yan
e754dd7e8b
perf c2c: Update documentation for new display option 'peer'
...
Since the new display option 'peer' is introduced, this patch is to
update the documentation to reflect it.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-16-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:32 -03:00
Leo Yan
ead42a0f9b
perf c2c: Use 'peer' as default display for Arm64
...
Since Arm64 arch doesn't support HITMs flags, this patch changes to use
'peer' as default display if user doesn't specify any type; for other
arches, it still uses 'tot' as default display type if user doesn't
specify it.
This patch changes to call perf_session__new() in an earlier place, so
session environment can be initialized ahead and arch info can be used
for setting display type.
Suggested-by: Ali Saidi <alisaidi@amazon.com >
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-15-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:31 -03:00
Leo Yan
f37c5d914e
perf c2c: Sort on peer snooping for load operations
...
This patch adds a new option 'peer' so can sort on the cache hit for
peer snooping.
For displaying with option 'peer', the "Shared Data Cache Line Table"
and "Shared Cache Line Distribution Pareto" both sort with the metrics
"tot_peer".
As result, we can get the 'peer' display:
# perf c2c report -d peer --coalesce tid,pid,iaddr,dso -N --stdio
=================================================
Shared Data Cache Line Table
=================================================
#
# ----------- Cacheline ---------- Peer ------- Load Peer ------- Total Total Total --------- Stores -------- ----- Core Load Hit ----- - LLC Load Hit -- - RMT Load Hit -- --- Load Dram ----
# Index Address Node PA cnt Snoop Total Local Remote records Loads Stores L1Hit L1Miss N/A FB L1 L2 LclHit LclHitm RmtHit RmtHitm Lcl Rmt
# ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ........ ....... ........ ....... ........ ........
#
0 0xaaaac17d6000 N/A 0 100.00% 99 99 0 18851 18851 0 0 0 0 0 18752 0 99 0 0 0 0 0
=================================================
Shared Cache Line Distribution Pareto
=================================================
#
# -- Peer Snoop -- ------- Store Refs ------ --------- Data address --------- ---------- cycles ---------- Total cpu Shared
# Num Rmt Lcl L1 Hit L1 Miss N/A Offset Node PA cnt Pid Tid Code address rmt peer lcl peer load records cnt Symbol Object Source:Line Node{cpus %peers %stores}
# ..... ....... ....... ....... ....... ....... .................. .... ...... ....... ................. .................. ........ ........ ........ ....... ........ ...................... ................ ............... ....
#
----------------------------------------------------------------------
0 0 99 0 0 0 0xaaaac17d6000
----------------------------------------------------------------------
0.00% 3.03% 0.00% 0.00% 0.00% 0x20 N/A 0 3603 3603:memstress 0xaaaac17c25ac 0 376 41 9314 2 [.] 0x00000000000025ac memstress memstress[25ac] 0{ 2 100.0% n/a}
0.00% 3.03% 0.00% 0.00% 0.00% 0x20 N/A 0 3603 3606:memstress 0xaaaac17c25ac 0 375 44 9155 1 [.] 0x00000000000025ac memstress memstress[25ac] 0{ 1 100.0% n/a}
0.00% 48.48% 0.00% 0.00% 0.00% 0x29 N/A 0 3603 3606:memstress 0xaaaac17c3e88 0 180 170 65 1 [.] 0x0000000000003e88 memstress memstress[3e88] 0{ 1 100.0% n/a}
0.00% 45.45% 0.00% 0.00% 0.00% 0x29 N/A 0 3603 3603:memstress 0xaaaac17c3e88 0 180 175 70 2 [.] 0x0000000000003e88 memstress memstress[3e88] 0{ 2 100.0% n/a}
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-14-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:25 -03:00
Leo Yan
faa30dfab5
perf c2c: Refactor display string
...
The display type is shown by combination the display string array and a
suffix string "HITMs", which is not friendly to extend display for other
sorting type (e.g. extension for peer operations).
This patch moves the suffix string "HITMs" into display string array for
HITM types, so it can allow us to not necessarily to output string
"HITMs" for new incoming display type.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-13-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:25 -03:00
Leo Yan
7c10b65a42
perf c2c: Refactor node header
...
The node header array contains 3 items, each item is used for one of
the 3 flavors for node accessing info. To extend sorting on other
snooping type and not always stick to HITMs, the second header string
"Node{cpus %hitms %stores}" should be adjusted (e.g. it's changed as
"Node{cpus %peer %stores}").
For this reason, this patch changes the node header array to three
flat variables and uses switch-case in function setup_nodes_header(),
thus it is easier for altering the header string.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-12-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:24 -03:00
Leo Yan
2be0bc7529
perf c2c: Rename dimension from 'percent_hitm' to 'percent_costly_snoop'
...
Use more general naming for the main sort dimension, this can allow us
not to sort only on HITM snoop type, so it can be extended to support
other costly snooping operations. So rename the dimension to the prefix
'percent_costly_".
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-11-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:22 -03:00
Leo Yan
c82ccc3a3d
perf c2c: Use explicit names for display macros
...
Perf c2c tool has an assumption that it heavily depends on HITM snoop
type to detect cache false sharing, unfortunately, HITM is not supported
on some architectures.
Essentially, perf c2c tool wants to find some very costly snooping
operations for false cache sharing, this means it's not necessarily
to stick using HITM tags and we can explore other snooping types
(e.g. SNOOPX_PEER).
For this reason, this patch renames HITM related display macros with
suffix '_HITM', so it can be distinct if later add more display types
for on other snooping type.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-10-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:21 -03:00
Leo Yan
682352e59b
perf c2c: Add mean dimensions for peer operations
...
This patch adds two dimensions for the mean value of peer operations.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-9-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:20 -03:00
Leo Yan
9082282fce
perf c2c: Add dimensions of peer metrics for cache line view
...
This patch adds dimensions of peer ops, which will be used for Shared
cache line distribution pareto.
It adds the percentage dimensions for local and remote peer operations,
and the dimensions for accounting operation numbers which is used for
stdio mode.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-8-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:19 -03:00
Leo Yan
63e74ab5e4
perf c2c: Add dimensions for peer load operations
...
This patch adds three dimensions for peer load operations of 'lcl_peer',
'rmt_peer' and 'tot_peer'. These three dimensions will be used in the
shared data cache line table.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-7-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:18 -03:00
Leo Yan
3ef1fc17b3
perf c2c: Output statistics for peer snooping
...
This patch outputs statistics for peer snooping for whole trace events
and global shared cache line.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:16 -03:00
Leo Yan
e843dec53a
perf mem: Add statistics for peer snooping
...
Since the flag PERF_MEM_SNOOPX_PEER is added to support cache snooping
from peer cache line, it can come from a peer core, a peer cluster, or
a remote NUMA node.
This patch adds statistics for the flag PERF_MEM_SNOOPX_PEER. Note, we
take PERF_MEM_SNOOPX_PEER as an affiliated info, it needs to cooperate
with cache level statistics. Therefore, we account the load operations
for both the cache level's metrics (e.g. ld_l2hit, ld_llchit, etc.) and
peer related metrics when flag PERF_MEM_SNOOPX_PEER is set.
So three new metrics are introduced: 'lcl_peer' is for local cache
access, the metric 'rmt_peer' is for remote access (includes remote DRAM
and any caches in remote node), and the metric 'tot_peer' is accounting
the sum value of 'lcl_peer' and 'rmt_peer'.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-5-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:12 -03:00
Ali Saidi
4e6430cbb1
perf arm-spe: Use SPE data source for neoverse cores
...
When synthesizing data from SPE, augment the type with source information
for Arm Neoverse cores. The field is IMPLDEF but the Neoverse cores all use
the same encoding. I can't find encoding information for any other SPE
implementations to unify their choices with Arm's thus that is left for
future work.
This change populates the mem_lvl_num for Neoverse cores as well as the
deprecated mem_lvl namespace.
Reviewed-by: German Gomez <german.gomez@arm.com >
Reviewed-by: Leo Yan <leo.yan@linaro.org >
Signed-off-by: Ali Saidi <alisaidi@amazon.com >
Tested-by: Leo Yan <leo.yan@linaro.org >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-4-leo.yan@linaro.org
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:01 -03:00
Leo Yan
f78d6250db
perf mem: Print snoop peer flag
...
Since PERF_MEM_SNOOPX_PEER flag is a new snoop type, print this flag if
it is set.
Before:
memstress 3603 [020] 122.463754: 1 l1d-miss: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
memstress 3603 [020] 122.463754: 1 l1d-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
memstress 3603 [020] 122.463754: 1 llc-miss: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
memstress 3603 [020] 122.463754: 1 llc-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
memstress 3603 [020] 122.463754: 1 tlb-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
memstress 3603 [020] 122.463754: 1 memory: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
After:
memstress 3603 [020] 122.463754: 1 l1d-miss: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
memstress 3603 [020] 122.463754: 1 l1d-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
memstress 3603 [020] 122.463754: 1 llc-miss: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
memstress 3603 [020] 122.463754: 1 llc-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
memstress 3603 [020] 122.463754: 1 tlb-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
memstress 3603 [020] 122.463754: 1 memory: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Reviewed-by: Kajol Jain <kjain@linux.ibm.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:11:36 -03:00
Ali Saidi
2e21bcf051
perf tools: Sync addition of PERF_MEM_SNOOPX_PEER
...
Add a flag to the 'perf mem' data struct to signal that a request caused
a cache-to-cache transfer of a line from a peer of the requestor and
wasn't sourced from a lower cache level.
The line being moved from one peer cache to another has latency and
performance implications.
On Arm64 Neoverse systems the data source can indicate a cache-to-cache
transfer but not if the line is dirty or clean, so instead of
overloading HITM define a new flag that indicates this type of transfer.
Committer notes:
This really is not syncing with the kernel since the patch to the kernel
wasn't merged.
But we're going ahead of this as it seems trivial and is just a matter
of the perf kernel maintainers to give their ack or for us to find
another way of expressing this in the perf records synthesized in
userspace from the ARM64 hardware traces.
Reviewed-by: Leo Yan <leo.yan@linaro.org >
Signed-off-by: Ali Saidi <alisaidi@amazon.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-2-leo.yan@linaro.org
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:11:36 -03:00
Leo Yan
4a88c4ec3c
perf arm64: Add missing -I for tools/arch/arm64/include/ to find asm/sysreg.h when building arm_spe.h
...
This cures a current problem where tools/perf/util/arm-spe.c isn't
finding a ARM64 specific asm header, so lets add it for now to make
progress.
Adding a .o specific rule seems clunky, lets try and find if this is
really the right solution.
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Reported-by: Suzuki K Poulose <suzuki.poulose@arm.com >
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: James Morse <james.morse@arm.com >
Link: https://lore.kernel.org/lkml/20220811124825.GA868014@leoy-huanghe.lan
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:11:36 -03:00
Adrian Hunter
53e76d35f7
perf tools: Tidy guest option documentation
...
Move common guest options into include files. Use attribute substitution to
customize an example, using "[verse]" to define the block instead of a
"literal" block which does not permit substitution.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Link: https://lore.kernel.org/r/20220811170411.84154-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 18:50:17 -03:00
Adrian Hunter
d9ca43c06f
perf inject: Fix missing guestmount option documentation
...
The 'perf inject' documentation is missing the guestmount option. Add it.
Fixes: 97406a7e4f ("perf inject: Add support for injecting guest sideband events")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Link: https://lore.kernel.org/r/20220811170411.84154-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 18:49:57 -03:00
Adrian Hunter
696d0a4cb8
perf script: Fix missing guest option documentation
...
The 'perf script' documentation is missing several options relating to
guests. Add them.
Fixes: 15a108af1a ("perf script: Allow specifying the files to process guest samples")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Link: https://lore.kernel.org/r/20220811170411.84154-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 18:49:38 -03:00
Namhyung Kim
ade1d0307b
perf offcpu: Update offcpu test for child process
...
Record off-cpu data with perf bench sched messaging workload and count
the number of offcpu-time events. Also update the test script not to
run next tests if failed already and revise the error messages.
$ sudo ./perf test offcpu -v
88: perf record offcpu profiling tests :
--- start ---
test child forked, pid 344780
Checking off-cpu privilege
Basic off-cpu test
Basic off-cpu test [Success]
Child task off-cpu test
Child task off-cpu test [Success]
test child finished with 0
---- end ----
perf record offcpu profiling tests: Ok
Signed-off-by: Namhyung Kim <namhyung@kernel.org >
Cc: Blake Jones <blakejones@google.com >
Cc: Hao Luo <haoluo@google.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@kernel.org >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Milian Wolff <milian.wolff@kdab.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Song Liu <songliubraving@fb.com >
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220811185456.194721-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 17:57:45 -03:00
Namhyung Kim
d23477637a
perf offcpu: Track child processes
...
When -p option used or a workload is given, it needs to handle child
processes. The perf_event can inherit those task events
automatically. We can add a new BPF program in task_newtask
tracepoint to track child processes.
Before:
$ sudo perf record --off-cpu -- perf bench sched messaging
$ sudo perf report --stat | grep -A1 offcpu
offcpu-time stats:
SAMPLE events: 1
After:
$ sudo perf record -a --off-cpu -- perf bench sched messaging
$ sudo perf report --stat | grep -A1 offcpu
offcpu-time stats:
SAMPLE events: 856
Signed-off-by: Namhyung Kim <namhyung@kernel.org >
Cc: Blake Jones <blakejones@google.com >
Cc: Hao Luo <haoluo@google.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@kernel.org >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Milian Wolff <milian.wolff@kdab.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Song Liu <songliubraving@fb.com >
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220811185456.194721-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 17:57:34 -03:00
Namhyung Kim
d6f415ca33
perf offcpu: Parse process id separately
...
The current target code uses thread id for tracking tasks because
perf_events need to be opened for each task. But we can use tgid in
BPF maps and check it easily.
Signed-off-by: Namhyung Kim <namhyung@kernel.org >
Cc: Blake Jones <blakejones@google.com >
Cc: Hao Luo <haoluo@google.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@kernel.org >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Milian Wolff <milian.wolff@kdab.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Song Liu <songliubraving@fb.com >
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220811185456.194721-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 17:57:11 -03:00
Namhyung Kim
07fc958b0c
perf offcpu: Check process id for the given workload
...
Current task filter checks task->pid which is different for each
thread. But we want to profile all the threads in the process. So
let's compare process id (or thread-group id: tgid) instead.
Before:
$ sudo perf record --off-cpu -- perf bench sched messaging -t
$ sudo perf report --stat | grep -A1 offcpu
offcpu-time stats:
SAMPLE events: 2
After:
$ sudo perf record --off-cpu -- perf bench sched messaging -t
$ sudo perf report --stat | grep -A1 offcpu
offcpu-time stats:
SAMPLE events: 850
Signed-off-by: Namhyung Kim <namhyung@kernel.org >
Cc: Blake Jones <blakejones@google.com >
Cc: Hao Luo <haoluo@google.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@kernel.org >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Milian Wolff <milian.wolff@kdab.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Song Liu <songliubraving@fb.com >
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220811185456.194721-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 17:56:47 -03:00
Adrian Hunter
806731a946
perf tools: Do not pass NULL to parse_events()
...
Many cases do not use the extra error information provided by
parse_events and instead pass NULL as the struct parse_events_error
pointer. Add a wrapper for those cases so that the pointer is never
NULL.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Link: https://lore.kernel.org/r/20220809080702.6921-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 14:30:09 -03:00
Adrian Hunter
1da1d60774
perf tests: Fix Track with sched_switch test for hybrid case
...
If cpu_core PMU event fails to parse, try also cpu_atom PMU event when
parsing cycles event.
Fixes: 43eb05d066 ("perf tests: Support 'Track with sched_switch' test for hybrid")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Jin Yao <yao.jin@linux.intel.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Link: https://lore.kernel.org/r/20220809080702.6921-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 14:29:46 -03:00
Adrian Hunter
2e828582b8
perf parse-events: Fix segfault when event parser gets an error
...
parse_events() is often called with parse_events_error set to NULL.
Make parse_events_error__handle() not segfault in that case.
A subsequent patch changes to avoid passing NULL in the first place.
Fixes: 43eb05d066 ("perf tests: Support 'Track with sched_switch' test for hybrid")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Jin Yao <yao.jin@linux.intel.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Link: https://lore.kernel.org/r/20220809080702.6921-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 14:29:23 -03:00
Adrian Hunter
b39c9e1b10
perf machine: Fix missing free of machine->kallsyms_filename
...
Add missing free of machine->kallsyms_filename to machine__exit().
Fixes: a5367ecb53 ("perf tools: Automatically use guest kcore_dir if present")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Link: https://lore.kernel.org/r/20220809130758.12800-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 10:44:02 -03:00
Adrian Hunter
0c39f14714
perf script: Fix reference to perf insert instead of perf inject
...
Amend "perf insert" to "perf inject".
Fixes: e28fb159f1 ("perf script: Add machine_pid and vcpu")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Link: https://lore.kernel.org/r/20220809123258.9086-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 10:44:02 -03:00
Yang Jihong
628881ee06
perf sched latency: Fix subcommand matching error
...
perf sched latency use strncmp to match subcommands which matching does not
meet expectation.
Before:
# perf sched lat1234 >/dev/null
# echo $?
0
#
Solution: Use strstarts to match subcommand.
After:
# perf sched lat1234
Usage: perf sched [<options>] {record|latency|map|replay|script|timehist}
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-i, --input <file> input file name
-v, --verbose be more verbose (show symbol address, etc)
# echo $?
129
#
# perf sched lat >/dev/null
# echo $?
0
#
Signed-off-by: Yang Jihong <yangjihong1@huawei.com >
Acked-by: Namhyung Kim <namhyung@kernel.org >
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220808092408.107399-3-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 10:44:02 -03:00
Yang Jihong
d2f30b793e
perf kvm: Fix subcommand matching error
...
Currently the 'diff', 'top', 'buildid-list' and 'stat' perf commands use
strncmp() to match subcommands. As a result, matching does not meet
expectation.
For example:
# perf kvm diff1234
# Event 'cycles'
#
# Baseline Delta Abs Shared Object Symbol
# ........ ......... ............. ......
#
# Event 'dummy:HG'
#
# Baseline Delta Abs Shared Object Symbol
# ........ ......... ............. ......
#
# echo $?
0
#
Invalid information should be returned, but success is actually returned.
Solution: Use strstarts() to match subcommands.
After:
# perf kvm diff1234
Usage: perf kvm [<options>] {top|record|report|diff|buildid-list|stat}
-i, --input <file> Input file name
-o, --output <file> Output file name
-v, --verbose be more verbose (show counter open errors, etc)
--guest Collect guest os data
--guest-code Guest code can be found in hypervisor process
--guestkallsyms <file>
file saving guest os /proc/kallsyms
--guestmodules <file>
file saving guest os /proc/modules
--guestmount <directory>
guest mount directory under which every guest os instance has a subdir
--guestvmlinux <file>
file saving guest os vmlinux
--host Collect host os data
# echo $?
129
#
Signed-off-by: Yang Jihong <yangjihong1@huawei.com >
Acked-by: Namhyung Kim <namhyung@kernel.org >
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220808092408.107399-2-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 10:44:02 -03:00
Christophe JAILLET
4bf6dcaa93
perf probe: Fix an error handling path in 'parse_perf_probe_command()'
...
If a memory allocation fail, we should branch to the error handling path
in order to free some resources allocated a few lines above.
Fixes: 15354d5469 ("perf probe: Generate event name with line number")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr >
Acked-by: Masami Hiramatsu <mhiramat@kernel.org >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: kernel-janitors@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/b71bcb01fa0c7b9778647235c3ab490f699ba278.1659797452.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 10:44:02 -03:00
Brian Robbins
46f7bd5e1b
perf inject jit: Ignore memfd and anonymous mmap events if jitdump present
...
Some processes store jitted code in memfd mappings to avoid having rwx
mappings. These processes map the code with a writeable mapping and a
read-execute mapping. They write the code using the writeable mapping
and then unmap the writeable mapping. All subsequent execution is
through the read-execute mapping.
perf inject --jit ignores //anon* mappings for each process where a
jitdump is present because it expects to inject mmap events for each
jitted code range, and said jitted code ranges will overlap with the
//anon* mappings.
Ignore /memfd: and [anon:* mappings so that jitted code contained in
/memfd: and [anon:* mappings is treated the same way as jitted code
contained in //anon* mappings.
Signed-off-by: Brian Robbins <brianrob@linux.microsoft.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220805220645.95855-1-brianrob@linux.microsoft.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 10:44:02 -03:00
Thomas Richter
e0b23af82d
perf list: Add PMU pai_crypto event description for IBM z16
...
Add the event description for the IBM z16 pai_crypto PMU released with
commit 1bf54f32f525 ("s390/pai: Add support for cryptography counters")
The document SA22-7832-13 "z/Architecture Principles of Operation",
published May, 2022, contains the description of the
Processor Activity Instrumentation Facility and the cryptography
counter set., See Pages 5-110 to 5-113.
Patch reworked to fit for the converted jevents processing.
Committer notes:
Couldn't find 1bf54f32f525 ("s390/pai: Add support for cryptography
counters") in torvalds/master, in what tree is that cset?
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com >
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com >
Cc: Heiko Carstens <hca@linux.ibm.com >
Cc: Sven Schnelle <svens@linux.ibm.com >
Cc: Vasily Gorbik <gor@linux.ibm.com >
Link: https://lore.kernel.org/r/20220804075221.1132849-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 10:44:02 -03:00
Ian Rogers
b48ddbbb99
perf vendor events: Remove bad jaketown uncore events
...
The event converter scripts at:
https://github.com/intel/event-converter-for-linux-perf
passes Filter values from data on 01.org that is bogus in a perf command
line and can cause perf to infinitely recurse in parse events. Remove
such events or filters using the updated patch:
afd779df99
Fixes: 376d8b581b ("perf vendor events: Update Intel jaketown")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com >
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Caleb Biggers <caleb.biggers@intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Perry Taylor <perry.taylor@intel.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Link: https://lore.kernel.org/r/20220805013856.1842878-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 10:44:02 -03:00
Ian Rogers
22de36ff2c
perf vendor events: Remove bad ivytown uncore events
...
The event converter scripts at:
https://github.com/intel/event-converter-for-linux-perf
passes Filter values from data on 01.org that is bogus in a perf command
line and can cause perf to infinitely recurse in parse events. Remove
such events or filters using the updated patch:
afd779df99
Fixes: 6220136831 ("perf vendor events: Update Intel ivytown")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com >
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Caleb Biggers <caleb.biggers@intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Perry Taylor <perry.taylor@intel.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Link: https://lore.kernel.org/r/20220805013856.1842878-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 10:44:02 -03:00
Ian Rogers
2c98bacfd7
perf vendor events: Remove bad broadwellde uncore events
...
The event converter scripts at:
https://github.com/intel/event-converter-for-linux-perf
passes Filter values from data on 01.org that is bogus in a perf command
line and can cause perf to infinitely recurse in parse events. Remove
such events or filters using the updated patch:
afd779df99
Fixes: ef908a1925 ("perf vendor events: Update Intel broadwellde")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com >
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Caleb Biggers <caleb.biggers@intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Perry Taylor <perry.taylor@intel.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Stephane Eranian <eranian@google.com >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Link: https://lore.kernel.org/r/20220805013856.1842878-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 10:44:02 -03:00
Ian Rogers
b4f0466082
perf jevents: Add JEVENTS_ARCH make option
...
Allow the architecture built into pmu-events.c to be set on the make
command line with JEVENTS_ARCH.
Reviewed-by: John Garry <john.garry@huawei.com >
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220804221816.1802790-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 10:44:02 -03:00
Ian Rogers
46acb311c6
perf jevents: Simplify generation of C-string
...
Previous implementation wanted variable order and '(null)' string output
to match the C implementation. The '(null)' string output was a
quirk/bug and so there is no need to carry it forward.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220804221816.1802790-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 10:44:02 -03:00
Ian Rogers
e1e19d0545
perf jevents: Clean up pytype warnings
...
Improve type hints to clean up pytype warnings.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220804221816.1802790-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-10 10:44:02 -03:00