One metric such as 'Kernel_Utilization' may be from different PMUs and
consists of different events.
For core,
Kernel_Utilization = cpu_clk_unhalted.thread:k / cpu_clk_unhalted.thread
For atom,
Kernel_Utilization = cpu_clk_unhalted.core:k / cpu_clk_unhalted.core
The metric group string for core is:
'{cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread:k/k,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/}:W'
It's internally expanded to:
'{cpu_clk_unhalted.thread_p/metric-id=cpu_clk_unhalted.thread_p:k/k,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/}:W#cpu_core'
The metric group string for atom is:
'{cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core:k/k,cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core/}:W'
It's internally expanded to:
'{cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core:k/k,cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core/}:W#cpu_atom'
That means the group "{cpu_clk_unhalted.thread:k,cpu_clk_unhalted.thread}:W"
is from cpu_core PMU and the group "{cpu_clk_unhalted.core:k,cpu_clk_unhalted.core}"
is from cpu_atom PMU. And then next, check if the events in the group are
valid on that PMU. If one event is not valid on that PMU, the associated
group would be removed internally.
In this example, cpu_clk_unhalted.thread is valid on cpu_core and
cpu_clk_unhalted.core is valid on cpu_atom. So the checks for these two
groups are passed.
Before:
# ./perf stat -M Kernel_Utilization -a sleep 1
WARNING: events in group from different hybrid PMUs!
WARNING: grouped events cpus do not match, disabling group:
anon group { CPU_CLK_UNHALTED.THREAD_P:k, CPU_CLK_UNHALTED.THREAD_P:k, CPU_CLK_UNHALTED.THREAD, CPU_CLK_UNHALTED.THREAD }
Performance counter stats for 'system wide':
17,639,501 cpu_atom/CPU_CLK_UNHALTED.CORE/ # 1.00 Kernel_Utilization
17,578,757 cpu_atom/CPU_CLK_UNHALTED.CORE:k/
1,005,350,226 ns duration_time
43,012,352 cpu_core/CPU_CLK_UNHALTED.THREAD_P:k/ # 0.99 Kernel_Utilization
17,608,010 cpu_atom/CPU_CLK_UNHALTED.THREAD_P:k/
43,608,755 cpu_core/CPU_CLK_UNHALTED.THREAD/
17,630,838 cpu_atom/CPU_CLK_UNHALTED.THREAD/
1,005,350,226 ns duration_time
1.005350226 seconds time elapsed
After:
# ./perf stat -M Kernel_Utilization -a sleep 1
Performance counter stats for 'system wide':
17,981,895 CPU_CLK_UNHALTED.CORE [cpu_atom] # 1.00 Kernel_Utilization
17,925,405 CPU_CLK_UNHALTED.CORE:k [cpu_atom]
1,004,811,366 ns duration_time
41,246,425 CPU_CLK_UNHALTED.THREAD_P:k [cpu_core] # 0.99 Kernel_Utilization
41,819,129 CPU_CLK_UNHALTED.THREAD [cpu_core]
1,004,811,366 ns duration_time
1.004811366 seconds time elapsed
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220422065635.767648-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It bothered me that during benchmarking using 'perf stat' (to collect
for example CPU cache events) I could not simultaneously retrieve the
times spend in user or kernel mode in a machine readable format.
When running 'perf stat' the output for humans contains the times
reported by rusage and wait4.
$ perf stat -e cache-misses:u -- true
Performance counter stats for 'true':
4,206 cache-misses:u
0.001113619 seconds time elapsed
0.001175000 seconds user
0.000000000 seconds sys
But 'perf stat's machine-readable format does not provide this information.
$ perf stat -x, -e cache-misses:u -- true
4282,,cache-misses:u,492859,100.00,,
I found no way to retrieve this information using the available events
while using machine-readable output.
This patch adds two new tool internal events 'user_time' and
'system_time', similarly to the already present 'duration_time' event.
Both events use the already collected rusage information obtained by
wait4 and tracked in the global ru_stats.
Examples presenting cache-misses and rusage information in both human
and machine-readable form:
$ perf stat -e duration_time,user_time,system_time,cache-misses -- grep -q -r duration_time .
Performance counter stats for 'grep -q -r duration_time .':
67,422,542 ns duration_time:u
50,517,000 ns user_time:u
16,839,000 ns system_time:u
30,937 cache-misses:u
0.067422542 seconds time elapsed
0.050517000 seconds user
0.016839000 seconds sys
$ perf stat -x, -e duration_time,user_time,system_time,cache-misses -- grep -q -r duration_time .
72134524,ns,duration_time:u,72134524,100.00,,
65225000,ns,user_time:u,65225000,100.00,,
6865000,ns,system_time:u,6865000,100.00,,
38705,,cache-misses:u,71189328,100.00,,
Signed-off-by: Florian Fischer <florian.fischer@muhq.space>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220420102354.468173-3-florian.fischer@muhq.space
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf bench numa' testcase fails on systems with more than 1K CPUs.
Testcase: perf bench numa mem -p 1 -t 3 -P 512 -s 100 -zZ0qcm --thp 1
Snippet of code:
<<>>
perf: bench/numa.c:302: bind_to_node: Assertion `!(ret)' failed.
Aborted (core dumped)
<<>>
bind_to_node() uses "sched_getaffinity" to save the original cpumask and
this call is returning EINVAL ((invalid argument).
This happens because the default mask size in glibc is 1024. To
overcome this 1024 CPUs mask size limitation of cpu_set_t, change the
mask size using the CPU_*_S macros ie, use CPU_ALLOC to allocate
cpumask, CPU_ALLOC_SIZE for size.
Apart from fixing this for "orig_mask", apply same logic to "mask" as
well which is used to setaffinity so that mask size is large enough to
represent number of possible CPU's in the system.
sched_getaffinity is used in one more place in perf numa bench. It is in
"bind_to_cpu" function. Apply the same logic there also. Though
currently no failure is reported from there, it is ideal to change
getaffinity to work with such system configurations having CPU's more
than default mask size supported by glibc.
Also fix "sched_setaffinity" to use mask size which is large enough to
represent number of possible CPU's in the system.
Fixed all places where "bind_cpumask" which is part of "struct
thread_data" is used such that bind_cpumask works in all configuration.
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220412164059.42654-3-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf numa bench test fails with error:
Testcase:
./perf bench numa mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk
Failure snippet:
<<>>
Running 'numa/mem' benchmark:
# Running main, "perf bench numa numa-mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk"
perf: bench/numa.c:333: bind_to_cpumask: Assertion `!(ret)' failed.
<<>>
The Testcases uses CPU's 0 and 8. In function "parse_setup_cpu_list",
There is check to see if cpu number is greater than max cpu's possible
in the system ie via "if (bind_cpu_0 >= g->p.nr_cpus || bind_cpu_1 >=
g->p.nr_cpus) {".
But it could happen that system has say 48 CPU's, but only number of
online CPU's is 0-7. Other CPU's are offlined. Since "g->p.nr_cpus" is
48, so function will go ahead and set bit for CPU 8 also in cpumask (
td->bind_cpumask).
bind_to_cpumask function is called to set affinity using
sched_setaffinity and the cpumask. Since the CPU8 is not present, set
affinity will fail here with EINVAL.
Fix this issue by adding a check to make sure that, CPU's provided in
the input argument values are online before proceeding further and skip
the test. For this, include new helper function "is_cpu_online" in
"tools/perf/util/header.c".
Since "BIT(x)" definition will get included from header.h, remove
that from bench/numa.c
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220412164059.42654-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix incorrect debug message:
Attempting to add event pmu 'intel_pt' with '' that may result in
non-fatal errors
which always appears with perf record -vv and intel_pt e.g.
perf record -vv -e intel_pt//u uname
The message is incorrect because there will never be non-fatal errors.
Suppress the message if the PMU is 'selectable' i.e. meant to be
selected directly as an event.
Fixes: 4ac22b484d ("perf parse-events: Make add PMU verbose output clearer")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20220411061758.2458417-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The `perf --list-cmds` output prints only internal commands, although
there is no reason for that from users' perspective.
Adding the external commands to commands array with NULL function
pointer allows printing all perf commands while not changing the logic
of command handler selection.
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220404221541.30312-2-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf bench epoll' testcase fails on systems with more than 1K CPUs.
Testcase: perf bench epoll all
Result snippet:
<<>>
Run summary [PID 106497]: 1399 threads monitoring on 64 file-descriptors for 8 secs.
perf: pthread_create: No such file or directory
<<>>
In epoll benchmarks (ctl, wait) pthread_create is invoked in do_threads
from respective bench_epoll_* function. Though the logs shows direct
failure from pthread_create, the actual failure is from
"sched_setaffinity" returning EINVAL (invalid argument).
This happens because the default mask size in glibc is 1024. To overcome
this 1024 CPUs mask size limitation of cpu_set_t, change the mask size
using the CPU_*_S macros.
Patch addresses this by fixing all the epoll benchmarks to use CPU_ALLOC
to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the
mask.
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220406175113.87881-3-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf bench futex' testcase fails on systems with more than 1K CPUs.
Testcase: perf bench futex all
Failure snippet:
<<>>Running futex/hash benchmark...
perf: pthread_create: No such file or directory
<<>>
All the futex benchmarks (ie hash, lock-api, requeue, wake,
wake-parallel), pthread_create is invoked in respective bench_futex_*
function. Though the logs shows direct failure from pthread_create,
strace logs showed that actual failure is from "sched_setaffinity"
returning EINVAL (invalid argument).
This happens because the default mask size in glibc is 1024. To overcome
this 1024 CPUs mask size limitation of cpu_set_t, change the mask size
using the CPU_*_S macros.
Patch addresses this by fixing all the futex benchmarks to use CPU_ALLOC
to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the
mask.
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220406175113.87881-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit Fixes: b9f6fbb3b2 ("perf arm64: Inject missing frames when
using 'perf record --call-graph=fp'") intended to add a 'best effort'
DWARF unwind that improved the frame pointer stack in most scenarios.
It's expected that the unwind will fail sometimes, but this shouldn't be
reported as an error. It only works when the return address can be
determined from the contents of the link register alone.
Fix the error shown when the unwinder requires extra registers by adding
a new flag that suppresses error messages. This flag is not set in the
normal --call-graph=dwarf unwind mode so that behavior is not changed.
Fixes: b9f6fbb3b2 ("perf arm64: Inject missing frames when using 'perf record --call-graph=fp'")
Reported-by: John Garry <john.garry@huawei.com>
Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220406145651.1392529-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By default `perf test tsc` does not return the error message when the
child process detected kernel does not support it. Instead, the child
process prints an error message to stderr, unfortunately stderr is
redirected to /dev/null when verbose <= 0.
This patch does:
- return TEST_SKIP to the parent process instead of TEST_OK when
perf_read_tsc_conversion() is not supported.
- Add a new subtest of testing if TSC is supported on current
architecture by moving exist code to a separate function.
It avoids two places in test__perf_time_to_tsc() that return
TEST_SKIP by doing this.
- Extend the test suite definition to contain above two subtests.
Current test_suite and test_case structs do not support printing skip
reason when the number of subtest less than 1. To print skip reason, it
is necessary to extend current test suite definition.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Chengdong Li <chengdongli@tencent.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: likexu@tencent.com
Link: https://lore.kernel.org/r/20220408084748.43707-1-chengdongli@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using -ffat-lto-objects in the python feature test when building with
clang-13 results in:
clang-13: error: optimization flag '-ffat-lto-objects' is not supported [-Werror,-Wignored-optimization-argument]
error: command '/usr/sbin/clang' failed with exit code 1
cp: cannot stat '/tmp/build/perf/python_ext_build/lib/perf*.so': No such file or directory
make[2]: *** [Makefile.perf:639: /tmp/build/perf/python/perf.so] Error 1
Noticed when building on a docker.io/library/archlinux:base container.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The clang compiler complains about some options even without a source
file being available, while others require one, so use the simple
tools/build/feature/test-hello.c file.
Then check for the "is not supported" string in its output, in addition
to the "unknown argument" already being looked for.
This was noticed when building with clang-13 where -ffat-lto-objects
isn't supported and since we were looking just for "unknown argument"
and not providing a source code to clang, was mistakenly assumed as
being available and not being filtered to set of command line options
provided to clang, leading to a build failure.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Link: http://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These make the feature check fail when using clang, so remove them just
like is done in tools/perf/Makefile.config to build perf itself.
Adding -Wno-compound-token-split-by-macro to tools/perf/Makefile.config
when building with clang is also necessary to avoid these warnings
turned into errors (-Werror):
CC /tmp/build/perf/util/scripting-engines/trace-event-perl.o
In file included from util/scripting-engines/trace-event-perl.c:35:
In file included from /usr/lib64/perl5/CORE/perl.h:4085:
In file included from /usr/lib64/perl5/CORE/hv.h:659:
In file included from /usr/lib64/perl5/CORE/hv_func.h:34:
In file included from /usr/lib64/perl5/CORE/sbox32_hash.h:4:
/usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: error: '(' and '{' tokens introducing statement expression appear in different macro expansion contexts [-Werror,-Wcompound-token-split-by-macro]
ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/lib64/perl5/CORE/zaphod32_hash.h:80:38: note: expanded from macro 'ZAPHOD32_SCRAMBLE32'
#define ZAPHOD32_SCRAMBLE32(v,prime) STMT_START { \
^~~~~~~~~~
/usr/lib64/perl5/CORE/perl.h:737:29: note: expanded from macro 'STMT_START'
# define STMT_START (void)( /* gcc supports "({ STATEMENTS; })" */
^
/usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: note: '{' token is here
ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/lib64/perl5/CORE/zaphod32_hash.h:80:49: note: expanded from macro 'ZAPHOD32_SCRAMBLE32'
#define ZAPHOD32_SCRAMBLE32(v,prime) STMT_START { \
^
/usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: error: '}' and ')' tokens terminating statement expression appear in different macro expansion contexts [-Werror,-Wcompound-token-split-by-macro]
ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/lib64/perl5/CORE/zaphod32_hash.h:87:41: note: expanded from macro 'ZAPHOD32_SCRAMBLE32'
v ^= (v>>23); \
^
/usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: note: ')' token is here
ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/lib64/perl5/CORE/zaphod32_hash.h:88:3: note: expanded from macro 'ZAPHOD32_SCRAMBLE32'
} STMT_END
^~~~~~~~
/usr/lib64/perl5/CORE/perl.h:738:21: note: expanded from macro 'STMT_END'
# define STMT_END )
^
Please refer to the discussion on the Link: tag below, where Nathan
clarifies the situation:
<quote>
acme> And then get to the problems at the end of this message, which seem
acme> similar to the problem described here:
acme>
acme> From Nathan Chancellor <>
acme> Subject [PATCH] mwifiex: Remove unnecessary braces from HostCmd_SET_SEQ_NO_BSS_INFO
acme>
acme> https://lkml.org/lkml/2020/9/1/135
acme>
acme> So perhaps in this case its better to disable that
acme> -Werror,-Wcompound-token-split-by-macro when building with clang?
Yes, I think that is probably the best solution. As far as I can tell,
at least in this file and context, the warning appears harmless, as the
"create a GNU C statement expression from two different macros" is very
much intentional, based on the presence of PERL_USE_GCC_BRACE_GROUPS.
The warning is fixed in upstream Perl by just avoiding creating GNU C
statement expressions using STMT_START and STMT_END:
https://github.com/Perl/perl5/issues/18780https://github.com/Perl/perl5/pull/18984
If I am reading the source code correctly, an alternative to disabling
the warning would be specifying -DPERL_GCC_BRACE_GROUPS_FORBIDDEN but it
seems like that might end up impacting more than just this site,
according to the issue discussion above.
</quote>
Based-on-a-patch-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # Debian/Selfmade LLVM-14 (x86-64)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Link: http://lore.kernel.org/lkml/YkxWcYzph5pC1EK8@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This essentially reverts commit c72e3f04b4 ("tools/perf/build:
Speed up git-version test on re-make") and commit 4e666cdb06
("perf tools: Fix dependency for version file creation")
In commit c72e3f04b4 ("tools/perf/build: Speed up git-version test
on re-make"), a makefile dependency on .git/HEAD was added. The
background is that running PERF-VERSION-FILE is relatively slow, and
commands like "git describe" are particularly slow.
In commit 4e666cdb06 ("perf tools: Fix dependency for version file
creation"), an additional dependency on .git/ORIG_HEAD was added, as
.git/HEAD may not change for "git reset --hard HEAD^" command. However,
depending on whether we're on a branch or not, a "git cherry-pick" may
not lead to the version being updated.
As discussed with the git community in [0], using git internal files for
dependencies is not reliable. Commit 4e666cdb06 also breaks some build
scenarios [1].
As mentioned, c72e3f04b4 ("tools/perf/build: Speed up git-version
test on re-make") was added to speed up the build. However in commit
7572733b84 ("perf tools: Fix version kernel tag") we removed the
call to "git describe", so just revert Makefile.perf back to same as pre
c72e3f04b4 ("tools/perf/build: Speed up git-version test on
re-make") and the build should not be so slow, as below:
Pre 7572733b84:
$> time util/PERF-VERSION-GEN
PERF_VERSION = 5.17.rc8.g4e666cdb06ee
real 0m0.110s
user 0m0.091s
sys 0m0.019s
Post 7572733b84:
$> time util/PERF-VERSION-GEN
PERF_VERSION = 5.17.rc8.g7572733b8499
real 0m0.039s
user 0m0.036s
sys 0m0.007s
[0] https://lore.kernel.org/git/87wngkpddp.fsf@igel.home/T/#m4a4dd6de52fdbe21179306cd57b3761eb07f45f8
[1] https://lore.kernel.org/linux-perf-users/20220329093120.4173283-1-matthieu.baerts@tessares.net/T/#u
Committer testing:
After a fresh rebuild using 'make -C tools/perf O=/tmp/build/perf install-bin':
$ perf -v
perf version 5.17.g162f9db407b6
$ git log --oneline -1
162f9db407b6a6e5 (HEAD -> perf/core) perf tools: Stop depending on .git files for building PERF-VERSION-FILE
$
Now using a detached tarball, i.e. outside the kernel source tree:
$ ls -la perf*tar
ls: cannot access 'perf*tar': No such file or directory
$ make perf-tar-src-pkg
TAR
PERF_VERSION = 5.17.g31d10b3ef133
$ ls -la perf*tar
-rw-r--r--. 1 acme acme 22241280 Mar 30 13:26 perf-5.17.0.tar
$ mv perf-5.17.0.tar /tmp
$ cd /tmp
$ tar xf perf-5.17.0.tar
$ cd perf-5.17.0/
$ make -C tools/perf |& tail
CC util/pmu.o
CC util/pmu-flex.o
CC util/expr-flex.o
CC util/expr.o
LD util/scripting-engines/perf-in.o
LD util/intel-pt-decoder/perf-in.o
LD util/perf-in.o
LD perf-in.o
LINK perf
make: Leaving directory '/tmp/perf-5.17.0/tools/perf'
$ tools/perf/perf -v
perf version 5.17.g31d10b3ef133
$ pwd
/tmp/perf-5.17.0
$ cat PERF-VERSION-FILE
#define PERF_VERSION "5.17.g31d10b3ef133"
$
Fixes: 4e666cdb06 ("perf tools: Fix dependency for version file creation")
Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/1648635774-14581-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>