perf header: Additional note on AMD IBS for max_precise pmu cap

x86 core PMU exposes supported maximum precision level via max_precise
PMU capability. Although, AMD core PMU does not support precise mode,
certain core PMU events with precise_ip > 0 are allowed and forwarded to
IBS OP PMU.

Display a note about this in the 'perf report' header output and
document the details in the perf-list man page.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ross Zwisler <zwisler@chromium.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231107083331.901-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This commit is contained in:
Arnaldo Carvalho de Melo 2023-11-07 14:03:31 +05:30 committed by Arnaldo Carvalho de Melo
parent a399ee6773
commit dd678532f9
4 changed files with 35 additions and 5 deletions

View File

@ -81,11 +81,13 @@ For Intel systems precise event sampling is implemented with PEBS
which supports up to precise-level 2, and precise level 3 for
some special cases
On AMD systems it is implemented using IBS (up to precise-level 2).
The precise modifier works with event types 0x76 (cpu-cycles, CPU
clocks not halted) and 0xC1 (micro-ops retired). Both events map to
IBS execution sampling (IBS op) with the IBS Op Counter Control bit
(IbsOpCntCtl) set respectively (see the
On AMD systems it is implemented using IBS OP (up to precise-level 2).
Unlike Intel PEBS which provides levels of precision, AMD core pmu is
inherently non-precise and IBS is inherently precise. (i.e. ibs_op//,
ibs_op//p, ibs_op//pp and ibs_op//ppp are all same). The precise modifier
works with event types 0x76 (cpu-cycles, CPU clocks not halted) and 0xC1
(micro-ops retired). Both events map to IBS execution sampling (IBS op)
with the IBS Op Counter Control bit (IbsOpCntCtl) set respectively (see the
Core Complex (CCX) -> Processor x86 Core -> Instruction Based Sampling (IBS)
section of the [AMD Processor Programming Reference (PPR)] relevant to the
family, model and stepping of the processor being used).

View File

@ -531,6 +531,24 @@ int perf_env__numa_node(struct perf_env *env, struct perf_cpu cpu)
return cpu.cpu >= 0 && cpu.cpu < env->nr_numa_map ? env->numa_map[cpu.cpu] : -1;
}
bool perf_env__has_pmu_mapping(struct perf_env *env, const char *pmu_name)
{
char *pmu_mapping = env->pmu_mappings, *colon;
for (int i = 0; i < env->nr_pmu_mappings; ++i) {
if (strtoul(pmu_mapping, &colon, 0) == ULONG_MAX || *colon != ':')
goto out_error;
pmu_mapping = colon + 1;
if (strcmp(pmu_mapping, pmu_name) == 0)
return true;
pmu_mapping += strlen(pmu_mapping) + 1;
}
out_error:
return false;
}
char *perf_env__find_pmu_cap(struct perf_env *env, const char *pmu_name,
const char *cap)
{

View File

@ -179,4 +179,6 @@ struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id);
int perf_env__numa_node(struct perf_env *env, struct perf_cpu cpu);
char *perf_env__find_pmu_cap(struct perf_env *env, const char *pmu_name,
const char *cap);
bool perf_env__has_pmu_mapping(struct perf_env *env, const char *pmu_name);
#endif /* __PERF_ENV_H */

View File

@ -2145,6 +2145,14 @@ static void print_pmu_caps(struct feat_fd *ff, FILE *fp)
__print_pmu_caps(fp, pmu_caps->nr_caps, pmu_caps->caps,
pmu_caps->pmu_name);
}
if (strcmp(perf_env__arch(&ff->ph->env), "x86") == 0 &&
perf_env__has_pmu_mapping(&ff->ph->env, "ibs_op")) {
char *max_precise = perf_env__find_pmu_cap(&ff->ph->env, "cpu", "max_precise");
if (max_precise != NULL && atoi(max_precise) == 0)
fprintf(fp, "# AMD systems uses ibs_op// PMU for some precise events, e.g.: cycles:p, see the 'perf list' man page for further details.\n");
}
}
static void print_pmu_mappings(struct feat_fd *ff, FILE *fp)