From aa1005d15d2aee10e5b93a25db076c47e05c4efa Mon Sep 17 00:00:00 2001 From: Raphael Gault Date: Wed, 8 Dec 2021 14:11:24 -0600 Subject: [PATCH] Documentation: arm64: Document PMU counters access from userspace Add documentation to describe the access to the pmu hardware counters from userspace. Signed-off-by: Raphael Gault Signed-off-by: Rob Herring Link: https://lore.kernel.org/r/20211208201124.310740-6-robh@kernel.org Signed-off-by: Will Deacon --- Documentation/arm64/perf.rst | 78 +++++++++++++++++++++++++++++++++++- 1 file changed, 77 insertions(+), 1 deletion(-) diff --git a/Documentation/arm64/perf.rst b/Documentation/arm64/perf.rst index b567f177d385..1f87b57c2332 100644 --- a/Documentation/arm64/perf.rst +++ b/Documentation/arm64/perf.rst @@ -2,7 +2,10 @@ .. _perf_index: -===================== +==== +Perf +==== + Perf Event Attributes ===================== @@ -88,3 +91,76 @@ exclude_host. However when using !exclude_hv there is a small blackout window at the guest entry/exit where host events are not captured. On VHE systems there are no blackout windows. + +Perf Userspace PMU Hardware Counter Access +========================================== + +Overview +-------- +The perf userspace tool relies on the PMU to monitor events. It offers an +abstraction layer over the hardware counters since the underlying +implementation is cpu-dependent. +Arm64 allows userspace tools to have access to the registers storing the +hardware counters' values directly. + +This targets specifically self-monitoring tasks in order to reduce the overhead +by directly accessing the registers without having to go through the kernel. + +How-to +------ +The focus is set on the armv8 PMUv3 which makes sure that the access to the pmu +registers is enabled and that the userspace has access to the relevant +information in order to use them. + +In order to have access to the hardware counters, the global sysctl +kernel/perf_user_access must first be enabled: + +.. code-block:: sh + + echo 1 > /proc/sys/kernel/perf_user_access + +It is necessary to open the event using the perf tool interface with config1:1 +attr bit set: the sys_perf_event_open syscall returns a fd which can +subsequently be used with the mmap syscall in order to retrieve a page of memory +containing information about the event. The PMU driver uses this page to expose +to the user the hardware counter's index and other necessary data. Using this +index enables the user to access the PMU registers using the `mrs` instruction. +Access to the PMU registers is only valid while the sequence lock is unchanged. +In particular, the PMSELR_EL0 register is zeroed each time the sequence lock is +changed. + +The userspace access is supported in libperf using the perf_evsel__mmap() +and perf_evsel__read() functions. See `tools/lib/perf/tests/test-evsel.c`_ for +an example. + +About heterogeneous systems +--------------------------- +On heterogeneous systems such as big.LITTLE, userspace PMU counter access can +only be enabled when the tasks are pinned to a homogeneous subset of cores and +the corresponding PMU instance is opened by specifying the 'type' attribute. +The use of generic event types is not supported in this case. + +Have a look at `tools/perf/arch/arm64/tests/user-events.c`_ for an example. It +can be run using the perf tool to check that the access to the registers works +correctly from userspace: + +.. code-block:: sh + + perf test -v user + +About chained events and counter sizes +-------------------------------------- +The user can request either a 32-bit (config1:0 == 0) or 64-bit (config1:0 == 1) +counter along with userspace access. The sys_perf_event_open syscall will fail +if a 64-bit counter is requested and the hardware doesn't support 64-bit +counters. Chained events are not supported in conjunction with userspace counter +access. If a 32-bit counter is requested on hardware with 64-bit counters, then +userspace must treat the upper 32-bits read from the counter as UNKNOWN. The +'pmc_width' field in the user page will indicate the valid width of the counter +and should be used to mask the upper bits as needed. + +.. Links +.. _tools/perf/arch/arm64/tests/user-events.c: + https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c +.. _tools/lib/perf/tests/test-evsel.c: + https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c