Commit Graph

546 Commits

Author SHA1 Message Date
Robin Murphy
558a078070 perf/arm-cmn: Move group validation data off-stack
With the value of CMN_MAX_DTMS increasing significantly, our validation
data structure is set to get quite big. Technically we could pack it at
least twice as densely, since we only need around 19 bits of information
per DTM, but that makes the code even more mind-bogglingly impenetrable,
and even half of "quite big" may still be uncomfortably large for a
stack frame (~1KB). Just move it to an off-stack allocation instead.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/0cabff2e5839ddc0979e757c55515966f65359e4.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:28 +00:00
Robin Murphy
4f2c3872dd perf/arm-cmn: Optimise DTC counter accesses
In cases where we do know which DTC domain a node belongs to, we can
skip initialising or reading the global count in DTCs where we know
it won't change. The machinery to achieve that is mostly in place
already, so finish hooking it up by converting the vestigial domain
tracking to propagate suitable bitmaps all the way through to events.

Note that this does not allow allocating such an unused counter to a
different event on that DTC, because that is a flippin' nightmare.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/51d930fd945ef51c81f5889ccca055c302b0a1d0.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:28 +00:00
Robin Murphy
847eef94e6 perf/arm-cmn: Optimise DTM counter reads
When multiple nodes of the same type are connected to the same XP
(particularly in CAL configurations), it seems that they are likely
to be consecutive in logical ID. Therefore, we're likely to gain a
small benefit from an easy tweak to optimise out consecutive reads
of the same set of DTM counters for an aggregated event.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/7777d77c2df17693cd3dabb6e268906e15238d82.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:27 +00:00
Robin Murphy
0947c80aba perf/arm-cmn: Refactor DTM handling
Untangle DTMs from XPs into a dedicated abstraction. This helps make
things a little more obvious and robust, but primarily paves the way
for further development where new IPs can grow extra DTMs per XP.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/9cca18b1b98f482df7f1aaf3d3213e7f39500423.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:27 +00:00
Robin Murphy
da5f7d2c80 perf/arm-cmn: Streamline node iteration
Refactor the places where we scan through the set of nodes to switch
from explicit array indexing to pointer-based iteration. This leads to
slightly simpler object code, but also makes the source less dense and
more pleasant for further development. It also unearths an almost-bug
in arm_cmn_event_init() where we've been depending on the "array index"
of NULL relative to cmn->dns being a sufficiently large number, yuck.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/ee0c9eda9a643f46001ac43aadf3f0b1fd5660dd.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:27 +00:00
Robin Murphy
5f167eab83 perf/arm-cmn: Refactor node ID handling
Add a bit more abstraction for the places where we decompose node IDs.
This will help keep things nice and manageable when we come to add yet
more variables which affect the node ID format. Also use the opportunity
to move the rest of the low-level node management helpers back up to the
logical place they were meant to be - how they ended up buried right in
the middle of the event-related definitions is somewhat of a mystery...

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/a2242a8c3c96056c13a04ae87bf2047e5e64d2d9.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:27 +00:00
Robin Murphy
82d8ea4b45 perf/arm-cmn: Drop compile-test restriction
Although CMN is currently (and overwhelmingly likely to remain) deployed
in arm64-only (modulo userspace) systems, the 64-bit "dependency" for
compile-testing was just laziness due to heavy reliance on readq/writeq
accessors. Since we only need one extra include for robustness in that
regard, let's pull that in, widen the compile-test coverage, and fix up
the smattering of type laziness that that brings to light.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/baee9ee0d0bdad8aaeb70f5a4b98d8fd4b1f5786.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:27 +00:00
Robin Murphy
6190741c29 perf/arm-cmn: Account for NUMA affinity
On a system with multiple CMN meshes, ideally we'd want to access each
PMU from within its own mesh, rather than with a long CML round-trip,
wherever feasible. Since such a system is likely to be presented as
multiple NUMA nodes, let's also hope a proximity domain is specified
for each CMN programming interface, and use that to guide our choice
of IRQ affinity to favour a node-local CPU where possible.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/32438b0d016e0649d882d47d30ac2000484287b9.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:27 +00:00
Robin Murphy
56c7c6eaf3 perf/arm-cmn: Fix CPU hotplug unregistration
Attempting to migrate the PMU context after we've unregistered the PMU
device, or especially if we never successfully registered it in the
first place, is a woefully bad idea. It's also fundamentally pointless
anyway. Make sure to unregister an instance from the hotplug handler
*without* invoking the teardown callback.

Fixes: 0ba64770a2 ("perf: Add Arm CMN-600 PMU driver")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/2c221d745544774e4b07583b65b5d4d94f7e0fe4.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:27 +00:00
Linus Torvalds
c0d6586afa ACPI updates for 5.16-rc1
- Update the ACPICA code in the kernel to upstream revision 20210930
    including the following changes:
 
    * Fix system-wide resume issue caused by evaluating control
      methods too early in the resume path (Rafael Wysocki).
 
    * Add support for Windows 2020 _OSI string (Mario Limonciello).
 
    * Add Generic Port Affinity type for SRAT (Alison Schofield).
 
    * Add disassembly support for the NHLT ACPI table (Bob Moore).
 
  - Avoid flushing caches before entering C3 type of idle states on
    AMD processors (Deepak Sharma).
 
  - Avoid enumerating CPUs that are not present and not online-capable
    according to the platform firmware (Mario Limonciello).
 
  - Add DMI-based mechanism to quirk IRQ overrides and use it for two
    platforms (Hui Wang).
 
  - Change the configuration of unused ACPI device objects to reflect
    the D3cold power state after enumerating devices (Rafael Wysocki).
 
  - Update MAINTAINERS information regarding ACPI (Rafael Wysocki).
 
  - Fix typo in ACPI Kconfig (Masanari Iid).
 
  - Use sysfs_emit() instead of snprintf() in some places (Qing Wang).
 
  - Make the association of ACPI device objects with PCI devices more
    straightforward and simplify the code doing that for all devices
    in general (Rafael Wysocki).
 
  - Use acpi_device_adr() in acpi_find_child_device() instead of
    evaluating _ADR (Rafael Wysocki).
 
  - Drop duplicate device IDs from PNP device IDs list (Krzysztof
    Kozlowski).
 
  - Allow acpi_idle_play_dead() to use C3 on AMD processors (Richard
    Gong).
 
  - Use ACPI_COMPANION() to simplify code in some drivers (Rafael
    Wysocki).
 
  - Check the states of all ACPI power resources during initialization
    to avoid dealing with power resources in unknown states (Rafael
    Wysocki).
 
  - Fix ACPI power resource issues related to sharing wakeup power
    resources (Rafael Wysocki).
 
  - Avoid registering redundant suspend_ops (Rafael Wysocki).
 
  - Report battery charging state as "full" if it appears to be over
    the design capacity (André Almeida).
 
  - Quirk GK45 mini PC to skip reading _PSR in the AC driver (Stefan
    Schaeckeler).
 
  - Mark apei_hest_parse() static (Christoph Hellwig).
 
  - Relax platform response timeout to 1 second after instructing it
    to inject an error (Shuai Xue).
 
  - Make the PRM code handle memory allocation and remapping failures
    more gracefully and drop some unnecessary blank lines from that
    code (Aubrey Li).
 
  - Fix spelling mistake in the ACPI documentation (Colin Ian King).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmGBkbESHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxCf8QALHX0PjsKC9cFs+hHVoI+T9VdVa/2pY4
 8vMjxYqkano4ti2xu1Fo/gaECnpgtVH+mV6iXPSAf80Wjgxnrx8K7y4qyUR4POBR
 l/5jA+XuFOyBRCSIHjJrfGof3/bTCt63LuyOc1WxiFzhkIuKeiI7Bmw6pvxrcXZw
 Ehe0VZK8eZGHR2py7RlZ68IJ32uAdlVKjlXyOFWfm9pjKtYit49WV/rjY0ldHBou
 vq8JP4EEmCKhyYAuzRprkPR2itm5fYI3lceG3UaIq5uWCj0IlaWcuv6c7K6N6QA1
 bHjCUSWPMCoHKQbZoX5MEEAjIJKdO81Rj3PGjNL1uYSz806ZmNcxav1twebm1lEa
 h6GbcFDaY9USMEl1D8s7h5Lm9cM33cen4IW9Ms5nMjugkdAr4KyLOA8CgpTbb1Ih
 vBrlB4CzxEqgpa+YAvJd7/r3FckauGthkciEjOCFpUxeZpKW5kweLIwFpPHKaqot
 KXtE+wwacxmkwYiJwEg141Bp4oFfy6iBcDn8xOapjq89DFEAn03VuvAO33Q+2KPL
 A2RtOaV2tSxxNDUss9wENI4lF5sc2JvBECXc1MZCBBNQqxTrwT562UwMhgUXD1Vy
 8LvdrYAOZzS6QizeHQC27FQ/uwQ5/uEqaYfjMv+JQjaZZU7YeG3vekADqiEstfWF
 UUj72AE8a/Cs
 =Ni9a
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "These update the ACPICA code in the kernel to the most recent upstream
  revision, address some issues related to the ACPI power resources
  management, simplify the enumeration of PCI devices having ACPI
  companions, add new quirks, fix assorted problems, update the
  ACPI-related information in maintainers and clean up code in several
  places.

  Specifics:

   - Update the ACPICA code in the kernel to upstream revision 20210930
     including the following changes:

        - Fix system-wide resume issue caused by evaluating control
          methods too early in the resume path (Rafael Wysocki).

        - Add support for Windows 2020 _OSI string (Mario Limonciello).

        - Add Generic Port Affinity type for SRAT (Alison Schofield).

        - Add disassembly support for the NHLT ACPI table (Bob Moore).

   - Avoid flushing caches before entering C3 type of idle states on AMD
     processors (Deepak Sharma).

   - Avoid enumerating CPUs that are not present and not online-capable
     according to the platform firmware (Mario Limonciello).

   - Add DMI-based mechanism to quirk IRQ overrides and use it for two
     platforms (Hui Wang).

   - Change the configuration of unused ACPI device objects to reflect
     the D3cold power state after enumerating devices (Rafael Wysocki).

   - Update MAINTAINERS information regarding ACPI (Rafael Wysocki).

   - Fix typo in ACPI Kconfig (Masanari Iid).

   - Use sysfs_emit() instead of snprintf() in some places (Qing Wang).

   - Make the association of ACPI device objects with PCI devices more
     straightforward and simplify the code doing that for all devices in
     general (Rafael Wysocki).

   - Use acpi_device_adr() in acpi_find_child_device() instead of
     evaluating _ADR (Rafael Wysocki).

   - Drop duplicate device IDs from PNP device IDs list (Krzysztof
     Kozlowski).

   - Allow acpi_idle_play_dead() to use C3 on AMD processors (Richard
     Gong).

   - Use ACPI_COMPANION() to simplify code in some drivers (Rafael
     Wysocki).

   - Check the states of all ACPI power resources during initialization
     to avoid dealing with power resources in unknown states (Rafael
     Wysocki).

   - Fix ACPI power resource issues related to sharing wakeup power
     resources (Rafael Wysocki).

   - Avoid registering redundant suspend_ops (Rafael Wysocki).

   - Report battery charging state as "full" if it appears to be over
     the design capacity (André Almeida).

   - Quirk GK45 mini PC to skip reading _PSR in the AC driver (Stefan
     Schaeckeler).

   - Mark apei_hest_parse() static (Christoph Hellwig).

   - Relax platform response timeout to 1 second after instructing it to
     inject an error (Shuai Xue).

   - Make the PRM code handle memory allocation and remapping failures
     more gracefully and drop some unnecessary blank lines from that
     code (Aubrey Li).

   - Fix spelling mistake in the ACPI documentation (Colin Ian King)"

* tag 'acpi-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (36 commits)
  ACPI: glue: Use acpi_device_adr() in acpi_find_child_device()
  perf: qcom_l2_pmu: ACPI: Use ACPI_COMPANION() directly
  ACPI: APEI: mark apei_hest_parse() static
  ACPI: APEI: EINJ: Relax platform response timeout to 1 second
  gpio-amdpt: ACPI: Use the ACPI_COMPANION() macro directly
  nouveau: ACPI: Use the ACPI_COMPANION() macro directly
  ACPI: resources: Add one more Medion model in IRQ override quirk
  ACPI: AC: Quirk GK45 to skip reading _PSR
  ACPI: PM: sleep: Do not set suspend_ops unnecessarily
  ACPI: PRM: Handle memory allocation and memory remap failure
  ACPI: PRM: Remove unnecessary blank lines
  ACPI: PM: Turn off wakeup power resources on _DSW/_PSW errors
  ACPI: PM: Fix sharing of wakeup power resources
  ACPI: PM: Turn off unused wakeup power resources
  ACPI: PM: Check states of power resources during initialization
  ACPI: replace snprintf() in "show" functions with sysfs_emit()
  ACPI: LPSS: Use ACPI_COMPANION() directly
  ACPI: scan: Release PM resources blocked by unused objects
  ACPI: battery: Accept charges over the design capacity as full
  ACPICA: Update version to 20210930
  ...
2021-11-02 15:58:39 -07:00
Linus Torvalds
46f8763228 arm64 updates for 5.16
- Support for the Arm8.6 timer extensions, including a self-synchronising
   view of the system registers to elide some expensive ISB instructions.
 
 - Exception table cleanup and rework so that the fixup handlers appear
   correctly in backtraces.
 
 - A handful of miscellaneous changes, the main one being selection of
   CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.
 
 - More mm and pgtable cleanups.
 
 - KASAN support for "asymmetric" MTE, where tag faults are reported
   synchronously for loads (via an exception) and asynchronously for
   stores (via a register).
 
 - Support for leaving the MMU enabled during kexec relocation, which
   significantly speeds up the operation.
 
 - Minor improvements to our perf PMU drivers.
 
 - Improvements to the compat vDSO build system, particularly when
   building with LLVM=1.
 
 - Preparatory work for handling some Coresight TRBE tracing errata.
 
 - Cleanup and refactoring of the SVE code to pave the way for SME
   support in future.
 
 - Ensure SCS pages are unpoisoned immediately prior to freeing them
   when KASAN is enabled for the vmalloc area.
 
 - Try moving to the generic pfn_valid() implementation again now that
   the DMA mapping issue from last time has been resolved.
 
 - Numerous improvements and additions to our FPSIMD and SVE selftests.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmF74ZYQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNI/eB/UZYAtmNi6xC5StPaETyMLeZph9BV/IqIFq
 N71ds7MFzlX/agR6MwLbH2tBHezBtlQ90O732Jjz8zAec2cHd+7sx/w82JesX7PB
 IuOfqP78rvtU4ZkKe1Rcd96QtYvbtNAqcRhIo95OzfV9xwuzkvdXI+ZTYhtCfCuZ
 GozCqQoJtnNDayMtfzbDSXyJLNJc/qnIcUQhrt3vg12zbF3BcHxnmp0nBcHCqZEo
 lDJYufju7p87kCzaFYda2WhlI3t+NThqKOiZ332wQfqzNcr+rw1Y4jWbnCfrdLtI
 JfHT9yiuHDmFSYaJrk7NU8kftW31NV70bbhD7rZ+DQCVndl0lRc=
 =3R3j
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "There's the usual summary below, but the highlights are support for
  the Armv8.6 timer extensions, KASAN support for asymmetric MTE, the
  ability to kexec() with the MMU enabled and a second attempt at
  switching to the generic pfn_valid() implementation.

  Summary:

   - Support for the Arm8.6 timer extensions, including a
     self-synchronising view of the system registers to elide some
     expensive ISB instructions.

   - Exception table cleanup and rework so that the fixup handlers
     appear correctly in backtraces.

   - A handful of miscellaneous changes, the main one being selection of
     CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.

   - More mm and pgtable cleanups.

   - KASAN support for "asymmetric" MTE, where tag faults are reported
     synchronously for loads (via an exception) and asynchronously for
     stores (via a register).

   - Support for leaving the MMU enabled during kexec relocation, which
     significantly speeds up the operation.

   - Minor improvements to our perf PMU drivers.

   - Improvements to the compat vDSO build system, particularly when
     building with LLVM=1.

   - Preparatory work for handling some Coresight TRBE tracing errata.

   - Cleanup and refactoring of the SVE code to pave the way for SME
     support in future.

   - Ensure SCS pages are unpoisoned immediately prior to freeing them
     when KASAN is enabled for the vmalloc area.

   - Try moving to the generic pfn_valid() implementation again now that
     the DMA mapping issue from last time has been resolved.

   - Numerous improvements and additions to our FPSIMD and SVE
     selftests"

[ armv8.6 timer updates were in a shared branch and already came in
  through -tip in the timer pull  - Linus ]

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (85 commits)
  arm64: Select POSIX_CPU_TIMERS_TASK_WORK
  arm64: Document boot requirements for FEAT_SME_FA64
  arm64/sve: Fix warnings when SVE is disabled
  arm64/sve: Add stub for sve_max_virtualisable_vl()
  arm64: errata: Add detection for TRBE write to out-of-range
  arm64: errata: Add workaround for TSB flush failures
  arm64: errata: Add detection for TRBE overwrite in FILL mode
  arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
  selftests: arm64: Factor out utility functions for assembly FP tests
  arm64: vmlinux.lds.S: remove `.fixup` section
  arm64: extable: add load_unaligned_zeropad() handler
  arm64: extable: add a dedicated uaccess handler
  arm64: extable: add `type` and `data` fields
  arm64: extable: use `ex` for `exception_table_entry`
  arm64: extable: make fixup_exception() return bool
  arm64: extable: consolidate definitions
  arm64: gpr-num: support W registers
  arm64: factor out GPR numbering helpers
  arm64: kvm: use kvm_exception_table_entry
  arm64: lib: __arch_copy_to_user(): fold fixups into body
  ...
2021-11-01 16:33:53 -07:00
Rafael J. Wysocki
d5a8fb654c perf: qcom_l2_pmu: ACPI: Use ACPI_COMPANION() directly
The ACPI_HANDLE() macro is a wrapper arond the ACPI_COMPANION()
macro and the ACPI handle produced by the former comes from the
ACPI device object produced by the latter, so it is way more
straightforward to evaluate the latter directly instead of passing
the handle produced by the former to acpi_bus_get_device().

Modify l2_cache_pmu_probe_cluster() accordingly (no intentional
functional impact).

While at it, rename the ACPI device pointer to adev for more
clarity.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-10-27 20:38:22 +02:00
John Garry
e656972b69 drivers/perf: Improve build test coverage
Improve build test cover by allowing some drivers to build under
COMPILE_TEST where possible.

Some notes:
- Mostly a dependency on CONFIG_ACPI is not really required for only
  building (but left untouched), but is required for TX2 which uses ACPI
  functions which have no stubs
- XGENE required 64b dependency as it relies on some unsigned long perf
  struct fields being 64b
- I don't see why TX2 requires NUMA to build, but left untouched
- Added an explicit dependency on GENERIC_MSI_IRQ_DOMAIN for
  ARM_SMMU_V3_PMU, which is required for platform MSI functions

Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1633085326-156653-3-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-04 13:13:11 +01:00
John Garry
78cac393b4 drivers/perf: thunderx2_pmu: Change data in size tx2_uncore_event_update()
A LSL of 32 requires > 32b value to hold the result. However in
tx2_uncore_event_update(), 1UL << 32 currently only works as unsigned
long is 64b on a 64b system.

If we want to compile test for a 32b system, we need unsigned long long,
whose min size is 64b.

Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1633085326-156653-2-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-04 13:13:11 +01:00
Shaokun Zhang
16cc4af286 drivers/perf: hisi: Fix PA PMU counter offset
The PA PMU counter offset was correct in [1] and the driver has
already been verified. We want to keep the register offset using
lower case character in later version that is consistent with
the existed driver. Since there was no functional change, we
didn't do more test. However there is typo when modified the PA
PMU counter offset by mistake, so fix this bad mistake.

[1] https://www.spinics.net/lists/arm-kernel/msg865263.html

Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20210928123022.23467-1-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-04 13:10:14 +01:00
Marc Zyngier
e840f42a49 KVM: arm64: Fix PMU probe ordering
Russell reported that since 5.13, KVM's probing of the PMU has
started to fail on his HW. As it turns out, there is an implicit
ordering dependency between the architectural PMU probing code and
and KVM's own probing. If, due to probe ordering reasons, KVM probes
before the PMU driver, it will fail to detect the PMU and prevent it
from being advertised to guests as well as the VMM.

Obviously, this is one probing too many, and we should be able to
deal with any ordering.

Add a callback from the PMU code into KVM to advertise the registration
of a host CPU PMU, allowing for any probing order.

Fixes: 5421db1be3 ("KVM: arm64: Divorce the perf code from oprofile helpers")
Reported-by: "Russell King (Oracle)" <linux@armlinux.org.uk>
Tested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/YUYRKVflRtUytzy5@shell.armlinux.org.uk
Cc: stable@vger.kernel.org
2021-09-20 12:43:34 +01:00
Jing Xiangfeng
d96b1b8c9f drivers/perf: fix the missed ida_simple_remove() in ddr_perf_probe()
ddr_perf_probe() misses to call ida_simple_remove() in an error path.
Jump to cpuhp_state_err to fix it.

Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Link: https://lore.kernel.org/r/20210617122614.166823-1-jingxiangfeng@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-17 19:45:24 +01:00
Tuan Phan
4e16f283ed perf/arm-cmn: Fix invalid pointer when access dtc object sharing the same IRQ number
When multiple dtcs share the same IRQ number, the irq_friend which
used to refer to dtc object gets calculated incorrect which leads
to invalid pointer.

Fixes: 0ba64770a2 ("perf: Add Arm CMN-600 PMU driver")

Signed-off-by: Tuan Phan <tuanphan@os.amperecomputing.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/1623946129-3290-1-git-send-email-tuanphan@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-17 19:45:02 +01:00
Qi Liu
773510f4d2 drivers/perf: Simplify EVENT ATTR macro in fsl_imx8_ddr_perf.c
Use common macro PMU_EVENT_ATTR_ID to simplify IMX8_DDR_PMU_EVENT_ATTR

Reviewed by Frank Li <Frank .li@nxp.com>

Cc: Frank Li <Frank.li@nxp.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1623220863-58233-7-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-11 11:18:40 +01:00
Qi Liu
b323dfe02e drivers/perf: Simplify EVENT ATTR macro in xgene_pmu.c
Use common macro PMU_EVENT_ATTR_ID to simplify XGENE_PMU_EVENT_ATTR

Cc: Khuong Dinh <khuong@os.amperecomputing.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1623220863-58233-6-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-11 11:18:40 +01:00
Qi Liu
78b1d3c720 drivers/perf: Simplify EVENT ATTR macro in qcom_l3_pmu.c
Use common macro PMU_EVENT_ATTR_ID to simplify L3CACHE_EVENT_ATTR

Cc: Andy Gross <agross@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1623220863-58233-5-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-11 11:18:40 +01:00
Qi Liu
0bf2d72988 drivers/perf: Simplify EVENT ATTR macro in qcom_l2_pmu.c
Use common macro PMU_EVENT_ATTR_ID to simplify L2CACHE_EVENT_ATTR

Cc: Andy Gross <agross@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1623220863-58233-4-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-11 11:18:40 +01:00
Qi Liu
7ac87a8dfb drivers/perf: Simplify EVENT ATTR macro in SMMU PMU driver
Use common macro PMU_EVENT_ATTR_ID to simplify SMMU_EVENT_ATTR

Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1623220863-58233-3-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-11 11:18:40 +01:00
Robin Murphy
4c1daba15c perf/smmuv3: Don't trample existing events with global filter
With global filtering, we only allow an event to be scheduled if its
filter settings exactly match those of any existing events, therefore
it is pointless to reapply the filter in that case. Much worse, though,
is that in doing that we trample the event type of counter 0 if it's
already active, and never touch the appropriate PMEVTYPERn so the new
event is likely not counting the right thing either. Don't do that.

CC: stable@vger.kernel.org
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/32c80c0e46237f49ad8da0c9f8864e13c4a803aa.1623153312.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-11 11:15:30 +01:00
Rikard Falkeborn
59d697a99d perf/hisi: Constify static attribute_group structs
These are only put in an array of pointers to const attribute_group
structs. Make them const like the other static attribute_group structs
to allow the compiler to put them in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20210605221514.73449-1-rikard.falkeborn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-08 12:49:54 +01:00
ChenXiaoSong
5ca54404e6 perf: qcom: Remove redundant dev_err call in qcom_l3_cache_pmu_probe()
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Link: https://lore.kernel.org/r/20210608084816.1046485-1-chenxiaosong2@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-08 12:48:39 +01:00
Shaokun Zhang
814be609ba drivers/perf: hisi: Fix data source control
'Data source' is a new function for HHA PMU and config / clear
interface was wrong by mistake. 'HHA_DATSRC_CTRL' register is
mainly used for data source configuration, if we enable bit0
as driver, it will go on count the event and we didn't check
it carefully. So fix the issue and do as the initial purpose.

Fixes: 932f6a99f9 ("drivers/perf: hisi: Add new functions for HHA PMU")
Reported-by: kernel test robot <lkp@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1622709291-37996-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-04 19:21:09 +01:00
Tian Tao
0d0f144a8f perf: qcom_l2_pmu: move to use request_irq by IRQF_NO_AUTOEN flag
request_irq() after setting IRQ_NOAUTOEN as below
irq_set_status_flags(irq, IRQ_NOAUTOEN); request_irq(dev, irq...); can
be replaced by request_irq() with IRQF_NO_AUTOEN flag.

this patch is made base on "add IRQF_NO_AUTOEN for request_irq" which
is being merged: https://lore.kernel.org/patchwork/patch/1388765/

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1622595642-61678-3-git-send-email-tiantao6@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-02 12:48:08 +01:00
Tian Tao
3c1f2eb547 arm_pmu: move to use request_irq by IRQF_NO_AUTOEN flag
request_irq() after setting IRQ_NOAUTOEN as below
irq_set_status_flags(irq, IRQ_NOAUTOEN);
request_irq(dev, irq...);
can be replaced by request_irq() with IRQF_NO_AUTOEN flag.

this patch is made base on "add IRQF_NO_AUTOEN for request_irq" which
is being merged: https://lore.kernel.org/patchwork/patch/1388765/

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1622595642-61678-2-git-send-email-tiantao6@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-02 12:48:08 +01:00
YueHaibing
f9e36b388a perf: arm_spe: use DEVICE_ATTR_RO macro
Use DEVICE_ATTR_RO() helper instead of plain DEVICE_ATTR(),
which makes the code a bit shorter and easier to read.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20210528061738.23392-1-yuehaibing@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-01 14:25:09 +01:00
YueHaibing
21ad02e6b4 perf: xgene_pmu: use DEVICE_ATTR_RO macro
Use DEVICE_ATTR_RO() helper instead of plain DEVICE_ATTR(),
which makes the code a bit shorter and easier to read.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20210528014940.4184-1-yuehaibing@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-01 14:23:46 +01:00
YueHaibing
ccbe14ce88 perf: qcom: use DEVICE_ATTR_RO macro
Use DEVICE_ATTR_RO() helper instead of plain DEVICE_ATTR(),
which makes the code a bit shorter and easier to read.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20210528014749.24068-1-yuehaibing@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-01 14:23:09 +01:00
YueHaibing
29c043760e perf: arm_pmu: use DEVICE_ATTR_RO macro
Use DEVICE_ATTR_RO helper instead of plain DEVICE_ATTR,
which makes the code a bit shorter and easier to read.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20210528014130.7708-1-yuehaibing@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-01 14:22:44 +01:00
Hao Fang
2db5223731 drivers/perf: hisi: use the correct HiSilicon copyright
s/Hisilicon/HiSilicon/.
It should use capital S, according to the official website
https://www.hisilicon.com/en.

Signed-off-by: Hao Fang <fanghao11@huawei.com>
Link: https://lore.kernel.org/r/1621679037-15323-1-git-send-email-fanghao11@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-01 14:18:47 +01:00
Junhao He
eb2b22f024 drivers/perf: arm-cci: Fix checkpatch spacing error
Fix some coding style issues reported by checkpatch.pl, including
following types:

ERROR: need consistent spacing around '-' (ctx:WxV)
ERROR: space required before the open parenthesis '('

Signed-off-by: Junhao He <hejunhao2@hisilicon.com>
Signed-off-by: Jay Fang <f.fangjian@huawei.com>
Link: https://lore.kernel.org/r/1620736054-58412-5-git-send-email-f.fangjian@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25 18:59:23 +01:00
Junhao He
a9f00c9760 drivers/perf: arm-cmn: Add space after ','
Fix a warning from checkpatch.pl.

ERROR: space required after that ',' (ctx:VxV)

Signed-off-by: Junhao He <hejunhao2@hisilicon.com>
Signed-off-by: Jay Fang <f.fangjian@huawei.com>
Link: https://lore.kernel.org/r/1620736054-58412-4-git-send-email-f.fangjian@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25 18:59:23 +01:00
Junhao He
f265fd166b drivers/perf: arm_pmu: Fix some coding style issues
Fix some coding style issues reported by checkpatch.pl, including
following types:

ERROR: spaces required around that '=' (ctx:VxW)
WARNING: Possible unnecessary 'out of memory' message

Signed-off-by: Junhao He <hejunhao2@hisilicon.com>
Signed-off-by: Jay Fang <f.fangjian@huawei.com>
Link: https://lore.kernel.org/r/1620736054-58412-3-git-send-email-f.fangjian@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25 18:59:23 +01:00
Junhao He
27e4482075 drivers/perf: arm_spe_pmu: Fix some coding style issues
Fix some coding style issues reported by checkpatch.pl, including
following types:

WARNING: void function return statements are not generally useful
WARNING: Possible unnecessary 'out of memory' message

Signed-off-by: Junhao He <hejunhao2@hisilicon.com>
Signed-off-by: Jay Fang <f.fangjian@huawei.com>
Link: https://lore.kernel.org/r/1620736054-58412-2-git-send-email-f.fangjian@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25 18:59:23 +01:00
Zou Wei
bf2367aaed drivers/perf: Remove redundant dev_err call in tx2_uncore_pmu_init_dev()
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Link: https://lore.kernel.org/r/1620715364-107460-1-git-send-email-zou_wei@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25 18:57:19 +01:00
Thomas Gleixner
77b06ddc04 perf/hisi: Use irq_set_affinity()
These drivers use irq_set_affinity_hint() to set the affinity for the PMU
interrupts, which relies on the undocumented side effect that this function
actually sets the affinity under the hood.

Setting an hint is clearly not a guarantee and for these PMU interrupts an
affinity hint, which is supposed to guide userspace for setting affinity,
is beyond pointless, because the affinity of these interrupts cannot be
modified from user space.

Aside of that the error checks are bogus because the only error which is
returned from irq_set_affinity_hint() is when there is no irq descriptor
for the interrupt number, but not when the affinity set fails. That's on
purpose because the hint can point to an offline CPU.

Replace the mindless abuse with irq_set_affinity().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210518093118.813375875@linutronix.de
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-24 11:02:00 +01:00
Thomas Gleixner
ba4489fb94 perf/imx_ddr: Use irq_set_affinity()
The driver uses irq_set_affinity_hint() to set the affinity for the PMU
interrupts, which relies on the undocumented side effect that this function
actually sets the affinity under the hood.

Setting an hint is clearly not a guarantee and for these PMU interrupts an
affinity hint, which is supposed to guide userspace for setting affinity,
is beyond pointless, because the affinity of these interrupts cannot be
modified from user space.

Aside of that the error checks are bogus because the only error which is
returned from irq_set_affinity_hint() is when there is no irq descriptor
for the interrupt number, but not when the affinity set fails. That's on
purpose because the hint can point to an offline CPU.

Replace the mindless abuse with irq_set_affinity().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frank Li <Frank.li@nxp.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210518093118.699566062@linutronix.de
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-24 11:02:00 +01:00
Thomas Gleixner
2621054535 perf/arm-smmuv3: Use irq_set_affinity()
The driver uses irq_set_affinity_hint() to set the affinity for the PMU
interrupts, which relies on the undocumented side effect that this function
actually sets the affinity under the hood.

Setting an hint is clearly not a guarantee and for these PMU interrupts an
affinity hint, which is supposed to guide userspace for setting affinity,
is beyond pointless, because the affinity of these interrupts cannot be
modified from user space.

Aside of that the error checks are bogus because the only error which is
returned from irq_set_affinity_hint() is when there is no irq descriptor
for the interrupt number, but not when the affinity set fails. That's on
purpose because the hint can point to an offline CPU.

Replace the mindless abuse with irq_set_affinity().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210518093118.603636289@linutronix.de
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-24 11:02:00 +01:00
Thomas Gleixner
41ea281724 perf/arm-dsu: Use irq_set_affinity()
The driver uses irq_set_affinity_hint() to set the affinity for the PMU
interrupts, which relies on the undocumented side effect that this function
actually sets the affinity under the hood.

Setting an hint is clearly not a guarantee and for these PMU interrupts an
affinity hint, which is supposed to guide userspace for setting affinity,
is beyond pointless, because the affinity of these interrupts cannot be
modified from user space.

Aside of that the error checks are bogus because the only error which is
returned from irq_set_affinity_hint() is when there is no irq descriptor
for the interrupt number, but not when the affinity set fails. That's on
purpose because the hint can point to an offline CPU.

Replace the mindless abuse with irq_set_affinity().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210518093118.505110632@linutronix.de
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-24 11:02:00 +01:00
Thomas Gleixner
1ceeb8d430 perf/arm-dmc620: Use irq_set_affinity()
The driver uses irq_set_affinity_hint() to set the affinity for the PMU
interrupts, which relies on the undocumented side effect that this function
actually sets the affinity under the hood.

Setting an hint is clearly not a guarantee and for these PMU interrupts an
affinity hint, which is supposed to guide userspace for setting affinity,
is beyond pointless, because the affinity of these interrupts cannot be
modified from user space.

Aside of that the error checks are bogus because the only error which is
returned from irq_set_affinity_hint() is when there is no irq descriptor
for the interrupt number, but not when the affinity set fails. That's on
purpose because the hint can point to an offline CPU.

Replace the mindless abuse with irq_set_affinity().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210518093118.395086573@linutronix.de
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-24 11:01:59 +01:00
Thomas Gleixner
8ec25d3401 perf/arm-cmn: Use irq_set_affinity()
The driver uses irq_set_affinity_hint() to set the affinity for the PMU
interrupts, which relies on the undocumented side effect that this function
actually sets the affinity under the hood.

Setting an hint is clearly not a guarantee and for these PMU interrupts an
affinity hint, which is supposed to guide userspace for setting affinity,
is beyond pointless, because the affinity of these interrupts cannot be
modified from user space.

Aside of that the error checks are bogus because the only error which is
returned from irq_set_affinity_hint() is when there is no irq descriptor
for the interrupt number, but not when the affinity set fails. That's on
purpose because the hint can point to an offline CPU.

Replace the mindless abuse with irq_set_affinity().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210518093118.277228577@linutronix.de
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-24 11:01:59 +01:00
Thomas Gleixner
84fca8ba62 perf/arm-ccn: Use irq_set_affinity()
The driver uses irq_set_affinity_hint() to set the affinity for the PMU
interrupts, which relies on the undocumented side effect that this function
actually sets the affinity under the hood.

Setting an hint is clearly not a guarantee and for these PMU interrupts an
affinity hint, which is supposed to guide userspace for setting affinity,
is beyond pointless, because the affinity of these interrupts cannot be
modified from user space.

Aside of that the error checks are bogus because the only error which is
returned from irq_set_affinity_hint() is when there is no irq descriptor
for the interrupt number, but not when the affinity set fails. That's on
purpose because the hint can point to an offline CPU.

Replace the mindless abuse with irq_set_affinity().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210518093118.128250213@linutronix.de
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-24 11:01:59 +01:00
Linus Torvalds
152d32aa84 ARM:
- Stage-2 isolation for the host kernel when running in protected mode
 
 - Guest SVE support when running in nVHE mode
 
 - Force W^X hypervisor mappings in nVHE mode
 
 - ITS save/restore for guests using direct injection with GICv4.1
 
 - nVHE panics now produce readable backtraces
 
 - Guest support for PTP using the ptp_kvm driver
 
 - Performance improvements in the S2 fault handler
 
 x86:
 
 - Optimizations and cleanup of nested SVM code
 
 - AMD: Support for virtual SPEC_CTRL
 
 - Optimizations of the new MMU code: fast invalidation,
   zap under read lock, enable/disably dirty page logging under
   read lock
 
 - /dev/kvm API for AMD SEV live migration (guest API coming soon)
 
 - support SEV virtual machines sharing the same encryption context
 
 - support SGX in virtual machines
 
 - add a few more statistics
 
 - improved directed yield heuristics
 
 - Lots and lots of cleanups
 
 Generic:
 
 - Rework of MMU notifier interface, simplifying and optimizing
 the architecture-specific code
 
 - Some selftests improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmCJ13kUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroM1HAgAqzPxEtiTPTFeFJV5cnPPJ3dFoFDK
 y/juZJUQ1AOtvuWzzwuf175ewkv9vfmtG6rVohpNSkUlJYeoc6tw7n8BTTzCVC1b
 c/4Dnrjeycr6cskYlzaPyV6MSgjSv5gfyj1LA5UEM16LDyekmaynosVWY5wJhju+
 Bnyid8l8Utgz+TLLYogfQJQECCrsU0Wm//n+8TWQgLf1uuiwshU5JJe7b43diJrY
 +2DX+8p9yWXCTz62sCeDWNahUv8AbXpMeJ8uqZPYcN1P0gSEUGu8xKmLOFf9kR7b
 M4U1Gyz8QQbjd2lqnwiWIkvRLX6gyGVbq2zH0QbhUe5gg3qGUX7JjrhdDQ==
 =AXUi
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "This is a large update by KVM standards, including AMD PSP (Platform
  Security Processor, aka "AMD Secure Technology") and ARM CoreSight
  (debug and trace) changes.

  ARM:

   - CoreSight: Add support for ETE and TRBE

   - Stage-2 isolation for the host kernel when running in protected
     mode

   - Guest SVE support when running in nVHE mode

   - Force W^X hypervisor mappings in nVHE mode

   - ITS save/restore for guests using direct injection with GICv4.1

   - nVHE panics now produce readable backtraces

   - Guest support for PTP using the ptp_kvm driver

   - Performance improvements in the S2 fault handler

  x86:

   - AMD PSP driver changes

   - Optimizations and cleanup of nested SVM code

   - AMD: Support for virtual SPEC_CTRL

   - Optimizations of the new MMU code: fast invalidation, zap under
     read lock, enable/disably dirty page logging under read lock

   - /dev/kvm API for AMD SEV live migration (guest API coming soon)

   - support SEV virtual machines sharing the same encryption context

   - support SGX in virtual machines

   - add a few more statistics

   - improved directed yield heuristics

   - Lots and lots of cleanups

  Generic:

   - Rework of MMU notifier interface, simplifying and optimizing the
     architecture-specific code

   - a handful of "Get rid of oprofile leftovers" patches

   - Some selftests improvements"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (379 commits)
  KVM: selftests: Speed up set_memory_region_test
  selftests: kvm: Fix the check of return value
  KVM: x86: Take advantage of kvm_arch_dy_has_pending_interrupt()
  KVM: SVM: Skip SEV cache flush if no ASIDs have been used
  KVM: SVM: Remove an unnecessary prototype declaration of sev_flush_asids()
  KVM: SVM: Drop redundant svm_sev_enabled() helper
  KVM: SVM: Move SEV VMCB tracking allocation to sev.c
  KVM: SVM: Explicitly check max SEV ASID during sev_hardware_setup()
  KVM: SVM: Unconditionally invoke sev_hardware_teardown()
  KVM: SVM: Enable SEV/SEV-ES functionality by default (when supported)
  KVM: SVM: Condition sev_enabled and sev_es_enabled on CONFIG_KVM_AMD_SEV=y
  KVM: SVM: Append "_enabled" to module-scoped SEV/SEV-ES control variables
  KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features
  KVM: SVM: Move SEV module params/variables to sev.c
  KVM: SVM: Disable SEV/SEV-ES if NPT is disabled
  KVM: SVM: Free sev_asid_bitmap during init if SEV setup fails
  KVM: SVM: Zero out the VMCB array used to track SEV ASID association
  x86/sev: Drop redundant and potentially misleading 'sev_enabled'
  KVM: x86: Move reverse CPUID helpers to separate header file
  KVM: x86: Rename GPR accessors to make mode-aware variants the defaults
  ...
2021-05-01 10:14:08 -07:00
Marc Zyngier
e9c74a686a arm64: Get rid of oprofile leftovers
perf_pmu_name() and perf_num_counters() are now unused. Drop them.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210414134409.1266357-3-maz@kernel.org
2021-04-22 13:32:39 +01:00
Robin Murphy
e20ac6c54a perf/arm_pmu_platform: Clean up with dev_printk
Nearly all of the messages we can log from the platform device code
relate to the specific PMU device and the properties we're parsing from
its DT node. In some cases we use %pOF to point at where something was
wrong, but even that is inconsistent. Let's convert these logs to the
appropriate dev_printk variants, so that every issue specific to the
device and/or its DT description is clearly and instantly attributable,
particularly if there is more than one PMU node present in the DT.

The local refactoring in a couple of functions invites some extra
cleanup in the process - the init_fn matching can be streamlined, and
the PMU registration failure message moved to the appropriate place and
log level.

CC: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/10a4aacdf071d0c03d061c408a5899e5b32cc0a6.1616774562.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-30 11:41:50 +01:00
Robin Murphy
e338cb6bef perf/arm_pmu_platform: Fix error handling
If we're aborting after failing to register the PMU device,
we probably don't want to leak the IRQs that we've claimed.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/53031a607fc8412a60024bfb3bb8cd7141f998f5.1616774562.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-30 11:41:50 +01:00
Robin Murphy
11fa1dc802 perf/arm_pmu_platform: Use dev_err_probe() for IRQ errors
By virtue of using platform_irq_get_optional() under the covers,
platform_irq_count() needs the target interrupt controller to be
available and may return -EPROBE_DEFER if it isn't. Let's use
dev_err_probe() to avoid a spurious error log (and help debug any
deferral issues) in that case.

Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/073d5e0d3ed1f040592cb47ca6fe3759f40cc7d1.1616774562.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-30 11:41:50 +01:00
Shaokun Zhang
a0ab25cd82 drivers/perf: hisi: Add support for HiSilicon PA PMU driver
On HiSilicon Hip09 platform, there is a PA (Protocol Adapter) module on
each chip SICL (Super I/O Cluster) which incorporates three Hydra interface
and facilitates the cache coherency between the dies on the chip. While PA
uncore PMU model is the same as other Hip09 PMU modules and many PMU events
are supported. Let's support the PMU driver using the HiSilicon uncore PMU
framework.

PA PMU supports the following filter functions:
* tracetag_en: allows user to count events according to tt_req or
tt_core set in L3C PMU. It's the same as other PMUs.

* srcid_cmd & srcid_msk: allows user to filter statistics that come from
specific CCL/ICL by configuration source ID.

* tgtid_cmd & tgtid_msk: it is the similar function to srcid_cmd &
srcid_msk. Both are used to check where the data comes from or go to.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-9-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 13:03:46 +00:00
Shaokun Zhang
3bf30882c3 drivers/perf: hisi: Add support for HiSilicon SLLC PMU driver
HiSilicon's Hip09 is comprised by multi-dies that can be connected by SLLC
module (Skyros Link Layer Controller), its has separate PMU registers which
the driver can program it freely and interrupt is supported to handle
counter overflow. Let's support its driver under the framework of HiSilicon
uncore PMU driver.

SLLC PMU supports the following filter functions:
* tracetag_en: allows user to count data according to tt_req or
tt_core set in L3C PMU.

* srcid_cmd & srcid_msk: allows user to filter statistics that come from
specific CCL/ICL by configuration source ID.

* tgtid_hi & tgtid_lo: it also supports event statistics that these
operations will go to the CCL/ICL by configuration target ID or
target ID range. It's the same as source ID with 11-bit width in
the SoC. More introduction is added in documentation:
Documentation/admin-guide/perf/hisi-pmu.rst

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-8-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 13:03:46 +00:00
Shaokun Zhang
cce03e702c drivers/perf: hisi: Update DDRC PMU for programmable counter
DDRC PMU's events are useful for performance profiling, but the events
are limited and counter is fixed. On HiSilicon Hip09 platform, PMU
counters are the programmable and more events are supported. Let's
add the DDRC PMU v2 driver.

Bandwidth events are exposed directly in driver and some more events
will listed in JSON file later.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-7-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 13:03:46 +00:00
Shaokun Zhang
932f6a99f9 drivers/perf: hisi: Add new functions for HHA PMU
On HiSilicon Hip09 platform, some new functions are also supported on
HHA PMU.

* tracetag_en: it is the abbreviation of tracetag enable and allows user
to count events according to tt_req or tt_core set in L3C PMU.

* datasrc_skt: it is the abbreviation of data source from another
socket and it is used in the multi-chips. It's the same as L3C PMU.

* srcid_cmd & srcid_msk: pair of the fields are used to filter
statistics that come from the specific CCL/ICL by the configuration.
These are the abbreviation of source ID command and mask. The source
ID is 11-bit and detailed descriptions are documented in
Documentation/admin-guide/perf/hisi-pmu.rst.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-6-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 13:03:46 +00:00
Shaokun Zhang
486a7f46b9 drivers/perf: hisi: Add new functions for L3C PMU
On HiSilicon Hip09 platform, some new functions are enhanced on L3C PMU:

* tt_req: it is the abbreviation of tracetag request and allows user to
count only read/write/atomic operations. tt_req is 3-bit and details are
listed in the hisi-pmu document.
$# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_req=0x4/ sleep 5

* tt_core: it is the abbreviation of tracetag core and allows user to
filter by core/thread within the cluster, it is a 8-bit bitmap that each
bit represents the corresponding core/thread in this L3C.
$# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_core=0xf/ sleep 5

* datasrc_cfg: it is the abbreviation of data source configuration and
allows user to check where the data comes from, such as: from local DDR,
cross-die DDR or cross-socket DDR. Its is 5-bit and represents different
data source in the SoC.
$# perf stat -a -e hisi_sccl3_l3c0/dat_access,datasrc_cfg=0xe/ sleep 5

* datasrc_skt: it is the abbreviation of data source from another socket
and is used in the multi-chips, if user wants to check the cross-socket
datat source, it shall be added in perf command. Only one bit is used to
control this.
$# perf stat -a -e hisi_sccl3_l3c0/dat_access,datasrc_cfg=0x10,datasrc_skt=1/ sleep 5

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-5-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 13:03:46 +00:00
Shaokun Zhang
3da582df57 drivers/perf: hisi: Add PMU version for uncore PMU drivers.
For HiSilicon uncore PMU, more versions are supported and some variables
shall be added suffix to distinguish the version which are prepared for
the new drivers.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-4-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 13:03:45 +00:00
Shaokun Zhang
baff06c315 drivers/perf: hisi: Refactor code for more uncore PMUs
On HiSilicon uncore PMU drivers, interrupt handling function and interrupt
registration function are very similar in differents PMU modules. Let's
refactor the frame.

Two new callbacks are added for the HW accessors:

* hisi_uncore_ops::get_int_status returns a bitmap of events which
  have overflowed and raised an interrupt

* hisi_uncore_ops::clear_int_status clears the overflow status for a
  specific event

These callback functions are used by a common IRQ handler,
hisi_uncore_pmu_isr().

One more function hisi_uncore_pmu_init_irq() is added to replace each
PMU initialization IRQ interface and simplify the code.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-3-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 13:03:45 +00:00
Shaokun Zhang
4e4cb8ca48 drivers/perf: hisi: Remove unnecessary check of counter index
The sanity check for counter index has been done in the function
hisi_uncore_pmu_get_event_idx, so remove the redundant interface
hisi_uncore_pmu_counter_valid() and sanity check.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-2-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 13:03:45 +00:00
Qi Liu
174744136d drivers/perf: Simplify the SMMUv3 PMU event attributes
For each PMU event, there is a SMMU_EVENT_ATTR(xx, XX) and
&smmu_event_attr_xx.attr.attr. Let's redefine the SMMU_EVENT_ATTR
to simplify the smmu_pmu_events.

Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1612789498-12957-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 12:58:39 +00:00
Qi Liu
fb62d67586 drivers/perf: convert sysfs sprintf family to sysfs_emit
sprintf does not know the PAGE_SIZE maximum of the temporary buffer
used for sysfs content and it's possible to overrun the buffer length.

Use sysfs_emit() function to ensures that no overrun is done.

Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1616148273-16374-4-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 12:55:44 +00:00
Qi Liu
9ec9f9cf86 drivers/perf: convert sysfs scnprintf family to sysfs_emit_at() and sysfs_emit()
Use the generic sysfs_emit_at() and sysfs_emit() function to take place
of scnprintf()

Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1616148273-16374-3-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 12:55:44 +00:00
Zihao Tang
700a9cf052 drivers/perf: convert sysfs snprintf family to sysfs_emit
Fix the following coccicheck warning:

./drivers/perf/hisilicon/hisi_uncore_pmu.c:128:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/fsl_imx8_ddr_perf.c:173:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm_spe_pmu.c:129:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm_smmu_pmu.c:563:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm_dsu_pmu.c:149:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm_dsu_pmu.c:139:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm-cmn.c:563:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm-cmn.c:351:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm-ccn.c:224:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm-cci.c:708:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm-cci.c:699:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm-cci.c:528:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm-cci.c:309:8-16: WARNING: use scnprintf or sprintf.

Signed-off-by: Zihao Tang <tangzihao1@hisilicon.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1616148273-16374-2-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 12:55:44 +00:00
Wei Yongjun
c8e3866836 perf/arm_dmc620_pmu: Fix error return code in dmc620_pmu_device_probe()
Fix to return negative error code -ENOMEM from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 53c218da22 ("driver/perf: Add PMU driver for the ARM DMC-620 memory controller")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Link: https://lore.kernel.org/r/20210312080421.277562-1-weiyongjun1@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-12 11:30:31 +00:00
Linus Torvalds
d652ea30ba IOMMU Updates for Linux v5.12
Including:
 
 	- ARM SMMU and Mediatek updates from Will Deacon:
 
 		- Support for MT8192 IOMMU from Mediatek
 
 		- Arm v7s io-pgtable extensions for MT8192
 
 		- Removal of TLBI_ON_MAP quirk
 
 		- New Qualcomm compatible strings
 
 		- Allow SVA without hardware broadcast TLB maintenance
 		  on SMMUv3
 
 		- Virtualization Host Extension support for SMMUv3 (SVA)
 
 		- Allow SMMUv3 PMU (perf) driver to be built
 		  independently from IOMMU
 
 	- Some tidy-up in IOVA and core code
 
 	- Conversion of the AMD IOMMU code to use the generic
 	  IO-page-table framework
 
 	- Intel VT-d updates from Lu Baolu:
 
 		- Audit capability consistency among different IOMMUs
 
 		- Add SATC reporting structure support
 
 		- Add iotlb_sync_map callback support
 
 	- SDHI Support for Renesas IOMMU driver
 
 	- Misc Cleanups and other small improvments
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmAz2AcACgkQK/BELZcB
 GuMflRAAyOfXzFqfmniYKxXmxAhOIlibCoojXzecItifdyaFkvUhxCadRg5u3dLH
 IeACDvUiaA5VQnVhjI0a7gCckKdq5YDwwAS+GQICUKNEZtWwrHwm2QTRBmL9MOlA
 v+iTrhYCqZWIAPe16BP5L4u6q4JWS2N9oNmDp6ia1VIhPjfsHU+gXpYKSxiLicmV
 VECAHJk5/JrwKXBP2nMg3ZqGz9RoJc2CzC4zvKu7ADDB9Zl+pXs74mb4ta81Y3G4
 vf07G/ORYJjbMskz5KcmYw2897I9ejMrNaHrYNjlh2IDpGqmoCJ4+8vuVO0zslrm
 GzMOHMshaI653BmuRDHyczwNrNxMxSX3NOeR2fp76d3MSouK7RoFMr7ghMAegp1u
 qmSrqFbgnOT4cdzN8QpPyU22lmVHtQHm0P4EpZZzC95dtJo1nzt8BFrDjPdJDOyZ
 D7oKvZq+OA6MtjCN4gZ2ClxQoiUZ8E/jP1uGIknpzR1oeWnyEFtx8aI9q4yRjNcp
 n8UR7wFqbOIV0O/QC7UlEp/xSpG7BDN4BkvIgbH2qe6LmRmejCrnUWv6pkmPc9sI
 wFgV/Qnh9oo7yf6zvmCsi1r0kmfGLPRe8GB9+eN3wY3xxzDOLJEJNdVghWaP8Rz7
 MBcL/0u8+fMfQlRiOPX64BSIF7fFPY0r0+erINawf1VUbyjUsUE=
 =xSX1
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - ARM SMMU and Mediatek updates from Will Deacon:
     - Support for MT8192 IOMMU from Mediatek
     - Arm v7s io-pgtable extensions for MT8192
     - Removal of TLBI_ON_MAP quirk
     - New Qualcomm compatible strings
     - Allow SVA without hardware broadcast TLB maintenance on SMMUv3
     - Virtualization Host Extension support for SMMUv3 (SVA)
     - Allow SMMUv3 PMU perf driver to be built independently from IOMMU

 - Some tidy-up in IOVA and core code

 - Conversion of the AMD IOMMU code to use the generic IO-page-table
   framework

 - Intel VT-d updates from Lu Baolu:
     - Audit capability consistency among different IOMMUs
     - Add SATC reporting structure support
     - Add iotlb_sync_map callback support

 - SDHI support for Renesas IOMMU driver

 - Misc cleanups and other small improvments

* tag 'iommu-updates-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (94 commits)
  iommu/amd: Fix performance counter initialization
  MAINTAINERS: repair file pattern in MEDIATEK IOMMU DRIVER
  iommu/mediatek: Fix error code in probe()
  iommu/mediatek: Fix unsigned domid comparison with less than zero
  iommu/vt-d: Parse SATC reporting structure
  iommu/vt-d: Add new enum value and structure for SATC
  iommu/vt-d: Add iotlb_sync_map callback
  iommu/vt-d: Move capability check code to cap_audit files
  iommu/vt-d: Audit IOMMU Capabilities and add helper functions
  iommu/vt-d: Fix 'physical' typos
  iommu: Properly pass gfp_t in _iommu_map() to avoid atomic sleeping
  iommu/vt-d: Fix compile error [-Werror=implicit-function-declaration]
  driver/perf: Remove ARM_SMMU_V3_PMU dependency on ARM_SMMU_V3
  MAINTAINERS: Add entry for MediaTek IOMMU
  iommu/mediatek: Add mt8192 support
  iommu/mediatek: Remove unnecessary check in attach_device
  iommu/mediatek: Support master use iova over 32bit
  iommu/mediatek: Add iova reserved function
  iommu/mediatek: Support for multi domains
  iommu/mediatek: Add get_domain_id from dev->dma_range_map
  ...
2021-02-22 10:31:29 -08:00
Qi Liu
8ee37e0f97 drivers/perf: Replace spin_lock_irqsave to spin_lock
There is no need to do spin_lock_irqsave in context of hard IRQ, so
replace them with spin_lock.

Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1612863742-1551-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-02-10 18:50:42 +00:00
Qi Liu
20116dd93f drivers/perf: Prevent forced unbinding of ARM_DMC620_PMU drivers
Set "suppress_bind_attrs" to true, so that bind/unbind can be
disabled via sysfs and prevent unbinding ARM_DMC620_PMU drivers
during perf sampling.

Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1612252686-50329-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-02-02 18:40:30 +00:00
Joerg Roedel
d1e3306ba8 Arm SMMU updates for 5.12
- Support for MT8192 IOMMU from Mediatek
 
 - Arm v7s io-pgtable extensions for MT8192
 
 - Removal of TLBI_ON_MAP quirk
 
 - New Qualcomm compatible strings
 
 - Allow SVA without hardware broadcast TLB maintenance on SMMUv3
 
 - Virtualization Host Extension support for SMMUv3 (SVA)
 
 - Allow SMMUv3 PMU (perf) driver to be built independently from IOMMU
 
 - Misc cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmAYD/QQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNLZjCACBK2bCmkVJfs39YeeMhHWVRyiiMpIuCvja
 7WuIqeXnbtAg19hkwPqns+XYeTTnoeo/aSZUS9z8DIDUqADMES+1UUMfVf2mfL3A
 VD1pGRtQezYS+rDOc5dL9m1CcJ2RRY5tfQEjK8Fnm5CItvylNXbURH1nfu1tMVW7
 fn1tMNKCTnY7FkABttKyR4a1XIyFX+fPdSrqd5PjQ7D6rK2ZYrorMf55WpmQEfFu
 3er4F3Xz8nJeTvswPUMNB3ibvcdICM9VcSwFHpElTq2dbHUDDz3uFQwyiXrutEmb
 Xz4UGabjS/ZEPaty6y7VxdYEbD4Q2dzii57jKKtV49N36iCvkSR2
 =5KXT
 -----END PGP SIGNATURE-----

Merge tag 'arm-smmu-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu

Arm SMMU updates for 5.12

- Support for MT8192 IOMMU from Mediatek

- Arm v7s io-pgtable extensions for MT8192

- Removal of TLBI_ON_MAP quirk

- New Qualcomm compatible strings

- Allow SVA without hardware broadcast TLB maintenance on SMMUv3

- Virtualization Host Extension support for SMMUv3 (SVA)

- Allow SMMUv3 PMU (perf) driver to be built independently from IOMMU

- Misc cleanups
2021-02-02 13:35:40 +01:00
John Garry
34eb9359c1 driver/perf: Remove ARM_SMMU_V3_PMU dependency on ARM_SMMU_V3
The ARM_SMMU_V3_PMU dependency on ARM_SMMU_V3_PMU was added with the idea
that a SMMUv3 PMCG would only exist on a system with an associated SMMUv3.

However it is not the job of Kconfig to make these sorts of decisions (even
if it were true), so remove the dependency.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/1612175042-56866-1-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-02-01 12:39:40 +00:00
Robin Murphy
1c8147ea89 perf/arm-cmn: Move IRQs when migrating context
If we migrate the PMU context to another CPU, we need to remember to
retarget the IRQs as well.

Fixes: 0ba64770a2 ("perf: Add Arm CMN-600 PMU driver")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/e080640aea4ed8dfa870b8549dfb31221803eb6b.1611839564.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-28 20:14:45 +00:00
Robin Murphy
79d7c3dca9 perf/arm-cmn: Fix PMU instance naming
Although it's neat to avoid the suffix for the typical case of a
single PMU, it means systems with multiple CMN instances end up with
inconsistent naming. I think it also breaks perf tool's "uncore alias"
logic if the common instance prefix is also the full name of one.

Avoid any surprises by not trying to be clever and simply numbering
every instance, even when it might technically prove redundant.

Fixes: 0ba64770a2 ("perf: Add Arm CMN-600 PMU driver")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/649a2281233f193d59240b13ed91b57337c77b32.1611839564.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-28 20:14:45 +00:00
Rikard Falkeborn
f0c140481d perf: Constify static struct attribute_group
The only usage is to put their addresses in an array of pointers to
const struct attribute group. Make them const to allow the compiler
to put them in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20210117212847.21319-5-rikard.falkeborn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-20 17:51:23 +00:00
Rikard Falkeborn
c2c4d5c051 perf: hisi: Constify static struct attribute_group
The only usage is to put their addresses in an array of pointers to
const struct attribute group. Make them const to allow the compiler
to put them in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20210117212847.21319-4-rikard.falkeborn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-20 17:51:22 +00:00
Rikard Falkeborn
3cb7d2da18 perf/imx_ddr: Constify static struct attribute_group
The only usage is to put their addresses in an array of pointers to
const struct attribute group. Make them const to allow the compiler
to put them in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20210117212847.21319-3-rikard.falkeborn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-20 17:51:22 +00:00
Rikard Falkeborn
30b34c4833 perf: qcom: Constify static struct attribute_group
The only usage is to put their addresses in an array of pointers to
const struct attribute group. Make them const to allow the compiler
to put them in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20210117212847.21319-2-rikard.falkeborn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-20 17:51:22 +00:00
Wei Li
4a669e2432 drivers/perf: Add support for ARMv8.3-SPE
Armv8.3 extends the SPE by adding:
- Alignment field in the Events packet, and filtering on this event
  using PMSEVFR_EL1.
- Support for the Scalable Vector Extension (SVE).

The main additions for SVE are:
- Recording the vector length for SVE operations in the Operation Type
  packet. It is not possible to filter on vector length.
- Incomplete predicate and empty predicate fields in the Events packet,
  and filtering on these events using PMSEVFR_EL1.

Update the check of pmsevfr for empty/partial predicated SVE and
alignment event in SPE driver.

Signed-off-by: Wei Li <liwei391@huawei.com>
Link: https://lore.kernel.org/r/20201203141609.14148-1-liwei391@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-20 17:47:44 +00:00
Will Deacon
b90d72a6bf Revert "arm64: Enable perf events based hard lockup detector"
This reverts commit 367c820ef0.

lockup_detector_init() makes heavy use of per-cpu variables and must be
called with preemption disabled. Usually, it's handled early during boot
in kernel_init_freeable(), before SMP has been initialised.

Since we do not know whether or not our PMU interrupt can be signalled
as an NMI until considerably later in the boot process, the Arm PMU
driver attempts to re-initialise the lockup detector off the back of a
device_initcall(). Unfortunately, this is called from preemptible
context and results in the following splat:

  | BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
  | caller is debug_smp_processor_id+0x20/0x2c
  | CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.10.0+ #276
  | Hardware name: linux,dummy-virt (DT)
  | Call trace:
  |   dump_backtrace+0x0/0x3c0
  |   show_stack+0x20/0x6c
  |   dump_stack+0x2f0/0x42c
  |   check_preemption_disabled+0x1cc/0x1dc
  |   debug_smp_processor_id+0x20/0x2c
  |   hardlockup_detector_event_create+0x34/0x18c
  |   hardlockup_detector_perf_init+0x2c/0x134
  |   watchdog_nmi_probe+0x18/0x24
  |   lockup_detector_init+0x44/0xa8
  |   armv8_pmu_driver_init+0x54/0x78
  |   do_one_initcall+0x184/0x43c
  |   kernel_init_freeable+0x368/0x380
  |   kernel_init+0x1c/0x1cc
  |   ret_from_fork+0x10/0x30

Rather than bodge this with raw_smp_processor_id() or randomly disabling
preemption, simply revert the culprit for now until we figure out how to
do this properly.

Reported-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Signed-off-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20201221162249.3119-1-lecopzer.chen@mediatek.com
Link: https://lore.kernel.org/r/20210112221855.10666-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-01-13 15:08:41 +00:00
Joakim Zhang
881b052050 perf/imx_ddr: Add system PMU identifier for userspace
The DDR Perf for i.MX8 is a system PMU whose AXI ID would different from
SoC to SoC. Need expose system PMU identifier for userspace which refer
to /sys/bus/event_source/devices/<PMU DEVICE>/identifier.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20201130114202.26057-3-qiangqing.zhang@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-12-09 14:14:02 +00:00
Sumit Garg
367c820ef0 arm64: Enable perf events based hard lockup detector
With the recent feature added to enable perf events to use pseudo NMIs
as interrupts on platforms which support GICv3 or later, its now been
possible to enable hard lockup detector (or NMI watchdog) on arm64
platforms. So enable corresponding support.

One thing to note here is that normally lockup detector is initialized
just after the early initcalls but PMU on arm64 comes up much later as
device_initcall(). So we need to re-initialize lockup detection once
PMU has been initialized.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Acked-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/1602060704-10921-1-git-send-email-sumit.garg@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 15:18:39 +00:00
Joakim Zhang
6b46338f22 perf/imx_ddr: Add stop event counters support for i.MX8MP
DDR Perf driver only supports free-running event counters(counter1/2/3)
now, this patch adds support for stop event counters.

Legacy SoCs:
Cycle counter(counter0) is a special counter, only count cycles. When
cycle counter overflow, it will lock all counters and generate an
interrupt. In ddr_perf_irq_handler, disable cycle counter then all
counters would stop at the same time, update all counters' count, then
enable cycle counter that all counters count again. During this process,
only clear cycle counter, no need to clear event counters since they are
free-running counters. They would continue counting after overflow and
do/while loop from ddr_perf_event_update can handle event counters
overflow case.

i.MX8MP:
Almost all is the same as legacy SoCs, the only difference is that, event
counters are not free-running any more. Like cycle counter, when event
counters overflow, they would stop counting unless clear the counter,
and no interrupt generate for event counters. So we should clear event
counters that let them re-count when cycle counter overflow, which ensure
event counters will not lose data.

This patch adds stop event counters support which would be compatible to
free-running event counters. We use the cycle counter to stop overflow
of the event counters.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Link: https://lore.kernel.org/r/20201027104451.15434-1-qiangqing.zhang@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 15:14:41 +00:00
John Garry
2c25522336 perf/smmuv3: Support sysfs identifier file
SMMU_PMCG_IIDR was added in the SMMUv3.3 spec.

For the perf tool to know the specific HW implementation, expose the
PMCG_IIDR contents only when set.

Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1602149181-237415-5-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 15:10:35 +00:00
John Garry
ac4511c936 drivers/perf: hisi: Add identifier sysfs file
To allow userspace to identify the specific implementation of the device,
add an "identifier" sysfs file.

Encoding is as follows (same for all uncore drivers):
hi1620: 0x0
hi1630: 0x30

Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1602149181-237415-2-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 15:10:35 +00:00
Wang Qing
6c8cfbf5db perf: remove duplicate check on fwnode
fwnode is checked IS_ERR_OR_NULL in following check by
is_of_node() or is_acpi_device_node(), remove duplicate check.

Signed-off-by: Wang Qing <wangqing@vivo.com>
Link: https://lore.kernel.org/r/1604644902-29655-1-git-send-email-wangqing@vivo.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 15:02:08 +00:00
Tuan Phan
53c218da22 driver/perf: Add PMU driver for the ARM DMC-620 memory controller
DMC-620 PMU supports total 10 counters which each is
independently programmable to different events and can
be started and stopped individually.

Currently, it only supports ACPI. Other platforms feel free to test and add
support for device tree.

Usage example:
  #perf stat -e arm_dmc620_10008c000/clk_cycle_count/ -C 0
  Get perf event for clk_cycle_count counter.

  #perf stat -e arm_dmc620_10008c000/clkdiv2_allocate,mask=0x1f,match=0x2f,
  incr=2,invert=1/ -C 0
  The above example shows how to specify mask, match, incr,
  invert parameters for clkdiv2_allocate event.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Tuan Phan <tuanphan@os.amperecomputing.com>
Link: https://lore.kernel.org/r/1604518246-6198-1-git-send-email-tuanphan@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 14:51:21 +00:00
Will Deacon
887e2cff0f perf: arm-cmn: Fix conversion specifiers for node type
The node type field is an enum type, so print it as a 32-bit quantity
rather than as an unsigned short.

Link: https://lore.kernel.org/r/202009302350.QIzfkx62-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-10-01 22:30:07 +01:00
Will Deacon
d9ef632fab perf: arm-cmn: Fix unsigned comparison to less than zero
Ensure that the 'irq' field of 'struct arm_cmn_dtc' is a signed int
so that it can be compared '< 0'.

Link: https://lore.kernel.org/r/20200929170835.GA15956@embeddedor
Addresses-Coverity-ID: 1497488 ("Unsigned compared against 0")
Fixes: 0ba64770a2 ("perf: Add Arm CMN-600 PMU driver")
Reported-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-10-01 22:29:53 +01:00
Julien Thierry
d8f6267f7c arm_pmu: arm64: Use NMIs for PMU
Add required PMU interrupt operations for NMIs. Request interrupt lines as
NMIs when possible, otherwise fall back to normal interrupts.

NMIs are only supported on the arm64 architecture with a GICv3 irqchip.

[Alexandru E.: Added that NMIs only work on arm64 + GICv3, print message
	when PMU is using NMIs]

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-8-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28 19:00:17 +01:00
Julien Thierry
f76b130bdb arm_pmu: Introduce pmu_irq_ops
Currently the PMU interrupt can either be a normal irq or a percpu irq.
Supporting NMI will introduce two cases for each existing one. It becomes
a mess of 'if's when managing the interrupt.

Define sets of callbacks for operations commonly done on the interrupt. The
appropriate set of callbacks is selected at interrupt request time and
simplifies interrupt enabling/disabling and freeing.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-7-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28 19:00:17 +01:00
Robin Murphy
0ba64770a2 perf: Add Arm CMN-600 PMU driver
Initial driver for PMU event counting on the Arm CMN-600 interconnect.
CMN sports an obnoxiously complex distributed PMU system as part of
its debug and trace features, which can do all manner of things like
sampling, cross-triggering and generating CoreSight trace. This driver
covers the PMU functionality, plus the relevant aspects of watchpoints
for simply counting matching flits.

Tested-by: Tsahi Zidenberg <tsahee@amazon.com>
Tested-by: Tuan Phan <tuanphan@os.amperecomputing.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28 18:50:20 +01:00
Mark Salter
688494a407 drivers/perf: thunderx2_pmu: Fix memory resource error handling
In tx2_uncore_pmu_init_dev(), a call to acpi_dev_get_resources() is used
to create a list _CRS resources which is searched for the device base
address. There is an error check following this:

   if (!rentry->res)
           return NULL

In no case, will rentry->res be NULL, so the test is useless. Even
if the test worked, it comes before the resource list memory is
freed. None of this really matters as long as the ACPI table has
the memory resource. Let's clean it up so that it makes sense and
will give a meaningful error should firmware leave out the memory
resource.

Fixes: 69c32972d5 ("drivers/perf: Add Cavium ThunderX2 SoC UNCORE PMU driver")
Signed-off-by: Mark Salter <msalter@redhat.com>
Link: https://lore.kernel.org/r/20200915204110.326138-2-msalter@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-18 14:34:51 +01:00
Mark Salter
a76b8236ed drivers/perf: xgene_pmu: Fix uninitialized resource struct
This splat was reported on newer Fedora kernels booting on certain
X-gene based machines:

 xgene-pmu APMC0D83:00: X-Gene PMU version 3
 Unable to handle kernel read from unreadable memory at virtual \
 address 0000000000004006
 ...
 Call trace:
  string+0x50/0x100
  vsnprintf+0x160/0x750
  devm_kvasprintf+0x5c/0xb4
  devm_kasprintf+0x54/0x60
  __devm_ioremap_resource+0xdc/0x1a0
  devm_ioremap_resource+0x14/0x20
  acpi_get_pmu_hw_inf.isra.0+0x84/0x15c
  acpi_pmu_dev_add+0xbc/0x21c
  acpi_ns_walk_namespace+0x16c/0x1e4
  acpi_walk_namespace+0xb4/0xfc
  xgene_pmu_probe_pmu_dev+0x7c/0xe0
  xgene_pmu_probe.part.0+0x2c0/0x310
  xgene_pmu_probe+0x54/0x64
  platform_drv_probe+0x60/0xb4
  really_probe+0xe8/0x4a0
  driver_probe_device+0xe4/0x100
  device_driver_attach+0xcc/0xd4
  __driver_attach+0xb0/0x17c
  bus_for_each_dev+0x6c/0xb0
  driver_attach+0x30/0x40
  bus_add_driver+0x154/0x250
  driver_register+0x84/0x140
  __platform_driver_register+0x54/0x60
  xgene_pmu_driver_init+0x28/0x34
  do_one_initcall+0x40/0x204
  do_initcalls+0x104/0x144
  kernel_init_freeable+0x198/0x210
  kernel_init+0x20/0x12c
  ret_from_fork+0x10/0x18
 Code: 91000400 110004e1 eb08009f 540000c0 (38646846)
 ---[ end trace f08c10566496a703 ]---

This is due to use of an uninitialized local resource struct in the xgene
pmu driver. The thunderx2_pmu driver avoids this by using the resource list
constructed by acpi_dev_get_resources() rather than using a callback from
that function. The callback in the xgene driver didn't fully initialize
the resource. So get rid of the callback and search the resource list as
done by thunderx2.

Fixes: 832c927d11 ("perf: xgene: Add APM X-Gene SoC Performance Monitoring Unit driver")
Signed-off-by: Mark Salter <msalter@redhat.com>
Link: https://lore.kernel.org/r/20200915204110.326138-1-msalter@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-18 14:34:51 +01:00
Tuan Phan
2b694fc92a perf: arm_dsu: Support DSU ACPI devices
Add support for probing device from ACPI node.
Each DSU ACPI node and its associated cpus are inside a cluster node.

Signed-off-by: Tuan Phan <tuanphan@os.amperecomputing.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/1600106656-9542-1-git-send-email-tuanphan@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-15 15:51:42 +01:00
Shaokun Zhang
d51eb416fa drivers/perf: hisi: Add missing include of linux/module.h
MODULE_*** is used in HiSilicon uncore PMU drivers and is provided by
linux/module.h, but the header file is not directly included. Add the
missing include.

Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1599186097-18599-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07 14:05:11 +01:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Linus Torvalds
30185b69a2 It looks like a smaller batch of clk updates this time around. In the core
framework we just have some minor tweaks and a debugfs feature, so not much to
 see there. The driver updates are fairly well split between AT91 and Qualcomm
 clk support. Adding those two drivers together equals about 50% of the
 diffstat. Otherwise, the big amount of work this time was on supporting
 Broadcom's Raspberry Pi firmware clks. See below for some more highlights.
 
 Core:
  - Document clk_hw_round_rate() so it gets some more use
  - Remove unused __clk_get_flags()
  - Add a prepare/enable debugfs feature similar to rate setting
 
 New Drivers:
  - Add support for SAMA7G5 SoC clks
  - Enable CPU clks on Qualcomm IPQ6018 SoCs
  - Enable CPU clks on Qualcomm MSM8996 SoCs
  - GPU clk support for Qualcomm SM8150 and SM8250 SoCs
  - Audio clks on Qualcomm SC7180 SoCs
  - Microchip Sparx5 DPLL clk
  - Add support for the new Renesas RZ/G2H (R8A774E1) SoC
 
 Updates:
  - Make defines for bcm63xx-gate clks to use in DT
  - Support BCM2711 SoC firmware clks
  - Add HDMI clks for BCM2711 SoCs
  - Add RTC related clks on Ingenic SoCs
  - Support USB PHY clks on Ingenic SoCs
  - Support gate clks on BCM6318 SoCs
  - RMU and DMAC/GPIO clock support for Actions Semi S500 SoCs
  - Use poll_timeout functions in Rockchip clk driver
  - Support Rockchip rk3288w SoC variant
  - Mark mac_lbtest critical on Rockchip rk3188
  - Add CAAM clock support for i.MX vf610 driver
  - Add MU root clock support for i.MX imx8mp driver
  - Amlogic g12: add neural network accelerator clock sources
  - Amlogic meson8: remove critical flag for main PLL divider
  - Amlogic meson8: add video decoder clock gates
  - Convert one more Renesas DT binding to json-schema
  - Enhance critical clock handling on Renesas platforms to only consider
    clocks that were enabled at boot time
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAl8tspERHHNib3lkQGtl
 cm5lbC5vcmcACgkQrQKIl8bklSX0og//TXp134IBXVZ2Z5Wca4J41itv7tPGJFsX
 EslQ/eMm6xhGEqqrA6BVsQ228JF9rUmg1fbvdl82UPK7Zyp9P2fo6CMC65ngDTky
 B1rLZOaKitER9JdL9DmmnIR772pI//rAMdIeVC/Jn5RR7OpP4lgY6+qvK0pJjVFl
 8aK3uHY+UWVFtqZxuJEAAyZBq1+bREqmVwNC1my6kmMIf0j0KwcGhrZgASWWtjQK
 4TRmroehLC4FBqnJ78Y3E6UAOBlz6C7XnP38qJge2672Ef7QhXZU7AGrGiTtxc4h
 5fB6MNMF+5QGel54qR1eH+JxaEoKsAjLaX1VBr7hAHrGIQ26dBFHFPsPvWDA+VVR
 4bwXJKz2objAWyqlMVM/cVV3q6uDixuScdrw2aAiojmV7ZvdZXflacdmZuS5v7e4
 sh86llN+lF0YrViYz33z+up0risCAi089xVo7Z99VgyLe2DR8TE+4/d6DQTFRNxl
 m66m6mlB9pPPgIV028SAf/zmzyoVWEarwdEAQPjJC6KVRbnR9mFlhSiTQMmCsSvU
 zHRxVcInc+8qIJY6VJ552UFu/JAan/AFl5knb+kW21a7hM8p81H9HuSnS4aHDDq4
 yETPU3XSMYz8ARHX4lKo1UIB6+k9iF2CWdCP+gXHNP+QYSAsRBIJsonjul9lLUCi
 8i6mUGkS0OU=
 =QLRg
 -----END PGP SIGNATURE-----

Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk updates from Stephen Boyd:
 "It looks like a smaller batch of clk updates this time around.

  In the core framework we just have some minor tweaks and a debugfs
  feature, so not much to see there. The driver updates are fairly well
  split between AT91 and Qualcomm clk support. Adding those two drivers
  together equals about 50% of the diffstat.

  Otherwise, the big amount of work this time was on supporting
  Broadcom's Raspberry Pi firmware clks.

  Highlights:

  Core:
   - Document clk_hw_round_rate() so it gets some more use
   - Remove unused __clk_get_flags()
   - Add a prepare/enable debugfs feature similar to rate setting

  New Drivers:
   - Add support for SAMA7G5 SoC clks
   - Enable CPU clks on Qualcomm IPQ6018 SoCs
   - Enable CPU clks on Qualcomm MSM8996 SoCs
   - GPU clk support for Qualcomm SM8150 and SM8250 SoCs
   - Audio clks on Qualcomm SC7180 SoCs
   - Microchip Sparx5 DPLL clk
   - Add support for the new Renesas RZ/G2H (R8A774E1) SoC

  Updates:
   - Make defines for bcm63xx-gate clks to use in DT
   - Support BCM2711 SoC firmware clks
   - Add HDMI clks for BCM2711 SoCs
   - Add RTC related clks on Ingenic SoCs
   - Support USB PHY clks on Ingenic SoCs
   - Support gate clks on BCM6318 SoCs
   - RMU and DMAC/GPIO clock support for Actions Semi S500 SoCs
   - Use poll_timeout functions in Rockchip clk driver
   - Support Rockchip rk3288w SoC variant
   - Mark mac_lbtest critical on Rockchip rk3188
   - Add CAAM clock support for i.MX vf610 driver
   - Add MU root clock support for i.MX imx8mp driver
   - Amlogic g12: add neural network accelerator clock sources
   - Amlogic meson8: remove critical flag for main PLL divider
   - Amlogic meson8: add video decoder clock gates
   - Convert one more Renesas DT binding to json-schema
   - Enhance critical clock handling on Renesas platforms to only
     consider clocks that were enabled at boot time"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (79 commits)
  clk: qcom: gcc: Make disp gpll0 branch aon for sc7180/sdm845
  ipq806x: gcc: add support for child probe
  clk: qcom: msm8996: Make symbol 'cpu_msm8996_clks' static
  clk: qcom: ipq8074: Add correct index for PCIe clocks
  clk: <linux/clk-provider.h>: drop a duplicated word
  clk: renesas: cpg-mssr: Add r8a774e1 support
  dt-bindings: clock: renesas,cpg-mssr: Document r8a774e1
  clk: Drop duplicate selection in Kconfig
  clk: qcom: smd: Add support for MSM8992/4 rpm clocks
  clk: qcom: ipq8074: Add missing clocks for pcie
  dt-bindings: clock: qcom: ipq8074: Add missing bindings for PCIe
  Replace HTTP links with HTTPS ones: Common CLK framework
  clk: qcom: Add CPU clock driver for msm8996
  dt-bindings: clk: qcom: Add bindings for CPU clock for msm8996
  soc: qcom: Separate kryo l2 accessors from PMU driver
  clk: meson: meson8b: add the vclk2_en gate clock
  clk: meson: meson8b: add the vclk_en gate clock
  clk: qcom: Fix return value check in apss_ipq6018_probe()
  clk: bcm: dvp: Add missing module informations
  clk: meson: meson8b: Drop CLK_IS_CRITICAL from fclk_div2
  ...
2020-08-07 13:35:51 -07:00
Linus Torvalds
145ff1ec09 arm64 and cross-arch updates for 5.9:
- Removal of the tremendously unpopular read_barrier_depends() barrier,
   which is a NOP on all architectures apart from Alpha, in favour of
   allowing architectures to override READ_ONCE() and do whatever dance
   they need to do to ensure address dependencies provide LOAD ->
   LOAD/STORE ordering. This work also offers a potential solution if
   compilers are shown to convert LOAD -> LOAD address dependencies into
   control dependencies (e.g. under LTO), as weakly ordered architectures
   will effectively be able to upgrade READ_ONCE() to smp_load_acquire().
   The latter case is not used yet, but will be discussed further at LPC.
 
 - Make the MSI/IOMMU input/output ID translation PCI agnostic, augment
   the MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID
   bus-specific parameter and apply the resulting changes to the device
   ID space provided by the Freescale FSL bus.
 
 - arm64 support for TLBI range operations and translation table level
   hints (part of the ARMv8.4 architecture version).
 
 - Time namespace support for arm64.
 
 - Export the virtual and physical address sizes in vmcoreinfo for
   makedumpfile and crash utilities.
 
 - CPU feature handling cleanups and checks for programmer errors
   (overlapping bit-fields).
 
 - ACPI updates for arm64: disallow AML accesses to EFI code regions and
   kernel memory.
 
 - perf updates for arm64.
 
 - Miscellaneous fixes and cleanups, most notably PLT counting
   optimisation for module loading, recordmcount fix to ignore
   relocations other than R_AARCH64_CALL26, CMA areas reserved for
   gigantic pages on 16K and 64K configurations.
 
 - Trivial typos, duplicate words.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl8oTcsACgkQa9axLQDI
 XvEj6hAAkn39mO5xrR/Vhpg3DyFPk63ZlMSX9SsOeVyaLbovT6stTs1XAZXPpnkt
 rV3gwACyGSrqH6+uey9pHgHJuPF2TdrGEVK08yVKo9KGW/6yXSIncdKFE4jUJ/WJ
 wF5j7eMET2aGzcpm5AlzMmq6HOrKB8nZac9H8/x6H+Ox2WdgJkEjOkDvyqACUyum
 N3FsTZkWj2pIkTXHNgDZ8KjxVLO8HlFaB2hkxFDl9NPlX2UTCQJ8Tg1KiPLafKaK
 gUvH4usQDFdb5RU/UWogre37J4emO0ZTApZOyju+U+PMMWlWVHjZ4isUIS9zz/AE
 JNZ23dnKZX2HrYa5p8HZx175zwj/vXUqUHCZPLvQXaAudCEhF8BVljPiG0e80FV5
 GHFUgUbylKspp01I/9L+2JvsG96Mr0e+P3Sx7L2HTI42cmtoSa14+MpoSRj7zlft
 Qcl8hfrVOjCjUnFRHa/1y1cGvnD9GbgnKJR7zgVxl9bD/Jd48r1HUtwRORZCzWFr
 mRPVbPS72fWxMzMV9DZYJm02jJY9kLX2BMl49njbB8MhAhzOvrMVzoVVtMMeRFLR
 XHeJpmg36W09FiRGe7LRXlkXIhCQzQG2bJfiphuupCfhjRAitPoq8I925G6Pig60
 c8RWaXGU7PrEsdMNrL83vekvGKgqrkoFkRVtsCoQ2X6Hvu/XdYI=
 =mh79
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 and cross-arch updates from Catalin Marinas:
 "Here's a slightly wider-spread set of updates for 5.9.

  Going outside the usual arch/arm64/ area is the removal of
  read_barrier_depends() series from Will and the MSI/IOMMU ID
  translation series from Lorenzo.

  The notable arm64 updates include ARMv8.4 TLBI range operations and
  translation level hint, time namespace support, and perf.

  Summary:

   - Removal of the tremendously unpopular read_barrier_depends()
     barrier, which is a NOP on all architectures apart from Alpha, in
     favour of allowing architectures to override READ_ONCE() and do
     whatever dance they need to do to ensure address dependencies
     provide LOAD -> LOAD/STORE ordering.

     This work also offers a potential solution if compilers are shown
     to convert LOAD -> LOAD address dependencies into control
     dependencies (e.g. under LTO), as weakly ordered architectures will
     effectively be able to upgrade READ_ONCE() to smp_load_acquire().
     The latter case is not used yet, but will be discussed further at
     LPC.

   - Make the MSI/IOMMU input/output ID translation PCI agnostic,
     augment the MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID
     bus-specific parameter and apply the resulting changes to the
     device ID space provided by the Freescale FSL bus.

   - arm64 support for TLBI range operations and translation table level
     hints (part of the ARMv8.4 architecture version).

   - Time namespace support for arm64.

   - Export the virtual and physical address sizes in vmcoreinfo for
     makedumpfile and crash utilities.

   - CPU feature handling cleanups and checks for programmer errors
     (overlapping bit-fields).

   - ACPI updates for arm64: disallow AML accesses to EFI code regions
     and kernel memory.

   - perf updates for arm64.

   - Miscellaneous fixes and cleanups, most notably PLT counting
     optimisation for module loading, recordmcount fix to ignore
     relocations other than R_AARCH64_CALL26, CMA areas reserved for
     gigantic pages on 16K and 64K configurations.

   - Trivial typos, duplicate words"

Link: http://lkml.kernel.org/r/20200710165203.31284-1-will@kernel.org
Link: http://lkml.kernel.org/r/20200619082013.13661-1-lorenzo.pieralisi@arm.com

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (82 commits)
  arm64: use IRQ_STACK_SIZE instead of THREAD_SIZE for irq stack
  arm64/mm: save memory access in check_and_switch_context() fast switch path
  arm64: sigcontext.h: delete duplicated word
  arm64: ptrace.h: delete duplicated word
  arm64: pgtable-hwdef.h: delete duplicated words
  bus: fsl-mc: Add ACPI support for fsl-mc
  bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver
  of/irq: Make of_msi_map_rid() PCI bus agnostic
  of/irq: make of_msi_map_get_device_domain() bus agnostic
  dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus
  of/device: Add input id to of_dma_configure()
  of/iommu: Make of_map_rid() PCI agnostic
  ACPI/IORT: Add an input ID to acpi_dma_configure()
  ACPI/IORT: Remove useless PCI bus walk
  ACPI/IORT: Make iort_msi_map_rid() PCI agnostic
  ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic
  ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for NC
  arm64: enable time namespace support
  arm64/vdso: Restrict splitting VVAR VMA
  arm64/vdso: Handle faults on timens page
  ...
2020-08-03 14:11:08 -07:00
Qi Liu
f32ed8eb0e drivers/perf: Prevent forced unbinding of PMU drivers
Forcefully unbinding PMU drivers during perf sampling will lead to
a kernel panic, because the perf upper-layer framework call a NULL
pointer in this situation.

To solve this issue, "suppress_bind_attrs" should be set to true, so
that bind/unbind can be disabled via sysfs and prevent unbinding PMU
drivers during perf sampling.

Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1594975763-32966-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-07-17 10:51:44 +01:00
Qi Liu
bdc5c744c7 drivers/perf: Fix kernel panic when rmmod PMU modules during perf sampling
When users try to remove PMU modules during perf sampling, kernel panic
will happen because the pmu->read() is a NULL pointer here.

INFO on HiSilicon hip08 platform as follow:
pc : hisi_uncore_pmu_event_update+0x30/0xa4 [hisi_uncore_pmu]
lr : hisi_uncore_pmu_read+0x20/0x2c [hisi_uncore_pmu]
sp : ffff800010103e90
x29: ffff800010103e90 x28: ffff0027db0c0e40
x27: ffffa29a76f129d8 x26: ffffa29a77ceb000
x25: ffffa29a773a5000 x24: ffffa29a77392000
x23: ffffddffe5943f08 x22: ffff002784285960
x21: ffff002784285800 x20: ffff0027d2e76c80
x19: ffff0027842859e0 x18: ffff80003498bcc8
x17: ffffa29a76afe910 x16: ffffa29a7583f530
x15: 16151a1512061a1e x14: 0000000000000000
x13: ffffa29a76f1e238 x12: 0000000000000001
x11: 0000000000000400 x10: 00000000000009f0
x9 : ffff8000107b3e70 x8 : ffff0027db0c1890
x7 : ffffa29a773a7000 x6 : 00000007f5131013
x5 : 00000007f5131013 x4 : 09f257d417c00000
x3 : 00000002187bd7ce x2 : ffffa29a38f0f0d8
x1 : ffffa29a38eae268 x0 : ffff0027d2e76c80
Call trace:
hisi_uncore_pmu_event_update+0x30/0xa4 [hisi_uncore_pmu]
hisi_uncore_pmu_read+0x20/0x2c [hisi_uncore_pmu]
__perf_event_read+0x1a0/0x1f8
flush_smp_call_function_queue+0xa0/0x160
generic_smp_call_function_single_interrupt+0x18/0x20
handle_IPI+0x31c/0x4dc
gic_handle_irq+0x2c8/0x310
el1_irq+0xcc/0x180
arch_cpu_idle+0x4c/0x20c
default_idle_call+0x20/0x30
do_idle+0x1b4/0x270
cpu_startup_entry+0x28/0x30
secondary_start_kernel+0x1a4/0x1fc

To solve the above issue, current module should be registered to kernel,
so that try_module_get() can be invoked when perf sampling starts. This
adds the reference counting of module and could prevent users from removing
modules during sampling.

Reported-by: Haifeng Wang <wang.wanghaifeng@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1594891165-8228-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16 11:35:24 +01:00
Jay Chen
f011856ce7 perf/smmuv3: To simplify code for ioremap page in pmcg
Use the devm_platform_get_and_ioremap_resource to simplify the code
a bit.

Signed-off-by: Jay Chen <jkchen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20200706112246.92220-2-jkchen@linux.alibaba.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-07-13 16:20:44 +01:00
Ilia Lin
6d0efeb14b soc: qcom: Separate kryo l2 accessors from PMU driver
The driver provides kernel level API for other drivers
to access the MSM8996 L2 cache registers.
Separating the L2 access code from the PMU driver and
making it public to allow other drivers use it.
The accesses must be separated with a single spinlock,
maintained in this driver.

Signed-off-by: Ilia Lin <ilialin@codeaurora.org>
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Link: https://lore.kernel.org/r/1593766185-16346-2-git-send-email-loic.poulain@linaro.org
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-07-10 17:08:55 -07:00
Linus Torvalds
55d728b2b0 arm64 merge window fixes for -rc1
- Fix SCS debug check to report max stack usage in bytes as advertised
 - Fix typo: CONFIG_FTRACE_WITH_REGS => CONFIG_DYNAMIC_FTRACE_WITH_REGS
 - Fix incorrect mask in HiSilicon L3C perf PMU driver
 - Fix compat vDSO compilation under some toolchain configurations
 - Fix false UBSAN warning from ACPI IORT parsing code
 - Fix booting under bootloaders that ignore TEXT_OFFSET
 - Annotate debug initcall function with '__init'
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl7iMe8QHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNIp5B/46kdFZ1M8VSsGxtZMzLVZBR4MWzjx1wBD3
 Zzvcg5x0aLAvg+VephmQ5cBiQE78/KKISUdTKndevJ9feVhzz8kxbOhLB88o14+L
 Pk63p4jol8v7cJHiqcsBgSLR6MDAiY+4epsgeFA7WkO9cf529UIMO1ea2TCx0KbT
 tKniZghX5I485Fu2RHtZGLGBxQXqFBcDJUok3/IoZnp2SDyUxrzHPViFL9fHHzCb
 FNSEJijcoHfrIKiG4bPssKICmvbtcNysembDlJeyZ+5qJXqotty2M3OK+We7vPrg
 Ne5O/tQoeCt4lLuW40yEmpQzodNLG8D+isC6cFvspmPXSyHflSCz
 =EtmQ
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "arm64 fixes that came in during the merge window.

  There will probably be more to come, but it doesn't seem like it's
  worth me sitting on these in the meantime.

   - Fix SCS debug check to report max stack usage in bytes as advertised

   - Fix typo: CONFIG_FTRACE_WITH_REGS => CONFIG_DYNAMIC_FTRACE_WITH_REGS

   - Fix incorrect mask in HiSilicon L3C perf PMU driver

   - Fix compat vDSO compilation under some toolchain configurations

   - Fix false UBSAN warning from ACPI IORT parsing code

   - Fix booting under bootloaders that ignore TEXT_OFFSET

   - Annotate debug initcall function with '__init'"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: warn on incorrect placement of the kernel by the bootloader
  arm64: acpi: fix UBSAN warning
  arm64: vdso32: add CONFIG_THUMB2_COMPAT_VDSO
  drivers/perf: hisi: Fix wrong value for all counters enable
  arm64: ftrace: Change CONFIG_FTRACE_WITH_REGS to CONFIG_DYNAMIC_FTRACE_WITH_REGS
  arm64: debug: mark a function as __init to save some memory
  scs: Report SCS usage in bytes rather than number of entries
2020-06-11 12:53:23 -07:00
Shaokun Zhang
961abd78ad drivers/perf: hisi: Fix wrong value for all counters enable
In L3C uncore PMU drivers, bit16 is used to control all counters enable &
disable. Wrong value is given in the driver and its default value is 1'b1,
it can work because each PMU counter has its own control bits too.
Let's fix the wrong value.

Fixes: 2940bc4333 ("perf: hisi: Add support for HiSilicon SoC L3C PMU driver")
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1591350221-32275-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-06-08 15:46:41 +01:00
Linus Torvalds
533b220f7b arm64 updates for 5.8
- Branch Target Identification (BTI)
 	* Support for ARMv8.5-BTI in both user- and kernel-space. This
 	  allows branch targets to limit the types of branch from which
 	  they can be called and additionally prevents branching to
 	  arbitrary code, although kernel support requires a very recent
 	  toolchain.
 
 	* Function annotation via SYM_FUNC_START() so that assembly
 	  functions are wrapped with the relevant "landing pad"
 	  instructions.
 
 	* BPF and vDSO updates to use the new instructions.
 
 	* Addition of a new HWCAP and exposure of BTI capability to
 	  userspace via ID register emulation, along with ELF loader
 	  support for the BTI feature in .note.gnu.property.
 
 	* Non-critical fixes to CFI unwind annotations in the sigreturn
 	  trampoline.
 
 - Shadow Call Stack (SCS)
 	* Support for Clang's Shadow Call Stack feature, which reserves
 	  platform register x18 to point at a separate stack for each
 	  task that holds only return addresses. This protects function
 	  return control flow from buffer overruns on the main stack.
 
 	* Save/restore of x18 across problematic boundaries (user-mode,
 	  hypervisor, EFI, suspend, etc).
 
 	* Core support for SCS, should other architectures want to use it
 	  too.
 
 	* SCS overflow checking on context-switch as part of the existing
 	  stack limit check if CONFIG_SCHED_STACK_END_CHECK=y.
 
 - CPU feature detection
 	* Removed numerous "SANITY CHECK" errors when running on a system
 	  with mismatched AArch32 support at EL1. This is primarily a
 	  concern for KVM, which disabled support for 32-bit guests on
 	  such a system.
 
 	* Addition of new ID registers and fields as the architecture has
 	  been extended.
 
 - Perf and PMU drivers
 	* Minor fixes and cleanups to system PMU drivers.
 
 - Hardware errata
 	* Unify KVM workarounds for VHE and nVHE configurations.
 
 	* Sort vendor errata entries in Kconfig.
 
 - Secure Monitor Call Calling Convention (SMCCC)
 	* Update to the latest specification from Arm (v1.2).
 
 	* Allow PSCI code to query the SMCCC version.
 
 - Software Delegated Exception Interface (SDEI)
 	* Unexport a bunch of unused symbols.
 
 	* Minor fixes to handling of firmware data.
 
 - Pointer authentication
 	* Add support for dumping the kernel PAC mask in vmcoreinfo so
 	  that the stack can be unwound by tools such as kdump.
 
 	* Simplification of key initialisation during CPU bringup.
 
 - BPF backend
 	* Improve immediate generation for logical and add/sub
 	  instructions.
 
 - vDSO
 	- Minor fixes to the linker flags for consistency with other
 	  architectures and support for LLVM's unwinder.
 
 	- Clean up logic to initialise and map the vDSO into userspace.
 
 - ACPI
 	- Work around for an ambiguity in the IORT specification relating
 	  to the "num_ids" field.
 
 	- Support _DMA method for all named components rather than only
 	  PCIe root complexes.
 
 	- Minor other IORT-related fixes.
 
 - Miscellaneous
 	* Initialise debug traps early for KGDB and fix KDB cacheflushing
 	  deadlock.
 
 	* Minor tweaks to early boot state (documentation update, set
 	  TEXT_OFFSET to 0x0, increase alignment of PE/COFF sections).
 
 	* Refactoring and cleanup
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl7U9csQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNLBHCACs/YU4SM7Om5f+7QnxIKao5DBr2CnGGvdC
 yTfDghFDTLQVv3MufLlfno3yBe5G8sQpcZfcc+hewfcGoMzVZXu8s7LzH6VSn9T9
 jmT3KjDMrg0RjSHzyumJp2McyelTk0a4FiKArSIIKsJSXUyb1uPSgm7SvKVDwEwU
 JGDzL9IGilmq59GiXfDzGhTZgmC37QdwRoRxDuqtqWQe5CHoRXYexg87HwBKOQxx
 HgU9L7ehri4MRZfpyjaDrr6quJo3TVnAAKXNBh3mZAskVS9ZrfKpEH0kYWYuqybv
 znKyHRecl/rrGePV8RTMtrwnSdU26zMXE/omsVVauDfG9hqzqm+Q
 =w3qi
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "A sizeable pile of arm64 updates for 5.8.

  Summary below, but the big two features are support for Branch Target
  Identification and Clang's Shadow Call stack. The latter is currently
  arm64-only, but the high-level parts are all in core code so it could
  easily be adopted by other architectures pending toolchain support

  Branch Target Identification (BTI):

   - Support for ARMv8.5-BTI in both user- and kernel-space. This allows
     branch targets to limit the types of branch from which they can be
     called and additionally prevents branching to arbitrary code,
     although kernel support requires a very recent toolchain.

   - Function annotation via SYM_FUNC_START() so that assembly functions
     are wrapped with the relevant "landing pad" instructions.

   - BPF and vDSO updates to use the new instructions.

   - Addition of a new HWCAP and exposure of BTI capability to userspace
     via ID register emulation, along with ELF loader support for the
     BTI feature in .note.gnu.property.

   - Non-critical fixes to CFI unwind annotations in the sigreturn
     trampoline.

  Shadow Call Stack (SCS):

   - Support for Clang's Shadow Call Stack feature, which reserves
     platform register x18 to point at a separate stack for each task
     that holds only return addresses. This protects function return
     control flow from buffer overruns on the main stack.

   - Save/restore of x18 across problematic boundaries (user-mode,
     hypervisor, EFI, suspend, etc).

   - Core support for SCS, should other architectures want to use it
     too.

   - SCS overflow checking on context-switch as part of the existing
     stack limit check if CONFIG_SCHED_STACK_END_CHECK=y.

  CPU feature detection:

   - Removed numerous "SANITY CHECK" errors when running on a system
     with mismatched AArch32 support at EL1. This is primarily a concern
     for KVM, which disabled support for 32-bit guests on such a system.

   - Addition of new ID registers and fields as the architecture has
     been extended.

  Perf and PMU drivers:

   - Minor fixes and cleanups to system PMU drivers.

  Hardware errata:

   - Unify KVM workarounds for VHE and nVHE configurations.

   - Sort vendor errata entries in Kconfig.

  Secure Monitor Call Calling Convention (SMCCC):

   - Update to the latest specification from Arm (v1.2).

   - Allow PSCI code to query the SMCCC version.

  Software Delegated Exception Interface (SDEI):

   - Unexport a bunch of unused symbols.

   - Minor fixes to handling of firmware data.

  Pointer authentication:

   - Add support for dumping the kernel PAC mask in vmcoreinfo so that
     the stack can be unwound by tools such as kdump.

   - Simplification of key initialisation during CPU bringup.

  BPF backend:

   - Improve immediate generation for logical and add/sub instructions.

  vDSO:

   - Minor fixes to the linker flags for consistency with other
     architectures and support for LLVM's unwinder.

   - Clean up logic to initialise and map the vDSO into userspace.

  ACPI:

   - Work around for an ambiguity in the IORT specification relating to
     the "num_ids" field.

   - Support _DMA method for all named components rather than only PCIe
     root complexes.

   - Minor other IORT-related fixes.

  Miscellaneous:

   - Initialise debug traps early for KGDB and fix KDB cacheflushing
     deadlock.

   - Minor tweaks to early boot state (documentation update, set
     TEXT_OFFSET to 0x0, increase alignment of PE/COFF sections).

   - Refactoring and cleanup"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits)
  KVM: arm64: Move __load_guest_stage2 to kvm_mmu.h
  KVM: arm64: Check advertised Stage-2 page size capability
  arm64/cpufeature: Add get_arm64_ftr_reg_nowarn()
  ACPI/IORT: Remove the unused __get_pci_rid()
  arm64/cpuinfo: Add ID_MMFR4_EL1 into the cpuinfo_arm64 context
  arm64/cpufeature: Add remaining feature bits in ID_AA64PFR1 register
  arm64/cpufeature: Add remaining feature bits in ID_AA64PFR0 register
  arm64/cpufeature: Add remaining feature bits in ID_AA64ISAR0 register
  arm64/cpufeature: Add remaining feature bits in ID_MMFR4 register
  arm64/cpufeature: Add remaining feature bits in ID_PFR0 register
  arm64/cpufeature: Introduce ID_MMFR5 CPU register
  arm64/cpufeature: Introduce ID_DFR1 CPU register
  arm64/cpufeature: Introduce ID_PFR2 CPU register
  arm64/cpufeature: Make doublelock a signed feature in ID_AA64DFR0
  arm64/cpufeature: Drop TraceFilt feature exposure from ID_DFR0 register
  arm64/cpufeature: Add explicit ftr_id_isar0[] for ID_ISAR0 register
  arm64: mm: Add asid_gen_match() helper
  firmware: smccc: Fix missing prototype warning for arm_smccc_version_init
  arm64: vdso: Fix CFI directives in sigreturn trampoline
  arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction
  ...
2020-06-01 15:18:27 -07:00
Jean-Philippe Brucker
10f6cd2af2 pmu/smmuv3: Clear IRQ affinity hint on device removal
Currently when trying to remove the SMMUv3 PMU module we get a
WARN_ON_ONCE from free_irq(), because the affinity hint set during probe
hasn't been properly cleared.

[  238.878383] WARNING: CPU: 0 PID: 175 at kernel/irq/manage.c:1744 free_irq+0x324/0x358
...
[  238.897263] Call trace:
[  238.897998]  free_irq+0x324/0x358
[  238.898792]  devm_irq_release+0x18/0x28
[  238.899189]  release_nodes+0x1b0/0x228
[  238.899984]  devres_release_all+0x38/0x60
[  238.900779]  device_release_driver_internal+0x10c/0x1d0
[  238.901574]  driver_detach+0x50/0xe0
[  238.902368]  bus_remove_driver+0x5c/0xd8
[  238.903448]  driver_unregister+0x30/0x60
[  238.903958]  platform_driver_unregister+0x14/0x20
[  238.905075]  arm_smmu_pmu_exit+0x1c/0xecc [arm_smmuv3_pmu]
[  238.905547]  __arm64_sys_delete_module+0x14c/0x260
[  238.906342]  el0_svc_common.constprop.0+0x74/0x178
[  238.907355]  do_el0_svc+0x24/0x90
[  238.907932]  el0_sync_handler+0x11c/0x198
[  238.908979]  el0_sync+0x158/0x180

Just like the other perf drivers, clear the affinity hint before
releasing the device.

Fixes: 7d839b4b9e ("perf/smmuv3: Add arm64 smmuv3 pmu driver")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20200422084805.237738-1-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-05-18 18:22:58 +01:00
Zhou Wang
97807325a0 drivers/perf: hisi: Permit modular builds of HiSilicon uncore drivers
This patch lets HiSilicon uncore PMU driver can be built as modules.
A common module and three specific uncore PMU driver modules will be built.

Export necessary functions in hisi_uncore_pmu module, and change
irq_set_affinity to irq_set_affinity_hint to pass compile.

Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Tested-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1588820305-174479-1-git-send-email-wangzhou1@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-05-18 18:18:39 +01:00
Shaokun Zhang
88562f06eb drivers/perf: hisi: Fix typo in events attribute array
Fix up one typo: wr_dr_64b -> wr_ddr_64b.

Fixes: 2bab3cf910 ("perf: hisi: Add support for HiSilicon SoC HHA PMU driver")
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1587643530-34357-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-04-30 21:53:38 +01:00
Tang Bin
1f0d97bb70 drivers/perf: arm_spe_pmu: Avoid duplicate printouts
platform_get_irq() already screams on failure, so the redundant call to
dev_err() can be removed.

Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Link: https://lore.kernel.org/r/20200402120330.19468-1-tangbin@cmss.chinamobile.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-04-30 21:49:32 +01:00
Tang Bin
5810f00ade drivers/perf: arm_dsu_pmu: Avoid duplicate printouts
platform_get_irq() already screams on failure, so the redundant call to
dev_err() can be removed.

Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Link: https://lore.kernel.org/r/20200402115940.4928-1-tangbin@cmss.chinamobile.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-04-30 21:48:10 +01:00
Alexey Budankov
cea7d0d4a5 drivers/perf: Open access for CAP_PERFMON privileged process
Open access to monitoring for CAP_PERFMON privileged process.  Providing
the access under CAP_PERFMON capability singly, without the rest of
CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials
and makes operation more secure.

CAP_PERFMON implements the principle of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and only
for the time that such privileges are actually required)

For backward compatibility reasons access to the monitoring remains open
for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for
secure monitoring is discouraged with respect to CAP_PERFMON capability.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Igor Lubashev <ilubashe@akamai.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: intel-gfx@lists.freedesktop.org
Cc: linux-doc@vger.kernel.org
Cc: linux-man@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Cc: selinux@vger.kernel.org
Link: http://lore.kernel.org/lkml/4ec1d6f7-548c-8d1c-f84a-cebeb9674e4e@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-16 12:19:09 -03:00
Linus Torvalds
3cd86a58f7 arm64 updates for 5.7:
- In-kernel Pointer Authentication support (previously only offered to
   user space).
 
 - ARM Activity Monitors (AMU) extension support allowing better CPU
   utilisation numbers for the scheduler (frequency invariance).
 
 - Memory hot-remove support for arm64.
 
 - Lots of asm annotations (SYM_*) in preparation for the in-kernel
   Branch Target Identification (BTI) support.
 
 - arm64 perf updates: ARMv8.5-PMU 64-bit counters, refactoring the PMU
   init callbacks, support for new DT compatibles.
 
 - IPv6 header checksum optimisation.
 
 - Fixes: SDEI (software delegated exception interface) double-lock on
   hibernate with shared events.
 
 - Minor clean-ups and refactoring: cpu_ops accessor, cpu_do_switch_mm()
   converted to C, cpufeature finalisation helper.
 
 - sys_mremap() comment explaining the asymmetric address untagging
   behaviour.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl6DVyIACgkQa9axLQDI
 XvHkqRAAiZA2EYKiQL4M1DJ1cNTADjT7xKX9+UtYBXj7GMVhgVWdunpHVE6qtfgk
 cT6avmKrS/6PDqizJgr+Z1yX8x3Kvs57G4BvmIUKIw97mkdewvFQ9JKv6VA1vb86
 7Qrl1WzqsGg5Kj9uUfI4h+ZoT1H4C/9PQeFxJwgZRtF9DxRh8O7VeZI+JCu8Aub2
 lIkjI8rh+EpTsGT9h/PMGWUcawnKQloZ1/F+GfMAuYBvIv2RNN2xVreJtTmm4NyJ
 VcpL0KCNyAI2lGdaJg5nBLRDyGuXDm5i+PLsCSXMquI4fie00txXeD8sjbeuO0ks
 YTJ0EhmUUhbSE17go+SxYiEFE0v09i+lD5ud+B4Vmojp0KTczTta9VSgURlbb2/9
 n9biq5G3PPDNIrZqiTT2Tf4AMz1350nkbzL2gzKecM5aIzR/u3y5yII5CgfZtFnj
 7bGbyFpFpcqI7UaISPsNCxmknbTt/7ff0WM3+7SbecxI3AD2mnxsOdN9JTLyhDp+
 owjyiaWxl5zMWF9DhplLG/9BKpNWSxh3skazdOdELd8GTq2MbJlXrVG2XgXTAOh3
 y1s6RQrfw8zXh8TSqdmmzauComXIRWTum/sbVB3U8Z3AUsIeq/NTSbN5X9JyIbOP
 HOabhlVhhkI6omN1grqPX4jwUiZLZoNfn7Ez4q71549KVK/uBtA=
 =LJVX
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "The bulk is in-kernel pointer authentication, activity monitors and
  lots of asm symbol annotations. I also queued the sys_mremap() patch
  commenting the asymmetry in the address untagging.

  Summary:

   - In-kernel Pointer Authentication support (previously only offered
     to user space).

   - ARM Activity Monitors (AMU) extension support allowing better CPU
     utilisation numbers for the scheduler (frequency invariance).

   - Memory hot-remove support for arm64.

   - Lots of asm annotations (SYM_*) in preparation for the in-kernel
     Branch Target Identification (BTI) support.

   - arm64 perf updates: ARMv8.5-PMU 64-bit counters, refactoring the
     PMU init callbacks, support for new DT compatibles.

   - IPv6 header checksum optimisation.

   - Fixes: SDEI (software delegated exception interface) double-lock on
     hibernate with shared events.

   - Minor clean-ups and refactoring: cpu_ops accessor,
     cpu_do_switch_mm() converted to C, cpufeature finalisation helper.

   - sys_mremap() comment explaining the asymmetric address untagging
     behaviour"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (81 commits)
  mm/mremap: Add comment explaining the untagging behaviour of mremap()
  arm64: head: Convert install_el2_stub to SYM_INNER_LABEL
  arm64: Introduce get_cpu_ops() helper function
  arm64: Rename cpu_read_ops() to init_cpu_ops()
  arm64: Declare ACPI parking protocol CPU operation if needed
  arm64: move kimage_vaddr to .rodata
  arm64: use mov_q instead of literal ldr
  arm64: Kconfig: verify binutils support for ARM64_PTR_AUTH
  lkdtm: arm64: test kernel pointer authentication
  arm64: compile the kernel with ptrauth return address signing
  kconfig: Add support for 'as-option'
  arm64: suspend: restore the kernel ptrauth keys
  arm64: __show_regs: strip PAC from lr in printk
  arm64: unwind: strip PAC from kernel addresses
  arm64: mask PAC bits of __builtin_return_address
  arm64: initialize ptrauth keys for kernel booting task
  arm64: initialize and switch ptrauth kernel keys
  arm64: enable ptrauth earlier
  arm64: cpufeature: handle conflicts based on capability
  arm64: cpufeature: Move cpu capability helpers inside C file
  ...
2020-03-31 10:05:01 -07:00
Takashi Iwai
06236821ae perf: arm-ccn: Use scnprintf() for robustness
snprintf() is a hard-to-use function, it's especially difficult to use
it for concatenating substrings in a buffer with a limited size.
Since snprintf() returns the would-be-output size, not the actual
size, the subsequent use of snprintf() may point to the incorrect
position easily.  Although the current code doesn't actually overflow
the buffer, it's an incorrect usage.

This patch replaces such snprintf() calls with a safer version,
scnprintf().

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Will Deacon <will@kernel.org>
2020-03-17 22:45:56 +00:00
luanshi
3ba52ad55b drivers/perf: arm_pmu_acpi: Fix incorrect checking of gicc pointer
Fix bogus NULL checks on the return value of acpi_cpu_get_madt_gicc()
by checking for a 0 'gicc->performance_interrupt' value instead.

Signed-off-by: Liguang Zhang <zhangliguang@linux.alibaba.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-03-02 12:07:35 +00:00
Joakim Zhang
049d919168 drivers/perf: fsl_imx8_ddr: Correct the CLEAR bit definition
When disabling a counter from ddr_perf_event_stop(), the counter value
is reset to 0 at the same time.

Preserve the counter value by performing a read-modify-write of the
PMU register and clearing only the enable bit.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-03-02 12:07:19 +00:00
luanshi
aaa1972715 perf: arm_spe: Remove unnecessary zero check on 'nr_pages'
We already check that the 'nr_pages' is > 2, so there's no need to check
that it's != 0 later on.

Signed-off-by: Liguang Zhang <zhangliguang@linux.alibaba.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-03-02 11:27:21 +00:00
John Garry
0ca2c0319a perf/smmuv3: Use platform_get_irq_optional() for wired interrupt
Even though a SMMUv3 PMCG implementation may use an MSI as the form of
interrupt source, the kernel would still complain that it does not find
the wired (GSIV) interrupt in this case:

root@(none)$ dmesg | grep arm-smmu-v3-pmcg | grep "not found"
[   59.237219] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.8.auto: IRQ index 0 not found
[   59.322841] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.9.auto: IRQ index 0 not found
[   59.422155] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.10.auto: IRQ index 0 not found
[   59.539014] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.11.auto: IRQ index 0 not found
[   59.640329] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.12.auto: IRQ index 0 not found
[   59.743112] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.13.auto: IRQ index 0 not found
[   59.880577] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.14.auto: IRQ index 0 not found
[   60.017528] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.15.auto: IRQ index 0 not found

Use platform_get_irq_optional() to silence the warning.

If neither interrupt source is found, then the driver will still warn that
IRQ setup errored and the probe will fail.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-02-10 18:14:46 +00:00
Leonard Crestez
9ee68b314e perf/imx_ddr: Fix cpu hotplug state cleanup
This driver allocates a dynamic cpu hotplug state but never releases it.
If reloaded in a loop it will quickly trigger a WARN message:

	"No more dynamic states available for CPU hotplug"

Fix by calling cpuhp_remove_multi_state on remove like several other
perf pmu drivers.

Also fix the cleanup logic on probe error paths: add the missing
cpuhp_remove_multi_state call and properly check the return value from
cpuhp_state_add_instant_nocalls.

Fixes: 9a66d36cc7 ("drivers/perf: imx_ddr: Add DDR performance counter support to perf")
Acked-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-15 12:48:40 +00:00
Shaokun Zhang
73daf0bba3 drivers/perf: hisi: Simplify hisi_read_sccl_and_ccl_id and its comment
hisi_read_sccl_and_ccl_id is not readable and its comment is a little
confused, so simplify the function and its comment as Mark's suggestion.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:58:57 +00:00
Hanjun Guo
8ae4bcf482 perf/smmuv3: Remove the leftover put_cpu() in error path
In smmu_pmu_probe(), there is put_cpu() in the error path,
which is wrong because we use raw_smp_processor_id() to
get the cpu ID, not get_cpu(), remove it.

While we are at it, kill 'out_cpuhp_err' altogether and
just return err if we fail to add the hotplug instance.

Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-12-18 16:15:36 +00:00
Shaokun Zhang
8703317ae5 drivers/perf: hisi: update the sccl_id/ccl_id for certain HiSilicon platform
For some HiSilicon platform, the originally designed SCCL_ID and CCL_ID
are not satisfied with much rich topology when the MT is set, so we
extend the SCCL_ID to MPIDR[aff3] and CCL_ID to MPIDR[aff2]. Let's
update this for HiSilicon uncore PMU driver.

Cc: John Garry <john.garry@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-07 13:07:55 +00:00
Joakim Zhang
f1d303a1b5 perf/imx_ddr: Dump AXI ID filter info to userspace
caps/filter indicates whether HW supports AXI ID filter or not.
caps/enhanced_filter indicates whether HW supports enhanced AXI ID filter
or not.

Users can check filter features from userspace with these attributions.

Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
[will: reworked cap switch to be less error-prone]
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-04 17:25:16 +00:00
Joakim Zhang
d3eeece9a8 perf/imx_ddr: Add driver for DDR PMU in i.MX8MPlus
Add driver for DDR PMU in i.MX8MPlus.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-04 16:27:34 +00:00
Joakim Zhang
44f8bd014a perf/imx_ddr: Add enhanced AXI ID filter support
With DDR_CAP_AXI_ID_FILTER quirk, indicating HW supports AXI ID filter
which only can get bursts from DDR transaction, i.e. DDR read/write
requests.

This patch add DDR_CAP_AXI_ID_ENHANCED_FILTER quirk, indicating HW
supports AXI ID filter which can get bursts and bytes from DDR
transaction at the same time. We hope PMU always return bytes in the
driver due to it is more meaningful for users.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-04 16:27:34 +00:00
Ganapatrao Prabhakerrao Kulkarni
5e2c27e833 drivers/perf: Add CCPI2 PMU support in ThunderX2 UNCORE driver.
CCPI2 is a low-latency high-bandwidth serial interface for inter socket
connectivity of ThunderX2 processors.

CCPI2 PMU supports up to 8 counters per socket. Counters are
independently programmable to different events and can be started and
stopped individually. The CCPI2 counters are 64-bit and do not overflow
in normal operation.

Signed-off-by: Ganapatrao Prabhakerrao Kulkarni <gkulkarni@marvell.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-29 10:08:46 +00:00
Marek Bykowski
126b0a1700 perf: arm-ccn: Enable stats for CCN-512 interconnect
Add compatible string for the ARM CCN-512 interconnect

Acked-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Marek Bykowski <marek.bykowski@gmail.com>
Signed-off-by: Boleslaw Malecki <boleslaw.malecki@tieto.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-28 15:40:42 +00:00
YueHaibing
c8b0de762e perf/smmuv3: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-01 12:35:44 +01:00
YueHaibing
504db0f826 perf/arm-cci: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-01 12:28:47 +01:00
YueHaibing
1c8d96b41d perf/arm-ccn: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-01 12:28:46 +01:00
YueHaibing
7fdd7f7c33 perf: xgene: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-01 12:28:46 +01:00
YueHaibing
42c184ade4 perf: hisi: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-01 12:28:46 +01:00
Will Deacon
ac12cf85d6 Merge branches 'for-next/52-bit-kva', 'for-next/cpu-topology', 'for-next/error-injection', 'for-next/perf', 'for-next/psci-cpuidle', 'for-next/rng', 'for-next/smpboot', 'for-next/tbi' and 'for-next/tlbi' into for-next/core
* for-next/52-bit-kva: (25 commits)
  Support for 52-bit virtual addressing in kernel space

* for-next/cpu-topology: (9 commits)
  Move CPU topology parsing into core code and add support for ACPI 6.3

* for-next/error-injection: (2 commits)
  Support for function error injection via kprobes

* for-next/perf: (8 commits)
  Support for i.MX8 DDR PMU and proper SMMUv3 group validation

* for-next/psci-cpuidle: (7 commits)
  Move PSCI idle code into a new CPUidle driver

* for-next/rng: (4 commits)
  Support for 'rng-seed' property being passed in the devicetree

* for-next/smpboot: (3 commits)
  Reduce fragility of secondary CPU bringup in debug configurations

* for-next/tbi: (10 commits)
  Introduce new syscall ABI with relaxed requirements for pointer tags

* for-next/tlbi: (6 commits)
  Handle spurious page faults arising from kernel space
2019-08-30 12:46:12 +01:00
Joakim Zhang
c12c0288e3 perf/imx_ddr: Add support for AXI ID filtering
AXI filtering is used by events 0x41 and 0x42 to count reads or writes
with an ARID or AWID matching a specified filter. The filter is exposed
to userspace as an (ID, MASK) pair, where each set bit in the mask
causes the corresponding bit in the ID to be ignored when matching
against the ID of memory transactions for the purposes of incrementing
the counter.

For example:

  # perf stat -a -e imx8_ddr0/axid-read,axi_mask=0xff,axi_id=0x800/ cmd

will count all read transactions from AXI IDs 0x800 - 0x8ff. If the
'axi_mask' is omitted, then it is treated as 0x0 which means that the
'axi_id' will be matched exactly.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-28 14:16:55 +01:00
Robin Murphy
3c9347351a perf/smmuv3: Validate groups for global filtering
With global filtering, it becomes possible for users to construct
self-contradictory groups with conflicting filters. Make sure we
cover that when initially validating events.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-27 19:37:04 +01:00
Robin Murphy
33e84ea433 perf/smmuv3: Validate group size
Ensure that a group will actually fit into the available counters.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-27 19:37:04 +01:00
Stephen Boyd
228f855fb5 perf: Remove dev_err() usage after platform_get_irq()
We don't need dev_err() messages when platform_get_irq() fails now that
platform_get_irq() prints an error message itself when something goes
wrong. Let's remove these prints with a simple semantic patch.

// <smpl>
@@
expression ret;
struct platform_device *E;
@@

ret =
(
platform_get_irq(E, ...)
|
platform_get_irq_byname(E, ...)
);

if ( \( ret < 0 \| ret <= 0 \) )
{
(
-if (ret != -EPROBE_DEFER)
-{ ...
-dev_err(...);
-... }
|
...
-dev_err(...);
)
...
}
// </smpl>

While we're here, remove braces on if statements that only have one
statement (manually).

Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-07-31 17:15:20 +01:00
Leonard Crestez
4b9ace9c25 perf/imx_ddr: Add MODULE_DEVICE_TABLE
This is required for automatic probing when driver is built as a module.

Fixes: 9a66d36cc7 ("drivers/perf: imx_ddr: Add DDR performance counter support to perf")
Acked-by: Frank Li <frank.li@nxp.com>
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-07-31 17:15:16 +01:00
Will Deacon
0d7fd70f26 drivers/perf: arm_pmu: Fix failure path in PM notifier
Handling of the CPU_PM_ENTER_FAILED transition in the Arm PMU PM
notifier code incorrectly skips restoration of the counters. Fix the
logic so that CPU_PM_ENTER_FAILED follows the same path as CPU_PM_EXIT.

Cc: <stable@vger.kernel.org>
Fixes: da4e4f18af ("drivers/perf: arm_pmu: implement CPU_PM notifier")
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29 11:43:48 +01:00
Mauro Carvalho Chehab
59809fe882 docs: perf: move to the admin-guide
The perf infrastructure is used for userspace to track issues.
At least a good part of what's described here is related to
it.

So, add it to the admin-guide.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15 09:20:27 -03:00
Mauro Carvalho Chehab
6baec31591 docs: perf: convert to ReST
Rename the perf documentation files to ReST, add an
index for them and adjust in order to produce a nice html
output via the Sphinx build system.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15 09:20:26 -03:00
Linus Torvalds
dfd437a257 arm64 updates for 5.3:
- arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP}
 
 - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to
   manage the permissions of executable vmalloc regions more strictly
 
 - Slight performance improvement by keeping softirqs enabled while
   touching the FPSIMD/SVE state (kernel_neon_begin/end)
 
 - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new XAFLAG
   and AXFLAG instructions for floating point comparison flags
   manipulation) and FRINT (rounding floating point numbers to integers)
 
 - Re-instate ARM64_PSEUDO_NMI support which was previously marked as
   BROKEN due to some bugs (now fixed)
 
 - Improve parking of stopped CPUs and implement an arm64-specific
   panic_smp_self_stop() to avoid warning on not being able to stop
   secondary CPUs during panic
 
 - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI
   platforms
 
 - perf: DDR performance monitor support for iMX8QXP
 
 - cache_line_size() can now be set from DT or ACPI/PPTT if provided to
   cope with a system cache info not exposed via the CPUID registers
 
 - Avoid warning on hardware cache line size greater than
   ARCH_DMA_MINALIGN if the system is fully coherent
 
 - arm64 do_page_fault() and hugetlb cleanups
 
 - Refactor set_pte_at() to avoid redundant READ_ONCE(*ptep)
 
 - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the 'arm_boot_flags'
   introduced in 5.1)
 
 - CONFIG_RANDOMIZE_BASE now enabled in defconfig
 
 - Allow the selection of ARM64_MODULE_PLTS, currently only done via
   RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill
   over into the vmalloc area
 
 - Make ZONE_DMA32 configurable
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl0eHqcACgkQa9axLQDI
 XvFyNA/+L+bnkz8m3ncydlqqfXomQn4eJJVQ8Uksb0knJz+1+3CUxxbO4ry4jXZN
 fMkbggYrDPRKpDbsUl0lsRipj7jW9bqan+N37c3SWqCkgb6HqDaHViwxdx6Ec/Uk
 gHudozDSPh/8c7hxGcSyt/CFyuW6b+8eYIQU5rtIgz8aVY2BypBvS/7YtYCbIkx0
 w4CFleRTK1zXD5mJQhrc6jyDx659sVkrAvdhf6YIymOY8nBTv40vwdNo3beJMYp8
 Po/+0Ixu+VkHUNtmYYZQgP/AGH96xiTcRnUqd172JdtRPpCLqnLqwFokXeVIlUKT
 KZFMDPzK+756Ayn4z4huEePPAOGlHbJje8JVNnFyreKhVVcCotW7YPY/oJR10bnc
 eo7yD+DxABTn+93G2yP436bNVa8qO1UqjOBfInWBtnNFJfANIkZweij/MQ6MjaTA
 o7KtviHnZFClefMPoiI7HDzwL8XSmsBDbeQ04s2Wxku1Y2xUHLx4iLmadwLQ1ZPb
 lZMTZP3N/T1554MoURVA1afCjAwiqU3bt1xDUGjbBVjLfSPBAn/25IacsG9Li9AF
 7Rp1M9VhrfLftjFFkB2HwpbhRASOxaOSx+EI3kzEfCtM2O9I1WHgP3rvCdc3l0HU
 tbK0/IggQicNgz7GSZ8xDlWPwwSadXYGLys+xlMZEYd3pDIOiFc=
 =0TDT
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP}

 - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to
   manage the permissions of executable vmalloc regions more strictly

 - Slight performance improvement by keeping softirqs enabled while
   touching the FPSIMD/SVE state (kernel_neon_begin/end)

 - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new
   XAFLAG and AXFLAG instructions for floating point comparison flags
   manipulation) and FRINT (rounding floating point numbers to integers)

 - Re-instate ARM64_PSEUDO_NMI support which was previously marked as
   BROKEN due to some bugs (now fixed)

 - Improve parking of stopped CPUs and implement an arm64-specific
   panic_smp_self_stop() to avoid warning on not being able to stop
   secondary CPUs during panic

 - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI
   platforms

 - perf: DDR performance monitor support for iMX8QXP

 - cache_line_size() can now be set from DT or ACPI/PPTT if provided to
   cope with a system cache info not exposed via the CPUID registers

 - Avoid warning on hardware cache line size greater than
   ARCH_DMA_MINALIGN if the system is fully coherent

 - arm64 do_page_fault() and hugetlb cleanups

 - Refactor set_pte_at() to avoid redundant READ_ONCE(*ptep)

 - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the
   'arm_boot_flags' introduced in 5.1)

 - CONFIG_RANDOMIZE_BASE now enabled in defconfig

 - Allow the selection of ARM64_MODULE_PLTS, currently only done via
   RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill
   over into the vmalloc area

 - Make ZONE_DMA32 configurable

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits)
  perf: arm_spe: Enable ACPI/Platform automatic module loading
  arm_pmu: acpi: spe: Add initial MADT/SPE probing
  ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  ACPI/PPTT: Modify node flag detection to find last IDENTICAL
  x86/entry: Simplify _TIF_SYSCALL_EMU handling
  arm64: rename dump_instr as dump_kernel_instr
  arm64/mm: Drop [PTE|PMD]_TYPE_FAULT
  arm64: Implement panic_smp_self_stop()
  arm64: Improve parking of stopped CPUs
  arm64: Expose FRINT capabilities to userspace
  arm64: Expose ARMv8.5 CondM capability to userspace
  arm64: defconfig: enable CONFIG_RANDOMIZE_BASE
  arm64: ARM64_MODULES_PLTS must depend on MODULES
  arm64: bpf: do not allocate executable memory
  arm64/kprobes: set VM_FLUSH_RESET_PERMS on kprobe instruction pages
  arm64/mm: wire up CONFIG_ARCH_HAS_SET_DIRECT_MAP
  arm64: module: create module allocations without exec permissions
  arm64: Allow user selection of ARM64_MODULE_PLTS
  acpi/arm64: ignore 5.1 FADTs that are reported as 5.0
  arm64: Allow selecting Pseudo-NMI again
  ...
2019-07-08 09:54:55 -07:00
Jeremy Linton
d482e575fb perf: arm_spe: Enable ACPI/Platform automatic module loading
Lets add the MODULE_TABLE and platform id_table entries so that
the SPE driver can attach to the ACPI platform device created by
the core pmu code.

Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-06-27 16:53:53 +01:00
Jeremy Linton
d24a0c7099 arm_pmu: acpi: spe: Add initial MADT/SPE probing
ACPI 6.3 adds additional fields to the MADT GICC
structure to describe SPE PPI's. We pick these out
of the cached reference to the madt_gicc structure
similarly to the core PMU code. We then create a platform
device referring to the IRQ and let the user/module loader
decide whether to load the SPE driver.

Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-06-27 16:53:42 +01:00
Thomas Gleixner
d2912cb15b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:55 +02:00
Thomas Gleixner
caab277b1d treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not see http www gnu org
  licenses

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 503 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Enrico Weigelt <info@metux.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:07 +02:00
Frank Li
9a66d36cc7 drivers/perf: imx_ddr: Add DDR performance counter support to perf
Add DDR performance monitor support for iMX8QXP. The PMU consists of 3
programmable event counters and a single dedicated cycle counter.

Example usage:

 $ perf stat -a -e \
   imx8_ddr0/read-cycles/,imx8_ddr0/write-cycles/,imx8_ddr0/precharge/ ls

- or -

 $ perf stat -a -e \
   imx8_ddr0/cycles/,imx8_ddr0/read-access/,imx8_ddr0/write-access/ ls

Other events are supported, and advertised via perf list.

Reviewed-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
[will: rewrote commit message/kconfig and used #defines for dev/cpuhp names]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-06-13 11:07:57 +01:00
Thomas Gleixner
97fb5e8d9b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 284
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 and
  only version 2 as published by the free software foundation this
  program is distributed in the hope that it will be useful but
  without any warranty without even the implied warranty of
  merchantability or fitness for a particular purpose see the gnu
  general public license for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 294 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141900.825281744@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:37 +02:00
Thomas Gleixner
1802d0beec treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 655 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070034.575739538@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:26:41 -07:00
Linus Torvalds
54dee40637 First round of arm64 fixes for -rc2
- Fix SPE probe failure when backing auxbuf with high-order pages
 
 - Fix handling of DMA allocations from outside of the vmalloc area
 
 - Fix generation of build-id ELF section for vDSO object
 
 - Disable huge I/O mappings if kernel page table dumping is enabled
 
 - A few other minor fixes (comments, kconfig etc)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAlzlRT0ACgkQt6xw3ITB
 YzRGOwgArUDryBedDdkxAvjx7fk8O+qjtWctAhdPtyuXIvVLOc3tpiKlayCguF/a
 clqr4qAfxswoDLHRMwhh7xdv955A2vraHQWlzvGUj2O2M4mG8RdbVJLm3NxpA09m
 dufjSuFcwxcou2c4rXbSXSB4AYJXPmQJiad04VsWj68+TVehy0P45zaPcjHsPNPI
 D9sTa9XhBlNa0qpJG7tP9T8FS/QP/hpWHn8v0z/DQ4QetKRTstkpwD5kmJox8WmM
 Bw593bvQQ2+5q9g+z0FM3M/7yHwTJw2RLnnIb29YsW8MxM3rUeqt+FMA2OALBgbi
 0m7WoTZwO9hDQuPU1DDvZUtw3iOpeg==
 =buiS
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:

 - Fix SPE probe failure when backing auxbuf with high-order pages

 - Fix handling of DMA allocations from outside of the vmalloc area

 - Fix generation of build-id ELF section for vDSO object

 - Disable huge I/O mappings if kernel page table dumping is enabled

 - A few other minor fixes (comments, kconfig etc)

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: vdso: Explicitly add build-id option
  arm64/mm: Inhibit huge-vmap with ptdump
  arm64: Print physical address of page table base in show_pte()
  arm64: don't trash config with compat symbol if COMPAT is disabled
  arm64: assembler: Update comment above cond_yield_neon() macro
  drivers/perf: arm_spe: Don't error on high-order pages for aux buf
  arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable
2019-05-22 08:36:16 -07:00
Thomas Gleixner
1ccea77e2a treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version this program is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details you
  should have received a copy of the gnu general public license along
  with this program if not see http www gnu org licenses

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version this program is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details [based]
  [from] [clk] [highbank] [c] you should have received a copy of the
  gnu general public license along with this program if not see http
  www gnu org licenses

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 355 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 11:28:45 +02:00
Thomas Gleixner
ec8f24b7fa treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:46 +02:00
Thomas Gleixner
457c899653 treewide: Add SPDX license identifier for missed files
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:45 +02:00
Will Deacon
14ae42a6f0 drivers/perf: arm_spe: Don't error on high-order pages for aux buf
Since commit 5768402fd9 ("perf/ring_buffer: Use high order allocations
for AUX buffers optimistically"), the perf core tends to back aux buffer
allocations with high-order pages with the order encoded in the
PagePrivate data. The Arm SPE driver explicitly rejects such pages,
causing the perf tool to fail with:

  | failed to mmap with 12 (Cannot allocate memory)

In actual fact, we can simply treat these pages just like any other
since the perf core takes care to populate the page array appropriately.
In theory we could try to map with PMDs where possible, but for now,
let's just get things working again.

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Fixes: 5768402fd9 ("perf/ring_buffer: Use high order allocations for AUX buffers optimistically")
Reported-by: Hanjun Guo <guohanjun@huawei.com>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-13 18:01:56 +01:00
Robin Murphy
9bcb929f96 perf/arm-ccn: Clean up CPU hotplug handling
Like arm-cci, arm-ccn has the same issue of disabling preemption around
operations which can take mutexes. Again, remove the definite bug by
simply not trying to fight the theoretical races. And since we are
touching the hotplug handling code, take the opportunity to streamline
it, as there's really no need to store a full-sized cpumask to keep
track of a single CPU ID.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-23 12:29:37 +01:00
Robin Murphy
0d2e2a82d4 perf/arm-cci: Remove broken race mitigation
Uncore PMU drivers face an awkward cyclic dependency wherein:

 - They have to pick a valid online CPU to associate with before
   registering the PMU device, since it will get exposed to userspace
   immediately.
 - The PMU registration has to be be at least partly complete before
   hotplug events can be handled, since trying to migrate an
   uninitialised context would be bad.
 - The hotplug handler has to be ready as soon as a CPU is chosen, lest
   it go offline without the user-visible cpumask value getting updated.

The arm-cci driver has tried to solve this by using get_cpu() to pick
the current CPU and prevent it from disappearing while both
registrations are performed, but that results in taking mutexes with
preemption disabled, which makes certain configurations very unhappy:

[ 1.983337] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:2004
[ 1.983340] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
[ 1.983342] Preemption disabled at:
[ 1.983353] [<ffffff80089801f4>] cci_pmu_probe+0x1dc/0x488
[ 1.983360] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.20-rt8-yocto-preempt-rt #1
[ 1.983362] Hardware name: ZynqMP ZCU102 Rev1.0 (DT)
[ 1.983364] Call trace:
[ 1.983369] dump_backtrace+0x0/0x158
[ 1.983372] show_stack+0x24/0x30
[ 1.983378] dump_stack+0x80/0xa4
[ 1.983383] ___might_sleep+0x138/0x160
[ 1.983386] __might_sleep+0x58/0x90
[ 1.983391] __rt_mutex_lock_state+0x30/0xc0
[ 1.983395] _mutex_lock+0x24/0x30
[ 1.983400] perf_pmu_register+0x2c/0x388
[ 1.983404] cci_pmu_probe+0x2bc/0x488
[ 1.983409] platform_drv_probe+0x58/0xa8

It is not feasible to resolve all the possible races outside of the perf
core itself, so address the immediate bug by following the example of
nearly every other PMU driver and not even trying to do so. Registering
the hotplug notifier first should minimise the window in which things
can go wrong, so that's about as much as we can reasonably do here. This
also revealed an additional race in assigning the global pointer too
late relative to the hotplug notifier, which gets fixed in the process.

Reported-by: Li, Meng <Meng.Li@windriver.com>
Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-23 12:29:23 +01:00
Shameer Kolothum
24062fe858 perf/smmuv3: Enable HiSilicon Erratum 162001800 quirk
HiSilicon erratum 162001800 describes the limitation of
SMMUv3 PMCG implementation on HiSilicon Hip08 platforms.

On these platforms, the PMCG event counter registers
(SMMU_PMCG_EVCNTRn) are read only and as a result it
is not possible to set the initial counter period value
on event monitor start.

To work around this, the current value of the counter
is read and used for delta calculations. OEM information
from ACPI header is used to identify the affected hardware
platforms.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
[will: update silicon-errata.txt and add reason string to acpi match]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-04 16:49:22 +01:00
Shameer Kolothum
f202cdab3b perf/smmuv3: Add MSI irq support
This adds support for MSI-based counter overflow interrupt.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-04 16:49:22 +01:00
Neil Leeder
7d839b4b9e perf/smmuv3: Add arm64 smmuv3 pmu driver
Adds a new driver to support the SMMUv3 PMU and add it into the
perf events framework.

Each SMMU node may have multiple PMUs associated with it, each of
which may support different events.

SMMUv3 PMCG devices are named as smmuv3_pmcg_<phys_addr_page> where
<phys_addr_page> is the physical page address of the SMMU PMCG
wrapped to 4K boundary. For example, the PMCG at 0xff88840000 is
named smmuv3_pmcg_ff88840

Filtering by stream id is done by specifying filtering parameters
with the event. options are:
   filter_enable    - 0 = no filtering, 1 = filtering enabled
   filter_span      - 0 = exact match, 1 = pattern match
   filter_stream_id - pattern to filter against

Example: perf stat -e smmuv3_pmcg_ff88840/transaction,filter_enable=1,
                       filter_span=1,filter_stream_id=0x42/ -a netperf

Applies filter pattern 0x42 to transaction events, which means events
matching stream ids 0x42 & 0x43 are counted as only upper StreamID
bits are required to match the given filter. Further filtering
information is available in the SMMU documentation.

SMMU events are not attributable to a CPU, so task mode and sampling
are not supported.

Signed-off-by: Neil Leeder <nleeder@codeaurora.org>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
[will: fold in review feedback from Robin]
[will: rewrite Kconfig text and allow building as a module]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-04 16:49:21 +01:00
Linus Torvalds
3d8dfe75ef arm64 updates for 5.1:
- Pseudo NMI support for arm64 using GICv3 interrupt priorities
 
 - uaccess macros clean-up (unsafe user accessors also merged but
   reverted, waiting for objtool support on arm64)
 
 - ptrace regsets for Pointer Authentication (ARMv8.3) key management
 
 - inX() ordering w.r.t. delay() on arm64 and riscv (acks in place by the
   riscv maintainers)
 
 - arm64/perf updates: PMU bindings converted to json-schema, unused
   variable and misleading comment removed
 
 - arm64/debug fixes to ensure checking of the triggering exception level
   and to avoid the propagation of the UNKNOWN FAR value into the si_code
   for debug signals
 
 - Workaround for Fujitsu A64FX erratum 010001
 
 - lib/raid6 ARM NEON optimisations
 
 - NR_CPUS now defaults to 256 on arm64
 
 - Minor clean-ups (documentation/comments, Kconfig warning, unused
   asm-offsets, clang warnings)
 
 - MAINTAINERS update for list information to the ARM64 ACPI entry
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlyCl0cACgkQa9axLQDI
 XvEyKxAAiogBZLbyhcy8bTUHVzVoJE0FyAkdO2wWnnaff2Ohkhy1Y/npv33IeK2q
 RknxqDIx2DUUVPJNRZGoI/WwBtTZdKaAnW4rIKG84yC1eAkFcd96WQasaZzcp1qY
 HmvbJiYXM0bh+0J7i3Wgry/QzOkrltJFJW2kp6Wd5aFE+R1WyWyxT6d+Fp0J3vlA
 bT70jlpBK6LXEOmmBS+04Ml02+8MvaGxIl8EInBHSfDLRLErj5E8n41rRHKUiSWz
 maWI+kVoLYwOE68xiZlDftUBEeQpUSWgg2nxeK+640QSl1wJmVcRcY9nm6TZeMG2
 AiZTR9a7cP5rrdSN5suUmb7d4AMMVlVMisGDlwb+9oCxeTRDzg0uwACaVgHfPqQr
 UeBdHbL9nStN7uBH23H8L9mKk+tqpFmk0sgzdrKejOwysAiqWV8aazb/Na3qnVRl
 J1B5opxMnGOsjXmHvtG/tiZl281Uwz5ZmzfLmIY3gUZgUgdA3511Egp0ry5y1dzJ
 SkYC4Hmzb2ybQvXGIDDa3OzCwXXiqyqKsO+O8Egg1k4OIwbp3w+NHE7gKeA+dMgD
 gjN7zEalCUi46Q28xiCPEb+88BpQ18czIWGQLb9mAnmYeZPjqqenXKXuRHr4lgVe
 jPURJ/vqvFEglZJN1RDuQHKzHEcm5f2XE566sMZYdSoeiUCb0QM=
 =2U56
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - Pseudo NMI support for arm64 using GICv3 interrupt priorities

 - uaccess macros clean-up (unsafe user accessors also merged but
   reverted, waiting for objtool support on arm64)

 - ptrace regsets for Pointer Authentication (ARMv8.3) key management

 - inX() ordering w.r.t. delay() on arm64 and riscv (acks in place by
   the riscv maintainers)

 - arm64/perf updates: PMU bindings converted to json-schema, unused
   variable and misleading comment removed

 - arm64/debug fixes to ensure checking of the triggering exception
   level and to avoid the propagation of the UNKNOWN FAR value into the
   si_code for debug signals

 - Workaround for Fujitsu A64FX erratum 010001

 - lib/raid6 ARM NEON optimisations

 - NR_CPUS now defaults to 256 on arm64

 - Minor clean-ups (documentation/comments, Kconfig warning, unused
   asm-offsets, clang warnings)

 - MAINTAINERS update for list information to the ARM64 ACPI entry

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits)
  arm64: mmu: drop paging_init comments
  arm64: debug: Ensure debug handlers check triggering exception level
  arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals
  Revert "arm64: uaccess: Implement unsafe accessors"
  arm64: avoid clang warning about self-assignment
  arm64: Kconfig.platforms: fix warning unmet direct dependencies
  lib/raid6: arm: optimize away a mask operation in NEON recovery routine
  lib/raid6: use vdupq_n_u8 to avoid endianness warnings
  arm64: io: Hook up __io_par() for inX() ordering
  riscv: io: Update __io_[p]ar() macros to take an argument
  asm-generic/io: Pass result of I/O accessor to __io_[p]ar()
  arm64: Add workaround for Fujitsu A64FX erratum 010001
  arm64: Rename get_thread_info()
  arm64: Remove documentation about TIF_USEDFPU
  arm64: irqflags: Fix clang build warnings
  arm64: Enable the support of pseudo-NMIs
  arm64: Skip irqflags tracing for NMI in IRQs disabled context
  arm64: Skip preemption when exiting an NMI
  arm64: Handle serror in NMI context
  irqchip/gic-v3: Allow interrupts to be set as pseudo-NMI
  ...
2019-03-10 10:17:23 -07:00
Mathieu Poirier
840018668c perf/aux: Make perf_event accessible to setup_aux()
When pmu::setup_aux() is called the coresight PMU needs to know which
sink to use for the session by looking up the information in the
event's attr::config2 field.

As such simply replace the cpu information by the complete perf_event
structure and change all affected customers.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Link: http://lkml.kernel.org/r/20190131184714.20388-2-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:39 -03:00
YueHaibing
cf2d65ec1d perf: xgene: Remove set but not used variable 'config'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/perf/xgene_pmu.c: In function 'xgene_perf_stop':
drivers/perf/xgene_pmu.c:1055:6: warning:
 variable 'config' set but not used [-Wunused-but-set-variable]

It never used since introduction.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-01-31 10:55:05 +00:00
Andrew Murray
a66b0010f8 perf/drivers: Strengthen exclusion checks with PERF_PMU_CAP_NO_EXCLUDE
For drivers that do not support context exclusion let's advertise the
PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will
prevent us from handling events where any exclusion flags are set.
Let's also remove the now unnecessary check for exclusion flags.

This change means that qcom_{l2|l3}_pmu will now also indicate that
they do not support exclude_{host|guest} and that xgene_pmu does
not also support exclude_idle and exclude_hv.

Note that for qcom_l2_pmu we now implictly return -EINVAL instead
of -EOPNOTSUPP. This change will result in the perf userspace
utility retrying the perf_event_open system call with fallback
event attributes that do not fail.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: robin.murphy@arm.com
Cc: suzuki.poulose@arm.com
Link: https://lkml.kernel.org/r/1547128414-50693-9-git-send-email-andrew.murray@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-01-21 11:01:25 +01:00
Andrew Murray
3065639858 For drivers that do not support context exclusion let's advertise the
PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will
prevent us from handling events where any exclusion flags are set.
Let's also remove the now unnecessary check for exclusion flags.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: robin.murphy@arm.com
Cc: suzuki.poulose@arm.com
Link: https://lkml.kernel.org/r/1547128414-50693-8-git-send-email-andrew.murray@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-01-21 11:01:24 +01:00
Andrew Murray
1d899c0e9b perf/core, arch/arm: Use PERF_PMU_CAP_NO_EXCLUDE conditionally
The ARM PMU driver can be used to represent a variety of ARM based
PMUs. Some of these PMUs do not provide support for context
exclusion, where this is the case we advertise the
PERF_PMU_CAP_NO_EXCLUDE capability to ensure that perf prevents us
from handling events where any exclusion flags are set.

Where an ARM PMU driver has the set_event_filter function implemented,
we rely on it to perform exclusion checks. At present some of these
functions do not test for all of the available exclude flags.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: robin.murphy@arm.com
Cc: suzuki.poulose@arm.com
Link: https://lkml.kernel.org/r/1547128414-50693-6-git-send-email-andrew.murray@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-01-21 11:01:22 +01:00
Shaokun Zhang
eb4f521325 drivers/perf: hisi: Fixup one DDRC PMU register offset
For DDRC PMU, each PMU counter is fixed-purpose. There is a mismatch
between perf list and driver definition on rw_chg event.
# perf list | grep chg
  hisi_sccl1_ddrc0/rnk_chg/                          [Kernel PMU event]
  hisi_sccl1_ddrc0/rw_chg/                           [Kernel PMU event]
But the register offset of rw_chg event is not defined in the driver,
meanwhile bnk_chg register offset is mis-defined, let's fixup it.

Fixes: 904dcf03f0 ("perf: hisi: Add support for HiSilicon SoC DDRC PMU driver")
Cc: stable@vger.kernel.org
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Reported-by: Weijian Huang <huangweijian4@hisilicon.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-01-04 10:13:27 +00:00
Kulkarni, Ganapatrao
69c32972d5 drivers/perf: Add Cavium ThunderX2 SoC UNCORE PMU driver
This patch adds a perf driver for the PMU UNCORE devices DDR4 Memory
Controller(DMC) and Level 3 Cache(L3C). Each PMU supports up to 4
counters. All counters lack overflow interrupt and are
sampled periodically.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
[will: consistent enum cpuhp_state naming]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 13:03:17 +00:00
Nicholas Mc Guire
754a58db6a perf: arm_spe: handle devm_kasprintf() failure
devm_kasprintf() may return NULL on failure of internal allocation
thus the assignment to 'name' is not safe if unchecked. If NULL
is passed in for name then perf_pmu_register() would not fail
but rather silently jump to skip_type which is not the intent
here. As perf_pmu_register() may also return -ENOMEM returning
-ENOMEM in the (unlikely) failure case of devm_kasprintf() should
be fine here as well.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Fixes: d5d9696b03 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension")
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
[will: reworded error message]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-29 16:29:16 +00:00
Hoan Tran
cbb72a3c19 drivers/perf: xgene: Add CPU hotplug support
If the CPU assigned to the xgene PMU is taken offline, then subsequent
perf invocations on the PMU will fail:

  # echo 0 > /sys/devices/system/cpu/cpu0/online
  # perf stat -a -e l3c0/cycle-count/,l3c0/write/ sleep 1
    Error:
    The sys_perf_event_open() syscall returned with 19 (No such device) for event (l3c0/cycle-count/).
    /bin/dmesg may provide additional information.
    No CONFIG_PERF_EVENTS=y kernel support configured?

This patch implements a hotplug notifier in the xgene PMU driver so that
the PMU context is migrated to another online CPU should its assigned
CPU disappear.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Hoan Tran <hoan.tran@amperecomputing.com>
[will: Made naming of new cpuhp_state enum entry consistent]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-21 16:28:00 +00:00
Jeremy Linton
472dc9fa7c perf: arm_spe: Enable automatic DT loading
When built as a module, the spe driver isn't automatically
loaded on DT systems. Add the MODULE_DEVICE_TABLE entry.

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-21 13:16:34 +00:00
Linus Torvalds
5289851171 arm64 updates for 4.20:
- Core mmu_gather changes which allow tracking the levels of page-table
   being cleared together with the arm64 low-level flushing routines
 
 - Support for the new ARMv8.5 PSTATE.SSBS bit which can be used to
   mitigate Spectre-v4 dynamically without trapping to EL3 firmware
 
 - Introduce COMPAT_SIGMINSTKSZ for use in compat_sys_sigaltstack
 
 - Optimise emulation of MRS instructions to ID_* registers on ARMv8.4
 
 - Support for Common Not Private (CnP) translations allowing threads of
   the same CPU to share the TLB entries
 
 - Accelerated crc32 routines
 
 - Move swapper_pg_dir to the rodata section
 
 - Trap WFI instruction executed in user space
 
 - ARM erratum 1188874 workaround (arch_timer)
 
 - Miscellaneous fixes and clean-ups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlvKGdEACgkQa9axLQDI
 XvGSQBAAiOH6aQABL4TB7c5KIc7C+Unjm6QCFCoaeGWoHuemnM6cFJ7RQsi0GqnP
 dVEX5V/FKfmeTWO5g24Ah+MbTm3Bt6+81gywAmi1rrHhmCaCIPjT7xDqy/WsLlvt
 7WtgegSGvQ7DIMj2dbfFav6+ra67qAiYZTc46jvuynVl6DrE3BCiyTDbXAWt2nzP
 Xf3un4AHRbg3UEMUZTLqU5q4z0tbM6rEAZru8O0UOTnD2q7uttUqW3Ab7fpuEkkj
 lEVrMWD3h8SJg+Df9CbXmCNOjh4VhwBwDb5LgO8vA/AcyV/YLEF5b2OUAk/28qwo
 0GBwjqRyI4+YQ9LPg41MhGzrlnta0HCdYoeNLgLQZiDcUkuSfGhoA+MNZNOR8B08
 sCWF7F6f8UIQm8KMMBiYYdlVyUYgHLsWE/1+CyeLV0oIoWT5k3c+Xe3pho9KpVb0
 Co04TqMlqalry0sbevHz5c55H7iWIjB1Tpo3SxM105dVJVibXRPXkz+WZ5iPO+xa
 ex2j1kjNdA/AUzrSCZ5lh22zhg0WsfwD++E5meAaJMxieim8FeZDRga43rowJ0BA
 zMbSNB/+NDFZ9EhC40VaUfKk8Tkgiug9J5swv0+v7hy1QLDyydHhbOecTuIueauM
 6taiT2Iuov5yFng1eonYj4htvouVF4WOhPGthFPJMOcrB9mLMhs=
 =3Mc8
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "Apart from some new arm64 features and clean-ups, this also contains
  the core mmu_gather changes for tracking the levels of the page table
  being cleared and a minor update to the generic
  compat_sys_sigaltstack() introducing COMPAT_SIGMINSKSZ.

  Summary:

   - Core mmu_gather changes which allow tracking the levels of
     page-table being cleared together with the arm64 low-level flushing
     routines

   - Support for the new ARMv8.5 PSTATE.SSBS bit which can be used to
     mitigate Spectre-v4 dynamically without trapping to EL3 firmware

   - Introduce COMPAT_SIGMINSTKSZ for use in compat_sys_sigaltstack

   - Optimise emulation of MRS instructions to ID_* registers on ARMv8.4

   - Support for Common Not Private (CnP) translations allowing threads
     of the same CPU to share the TLB entries

   - Accelerated crc32 routines

   - Move swapper_pg_dir to the rodata section

   - Trap WFI instruction executed in user space

   - ARM erratum 1188874 workaround (arch_timer)

   - Miscellaneous fixes and clean-ups"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (78 commits)
  arm64: KVM: Guests can skip __install_bp_hardening_cb()s HYP work
  arm64: cpufeature: Trap CTR_EL0 access only where it is necessary
  arm64: cpufeature: Fix handling of CTR_EL0.IDC field
  arm64: cpufeature: ctr: Fix cpu capability check for late CPUs
  Documentation/arm64: HugeTLB page implementation
  arm64: mm: Use __pa_symbol() for set_swapper_pgd()
  arm64: Add silicon-errata.txt entry for ARM erratum 1188873
  Revert "arm64: uaccess: implement unsafe accessors"
  arm64: mm: Drop the unused cpu parameter
  MAINTAINERS: fix bad sdei paths
  arm64: mm: Use #ifdef for the __PAGETABLE_P?D_FOLDED defines
  arm64: Fix typo in a comment in arch/arm64/mm/kasan_init.c
  arm64: xen: Use existing helper to check interrupt status
  arm64: Use daifflag_restore after bp_hardening
  arm64: daifflags: Use irqflags functions for daifflags
  arm64: arch_timer: avoid unused function warning
  arm64: Trap WFI executed in userspace
  arm64: docs: Document SSBS HWCAP
  arm64: docs: Fix typos in ELF hwcaps
  arm64/kprobes: remove an extra semicolon in arch_prepare_kprobe
  ...
2018-10-22 17:30:06 +01:00
Will Deacon
ca2b497253 arm64: perf: Reject stand-alone CHAIN events for PMUv3
It doesn't make sense for a perf event to be configured as a CHAIN event
in isolation, so extend the arm_pmu structure with a ->filter_match()
function to allow the backend PMU implementation to reject CHAIN events
early.

Cc: <stable@vger.kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-10-12 15:25:17 +01:00
Rob Herring
03630b3b76 perf: Convert to using %pOFn instead of device_node.name
In preparation to remove the node name pointer from struct device_node,
convert printf users to use the %pOFn format specifier.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-01 11:33:17 +01:00
Linus Torvalds
d5acba26bf Char/Misc driver patches for 4.19-rc1
Here is the bit set of char/misc drivers for 4.19-rc1
 
 There is a lot here, much more than normal, seems like everyone is
 writing new driver subsystems these days...  Anyway, major things here
 are:
 	- new FSI driver subsystem, yet-another-powerpc low-level
 	  hardware bus
 	- gnss, finally an in-kernel GPS subsystem to try to tame all of
 	  the crazy out-of-tree drivers that have been floating around
 	  for years, combined with some really hacky userspace
 	  implementations.  This is only for GNSS receivers, but you
 	  have to start somewhere, and this is great to see.
 Other than that, there are new slimbus drivers, new coresight drivers,
 new fpga drivers, and loads of DT bindings for all of these and existing
 drivers.
 
 Full details of everything is in the shortlog.
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCW3g7ew8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykfBgCeOG0RkSI92XVZe0hs/QYFW9kk8JYAnRBf3Qpm
 cvW7a+McOoKz/MGmEKsi
 =TNfn
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the bit set of char/misc drivers for 4.19-rc1

  There is a lot here, much more than normal, seems like everyone is
  writing new driver subsystems these days... Anyway, major things here
  are:

   - new FSI driver subsystem, yet-another-powerpc low-level hardware
     bus

   - gnss, finally an in-kernel GPS subsystem to try to tame all of the
     crazy out-of-tree drivers that have been floating around for years,
     combined with some really hacky userspace implementations. This is
     only for GNSS receivers, but you have to start somewhere, and this
     is great to see.

  Other than that, there are new slimbus drivers, new coresight drivers,
  new fpga drivers, and loads of DT bindings for all of these and
  existing drivers.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (255 commits)
  android: binder: Rate-limit debug and userspace triggered err msgs
  fsi: sbefifo: Bump max command length
  fsi: scom: Fix NULL dereference
  misc: mic: SCIF Fix scif_get_new_port() error handling
  misc: cxl: changed asterisk position
  genwqe: card_base: Use true and false for boolean values
  misc: eeprom: assignment outside the if statement
  uio: potential double frees if __uio_register_device() fails
  eeprom: idt_89hpesx: clean up an error pointer vs NULL inconsistency
  misc: ti-st: Fix memory leak in the error path of probe()
  android: binder: Show extra_buffers_size in trace
  firmware: vpd: Fix section enabled flag on vpd_section_destroy
  platform: goldfish: Retire pdev_bus
  goldfish: Use dedicated macros instead of manual bit shifting
  goldfish: Add missing includes to goldfish.h
  mux: adgs1408: new driver for Analog Devices ADGS1408/1409 mux
  dt-bindings: mux: add adi,adgs1408
  Drivers: hv: vmbus: Cleanup synic memory free path
  Drivers: hv: vmbus: Remove use of slow_virt_to_phys()
  Drivers: hv: vmbus: Reset the channel callback in vmbus_onoffer_rescind()
  ...
2018-08-18 11:04:51 -07:00
Will Deacon
ba70ffa7d2 Merge branch 'for-next/perf' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into aarch64/for-next/core
Pull in arm perf updates, including support for 64-bit (chained) event
counters and some non-critical fixes for some of the system PMU drivers.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-27 14:39:04 +01:00
Sudeep Holla
809092dc3e drivers/perf: arm-ccn: Use devm_ioremap_resource() to map memory
Instead of checking the return value of platform_get_resource(), we can
use devm_ioremap_resource() which has the NULL pointer check and the
memory region requesting. devm_ioremap_resource is designed to replace
calls to devm_request_mem_region followed by devm_ioremap, so let's use
the same.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-26 13:33:49 +01:00
Shaokun Zhang
06060ea7fb drivers/perf: hisi: update the sccl_id/ccl_id when MT is supported
MT bit in MPIDR_EL1 is now supported in certain HiSilicon platforms, so
the mapping between sccl_id/ccl_id and affinity level needs to be updated
from the generic encoding we originally used.

Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
[will: fixed comment]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-24 15:40:43 +01:00
Greg Kroah-Hartman
83cf9cd6d5 Merge 4.18-rc5 into char-misc-next
We want the char-misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-16 09:04:54 +02:00
Suzuki K Poulose
c132079053 arm64: perf: Add support for chaining event counters
Add support for 64bit event by using chained event counters
and 64bit cycle counters.

PMUv3 allows chaining a pair of adjacent 32-bit counters, effectively
forming a 64-bit counter. The low/even counter is programmed to count
the event of interest, and the high/odd counter is programmed to count
the CHAIN event, taken when the low/even counter overflows.

For CPU cycles, when 64bit mode is requested, the cycle counter
is used in 64bit mode. If the cycle counter is not available,
falls back to chaining.

Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-10 18:19:30 +01:00
Suzuki K Poulose
7dfc8db1d1 arm_pmu: Tidy up clear_event_idx call backs
The armpmu uses get_event_idx callback to allocate an event
counter for a given event, which marks the selected counter
as "used". Now, when we delete the counter, the arm_pmu goes
ahead and clears the "used" bit and then invokes the "clear_event_idx"
call back, which kind of splits the job between the core code
and the backend. To keep things tidy, mandate the implementation
of clear_event_idx() and add it for exisiting backends.
This will be useful for adding the chained event support, where
we leave the event idx maintenance to the backend.

Also, when an event is removed from the PMU, reset the hw.idx
to indicate that a counter is not allocated for this event,
to help the backends do better checks. This will be also used
for the chain counter support.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-10 18:19:02 +01:00
Suzuki K Poulose
e2da97d328 arm_pmu: Add support for 64bit event counters
Each PMU has a set of 32bit event counters. But in some
special cases, the events could be counted using counters
which are effectively 64bit wide.

e.g, Arm V8 PMUv3 has a 64 bit cycle counter which can count
only the CPU cycles. Also, the PMU can chain the event counters
to effectively count as a 64bit counter.

Add support for tracking the events that uses 64bit counters.
This only affects the periods set for each counter in the core
driver.

Cc: Will Deacon <will.deacon@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-10 18:19:02 +01:00
Suzuki K Poulose
8d3e994241 arm_pmu: Clean up maximum period handling
Each PMU defines their max_period of the counter as the maximum
value that can be counted. Since all the PMU backends support
32bit counters by default, let us remove the redundant field.

No functional changes.

Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-10 18:19:02 +01:00
Randy Dunlap
ac3167257b headers: separate linux/mod_devicetable.h from linux/platform_device.h
At over 4000 #includes, <linux/platform_device.h> is the 9th most
#included header file in the Linux kernel.  It does not need
<linux/mod_devicetable.h>, so drop that header and explicitly add
<linux/mod_devicetable.h> to source files that need it.

   4146 #include <linux/platform_device.h>

After this patch, there are 225 files that use <linux/mod_devicetable.h>,
for a reduction of around 3900 times that <linux/mod_devicetable.h>
does not have to be read & parsed.

    225 #include <linux/mod_devicetable.h>

This patch was build-tested on 20 different arch-es.

It also makes these drivers SubmitChecklist#1 compliant.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kbuild test robot <lkp@intel.com> # drivers/media/platform/vimc/
Reported-by: kbuild test robot <lkp@intel.com> # drivers/pinctrl/pinctrl-u300.c
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-07 17:52:26 +02:00
Will Deacon
59b62e7ad0 drivers/perf: Initialise return value in armpmu_request_irqs()
If a PMU doesn't have any IRQs, we should return 0 from
armpmu_request_irqs(), rather than uninitialised stack.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-04 11:50:50 +01:00
Kees Cook
1201a5a25c perf/arm-cci: Remove VLA usage
In the quest to remove all stack VLA usage from the kernel[1], this
removes the VLA in favor of a maximum size and adds a sanity check
at registration time. The sizes are all explicitly enumerated already,
so this just collects them into macros.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-02 12:50:03 +01:00
Hoan Tran
a45fc268db drivers/perf: xgene_pmu: Fix IOB SLOW PMU parser error
This patch fixes the below parser error of the IOB SLOW PMU.

        # perf stat -a -e iob-slow0/cycle-count/ sleep 1
        evenf syntax error: 'iob-slow0/cycle-count/'
                                 \___ parser error

It replaces the "-" character by "_" character inside the PMU name.

Signed-off-by: Hoan Tran <hoan.tran@amperecomputing.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-06-18 17:48:42 +01:00
Arnd Bergmann
984e9cf1b9 drivers/bus: arm-cci: fix build warnings
When the arm-cci driver is enabled, but both CONFIG_ARM_CCI5xx_PMU and
CONFIG_ARM_CCI400_PMU are not, we get a warning about how parts of
the driver are never used:

drivers/perf/arm-cci.c:1454:29: error: 'cci_pmu_models' defined but not used [-Werror=unused-variable]
drivers/perf/arm-cci.c:693:16: error: 'cci_pmu_event_show' defined but not used [-Werror=unused-function]
drivers/perf/arm-cci.c:685:16: error: 'cci_pmu_format_show' defined but not used [-Werror=unused-function]

Marking all three functions as __maybe_unused avoids the warnings in
randconfig builds. I'm doing this lacking any ideas for a better fix.

Fixes: 3de6be7a3d ("drivers/bus: Split Arm CCI driver")
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-05-29 16:38:16 +01:00
John Garry
b89205bd50 drivers/perf: Remove ARM_SPE_PMU explicit PERF_EVENTS dependency
Since commit bddb9b68d3 ("drivers/perf: commonise PERF_EVENTS
dependency"), all perf drivers depend on PERF_EVENTS config under a
common menu.

Config ARM_SPE_PMU still declares explicitly a dependency on
PERF_EVENTS, which is unneeded, so remove it.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-05-22 17:11:12 +01:00
Mark Rutland
1898eb61fb drivers/perf: arm-ccn: don't log to dmesg in event_init
The ARM CCN PMU driver uses dev_warn() to complain about parameters in
the user-provided perf_event_attr. This means that under normal
operation (e.g. a single invocation of the perf tool), a number of
messages warnings may be logged to dmesg.

Tools may issue multiple syscalls to probe for feature support, and
multiple applications (from multiple users) can attempt to open events
simultaneously, so this is not very helpful, even if a user happens to
have access to dmesg. Worse, this can push important information out of
the dmesg ring buffer, and can significantly slow down syscall fuzzers,
vastly increasing the time it takes to find critical bugs.

Demote the dev_warn() instances to dev_dbg(), as is the case for all
other PMU drivers under drivers/perf/. Users who wish to debug PMU event
initialisation can enable dynamic debug to receive these messages.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-05-21 18:21:32 +01:00
Robin Murphy
8b0c93c20e perf/arm-cci: Allow building as a module
Fill in the few extra bits and annotations needed to make the driver
work properly as a module, and jiggle the Kconfig to expose the
driver-level ARM_CCI_PMU option.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-05-21 18:12:54 +01:00
Robin Murphy
28c01dc9d8 perf/arm-cci: Remove pointless PMU disabling
The CCI PMU driver bears some legacy remnants of the arm_pmu framework
from when it was split in c6f85cb430 ("bus: cci: move away from
arm_pmu framework"). In particular this perf_pmu_{dis,en}able() dance
around pmu->add which was fixed for arm_pmu in a9e469d1c8
("drivers/perf: arm_pmu: remove pointless PMU disabling").

For the exact same reasons (i.e. perf core already does this around the
call anyway), give cci_pmu_add() the exact same change, which also
prevents having to export those core functions to build it as a module.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-05-21 18:12:53 +01:00
Robin Murphy
75dc344145 perf/arm-cc*: Fix MODULE_LICENSE() tags
The CCI/CCN drivers are licensed under GPLv2, but the MODULE_LICENSE()
tags are using the bare "GPL" string implying GPLv2 or later. Fix them
to match their actual file license.

Acked-by: Pawel Moll <pawel.moll@arm.com>
Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-05-21 18:12:53 +01:00
Mark Rutland
0788f1e973 arm_pmu: simplify arm_pmu::handle_irq
The arm_pmu::handle_irq() callback has the same prototype as a generic
IRQ handler, taking the IRQ number and a void pointer argument which it
must convert to an arm_pmu pointer.

This means that all arm_pmu::handle_irq() take an IRQ number they never
use, and all must explicitly cast the void pointer to an arm_pmu
pointer.

Instead, let's change arm_pmu::handle_irq to take an arm_pmu pointer,
allowing these casts to be removed. The redundant IRQ number parameter
is also removed.

Suggested-by: Hoeun Ryu <hoeun.ryu@lge.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-05-21 18:07:05 +01:00
Robin Murphy
5c591304e7 perf/arm-cci: Remove unnecessary period adjustment
Since sampling events are rejected up-front by cci_pmu_event_init(), it
doesn't make much sense to go fiddling with the sampling period later.
This would seem to be just another leftover artefact of the arm_pmu
framwork, and as such can go.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-05-21 18:06:11 +01:00
Wolfram Sang
d0f2e42329 perf: simplify getting .drvdata
We should get drvdata from struct device directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-05-21 18:02:35 +01:00
Linus Torvalds
38c23685b2 ARM: SoC driver updates for 4.17
The main addition this time around is the new ARM "SCMI" framework,
 which is the latest in a series of standards coming from ARM to do power
 management in a platform independent way. This has been through many
 review cycles, and it relies on a rather interesting way of using the
 mailbox subsystem, but in the end I agreed that Sudeep's version was
 the best we could do after all.
 
 Other changes include:
 
 - the ARM CCN driver is moved out of drivers/bus into drivers/perf,
   which makes more sense. Similarly, the performance monitoring
   portion of the CCI driver are moved the same way and cleaned up
   a little more.
 
 - a series of updates to the SCPI framework
 
 - support for the Mediatek mt7623a SoC in drivers/soc
 
 - support for additional NVIDIA Tegra hardware in drivers/soc
 
 - a new reset driver for Socionext Uniphier
 
 - lesser bug fixes in drivers/soc, drivers/tee, drivers/memory, and
   drivers/firmware and drivers/reset across platforms
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaxiNzAAoJEGCrR//JCVInYhYP/2kPhc5t/kszA1bcklcbO9dY
 eX37Ra/RR4yQ5yeQZVIZ4UkUovxk9PmG2tM4K5oJaTDsz5pPEgavVOOr3sbfj6vb
 4O9auTeysEQlHcbVdNFum0YS2gUY2YD7D12DTRorotLxCqod184ccWXq0XGfIWaY
 l3YRrcL/lPlqmyS3z/GNx9oNygOMUzEfXfIQYICyzHuYiLBUGnkKC1vIb+Hx1TDq
 Cxk++AUqH13Mss24O2A2QQh+oBHj2BybDLLqwcC5PSpsUbFrVCfzG54l43mig32T
 NOxV0Qnml2wAtU4H0QcgtSgwRimHD0YOiX8ssquvDDiqTqM5G+llSTGkEbYe+AUW
 4GIZYoBOwGkfEXS+tyymHe9yfc5h1OLYAeFU1jRm723c7phanuu67rPn35YC8UMK
 zSql10JpkAGNzMikrxxb6wnis951w2UFlzhgZQ6ItA/nRq3l+oEQA0Qiljv965nz
 DVLsD5+gdhK6GBctkzlsD5HFn6GjM8JilnsOVPHD765nKnVBSxKiXRLV228XVug2
 rChF1FhQqLnM54jCMqHZX5fS9SbSgtYswHqIXpVw6GmJkqq/Ly10yGR0vuWD+uyn
 BV7q5AKpGrwm6wZkMM2uZ1VdUtWzn856AbkqrvX/QhmJcX4McuqaLUrC8bSOj1ty
 KeVil0akq3nU+xHl5Ojs
 =Pmsx
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC driver updates from Arnd Bergmann:
 "The main addition this time around is the new ARM "SCMI" framework,
  which is the latest in a series of standards coming from ARM to do
  power management in a platform independent way.

  This has been through many review cycles, and it relies on a rather
  interesting way of using the mailbox subsystem, but in the end I
  agreed that Sudeep's version was the best we could do after all.

  Other changes include:

   - the ARM CCN driver is moved out of drivers/bus into drivers/perf,
     which makes more sense. Similarly, the performance monitoring
     portion of the CCI driver are moved the same way and cleaned up a
     little more.

   - a series of updates to the SCPI framework

   - support for the Mediatek mt7623a SoC in drivers/soc

   - support for additional NVIDIA Tegra hardware in drivers/soc

   - a new reset driver for Socionext Uniphier

   - lesser bug fixes in drivers/soc, drivers/tee, drivers/memory, and
     drivers/firmware and drivers/reset across platforms"

* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (87 commits)
  reset: uniphier: add ethernet reset control support for PXs3
  reset: stm32mp1: Enable stm32mp1 reset driver
  dt-bindings: reset: add STM32MP1 resets
  reset: uniphier: add Pro4/Pro5/PXs2 audio systems reset control
  reset: imx7: add 'depends on HAS_IOMEM' to fix unmet dependency
  reset: modify the way reset lookup works for board files
  reset: add support for non-DT systems
  clk: scmi: use devm_of_clk_add_hw_provider() API and drop scmi_clocks_remove
  firmware: arm_scmi: prevent accessing rate_discrete uninitialized
  hwmon: (scmi) return -EINVAL when sensor information is unavailable
  amlogic: meson-gx-socinfo: Update soc ids
  soc/tegra: pmc: Use the new reset APIs to manage reset controllers
  soc: mediatek: update power domain data of MT2712
  dt-bindings: soc: update MT2712 power dt-bindings
  cpufreq: scmi: add thermal dependency
  soc: mediatek: fix the mistaken pointer accessed when subdomains are added
  soc: mediatek: add SCPSYS power domain driver for MediaTek MT7623A SoC
  soc: mediatek: avoid hardcoded value with bus_prot_mask
  dt-bindings: soc: add header files required for MT7623A SCPSYS dt-binding
  dt-bindings: soc: add SCPSYS binding for MT7623 and MT7623A SoC
  ...
2018-04-05 21:29:35 -07:00
Linus Torvalds
23221d997b arm64 updates for 4.17
Nothing particularly stands out here, probably because people were tied
 up with spectre/meltdown stuff last time around. Still, the main pieces
 are:
 
 - Rework of our CPU features framework so that we can whitelist CPUs that
   don't require kpti even in a heterogeneous system
 
 - Support for the IDC/DIC architecture extensions, which allow us to elide
   instruction and data cache maintenance when writing out instructions
 
 - Removal of the large memory model which resulted in suboptimal codegen
   by the compiler and increased the use of literal pools, which could
   potentially be used as ROP gadgets since they are mapped as executable
 
 - Rework of forced signal delivery so that the siginfo_t is well-formed
   and handling of show_unhandled_signals is consolidated and made
   consistent between different fault types
 
 - More siginfo cleanup based on the initial patches from Eric Biederman
 
 - Workaround for Cortex-A55 erratum #1024718
 
 - Some small ACPI IORT updates and cleanups from Lorenzo Pieralisi
 
 - Misc cleanups and non-critical fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJaw1TCAAoJELescNyEwWM0gyQIAJVMK4QveBW+LwF96NYdZo16
 p90Aa+nqKelh/s93govQArDMv1gxyuXdFlQZVOGPQHfqpz6RhJWmBA2tFsUbQrUc
 OBcioPrRihqTmKBe+1r1XORwZxkVX6GGmCn0LYpPR7I3TjxXZpvxqaxGxiUvHkci
 yVxWlDTyN/7eL3akhCpCDagN3Fxwk3QnJLqE3fxOFMlY7NvQcmUxcITiUl/s469q
 xK6SWH9SRH1JK8jTHPitwUBiU//3FfCqSI9HLEdDIDoTuPcVM8UetWvi4QzrzJL1
 UYg8lmU0CXNmflDzZJDaMf+qFApOrGxR0YVPpBzlQvxe0JIY69g48f+JzDPz8nc=
 =+gNa
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "Nothing particularly stands out here, probably because people were
  tied up with spectre/meltdown stuff last time around. Still, the main
  pieces are:

   - Rework of our CPU features framework so that we can whitelist CPUs
     that don't require kpti even in a heterogeneous system

   - Support for the IDC/DIC architecture extensions, which allow us to
     elide instruction and data cache maintenance when writing out
     instructions

   - Removal of the large memory model which resulted in suboptimal
     codegen by the compiler and increased the use of literal pools,
     which could potentially be used as ROP gadgets since they are
     mapped as executable

   - Rework of forced signal delivery so that the siginfo_t is
     well-formed and handling of show_unhandled_signals is consolidated
     and made consistent between different fault types

   - More siginfo cleanup based on the initial patches from Eric
     Biederman

   - Workaround for Cortex-A55 erratum #1024718

   - Some small ACPI IORT updates and cleanups from Lorenzo Pieralisi

   - Misc cleanups and non-critical fixes"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (70 commits)
  arm64: uaccess: Fix omissions from usercopy whitelist
  arm64: fpsimd: Split cpu field out from struct fpsimd_state
  arm64: tlbflush: avoid writing RES0 bits
  arm64: cmpxchg: Include linux/compiler.h in asm/cmpxchg.h
  arm64: move percpu cmpxchg implementation from cmpxchg.h to percpu.h
  arm64: cmpxchg: Include build_bug.h instead of bug.h for BUILD_BUG
  arm64: lse: Include compiler_types.h and export.h for out-of-line LL/SC
  arm64: fpsimd: include <linux/init.h> in fpsimd.h
  drivers/perf: arm_pmu_platform: do not warn about affinity on uniprocessor
  perf: arm_spe: include linux/vmalloc.h for vmap()
  Revert "arm64: Revert L1_CACHE_SHIFT back to 6 (64-byte cache line size)"
  arm64: cpufeature: Avoid warnings due to unused symbols
  arm64: Add work around for Arm Cortex-A55 Erratum 1024718
  arm64: Delay enabling hardware DBM feature
  arm64: Add MIDR encoding for Arm Cortex-A55 and Cortex-A35
  arm64: capabilities: Handle shared entries
  arm64: capabilities: Add support for checks based on a list of MIDRs
  arm64: Add helpers for checking CPU MIDR against a range
  arm64: capabilities: Clean up midr range helpers
  arm64: capabilities: Change scope of VHE to Boot CPU feature
  ...
2018-04-04 16:01:43 -07:00
Alexander Monakov
65bd053fbf drivers/perf: arm_pmu_platform: do not warn about affinity on uniprocessor
If there is exactly one CPU present, there is no ambiguity: do not warn
that PMU setup would need to guess IRQ affinity.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-27 13:13:27 +01:00
Arnd Bergmann
fcd9f8315e perf: arm_spe: include linux/vmalloc.h for vmap()
On linux-next, I get a build failure in some configurations:

drivers/perf/arm_spe_pmu.c: In function 'arm_spe_pmu_setup_aux':
drivers/perf/arm_spe_pmu.c:857:14: error: implicit declaration of function 'vmap'; did you mean 'swap'? [-Werror=implicit-function-declaration]
  buf->base = vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL);
              ^~~~
              swap
drivers/perf/arm_spe_pmu.c:857:37: error: 'VM_MAP' undeclared (first use in this function); did you mean 'VM_MPX'?
  buf->base = vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL);
                                     ^~~~~~
                                     VM_MPX
drivers/perf/arm_spe_pmu.c:857:37: note: each undeclared identifier is reported only once for each function it appears in
drivers/perf/arm_spe_pmu.c: In function 'arm_spe_pmu_free_aux':
drivers/perf/arm_spe_pmu.c:878:2: error: implicit declaration of function 'vunmap'; did you mean 'iounmap'? [-Werror=implicit-function-declaration]

vmap() is declared in linux/vmalloc.h, so we should include that header file.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[will: add additional missing #includes reported by Mark]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-27 13:13:11 +01:00
Ingo Molnar
134933e557 Linux 4.16-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlqvCPYeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGOaAH/171cgZGFEXSONxK
 3O1AAv61wN5K/ISMt6mnelWR6fZg195FarOx0Rnq7Ot8OWuVa8CGcyT4vX4Z7nb9
 SVMQKNMPCVQE4WCDOv6S0njChmRC0BxBoVJtTN9fhywdYgX1KcaTS/drMRHACF5n
 rB9eouMQScfMzKGAW08gp5NvEGJ6W1SLX7La3/u0751dYisdJSP7+vFZNxUrGXEA
 yIPOQjFu0Tfo8GXz/BwC678RZVzVLN0sE6+/vM7zNnoDlsRVkdDIVMo3UiVqm/NK
 B37/TlZz8CYoapoKnRRB5giXnSPDSXtsikbGy3mcy0u5imGe+ZgdjrdYSaLk31cR
 NVZY08k=
 =pu3X
 -----END PGP SIGNATURE-----

Merge tag 'v4.16-rc6' into perf/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-19 20:37:35 +01:00
Peter Zijlstra
edb39592a5 perf: Fix sibling iteration
Mark noticed that the change to sibling_list changed some iteration
semantics; because previously we used group_list as list entry,
sibling events would always have an empty sibling_list.

But because we now use sibling_list for both list head and list entry,
siblings will report as having siblings.

Fix this with a custom for_each_sibling_event() iterator.

Fixes: 8343aae661 ("perf/core: Remove perf_event::group_entry")
Reported-by: Mark Rutland <mark.rutland@arm.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: vincent.weaver@maine.edu
Cc: alexander.shishkin@linux.intel.com
Cc: torvalds@linux-foundation.org
Cc: alexey.budankov@linux.intel.com
Cc: valery.cherepennikov@intel.com
Cc: eranian@google.com
Cc: acme@redhat.com
Cc: linux-tip-commits@vger.kernel.org
Cc: davidcc@google.com
Cc: kan.liang@intel.com
Cc: Dmitry.Prohorov@intel.com
Cc: jolsa@redhat.com
Link: https://lkml.kernel.org/r/20180315170129.GX4043@hirez.programming.kicks-ass.net
2018-03-16 20:44:12 +01:00
Peter Zijlstra
8343aae661 perf/core: Remove perf_event::group_entry
Now that all the grouping is done with RB trees, we no longer need
group_entry and can replace the whole thing with sibling_list.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Dmitri Prokhorov <Dmitry.Prohorov@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Valery Cherepennikov <valery.cherepennikov@intel.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-12 15:28:49 +01:00
Robin Murphy
e9c112c94b perf/arm-cci: Untangle global cci_ctrl_base
Depending directly on the bus driver's global cci_ctrl_base variable
is a little unpleasant, and exporting it to allow the PMU driver to
be modular would be even more so. Let's make things a little better
abstracted by adding the control register block to the cci_pmu instance
data alongside the PMU register block, and communicating the mapped
address from the bus driver via platform data.

It's not practical to try the same thing for the bus driver itself,
given that the globals are entangled with the hairy assembly code for
port control, so we leave them be there. It would however be prudent
to move them to the __ro_after_init section in passing, since the
addresses really should never be changing once set.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-06 17:27:25 +01:00
Robin Murphy
32837954db perf/arm-cci: Clean up model discovery
Since I am the self-appointed of_device_get_match_data() police, it's
only right that I should clean up this driver while I'm otherwise
touching it. This also reveals that we're passing around a struct
platform_device in places where we only ever care about its regular
device, so straighten that out in the process.

Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-06 17:27:09 +01:00
Robin Murphy
03057f2626 perf/arm-cci: Simplify CPU hotplug
Realistically, systems with multiple CCIs are unlikely to ever exist,
and since the driver only actually supports a single instance anyway
there's really no need to do the multi-instance hotplug state dance.

Take the opportunity to simplify the hotplug-related code all over,
addressing the context-migration TODO in the process for good measure.

Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-06 17:26:24 +01:00
Robin Murphy
3de6be7a3d drivers/bus: Split Arm CCI driver
The arm-cci driver is really two entirely separate drivers; one for MCPM
port control and the other for the performance monitors. Since they are
already relatively self-contained, let's take the plunge and move the
PMU parts out to drivers/perf where they belong these days. For non-MCPM
systems this leaves a small dependency on the remaining "bus" stub for
initial probing and discovery, but we end up with something that still
fits the general pattern of its fellow system PMU drivers to ease future
maintenance.

Moving code to a new file also offers a perfect excuse to modernise the
license/copyright headers and clean up some funky linewraps on the way.

Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-06 17:26:17 +01:00
Robin Murphy
1888d3ddc3 drivers/bus: Move Arm CCN PMU driver
The arm-ccn driver is purely a perf driver for the CCN PMU, not a bus
driver in the sense of the other residents of drivers/bus/, so let's
move it to the appropriate place for SoC PMU drivers. Not to mention
moving the documentation accordingly as well.

Acked-by: Pawel Moll <pawel.moll@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-06 17:26:15 +01:00
Will Deacon
b08e5fd90b arm_pmu: Use disable_irq_nosync when disabling SPI in CPU teardown hook
Commit 6de3f79112 ("arm_pmu: explicitly enable/disable SPIs at hotplug")
moved all of the arm_pmu IRQ enable/disable calls to the CPU hotplug hooks,
regardless of whether they are implemented as PPIs or SPIs. This can
lead to us sleeping from atomic context due to disable_irq blocking:

 | BUG: sleeping function called from invalid context at kernel/irq/manage.c:112
 | in_atomic(): 1, irqs_disabled(): 128, pid: 15, name: migration/1
 | no locks held by migration/1/15.
 | irq event stamp: 192
 | hardirqs last  enabled at (191): [<00000000803c2507>]
 | _raw_spin_unlock_irq+0x2c/0x4c
 | hardirqs last disabled at (192): [<000000007f57ad28>] multi_cpu_stop+0x9c/0x140
 | softirqs last  enabled at (0): [<0000000004ee1b58>]
 | copy_process.isra.77.part.78+0x43c/0x1504
 | softirqs last disabled at (0): [<          (null)>]           (null)
 | CPU: 1 PID: 15 Comm: migration/1 Not tainted 4.16.0-rc3-salvator-x #1651
 | Hardware name: Renesas Salvator-X board based on r8a7796 (DT)
 | Call trace:
 |  dump_backtrace+0x0/0x140
 |  show_stack+0x14/0x1c
 |  dump_stack+0xb4/0xf0
 |  ___might_sleep+0x1fc/0x218
 |  __might_sleep+0x70/0x80
 |  synchronize_irq+0x40/0xa8
 |  disable_irq+0x20/0x2c
 |  arm_perf_teardown_cpu+0x80/0xac

Since the interrupt is always CPU-affine and this code is running with
interrupts disabled, we can just use disable_irq_nosync as we know there
isn't a concurrent invocation of the handler to worry about.

Fixes: 6de3f79112 ("arm_pmu: explicitly enable/disable SPIs at hotplug")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-28 14:59:47 +00:00
Mark Rutland
167e61438d arm_pmu: acpi: request IRQs up-front
We can't request IRQs in atomic context, so for ACPI systems we'll have
to request them up-front, and later associate them with CPUs.

This patch reorganises the arm_pmu code to do so. As we no longer have
the arm_pmu structure at probe time, a number of prototypes need to be
adjusted, requiring changes to the common arm_pmu code and arm_pmu
platform code.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20 11:34:54 +00:00
Mark Rutland
84b4be57ae arm_pmu: note IRQs and PMUs per-cpu
To support ACPI systems, we need to request IRQs before we know the
associated PMU, and thus we need some percpu variable that the IRQ
handler can find the PMU from.

As we're going to request IRQs without the PMU, we can't rely on the
arm_pmu::active_irqs mask, and similarly need to track requested IRQs
with a percpu variable.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[will: made armpmu_count_irq_users static]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20 11:34:54 +00:00
Mark Rutland
6de3f79112 arm_pmu: explicitly enable/disable SPIs at hotplug
To support ACPI systems, we need to request IRQs before CPUs are
hotplugged, and thus we need to request IRQs before we know their
associated PMU.

This is problematic if a PMU IRQ is pending out of reset, as it may be
taken before we know the PMU, and thus the IRQ handler won't be able to
handle it, leaving it screaming.

To avoid such problems, lets request all IRQs in a disabled state, and
explicitly enable/disable them at hotplug time, when we're sure the PMU
has been probed.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20 11:34:54 +00:00
Mark Rutland
43fc9a2feb arm_pmu: acpi: check for mismatched PPIs
The arm_pmu platform code explicitly checks for mismatched PPIs at probe
time, while the ACPI code leaves this to the core code. Future
refactoring will make this difficult for the core code to check, so
let's have the ACPI code check this explicitly.

As before, upon a failure we'll continue on without an interrupt. Ho
hum.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20 11:34:54 +00:00
Mark Rutland
0dc1a1851a arm_pmu: add armpmu_alloc_atomic()
In ACPI systems, we don't know the makeup of CPUs until we hotplug them
on, and thus have to allocate the PMU datastructures at hotplug time.
Thus, we must use GFP_ATOMIC allocations.

Let's add an armpmu_alloc_atomic() that we can use in this case.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20 11:34:54 +00:00
Mark Rutland
d3d5aac206 arm_pmu: fold platform helpers into platform code
The armpmu_{request,free}_irqs() helpers are only used by
arm_pmu_platform.c, so let's fold them in and make them static.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20 11:34:53 +00:00
Mark Rutland
c0248c9663 arm_pmu: kill arm_pmu_platdata
Now that we have no platforms passing platform data to the arm_pmu code,
we can get rid of the platdata and associated hooks, paving the way for
rework of our IRQ handling.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20 11:34:53 +00:00
Yury Norov
3aa56885e5 bitmap: replace bitmap_{from,to}_u32array
with bitmap_{from,to}_arr32 over the kernel. Additionally to it:
* __check_eq_bitmap() now takes single nbits argument.
* __check_eq_u32_array is not used in new test but may be used in
  future. So I don't remove it here, but annotate as __used.

Tested on arm64 and 32-bit BE mips.

[arnd@arndb.de: perf: arm_dsu_pmu: convert to bitmap_from_arr32]
  Link: http://lkml.kernel.org/r/20180201172508.5739-2-ynorov@caviumnetworks.com
[ynorov@caviumnetworks.com: fix net/core/ethtool.c]
  Link: http://lkml.kernel.org/r/20180205071747.4ekxtsbgxkj5b2fz@yury-thinkpad
Link: http://lkml.kernel.org/r/20171228150019.27953-2-ynorov@caviumnetworks.com
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: David Decotigny <decot@googlers.com>,
Cc: David S. Miller <davem@davemloft.net>,
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:44 -08:00
Suzuki K Poulose
a22fde8e97 perf: dsu: Use signed field for dsu_pmu->num_counters
We set dsu_pmu->num_counters to -1, when the DSU is allocated
but not initialised when none of the CPUs are active in the DSU.
However, we use an unsigned field for num_counters. Switch this
to a signed field.

Fixes: 7520fa9924 ("perf: ARM DynamIQ Shared Unit PMU support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-15 18:02:17 +00:00
Catalin Marinas
3423cab3e0 Merge branch 'for-next/perf' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux
Support for the Cluster PMU part of the ARM DynamIQ Shared Unit (DSU).

* 'for-next/perf' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux:
  perf: ARM DynamIQ Shared Unit PMU support
  dt-bindings: Document devicetree binding for ARM DSU PMU
  arm_pmu: Use of_cpu_node_to_id helper
  arm64: Use of_cpu_node_to_id helper for CPU topology parsing
  irqchip: gic-v3: Use of_cpu_node_to_id helper
  coresight: of: Use of_cpu_node_to_id helper
  of: Add helper for mapping device node to logical CPU number
  perf: Export perf_event_update_userpage
2018-01-12 14:33:56 +00:00
Suzuki K Poulose
7520fa9924 perf: ARM DynamIQ Shared Unit PMU support
Add support for the Cluster PMU part of the ARM DynamIQ Shared Unit (DSU).
The DSU integrates one or more cores with an L3 memory system, control
logic, and external interfaces to form a multicore cluster. The PMU
allows counting the various events related to L3, SCU etc, along with
providing a cycle counter.

The PMU can be accessed via system registers, which are common
to the cores in the same cluster. The PMU registers follow the
semantics of the ARMv8 PMU, mostly, with the exception that
the counters record the cluster wide events.

This driver is mostly based on the ARMv8 and CCI PMU drivers.
The driver only supports ARM64 at the moment. It can be extended
to support ARM32 by providing register accessors like we do in
arch/arm64/include/arm_dsu_pmu.h.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-01-02 16:43:12 +00:00
Suzuki K Poulose
6658278731 arm_pmu: Use of_cpu_node_to_id helper
Use the new generic helper, of_cpu_node_to_id(), to map a
a phandle to the logical CPU number while parsing the
PMU irq affinity.

Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-01-02 16:43:12 +00:00
Will Deacon
7a4a0c1555 perf: arm_spe: Fail device probe when arm64_kernel_unmapped_at_el0()
When running with the kernel unmapped whilst at EL0, the virtually-addressed
SPE buffer is also unmapped, which can lead to buffer faults if userspace
profiling is enabled and potentially also when writing back kernel samples
unless an expensive drain operation is performed on exception return.

For now, fail the SPE driver probe when arm64_kernel_unmapped_at_el0().

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Tested-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-11 13:41:13 +00:00
Linus Torvalds
c9b012e5f4 arm64 updates for 4.15
Plenty of acronym soup here:
 
 - Initial support for the Scalable Vector Extension (SVE)
 - Improved handling for SError interrupts (required to handle RAS events)
 - Enable GCC support for 128-bit integer types
 - Remove kernel text addresses from backtraces and register dumps
 - Use of WFE to implement long delay()s
 - ACPI IORT updates from Lorenzo Pieralisi
 - Perf PMU driver for the Statistical Profiling Extension (SPE)
 - Perf PMU driver for Hisilicon's system PMUs
 - Misc cleanups and non-critical fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJaCcLqAAoJELescNyEwWM0JREH/2FbmD/khGzEtP8LW+o9D8iV
 TBM02uWQxS1bbO1pV2vb+512YQO+iWfeQwJH9Jv2FZcrMvFv7uGRnYgAnJuXNGrl
 W+LL6OhN22A24LSawC437RU3Xe7GqrtONIY/yLeJBPablfcDGzPK1eHRA0pUzcyX
 VlyDruSHWX44VGBPV6JRd3x0vxpV8syeKOjbRvopRfn3Nwkbd76V3YSfEgwoTG5W
 ET1sOnXLmHHdeifn/l1Am5FX1FYstpcd7usUTJ4Oto8y7e09tw3bGJCD0aMJ3vow
 v1pCUWohEw7fHqoPc9rTrc1QEnkdML4vjJvMPUzwyTfPrN+7uEuMIEeJierW+qE=
 =0qrg
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "The big highlight is support for the Scalable Vector Extension (SVE)
  which required extensive ABI work to ensure we don't break existing
  applications by blowing away their signal stack with the rather large
  new vector context (<= 2 kbit per vector register). There's further
  work to be done optimising things like exception return, but the ABI
  is solid now.

  Much of the line count comes from some new PMU drivers we have, but
  they're pretty self-contained and I suspect we'll have more of them in
  future.

  Plenty of acronym soup here:

   - initial support for the Scalable Vector Extension (SVE)

   - improved handling for SError interrupts (required to handle RAS
     events)

   - enable GCC support for 128-bit integer types

   - remove kernel text addresses from backtraces and register dumps

   - use of WFE to implement long delay()s

   - ACPI IORT updates from Lorenzo Pieralisi

   - perf PMU driver for the Statistical Profiling Extension (SPE)

   - perf PMU driver for Hisilicon's system PMUs

   - misc cleanups and non-critical fixes"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (97 commits)
  arm64: Make ARMV8_DEPRECATED depend on SYSCTL
  arm64: Implement __lshrti3 library function
  arm64: support __int128 on gcc 5+
  arm64/sve: Add documentation
  arm64/sve: Detect SVE and activate runtime support
  arm64/sve: KVM: Hide SVE from CPU features exposed to guests
  arm64/sve: KVM: Treat guest SVE use as undefined instruction execution
  arm64/sve: KVM: Prevent guests from using SVE
  arm64/sve: Add sysctl to set the default vector length for new processes
  arm64/sve: Add prctl controls for userspace vector length management
  arm64/sve: ptrace and ELF coredump support
  arm64/sve: Preserve SVE registers around EFI runtime service calls
  arm64/sve: Preserve SVE registers around kernel-mode NEON use
  arm64/sve: Probe SVE capabilities and usable vector lengths
  arm64: cpufeature: Move sys_caps_initialised declarations
  arm64/sve: Backend logic for setting the vector length
  arm64/sve: Signal handling support
  arm64/sve: Support vector length resetting for new processes
  arm64/sve: Core task context handling
  arm64/sve: Low-level CPU setup
  ...
2017-11-15 10:56:56 -08:00
Suzuki K Poulose
19b4aff202 perf: arm_spe: Prevent module unload while the PMU is in use
When the PMU driver is built as a module, the perf expects the
pmu->module to be valid, so that the driver is prevented from
being unloaded while it is in use. Fix the SPE pmu driver to
fill in this field.

Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-03 15:23:55 +00:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Will Deacon
1e0c661f05 Merge branch 'for-next/perf' into aarch64/for-next/core
Merge in ARM PMU and perf updates for 4.15:

  - Support for the Statistical Profiling Extension
  - Support for Hisilicon's SoC PMU

Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-24 16:06:56 +01:00
Julien Thierry
611479c79a arm/arm64: pmu: Distinguish percpu irq and percpu_devid irq
arm_pmu interrupts are maked as PERCPU even when these are not local
physical interrupts to a single CPU. When using non-local interrupts,
interrupts marked as PERCPU will not get freed not disabled properly
by the PMU driver.

Check if interrupts are local to a single CPU with PERCPU_DEVID since
this is what the PMU driver really needs to know.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-24 16:04:05 +01:00
Shaokun Zhang
904dcf03f0 perf: hisi: Add support for HiSilicon SoC DDRC PMU driver
This patch adds support for DDRC PMU driver in HiSilicon SoC chip, Each
DDRC has own control, counter and interrupt registers and is an separate
PMU. For each DDRC PMU, it has 8-fixed-purpose counters which have been
mapped to 8-events by hardware, it assumes that counter index is equal
to event code (0 - 7) in DDRC PMU driver. Interrupt is supported to
handle counter (32-bits) overflow.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Anurup M <anurup.m@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19 17:06:35 +01:00
Shaokun Zhang
2bab3cf910 perf: hisi: Add support for HiSilicon SoC HHA PMU driver
L3 cache coherence is maintained by Hydra Home Agent (HHA) in HiSilicon
SoC. This patch adds support for HHA PMU driver, Each HHA has own
control, counter and interrupt registers and is an separate PMU. For
each HHA PMU, it has 16-programable counters and each counter is
free-running. Interrupt is supported to handle counter (48-bits)
overflow.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Anurup M <anurup.m@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19 17:06:35 +01:00
Shaokun Zhang
2940bc4333 perf: hisi: Add support for HiSilicon SoC L3C PMU driver
This patch adds support for L3C PMU driver in HiSilicon SoC chip, Each
L3C has own control, counter and interrupt registers and is an separate
PMU. For each L3C PMU, it has 8-programable counters and each counter
is free-running. Interrupt is supported to handle counter (48-bits)
overflow.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Anurup M <anurup.m@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19 17:06:34 +01:00
Shaokun Zhang
6ce4ef9419 perf: hisi: Add support for HiSilicon SoC uncore PMU driver
This patch adds support HiSilicon SoC uncore PMU driver framework and
interfaces.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Anurup M <anurup.m@huawei.com>
[will: Fix leader accounting in uncore group validation]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19 17:06:34 +01:00
Will Deacon
d5d9696b03 drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension
The ARMv8.2 architecture introduces the optional Statistical Profiling
Extension (SPE).

SPE can be used to profile a population of operations in the CPU pipeline
after instruction decode. These are either architected instructions (i.e.
a dynamic instruction trace) or CPU-specific uops and the choice is fixed
statically in the hardware and advertised to userspace via caps/. Sampling
is controlled using a sampling interval, similar to a regular PMU counter,
but also with an optional random perturbation to avoid falling into patterns
where you continuously profile the same instruction in a hot loop.

After each operation is decoded, the interval counter is decremented. When
it hits zero, an operation is chosen for profiling and tracked within the
pipeline until it retires. Along the way, information such as TLB lookups,
cache misses, time spent to issue etc is captured in the form of a sample.
The sample is then filtered according to certain criteria (e.g. load
latency) that can be specified in the event config (described under
format/) and, if the sample satisfies the filter, it is written out to
memory as a record, otherwise it is discarded. Only one operation can
be sampled at a time.

The in-memory buffer is linear and virtually addressed, raising an
interrupt when it fills up. The PMU driver handles these interrupts to
give the appearance of a ring buffer, as expected by the AUX code.

The in-memory trace-like format is self-describing (though not parseable
in reverse) and written as a series of records, with each record
corresponding to a sample and consisting of a sequence of packets. These
packets are defined by the architecture, although some have CPU-specific
fields for recording information specific to the microarchitecture.

As a simple example, a record generated for a branch instruction may
consist of the following packets:

  0 (Address) : Virtual PC of the branch instruction
  1 (Type)    : Conditional direct branch
  2 (Counter) : Number of cycles taken from Dispatch to Issue
  3 (Address) : Virtual branch target + condition flags
  4 (Counter) : Number of cycles taken from Dispatch to Complete
  5 (Events)  : Mispredicted as not-taken
  6 (END)     : End of record

It is also possible to toggle properties such as timestamp packets in
each record.

This patch adds support for SPE in the form of a new perf driver.

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-18 12:53:34 +01:00
Shaokun Zhang
d1809d0e64 drivers/perf: arm_pmu_acpi: drop redundant acpi_disabled check
acpi_disabled has been checked in armv8_pmu_driver_init and it shall
be ZERO in arm_pmu_acpi_probe, clean up this unnecessary check.

Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-04 13:42:59 +01:00
Neil Leeder
b65423ed46 perf: qcom_l2_pmu: add event names
Add event names so that common events can be
specified symbolically, for example:

l2cache_0/total-reads/,l2cache_0/cycles/

Event names are displayed in 'perf list'.

Signed-off-by: Neil Leeder <nleeder@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-02 10:13:06 +01:00
Arvind Yadav
a88dc7ba15 drivers/perf: arm_pmu_acpi: Release memory obtained by kasprintf
Free memory region, if arm_pmu_acpi_probe is not successful.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-09-22 15:11:46 +01:00
Will Deacon
6c833bb924 arm64: perf: Allow standard PMUv3 events to be extended by the CPU type
Rather than continue adding CPU-specific event maps, instead look up by
default in the PMUv3 event map and only fallback to the CPU-specific maps
if either the event isn't described by PMUv3, or it is described but
the PMCEID registers say that it is unsupported by the current CPU.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-08 17:12:34 +01:00
Tai Nguyen
c1be2ddb1e perf: xgene: Remove unnecessary managed resources cleanup
Managed resources in the driver should be automatically cleaned up on
driver detach. It's unnecessary to manually free/unmmap these resources.
One of the manual cleanup causes static checkers to complain.
The bug is reported by Dan Carpenter <dan.carpenter@oracle.com> in [1]

[1] https://www.spinics.net/lists/arm-kernel/msg593012.html

This patch gets rid of all the unnecessary manual cleanup and properly
unregister all the registered PMU devices by the driver on driver detach.

Signed-off-by: Tai Nguyen <ttnguyen@apm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-08 14:33:13 +01:00
Will Deacon
a3287c41ff drivers/perf: arm_pmu: Request PMU SPIs with IRQF_PER_CPU
Since the PMU register interface is banked per CPU, CPU PMU interrrupts
cannot be handled by a CPU other than the one with the PMU asserting the
interrupt. This means that migrating PMU SPIs, as we do during a CPU
hotplug operation doesn't make any sense and can lead to the IRQ being
disabled entirely if we route a spurious IRQ to the new affinity target.

This has been observed in practice on AMD Seattle, where CPUs on the
non-boot cluster appear to take a spurious PMU IRQ when coming online,
which is routed to CPU0 where it cannot be handled.

This patch passes IRQF_PERCPU for PMU SPIs and forcefully sets their
affinity prior to requesting them, ensuring that they cannot
be migrated during hotplug events. This interacts badly with the DB8500
erratum workaround that ping-pongs the interrupt affinity from the handler,
so we avoid passing IRQF_PERCPU in that case by allowing the IRQ flags
to be overridden in the platdata.

Fixes: 3cf7ee98b8 ("drivers/perf: arm_pmu: move irq request/free into probe")
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-07-27 13:43:22 +01:00
Neil Leeder
6c17c1c309 perf: qcom_l2: fix column exclusion check
The check for column exclusion did not verify that the event being
checked was an L2 event, and not a software event.
Software events should not be checked for column exclusion.
This resulted in a group with both software and L2 events sometimes
incorrectly rejecting the L2 event for column exclusion and
not counting it.

Add a check for PMU type before applying column exclusion logic.

Fixes: 21bdbb7102 ("perf: add qcom l2 cache perf events driver")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Neil Leeder <nleeder@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-07-26 09:27:43 +01:00
Rob Herring
5f4216f49b perf: Convert to using %pOF instead of full_name
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.

Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-07-20 10:28:41 +01:00
Will Deacon
3edb1dd13c Merge branch 'aarch64/for-next/ras-apei' into aarch64/for-next/core
Merge in arm64 ACPI RAS support (APEI/GHES) from Tyler Baicar.
2017-06-26 10:54:27 +01:00
Hoan Tran
c0f7f7acde perf: xgene: Add support for SoC PMU version 3
This patch adds support for SoC-wide (AKA uncore) Performance Monitoring
Unit version 3.

It can support up to
 - 2 IOB PMU instances
 - 8 L3C PMU instances
 - 2 MCB PMU instances
 - 8 MCU PMU instances
and these PMUs support 64 bit counter

Signed-off-by: Hoan Tran <hotran@apm.com>
[Mark: stop counters in _xgene_pmu_isr()]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[will: make xgene_pmu_v3_ops static]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-22 20:16:34 +01:00
Hoan Tran
e35e0a04a9 perf: xgene: Move PMU leaf functions into function pointer structure
This patch moves PMU leaf functions into a function pointer structure.
It helps code maintain and expasion easier.

Signed-off-by: Hoan Tran <hotran@apm.com>
[Mark: remove redundant cast]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[will: make xgene_pmu_ops static]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-22 20:16:18 +01:00
Hoan Tran
838955e2a3 perf: xgene: Parse PMU subnode from the match table
This patch parses PMU Subnode from a match table.

Signed-off-by: Hoan Tran <hotran@apm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-22 19:30:09 +01:00
Mark Rutland
bddb9b68d3 drivers/perf: commonise PERF_EVENTS dependency
All PMU drivers are going to depend on PERF_EVENTS, so let's make this
dependency common and simplify the individual Kconfig entries.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 11:10:33 +01:00
Wei Huang
477c50e8dc drivers/perf: arm_pmu_acpi: avoid perf IRQ init when guest PMU is off
We saw perf IRQ init failures when running Linux kernel in an ACPI
guest without PMU (i.e. pmu=off). This is because perf IRQ is not
present when pmu=off, but arm_pmu_acpi still tries to register
or unregister GSI. This patch addresses the problem by checking
gicc->performance_interrupt. If it is 0, which is the value set
by qemu when pmu=off, we skip the IRQ register/unregister process.

[    4.069470] bc00: 0000000000040b00 ffff0000089db190
[    4.070267] [<ffff000008134f80>] enable_percpu_irq+0xdc/0xe4
[    4.071192] [<ffff000008667cc4>] arm_perf_starting_cpu+0x108/0x10c
[    4.072200] [<ffff0000080cbdd4>] cpuhp_invoke_callback+0x14c/0x4ac
[    4.073210] [<ffff0000080ccd3c>] cpuhp_thread_fun+0xd4/0x11c
[    4.074132] [<ffff0000080f1394>] smpboot_thread_fn+0x1b4/0x1c4
[    4.075081] [<ffff0000080ec90c>] kthread+0x10c/0x138
[    4.075921] [<ffff0000080833c0>] ret_from_fork+0x10/0x50
[    4.076947] genirq: Setting trigger mode 4 for irq 43 failed
(gic_set_type+0x0/0x74)

Signed-off-by: Wei Huang <wei@redhat.com>
[will: add comment justifying deviation from ACPI spec, removed redundant hunk]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-05-30 12:40:03 +01:00
Mark Rutland
45736a72fb drivers/perf: arm_pmu: add ACPI framework
This patch adds framework code to handle parsing PMU data out of the
MADT, sanity checking this, and managing the association of CPUs (and
their interrupts) with appropriate logical PMUs.

For the time being, we expect that only one PMU driver (PMUv3) will make
use of this, and we simply pass in a single probe function.

This is based on an earlier patch from Jeremy Linton.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-04-11 16:29:54 +01:00
Mark Rutland
18bfcfe51b drivers/perf: arm_pmu: split out platform device probe logic
Now that we've split the pdev and DT probing logic from the runtime
management, let's move the former into its own file. We gain a few lines
due to the copyright header and includes, but this should keep the logic
clearly separated, and paves the way for adding ACPI support in a
similar fashion.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
[will: rename nr_irqs to avoid conflict with global variable]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-04-11 16:29:54 +01:00
Mark Rutland
3cf7ee98b8 drivers/perf: arm_pmu: move irq request/free into probe
Currently we request (and potentially free) all IRQs for a given PMU in
cpu_pmu_init(). This works for platform/DT probing today, but it doesn't
fit ACPI well as we don't have all our affinity data up-front.

In preparation for ACPI support, fold the IRQ request/free into
arm_pmu_device_probe(), which will remain specific to platform/DT
probing.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-04-11 16:29:53 +01:00
Mark Rutland
0e2663d921 drivers/perf: arm_pmu: split cpu-local irq request/free
Currently we have functions to request/free all IRQs for a given PMU.
While this works today, this won't work for ACPI, where we don't know
the full set of IRQs up front, and need to request them separately.

To enable supporting ACPI, this patch splits out the cpu-local
request/free into new functions, allowing us to request/free individual
IRQs.

As this makes it possible/necessary to request a PPI once per cpu, an
additional check is added to detect mismatched PPIs. This shouldn't
matter for the DT / platform case, as we check this when parsing.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-04-11 16:29:53 +01:00
Mark Rutland
3cf6111025 drivers/perf: arm_pmu: rename irq request/free functions
For historical reasons, portions of the arm_pmu code use a cpu_pmu_
prefix rather than an armpmu_ prefix. While a minor annoyance, this
hasn't been a problem thusfar.

However, to enable ACPI support, we'll need to expose a few things in
header files, and we should aim to keep those consistently namespaced.
In preparation for exporting our IRQ request/free functions, rename
these to have an armpmu_ prefix. For consistency, the 'cpu_pmu'
parameter is also renamed to 'armpmu'.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-04-11 16:29:53 +01:00
Mark Rutland
7654137071 drivers/perf: arm_pmu: handle no platform_device
In armpmu_dispatch_irq() we look at arm_pmu::plat_device to acquire
platdata, so that we can defer to platform-specific IRQ handling,
required on some 32-bit parts. With the advent of ACPI we won't always
have a platform_device, and so we must avoid trying to dereference
fields from it.

This patch fixes up armpmu_dispatch_irq() to avoid doing so, introducing
a new armpmu_get_platdata() helper.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-04-11 16:29:53 +01:00
Mark Rutland
3a5a89d30e drivers/perf: arm_pmu: simplify cpu_pmu_request_irqs()
The ARM PMU framework code always uses armpmu_dispatch_irq as its common
IRQ handler. Passing this down from cpu_pmu_init() is somewhat
pointless, and gets in the way of refactoring.

This patch makes cpu_pmu_request_irqs() always use armpmu_dispatch_irq
as the handler when requesting IRQs, and removes the handler parameter
from its prototype.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-04-11 16:29:53 +01:00
Mark Rutland
74a2b3ea2d drivers/perf: arm_pmu: factor out pmu registration
Currently arm_pmu_device_probe contains probing logic specific to the
platform_device infrastructure, and some logic required to safely
register the PMU with various systems.

This patch factors out the logic relating to the registration of the
PMU. This makes arm_pmu_device_probe a little easier to read, and will
make it easier to reuse the logic for an ACPI-specific probing
mechanism.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-04-11 16:29:53 +01:00
Mark Rutland
70cd908a18 drivers/perf: arm_pmu: fold init into alloc
Given we always want to initialise common fields on an allocated PMU,
this patch folds this common initialisation into armpmu_alloc(). This
will make it simpler to reuse this code for an ACPI-specific probe path.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-04-11 16:29:53 +01:00
Mark Rutland
083c52144a drivers/perf: arm_pmu: define armpmu_init_fn
We expect an ARM PMU's init function to have a particular prototype,
which we open-code in a few places. This is less than ideal, considering
that we cast a void value to this type in one location, and a mismatch
could easily be missed.

Add a typedef so that we can ensure this is consistent.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-04-11 16:29:53 +01:00
Mark Rutland
a9e469d1c8 drivers/perf: arm_pmu: remove pointless PMU disabling
We currently disable the PMU temporarily in armpmu_add(). We may have
required this historically, but the perf core always disables an event's
PMU when calling event::pmu::add(), so this is not necessary.

We don't do similarly in armpmu_del(), or elsewhere, so this is
unnecessary and inconsistent, and only serves to confuse the reader.

Remove the pointless disable, simplifying armpmu_add() in the process.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-04-11 16:29:53 +01:00
Agustin Vega-Frias
3071f13d75 perf: qcom: Add L3 cache PMU driver
This adds a new dynamic PMU to the Perf Events framework to program
and control the L3 cache PMUs in some Qualcomm Technologies SOCs.

The driver supports a distributed cache architecture where the overall
cache for a socket is comprised of multiple slices each with its own PMU.
Access to each individual PMU is provided even though all CPUs share all
the slices. User space needs to aggregate to individual counts to provide
a global picture.

The driver exports formatting and event information to sysfs so it can
be used by the perf user space tools with the syntaxes:
   perf stat -a -e l3cache_0_0/read-miss/
   perf stat -a -e l3cache_0_0/event=0x21/

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
[will: fixed sparse issues]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-04-03 18:53:50 +01:00
Mark Rutland
c09adab01e drivers/perf: arm_pmu: split irq request from enable
For historical reasons, we lazily request and free interrupts in the
arm pmu driver. This requires us to refcount use of the pmu (by way of
counting the active events) in order to request/free interrupts at the
correct times, which complicates the driver somewhat.

The existing logic is flawed, as it only considers currently online CPUs
when requesting, freeing, or managing the affinity of interrupts.
Intervening hotplug events can result in erroneous IRQ affinity, online
CPUs for which interrupts have not been requested, or offline CPUs whose
interrupts are still requested.

To fix this, this patch splits the requesting of interrupts from any
per-cpu management (i.e. per-cpu enable/disable, and configuration of
cpu affinity). We now request all interrupts up-front at probe time (and
never free them, since we never unregister PMUs).

The management of affinity, and per-cpu enable/disable now happens in
our cpu hotplug callback, ensuring it occurs consistently. This means
that we must now invoke the CPU hotplug callback at boot time in order
to configure IRQs, and since the callback also resets the PMU hardware,
we can remove the duplicate reset in the probe path.

This rework renders our event refcounting unnecessary, so this is
removed.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[will: make armpmu_get_cpu_irq static]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-03-31 18:20:29 +01:00
Mark Rutland
7ed98e0168 drivers/perf: arm_pmu: manage interrupts per-cpu
When requesting or freeing interrupts, we use platform_get_irq() to find
relevant irqs, backing this up with additional information in an
optional irq_affinity table.

This means that our irq request and free paths are tied to a
platform_device, and our request path must jump through a number of
hoops in order to determine the required affinity of each interrupt.

Given that the affinity must be static, we can compute the affinity once
up-front at probe time, simplifying the irq request and free paths. By
recording interrupts in a per-cpu data structure, we simplify a few
paths, and permit a subsequent rework of the request and free paths.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[will: rename local nr_irqs variable to avoid conflict with global]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-03-31 18:20:02 +01:00
Mark Rutland
2681f01842 drivers/perf: arm_pmu: rework per-cpu allocation
For historical reasons, we allocate per-cpu data associated with a PMU
rather late, in cpu_pmu_init, after we've parsed whatever hardware
information we were provided with.

In order to allow use to store some per-cpu data early in the probe
path, we need to allocate (and initialise) the per-cpu data earlier.
This patch reworks the way we allocate the pmu and associated per-cpu
data in order to make that possible.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[will: make armpmu_{alloc,free} static
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-03-31 18:19:45 +01:00
Ingo Molnar
e601757102 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h>
We are going to split <linux/sched/clock.h> out of <linux/sched.h>, which
will have to be picked up from other headers and .c files.

Create a trivial placeholder <linux/sched/clock.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:27 +01:00
Neil Leeder
21bdbb7102 perf: add qcom l2 cache perf events driver
Adds perf events support for L2 cache PMU.

The L2 cache PMU driver is named 'l2cache_0' and can be used
with perf events to profile L2 events such as cache hits
and misses on Qualcomm Technologies processors.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Neil Leeder <nleeder@codeaurora.org>
[will: minimise nesting in l2_cache_associate_cpu_with_cluster]
[will: use kstrtoul for unsigned long, remove redunant .owner setting]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-02-08 19:32:24 +00:00
Stephen Boyd
c0bfc549e9 perf: xgene: Include module.h
I ran into a build error when I disabled CONFIG_ACPI and tried to
compile this driver:

drivers/perf/xgene_pmu.c:1242:1: warning: data definition has no type or storage class
 MODULE_DEVICE_TABLE(of, xgene_pmu_of_match);
 ^
drivers/perf/xgene_pmu.c:1242:1: error: type defaults to 'int' in declaration of 'MODULE_DEVICE_TABLE' [-Werror=implicit-int]

Include module.h for the MODULE_DEVICE_TABLE macro that's
implicitly included through ACPI.

Tested-by: Tai Nguyen <ttnguyen@apm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-02-03 18:46:47 +00:00
Thomas Gleixner
73c1b41e63 cpu/hotplug: Cleanup state names
When the state names got added a script was used to add the extra argument
to the calls. The script basically converted the state constant to a
string, but the cleanup to convert these strings into meaningful ones did
not happen.

Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
are used in all the other places already.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-25 10:47:44 +01:00
Tai Nguyen
9a1a1f404b perf: xgene: Remove bogus IS_ERR() check
In acpi_get_pmu_hw_inf we pass the address of a local variable to IS_ERR(),
which doesn't make sense, as the pointer must be a real, valid pointer.
This doesn't cause a functional problem, as IS_ERR() will evaluate as
false, but the check is bogus and causes static checkers to complain.

Remove the bogus check.

The bug is reported by Dan Carpenter <dan.carpenter@oracle.com> in [1]

[1] https://www.spinics.net/lists/arm-kernel/msg535957.html

Signed-off-by: Tai Nguyen <ttnguyen@apm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-17 15:50:07 +01:00
Linus Torvalds
6afd563d4b ARM: SoC driver updates for v4.9
Driver updates for ARM SoCs, including a couple of newly added drivers:
 
 - The Qualcomm external bus interface 2 (EBI2), used in some of their
   mobile phone chips for connecting flash memory, LCD displays or
   other peripherals
 
 - Secure monitor firmware for Amlogic SoCs, and an NVMEM driver for the
   EFUSE based on that firmware interface.
 
 - Perf support for the AppliedMicro X-Gene performance monitor unit
 
 - Reset driver for STMicroelectronics STM32
 
 - Reset driver for SocioNext UniPhier SoCs
 
 Aside from these, there are minor updates to SoC-specific bus,
 clocksource, firmware, pinctrl, reset, rtc and pmic drivers.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAV/gaimCrR//JCVInAQJaOQ/6A++YfLVmdF4wxgcu/0ti28lA7SkQIGJV
 UAsfCmqMEutbeDvnloVGmTV2K2NS7mzxdxsJGbVB7Oe/zdOFN+T9sf9hAlId01QA
 oVkoagpofoxlyKoKJ/l+heuEEZMa0Ekk3XXRTGv/Ovymo7252o4tEdGu9c+gyaMJ
 KqgixcrQRzxuWDgPpHUPUez2vY1iRMvvdcb0EmfiHcIgPOEJc6MIxulsqEIrkoMz
 WYeGFIeqRJxnrur3QD8WnD+aZD6bV01wkFTkWXGWg4H87QfEESgVBu5A7TL+5sL8
 1SlX/b7S5/ZJbrOiOS2IUyvbK7NiA/Q+NunHW2rMVnUWuEvJ9HAQB1kVSQH5LIYO
 6OBokjcijm6m/j6O6fdDfvNd6PLsIEUqfWVws7O+uofMMqKPxqak4VBTRdFM+aeF
 ZtK7mEbzteCX0bnC+XblZrseAlkIehYnP80CLDbtDTerTWP4gsjxGVt3U6MO0NzB
 K0ACWZOclzrcFscNKrmP6uPCpfZriiPV/XMCEHcylA/X2iYsVmpqKzdLuNs5aeUr
 uPzQbNWu9ygg/bDRXMYY2E3Kzjsc0eIOKEOPyhLaZdSo4e1FQxud6L2V2Vj0RLB/
 iMA7/CyQZqn6Yzgs0VMZm/bnh+hIdHioGFl5K5j6Fcw9VZRkNmnEQJzX4VU5efGO
 g1+5av0vFXg=
 =GvTq
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC driver updates from Arnd Bergmann:
 "Driver updates for ARM SoCs, including a couple of newly added
  drivers:

   - The Qualcomm external bus interface 2 (EBI2), used in some of their
     mobile phone chips for connecting flash memory, LCD displays or
     other peripherals

   - Secure monitor firmware for Amlogic SoCs, and an NVMEM driver for
     the EFUSE based on that firmware interface.

   - Perf support for the AppliedMicro X-Gene performance monitor unit

   - Reset driver for STMicroelectronics STM32

   - Reset driver for SocioNext UniPhier SoCs

  Aside from these, there are minor updates to SoC-specific bus,
  clocksource, firmware, pinctrl, reset, rtc and pmic drivers"

* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (50 commits)
  bus: qcom-ebi2: depend on HAS_IOMEM
  pinctrl: mvebu: orion5x: Generalise mv88f5181l support for 88f5181
  clk: mvebu: Add clk support for the orion5x SoC mv88f5181
  dt-bindings: EXYNOS: Add Exynos5433 PMU compatible
  clocksource: exynos_mct: Add the support for ARM64
  perf: xgene: Add APM X-Gene SoC Performance Monitoring Unit driver
  Documentation: Add documentation for APM X-Gene SoC PMU DTS binding
  MAINTAINERS: Add entry for APM X-Gene SoC PMU driver
  bus: qcom: add EBI2 driver
  bus: qcom: add EBI2 device tree bindings
  rtc: rtc-pm8xxx: Add support for pm8018 rtc
  nvmem: amlogic: Add Amlogic Meson EFUSE driver
  firmware: Amlogic: Add secure monitor driver
  soc: qcom: smd: Reset rx tail rather than tx
  memory: atmel-sdramc: fix a possible NULL dereference
  reset: hi6220: allow to compile test driver on other architectures
  reset: zynq: add driver Kconfig option
  reset: sunxi: add driver Kconfig option
  reset: stm32: add driver Kconfig option
  reset: socfpga: add driver Kconfig option
  ...
2016-10-07 21:23:40 -07:00
Linus Torvalds
597f03f9d1 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug updates from Thomas Gleixner:
 "Yet another batch of cpu hotplug core updates and conversions:

   - Provide core infrastructure for multi instance drivers so the
     drivers do not have to keep custom lists.

   - Convert custom lists to the new infrastructure. The block-mq custom
     list conversion comes through the block tree and makes the diffstat
     tip over to more lines removed than added.

   - Handle unbalanced hotplug enable/disable calls more gracefully.

   - Remove the obsolete CPU_STARTING/DYING notifier support.

   - Convert another batch of notifier users.

   The relayfs changes which conflicted with the conversion have been
   shipped to me by Andrew.

   The remaining lot is targeted for 4.10 so that we finally can remove
   the rest of the notifiers"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
  cpufreq: Fix up conversion to hotplug state machine
  blk/mq: Reserve hotplug states for block multiqueue
  x86/apic/uv: Convert to hotplug state machine
  s390/mm/pfault: Convert to hotplug state machine
  mips/loongson/smp: Convert to hotplug state machine
  mips/octeon/smp: Convert to hotplug state machine
  fault-injection/cpu: Convert to hotplug state machine
  padata: Convert to hotplug state machine
  cpufreq: Convert to hotplug state machine
  ACPI/processor: Convert to hotplug state machine
  virtio scsi: Convert to hotplug state machine
  oprofile/timer: Convert to hotplug state machine
  block/softirq: Convert to hotplug state machine
  lib/irq_poll: Convert to hotplug state machine
  x86/microcode: Convert to hotplug state machine
  sh/SH-X3 SMP: Convert to hotplug state machine
  ia64/mca: Convert to hotplug state machine
  ARM/OMAP/wakeupgen: Convert to hotplug state machine
  ARM/shmobile: Convert to hotplug state machine
  arm64/FP/SIMD: Convert to hotplug state machine
  ...
2016-10-03 19:43:08 -07:00
Linus Torvalds
7af8a0f808 arm64 updates for 4.9:
- Support for execute-only page permissions
 - Support for hibernate and DEBUG_PAGEALLOC
 - Support for heterogeneous systems with mismatches cache line sizes
 - Errata workarounds (A53 843419 update and QorIQ A-008585 timer bug)
 - arm64 PMU perf updates, including cpumasks for heterogeneous systems
 - Set UTS_MACHINE for building rpm packages
 - Yet another head.S tidy-up
 - Some cleanups and refactoring, particularly in the NUMA code
 - Lots of random, non-critical fixes across the board
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJX7k31AAoJELescNyEwWM0XX0H/iOaWCfKlWOhvBsStGUCsLrK
 XryTzQT2KjdnLKf3jwP+1ateCuBR5ROurYxoDCX5/7mD63c5KiI338Vbv61a1lE1
 AAwjt1stmQVUg/j+kqnuQwB/0DYg+2C8se3D3q5Iyn7zc19cDZJEGcBHNrvLMufc
 XgHrgHgl/rzBDDlHJXleknDFge/MfhU5/Q1vJMRRb4JYrpAtmIokzCO75CYMRcCT
 ND2QbmppKtsyuFPGUTVbAFzJlP6dGKb3eruYta7/ct5d0pJQxav3u98D2yWGfjdM
 YaYq1EmX5Pol7rWumqLtk0+mA9yCFcKLLc+PrJu20Vx0UkvOq8G8Xt70sHNvZU8=
 =gdPM
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "It's a bit all over the place this time with no "killer feature" to
  speak of.  Support for mismatched cache line sizes should help people
  seeing whacky JIT failures on some SoCs, and the big.LITTLE perf
  updates have been a long time coming, but a lot of the changes here
  are cleanups.

  We stray outside arch/arm64 in a few areas: the arch/arm/ arch_timer
  workaround is acked by Russell, the DT/OF bits are acked by Rob, the
  arch_timer clocksource changes acked by Marc, CPU hotplug by tglx and
  jump_label by Peter (all CC'd).

  Summary:

   - Support for execute-only page permissions
   - Support for hibernate and DEBUG_PAGEALLOC
   - Support for heterogeneous systems with mismatches cache line sizes
   - Errata workarounds (A53 843419 update and QorIQ A-008585 timer bug)
   - arm64 PMU perf updates, including cpumasks for heterogeneous systems
   - Set UTS_MACHINE for building rpm packages
   - Yet another head.S tidy-up
   - Some cleanups and refactoring, particularly in the NUMA code
   - Lots of random, non-critical fixes across the board"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (100 commits)
  arm64: tlbflush.h: add __tlbi() macro
  arm64: Kconfig: remove SMP dependence for NUMA
  arm64: Kconfig: select OF/ACPI_NUMA under NUMA config
  arm64: fix dump_backtrace/unwind_frame with NULL tsk
  arm/arm64: arch_timer: Use archdata to indicate vdso suitability
  arm64: arch_timer: Work around QorIQ Erratum A-008585
  arm64: arch_timer: Add device tree binding for A-008585 erratum
  arm64: Correctly bounds check virt_addr_valid
  arm64: migrate exception table users off module.h and onto extable.h
  arm64: pmu: Hoist pmu platform device name
  arm64: pmu: Probe default hw/cache counters
  arm64: pmu: add fallback probe table
  MAINTAINERS: Update ARM PMU PROFILING AND DEBUGGING entry
  arm64: Improve kprobes test for atomic sequence
  arm64/kvm: use alternative auto-nop
  arm64: use alternative auto-nop
  arm64: alternative: add auto-nop infrastructure
  arm64: lse: convert lse alternatives NOP padding to use __nops
  arm64: barriers: introduce nops and __nops macros for NOP sequences
  arm64: sysreg: replace open-coded mrs_s/msr_s with {read,write}_sysreg_s
  ...
2016-10-03 08:58:35 -07:00
Arnd Bergmann
c55b2d989a X-Gene driver changes queued for v4.9
This patch set includes:
 + X-Gene SoC Performance Monitoring Unit (PMU) driver
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX24bQAAoJEB11UG/BVQ/gMZEP/3408fNJfE0LiUdcT+pq/4Iq
 udqJwGXAp3yIKUy1Ewl82N9HlyhWI5AFD7lsLbyeKghPDB33atva4h+ywg4h9bDK
 /Iug+E635JLj64w54qLCcenTCzKQQGVMcHNO7QZsKhHyvJbjKjFT1ZhlQiON8DxU
 uovSRergt8MVNwKbPaoE8kWsKIP/WEJWnxu9F4diWo0/W7a3rEta2urqvB2xwaot
 ZaPfm9/DY2rew6rXiUpAEyFCq72zOTHFzMLsU5HRafdANcdYBbF6XwflyaNqgysK
 KcXMNEDtCq6MQYj+B5fcKUrOY0TQLdajNQg/yaOqHwWlIdpFSFzU3ROGgLffPHiX
 t0b6w0dZJOQ2oZ6MwZNLqLDFloSn7uIovDrSQYrWn5AXcB+IU692AoBX2vKCUp95
 0P8EUuISrdgcX1IHGbHySK/zP0Y2s3skXn3+RHbNBvuTCljpa6Qo/5iytrIDW1N7
 8FjtqmqiIVUFrEjZwOxNj5LQMB3B5kQm7EBKtDN+ML3cKE1Bn/RTS8+wpDqoo0Xg
 NVNMSo7BIKDlkJStG1DWy2FlLReO1O1p1DbanSQsWywC9j357W5VuU8X+NjRgBHR
 NJBvpZpJewbOKItnoK4SQa2YrwOrSSPy1JlTWBQICnRvgHH17WXmtkhkjSfWAIFD
 e4ctgF/vMDaH26oKcW6z
 =OqRW
 -----END PGP SIGNATURE-----

Merge tag 'xgene-drivers-for-4.9' of https://github.com/AppliedMicro/xgene-next into next/drivers

Pull "X-Gene driver changes queued for v4.9" from Duc Dang:

This patch set includes:
+ X-Gene SoC Performance Monitoring Unit (PMU) driver

* tag 'xgene-drivers-for-4.9' of https://github.com/AppliedMicro/xgene-next:
  perf: xgene: Add APM X-Gene SoC Performance Monitoring Unit driver
  Documentation: Add documentation for APM X-Gene SoC PMU DTS binding
  MAINTAINERS: Add entry for APM X-Gene SoC PMU driver
2016-09-19 17:05:14 +02:00
Mark Salter
dbee3a74ef arm64: pmu: add fallback probe table
In preparation for ACPI support, add a pmu_probe_info table to
the arm_pmu_device_probe() call. This table gets used when
probing in the absence of a devicetree node for PMU.

Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-16 17:11:33 +01:00
Tai Nguyen
832c927d11 perf: xgene: Add APM X-Gene SoC Performance Monitoring Unit driver
This patch adds a driver for the SoC-wide (AKA uncore) PMU hardware
found in APM X-Gene SoCs.

Signed-off-by: Tai Nguyen <ttnguyen@apm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
2016-09-15 11:20:55 -07:00
Mark Rutland
48538b5863 drivers/perf: arm_pmu: expose a cpumask in sysfs
In systems with heterogeneous CPUs, there are multiple logical CPU PMUs,
each of which covers a subset of CPUs in the system. In some cases
userspace needs to know which CPUs a given logical PMU covers, so we'd
like to expose a cpumask under sysfs, similar to what is done for uncore
PMUs.

Unfortunately, prior to commit 00e727bb38 ("perf stat: Balance
opening and reading events"), perf stat only correctly handled a cpumask
holding a single CPU, and only when profiling in system-wide mode. In
other cases, the presence of a cpumask file could cause perf stat to
behave erratically.

Thus, exposing a cpumask file would break older perf binaries in cases
where they would otherwise work.

To avoid this issue while still providing userspace with the information
it needs, this patch exposes a differently-named file (cpus) under
sysfs. New tools can look for this and operate correctly, while older
tools will not be adversely affected by its presence.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:51:51 +01:00
Mark Rutland
1589680da6 drivers/perf: arm_pmu: only use common attr_groups
Now that the 32-bit and 64-bit perf backends use the common groups
directly, remove the fallback and no longer allow the groups array to be
overridden.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:51:51 +01:00
Mark Rutland
86cdd72af9 drivers/perf: arm_pmu: add common attr group fields
In preparation for adding common attribute groups, add an array of
attribute group pointers to arm_pmu, which will be used if the
backend hasn't already set pmu::attr_groups.

Subsequent patches will move backends over to using these, before adding
common fields.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:51:51 +01:00
Marc Zyngier
282b879635 drivers/perf: arm_pmu: Always consider IRQ0 as an error
As declared by the chief penguin, and enforced by the NO_IRQ brigade,
IRQ0 doesn't exist, and is considered as an error (no irq).

Unfortunately, the arm_pmu driver still considers it as valid in
a large number of cases. Let's fix this.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-06 18:26:33 +01:00
Sebastian Andrzej Siewior
6e103c0cfe arm/perf: Use multi instance instead of custom list
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160817171420.sdwk2qivxunzryz4@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-02 20:05:06 +02:00
Stefan Wahren
63fb0a9516 drivers/perf: arm_pmu: Fix NULL pointer dereference during probe
Patch 7f1d642fbb ("drivers/perf: arm-pmu: Fix handling of SPI lacking
interrupt-affinity property") unintended also fixes perf_event support
for bcm2835 which doesn't have PMU interrupts. Unfortunately this change
introduce a NULL pointer dereference on bcm2835, because irq_is_percpu
always expected to be called with a valid IRQ. So fix this regression
by validating the IRQ before.

Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: 7f1d642fbb ("drivers/perf: arm-pmu: Fix handling of SPI lacking "interrupt-affinity" property")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-09-02 17:17:52 +01:00
Stefan Wahren
753246840d drivers/perf: arm_pmu: Fix leak in error path
In case of a IRQ type mismatch in of_pmu_irq_cfg() the
device node for interrupt affinity isn't freed. So fix this
issue by calling of_node_put().

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: fa8ad7889d ("arm: perf: factor arm_pmu core out to drivers")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-09-02 17:17:52 +01:00
Marc Zyngier
7f1d642fbb drivers/perf: arm-pmu: Fix handling of SPI lacking "interrupt-affinity" property
Patch 19a469a587 ("drivers/perf: arm-pmu: Handle per-interrupt
affinity mask") added support for partitionned PPI setups, but
inadvertently broke setups using SPIs without the "interrupt-affinity"
property (which is the case for UP platforms).

This patch restore the broken functionnality by testing whether the
interrupt is percpu or not instead of relying on the using_spi flag
that really means "SPI *and* interrupt-affinity property".

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: 19a469a587 ("drivers/perf: arm-pmu: Handle per-interrupt affinity mask")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-09 17:57:39 +01:00
Sudeep Holla
a026bb12cc drivers/perf: arm-pmu: convert arm_pmu_mutex to spinlock
arm_pmu_mutex is never held long and we don't want to sleep while the
lock is being held as it's executed in the context of hotplug notifiers.
So it can be converted to a simple spinlock instead.

Without this patch we get the following warning:

BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/2
no locks held by swapper/2/0.
irq event stamp: 381314
hardirqs last  enabled at (381313): _raw_spin_unlock_irqrestore+0x7c/0x88
hardirqs last disabled at (381314): cpu_die+0x28/0x48
softirqs last  enabled at (381294): _local_bh_enable+0x28/0x50
softirqs last disabled at (381293): irq_enter+0x58/0x78
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.7.0 #12
Call trace:
 dump_backtrace+0x0/0x220
 show_stack+0x24/0x30
 dump_stack+0xb4/0xf0
 ___might_sleep+0x1d8/0x1f0
 __might_sleep+0x5c/0x98
 mutex_lock_nested+0x54/0x400
 arm_perf_starting_cpu+0x34/0xb0
 cpuhp_invoke_callback+0x88/0x3d8
 notify_cpu_starting+0x78/0x98
 secondary_start_kernel+0x108/0x1a8

This patch converts the mutex to spinlock to eliminate the above
warnings. This constraints pmu->reset to be non-blocking call which is
the case with all the ARM PMU backends.

Cc: Stephen Boyd <sboyd@codeaurora.org>
Fixes: 37b502f121 ("arm/perf: Fix hotplug state machine conversion")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-09 17:33:38 +01:00
Linus Torvalds
a6408f6cb6 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull smp hotplug updates from Thomas Gleixner:
 "This is the next part of the hotplug rework.

   - Convert all notifiers with a priority assigned

   - Convert all CPU_STARTING/DYING notifiers

     The final removal of the STARTING/DYING infrastructure will happen
     when the merge window closes.

  Another 700 hundred line of unpenetrable maze gone :)"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
  timers/core: Correct callback order during CPU hot plug
  leds/trigger/cpu: Move from CPU_STARTING to ONLINE level
  powerpc/numa: Convert to hotplug state machine
  arm/perf: Fix hotplug state machine conversion
  irqchip/armada: Avoid unused function warnings
  ARC/time: Convert to hotplug state machine
  clocksource/atlas7: Convert to hotplug state machine
  clocksource/armada-370-xp: Convert to hotplug state machine
  clocksource/exynos_mct: Convert to hotplug state machine
  clocksource/arm_global_timer: Convert to hotplug state machine
  rcu: Convert rcutree to hotplug state machine
  KVM/arm/arm64/vgic-new: Convert to hotplug state machine
  smp/cfd: Convert core to hotplug state machine
  x86/x2apic: Convert to CPU hotplug state machine
  profile: Convert to hotplug state machine
  timers/core: Convert to hotplug state machine
  hrtimer: Convert to hotplug state machine
  x86/tboot: Convert to hotplug state machine
  arm64/armv8 deprecated: Convert to hotplug state machine
  hwtracing/coresight-etm4x: Convert to hotplug state machine
  ...
2016-07-29 13:55:30 -07:00
Sebastian Andrzej Siewior
37b502f121 arm/perf: Fix hotplug state machine conversion
Mark Rutland pointed out that this commit is incomplete:

  7d88eb695a ("arm/perf: Convert to hotplug state machine")

The problem is that:

 > We may have multiple PMUs (e.g. two in big.LITTLE systems), and
 > __oprofile_cpu_pmu only contains one of these. So this conversion is not
 > correct.
 >
 > We were relying on the notifier list implicitly containing a list of
 > those PMUs. It seems like we need an explicit list here.
 >
 > We keep __oprofile_cpu_pmu around for legacy 32-bit users of OProfile
 > (on non-hetereogeneous systems), and that's all that the variable should
 > be used for.

Introduce arm_pmu_list to correctly handle multiple PMUs in the system.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-tip-commits@vger.kernel.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160719111733.GA22911@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-20 09:57:34 +02:00
Thomas Gleixner
7d88eb695a arm/perf: Convert to hotplug state machine
Straight forward conversion w/o bells and whistles.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153335.794097159@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15 10:40:23 +02:00
Marc Zyngier
19a469a587 drivers/perf: arm-pmu: Handle per-interrupt affinity mask
On a big-little system, PMUs can be wired to CPUs using per CPU
interrups (PPI). In this case, it is important to make sure that
the enable/disable do happen on the right set of CPUs.

So instead of relying on the interrupt-affinity property, we can
use the actual percpu affinity that DT exposes as part of the
interrupt specifier. The DT binding is also updated to reflect
the fact that the interrupt-affinity property shouldn't be used
in that case.

Acked-by: Rob Herring <robh@kernel.org>
Tested-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-07-08 17:39:55 +01:00
Mark Salter
f7a6c1492a arm: pmu: Fix non-devicetree probing
There is a problem in the non-devicetree PMU probing where some
probe functions may get the number of supported events through
smp_call_function_any() using the arm_pmu supported_cpus mask.
But at the time the probe function is called, the supported_cpus
mask is empty so the call fails. This patch makes sure the mask
is set before calling the init function rather than after.

Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-15 09:51:35 +01:00
Julien Grall
5988a363ed drivers/perf: arm_pmu: Avoid leaking pmu->irq_affinity on error
pmu->irq_affinity will not be freed if an error occurred within
arm_pmu_device_probe after of_pmu_irq_cfg has been called.

Note that in the case of_pmu_irq_cfg is returning an error,
pmu->irq_affinity will not be set, but it should be NULL as pmu was
kzalloc'd. Therefore the result kfree(NULL) is benign.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-03 10:16:21 +01:00
Julien Grall
0f254c7671 drivers/perf: arm_pmu: Defer the setting of __oprofile_cpu_pmu
The global variable __oprofile_cpu_pmu is set before the PMU is fully
initialized. If an error occurs before the end of the initialization,
the PMU will be freed and the variable will contain an invalid pointer.

This will result in a kernel crash when perf will be used.

Fix it by moving the setting of __oprofile_cpu_pmu when the PMU is fully
initialized (i.e when it is no longer possible to fail).

Cc: <stable@vger.kernel.org>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-03 10:16:21 +01:00
Julien Grall
121323ae66 drivers/perf: arm_pmu: Fix reference count of a device_node in of_pmu_irq_cfg
The only function called by of_pmu_irq_cfg that will increment the
reference count on dn is of_parse_phandle.

Each time we successfully parse a possible CPU from an
interrupt-affinity property, we increment the refcount of that CPU node
once via of_parse_handle. After validating the CPU is possible, we
decrement the refcount once. Subsequently, we decrement the refcount
again, either as part of an early break if we don't have a matching SPI,
or as part of the end of the loop body.

This will lead to decrementing twice the refcounnt.
Remove the second pairs of call to of_node_put as nobody is using dn
between the first and second call to of_node_put.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-03 10:16:20 +01:00
Mark Rutland
5101ef20f0 perf/arm: Special-case hetereogeneous CPUs
Commit:

  2665784850 ("perf/core: Verify we have a single perf_hw_context PMU")

forcefully prevents multiple PMUs from sharing perf_hw_context, as this
generally doesn't make sense. It is a common bug for uncore PMUs to
use perf_hw_context rather than perf_invalid_context, which this detects.

However, systems exist with heterogeneous CPUs (and hence heterogeneous
HW PMUs), for which sharing perf_hw_context is necessary, and possible
in some limited cases.

To make this work we have to perform some gymnastics, as we did in these
commits:

  66eb579e66 ("perf: allow for PMU-specific event filtering")
  c904e32a69 ("arm: perf: filter unschedulable events")

To allow those systems to work, we must allow PMUs for heterogeneous
CPUs to share perf_hw_context, though we must still disallow sharing
otherwise to detect the common misuse of perf_hw_context.

This patch adds a new PERF_PMU_CAP_HETEROGENEOUS_CPUS for this, updates
the core logic to account for this, and makes use of it in the arm_pmu
code that is used for systems with heterogeneous CPUs. Comments are
added to make the rationale clear and hopefully avoid accidental abuse.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20160426103346.GA20836@leverpostej
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 10:13:59 +02:00
Lorenzo Pieralisi
cbcc72e037 drivers/perf: arm-pmu: fix RCU usage on pmu resume from low-power
Commit da4e4f18af ("drivers/perf: arm_pmu: implement CPU_PM notifier")
added code in the arm perf infrastructure that allows the kernel to
save/restore perf counters whenever the CPU enters a low-power
state. The kernel saves/restores the counters for each active event
through the armpmu_{stop/start} ARM pmu API, so that the low-power state
enter/exit cycle is emulated through pmu start/stop operations for each
event in use.

However, calling armpmu_start() for each active event on power up
executes code that requires RCU locking (perf_event_update_userpage())
to be functional, so, given that the core may call the CPU_PM notifiers
while running the idle thread in an quiescent RCU state this is not
allowed as detected through the following splat when kernel is run with
CONFIG_PROVE_LOCKING enabled:

[   49.293286]
[   49.294761] ===============================
[   49.298895] [ INFO: suspicious RCU usage. ]
[   49.303031] 4.6.0-rc3+ #421 Not tainted
[   49.306821] -------------------------------
[   49.310956] include/linux/rcupdate.h:872 rcu_read_lock() used
illegally while idle!
[   49.318530]
[   49.318530] other info that might help us debug this:
[   49.318530]
[   49.326451]
[   49.326451] RCU used illegally from idle CPU!
[   49.326451] rcu_scheduler_active = 1, debug_locks = 0
[   49.337209] RCU used illegally from extended quiescent state!
[   49.342892] 2 locks held by swapper/2/0:
[   49.346768]  #0:  (cpu_pm_notifier_lock){......}, at:
[<ffffff8008163c28>] cpu_pm_exit+0x18/0x80
[   49.355492]  #1:  (rcu_read_lock){......}, at: [<ffffff800816dc38>]
perf_event_update_userpage+0x0/0x260

This patch wraps the armpmu_start() call (that indirectly calls
perf_event_update_userpage()) on CPU_PM notifier power state exit (or
failed entry) within the RCU_NONIDLE() macro so that the RCU subsystem
is made aware the calling cpu is not idle from an RCU perspective for
the armpmu_start() call duration, therefore fixing the issue.

Fixes: da4e4f18af ("drivers/perf: arm_pmu: implement CPU_PM notifier")
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reported-by: James Morse <james.morse@arm.com>
Suggested-by: Kevin Hilman <khilman@baylibre.com>
Cc: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-04-21 15:03:06 +01:00
Will Deacon
357b565d5d drivers/perf: arm_pmu: avoid NULL dereference when not using devicetree
Commit c6b90653f1 ("drivers/perf: arm_pmu: make info messages more
verbose") breaks booting on systems where the PMU is probed without
devicetree (e.g by inspecting the MIDR of the current CPU). In this case,
pdev->dev.of_node is NULL and we shouldn't try to access its ->fullname
field when printing probe error messages.

This patch fixes the probing code to use of_node_full_name, which safely
handles NULL nodes and removes the "Error %i" part of the string, since
it's not terribly useful.

Reported-by: Guenter Roeck <private@roeck-us.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-03-21 11:36:17 +00:00
Lorenzo Pieralisi
da4e4f18af drivers/perf: arm_pmu: implement CPU_PM notifier
When a CPU is suspended (either through suspend-to-RAM or CPUidle),
its PMU registers content can be lost, which means that counters
registers values that were initialized on power down entry have to be
reprogrammed on power-up to make sure the counters set-up is preserved
(ie on power-up registers take the reset values on Cold or Warm reset,
which can be architecturally UNKNOWN).

To guarantee seamless profiling conditions across a core power down
this patch adds a CPU PM notifier to ARM pmus, that upon CPU PM
entry/exit from low-power states saves/restores the pmu registers
set-up (by using the ARM perf API), so that the power-down/up cycle does
not affect the perf behaviour (apart from a black-out period between
power-up/down CPU PM notifications that is unavoidable).

Cc: Will Deacon <will.deacon@arm.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Acked-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-26 14:37:06 +00:00
Dirk Behme
c6b90653f1 drivers/perf: arm_pmu: make info messages more verbose
On a big.LITTLE system e.g. with Cortex A57 and A53 in case not all cores
are online at PMU probe time we might get

hw perfevents: failed to probe PMU!
hw perfevents: failed to register PMU devices!

making it unclear which cores failed, here.

Add the device tree full name which failed and the error value resulting
in a more verbose and helpful message like

hw perfevents: /soc/pmu_a53: failed to probe PMU! Error -6
hw perfevents: /soc/pmu_a53: failed to register PMU devices! Error -6

Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-09 11:18:39 +00:00
Martin Fuzzey
8d1a0ae724 ARM: perf: Set ARMv7 SDER SUNIDEN bit
ARMv7 counters other than the CPU cycle counter only work if the Secure
Debug Enable Register (SDER) SUNIDEN bit is set.

Since access to the SDER is only possible in secure state, it will
only be done if the device tree property "secure-reg-access" is set.

Without this:

 Performance counter stats for 'sleep 1':

          14606094 cycles                    #    0.000 GHz
                 0 instructions              #    0.00  insns per cycle

After applying:

 Performance counter stats for 'sleep 1':

           5843809 cycles
           2566484 instructions              #    0.44  insns per cycle

       1.020144000 seconds time elapsed

Some platforms (eg i.MX53) may also need additional platform specific
setup.

Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Martin Fuzzey <mfuzzey@parkeon.com>
Signed-off-by: Pooya Keshavarzi <Pooya.Keshavarzi@de.bosch.com>
Signed-off-by: George G. Davis <george_davis@mentor.com>
[will: add warning if property is found on arm64]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-01-25 18:37:44 +00:00
Mark Rutland
b916b785af drivers/perf: kill armpmu_register
Nothing outside of drivers/perf/arm_pmu.c should call armpmu_register
any more, so it no longer needs to be in include/linux/perf/arm_pmu.h.
Additionally, by folding it in to arm_pmu_device_probe we can allow
drivers to override struct pmu fields without getting blatted by the
armpmu code.

This patch folds armpmu_register into arm_pmu_device_probe. The logging
to the console is moved to after the PMU is successfully registered with
the core perf code.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Suggested-by: Will Deacon <will.deacon@arm.com>
Cc: Drew Richardson <drew.richardson@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-11-16 15:41:49 +00:00
Linus Torvalds
2dc10ad81f arm64 updates for 4.4:
- "genirq: Introduce generic irq migration for cpu hotunplugged" patch
   merged from tip/irq/for-arm to allow the arm64-specific part to be
   upstreamed via the arm64 tree
 
 - CPU feature detection reworked to cope with heterogeneous systems
   where CPUs may not have exactly the same features. The features
   reported by the kernel via internal data structures or ELF_HWCAP are
   delayed until all the CPUs are up (and before user space starts)
 
 - Support for 16KB pages, with the additional bonus of a 36-bit VA
   space, though the latter only depending on EXPERT
 
 - Implement native {relaxed, acquire, release} atomics for arm64
 
 - New ASID allocation algorithm which avoids IPI on roll-over, together
   with TLB invalidation optimisations (using local vs global where
   feasible)
 
 - KASan support for arm64
 
 - EFI_STUB clean-up and isolation for the kernel proper (required by
   KASan)
 
 - copy_{to,from,in}_user optimisations (sharing the memcpy template)
 
 - perf: moving arm64 to the arm32/64 shared PMU framework
 
 - L1_CACHE_BYTES increased to 128 to accommodate Cavium hardware
 
 - Support for the contiguous PTE hint on kernel mapping (16 consecutive
   entries may be able to use a single TLB entry)
 
 - Generic CONFIG_HZ now used on arm64
 
 - defconfig updates
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWOkmIAAoJEGvWsS0AyF7x4GgQAINU3NePjFFvWZNCkqobeH9+
 jFKwtXamIudhTSdnXNXyYWmtRL9Krg3qI4zDQf68dvDFAZAze2kVuOi1yPpCbpFZ
 /j/afNyQc7+PoyqRAzmT+EMPZlcuOA84Prrl1r3QWZ58QaFeVk/6ZxrHunTHxN0x
 mR9PIXfWx73MTo+UnG8FChkmEY6LmV4XpemgTaMR9FqFhdT51OZSxDDAYXOTm4JW
 a5HdN9OWjjJ2rhLlFEaC7tszG9B5doHdy2tr5ge/YERVJzIPDogHkMe8ZhfAJc+x
 SQU5tKN6Pg4MOi+dLhxlk0/mKCvHLiEQ5KVREJnt8GxupAR54Bat+DQ+rP9cSnpq
 dRQTcARIOyy9LGgy+ROAsSo+NiyM5WuJ0/WJUYKmgWTJOfczRYoZv6TMKlwNOUYb
 tGLCZHhKPM3yBHJlWbQykl3xmSuudxCMmjlZzg7B+MVfTP6uo0CRSPmYl+v67q+J
 bBw/Z2RYXWYGnvlc6OfbMeImI6prXeE36+5ytyJFga0m+IqcTzRGzjcLxKEvdbiU
 pr8n9i+hV9iSsT/UwukXZ8ay6zH7PrTLzILWQlieutfXlvha7MYeGxnkbLmdYcfe
 GCj374io5cdImHcVKmfhnOMlFOLuOHphl9cmsd/O2LmCIqBj9BIeNH2Om8mHVK2F
 YHczMdpESlJApE7kUc1e
 =3six
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - "genirq: Introduce generic irq migration for cpu hotunplugged" patch
   merged from tip/irq/for-arm to allow the arm64-specific part to be
   upstreamed via the arm64 tree

 - CPU feature detection reworked to cope with heterogeneous systems
   where CPUs may not have exactly the same features.  The features
   reported by the kernel via internal data structures or ELF_HWCAP are
   delayed until all the CPUs are up (and before user space starts)

 - Support for 16KB pages, with the additional bonus of a 36-bit VA
   space, though the latter only depending on EXPERT

 - Implement native {relaxed, acquire, release} atomics for arm64

 - New ASID allocation algorithm which avoids IPI on roll-over, together
   with TLB invalidation optimisations (using local vs global where
   feasible)

 - KASan support for arm64

 - EFI_STUB clean-up and isolation for the kernel proper (required by
   KASan)

 - copy_{to,from,in}_user optimisations (sharing the memcpy template)

 - perf: moving arm64 to the arm32/64 shared PMU framework

 - L1_CACHE_BYTES increased to 128 to accommodate Cavium hardware

 - Support for the contiguous PTE hint on kernel mapping (16 consecutive
   entries may be able to use a single TLB entry)

 - Generic CONFIG_HZ now used on arm64

 - defconfig updates

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (91 commits)
  arm64/efi: fix libstub build under CONFIG_MODVERSIONS
  ARM64: Enable multi-core scheduler support by default
  arm64/efi: move arm64 specific stub C code to libstub
  arm64: page-align sections for DEBUG_RODATA
  arm64: Fix build with CONFIG_ZONE_DMA=n
  arm64: Fix compat register mappings
  arm64: Increase the max granular size
  arm64: remove bogus TASK_SIZE_64 check
  arm64: make Timer Interrupt Frequency selectable
  arm64/mm: use PAGE_ALIGNED instead of IS_ALIGNED
  arm64: cachetype: fix definitions of ICACHEF_* flags
  arm64: cpufeature: declare enable_cpu_capabilities as static
  genirq: Make the cpuhotplug migration code less noisy
  arm64: Constify hwcap name string arrays
  arm64/kvm: Make use of the system wide safe values
  arm64/debug: Make use of the system wide safe value
  arm64: Move FP/ASIMD hwcap handling to common code
  arm64/HWCAP: Use system wide safe values
  arm64/capabilities: Make use of system wide safe value
  arm64: Delay cpu feature capability checks
  ...
2015-11-04 14:47:13 -08:00
Will Deacon
fb659882cc drivers/perf: arm_pmu: avoid CPU device_node reference leak
of_cpu_device_node_get increments the reference count on the CPU
device_node, so we must take care to of_node_put once we've finished
with it.

This patch fixes the perf IRQ probing code to avoid the leak.

Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2015-10-15 17:11:23 +02:00
Mark Rutland
6475b2d846 arm64: perf: move to shared arm_pmu framework
Now that the arm_pmu framework has been factored out to drivers/perf we
can make use of it for arm64, gaining support for heterogeneous PMUs
and unifying the two codebases before they diverge further.

The as yet unused PMU name for PMUv3 is changed to armv8_pmuv3, matching
the style previously applied to the 32-bit PMUs.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-10-07 14:24:48 +01:00
Mark Rutland
fa8ad7889d arm: perf: factor arm_pmu core out to drivers
To enable sharing of the arm_pmu code with arm64, this patch factors it
out to drivers/perf/. A new drivers/perf directory is added for
performance monitor drivers to live under.

MAINTAINERS is updated accordingly. Files added previously without a
corresponsing MAINTAINERS update (perf_regs.c, perf_callchain.c, and
perf_event.h) are also added.

Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[will: augmented Kconfig help slightly]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-31 15:01:14 +01:00