Commit Graph

818 Commits

Author SHA1 Message Date
Linus Torvalds
c150b809f7 RISC-V Patches for the 6.9 Merge Window
* Support for various vector-accelerated crypto routines.
 * Hibernation is now enabled for portable kernel builds.
 * mmap_rnd_bits_max is larger on systems with larger VAs.
 * Support for fast GUP.
 * Support for membarrier-based instruction cache synchronization.
 * Support for the Andes hart-level interrupt controller and PMU.
 * Some cleanups around unaligned access speed probing and Kconfig
   settings.
 * Support for ACPI LPI and CPPC.
 * Various cleanus related to barriers.
 * A handful of fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmX9icgTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYib+UD/4xyL6UMixx6A06BVBL9UT4vOrxRvNr
 JIihG5y5QNMjes9DHWL35mZTMqFtQ0tq94ViWFLmJWloV/8KRVM2C9R9KX7vplf3
 M/OwvP106spxgvNHoeQbycgs42RU1t2mpqT7N1iK2hCjqieP3vLn6hsSLXWTAG0L
 3gQbQw6XCLC3hPyLq+nbFY2i4faeCmpXWmixoy/IvQ5calZQrRU0LNlP6lcMBhVo
 uocjG0uGAhrahw2s81jxcMZcxa3AvUCiplapdD5H5v9rBM85SkYJj2Q9SqdSorkb
 xzuimRnKPI5s47yM3pTfZY0qnQUYHV7PXXuw4WujpCQVQdhaG+Ggq63UUZA61J9t
 IzZK2zdcfHqICrGTtXImUzRT3dcc3oq+IFq4tTY+rEJm29hrXkAtx+qBm5xtMvax
 fJz5feJ/iT0u7MDj4Oq24n+Kpl+Olm+MJaZX3m5Ovi/9V6a9iK9HXqxg9/Fs0fMO
 +J/0kTgd8Vu9CYH7KNWz3uztcO9eMAH3VyzuXuab4BGj1i1Y/9EjpALQi7rDN73S
 OsYQX6NnzMkBV4dvElJVLXiPlvNlMHZZwdak5CqPb48jaJu6iiIZAuvOrG6/naGP
 wnQSLVA2WWWoOkl3AJhxfpa11CLhbMl9E2gYm1VtNvASXoSFIxlAq1Yv3sG8yjty
 4ZT0rYFJOstYiQ==
 =3dL5
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for various vector-accelerated crypto routines

 - Hibernation is now enabled for portable kernel builds

 - mmap_rnd_bits_max is larger on systems with larger VAs

 - Support for fast GUP

 - Support for membarrier-based instruction cache synchronization

 - Support for the Andes hart-level interrupt controller and PMU

 - Some cleanups around unaligned access speed probing and Kconfig
   settings

 - Support for ACPI LPI and CPPC

 - Various cleanus related to barriers

 - A handful of fixes

* tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (66 commits)
  riscv: Fix syscall wrapper for >word-size arguments
  crypto: riscv - add vector crypto accelerated AES-CBC-CTS
  crypto: riscv - parallelize AES-CBC decryption
  riscv: Only flush the mm icache when setting an exec pte
  riscv: Use kcalloc() instead of kzalloc()
  riscv/barrier: Add missing space after ','
  riscv/barrier: Consolidate fence definitions
  riscv/barrier: Define RISCV_FULL_BARRIER
  riscv/barrier: Define __{mb,rmb,wmb}
  RISC-V: defconfig: Enable CONFIG_ACPI_CPPC_CPUFREQ
  cpufreq: Move CPPC configs to common Kconfig and add RISC-V
  ACPI: RISC-V: Add CPPC driver
  ACPI: Enable ACPI_PROCESSOR for RISC-V
  ACPI: RISC-V: Add LPI driver
  cpuidle: RISC-V: Move few functions to arch/riscv
  riscv: Introduce set_compat_task() in asm/compat.h
  riscv: Introduce is_compat_thread() into compat.h
  riscv: add compile-time test into is_compat_task()
  riscv: Replace direct thread flag check with is_compat_task()
  riscv: Improve arch_get_mmap_end() macro
  ...
2024-03-22 10:41:13 -07:00
Sunil V L
6649182a38
cpuidle: RISC-V: Move few functions to arch/riscv
To support ACPI Low Power Idle (LPI), few functions are required which
are currently static functions in the DT based cpuidle driver. Hence,
move them under arch/riscv so that ACPI driver also can use them. Since
they are no longer static functions, append "riscv_" prefix to the
function name.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20240118062930.245937-2-sunilvl@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-19 17:51:38 -07:00
Linus Torvalds
902861e34c - Sumanth Korikkar has taught s390 to allocate hotplug-time page frames
from hotplugged memory rather than only from main memory.  Series
   "implement "memmap on memory" feature on s390".
 
 - More folio conversions from Matthew Wilcox in the series
 
 	"Convert memcontrol charge moving to use folios"
 	"mm: convert mm counter to take a folio"
 
 - Chengming Zhou has optimized zswap's rbtree locking, providing
   significant reductions in system time and modest but measurable
   reductions in overall runtimes.  The series is "mm/zswap: optimize the
   scalability of zswap rb-tree".
 
 - Chengming Zhou has also provided the series "mm/zswap: optimize zswap
   lru list" which provides measurable runtime benefits in some
   swap-intensive situations.
 
 - And Chengming Zhou further optimizes zswap in the series "mm/zswap:
   optimize for dynamic zswap_pools".  Measured improvements are modest.
 
 - zswap cleanups and simplifications from Yosry Ahmed in the series "mm:
   zswap: simplify zswap_swapoff()".
 
 - In the series "Add DAX ABI for memmap_on_memory", Vishal Verma has
   contributed several DAX cleanups as well as adding a sysfs tunable to
   control the memmap_on_memory setting when the dax device is hotplugged
   as system memory.
 
 - Johannes Weiner has added the large series "mm: zswap: cleanups",
   which does that.
 
 - More DAMON work from SeongJae Park in the series
 
 	"mm/damon: make DAMON debugfs interface deprecation unignorable"
 	"selftests/damon: add more tests for core functionalities and corner cases"
 	"Docs/mm/damon: misc readability improvements"
 	"mm/damon: let DAMOS feeds and tame/auto-tune itself"
 
 - In the series "mm/mempolicy: weighted interleave mempolicy and sysfs
   extension" Rakie Kim has developed a new mempolicy interleaving policy
   wherein we allocate memory across nodes in a weighted fashion rather
   than uniformly.  This is beneficial in heterogeneous memory environments
   appearing with CXL.
 
 - Christophe Leroy has contributed some cleanup and consolidation work
   against the ARM pagetable dumping code in the series "mm: ptdump:
   Refactor CONFIG_DEBUG_WX and check_wx_pages debugfs attribute".
 
 - Luis Chamberlain has added some additional xarray selftesting in the
   series "test_xarray: advanced API multi-index tests".
 
 - Muhammad Usama Anjum has reworked the selftest code to make its
   human-readable output conform to the TAP ("Test Anything Protocol")
   format.  Amongst other things, this opens up the use of third-party
   tools to parse and process out selftesting results.
 
 - Ryan Roberts has added fork()-time PTE batching of THP ptes in the
   series "mm/memory: optimize fork() with PTE-mapped THP".  Mainly
   targeted at arm64, this significantly speeds up fork() when the process
   has a large number of pte-mapped folios.
 
 - David Hildenbrand also gets in on the THP pte batching game in his
   series "mm/memory: optimize unmap/zap with PTE-mapped THP".  It
   implements batching during munmap() and other pte teardown situations.
   The microbenchmark improvements are nice.
 
 - And in the series "Transparent Contiguous PTEs for User Mappings" Ryan
   Roberts further utilizes arm's pte's contiguous bit ("contpte
   mappings").  Kernel build times on arm64 improved nicely.  Ryan's series
   "Address some contpte nits" provides some followup work.
 
 - In the series "mm/hugetlb: Restore the reservation" Breno Leitao has
   fixed an obscure hugetlb race which was causing unnecessary page faults.
   He has also added a reproducer under the selftest code.
 
 - In the series "selftests/mm: Output cleanups for the compaction test",
   Mark Brown did what the title claims.
 
 - Kinsey Ho has added the series "mm/mglru: code cleanup and refactoring".
 
 - Even more zswap material from Nhat Pham.  The series "fix and extend
   zswap kselftests" does as claimed.
 
 - In the series "Introduce cpu_dcache_is_aliasing() to fix DAX
   regression" Mathieu Desnoyers has cleaned up and fixed rather a mess in
   our handling of DAX on archiecctures which have virtually aliasing data
   caches.  The arm architecture is the main beneficiary.
 
 - Lokesh Gidra's series "per-vma locks in userfaultfd" provides dramatic
   improvements in worst-case mmap_lock hold times during certain
   userfaultfd operations.
 
 - Some page_owner enhancements and maintenance work from Oscar Salvador
   in his series
 
 	"page_owner: print stacks and their outstanding allocations"
 	"page_owner: Fixup and cleanup"
 
 - Uladzislau Rezki has contributed some vmalloc scalability improvements
   in his series "Mitigate a vmap lock contention".  It realizes a 12x
   improvement for a certain microbenchmark.
 
 - Some kexec/crash cleanup work from Baoquan He in the series "Split
   crash out from kexec and clean up related config items".
 
 - Some zsmalloc maintenance work from Chengming Zhou in the series
 
 	"mm/zsmalloc: fix and optimize objects/page migration"
 	"mm/zsmalloc: some cleanup for get/set_zspage_mapping()"
 
 - Zi Yan has taught the MM to perform compaction on folios larger than
   order=0.  This a step along the path to implementaton of the merging of
   large anonymous folios.  The series is named "Enable >0 order folio
   memory compaction".
 
 - Christoph Hellwig has done quite a lot of cleanup work in the
   pagecache writeback code in his series "convert write_cache_pages() to
   an iterator".
 
 - Some modest hugetlb cleanups and speedups in Vishal Moola's series
   "Handle hugetlb faults under the VMA lock".
 
 - Zi Yan has changed the page splitting code so we can split huge pages
   into sizes other than order-0 to better utilize large folios.  The
   series is named "Split a folio to any lower order folios".
 
 - David Hildenbrand has contributed the series "mm: remove
   total_mapcount()", a cleanup.
 
 - Matthew Wilcox has sought to improve the performance of bulk memory
   freeing in his series "Rearrange batched folio freeing".
 
 - Gang Li's series "hugetlb: parallelize hugetlb page init on boot"
   provides large improvements in bootup times on large machines which are
   configured to use large numbers of hugetlb pages.
 
 - Matthew Wilcox's series "PageFlags cleanups" does that.
 
 - Qi Zheng's series "minor fixes and supplement for ptdesc" does that
   also.  S390 is affected.
 
 - Cleanups to our pagemap utility functions from Peter Xu in his series
   "mm/treewide: Replace pXd_large() with pXd_leaf()".
 
 - Nico Pache has fixed a few things with our hugepage selftests in his
   series "selftests/mm: Improve Hugepage Test Handling in MM Selftests".
 
 - Also, of course, many singleton patches to many things.  Please see
   the individual changelogs for details.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZfJpPQAKCRDdBJ7gKXxA
 joxeAP9TrcMEuHnLmBlhIXkWbIR4+ki+pA3v+gNTlJiBhnfVSgD9G55t1aBaRplx
 TMNhHfyiHYDTx/GAV9NXW84tasJSDgA=
 =TG55
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-03-13-20-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Sumanth Korikkar has taught s390 to allocate hotplug-time page frames
   from hotplugged memory rather than only from main memory. Series
   "implement "memmap on memory" feature on s390".

 - More folio conversions from Matthew Wilcox in the series

	"Convert memcontrol charge moving to use folios"
	"mm: convert mm counter to take a folio"

 - Chengming Zhou has optimized zswap's rbtree locking, providing
   significant reductions in system time and modest but measurable
   reductions in overall runtimes. The series is "mm/zswap: optimize the
   scalability of zswap rb-tree".

 - Chengming Zhou has also provided the series "mm/zswap: optimize zswap
   lru list" which provides measurable runtime benefits in some
   swap-intensive situations.

 - And Chengming Zhou further optimizes zswap in the series "mm/zswap:
   optimize for dynamic zswap_pools". Measured improvements are modest.

 - zswap cleanups and simplifications from Yosry Ahmed in the series
   "mm: zswap: simplify zswap_swapoff()".

 - In the series "Add DAX ABI for memmap_on_memory", Vishal Verma has
   contributed several DAX cleanups as well as adding a sysfs tunable to
   control the memmap_on_memory setting when the dax device is
   hotplugged as system memory.

 - Johannes Weiner has added the large series "mm: zswap: cleanups",
   which does that.

 - More DAMON work from SeongJae Park in the series

	"mm/damon: make DAMON debugfs interface deprecation unignorable"
	"selftests/damon: add more tests for core functionalities and corner cases"
	"Docs/mm/damon: misc readability improvements"
	"mm/damon: let DAMOS feeds and tame/auto-tune itself"

 - In the series "mm/mempolicy: weighted interleave mempolicy and sysfs
   extension" Rakie Kim has developed a new mempolicy interleaving
   policy wherein we allocate memory across nodes in a weighted fashion
   rather than uniformly. This is beneficial in heterogeneous memory
   environments appearing with CXL.

 - Christophe Leroy has contributed some cleanup and consolidation work
   against the ARM pagetable dumping code in the series "mm: ptdump:
   Refactor CONFIG_DEBUG_WX and check_wx_pages debugfs attribute".

 - Luis Chamberlain has added some additional xarray selftesting in the
   series "test_xarray: advanced API multi-index tests".

 - Muhammad Usama Anjum has reworked the selftest code to make its
   human-readable output conform to the TAP ("Test Anything Protocol")
   format. Amongst other things, this opens up the use of third-party
   tools to parse and process out selftesting results.

 - Ryan Roberts has added fork()-time PTE batching of THP ptes in the
   series "mm/memory: optimize fork() with PTE-mapped THP". Mainly
   targeted at arm64, this significantly speeds up fork() when the
   process has a large number of pte-mapped folios.

 - David Hildenbrand also gets in on the THP pte batching game in his
   series "mm/memory: optimize unmap/zap with PTE-mapped THP". It
   implements batching during munmap() and other pte teardown
   situations. The microbenchmark improvements are nice.

 - And in the series "Transparent Contiguous PTEs for User Mappings"
   Ryan Roberts further utilizes arm's pte's contiguous bit ("contpte
   mappings"). Kernel build times on arm64 improved nicely. Ryan's
   series "Address some contpte nits" provides some followup work.

 - In the series "mm/hugetlb: Restore the reservation" Breno Leitao has
   fixed an obscure hugetlb race which was causing unnecessary page
   faults. He has also added a reproducer under the selftest code.

 - In the series "selftests/mm: Output cleanups for the compaction
   test", Mark Brown did what the title claims.

 - Kinsey Ho has added the series "mm/mglru: code cleanup and
   refactoring".

 - Even more zswap material from Nhat Pham. The series "fix and extend
   zswap kselftests" does as claimed.

 - In the series "Introduce cpu_dcache_is_aliasing() to fix DAX
   regression" Mathieu Desnoyers has cleaned up and fixed rather a mess
   in our handling of DAX on archiecctures which have virtually aliasing
   data caches. The arm architecture is the main beneficiary.

 - Lokesh Gidra's series "per-vma locks in userfaultfd" provides
   dramatic improvements in worst-case mmap_lock hold times during
   certain userfaultfd operations.

 - Some page_owner enhancements and maintenance work from Oscar Salvador
   in his series

	"page_owner: print stacks and their outstanding allocations"
	"page_owner: Fixup and cleanup"

 - Uladzislau Rezki has contributed some vmalloc scalability
   improvements in his series "Mitigate a vmap lock contention". It
   realizes a 12x improvement for a certain microbenchmark.

 - Some kexec/crash cleanup work from Baoquan He in the series "Split
   crash out from kexec and clean up related config items".

 - Some zsmalloc maintenance work from Chengming Zhou in the series

	"mm/zsmalloc: fix and optimize objects/page migration"
	"mm/zsmalloc: some cleanup for get/set_zspage_mapping()"

 - Zi Yan has taught the MM to perform compaction on folios larger than
   order=0. This a step along the path to implementaton of the merging
   of large anonymous folios. The series is named "Enable >0 order folio
   memory compaction".

 - Christoph Hellwig has done quite a lot of cleanup work in the
   pagecache writeback code in his series "convert write_cache_pages()
   to an iterator".

 - Some modest hugetlb cleanups and speedups in Vishal Moola's series
   "Handle hugetlb faults under the VMA lock".

 - Zi Yan has changed the page splitting code so we can split huge pages
   into sizes other than order-0 to better utilize large folios. The
   series is named "Split a folio to any lower order folios".

 - David Hildenbrand has contributed the series "mm: remove
   total_mapcount()", a cleanup.

 - Matthew Wilcox has sought to improve the performance of bulk memory
   freeing in his series "Rearrange batched folio freeing".

 - Gang Li's series "hugetlb: parallelize hugetlb page init on boot"
   provides large improvements in bootup times on large machines which
   are configured to use large numbers of hugetlb pages.

 - Matthew Wilcox's series "PageFlags cleanups" does that.

 - Qi Zheng's series "minor fixes and supplement for ptdesc" does that
   also. S390 is affected.

 - Cleanups to our pagemap utility functions from Peter Xu in his series
   "mm/treewide: Replace pXd_large() with pXd_leaf()".

 - Nico Pache has fixed a few things with our hugepage selftests in his
   series "selftests/mm: Improve Hugepage Test Handling in MM
   Selftests".

 - Also, of course, many singleton patches to many things. Please see
   the individual changelogs for details.

* tag 'mm-stable-2024-03-13-20-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (435 commits)
  mm/zswap: remove the memcpy if acomp is not sleepable
  crypto: introduce: acomp_is_async to expose if comp drivers might sleep
  memtest: use {READ,WRITE}_ONCE in memory scanning
  mm: prohibit the last subpage from reusing the entire large folio
  mm: recover pud_leaf() definitions in nopmd case
  selftests/mm: skip the hugetlb-madvise tests on unmet hugepage requirements
  selftests/mm: skip uffd hugetlb tests with insufficient hugepages
  selftests/mm: dont fail testsuite due to a lack of hugepages
  mm/huge_memory: skip invalid debugfs new_order input for folio split
  mm/huge_memory: check new folio order when split a folio
  mm, vmscan: retry kswapd's priority loop with cache_trim_mode off on failure
  mm: add an explicit smp_wmb() to UFFDIO_CONTINUE
  mm: fix list corruption in put_pages_list
  mm: remove folio from deferred split list before uncharging it
  filemap: avoid unnecessary major faults in filemap_fault()
  mm,page_owner: drop unnecessary check
  mm,page_owner: check for null stack_record before bumping its refcount
  mm: swap: fix race between free_swap_and_cache() and swapoff()
  mm/treewide: align up pXd_leaf() retval across archs
  mm/treewide: drop pXd_large()
  ...
2024-03-14 17:43:30 -07:00
Yosry Ahmed
7dbbc8f57d x86/mm: delete unused cpu argument to leave_mm()
The argument is unused since commit 3d28ebceaf ("x86/mm: Rework lazy
TLB to track the actual loaded mm"), delete it.

Link: https://lkml.kernel.org/r/20240126080644.1714297-1-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22 10:24:41 -08:00
C Cheng
88390dd788 cpuidle: Avoid potential overflow in integer multiplication
In detail:

In C language, when you perform a multiplication operation, if
both operands are of int type, the multiplication operation is
performed on the int type, and then the result is converted to
the target type. This means that if the product of int type
multiplication exceeds the range that int type can represent,
an overflow will occur even if you store the result in a
variable of int64_t type.

For a multiplication of two int values, it is better to use
mul_u32_u32() rather than s->exit_latency_ns = s->exit_latency *
NSEC_PER_USEC to avoid potential overflow happenning.

Signed-off-by: C Cheng <C.Cheng@mediatek.com>
Signed-off-by: Bo Ye <bo.ye@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
[ rjw: New subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-12 17:07:49 +01:00
Parshuram Sangle
496d0a6485 cpuidle: haltpoll: do not shrink guest poll_limit_ns below grow_start
While adjusting guest halt poll limit, grow block starts at
guest_halt_poll_grow_start without taking intermediate values.
Similar behavior is expected while shrinking the value. This
avoids short interval values which are really not required.

VCPU1 trace (guest_halt_poll_shrink equals 2):

VCPU1 grow 10000
VCPU1 shrink 5000
VCPU1 shrink 2500
VCPU1 shrink 1250
VCPU1 shrink 625
VCPU1 shrink 312
VCPU1 shrink 156
VCPU1 shrink 78
VCPU1 shrink 39
VCPU1 shrink 19
VCPU1 shrink 9
VCPU1 shrink 4

Similar change is done in KVM halt poll flow with below patch:
Link: https://lore.kernel.org/kvm/20211006133021.271905-3-sashal@kernel.org/

Co-developed-by: Rajendran Jaishankar <jaishankar.rajendran@intel.com>
Signed-off-by: Rajendran Jaishankar <jaishankar.rajendran@intel.com>
Signed-off-by: Parshuram Sangle <parshuram.sangle@intel.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-12 17:02:51 +01:00
Borislav Petkov (AMD)
c8f5caec3d cpuidle: haltpoll: Do not enable interrupts when entering idle
The cpuidle drivers' ->enter() methods are supposed to be IRQ invariant:

  5e26aa9339 ("cpuidle/poll: Ensure IRQs stay disabled after cpuidle_state::enter() calls")
  bb7b112585 ("cpuidle: Move IRQ state validation")

Do that in the haltpoll driver too.

Fixes: 5e26aa9339 ("cpuidle/poll: Ensure IRQs stay disabled after cpuidle_state::enter() calls")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218245
Reported-by: <forza@tnonline.net>
Tested-by: <forza@tnonline.net>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29 18:08:18 +01:00
Justin Stitt
b545465e22 cpuidle: dt: Replace deprecated strncpy() with strscpy()
`strncpy` is deprecated for use on NUL-terminated destination strings [1].

We should prefer more robust and less ambiguous string interfaces.

A suitable replacement is `strscpy` [2] due to the fact that it guarantees
NUL-termination on the destination buffer. With this, we can also drop
the now unnecessary `CPUIDLE_(NAME|DESC)_LEN - 1` pieces.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230913-strncpy-drivers-cpuidle-dt_idle_states-c-v1-1-d16a0dbe5658@google.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-09-29 14:48:31 -07:00
Linus Torvalds
4ad0a4c234 powerpc updates for 6.6
- Add HOTPLUG_SMT support (/sys/devices/system/cpu/smt) and honour the
    configured SMT state when hotplugging CPUs into the system.
 
  - Combine final TLB flush and lazy TLB mm shootdown IPIs when using the Radix
    MMU to avoid a broadcast TLBIE flush on exit.
 
  - Drop the exclusion between ptrace/perf watchpoints, and drop the now unused
    associated arch hooks.
 
  - Add support for the "nohlt" command line option to disable CPU idle.
 
  - Add support for -fpatchable-function-entry for ftrace, with GCC >= 13.1.
 
  - Rework memory block size determination, and support 256MB size on systems
    with GPUs that have hotpluggable memory.
 
  - Various other small features and fixes.
 
 Thanks to: Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev,
 Benjamin Gray, Christophe Leroy, Frederic Barrat, Gautam Menghani, Geoff Levand,
 Hari Bathini, Immad Mir, Jialin Zhang, Joel Stanley, Jordan Niethe, Justin
 Stitt, Kajol Jain, Kees Cook, Krzysztof Kozlowski, Laurent Dufour, Liang He,
 Linus Walleij, Mahesh Salgaonkar, Masahiro Yamada, Michal Suchanek, Nageswara
 R Sastry, Nathan Chancellor, Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nick
 Desaulniers, Omar Sandoval, Randy Dunlap, Reza Arbab, Rob Herring, Russell
 Currey, Sourabh Jain, Thomas Gleixner, Trevor Woerner, Uwe Kleine-König, Vaibhav
 Jain, Xiongfeng Wang, Yuan Tan, Zhang Rui, Zheng Zengkai.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmTwgbwTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgFmpD/432vipeoqvkAYsyK0xi/Y3GcY0wcyd
 WJApLXXadEbtKQrgXQ6sowWqalg5thYnQCRarg/tXKK/po3KfgwkPjGDpOL+cIdr
 12QVN2XJm9VmJ1wYJxzk+yXx4F43AdmMdr94qWAGufbTHezwb4UpzVR1NxtFrOE/
 X5TNsC2+2mdZY/ZaNHS5vsTIFv3EhQfqgjZPlIAdLn6CGc8xWT514Q/uHA8+ytM/
 HL7Hqs33DoPSvgTa5TT/2E0d0k5nO3P5KObzAjpYlireTPaBi51mpKGewcrtm0o2
 v3cBlbfx3C7pe9ZhKBK9BH8cjynfiqsVZ9/lCw/7eBNdm9tHuzG0jeS7Db9tCZXS
 fM7G2R7SoIusPTqxlBmkU5DpYslwrHiVgCyy3ijxkoA/fakVwh/GgTcMsRt73IY6
 n6DsUvWwuYHCIeIiHmHQJqCqCRtV+aMzU3AbbBHOjtdIanhlW16M686dEsgCirh7
 akRVRD5VqKaqXs34PpkRL89Xv3wZRjl6XZ3hZFfCjSYXfpXDXhgSToIskpHYhKL8
 gpY7WtG9YQP05Xz5HRCx6EluaZVeKe0lZi6fezX7Mi9AygJQO8FfXqP1mHBlEq40
 ThWtvL9D89RV6lADqqFN20XepgvKNOyAXcE4szvsnIZYUSPmZQZSPxx+DHtROaLP
 jX3ifxtxJp92pQ==
 =5g7K
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Add HOTPLUG_SMT support (/sys/devices/system/cpu/smt) and honour the
   configured SMT state when hotplugging CPUs into the system

 - Combine final TLB flush and lazy TLB mm shootdown IPIs when using the
   Radix MMU to avoid a broadcast TLBIE flush on exit

 - Drop the exclusion between ptrace/perf watchpoints, and drop the now
   unused associated arch hooks

 - Add support for the "nohlt" command line option to disable CPU idle

 - Add support for -fpatchable-function-entry for ftrace, with GCC >=
   13.1

 - Rework memory block size determination, and support 256MB size on
   systems with GPUs that have hotpluggable memory

 - Various other small features and fixes

Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira
Rajeev, Benjamin Gray, Christophe Leroy, Frederic Barrat, Gautam
Menghani, Geoff Levand, Hari Bathini, Immad Mir, Jialin Zhang, Joel
Stanley, Jordan Niethe, Justin Stitt, Kajol Jain, Kees Cook, Krzysztof
Kozlowski, Laurent Dufour, Liang He, Linus Walleij, Mahesh Salgaonkar,
Masahiro Yamada, Michal Suchanek, Nageswara R Sastry, Nathan Chancellor,
Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nick Desaulniers, Omar
Sandoval, Randy Dunlap, Reza Arbab, Rob Herring, Russell Currey, Sourabh
Jain, Thomas Gleixner, Trevor Woerner, Uwe Kleine-König, Vaibhav Jain,
Xiongfeng Wang, Yuan Tan, Zhang Rui, and Zheng Zengkai.

* tag 'powerpc-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (135 commits)
  macintosh/ams: linux/platform_device.h is needed
  powerpc/xmon: Reapply "Relax frame size for clang"
  powerpc/mm/book3s64: Use 256M as the upper limit with coherent device memory attached
  powerpc/mm/book3s64: Fix build error with SPARSEMEM disabled
  powerpc/iommu: Fix notifiers being shared by PCI and VIO buses
  powerpc/mpc5xxx: Add missing fwnode_handle_put()
  powerpc/config: Disable SLAB_DEBUG_ON in skiroot
  powerpc/pseries: Remove unused hcall tracing instruction
  powerpc/pseries: Fix hcall tracepoints with JUMP_LABEL=n
  powerpc: dts: add missing space before {
  powerpc/eeh: Use pci_dev_id() to simplify the code
  powerpc/64s: Move CPU -mtune options into Kconfig
  powerpc/powermac: Fix unused function warning
  powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT
  powerpc: Don't include lppaca.h in paca.h
  powerpc/pseries: Move hcall_vphn() prototype into vphn.h
  powerpc/pseries: Move VPHN constants into vphn.h
  cxl: Drop unused detach_spa()
  powerpc: Drop zalloc_maybe_bootmem()
  powerpc/powernv: Use struct opal_prd_msg in more places
  ...
2023-08-31 12:43:10 -07:00
Rafael J. Wysocki
1201c50c1e Merge branches 'pm-cpuidle' and 'pm-cpufreq'
Merge CPU power management updates for 6.6-rc1:

 - Rework the menu and teo cpuidle governors to avoid calling
   tick_nohz_get_sleep_length(), which is likely to become quite
   expensive going forward, too often and improve making decisions
   regarding whether or not to stop the scheduler tick in the teo
   governor (Rafael Wysocki).

 - Improve the performance of cpufreq_stats_create_table() in some
   cases (Liao Chang).

 - Fix two issues in the amd-pstate-ut cpufreq driver (Swapnil Sapkal).

 - Use clamp() helper macro to improve the code readability in
   cpufreq_verify_within_limits() (Liao Chang).

 - Set stale CPU frequency to minimum in intel_pstate (Doug Smythies).

* pm-cpuidle:
  cpuidle: teo: Avoid unnecessary variable assignments
  cpuidle: menu: Skip tick_nohz_get_sleep_length() call in some cases
  cpuidle: teo: Gather statistics regarding whether or not to stop the tick
  cpuidle: teo: Skip tick_nohz_get_sleep_length() call in some cases
  cpuidle: teo: Do not call tick_nohz_get_sleep_length() upfront
  cpuidle: teo: Drop utilized from struct teo_cpu
  cpuidle: teo: Avoid stopping the tick unnecessarily when bailing out
  cpuidle: teo: Update idle duration estimate when choosing shallower state

* pm-cpufreq:
  cpufreq: amd-pstate-ut: Fix kernel panic when loading the driver
  cpufreq: amd-pstate-ut: Remove module parameter access
  cpufreq: Use clamp() helper macro to improve the code readability
  cpufreq: intel_pstate: set stale CPU frequency to minimum
  cpufreq: stats: Improve the performance of cpufreq_stats_create_table()
2023-08-25 21:15:56 +02:00
Russell Currey
eac030b22e powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT
lppaca_shared_proc() takes a pointer to the lppaca which is typically
accessed through get_lppaca().  With DEBUG_PREEMPT enabled, this leads
to checking if preemption is enabled, for example:

  BUG: using smp_processor_id() in preemptible [00000000] code: grep/10693
  caller is lparcfg_data+0x408/0x19a0
  CPU: 4 PID: 10693 Comm: grep Not tainted 6.5.0-rc3 #2
  Call Trace:
    dump_stack_lvl+0x154/0x200 (unreliable)
    check_preemption_disabled+0x214/0x220
    lparcfg_data+0x408/0x19a0
    ...

This isn't actually a problem however, as it does not matter which
lppaca is accessed, the shared proc state will be the same.
vcpudispatch_stats_procfs_init() already works around this by disabling
preemption, but the lparcfg code does not, erroring any time
/proc/powerpc/lparcfg is accessed with DEBUG_PREEMPT enabled.

Instead of disabling preemption on the caller side, rework
lppaca_shared_proc() to not take a pointer and instead directly access
the lppaca, bypassing any potential preemption checks.

Fixes: f13c13a005 ("powerpc: Stop using non-architected shared_proc field in lppaca")
Signed-off-by: Russell Currey <ruscur@russell.cc>
[mpe: Rework to avoid needing a definition in paca.h and lppaca.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230823055317.751786-4-mpe@ellerman.id.au
2023-08-24 22:33:17 +10:00
Rafael J. Wysocki
78aabcb321 cpuidle: teo: Avoid unnecessary variable assignments
Notice that it is not necessary to assign tick_intercept_sum in every
iteration of the first loop over idle states in teo_select(), because
the intercept_sum value does not change after the assignment in a
given iteration of the loop, so its value after the last iteration of
the loop can be used for computing the tick_intercept_sum value
directly.

Modify the code accordingly.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-08-23 18:25:04 +02:00
Rafael J. Wysocki
5484e31bbb cpuidle: menu: Skip tick_nohz_get_sleep_length() call in some cases
Because the cost of calling tick_nohz_get_sleep_length() may increase
in the future, reorder the code in menu_select() so it first uses the
statistics to determine the expected idle duration.  If that value is
higher than RESIDENCY_THRESHOLD_NS, tick_nohz_get_sleep_length() will
be called to obtain the time till the closest timer and refine the
idle duration prediction if necessary.

This causes the governor to always take the full overhead of
get_typical_interval() with the assumption that the cost will be
amortized by skipping the tick_nohz_get_sleep_length() call in the
cases when the predicted idle duration is relatively very small.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Doug Smythies <dsmythies@telus.net>
2023-08-17 11:28:38 +02:00
Rafael J. Wysocki
2662342079 cpuidle: teo: Gather statistics regarding whether or not to stop the tick
Currently, if the target residency of the deepest idle state is less than
the tick period length, which is quite likely for HZ=100, and the deepest
idle state is about to be selected by the TEO idle governor, the decision
on whether or not to stop the scheduler tick is based entirely on the
time till the closest timer.  This is often insufficient, because timers
may not be in heavy use and there may be a plenty of other CPU wakeup
events between the deepest idle state's target residency and the closest
tick.

Allow the governor to count those events by making the deepest idle
state's bin effectively end at TICK_NSEC and introducing an additional
"bin" for collecting "hit" events (ie. the ones in which the measured
idle duration falls into the same bin as the time till the closest
timer) with idle duration values past TICK_NSEC.

This way the "intercepts" metric for the deepest idle state's bin
becomes nonzero in general, and so it can influence the decision on
whether or not to stop the tick possibly increasing the governor's
accuracy in that respect.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
Tested-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
2023-08-09 19:58:53 +02:00
Rafael J. Wysocki
6da8f9ba5a cpuidle: teo: Skip tick_nohz_get_sleep_length() call in some cases
Make teo_select() avoid calling tick_nohz_get_sleep_length() if the
candidate idle state to return is state 0 or if state 0 is a polling
one and the target residency of the current candidate one is below
a certain threshold, in which cases it may be assumed that the CPU will
be woken up immediately by a non-timer wakeup source and the timers
are not likely to matter.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
Tested-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
2023-08-09 19:58:46 +02:00
Rafael J. Wysocki
21d28cd2fa cpuidle: teo: Do not call tick_nohz_get_sleep_length() upfront
Because the cost of calling tick_nohz_get_sleep_length() may increase
in the future, reorder the code in teo_select() so it first uses the
statistics to pick up a candidate idle state and applies the utilization
heuristic to it and only then calls tick_nohz_get_sleep_length() to
obtain the sleep length value and refine the selection if necessary.

This change by itself does not cause tick_nohz_get_sleep_length() to
be called less often, but it prepares the code for subsequent changes
that will do so.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
Tested-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
2023-08-09 19:58:05 +02:00
Maulik Shah
12acb348fa cpuidle: psci: Move enabling OSI mode after power domains creation
A switch from OSI to PC mode is only possible if all CPUs other than the
calling one are OFF, either through a call to CPU_OFF or not yet booted.

Currently OSI mode is enabled before power domains are created. In cases
where CPUidle states are not using hierarchical CPU topology the bail out
path tries to switch back to PC mode which gets denied by firmware since
other CPUs are online at this point and creates inconsistent state as
firmware is in OSI mode and Linux in PC mode.

This change moves enabling OSI mode after power domains are created,
this would makes sure that hierarchical CPU topology is used before
switching firmware to OSI mode.

Cc: stable@vger.kernel.org
Fixes: 70c179b498 ("cpuidle: psci: Allow PM domain to be initialized even if no OSI mode")
Signed-off-by: Maulik Shah <quic_mkshah@quicinc.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-08-08 16:07:01 +02:00
Maulik Shah
9a8fa00dad cpuidle: dt_idle_genpd: Add helper function to remove genpd topology
Genpd parent and child domain topology created using dt_idle_pd_init_topology()
needs to be removed during error cases.

Add new helper function dt_idle_pd_remove_topology() for same.

Cc: stable@vger.kernel.org
Reviewed-by: Ulf Hanssson <ulf.hansson@linaro.org>
Signed-off-by: Maulik Shah <quic_mkshah@quicinc.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-08-08 16:06:20 +02:00
Rafael J. Wysocki
9a41e16f11 cpuidle: teo: Drop utilized from struct teo_cpu
Because the utilized field in struct teo_cpu is only used locally in
teo_select(), replace it with a local variable in that function.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
2023-08-03 22:37:11 +02:00
Rafael J. Wysocki
04bae4e226 cpuidle: teo: Avoid stopping the tick unnecessarily when bailing out
When teo_select() is going to return early in some special cases, make
it avoid stopping the tick if the idle state to be returned is shallow.

In particular, never stop the tick if state 0 is to be returned.

Link: https://lore.kernel.org/linux-pm/CAJZ5v0jJxHj65r2HXBTd3wfbZtsg=_StzwO1kA5STDnaPe_dWA@mail.gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
2023-08-03 22:37:03 +02:00
Rafael J. Wysocki
3f0b0966b3 cpuidle: teo: Update idle duration estimate when choosing shallower state
The TEO governor takes CPU utilization into account by refining idle state
selection when the utilization is above a certain threshold.  This is done by
choosing an idle state shallower than the previously selected one.

However, when doing this, the idle duration estimate needs to be
adjusted so as to prevent the scheduler tick from being stopped when the
candidate idle state is shallow, which may lead to excessive energy
usage if the CPU is not woken up quickly enough going forward.
Moreover, if the scheduler tick has been stopped already and the new
idle duration estimate is too small, the replacement candidate state
cannot be used.

Modify the relevant code to take the above observations into account.

Fixes: 9ce0f7c4bc ("cpuidle: teo: Introduce util-awareness")
Link: https://lore.kernel.org/linux-pm/CAJZ5v0jJxHj65r2HXBTd3wfbZtsg=_StzwO1kA5STDnaPe_dWA@mail.gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
2023-08-03 22:36:09 +02:00
Peter Zijlstra
e6a15fa9ea cpuidle: Use local_clock_noinstr()
With the introduction of local_clock_noinstr(), local_clock() itself
is no longer marked noinstr, use the correct function.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>  # Hyper-V
Link: https://lore.kernel.org/r/20230519102716.045980863@infradead.org
2023-06-05 21:11:09 +02:00
Andrew Jones
41cad8284d
RISC-V: Align SBI probe implementation with spec
sbi_probe_extension() is specified with "Returns 0 if the given SBI
extension ID (EID) is not available, or 1 if it is available unless
defined as any other non-zero value by the implementation."
Additionally, sbiret.value is a long. Fix the implementation to
ensure any nonzero long value is considered a success, rather
than only positive int values.

Fixes: b9dcd9e415 ("RISC-V: Add basic support for SBI v0.2")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230427163626.101042-1-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-29 13:04:50 -07:00
Linus Torvalds
70cc1b5307 powerpc updates for 6.4
- Add support for building the kernel using PC-relative addressing on Power10.
 
  - Allow HV KVM guests on Power10 to use prefixed instructions.
 
  - Unify support for the P2020 CPU (85xx) into a single machine description.
 
  - Always build the 64-bit kernel with 128-bit long double.
 
  - Drop support for several obsolete 2000's era development boards as
    identified by Paul Gortmaker.
 
  - A series fixing VFIO on Power since some generic changes.
 
  - Various other small features and fixes.
 
 Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Benjamin Gray, Bo Liu,
 Christophe Leroy, Dan Carpenter, David Binderman, Ira Weiny, Joel Stanley,
 Kajol Jain, Kautuk Consul, Liang He, Luis Chamberlain, Masahiro Yamada, Michael
 Neuling, Nathan Chancellor, Nathan Lynch, Nicholas Miehlbradt, Nicholas Piggin,
 Nick Desaulniers, Nysal Jan K.A, Pali Rohár, Paul Gortmaker, Paul Mackerras,
 Petr Vaněk, Randy Dunlap, Rob Herring, Sachin Sant, Sean Christopherson, Segher
 Boessenkool, Timothy Pearson.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmRLdD8THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgPrED/93Hwi1/8udXYV6OAFBnLLLDX9RZWNx
 sj6W7PC/yY7MGUTIGcayrAURt5xFfQZMBdq9nhhH46Wyd8pSUe1IlpXIEgqeH64Y
 5uaSe6u4OZ5dDmiYz8bM+Y4Ixkfq1xMO0Rj27FIRJyU4Pp6gyMQ6/8W+iPU2vYIb
 jl4frVO8PpmCbW8euOyT/b9YB2h79+5nLgZT4RvmYJblKQwgzRZWaiCAU0wYf+xT
 xYsMQcqJlPstizTnXIr+GC08VrMPQR51kJCurnNUMXoAQY6toEXebveWRNZ3sH39
 K0BRQ036NNGPS4GlHVbjgLIdoWr4pUEROZ48Jy9WeiDh6OwO2vb2zrm7ZLtKlGXI
 LCQ08T9diPbAAJyVJaKjpsNXhTffuLPRhIOr1o2vqGvY+Fqbw96bPQAyFbxpJtrw
 rGTTc5e93fEdZV+eaGR8kfdNM75WM6lgiteaIbUnpirYTh/0j6YEO1Ijc4d9e5ux
 aoHN2eXQDjMm9foZf1D6vaCTRyN8LwLa9hcydKCn6+qULUQzoZ2r4cRt6eSD7+fx
 Wni1Zg7vD6xx294nVmhg5dWuD4qVIAZj3BZPFHHR2SPqy+UbEJLk/NKgznmrhhgQ
 HBcZAEFqIBZkA4e0w6LJKh/1j1rxpZUokgjMotOJq4VRivq82W1P7SioFDY3/6w5
 UvGtIgGq4Qsa2Q==
 =89WH
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Add support for building the kernel using PC-relative addressing on
   Power10.

 - Allow HV KVM guests on Power10 to use prefixed instructions.

 - Unify support for the P2020 CPU (85xx) into a single machine
   description.

 - Always build the 64-bit kernel with 128-bit long double.

 - Drop support for several obsolete 2000's era development boards as
   identified by Paul Gortmaker.

 - A series fixing VFIO on Power since some generic changes.

 - Various other small features and fixes.

Thanks to Alexey Kardashevskiy, Andrew Donnellan, Benjamin Gray, Bo Liu,
Christophe Leroy, Dan Carpenter, David Binderman, Ira Weiny, Joel
Stanley, Kajol Jain, Kautuk Consul, Liang He, Luis Chamberlain, Masahiro
Yamada, Michael Neuling, Nathan Chancellor, Nathan Lynch, Nicholas
Miehlbradt, Nicholas Piggin, Nick Desaulniers, Nysal Jan K.A, Pali
Rohár, Paul Gortmaker, Paul Mackerras, Petr Vaněk, Randy Dunlap, Rob
Herring, Sachin Sant, Sean Christopherson, Segher Boessenkool, and
Timothy Pearson.

* tag 'powerpc-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (156 commits)
  powerpc/64s: Disable pcrel code model on Clang
  powerpc: Fix merge conflict between pcrel and copy_thread changes
  powerpc/configs/powernv: Add IGB=y
  powerpc/configs/64s: Drop JFS Filesystem
  powerpc/configs/64s: Use EXT4 to mount EXT2 filesystems
  powerpc/configs: Make pseries_defconfig an alias for ppc64le_guest
  powerpc/configs: Make pseries_le an alias for ppc64le_guest
  powerpc/configs: Incorporate generic kvm_guest.config into guest configs
  powerpc/configs: Add IBMVETH=y and IBMVNIC=y to guest configs
  powerpc/configs/64s: Enable Device Mapper options
  powerpc/configs/64s: Enable PSTORE
  powerpc/configs/64s: Enable VLAN support
  powerpc/configs/64s: Enable BLK_DEV_NVME
  powerpc/configs/64s: Drop REISERFS
  powerpc/configs/64s: Use SHA512 for module signatures
  powerpc/configs/64s: Enable IO_STRICT_DEVMEM
  powerpc/configs/64s: Enable SCHEDSTATS
  powerpc/configs/64s: Enable DEBUG_VM & other options
  powerpc/configs/64s: Enable EMULATED_STATS
  powerpc/configs/64s: Enable KUNIT and most tests
  ...
2023-04-28 16:24:32 -07:00
Linus Torvalds
556eb8b791 Driver core changes for 6.4-rc1
Here is the large set of driver core changes for 6.4-rc1.
 
 Once again, a busy development cycle, with lots of changes happening in
 the driver core in the quest to be able to move "struct bus" and "struct
 class" into read-only memory, a task now complete with these changes.
 
 This will make the future rust interactions with the driver core more
 "provably correct" as well as providing more obvious lifetime rules for
 all busses and classes in the kernel.
 
 The changes required for this did touch many individual classes and
 busses as many callbacks were changed to take const * parameters
 instead.  All of these changes have been submitted to the various
 subsystem maintainers, giving them plenty of time to review, and most of
 them actually did so.
 
 Other than those changes, included in here are a small set of other
 things:
   - kobject logging improvements
   - cacheinfo improvements and updates
   - obligatory fw_devlink updates and fixes
   - documentation updates
   - device property cleanups and const * changes
   - firwmare loader dependency fixes.
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZEp7Sw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykitQCfamUHpxGcKOAGuLXMotXNakTEsxgAoIquENm5
 LEGadNS38k5fs+73UaxV
 =7K4B
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the large set of driver core changes for 6.4-rc1.

  Once again, a busy development cycle, with lots of changes happening
  in the driver core in the quest to be able to move "struct bus" and
  "struct class" into read-only memory, a task now complete with these
  changes.

  This will make the future rust interactions with the driver core more
  "provably correct" as well as providing more obvious lifetime rules
  for all busses and classes in the kernel.

  The changes required for this did touch many individual classes and
  busses as many callbacks were changed to take const * parameters
  instead. All of these changes have been submitted to the various
  subsystem maintainers, giving them plenty of time to review, and most
  of them actually did so.

  Other than those changes, included in here are a small set of other
  things:

   - kobject logging improvements

   - cacheinfo improvements and updates

   - obligatory fw_devlink updates and fixes

   - documentation updates

   - device property cleanups and const * changes

   - firwmare loader dependency fixes.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits)
  device property: make device_property functions take const device *
  driver core: update comments in device_rename()
  driver core: Don't require dynamic_debug for initcall_debug probe timing
  firmware_loader: rework crypto dependencies
  firmware_loader: Strip off \n from customized path
  zram: fix up permission for the hot_add sysfs file
  cacheinfo: Add use_arch[|_cache]_info field/function
  arch_topology: Remove early cacheinfo error message if -ENOENT
  cacheinfo: Check cache properties are present in DT
  cacheinfo: Check sib_leaf in cache_leaves_are_shared()
  cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
  cacheinfo: Add arm64 early level initializer implementation
  cacheinfo: Add arch specific early level initializer
  tty: make tty_class a static const structure
  driver core: class: remove struct class_interface * from callbacks
  driver core: class: mark the struct class in struct class_interface constant
  driver core: class: make class_register() take a const *
  driver core: class: mark class_release() as taking a const *
  driver core: remove incorrect comment for device_create*
  MIPS: vpe-cmp: remove module owner pointer from struct class usage.
  ...
2023-04-27 11:53:57 -07:00
Linus Torvalds
cb6fe2ceb6 Devicetree updates for v6.4, part 2:
- First part of DT header detangling dropping cpu.h from of_device.h
   and replacing some includes with forward declarations. A handful of
   drivers needed some adjustment to their includes as a result.
 
 - Refactor of_device.h to be used by bus drivers rather than various
   device drivers. This moves non-bus related functions out of
   of_device.h. The end goal is for of_platform.h and of_device.h to stop
   including each other.
 
 - Refactor open coded parsing of "ranges" in some bus drivers to use DT
   address parsing functions
 
 - Add some new address parsing functions of_property_read_reg(),
   of_range_count(), and of_range_to_resource() in preparation to convert
   more open coded parsing of DT addresses to use them.
 
 - Treewide clean-ups to use of_property_read_bool() and
   of_property_present() as appropriate. The ones here are the ones
   that didn't get picked up elsewhere.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmRIOrkACgkQ+vtdtY28
 YcN9WA//R+QrmSPExhfgio5y+aOJDWucqnAcyAusPctLcF7h7j0CdzpwaSRkdaH4
 KiLjeyt6tKn8wt8w7m/+SmCsSYXPn81GH/Y5I2F40x6QMrY3cVOXUsulKQA+6ZjZ
 PmW3bMcz0Dw9IhUK3R/WX96+9UdoytKg5qoTzNzPTKpvKA1yHa/ogl2FnHJS5W+8
 Rxz+1oJ70VMIWGpBOc0acHuB2S0RHZ46kPKkPTBgFYEwtmJ8qobvV3r3uQapNaIP
 2jnamPu0tAaQoSaJKKSulToziT+sd1sNB+9oyu/kP+t3PXzq4qwp2Gr4jzUYKs4A
 ZF3DPhMR3YLLN41g/L3rtB0T/YIS287sZRuaLhCqldNpRerSDk4b0HRAksGk1XrI
 HqYXjWPbRxqYiIUWkInfregSTYJfGPxeLfLKrawNO34/eEV4JrkSKy8d0AJn04EK
 jTRqI3L7o23ZPxs29uH/3+KK90J3emPZkF7GWVJTEAMsM8jYZduGh7EpsttJLaz/
 QnxbTBm9295ahIdCfo/OQhqjWnaNhpbTzf31pyrBZ/itXV7gQ0xjwqPwiyFwI+o/
 F/r81xqdwQ3Ni8MKt2c7zLyVA95JHPe95KQ3GrDXR68aByJr4RuhKG8Y2Pj1VOb3
 V+Hsu5uhwKrK7Yqe+rHDnJBO00OCO8nwbWhMy2xVxoTkSFCjDmo=
 =89Zj
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull more devicetree updates from Rob Herring:

 - First part of DT header detangling dropping cpu.h from of_device.h
   and replacing some includes with forward declarations. A handful of
   drivers needed some adjustment to their includes as a result.

 - Refactor of_device.h to be used by bus drivers rather than various
   device drivers. This moves non-bus related functions out of
   of_device.h. The end goal is for of_platform.h and of_device.h to
   stop including each other.

 - Refactor open coded parsing of "ranges" in some bus drivers to use DT
   address parsing functions

 - Add some new address parsing functions of_property_read_reg(),
   of_range_count(), and of_range_to_resource() in preparation to
   convert more open coded parsing of DT addresses to use them.

 - Treewide clean-ups to use of_property_read_bool() and
   of_property_present() as appropriate. The ones here are the ones that
   didn't get picked up elsewhere.

* tag 'devicetree-for-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (34 commits)
  bus: tegra-gmi: Replace of_platform.h with explicit includes
  hte: Use of_property_present() for testing DT property presence
  w1: w1-gpio: Use of_property_read_bool() for boolean properties
  virt: fsl: Use of_property_present() for testing DT property presence
  soc: fsl: Use of_property_present() for testing DT property presence
  sbus: display7seg: Use of_property_read_bool() for boolean properties
  sparc: Use of_property_read_bool() for boolean properties
  sparc: Use of_property_present() for testing DT property presence
  bus: mvebu-mbus: Remove open coded "ranges" parsing
  of/address: Add of_property_read_reg() helper
  of/address: Add of_range_count() helper
  of/address: Add support for 3 address cell bus
  of/address: Add of_range_to_resource() helper
  of: unittest: Add bus address range parsing tests
  of: Drop cpu.h include from of_device.h
  OPP: Adjust includes to remove of_device.h
  irqchip: loongson-eiointc: Add explicit include for cpuhotplug.h
  cpuidle: Adjust includes to remove of_device.h
  cpufreq: sun50i: Add explicit include for cpu.h
  cpufreq: Adjust includes to remove of_device.h
  ...
2023-04-27 10:09:05 -07:00
Michael Ellerman
88990745c9 cpuidle: pseries: Mark ->enter() functions as __cpuidle
Code in the idle path is not allowed to be instrumented because RCU is
disabled, see commit 0e985e9d22 ("cpuidle: Add comments about
noinstr/__cpuidle usage").

Mark the cpuidle ->enter() callbacks as __cpuidle and use the
raw_local_irq_*() routines to ensure that is the case.

Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Link: https://lore.kernel.org/all/4C073F6A-C812-4C4A-BB7A-ECD10B75FB88@linux.ibm.com/
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230406144535.3786008-3-mpe@ellerman.id.au
2023-04-20 13:21:49 +10:00
Rob Herring
24760e43c6 cpuidle: Adjust includes to remove of_device.h
Now that of_cpu_device_node_get() is defined in of.h, of_device.h is just
implicitly including other includes, and is no longer needed. Adjust the
include files with what was implicitly included by of_device.h (cpu.h,
cpuhotplug.h, of.h, and of_platform.h) and drop including of_device.h.

Acked-by: Anup Patel <anup@brainfault.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20230329-dt-cpu-header-cleanups-v1-16-581e2605fe47@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
2023-04-13 17:46:35 -05:00
Greg Kroah-Hartman
cd8fe5b6db Merge 6.3-rc5 into driver-core-next
We need the fixes in here for testing, as well as the driver core
changes for documentation updates to build on.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-03 09:33:30 +02:00
Rob Herring
f914bfdd7f cpuidle: Use of_property_present() for testing DT property presence
It is preferred to use typed property access functions (i.e.
of_property_read_<type> functions) rather than low-level
of_get_property/of_find_property functions for reading properties. As
part of this, convert of_get_property/of_find_property calls to the
recently added of_property_present() helper when we just want to test
for presence of a property and nothing more.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-03-27 19:08:12 +02:00
Greg Kroah-Hartman
bf6479dbe7 cpuidle: move to use bus_get_dev_root()
Direct access to the struct bus_type dev_root pointer is going away soon
so replace that with a call to bus_get_dev_root() instead, which is what
it is there for.

This allows us to clean up the cpuidle_add_interface() call a bit as it
was only called in one place, with the same argument so just put that
into the function itself.  Note that cpuidle_remove_interface() should
also probably be removed in the future as there are no callers of it for
some reason.

Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-pm@vger.kernel.org
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230322090557.2943479-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-22 20:25:23 +01:00
Shawn Guo
6b0313c2fa cpuidle: psci: Iterate backwards over list in psci_pd_remove()
In case that psci_pd_init_topology() fails for some reason,
psci_pd_remove() will be responsible for deleting provider and removing
genpd from psci_pd_providers list.  There will be a failure when removing
the cluster PD, because the cpu (child) PDs haven't been removed.

[    0.050232] CPUidle PSCI: init PM domain cpu0
[    0.050278] CPUidle PSCI: init PM domain cpu1
[    0.050329] CPUidle PSCI: init PM domain cpu2
[    0.050370] CPUidle PSCI: init PM domain cpu3
[    0.050422] CPUidle PSCI: init PM domain cpu-cluster0
[    0.050475] PM: genpd_remove: unable to remove cpu-cluster0
[    0.051412] PM: genpd_remove: removed cpu3
[    0.051449] PM: genpd_remove: removed cpu2
[    0.051499] PM: genpd_remove: removed cpu1
[    0.051546] PM: genpd_remove: removed cpu0

Fix the problem by iterating the provider list reversely, so that parent
PD gets removed after child's PDs like below.

[    0.029052] CPUidle PSCI: init PM domain cpu0
[    0.029076] CPUidle PSCI: init PM domain cpu1
[    0.029103] CPUidle PSCI: init PM domain cpu2
[    0.029124] CPUidle PSCI: init PM domain cpu3
[    0.029151] CPUidle PSCI: init PM domain cpu-cluster0
[    0.029647] PM: genpd_remove: removed cpu0
[    0.029666] PM: genpd_remove: removed cpu1
[    0.029690] PM: genpd_remove: removed cpu2
[    0.029714] PM: genpd_remove: removed cpu3
[    0.029738] PM: genpd_remove: removed cpu-cluster0

Fixes: a65a397f24 ("cpuidle: psci: Add support for PM domains by using genpd")
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
Signed-off-by: Rafael J. Wysocki <rjw@rjwysocki.net>
2023-03-07 14:04:13 +01:00
Linus Torvalds
11c7052998 ARM: SoC drivers for 6.3
As usual, there are lots of minor driver changes across SoC platforms
 from  NXP, Amlogic, AMD Zynq, Mediatek, Qualcomm, Apple and Samsung.
 These usually add support for additional chip variations in existing
 drivers, but also add features or bugfixes.
 
 The SCMI firmware subsystem gains a unified raw userspace interface
 through debugfs, which can be used for validation purposes.
 
 Newly added drivers include:
 
  - New power management drivers for StarFive JH7110, Allwinner D1 and
    Renesas RZ/V2M
 
  - A driver for Qualcomm battery and power supply status
 
  - A SoC device driver for identifying Nuvoton WPCM450 chips
 
  - A regulator coupler driver for Mediatek MT81xxv
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmPtSN8ACgkQmmx57+YA
 GNkOSw/+JS5tElm/ZP7c3uWYp6uwvcb0jUlKW/U3aCtPiPEcYDLEqIEXwcNdaDMh
 m4rW3GYlW0IRL3FsyuYkSLx+EIIUIfs40wldYXJOqRDj0XasndiloIwltOQJGfd9
 C/UVM0FpJdxMJrcBMFgwLLQCIbAVnhHP34i6ppDRgxW/MfTeiCaaG6fnS70iv6mC
 oh2N7FoZSKDtTrFtlR5TqFiK5v/W1CgNJVuglkFB0ceFpjyBpp/8AT0FGS887xCz
 IYSTqm4Q/79vaZXI1Y2oog257cgdwsVqgPrnK5CuSFhTnAcJMCekiFelHq8Yhyuk
 Rw7j/B3KO3AOaxmR75c6SZdeZ+VHgUMRC/RKe3fay0sm3Zea2kAIPXA6Zn+r/cxb
 8M94V59qBz+f8XmpXRTK1UR3s3EbwFIuNyuDIkeorMtpSKtvqJXmZxGDwNIfXr2F
 /voo++MKjzdtdxdW/D/5Tc9DC0Pyb4HLi0EYj2QCzA03njmfLDF1w73NfzMec+GD
 R1zAd3FEbiJQx8Hin0PSPjYXpfMnkjkGAEcE9N9Ralg4ewNWAxfOFsAhHKTZNssL
 pitTAvHR/+dXtvkX7FUi2l/6fqn8nJUrg/xRazPPp3scRbpuk8m6P4MNr3/lsaHk
 HTQ/hYwDdecWLvKXjw5y9yIr3yhLmPPcloTVIIFFjsM0t8b+d9E=
 =p6Xp
 -----END PGP SIGNATURE-----

Merge tag 'soc-drivers-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC driver updates from Arnd Bergmann:
 "As usual, there are lots of minor driver changes across SoC platforms
  from NXP, Amlogic, AMD Zynq, Mediatek, Qualcomm, Apple and Samsung.
  These usually add support for additional chip variations in existing
  drivers, but also add features or bugfixes.

  The SCMI firmware subsystem gains a unified raw userspace interface
  through debugfs, which can be used for validation purposes.

  Newly added drivers include:

   - New power management drivers for StarFive JH7110, Allwinner D1 and
     Renesas RZ/V2M

   - A driver for Qualcomm battery and power supply status

   - A SoC device driver for identifying Nuvoton WPCM450 chips

   - A regulator coupler driver for Mediatek MT81xxv"

* tag 'soc-drivers-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (165 commits)
  power: supply: Introduce Qualcomm PMIC GLINK power supply
  soc: apple: rtkit: Do not copy the reg state structure to the stack
  soc: sunxi: SUN20I_PPU should depend on PM
  memory: renesas-rpc-if: Remove redundant division of dummy
  soc: qcom: socinfo: Add IDs for IPQ5332 and its variant
  dt-bindings: arm: qcom,ids: Add IDs for IPQ5332 and its variant
  dt-bindings: power: qcom,rpmpd: add RPMH_REGULATOR_LEVEL_LOW_SVS_L1
  firmware: qcom_scm: Move qcom_scm.h to include/linux/firmware/qcom/
  MAINTAINERS: Update qcom CPR maintainer entry
  dt-bindings: firmware: document Qualcomm SM8550 SCM
  dt-bindings: firmware: qcom,scm: add qcom,scm-sa8775p compatible
  soc: qcom: socinfo: Add Soc IDs for IPQ8064 and variants
  dt-bindings: arm: qcom,ids: Add Soc IDs for IPQ8064 and variants
  soc: qcom: socinfo: Add support for new field in revision 17
  soc: qcom: smd-rpm: Add IPQ9574 compatible
  soc: qcom: pmic_glink: remove redundant calculation of svid
  soc: qcom: stats: Populate all subsystem debugfs files
  dt-bindings: soc: qcom,rpmh-rsc: Update to allow for generic nodes
  soc: qcom: pmic_glink: add CONFIG_NET/CONFIG_OF dependencies
  soc: qcom: pmic_glink: Introduce altmode support
  ...
2023-02-27 10:04:49 -08:00
Linus Torvalds
2504ba8b01 Power management updates for 6.3-rc1
- Add EPP support to the AMD P-state cpufreq driver (Perry Yuan, Wyes
    Karny, Arnd Bergmann, Bagas Sanjaya).
 
  - Drop the custom cpufreq driver for loongson1 that is not necessary
    any more and the corresponding cpufreq platform device (Keguang
    Zhang).
 
  - Remove "select SRCU" from system sleep, cpufreq and OPP Kconfig
    entries (Paul E. McKenney).
 
  - Enable thermal cooling for Tegra194 (Yi-Wei Wang).
 
  - Register module device table and add missing compatibles for
    cpufreq-qcom-hw (Nícolas F. R. A. Prado, Abel Vesa and Luca Weiss).
 
  - Various dt binding updates for qcom-cpufreq-nvmem and opp-v2-kryo-cpu
    (Christian Marangi).
 
  - Make kobj_type structure in the cpufreq core constant (Thomas
    Weißschuh).
 
  - Make cpufreq_unregister_driver() return void (Uwe Kleine-König).
 
  - Make the TEO cpuidle governor check CPU utilization in order to refine
   idle state selection (Kajetan Puchalski).
 
  - Make Kconfig select the haltpoll cpuidle governor when the haltpoll
    cpuidle driver is selected and replace a default_idle() call in that
    driver with arch_cpu_idle() to allow MWAIT to be used (Li RongQing).
 
  - Add Emerald Rapids Xeon support to the intel_idle driver (Artem
    Bityutskiy).
 
  - Add ARCH_SUSPEND_POSSIBLE dependencies for ARMv4 cpuidle drivers to
    avoid randconfig build failures (Arnd Bergmann).
 
  - Make kobj_type structures used in the cpuidle sysfs interface
    constant (Thomas Weißschuh).
 
  - Make the cpuidle driver registration code update microsecond values
    of idle state parameters in accordance with their nanosecond values
    if they are provided (Rafael Wysocki).
 
  - Make the PSCI cpuidle driver prevent topology CPUs from being
    suspended on PREEMPT_RT (Krzysztof Kozlowski).
 
  - Document that pm_runtime_force_suspend() cannot be used with
    DPM_FLAG_SMART_SUSPEND (Richard Fitzgerald).
 
  - Add EXPORT macros for exporting PM functions from drivers (Richard
    Fitzgerald).
 
  - Remove /** from non-kernel-doc comments in hibernation code (Randy
    Dunlap).
 
  - Fix possible name leak in powercap_register_zone() (Yang Yingliang).
 
  - Add Meteor Lake and Emerald Rapids support to the intel_rapl power
    capping driver (Zhang Rui).
 
  - Modify the idle_inject power capping facility to support 100% idle
    injection (Srinivas Pandruvada).
 
  - Fix large time windows handling in the intel_rapl power capping
    driver (Zhang Rui).
 
  - Fix memory leaks with using debugfs_lookup() in the generic PM
    domains and Energy Model code (Greg Kroah-Hartman).
 
  - Add missing 'cache-unified' property in the example for kryo OPP
    bindings (Rob Herring).
 
  - Fix error checking in opp_migrate_dentry() (Qi Zheng).
 
  - Let qcom,opp-fuse-level be a 2-long array for qcom SoCs (Konrad
    Dybcio).
 
  - Modify some power management utilities to use the canonical ftrace
    path (Ross Zwisler).
 
  - Correct spelling problems for Documentation/power/ as reported by
    codespell (Randy Dunlap).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmPuJfMSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx/5kQAJNOVImLEPLerLP8xufw30//LuDU5Gi0
 STsyDOMql/I2MpkeqeCcgrSbpy6NlEglOvg16gfpQ3qqTCLF9ypENxs9E5BGGvW0
 aEdCzvaoqmvi9PCr/jmj0EPP70/U+rIX5m/k0QdjLh9x0aLoAEe3uRJTfR9QVqXf
 I7JX0N9kjKi7YxpA5DlkHrS7J7GPPiWlesJ3p4wXuHMo3jf+6fgkoPFt8yRrGWeh
 AHzGT2BLrsy7aAUjGZB65Qx9q3fnSXMmXOjmn0Xh2njQah+zRZDwrNzwoY2HTLL/
 KQ6/Ww16USYRZtCS1fmGwAj9I+ddq6AOvhPCMn0vLXXmKVAMUrVVWnQS/0+vpm9y
 suUMK9Tndkgxd1vjby2246ThJn27uDd/ERFan4ouQo2j22uICY+SDo3osj2hMXka
 wq4zthXkY8KgjZ+MuXnZxPhcOvo8KRvfxAU0fy5efQnSkbtwY9UlMvjPBMBHm/RA
 21/6kjQNtq5vMmI37oC8DH+oPrRQ7sUKuY7HNqwO9P3QNKWVmNe7cF5UtXXxME7Q
 ULvP1d+u+TNNdHFLryPwCSzBO34wQEccdRZBjalZ8tBe6JiDWUFHC3giSURZSuzZ
 GDvzVaNX6PkgToyv4inBTB8lTp6pAuUjaWNvNJzVvUXiEKHB0ihzg5vpJW5NdwlH
 15Tn8cjH7pp0
 =lZLx
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These add EPP support to the AMD P-state cpufreq driver, add support
  for new platforms to the Intel RAPL power capping driver, intel_idle
  and the Qualcomm cpufreq driver, enable thermal cooling for Tegra194,
  drop the custom cpufreq driver for loongson1 that is not necessary any
  more (and the corresponding cpufreq platform device), fix assorted
  issues and clean up code.

  Specifics:

   - Add EPP support to the AMD P-state cpufreq driver (Perry Yuan, Wyes
     Karny, Arnd Bergmann, Bagas Sanjaya)

   - Drop the custom cpufreq driver for loongson1 that is not necessary
     any more and the corresponding cpufreq platform device (Keguang
     Zhang)

   - Remove "select SRCU" from system sleep, cpufreq and OPP Kconfig
     entries (Paul E. McKenney)

   - Enable thermal cooling for Tegra194 (Yi-Wei Wang)

   - Register module device table and add missing compatibles for
     cpufreq-qcom-hw (Nícolas F. R. A. Prado, Abel Vesa and Luca Weiss)

   - Various dt binding updates for qcom-cpufreq-nvmem and
     opp-v2-kryo-cpu (Christian Marangi)

   - Make kobj_type structure in the cpufreq core constant (Thomas
     Weißschuh)

   - Make cpufreq_unregister_driver() return void (Uwe Kleine-König)

   - Make the TEO cpuidle governor check CPU utilization in order to
     refine idle state selection (Kajetan Puchalski)

   - Make Kconfig select the haltpoll cpuidle governor when the haltpoll
     cpuidle driver is selected and replace a default_idle() call in
     that driver with arch_cpu_idle() to allow MWAIT to be used (Li
     RongQing)

   - Add Emerald Rapids Xeon support to the intel_idle driver (Artem
     Bityutskiy)

   - Add ARCH_SUSPEND_POSSIBLE dependencies for ARMv4 cpuidle drivers to
     avoid randconfig build failures (Arnd Bergmann)

   - Make kobj_type structures used in the cpuidle sysfs interface
     constant (Thomas Weißschuh)

   - Make the cpuidle driver registration code update microsecond values
     of idle state parameters in accordance with their nanosecond values
     if they are provided (Rafael Wysocki)

   - Make the PSCI cpuidle driver prevent topology CPUs from being
     suspended on PREEMPT_RT (Krzysztof Kozlowski)

   - Document that pm_runtime_force_suspend() cannot be used with
     DPM_FLAG_SMART_SUSPEND (Richard Fitzgerald)

   - Add EXPORT macros for exporting PM functions from drivers (Richard
     Fitzgerald)

   - Remove /** from non-kernel-doc comments in hibernation code (Randy
     Dunlap)

   - Fix possible name leak in powercap_register_zone() (Yang Yingliang)

   - Add Meteor Lake and Emerald Rapids support to the intel_rapl power
     capping driver (Zhang Rui)

   - Modify the idle_inject power capping facility to support 100% idle
     injection (Srinivas Pandruvada)

   - Fix large time windows handling in the intel_rapl power capping
     driver (Zhang Rui)

   - Fix memory leaks with using debugfs_lookup() in the generic PM
     domains and Energy Model code (Greg Kroah-Hartman)

   - Add missing 'cache-unified' property in the example for kryo OPP
     bindings (Rob Herring)

   - Fix error checking in opp_migrate_dentry() (Qi Zheng)

   - Let qcom,opp-fuse-level be a 2-long array for qcom SoCs (Konrad
     Dybcio)

   - Modify some power management utilities to use the canonical ftrace
     path (Ross Zwisler)

   - Correct spelling problems for Documentation/power/ as reported by
     codespell (Randy Dunlap)"

* tag 'pm-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (53 commits)
  Documentation: amd-pstate: disambiguate user space sections
  cpufreq: amd-pstate: Fix invalid write to MSR_AMD_CPPC_REQ
  dt-bindings: opp: opp-v2-kryo-cpu: enlarge opp-supported-hw maximum
  dt-bindings: cpufreq: qcom-cpufreq-nvmem: make cpr bindings optional
  dt-bindings: cpufreq: qcom-cpufreq-nvmem: specify supported opp tables
  PM: Add EXPORT macros for exporting PM functions
  cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
  MIPS: loongson32: Drop obsolete cpufreq platform device
  powercap: intel_rapl: Fix handling for large time window
  cpuidle: driver: Update microsecond values of state parameters as needed
  cpuidle: sysfs: make kobj_type structures constant
  cpuidle: add ARCH_SUSPEND_POSSIBLE dependencies
  PM: EM: fix memory leak with using debugfs_lookup()
  PM: domains: fix memory leak with using debugfs_lookup()
  cpufreq: Make kobj_type structure constant
  cpufreq: davinci: Fix clk use after free
  cpufreq: amd-pstate: avoid uninitialized variable use
  cpufreq: Make cpufreq_unregister_driver() return void
  OPP: fix error checking in opp_migrate_dentry()
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM8550 compatible
  ...
2023-02-21 12:13:58 -08:00
Krzysztof Kozlowski
f9901f6453 cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
The runtime Power Management of CPU topology is not compatible with
PREEMPT_RT:

 1. Core cpuidle path disables IRQs.
 2. Core cpuidle calls cpuidle-psci.
 3. cpuidle-psci in __psci_enter_domain_idle_state() calls
    pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use
    spinlocks (which are sleeping on PREEMPT_RT).

Deep sleep modes are not a priority of Realtime kernels because the
latencies might become unpredictable.  On the other hand the PSCI CPU
idle power domain is a parent of other devices and power domain
controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250).

Disable the idle callbacks in cpuidle-psci and mark the domain as
always on.  This is a trade-off between making PREEMPT_RT working and
still having a proper power domain hierarchy in the system.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Adrien Thierry <athierry@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-13 17:15:42 +01:00
Rafael J. Wysocki
41204a6076 cpuidle: driver: Update microsecond values of state parameters as needed
If the cpuidle driver provides the target residency and exit latency in
nanoseconds, the corresponding values in microseconds need to be set to
reflect the provided numbers in order for the sysfs interface to show
them correctly, so make __cpuidle_driver_init() do that.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2023-02-13 16:08:44 +01:00
Thomas Weißschuh
e898b07deb cpuidle: sysfs: make kobj_type structures constant
Since commit ee6d3dd4ed ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.

Take advantage of this to constify the structure definitions to prevent
modification at runtime.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-09 21:04:01 +01:00
Arnd Bergmann
7787943a3a cpuidle: add ARCH_SUSPEND_POSSIBLE dependencies
Some ARMv4 processors don't support suspend, which leads
to a build failure with the tegra and qualcomm cpuidle driver:

WARNING: unmet direct dependencies detected for ARM_CPU_SUSPEND
  Depends on [n]: ARCH_SUSPEND_POSSIBLE [=n]
  Selected by [y]:
  - ARM_TEGRA_CPUIDLE [=y] && CPU_IDLE [=y] && (ARM [=y] || ARM64) && (ARCH_TEGRA [=n] || COMPILE_TEST [=y]) && !ARM64 && MMU [=y]

arch/arm/kernel/sleep.o: in function `__cpu_suspend':
(.text+0x68): undefined reference to `cpu_sa110_suspend_size'
(.text+0x68): undefined reference to `cpu_fa526_suspend_size'

Add an explicit dependency to make randconfig builds avoid
this combination.

Fixes: faae6c9f2e ("cpuidle: tegra: Enable compile testing")
Fixes: a871be6b8e ("cpuidle: Convert Qualcomm SPM driver to a generic CPUidle driver")
Link: https://lore.kernel.org/all/20211013160125.772873-1-arnd@kernel.org/
Cc: All applicable <stable@vger.kernel.org>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-09 20:50:39 +01:00
Elliot Berman
3bf90eca76 firmware: qcom_scm: Move qcom_scm.h to include/linux/firmware/qcom/
Move include/linux/qcom_scm.h to include/linux/firmware/qcom/qcom_scm.h.
This removes 1 of a few remaining Qualcomm-specific headers into a more
approciate subdirectory under include/.

Suggested-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
Reviewed-by: Guru Das Srinagesh <quic_gurus@quicinc.com>
Acked-by: Mukesh Ojha <quic_mojha@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230203210956.3580811-1-quic_eberman@quicinc.com
2023-02-08 19:15:16 -08:00
Peter Zijlstra
4d627628d7 cpuidle: Fix poll_idle() noinstr annotation
The instrumentation_begin()/end() annotations in poll_idle() were
complete nonsense. Specifically they caused tracing to happen in the
middle of noinstr code, resulting in RCU splats.

Now that local_clock() is noinstr, mark up the rest and let it rip.

Fixes: 00717eb8c9 ("cpuidle: Annotate poll_idle()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/oe-lkp/202301192148.58ece903-oliver.sang@intel.com
Link: https://lore.kernel.org/r/20230126151323.819534689@infradead.org
2023-01-31 15:01:47 +01:00
Li RongQing
716ff71ae2 cpuidle-haltpoll: Replace default_idle() with arch_cpu_idle()
When a KVM guest has MWAIT, mwait_idle() is used as the default idle
function.

However, the cpuidle-haltpoll driver calls default_idle() from
default_enter_idle() directly and that one uses HLT instead of MWAIT,
which may affect performance adversely, because MWAIT is preferred to
HLT as explained by the changelog of commit aebef63cf7 ("x86: Remove
vendor checks from prefer_mwait_c1_over_halt").

Make default_enter_idle() call arch_cpu_idle(), which can use MWAIT,
instead of default_idle() to address this issue.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
[ rjw: Changelog rewrite ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-20 17:33:52 +01:00
Peter Zijlstra
19235e4727 cpuidle, arm64: Fix the ARM64 cpuidle logic
The recent cpuidle changes started triggering RCU splats on
Juno development boards:

  | =============================
  | WARNING: suspicious RCU usage
  | -----------------------------
  | include/trace/events/ipi.h:19 suspicious rcu_dereference_check() usage!

Fix cpuidle on ARM64:

 - ... by introducing a new 'is_rcu' flag to the cpuidle helpers & make
   ARM64 use it, as ARM64 wants to keep RCU active longer and wants to
   do the ct_cpuidle_enter()/exit() dance itself.

 - Also update the PSCI driver accordingly.

 - This also removes the last known RCU_NONIDLE() user as a bonus.

Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/Y8Z31UbzG3LJgAXE@hirez.programming.kicks-ass.net

--
2023-01-18 12:27:17 +01:00
Arnd Bergmann
3b8645e9ec cpuidle: mvebu: Fix duplicate flags assignment
The added '.flags' value is sometimes ignored here because
it gets overwritten by another initialization:

  drivers/cpuidle/cpuidle-mvebu-v7.c:24:33: error: initialized field overwritten [-Werror=override-init]
     24 | #define MVEBU_V7_FLAG_DEEP_IDLE 0x10000
        |                                 ^~~~~~~
  drivers/cpuidle/cpuidle-mvebu-v7.c:69:43: note: in expansion of macro 'MVEBU_V7_FLAG_DEEP_IDLE'
  ...

Merge the two fields into one.

Fixes: 4ce40e9dbe ("cpuidle, armada: Push RCU-idle into driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230117164642.1672784-1-arnd@kernel.org
2023-01-18 12:03:54 +01:00
Li RongQing
4edc13ae89 cpuidle-haltpoll: select haltpoll governor
The haltpoll cpuidle driver should select the haltpoll governor, so
as to ensure that they work together.

Signed-off-by: Li RongQing <lirongqing@baidu.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-13 20:50:46 +01:00
Peter Zijlstra
0e985e9d22 cpuidle: Add comments about noinstr/__cpuidle usage
Add a few words on noinstr / __cpuidle usage.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230112195542.397238052@infradead.org
2023-01-13 11:48:18 +01:00
Peter Zijlstra
69e26b4f43 cpuidle, arch: Mark all ct_cpuidle_enter() callers __cpuidle
For all cpuidle drivers that use CPUIDLE_FLAG_RCU_IDLE, ensure that
all functions that call ct_cpuidle_enter() are marked __cpuidle.

( due to lack of noinstr validation on these platforms it is entirely
  possible this isn't complete )

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230112195542.274096325@infradead.org
2023-01-13 11:48:17 +01:00
Peter Zijlstra
17cc2b5525 cpuidle: Ensure ct_cpuidle_enter() is always called from noinstr/__cpuidle
Tracing (kprobes included) and other compiler instrumentation relies
on a normal kernel runtime. Therefore all functions that disable RCU
should be noinstr, as should all functions that are called while RCU
is disabled.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230112195542.212914195@infradead.org
2023-01-13 11:48:17 +01:00
Peter Zijlstra
00717eb8c9 cpuidle: Annotate poll_idle()
The __cpuidle functions will become a noinstr class, as such they need
explicit annotations.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230112195540.312601331@infradead.org
2023-01-13 11:48:15 +01:00
Peter Zijlstra
a01353cf18 cpuidle: Fix ct_idle_*() usage
The whole disable-RCU, enable-IRQS dance is very intricate since
changing IRQ state is traced, which depends on RCU.

Add two helpers for the cpuidle case that mirror the entry code:

  ct_cpuidle_enter()
  ct_cpuidle_exit()

And fix all the cases where the enter/exit dance was buggy.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230112195540.130014793@infradead.org
2023-01-13 11:48:15 +01:00
Peter Zijlstra
0c5ffc3d7b cpuidle, dt: Push RCU-idle into driver
Doing RCU-idle outside the driver, only to then temporarily enable it
again before going idle is suboptimal.

Notably: this converts all dt_init_idle_driver() and
__CPU_PM_CPU_IDLE_ENTER() users for they are inextrably intertwined.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230112195540.068981667@infradead.org
2023-01-13 11:48:00 +01:00