Commit Graph

32097 Commits

Author SHA1 Message Date
Linus Torvalds
c677124e63 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "These were the main changes in this cycle:

   - More -rt motivated separation of CONFIG_PREEMPT and
     CONFIG_PREEMPTION.

   - Add more low level scheduling topology sanity checks and warnings
     to filter out nonsensical topologies that break scheduling.

   - Extend uclamp constraints to influence wakeup CPU placement

   - Make the RT scheduler more aware of asymmetric topologies and CPU
     capacities, via uclamp metrics, if CONFIG_UCLAMP_TASK=y

   - Make idle CPU selection more consistent

   - Various fixes, smaller cleanups, updates and enhancements - please
     see the git log for details"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
  sched/fair: Define sched_idle_cpu() only for SMP configurations
  sched/topology: Assert non-NUMA topology masks don't (partially) overlap
  idle: fix spelling mistake "iterrupts" -> "interrupts"
  sched/fair: Remove redundant call to cpufreq_update_util()
  sched/psi: create /proc/pressure and /proc/pressure/{io|memory|cpu} only when psi enabled
  sched/fair: Fix sgc->{min,max}_capacity calculation for SD_OVERLAP
  sched/fair: calculate delta runnable load only when it's needed
  sched/cputime: move rq parameter in irqtime_account_process_tick
  stop_machine: Make stop_cpus() static
  sched/debug: Reset watchdog on all CPUs while processing sysrq-t
  sched/core: Fix size of rq::uclamp initialization
  sched/uclamp: Fix a bug in propagating uclamp value in new cgroups
  sched/fair: Load balance aggressively for SCHED_IDLE CPUs
  sched/fair : Improve update_sd_pick_busiest for spare capacity case
  watchdog: Remove soft_lockup_hrtimer_cnt and related code
  sched/rt: Make RT capacity-aware
  sched/fair: Make EAS wakeup placement consider uclamp restrictions
  sched/fair: Make task_fits_capacity() consider uclamp restrictions
  sched/uclamp: Rename uclamp_util_with() into uclamp_rq_util_with()
  sched/uclamp: Make uclamp util helpers use and return UL values
  ...
2020-01-28 10:07:09 -08:00
Linus Torvalds
c0e809e244 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Kernel side changes:

   - Ftrace is one of the last W^X violators (after this only KLP is
     left). These patches move it over to the generic text_poke()
     interface and thereby get rid of this oddity. This requires a
     surprising amount of surgery, by Peter Zijlstra.

   - x86/AMD PMUs: add support for 'Large Increment per Cycle Events' to
     count certain types of events that have a special, quirky hw ABI
     (by Kim Phillips)

   - kprobes fixes by Masami Hiramatsu

  Lots of tooling updates as well, the following subcommands were
  updated: annotate/report/top, c2c, clang, record, report/top TUI,
  sched timehist, tests; plus updates were done to the gtk ui, libperf,
  headers and the parser"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
  perf/x86/amd: Add support for Large Increment per Cycle Events
  perf/x86/amd: Constrain Large Increment per Cycle events
  perf/x86/intel/rapl: Add Comet Lake support
  tracing: Initialize ret in syscall_enter_define_fields()
  perf header: Use last modification time for timestamp
  perf c2c: Fix return type for histogram sorting comparision functions
  perf beauty sockaddr: Fix augmented syscall format warning
  perf/ui/gtk: Fix gtk2 build
  perf ui gtk: Add missing zalloc object
  perf tools: Use %define api.pure full instead of %pure-parser
  libperf: Setup initial evlist::all_cpus value
  perf report: Fix no libunwind compiled warning break s390 issue
  perf tools: Support --prefix/--prefix-strip
  perf report: Clarify in help that --children is default
  tools build: Fix test-clang.cpp with Clang 8+
  perf clang: Fix build with Clang 9
  kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic
  tools lib: Fix builds when glibc contains strlcpy()
  perf report/top: Make 'e' visible in the help and make it toggle showing callchains
  perf report/top: Do not offer annotation for symbols without samples
  ...
2020-01-28 09:44:15 -08:00
Linus Torvalds
2180f214f4 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "Just a handful of changes in this cycle: an ARM64 performance
  optimization, a comment fix and a debug output fix"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/osq: Use optimized spinning loop for arm64
  locking/qspinlock: Fix inaccessible URL of MCS lock paper
  locking/lockdep: Fix lockdep_stats indentation problem
2020-01-28 09:33:25 -08:00
Linus Torvalds
d99391ec2b Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The RCU changes in this cycle were:
   - Expedited grace-period updates
   - kfree_rcu() updates
   - RCU list updates
   - Preemptible RCU updates
   - Torture-test updates
   - Miscellaneous fixes
   - Documentation updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (69 commits)
  rcu: Remove unused stop-machine #include
  powerpc: Remove comment about read_barrier_depends()
  .mailmap: Add entries for old paulmck@kernel.org addresses
  srcu: Apply *_ONCE() to ->srcu_last_gp_end
  rcu: Switch force_qs_rnp() to for_each_leaf_node_cpu_mask()
  rcu: Move rcu_{expedited,normal} definitions into rcupdate.h
  rcu: Move gp_state_names[] and gp_state_getname() to tree_stall.h
  rcu: Remove the declaration of call_rcu() in tree.h
  rcu: Fix tracepoint tracking RCU CPU kthread utilization
  rcu: Fix harmless omission of "CONFIG_" from #if condition
  rcu: Avoid tick_dep_set_cpu() misordering
  rcu: Provide wrappers for uses of ->rcu_read_lock_nesting
  rcu: Use READ_ONCE() for ->expmask in rcu_read_unlock_special()
  rcu: Clear ->rcu_read_unlock_special only once
  rcu: Clear .exp_hint only when deferred quiescent state has been reported
  rcu: Rename some instance of CONFIG_PREEMPTION to CONFIG_PREEMPT_RCU
  rcu: Remove kfree_call_rcu_nobatch()
  rcu: Remove kfree_rcu() special casing and lazy-callback handling
  rcu: Add support for debug_objects debugging for kfree_rcu()
  rcu: Add multiple in-flight batches of kfree_rcu() work
  ...
2020-01-28 08:46:13 -08:00
Ingo Molnar
0cc4bd8f70 Merge branch 'core/kprobes' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-01-28 07:59:05 +01:00
Linus Torvalds
3d3b44a61a The interrupt departement provides:
- A mechanism to shield isolated tasks from managed interrupts:
 
    The affinity of managed interrupts is completely controlled by the
    kernel and user space has no influence on them. The reason is that
    the automatically assigned affinity correlates to the multi-queue
    CPU handling of block devices.
 
    If the generated affinity mask spaws both housekeeping and isolated CPUs
    the interrupt could be routed to an isolated CPU which would then be
    disturbed by I/O submitted by a housekeeping CPU.
 
    The new mechamism ensures that as long as one housekeeping CPU is online
    in the assigned affinity mask the interrupt is routed to a housekeeping
    CPU.
 
    If there is no online housekeeping CPU in the affinity mask, then the
    interrupt is routed to an isolated CPU to keep the device queue intact,
    but unless the isolated CPU submits I/O by itself these interrupts are
    not raised.
 
  - A small addon to the device tree irqdomain core code to avoid
    duplication in irq chip drivers
 
  - Conversion of the SiFive PLIC to hierarchical domains
 
  - The usual pile of new irq chip drivers: SiFive GPIO, Aspeed SCI, NXP
    INTMUX, Meson A1 GPIO
 
  - The first cut of support for the new ARM GICv4.1
 
  - The usual pile of fixes and improvements in core and driver code
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl4vcbETHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoezyEADBPf0ipu5+KeTtCR+DjRAO8o0wM0J/
 JNkRkSrS/qENSda/d6pZE2AWpqlDOs6apg+SNGkv0knM+1Xy94nLOf4zJBsR+GW0
 w2jw68egnyB2QZtm/BvOJL+qCoixcObg5sLt0165pDdKzyDNWeCMtRU+QAw42T/l
 WC2QrhjKKqYST1m+UgDf1UXz8TDGIW4muRP9UiG0Uwc0LU6cG2H4OmGn0bYissaT
 JTG75pzGqUH3kZ1a1qD28nGyoY85BXz1iV5/IvIPaQbkQARbvfMbh1KvAnGhJj7N
 96rjMpOGv2/kv1FI+4FUy6w5Wn4EyW2OaCtB/oUCFNcZvrNNgvglxCRQkkO8yb3D
 VOOm595ICm3EnIfxBpSzhgvVl5MY39g6qRb6Rpnna+8eRtrYnytMBdvhY0OGlG8/
 cZYZDay0nzhY6vq023iw1YMDKqft7TR1R+6w1iPL7nXHXW99Dhv87d1Fjt0CqphD
 NIoNDgxciIyfMbMBvcg1qPe/g3L8+cAKNzGsIwIU9GneEZFBk3/piGcBlFpoEEOK
 2QKvks3QRXMx+qVWkIqy3LZKV9EAQlb9Lpjaa1ec5d4m/EdACm19OpZpqoCljPtw
 9vdaMz4ZxvUbwjih3VnVPklZCiVGiKj1j0iw5v3FCHh4MUljzCrxNMqK/U9CR8H0
 uid3EX8YMi+DXA==
 =E2VR
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2020-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "The interrupt departement provides:

   - A mechanism to shield isolated tasks from managed interrupts:

     The affinity of managed interrupts is completely controlled by the
     kernel and user space has no influence on them. The reason is that
     the automatically assigned affinity correlates to the multi-queue
     CPU handling of block devices.

     If the generated affinity mask spaws both housekeeping and isolated
     CPUs the interrupt could be routed to an isolated CPU which would
     then be disturbed by I/O submitted by a housekeeping CPU.

     The new mechamism ensures that as long as one housekeeping CPU is
     online in the assigned affinity mask the interrupt is routed to a
     housekeeping CPU.

     If there is no online housekeeping CPU in the affinity mask, then
     the interrupt is routed to an isolated CPU to keep the device queue
     intact, but unless the isolated CPU submits I/O by itself these
     interrupts are not raised.

   - A small addon to the device tree irqdomain core code to avoid
     duplication in irq chip drivers

   - Conversion of the SiFive PLIC to hierarchical domains

   - The usual pile of new irq chip drivers: SiFive GPIO, Aspeed SCI,
     NXP INTMUX, Meson A1 GPIO

   - The first cut of support for the new ARM GICv4.1

   - The usual pile of fixes and improvements in core and driver code"

* tag 'irq-core-2020-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  genirq, sched/isolation: Isolate from handling managed interrupts
  irqchip/gic-v4.1: Allow direct invalidation of VLPIs
  irqchip/gic-v4.1: Suppress per-VLPI doorbell
  irqchip/gic-v4.1: Add VPE INVALL callback
  irqchip/gic-v4.1: Add VPE eviction callback
  irqchip/gic-v4.1: Add VPE residency callback
  irqchip/gic-v4.1: Add mask/unmask doorbell callbacks
  irqchip/gic-v4.1: Plumb skeletal VPE irqchip
  irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP
  irqchip/gic-v4.1: Don't use the VPE proxy if RVPEID is set
  irqchip/gic-v4.1: Implement the v4.1 flavour of VMAPP
  irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation
  irqchip/gic-v3: Add GICv4.1 VPEID size discovery
  irqchip/gic-v3: Detect GICv4.1 supporting RVPEID
  irqchip/gic-v3-its: Fix get_vlpi_map() breakage with doorbells
  irqdomain: Fix a memory leak in irq_domain_push_irq()
  irqchip: Add NXP INTMUX interrupt multiplexer support
  dt-bindings: interrupt-controller: Add binding for NXP INTMUX interrupt multiplexer
  irqchip: Define EXYNOS_IRQ_COMBINER
  irqchip/meson-gpio: Add support for meson a1 SoCs
  ...
2020-01-27 17:22:21 -08:00
Linus Torvalds
ab67f60025 A small set of SMP core code changes:
- Rework the smp function call core code to avoid the allocation of an
    additional cpumask.
 
  - Remove the not longer required GFP argument from on_each_cpu_cond() and
    on_each_cpu_cond_mask() and fixup the callers.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl4vcrATHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYocr1D/4ptWrZKsgBxGKBP34lvJAjd0KRqVoz
 J9dLAN+AAs6YZSnOmRBX1b9d9IL2PrccOEF+J/Ja3ZkB+PAoAQ9W3uCHkZ77WUph
 xx5eJahZCo+3nZ6amGgS2cPdG8WjxSK3enxPcU4pJhV/QaaP7R9BZt5YQgreYAQO
 kRi0qyt10AExLqLd+077GX5DKcEOXwwVG/qckUQK2h8Kkd68vTbjDxggvsHwmpSE
 MHaszv85UpE+YQbT6DyG5Hi4kK3AJeODBy/fKr2VODIBLZpKiuQ5kK4lbNHYPpVB
 wXw0umXHLQggrKoPKo58ayoCXD0bAG9JT0rvapjUJIz1/9YejQ6lB/t5f0dPbSrU
 al4CJq/pfNky4H6uLWFVbAXJabJuBcB/eG1csaM88Yw0pEXkbnHCOkJAdosoDhhl
 qNQYg4yaE9tTuy1chXDMntH0R0Qztqry6+DMsczJxT21TgERsHCRJV+mGLV46/ZN
 GXJEoJ/cnjNJlqj8GirjbksPRbxuvmQNHRVrTh8qOSxbPKUQZfZocp9HHNmFsBaN
 Q07VgWMHXzYj1L4r3cbJ/ONpOCo66lw7F//MNGk0eIWdeL6H7XZvJQPX+YUrLsZc
 tVlZh8mZOGbRiM8g1dN0BSJO7QrVYmJWGb0oQQtv5tVSRN/V8Y9VZ8YX8lpYlF1e
 ETkrZLGhTJWp4A==
 =M4aK
 -----END PGP SIGNATURE-----

Merge tag 'smp-core-2020-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core SMP updates from Thomas Gleixner:
 "A small set of SMP core code changes:

   - Rework the smp function call core code to avoid the allocation of
     an additional cpumask

   - Remove the not longer required GFP argument from on_each_cpu_cond()
     and on_each_cpu_cond_mask() and fixup the callers"

* tag 'smp-core-2020-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  smp: Remove allocation mask from on_each_cpu_cond.*()
  smp: Add a smp_cond_func_t argument to smp_call_function_many()
  smp: Use smp_cond_func_t as type for the conditional function
2020-01-27 17:04:51 -08:00
Linus Torvalds
e279160f49 The timekeeping and timers departement provides:
- Time namespace support:
 
     If a container migrates from one host to another then it expects that
     clocks based on MONOTONIC and BOOTTIME are not subject to
     disruption. Due to different boot time and non-suspended runtime these
     clocks can differ significantly on two hosts, in the worst case time
     goes backwards which is a violation of the POSIX requirements.
 
     The time namespace addresses this problem. It allows to set offsets for
     clock MONOTONIC and BOOTTIME once after creation and before tasks are
     associated with the namespace. These offsets are taken into account by
     timers and timekeeping including the VDSO.
 
     Offsets for wall clock based clocks (REALTIME/TAI) are not provided by
     this mechanism. While in theory possible, the overhead and code
     complexity would be immense and not justified by the esoteric potential
     use cases which were discussed at Plumbers '18.
 
     The overhead for tasks in the root namespace (host time offsets = 0) is
     in the noise and great effort was made to ensure that especially in the
     VDSO. If time namespace is disabled in the kernel configuration the
     code is compiled out.
 
     Kudos to Andrei Vagin and Dmitry Sofanov who implemented this feature
     and kept on for more than a year addressing review comments, finding
     better solutions. A pleasant experience.
 
   - Overhaul of the alarmtimer device dependency handling to ensure that
     the init/suspend/resume ordering is correct.
 
   - A new clocksource/event driver for Microchip PIT64
 
   - Suspend/resume support for the Hyper-V clocksource
 
   - The usual pile of fixes, updates and improvements mostly in the
     driver code.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl4vbTcTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoXT2D/96iJ3G9Snn2khEQP3XS2rYmtDGw7NO
 m1n96falwWeGe6zreU80R2Jge5nLxQtNhRoMPLLee1GpHwRC6lvqEqgdZ4LMBrD2
 JqV7Gzg8Urmdh+hpDsyTCpeEWEzoMKxiFOX8PxwctqUhM4szEe5iQg2YQsg85Jw2
 vG6M93N2xwDILh4rhEMbKjo+5ZmYn7c1RQvpGOSmpKOj940W/N7H2HBsFhdaJ1Kw
 FW5pFv1211PaU5RV2YNb2dMeeMTT1N3e2VN4Dkadoxp47pb+725gNHEBEjmV9poG
 Lp4IhzGAPnj8zVD88icQZSTaK3gUHMClxprJ0Pf84WEtiH7SeGu8BPYyu77+oNDe
 yzcctDJNyCWXkzmaP/fe/HLc0TStbvNAJ5Tagp4BC75gzebeb4/n8RtRT0fKeDYL
 pxpDPKDAPU7p1JSjxiWAtshqjBycWNY3Z49bA7/VhKBhnv8BDyBPGlYd7/4xrbGr
 RK7DQNXJwaJaiNJ7p5PiaFxGzNyB0B9sThD/slSlEInIKb4h9YzWr0TV+NB62VnB
 sDcN+tpLbRPz5/5cHGGfxR0+zKWpfyai8pzbmmaXEaKssjRYwyvcac5EZdgbWpbK
 k7CqAjoWLA2P+tGeePNJOf5JYK6Vmdyh4clmuwM0zOiRJ9NlWUyMf3z7QYILs4RO
 UAI+6opYlZEPAw==
 =x3qT
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2020-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "The timekeeping and timers departement provides:

   - Time namespace support:

     If a container migrates from one host to another then it expects
     that clocks based on MONOTONIC and BOOTTIME are not subject to
     disruption. Due to different boot time and non-suspended runtime
     these clocks can differ significantly on two hosts, in the worst
     case time goes backwards which is a violation of the POSIX
     requirements.

     The time namespace addresses this problem. It allows to set offsets
     for clock MONOTONIC and BOOTTIME once after creation and before
     tasks are associated with the namespace. These offsets are taken
     into account by timers and timekeeping including the VDSO.

     Offsets for wall clock based clocks (REALTIME/TAI) are not provided
     by this mechanism. While in theory possible, the overhead and code
     complexity would be immense and not justified by the esoteric
     potential use cases which were discussed at Plumbers '18.

     The overhead for tasks in the root namespace (ie where host time
     offsets = 0) is in the noise and great effort was made to ensure
     that especially in the VDSO. If time namespace is disabled in the
     kernel configuration the code is compiled out.

     Kudos to Andrei Vagin and Dmitry Sofanov who implemented this
     feature and kept on for more than a year addressing review
     comments, finding better solutions. A pleasant experience.

   - Overhaul of the alarmtimer device dependency handling to ensure
     that the init/suspend/resume ordering is correct.

   - A new clocksource/event driver for Microchip PIT64

   - Suspend/resume support for the Hyper-V clocksource

   - The usual pile of fixes, updates and improvements mostly in the
     driver code"

* tag 'timers-core-2020-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
  alarmtimer: Make alarmtimer_get_rtcdev() a stub when CONFIG_RTC_CLASS=n
  alarmtimer: Use wakeup source from alarmtimer platform device
  alarmtimer: Make alarmtimer platform device child of RTC device
  alarmtimer: Update alarmtimer_get_rtcdev() docs to reflect reality
  hrtimer: Add missing sparse annotation for __run_timer()
  lib/vdso: Only read hrtimer_res when needed in __cvdso_clock_getres()
  MIPS: vdso: Define BUILD_VDSO32 when building a 32bit kernel
  clocksource/drivers/hyper-v: Set TSC clocksource as default w/ InvariantTSC
  clocksource/drivers/hyper-v: Untangle stimers and timesync from clocksources
  clocksource/drivers/timer-microchip-pit64b: Fix sparse warning
  clocksource/drivers/exynos_mct: Rename Exynos to lowercase
  clocksource/drivers/timer-ti-dm: Fix uninitialized pointer access
  clocksource/drivers/timer-ti-dm: Switch to platform_get_irq
  clocksource/drivers/timer-ti-dm: Convert to devm_platform_ioremap_resource
  clocksource/drivers/em_sti: Fix variable declaration in em_sti_probe
  clocksource/drivers/em_sti: Convert to devm_platform_ioremap_resource
  clocksource/drivers/bcm2835_timer: Fix memory leak of timer
  clocksource/drivers/cadence-ttc: Use ttc driver as platform driver
  clocksource/drivers/timer-microchip-pit64b: Add Microchip PIT64B support
  clocksource/drivers/hyper-v: Reserve PAGE_SIZE space for tsc page
  ...
2020-01-27 16:47:05 -08:00
Linus Torvalds
b11c89a158 A set of watchdog/softlockup related improvements:
- Enforce that the watchdog timestamp is always valid on boot. The
    original implementation caused a watchdog disabled gap of one second in
    the boot process due to truncation of the underlying sched clock. The
    sched clock is divided by 1e9 to convert nanoseconds to seconds. So for
    the first second of the boot process the result is 0 which is at the
    same time the indicator to disable the watchdog. The trivial fix is to
    change the disabled indicator to ULONG_MAX.
 
  - Two cleanup patches removing unused and redundant code which got
    forgotten to be cleaned up in previous changes.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl4vbrQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTQHD/9ONyg9VQLjk6aH94H1Sjik/K7zvxoC
 aMGY2onZ6PddVrcTgJoMmWteQlQ2YScCSVnfVedmxTRU8laEHU/LQnMntTAbuHWj
 VUkK8X/AI5l+VY6p0Sr1iCyxcFezoC2VMqOKntuQl3080mK7R7/fQ+ZVmimiPihr
 46qMikIfBN7w2od7Ger3dZRttbnRj5YsmLBenX/HtBY/HPdhoDx6lfW/5AbAgUH5
 qnAmM0yPZ/VUSfo45z+exESUezxByIkGsrROBtPSRwql3Oqbyrza2UC48dRjsuIQ
 vO0coorlhqJGF72WW45DiLvg4Hew/vVyzcYrIiOSQPZpeTtPzL23zk/cqcqpKy6N
 pCuiSgimzbPgzqTHs6WQR/D0Dn76rruUqXqteuD5zirC9Kjf2TWeIMPTgPfy8irt
 2RwT1+5Ao/SNkdm/Pxk0S/+Y99uRJSqeNTV3lroYGC7IFMAnG4P0S9uyFJ6ZFIMz
 nOvEOhUlFXWw/w7WPZv+ytx40sRkqFVIePSRtzq+cjlDEYCgLhuveE2A4/6IGPMP
 Ej6vsGh3lMyHieRhmymESG8uLU2P/L7hhPexUPJJu4QSxKbKQNfWx+0z7bm86Ic7
 0uDSNZZl7UDYq6tioS1DBTq9ybly9vn1WDe5tHMJDllPe9TIEnqynvVLIg6MMGdm
 GjbTNysDPx85yw==
 =WMiM
 -----END PGP SIGNATURE-----

Merge tag 'core-core-2020-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull watchdog updates from Thomas Gleixner:
 "A set of watchdog/softlockup related improvements:

   - Enforce that the watchdog timestamp is always valid on boot. The
     original implementation caused a watchdog disabled gap of one
     second in the boot process due to truncation of the underlying
     sched clock.

     The sched clock is divided by 1e9 to convert nanoseconds to
     seconds. So for the first second of the boot process the result is
     0 which is at the same time the indicator to disable the watchdog.

     The trivial fix is to change the disabled indicator to ULONG_MAX.

   - Two cleanup patches removing unused and redundant code which got
     forgotten to be cleaned up in previous changes"

* tag 'core-core-2020-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  watchdog/softlockup: Enforce that timestamp is valid on boot
  watchdog/softlockup: Remove obsolete check of last reported task
  watchdog: Remove soft_lockup_hrtimer_cnt and related code
2020-01-27 16:42:11 -08:00
Linus Torvalds
a56c41e5d7 Two fixes for the generic VDSO code which missed 5.5:
- Make the update to the coarse timekeeper unconditional. This is required
    because the coarse timekeeper interfaces in the VDSO do not depend on a
    VDSO capable clocksource. If the system does not have a VDSO capable
    clocksource and the update is depending on the VDSO capable clocksource,
    the coarse VDSO interfaces would operate on stale data forever.
 
  - Invert the logic of __arch_update_vdso_data() to avoid further head
    scratching. Tripped over this several times while analyzing the update
    problem above.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl4vXzUTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYodbPD/4km+XOhsbefcn1Xo6SAQV9akPhKSHY
 h1gfjpe4UD+Uj4WfmpERHcCJA3sYtZSjNyEWkwagH1XjB+rcLc3JE8XvhPCZTXCx
 g/OQlww1ef6mBZ5nslpPUZs8i0HppoV7Sa955QxR/jWuOIEssg5c+XGqP8xX8AhX
 TqBOUcJd0LhqCGt76Gb6LHnOEshE8e6ptZ0xayzMZsab3LJTEaJCrsoDpADQ1q8A
 hMjiL3CG9/e12qKYhODFTbyc/wgyGQYK8g6sb9E1Twd2Tw2+ikRbtZuQd3HQv4jV
 SiVtmMqLu6IH+G608zeNIn/67/WX9zYqUZ3fZgSjBwXWoB84Gyj11KLnjmCgS6SH
 0ddOQKPn8VyQc2anG4obRtMNB+TjJvGnB4QSL2ROJB7Zx6EYMsduhXwIbaNZDDro
 nIh6Xvl6iyb0lkhd9zCR7ak7UHJg4ECJsVKK3kAMIHJM4f53d/DwT+ZaHbJZa/2a
 OLoBGpBkJoE1X40dXou+0FUyUFRla42+ho99nCU580EyK/ZAuZEqKjjez9QIh4vN
 L/I6uEHGBw9myB40nb0DFhRIFR97BUkRTRA3VhyX0CYIE3gUL43zNFsdvcugsxRy
 4/Cf7tqhQcSjYjJxpLTRRWt2t6QvDoWfTnrwiPqSepcO17uV8WHLrxK4mT2i8Vjc
 PIq7OgZlp09gQA==
 =ONO4
 -----END PGP SIGNATURE-----

Merge tag 'timers-urgent-2020-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
 "Two fixes for the generic VDSO code which missed 5.5:

   - Make the update to the coarse timekeeper unconditional.

     This is required because the coarse timekeeper interfaces in the
     VDSO do not depend on a VDSO capable clocksource. If the system
     does not have a VDSO capable clocksource and the update is
     depending on the VDSO capable clocksource, the coarse VDSO
     interfaces would operate on stale data forever.

   - Invert the logic of __arch_update_vdso_data() to avoid further head
     scratching.

     Tripped over this several times while analyzing the update problem
     above"

* tag 'timers-urgent-2020-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lib/vdso: Update coarse timekeeper unconditionally
  lib/vdso: Make __arch_update_vdso_data() logic understandable
2020-01-27 16:37:40 -08:00
Linus Torvalds
07e309a972 audit/stable-5.6 PR 20200127
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl4vRtMUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXM6rw//RXPHJ+U1gjtC5kWQX66/HxEwSY3c
 M236UiJD+xbEHKWpViFd6S7YzHQCkqEO2UvMSwMFP0aL2D56nhkEIKblQJ5sLSK9
 3kNq/7wmxZgCj+/YrGeCiFFWpgSj/PiNB+VDouUkEkT5ZtKamA63qzhqEAUY995L
 vlZVgE8Cpu92JKJKZXKOnlJ+gYh3icFXKbWp0Lk9mmte4RiJ/zsFo+rRou5TzrMm
 30D3A9p9A7sC3jMeRQCowE5UwTkdOeknRi1b4obAGAajuaA+/HtL7bUj8rVwjJXl
 bpX/wShrZDb+dc0NGLQikhzDV/i3qn1DzMbSMuJL/1tf9Jv5lzoJ0/14RkBzd5sm
 pPFA/tUs/3NlPKEyZluA7W21LOUdWk4UxeOJkysJLjfYvsVDg02yFS3qYaZRPaSa
 B3Ex36drCfQfMpMH4Nglh1iDl5oOIoAwn4mSCtirAw6YYG/sW6YnBEnloNYFfahs
 b4/xPhzKfzLtKdc+4yUSbTlIUU+GAdCLxPlp2IvRgqfa9oTATIRP9DY70//V3myN
 PGnCLCu10ag47fJWV4mNetYUv6BR22dvLLX8igcfYmIS3zYM0lEWEz7SOaRuPBdf
 QqAHMNaDCY6z8aEFr+aXW6kr2SP3ycqdvv+b+CbfX1Z7R7wZ8iG3uRyaQHEGPvN2
 zje4VYJQcJs+EXE=
 =tPy4
 -----END PGP SIGNATURE-----

Merge tag 'audit-pr-20200127' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit

Pull audit update from Paul Moore:
 "One small audit patch for the Linux v5.6 merge window, and
  unsurprisingly it passes our test suite with flying colors"

* tag 'audit-pr-20200127' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: Add __rcu annotation to RCU pointer
2020-01-27 15:35:50 -08:00
Linus Torvalds
03aa8c8cfa Merge branch 'for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:

 - cgroup2 interface for hugetlb controller. I think this was the last
   remaining bit which was missing from cgroup2

 - fixes for race and a spurious warning in threaded cgroup handling

 - other minor changes

* 'for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  iocost: Fix iocost_monitor.py due to helper type mismatch
  cgroup: Prevent double killing of css when enabling threaded cgroup
  cgroup: fix function name in comment
  mm: hugetlb controller for cgroups v2
2020-01-27 15:18:25 -08:00
Linus Torvalds
16d06120d7 Merge branch 'for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue updates from Tejun Heo:
 "Just a couple tracepoint patches"

* 'for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: remove workqueue_work event class
  workqueue: add worker function to workqueue_execute_end tracepoint
2020-01-27 15:16:52 -08:00
Linus Torvalds
6d277aca48 Power management updates for 5.6-rc1
- Update the ACPI processor driver in order to export
    acpi_processor_evaluate_cst() to the code outside of it, add
    ACPI support to the intel_idle driver based on that and clean
    up that driver somewhat (Rafael Wysocki).
 
  - Add an admin guide document for the intel_idle driver (Rafael
    Wysocki).
 
  - Clean up cpuidle core and drivers, enable compilation testing
    for some of them (Benjamin Gaignard, Krzysztof Kozlowski, Rafael
    Wysocki, Yangtao Li).
 
  - Fix reference counting of OPP (operating performance points) table
    structures (Viresh Kumar).
 
  - Add support for CPR (Core Power Reduction) to the AVS (Adaptive
    Voltage Scaling) subsystem (Niklas Cassel, Colin Ian King,
    YueHaibing).
 
  - Add support for TigerLake Mobile and JasperLake to the Intel RAPL
    power capping driver (Zhang Rui).
 
  - Update cpufreq drivers:
 
    * Add i.MX8MP support to imx-cpufreq-dt (Anson Huang).
 
    * Fix usage of a macro in loongson2_cpufreq (Alexandre Oliva).
 
    * Fix cpufreq policy reference counting issues in s3c and
      brcmstb-avs (chenqiwu).
 
    * Fix ACPI table reference counting issue and HiSilicon quirk
      handling in the CPPC driver (Hanjun Guo).
 
    * Clean up spelling mistake in intel_pstate (Harry Pan).
 
    * Convert the kirkwood and tegra186 drivers to using
      devm_platform_ioremap_resource() (Yangtao Li).
 
  - Update devfreq core:
 
    * Add 'name' sysfs attribute for devfreq devices (Chanwoo Choi).
 
    * Clean up the handing of transition statistics and allow them
      to be reset by writing 0 to the 'trans_stat' devfreq device
      attribute in sysfs (Kamil Konieczny).
 
    * Add 'devfreq_summary' to debugfs (Chanwoo Choi).
 
    * Clean up kerneldoc comments and Kconfig indentation (Krzysztof
      Kozlowski, Randy Dunlap).
 
  - Update devfreq drivers:
 
    * Add dynamic scaling for the imx8m DDR controller and clean up
      imx8m-ddrc (Leonard Crestez, YueHaibing).
 
    * Fix DT node reference counting and nitialization error code path
      in rk3399_dmc and add COMPILE_TEST and HAVE_ARM_SMCCC dependency
      for it (Chanwoo Choi, Yangtao Li).
 
    * Fix DT node reference counting in rockchip-dfi and make it use
      devm_platform_ioremap_resource() (Yangtao Li).
 
    * Fix excessive stack usage in exynos-ppmu (Arnd Bergmann).
 
    * Fix initialization error code paths in exynos-bus (Yangtao Li).
 
    * Clean up exynos-bus and exynos somewhat (Artur Świgoń, Krzysztof
      Kozlowski).
 
  - Add tracepoints for tracking usage_count updates unrelated to
    status changes in PM-runtime (Michał Mirosław).
 
  - Add sysfs attribute to control the "sync on suspend" behavior
    during system-wide suspend (Jonas Meurer).
 
  - Switch system-wide suspend tests over to 64-bit time (Alexandre
    Belloni).
 
  - Make wakeup sources statistics in debugfs cover deleted ones which
    used to be the case some time ago (zhuguangqing).
 
  - Clean up computations carried out during hibernation, update
    messages related to hibernation and fix a spelling mistake in one
    of them (Wen Yang, Luigi Semenzato, Colin Ian King).
 
  - Add mailmap entry for maintainer e-mail address that has not been
    functional for several years (Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl4u2fESHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxvlkP/j5vDzyNUNJjnD6+897c8W+z5dwdiQfU
 QNtoopFXgw/fpOhGXRdj2mA4e6RtpU9aCCiHR6/qdh3/1qSnR5Y9R/51/gmdkwhY
 YakSxmgpgGrOJru94ApI1o/35eWwN/GxjajbfNY5ScrPQl/L0DF3iJWRsAOR5534
 p9e2gQqKecoE+MEn5JcGAXApA5xBLXuUmtWPUn5UGyhaz+jdmsf1zkDEOEvxREay
 hLGH1y6BY8HS/jytyNzISs9iDeBvg2fHmG8SskDiXVMke5sHBTU9MilgpnCFfQ0l
 OF/eNnTXTU7mAJhlnjBUt2rIe5peGSuhgg+Ur7s86xYqbj2SfsVM4UHjU0A6t9Jm
 sauWQh/Nbzw6XaCNzYKxP+dREAg0g/aq7xFqQi3bWx7YvzLk/hvNWi2+bv3adzx7
 Z3fvOki4xMXzLLrh0f1ipC8BKTsdioDZPAy06B80a0luv6ROdr6bPL7did14mWt2
 eCuPuZyXKhdV+PkjZHF+c4XT7N9NfGtE0WUQf54Q4VT00hDagGDliwXpm4ht1pjJ
 iO7uUJevXKSxMaV2xPZ+nWZaOeCVrMMTA1Ec1ELgC1n8WROZJ+SfhehgMQGp7BHS
 Hz4QO1HjTsCDnT+OU7JFeCRrkyXIlh75MOndWOOH6eTEXCAI9PihstB+UGXeNsK0
 BesNQz1sYY1O
 =g48u
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These add ACPI support to the intel_idle driver along with an admin
  guide document for it, add support for CPR (Core Power Reduction) to
  the AVS (Adaptive Voltage Scaling) subsystem, add new hardware support
  in a few places, add some new sysfs attributes, debugfs files and
  tracepoints, fix bugs and clean up a bunch of things all over.

  Specifics:

   - Update the ACPI processor driver in order to export
     acpi_processor_evaluate_cst() to the code outside of it, add ACPI
     support to the intel_idle driver based on that and clean up that
     driver somewhat (Rafael Wysocki).

   - Add an admin guide document for the intel_idle driver (Rafael
     Wysocki).

   - Clean up cpuidle core and drivers, enable compilation testing for
     some of them (Benjamin Gaignard, Krzysztof Kozlowski, Rafael
     Wysocki, Yangtao Li).

   - Fix reference counting of OPP (operating performance points) table
     structures (Viresh Kumar).

   - Add support for CPR (Core Power Reduction) to the AVS (Adaptive
     Voltage Scaling) subsystem (Niklas Cassel, Colin Ian King,
     YueHaibing).

   - Add support for TigerLake Mobile and JasperLake to the Intel RAPL
     power capping driver (Zhang Rui).

   - Update cpufreq drivers:
      - Add i.MX8MP support to imx-cpufreq-dt (Anson Huang).
      - Fix usage of a macro in loongson2_cpufreq (Alexandre Oliva).
      - Fix cpufreq policy reference counting issues in s3c and
        brcmstb-avs (chenqiwu).
      - Fix ACPI table reference counting issue and HiSilicon quirk
        handling in the CPPC driver (Hanjun Guo).
      - Clean up spelling mistake in intel_pstate (Harry Pan).
      - Convert the kirkwood and tegra186 drivers to using
        devm_platform_ioremap_resource() (Yangtao Li).

   - Update devfreq core:
      - Add 'name' sysfs attribute for devfreq devices (Chanwoo Choi).
      - Clean up the handing of transition statistics and allow them to
        be reset by writing 0 to the 'trans_stat' devfreq device
        attribute in sysfs (Kamil Konieczny).
      - Add 'devfreq_summary' to debugfs (Chanwoo Choi).
      - Clean up kerneldoc comments and Kconfig indentation (Krzysztof
        Kozlowski, Randy Dunlap).

   - Update devfreq drivers:
      - Add dynamic scaling for the imx8m DDR controller and clean up
        imx8m-ddrc (Leonard Crestez, YueHaibing).
      - Fix DT node reference counting and nitialization error code path
        in rk3399_dmc and add COMPILE_TEST and HAVE_ARM_SMCCC dependency
        for it (Chanwoo Choi, Yangtao Li).
      - Fix DT node reference counting in rockchip-dfi and make it use
        devm_platform_ioremap_resource() (Yangtao Li).
      - Fix excessive stack usage in exynos-ppmu (Arnd Bergmann).
      - Fix initialization error code paths in exynos-bus (Yangtao Li).
      - Clean up exynos-bus and exynos somewhat (Artur Świgoń, Krzysztof
        Kozlowski).

   - Add tracepoints for tracking usage_count updates unrelated to
     status changes in PM-runtime (Michał Mirosław).

   - Add sysfs attribute to control the "sync on suspend" behavior
     during system-wide suspend (Jonas Meurer).

   - Switch system-wide suspend tests over to 64-bit time (Alexandre
     Belloni).

   - Make wakeup sources statistics in debugfs cover deleted ones which
     used to be the case some time ago (zhuguangqing).

   - Clean up computations carried out during hibernation, update
     messages related to hibernation and fix a spelling mistake in one
     of them (Wen Yang, Luigi Semenzato, Colin Ian King).

   - Add mailmap entry for maintainer e-mail address that has not been
     functional for several years (Rafael Wysocki)"

* tag 'pm-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (83 commits)
  cpufreq: loongson2_cpufreq: adjust cpufreq uses of LOONGSON_CHIPCFG
  intel_idle: Clean up irtl_2_usec()
  intel_idle: Move 3 functions closer to their callers
  intel_idle: Annotate initialization code and data structures
  intel_idle: Move and clean up intel_idle_cpuidle_devices_uninit()
  intel_idle: Rearrange intel_idle_cpuidle_driver_init()
  intel_idle: Clean up NULL pointer check in intel_idle_init()
  intel_idle: Fold intel_idle_probe() into intel_idle_init()
  intel_idle: Eliminate __setup_broadcast_timer()
  cpuidle: fix cpuidle_find_deepest_state() kerneldoc warnings
  cpuidle: sysfs: fix warnings when compiling with W=1
  cpuidle: coupled: fix warnings when compiling with W=1
  cpufreq: brcmstb-avs: fix imbalance of cpufreq policy refcount
  PM: suspend: Add sysfs attribute to control the "sync on suspend" behavior
  PM / devfreq: Add debugfs support with devfreq_summary file
  Documentation: admin-guide: PM: Add intel_idle document
  cpuidle: arm: Enable compile testing for some of drivers
  PM-runtime: add tracepoints for usage_count changes
  cpufreq: intel_pstate: fix spelling mistake: "Whethet" -> "Whether"
  PM: hibernate: fix spelling mistake "shapshot" -> "snapshot"
  ...
2020-01-27 11:23:54 -08:00
Linus Torvalds
0238d3c753 arm64 updates for 5.6
- New architecture features
 	* Support for Armv8.5 E0PD, which benefits KASLR in the same way as
 	  KPTI but without the overhead. This allows KPTI to be disabled on
 	  CPUs that are not affected by Meltdown, even is KASLR is enabled.
 
 	* Initial support for the Armv8.5 RNG instructions, which claim to
 	  provide access to a high bandwidth, cryptographically secure hardware
 	  random number generator. As well as exposing these to userspace, we
 	  also use them as part of the KASLR seed and to seed the crng once
 	  all CPUs have come online.
 
 	* Advertise a bunch of new instructions to userspace, including support
 	  for Data Gathering Hint, Matrix Multiply and 16-bit floating point.
 
 - Kexec
 	* Cleanups in preparation for relocating with the MMU enabled
 	* Support for loading crash dump kernels with kexec_file_load()
 
 - Perf and PMU drivers
 	* Cleanups and non-critical fixes for a couple of system PMU drivers
 
 - FPU-less (aka broken) CPU support
 	* Considerable fixes to support CPUs without the FP/SIMD extensions,
 	  including their presence in heterogeneous systems. Good luck finding
 	  a 64-bit userspace that handles this.
 
 - Modern assembly function annotations
 	* Start migrating our use of ENTRY() and ENDPROC() over to the
 	  new-fangled SYM_{CODE,FUNC}_{START,END} macros, which are intended to
 	  aid debuggers
 
 - Kbuild
 	* Cleanup detection of LSE support in the assembler by introducing
 	  'as-instr'
 
 	* Remove compressed Image files when building clean targets
 
 - IP checksumming
 	* Implement optimised IPv4 checksumming routine when hardware offload
 	  is not in use. An IPv6 version is in the works, pending testing.
 
 - Hardware errata
 	* Work around Cortex-A55 erratum #1530923
 
 - Shadow call stack
 	* Work around some issues with Clang's integrated assembler not liking
 	  our perfectly reasonable assembly code
 
 	* Avoid allocating the X18 register, so that it can be used to hold the
 	  shadow call stack pointer in future
 
 - ACPI
 	* Fix ID count checking in IORT code. This may regress broken firmware
 	  that happened to work with the old implementation, in which case we'll
 	  have to revert it and try something else
 
 	* Fix DAIF corruption on return from GHES handler with pseudo-NMIs
 
 - Miscellaneous
 	* Whitelist some CPUs that are unaffected by Spectre-v2
 
 	* Reduce frequency of ASID rollover when KPTI is compiled in but
 	  inactive
 
 	* Reserve a couple of arch-specific PROT flags that are already used by
 	  Sparc and PowerPC and are planned for later use with BTI on arm64
 
 	* Preparatory cleanup of our entry assembly code in preparation for
 	  moving more of it into C later on
 
 	* Refactoring and cleanup
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl4oY+IQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNNfRB/4p3vax0hqaOnLRvmJPRXF31B8oPlivnr2u
 6HCA9LkdU5IlrgaTNOJ/sQEqJAPOPCU7v49Ol0iYw0iKL1suUE7Ikui5VB6Uybqt
 YbfF5UNzfXAMs2A86TF/hzqhxw+W+lpnZX8NVTuQeAODfHEGUB1HhTLfRi9INsER
 wKEAuoZyuSUibxTFvji+DAq7nVRniXX7CM7tE385pxDisCMuu/7E5wOl+3EZYXWz
 DTGzTbHXuVFL+UFCANFEUlAtmr3dQvPFIqAwVl/CxjRJjJ7a+/G3cYLsHFPrQCjj
 qYX4kfhAeeBtqmHL7YFNWFwFs5WaT5UcQquFO665/+uCTWSJpORY
 =AIh/
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "The changes are a real mixed bag this time around.

  The only scary looking one from the diffstat is the uapi change to
  asm-generic/mman-common.h, but this has been acked by Arnd and is
  actually just adding a pair of comments in an attempt to prevent
  allocation of some PROT values which tend to get used for
  arch-specific purposes. We'll be using them for Branch Target
  Identification (a CFI-like hardening feature), which is currently
  under review on the mailing list.

  New architecture features:

   - Support for Armv8.5 E0PD, which benefits KASLR in the same way as
     KPTI but without the overhead. This allows KPTI to be disabled on
     CPUs that are not affected by Meltdown, even is KASLR is enabled.

   - Initial support for the Armv8.5 RNG instructions, which claim to
     provide access to a high bandwidth, cryptographically secure
     hardware random number generator. As well as exposing these to
     userspace, we also use them as part of the KASLR seed and to seed
     the crng once all CPUs have come online.

   - Advertise a bunch of new instructions to userspace, including
     support for Data Gathering Hint, Matrix Multiply and 16-bit
     floating point.

  Kexec:

   - Cleanups in preparation for relocating with the MMU enabled

   - Support for loading crash dump kernels with kexec_file_load()

  Perf and PMU drivers:

   - Cleanups and non-critical fixes for a couple of system PMU drivers

  FPU-less (aka broken) CPU support:

   - Considerable fixes to support CPUs without the FP/SIMD extensions,
     including their presence in heterogeneous systems. Good luck
     finding a 64-bit userspace that handles this.

  Modern assembly function annotations:

   - Start migrating our use of ENTRY() and ENDPROC() over to the
     new-fangled SYM_{CODE,FUNC}_{START,END} macros, which are intended
     to aid debuggers

  Kbuild:

   - Cleanup detection of LSE support in the assembler by introducing
     'as-instr'

   - Remove compressed Image files when building clean targets

  IP checksumming:

   - Implement optimised IPv4 checksumming routine when hardware offload
     is not in use. An IPv6 version is in the works, pending testing.

  Hardware errata:

   - Work around Cortex-A55 erratum #1530923

  Shadow call stack:

   - Work around some issues with Clang's integrated assembler not
     liking our perfectly reasonable assembly code

   - Avoid allocating the X18 register, so that it can be used to hold
     the shadow call stack pointer in future

  ACPI:

   - Fix ID count checking in IORT code. This may regress broken
     firmware that happened to work with the old implementation, in
     which case we'll have to revert it and try something else

   - Fix DAIF corruption on return from GHES handler with pseudo-NMIs

  Miscellaneous:

   - Whitelist some CPUs that are unaffected by Spectre-v2

   - Reduce frequency of ASID rollover when KPTI is compiled in but
     inactive

   - Reserve a couple of arch-specific PROT flags that are already used
     by Sparc and PowerPC and are planned for later use with BTI on
     arm64

   - Preparatory cleanup of our entry assembly code in preparation for
     moving more of it into C later on

   - Refactoring and cleanup"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (73 commits)
  arm64: acpi: fix DAIF manipulation with pNMI
  arm64: kconfig: Fix alignment of E0PD help text
  arm64: Use v8.5-RNG entropy for KASLR seed
  arm64: Implement archrandom.h for ARMv8.5-RNG
  arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean'
  arm64: entry: Avoid empty alternatives entries
  arm64: Kconfig: select HAVE_FUTEX_CMPXCHG
  arm64: csum: Fix pathological zero-length calls
  arm64: entry: cleanup sp_el0 manipulation
  arm64: entry: cleanup el0 svc handler naming
  arm64: entry: mark all entry code as notrace
  arm64: assembler: remove smp_dmb macro
  arm64: assembler: remove inherit_daif macro
  ACPI/IORT: Fix 'Number of IDs' handling in iort_id_map()
  mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use
  arm64: Use macros instead of hard-coded constants for MAIR_EL1
  arm64: Add KRYO{3,4}XX CPU cores to spectre-v2 safe list
  arm64: kernel: avoid x18 in __cpu_soft_restart
  arm64: kvm: stop treating register x18 as caller save
  arm64/lib: copy_page: avoid x18 register in assembler code
  ...
2020-01-27 08:58:19 -08:00
Rafael J. Wysocki
245224d1cb Merge branches 'pm-cpufreq' and 'pm-sleep'
* pm-cpufreq:
  cpufreq: loongson2_cpufreq: adjust cpufreq uses of LOONGSON_CHIPCFG
  cpufreq: brcmstb-avs: fix imbalance of cpufreq policy refcount
  cpufreq: intel_pstate: fix spelling mistake: "Whethet" -> "Whether"
  cpufreq: s3c: fix unbalances of cpufreq policy refcount
  cpufreq: imx-cpufreq-dt: Add i.MX8MP support
  cpufreq: Use imx-cpufreq-dt for i.MX8MP's speed grading
  cpufreq: tegra186: convert to devm_platform_ioremap_resource
  cpufreq: kirkwood: convert to devm_platform_ioremap_resource
  cpufreq: CPPC: put ACPI table after using it
  cpufreq : CPPC: Break out if HiSilicon CPPC workaround is matched

* pm-sleep:
  PM: suspend: Add sysfs attribute to control the "sync on suspend" behavior
  PM: hibernate: fix spelling mistake "shapshot" -> "snapshot"
  PM: hibernate: Add more logging on hibernation failure
  PM: hibernate: improve arithmetic division in preallocate_highmem_fraction()
  PM: wakeup: Show statistics for deleted wakeup sources again
  PM: sleep: Switch to rtc_time64_to_tm()/rtc_tm_to_time64()
2020-01-27 11:29:09 +01:00
Ingo Molnar
f8a4bb6bfa Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:

 - Expedited grace-period updates
 - kfree_rcu() updates
 - RCU list updates
 - Preemptible RCU updates
 - Torture-test updates
 - Miscellaneous fixes
 - Documentation updates

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-01-25 10:05:23 +01:00
Stephen Boyd
fd928f3e32 alarmtimer: Make alarmtimer_get_rtcdev() a stub when CONFIG_RTC_CLASS=n
The stubbed version of alarmtimer_get_rtcdev() is not exported.
so this won't work if this function is used in a module when
CONFIG_RTC_CLASS=n.

Move the stub function to the header file and make it inline so that
callers don't have to worry about linking against this symbol.

rtcdev isn't used outside of this ifdef so it's not required to be
redefined to NULL. Drop that while touching this area.

Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200124055849.154411-4-swboyd@chromium.org
2020-01-24 21:03:53 +01:00
Stephen Boyd
7c94caca87 alarmtimer: Use wakeup source from alarmtimer platform device
Use the wakeup source that can be associated with the 'alarmtimer'
platform device instead of registering another one by hand.

Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200124055849.154411-3-swboyd@chromium.org
2020-01-24 21:00:21 +01:00
Stephen Boyd
c79108bd19 alarmtimer: Make alarmtimer platform device child of RTC device
The alarmtimer_suspend() function will fail if an RTC device is on a bus
such as SPI or i2c and that RTC device registers and probes after
alarmtimer_init() registers and probes the 'alarmtimer' platform device.

This is because system wide suspend suspends devices in the reverse order
of their probe. When alarmtimer_suspend() attempts to program the RTC for a
wakeup it will try to program an RTC device on a bus that has already been
suspended.

Move the alarmtimer device registration to happen when the RTC which is
used for wakeup is registered. Register the 'alarmtimer' platform device as
a child of the RTC device too, so that it can be guaranteed that the RTC
device won't be suspended when alarmtimer_suspend() is called.

Reported-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200124055849.154411-2-swboyd@chromium.org
2020-01-24 21:00:20 +01:00
Stephen Boyd
6b088cefbe alarmtimer: Update alarmtimer_get_rtcdev() docs to reflect reality
This function doesn't do anything like this comment says when an RTC device
hasn't been chosen. It looks like we used to do something like that before
commit 8bc0dafb5c ("alarmtimers: Rework RTC device selection using class
interface") but that's long gone now. Remove this sentence to avoid
confusing the reader.

Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200124055849.154411-5-swboyd@chromium.org
2020-01-24 21:00:20 +01:00
Sebastian Andrzej Siewior
cb923159bb smp: Remove allocation mask from on_each_cpu_cond.*()
The allocation mask is no longer used by on_each_cpu_cond() and
on_each_cpu_cond_mask() and can be removed.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20200117090137.1205765-4-bigeasy@linutronix.de
2020-01-24 20:40:09 +01:00
Sebastian Andrzej Siewior
67719ef25e smp: Add a smp_cond_func_t argument to smp_call_function_many()
on_each_cpu_cond_mask() allocates a new CPU mask. The newly allocated
mask is a subset of the provided mask based on the conditional function.

This memory allocation can be avoided by extending smp_call_function_many()
with the conditional function and performing the remote function call based
on the mask and the conditional function.

Rename smp_call_function_many() to smp_call_function_many_cond() and add
the smp_cond_func_t argument. If smp_cond_func_t is provided then it is
used before invoking the function.  Provide smp_call_function_many() with
cond_func set to NULL.  Let on_each_cpu_cond_mask() use
smp_call_function_many_cond().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20200117090137.1205765-3-bigeasy@linutronix.de
2020-01-24 20:40:09 +01:00
Sebastian Andrzej Siewior
5671d814db smp: Use smp_cond_func_t as type for the conditional function
Use a typdef for the conditional function instead defining it each time in
the function prototype.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20200117090137.1205765-2-bigeasy@linutronix.de
2020-01-24 20:40:08 +01:00
Thomas Gleixner
43ee74487b irqchip updates for Linux 5.6:
- Conversion of the SiFive PLIC to hierarchical domains
 - New SiFive GPIO irqchip driver
 - New Aspeed SCI irqchip driver
 - New NXP INTMUX irqchip driver
 - Additional support for the Meson A1 GPIO irqchip
 - First part of the GICv4.1 support
 - Assorted fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl4rKn0PHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDHVoQALTTYQol+5Gz5pLxnROYEAdFjzrVrCarsK/b
 Cl4uVa5efOTCItSO3L9cEo1zoB++aJxPSOaKqX9hryPwPLTZzDiHYtVQ870tZB+k
 233cTvtT8+iw7/JPKnA8706TYDk1FUkJQ87V0gMLrnVH00dmJ8LvjW1bCdXV8iIa
 Ln78XIF+Ass+qJjSpCDRaOukDm6Qs+sZKAY0+nLXM8Ge564fdX7bPkDGN4tq9DLz
 74ZxY6s3rI5FoPceS270dtDf4Ib8gH+T8Bqd5AYSj/tcRE23s4muGb/O3Kez5Oko
 eEiuSadpep/kPQhgZlpX0tJgtEqHNfi6K8AIMscQQDFmJyuCqgR9/5as+UKX1V0M
 kPlOQtYCAVZmTnlOP6rA2V3RUFurVkFPkwUGzVYlCYxxrARvsH+vPxYqAPH/EEFq
 lGUo+2Z7Z+1ubPsnR8WKs8heC6qJidegGUtKoKYWroJl+tiuT6EtCP3J0QZPhdXT
 lVOBVnR6DHNIURuAEmag/eNYsBIj7PdmlByoMkBFn9LPE7Fn+OExJgbyVsu1IaTe
 AcUHmXR9QpcAKnDLmNSqFvhWsLo8CJ607rH3tL8vqnfijOHyt4AvKeE1R4QSavPx
 0F3FFNdo7Y1FAlJ9Ibw0gLvoIa6uP6FpdI3rht0iRaOZJlnDTbn+B8UayY0Ajvyp
 aGIjx7tY
 =8iz1
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates from Marc Zyngier:

- Conversion of the SiFive PLIC to hierarchical domains
- New SiFive GPIO irqchip driver
- New Aspeed SCI irqchip driver
- New NXP INTMUX irqchip driver
- Additional support for the Meson A1 GPIO irqchip
- First part of the GICv4.1 support
- Assorted fixes
2020-01-24 20:08:51 +01:00
Paul E. McKenney
0e247386d9 Merge branches 'doc.2019.12.10a', 'exp.2019.12.09a', 'fixes.2020.01.24a', 'kfree_rcu.2020.01.24a', 'list.2020.01.10a', 'preempt.2020.01.24a' and 'torture.2019.12.09a' into HEAD
doc.2019.12.10a: Documentations updates
exp.2019.12.09a: Expedited grace-period updates
fixes.2020.01.24a: Miscellaneous fixes
kfree_rcu.2020.01.24a: Batch kfree_rcu() work
list.2020.01.10a: RCU-protected-list updates
preempt.2020.01.24a: Preemptible RCU updates
torture.2019.12.09a: Torture-test updates
2020-01-24 10:37:27 -08:00
Paul E. McKenney
f6105fc2a9 rcu: Remove unused stop-machine #include
Long ago, RCU used the stop-machine mechanism to implement expedited
grace periods, but no longer does so.  This commit therefore removes
the no-longer-needed #includes of linux/stop_machine.h.

Link: https://lwn.net/Articles/805317/
Reported-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:33:52 -08:00
Paul E. McKenney
844a378de3 srcu: Apply *_ONCE() to ->srcu_last_gp_end
The ->srcu_last_gp_end field is accessed from any CPU at any time
by synchronize_srcu(), so non-initialization references need to use
READ_ONCE() and WRITE_ONCE().  This commit therefore makes that change.

Reported-by: syzbot+08f3e9d26e5541e1ecf2@syzkaller.appspotmail.com
Acked-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:33:51 -08:00
Paul E. McKenney
7441e7661d rcu: Switch force_qs_rnp() to for_each_leaf_node_cpu_mask()
Currently, force_qs_rnp() uses a for_each_leaf_node_possible_cpu()
loop containing a check of the current CPU's bit in ->qsmask.
This works, but this commit saves three lines by instead using
for_each_leaf_node_cpu_mask(), which combines the functionality of
for_each_leaf_node_possible_cpu() and leaf_node_cpu_bit().  This commit
also replaces the use of the local variable "bit" with rdp->grpmask.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:33:51 -08:00
Ben Dooks
e1350e8e0e rcu: Move rcu_{expedited,normal} definitions into rcupdate.h
This commit moves the rcu_{expedited,normal} definitions from
kernel/rcu/update.c to include/linux/rcupdate.h to make sure they are
in sync, and also to avoid the following warning from sparse:

kernel/ksysfs.c:150:5: warning: symbol 'rcu_expedited' was not declared. Should it be static?
kernel/ksysfs.c:167:5: warning: symbol 'rcu_normal' was not declared. Should it be static?

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:33:50 -08:00
Lai Jiangshan
e2167b38c8 rcu: Move gp_state_names[] and gp_state_getname() to tree_stall.h
Only tree_stall.h needs to get name from GP state, so this commit
moves the gp_state_names[] array and the gp_state_getname()
from kernel/rcu/tree.h and kernel/rcu/tree.c, respectively, to
kernel/rcu/tree_stall.h.  While moving gp_state_names[], this commit
uses the GCC syntax to ensure that the right string is associated with
the right CPP macro.

Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:33:45 -08:00
Lai Jiangshan
4778339df0 rcu: Remove the declaration of call_rcu() in tree.h
The call_rcu() function is an external RCU API that is declared in
include/linux/rcupdate.h.  There is thus no point in redeclaring it
in kernel/rcu/tree.h, so this commit removes that redundant declaration.

Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:33:38 -08:00
Lai Jiangshan
2488a5e695 rcu: Fix tracepoint tracking RCU CPU kthread utilization
In the call to trace_rcu_utilization() at the start of the loop in
rcu_cpu_kthread(), "rcu_wait" is incorrect, plus this trace event needs
to be hoisted above the loop to balance with either the "rcu_wait" or
"rcu_yield", depending on how the loop exits.  This commit therefore
makes these changes.

Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:33:31 -08:00
Lai Jiangshan
822175e729 rcu: Fix harmless omission of "CONFIG_" from #if condition
The C preprocessor macros SRCU and TINY_RCU should instead be CONFIG_SRCU
and CONFIG_TINY_RCU, respectively in the #f in kernel/rcu/rcu.h. But
there is no harm when "TINY_RCU" is wrongly used, which are always
non-defined, which makes "!defined(TINY_RCU)" always true, which means
the code block is always included, and the included code block doesn't
cause any compilation error so far in CONFIG_TINY_RCU builds.  It is
also the reason this change should not be taken in -stable.

This commit adds the needed "CONFIG_" prefix to both macros.

Not for -stable.

Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:33:13 -08:00
Paul E. McKenney
5b14557b07 rcu: Avoid tick_dep_set_cpu() misordering
In the current code, rcu_nmi_enter_common() might decide to turn on
the tick using tick_dep_set_cpu(), but be delayed just before doing so.
Then the grace-period kthread might notice that the CPU in question had
in fact gone through a quiescent state, thus turning off the tick using
tick_dep_clear_cpu().  The later invocation of tick_dep_set_cpu() would
then incorrectly leave the tick on.

This commit therefore enlists the aid of the leaf rcu_node structure's
->lock to ensure that decisions to enable or disable the tick are
carried out before they can be reversed.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:27:33 -08:00
Lai Jiangshan
77339e61aa rcu: Provide wrappers for uses of ->rcu_read_lock_nesting
This commit provides wrapper functions for uses of ->rcu_read_lock_nesting
to improve readability and to ease future changes to support inlining
of __rcu_read_lock() and __rcu_read_unlock().

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:27:33 -08:00
Paul E. McKenney
c51f83c315 rcu: Use READ_ONCE() for ->expmask in rcu_read_unlock_special()
The rcu_node structure's ->expmask field is updated only when holding the
->lock, but is also accessed locklessly.  This means that all ->expmask
updates must use WRITE_ONCE() and all reads carried out without holding
->lock must use READ_ONCE().  This commit therefore changes the lockless
->expmask read in rcu_read_unlock_special() to use READ_ONCE().

Reported-by: syzbot+99f4ddade3c22ab0cf23@syzkaller.appspotmail.com
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Marco Elver <elver@google.com>
2020-01-24 10:27:33 -08:00
Lai Jiangshan
3717e1e9f2 rcu: Clear ->rcu_read_unlock_special only once
In rcu_preempt_deferred_qs_irqrestore(), ->rcu_read_unlock_special is
cleared one piece at a time.  Given that the "if" statements in this
function use the copy in "special", this commit removes the clearing
of the individual pieces in favor of clearing ->rcu_read_unlock_special
in one go just after it has been determined to be non-zero.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:27:33 -08:00
Lai Jiangshan
2eeba5838f rcu: Clear .exp_hint only when deferred quiescent state has been reported
Currently, the .exp_hint flag is cleared in rcu_read_unlock_special(),
which works, but which can also prevent subsequent rcu_read_unlock() calls
from helping expedite the quiescent state needed by an ongoing expedited
RCU grace period.  This commit therefore defers clearing of .exp_hint
from rcu_read_unlock_special() to rcu_preempt_deferred_qs_irqrestore(),
thus ensuring that intervening calls to rcu_read_unlock() have a chance
to help end the expedited grace period.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:27:33 -08:00
Lai Jiangshan
c130d2dc93 rcu: Rename some instance of CONFIG_PREEMPTION to CONFIG_PREEMPT_RCU
CONFIG_PREEMPTION and CONFIG_PREEMPT_RCU are always identical,
but some code depends on CONFIG_PREEMPTION to access to
rcu_preempt functionality. This patch changes CONFIG_PREEMPTION
to CONFIG_PREEMPT_RCU in these cases.

Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:26:28 -08:00
Joel Fernandes (Google)
189a6883dc rcu: Remove kfree_call_rcu_nobatch()
Now that the kfree_rcu() special-casing has been removed from tree RCU,
this commit removes kfree_call_rcu_nobatch() since it is no longer needed.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:24:31 -08:00
Joel Fernandes (Google)
77a40f9703 rcu: Remove kfree_rcu() special casing and lazy-callback handling
This commit removes kfree_rcu() special-casing and the lazy-callback
handling from Tree RCU.  It moves some of this special casing to Tiny RCU,
the removal of which will be the subject of later commits.

This results in a nice negative delta.

Suggested-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
[ paulmck: Add slab.h #include, thanks to kbuild test robot <lkp@intel.com>. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:24:31 -08:00
Joel Fernandes (Google)
e99637becb rcu: Add support for debug_objects debugging for kfree_rcu()
This commit applies RCU's debug_objects debugging to the new batched
kfree_rcu() implementations.  The object is queued at the kfree_rcu()
call and dequeued during reclaim.

Tested that enabling CONFIG_DEBUG_OBJECTS_RCU_HEAD successfully detects
double kfree_rcu() calls.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
[ paulmck: Fix IRQ per kbuild test robot <lkp@intel.com> feedback. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:24:31 -08:00
Joel Fernandes (Google)
0392bebebf rcu: Add multiple in-flight batches of kfree_rcu() work
During testing, it was observed that amount of memory consumed due
kfree_rcu() batching is 300-400MB. Previously we had only a single
head_free pointer pointing to the list of rcu_head(s) that are to be
freed after a grace period. Until this list is drained, we cannot queue
any more objects on it since such objects may not be ready to be
reclaimed when the worker thread eventually gets to drainin g the
head_free list.

We can do better by maintaining multiple lists as done by this patch.
Testing shows that memory consumption came down by around 100-150MB with
just adding another list. Adding more than 1 additional list did not
show any improvement.

Suggested-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
[ paulmck: Code style and initialization handling. ]
[ paulmck: Fix field name, reported by kbuild test robot <lkp@intel.com>. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:24:31 -08:00
Joel Fernandes
569d767087 rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
Because the ->monitor_todo field is always protected by krcp->lock,
this commit downgrades from xchg() to non-atomic unmarked assignment
statements.

Signed-off-by: Joel Fernandes <joel@joelfernandes.org>
[ paulmck: Update to include early-boot kick code. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:24:31 -08:00
Joel Fernandes (Google)
e6e78b004f rcuperf: Add kfree_rcu() performance Tests
This test runs kfree_rcu() in a loop to measure performance of the new
kfree_rcu() batching functionality.

The following table shows results when booting with arguments:
rcuperf.kfree_loops=20000 rcuperf.kfree_alloc_num=8000
rcuperf.kfree_rcu_test=1 rcuperf.kfree_no_batch=X

rcuperf.kfree_no_batch=X    # Grace Periods	Test Duration (s)
  X=1 (old behavior)              9133                 11.5
  X=0 (new behavior)              1732                 12.5

On a 16 CPU system with the above boot parameters, we see that the total
number of grace periods that elapse during the test drops from 9133 when
not batching to 1732 when batching (a 5X improvement). The kfree_rcu()
flood itself slows down a bit when batching, though, as shown.

Note that the active memory consumption during the kfree_rcu() flood
does increase to around 200-250MB due to the batching (from around 50MB
without batching). However, this memory consumption is relatively
constant. In other words, the system is able to keep up with the
kfree_rcu() load. The memory consumption comes down considerably if
KFREE_DRAIN_JIFFIES is increased from HZ/50 to HZ/80. A later patch will
reduce memory consumption further by using multiple lists.

Also, when running the test, please disable CONFIG_DEBUG_PREEMPT and
CONFIG_PROVE_RCU for realistic comparisons with/without batching.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:24:31 -08:00
Byungchul Park
a35d16905e rcu: Add basic support for kfree_rcu() batching
Recently a discussion about stability and performance of a system
involving a high rate of kfree_rcu() calls surfaced on the list [1]
which led to another discussion how to prepare for this situation.

This patch adds basic batching support for kfree_rcu(). It is "basic"
because we do none of the slab management, dynamic allocation, code
moving or any of the other things, some of which previous attempts did
[2]. These fancier improvements can be follow-up patches and there are
different ideas being discussed in those regards. This is an effort to
start simple, and build up from there. In the future, an extension to
use kfree_bulk and possibly per-slab batching could be done to further
improve performance due to cache-locality and slab-specific bulk free
optimizations. By using an array of pointers, the worker thread
processing the work would need to read lesser data since it does not
need to deal with large rcu_head(s) any longer.

Torture tests follow in the next patch and show improvements of around
5x reduction in number of  grace periods on a 16 CPU system. More
details and test data are in that patch.

There is an implication with rcu_barrier() with this patch. Since the
kfree_rcu() calls can be batched, and may not be handed yet to the RCU
machinery in fact, the monitor may not have even run yet to do the
queue_rcu_work(), there seems no easy way of implementing rcu_barrier()
to wait for those kfree_rcu()s that are already made. So this means a
kfree_rcu() followed by an rcu_barrier() does not imply that memory will
be freed once rcu_barrier() returns.

Another implication is higher active memory usage (although not
run-away..) until the kfree_rcu() flooding ends, in comparison to
without batching. More details about this are in the second patch which
adds an rcuperf test.

Finally, in the near future we will get rid of kfree_rcu() special casing
within RCU such as in rcu_do_batch and switch everything to just
batching. Currently we don't do that since timer subsystem is not yet up
and we cannot schedule the kfree_rcu() monitor as the timer subsystem's
lock are not initialized. That would also mean getting rid of
kfree_call_rcu_nobatch() entirely.

[1] http://lore.kernel.org/lkml/20190723035725-mutt-send-email-mst@kernel.org
[2] https://lkml.org/lkml/2017/12/19/824

Cc: kernel-team@android.com
Cc: kernel-team@lge.com
Co-developed-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
[ paulmck: Applied 0day and Paul Walmsley feedback on ->monitor_todo. ]
[ paulmck: Make it work during early boot. ]
[ paulmck: Add a crude early boot self-test. ]
[ paulmck: Style adjustments and experimental docbook structure header. ]
Link: https://lore.kernel.org/lkml/alpine.DEB.2.21.9999.1908161931110.32497@viisi.sifive.com/T/#me9956f66cb611b95d26ae92700e1d901f46e8c59
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24 10:17:03 -08:00
Linus Torvalds
34597c85be Various tracing fixes:
- Fix a function comparison warning for a xen trace event macro
  - Fix a double perf_event linking to a trace_uprobe_filter for multiple events
  - Fix suspicious RCU warnings in trace event code for using
     list_for_each_entry_rcu() when the "_rcu" portion wasn't needed.
  - Fix a bug in the histogram code when using the same variable
  - Fix a NULL pointer dereference when tracefs lockdown enabled and calling
     trace_set_default_clock()
 
 This v2 version contains:
 
  - A fix to a bug found with the double perf_event linking patch
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXinakBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qhNZAQCi86p9eW3f3w7hM2hZcirC+mQKVZgp
 2rO4zIAK5V6G7gEAh6I7VZa50a6AE647ZjryE7ufTRUhmSFMWoG0kcJ7OAk=
 =/J9n
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.5-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Various tracing fixes:

   - Fix a function comparison warning for a xen trace event macro

   - Fix a double perf_event linking to a trace_uprobe_filter for
     multiple events

   - Fix suspicious RCU warnings in trace event code for using
     list_for_each_entry_rcu() when the "_rcu" portion wasn't needed.

   - Fix a bug in the histogram code when using the same variable

   - Fix a NULL pointer dereference when tracefs lockdown enabled and
     calling trace_set_default_clock()

   - A fix to a bug found with the double perf_event linking patch"

* tag 'trace-v5.5-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/uprobe: Fix to make trace_uprobe_filter alignment safe
  tracing: Do not set trace clock if tracefs lockdown is in effect
  tracing: Fix histogram code when expression has same var as value
  tracing: trigger: Replace unneeded RCU-list traversals
  tracing/uprobe: Fix double perf_event linking on multiprobe uprobe
  tracing: xen: Ordered comparison of function pointers
2020-01-23 11:23:37 -08:00
Linus Torvalds
3a83c8c81c Power management fix for 5.5-rc8
Prevent the kernel from crashing during resume from hibernation
 if free pages contain leftover data from the restore kernel and
 init_on_free is set (Alexander Potapenko).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl4psjQSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxpOcP/1UTGUr+VdGfBBjG5WlgCY0Jrd50Y78b
 RoVNDR/NvSVSuIs44AgrnfyQiz2Y8jG6qY1iSAbIXmFl37/4+kGhkNmd6pV/xFUc
 TdZZotFwFlRQjmeQxxH0kNXuAY6nJ2RwELWrjXqM8PuNjNKIEpfS+0fSaWexHqIm
 MDArxcDHkvZU5SnnRQM+LkT/EmbEheB7tgm7vGGqMLsSKc0gUsBmVCURe/lLAH5o
 EUKX4FI2jCy+LlmSdZ3EDjf1cstm3YXLiegTLSq1Jh3mFHXkFTwJMmidiz21qXJh
 Hc4r3iG0NZ37J8HXpwuq++KlhvNbhHJz+ZgC1IYls16RNYh5mUzxtMHWdSyyqlrW
 +z8gBVUyeJUYos5Kjb/NKSt43gnz7Uhy0UVbQXD66hgajXe71CQZQq/D5CMeTdJL
 jWNaeGYnhskz3IW2vnrs9Ucf6RHHWezXk51kVsyJXadiLhTdOv7DKahDKVwC/Hvf
 kyN1W0F5PZpF50yYmnhJgqDfxkGBNKpwXxTAGk6X0WQFaWeh/2FkX045UdJzBbHu
 fa1taTM/5RfPlbWq0wLPeHHSP4M2I0ndeWXZk88vUwwMfm9Wo+FNBhgs/EZfeuKD
 16sVMsX0r7R1bG+hEj+mvNeLWqfgz7MpGludCkV1dHIDkn2esxx6JqaWxR/UnLHb
 D3fZM/cKGdpY
 =kDFG
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "Prevent the kernel from crashing during resume from hibernation if
  free pages contain leftover data from the restore kernel and
  init_on_free is set (Alexander Potapenko)"

* tag 'pm-5.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: hibernate: fix crashes with init_on_free=1
2020-01-23 11:10:21 -08:00
Rafael J. Wysocki
322e929d19 Merge back new material related to system-wide PM for v5.6. 2020-01-23 16:00:56 +01:00