linux

Author	SHA1	Message	Date
Linus Torvalds	336622e9fc	NOHZ full updates: - Remove TIF_NOHZ from 3 architectures These architectures use a static key to decide whether context tracking needs to be invoked and the TIF_NOHZ flag just causes a pointless slowpath execution for nothing. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl6B+bITHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoZjpD/9PkXE/zQoVmPLhOOcEBXB4i0rQQV41 mR8F83aswch+qtT1g7A00G5j49CWkLh/hj5PX7ajS9nSTQCHOQ9jdZuxPrjW8CGZ gMHCyd5o9C98sKOORylR2nuCKhVdOq0/HleRjBBDsqcO0T5KlhVPUrtuJ878kX8d 1SnoZnZMx+Ro0+4+Ehp39CmZJ0pV6o5ypT469esa2MB1xw389AQCmLt4rk99FNMo LDbKAB+7XBwNAu/rqD0hIv7YyvaSlcdlWBAXBLeCrwVIKQG3VfT9CpgwTtGoNFhY 9KBkzr0z+lvHS9eKWyWzpXYrgVU1u28gUVvpaavv+Ma5V8STqNunoMBs7hKanJqV mPh+4ABACtFieKlwkj2PwUrGEgH+y/SAfStliOFsimVz/w2udC0S777/EjjzfKaN NS13mP19s5/P1q3y/6BSrOxYD0inicROO+UfetHNPOgMePY+Gp/xzluefPnhTagX CnJxndA3Fbjh9rXFbSZ5TMlf97kTxVVJE+qtrh5Upw1AWpo/qvkLsIFsamgyW2jR 7t3MbHzKYnLkUJlwOLPJimvZeN4hZOx05ra/RZOkVaxri7xtVsDCkaEvhgEqLWYj Gbt2mGnNccawwN0bVPd2hgkKmUBqO8u5llhQcM2BBG4CJgZMaB8LjIkS+F6FsSME xMnY+tS3c7Q8TQ== =0lHP -----END PGP SIGNATURE----- Merge tag 'timers-nohz-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull NOHZ update from Thomas Gleixner: "Remove TIF_NOHZ from three architectures These architectures use a static key to decide whether context tracking needs to be invoked and the TIF_NOHZ flag just causes a pointless slowpath execution for nothing" * tag 'timers-nohz-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm64: Remove TIF_NOHZ arm: Remove TIF_NOHZ x86: Remove TIF_NOHZ context-tracking: Introduce CONFIG_HAVE_TIF_NOHZ x86/entry: Remove _TIF_NOHZ from _TIF_WORK_SYSCALL_ENTRY	2020-03-30 18:29:05 -07:00
Linus Torvalds	642e53ead6	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle are: - Various NUMA scheduling updates: harmonize the load-balancer and NUMA placement logic to not work against each other. The intended result is better locality, better utilization and fewer migrations. - Introduce Thermal Pressure tracking and optimizations, to improve task placement on thermally overloaded systems. - Implement frequency invariant scheduler accounting on (some) x86 CPUs. This is done by observing and sampling the 'recent' CPU frequency average at ~tick boundaries. The CPU provides this data via the APERF/MPERF MSRs. This hopefully makes our capacity estimates more precise and keeps tasks on the same CPU better even if it might seem overloaded at a lower momentary frequency. (As usual, turbo mode is a complication that we resolve by observing the maximum frequency and renormalizing to it.) - Add asymmetric CPU capacity wakeup scan to improve capacity utilization on asymmetric topologies. (big.LITTLE systems) - PSI fixes and optimizations. - RT scheduling capacity awareness fixes & improvements. - Optimize the CONFIG_RT_GROUP_SCHED constraints code. - Misc fixes, cleanups and optimizations - see the changelog for details" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (62 commits) threads: Update PID limit comment according to futex UAPI change sched/fair: Fix condition of avg_load calculation sched/rt: cpupri_find: Trigger a full search as fallback kthread: Do not preempt current task if it is going to call schedule() sched/fair: Improve spreading of utilization sched: Avoid scale real weight down to zero psi: Move PF_MEMSTALL out of task->flags MAINTAINERS: Add maintenance information for psi psi: Optimize switching tasks inside shared cgroups psi: Fix cpu.pressure for cpu.max and competing cgroups sched/core: Distribute tasks within affinity masks sched/fair: Fix enqueue_task_fair warning thermal/cpu-cooling, sched/core: Move the arch_set_thermal_pressure() API to generic scheduler code sched/rt: Remove unnecessary push for unfit tasks sched/rt: Allow pulling unfitting task sched/rt: Optimize cpupri_find() on non-heterogenous systems sched/rt: Re-instate old behavior in select_task_rq_rt() sched/rt: cpupri_find: Implement fallback mechanism for !fit case sched/fair: Fix reordering of enqueue/dequeue_task_fair() sched/fair: Fix runnable_avg for throttled cfs ...	2020-03-30 17:01:51 -07:00
Linus Torvalds	9b82f05f86	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "The main changes in this cycle were: Kernel side changes: - A couple of x86/cpu cleanups and changes were grandfathered in due to patch dependencies. These clean up the set of CPU model/family matching macros with a consistent namespace and C99 initializer style. - A bunch of updates to various low level PMU drivers: * AMD Family 19h L3 uncore PMU * Intel Tiger Lake uncore support * misc fixes to LBR TOS sampling - optprobe fixes - perf/cgroup: optimize cgroup event sched-in processing - misc cleanups and fixes Tooling side changes are to: - perf {annotate,expr,record,report,stat,test} - perl scripting - libapi, libperf and libtraceevent - vendor events on Intel and S390, ARM cs-etm - Intel PT updates - Documentation changes and updates to core facilities - misc cleanups, fixes and other enhancements" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (89 commits) cpufreq/intel_pstate: Fix wrong macro conversion x86/cpu: Cleanup the now unused CPU match macros hwrng: via_rng: Convert to new X86 CPU match macros crypto: Convert to new CPU match macros ASoC: Intel: Convert to new X86 CPU match macros powercap/intel_rapl: Convert to new X86 CPU match macros PCI: intel-mid: Convert to new X86 CPU match macros mmc: sdhci-acpi: Convert to new X86 CPU match macros intel_idle: Convert to new X86 CPU match macros extcon: axp288: Convert to new X86 CPU match macros thermal: Convert to new X86 CPU match macros hwmon: Convert to new X86 CPU match macros platform/x86: Convert to new CPU match macros EDAC: Convert to new X86 CPU match macros cpufreq: Convert to new X86 CPU match macros ACPI: Convert to new X86 CPU match macros x86/platform: Convert to new CPU match macros x86/kernel: Convert to new CPU match macros x86/kvm: Convert to new CPU match macros x86/perf/events: Convert to new CPU match macros ...	2020-03-30 16:40:08 -07:00
Linus Torvalds	4b9fd8a829	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle were: - Continued user-access cleanups in the futex code. - percpu-rwsem rewrite that uses its own waitqueue and atomic_t instead of an embedded rwsem. This addresses a couple of weaknesses, but the primary motivation was complications on the -rt kernel. - Introduce raw lock nesting detection on lockdep (CONFIG_PROVE_RAW_LOCK_NESTING=y), document the raw_lock vs. normal lock differences. This too originates from -rt. - Reuse lockdep zapped chain_hlocks entries, to conserve RAM footprint on distro-ish kernels running into the "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!" depletion of the lockdep chain-entries pool. - Misc cleanups, smaller fixes and enhancements - see the changelog for details" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits) fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_t thermal/x86_pkg_temp: Make pkg_temp_lock a raw_spinlock_t Documentation/locking/locktypes: Minor copy editor fixes Documentation/locking/locktypes: Further clarifications and wordsmithing m68knommu: Remove mm.h include from uaccess_no.h x86: get rid of user_atomic_cmpxchg_inatomic() generic arch_futex_atomic_op_inuser() doesn't need access_ok() x86: don't reload after cmpxchg in unsafe_atomic_op2() loop x86: convert arch_futex_atomic_op_inuser() to user_access_begin/user_access_end() objtool: whitelist __sanitizer_cov_trace_switch() [parisc, s390, sparc64] no need for access_ok() in futex handling sh: no need of access_ok() in arch_futex_atomic_op_inuser() futex: arch_futex_atomic_op_inuser() calling conventions change completion: Use lockdep_assert_RT_in_threaded_ctx() in complete_all() lockdep: Add posixtimer context tracing bits lockdep: Annotate irq_work lockdep: Add hrtimer context tracing bits lockdep: Introduce wait-type checks completion: Use simple wait queues sched/swait: Prepare usage in completions ...	2020-03-30 16:17:15 -07:00
Linus Torvalds	a776c270a0	Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: "The EFI changes in this cycle are much larger than usual, for two (positive) reasons: - The GRUB project is showing signs of life again, resulting in the introduction of the generic Linux/UEFI boot protocol, instead of x86 specific hacks which are increasingly difficult to maintain. There's hope that all future extensions will now go through that boot protocol. - Preparatory work for RISC-V EFI support. The main changes are: - Boot time GDT handling changes - Simplify handling of EFI properties table on arm64 - Generic EFI stub cleanups, to improve command line handling, file I/O, memory allocation, etc. - Introduce a generic initrd loading method based on calling back into the firmware, instead of relying on the x86 EFI handover protocol or device tree. - Introduce a mixed mode boot method that does not rely on the x86 EFI handover protocol either, and could potentially be adopted by other architectures (if another one ever surfaces where one execution mode is a superset of another) - Clean up the contents of 'struct efi', and move out everything that doesn't need to be stored there. - Incorporate support for UEFI spec v2.8A changes that permit firmware implementations to return EFI_UNSUPPORTED from UEFI runtime services at OS runtime, and expose a mask of which ones are supported or unsupported via a configuration table. - Partial fix for the lack of by-VA cache maintenance in the decompressor on 32-bit ARM. - Changes to load device firmware from EFI boot service memory regions - Various documentation updates and minor code cleanups and fixes" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits) efi/libstub/arm: Fix spurious message that an initrd was loaded efi/libstub/arm64: Avoid image_base value from efi_loaded_image partitions/efi: Fix partition name parsing in GUID partition entry efi/x86: Fix cast of image argument efi/libstub/x86: Use ULONG_MAX as upper bound for all allocations efi: Fix a mistype in comments mentioning efivar_entry_iter_begin() efi/libstub: Avoid linking libstub/lib-ksyms.o into vmlinux efi/x86: Preserve %ebx correctly in efi_set_virtual_address_map() efi/x86: Ignore the memory attributes table on i386 efi/x86: Don't relocate the kernel unless necessary efi/x86: Remove extra headroom for setup block efi/x86: Add kernel preferred address to PE header efi/x86: Decompress at start of PE image load address x86/boot/compressed/32: Save the output address instead of recalculating it efi/libstub/x86: Deal with exit() boot service returning x86/boot: Use unsigned comparison for addresses efi/x86: Avoid using code32_start efi/x86: Make efi32_pe_entry() more readable efi/x86: Respect 32-bit ABI in efi32_pe_entry() efi/x86: Annotate the LOADED_IMAGE_PROTOCOL_GUID with SYM_DATA ...	2020-03-30 16:13:08 -07:00
Linus Torvalds	ff7b862a4c	* Do not report spurious MCEs on some Intel platforms caused by errata; by Prarit Bhargava. * Change dev-mcelog's hardcoded limit of 32 error records to a dynamic one, controlled by the number of logical CPUs, by Tony Luck. * Add support for the processor identification number (PPIN) on AMD, by Wei Huang. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl6BseQACgkQEsHwGGHe VUqIMg/+KtZsFOHRKZD1dc0Jyo8O0BTzqMIif5J7AzRWv6DPLzfEFBjGFmVY10gN aovhRIF1TrUI8Em5as4FlczH8l328n1ZQhhy6YoCHcrT03LsKHXE46bcvm5msj9n 0s0uZyDei6ly4k6hnNn5NPMjlkpNKS4/A1dkT3Ir25zlS+3Agds4nj5iNzFfOE19 67bFSVw+KuEt4iihfX/uT0HtmcW5T5byDlwrxgMUC3s0EzMLIx4y+hqROzrJfIau NI3edpD0olhfkT9vz5NyZI7hNVAUOoWfYhoxZEJlAxjC+0MRKwR2A539YGsqzgJ9 kFN5h6400xDmG5C5FUVULAEHG8O/AV+0AzMoH0c4xamalB64CJe6BehYJggFbyXB bH9bSZKasesZUSTP+v92dOrMK2ZtJnvhU5hhEDYbtRL4ERyIb/q9/AsJfpb299HJ JD1t4lMhURYr5qu/nck48yVnsHw0yqPju1qRDxqkbmRCkKNDi2t1ph7XUb7okSba AekWUomTliTm83rsX/lH6OJQ1uCtM7QOp6YULr8Zjb4TJcSAfuEsbAcnulUSrxan hreIKqC2A2RMpRVnX9IflKDHAGNWmT5Ag6tLpQ0/TfeaazxT2gdEw8YS4EU18cq6 mMiJyIKmH2nGT7Mf65A0Lg0uJXFPFrtnKfFoSlb0kDsGlx3PEic= =3/4h -----END PGP SIGNATURE----- Merge tag 'ras_updates_for_5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Borislav Petkov: - Do not report spurious MCEs on some Intel platforms caused by errata; by Prarit Bhargava. - Change dev-mcelog's hardcoded limit of 32 error records to a dynamic one, controlled by the number of logical CPUs, by Tony Luck. - Add support for the processor identification number (PPIN) on AMD, by Wei Huang. * tag 'ras_updates_for_5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce/amd: Add PPIN support for AMD MCE x86/mce/dev-mcelog: Dynamically allocate space for machine check records x86/mce: Do not log spurious corrected mce errors	2020-03-30 13:17:50 -07:00
Thomas Gleixner	cf226c42b2	Merge branch 'uaccess.futex' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into locking/core Pull uaccess futex cleanups for Al Viro: Consolidate access_ok() usage and the futex uaccess function zoo.	2020-03-28 11:59:24 +01:00
Thomas Gleixner	a215032725	Merge branch 'next.uaccess-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into x86/cleanups Pull uaccess cleanups from Al Viro: Consolidate the user access areas and get rid of uaccess_try(), user_ex() and other warts.	2020-03-28 11:57:02 +01:00
Al Viro	f5544ba712	x86: get rid of user_atomic_cmpxchg_inatomic() Only one user left; the thing had been made polymorphic back in 2013 for the sake of MPX. No point keeping it now that MPX is gone. Convert futex_atomic_cmpxchg_inatomic() to user_access_{begin,end}() while we are at it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-27 23:58:55 -04:00
Al Viro	8aef36dacb	x86: don't reload after cmpxchg in unsafe_atomic_op2() loop lock cmpxchg leaves the current value in eax; no need to reload it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-27 23:58:54 -04:00
Al Viro	0ec33c0171	x86: convert arch_futex_atomic_op_inuser() to user_access_begin/user_access_end() Lift stac/clac pairs from __futex_atomic_op{1,2} into arch_futex_atomic_op_inuser(), fold them with access_ok() in there. The switch in arch_futex_atomic_op_inuser() is what has required the previous (objtool) commit... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-27 23:58:53 -04:00
Al Viro	a08971e948	futex: arch_futex_atomic_op_inuser() calling conventions change Move access_ok() in and pagefault_enable()/pagefault_disable() out. Mechanical conversion only - some instances don't really need a separate access_ok() at all (e.g. the ones only using get_user()/put_user(), or architectures where access_ok() is always true); we'll deal with that in followups. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-27 23:58:51 -04:00
Benjamin Thiel	5bacdc0982	x86/mm/set_memory: Fix -Wmissing-prototypes warnings Add missing includes and move prototypes into the header set_memory.h in order to fix -Wmissing-prototypes warnings. [ bp: Add ifdeffery around arch_invalidate_pmem() ] Signed-off-by: Benjamin Thiel <b.thiel@posteo.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200320145028.6013-1-b.thiel@posteo.de	2020-03-27 11:26:06 +01:00
Benjamin Thiel	01bd18624d	x86/platform/uv: Add a missing prototype for uv_bau_message_interrupt() ... in order to fix a -Wmissing-prototypes warning: arch/x86/platform/uv/tlb_uv.c:1275:6: warning: no previous prototype for ‘uv_bau_message_interrupt’ [-Wmissing-prototypes] \ void uv_bau_message_interrupt(struct pt_regs *regs) Signed-off-by: Benjamin Thiel <b.thiel@posteo.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200327072621.2255-1-b.thiel@posteo.de	2020-03-27 10:54:52 +01:00
Al Viro	cf122cfba5	kill uaccess_try() finally Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-26 15:02:14 -04:00
Ingo Molnar	629b3df7ec	Merge branch 'x86/cpu' into perf/core, to resolve conflict Conflicts: arch/x86/events/intel/uncore.c Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-03-25 15:20:44 +01:00
Brian Gerst	290a4474d0	x86/entry: Fix build error x86 with !CONFIG_POSIX_TIMERS Add missing semicolon. Fixes: `a74d187c2d` ("x86/entry: Refactor SYS_NI macros") Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200324143520.898733-1-brgerst@gmail.com	2020-03-25 10:06:20 +01:00
Thomas Gleixner	1826d56bce	x86/cpu: Cleanup the now unused CPU match macros No more users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lkml.kernel.org/r/20200320131510.900226233@linutronix.de	2020-03-24 21:37:23 +01:00
Thomas Gleixner	20d437447c	x86/cpu: Add consistent CPU match macros Finding all places which build x86_cpu_id match tables is tedious and the logic is hidden in lots of differently named macro wrappers. Most of these initializer macros use plain C89 initializers which rely on the ordering of the struct members. So new members could only be added at the end of the struct, but that's ugly as hell and C99 initializers are really the right thing to use. Provide a set of macros which: - Have a proper naming scheme, starting with X86_MATCH_ - Use C99 initializers The set of provided macros are all subsets of the base macro X86_MATCH_VENDOR_FAM_MODEL_FEATURE() which allows to supply all possible selection criteria: vendor, family, model, feature The other macros shorten this to avoid typing all arguments when they are not needed and would require one of the _ANY constants. They have been created due to the requirements of the existing usage sites. Also add a few model constants for Centaur CPUs and QUARK. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lkml.kernel.org/r/20200320131508.826011988@linutronix.de	2020-03-24 21:17:50 +01:00
Thomas Gleixner	ba5bade4cc	x86/devicetable: Move x86 specific macro out of generic code There is no reason that this gunk is in a generic header file. The wildcard defines need to stay as they are required by file2alias. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lkml.kernel.org/r/20200320131508.736205164@linutronix.de	2020-03-24 21:02:47 +01:00
Vincenzo Frascino	1c1a18b00d	um: Fix header inclusion User Mode Linux is a flavor of x86 that from the vDSO prospective always falls back on system calls. This implies that it does not require any of the unified vDSO definitions and their inclusion causes side effects like this: In file included from include/vdso/processor.h:10:0, from include/vdso/datapage.h:17, from arch/x86/include/asm/vgtod.h:7, from arch/x86/um/../kernel/sys_ia32.c:49: >> arch/x86/include/asm/vdso/processor.h:11:29: error: redefinition of 'rep_nop' static __always_inline void rep_nop(void) ^~~~~~~ In file included from include/linux/rcupdate.h:30:0, from include/linux/rculist.h:11, from include/linux/pid.h:5, from include/linux/sched.h:14, from arch/x86/um/../kernel/sys_ia32.c:25: arch/x86/um/asm/processor.h:24:20: note: previous definition of 'rep_nop' was here static inline void rep_nop(void) Make sure that the unnecessary headers are not included when um is built to address the problem. Fixes: `abc22418db` ("x86/vdso: Enable x86 to use common headers") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200323124109.7104-1-vincenzo.frascino@arm.com	2020-03-23 18:45:14 +01:00
Anshuman Khandual	31a9122058	x86/mm: Drop pud_mknotpresent() There is an inconsistency between PMD and PUD-based THP page table helpers like the following, as pud_present() does not test for _PAGE_PSE. pmd_present(pmd_mknotpresent(pmd)) : True pud_present(pud_mknotpresent(pud)) : False Drop pud_mknotpresent() as there are no current users. If/when needed back later, pud_present() will also have to be fixed to accommodate _PAGE_PSE. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Baoquan He <bhe@redhat.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: https://lkml.kernel.org/r/1584925542-13034-1-git-send-email-anshuman.khandual@arm.com	2020-03-23 09:11:48 +01:00
Wei Huang	077168e241	x86/mce/amd: Add PPIN support for AMD MCE Newer AMD CPUs support a feature called protected processor identification number (PPIN). This feature can be detected via CPUID_Fn80000008_EBX[23]. However, CPUID alone is not enough to read the processor identification number - MSR_AMD_PPIN_CTL also needs to be configured properly. If, for any reason, MSR_AMD_PPIN_CTL[PPIN_EN] can not be turned on, such as disabled in BIOS, the CPU capability bit X86_FEATURE_AMD_PPIN needs to be cleared. When the X86_FEATURE_AMD_PPIN capability is available, the identification number is issued together with the MCE error info in order to keep track of the source of MCE errors. [ bp: Massage. ] Co-developed-by: Smita Koralahalli Channabasappa <smita.koralahallichannabasappa@amd.com> Signed-off-by: Smita Koralahalli Channabasappa <smita.koralahallichannabasappa@amd.com> Signed-off-by: Wei Huang <wei.huang2@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200321193800.3666964-1-wei.huang2@amd.com	2020-03-22 11:03:47 +01:00
Peter Zijlstra	46db36abc3	x86/entry: Rename ___preempt_schedule Because moar '_' isn't always moar readable. git grep -l "___preempt_schedule$_notrace$" \| while read file; do sed -ie 's/___preempt_schedule$_notrace$/preempt_schedule\1_thunk/g' $file; done Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200320115858.995685950@infradead.org	2020-03-21 16:03:53 +01:00
Brian Gerst	ffd75b373f	x86: Remove unneeded includes Clean up includes of and in <asm/syscalls.h> Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200313195144.164260-19-brgerst@gmail.com	2020-03-21 16:03:25 +01:00
Brian Gerst	0f78ff1711	x86/entry: Drop asmlinkage from syscalls asmlinkage is no longer required since the syscall ABI is now fully under x86 architecture control. This makes the 32-bit native syscalls a bit more effecient by passing in regs via EAX instead of on the stack. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200313195144.164260-18-brgerst@gmail.com	2020-03-21 16:03:25 +01:00
Brian Gerst	25c619e59b	x86/entry/32: Enable pt_regs based syscalls Enable pt_regs based syscalls for 32-bit. This makes the 32-bit native kernel consistent with the 64-bit kernel, and improves the syscall interface by not needing to push all 6 potential arguments onto the stack. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Link: https://lkml.kernel.org/r/20200313195144.164260-17-brgerst@gmail.com	2020-03-21 16:03:24 +01:00
Brian Gerst	0872098804	x86/entry: Move max syscall number calculation to syscallhdr.sh Instead of using an array in asm-offsets to calculate the max syscall number, calculate it when writing out the syscall headers. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200313195144.164260-9-brgerst@gmail.com	2020-03-21 16:03:21 +01:00
Brian Gerst	cc42c045af	x86/entry/64: Move sys_ni_syscall stub to common.c so it can be available to multiple syscall tables. Also directly return -ENOSYS instead of bouncing to the generic sys_ni_syscall(). Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200313195144.164260-7-brgerst@gmail.com	2020-03-21 16:03:20 +01:00
Brian Gerst	27dd84fafc	x86/entry/64: Use syscall wrappers for x32_rt_sigreturn Add missing syscall wrapper for x32_rt_sigreturn(). Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200313195144.164260-6-brgerst@gmail.com	2020-03-21 16:03:20 +01:00
Brian Gerst	a74d187c2d	x86/entry: Refactor SYS_NI macros Pull the common code out from the SYS_NI macros into a new __SYS_NI macro. Also conditionalize the X64 version in preparation for enabling syscall wrappers on 32-bit native kernels. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200313195144.164260-5-brgerst@gmail.com	2020-03-21 16:03:20 +01:00
Brian Gerst	6cc8d2b286	x86/entry: Refactor COND_SYSCALL macros Pull the common code out from the COND_SYSCALL macros into a new __COND_SYSCALL macro. Also conditionalize the X64 version in preparation for enabling syscall wrappers on 32-bit native kernels. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200313195144.164260-4-brgerst@gmail.com	2020-03-21 16:03:19 +01:00
Brian Gerst	d2b5de495e	x86/entry: Refactor SYSCALL_DEFINE0 macros Pull the common code out from the SYSCALL_DEFINE0 macros into a new __SYS_STUB0 macro. Also conditionalize the X64 version in preparation for enabling syscall wrappers on 32-bit native kernels. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200313195144.164260-3-brgerst@gmail.com	2020-03-21 16:03:19 +01:00
Brian Gerst	4399e0cf49	x86/entry: Refactor SYSCALL_DEFINEx macros Pull the common code out from the SYSCALL_DEFINEx macros into a new __SYS_STUBx macro. Also conditionalize the X64 version in preparation for enabling syscall wrappers on 32-bit native kernels. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200313195144.164260-2-brgerst@gmail.com	2020-03-21 16:03:18 +01:00
Vincenzo Frascino	abc22418db	x86/vdso: Enable x86 to use common headers Enable x86 to use only the common headers in the implementation of the vDSO library. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200320145351.32292-24-vincenzo.frascino@arm.com	2020-03-21 15:24:02 +01:00
Vincenzo Frascino	659a9faa3f	x86: Introduce asm/vdso/clocksource.h The vDSO library should only include the necessary headers required for a userspace library (UAPI and a minimal set of kernel headers). To make this possible it is necessary to isolate from the kernel headers the common parts that are strictly necessary to build the library. Introduce asm/vdso/clocksource.h to contain all the arm64 specific functions that are suitable for vDSO inclusion. This header will be required by a future patch that will generalize vdso/clocksource.h. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200320145351.32292-5-vincenzo.frascino@arm.com	2020-03-21 15:23:54 +01:00
Peter Zijlstra	d8a7386897	x86/optprobe: Fix OPTPROBE vs UACCESS While looking at an objtool UACCESS warning, it suddenly occurred to me that it is entirely possible to have an OPTPROBE right in the middle of an UACCESS region. In this case we must of course clear FLAGS.AC while running the KPROBE. Luckily the trampoline already saves/restores [ER]FLAGS, so all we need to do is inject a CLAC. Unfortunately we cannot use ALTERNATIVE() in the trampoline text, so we have to frob that manually. Fixes: ca0bbc70f147 ("sched/x86_64: Don't save flags on context switch") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20200305092130.GU2596@hirez.programming.kicks-ass.net	2020-03-20 13:06:22 +01:00
Ingo Molnar	409e1a3140	Merge branch 'perf/urgent' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-03-19 15:01:45 +01:00
Guenter Roeck	bac59d18c7	x86/setup: Fix static memory detection When booting x86 images in qemu, the following warning is seen randomly if DEBUG_LOCKDEP is enabled. WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:1119 lockdep_register_key+0xc0/0x100 static_obj() returns true if an address is between _stext and _end. On x86, this includes the brk memory space. Problem is that this memory block is not static on x86; its unused portions are released after init and can be allocated. This results in the observed warning if a lockdep object is allocated from this memory. Solve the problem by implementing arch_is_kernel_initmem_freed() for x86 and have it return true if an address is within the released memory range. The same problem was solved for s390 with commit `7a5da02de8` ("locking/lockdep: check for freed initmem in static_obj()"), which introduced arch_is_kernel_initmem_freed(). Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200131021159.9178-1-linux@roeck-us.net	2020-03-19 11:58:13 +01:00
Al Viro	9f855c085f	x86: switch setup_sigcontext() to unsafe_put_user() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-18 20:39:02 -04:00
Al Viro	77f3c6166d	x86: kill get_user_{try,catch,ex} no users left Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-18 20:35:35 -04:00
Al Viro	4b842e4e25	x86: get rid of small constant size cases in raw_copy_{to,from}_user() Very few call sites where that would be triggered remain, and none of those is anywhere near hot enough to bother. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-18 15:53:25 -04:00
Al Viro	71c3313a38	x86: switch sigframe sigset handling to explict __get_user()/__put_user() ... and consolidate the definition of sigframe_ia32->extramask - it's always a 1-element array of 32bit unsigned. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-03-18 15:29:54 -04:00
Jesse Brandeburg	1651e70066	x86: Fix bitops.h warning with a moved cast Fix many sparse warnings when building with C=1. These are useless noise from the bitops.h file and getting rid of them helps developers make more use of the tools and possibly find real bugs. When the kernel is compiled with C=1, there are lots of messages like: arch/x86/include/asm/bitops.h:77:37: warning: cast truncates bits from constant value (ffffff7f becomes 7f) CONST_MASK() is using a signed integer "1" to create the mask which is later cast to (u8), in order to yield an 8-bit value for the assembly instructions to use. Simplify the expressions used to clearly indicate they are working on 8-bit values only, which still keeps sparse happy without an accidental promotion to a 32 bit integer. The warning was occurring because certain bitmasks that end with a bit set next to a natural boundary like 7, 15, 23, 31, end up with a mask like 0x7f, which then results in sign extension due to the integer type promotion rules[1]. It was really only clear_bit() that was having problems, and it was only on some bit checks that resulted in a mask like 0xffffff7f being generated after the inversion. Verify with a test module (see next patch) and assembly inspection that the fix doesn't introduce any change in generated code. [ bp: Massage. ] Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Acked-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://stackoverflow.com/questions/46073295/implicit-type-promotion-rules [1] Link: https://lkml.kernel.org/r/20200310221747.2848474-1-jesse.brandeburg@intel.com	2020-03-18 12:30:19 +01:00
Kim Phillips	e48667b865	perf/amd/uncore: Add support for Family 19h L3 PMU Family 19h introduces change in slice, core and thread specification in its L3 Performance Event Select (ChL3PmcCfg) h/w register. The change is incompatible with Family 17h's version of the register. Introduce a new path in l3_thread_slice_mask() to do things differently for Family 19h vs. Family 17h, otherwise the new hardware doesn't get programmed correctly. Instead of a linear core--thread bitmask, Family 19h takes an encoded core number, and a separate thread mask. There are new bits that are set for all cores and all slices, of which only the latter is used, since the driver counts events for all slices on behalf of the specified CPU. Also update amd_uncore_init() to base its L2/NB vs. L3/Data Fabric mode decision based on Family 17h or above, not just 17h and 18h: the Family 19h Data Fabric PMC is compatible with the Family 17h DF PMC. [ bp: Touchups. ] Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200313231024.17601-3-kim.phillips@amd.com	2020-03-17 13:01:03 +01:00
Thomas Hellstrom	6db73f17c5	x86: Don't let pgprot_modify() change the page encryption bit When SEV or SME is enabled and active, vm_get_page_prot() typically returns with the encryption bit set. This means that users of pgprot_modify(, vm_get_page_prot()) (mprotect_fixup(), do_mmap()) end up with a value of vma->vm_pg_prot that is not consistent with the intended protection of the PTEs. This is also important for fault handlers that rely on the VMA vm_page_prot to set the page protection. Fix this by not allowing pgprot_modify() to change the encryption bit, similar to how it's done for PAT bits. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lkml.kernel.org/r/20200304114527.3636-2-thomas_os@shipmail.org	2020-03-17 11:48:31 +01:00
Borislav Petkov	19d33357ec	x86/amd_nb, char/amd64-agp: Use amd_nb_num() accessor ... to find whether there are northbridges present on the system. Convert the last forgotten user and therefore, unexport amd_nb_misc_ids[] too. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Michal Kubecek <mkubecek@suse.cz> Cc: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lkml.kernel.org/r/20200316150725.925-1-bp@alien8.de	2020-03-17 10:25:58 +01:00
Paolo Bonzini	727a7e27cf	KVM: x86: rename set_cr3 callback and related flags to load_mmu_pgd The set_cr3 callback is not setting the guest CR3, it is setting the root of the guest page tables, either shadow or two-dimensional. To make this clearer as well as to indicate that the MMU calls it via kvm_mmu_load_cr3, rename it to load_mmu_pgd. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:58:52 +01:00
Paolo Bonzini	689f3bf216	KVM: x86: unify callbacks to load paging root Similar to what kvm-intel.ko is doing, provide a single callback that merges svm_set_cr3, set_tdp_cr3 and nested_svm_set_tdp_cr3. This lets us unify the set_cr3 and set_tdp_cr3 entries in kvm_x86_ops. I'm doing that in this same patch because splitting it adds quite a bit of churn due to the need for forward declarations. For the same reason the assignment to vcpu->arch.mmu->set_cr3 is moved to kvm_init_shadow_mmu from init_kvm_softmmu and nested_svm_init_mmu_context. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:58:51 +01:00
Sean Christopherson	257038745c	KVM: x86: Move nSVM CPUID 0x8000000A handling into common x86 code Handle CPUID 0x8000000A in the main switch in __do_cpuid_func() and drop ->set_supported_cpuid() now that both VMX and SVM implementations are empty. Like leaf 0x14 (Intel PT) and leaf 0x8000001F (SEV), leaf 0x8000000A is is (obviously) vendor specific but can be queried in common code while respecting SVM's wishes by querying kvm_cpu_cap_has(). Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:58:45 +01:00
Sean Christopherson	91661989d1	KVM: x86: Move VMX's host_efer to common x86 code Move host_efer to common x86 code and use it for CPUID's is_efer_nx() to avoid constantly re-reading the MSR. No functional change intended. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:58:42 +01:00
Sean Christopherson	703c335d06	KVM: x86/mmu: Configure max page level during hardware setup Configure the max page level during hardware setup to avoid a retpoline in the page fault handler. Drop ->get_lpage_level() as the page fault handler was the last user. No functional change intended. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:58:40 +01:00
Sean Christopherson	bde7723559	KVM: x86/mmu: Merge kvm_{enable,disable}_tdp() into a common function Combine kvm_enable_tdp() and kvm_disable_tdp() into a single function, kvm_configure_mmu(), in preparation for doing additional configuration during hardware setup. And because having separate helpers is silly. No functional change intended. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:58:39 +01:00
Sean Christopherson	a1bead2aba	KVM: VMX: Directly query Intel PT mode when refreshing PMUs Use vmx_pt_mode_is_host_guest() in intel_pmu_refresh() instead of bouncing through kvm_x86_ops->pt_supported, and remove ->pt_supported() as the PMU code was the last remaining user. Opportunistically clean up the wording of a comment that referenced kvm_x86_ops->pt_supported(). No functional change intended. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:58:38 +01:00
Sean Christopherson	139085101f	KVM: x86: Use KVM cpu caps to detect MSR_TSC_AUX virt support Check for MSR_TSC_AUX virtualization via kvm_cpu_cap_has() and drop ->rdtscp_supported(). Note, vmx_rdtscp_supported() needs to hang around a tiny bit longer due other usage in VMX code. No functional change intended. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:58:35 +01:00
Sean Christopherson	90d2f60f41	KVM: x86: Use KVM cpu caps to track UMIP emulation Set UMIP in kvm_cpu_caps when it is emulated by VMX, even though the bit will effectively be dropped by do_host_cpuid(). This allows checking for UMIP emulation via kvm_cpu_caps instead of a dedicated kvm_x86_ops callback. No functional change intended. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:58:28 +01:00
Sean Christopherson	b3d895d5c4	KVM: x86: Move XSAVES CPUID adjust to VMX's KVM cpu cap update Move the clearing of the XSAVES CPUID bit into VMX, which has a separate VMCS control to enable XSAVES in non-root, to eliminate the last ugly renmant of the undesirable "unsigned f_* = _supported ? F() : 0" pattern in the common CPUID handling code. Drop ->xsaves_supported(), CPUID adjustment was the only user. No functional change intended. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:58:25 +01:00
Sean Christopherson	d64d83d1e0	KVM: x86: Handle PKU CPUID adjustment in VMX code Move the setting of the PKU CPUID bit into VMX to eliminate an instance of the undesirable "unsigned f_* = _supported ? F() : 0" pattern in the common CPUID handling code. Drop ->pku_supported(), CPUID adjustment was the only user. Note, some AMD CPUs now support PKU, but SVM doesn't yet support exposing it to a guest. No functional change intended. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:58:19 +01:00
Sean Christopherson	5ffec6f910	KVM: x86: Handle INVPCID CPUID adjustment in VMX code Move the INVPCID CPUID adjustments into VMX to eliminate an instance of the undesirable "unsigned f_* = _supported ? F() : 0" pattern in the common CPUID handling code. Drop ->invpcid_supported(), CPUID adjustment was the only user. No functional change intended. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:58:17 +01:00
Sean Christopherson	160b486f65	KVM: x86: Drop explicit @func param from ->set_supported_cpuid() Drop the explicit @func param from ->set_supported_cpuid() and instead pull the CPUID function from the relevant entry. This sets the stage for hardening guest CPUID updates in future patches, e.g. allows adding run-time assertions that the CPUID feature being changed is actually a bit in the referenced CPUID entry. No functional change intended. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:58:12 +01:00
Sean Christopherson	7f5581f592	KVM: x86: Use supported_xcr0 to detect MPX support Query supported_xcr0 when checking for MPX support instead of invoking ->mpx_supported() and drop ->mpx_supported() as kvm_mpx_supported() was its last user. Rename vmx_mpx_supported() to cpu_has_vmx_mpx() to better align with VMX/VMCS nomenclature. Modify VMX's adjustment of xcr0 to call cpus_has_vmx_mpx() (renamed from vmx_mpx_supported()) directly to avoid reading supported_xcr0 before it's fully configured. No functional change intended. Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> [Test that all bits are set. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:58:10 +01:00
Sean Christopherson	2f728d66e8	KVM: x86: Move kvm_emulate.h into KVM's private directory Now that the emulation context is dynamically allocated and not embedded in struct kvm_vcpu, move its header, kvm_emulate.h, out of the public asm directory and into KVM's private x86 directory. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:57:52 +01:00
Sean Christopherson	c9b8b07cde	KVM: x86: Dynamically allocate per-vCPU emulation context Allocate the emulation context instead of embedding it in struct kvm_vcpu_arch. Dynamic allocation provides several benefits: - Shrinks the size x86 vcpus by ~2.5k bytes, dropping them back below the PAGE_ALLOC_COSTLY_ORDER threshold. - Allows for dropping the include of kvm_emulate.h from asm/kvm_host.h and moving kvm_emulate.h into KVM's private directory. - Allows a reducing KVM's attack surface by shrinking the amount of vCPU data that is exposed to usercopy. - Allows a future patch to disable the emulator entirely, which may or may not be a realistic endeavor. Mark the entire struct as valid for usercopy to maintain existing behavior with respect to hardened usercopy. Future patches can shrink the usercopy range to cover only what is necessary. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:57:52 +01:00
Sean Christopherson	21f1b8f29e	KVM: x86: Explicitly pass an exception struct to check_intercept Explicitly pass an exception struct when checking for intercept from the emulator, which eliminates the last reference to arch.emulate_ctxt in vendor specific code. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:57:50 +01:00
Sean Christopherson	d8dd54e063	KVM: x86/mmu: Rename kvm_mmu->get_cr3() to ->get_guest_pgd() Rename kvm_mmu->get_cr3() to call out that it is retrieving a guest value, as opposed to kvm_mmu->set_cr3(), which sets a host value, and to note that it will return something other than CR3 when nested EPT is in use. Hopefully the new name will also make it more obvious that L1's nested_cr3 is returned in SVM's nested NPT case. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:57:46 +01:00
Sean Christopherson	bb1fcc70d9	KVM: nVMX: Allow L1 to use 5-level page walks for nested EPT Add support for 5-level nested EPT, and advertise said support in the EPT capabilities MSR. KVM's MMU can already handle 5-level legacy page tables, there's no reason to force an L1 VMM to use shadow paging if it wants to employ 5-level page tables. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:57:44 +01:00
Sean Christopherson	8053f924ca	KVM: x86/mmu: Drop kvm_mmu_extended_role.cr4_la57 hack Drop kvm_mmu_extended_role.cr4_la57 now that mmu_role doesn't mask off level, which already incorporates the guest's CR4.LA57 for a shadow MMU by querying is_la57_mode(). Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:57:43 +01:00
Sean Christopherson	a1c77abb8d	KVM: nVMX: Properly handle userspace interrupt window request Return true for vmx_interrupt_allowed() if the vCPU is in L2 and L1 has external interrupt exiting enabled. IRQs are never blocked in hardware if the CPU is in the guest (L2 from L1's perspective) when IRQs trigger VM-Exit. The new check percolates up to kvm_vcpu_ready_for_interrupt_injection() and thus vcpu_run(), and so KVM will exit to userspace if userspace has requested an interrupt window (to inject an IRQ into L1). Remove the @external_intr param from vmx_check_nested_events(), which is actually an indicator that userspace wants an interrupt window, e.g. it's named @req_int_win further up the stack. Injecting a VM-Exit into L1 to try and bounce out to L0 userspace is all kinds of broken and is no longer necessary. Remove the hack in nested_vmx_vmexit() that attempted to workaround the breakage in vmx_check_nested_events() by only filling interrupt info if there's an actual interrupt pending. The hack actually made things worse because it caused KVM to _never_ fill interrupt info when the LAPIC resides in userspace (kvm_cpu_has_interrupt() queries interrupt.injected, which is always cleared by prepare_vmcs12() before reaching the hack in nested_vmx_vmexit()). Fixes: `6550c4df7e` ("KVM: nVMX: Fix interrupt window request with "Acknowledge interrupt on exit"") Cc: stable@vger.kernel.org Cc: Liran Alon <liran.alon@oracle.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:57:40 +01:00
Wanpeng Li	4abaffce4d	KVM: LAPIC: Recalculate apic map in batch In the vCPU reset and set APIC_BASE MSR path, the apic map will be recalculated several times, each time it will consume 10+ us observed by ftrace in my non-overcommit environment since the expensive memory allocate/mutex/rcu etc operations. This patch optimizes it by recaluating apic map in batch, I hope this can benefit the serverless scenario which can frequently create/destroy VMs. Before patch: kvm_lapic_reset ~27us After patch: kvm_lapic_reset ~14us Observed by ftrace, improve ~48%. Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:57:39 +01:00
Jay Zhou	3c9bd4006b	KVM: x86: enable dirty log gradually in small chunks It could take kvm->mmu_lock for an extended period of time when enabling dirty log for the first time. The main cost is to clear all the D-bits of last level SPTEs. This situation can benefit from manual dirty log protect as well, which can reduce the mmu_lock time taken. The sequence is like this: 1. Initialize all the bits of the dirty bitmap to 1 when enabling dirty log for the first time 2. Only write protect the huge pages 3. KVM_GET_DIRTY_LOG returns the dirty bitmap info 4. KVM_CLEAR_DIRTY_LOG will clear D-bit for each of the leaf level SPTEs gradually in small chunks Under the Intel(R) Xeon(R) Gold 6152 CPU @ 2.10GHz environment, I did some tests with a 128G windows VM and counted the time taken of memory_global_dirty_log_start, here is the numbers: VM Size Before After optimization 128G 460ms 10ms Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:57:37 +01:00
Oliver Upton	cc7f5577ad	KVM: SVM: Inhibit APIC virtualization for X2APIC guest The AVIC does not support guest use of the x2APIC interface. Currently, KVM simply chooses to squash the x2APIC feature in the guest's CPUID If the AVIC is enabled. Doing so prevents KVM from running a guest with greater than 255 vCPUs, as such a guest necessitates the use of the x2APIC interface. Instead, inhibit AVIC enablement on a per-VM basis whenever the x2APIC feature is set in the guest's CPUID. Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:57:35 +01:00
Sean Christopherson	562b6b089d	KVM: x86: Consolidate VM allocation and free for VMX and SVM Move the VM allocation and free code to common x86 as the logic is more or less identical across SVM and VMX. Note, although hyperv.hv_pa_pg is part of the common kvm->arch, it's (currently) only allocated by VMX VMs. But, since kfree() plays nice when passed a NULL pointer, the superfluous call for SVM is harmless and avoids future churn if SVM gains support for HyperV's direct TLB flush. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> [Make vm_size a field instead of a function. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:57:33 +01:00
Sean Christopherson	e96c81ee89	KVM: Simplify kvm_free_memslot() and all its descendents Now that all callers of kvm_free_memslot() pass NULL for @dont, remove the param from the top-level routine and all arch's implementations. No functional change intended. Tested-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:57:22 +01:00
Sean Christopherson	744e699c7e	KVM: x86: Move gpa_val and gpa_available into the emulator context Move the GPA tracking into the emulator context now that the context is guaranteed to be initialized via __init_emulate_ctxt() prior to dereferencing gpa_{available,val}, i.e. now that seeing a stale gpa_available will also trigger a WARN due to an invalid context. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:57:12 +01:00
Sean Christopherson	92daa48b34	KVM: x86: Add EMULTYPE_PF when emulation is triggered by a page fault Add a new emulation type flag to explicitly mark emulation related to a page fault. Move the propation of the GPA into the emulator from the page fault handler into x86_emulate_instruction, using EMULTYPE_PF as an indicator that cr2 is valid. Similarly, don't propagate cr2 into the exception.address when it's not valid. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 17:57:12 +01:00
Linus Torvalds	6693075e0f	Bugfixes, x86+s390. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJebMXnAAoJEL/70l94x66D3fYIAJ1r+o2qgzadwEqoXTvlihjB ujX1jOs20EJJ56VhTtXF/wZQc+7VeKCjpIqNv4WaeSYPUhzFGyL9t5tw1YdRDCwY u6gklxruIzZodgp+vCoTkPyyUylVmY50sY/yBIJ4F8qOaMxhTEE1aXzGuaOrYqVO MmIlAltEKQzdXPO1SVPD7triGPgUTj+DRxrlyRrGt2ItiMUincCz9K6TDyXFib0r SSCVFNYtYmzu/bV/E4/Sphi2BxCQEem5DIFWLcngzN8Wy5oCoRVzPGugT4Q9eXWt ZtWIDh473JGiXBLYmDq4REJsRSca+7s/YiiLSiQwYfByhIPJpVEoy54fcdaZflo= =T4AD -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "Bugfixes for x86 and s390" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: nVMX: avoid NULL pointer dereference with incorrect EVMCS GPAs KVM: x86: Initializing all kvm_lapic_irq fields in ioapic_write_indirect KVM: VMX: Condition ENCLS-exiting enabling on CPU support for SGX1 KVM: s390: Also reset registers in sync regs for initial cpu reset KVM: fix Kconfig menu text for -Werror KVM: x86: remove stale comment from struct x86_emulate_ctxt KVM: x86: clear stale x86_emulate_ctxt->intercept value KVM: SVM: Fix the svm vmexit code for WRMSR KVM: X86: Fix dereference null cpufreq policy	2020-03-14 15:45:26 -07:00
Kim Phillips	753039ef8b	x86/cpu/amd: Call init_amd_zn() om Family 19h processors too Family 19h CPUs are Zen-based and still share most architectural features with Family 17h CPUs, and therefore still need to call init_amd_zn() e.g., to set the RECLAIM_DISTANCE override. init_amd_zn() also sets X86_FEATURE_ZEN, which today is only used in amd_set_core_ssb_state(), which isn't called on some late model Family 17h CPUs, nor on any Family 19h CPUs: X86_FEATURE_AMD_SSBD replaces X86_FEATURE_LS_CFG_SSBD on those later model CPUs, where the SSBD mitigation is done via the SPEC_CTRL MSR instead of the LS_CFG MSR. Family 19h CPUs also don't have the erratum where the CPB feature bit isn't set, but that code can stay unchanged and run safely on Family 19h. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200311191451.13221-1-kim.phillips@amd.com	2020-03-12 12:13:44 +01:00
Tony Luck	d8ecca4043	x86/mce/dev-mcelog: Dynamically allocate space for machine check records We have had a hard coded limit of 32 machine check records since the dawn of time. But as numbers of cores increase, it is possible for more than 32 errors to be reported before a user process reads from /dev/mcelog. In this case the additional errors are lost. Keep 32 as the minimum. But tune the maximum value up based on the number of processors. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200218184408.GA23048@agluck-desk2.amr.corp.intel.com	2020-03-10 10:25:14 +01:00
Boqun Feng	1cf106d932	PCI: hv: Introduce hv_msi_entry Add a new structure (hv_msi_entry), which is also defined in the TLFS, to describe the msi entry for HVCALL_RETARGET_INTERRUPT. The structure is needed because its layout may be different from architecture to architecture. Also add a new generic interface hv_set_msi_entry_from_desc() to allow different archs to set the msi entry from msi_desc. No functional change, only preparation for the future support of virtual PCI on non-x86 architectures. Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Dexuan Cui <decui@microsoft.com>	2020-03-09 14:51:31 +00:00
Boqun Feng	61bfd920ab	PCI: hv: Move retarget related structures into tlfs header Currently, retarget_msi_interrupt and other structures it relys on are defined in pci-hyperv.c. However, those structures are actually defined in Hypervisor Top-Level Functional Specification [1] and may be different in sizes of fields or layout from architecture to architecture. Let's move those definitions into x86's tlfs header file to support virtual PCI on non-x86 architectures in the future. Note that "__packed" attribute is added to these structures during the movement for the same reason as we use the attribute for other TLFS structures in the header file: make sure the structures meet the specification and avoid anything unexpected from the compilers. Additionally, rename struct retarget_msi_interrupt to hv_retarget_msi_interrupt for the consistent naming convention, also mirroring the name in TLFS. [1]: https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Dexuan Cui <decui@microsoft.com>	2020-03-09 14:50:53 +00:00
Boqun Feng	b00f80fcfa	PCI: hv: Move hypercall related definitions into tlfs header Currently HVCALL_RETARGET_INTERRUPT and HV_PARTITION_ID_SELF are defined in pci-hyperv.c. However, similar to other hypercall related definitions, it makes more sense to put them in the tlfs header file. Besides, these definitions are arch-dependent, so for the support of virtual PCI on non-x86 archs in the future, move them into arch-specific tlfs header file. Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Andrew Murray <amurray@thegoodpenguin.co.uk> Reviewed-by: Dexuan Cui <decui@microsoft.com>	2020-03-09 14:50:39 +00:00
Ingo Molnar	6120681bdf	Merge branch 'efi/urgent' into efi/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-03-08 09:57:58 +01:00
Ingo Molnar	1b10d388d0	Merge branch 'linus' into sched/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-03-06 12:49:56 +01:00
Vitaly Kuznetsov	d718fdc3e7	KVM: x86: remove stale comment from struct x86_emulate_ctxt Commit `c44b4c6ab8` ("KVM: emulate: clean up initializations in init_decode_cache") did some field shuffling and instead of [opcode_len, _regs) started clearing [has_seg_override, modrm). The comment about clearing fields altogether is not true anymore. Fixes: `c44b4c6ab8` ("KVM: emulate: clean up initializations in init_decode_cache") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-03 17:38:22 +01:00
Linus Torvalds	2873dc2547	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: a pkeys fix for a bug that triggers with weird BIOS settings, and two Xen PV fixes: a paravirt interface fix, and pagetable dumping fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Fix dump_pagetables with Xen PV x86/ioperm: Add new paravirt function update_io_bitmap() x86/pkeys: Manually set X86_FEATURE_OSPKE to preserve existing changes	2020-03-02 06:54:54 -06:00
Juergen Gross	99bcd4a6e5	x86/ioperm: Add new paravirt function update_io_bitmap() Commit `111e7b15cf` ("x86/ioperm: Extend IOPL config to control ioperm() as well") reworked the iopl syscall to use I/O bitmaps. Unfortunately this broke Xen PV domains using that syscall as there is currently no I/O bitmap support in PV domains. Add I/O bitmap support via a new paravirt function update_io_bitmap which Xen PV domains can use to update their I/O bitmaps via a hypercall. Fixes: `111e7b15cf` ("x86/ioperm: Extend IOPL config to control ioperm() as well") Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Cc: <stable@vger.kernel.org> # 5.5 Link: https://lkml.kernel.org/r/20200218154712.25490-1-jgross@suse.com	2020-02-29 12:43:09 +01:00
Thomas Gleixner	17dbedb5da	x86/irq: Remove useless return value from do_IRQ() Nothing is using it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200225220216.826870369@linutronix.de	2020-02-27 14:48:40 +01:00
Thomas Gleixner	3ba4f0a633	x86/traps: Remove redundant declaration of do_double_fault() Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200225220216.720335354@linutronix.de	2020-02-27 14:48:40 +01:00
Thomas Gleixner	840371bea1	x86/entry/32: Force MCE through do_mce() Remove the pointless difference between 32 and 64 bit to make further unifications simpler. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200225220216.428188397@linutronix.de	2020-02-27 14:48:39 +01:00
Andy Lutomirski	55ba18d6ed	x86/mce: Disable tracing and kprobes on do_machine_check() do_machine_check() can be raised in almost any context including the most fragile ones. Prevent kprobes and tracing. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200225220216.315548935@linutronix.de	2020-02-27 14:48:39 +01:00
Ingo Molnar	e9765680a3	EFI updates for v5.7: This time, the set of changes for the EFI subsystem is much larger than usual. The main reasons are: - Get things cleaned up before EFI support for RISC-V arrives, which will increase the size of the validation matrix, and therefore the threshold to making drastic changes, - After years of defunct maintainership, the GRUB project has finally started to consider changes from the distros regarding UEFI boot, some of which are highly specific to the way x86 does UEFI secure boot and measured boot, based on knowledge of both shim internals and the layout of bootparams and the x86 setup header. Having this maintenance burden on other architectures (which don't need shim in the first place) is hard to justify, so instead, we are introducing a generic Linux/UEFI boot protocol. Summary of changes: - Boot time GDT handling changes (Arvind) - Simplify handling of EFI properties table on arm64 - Generic EFI stub cleanups, to improve command line handling, file I/O, memory allocation, etc. - Introduce a generic initrd loading method based on calling back into the firmware, instead of relying on the x86 EFI handover protocol or device tree. - Introduce a mixed mode boot method that does not rely on the x86 EFI handover protocol either, and could potentially be adopted by other architectures (if another one ever surfaces where one execution mode is a superset of another) - Clean up the contents of struct efi, and move out everything that doesn't need to be stored there. - Incorporate support for UEFI spec v2.8A changes that permit firmware implementations to return EFI_UNSUPPORTED from UEFI runtime services at OS runtime, and expose a mask of which ones are supported or unsupported via a configuration table. - Various documentation updates and minor code cleanups (Heinrich) - Partial fix for the lack of by-VA cache maintenance in the decompressor on 32-bit ARM. Note that these patches were deliberately put at the beginning so they can be used as a stable branch that will be shared with a PR containing the complete fix, which I will send to the ARM tree. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEnNKg2mrY9zMBdeK7wjcgfpV0+n0FAl5S7WYACgkQwjcgfpV0 +n1jmQgAmwV3V8pbgB4mi4P2Mv8w5Zj5feUe6uXnTR2AFv5nygLcTzdxN+TU/6lc OmZv2zzdsAscYlhuUdI/4t4cXIjHAZI39+UBoNRuMqKbkbvXCFscZANLxvJjHjZv FFbgUk0DXkF0BCEDuSLNavidAv4b3gZsOmHbPfwuB8xdP05LbvbS2mf+2tWVAi2z ULfua/0o9yiwl19jSS6iQEPCvvLBeBzTLW7x5Rcm/S0JnotzB59yMaeqD7jO8JYP 5PvI4WM/l5UfVHnZp2k1R76AOjReALw8dQgqAsT79Q7+EH3sNNuIjU6omdy+DFf4 tnpwYfeLOaZ1SztNNfU87Hsgnn2CGw== =pDE3 -----END PGP SIGNATURE----- Merge tag 'efi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into efi/core Pull EFI updates for v5.7 from Ard Biesheuvel: This time, the set of changes for the EFI subsystem is much larger than usual. The main reasons are: - Get things cleaned up before EFI support for RISC-V arrives, which will increase the size of the validation matrix, and therefore the threshold to making drastic changes, - After years of defunct maintainership, the GRUB project has finally started to consider changes from the distros regarding UEFI boot, some of which are highly specific to the way x86 does UEFI secure boot and measured boot, based on knowledge of both shim internals and the layout of bootparams and the x86 setup header. Having this maintenance burden on other architectures (which don't need shim in the first place) is hard to justify, so instead, we are introducing a generic Linux/UEFI boot protocol. Summary of changes: - Boot time GDT handling changes (Arvind) - Simplify handling of EFI properties table on arm64 - Generic EFI stub cleanups, to improve command line handling, file I/O, memory allocation, etc. - Introduce a generic initrd loading method based on calling back into the firmware, instead of relying on the x86 EFI handover protocol or device tree. - Introduce a mixed mode boot method that does not rely on the x86 EFI handover protocol either, and could potentially be adopted by other architectures (if another one ever surfaces where one execution mode is a superset of another) - Clean up the contents of struct efi, and move out everything that doesn't need to be stored there. - Incorporate support for UEFI spec v2.8A changes that permit firmware implementations to return EFI_UNSUPPORTED from UEFI runtime services at OS runtime, and expose a mask of which ones are supported or unsupported via a configuration table. - Various documentation updates and minor code cleanups (Heinrich) - Partial fix for the lack of by-VA cache maintenance in the decompressor on 32-bit ARM. Note that these patches were deliberately put at the beginning so they can be used as a stable branch that will be shared with a PR containing the complete fix, which I will send to the ARM tree. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-02-26 15:21:22 +01:00
Arvind Sankar	6f8f0dc980	x86/vmlinux: Drop unneeded linker script discard of .eh_frame Now that .eh_frame sections for the files in setup.elf and realmode.elf are not generated anymore, the linker scripts don't need the special output section name /DISCARD/ any more. Remove the one in the main kernel linker script as well, since there are no .eh_frame sections already, and fix up a comment referencing .eh_frame. Update the comment in asm/dwarf2.h referring to .eh_frame so it continues to make sense, as well as being more specific. [ bp: Touch up commit message. ] Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Link: https://lkml.kernel.org/r/20200224232129.597160-3-nivedita@alum.mit.edu	2020-02-25 14:51:29 +01:00
Linus Torvalds	63623fd449	Bugfixes, including the fix for CVE-2020-2732 and a few issues found by "make W=1". -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJeVBwcAAoJEL/70l94x66DB9AH/AxWhmtf6YVXMNyZjXydxa1f hYVm9wg9GCsZS+7cktMhq0/uDEu5IjaCv7d+bzIcYZdFAOcs5nBUUjn1LtVl9w1y 48vobyOa8pXpORerBtZtaO1kt4sfFR63zm7uau32DzXrz3qpHlMUjPdL08A1e35V cSSPAHHsl9S1TbDryc/VUNCOgauJes6LHbd3CdeAXU6lzMBW8JWbF2b/MAkvHG6n Hw5LpicWSeTxoPjR4Oi0Yx3VKvWfS9608netSJmuCNsv36wrhzKR1iuyb3kNCkAy AIlALn4PZq1Y5i1INi/XIkpC8d9yTqt5heRxYwp+yHadWO6E7ZMlITfxLZii+mM= =7EpO -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "Bugfixes, including the fix for CVE-2020-2732 and a few issues found by 'make W=1'" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: s390: rstify new ioctls in api.rst KVM: nVMX: Check IO instruction VM-exit conditions KVM: nVMX: Refactor IO bitmap checks into helper function KVM: nVMX: Don't emulate instructions in guest mode KVM: nVMX: Emulate MTF when performing instruction emulation KVM: fix error handling in svm_hardware_setup KVM: SVM: Fix potential memory leak in svm_cpu_init() KVM: apic: avoid calculating pending eoi from an uninitialized val KVM: nVMX: clear PIN_BASED_POSTED_INTR from nested pinbased_ctls only when apicv is globally disabled KVM: nVMX: handle nested posted interrupts when apicv is disabled for L1 kvm: x86: svm: Fix NULL pointer dereference when AVIC not enabled KVM: VMX: Add VMX_FEATURE_USR_WAIT_PAUSE KVM: nVMX: Hold KVM's srcu lock when syncing vmcs12->shadow KVM: x86: don't notify userspace IOAPIC on edge-triggered interrupt EOI kvm/emulate: fix a -Werror=cast-function-type KVM: x86: fix incorrect comparison in trace event KVM: nVMX: Fix some obsolete comments and grammar error KVM: x86: fix missing prototypes KVM: x86: enable -Werror	2020-02-24 11:48:17 -08:00
Dave Hansen	16171bffc8	x86/pkeys: Add check for pkey "overflow" Alex Shi reported the pkey macros above arch_set_user_pkey_access() to be unused. They are unused, and even refer to a nonexistent CONFIG option. But, they might have served a good use, which was to ensure that the code does not try to set values that would not fit in the PKRU register. As it stands, a too-large 'pkey' value would be likely to silently overflow the u32 new_pkru_bits. Add a check to look for overflows. Also add a comment to remind any future developer to closely examine the types used to store pkey values if arch_max_pkey() ever changes. This boots and passes the x86 pkey selftests. Reported-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200122165346.AD4DA150@viggo.jf.intel.com	2020-02-24 20:25:21 +01:00
Ingo Molnar	546121b65f	Linux 5.6-rc3 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl5TFjYeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGikYIAIhI4C8R87wyj/0m b2NWk6TZ5AFmiZLYSbsPYxdSC9OLdUmlGFKgL2SyLTwZCiHChm+cNBrngp3hJ6gz x1YH99HdjzkiaLa0hCc2+a/aOt8azGU2RiWEP8rbo0gFSk28wE6FjtzSxR95jyPz FRKo/sM+dHBMFXrthJbr+xHZ1De28MITzS2ddstr/10ojoRgm43I3qo1JKhjoDN5 9GGb6v0Md5eo+XZjjB50CvgF5GhpiqW7+HBB7npMsgTk37GdsR5RlosJ/TScLVC9 dNeanuqk8bqMGM0u2DFYdDqjcqAlYbt8aobuWWCB5xgPBXr5G2nox+IgF/f9G6UH EShA/xs= =OFPc -----END PGP SIGNATURE----- Merge tag 'v5.6-rc3' into sched/core, to pick up fixes and dependent patches Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-02-24 11:36:09 +01:00
Ard Biesheuvel	3b8f44fc08	efi/libstub/x86: Use Exit() boot service to exit the stub on errors Currently, we either return with an error [from efi_pe_entry()] or enter a deadloop [in efi_main()] if any fatal errors occur during execution of the EFI stub. Let's switch to calling the Exit() EFI boot service instead in both cases, so that we a) can get rid of the deadloop, and simply return to the boot manager if any errors occur during execution of the stub, including during the call to ExitBootServices(), b) can also return cleanly from efi_pe_entry() or efi_main() in mixed mode, once we introduce support for LoadImage/StartImage based mixed mode in the next patch. Note that on systems running downstream GRUBs [which do not use LoadImage or StartImage to boot the kernel, and instead, pass their own image handle as the loaded image handle], calling Exit() will exit from GRUB rather than from the kernel, but this is a tolerable side effect. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2020-02-23 21:59:42 +01:00
Ard Biesheuvel	59f2a619a2	efi: Add 'runtime' pointer to struct efi Instead of going through the EFI system table each time, just copy the runtime services table pointer into struct efi directly. This is the last use of the system table pointer in struct efi, allowing us to drop it in a future patch, along with a fair amount of quirky handling of the translated address. Note that usually, the runtime services pointer changes value during the call to SetVirtualAddressMap(), so grab the updated value as soon as that call returns. (Mixed mode uses a 1:1 mapping, and kexec boot enters with the updated address in the system table, so in those cases, we don't need to do anything here) Tested-by: Tony Luck <tony.luck@intel.com> # arch/ia64 Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2020-02-23 21:59:42 +01:00
Ard Biesheuvel	9cd437ac0e	efi/x86: Make fw_vendor, config_table and runtime sysfs nodes x86 specific There is some code that exposes physical addresses of certain parts of the EFI firmware implementation via sysfs nodes. These nodes are only used on x86, and are of dubious value to begin with, so let's move their handling into the x86 arch code. Tested-by: Tony Luck <tony.luck@intel.com> # arch/ia64 Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2020-02-23 21:59:42 +01:00
Ard Biesheuvel	0a67361dcd	efi/x86: Remove runtime table address from kexec EFI setup data Since commit `33b85447fa` ("efi/x86: Drop two near identical versions of efi_runtime_init()"), we no longer map the EFI runtime services table before calling SetVirtualAddressMap(), which means we don't need the 1:1 mapped physical address of this table, and so there is no point in passing the address via EFI setup data on kexec boot. Note that the kexec tools will still look for this address in sysfs, so we still need to provide it. Tested-by: Tony Luck <tony.luck@intel.com> # arch/ia64 Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2020-02-23 21:59:42 +01:00
Ard Biesheuvel	2931d526d5	efi/libstub: Make the LoadFile EFI protocol accessible Add the protocol definitions, GUIDs and mixed mode glue so that the EFI loadfile protocol can be used from the stub. This will be used in a future patch to load the initrd. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2020-02-23 21:57:15 +01:00

1 2 3 4 5 ...

9559 Commits