linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-16 09:02:00 +00:00

Author	SHA1	Message	Date
Dave Hansen	814564a0a1	x86, mpx: Explicitly disable 32-bit MPX support on 64-bit kernels We had originally planned on submitting MPX support in one patch set. We eventually broke it up in to two pieces for easier review. One of the features that didn't make the first round was supporting 32-bit binaries on 64-bit kernels. Once we split the set up, we never added code to restrict 32-bit binaries from _using_ MPX on 64-bit kernels. The 32-bit bounds tables are a different format than the 64-bit ones. Without this patch, the kernel will try to read a 32-bit binary's tables as if they were the 64-bit version. They will likely be noticed as being invalid rather quickly and the app will get killed, but that's kinda mean. This patch adds an explicit check, and will make a 64-bit kernel essentially behave as if it has no MPX support when called from a 32-bit binary. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave@sr71.net> Link: http://lkml.kernel.org/r/20150108223020.9E9AA511@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-01-22 21:11:06 +01:00
K. Y. Srinivasan	32c6590d12	x86, hyperv: Mark the Hyper-V clocksource as being continuous The Hyper-V clocksource is continuous; mark it accordingly. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Acked-by: jasowang@redhat.com Cc: gregkh@linuxfoundation.org Cc: devel@linuxdriverproject.org Cc: olaf@aepfle.de Cc: apw@canonical.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1421108762-3331-1-git-send-email-kys@microsoft.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-01-20 14:36:25 +01:00
Juergen Gross	9d34cfdf47	x86: Don't rely on VMWare emulating PAT MSR correctly VMWare seems not to emulate the PAT MSR correctly: reaeding MSR_IA32_CR_PAT returns 0 even after writing another value to it. Commit `bd809af16e` triggers this VMWare bug when the kernel is booted as a VMWare guest. Detect this bug and don't use the read value if it is 0. Fixes: `bd809af16e` "x86: Enable PAT to use cache mode translation tables" Reported-and-tested-by: Jongman Heo <jongman.heo@samsung.com> Acked-by: Alok N Kataria <akataria@vmware.com> Signed-off-by: Juergen Gross <jgross@suse.com> Link: http://lkml.kernel.org/r/1421039745-14335-1-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-01-20 14:33:45 +01:00
Jan Beulich	4a0d3107d6	x86, irq: Properly tag virtualization entry in /proc/interrupts The mis-naming likely was a copy-and-paste effect. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/54B9408B0200007800055E8B@mail.emea.novell.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-01-20 12:37:23 +01:00
Kees Cook	f285f4a21c	x86, boot: Skip relocs when load address unchanged On 64-bit, relocation is not required unless the load address gets changed. Without this, relocations do unexpected things when the kernel is above 4G. Reported-by: Baoquan He <bhe@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Thomas D. <whissi@whissi.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Junjie Mao <eternal.n08@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20150116005146.GA4212@www.outflux.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-01-20 12:37:23 +01:00
Jiang Liu	8abb850a03	x86/xen: Override ACPI IRQ management callback __acpi_unregister_gsi Xen overrides __acpi_register_gsi and leaves __acpi_unregister_gsi as is. That means, an IRQ allocated by acpi_register_gsi_xen_hvm() or acpi_register_gsi_xen() will be freed by acpi_unregister_gsi_ioapic(), which may cause undesired effects. So override __acpi_unregister_gsi to NULL for safety. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Cc: Tony Luck <tony.luck@intel.com> Cc: xen-devel@lists.xenproject.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Graeme Gregory <graeme.gregory@linaro.org> Cc: Lv Zheng <lv.zheng@intel.com> Link: http://lkml.kernel.org/r/1421720467-7709-4-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-01-20 11:44:41 +01:00
Jiang Liu	b568b8601f	x86/xen: Treat SCI interrupt as normal GSI interrupt Currently Xen Domain0 has special treatment for ACPI SCI interrupt, that is initialize irq for ACPI SCI at early stage in a special way as: xen_init_IRQ() ->pci_xen_initial_domain() ->xen_setup_acpi_sci() Allocate and initialize irq for ACPI SCI Function xen_setup_acpi_sci() calls acpi_gsi_to_irq() to get an irq number for ACPI SCI. But unfortunately acpi_gsi_to_irq() depends on IOAPIC irqdomains through following path acpi_gsi_to_irq() ->mp_map_gsi_to_irq() ->mp_map_pin_to_irq() ->check IOAPIC irqdomain For PV domains, it uses Xen event based interrupt manangement and doesn't make uses of native IOAPIC, so no irqdomains created for IOAPIC. This causes Xen domain0 fail to install interrupt handler for ACPI SCI and all ACPI events will be lost. Please refer to: https://lkml.org/lkml/2014/12/19/178 So the fix is to get rid of special treatment for ACPI SCI, just treat ACPI SCI as normal GSI interrupt as: acpi_gsi_to_irq() ->acpi_register_gsi() ->acpi_register_gsi_xen() ->xen_register_gsi() With above change, there's no need for xen_setup_acpi_sci() anymore. The above change also works with bare metal kernel too. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Cc: Tony Luck <tony.luck@intel.com> Cc: xen-devel@lists.xenproject.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: http://lkml.kernel.org/r/1421720467-7709-2-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-01-20 11:44:40 +01:00
Linus Torvalds	59b2858f57	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Mostly tooling fixes, but also two PMU driver fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf tools powerpc: Use dwfl_report_elf() instead of offline. perf tools: Fix segfault for symbol annotation on TUI perf test: Fix dwarf unwind using libunwind. perf tools: Avoid build splat for syscall numbers with uclibc perf tools: Elide strlcpy warning with uclibc perf tools: Fix statfs.f_type data type mismatch build error with uclibc tools: Remove bitops/hweight usage of bits in tools/perf perf machine: Fix __machine__findnew_thread() error path perf tools: Fix building error in x86_64 when dwarf unwind is on perf probe: Propagate error code when write(2) failed perf/x86/intel: Fix bug for "cycles:p" and "cycles:pp" on SLM perf/rapl: Fix sysfs_show() initialization for RAPL PMU	2015-01-18 06:24:30 +12:00
Linus Torvalds	23aa4b416a	This holds a few fixes to the ftrace infrastructure as well as the mixture of function graph tracing and kprobes. When jprobes and function graph tracing is enabled at the same time it will crash the system. # modprobe jprobe_example # echo function_graph > /sys/kernel/debug/tracing/current_tracer After the first fork (jprobe_example probes it), the system will crash. This is due to the way jprobes copies the stack frame and does not do a normal function return. This messes up with the function graph tracing accounting which hijacks the return address from the stack and replaces it with a hook function. It saves the return addresses in a separate stack to put back the correct return address when done. But because the jprobe functions do not do a normal return, their stack addresses are not put back until the function they probe is called, which means that the probed function will get the return address of the jprobe handler instead of its own. The simple fix here was to disable function graph tracing while the jprobe handler is being called. While debugging this I found two minor bugs with the function graph tracing. The first was about the function graph tracer sharing its function hash with the function tracer (they both get filtered by the same input). The changing of the set_ftrace_filter would not sync the function recording records after a change if the function tracer was disabled but the function graph tracer was enabled. This was due to the update only checking one of the ops instead of the shared ops to see if they were enabled and should perform the sync. This caused the ftrace accounting to break and a ftrace_bug() would be triggered, disabling ftrace until a reboot. The second was that the check to update records only checked one of the filter hashes. It needs to test both the "filter" and "notrace" hashes. The "filter" hash determines what functions to trace where as the "notrace" hash determines what functions not to trace (trace all but these). Both hashes need to be passed to the update code to find out what change is being done during the update. This also broke the ftrace record accounting and triggered a ftrace_bug(). This patch set also include two more fixes that were reported separately from the kprobe issue. One was that init_ftrace_syscalls() was called twice at boot up. This is not a major bug, but that call performed a rather large kmalloc (NR_syscalls * sizeof(syscalls_metadata)). The second call made the first one a memory leak, and wastes memory. The other fix is a regression caused by an update in the v3.19 merge window. The moving to enable events early, moved the enabling before PID 1 was created. The syscall events require setting the TIF_SYSCALL_TRACEPOINT for all tasks. But for_each_process_thread() does not include the swapper task (PID 0), and ended up being a nop. A suggested fix was to add the init_task() to have its flag set, but I didn't really want to mess with PID 0 for this minor bug. Instead I disable and re-enable events again at early_initcall() where it use to be enabled. This also handles any other event that might have its own reg function that could break at early boot up. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJUt9vmAAoJEEjnJuOKh9ldLHEIAJ9XrPW2xMIY5yI69jT1F7pv PkSRqENnOK0l4UulD52SvIBecQTTBcEEjao4yVGkc7DCJBOws/1LZ5gW8OfNlKjq rMB8yaosL1tXJ1ARVPMjcQVy+228zkgTXznwEZCjku1g7LuScQ28qyXsXO7B6yiK xKoHqKjygmM/a2aVn+8tdiVKiDp6jdmkbYicbaFT4xP7XB5DaMmIiXRHxdvW6xdR azKrVfYiMyJqTZNt/EVSWUk2WjeaYhoXyNtvgPx515wTo/llCnzhjcsocXBtH2P/ YOtwl+1L7Z89ukV9oXqrtrUJZ6Ps7+g7I1flJuL7/1FlNGnklcP9JojD+t6HeT8= =vkec -----END PGP SIGNATURE----- Merge tag 'trace-fixes-v3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull ftrace fixes from Steven Rostedt: "This holds a few fixes to the ftrace infrastructure as well as the mixture of function graph tracing and kprobes. When jprobes and function graph tracing is enabled at the same time it will crash the system: # modprobe jprobe_example # echo function_graph > /sys/kernel/debug/tracing/current_tracer After the first fork (jprobe_example probes it), the system will crash. This is due to the way jprobes copies the stack frame and does not do a normal function return. This messes up with the function graph tracing accounting which hijacks the return address from the stack and replaces it with a hook function. It saves the return addresses in a separate stack to put back the correct return address when done. But because the jprobe functions do not do a normal return, their stack addresses are not put back until the function they probe is called, which means that the probed function will get the return address of the jprobe handler instead of its own. The simple fix here was to disable function graph tracing while the jprobe handler is being called. While debugging this I found two minor bugs with the function graph tracing. The first was about the function graph tracer sharing its function hash with the function tracer (they both get filtered by the same input). The changing of the set_ftrace_filter would not sync the function recording records after a change if the function tracer was disabled but the function graph tracer was enabled. This was due to the update only checking one of the ops instead of the shared ops to see if they were enabled and should perform the sync. This caused the ftrace accounting to break and a ftrace_bug() would be triggered, disabling ftrace until a reboot. The second was that the check to update records only checked one of the filter hashes. It needs to test both the "filter" and "notrace" hashes. The "filter" hash determines what functions to trace where as the "notrace" hash determines what functions not to trace (trace all but these). Both hashes need to be passed to the update code to find out what change is being done during the update. This also broke the ftrace record accounting and triggered a ftrace_bug(). This patch set also include two more fixes that were reported separately from the kprobe issue. One was that init_ftrace_syscalls() was called twice at boot up. This is not a major bug, but that call performed a rather large kmalloc (NR_syscalls sizeof(syscalls_metadata)). The second call made the first one a memory leak, and wastes memory. The other fix is a regression caused by an update in the v3.19 merge window. The moving to enable events early, moved the enabling before PID 1 was created. The syscall events require setting the TIF_SYSCALL_TRACEPOINT for all tasks. But for_each_process_thread() does not include the swapper task (PID 0), and ended up being a nop. A suggested fix was to add the init_task() to have its flag set, but I didn't really want to mess with PID 0 for this minor bug. Instead I disable and re-enable events again at early_initcall() where it use to be enabled. This also handles any other event that might have its own reg function that could break at early boot up" tag 'trace-fixes-v3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix enabling of syscall events on the command line tracing: Remove extra call to init_ftrace_syscalls() ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing ftrace: Check both notrace and filter for old hash ftrace: Fix updating of filters for shared global_ops filters	2015-01-17 07:55:52 +13:00
Kan Liang	33636732dc	perf/x86/intel: Fix bug for "cycles:p" and "cycles:pp" on SLM cycles:p and cycles:pp do not work on SLM since commit: `86a04461a9` ("perf/x86: Revamp PEBS event selection") UOPS_RETIRED.ALL is not a PEBS capable event, so it should not be used to count cycle number. Actually SLM calls intel_pebs_aliases_core2() which uses INST_RETIRED.ANY_P to count the number of cycles. It's a PEBS capable event. But inv and cmask must be set to count cycles. Considering SLM allows all events as PEBS with no flags, only INST_RETIRED.ANY_P, inv=1, cmask=16 needs to handled specially. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1421084541-31639-1-git-send-email-kan.liang@intel.com Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-01-16 09:06:59 +01:00
Stephane Eranian	433678bdc6	perf/rapl: Fix sysfs_show() initialization for RAPL PMU This patch fixes a problem with the initialization of the sysfs_show() routine for the RAPL PMU. The current code was wrongly relying on the EVENT_ATTR_STR() macro which uses the events_sysfs_show() function in the x86 PMU code. That function itself was relying on the x86_pmu data structure. Yet RAPL and the core PMU (x86_pmu) have nothing to do with each other. They should therefore not interact with each other. The x86_pmu structure is initialized at boot time based on the host CPU model. When the host CPU is not supported, the x86_pmu remains uninitialized and some of the callbacks it contains are NULL. The false dependency with x86_pmu could potentially cause crashes in case the x86_pmu is not initialized while the RAPL PMU is. This may, for instance, be the case in virtualized environments. This patch fixes the problem by using a private sysfs_show() routine for exporting the RAPL PMU events. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150113225953.GA21525@thinkpad Cc: vincent.weaver@maine.edu Cc: jolsa@redhat.com Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-01-16 09:06:58 +01:00
Steven Rostedt (Red Hat)	237d28db03	ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing If the function graph tracer traces a jprobe callback, the system will crash. This can easily be demonstrated by compiling the jprobe sample module that is in the kernel tree, loading it and running the function graph tracer. # modprobe jprobe_example.ko # echo function_graph > /sys/kernel/debug/tracing/current_tracer # ls The first two commands end up in a nice crash after the first fork. (do_fork has a jprobe attached to it, so "ls" just triggers that fork) The problem is caused by the jprobe_return() that all jprobe callbacks must end with. The way jprobes works is that the function a jprobe is attached to has a breakpoint placed at the start of it (or it uses ftrace if fentry is supported). The breakpoint handler (or ftrace callback) will copy the stack frame and change the ip address to return to the jprobe handler instead of the function. The jprobe handler must end with jprobe_return() which swaps the stack and does an int3 (breakpoint). This breakpoint handler will then put back the saved stack frame, simulate the instruction at the beginning of the function it added a breakpoint to, and then continue on. For function tracing to work, it hijakes the return address from the stack frame, and replaces it with a hook function that will trace the end of the call. This hook function will restore the return address of the function call. If the function tracer traces the jprobe handler, the hook function for that handler will not be called, and its saved return address will be used for the next function. This will result in a kernel crash. To solve this, pause function tracing before the jprobe handler is called and unpause it before it returns back to the function it probed. Some other updates: Used a variable "saved_sp" to hold kcb->jprobe_saved_sp. This makes the code look a bit cleaner and easier to understand (various tries to fix this bug required this change). Note, if fentry is being used, jprobes will change the ip address before the function graph tracer runs and it will not be able to trace the function that the jprobe is probing. Link: http://lkml.kernel.org/r/20150114154329.552437962@goodmis.org Cc: stable@vger.kernel.org # 2.6.30+ Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2015-01-15 09:39:18 -05:00
Linus Torvalds	613d4cefbb	xen: bug fixes for 3.19-rc4 - Several critical linear p2m fixes that prevented some hosts from booting. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJUtQHNAAoJEFxbo/MsZsTR/qgH/iiW4k2T8dBGZ7TPyzt88iyT 4caWjuujp2OUaRqhBQdY7z05uai6XxgJLwDyqiO+qHaRUj+ZWCrjh/ZFPU1+09hK GdwPMWU7xMRs/7F2ANO03jJ/ktvsYXtazcVrV89Q3t+ZZJIQ/THovDkaoa+dF2lh W8d5H7N2UNCJLe9w2fm5iOq4SKoTsJOq6pVQ6gUBqJcgkSDWavd6bowXnTlcepZN tNaSMZsOt4CAvYQIa0nKPJo6Q4QN3buRQMWEOAOmGVT/RkVi68wirwk59uNzcS7E HjhqxFjhXYamNTuwHYZlchBrZutdbymSlucVucb1wAoxRAX+Wd1jk5EPl6zLv4w= =kFSE -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-3.19-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel: "Several critical linear p2m fixes that prevented some hosts from booting" * tag 'stable/for-linus-3.19-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: properly retrieve NMI reason xen: check for zero sized area when invalidating memory xen: use correct type for physical addresses xen: correct race in alloc_p2m_pmd() xen: correct error for building p2m list on 32 bits x86/xen: avoid freeing static 'name' when kasprintf() fails x86/xen: add extra memory for remapped frames during setup x86/xen: don't count how many PFNs are identity mapped x86/xen: Free bootmem in free_p2m_page() during early boot x86/xen: Remove unnecessary BUG_ON(preemptible()) in xen_setup_timer()	2015-01-14 08:07:42 +13:00
Jan Beulich	f221b04fe0	x86/xen: properly retrieve NMI reason Using the native code here can't work properly, as the hypervisor would normally have cleared the two reason bits by the time Dom0 gets to see the NMI (if passed to it at all). There's a shared info field for this, and there's an existing hook to use - just fit the two together. This is particularly relevant so that NMIs intended to be handled by APEI / GHES actually make it to the respective handler. Note that the hook can (and should) be used irrespective of whether being in Dom0, as accessing port 0x61 in a DomU would be even worse, while the shared info field would just hold zero all the time. Note further that hardware NMI handling for PVH doesn't currently work anyway due to missing code in the hypervisor (but it is expected to work the native rather than the PV way). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-13 09:39:50 +00:00
Juergen Gross	9a17ad7f3d	xen: check for zero sized area when invalidating memory With the introduction of the linear mapped p2m list setting memory areas to "invalid" had to be delayed. When doing the invalidation make sure no zero sized areas are processed. Signed-off-by: Juegren Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-12 10:09:55 +00:00
Juergen Gross	e86f949667	xen: use correct type for physical addresses When converting a pfn to a physical address be sure to use 64 bit wide types or convert the physical address to a pfn if possible. Signed-off-by: Juergen Gross <jgross@suse.com> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-12 10:09:48 +00:00
Juergen Gross	f241b0b891	xen: correct race in alloc_p2m_pmd() When allocating a new pmd for the linear mapped p2m list a check is done for not introducing another pmd when this just happened on another cpu. In this case the old pte pointer was returned which points to the p2m_missing or p2m_identity page. The correct value would be the pointer to the found new page. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-12 10:09:40 +00:00
Juergen Gross	82c92ed135	xen: correct error for building p2m list on 32 bits In xen_rebuild_p2m_list() for large areas of invalid or identity mapped memory the pmd entries on 32 bit systems are initialized wrong. Correct this error. Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-12 10:09:34 +00:00
Linus Torvalds	505569d208	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: two vdso fixes, two kbuild fixes and a boot failure fix with certain odd memory mappings" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, vdso: Use asm volatile in __getcpu x86/build: Clean auto-generated processor feature files x86: Fix mkcapflags.sh bash-ism x86: Fix step size adjustment during initial memory mapping x86_64, vdso: Fix the vdso address randomization algorithm	2015-01-11 11:53:46 -08:00
Linus Torvalds	ddb321a8dd	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Mostly tooling fixes, but also some kernel side fixes: uncore PMU driver fix, user regs sampling fix and an instruction decoder fix that unbreaks PEBS precise sampling" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/uncore/hsw-ep: Handle systems with only two SBOXes perf/x86_64: Improve user regs sampling perf: Move task_pt_regs sampling into arch code x86: Fix off-by-one in instruction decoder perf hists browser: Fix segfault when showing callchain perf callchain: Free callchains when hist entries are deleted perf hists: Fix children sort key behavior perf diff: Fix to sort by baseline field by default perf list: Fix --raw-dump option perf probe: Fix crash in dwarf_getcfi_elf perf probe: Fix to fall back to find probe point in symbols perf callchain: Append callchains only when requested perf ui/tui: Print backtrace symbols when segfault occurs perf report: Show progress bar for output resorting	2015-01-11 11:47:45 -08:00
Andi Kleen	5306c31c57	perf/x86/uncore/hsw-ep: Handle systems with only two SBOXes There was another report of a boot failure with a #GP fault in the uncore SBOX initialization. The earlier work around was not enough for this system. The boot was failing while trying to initialize the third SBOX. This patch detects parts with only two SBOXes and limits the number of SBOX units to two there. Stable material, as it affects boot problems on 3.18. Tested-by: Andreas Oehler <andreas@oehler-net.de> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Yan, Zheng <zheng.z.yan@intel.com> Link: http://lkml.kernel.org/r/1420583675-9163-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-01-09 11:12:30 +01:00
Andy Lutomirski	86c269fea3	perf/x86_64: Improve user regs sampling Perf reports user regs for kernel-mode samples so that samples can be backtraced through user code. The old code was very broken in syscall context, resulting in useless backtraces. The new code, in contrast, is still dangerously racy, but it should at least work most of the time. Tested-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: chenggang.qcg@taobao.com Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/243560c26ff0f739978e2459e203f6515367634d.1420396372.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-01-09 11:12:29 +01:00
Andy Lutomirski	88a7c26af8	perf: Move task_pt_regs sampling into arch code On x86_64, at least, task_pt_regs may be only partially initialized in many contexts, so x86_64 should not use it without extra care from interrupt context, let alone NMI context. This will allow x86_64 to override the logic and will supply some scratch space to use to make a cleaner copy of user regs. Tested-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: chenggang.qcg@taobao.com Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salter <msalter@redhat.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/e431cd4c18c2e1c44c774f10758527fb2d1025c4.1420396372.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-01-09 11:12:28 +01:00
Peter Zijlstra	0f363b250b	x86: Fix off-by-one in instruction decoder Stephane reported that the PEBS fixup was broken by the recent commit to the instruction decoder. The thing had an off-by-one which resulted in not being able to decode the last instruction and always bail. Reported-by: Stephane Eranian <eranian@google.com> Fixes: `6ba48ff46f` ("x86: Remove arbitrary instruction size limit in instruction decoder") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org # 3.18 Cc: <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Liang Kan <kan.liang@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/20141216104614.GV3337@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-01-09 11:12:26 +01:00
Linus Torvalds	74e59ea05c	Power management and ACPI material for 3.19-rc4 - Fix ACPI power management intialization for device objects corresponding to devices that are not present at the init time (the _STA control method returns 0 for them) and therefore should not be regarded as power manageable (Rafael J Wysocki). - Rename a structure field and two functions used by the ACPI processor driver to make them less tied to architectures that use APICs (both x86 and ia64) and more suitable for ARM64 processors (Hanjun Guo). - Add a disable_native_backlight quirk for Dell XPS15 L521X designed in an unusual way preventing native backlight from working on that machine (Hans de Goede). / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJUrcCTAAoJEILEb/54YlRxeMUQAKOHS8rlq0XtOxieufCks0Rq e96ZExiTdLI/KYZCgqwkSJxR7w2981hbFVIcFccPpo1k9Z1YowSOvWcHvmn1RwHQ lXDfCEqWssIXMn0zctH3Ob/uBggvWFx9g5y8slOZ6W2LaUzzNdA+ZqxIbNsgwNSN vji2E1m2ZhcwgThPYeXNsvvWJzYyC3LI4nZ8UIKE8kUMM5oxYoIlrW/qjoHc8KNV AlUv+e0Z43XHy5jHaS8sQrKGrAPrdUroDRcByw1MG/V8r4vSXepvNjOXXwEZ9SF9 xPp/bzLLRPK8FW4MQ2VEhTcO0FcjblDTQDhenDHKyjEonWOse5A9VbAyZ2dORfZ+ juDsO3xalYk+OoHjRDdzxkQLIyQOATTcxkyGxyJf+HjcD6nfsdibWzisM6nl7E9h hpwGRhb5sgbWV9QRwmkkvFBP/uCsF3rHA/qCZQCFsxs0n3Ty5wiAg2JEHEL/kPF9 6vPsX4ttukw5MbNzTyHfXOm41Ula3u4EfJvTdQjWNdx9uifBjsosetw3ho6q4XmP e6mCBFIU0Fxxfnmgij5x6ufGCBCrkpbLuZsC63HyaRDiRDN4DuftDLesVEySfJix +FiqilOyiOmSXHHC17cF8YNYsxMdFo1mGJDpV1PS1gnh/xwmEgfzwdWrjso7Y76L 9MilW/AXspuXxegu6JDu =NEY0 -----END PGP SIGNATURE----- Merge tag 'pm+acpi-3.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI fixes from Rafael Wysocki: "These are an ACPI device power management initialization fix (-stable material), two commits renaming stuff in the ACPI processor driver to make it more suitable for ARM64 processors and a new ACPI backlight blacklist entry. Specifics: - Fix ACPI power management intialization for device objects corresponding to devices that are not present at the init time (the _STA control method returns 0 for them) and therefore should not be regarded as power manageable (Rafael J Wysocki). - Rename a structure field and two functions used by the ACPI processor driver to make them less tied to architectures that use APICs (both x86 and ia64) and more suitable for ARM64 processors (Hanjun Guo). - Add a disable_native_backlight quirk for Dell XPS15 L521X designed in an unusual way preventing native backlight from working on that machine (Hans de Goede)" * tag 'pm+acpi-3.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / video: Add disable_native_backlight quirk for Dell XPS15 L521X ACPI / processor: Rename acpi_(un)map_lsapic() to acpi_(un)map_cpu() ACPI / processor: Convert apic_id to phys_id to make it arch agnostic ACPI / PM: Fix PM initialization for devices that are not present	2015-01-08 14:11:03 -08:00
Linus Torvalds	716c13a817	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes a build problem with sha-mb with old toolchains and an implementation bug in the ctr(aes)/by8 branch of aesni-intel that's enabled when AVX is available" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: sha-mb - Add avx2_supported check. crypto: aesni - fix "by8" variant for 128 bit keys	2015-01-08 11:33:51 -08:00
Vitaly Kuznetsov	7be0772d19	x86/xen: avoid freeing static 'name' when kasprintf() fails In case kasprintf() fails in xen_setup_timer() we assign name to the static string "<timer kasprintf failed>". We, however, don't check that fact before issuing kfree() in xen_teardown_timer(), kernel is supposed to crash with 'kernel BUG at mm/slub.c:3341!' Solve the issue by making name a fixed length string inside struct xen_clock_event_device. 16 bytes should be enough. Suggested-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-08 13:55:25 +00:00
David Vrabel	a97dae1a2e	x86/xen: add extra memory for remapped frames during setup If the non-RAM regions in the e820 memory map are larger than the size of the initial balloon, a BUG was triggered as the frames are remaped beyond the limit of the linear p2m. The frames are remapped into the initial balloon area (xen_extra_mem) but not enough of this is available. Ensure enough extra memory regions are added for these remapped frames. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com>	2015-01-08 13:52:37 +00:00
David Vrabel	bc7142cf79	x86/xen: don't count how many PFNs are identity mapped This accounting is just used to print a diagnostic message that isn't very useful. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com>	2015-01-08 13:52:36 +00:00
Boris Ostrovsky	701a261ad6	x86/xen: Free bootmem in free_p2m_page() during early boot With recent changes in p2m we now have legitimate cases when p2m memory needs to be freed during early boot (i.e. before slab is initialized). Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-08 13:52:36 +00:00
Rafael J. Wysocki	794c3a0a93	Merge branches 'acpi-pm', 'acpi-processor' and 'acpi-video' * acpi-pm: ACPI / PM: Fix PM initialization for devices that are not present * acpi-processor: ACPI / processor: Rename acpi_(un)map_lsapic() to acpi_(un)map_cpu() ACPI / processor: Convert apic_id to phys_id to make it arch agnostic * acpi-video: ACPI / video: Add disable_native_backlight quirk for Dell XPS15 L521X	2015-01-06 23:35:43 +01:00
Hanjun Guo	d02dc27db0	ACPI / processor: Rename acpi_(un)map_lsapic() to acpi_(un)map_cpu() acpi_map_lsapic() will allocate a logical CPU number and map it to physical CPU id (such as APIC id) for the hot-added CPU, it will also do some mapping for NUMA node id and etc, acpi_unmap_lsapic() will do the reverse. We can see that the name of the function is a little bit confusing and arch (IA64) dependent so rename them as acpi_(un)map_cpu() to make arch agnostic and explicit. Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-01-05 23:34:26 +01:00
Vinson Lee	0b8c960cf6	crypto: sha-mb - Add avx2_supported check. This patch fixes this allyesconfig target build error with older binutils. LD arch/x86/crypto/built-in.o ld: arch/x86/crypto/sha-mb/built-in.o: No such file: No such file or directory Cc: stable@vger.kernel.org # 3.18+ Signed-off-by: Vinson Lee <vlee@twitter.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-01-05 21:35:02 +11:00
Mathias Krause	0b1e95b2fa	crypto: aesni - fix "by8" variant for 128 bit keys The "by8" counter mode optimization is broken for 128 bit keys with input data longer than 128 bytes. It uses the wrong key material for en- and decryption. The key registers xkey0, xkey4, xkey8 and xkey12 need to be preserved in case we're handling more than 128 bytes of input data -- they won't get reloaded after the initial load. They must therefore be (a) loaded on the first iteration and (b) be preserved for the latter ones. The implementation for 128 bit keys does not comply with (a) nor (b). Fix this by bringing the implementation back to its original source and correctly load the key registers and preserve their values by not re-using the registers for other purposes. Kudos to James for reporting the issue and providing a test case showing the discrepancies. Reported-by: James Yonan <james@openvpn.net> Cc: Chandramouli Narayanan <mouli@linux.intel.com> Cc: <stable@vger.kernel.org> # v3.18 Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-01-05 21:35:02 +11:00
Daniel Borkmann	b485342bd7	x86, um: actually mark system call tables readonly Commit `a074335a37` ("x86, um: Mark system call tables readonly") was supposed to mark the sys_call_table in UML as RO by adding the const, but it doesn't have the desired effect as it's nevertheless being placed into the data section since __cacheline_aligned enforces sys_call_table being placed into .data..cacheline_aligned instead. We need to use the ____cacheline_aligned version instead to fix this issue. Before: $ nm -v arch/x86/um/sys_call_table_64.o \| grep -1 "sys_call_table" U sys_writev 0000000000000000 D sys_call_table 0000000000000000 D syscall_table_size After: $ nm -v arch/x86/um/sys_call_table_64.o \| grep -1 "sys_call_table" U sys_writev 0000000000000000 R sys_call_table 0000000000000000 D syscall_table_size Fixes: `a074335a37` ("x86, um: Mark system call tables readonly") Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2015-01-04 14:21:25 +01:00
Ingo Molnar	2aba73a614	This is hopefully the last vdso fix for 3.19. It should be very safe (it just adds a volatile). I don't think it fixes an actual bug (the __getcpu calls in the pvclock code may not have been needed in the first place), but discussion on that point is ongoing. It also fixes a big performance issue in 3.18 and earlier in which the lsl instructions in vclock_gettime got hoisted so far up the function that they happened even when the function they were in was never called. n 3.19, the performance issue seems to be gone due to the whims of my compiler and some interaction with a branch that's now gone. I'll hopefully have a much bigger overhaul of the pvclock code for 3.20, but it needs careful review. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJUmduGAAoJEK9N98ZeDfrk874H/RkkP+y6/DmdKVR1dTOUQW4u f1wPU0/sc5xywGjNcfR3XwUuyBJyd3s81WVaE5XXHfCHnbjG2Z4CNTqga27hL1D0 io01Q2s3dh1Y5c0cccVmJmyw//YVzMUOzGTNM9R0NKQNXmYUz6jgQaqk+wWORdD6 JXCU3/LI5VT0fjNPLj1M9l59eC2Qg/V4GqY2xRJ1AfbwkX1CFZTcWUPb+4FScVYv 9gds/vOoFg54MypVJD4SeIC9I8U0qcim9gV7gGFdzyDNCXS5J4P+02sEOFNu8oYy HVK1B0LXhswT08Ho1yRxXUhFxpqEGeGJvTlDTvwy+r/yuKE2AVBtlhLQBqMPhnY= =u3d2 -----END PGP SIGNATURE----- Merge tag 'pr-20141223-x86-vdso' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux into x86/urgent Pull VDSO fix from Andy Lutomirski: "This is hopefully the last vdso fix for 3.19. It should be very safe (it just adds a volatile). I don't think it fixes an actual bug (the __getcpu calls in the pvclock code may not have been needed in the first place), but discussion on that point is ongoing. It also fixes a big performance issue in 3.18 and earlier in which the lsl instructions in vclock_gettime got hoisted so far up the function that they happened even when the function they were in was never called. n 3.19, the performance issue seems to be gone due to the whims of my compiler and some interaction with a branch that's now gone. I'll hopefully have a much bigger overhaul of the pvclock code for 3.20, but it needs careful review." Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-01-01 22:21:22 +01:00
Paolo Bonzini	a629df7ead	kvm: x86: drop severity of "generation wraparound" message Since most virtual machines raise this message once, it is a bit annoying. Make it KERN_DEBUG severity. Cc: stable@vger.kernel.org Fixes: `7a2e8aaf0f` Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-12-27 21:52:28 +01:00
Tiejun Chen	baa035227b	kvm: x86: vmx: reorder some msr writing The commit `34a1cd60d1`, "x86: vmx: move some vmx setting from vmx_init() to hardware_setup()", tried to refactor some codes specific to vmx hardware setting into hardware_setup(), but some msr writing should depend on our previous setting condition like enable_apicv, enable_ept and so on. Reported-by: Jamie Heilman <jamie@audible.transient.net> Tested-by: Jamie Heilman <jamie@audible.transient.net> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-12-27 21:52:10 +01:00
Andy Lutomirski	1ddf0b1b11	x86, vdso: Use asm volatile in __getcpu In Linux 3.18 and below, GCC hoists the lsl instructions in the pvclock code all the way to the beginning of __vdso_clock_gettime, slowing the non-paravirt case significantly. For unknown reasons, presumably related to the removal of a branch, the performance issue is gone as of `e76b027e64` x86,vdso: Use LSL unconditionally for vgetcpu but I don't trust GCC enough to expect the problem to stay fixed. There should be no correctness issue, because the __getcpu calls in __vdso_vlock_gettime were never necessary in the first place. Note to stable maintainers: In 3.18 and below, depending on configuration, gcc 4.9.2 generates code like this: 9c3: 44 0f 03 e8 lsl %ax,%r13d 9c7: 45 89 eb mov %r13d,%r11d 9ca: 0f 03 d8 lsl %ax,%ebx This patch won't apply as is to any released kernel, but I'll send a trivial backported version if needed. Fixes: `51c19b4f59` x86: vdso: pvclock gettime support Cc: stable@vger.kernel.org # 3.8+ Cc: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net>	2014-12-23 13:05:30 -08:00
Bjørn Mork	280dbc5723	x86/build: Clean auto-generated processor feature files Commit `9def39be4e` ("x86: Support compiling out human-friendly processor feature names") made two source file targets conditional. Such conditional targets will not be cleaned automatically by make mrproper. Fix by adding explicit clean-files targets for the two files. Fixes: `9def39be4e` ("x86: Support compiling out human-friendly processor feature names") Signed-off-by: Bjørn Mork <bjorn@mork.no> Cc: Josh Triplett <josh@joshtriplett.org> Link: http://lkml.kernel.org/r/1419335863-10608-1-git-send-email-bjorn@mork.no Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-12-23 15:37:06 +01:00
Sylvain BERTRAND	ea174f4c4f	x86: Fix mkcapflags.sh bash-ism Chocked while compiling linux with dash shell instead of bash shell. See: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html Signed-off-by: Sylvain BERTRAND <sylvain.bertrand@gmail.com> Link: http://lkml.kernel.org/r/20141223123912.GA1386@localhost.localdomain Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-12-23 15:34:57 +01:00
Jan Beulich	132978b94e	x86: Fix step size adjustment during initial memory mapping The old scheme can lead to failure in certain cases - the problem is that after bumping step_size the next (non-final) iteration is only guaranteed to make available a memory block the size of what step_size was before. E.g. for a memory block [0,3004600000) we'd have: iter start end step amount 1 3004400000 30045fffff 2M 2M 2 3004000000 30043fffff 64M 4M 3 3000000000 3003ffffff 2G 64M 4 2000000000 2fffffffff 64G 64G Yet to map 64G with 4k pages (as happens e.g. under PV Xen) we need slightly over 128M, but the first three iterations made only about 70M available. The condition (new_mapped_ram_size > mapped_ram_size) for bumping step_size is just not suitable. Instead we want to bump it when we know we have enough memory available to cover a block of the new step_size. And rather than making that condition more complicated than needed, simply adjust step_size by the largest possible factor we know we can cover at that point - which is shifting it left by one less than the difference between page table level shifts. (Interestingly the original STEP_SIZE_SHIFT definition had a comment hinting at that having been the intention, just that it should have been PUD_SHIFT-PMD_SHIFT-1 instead of (PUD_SHIFT-PMD_SHIFT)/2, and of course for non-PAE 32-bit we can't really use these two constants as they're equal there.) Furthermore the comment in get_new_step_size() didn't get updated when the bottom-down mapping logic got added. Yet while an overflow (flushing step_size to zero) of the shift doesn't matter for the top-down method, it does for bottom-up because round_up(x, 0) = 0, and an upper range boundary of zero can't really work well. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/54945C1E020000780005114E@mail.emea.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-12-23 11:39:34 +01:00
Boris Ostrovsky	8b8cd8a367	x86/xen: Remove unnecessary BUG_ON(preemptible()) in xen_setup_timer() There is no reason for having it and, with commit `250a1ac685` ("x86, smpboot: Remove pointless preempt_disable() in native_smp_prepare_cpus()"), it prevents HVM guests from booting. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2014-12-23 10:31:55 +00:00
Ingo Molnar	fbe1bf1406	One vdso fix for a longstanding ASLR bug that's been in the news lately. The vdso base address has always been randomized, and I don't think there's anything particularly wrong with the range over which it's randomized, but the implementation seems to have been buggy since the very beginning. This fixes the implementation to remove a large bias that caused a small fraction of possible vdso load addresess to be vastly more likely than the rest of the possible addresses. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJUlhwwAAoJEK9N98ZeDfrklIcH/2/4P/ffRcCp0qYo7gFpXwPh bWem3ygB4xmBgiivwGJOx/GgE6/QGmedZmD6EDLoLmpuOvGjdp4iRmvU1dCSDWOE bMOBa1cxC4TPGzqlbGgjmyHgMPuihJq6GInAqmpJk/hZJ8W7JfnXyoZbt9pj4UBW gYKMLa0gSF/rMTZ5hkDQ6mVH65M7jJnmHLRydTpK8Ryfap2lu01MIr6mC6xaobVc 5NfbI8bZexQECGLmPsFRnFWrYXNz86PKtN0j4YnPRVPRbBSTi8nHGu+zK2CJXpgu 9hyZ0eOdqSEi4wwQgI9g9lZSsa1C+5QBc6BwiDy5XjToPJOx5EnT6+m9O+fy9Q8= =BVMi -----END PGP SIGNATURE----- Merge tag 'pr-20141220-x86-vdso' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux into x86/urgent Pull a VDSO fix from Andy Lutomirski: "One vdso fix for a longstanding ASLR bug that's been in the news lately. The vdso base address has always been randomized, and I don't think there's anything particularly wrong with the range over which it's randomized, but the implementation seems to have been buggy since the very beginning. This fixes the implementation to remove a large bias that caused a small fraction of possible vdso load addresess to be vastly more likely than the rest of the possible addresses." Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-12-21 11:16:49 +01:00
Andy Lutomirski	394f56fe48	x86_64, vdso: Fix the vdso address randomization algorithm The theory behind vdso randomization is that it's mapped at a random offset above the top of the stack. To avoid wasting a page of memory for an extra page table, the vdso isn't supposed to extend past the lowest PMD into which it can fit. Other than that, the address should be a uniformly distributed address that meets all of the alignment requirements. The current algorithm is buggy: the vdso has about a 50% probability of being at the very end of a PMD. The current algorithm also has a decent chance of failing outright due to incorrect handling of the case where the top of the stack is near the top of its PMD. This fixes the implementation. The paxtest estimate of vdso "randomisation" improves from 11 bits to 18 bits. (Disclaimer: I don't know what the paxtest code is actually calculating.) It's worth noting that this algorithm is inherently biased: the vdso is more likely to end up near the end of its PMD than near the beginning. Ideally we would either nix the PMD sharing requirement or jointly randomize the vdso and the stack to reduce the bias. In the mean time, this is a considerable improvement with basically no risk of compatibility issues, since the allowed outputs of the algorithm are unchanged. As an easy test, doing this: for i in `seq 10000` do grep -P vdso /proc/self/maps \|cut -d- -f1 done \|sort \|uniq -d used to produce lots of output (1445 lines on my most recent run). A tiny subset looks like this: 7fffdfffe000 7fffe01fe000 7fffe05fe000 7fffe07fe000 7fffe09fe000 7fffe0bfe000 7fffe0dfe000 Note the suspicious fe000 endings. With the fix, I get a much more palatable 76 repeated addresses. Reviewed-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Signed-off-by: Andy Lutomirski <luto@amacapital.net>	2014-12-20 16:56:57 -08:00
Linus Torvalds	60815cf2e0	kernel: Provide READ_ONCE and ASSIGN_ONCE As discussed on LKML http://marc.info/?i=54611D86.4040306%40de.ibm.com ACCESS_ONCE might fail with specific compilers for non-scalar accesses. Here is a set of patches to tackle that problem. The first patch introduce READ_ONCE and ASSIGN_ONCE. If the data structure is larger than the machine word size memcpy is used and a warning is emitted. The next patches fix up several in-tree users of ACCESS_ONCE on non-scalar types. This merge does not yet contain a patch that forces ACCESS_ONCE to work only on scalar types. This is targetted for the next merge window as Linux next already contains new offenders regarding ACCESS_ONCE vs. non-scalar types. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIcBAABAgAGBQJUkrVGAAoJEBF7vIC1phx8stkP/2LmN5y6LOseoEW06xa5MX4m cbIKsZNtsGHl7EDcTzzuWs6Sq5/Cj7V3yzeBF7QGbUKOqvFWU3jvpUBCCfjMg37C 77/Vf0ZPrxTXXxeJ4Ykdy2CGvuMtuYY9TWkrRNKmLU0xex7lGblEzCt9z6+mZviw 26/DN8ctjkHRvIUAi+7RfQBBc3oSMYAC1mzxYKBAsAFLV+LyFmsGU/4iofZMAsdt XFyVXlrLn0Bjx/MeceGkOlMDiVx4FnfccfFaD4hhuTLBJXWitkUK/MRa4JBiXWzH agY8942A8/j9wkI2DFp/pqZYqA/sTXLndyOWlhE//ZSti0n0BSJaOx3S27rTLkAc 5VmZEVyIrS3hyOpyyAi0sSoPkDnjeCHmQg9Rqn34/poKLd7JDrW2UkERNCf/T3eh GI2rbhAlZz3v5mIShn8RrxzslWYmOObpMr3HYNUdRk8YUfTf6d6aZ3txHp2nP4mD VBAEzsvP9rcVT2caVhU2dnBzeaZAj3zeDxBtjcb3X2osY9tI7qgLc9Fa/fWKgILk 2evkLcctsae2mlLNGHyaK3Dm/ZmYJv+57MyaQQEZNfZZgeB1y4k0DkxH4w1CFmCi s8XlH5voEHgnyjSQXXgc/PNVlkPAKr78ZyTiAfiKmh8rpe41/W4hGcgao7L9Lgiu SI0uSwKibuZt4dHGxQuG =IQ5o -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/borntraeger/linux Pull ACCESS_ONCE cleanup preparation from Christian Borntraeger: "kernel: Provide READ_ONCE and ASSIGN_ONCE As discussed on LKML http://marc.info/?i=54611D86.4040306%40de.ibm.com ACCESS_ONCE might fail with specific compilers for non-scalar accesses. Here is a set of patches to tackle that problem. The first patch introduce READ_ONCE and ASSIGN_ONCE. If the data structure is larger than the machine word size memcpy is used and a warning is emitted. The next patches fix up several in-tree users of ACCESS_ONCE on non-scalar types. This does not yet contain a patch that forces ACCESS_ONCE to work only on scalar types. This is targetted for the next merge window as Linux next already contains new offenders regarding ACCESS_ONCE vs. non-scalar types" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/borntraeger/linux: s390/kvm: REPLACE barrier fixup with READ_ONCE arm/spinlock: Replace ACCESS_ONCE with READ_ONCE arm64/spinlock: Replace ACCESS_ONCE READ_ONCE mips/gup: Replace ACCESS_ONCE with READ_ONCE x86/gup: Replace ACCESS_ONCE with READ_ONCE x86/spinlock: Replace ACCESS_ONCE with READ_ONCE mm: replace ACCESS_ONCE with READ_ONCE or barriers kernel: Provide READ_ONCE and ASSIGN_ONCE	2014-12-20 16:48:59 -08:00
Linus Torvalds	e589c9e13a	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic updates from Thomas Gleixner: "After stopping the full x86/apic branch, I took some time to go through the first block of patches again, which are mostly cleanups and preparatory work for the irqdomain conversion and ioapic hotplug support. Unfortunaly one of the real problematic commits was right at the beginning, so I rebased this portion of the pending patches without the offenders. It would be great to get this into 3.19. That makes reworking the problematic parts simpler. The usual tip testing did not unearth any issues and it is fully bisectible now. I'm pretty confident that this wont affect the calmness of the xmas season. Changes: - Split the convoluted io_apic.c code into domain specific parts (vector, ioapic, msi, htirq) - Introduce proper helper functions to retrieve irq specific data instead of open coded dereferencing of pointers - Preparatory work for ioapic hotplug and irqdomain conversion - Removal of the non functional pci-ioapic driver - Removal of unused irq entry stubs - Make native_smp_prepare_cpus() preemtible to avoid GFP_ATOMIC allocations for everything which is called from there. - Small cleanups and fixes" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) iommu/amd: Use helpers to access irq_cfg data structure associated with IRQ iommu/vt-d: Use helpers to access irq_cfg data structure associated with IRQ x86: irq_remapping: Use helpers to access irq_cfg data structure associated with IRQ x86, irq: Use helpers to access irq_cfg data structure associated with IRQ x86, irq: Make MSI and HT_IRQ indepenent of X86_IO_APIC x86, irq: Move IRQ initialization routines from io_apic.c into vector.c x86, irq: Move IOAPIC related declarations from hw_irq.h into io_apic.h x86, irq: Move HT IRQ related code from io_apic.c into htirq.c x86, irq: Move PCI MSI related code from io_apic.c into msi.c x86, irq: Replace printk(KERN_LVL) with pr_lvl() utilities x86, irq: Make UP version of irq_complete_move() an inline stub x86, irq: Move local APIC related code from io_apic.c into vector.c x86, irq: Introduce helpers to access struct irq_cfg x86, irq: Protect __clear_irq_vector() with vector_lock x86, irq: Rename local APIC related functions in io_apic.c as apic_xxx() x86, irq: Refine hw_irq.h to prepare for irqdomain support x86, irq: Convert irq_2_pin list to generic list x86, irq: Kill useless parameter 'irq_attr' of IO_APIC_get_PCI_irq_vector() x86, irq, acpi: Get rid of special handling of GSI for ACPI SCI x86, irq: Introduce helper to check whether an IOAPIC has been registered ...	2014-12-19 14:02:02 -08:00
Linus Torvalds	a54455766b	Merge branch 'x86-mpx-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 MPX fixes from Thomas Gleixner: "Three updates for the new MPX infrastructure: - Use the proper error check in the trap handler - Add a proper config option for it - Bring documentation up to date" * 'x86-mpx-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, mpx: Give MPX a real config option prompt x86, mpx: Update documentation x86_64/traps: Fix always true condition	2014-12-19 13:22:42 -08:00
Linus Torvalds	1092b596a5	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Ingo Molnar: "This contains a single TLS ABI validation fix from Andy Lutomirski" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tls: Don't validate lm in set_thread_area() after all	2014-12-19 13:18:31 -08:00
Linus Torvalds	88a57667f2	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes and cleanups from Ingo Molnar: "A kernel fix plus mostly tooling fixes, but also some tooling restructuring and cleanups" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) perf: Fix building warning on ARM 32 perf symbols: Fix use after free in filename__read_build_id perf evlist: Use roundup_pow_of_two tools: Adopt roundup_pow_of_two perf tools: Make the mmap length autotuning more robust tools: Adopt rounddown_pow_of_two and deps tools: Adopt fls_long and deps tools: Move bitops.h from tools/perf/util to tools/ tools: Introduce asm-generic/bitops.h tools lib: Move asm-generic/bitops/find.h code to tools/include and tools/lib tools: Whitespace prep patches for moving bitops.h tools: Move code originally from asm-generic/atomic.h into tools/include/asm-generic/ tools: Move code originally from linux/log2.h to tools/include/linux/ tools: Move __ffs implementation to tools/include/asm-generic/bitops/__ffs.h perf evlist: Do not use hard coded value for a mmap_pages default perf trace: Let the perf_evlist__mmap autosize the number of pages to use perf evlist: Improve the strerror_mmap method perf evlist: Clarify sterror_mmap variable names perf evlist: Fixup brown paper bag on "hint" for --mmap-pages cmdline arg perf trace: Provide a better explanation when mmap fails ...	2014-12-19 13:15:24 -08:00

1 2 3 4 5 ...

20565 Commits