linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-15 08:31:55 +00:00

Author	SHA1	Message	Date
James Hogan	c838e72a35	metag: export clear_page and copy_page Various file systems use clear_page() and copy_page(), so when they're built as modules we get build errors like the following: ERROR: "clear_page" [fs/ntfs/ntfs.ko] undefined! ERROR: "copy_page" [fs/nilfs2/nilfs2.ko] undefined! Therefore export these functions to modules from metag_ksyms.c to fix the errors. This was hit by a randconfig build. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:11:13 +00:00
James Hogan	f626dc704e	metag: export metag_code_cache_flush_all Various file systems indirectly use metag_code_cache_flush_all(), so when they're built as modules we get build errors like the following: ERROR: "metag_code_cache_flush_all" [fs/xfs/xfs.ko] undefined! Therefore export this function to modules to fix the errors. This was hit by a randconfig build. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:11:12 +00:00
James Hogan	3d6b7bb0a2	metag: protect more non-MMU memory regions Rename setup_txprivext() to setup_priv() and add initialisation of some more per-thread privilege protection registers: - TxPRIVSYSR: 0x04400000-0x047fffff 0x05000000-0x07ffffff 0x84000000-0x87ffffff - TxPIOREG: 0x02000000-0x02ffffff 0x04800000-0x048fffff - TxSYREG: 0x04000000-0x04000fff (except write fetch system event) Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:58 +00:00
James Hogan	c787c2d62f	metag: make TXPRIVEXT bits explicit Define PRIV_BITS using explicit constants from <asm/metag_regs.h> rather than with a hard coded value. This also adds a couple of missing definitions for the TXPRIVEXT priv bits for protecting writes to TXTIMER and the trace registers. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:58 +00:00
James Hogan	82f0167aa4	metag: kernel/setup.c: sort includes Sort includes in kernel/setup.c. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:57 +00:00
James Hogan	883a635591	metag: add boot time LNKGET/LNKSET check Add boot time check for whether LNKGET/LNKSET go through or around the cache. Depending on the configuration an info message (no harm), warning (technically wrong but no harm), or big WARN (expect failure in either kernel or userland) may be emitted if the behaviour is not as expected: Configuration Hardware Response ------------------------------------------ -------- -------- AROUND_CACHE through pr_info !AROUND_CACHE && ATOMICITY_LNKGET around WARN (kernel) " && !ATOMICITY_LNKGET && SMP around WARN (user) " " && !SMP around pr_warn Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:56 +00:00
James Hogan	0a38a8adc5	metag: add __init to metag_cache_probe() metag_cache_probe() is only called from setup_arch(), so add the __init attribute to it. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:56 +00:00
James Hogan	ae85ac71b7	metag: Add JTAG Debug Adapter (DA) support Add basic JTAG Debug Adapter (DA) support so that drivers which communicate with the DA can detect whether one is actually present (otherwise the target will halt indefinitely). This allows the metag_da TTY driver and imgdafs filesystem driver to be built, updates defconfigs, and sets up the metag_da console early if it's configured in. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:56 +00:00
James Hogan	00512bdd45	metag: ftrace support Add ftrace support for metag. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>	2013-03-02 20:09:55 +00:00
James Hogan	903b20ad68	metag: Perf Add Perf support for metag. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2013-03-02 20:09:54 +00:00
James Hogan	5633004cc2	metag: Build infrastructure Add metag build infrastructure. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:54 +00:00
James Hogan	1e57372eac	metag: Various other headers Add the remaining metag header files: - byteorder.h, swab.h (byte order and swapping) - barrier.h, cpu.h. hwthread.h, processor.h (hardware thread related) - bug.h, elf.h, gpio.h, linkage.h, resource.h (other) Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:52 +00:00
James Hogan	e8de3486a4	metag: Stack unwinding Add stack unwinding support for metag. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:52 +00:00
James Hogan	086e9dc0e2	metag: Optimised library functions Add optimised library functions for metag. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:52 +00:00
James Hogan	f507758ccb	metag: DMA Add DMA mapping code. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:51 +00:00
James Hogan	42682c6c42	metag: SMP support Add SMP support for metag. This allows Linux to take control of multiple hardware threads on a single Meta core, treating them as separate Linux CPUs. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:51 +00:00
James Hogan	6006c0d8ce	metag: Atomics, locks and bitops Add header files to implement Meta hardware thread locks (used by some other atomic operations), atomics, spinlocks, and bitops. There are 2 main types of atomic primitives for metag (in addition to IRQs off on UP): - LOCK instructions provide locking between hardware threads. - LNKGET/LNKSET instructions provide load-linked/store-conditional operations allowing for lighter weight atomics on Meta2 LOCK instructions allow for hardware threads to acquire voluntary or exclusive hardware thread locks: - LOCK0 releases exclusive and voluntary lock from the running hardware thread. - LOCK1 acquires the voluntary hardware lock, blocking until it becomes available. - LOCK2 implies LOCK1, and additionally acquires the exclusive hardware lock, blocking all other hardware threads from executing. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:50 +00:00
James Hogan	9b802d1f43	metag: Module support Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:49 +00:00
James Hogan	44dea393cf	metag: Scheduling/Process management Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:49 +00:00
James Hogan	26025bbfbb	metag: System Calls Add metag system call and gateway page interfaces. The metag architecture port uses the generic system call numbers from asm-generic/unistd.h, as well as a user gateway page mapped at 0x6ffff000 which contains fast atomic primitives (depending on SMP) and a fast method of accessing TLS data. System calls use the SWITCH instruction with the immediate 0x440001 to signal a system call. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:49 +00:00
James Hogan	5698c50d9d	metag: Internal and external irqchips Meta core internal interrupts (from HWSTATMETA and friends) are vectored onto the TR1 core trigger for the current thread. This is demultiplexed in irq-metag.c to individual Linux IRQs for each internal interrupt. External SoC interrupts (from HWSTATEXT and friends) are vectored onto the TR2 core trigger for the current thread. This is demultiplexed in irq-metag-ext.c to individual Linux IRQs for each external SoC interrupt. The external irqchip has devicetree bindings for configuring the number of irq banks and the type of masking available. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Rob Landley <rob@landley.net> Cc: Dom Cobley <popcornmix@gmail.com> Cc: Simon Arlott <simon@fire.lp0.eu> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Maxime Ripard <maxime.ripard@free-electrons.com> Cc: devicetree-discuss@lists.ozlabs.org Cc: linux-doc@vger.kernel.org	2013-03-02 20:09:48 +00:00
James Hogan	63047ea360	metag: IRQ handling Add core IRQ handling for metag. The code in irq.c exposes the TBX signal numbers as Linux IRQs. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:48 +00:00
James Hogan	ac919f0883	metag: Traps Add trap code for metag. At the lowest level Meta traps (and return from interrupt instruction - RTI) simply swap the PC and PCX registers and optionally toggle the interrupt status bit (ISTAT). Low level TBX code in tbipcx.S handles the core context save, determine the TBX signal number based on the core trigger that fired (using the TXSTATI status register), and call TBX signal handlers (mostly in traps.c) via a vector table. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Al Viro <viro@zeniv.linux.org.uk>	2013-03-02 20:09:45 +00:00
James Hogan	a2c5d4ed92	metag: Time keeping Add time keeping code for metag. Meta hardware threads have 2 timers. The background timer (TXTIMER) is used as a free-running time base, and the interrupt timer (TXTIMERI) is used for the timer interrupt. Both counters traditionally count at approximately 1MHz. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de>	2013-03-02 20:09:22 +00:00
James Hogan	bc3966bf15	metag: ptrace The ptrace interface for metag provides access to some core register sets using the PTRACE_GETREGSET and PTRACE_SETREGSET operations. The details of the internal context structures is abstracted into user API structures to both ease use and allow flexibility to change the internal context layouts. Copyin and copyout functions for these register sets are exposed to allow signal handling code to use them to copy to and from the signal context. struct user_gp_regs (NT_PRSTATUS) provides access to the core general purpose register context. struct user_cb_regs (NT_METAG_CBUF) provides access to the TXCATCH* registers which contains information abuot a memory fault, unaligned access error or watchpoint. This can be modified to alter the way the fault is replayed on resume ("catch replay"), or to prevent the replay taking place. struct user_rp_state (NT_METAG_RPIPE) provides access to the state of the Meta read pipeline which can be used to hide memory latencies in hand optimised data loops. Extended DSP register state, DSP RAM, and hardware breakpoint registers aren't yet exposed through ptrace. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Tony Lindgren <tony@atomide.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>	2013-03-02 20:09:22 +00:00
James Hogan	29dd78cf0b	metag: Device tree Add device tree files to arch/metag. Signed-off-by: James Hogan <james.hogan@imgtec.com> Reviewed-by: Vineet Gupta <vgupta@synopsys.com>	2013-03-02 20:09:22 +00:00
James Hogan	262d96b0de	metag: Signal handling Add signal handling code for metag. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Al Viro <viro@zeniv.linux.org.uk>	2013-03-02 20:09:21 +00:00
James Hogan	c438b58e65	metag: TCM support Add some TCM support Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:21 +00:00
James Hogan	bbc17704d5	metag: Highmem support Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:20 +00:00
James Hogan	e624e95bd8	metag: Huge TLB Add huge TLB support to the metag architecture. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:20 +00:00
James Hogan	373cd784d0	metag: Memory handling Meta has instructions for accessing: - bytes - GETB (1 byte) - words - GETW (2 bytes) - doublewords - GETD (4 bytes) - longwords - GETL (8 bytes) All accesses must be aligned. Unaligned accesses can be detected and made to fault on Meta2, however it isn't possible to fix up unaligned writes so we don't bother fixing up reads either. This patch adds metag memory handling code including: - I/O memory (io.h, ioremap.c): Actually any virtual memory can be accessed with these helpers. A part of the non-MMUable address space is used for memory mapped I/O. The ioremap() function is implemented one to one for non-MMUable addresses. - User memory (uaccess.h, usercopy.c): User memory is directly accessible from privileged code. - Kernel memory (maccess.c): probe_kernel_write() needs to be overwridden to use the I/O functions when doing a simple aligned write to non-writecombined memory, otherwise the write may be split by the generic version. Note that due to the fact that a portion of the virtual address space is non-MMUable, and therefore always maps directly to the physical address space, metag specific I/O functions are made available (metag_in32, metag_out32 etc). These cast the address argument to a pointer so that they can be used with raw physical addresses. These accessors are only to be used for accessing fixed core Meta architecture registers in the non-MMU region, and not for any SoC/peripheral registers. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:19 +00:00
James Hogan	f5df8e268f	metag: Memory management Add memory management files for metag. Meta's 32bit virtual address space is split into two halves: - local (0x08000000-0x7fffffff): traditionally local to a hardware thread and incoherent between hardware threads. Each hardware thread has it's own local MMU table. On Meta2 the local space can be globally coherent (GCOn) if the cache partitions coincide. - global (0x88000000-0xffff0000): coherent and traditionally global between hardware threads. On Meta2, each hardware thread has it's own global MMU table. The low 128MiB of each half is non-MMUable and maps directly to the physical address space: - 0x00010000-0x07ffffff: contains Meta core registers and maps SoC bus - 0x80000000-0x87ffffff: contains low latency global core memories Linux usually further splits the local virtual address space like this: - 0x08000000-0x3fffffff: user mappings - 0x40000000-0x7fffffff: kernel mappings Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:19 +00:00
James Hogan	99ef7c2ac1	metag: Cache/TLB handling Add cache and TLB handling code for metag, including the required callbacks used by MM switches and DMA operations. Caches can be partitioned between the hardware threads and the global space, however this is usually configured by the bootloader so Linux doesn't make any changes to this configuration. TLBs aren't configurable, so only need consideration to flush them. On Meta1 the L1 cache was VIVT which required a full flush on MM switch. Meta2 has a VIPT L1 cache so it doesn't require the full flush on MM switch. Meta2 can also have a writeback L2 with hardware prefetch which requires some special handling. Support is optional, and the L2 can be detected and initialised by Linux. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:19 +00:00
James Hogan	027f891f76	metag: TBX source Add source files from the Thread Binary Interface (TBI) library which provides useful low level operations and traps/context management. Among other things it handles interrupt/exception/syscall entry (in tbipcx.S). Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:18 +00:00
James Hogan	4ca151b208	metag: TBX header Add the main header for the Thread Binary Interface (TBI) library which provides useful low level operations and trap/context management. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:18 +00:00
James Hogan	85d9d7a920	metag: Boot Add boot code for metag. Due to the multi-threaded nature of Meta it is not uncommon for an RTOS or bare metal application to be started on other hardware threads by the bootloader. Since there is a single MMU switch which affects all threads, the MMU is traditionally configured by the bootloader prior to starting Linux. The bootloader passes a structure to Linux which among other things contains information about memory regions which have been mapped. Linux then assumes control of the local heap memory region. A kernel arguments string pointer or a flattened device tree pointer can be provided in the third argument. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:17 +00:00
James Hogan	87aa1328f2	metag: Header for core memory mapped registers Add the header <asm/metag_mem.h> describing addresses, fields, and bits of various core memory mapped registers in the low non-MMU region. Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:17 +00:00
James Hogan	af8a10493e	metag: Headers for core arch constants Add a couple of header files containing core architecture constants. The first (<asm/metag_isa.h>) contains some constants relating to the instruction set, such as values to give to the CACHEW and CACHER instructions. The second (<asm/metag_regs.h>) contains constants for the core register units directly accessible to various instructions, and for the registers, fields, and bits in those units. The main units described are the control unit (CT.), the trigger unit (TR.), and the run-time trace unit (TT.*). Signed-off-by: James Hogan <james.hogan@imgtec.com>	2013-03-02 20:09:16 +00:00
James Hogan	c19fa94a8f	Add HAVE_64BIT_ALIGNED_ACCESS On 64 bit architectures with no efficient unaligned access, padding and explicit alignment must be added in various places to prevent unaligned 64bit accesses (such as taskstats and trace ring buffer). However this also needs to apply to 32 bit architectures with 64 bit accesses requiring alignment such as metag. This is solved by adding a new Kconfig symbol HAVE_64BIT_ALIGNED_ACCESS which defaults to 64BIT && !HAVE_EFFICIENT_UNALIGNED_ACCESS, and can be explicitly selected by METAG and any other relevant architectures. This can be used in various places to determine whether 64bit alignment is required. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Eric Paris <eparis@redhat.com> Cc: Will Drewry <wad@chromium.org>	2013-03-02 20:09:15 +00:00
Linus Torvalds	f741656d64	Revert the PVonHVM kexec. The patch introduces a regression with older hypervisor stacks, such as Xen 4.1. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQEcBAABAgAGBQJRHZ7eAAoJEFjIrFwIi8fJZ+sH/ieMkzdBB6aqbFMcNr7mkfBo i3swjO2JQI7REYIHfKEVoR3IgHfqKEuABdeEQrceE0XqDepFh84YiKGI2QpPRWEA 903vUV4DXVdcBrypbL45tSFZ1Jxsrzx+F7WfV/f9WHyeiwOyaZTGVQH0VuOzpcum RvPTT7MmC7g8MJDi66SDYBaX/pBQzifQ81nMWWjXNw0w4CwWX7le1cScZEP42MR6 jTEHzYMLDojdO+2aQM5pt/0CGI5tzBHtX5nNRl6tovlPI3ckknYYx6a7RfxkfZzF IkMIuGS32yLfsswPPIiMs47/Qgiq3BN6eSTJXMZKUwQokL9yEs8LodcnRDYfgyQ= =fqcJ -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-3.8-rc7-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull xen fixes from Konrad Rzeszutek Wilk: "Two fixes: - A simple bug-fix for redundant NULL check. - CVE-2013-0228/XSA-42: x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS and two reverts: - Revert the PVonHVM kexec. The patch introduces a regression with older hypervisor stacks, such as Xen 4.1." * tag 'stable/for-linus-3.8-rc7-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: Revert "xen PVonHVM: use E820_Reserved area for shared_info" Revert "xen/PVonHVM: fix compile warning in init_hvm_pv_info" xen: remove redundant NULL check before unregister_and_remove_pcpu(). x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS.	2013-02-15 12:12:55 -08:00
Linus Torvalds	11e7651432	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc Pull sparc fixes from David Miller: "A couple small fixes for sparc including some THP brown-paper-bag material: 1) During the merging of all the THP support for various architectures, sparc missed adding a HAVE_ARCH_TRANSPARENT_HUGEPAGE to it's Kconfig, oops. 2) Sparc needs to be mindful of hugepages in get_user_pages_fast(). 3) Fix memory leak in SBUS probe, from Cong Ding. 4) The sunvdc virtual disk client driver has a test of the bitmask of vdisk server supported operations which was off by one bit" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sunvdc: Fix off-by-one in generic_request(). sparc64: Fix get_user_pages_fast() wrt. THP. sparc64: Add missing HAVE_ARCH_TRANSPARENT_HUGEPAGE. sparc: kernel/sbus.c: fix memory leakage	2013-02-15 12:05:57 -08:00
Konrad Rzeszutek Wilk	e9daff24a2	Revert "xen PVonHVM: use E820_Reserved area for shared_info" This reverts commit `9d02b43dee`. We are doing this b/c on 32-bit PVonHVM with older hypervisors (Xen 4.1) it ends up bothing up the start_info. This is bad b/c we use it for the time keeping, and the timekeeping code loops forever - as the version field never changes. Olaf says to revert it, so lets do that. Acked-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-02-14 21:29:31 -05:00
Konrad Rzeszutek Wilk	5eb65be2d9	Revert "xen/PVonHVM: fix compile warning in init_hvm_pv_info" This reverts commit `a7be94ac8d`. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-02-14 21:29:27 -05:00
Satoru Takeuchi	1de63d60cd	efi: Clear EFI_RUNTIME_SERVICES rather than EFI_BOOT by "noefi" boot parameter There was a serious problem in samsung-laptop that its platform driver is designed to run under BIOS and running under EFI can cause the machine to become bricked or can cause Machine Check Exceptions. Discussion about this problem: https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557 https://bugzilla.kernel.org/show_bug.cgi?id=47121 The patches to fix this problem: efi: Make 'efi_enabled' a function to query EFI facilities `83e6818974` samsung-laptop: Disable on EFI hardware `e0094244e4` Unfortunately this problem comes back again if users specify "noefi" option. This parameter clears EFI_BOOT and that driver continues to run even if running under EFI. Refer to the document, this parameter should clear EFI_RUNTIME_SERVICES instead. Documentation/kernel-parameters.txt: =============================================================================== ... noefi [X86] Disable EFI runtime services support. ... =============================================================================== Documentation/x86/x86_64/uefi.txt: =============================================================================== ... - If some or all EFI runtime services don't work, you can try following kernel command line parameters to turn off some or all EFI runtime services. noefi turn off all EFI runtime services ... =============================================================================== Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Link: http://lkml.kernel.org/r/511C2C04.2070108@jp.fujitsu.com Cc: Matt Fleming <matt.fleming@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-02-13 17:24:11 -08:00
Jan Beulich	13d2b4d11d	x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS. This fixes CVE-2013-0228 / XSA-42 Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user in 32bit PV guest can use to crash the > guest with the panic like this: ------------- general protection fault: 0000 [#1] SMP last sysfs file: /sys/devices/vbd-51712/block/xvda/dev Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4 mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1 EIP: 0061:[<c0407462>] EFLAGS: 00010086 CPU: 0 EIP is at xen_iret+0x12/0x2b EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010 ESI: 00000000 EDI: 003d0f00 EBP: `b77f8388` ESP: eb8d1fe0 DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069 Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000) Stack: 00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000 Call Trace: Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00 8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 <8b> 40 10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02 EIP: [<c0407462>] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0 general protection fault: 0000 [#2] ---[ end trace ab0d29a492dcd330 ]--- Kernel panic - not syncing: Fatal exception Pid: 1250, comm: r Tainted: G D --------------- 2.6.32-356.el6.i686 #1 Call Trace: [<c08476df>] ? panic+0x6e/0x122 [<c084b63c>] ? oops_end+0xbc/0xd0 [<c084b260>] ? do_general_protection+0x0/0x210 [<c084a9b7>] ? error_code+0x73/ ------------- Petr says: " I've analysed the bug and I think that xen_iret() cannot cope with mangled DS, in this case zeroed out (null selector/descriptor) by either xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT entry was invalidated by the reproducer. " Jan took a look at the preliminary patch and came up a fix that solves this problem: "This code gets called after all registers other than those handled by IRET got already restored, hence a null selector in %ds or a non-null one that got loaded from a code or read-only data descriptor would cause a kernel mode fault (with the potential of crashing the kernel as a whole, if panic_on_oops is set)." The way to fix this is to realize that the we can only relay on the registers that IRET restores. The two that are guaranteed are the %cs and %ss as they are always fixed GDT selectors. Also they are inaccessible from user mode - so they cannot be altered. This is the approach taken in this patch. Another alternative option suggested by Jan would be to relay on the subtle realization that using the %ebp or %esp relative references uses the %ss segment. In which case we could switch from using %eax to %ebp and would not need the %ss over-rides. That would also require one extra instruction to compensate for the one place where the register is used as scaled index. However Andrew pointed out that is too subtle and if further work was to be done in this code-path it could escape folks attention and lead to accidents. Reviewed-by: Petr Matousek <pmatouse@redhat.com> Reported-by: Petr Matousek <pmatouse@redhat.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-02-13 15:40:30 -05:00
David S. Miller	89a77915e0	sparc64: Fix get_user_pages_fast() wrt. THP. Mostly mirrors the s390 logic, as unlike x86 we don't need the SetPageReferenced() bits. On sparc64 we also lack a user/privileged bit in the huge PMDs. In order to make this work for THP and non-THP builds, some header file adjustments were necessary. Namely, provide the PMD_HUGE_* bit defines and the pmd_large() inline unconditionally rather than protected by TRANSPARENT_HUGEPAGE. Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-13 12:22:14 -08:00
David S. Miller	b9156ebb7b	sparc64: Add missing HAVE_ARCH_TRANSPARENT_HUGEPAGE. This got missed in the cleanups done for the S390 THP support. CC: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-13 12:22:13 -08:00
Linus Torvalds	42976ad0b2	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "One (hopefully) last batch of x86 fixes. You asked for the patch by patch justifications, so here they are: x86, MCE: Retract most UAPI exports This one unexports from userspace a bunch of definitions which should never have been exported. We really don't want to create an accidental legacy here. x86, doc: Add a bootloader ID for OVMF This is a documentation-only patch, just recording the official assignment of a boot loader ID. x86: Do not leak kernel page mapping locations Security: avoid making it needlessly easy for user space to probe the kernel memory layout. x86/mm: Check if PUD is large when validating a kernel address Prevent failures using /proc/kcore when using 1G pages. x86/apic: Work around boot failure on HP ProLiant DL980 G7 Server systems Works around a BIOS problem causing boot failures on affected hardware." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Check if PUD is large when validating a kernel address x86/apic: Work around boot failure on HP ProLiant DL980 G7 Server systems x86, doc: Add a bootloader ID for OVMF x86: Do not leak kernel page mapping locations x86, MCE: Retract most UAPI exports	2013-02-13 12:19:49 -08:00
Mel Gorman	0ee364eb31	x86/mm: Check if PUD is large when validating a kernel address A user reported the following oops when a backup process reads /proc/kcore: BUG: unable to handle kernel paging request at ffffbb00ff33b000 IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110 [...] Call Trace: [<ffffffff811b8aaa>] read_kcore+0x17a/0x370 [<ffffffff811ad847>] proc_reg_read+0x77/0xc0 [<ffffffff81151687>] vfs_read+0xc7/0x130 [<ffffffff811517f3>] sys_read+0x53/0xa0 [<ffffffff81449692>] system_call_fastpath+0x16/0x1b Investigation determined that the bug triggered when reading system RAM at the 4G mark. On this system, that was the first address using 1G pages for the virt->phys direct mapping so the PUD is pointing to a physical address, not a PMD page. The problem is that the page table walker in kern_addr_valid() is not checking pud_large() and treats the physical address as if it was a PMD. If it happens to look like pmd_none then it'll silently fail, probably returning zeros instead of real data. If the data happens to look like a present PMD though, it will be walked resulting in the oops above. This patch adds the necessary pud_large() check. Unfortunately the problem was not readily reproducible and now they are running the backup program without accessing /proc/kcore so the patch has not been validated but I think it makes sense. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.coM> Reviewed-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: stable@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20130211145236.GX21389@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-02-13 10:02:55 +01:00
Linus Torvalds	a0e5056e3b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux into akpm Pull s390 regression fix from Martin Schwidefsky: "The recent fix for the s390 sched_clock() function uncovered yet another bug in s390_next_ktime which causes an endless loop in KVM. This regression should be fixed before v3.8. I keep the fingers crossed that this is the last one for v3.8." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/timer: avoid overflow when programming clock comparator	2013-02-12 16:16:29 -08:00

1 2 3 4 5 ...

77822 Commits