memblock_enforce_memory_limit() takes the desired maximum quantity of memory
to end up with, not an address above which memory will not be used.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The rtas_event_scan() function uses smp_processor_id() to select a
starting point in cpu_online_mask, and does so under the protection
of get_online_cpus(). This might not select the current processor
in any case, so switch to raw_smp_processor_id().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use the new functions and free the descriptor when the virq is
destroyed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Define the ARCH_IRQ_INIT_FLAGS instead of fixing it up in a loop.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fix the error in spelling the config option for hw-breakpoints and fix
the build issue that follows.
Signed-off by: K.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kyle Moffett points out that mpc85xx has started using the
ppc_md.machine_kexec hook. As such, revert patch c94868788c
(powerpc/kexec: Remove ppc_md.machine_kexec).
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Get rid of old users of of_platform_driver in arch/powerpc. Most
of_platform_driver users can be converted to use the platform_bus
directly.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
arch/powerpc/kernel/ibmebus.c is the only remaining user of the
of_bus_type support code for initializing the bus and registering
drivers. All others have either been switched to the vanilla platform
bus or already have their own infrastructure.
This patch moves the functionality that ibmebus is using out of
drivers/of/{platform,device}.c and into ibmebus.c where it is actually
used. Also renames the moved symbols from of_platform_* to
ibmebus_bus_* to reflect the actual usage.
This patch is part of moving all of the of_platform_bus_type users
over to the platform_bus_type.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reason: Import mainline device tree changes on which further patches
depend on or conflict.
Trivial conflict in: drivers/spi/pxa2xx_spi_pci.c
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some of those functions try to adjust the CPU features, for example
to remove NAP support on some revisions. However, they seem to use
r5 as an index into the CPU table entry, which might have been right
a long time ago but no longer is. r4 is the right register to use.
This probably caused some off behaviours on some PowerMac variants
using 750cx or 7455 processor revisions.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: stable@kernel.org
When calling setup_cpu() on 64-bit, we pass a pointer to the
cputable entry we have found. This used to be fine when cur_cpu_spec
was a pointer to that entry, but nowadays, we copy the entry into
a separate variable, and we do so before we call the setup_cpu()
callback. That means that any attempt by that callback at patching
the CPU table entry (to adjust CPU features for example) will patch
the wrong table.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently, ppc32 uses sysdata for the pci_controller pointer, and
ppc64 uses it to hold the device_node pointer. This patch moves the
of_node pointer into (struct pci_bus*)->dev.of_node and
(struct pci_dev*)->dev.of_node so that sysdata can be converted to always
use the pci_controller pointer instead. It also fixes up the
allocating of pci devices so that the of_node pointer gets assigned
consistently and increments the ref count.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
There is a tiny difference between PPC32 and PPC64. Microblaze uses the
PPC32 variant.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
[grant.likely@secretlab.ca: Added comment to #endif, moved documentation
block to function implementation, fixed for non ppc and microblaze
compiles]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
The DD2 core still has some unstability. Define CPU_FTR_476_DD2 to
enable workarounds in later patches.
This is based on an earlier, unreleased patch for DD1 by Ben Herrenschmidt.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Currently percpu readmostly subsection may share cachelines with other
percpu subsections which may result in unnecessary cacheline bounce
and performance degradation.
This patch adds @cacheline parameter to PERCPU() and PERCPU_VADDR()
linker macros, makes each arch linker scripts specify its cacheline
size and use it to align percpu subsections.
This is based on Shaohua's x86 only patch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Shaohua Li <shaohua.li@intel.com>
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Fix time function double declaration with glibc
perf tools: Fix build by checking if extra warnings are supported
perf tools: Fix build when using gcc 3.4.6
perf tools: Add missing header, fixes build
perf tools: Fix 64 bit integer format strings
perf test: Fix build on older glibcs
perf: perf_event_exit_task_context: s/rcu_dereference/rcu_dereference_raw/
perf test: Use cpu_map->[cpu] when setting affinity
perf symbols: Fix annotation of thumb code
perf: Annotate cpuctx->ctx.mutex to avoid a lockdep splat
powerpc, perf: Fix frequency calculation for overflowing counters (FSL version)
perf: Fix perf_event_init_task()/perf_event_free_task() interaction
perf: Fix find_get_context() vs perf_event_exit_task() race
Decoding machine checks is CPU specific and so machine_check_generic doesn't
do the right thing on 64bit chips. Luckily we never call into this code
because we call ppc_md.machine_check_exception instead if available.
Since we check cur_cpu_spec->machine_check before calling it, we may as
well remove machine_check_generic from 64bit archs.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The spec suggests we should first check the extended log flag before checking
the length field.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If a machine check comes from userspace we send a SIGBUS to the task and
fail to printk anything.
If we are taking machine checks due to bad hardware we want to know about
it right away. Furthermore if we don't complain loudly then it will look
a lot like a bug in the userspace application, potentially causing a lot
of confusion.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We are calling debugger_fault_handler twice in machine_check_exception.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We should never force MSR_RI on. If we take a machine check with MSR_RI off
then we have no chance of recovering safely.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We were printing 64 bits of DSISR in show_regs even though it is 32 bit.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We should disable ftrace during kexec, some of the tracers are very invasive
and we do not want them going off while doing the low level work of swapping
one kernel out for another. This mirrors what we do on x86.
Even though we cannot return from a kexec on powerpc (since we do not implement
CONFIG_KEXEC_JUMP), add the restore code in case we do one day.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use the crash handler hooks to run the SPU stop code, just like we do for
ehea and cell RAS code.
While I'm here I noticed "CPUSs reliabally"
so fix the spelling MISTAKESs reliabally.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
No one uses ppc_md.machine_crash_shutdown, so remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
No one uses ppc_md.machine_kexec, so remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
No one uses ppc_md.machine_kexec_cleanup, so remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With cmwq, there's no reason to use a separate workqueue in
cpufreq_spudemand. Use system_wq instead. The work items are already
sync canceled on stop, so it's already guaranteed that no work is
running when spu_gov_exit() is entered.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Dave Jones <davej@redhat.com>
Cc: cpufreq@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Simplify read file operation for /proc/powerpc/rtas/* interface
by using simple_read_from_buffer.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
32-bit variant of the previous patch for 64-bit:
<<
When an interrupt occurs in userspace, we can call trace_hardirqs_on/off()
With one level stack. But if we have irqsoff tracing enabled,
it checks both CALLER_ADDR0 and CALLER_ADDR1. The second call
goes two stack frames up. If this is from user space, then there may
not exist a second stack....
>>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When fixing the frequency calculations for perf on powerpc I
forgot to fix the FSL version.
If we dont set event->hw.last_period the frequency to period
calculations in perf go haywire and we continually
throttle/unthrottle the PMU.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110118214404.2f42e634@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Fix tracepoint id to string perf.data header table
perf tools: Fix handling of wildcards in tracepoint event selectors
powerpc: perf: Fix frequency calculation for overflowing counters
When profiling a benchmark that is almost 100% userspace, I noticed some wildly
inaccurate profiles that showed almost all time spent in the kernel.
Closer examination shows we were programming a tiny number of cycles into the
PMU after each overflow (about ~200 away from the next overflow). This gets us
stuck in a loop which we eventually break out of by throttling the PMU (there
are regular throttle/unthrottle events in the log).
It looks like we aren't setting event->hw.last_period to something same and the
frequency to period calculations in perf are going haywire.
With the following patch we find the correct period after a few interrupts and
stay there. I also see no more throttle events.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
LKML-Reference: <20110117161742.5feb3761@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The physical address is never used by the device tree code when
allocating memory for unflattening. Change the architecture's alloc
hook to return the virutal address instead.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Moved setting of RFXE bit so we get machine checks on RIO errors into
cpu_setup so that the RIO code isn't core specific.
Signed-off-by: Shaohui Xie <b21989@freescale.com>
Cc: Li Yang <leoli@freescale.com>
Cc: Roy Zang <tie-fei.zang@freescale.com>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (72 commits)
powerpc/pseries: Fix build of topology stuff without CONFIG_NUMA
powerpc/pseries: Fix VPHN build errors on non-SMP systems
powerpc/83xx: add mpc8308_p1m DMA controller device-tree node
powerpc/83xx: add DMA controller to mpc8308 device-tree node
powerpc/512x: try to free dma descriptors in case of allocation failure
powerpc/512x: add MPC8308 dma support
powerpc/512x: fix the hanged dma transfer issue
powerpc/512x: scatter/gather dma fix
powerpc/powermac: Make auto-loading of therm_pm72 possible
of/address: Use propper endianess in get_flags
powerpc/pci: Use printf extension %pR for struct resource
powerpc: Remove unnecessary casts of void ptr
powerpc: Disable VPHN polling during a suspend operation
powerpc/pseries: Poll VPA for topology changes and update NUMA maps
powerpc: iommu: Add device name to iommu error printks
powerpc: Record vma->phys_addr in ioremap()
powerpc: Update compat_arch_ptrace
powerpc: Fix PPC_PTRACE_SETHWDEBUG on PPC_BOOK3S
powerpc/time: printk time stamp init not correct
powerpc: Minor cleanups for machdep.h
...
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: (29 commits)
of/flattree: forward declare struct device_node in of_fdt.h
ipmi: explicitly include of_address.h and of_irq.h
sparc: explicitly cast negative phandle checks to s32
powerpc/405: Fix missing #{address,size}-cells in i2c node
powerpc/5200: dts: refactor dts files
powerpc/5200: dts: Change combatible strings on localbus
powerpc/5200: dts: remove unused properties
powerpc/5200: dts: rename nodes to prepare for refactoring dts files
of/flattree: Update dtc to current mainline.
of/device: Don't register disabled devices
powerpc/dts: fix syntax bugs in bluestone.dts
of: Fixes for OF probing on little endian systems
of: make drivers depend on CONFIG_OF instead of CONFIG_PPC_OF
of/flattree: Add of_flat_dt_match() helper function
of_serial: explicitly include of_irq.h
of/flattree: Refactor unflatten_device_tree and add fdt_unflatten_tree
of/flattree: Reorder unflatten_dt_node
of/flattree: Refactor unflatten_dt_node
of/flattree: Add non-boottime device tree functions
of/flattree: Add Kconfig for EARLY_FLATTREE
...
Fix up trivial conflict in arch/sparc/prom/tree_32.c as per Grant.
Extend the perf_pmu_register() interface to allow for named and
dynamic pmu types.
Because we need to support the existing static types we cannot use
dynamic types for everything, hence provide a type argument.
If we want to enumerate the PMUs they need a name, provide one.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101117222056.259707703@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Using %pR standardizes the struct resource output.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tie the polling mechanism into the ibm,suspend-me rtas call to
stop/restart polling before/after a suspend, hibernate, migrate,
or checkpoint restart operation. This ensures that the system has a
chance to disable the polling if the partition is migrated to a system
that does not support VPHN (and vice versa).
Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Right now its difficult to see which device is running out of iommu space:
iommu_alloc failed, tbl c00000076e096660 vaddr c000000768806600 npages 1
Use dev_info() so we get the device name and location:
ipr 0000:00:01.0: iommu_alloc failed, tbl c00000076e096660 vaddr c000000768806600 npages 1
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Update compat_arch_ptrace to follow recent changes in
PTRACE_GET_DEBUGREG and the addition of
PPC_PTRACE_{GETHWDBGINFO|{SET|DEL}HWDEBUG}. The latter three can be
forwarded to arch_ptrace unchanged.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Properly set the DABR_TRANSLATION/DABR_DATA_READ/DABR_DATA_READ bits in
the dabr when setting the debug register via PPC_PTRACE_SETHWDEBUG. Also
don't reject trigger type of PPC_BREAKPOINT_TRIGGER_READ.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
problem:
I see sometimes on my mpc5200 based board such printk timing
information:
[ 0.000000] NR_IRQS:512 nr_irqs:512 16
[ 0.000000] MPC52xx PIC is up and running!
[ 0.000000] clocksource: timebase mult[79364d9] shift[22] registered
[ 0.000000] console [ttyPSC0] enabled
[ 130.300633] pid_max: default: 32768 minimum: 301
[ 130.305647] Mount-cache hash table entries: 512
[ 130.315818] NET: Registered protocol family 16
reason:
if the tbu not starts from 0 when linux boots, boot_tb
maybe could not store the real 64 bit tbu value, because
boot_tp is only a 32 bit unsigned long.
solution:
change boot_tb to u64
[BenH: Made it u64 instead of unsigned long long]
Signed-off-by: Heiko Schocher <hs@denx.de>
cc: Wolfgang Denk <wd@denx.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fix head_64.S so that we can build a relocatable kernel
that isn't necessarily a crash-dump kernel
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Sonny Rao <sonnyrao@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We now allow interrupt stacks anywhere in the first segment which can be
256M or 1TB. Fix the comment.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The IOMMU code has been passing the dma-mask instead of the
coherent_dma_mask to the iommu allocator. Coherent allocations should
be made using the coherent_dma_mask.
Also update the vio code to ensure the coherent_dma_mask is set. Without
this change drivers, such as ibmvscsi, fail to load with the corrected
dma_iommu_alloc_coherent().
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The name field in the nvram_header can be < 12 chars, null-terminated,
or 12 chars without the null. Handle this safely.
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Simplify creation and use of the NVRAM partition list.
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The nvram log partition stuff currently in nvram_64.c is really
pseries specific. It isn't actually used on anything else (despite
the fact that we ran the code to setup the partition on anything
except powermac) and the log format is specific to pseries RTAS
implementation. So move it where it belongs
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This changes the function to use nvram_find_partition() instead
of doing the lookup "by hand". It also makes some of the logic
clearer and prints out more useful diagnostic information.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Existing code is nasty, has bugs etc... rewrite the function
more simply, and make it take the signature and optional
name of the partitions to remove as arguments, thus making
it a more generic utility.
We also try to remove a log partition that we find and is too
small rather than creating a duplicate.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This error log stuff is really pseries specific. As a first step we move
the initialization of these variables to the caller of
nvram_create_partition(), which is also slightly reorganized so we
setup the free partition before we clear the new partition, so the
chance of an error during clear leaving us with invalid headers
is lessened.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When creating a partition, we clear it entirely rather than
just the first two words since the previous code was rather
specific to the pseries log partition format.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use BUILD_BUG_ON to ensure the structure representing a partition
header have the right size.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This converts nvram_create_partition() to use a size in bytes
rather than blocks. It does the appropriate alignment internally
The size passed is also the data size (ie. doesn't include the
header anymore).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Replace nvram_create_os_partition() with a variant that takes
the partition name, signature and size as arguments.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This moves a bunch of definitions out of asm/nvram.h to the files
that use them or just outright remove completely unused stuff.
We leave the partition signatures definitions, they will be useful
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Since STACK_FRAME_OVERHEAD is defined in asm/ptrace.h and that
is ASSEMBER safe, we can just include that instead of going via
asm-offsets.h.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This adds the POWER7+ cputable entry for the PVR 0x004a0000. Rest is
the same as vanilla POWER7.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These are not needed on POWER7 so remove them.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These are not needed so just remove them
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
No need to have three of them.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use the set_dma_ops helper. Instead of modifying vio_dma_mapping_ops,
just create a trivial wrapper for dma_supported.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These APIs take logical cpu number as input
Change cpu_first_thread_in_core() to cpu_first_thread_sibling()
Change cpu_last_thread_in_core() to cpu_last_thread_sibling()
These APIs convert core number (index) to logical cpu/thread numbers
Add cpu_first_thread_of_core(int core)
Changed cpu_thread_to_core() to cpu_core_index_of_thread(int cpu)
The goal is to make 'threads_per_core' accessible to the
pseries_energy module. Instead of making an API to read
threads_per_core, this is a higher level wrapper function to
convert from logical cpu number to core number.
The current APIs cpu_first_thread_in_core() and
cpu_last_thread_in_core() returns logical CPU number while
cpu_thread_to_core() returns core number or index which is
not a logical CPU number. The new APIs are now clearly named to
distinguish 'core number' versus first and last 'logical cpu
number' in that core.
The new APIs cpu_{first,last}_thread_sibling() work on
logical cpu numbers. While cpu_first_thread_of_core() and
cpu_core_index_of_thread() work on core index.
Example usage: (4 threads per core system)
cpu_first_thread_sibling(5) = 4
cpu_last_thread_sibling(5) = 7
cpu_core_index_of_thread(5) = 1
cpu_first_thread_of_core(1) = 4
cpu_core_index_of_thread() is used in cpu_to_drc_index() in the
module and cpu_first_thread_of_core() is used in
drc_index_to_cpu() in the module.
Make API changes to few callers. Export symbols for use in modules.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The crashkernel region will almost always overlap RTAS. If we free the
crashkernel region via "echo 0 > /sys/kernel/kexec_crash_size" then we will
free RTAS and the machine will crash in confusing and exciting ways.
Override crash_free_reserved_phys_range and check for overlap with RTAS.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
POWER5 added popcntb, and POWER7 added popcntw and popcntd. As a first step
this patch does all the work out of line, but it would be nice to implement
them as inlines with an out of line fallback.
The performance issue with hweight was noticed when disabling SMT on a large
(192 thread) POWER7 box. The patch improves that testcase by about 8%.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The perf hardware pmu got initialized at various points in the boot,
some before early_initcall() some after (notably arch_initcall).
The problem is that the NMI lockup detector is ran from early_initcall()
and expects the hardware pmu to be present.
Sanitize this by moving all architecture hardware pmu implementations to
initialize at early_initcall() and move the lockup detector to an explicit
initcall right after that.
Cc: paulus <paulus@samba.org>
Cc: davem <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290707759.2145.119.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
kgdb,ppc: Fix regression in evr register handling
kgdb,x86: fix regression in detach handling
kdb: fix crash when KDB_BASE_CMD_MAX is exceeded
kdb: fix memory leak in kdb_main.c
The commit 5e3d20a remove bkl from startup code so setup_arch() it isn't called
with bkl held anymore. Update the comment on top of that function.
Fix also a typo.
This work was supported by a hardware donation from the CE Linux Forum.
Signed-off-by: Alessio Igor Bogani <abogani@texware.it>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit ff10b88b5a (kgdb,ppc: Individual
register get/set for ppc) introduced a problem where memcpy was used
incorrectly to read and write the evr registers with a kernel that
has:
CONFIG_FSL_BOOKE=y
CONFIG_SPE=y
CONFIG_KGDB=y
This patch also fixes the following compilation problems:
arch/powerpc/kernel/kgdb.c: In function 'dbg_get_reg':
arch/powerpc/kernel/kgdb.c:341: error: passing argument 2 of 'memcpy' makes pointer from integer without a cast
arch/powerpc/kernel/kgdb.c: In function 'dbg_set_reg':
arch/powerpc/kernel/kgdb.c:366: error: passing argument 1 of 'memcpy' makes pointer from integer without a cast
[jason.wessel@windriver.com: Remove void * casts and fix patch header]
Reported-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
CC: linuxppc-dev@lists.ozlabs.org
The big kernel lock has been removed from all these files at some point,
leaving only the #include.
Remove this too as a cleanup.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix an unresolved symbol with CONFIG_KVM_GUEST plus CONFIG_RELOCATABLE on
Book E.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
There are two identical implementations of of_get_mac_address(), one
each in arch/powerpc/kernel/prom_parse.c and
arch/microblaze/kernel/prom_parse.c. Move this function to a new
common file of_net.{c,h} and adjust all the callers to include the new
header.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
[grant.likely@secretlab.ca: protect header with #ifdef]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
commit 534af1082329392bc29f6badf815e69ae2ae0f4c(kgdb,kdb: individual
register set and and get API) introduce dbg_get_reg/dbg_set_reg API
for individual register get and set.
This patch implement those APIs for ppc.
Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Use new 'datavp' and 'datalp' variables in order to remove unnecessary
castings.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix up the arguments to arch_ptrace() to take account of the fact that
@addr and @data are now unsigned long rather than long as of a preceding
patch in this series.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use the new {max,min}3 macros to save some cycles and bytes on the stack.
This patch substitutes trivial nested macros with their counterpart.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Joe Perches <joe@perches.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6:
mtd/m25p80: add support to parse the partitions by OF node
of/irq: of_irq.c needs to include linux/irq.h
of/mips: Cleanup some include directives/files.
of/mips: Add device tree support to MIPS
of/flattree: Eliminate need to provide early_init_dt_scan_chosen_arch
of/device: Rework to use common platform_device_alloc() for allocating devices
of/xsysace: Fix OF probing on little-endian systems
of: use __be32 types for big-endian device tree data
of/irq: remove references to NO_IRQ in drivers/of/platform.c
of/promtree: add package-to-path support to pdt
of/promtree: add of_pdt namespace to pdt code
of/promtree: no longer call prom_ functions directly; use an ops structure
of/promtree: make drivers/of/pdt.c no longer sparc-only
sparc: break out some PROM device-tree building code out into drivers/of
of/sparc: convert various prom_* functions to use phandle
sparc: stop exporting openprom.h header
powerpc, of_serial: Endianness issues setting up the serial ports
of: MTD: Fix OF probing on little-endian systems
of: GPIO: Fix OF probing on little-endian systems
* 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (321 commits)
KVM: Drop CONFIG_DMAR dependency around kvm_iommu_map_pages
KVM: Fix signature of kvm_iommu_map_pages stub
KVM: MCE: Send SRAR SIGBUS directly
KVM: MCE: Add MCG_SER_P into KVM_MCE_CAP_SUPPORTED
KVM: fix typo in copyright notice
KVM: Disable interrupts around get_kernel_ns()
KVM: MMU: Avoid sign extension in mmu_alloc_direct_roots() pae root address
KVM: MMU: move access code parsing to FNAME(walk_addr) function
KVM: MMU: audit: check whether have unsync sps after root sync
KVM: MMU: audit: introduce audit_printk to cleanup audit code
KVM: MMU: audit: unregister audit tracepoints before module unloaded
KVM: MMU: audit: fix vcpu's spte walking
KVM: MMU: set access bit for direct mapping
KVM: MMU: cleanup for error mask set while walk guest page table
KVM: MMU: update 'root_hpa' out of loop in PAE shadow path
KVM: x86 emulator: Eliminate compilation warning in x86_decode_insn()
KVM: x86: Fix constant type in kvm_get_time_scale
KVM: VMX: Add AX to list of registers clobbered by guest switch
KVM guest: Move a printk that's using the clock before it's ready
KVM: x86: TSC catchup mode
...
Before I incorrectly enabled napping also for BookE, which would result in
needless dcache flushes. Since we only need to force enable napping on
Book3s_64 because it doesn't go into MSR_POW otherwise, we can just #ifdef
that code to this particular platform.
Reported-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
There are some heuristics in the PPC power management code that try to find
out if the particular hardware we're running on supports proper power management
or just hangs the machine when going into nap mode.
Since we know that KVM is safe with nap, let's force enable it in the PV code
once we're certain that we are on a KVM VM.
Signed-off-by: Alexander Graf <agraf@suse.de>
We had an arbitrary limitation in mtmsrd L=1 that kept us from using r30 and
r31 as input registers. Let's get rid of that and get more potential speedups!
Signed-off-by: Alexander Graf <agraf@suse.de>
So far we've been restricting ourselves to r0-r29 as registers an mtmsr
instruction could use. This was bad, as there are some code paths in
Linux actually using r30.
So let's instead handle all registers gracefully and get rid of that
stupid limitation
Signed-off-by: Alexander Graf <agraf@suse.de>
This is the guest side of the mtsr acceleration. Using this a guest can now
call mtsrin with almost no overhead as long as it ensures that it only uses
it with (MSR_IR|MSR_DR) == 0. Linux does that, so we're good.
Signed-off-by: Alexander Graf <agraf@suse.de>
We will soon add SR PV support to the shared page, so we need some
infrastructure that allows the guest to query for features KVM exports.
This patch adds a second return value to the magic mapping that
indicated to the guest which features are available.
Signed-off-by: Alexander Graf <agraf@suse.de>
When CONFIG_KVM_GUEST is selected, but CONFIG_KVM is not, we were missing
some defines in asm-offsets.c and included too many headers at other places.
This patch makes above configuration work.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
When using a relocatable kernel we need to make sure that the trampline code
and the interrupt handlers are both copied to low memory. The only way to do
this reliably is to put them in the copied section.
This patch should make relocated kernels work with KVM.
KVM-Stable-Tag
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
On BookE the preferred way to write the EE bit is the wrteei instruction. It
already encodes the EE bit in the instruction.
So in order to get BookE some speedups as well, let's also PV'nize thati
instruction.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
There is also a form of mtmsr where all bits need to be addressed. While the
PPC64 Linux kernel behaves resonably well here, on PPC32 we do not have an
L=1 form. It does mtmsr even for simple things like only changing EE.
So we need to hook into that one as well and check for a mask of bits that we
deem safe to change from within guest context.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
The PowerPC ISA has a special instruction for mtmsr that only changes the EE
and RI bits, namely the L=1 form.
Since that one is reasonably often occuring and simple to implement, let's
go with this first. Writing EE=0 is always just a store. Doing EE=1 also
requires us to check for pending interrupts and if necessary exit back to the
hypervisor.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
When we hook an instruction we need to make sure we don't clobber any of
the registers at that point. So we write them out to scratch space in the
magic page. To make sure we don't fall into a race with another piece of
hooked code, we need to disable interrupts.
To make the later patches and code in general easier readable, let's introduce
a set of defines that save and restore r30, r31 and cr. Let's also define some
helpers to read the lower 32 bits of a 64 bit field on 32 bit systems.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
We will need to patch several instruction streams over to a different
code path, so we need a way to patch a single instruction with a branch
somewhere else.
This patch adds a helper to facilitate this patching.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
We will soon require more sophisticated methods to replace single instructions
with multiple instructions. We do that by branching to a memory region where we
write replacement code for the instruction to.
This region needs to be within 32 MB of the patched instruction though, because
that's the furthest we can jump with immediate branches.
So we keep 1MB of free space around in bss. After we're done initing we can just
tell the mm system that the unused pages are free, but until then we have enough
space to fit all our code in.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>