Fix the "siblings" field value in /proc/cpuinfo so that it now shows the
number of siblings as seen by OS, instead of what is available from
hardware perspective.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The simscsi code at present overflows an int if it's given a large
disk image. The attached patch increases the possible size to 128G.
While it's unlikely that anyone will want to use SKI with such a
large drive, the same framework is currently being used for various
virtualisation experiments.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This introduces a limit parameter to the core bootmem allocator; The new
parameter indicates that physical memory allocated by the bootmem
allocator should be within the requested limit.
We also introduce alloc_bootmem_low_pages_limit, alloc_bootmem_node_limit,
alloc_bootmem_low_pages_node_limit apis, but alloc_bootmem_low_pages_limit
is the only api used for swiotlb.
The existing alloc_bootmem_low_pages() api could instead have been
changed and made to pass right limit to the core allocator. But that
would make the patch more intrusive for 2.6.14, as other arches use
alloc_bootmem_low_pages(). We may be done that post 2.6.14 as a
cleanup.
With this, swiotlb gets memory within 4G for both x86_64 and ia64
arches.
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Ravikiran G Thirumalai <kiran@scalex86.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I've noticed a kernel hang during a storm of CMC interrupts, which was
tracked down to the continual execution of the interrupt handler.
There's code in the CMC handler that's supposed to disable CMC
interrupts and switch to polling mode when it sees a bunch of CMCs.
Because disabling CMCs across all CPUs isn't safe in interrupt context,
the disable is done with a schedule_work(). But with continual CMC
interrupts, the schedule_work() never gets executed.
The following patch immediately disables CMC interrupts for the current
CPU. This then allows (at least) one CPU to ignore CMC interrupts,
execute the schedule_work() code, and disable CMC interrupts on the rest
of the CPUs.
Acked-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Bryan Sutula <Bryan.Sutula@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
gensparse_defconfig is a config generated file for SPARSEMEM and GENERIC
kernel configuration (defconfig).
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch is the minimal set of changes required by ia64 to use SPARSEMEM.
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
For FLATMEM contig_page_data has been made transparent to the arch code.
This patch conforms to that change.
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The patch modifies the Kconfig file to introduce the new memory model
options and other related SPARSEMEM changes. There is also a minor change
in the Makefile.
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
/proc/iomem describes a block of memory as "Kernel data",
but the end address is derived from "_edata". The kernel
actually has many other sections beyond _edata. Get the
real end address from _end.
Acked-by: Khalid Aziz <khalid_aziz@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch adds a #define for SN_SAL_IOIF_PCI_SAFE and makes that the
preferred method of implementing sn_pci_legacy_read() and
sn_pci_legacy_write().
This SAL call has been present in SGI proms since version 4.10. If the
SN_SAL_IOIF_PCI_SAFE call fails, revert to the previous code for compatability
with older proms.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Address space resources for ACPI devices have a producer/consumer
flag. All devices "consume" the indicated address space. If the
resource is marked as a "producer", the range is also passed on
to child devices.
We currently ignore this flag when setting up MMIO and I/O port
windows for PCI root bridges, so we could mistakenly interpret
a "consumed-only" range, like CSR space for the device itself,
as a window that is routed to children.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Verify the pfn is valid before calling pfn_to_page(),
and cut isolation message if nothing was done.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Wire the MCA/INIT handler stacks into DTR[2] and track them in
IA64_KR(CURRENT_STACK). This gives the MCA/INIT handler stacks the
same TLB status as normal kernel stacks. Reload the old CURRENT_STACK
data on return from OS to SAL.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The sd driver now uses scsi_execute_req() for almost everything.
scsi_execute_req() converts requests into scatterlists.
Fix the HP SCSI disk simulator to understand scatterlists for
more commands.
Without this patch the current kernel will not boot on the simulator
(the disks are always detected as having no sectors, and so cannot be
mounted).
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Move acpi_map_iosapics() from pci.c to acpi.c, since it doesn't
have anything to do with PCI.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Update comment about how ar.k0 is used. Make the initialization the
same as in start_secondary() (no functional change, just make it look
more similar).
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
User mode kexec tools expect to find information about physical
memory in /proc/iomem (as they do on x86) to validate the addresses
that the new kernel will use.
Signed-off-by: Khalid Aziz <khalid.aziz@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
With the new fdtable locking rules, you have to protect fdtable with either
->file_lock or rcu_read_lock/unlock(). There are some places where we
aren't doing either. This patch fixes those places.
Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch increases the maximum number of cpus supported on IA64
to 1024. No changes are made to the default SSI size. The patch
simply allows specifying up to 1024p.
There are certainly scaling (& other) issues that also need to be
addressed!!! Additional patches will follow.....
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
There were some trailing white spaces, long lines, brackets in
weird style etc. This patch cleans them up.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch removes some compilation warnings, mostly
trivially. acpi.c fix also noted by Kenji Kaneshige.
Signed-off-by; Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Some of the SN code & #defines related to compact nodes & IO discovery
have gotten stale over the years. This patch attempts to clean them up.
Some of the various SN MAX_xxx #defines were also unclear & misused.
The primary changes are:
- use MAX_NUMNODES. This is the generic linux #define for the number
of nodes that are known to the generic kernel. Arrays & loops
for constructs that are 1:1 with linux-defined nodes should
use the linux #define - not an SN equivalent.
- use MAX_COMPACT_NODES for MAX_NUMNODES + NUM_TIOS. This is the
number of nodes in the SSI system. Compact nodes are a hack to
get around the IA64 architectural limit of 256 nodes. Large SGI
systems have more than 256 nodes. When we upgrade to ACPI3.0,
I _hope_ that all nodes will be real nodes that are known to
the generic kernel. That will allow us to delete the notion
of "compact nodes".
- add MAX_NUMALINK_NODES for the total number of nodes that
are in the numalink domain - all partitions.
- simplified (understandable) scan_for_ionodes()
- small amount of cleanup related to cnodes
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
PNP and PNPACPI turned on
i8042 recently changed from ACPI to PNP detection. Without PNP, it
probes legacy I/O ports for the keyboard controller, which causes an
MCA on HP boxes.
Also, I'm about to remove 8250_acpi.c, so we'll need PNP to detect
non-PCI serial ports. Until 8250_acpi.c is removed, some systems
will see serial ports reported twice (once from 8250_acpi.c and again
from 8250_pnp.c). This is harmless.
PNPACPI is still marked EXPERIMENTAL, but I'm not aware of any
outstanding issues on ia64.
IDE_GENERIC turned off (except for SGI simulator, all ia64 IDE is PCI)
ide-generic probes compiled-in legacy I/O ports for IDE devices, which
again causes an MCA. It would be nicer to just get rid of all the
legacy junk from include/asm-ia64/ide.h, but that is a bit riskier
because it could break ide-cs and the HDIO_REGISTER_HWIF ioctl
(http://www.ussg.iu.edu/hypermail/linux/kernel/0508.2/0049.html).
Here's the essence of the patch:
-# CONFIG_PNP is not set
+CONFIG_PNP=y
+CONFIG_PNPACPI=y
-CONFIG_IDE_GENERIC=y
+# CONFIG_IDE_GENERIC is not set
Tested on tiger, bigsur, and zx1.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Several implementations were essentialy a common piece of C code using
the cmpxchg() macro. Put the implementation in one spot that everyone
can share, and convert sparc64 over to using this.
Alpha is the lone arch-specific implementation, which codes up a
special fast path for the common case in order to avoid GP reloading
which a pure C version would require.
Signed-off-by: David S. Miller <davem@davemloft.net>
Machine vector selection has always been a bit of a hack given how
early in system boot it needs to be done. Services like ACPI namespace
are not available and there are non-trivial problems to moving them to
early boot. However, there's no reason we can't change to a different
machvec later in boot when the services we need are available. By
adding a entry point for later initialization of the swiotlb, we can add
an error path for the hpzx1 machevec initialization and fall back to the
DIG machine vector if IOMMU hardware isn't found in the system. Since
ia64 uses 4GB for zone DMA (no ISA support), it's trivial to allocate a
contiguous range from the slab for bounce buffer usage.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Pavel Emelianov and Kirill Korotaev observe that fs and arch users of
security_vm_enough_memory tend to forget to vm_unacct_memory when a
failure occurs further down (typically in setup_arg_pages variants).
These are all users of insert_vm_struct, and that reservation will only
be unaccounted on exit if the vma is marked VM_ACCOUNT: which in some
cases it is (hidden inside VM_STACK_FLAGS) and in some cases it isn't.
So x86_64 32-bit and ppc64 vDSO ELFs have been leaking memory into
Committed_AS each time they're run. But don't add VM_ACCOUNT to them,
it's inappropriate to reserve against the very unlikely case that gdb
be used to COW a vDSO page - we ought to do something about that in
do_wp_page, but there are yet other inconsistencies to be resolved.
The safe and economical way to fix this is to let insert_vm_struct do
the security_vm_enough_memory check when it finds VM_ACCOUNT is set.
And the MIPS irix_brk has been calling security_vm_enough_memory before
calling do_brk which repeats it, doubly accounting and so also leaking.
Remove that, and all the fs and arch calls to security_vm_enough_memory:
give it a less misleading name later on.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-Off-By: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If we use 64bit kernel on ia64/x86_64/s390 architecture, and we run
32bit binary on 32bit compatibility mode, sendfile system call seems be
not set offset argument.
This is because sendfile's return value is not zero but the code regards
the result by return value is zero or not.
This problem will be affect to ia64/x86_64/s390 and not affect to other
architecture does not affect other architecture (mips/parisc/ppc64/sparc64).
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Delete the special case unwind code that was only used by the old
MCA/INIT handler.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The bulk of the change. Use per cpu MCA/INIT stacks. Change the SAL
to OS state (sos) to be per process. Do all the assembler work on the
MCA/INIT stacks, leaving the original stack alone. Pass per cpu state
data to the C handlers for MCA and INIT, which also means changing the
mca_drv interfaces slightly. Lots of verification on whether the
original stack is usable before converting it to a sleeping process.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reading the INIT record from SAL during the INIT event has proved to be
unreliable, and a source of hangs during INIT processing. The new
MCA/INIT handlers remove the need to get the INIT record from SAL.
Change salinfo.c so mca.c can just flag that a new record is available,
without having to read the record during INIT processing. This patch
can be applied without the new MCA/INIT handlers.
Also clean up some usage of NR_CPUS which should have been using
cpu_online().
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>