2005-04-16 22:20:36 +00:00
|
|
|
AMD64 specific boot options
|
|
|
|
|
|
|
|
There are many others (usually documented in driver documentation), but
|
|
|
|
only the AMD64 specific ones are listed here.
|
|
|
|
|
|
|
|
Machine check
|
|
|
|
|
2009-05-27 19:56:56 +00:00
|
|
|
Please see Documentation/x86/x86_64/machinecheck for sysfs runtime tunables.
|
|
|
|
|
2009-06-11 07:06:07 +00:00
|
|
|
mce=off
|
|
|
|
Disable machine check
|
|
|
|
mce=no_cmci
|
|
|
|
Disable CMCI(Corrected Machine Check Interrupt) that
|
|
|
|
Intel processor supports. Usually this disablement is
|
|
|
|
not recommended, but it might be handy if your hardware
|
|
|
|
is misbehaving.
|
|
|
|
Note that you'll get more problems without CMCI than with
|
|
|
|
due to the shared banks, i.e. you might get duplicated
|
|
|
|
error logs.
|
|
|
|
mce=dont_log_ce
|
|
|
|
Don't make logs for corrected errors. All events reported
|
|
|
|
as corrected are silently cleared by OS.
|
|
|
|
This option will be useful if you have no interest in any
|
|
|
|
of corrected errors.
|
|
|
|
mce=ignore_ce
|
|
|
|
Disable features for corrected errors, e.g. polling timer
|
|
|
|
and CMCI. All events reported as corrected are not cleared
|
|
|
|
by OS and remained in its error banks.
|
|
|
|
Usually this disablement is not recommended, however if
|
|
|
|
there is an agent checking/clearing corrected errors
|
|
|
|
(e.g. BIOS or hardware monitoring applications), conflicting
|
|
|
|
with OS's error handling, and you cannot deactivate the agent,
|
|
|
|
then this option will be a help.
|
2015-06-04 16:55:23 +00:00
|
|
|
mce=no_lmce
|
|
|
|
Do not opt-in to Local MCE delivery. Use legacy method
|
|
|
|
to broadcast MCEs.
|
2009-06-11 07:06:07 +00:00
|
|
|
mce=bootlog
|
|
|
|
Enable logging of machine checks left over from booting.
|
|
|
|
Disabled by default on AMD because some BIOS leave bogus ones.
|
|
|
|
If your BIOS doesn't do that it's a good idea to enable though
|
|
|
|
to make sure you log even machine check events that result
|
|
|
|
in a reboot. On Intel systems it is enabled by default.
|
2005-11-05 16:25:54 +00:00
|
|
|
mce=nobootlog
|
|
|
|
Disable boot machine check logging.
|
2009-05-27 19:56:55 +00:00
|
|
|
mce=tolerancelevel[,monarchtimeout] (number,number)
|
|
|
|
tolerance levels:
|
x86_64: mcelog tolerant level cleanup
Background:
The MCE handler has several paths that it can take, depending on various
conditions of the MCE status and the value of the 'tolerant' knob. The
exact semantics are not well defined and the code is a bit twisty.
Description:
This patch makes the MCE handler's behavior more clear by documenting the
behavior for various 'tolerant' levels. It also fixes or enhances
several small things in the handler. Specifically:
* If RIPV is set it is not safe to restart, so set the 'no way out'
flag rather than the 'kill it' flag.
* Don't panic() on correctable MCEs.
* If the _OVER bit is set *and* the _UC bit is set (meaning possibly
dropped uncorrected errors), set the 'no way out' flag.
* Use EIPV for testing whether an app can be killed (SIGBUS) rather
than RIPV. According to docs, EIPV indicates that the error is
related to the IP, while RIPV simply means the IP is valid to
restart from.
* Don't clear the MCi_STATUS registers until after the panic() path.
This leaves the status bits set after the panic() so clever BIOSes
can find them (and dumb BIOSes can do nothing).
This patch also calls nonseekable_open() in mce_open (as suggested by akpm).
Result:
Tolerant levels behave almost identically to how they always have, but
not it's well defined. There's a slightly higher chance of panic()ing
when multiple errors happen (a good thing, IMHO). If you take an MBE and
panic(), the error status bits are not cleared.
Alternatives:
None.
Testing:
I used software to inject correctable and uncorrectable errors. With
tolerant = 3, the system usually survives. With tolerant = 2, the system
usually panic()s (PCC) but not always. With tolerant = 1, the system
always panic()s. When the system panic()s, the BIOS is able to detect
that the cause of death was an MC4. I was not able to reproduce the
case of a non-PCC error in userspace, with EIPV, with (tolerant < 3).
That will be rare at best.
Signed-off-by: Tim Hockin <thockin@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 15:10:37 +00:00
|
|
|
0: always panic on uncorrected errors, log corrected errors
|
|
|
|
1: panic or SIGBUS on uncorrected errors, log corrected errors
|
|
|
|
2: SIGBUS or log uncorrected errors, log corrected errors
|
|
|
|
3: never panic or SIGBUS, log all errors (for testing only)
|
|
|
|
Default is 1
|
2005-09-12 16:49:24 +00:00
|
|
|
Can be also set using sysfs which is preferable.
|
2009-05-27 19:56:55 +00:00
|
|
|
monarchtimeout:
|
|
|
|
Sets the time in us to wait for other CPUs on machine checks. 0
|
|
|
|
to disable.
|
2012-09-27 17:08:00 +00:00
|
|
|
mce=bios_cmci_threshold
|
|
|
|
Don't overwrite the bios-set CMCI threshold. This boot option
|
|
|
|
prevents Linux from overwriting the CMCI threshold set by the
|
|
|
|
bios. Without this option, Linux always sets the CMCI
|
|
|
|
threshold to 1. Enabling this may make memory predictive failure
|
|
|
|
analysis less effective if the bios sets thresholds for memory
|
|
|
|
errors since we will not see details for all errors.
|
2016-02-17 18:20:13 +00:00
|
|
|
mce=recovery
|
|
|
|
Force-enable recoverable machine check code paths
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
nomce (for compatibility with i386): same as mce=off
|
|
|
|
|
|
|
|
Everything else is in sysfs now.
|
|
|
|
|
|
|
|
APICs
|
|
|
|
|
|
|
|
apic Use IO-APIC. Default
|
|
|
|
|
|
|
|
noapic Don't use the IO-APIC.
|
|
|
|
|
|
|
|
disableapic Don't use the local APIC
|
|
|
|
|
|
|
|
nolapic Don't use the local APIC (alias for i386 compatibility)
|
|
|
|
|
2008-10-20 16:32:21 +00:00
|
|
|
pirq=... See Documentation/x86/i386/IO-APIC.txt
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
noapictimer Don't set up the APIC timer
|
|
|
|
|
2005-05-20 21:27:59 +00:00
|
|
|
no_timer_check Don't check the IO-APIC timer. This can work around
|
|
|
|
problems with incorrect timer initialization on some boards.
|
2006-02-03 20:51:41 +00:00
|
|
|
apicpmtimer
|
|
|
|
Do APIC timer calibration using the pmtimer. Implies
|
|
|
|
apicmaintimer. Useful when your PIT timer is totally
|
|
|
|
broken.
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
Timing
|
|
|
|
|
|
|
|
notsc
|
|
|
|
Don't use the CPU time stamp counter to read the wall time.
|
|
|
|
This can be used to work around timing problems on multiprocessor systems
|
2005-07-29 04:15:34 +00:00
|
|
|
with not properly synchronized CPUs.
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
nohpet
|
|
|
|
Don't use the HPET timer.
|
|
|
|
|
|
|
|
Idle loop
|
|
|
|
|
|
|
|
idle=poll
|
|
|
|
Don't do power saving in the idle loop using HLT, but poll for rescheduling
|
|
|
|
event. This will make the CPUs eat a lot more power, but may be useful
|
|
|
|
to get slightly better performance in multiprocessor benchmarks. It also
|
|
|
|
makes some profiling using performance counters more accurate.
|
2005-07-29 04:15:34 +00:00
|
|
|
Please note that on systems with MONITOR/MWAIT support (like Intel EM64T
|
|
|
|
CPUs) this option has no performance advantage over the normal idle loop.
|
|
|
|
It may also interact badly with hyperthreading.
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
Rebooting
|
|
|
|
|
2008-01-30 12:31:19 +00:00
|
|
|
reboot=b[ios] | t[riple] | k[bd] | a[cpi] | e[fi] [, [w]arm | [c]old]
|
2006-10-03 20:54:15 +00:00
|
|
|
bios Use the CPU reboot vector for warm reset
|
2005-04-16 22:20:36 +00:00
|
|
|
warm Don't set the cold reboot flag
|
|
|
|
cold Set the cold reboot flag
|
|
|
|
triple Force a triple fault (init)
|
|
|
|
kbd Use the keyboard controller. cold reset (default)
|
2008-01-30 12:31:17 +00:00
|
|
|
acpi Use the ACPI RESET_REG in the FADT. If ACPI is not configured or the
|
|
|
|
ACPI reset does not work, the reboot path attempts the reset using
|
|
|
|
the keyboard controller.
|
2008-01-30 12:31:19 +00:00
|
|
|
efi Use efi reset_system runtime service. If EFI is not configured or the
|
|
|
|
EFI reset does not work, the reboot path attempts the reset using
|
|
|
|
the keyboard controller.
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
Using warm reset will be much faster especially on big memory
|
|
|
|
systems because the BIOS will not go through the memory check.
|
|
|
|
Disadvantage is that not all hardware will be completely reinitialized
|
|
|
|
on reboot so there may be boot problems on some systems.
|
|
|
|
|
|
|
|
reboot=force
|
|
|
|
|
|
|
|
Don't stop other CPUs on reboot. This can make reboot more reliable
|
|
|
|
in some cases.
|
|
|
|
|
|
|
|
Non Executable Mappings
|
|
|
|
|
|
|
|
noexec=on|off
|
|
|
|
|
|
|
|
on Enable(default)
|
|
|
|
off Disable
|
|
|
|
|
|
|
|
NUMA
|
|
|
|
|
|
|
|
numa=off Only set up a single NUMA node spanning all memory.
|
|
|
|
|
|
|
|
numa=noacpi Don't parse the SRAT table for NUMA setup
|
|
|
|
|
2010-02-15 21:43:30 +00:00
|
|
|
numa=fake=<size>[MG]
|
|
|
|
If given as a memory unit, fills all system RAM with nodes of
|
|
|
|
size interleaved over physical nodes.
|
|
|
|
|
2010-02-15 21:43:33 +00:00
|
|
|
numa=fake=<N>
|
|
|
|
If given as an integer, fills all system RAM with N fake nodes
|
|
|
|
interleaved over physical nodes.
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
ACPI
|
|
|
|
|
|
|
|
acpi=off Don't enable ACPI
|
|
|
|
acpi=ht Use ACPI boot table parsing, but don't enable ACPI
|
|
|
|
interpreter
|
|
|
|
acpi=force Force ACPI on (currently not needed)
|
|
|
|
|
|
|
|
acpi=strict Disable out of spec ACPI workarounds.
|
|
|
|
|
|
|
|
acpi_sci={edge,level,high,low} Set up ACPI SCI interrupt.
|
|
|
|
|
|
|
|
acpi=noirq Don't route interrupts
|
|
|
|
|
2013-07-01 15:38:54 +00:00
|
|
|
acpi=nocmcff Disable firmware first mode for corrected errors. This
|
|
|
|
disables parsing the HEST CMC error source to check if
|
|
|
|
firmware has set the FF flag. This may result in
|
|
|
|
duplicate corrected error reports.
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
PCI
|
|
|
|
|
2011-03-17 19:24:15 +00:00
|
|
|
pci=off Don't use PCI
|
|
|
|
pci=conf1 Use conf1 access.
|
|
|
|
pci=conf2 Use conf2 access.
|
|
|
|
pci=rom Assign ROMs.
|
|
|
|
pci=assign-busses Assign busses
|
|
|
|
pci=irqmask=MASK Set PCI interrupt mask to MASK
|
|
|
|
pci=lastbus=NUMBER Scan up to NUMBER busses, no matter what the mptable says.
|
2005-04-16 22:20:36 +00:00
|
|
|
pci=noacpi Don't use ACPI to set up PCI interrupt routing.
|
|
|
|
|
2007-02-13 12:26:21 +00:00
|
|
|
IOMMU (input/output memory management unit)
|
|
|
|
|
|
|
|
Currently four x86-64 PCI-DMA mapping implementations exist:
|
|
|
|
|
|
|
|
1. <arch/x86_64/kernel/pci-nommu.c>: use no hardware/software IOMMU at all
|
|
|
|
(e.g. because you have < 3 GB memory).
|
|
|
|
Kernel boot message: "PCI-DMA: Disabling IOMMU"
|
|
|
|
|
2011-05-10 15:22:06 +00:00
|
|
|
2. <arch/x86/kernel/amd_gart_64.c>: AMD GART based hardware IOMMU.
|
2007-02-13 12:26:21 +00:00
|
|
|
Kernel boot message: "PCI-DMA: using GART IOMMU"
|
|
|
|
|
|
|
|
3. <arch/x86_64/kernel/pci-swiotlb.c> : Software IOMMU implementation. Used
|
|
|
|
e.g. if there is no hardware IOMMU in the system and it is need because
|
|
|
|
you have >3GB memory or told the kernel to us it (iommu=soft))
|
|
|
|
Kernel boot message: "PCI-DMA: Using software bounce buffering
|
|
|
|
for IO (SWIOTLB)"
|
|
|
|
|
|
|
|
4. <arch/x86_64/pci-calgary.c> : IBM Calgary hardware IOMMU. Used in IBM
|
|
|
|
pSeries and xSeries servers. This hardware IOMMU supports DMA address
|
|
|
|
mapping with memory protection, etc.
|
|
|
|
Kernel boot message: "PCI-DMA: Using Calgary IOMMU"
|
|
|
|
|
|
|
|
iommu=[<size>][,noagp][,off][,force][,noforce][,leak[=<nr_of_leak_pages>]
|
|
|
|
[,memaper[=<order>]][,merge][,forcesac][,fullflush][,nomerge]
|
|
|
|
[,noaperture][,calgary]
|
|
|
|
|
|
|
|
General iommu options:
|
|
|
|
off Don't initialize and use any kind of IOMMU.
|
|
|
|
noforce Don't force hardware IOMMU usage when it is not needed.
|
|
|
|
(default).
|
|
|
|
force Force the use of the hardware IOMMU even when it is
|
|
|
|
not actually needed (e.g. because < 3 GB memory).
|
|
|
|
soft Use software bounce buffering (SWIOTLB) (default for
|
|
|
|
Intel machines). This can be used to prevent the usage
|
|
|
|
of an available hardware IOMMU.
|
|
|
|
|
|
|
|
iommu options only relevant to the AMD GART hardware IOMMU:
|
|
|
|
<size> Set the size of the remapping area in bytes.
|
|
|
|
allowed Overwrite iommu off workarounds for specific chipsets.
|
|
|
|
fullflush Flush IOMMU on each allocation (default).
|
|
|
|
nofullflush Don't use IOMMU fullflush.
|
|
|
|
leak Turn on simple iommu leak tracing (only when
|
|
|
|
CONFIG_IOMMU_LEAK is on). Default number of leak pages
|
|
|
|
is 20.
|
|
|
|
memaper[=<order>] Allocate an own aperture over RAM with size 32MB<<order.
|
|
|
|
(default: order=1, i.e. 64MB)
|
2007-02-13 12:26:23 +00:00
|
|
|
merge Do scatter-gather (SG) merging. Implies "force"
|
2007-02-13 12:26:21 +00:00
|
|
|
(experimental).
|
2007-02-13 12:26:23 +00:00
|
|
|
nomerge Don't do scatter-gather (SG) merging.
|
2007-02-13 12:26:21 +00:00
|
|
|
noaperture Ask the IOMMU not to touch the aperture for AGP.
|
|
|
|
forcesac Force single-address cycle (SAC) mode for masks <40bits
|
|
|
|
(experimental).
|
|
|
|
noagp Don't initialize the AGP driver and use full aperture.
|
|
|
|
allowdac Allow double-address cycle (DAC) mode, i.e. DMA >4GB.
|
|
|
|
DAC is used with 32-bit PCI to push a 64-bit address in
|
|
|
|
two cycles. When off all DMA over >4GB is forced through
|
|
|
|
an IOMMU or software bounce buffering.
|
|
|
|
nodac Forbid DAC mode, i.e. DMA >4GB.
|
|
|
|
panic Always panic when IOMMU overflows.
|
|
|
|
calgary Use the Calgary IOMMU if it is available
|
|
|
|
|
|
|
|
iommu options only relevant to the software bounce buffering (SWIOTLB) IOMMU
|
|
|
|
implementation:
|
|
|
|
swiotlb=<pages>[,force]
|
|
|
|
<pages> Prereserve that many 128K pages for the software IO
|
|
|
|
bounce buffering.
|
|
|
|
force Force all IO through the software TLB.
|
|
|
|
|
|
|
|
Settings for the IBM Calgary hardware IOMMU currently found in IBM
|
|
|
|
pSeries and xSeries machines:
|
|
|
|
|
|
|
|
calgary=[64k,128k,256k,512k,1M,2M,4M,8M]
|
|
|
|
calgary=[translate_empty_slots]
|
|
|
|
calgary=[disable=<PCI bus number>]
|
|
|
|
panic Always panic when IOMMU overflows
|
2006-06-26 11:58:14 +00:00
|
|
|
|
|
|
|
64k,...,8M - Set the size of each PCI slot's translation table
|
|
|
|
when using the Calgary IOMMU. This is the size of the translation
|
|
|
|
table itself in main memory. The smallest table, 64k, covers an IO
|
|
|
|
space of 32MB; the largest, 8MB table, can cover an IO space of
|
|
|
|
4GB. Normally the kernel will make the right choice by itself.
|
|
|
|
|
|
|
|
translate_empty_slots - Enable translation even on slots that have
|
|
|
|
no devices attached to them, in case a device will be hotplugged
|
|
|
|
in the future.
|
|
|
|
|
|
|
|
disable=<PCI bus number> - Disable translation on a given PHB. For
|
|
|
|
example, the built-in graphics adapter resides on the first bridge
|
|
|
|
(PCI bus number 0); if translation (isolation) is enabled on this
|
|
|
|
bridge, X servers that access the hardware directly from user
|
|
|
|
space might stop working. Use this option if you have devices that
|
|
|
|
are accessed from userspace directly on some PCI host bridge.
|
|
|
|
|
2007-02-13 12:26:23 +00:00
|
|
|
Miscellaneous
|
2008-04-17 15:40:45 +00:00
|
|
|
|
|
|
|
nogbpages
|
|
|
|
Do not use GB pages for kernel direct mappings.
|
|
|
|
gbpages
|
|
|
|
Use GB pages for kernel direct mappings.
|