HP nx6125/nx6325/... machines have a _GPE handler with an infinite
loop sending Notify() events to different ACPI subsystems.
Notify handler in ACPI driver is a C-routine, which may call ACPI
interpreter again to get access to some ACPI variables
(acpi_evaluate_xxx).
On these HP machines such an evaluation changes state of some variable
and lets the loop above break.
In the current ACPI implementation Notify requests are being deferred
to the same kacpid workqueue on which the above GPE handler with
infinite loop is executing. Thus we have a deadlock -- loop will
continue to spin, sending notify events, and at the same time
preventing these notify events from being run on a workqueue. All
notify events are deferred, thus we see increase in memory consumption
noticed by author of the thread. Also as GPE handling is bloked,
machines overheat. Eventually by external poll of the same
acpi_evaluate, kacpid is released and all the queued notify events are
free to run, thus 100% cpu utilization by kacpid for several seconds
or more.
To prevent all these horrors it's needed to not put notify events to
kacpid workqueue by either executing them immediately or putting them
on some other thread. It's dangerous to execute notify events in
place, as it will put several ACPI interpreter stacks on top of each
other (at least 4 in case of nx6125), thus causing kernel stack
overflow.
First attempt to create a new thread was done by Peter Wainwright
He created a bunch of threads, which were stealing work from a kacpid
workqueue.
This patch appeared in 2.6.15 kernel shipped with Ubuntu 6.06 LTS.
Second attempt was done by me, I created a new thread for each Notify
event. This worked OK on HP nx machines, but broke Linus' Compaq
n620c, by producing threads with a speed what they stopped the machine
completely. Thus this patch was reverted from 18-rc2 as I remember.
I re-made the patch to create second workqueue just for notify events,
thus hopping it will not break Linus' machine. Patch was tested on the
same HP nx machines in #5534 and #7122, but I did not received reply
from Linus on a test patch sent to him.
Patch went to 19-rc and was rejected with much fanfare again.
There was 4th patch, which inserted schedule_timeout(1) into deferred
execution of kacpid, if we had any notify requests pending, but Linus
decided that it was too complex (involved either changes to workqueue
to see if it's empty or atomic inc/dec).
Now you see last variant which adds yield() to every GPE execution.
http://bugzilla.kernel.org/show_bug.cgi?id=5534http://bugzilla.kernel.org/show_bug.cgi?id=8385
Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.
Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use NULL for pointers
drivers/acpi/osl.c:208:10: warning: Using plain integer as NULL pointer
drivers/acpi/tables/tbxface.c:411:49: warning: Using plain integer as NULL pointer
drivers/acpi/processor_core.c:1008:10: warning: Using plain integer as NULL pointer
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Len Brown <len.brown@intel.com>
cosmetic only
Make "module name" actually match the file name.
Invoke with ';' as leaving it off confuses Lindent and gcc doesn't care.
Fix indentation where Lindent did get confused.
Signed-off-by: Len Brown <len.brown@intel.com>
Resources described by the FADT aren't really a good fit for the
ACPI motherboard driver.
The motherboard driver cares about PNP0C01 and PNP0C02 devices and
their resources.
The FADT describes some resources used by the ACPI core. Often, they
are also described by by the _CRS of a motherboard device, but I think
it's better to reserve them specifically in the ACPI osl.c because
(a) the motherboard driver is optional and ACPI uses the resources even
if the driver is absent, and (b) I want to remove the ACPI motherboard
driver because it's mostly redundant with the PNP system.c driver.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Spotted by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Pass the work_struct pointer to the work function rather than context data.
The work function can use container_of() to work out the data.
For the cases where the container of the work_struct may go away the moment the
pending bit is cleared, it is made possible to defer the release of the
structure by deferring the clearing of the pending bit.
To make this work, an extra flag is introduced into the management side of the
work_struct. This governs auto-release of the structure upon execution.
Ordinarily, the work queue executor would release the work_struct for further
scheduling or deallocation by clearing the pending bit prior to jumping to the
work function. This means that, unless the driver makes some guarantee itself
that the work_struct won't go away, the work function may not access anything
else in the work_struct or its container lest they be deallocated.. This is a
problem if the auxiliary data is taken away (as done by the last patch).
However, if the pending bit is *not* cleared before jumping to the work
function, then the work function *may* access the work_struct and its container
with no problems. But then the work function must itself release the
work_struct by calling work_release().
In most cases, automatic release is fine, so this is the default. Special
initiators exist for the non-auto-release case (ending in _NAR).
Signed-Off-By: David Howells <dhowells@redhat.com>
This reverts commit 37605a6900.
Again.
This same bug has now been introduced twice: it was done earlier by
commit b8d35192c5, only to be reverted
last time in commit 72945b2b90.
We must NOT try to queue up notify handlers to another thread than the
normal ACPI execution thread, because the notifications on some systems
seem to just keep on accumulating until we run out of memory and/or
threads.
Keeping events within the one deferred execution thread automatically
throttles the events properly.
At least the Compaq N620c will lock up completely on the first thermal
event without this patch reverted.
Cc: David Brownell <david-b@pacbell.net>
Cc: Len Brown <len.brown@intel.com>
Cc: Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
of passing regs around manually through all ~1800 interrupt handlers in the
Linux kernel.
The regs pointer is used in few places, but it potentially costs both stack
space and code to pass it around. On the FRV arch, removing the regs parameter
from all the genirq function results in a 20% speed up of the IRQ exit path
(ie: from leaving timer_interrupt() to leaving do_IRQ()).
Where appropriate, an arch may override the generic storage facility and do
something different with the variable. On FRV, for instance, the address is
maintained in GR28 at all times inside the kernel as part of general exception
handling.
Having looked over the code, it appears that the parameter may be handed down
through up to twenty or so layers of functions. Consider a USB character
device attached to a USB hub, attached to a USB controller that posts its
interrupts through a cascaded auxiliary interrupt controller. A character
device driver may want to pass regs to the sysrq handler through the input
layer which adds another few layers of parameter passing.
I've build this code with allyesconfig for x86_64 and i386. I've runtested the
main part of the code on FRV and i386, though I can't test most of the drivers.
I've also done partial conversion for powerpc and MIPS - these at least compile
with minimal configurations.
This will affect all archs. Mostly the changes should be relatively easy.
Take do_IRQ(), store the regs pointer at the beginning, saving the old one:
struct pt_regs *old_regs = set_irq_regs(regs);
And put the old one back at the end:
set_irq_regs(old_regs);
Don't pass regs through to generic_handle_irq() or __do_IRQ().
In timer_interrupt(), this sort of change will be necessary:
- update_process_times(user_mode(regs));
- profile_tick(CPU_PROFILING, regs);
+ update_process_times(user_mode(get_irq_regs()));
+ profile_tick(CPU_PROFILING);
I'd like to move update_process_times()'s use of get_irq_regs() into itself,
except that i386, alone of the archs, uses something other than user_mode().
Some notes on the interrupt handling in the drivers:
(*) input_dev() is now gone entirely. The regs pointer is no longer stored in
the input_dev struct.
(*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does
something different depending on whether it's been supplied with a regs
pointer or not.
(*) Various IRQ handler function pointers have been moved to type
irq_handler_t.
Signed-Off-By: David Howells <dhowells@redhat.com>
(cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
* Rougly half of callers already do it by not checking return value
* Code in drivers/acpi/osl.c does the following to be sure:
(void)kmem_cache_destroy(cache);
* Those who check it printk something, however, slab_error already printed
the name of failed cache.
* XFS BUGs on failed kmem_cache_destroy which is not the decision
low-level filesystem driver should make. Converted to ignore.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This effectively reverts commit b8d35192c5
by reverts acpi_os_queue_for_execution() to what it was before that,
except it changes the name to acpi_os_execute() to match ACPICA
20060512.
Signed-off-by: Len Brown <len.brown@intel.com>
[ The thread execution doesn't actually solve the bug it set out to
solve (see
http://bugzilla.kernel.org/show_bug.cgi?id=5534
for more details) because the new events can get caught behind the AML
semaphore or other serialization. And when that happens, the notify
threads keep on piling up until the system dies. ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Replace acpi_in_resume with a more general hack
to check irqs_disabled() on any kmalloc() from ACPI.
While setting (system_state != SYSTEM_RUNNING) on resume
seemed more general, Andrew Morton preferred this approach.
http://bugzilla.kernel.org/show_bug.cgi?id=3469
Make acpi_os_allocate() into an inline function to
allow /proc/slab_allocators to work.
Delete some memset() that could fault on allocation failure.
Signed-off-by: Len Brown <len.brown@intel.com>
Implemented a new acpi_spinlock type for the OSL lock
interfaces. This allows the type to be customized to
the host OS for improved efficiency (since a spinlock is
usually a very small object.)
Implemented support for "ignored" bits in the ACPI
registers. According to the ACPI specification, these
bits should be preserved when writing the registers via
a read/modify/write cycle. There are 3 bits preserved
in this manner: PM1_CONTROL[0] (SCI_EN), PM1_CONTROL[9],
and PM1_STATUS[11].
http://bugzilla.kernel.org/show_bug.cgi?id=3691
Implemented the initial deployment of new OSL mutex
interfaces. Since some host operating systems have
separate mutex and semaphore objects, this feature was
requested. The base code now uses mutexes (and the new
mutex interfaces) wherever a binary semaphore was used
previously. However, for the current release, the mutex
interfaces are defined as macros to map them to the
existing semaphore interfaces.
Fixed several problems with the support for the control
method SyncLevel parameter. The SyncLevel now works
according to the ACPI specification and in concert with the
Mutex SyncLevel parameter, since the current SyncLevel is
a property of the executing thread. Mutual exclusion for
control methods is now implemented with a mutex instead
of a semaphore.
Fixed three instances of the use of the C shift operator
in the bitfield support code (exfldio.c) to avoid the use
of a shift value larger than the target data width. The
behavior of C compilers is undefined in this case and can
cause unpredictable results, and therefore the case must
be detected and avoided. (Fiodor Suietov)
Added an info message whenever an SSDT or OEM table
is loaded dynamically via the Load() or LoadTable()
ASL operators. This should improve debugging capability
since it will show exactly what tables have been loaded
(beyond the tables present in the RSDT/XSDT.)
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The ASL Acquire operator (17.5.1 in ACPI 3.0 spec) is allowed to time out
and return True without acquiring the semaphore. There's no indication in
the spec that this is an actual error, so this message should be
debug-only, as the message for successful acquisition is.
This used to be an ACPI_DEBUG_PRINT, but it was mis-classified as
ACPI_DB_ERROR rather than ACPI_DB_MUTEX, so it got swept up in Thomas'
recent patch to enable ACPI error messages even without CONFIG_ACPI_DEBUG.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Converted the locking mutex used for the ACPI hardware
to a spinlock. This change should eliminate all problems
caused by attempting to acquire a semaphore at interrupt
level, and it means that all ACPICA external interfaces
that directly access the ACPI hardware can be safely
called from interrupt level.
Fixed a regression introduced in 20060526 where the ACPI
device initialization could be prematurely aborted with
an AE_NOT_FOUND if a device did not have an optional
_INI method.
Fixed an IndexField issue where a write to the Data
Register should be limited in size to the AccessSize
(width) of the IndexField itself. (BZ 433, Fiodor Suietov)
Fixed problem reports (Valery Podrezov) integrated: - Allow
store of ThermalZone objects to Debug object.
http://bugzilla.kernel.org/show_bug.cgi?id=5369http://bugzilla.kernel.org/show_bug.cgi?id=5370
Fixed problem reports (Fiodor Suietov) integrated: -
acpi_get_table_header() doesn't handle multiple instances
correctly (BZ 364)
Removed four global mutexes that were obsolete and were
no longer being used.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
http://bugzilla.kernel.org/show_bug.cgi?id=5534
Thanks to Peter Wainwright for isolating the issue.
Thanks to Andi Kleen and Bob Moore for feedback.
Thanks to Richard Mace and others for testing.
Updates by Konstantin Karasyov.
Signed-off-by: Konstantin Karasyov <konstantin.a.karasyov@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Removed a device initialization optimization introduced in
20051216 where the _STA method was not run unless an _INI
was also present for the same device. This optimization
could cause problems because it could allow _INI methods
to be run within a not-present device subtree (If a
not-present device had no _INI, _STA would not be run,
the not-present status would not be discovered, and the
children of the device would be incorrectly traversed.)
Implemented a new _STA optimization where namespace
subtrees that do not contain _INI are identified and
ignored during device initialization. Selectively running
_STA can significantly improve boot time on large machines
(with assistance from Len Brown.)
Implemented support for the device initialization case
where the returned _STA flags indicate a device not-present
but functioning. In this case, _INI is not run, but the
device children are examined for presence, as per the
ACPI specification.
Implemented an additional change to the IndexField support
in order to conform to MS behavior. The value written to
the Index Register is not simply a byte offset, it is a
byte offset in units of the access width of the parent
Index Field. (Fiodor Suietov)
Defined and deployed a new OSL interface,
acpi_os_validate_address(). This interface is called during
the creation of all AML operation regions, and allows
the host OS to exert control over what addresses it will
allow the AML code to access. Operation Regions whose
addresses are disallowed will cause a runtime exception
when they are actually accessed (will not affect or abort
table loading.)
Defined and deployed a new OSL interface,
acpi_os_validate_interface(). This interface allows the host OS
to match the various "optional" interface/behavior strings
for the _OSI predefined control method as appropriate
(with assistance from Bjorn Helgaas.)
Restructured and corrected various problems in the
exception handling code paths within DsCallControlMethod
and DsTerminateControlMethod in dsmethod (with assistance
from Takayoshi Kochi.)
Modified the Linux source converter to ignore quoted string
literals while converting identifiers from mixed to lower
case. This will correct problems with the disassembler
and other areas where such strings must not be modified.
The ACPI_FUNCTION_* macros no longer require quotes around
the function name. This allows the Linux source converter
to convert the names, now that the converter ignores
quoted strings.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Implemented the use of a cache object for all internal
namespace nodes. Since there are about 1000 static nodes
in a typical system, this will decrease memory use for
cache implementations that minimize per-allocation overhead
(such as a slab allocator.)
Removed the reference count mechanism for internal
namespace nodes, since it was deemed unnecessary. This
reduces the size of each namespace node by about 5%-10%
on all platforms. Nodes are now 20 bytes for the 32-bit
case, and 32 bytes for the 64-bit case.
Optimized several internal data structures to reduce
object size on 64-bit platforms by packing data within
the 64-bit alignment. This includes the frequently used
ACPI_OPERAND_OBJECT, of which there can be ~1000 static
instances corresponding to the namespace objects.
Added two new strings for the predefined _OSI method:
"Windows 2001.1 SP1" and "Windows 2006".
Split the allocation tracking mechanism out to a separate
file, from utalloc.c to uttrack.c. This mechanism appears
to be only useful for application-level code. Kernels may
wish to not include uttrack.c in distributions.
Removed all remnants of the obsolete ACPI_REPORT_* macros
and the associated code. (These macros have been replaced
by the ACPI_ERROR and ACPI_WARNING macros.)
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
ia64 ioremap is now smart enough to use the correct memory attributes, so
remove the EFI checks from osl.c.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: "Tolentino, Matthew E" <matthew.e.tolentino@intel.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Almost all users of the table addresses from the EFI system table want
physical addresses. So rather than doing the pa->va->pa conversion, just keep
physical addresses in struct efi.
This fixes a DMI bug: the efi structure contained the physical SMBIOS address
on x86 but the virtual address on ia64, so dmi_scan_machine() used ioremap()
on a virtual address on ia64.
This is essentially the same as an earlier patch by Matt Tolentino:
http://marc.theaimsgroup.com/?l=linux-kernel&m=112130292316281&w=2
except that this changes all table addresses, not just ACPI addresses.
Matt's original patch was backed out because it caused MCAs on HP sx1000
systems. That problem is resolved by the ioremap() attribute checking added
for ia64.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: "Tolentino, Matthew E" <matthew.e.tolentino@intel.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Implemented support in the Resource Manager to allow
unresolved namestring references within resource package
objects for the _PRT method. This support is in addition
to the previously implemented unresolved reference
support within the AML parser. If the interpreter slack
mode is enabled (true on Linux unless acpi=strict),
these unresolved references will be passed through
to the caller as a NULL package entry.
http://bugzilla.kernel.org/show_bug.cgi?id=5741
Implemented and deployed new macros and functions for
error and warning messages across the subsystem. These
macros are simpler and generate less code than their
predecessors. The new macros ACPI_ERROR, ACPI_EXCEPTION,
ACPI_WARNING, and ACPI_INFO replace the ACPI_REPORT_*
macros.
Implemented the acpi_cpu_flags type to simplify host OS
integration of the Acquire/Release Lock OSL interfaces.
Suggested by Steven Rostedt and Andrew Morton.
Fixed a problem where Alias ASL operators are sometimes
not correctly resolved. causing AE_AML_INTERNAL
http://bugzilla.kernel.org/show_bug.cgi?id=5189http://bugzilla.kernel.org/show_bug.cgi?id=5674
Fixed several problems with the implementation of the
ConcatenateResTemplate ASL operator. As per the ACPI
specification, zero length buffers are now treated as a
single EndTag. One-length buffers always cause a fatal
exception. Non-zero length buffers that do not end with
a full 2-byte EndTag cause a fatal exception.
Fixed a possible structure overwrite in the
AcpiGetObjectInfo external interface. (With assistance
from Thomas Renninger)
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
According to the TCG specifications measurements or hashes of the BIOS code
and data are extended into TPM PCRS and a log is kept in an ACPI table of
these extensions for later validation if desired. This patch exports the
values in the ACPI table through a security-fs seq_file.
Signed-off-by: Seiji Munetoh <munetoh@jp.ibm.com>
Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
Signed-off-by: Reiner Sailer <sailer@us.ibm.com>
Signed-off-by: Kylene Hall <kjhall@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Before this fix, the finite timeout case
behaved like the no-timeout (trylock) case.
http://bugzilla.kernel.org/show_bug.cgi?id=4588
Signed-off-by: Luming Yu <luming.yu@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Implemented support for the EM64T and other x86_64
processors. This essentially entails recognizing
that these processors support non-aligned memory
transfers. Previously, all 64-bit processors were assumed
to lack hardware support for non-aligned transfers.
Completed conversion of the Resource Manager to nearly
full table-driven operation. Specifically, the resource
conversion code (convert AML to internal format and the
reverse) and the debug code to dump internal resource
descriptors are fully table-driven, reducing code and data
size and improving maintainability.
The OSL interfaces for Acquire and Release Lock now use a
64-bit flag word on 64-bit processors instead of a fixed
32-bit word. (Alexey Starikovskiy)
Implemented support within the resource conversion code
for the Type-Specific byte within the various ACPI 3.0
*WordSpace macros.
Fixed some issues within the resource conversion code for
the type-specific flags for both Memory and I/O address
resource descriptors. For Memory, implemented support
for the MTP and TTP flags. For I/O, split the TRS and TTP
flags into two separate fields.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Use schedule_timeout_interruptible() instead of
set_current_state()/schedule_timeout() to reduce kernel size. Also use
msecs_to_jiffies() instead of direct HZ division to avoid rounding errors.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>