commit be9d73e649 ("platform/x86: hp-wmi: Fix 0x05 error code reported by
several WMI calls") and commit 12b19f14a2 ("platform/x86: hp-wmi: Fix
hp_wmi_read_int() reporting error (0x05)") cause ACPI BIOS Error (bug):
Attempt to CreateField of length zero (20211217/dsopcode-133) because of
the ACPI method HWMC, which unconditionally creates a Field of
size (insize*8) bits:
CreateField (Arg1, 0x80, (Local5 * 0x08), DAIN)
In cases where args->insize = 0, the Field size is 0, resulting in
an error.
Fix this by using zero insize only if 0x5 error code is returned
Tested on Omen 15 AMD (2020) board ID: 8786.
Fixes: be9d73e649 ("platform/x86: hp-wmi: Fix 0x05 error code reported by several WMI calls")
Signed-off-by: Bedant Patnaik <bedant.patnaik@gmail.com>
Tested-by: Jorge Lopez <jorge.lopez2@hp.com>
Link: https://lore.kernel.org/r/41be46743d21c78741232a47bbb5f1cdbcc3d21e.camel@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
WMI queries fail on some devices where the ACPI method HWMC
unconditionally attempts to create Fields beyond the buffer
if the buffer is too small, this breaks essential features
such as power profiles:
CreateByteField (Arg1, 0x10, D008)
CreateByteField (Arg1, 0x11, D009)
CreateByteField (Arg1, 0x12, D010)
CreateDWordField (Arg1, 0x10, D032)
CreateField (Arg1, 0x80, 0x0400, D128)
In cases where args->data had zero length, ACPI BIOS Error
(bug): AE_AML_BUFFER_LIMIT, Field [D008] at bit
offset/length 128/8 exceeds size of target Buffer (128 bits)
(20211217/dsopcode-198) was obtained.
ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [D009] at bit
offset/length 136/8 exceeds size of target Buffer (136bits)
(20211217/dsopcode-198)
The original code created a buffer size of 128 bytes regardless if
the WMI call required a smaller buffer or not. This particular
behavior occurs in older BIOS and reproduced in OMEN laptops. Newer
BIOS handles buffer sizes properly and meets the latest specification
requirements. This is the reason why testing with a dynamically
allocated buffer did not uncover any failures with the test systems at
hand.
This patch was tested on several OMEN, Elite, and Zbooks. It was
confirmed the patch resolves HPWMI_FAN GET/SET calls in an OMEN
Laptop 15-ek0xxx. No problems were reported when testing on several Elite
and Zbooks notebooks.
Fixes: 4b4967cbd2 ("platform/x86: hp-wmi: Changing bios_args.data to be dynamically allocated")
Signed-off-by: Jorge Lopez <jorge.lopez2@hp.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220608212923.8585-2-jorge.lopez2@hp.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
The driver is using functions from a compilation unit which is enabled
by CONFIG_CPU_SUP_INTEL. Add that dependency to Kconfig explicitly
otherwise:
drivers/platform/x86/intel/ifs/load.o: in function `ifs_load_firmware':
load.c:(.text+0x3b8): undefined reference to `intel_cpu_collect_info'
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/YoZay8YR0zRGyVu+@zn.tnic
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Add documentation for In-Field Scan (IFS). This documentation
describes the basics of IFS, the loading IFS image, chunk
authentication, running scan and how to check result via sysfs.
The CORE_CAPABILITIES MSR enumerates whether IFS is supported.
The full github location for distributing the IFS images is
still being decided. So just a placeholder included for now
in the documentation.
Future CPUs will support more than one type of test. Plan for
that now by using a "_0" suffix on the ABI directory names.
Additional test types will use "_1", etc.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20220506225410.1652287-13-tony.luck@intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Implement sysfs interface to trigger ifs test for a specific cpu.
Additional interfaces related to checking the status of the
scan test and seeing the version of the loaded IFS binary
are also added.
The basic usage is as below.
- To start test, for example on cpu5:
echo 5 > /sys/devices/platform/intel_ifs/run_test
- To see the status of the last test
cat /sys/devices/platform/intel_ifs/status
- To see the version of the loaded scan binary
cat /sys/devices/platform/intel_ifs/image_version
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Co-developed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20220506225410.1652287-10-tony.luck@intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
In a core, the scan engine is shared between sibling cpus.
When a Scan test (for a particular core) is triggered by the user,
the scan chunks are executed on all the threads on the core using
stop_core_cpuslocked.
Scan may be aborted by some reasons. Scan test will be aborted in certain
circumstances such as when interrupt occurred or cpu does not have enough
power budget for scan. In this case, the kernel restart scan from the chunk
where it stopped. Scan will also be aborted when the test is failed. In
this case, the test is immediately stopped without retry.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Co-developed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20220506225410.1652287-9-tony.luck@intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
The IFS image contains hashes that will be used to authenticate the ifs
test chunks. First, use WRMSR to copy the hashes and enumerate the number
of test chunks, chunk size and the maximum number of cores that can run
scan test simultaneously.
Next, use WRMSR to authenticate each and every scan test chunk which is
stored in the IFS image. The CPU will check if the test chunks match
the hashes, otherwise failure is indicated to system software. If the test
chunk is authenticated, it is automatically copied to secured memory.
Use schedule_work_on() to perform the hash copy and authentication. Note
this needs only be done on the first logical cpu of each socket.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Co-developed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20220506225410.1652287-8-tony.luck@intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Driver probe routine allocates structure to communicate status
and parameters between functions in the driver. Also call
load_ifs_binary() to load the scan image file.
There is a separate scan image file for each processor family,
model, stepping combination. This is read from the static path:
/lib/firmware/intel/ifs/{ff-mm-ss}.scan
Step 1 in loading is to generate the correct path and use
request_firmware_direct() to load into memory.
Subsequent patches will use the IFS MSR interfaces to copy
the image to BIOS reserved memory and validate the SHA256
checksums.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Co-developed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20220506225410.1652287-6-tony.luck@intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cloud Service Providers that operate fleets of servers have reported
[1] occasions where they can detect that a CPU has gone bad due to
effects like electromigration, or isolated manufacturing defects.
However, that detection method is A/B testing seemingly random
application failures looking for a pattern. In-Field Scan (IFS) is
a driver for a platform capability to load a crafted 'scan image'
to run targeted low level diagnostics outside of the CPU's architectural
error detection capabilities.
Stub version of driver just does initial part of check for the IFS
feature. MSR_IA32_CORE_CAPS must enumerate the presence of the
MSR_INTEGRITY_CAPS MSR.
[1]: https://www.youtube.com/watch?v=QMF3rqhjYuM
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20220506225410.1652287-5-tony.luck@intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
If CONFIG_SUSPEND and CONFIG_DEBUG_FS are not set.
compile error:
drivers/platform/x86/amd-pmc.c:323:12: error: ‘get_metrics_table’ defined but not used [-Werror=unused-function]
static int get_metrics_table(struct amd_pmc_dev *pdev, struct smu_metrics *table)
^~~~~~~~~~~~~~~~~
drivers/platform/x86/amd-pmc.c:298:12: error: ‘amd_pmc_idlemask_read’ defined but not used [-Werror=unused-function]
static int amd_pmc_idlemask_read(struct amd_pmc_dev *pdev, struct device *dev,
^~~~~~~~~~~~~~~~~~~~~
drivers/platform/x86/amd-pmc.c:196:12: error: ‘amd_pmc_get_smu_version’ defined but not used [-Werror=unused-function]
static int amd_pmc_get_smu_version(struct amd_pmc_dev *dev)
^~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
To fix building warning, wrap all related code with CONFIG_SUSPEND or CONFIG_DEBUG_FS.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ren Zhijie <renzhijie2@huawei.com>
Link: https://lore.kernel.org/r/20220505121958.138905-1-renzhijie2@huawei.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
There was an issue with the dual fan probe whereby the probe was
failing as it assuming that second_fan support was not available.
Corrected the logic so the probe works correctly. Cleaned up so
quirks only used if 2nd fan not detected.
Tested on X1 Carbon 10 (2 fans), X1 Carbon 9 (2 fans) and T490 (1 fan)
Signed-off-by: Mark Pearson <markpearson@lenovo.com>
Link: https://lore.kernel.org/r/20220502191200.63470-1-markpearson@lenovo.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Lenovo laptops that contain NVME SSDs across a variety of generations have
trouble resuming from suspend to idle when the IOMMU translation layer is
active for the NVME storage device.
This generally manifests as a large resume delay or page faults. These
delays and page faults occur as a result of a Lenovo BIOS specific SMI
that runs during the D3->D0 transition on NVME devices.
This SMI occurs because of a flag that is set during resume by Lenovo
firmware:
```
OperationRegion (PM80, SystemMemory, 0xFED80380, 0x10)
Field (PM80, AnyAcc, NoLock, Preserve)
{
SI3R, 1
}
Method (_ON, 0, NotSerialized) // _ON_: Power On
{
TPST (0x60D0)
If ((DAS3 == 0x00))
{
If (SI3R)
{
TPST (0x60E0)
M020 (NBRI, 0x00, 0x00, 0x04, (NCMD | 0x06))
M020 (NBRI, 0x00, 0x00, 0x10, NBAR)
APMC = HDSI /* \HDSI */
SLPS = 0x01
SI3R = 0x00
TPST (0x60E1)
}
D0NV = 0x01
}
}
```
Create a quirk that will run early in the resume process to prevent this
SMI from running. As any of these machines are fixed, they can be peeled
back from this quirk or narrowed down to individual firmware versions.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1910
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1689
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mark Pearson <markpearson@lenvo.com>
Link: https://lore.kernel.org/r/20220429030501.1909-3-mario.limonciello@amd.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
The dcdbas driver is used to call SMI handlers for both, dcdbas and
dell-smbios-smm. Both drivers allocate a buffer for communicating
with the SMI handler. The physical buffer address is then passed to
the called SMI handler via %ebx.
Unfortunately this doesn't work when running in Xen dom0, as the
physical address obtained via virt_to_phys() is only a guest physical
address, and not a machine physical address as needed by SMI.
The problem in dcdbas is easy to correct, as dcdbas is using
dma_alloc_coherent() for allocating the buffer, and the machine
physical address is available via the DMA address returned in the DMA
handle.
In order to avoid duplicating the buffer allocation code in
dell-smbios-smm, add a generic buffer allocation function to dcdbas
and use it for both drivers. This is especially fine regarding driver
dependencies, as dell-smbios-smm is already calling dcdbas to generate
the SMI request.
Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220318150950.16843-1-jgross@suse.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Sparse reports this issue
core.c: note: in included file:
core.h:239:12: warning: symbol 'pmc_lpm_modes' was not declared. Should it be static?
Global variables should not be defined in headers. This only works
because core.h is only included by core.c. Single file use
variables should be static, so change its storage-class specifier
to static.
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20220423123048.591405-1-trix@redhat.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Before this commit fan_curve_check_present() was trying to not cause
the probe to fail on devices without fan curve control by testing for
known error codes returned by asus_wmi_evaluate_method_buf().
Checking for ENODATA or ENODEV, with the latter being returned by this
function when an ACPI integer with a value of ASUS_WMI_UNSUPPORTED_METHOD
is returned. But for other ACPI integer returns this function just returns
them as is, including the ASUS_WMI_DSTS_UNKNOWN_BIT value of 2.
On the Asus U36SD ASUS_WMI_DSTS_UNKNOWN_BIT gets returned, leading to:
asus-nb-wmi: probe of asus-nb-wmi failed with error 2
Instead of playing whack a mole with error codes here, simply treat all
errors as there not being any fan curves, fixing the driver no longer
loading on the Asus U36SD laptop.
Fixes: e3d13da7f7 ("platform/x86: asus-wmi: Fix regression when probing for fan curve control")
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=2079125
Cc: Luke D. Jones <luke@ljones.dev>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220427114956.332919-1-hdegoede@redhat.com
When CONFIG_DEBUG_FS is disabled, amd_pmc_get_smu_version() is unused:
drivers/platform/x86/amd-pmc.c:196:12: warning: unused function 'amd_pmc_get_smu_version' [-Wunused-function]
static int amd_pmc_get_smu_version(struct amd_pmc_dev *dev)
^
1 warning generated.
Eliminate the warning by moving amd_pmc_get_smu_version() to the
CONFIG_DEBUG_FS block where it is used.
Fixes: b0c07116c8 ("platform/x86: amd-pmc: Avoid reading SMU version at probe time")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20220418213800.21257-1-nathan@kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
SMU logging is setup when the device is probed currently.
In analyzing boot performance it was observed that amd_pmc_probe is
taking ~116800us on startup on an OEM platform. This is longer than
expected, and is caused by enabling SMU logging at startup.
As the SMU logging is only needed for debugging, initialize it only upon
use. This decreases the time for amd_pmc_probe to ~28800us.
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20220411143820.13971-1-mario.limonciello@amd.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>