Provide different names for cooling devices registration to allow
binding each cooling devices to relevant thermal zone. Thus, specific
cooling device can be associated with related thermal sensor by setting
thermal cooling device type for example to "mlxreg_fan2" and passing
this type to thermal_zone_bind_cooling_device() through 'cdev->type'.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20210926053541.1806937-3-vadimp@nvidia.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Validate PWM connectivity only for additional PWM - "pwm1" is connected
on all systems, while "pwm2" - "pwm4" are optional. Validate
connectivity only for optional attributes by reading of related "pwm{n}"
registers - in case "pwm{n}" is not connected, register value is
supposed to be 0xff.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20210926053541.1806937-2-vadimp@nvidia.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
ASUS Pro WS X570-ACE board has got an nct6775 chip, but by default
there's no use of it because of resource conflict:
```
ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\AMW0.SHWM) (20210604/utaddress-204
)
ACPI: OSL: Resource conflict; ACPI support missing from driver?
ACPI: OSL: Resource conflict: System may be unstable or behave erratically
```
A workaround is to use `acpi_enforce_resources=lax`, but a proper
support needs to be added instead.
This commit adds Pro WS X570-ACE to the list of boards that can be monitored
using ASUS WMI.
Tested by me on this hardware:
```
Base Board Information
Manufacturer: ASUSTeK COMPUTER INC.
Product Name: Pro WS X570-ACE
BIOS Information
Vendor: American Megatrends Inc.
Version: 3801
Release Date: 07/30/2021
```
Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Link: https://lore.kernel.org/r/20211003133344.9036-2-oleksandr@natalenko.name
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The appropriate mantissa values for the lm25066 family's direct-format
current and power readings are a function of the sense resistor
employed between the SENSE and VIN pins of the chip. Instead of
assuming that resistance is always the same 1mOhm as used in the
datasheet, allow it to be configured via a device-tree property
("shunt-resistor-micro-ohms").
Signed-off-by: Zev Weiss <zev@bewilderbeest.net>
Link: https://lore.kernel.org/r/20210928092242.30036-8-zev@bewilderbeest.net
[groeck: Fixed checkpatch warnings]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The driver doesn't have a struct of_device_id table but supported devices
are registered via Device Trees. This is working on the assumption that a
I2C device registered via OF will always match a legacy I2C device ID and
that the MODALIAS reported will always be of the form i2c:<device>.
But this could change in the future so the correct approach is to have an
OF device ID table if the devices are registered via OF.
Signed-off-by: Zev Weiss <zev@bewilderbeest.net>
Link: https://lore.kernel.org/r/20210928092242.30036-7-zev@bewilderbeest.net
[groeck: Replaced reference to reasoning with reasoning, fixed checkpatch
warnings, fixed compile warning comparing of_id->data w/ i2c_id->driver_data]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Array fan->pwm[] is MLXREG_FAN_MAX_PWM elements in size, however the
for-loop has a off-by-one error causing index i to be out of range
causing an out of bounds read on the array. Fix this by replacing
the <= operator with < in the for-loop.
Addresses-Coverity: ("Out-of-bounds read")
Reported-by: Vadim Pasternak <vadimp@nvidia.com>
Fixes: 35edbaab3bbf ("hwmon: (mlxreg-fan) Extend driver to support multiply cooling devices")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210920180921.16246-1-colin.king@canonical.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Add support for additional cooling devices in order to support the
systems, which can be equipped with up-to four PWM controllers.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
I got memory leak as follows when doing fault injection test:
unreferenced object 0xffff888102740438 (size 8):
comm "27", pid 859, jiffies 4295031351 (age 143.992s)
hex dump (first 8 bytes):
68 77 6d 6f 6e 30 00 00 hwmon0..
backtrace:
[<00000000544b5996>] __kmalloc_track_caller+0x1a6/0x300
[<00000000df0d62b9>] kvasprintf+0xad/0x140
[<00000000d3d2a3da>] kvasprintf_const+0x62/0x190
[<000000005f8f0f29>] kobject_set_name_vargs+0x56/0x140
[<00000000b739e4b9>] dev_set_name+0xb0/0xe0
[<0000000095b69c25>] __hwmon_device_register+0xf19/0x1e50 [hwmon]
[<00000000a7e65b52>] hwmon_device_register_with_info+0xcb/0x110 [hwmon]
[<000000006f181e86>] devm_hwmon_device_register_with_info+0x85/0x100 [hwmon]
[<0000000081bdc567>] tmp421_probe+0x2d2/0x465 [tmp421]
[<00000000502cc3f8>] i2c_device_probe+0x4e1/0xbb0
[<00000000f90bda3b>] really_probe+0x285/0xc30
[<000000007eac7b77>] __driver_probe_device+0x35f/0x4f0
[<000000004953d43d>] driver_probe_device+0x4f/0x140
[<000000002ada2d41>] __device_attach_driver+0x24c/0x330
[<00000000b3977977>] bus_for_each_drv+0x15d/0x1e0
[<000000005bf2a8e3>] __device_attach+0x267/0x410
When device_register() returns an error, the name allocated in
dev_set_name() will be leaked, the put_device() should be used
instead of calling hwmon_dev_release() to give up the device
reference, then the name will be freed in kobject_cleanup().
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: bab2243ce1 ("hwmon: Introduce hwmon_device_register_with_groups")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211012112758.2681084-1-yangyingliang@huawei.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
If driver read tmp value sufficient for
(tmp & 0x08) && (!(tmp & 0x80)) && ((tmp & 0x7) == ((tmp >> 4) & 0x7))
from device then Null pointer dereference occurs.
(It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers)
Also lm75[] does not serve a purpose anymore after switching to
devm_i2c_new_dummy_device() in w83791d_detect_subclients().
The patch fixes possible NULL pointer dereference by removing lm75[].
Found by Linux Driver Verification project (linuxtesting.org).
Cc: stable@vger.kernel.org
Signed-off-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Link: https://lore.kernel.org/r/20210921155153.28098-3-lutovinova@ispras.ru
[groeck: Dropped unnecessary continuation lines, fixed multi-line alignments]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
If driver read val value sufficient for
(val & 0x08) && (!(val & 0x80)) && ((val & 0x7) == ((val >> 4) & 0x7))
from device then Null pointer dereference occurs.
(It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers)
Also lm75[] does not serve a purpose anymore after switching to
devm_i2c_new_dummy_device() in w83791d_detect_subclients().
The patch fixes possible NULL pointer dereference by removing lm75[].
Found by Linux Driver Verification project (linuxtesting.org).
Cc: stable@vger.kernel.org
Signed-off-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Link: https://lore.kernel.org/r/20210921155153.28098-2-lutovinova@ispras.ru
[groeck: Dropped unnecessary continuation lines, fixed multipline alignment]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
If driver read val value sufficient for
(val & 0x08) && (!(val & 0x80)) && ((val & 0x7) == ((val >> 4) & 0x7))
from device then Null pointer dereference occurs.
(It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers)
Also lm75[] does not serve a purpose anymore after switching to
devm_i2c_new_dummy_device() in w83791d_detect_subclients().
The patch fixes possible NULL pointer dereference by removing lm75[].
Found by Linux Driver Verification project (linuxtesting.org).
Cc: stable@vger.kernel.org
Signed-off-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Link: https://lore.kernel.org/r/20210921155153.28098-1-lutovinova@ispras.ru
[groeck: Dropped unnecessary continuation lines, fixed multi-line alignment]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The bytes for max_power_out from the ibm-cffps devices differ in byte
order for some power supplies.
The Witherspoon power supply returns the bytes in MSB/LSB order.
The Rainier power supply returns the bytes in LSB/MSB order.
The Witherspoon power supply uses version cffps1. The Rainier power
supply should use version cffps2. If version is cffps1, swap the bytes
before output to max_power_out.
Tested:
Witherspoon before: 3148. Witherspoon after: 3148.
Rainier before: 53255. Rainier after: 2000.
Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20210928205051.1222815-1-bjwyman@gmail.com
[groeck: Replaced yoda programming]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Old code produces -24999 for 0b1110011100000000 input in standard format due to
always rounding up rather than "away from zero".
Use the common macro for division, unify and simplify the conversion code along
the way.
Fixes: 9410700b88 ("hwmon: Add driver for Texas Instruments TMP421/422/423 sensor chips")
Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Link: https://lore.kernel.org/r/20210924093011.26083-3-fercerpav@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
For both local and remote sensors all the supported ICs can report an
"undervoltage lockout" condition which means the conversion wasn't
properly performed due to insufficient power supply voltage and so the
measurement results can't be trusted.
Fixes: 9410700b88 ("hwmon: Add driver for Texas Instruments TMP421/422/423 sensor chips")
Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Link: https://lore.kernel.org/r/20210924093011.26083-2-fercerpav@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Function i2c_smbus_read_byte_data() can return a negative error number
instead of the data read if I2C transaction failed for whatever reason.
Lack of error checking can lead to serious issues on production
hardware, e.g. errors treated as temperatures produce spurious critical
temperature-crossed-threshold errors in BMC logs for OCP server
hardware. The patch was tested with Mellanox OCP Mezzanine card
emulating TMP421 protocol for temperature sensing which sometimes leads
to I2C protocol error during early boot up stage.
Fixes: 9410700b88 ("hwmon: Add driver for Texas Instruments TMP421/422/423 sensor chips")
Cc: stable@vger.kernel.org
Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Link: https://lore.kernel.org/r/20210924093011.26083-1-fercerpav@gmail.com
[groeck: dropped unnecessary line breaks]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Fan speed minimum can be enforced from sysfs. For example, setting
current fan speed to 20 is used to enforce fan speed to be at 100%
speed, 19 - to be not below 90% speed, etcetera. This feature provides
ability to limit fan speed according to some system wise
considerations, like absence of some replaceable units or high system
ambient temperature.
Request for changing fan minimum speed is configuration request and can
be set only through 'sysfs' write procedure. In this situation value of
argument 'state' is above nominal fan speed maximum.
Return non-zero code in this case to avoid
thermal_cooling_device_stats_update() call, because in this case
statistics update violates thermal statistics table range.
The issues is observed in case kernel is configured with option
CONFIG_THERMAL_STATISTICS.
Here is the trace from KASAN:
[ 159.506659] BUG: KASAN: slab-out-of-bounds in thermal_cooling_device_stats_update+0x7d/0xb0
[ 159.516016] Read of size 4 at addr ffff888116163840 by task hw-management.s/7444
[ 159.545625] Call Trace:
[ 159.548366] dump_stack+0x92/0xc1
[ 159.552084] ? thermal_cooling_device_stats_update+0x7d/0xb0
[ 159.635869] thermal_zone_device_update+0x345/0x780
[ 159.688711] thermal_zone_device_set_mode+0x7d/0xc0
[ 159.694174] mlxsw_thermal_modules_init+0x48f/0x590 [mlxsw_core]
[ 159.700972] ? mlxsw_thermal_set_cur_state+0x5a0/0x5a0 [mlxsw_core]
[ 159.731827] mlxsw_thermal_init+0x763/0x880 [mlxsw_core]
[ 160.070233] RIP: 0033:0x7fd995909970
[ 160.074239] Code: 73 01 c3 48 8b 0d 28 d5 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 99 2d 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ..
[ 160.095242] RSP: 002b:00007fff54f5d938 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 160.103722] RAX: ffffffffffffffda RBX: 0000000000000013 RCX: 00007fd995909970
[ 160.111710] RDX: 0000000000000013 RSI: 0000000001906008 RDI: 0000000000000001
[ 160.119699] RBP: 0000000001906008 R08: 00007fd995bc9760 R09: 00007fd996210700
[ 160.127687] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000013
[ 160.135673] R13: 0000000000000001 R14: 00007fd995bc8600 R15: 0000000000000013
[ 160.143671]
[ 160.145338] Allocated by task 2924:
[ 160.149242] kasan_save_stack+0x19/0x40
[ 160.153541] __kasan_kmalloc+0x7f/0xa0
[ 160.157743] __kmalloc+0x1a2/0x2b0
[ 160.161552] thermal_cooling_device_setup_sysfs+0xf9/0x1a0
[ 160.167687] __thermal_cooling_device_register+0x1b5/0x500
[ 160.173833] devm_thermal_of_cooling_device_register+0x60/0xa0
[ 160.180356] mlxreg_fan_probe+0x474/0x5e0 [mlxreg_fan]
[ 160.248140]
[ 160.249807] The buggy address belongs to the object at ffff888116163400
[ 160.249807] which belongs to the cache kmalloc-1k of size 1024
[ 160.263814] The buggy address is located 64 bytes to the right of
[ 160.263814] 1024-byte region [ffff888116163400, ffff888116163800)
[ 160.277536] The buggy address belongs to the page:
[ 160.282898] page:0000000012275840 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888116167000 pfn:0x116160
[ 160.294872] head:0000000012275840 order:3 compound_mapcount:0 compound_pincount:0
[ 160.303251] flags: 0x200000000010200(slab|head|node=0|zone=2)
[ 160.309694] raw: 0200000000010200 ffffea00046f7208 ffffea0004928208 ffff88810004dbc0
[ 160.318367] raw: ffff888116167000 00000000000a0006 00000001ffffffff 0000000000000000
[ 160.327033] page dumped because: kasan: bad access detected
[ 160.333270]
[ 160.334937] Memory state around the buggy address:
[ 160.356469] >ffff888116163800: fc ..
Fixes: 65afb4c8e7 ("hwmon: (mlxreg-fan) Add support for Mellanox FAN driver")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20210916183151.869427-1-vadimp@nvidia.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Commit id "b00647c46c9d7f6ee1ff6aaf335906101755e614",
adds reporting current and voltage to k10temp.c
The commit id "0a4e668b5d52eed8026f5d717196b02b55fb2dc6",
removed reporting current and voltage from k10temp.c
The curr and in(voltage) entries are not removed from
"k10temp_info" structure. Removing those residue entries.
while at it, update k10temp driver documentation
Signed-off-by: suma hegde <suma.hegde@amd.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210902174155.7365-2-nchatrad@amd.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Merge more updates from Andrew Morton:
"147 patches, based on 7d2a07b769.
Subsystems affected by this patch series: mm (memory-hotplug, rmap,
ioremap, highmem, cleanups, secretmem, kfence, damon, and vmscan),
alpha, percpu, procfs, misc, core-kernel, MAINTAINERS, lib,
checkpatch, epoll, init, nilfs2, coredump, fork, pids, criu, kconfig,
selftests, ipc, and scripts"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
scripts: check_extable: fix typo in user error message
mm/workingset: correct kernel-doc notations
ipc: replace costly bailout check in sysvipc_find_ipc()
selftests/memfd: remove unused variable
Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH
configs: remove the obsolete CONFIG_INPUT_POLLDEV
prctl: allow to setup brk for et_dyn executables
pid: cleanup the stale comment mentioning pidmap_init().
kernel/fork.c: unexport get_{mm,task}_exe_file
coredump: fix memleak in dump_vma_snapshot()
fs/coredump.c: log if a core dump is aborted due to changed file permissions
nilfs2: use refcount_dec_and_lock() to fix potential UAF
nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group
nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group
nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group
nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group
nilfs2: fix NULL pointer in nilfs_##name##_attr_release
nilfs2: fix memory leak in nilfs_sysfs_create_device_group
trap: cleanup trap_init()
init: move usermodehelper_enable() to populate_rootfs()
...
This driver exposes hardware sensors of the Aquacomputer D5 Next
watercooling pump, which communicates through a proprietary USB HID
protocol.
Available sensors are pump and fan speed, power, voltage and current, as
well as coolant temperature. Also available through debugfs are the serial
number, firmware version and power-on count.
Attaching a fan is optional and allows it to be controlled using
temperature curves directly from the pump. If it's not connected,
the fan-related sensors will report zeroes.
The pump can be configured either through software or via its physical
interface. Configuring the pump through this driver is not implemented,
as it seems to require sending it a complete configuration. That
includes addressable RGB LEDs, for which there is no standard sysfs
interface. Thus, that task is better suited for userspace tools.
This driver has been tested on x86_64, both in-kernel and as a module.
Signed-off-by: Aleksa Savic <savicaleksa83@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Tdie is an offset calculation that should only be shown when temp_offset
is actually put into a table. This is useless to show for all CPU/APU.
Show it only when necessary.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
These follow the rest of the existing codepaths for families
17h and 19h.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>