Array fan->pwm[] is MLXREG_FAN_MAX_PWM elements in size, however the
for-loop has a off-by-one error causing index i to be out of range
causing an out of bounds read on the array. Fix this by replacing
the <= operator with < in the for-loop.
Addresses-Coverity: ("Out-of-bounds read")
Reported-by: Vadim Pasternak <vadimp@nvidia.com>
Fixes: 35edbaab3bbf ("hwmon: (mlxreg-fan) Extend driver to support multiply cooling devices")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210920180921.16246-1-colin.king@canonical.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Add support for additional cooling devices in order to support the
systems, which can be equipped with up-to four PWM controllers.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
I got memory leak as follows when doing fault injection test:
unreferenced object 0xffff888102740438 (size 8):
comm "27", pid 859, jiffies 4295031351 (age 143.992s)
hex dump (first 8 bytes):
68 77 6d 6f 6e 30 00 00 hwmon0..
backtrace:
[<00000000544b5996>] __kmalloc_track_caller+0x1a6/0x300
[<00000000df0d62b9>] kvasprintf+0xad/0x140
[<00000000d3d2a3da>] kvasprintf_const+0x62/0x190
[<000000005f8f0f29>] kobject_set_name_vargs+0x56/0x140
[<00000000b739e4b9>] dev_set_name+0xb0/0xe0
[<0000000095b69c25>] __hwmon_device_register+0xf19/0x1e50 [hwmon]
[<00000000a7e65b52>] hwmon_device_register_with_info+0xcb/0x110 [hwmon]
[<000000006f181e86>] devm_hwmon_device_register_with_info+0x85/0x100 [hwmon]
[<0000000081bdc567>] tmp421_probe+0x2d2/0x465 [tmp421]
[<00000000502cc3f8>] i2c_device_probe+0x4e1/0xbb0
[<00000000f90bda3b>] really_probe+0x285/0xc30
[<000000007eac7b77>] __driver_probe_device+0x35f/0x4f0
[<000000004953d43d>] driver_probe_device+0x4f/0x140
[<000000002ada2d41>] __device_attach_driver+0x24c/0x330
[<00000000b3977977>] bus_for_each_drv+0x15d/0x1e0
[<000000005bf2a8e3>] __device_attach+0x267/0x410
When device_register() returns an error, the name allocated in
dev_set_name() will be leaked, the put_device() should be used
instead of calling hwmon_dev_release() to give up the device
reference, then the name will be freed in kobject_cleanup().
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: bab2243ce1 ("hwmon: Introduce hwmon_device_register_with_groups")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211012112758.2681084-1-yangyingliang@huawei.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
If driver read tmp value sufficient for
(tmp & 0x08) && (!(tmp & 0x80)) && ((tmp & 0x7) == ((tmp >> 4) & 0x7))
from device then Null pointer dereference occurs.
(It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers)
Also lm75[] does not serve a purpose anymore after switching to
devm_i2c_new_dummy_device() in w83791d_detect_subclients().
The patch fixes possible NULL pointer dereference by removing lm75[].
Found by Linux Driver Verification project (linuxtesting.org).
Cc: stable@vger.kernel.org
Signed-off-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Link: https://lore.kernel.org/r/20210921155153.28098-3-lutovinova@ispras.ru
[groeck: Dropped unnecessary continuation lines, fixed multi-line alignments]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
If driver read val value sufficient for
(val & 0x08) && (!(val & 0x80)) && ((val & 0x7) == ((val >> 4) & 0x7))
from device then Null pointer dereference occurs.
(It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers)
Also lm75[] does not serve a purpose anymore after switching to
devm_i2c_new_dummy_device() in w83791d_detect_subclients().
The patch fixes possible NULL pointer dereference by removing lm75[].
Found by Linux Driver Verification project (linuxtesting.org).
Cc: stable@vger.kernel.org
Signed-off-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Link: https://lore.kernel.org/r/20210921155153.28098-2-lutovinova@ispras.ru
[groeck: Dropped unnecessary continuation lines, fixed multipline alignment]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
If driver read val value sufficient for
(val & 0x08) && (!(val & 0x80)) && ((val & 0x7) == ((val >> 4) & 0x7))
from device then Null pointer dereference occurs.
(It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers)
Also lm75[] does not serve a purpose anymore after switching to
devm_i2c_new_dummy_device() in w83791d_detect_subclients().
The patch fixes possible NULL pointer dereference by removing lm75[].
Found by Linux Driver Verification project (linuxtesting.org).
Cc: stable@vger.kernel.org
Signed-off-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Link: https://lore.kernel.org/r/20210921155153.28098-1-lutovinova@ispras.ru
[groeck: Dropped unnecessary continuation lines, fixed multi-line alignment]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The bytes for max_power_out from the ibm-cffps devices differ in byte
order for some power supplies.
The Witherspoon power supply returns the bytes in MSB/LSB order.
The Rainier power supply returns the bytes in LSB/MSB order.
The Witherspoon power supply uses version cffps1. The Rainier power
supply should use version cffps2. If version is cffps1, swap the bytes
before output to max_power_out.
Tested:
Witherspoon before: 3148. Witherspoon after: 3148.
Rainier before: 53255. Rainier after: 2000.
Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20210928205051.1222815-1-bjwyman@gmail.com
[groeck: Replaced yoda programming]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Old code produces -24999 for 0b1110011100000000 input in standard format due to
always rounding up rather than "away from zero".
Use the common macro for division, unify and simplify the conversion code along
the way.
Fixes: 9410700b88 ("hwmon: Add driver for Texas Instruments TMP421/422/423 sensor chips")
Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Link: https://lore.kernel.org/r/20210924093011.26083-3-fercerpav@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
For both local and remote sensors all the supported ICs can report an
"undervoltage lockout" condition which means the conversion wasn't
properly performed due to insufficient power supply voltage and so the
measurement results can't be trusted.
Fixes: 9410700b88 ("hwmon: Add driver for Texas Instruments TMP421/422/423 sensor chips")
Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Link: https://lore.kernel.org/r/20210924093011.26083-2-fercerpav@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Function i2c_smbus_read_byte_data() can return a negative error number
instead of the data read if I2C transaction failed for whatever reason.
Lack of error checking can lead to serious issues on production
hardware, e.g. errors treated as temperatures produce spurious critical
temperature-crossed-threshold errors in BMC logs for OCP server
hardware. The patch was tested with Mellanox OCP Mezzanine card
emulating TMP421 protocol for temperature sensing which sometimes leads
to I2C protocol error during early boot up stage.
Fixes: 9410700b88 ("hwmon: Add driver for Texas Instruments TMP421/422/423 sensor chips")
Cc: stable@vger.kernel.org
Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Link: https://lore.kernel.org/r/20210924093011.26083-1-fercerpav@gmail.com
[groeck: dropped unnecessary line breaks]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Fan speed minimum can be enforced from sysfs. For example, setting
current fan speed to 20 is used to enforce fan speed to be at 100%
speed, 19 - to be not below 90% speed, etcetera. This feature provides
ability to limit fan speed according to some system wise
considerations, like absence of some replaceable units or high system
ambient temperature.
Request for changing fan minimum speed is configuration request and can
be set only through 'sysfs' write procedure. In this situation value of
argument 'state' is above nominal fan speed maximum.
Return non-zero code in this case to avoid
thermal_cooling_device_stats_update() call, because in this case
statistics update violates thermal statistics table range.
The issues is observed in case kernel is configured with option
CONFIG_THERMAL_STATISTICS.
Here is the trace from KASAN:
[ 159.506659] BUG: KASAN: slab-out-of-bounds in thermal_cooling_device_stats_update+0x7d/0xb0
[ 159.516016] Read of size 4 at addr ffff888116163840 by task hw-management.s/7444
[ 159.545625] Call Trace:
[ 159.548366] dump_stack+0x92/0xc1
[ 159.552084] ? thermal_cooling_device_stats_update+0x7d/0xb0
[ 159.635869] thermal_zone_device_update+0x345/0x780
[ 159.688711] thermal_zone_device_set_mode+0x7d/0xc0
[ 159.694174] mlxsw_thermal_modules_init+0x48f/0x590 [mlxsw_core]
[ 159.700972] ? mlxsw_thermal_set_cur_state+0x5a0/0x5a0 [mlxsw_core]
[ 159.731827] mlxsw_thermal_init+0x763/0x880 [mlxsw_core]
[ 160.070233] RIP: 0033:0x7fd995909970
[ 160.074239] Code: 73 01 c3 48 8b 0d 28 d5 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 99 2d 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ..
[ 160.095242] RSP: 002b:00007fff54f5d938 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 160.103722] RAX: ffffffffffffffda RBX: 0000000000000013 RCX: 00007fd995909970
[ 160.111710] RDX: 0000000000000013 RSI: 0000000001906008 RDI: 0000000000000001
[ 160.119699] RBP: 0000000001906008 R08: 00007fd995bc9760 R09: 00007fd996210700
[ 160.127687] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000013
[ 160.135673] R13: 0000000000000001 R14: 00007fd995bc8600 R15: 0000000000000013
[ 160.143671]
[ 160.145338] Allocated by task 2924:
[ 160.149242] kasan_save_stack+0x19/0x40
[ 160.153541] __kasan_kmalloc+0x7f/0xa0
[ 160.157743] __kmalloc+0x1a2/0x2b0
[ 160.161552] thermal_cooling_device_setup_sysfs+0xf9/0x1a0
[ 160.167687] __thermal_cooling_device_register+0x1b5/0x500
[ 160.173833] devm_thermal_of_cooling_device_register+0x60/0xa0
[ 160.180356] mlxreg_fan_probe+0x474/0x5e0 [mlxreg_fan]
[ 160.248140]
[ 160.249807] The buggy address belongs to the object at ffff888116163400
[ 160.249807] which belongs to the cache kmalloc-1k of size 1024
[ 160.263814] The buggy address is located 64 bytes to the right of
[ 160.263814] 1024-byte region [ffff888116163400, ffff888116163800)
[ 160.277536] The buggy address belongs to the page:
[ 160.282898] page:0000000012275840 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888116167000 pfn:0x116160
[ 160.294872] head:0000000012275840 order:3 compound_mapcount:0 compound_pincount:0
[ 160.303251] flags: 0x200000000010200(slab|head|node=0|zone=2)
[ 160.309694] raw: 0200000000010200 ffffea00046f7208 ffffea0004928208 ffff88810004dbc0
[ 160.318367] raw: ffff888116167000 00000000000a0006 00000001ffffffff 0000000000000000
[ 160.327033] page dumped because: kasan: bad access detected
[ 160.333270]
[ 160.334937] Memory state around the buggy address:
[ 160.356469] >ffff888116163800: fc ..
Fixes: 65afb4c8e7 ("hwmon: (mlxreg-fan) Add support for Mellanox FAN driver")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20210916183151.869427-1-vadimp@nvidia.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Commit id "b00647c46c9d7f6ee1ff6aaf335906101755e614",
adds reporting current and voltage to k10temp.c
The commit id "0a4e668b5d52eed8026f5d717196b02b55fb2dc6",
removed reporting current and voltage from k10temp.c
The curr and in(voltage) entries are not removed from
"k10temp_info" structure. Removing those residue entries.
while at it, update k10temp driver documentation
Signed-off-by: suma hegde <suma.hegde@amd.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210902174155.7365-2-nchatrad@amd.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Merge more updates from Andrew Morton:
"147 patches, based on 7d2a07b769.
Subsystems affected by this patch series: mm (memory-hotplug, rmap,
ioremap, highmem, cleanups, secretmem, kfence, damon, and vmscan),
alpha, percpu, procfs, misc, core-kernel, MAINTAINERS, lib,
checkpatch, epoll, init, nilfs2, coredump, fork, pids, criu, kconfig,
selftests, ipc, and scripts"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
scripts: check_extable: fix typo in user error message
mm/workingset: correct kernel-doc notations
ipc: replace costly bailout check in sysvipc_find_ipc()
selftests/memfd: remove unused variable
Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH
configs: remove the obsolete CONFIG_INPUT_POLLDEV
prctl: allow to setup brk for et_dyn executables
pid: cleanup the stale comment mentioning pidmap_init().
kernel/fork.c: unexport get_{mm,task}_exe_file
coredump: fix memleak in dump_vma_snapshot()
fs/coredump.c: log if a core dump is aborted due to changed file permissions
nilfs2: use refcount_dec_and_lock() to fix potential UAF
nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group
nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group
nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group
nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group
nilfs2: fix NULL pointer in nilfs_##name##_attr_release
nilfs2: fix memory leak in nilfs_sysfs_create_device_group
trap: cleanup trap_init()
init: move usermodehelper_enable() to populate_rootfs()
...
This driver exposes hardware sensors of the Aquacomputer D5 Next
watercooling pump, which communicates through a proprietary USB HID
protocol.
Available sensors are pump and fan speed, power, voltage and current, as
well as coolant temperature. Also available through debugfs are the serial
number, firmware version and power-on count.
Attaching a fan is optional and allows it to be controlled using
temperature curves directly from the pump. If it's not connected,
the fan-related sensors will report zeroes.
The pump can be configured either through software or via its physical
interface. Configuring the pump through this driver is not implemented,
as it seems to require sending it a complete configuration. That
includes addressable RGB LEDs, for which there is no standard sysfs
interface. Thus, that task is better suited for userspace tools.
This driver has been tested on x86_64, both in-kernel and as a module.
Signed-off-by: Aleksa Savic <savicaleksa83@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Tdie is an offset calculation that should only be shown when temp_offset
is actually put into a table. This is useless to show for all CPU/APU.
Show it only when necessary.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
These follow the rest of the existing codepaths for families
17h and 19h.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
In the initial implementation a number of PMBUS_x_WARN_LIMITs were
mapped to MFR fields. This was incorrect as these MFR limits reflect the
rated limit as opposed to a limit which will generate warning. Instead
return -ENXIO like we were already doing for other WARN_LIMITs.
Subsequently these rated limits have been exposed generically as new
fields in the sysfs ABI so the values are still available.
Fixes: 15b2703e5e ("hwmon: (pmbus) Add driver for BluTek BPA-RS600")
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Link: https://lore.kernel.org/r/20210812014000.26293-2-chris.packham@alliedtelesis.co.nz
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The HW has some predefined points where it will associate a PWM value.
However some users might want to better set these points to their
usecases. This patch exposes these points as pwm auto_points:
* pwm1_auto_point1_temp_hyst: temperature threshold below which PWM should
be 0%;
* pwm1_auto_point1_temp: temperature threshold above which PWM should be
25%;
* pwm1_auto_point2_temp_hyst: temperature threshold below which PWM should
be 25%;
* pwm1_auto_point2_temp: temperature threshold above which PWM should be
50%;
* pwm1_auto_point3_temp_hyst: temperature threshold below which PWM should
be 50%;
* pwm1_auto_point3_temp: temperature threshold above which PWM should be
75%;
* pwm1_auto_point4_temp_hyst: temperature threshold below which PWM should
be 75%;
* pwm1_auto_point4_temp: temperature threshold above which PWM should be
100%;
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20210811114853.159298-4-nuno.sa@analog.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The core will now start out of reset at boot as soon as clocking is
available. Hence, by the time we unmask the interrupts we already might
have some of them set. Thus, it's important to handle them in the
natural order the core generates them. Otherwise, we could process
'ADI_IRQ_SRC_PWM_CHANGED' before 'ADI_IRQ_SRC_TEMP_INCREASE' and
erroneously set 'update_tacho_params' to true.
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20210811114853.159298-3-nuno.sa@analog.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>