Testing the DAMON debugfs files while DAMON is running makes no sense,
as any write to the debugfs files will fail. This commit makes the test
be skipped in this case.
Link: https://lkml.kernel.org/r/20211201150440.1088-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A couple of test functions in DAMON virtual address space monitoring
primitives implementation has unnecessary damon_ctx variables. This
commit removes those.
Link: https://lkml.kernel.org/r/20211201150440.1088-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On some configuration[1], 'damon_test_split_evenly()' kunit test
function has >1024 bytes frame size, so below build warning is
triggered:
CC mm/damon/vaddr.o
In file included from mm/damon/vaddr.c:672:
mm/damon/vaddr-test.h: In function 'damon_test_split_evenly':
mm/damon/vaddr-test.h:309:1: warning: the frame size of 1064 bytes is larger than 1024 bytes [-Wframe-larger-than=]
309 | }
| ^
This commit fixes the warning by separating the common logic in the
function.
[1] https://lore.kernel.org/linux-mm/202111182146.OV3C4uGr-lkp@intel.com/
Link: https://lkml.kernel.org/r/20211201150440.1088-6-sj@kernel.org
Fixes: 17ccae8bb5 ("mm/damon: add kunit tests")
Signed-off-by: SeongJae Park <sj@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The DAMON virtual address space monitoring primitive prints a warning
message for wrong DAMOS action. However, it is not essential as the
code returns appropriate failure in the case. This commit removes the
message to make the log clean.
Link: https://lkml.kernel.org/r/20211201150440.1088-5-sj@kernel.org
Fixes: 6dea8add4d ("mm/damon/vaddr: support DAMON-based Operation Schemes")
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
DAMON core prints error messages when damon_target object creation is
failed or wrong monitoring attributes are given. Because appropriate
error code is returned for each case, the messages are not essential.
Also, because the code path can be triggered with user-specified input,
this could result in kernel log mistakenly being messy. To avoid the
case, this commit removes the messages.
Link: https://lkml.kernel.org/r/20211201150440.1088-4-sj@kernel.org
Fixes: 4bc05954d0 ("mm/damon: implement a debugfs-based user space interface")
Fixes: b9a6ac4e4e ("mm/damon: adaptively adjust regions")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When wrong scheme action is requested via the debugfs interface, DAMON
prints an error message. Because the function returns error code, this
is not really needed. Because the code path is triggered by the user
specified input, this can result in kernel log mistakenly being messy.
To avoid the case, this commit removes the message.
Link: https://lkml.kernel.org/r/20211201150440.1088-3-sj@kernel.org
Fixes: af122dd8f3 ("mm/damon/dbgfs: support DAMON-based Operation Schemes")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "mm/damon: Trivial fixups and improvements".
This patchset contains trivial fixups and improvements for DAMON and its
kunit/kselftest tests.
This patch (of 11):
DAMON is using hrtimer if requested sleep time is <=100ms, while the
suggested threshold[1] is <=20ms. This commit applies the threshold.
[1] Documentation/timers/timers-howto.rst
Link: https://lkml.kernel.org/r/20211201150440.1088-2-sj@kernel.org
Fixes: ee801b7dd7 ("mm/damon/schemes: activate schemes based on a watermarks mechanism")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Because DAMON sleeps in uninterruptible mode, /proc/loadavg reports fake
load while DAMON is turned on, though it is doing nothing. This can
confuse users[1]. To avoid the case, this commit makes DAMON sleeps in
idle mode.
[1] https://lore.kernel.org/all/11868371.O9o76ZdvQC@natalenko.name/
Link: https://lkml.kernel.org/r/20211126145015.15862-3-sj@kernel.org
Fixes: 2224d84854 ("mm: introduce Data Access MONitor (DAMON)")
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: SeongJae Park <sj@kernel.org>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "mm/damon: Fix fake /proc/loadavg reports", v3.
This patchset fixes DAMON's fake load report issue. The first patch
makes yet another variant of usleep_range() for this fix, and the second
patch fixes the issue of DAMON by making it using the newly introduced
function.
This patch (of 2):
Some kernel threads such as DAMON could need to repeatedly sleep in
micro seconds level. Because usleep_range() sleeps in uninterruptible
state, however, such threads would make /proc/loadavg reports fake load.
To help such cases, this commit implements a variant of usleep_range()
called usleep_idle_range(). It is same to usleep_range() but sets the
state of the current task as TASK_IDLE while sleeping.
Link: https://lkml.kernel.org/r/20211126145015.15862-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20211126145015.15862-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: John Stultz <john.stultz@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pages are individually marked as suffering from hardware poisoning.
Checking that the head page is not hardware poisoned doesn't make
sense; we might be after a subpage. We check each page individually
before we use it, so this was an optimisation gone wrong. It will
cause us to fall back to the slow path when there was no need to do
that
Link: https://lkml.kernel.org/r/20211120174429.2596303-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove myself from kdump maintainers as I have no enough time to maintain
it now. But I can review patches on demand though.
Link: https://lkml.kernel.org/r/YZyKilzKFsWJYdgn@dhcp-128-65.nay.redhat.com
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This limit has not been updated since 2008, when it was increased to 64
KiB at the request of GnuPG. Until recently, the main use-cases for this
feature were (1) preventing sensitive memory from being swapped, as in
GnuPG's use-case; and (2) real-time use-cases. In the first case, little
memory is called for, and in the second case, the user is generally in a
position to increase it if they need more.
The introduction of IOURING_REGISTER_BUFFERS adds a third use-case:
preparing fixed buffers for high-performance I/O. This use-case will take
as much of this memory as it can get, but is still limited to 64 KiB by
default, which is very little. This increases the limit to 8 MB, which
was chosen fairly arbitrarily as a more generous, but still conservative,
default value.
It is also possible to raise this limit in userspace. This is easily
done, for example, in the use-case of a network daemon: systemd, for
instance, provides for this via LimitMEMLOCK in the service file; OpenRC
via the rc_ulimit variables. However, there is no established userspace
facility for configuring this outside of daemons: end-user applications do
not presently have access to a convenient means of raising their limits.
The buck, as it were, stops with the kernel. It's much easier to address
it here than it is to bring it to hundreds of distributions, and it can
only realistically be relied upon to be high-enough by end-user software
if it is more-or-less ubiquitous. Most distros don't change this
particular rlimit from the kernel-supplied default value, so a change here
will easily provide that ubiquity.
Link: https://lkml.kernel.org/r/20211028080813.15966-1-sir@cmpwn.com
Signed-off-by: Drew DeVault <sir@cmpwn.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Cyril Hrubis <chrubis@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Andrew Dona-Couch <andrew@donacou.ch>
Cc: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the definition of one of the Tiger Lake MMIO registers in the
int340x thermal driver (Sumeet Pawnikar).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmGzrbMSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxVokQAIpCQJ0m7qrG61hcrDHEKjOJFVrWs3CF
ombKEUZHTqqjvzzETMfiJQjq8+FXdpWFMQPzUdNm5NuhOkJY711O8N6WWKwStC9P
HGPDAoI5mzR8TpdZjeCMSFgf7sVunrZfWROfdxd/L5ySJby6sCGOJhJ11Cre/shY
3Pq3WhjJj6WA7eK6eyv+dTibD+l8qdkhNZmuaJJoi6/gVF/T7w+r+6FNbT/cyJc+
rT3ETCs5vVK9D5Xv1/URj5km+6mCVHXl/KLZ4gzAo2h6WXfbF+Sh2VTaml9a36Vv
plzJh5S/3Npg1ijcswiHzCgTqlS7ba4raxv4PpMS35fQbjELCHDiqcYJ/C911lpx
FYZtJ9+UMRKsAc42pFJ9lvso6nyhYxAO5fJkcCqX9JqmslpAGTfBtyx6tHp8X6ey
oIccz3qrdfnsADfGYftML0D4zWBzeZ9NZewaElUU2cj6xNs30fQ+VAeApA/1OIHA
C9a2d1S8CwkDCOFFW0j2/wLjgeQ+99VPRf3xKiZbG2tpecw/It+npKcGjR6UOUMi
EJ0pdiwu7htGNTnJTwtfhKQuzvAe/Dzcru55an/i4uJz+ARu4ZlypEOj0PIax50C
ZAQ9FqLmA4fF9CbFR2kaEkrdgDkrb7eUqCRUdawPzSmWyULL6Sr30eHHTyfQkDBH
G6ltCwAkNQW5
=pl8B
-----END PGP SIGNATURE-----
Merge tag 'thermal-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
"Fix the definition of one of the Tiger Lake MMIO registers in the
int340x thermal driver (Sumeet Pawnikar)"
* tag 'thermal-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: int340x: Fix VCoRefLow MMIO bit offset for TGL
Create the output directory for the ACPI tools during build if it has
not been present before and prevent the compilation from failing in
that case (Chen Yu).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmGzrMQSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxEZAP/RUAlFyiYOgL8iyRJxFkRNL0VdzS7ATY
2p+rT53mPRkcUN0e4ZdrUtNr50p1rC2Qx614Tm++2QLsQ08tz7IOb2PDX7GcKx34
Gs6z/tYQzjgwy+bElfSa+lnbczGndkm1m4hWgIcs0MkIQzZwaGRfQ3mc5TE7FfyJ
F1ALLsMM8fY2K/3akbQ5ysPLXA3AYEmnC/V1tw4b5sxH7Mi1vkfFYyTmk5Te5VyW
Zp9u63DoEmaNYKZeWsfkHUJM9QjvUOAhdIGK+OVBjlF9/qvSiLOTL5jYuRt/74ab
7vOSWWzEcihW5xa2mPWCz3YWQzZyB6qNM1b1DVN3sDZvv7zLhd1ntcZ5B/M/sT+I
SuRB1AMj92t8bgvp+U21ehUIygQdJuJ0Ny16q89yDgDv0qY0m8ZDq28IrXEkMVc6
g5iAwI7LlYovD3wnSGXL391sG6dSUQzWxuLp+h3bcJJZkSTiYYW9M2iQHVJgDI1h
T4Scdm0aUPr6lO1o/J8Chg8G9dMtPX89+WK5+6JA0P2PwHYHd9s79KDzq14zrltZ
vZWG7ishOqMoRJhxd5mI9Y63DZesO4kcuIzCADqO/K2WVCEgJ2sooNQAF0Ys440R
HypRBRs8FZmohS/gznFDmtrS/lhXfacXQVS+CKhM9XbNI+E8AQq9F6AmqHotqZRF
0MZ8HPOg3kKP
=lLS7
-----END PGP SIGNATURE-----
Merge tag 'acpi-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Create the output directory for the ACPI tools during build if it has
not been present before and prevent the compilation from failing in
that case (Chen Yu)"
* tag 'acpi-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: tools: Fix compilation when output directory is not present
Fix a kernedoc comment that doesn't match the behavior of the
function documented by it.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmGzq+ISHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxHRQP/09O03Tm3A9phiOaeHulxMwxxJu/t6E9
+0ZbzM4vSmkg0mPdlIAJt7/gAd6hWiS9XEOrrLZ+Q9/v00pIz4nF2510E7MbLxLO
9LPYO9PBw7tTG1sxMf9zULSvw3SzLcBEdgsuzicQO+QUf8uyYRXblUrlppgyW+YN
XEl6SjBYSLWPjuQYs5ZZx1JIic57zfj1g7XUyFmvIKrN0tx1GKo4YQsedhcAiW8O
523V+9H4acQ70tQfNHzwrkyBF7GImocZFxWBdb5c+JhfJgz0TigCXFB84D+3U9WW
3dMrfIKafygYS0PvrPk0ovk3SgmXj85i/x5v41DZA22gmoYFAbKNc3Uk4T6Z1D08
scjKtydpALnMagEBbkJ378Oa4Q4Tq5pvVW/sf78RdOjF3mTrfpe6yMAQCE3+GTNK
IbnidQGunEFJObaGoh4/U9XQS6cfjIr5FVw83hNxYvSq7YoWtzsHarBNHxUFTgML
E5DZaiLWeEIoVaFBo3tkxTsP5AKZo2NYxD/cX85cXLxNygY64CRSHvJ+rC6bQRae
7w5ZwgWisNaypwblO741WPxxqs16uD4ZpRz2dqDVmTjsSP4EJCZe6C8aECzbGNDD
Wqy+3edya/fHdV7qra+yMRNgK4po3/rzr7CZ7WqRaNZ5dH7o3wDBPKBhhHwKtK/w
ymPawTBfYNQR
=Hvf1
-----END PGP SIGNATURE-----
Merge tag 'pm-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Fix a kernedoc comment that doesn't match the behavior of the function
documented by it"
* tag 'pm-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: runtime: Fix pm_runtime_active() kerneldoc comment
- In the pwm-fan driver, ensure that the internal pwm state matches
the state assumed by the pwm code.
- Avoid EREMOTEIO errors in sht4 driver
- In the nct6775 driver, make it explicit that the register value
passed to nct6775_asuswmi_read() is an 8-bit value
- Avoid WARNing in dell-smm driver removal after failing to create
/proc/i8k
- Stop using a plain integer as NULL pointer in corsair-psu driver
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEiHPvMQj9QTOCiqgVyx8mb86fmYEFAmGzuzMACgkQyx8mb86f
mYFA7xAAo9Bt0sjsRqgu9ZD+0cC6wKQV6HVkq1Gd5CnvG/gnS98/Yog7AGdpvGdp
x4FssJ8Xgr6CZJfAwZA2koX60aEyl/VT7QORFZdeg0yhUHNxnU9W/INH7OZ2Bs3q
n+ceE1MHFyyesRgqD9mrMEiiOodQxOch2RmGpXercYyVfpZciPL5ITYVcwS0xU1C
ESCSi5bQGxfgw3lcH92fDFrmtXYBwxNYbq19qJvzRlT9bqvxUjyQD69Mxl/yKfev
NXUFizBAMmXbe7d7Uqg3HKWQzLfyz7FCIor37suclYoiWY/Nn4Fx8oBLq7PuYf51
7/jFX2xCB/GZFz5zjDj/83/Kxj9KR2yyHu1IyWqBGhGGQlFTiu11m308CIBJpKL8
LkoIt9N95oyDI66YEDrxDRQvx+uW35igMlxRmk+E4WQLwJcOdfpEFHkvMAey3/D5
qYybXKMRQ/UlNSzpgDJSpUFu34RdaJJczYi68yCpCVy30PbTLL+8jLUW5RhrS6H4
g6Rupx0ggJFWPs2gIjgoA0ERaoW7pO9BKmqxqbSdafje5xzYUngiXxsAc9qSl5fI
p8Gzq4tQNogvC35SFJKjjtJy8m43fKxPFD4O6ShwZYn5i7J+ugm9Q75U4hcCCeeQ
hW0tNC6aK17NJE2F/RkWVnE9nWpwKULeRDTUqLJuvP3pVZJVyXQ=
=Fjl2
-----END PGP SIGNATURE-----
Merge tag 'hwmon-for-v5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- In the pwm-fan driver, ensure that the internal pwm state matches the
state assumed by the pwm code.
- Avoid EREMOTEIO errors in sht4 driver
- In the nct6775 driver, make it explicit that the register value
passed to nct6775_asuswmi_read() is an 8-bit value
- Avoid WARNing in dell-smm driver removal after failing to create
/proc/i8k
- Stop using a plain integer as NULL pointer in corsair-psu driver
* tag 'hwmon-for-v5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (pwm-fan) Ensure the fan going on in .probe()
hwmon: (sht4x) Fix EREMOTEIO errors
hwmon: (nct6775) mask out bank number in nct6775_wmi_read_value()
hwmon: (dell-smm) Fix warning on /proc/i8k creation error
hwmon: (corsair-psu) fix plain integer used as NULL pointer
- Have tracefs honor the gid mount option
- Have new files in tracefs inherit the parent ownership
- Have direct_ops unregister when it has no more functions
- Properly clean up the ops when unregistering multi direct ops
- Add a sample module to test the multiple direct ops
- Fix memory leak in error path of __create_synth_event()
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYbOgPBQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qgOtAP0YD+cRLxnRKA376oQVB8zmuZ3mZ/4x
6M1hqruSDlno3AEA19PyHpxl7flFwnBb6Gnfo9VeefcMS5ENDH9p1aHX4wU=
=Tr6t
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Tracing, ftrace and tracefs fixes:
- Have tracefs honor the gid mount option
- Have new files in tracefs inherit the parent ownership
- Have direct_ops unregister when it has no more functions
- Properly clean up the ops when unregistering multi direct ops
- Add a sample module to test the multiple direct ops
- Fix memory leak in error path of __create_synth_event()"
* tag 'trace-v5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix possible memory leak in __create_synth_event() error path
ftrace/samples: Add module to test multi direct modify interface
ftrace: Add cleanup to unregister_ftrace_direct_multi
ftrace: Use direct_ops hash in unregister_ftrace_direct
tracefs: Set all files to the same group ownership as the mount option
tracefs: Have new files inherit the ownership of their parent
Fix three bugs in aio poll, and one issue with POLLFREE more broadly:
- aio poll didn't handle POLLFREE, causing a use-after-free.
- aio poll could block while the file is ready.
- aio poll called eventfd_signal() when it isn't allowed.
- POLLFREE didn't handle multiple exclusive waiters correctly.
This has been tested with the libaio test suite, as well as with test
programs I wrote that reproduce the first two bugs. I am sending this
pull request myself as no one seems to be maintaining this code.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCYbObthQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK+3mAQC9W8ApzBleEPI6FXzIIo5AiQT/2jGl
7FbO1MtkdUBU4QEAzf+VWl4Z4BJTgxl44avRdVDpXGAMnbWkd7heY+e3HwA=
=mp+r
-----END PGP SIGNATURE-----
Merge tag 'aio-poll-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull aio poll fixes from Eric Biggers:
"Fix three bugs in aio poll, and one issue with POLLFREE more broadly:
- aio poll didn't handle POLLFREE, causing a use-after-free.
- aio poll could block while the file is ready.
- aio poll called eventfd_signal() when it isn't allowed.
- POLLFREE didn't handle multiple exclusive waiters correctly.
This has been tested with the libaio test suite, as well as with test
programs I wrote that reproduce the first two bugs. I am sending this
pull request myself as no one seems to be maintaining this code"
* tag 'aio-poll-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
aio: Fix incorrect usage of eventfd_signal_allowed()
aio: fix use-after-free due to missing POLLFREE handling
aio: keep poll requests on waitqueue until completed
signalfd: use wake_up_pollfree()
binder: use wake_up_pollfree()
wait: add wake_up_pollfree()
* Logic bugs in CR0 writes and Hyper-V hypercalls
* Don't use Enlightened MSR Bitmap for L3
* Remove user-triggerable WARN
Plus a few selftest fixes and a regression test for the
user-triggerable WARN.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGzZuIUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMD4Qf+Im7Q0XNRZGzK6x4Blu3ZZJSuIIkW
gEK5mDX/BWSYxoGhRN0IOkyf1Tx/A5qYwbZts87wZSvKONG2MuVzdeQ0mkDxgKc3
cYwvvIPxCKaW/dQLD2OKVlqdAv6YbeJiFURWXgszMkrcgHvw39H5Tn6ldi0B5nvg
Gvpj8LtbPDXGXab//Xrhia3+1F9TKOrcOG+obGC5G2mrGKTkG2+pi9L6LohvENhd
sOSWdpmvQTU4PeqGlhW8RCwcN+vpa+NasHT2i2tHcWZA9Lqp4P81+4ZyQLIBsRB3
psANG0c40EW+lfjFGbLL/6VR5kypxa6zy9RgX+QiRcj6C0+dJOgnwNY35A==
=cREz
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"More x86 fixes:
- Logic bugs in CR0 writes and Hyper-V hypercalls
- Don't use Enlightened MSR Bitmap for L3
- Remove user-triggerable WARN
Plus a few selftest fixes and a regression test for the
user-triggerable WARN"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
selftests: KVM: Add test to verify KVM doesn't explode on "bad" I/O
KVM: x86: Don't WARN if userspace mucks with RCX during string I/O exit
KVM: X86: Raise #GP when clearing CR0_PG in 64 bit mode
selftests: KVM: avoid failures due to reserved HyperTransport region
KVM: x86: Ignore sparse banks size for an "all CPUs", non-sparse IPI req
KVM: x86: Wait for IPIs to be delivered when handling Hyper-V TLB flush hypercall
KVM: x86: selftests: svm_int_ctl_test: fix intercept calculation
KVM: nVMX: Don't use Enlightened MSR Bitmap for L3
Maxime points out that the polling code in mpc_i2c_isr should use the
_atomic API because it is called in an irq context and that the
behaviour of the MCF bit is that it is 1 when the byte transfer is
complete. All of this means the original code was effectively a
udelay(100).
Fix this by using readb_poll_timeout_atomic() and removing the negation
of the break condition.
Fixes: 4a8ac5e45c ("i2c: mpc: Poll for MCF")
Reported-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Tested-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
We check IO_WQ_BIT_EXIT before attempting to create a new worker, and
wq exit cancels pending work if we have any. But it's possible to have
a race between the two, where creation checks exit finding it not set,
but we're in the process of exiting. The exit side will cancel pending
creation task_work, but there's a gap where we add task_work after we've
canceled existing creations at exit time.
Fix this by checking the EXIT bit post adding the creation task_work.
If it's set, run the same cancelation that exit does.
Reported-and-tested-by: syzbot+b60c982cb0efc5e05a47@syzkaller.appspotmail.com
Reviewed-by: Hao Xu <haoxu@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If we successfully cancel a work item but that work item needs to be
processed through task_work, then we can be sleeping uninterruptibly
in io_uring_cancel_generic() and never process it. Hence we don't
make forward progress and we end up with an uninterruptible sleep
warning.
While in there, correct a comment that should be IFF, not IIF.
Reported-and-tested-by: syzbot+21e6887c0be14181206d@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmGzf4IUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vwjcBAAiWel9P5H947jR9sTbz4ya6wH1biD
k2w97VDa65DyH/LBJSgNwmblnXs7yIUuGTd+mRq9bhlpE8CQi9BfeCehP1vCfTeQ
JtMH62dW8KBLkvIHU83H1SSZZNKQgDn7hqUsrrMa0HD+Z+ovbuQYp4M1Oh6xRAEM
TTBTKb0KivA8bFwvtgj/mu7K7sVJH+cVMilD9ABoVeGmCWfUSO48ovEjWB+vmBFs
UyTCU5CUg/FkjvVmZTOv5GY4EL83FA9Jdtzy8inRA+hSWY6ImXHTzmQlAzvA+Rkv
k344ZQM9GNvbvwKfBa9iW2g+B2y/OJXafGoVL0NBUcj/eiY5dnAX0/tZHvx0aXFy
G1Txy2utaG2MSkfZzchEKbRvS0tV7kiFiTmqp9lNmffTZiP72k4+kFJHQC5AzvZb
O7Ce/XSQifQ1Z3f5B+Ymx6EOgKYJUaWO9B1U1KF0EKGMe5GB0TBiXh/tS2EmV1O8
1hkUJm032Bbf1Bv5R6BLdgKVz4I3UsqmGKH5gg3blyylAQ1oHsioaUKeV6iHSq40
u9rNZaKGC3SweYZVISNE1uoII4qzEgLOHggHpZvWxhQy35cFBz8ZsNfLwBD3/8z9
UfFuLSLHjx+hv3Ev5mgDWH1mzAlzyq5KkDT0bodBix07s5mviDH+57yyw3JtHUuL
F+tMrYjUHK9ArC8=
=BWkT
-----END PGP SIGNATURE-----
Merge tag 'pci-v5.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- Revert emulation of Marvell Armada A3720 expansion ROM because it
doesn't work as expected (Marek Behún)
- Assert PERST# in Apple M1 driver to fix initialization when booting
from bootloaders using PCIe, such as U-Boot (Marc Zyngier)
- Describe PERST# as active low in Apple T8103 DT and update driver to
match (Marc Zyngier)
* tag 'pci-v5.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: apple: Fix PERST# polarity
arm64: dts: apple: t8103: Mark PCIe PERST# polarity active low in DT
PCI: apple: Follow the PCIe specifications when resetting the port
Revert "PCI: aardvark: Fix support for PCI_ROM_ADDRESS1 on emulated bridge"
* Fix a sparse warning in the ahci_ceva driver, from me
* Disable the ASMedia 1092 non-functional device, from Hannes
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCYbMACQAKCRDdoc3SxdoY
dt1wAQDbewJv2zf5eCUkAF2/NJaRJvrT8HmcbihsGic+NzfdwAD+PvF6XS4DMbr5
ee6g7c0SYOb5ZtLfVGwVaZQgNNvJAQQ=
=vV6H
-----END PGP SIGNATURE-----
Merge tag 'libata-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull libata fixes from Damien Le Moal:
- Fix a sparse warning in the ahci_ceva driver (me)
- Disable the ASMedia 1092 non-functional device (Hannes)
* tag 'libata-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
libata: add horkage for ASMedia 1092
ata: ahci_ceva: Fix id array access in ceva_ahci_read_id()
Another collection of small fixes. It's still not quite calm yet,
but nothing looks scary.
ALSA core got a few fixes for covering the issues detected by fuzzer
and the 32bit compat problem of control API, while the rest are all
device-specific small fixes, including the continued fixes for Tegra.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmGyKGsOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE+56w/+ME2YljtbD2RhYzJs2X8ARvusOO3mX1Zjr/36
+7wBRi8WwRVH3jn59CnhgCDXSxTaM8GYbFje8kAyYv0Ib+f9bAvao7pgFAPFEd//
ZRZBQ0bCF18Pp4oNqbR/F6K2XyLyzQeRQPWl2z0oZq4zuWXtK59pQnXbEYV/UGx8
MMciRA4aj7qYaaQj4juBNKuxgixAyCatcOJh6t5O4dy2N9naQi0TShMF49ca8uRR
nSOq1YeEBpIOd4DVto4P6sQ7tpyfffj4qPhXGvemYnhBfwMhUVJyWxFjXXGJY2rT
KrFtuOHlS7NlScvT36GowbQdB5wgXJ7eLJg/JXVi3HBCrV4zHlp7Jn/Nbr62SYIu
h5gkgNN04Hjgel5lTvsJPiirxfxWpbVeF84HOrkrx6teOsGWZtW10Zms0YkovKmg
hR23YRNbX4qko6evBvg4lRlSlbTznOvzHKY323joebjSYp4kSJyNdqc+8fgVpK3E
Fx9DJmBSyGp/n2gkKZEhDVSgcWZyGvPkFqondCjwxqWV+jvJWSnScTjUyMeOfUCt
lFV4tlIMQ58t5u6BRaMGTenTxQ6Dqf5nOR1hwK5EPR5RQwu3chFfYDsm0C9ZKfsG
mCMe3BTvdl3W2nShwIH11B/ukieqAVZ7uugSFAarYamDfupPcwO69lPIeFAEuzjw
N+us0rg=
=g4gf
-----END PGP SIGNATURE-----
Merge tag 'sound-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Another collection of small fixes. It's still not quite calm yet, but
nothing looks scary.
ALSA core got a few fixes for covering the issues detected by fuzzer
and the 32bit compat problem of control API, while the rest are all
device-specific small fixes, including the continued fixes for Tegra"
* tag 'sound-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (23 commits)
ALSA: hda/realtek - Add headset Mic support for Lenovo ALC897 platform
ALSA: usb-audio: Reorder snd_djm_devices[] entries
ALSA: hda/realtek: Fix quirk for TongFang PHxTxX1
ALSA: ctl: Fix copy of updated id with element read/write
ALSA: pcm: oss: Handle missing errors in snd_pcm_oss_change_params*()
ALSA: pcm: oss: Limit the period size to 16MB
ALSA: pcm: oss: Fix negative period/buffer sizes
ASoC: codecs: wsa881x: fix return values from kcontrol put
ASoC: codecs: wcd934x: return correct value from mixer put
ASoC: codecs: wcd934x: handle channel mappping list correctly
ASoC: qdsp6: q6routing: Fix return value from msm_routing_put_audio_mixer
ASoC: SOF: Intel: Retry codec probing if it fails
ASoC: amd: fix uninitialized variable in snd_acp6x_probe()
ASoC: rockchip: i2s_tdm: Dup static DAI template
ASoC: rt5682s: Fix crash due to out of scope stack vars
ASoC: rt5682: Fix crash due to out of scope stack vars
ASoC: tegra: Use normal system sleep for ADX
ASoC: tegra: Use normal system sleep for AMX
ASoC: tegra: Use normal system sleep for Mixer
ASoC: tegra: Use normal system sleep for MVC
...
This reverts commit 776b54e97a.
Looks like a last minute edit snuck into this patch, and as a result,
it doesn't even compile. Revert the change for now.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
do_each_pid_thread(PIDTYPE_PGID) can race with a concurrent
change_pid(PIDTYPE_PGID) that can move the task from one hlist
to another while iterating. Serialize ioprio_get to take
the tasklist_lock in this case, just like it's set counterpart.
Fixes: d69b78ba1d (ioprio: grab rcu_read_lock in sys_ioprio_{set,get}())
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Link: https://lore.kernel.org/r/20211210182058.43417-1-dave@stgolabs.net
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch adds tests for the verifier's tracking for spilled, <8B
registers. The first two test cases ensure the verifier doesn't
incorrectly prune states in case of <8B spill/fills. The last one simply
checks that a filled u64 register is marked unknown if the register
spilled in the same slack slot was less than 8B.
The map value access at the end of the first program is only incorrect
for the path R6=32. If the precision bit for register R8 isn't
backtracked through the u32 spill/fill, the R6=32 path is pruned at
instruction 9 and the program is incorrectly accepted. The second
program is a variation of the same with u32 spills and a u64 fill.
The additional instructions to introduce the first pruning point may be
a bit fragile as they depend on the heuristics for pruning points in the
verifier (currently at least 8 instructions and 2 jumps). If the
heuristics are changed, the pruning point may move (e.g., to the
subsequent jump) or disappear, which would cause the test to always pass.
Signed-off-by: Paul Chaignon <paul@isovalent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Commit 354e8f1970 ("bpf: Support <8-byte scalar spill and refill")
introduced support in the verifier to track <8B spill/fills of scalars.
The backtracking logic for the precision bit was however skipping
spill/fills of less than 8B. That could cause state pruning to consider
two states equivalent when they shouldn't be.
As an example, consider the following bytecode snippet:
0: r7 = r1
1: call bpf_get_prandom_u32
2: r6 = 2
3: if r0 == 0 goto pc+1
4: r6 = 3
...
8: [state pruning point]
...
/* u32 spill/fill */
10: *(u32 *)(r10 - 8) = r6
11: r8 = *(u32 *)(r10 - 8)
12: r0 = 0
13: if r8 == 3 goto pc+1
14: r0 = 1
15: exit
The verifier first walks the path with R6=3. Given the support for <8B
spill/fills, at instruction 13, it knows the condition is true and skips
instruction 14. At that point, the backtracking logic kicks in but stops
at the fill instruction since it only propagates the precision bit for
8B spill/fill. When the verifier then walks the path with R6=2, it will
consider it safe at instruction 8 because R6 is not marked as needing
precision. Instruction 14 is thus never walked and is then incorrectly
removed as 'dead code'.
It's also possible to lead the verifier to accept e.g. an out-of-bound
memory access instead of causing an incorrect dead code elimination.
This regression was found via Cilium's bpf-next CI where it was causing
a conntrack map update to be silently skipped because the code had been
removed by the verifier.
This commit fixes it by enabling support for <8B spill/fills in the
bactracking logic. In case of a <8B spill/fill, the full 8B stack slot
will be marked as needing precision. Then, in __mark_chain_precision,
any tracked register spilled in a marked slot will itself be marked as
needing precision, regardless of the spill size. This logic makes two
assumptions: (1) only 8B-aligned spill/fill are tracked and (2) spilled
registers are only tracked if the spill and fill sizes are equal. Commit
ef979017b8 ("bpf: selftest: Add verifier tests for <8-byte scalar
spill and refill") covers the first assumption and the next commit in
this patchset covers the second.
Fixes: 354e8f1970 ("bpf: Support <8-byte scalar spill and refill")
Signed-off-by: Paul Chaignon <paul@isovalent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
In driver/md/md.c, if the function autorun_array() is called,
the problem of double free may occur.
In function autorun_array(), when the function do_md_run() returns an
error, the function do_md_stop() will be called.
The function do_md_run() called function md_run(), but in function
md_run(), the pointer mddev->private may be freed.
The function do_md_stop() called the function __md_stop(), but in
function __md_stop(), the pointer mddev->private also will be freed
without judging null.
At this time, the pointer mddev->private will be double free, so it
needs to be judged null or not.
Signed-off-by: zhangyue <zhangyue1@kylinos.cn>
Signed-off-by: Song Liu <songliubraving@fb.com>
The superblock of version 1.0 doesn't get moved to the new position on a
device size change. This leads to a rdev without a superblock on a known
position, the raid can't be re-assembled.
The line was removed by mistake and is re-added by this patch.
Fixes: d9c0fa509e ("md: fix max sectors calculation for super 1.0")
Cc: stable@vger.kernel.org
Signed-off-by: Markus Hochholdinger <markus@hochholdinger.net>
Reviewed-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
A delegation break could arrive as soon as we've called vfs_setlease. A
delegation break runs a callback which immediately (in
nfsd4_cb_recall_prepare) adds the delegation to del_recall_lru. If we
then exit nfs4_set_delegation without hashing the delegation, it will be
freed as soon as the callback is done with it, without ever being
removed from del_recall_lru.
Symptoms show up later as use-after-free or list corruption warnings,
usually in the laundromat thread.
I suspect aba2072f45 "nfsd: grant read delegations to clients holding
writes" made this bug easier to hit, but I looked as far back as v3.0
and it looks to me it already had the same problem. So I'm not sure
where the bug was introduced; it may have been there from the beginning.
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Commit bd5ae9288d ("nfsd: register pernet ops last, unregister first")
has re-opened rpc_pipefs_event() race against nfsd_net_id registration
(register_pernet_subsys()) which has been fixed by commit bb7ffbf29e
("nfsd: fix nsfd startup race triggering BUG_ON").
Restore the order of register_pernet_subsys() vs register_cld_notifier().
Add WARN_ON() to prevent a future regression.
Crash info:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000012
CPU: 8 PID: 345 Comm: mount Not tainted 5.4.144-... #1
pc : rpc_pipefs_event+0x54/0x120 [nfsd]
lr : rpc_pipefs_event+0x48/0x120 [nfsd]
Call trace:
rpc_pipefs_event+0x54/0x120 [nfsd]
blocking_notifier_call_chain
rpc_fill_super
get_tree_keyed
rpc_fs_get_tree
vfs_get_tree
do_mount
ksys_mount
__arm64_sys_mount
el0_svc_handler
el0_svc
Fixes: bd5ae9288d ("nfsd: register pernet ops last, unregister first")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
With some specific kernel configuration and Clang, the kernel fails
to like with something like:
ld.lld: error: undefined symbol: __compiletime_assert_200
>>> referenced by arch_timer.h:156 (./arch/arm64/include/asm/arch_timer.h:156)
>>> clocksource/arm_arch_timer.o:(erratum_set_next_event_generic) in archive drivers/built-in.a
ld.lld: error: undefined symbol: __compiletime_assert_197
>>> referenced by arch_timer.h:133 (./arch/arm64/include/asm/arch_timer.h:133)
>>> clocksource/arm_arch_timer.o:(erratum_set_next_event_generic) in archive drivers/built-in.a
make: *** [Makefile:1161: vmlinux] Error 1
These are due to the BUILD_BUG() macros contained in the low-level
accessors (arch_timer_reg_{write,read}_cp15) being emitted, as the
access type wasn't known at compile time.
Fix this by making erratum_set_next_event_generic() __force_inline,
resulting in the 'access' parameter to be resolved at compile time,
similarly to what is already done for set_next_event().
Fixes: 4775bc63f8 ("Add build-time guards for unhandled register accesses")
Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20211117113532.3895208-1-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The driver refuses to probe with -EINVAL since the commit 5d9814df0a
("clocksource/drivers/dw_apb_timer_of: Add error handling if no clock
available").
Before the driver used to probe successfully if either "clock-freq" or
"clock-frequency" properties has been specified in the device tree.
That commit changed
if (A && B)
panic("No clock nor clock-frequency property");
into
if (!A && !B)
return 0;
That's a bug: the reverse of `A && B` is '!A || !B', not '!A && !B'
Signed-off-by: Vadim V. Vlasov <vadim.vlasov@elpitech.ru>
Signed-off-by: Alexey Sheplyakov <asheplyakov@basealt.ru>
Fixes: 5d9814df0a ("clocksource/drivers/dw_apb_timer_of: Add error handling if no clock available").
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vadim V. Vlasov <vadim.vlasov@elpitech.ru>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Link: https://lore.kernel.org/r/20211109153401.157491-1-asheplyakov@basealt.ru
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The HiperSockets Converged Interface (HSCI) introduced with commit
4e20e73e63 ("s390/qeth: Switchdev event handler") requires
CONFIG_SWITCHDEV=y to be usable. Similarly when using Linux controlled
SR-IOV capable PF devices with the mlx5_core driver CONFIG_SWITCHDEV=y
as well as CONFIG_MLX5_ESWITCH=y are necessary to actually get link on
the created VFs. So let's add these to the defconfig to make both types
of devices usable. Note also that these options are already enabled in
most current distribution kernels.
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Starting with gcc 11.3, the C compiler will generate PLT-relative function
calls even if they are local and do not require it. Later on during linking,
the linker will replace all PLT-relative calls to local functions with
PC-relative ones. Unfortunately, the purgatory code of kexec/kdump is
not being linked as a regular executable or shared library would have been,
and therefore, all PLT-relative addresses remain in the generated purgatory
object code unresolved. This leads to the situation where the purgatory
code is being executed during kdump with all PLT-relative addresses
unresolved. And this results in endless loops within the purgatory code.
Furthermore, the clang C compiler has always behaved like described above
and this commit should fix kdump for kernels built with the latter.
Because the purgatory code is no regular executable or shared library,
contains only calls to local functions and has no PLT, all R_390_PLT32DBL
relocation entries can be resolved just like a R_390_PC32DBL one.
* https://refspecs.linuxfoundation.org/ELF/zSeries/lzsabi0_zSeries/x1633.html#AEN1699
Relocation entries of purgatory code generated with gcc 11.3
------------------------------------------------------------
$ readelf -r linux/arch/s390/purgatory/purgatory.o
Relocation section '.rela.text' at offset 0x370 contains 5 entries:
Offset Info Type Sym. Value Sym. Name + Addend
00000000005c 000c00000013 R_390_PC32DBL 0000000000000000 purgatory_sha_regions + 2
00000000007a 000d00000014 R_390_PLT32DBL 0000000000000000 sha256_update + 2
00000000008c 000e00000014 R_390_PLT32DBL 0000000000000000 sha256_final + 2
000000000092 000800000013 R_390_PC32DBL 0000000000000000 .LC0 + 2
0000000000a0 000f00000014 R_390_PLT32DBL 0000000000000000 memcmp + 2
Relocation entries of purgatory code generated with gcc 11.2
------------------------------------------------------------
$ readelf -r linux/arch/s390/purgatory/purgatory.o
Relocation section '.rela.text' at offset 0x368 contains 5 entries:
Offset Info Type Sym. Value Sym. Name + Addend
00000000005c 000c00000013 R_390_PC32DBL 0000000000000000 purgatory_sha_regions + 2
00000000007a 000d00000013 R_390_PC32DBL 0000000000000000 sha256_update + 2
00000000008c 000e00000013 R_390_PC32DBL 0000000000000000 sha256_final + 2
000000000092 000800000013 R_390_PC32DBL 0000000000000000 .LC0 + 2
0000000000a0 000f00000013 R_390_PC32DBL 0000000000000000 memcmp + 2
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reported-by: Tao Liu <ltao@redhat.com>
Suggested-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20211209073817.82196-1-egorenar@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
It looks like commit ce5e48036c ("ftrace: disable preemption
when recursion locked") missed a spot in kprobe_ftrace_handler() in
arch/s390/kernel/ftrace.c.
Remove the superfluous preempt_disable/enable_notrace() there too.
Fixes: ce5e48036c ("ftrace: disable preemption when recursion locked")
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Link: https://lore.kernel.org/r/20211208151503.1510381-1-jmarchan@redhat.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
arch_kexec_apply_relocations_add currently ignores all errors returned
by arch_kexec_do_relocs. This means that every unknown relocation is
silently skipped causing unpredictable behavior while the relocated code
runs. Fix this by checking for errors and fail kexec_file_load if an
unknown relocation type is encountered.
The problem was found after gcc changed its behavior and used
R_390_PLT32DBL relocations for brasl instruction and relied on ld to
resolve the relocations in the final link in case direct calls are
possible. As the purgatory code is only linked partially (option -r)
ld didn't resolve the relocations leaving them for arch_kexec_do_relocs.
But arch_kexec_do_relocs doesn't know how to handle R_390_PLT32DBL
relocations so they were silently skipped. This ultimately caused an
endless loop in the purgatory as the brasl instructions kept branching
to itself.
Fixes: 71406883fd ("s390/kexec_file: Add kexec_file_load system call")
Reported-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Link: https://lore.kernel.org/r/20211208130741.5821-3-prudo@redhat.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Add an x86 selftest to verify that KVM doesn't WARN or otherwise explode
if userspace modifies RCX during a userspace exit to handle string I/O.
This is a regression test for a user-triggerable WARN introduced by
commit 3b27de2718 ("KVM: x86: split the two parts of emulator_pio_in").
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211025201311.1881846-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Replace a WARN with a comment to call out that userspace can modify RCX
during an exit to userspace to handle string I/O. KVM doesn't actually
support changing the rep count during an exit, i.e. the scenario can be
ignored, but the WARN needs to go as it's trivial to trigger from
userspace.
Cc: stable@vger.kernel.org
Fixes: 3b27de2718 ("KVM: x86: split the two parts of emulator_pio_in")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211025201311.1881846-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the SDM:
If the logical processor is in 64-bit mode or if CR4.PCIDE = 1, an
attempt to clear CR0.PG causes a general-protection exception (#GP).
Software should transition to compatibility mode and clear CR4.PCIDE
before attempting to disable paging.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20211207095230.53437-1-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make xhci_disable_slot() synchronous, thus ensuring it, and
xhci_free_dev() calling it return after xHC controller completes
the disable slot command.
Otherwise the roothub and xHC host may runtime suspend, and clear the
command ring while the disable slot command is being processed.
This causes a command completion mismatch as the completion event can't
be mapped to the correct command.
Command ring gets out of sync and commands time out.
Driver finally assumes host is unresponsive and bails out.
usb 2-4: USB disconnect, device number 10
xhci_hcd 0000:00:0d.0: ERROR mismatched command completion event
...
xhci_hcd 0000:00:0d.0: xHCI host controller not responding, assume dead
xhci_hcd 0000:00:0d.0: HC died; cleaning up
Cc: <stable@vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20211210141735.1384209-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the xHCI is quirked with XHCI_RESET_ON_RESUME, runtime resume
routine also resets the controller.
This is bad for USB drivers without reset_resume callback, because
there's no subsequent call of usb_dev_complete() ->
usb_resume_complete() to force rebinding the driver to the device. For
instance, btusb device stops working after xHCI controller is runtime
resumed, if the controlled is quirked with XHCI_RESET_ON_RESUME.
So always take XHCI_RESET_ON_RESUME into account to solve the issue.
Cc: <stable@vger.kernel.org>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20211210141735.1384209-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>