The rtla timerlat tool is an interface for the timerlat tracer.
The timerlat tracer dispatches a kernel thread per-cpu. These threads set a
periodic timer to wake themselves up and go back to sleep. After the
wakeup, they collect and generate useful information for the debugging of
operating system timer latency.
The timerlat tracer outputs information in two ways. It periodically
prints the timer latency at the timer IRQ handler and the Thread handler.
It also provides information for each noise via the osnoise tracepoints.
The rtla timerlat top mode displays a summary of the periodic output from
the timerlat tracer.
Here is one example of the rtla timerlat tool output:
---------- %< ----------
[root@alien ~]# rtla timerlat top -c 0-3 -d 1m
Timer Latency
0 00:01:00 | IRQ Timer Latency (us) | Thread Timer Latency (us)
CPU COUNT | cur min avg max | cur min avg max
0 #60001 | 0 0 0 3 | 1 1 1 6
1 #60001 | 0 0 0 3 | 2 1 1 5
2 #60001 | 0 0 1 6 | 1 1 2 7
3 #60001 | 0 0 0 7 | 1 1 1 11
---------- >% ----------
Running:
# rtla timerlat --help
# rtla timerlat top --help
provides information about the available options.
Link: https://lkml.kernel.org/r/e95032e20c2b88c962195bf7693bb53c9ebcced8.1639158831.git.bristot@kernel.org
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: linux-rt-users@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The rtla osnoise tool is an interface for the osnoise tracer. The
osnoise tracer dispatches a kernel thread per-cpu. These threads read
the time in a loop while with preemption, softirqs and IRQs enabled,
thus allowing all the sources of osnoise during its execution. The
osnoise threads take note of the entry and exit point of any source
of interferences, increasing a per-cpu interference counter. The
osnoise tracer also saves an interference counter for each source
of interference.
The rtla osnoise top mode displays information about the periodic
summary from the osnoise tracer.
One example of rtla osnoise top output is:
[root@alien ~]# rtla osnoise top -c 0-3 -d 1m -q -r 900000 -P F:1
Operating System Noise
duration: 0 00:01:00 | time is in us
CPU Period Runtime Noise % CPU Aval Max Noise Max Single HW NMI IRQ Softirq Thread
0 #58 52200000 1031 99.99802 91 60 0 0 52285 0 101
1 #59 53100000 5 99.99999 5 5 0 9 53122 0 18
2 #59 53100000 7 99.99998 7 7 0 8 53115 0 18
3 #59 53100000 8274 99.98441 277 23 0 9 53778 0 660
"rtla osnoise top --help" works and provide information about the
available options.
Link: https://lkml.kernel.org/r/0d796993abf587ae5a170bb8415c49368d4999e1.1639158831.git.bristot@kernel.org
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: linux-rt-users@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
If start_per_cpu_kthreads() called from osnoise_workload_start() returns
error, event hooks are left in broken state: unhook_irq_events() called
but unhook_thread_events() and unhook_softirq_events() not called, and
trace_osnoise_callback_enabled flag not cleared.
On the next tracer enable, hooks get not installed due to
trace_osnoise_callback_enabled flag.
And on the further tracer disable an attempt to remove non-installed
hooks happened, hitting a WARN_ON_ONCE() in tracepoint_remove_func().
Fix the error path by adding the missing part of cleanup.
While at this, introduce osnoise_unhook_events() to avoid code
duplication between this error path and normal tracer disable.
Link: https://lkml.kernel.org/r/20220109153459.3701773-1-nikita.yushchenko@virtuozzo.com
Cc: stable@vger.kernel.org
Fixes: bce29ac9ce ("trace: Add osnoise tracer")
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Pingfan reported that the following causes a fault:
echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
echo 1 > events/syscalls/sys_enter_at/enable
The reason is that trace event filter treats the user space pointer
defined by "filename" as a normal pointer to compare against the "cpu"
string. The following bug happened:
kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
#PF: supervisor read access in kernel mode
#PF: error_code(0x0001) - permissions violation
PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
Oops: 0001 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:strlen+0x0/0x20
Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11
48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8
48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
FS: 00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
filter_pred_pchar+0x18/0x40
filter_match_preds+0x31/0x70
ftrace_syscall_enter+0x27a/0x2c0
syscall_trace_enter.constprop.0+0x1aa/0x1d0
do_syscall_64+0x16/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa861d88664
The above happened because the kernel tried to access user space directly
and triggered a "supervisor read access in kernel mode" fault. Worse yet,
the memory could not even be loaded yet, and a SEGFAULT could happen as
well. This could be true for kernel space accessing as well.
To be even more robust, test both kernel and user space strings. If the
string fails to read, then simply have the filter fail.
Note, TASK_SIZE is used to determine if the pointer is user or kernel space
and the appropriate strncpy_from_kernel/user_nofault() function is used to
copy the memory. For some architectures, the compare to TASK_SIZE may always
pick user space or kernel space. If it gets it wrong, the only thing is that
the filter will fail to match. In the future, this needs to be fixed to have
the event denote which should be used. But failing a filter is much better
than panicing the machine, and that can be solved later.
Link: https://lore.kernel.org/all/20220107044951.22080-1-kernelfans@gmail.com/
Link: https://lkml.kernel.org/r/20220110115532.536088fd@gandalf.local.home
Cc: stable@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Reported-by: Pingfan Liu <kernelfans@gmail.com>
Tested-by: Pingfan Liu <kernelfans@gmail.com>
Fixes: 87a342f5db ("tracing/filters: Support filtering for char * strings")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
When UBSAN is enabled, it reports an invalid value in __pagevec_release()
when accessing pvec->percpu_pvec_drained, which is simply whatever
garbage was on the stack. Initialise it when initialising the rest of
the folio_batch.
Fixes: 10331795fb ("pagevec: Add folio_batch")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Disabling only bottom halves via local_bh_disable() disables also
preemption but this remains invisible to tracing. On a CONFIG_PREEMPT
kernel one might wonder why there is no scheduling happening despite the
N flag in the trace. The reason might be the a rcu_read_lock_bh()
section.
Add a 'b' to the tracing output if in task context with disabled bottom
halves.
Link: https://lkml.kernel.org/r/YbcbtdtC/bjCKo57@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The variable n is being bit-wise or'd with a value and reassigned
before being returned. The update of n is redundant, replace
the |= operator with | instead. Cleans up clang scan warning:
drivers/block/aoe/aoecmd.c:125:9: warning: Although the value stored
to 'n' is used in the enclosing expression, the value is never
actually read from 'n' [deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20220113000545.1307091-1-colin.i.king@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The pointer node is being initialized with a value that is never
read, it is being re-assigned the same value a little futher on.
Remove the redundant initialization. Cleans up clang scan warning:
drivers/block/loop.c:823:19: warning: Value stored to 'node' during
its initialization is never read [deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20220113001432.1331871-1-colin.i.king@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In case of shared tags, there might be more than one hctx which
allocates from the same tags, and each hctx is limited to allocate at
most:
hctx_max_depth = max((bt->sb.depth + users - 1) / users, 4U);
tag idle detection is lazy, and may be delayed for 30sec, so there
could be just one real active hctx(queue) but all others are actually
idle and still accounted as active because of the lazy idle detection.
Then if wake_batch is > hctx_max_depth, driver tag allocation may wait
forever on this real active hctx.
Fix this by recalculating wake_batch when inc or dec active_queues.
Fixes: 0d2602ca30 ("blk-mq: improve support for shared tags maps")
Suggested-by: Ming Lei <ming.lei@redhat.com>
Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220113025536.1479653-1-qiulaibin@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull rdma updates from Jason Gunthorpe:
"Another small cycle. Mostly cleanups and bug fixes, quite a bit
assisted from bots. There are a few new syzkaller splats that haven't
been solved yet but they should get into the rcs in a few weeks, I
think.
Summary:
- Update drivers to use common helpers for GUIDs, pkeys, bitmaps,
memset_startat, and others
- General code cleanups from bots
- Simplify some of the rxe pool code in preparation for a larger
rework
- Clean out old stuff from hns, including all support for hip06
devices
- Fix a bug where GID table entries could be missed if the table had
holes in it
- Rename paths and sessions in rtrs for better understandability
- Consolidate the roce source port selection code
- NDR speed support in mlx5"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (83 commits)
RDMA/irdma: Remove the redundant return
RDMA/rxe: Use the standard method to produce udp source port
RDMA/irdma: Make the source udp port vary
RDMA/hns: Replace get_udp_sport with rdma_get_udp_sport
RDMA/core: Calculate UDP source port based on flow label or lqpn/rqpn
IB/qib: Fix typos
RDMA/rtrs-clt: Rename rtrs_clt to rtrs_clt_sess
RDMA/rtrs-srv: Rename rtrs_srv to rtrs_srv_sess
RDMA/rtrs-clt: Rename rtrs_clt_sess to rtrs_clt_path
RDMA/rtrs-srv: Rename rtrs_srv_sess to rtrs_srv_path
RDMA/rtrs: Rename rtrs_sess to rtrs_path
RDMA/hns: Modify the hop num of HIP09 EQ to 1
IB/iser: Align coding style across driver
IB/iser: Remove un-needed casting to/from void pointer
IB/iser: Don't suppress send completions
IB/iser: Rename ib_ret local variable
IB/iser: Fix RNR errors
IB/iser: Remove deprecated pi_guard module param
IB/mlx5: Expose NDR speed through MAD
RDMA/cxgb4: Set queue pair state when being queried
...
commit 56b765b79e ("htb: improved accuracy at high rates") broke
"overhead X", "linklayer atm" and "mpu X" attributes.
"overhead X" and "linklayer atm" have already been fixed. This restores
the "mpu X" handling, as might be used by DOCSIS or Ethernet shaping:
tc class add ... htb rate X overhead 4 mpu 64
The code being fixed is used by htb, tbf and act_police. Cake has its
own mpu handling. qdisc_calculate_pkt_len still uses the size table
containing values adjusted for mpu by user space.
iproute2 tc has always passed mpu into the kernel via a tc_ratespec
structure, but the kernel never directly acted on it, merely stored it
so that it could be read back by `tc class show`.
Rather, tc would generate length-to-time tables that included the mpu
(and linklayer) in their construction, and the kernel used those tables.
Since v3.7, the tables were no longer used. Along with "mpu", this also
broke "overhead" and "linklayer" which were fixed in 01cb71d2d4
("net_sched: restore "overhead xxx" handling", v3.10) and 8a8e3d84b1
("net_sched: restore "linklayer atm" handling", v3.11).
"overhead" was fixed by simply restoring use of tc_ratespec::overhead -
this had originally been used by the kernel but was initially omitted
from the new non-table-based calculations.
"linklayer" had been handled in the table like "mpu", but the mode was
not originally passed in tc_ratespec. The new implementation was made to
handle it by getting new versions of tc to pass the mode in an extended
tc_ratespec, and for older versions of tc the table contents were analysed
at load time to deduce linklayer.
As "mpu" has always been given to the kernel in tc_ratespec,
accompanying the mpu-based table, we can restore system functionality
with no userspace change by making the kernel act on the tc_ratespec
value.
Fixes: 56b765b79e ("htb: improved accuracy at high rates")
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Vimalkumar <j.vimal@gmail.com>
Link: https://lore.kernel.org/r/20220112170210.1014351-1-kevin@bracey.fi
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Some architectures support self-extracting kernel, which embeds the
compressed vmlinux.
It has 4 byte data at the end so the decompressor can know the vmlinux
size beforehand.
GZIP natively has it in the trailer, but for the other compression
algorithms, the hand-crafted trailer is added.
It is unneeded to generate such _corrupted_ compressed files because
it is possible to pass the size data as a separate file.
For example, the assembly code:
.incbin "compressed-vmlinux-with-size-data"
can be transformed to:
.incbin "compressed-vmlinux"
.incbin "size-data"
My hope is, after some reworks of the decompressors, the macros
cmd_{bzip2,lzma,lzo,lz4,xzkern,zstd22}_with_size will go away.
This new macro, cmd_file_size, will be useful to generate a separate
size-data file.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Presumably, arch/{parisc,s390,sh}/boot/compressed/Makefile copied
arch/x86/boot/compressed/Makefile, but vmlinux.bin.all-y is useless
here because it is the same as $(obj)/vmlinux.bin.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
GZIP-compressed files end with 4 byte data that represents the size
of the original input. The decompressors (the self-extracting kernel)
exploit it to know the vmlinux size beforehand. To mimic the GZIP's
trailer, Kbuild provides cmd_{bzip2,lzma,lzo,lz4,xzkern,zstd22}.
Unfortunately these macros are used everywhere despite the appended
size data is only useful for the decompressors.
There is no guarantee that such hand-crafted trailers are safely ignored.
In fact, the kernel refuses compressed initramdfs with the garbage data.
That is why usr/Makefile overrides size_append to make it no-op.
To limit the use of such broken compressed files, this commit renames
the existing macros as follows:
cmd_bzip2 --> cmd_bzip2_with_size
cmd_lzma --> cmd_lzma_with_size
cmd_lzo --> cmd_lzo_with_size
cmd_lz4 --> cmd_lz4_with_size
cmd_xzkern --> cmd_xzkern_with_size
cmd_zstd22 --> cmd_zstd22_with_size
To keep the decompressors working, I updated the following Makefiles
accordingly:
arch/arm/boot/compressed/Makefile
arch/h8300/boot/compressed/Makefile
arch/mips/boot/compressed/Makefile
arch/parisc/boot/compressed/Makefile
arch/s390/boot/compressed/Makefile
arch/sh/boot/compressed/Makefile
arch/x86/boot/compressed/Makefile
I reused the current macro names for the normal usecases; they produce
the compressed data in the proper format.
I did not touch the following:
arch/arc/boot/Makefile
arch/arm64/boot/Makefile
arch/csky/boot/Makefile
arch/mips/boot/Makefile
arch/riscv/boot/Makefile
arch/sh/boot/Makefile
kernel/Makefile
This means those Makefiles will stop appending the size data.
I dropped the 'override size_append' hack from usr/Makefile.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
The appended file size is only used by the decompressors, which some
architectures support.
As the comment "zstd22 is used for kernel compression" says, cmd_zstd22
is used in arch/{mips,s390,x86}/boot/compressed/Makefile.
On the other hand, there is no good reason to append the file size to
cmd_zstd since it is used for other purposes.
Actually cmd_zstd is only used in usr/Makefile, where the appended file
size is rather harmful.
The initramfs with its file size appended is considered as corrupted
data, so commit 65e00e04e5 ("initramfs: refactor the initramfs build
rules") added 'override size_append := :' to make it no-op.
As a conclusion, this $(size_append) should not exist here.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
'export suffix-y' does not work reliably because hyphens are disallowed
in shell variables.
A similar issue was fixed by commit 2bfbe7881e ("kbuild: Do not use
hyphen in exported variable name").
If I do similar in dash, ARCH=sh fails to build.
$ mv linux linux~
$ cd linux~
$ dash
$ make O=foo/bar ARCH=sh CROSS_COMPILE=sh4-linux-gnu- defconfig all
make[1]: Entering directory '/home/masahiro/linux~/foo/bar'
[ snip ]
make[4]: *** No rule to make target 'arch/sh/boot/compressed/vmlinux.bin.', needed by 'arch/sh/boot/compressed/piggy.o'. Stop.
make[3]: *** [/home/masahiro/linux~/arch/sh/boot/Makefile:40: arch/sh/boot/compressed/vmlinux] Error 2
make[2]: *** [/home/masahiro/linux~/arch/sh/Makefile:194: zImage] Error 2
make[1]: *** [/home/masahiro/linux~/Makefile:350: __build_one_by_one] Error 2
make[1]: Leaving directory '/home/masahiro/linux~/foo/bar'
make: *** [Makefile:219: __sub-make] Error 2
The maintainer of GNU Make stated that there is no consistent way to
export variables that do not meet the shell's naming criteria.
(https://savannah.gnu.org/bugs/?55719)
Consequently, you cannot use hyphens in exported variables.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Pull MSI irq updates from Thomas Gleixner:
"Rework of the MSI interrupt infrastructure.
This is a treewide cleanup and consolidation of MSI interrupt handling
in preparation for further changes in this area which are necessary
to:
- address existing shortcomings in the VFIO area
- support the upcoming Interrupt Message Store functionality which
decouples the message store from the PCI config/MMIO space"
* tag 'irq-msi-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (94 commits)
genirq/msi: Populate sysfs entry only once
PCI/MSI: Unbreak pci_irq_get_affinity()
genirq/msi: Convert storage to xarray
genirq/msi: Simplify sysfs handling
genirq/msi: Add abuse prevention comment to msi header
genirq/msi: Mop up old interfaces
genirq/msi: Convert to new functions
genirq/msi: Make interrupt allocation less convoluted
platform-msi: Simplify platform device MSI code
platform-msi: Let core code handle MSI descriptors
bus: fsl-mc-msi: Simplify MSI descriptor handling
soc: ti: ti_sci_inta_msi: Remove ti_sci_inta_msi_domain_free_irqs()
soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation
NTB/msi: Convert to msi_on_each_desc()
PCI: hv: Rework MSI handling
powerpc/mpic_u3msi: Use msi_for_each-desc()
powerpc/fsl_msi: Use msi_for_each_desc()
powerpc/pasemi/msi: Convert to msi_on_each_dec()
powerpc/cell/axon_msi: Convert to msi_on_each_desc()
powerpc/4xx/hsta: Rework MSI handling
...
Pull timer updates from Thomas Gleixner:
"Updates for the time(r) subsystem:
Core:
- Make the clocksource watchdog more robust by better validation
checks of the measurement.
Drivers:
- New drivers for MStar and SSD20xd SOCs
- The usual cleanups and improvements all over the place"
* tag 'timers-core-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
dt-bindings: timer: Add Mstar MSC313e timer devicetree bindings documentation
clocksource/drivers/msc313e: Add support for ssd20xd-based platforms
clocksource/drivers: Add MStar MSC313e timer support
clocksource/drivers/pistachio: Fix -Wunused-but-set-variable warning
clocksource/drivers/timer-imx-sysctr: Set cpumask to cpu_possible_mask
clocksource/drivers/imx-sysctr: Mark two variable with __ro_after_init
clocksource/drivers/renesas,ostm: Make RENESAS_OSTM symbol visible
clocksource/drivers/renesas-ostm: Add RZ/G2L OSTM support
dt-bindings: timer: renesas: ostm: Document Renesas RZ/G2L OSTM
clocksource/drivers/exynos_mct: Fix silly typo resulting in checkpatch warning
clocksource: Reduce the default clocksource_watchdog() retries to 2
clocksource: Avoid accidental unstable marking of clocksources
dt-bindings: timer: tpm-timer: Add imx8ulp compatible string
reset: Add of_reset_control_get_optional_exclusive()
clocksource/drivers/exynos_mct: Refactor resources allocation
dt-bindings: timer: remove rockchip,rk3066-timer compatible string from rockchip,rk-timer.yaml
dt-bindings: timer: cadence_ttc: Add power-domains
Pull irq updates from Thomas Gleixner:
"Updates for the interrupt subsystem:
Core:
- Provide a new interface for affinity hints to provide a separation
between hint and actual affinity change which has become a hidden
property of the current interface
- Fix up the in tree usage of the affinity hint interfaces
Drivers:
- No new irqchip drivers!
- Fix GICv3 redistributor table reservation with RT across kexec
- Fix GICv4.1 redistributor view of the VPE table across kexec
- Add support for extra interrupts on spear-shirq
- Make obtaining some interrupts optional for the Renesas drivers
- Various cleanups and bug fixes"
* tag 'irq-core-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
irqchip/renesas-intc-irqpin: Use platform_get_irq_optional() to get the interrupt
irqchip/renesas-irqc: Use platform_get_irq_optional() to get the interrupt
irqchip/gic-v4: Disable redistributors' view of the VPE table at boot time
irqchip/ingenic-tcu: Use correctly sized arguments for bit field
irqchip/gic-v2m: Add const to of_device_id
irqchip/imx-gpcv2: Mark imx_gpcv2_instance with __ro_after_init
irqchip/spear-shirq: Add support for IRQ 0..6
irqchip/gic-v3-its: Limit memreserve cpuhp state lifetime
irqchip/gic-v3-its: Postpone LPI pending table freeing and memreserve
irqchip/gic-v3-its: Give the percpu rdist struct its own flags field
net/mlx4: Use irq_update_affinity_hint()
net/mlx5: Use irq_set_affinity_and_hint()
hinic: Use irq_set_affinity_and_hint()
scsi: lpfc: Use irq_set_affinity()
mailbox: Use irq_update_affinity_hint()
ixgbe: Use irq_update_affinity_hint()
be2net: Use irq_update_affinity_hint()
enic: Use irq_update_affinity_hint()
RDMA/irdma: Use irq_update_affinity_hint()
scsi: mpt3sas: Use irq_set_affinity_and_hint()
...
- Add PCI_ERROR_RESPONSE and related definitions for signaling and checking
for transaction errors on PCI (Naveen Naidu)
- Fabricate PCI_ERROR_RESPONSE data (~0) in config read wrappers, instead
of in host controller drivers, when transactions fail on PCI (Naveen
Naidu)
- Use PCI_POSSIBLE_ERROR() to check for possible failure of config reads
(Naveen Naidu)
* pci/errors:
PCI: xgene: Use PCI_ERROR_RESPONSE to identify config read errors
PCI: hv: Use PCI_ERROR_RESPONSE to identify config read errors
PCI: keystone: Use PCI_ERROR_RESPONSE to identify config read errors
PCI: Use PCI_ERROR_RESPONSE to identify config read errors
PCI: cpqphp: Use PCI_POSSIBLE_ERROR() to check config reads
PCI/PME: Use PCI_POSSIBLE_ERROR() to check config reads
PCI/DPC: Use PCI_POSSIBLE_ERROR() to check config reads
PCI: pciehp: Use PCI_POSSIBLE_ERROR() to check config reads
PCI: vmd: Use PCI_POSSIBLE_ERROR() to check config reads
PCI/ERR: Use PCI_POSSIBLE_ERROR() to check config reads
PCI: rockchip-host: Drop error data fabrication when config read fails
PCI: rcar-host: Drop error data fabrication when config read fails
PCI: altera: Drop error data fabrication when config read fails
PCI: mvebu: Drop error data fabrication when config read fails
PCI: aardvark: Drop error data fabrication when config read fails
PCI: kirin: Drop error data fabrication when config read fails
PCI: histb: Drop error data fabrication when config read fails
PCI: exynos: Drop error data fabrication when config read fails
PCI: mediatek: Drop error data fabrication when config read fails
PCI: iproc: Drop error data fabrication when config read fails
PCI: thunder: Drop error data fabrication when config read fails
PCI: Drop error data fabrication when config read fails
PCI: Use PCI_SET_ERROR_RESPONSE() for disconnected devices
PCI: Set error response data when config read fails
PCI: Add PCI_ERROR_RESPONSE and related definitions
- Sort Intel Device IDs by value (Andy Shevchenko)
- Change Capability offsets to hex to match spec (Baruch Siach)
- Correct misspellings (Krzysztof Wilczyński)
- Terminate statement with semicolon in pci_endpoint_test.c (Ming Wang)
* pci/misc:
misc: pci_endpoint_test: Terminate statement with semicolon
PCI: Correct misspelled words
PCI: Change capability register offsets to hex
PCI: Sort Intel Device IDs by value
- Make emulated ROM BAR read-only by default (Pali Rohár)
- Make some emulated legacy PCI bits read-only for PCIe devices (Pali
Rohár)
- Update reserved bits in emulated PCIe Capability (Pali Rohár)
- Allow drivers to emulate different PCIe Capability versions (Pali Rohár)
- Set emulated Capabilities List bit for all PCIe devices, since they must
have at least a PCIe Capability (Pali Rohár)
* remotes/lorenzo/pci/bridge-emul:
PCI: pci-bridge-emul: Set PCI_STATUS_CAP_LIST for PCIe device
PCI: pci-bridge-emul: Correctly set PCIe capabilities
PCI: pci-bridge-emul: Fix definitions of reserved bits
PCI: pci-bridge-emul: Properly mark reserved PCIe bits in PCI config space
PCI: pci-bridge-emul: Make expansion ROM Base Address register read-only
- Declare bitmap correctly and as part of struct nwl_msi managed resource
(Christophe JAILLET)
* remotes/lorenzo/pci/xilinx-nwl:
PCI: xilinx-nwl: Simplify code and fix a memory leak
- Use bitmap ops for MSI allocator (Christophe JAILLET)
- Fix IB window setup, which was broken by the fact that IB resources are
now sorted in address order instead of DT dma-ranges order (Rob Herring)
* remotes/lorenzo/pci/xgene:
PCI: xgene: Fix IB window setup
PCI: xgene-msi: Use bitmap_zalloc() when applicable
- Reset everything below VMD before enumerating to work around failure to
enumerate NVMe devices when guest OS reboots (Nirmal Patel)
- Honor platform ACPI _OSC feature negotiation for Root Ports below VMD
(Kai-Heng Feng)
- Add support for Raptor Lake SKUs (Karthik L Gopalakrishnan)
* remotes/lorenzo/pci/vmd:
PCI: vmd: Add DID 8086:A77F for all Intel Raptor Lake SKU's
PCI: vmd: Honor ACPI _OSC on PCIe features
PCI: vmd: Clean up domain before enumeration
- Fix aarch32 abort handler so it doesn't check the wrong bus clock before
accessing the host controller (Marek Vasut)
* remotes/lorenzo/pci/rcar:
PCI: rcar: Check if device is runtime suspended instead of __clk_is_enabled()
- Undo PM setup in qcom_pcie_probe() error handling path (Christophe
JAILLET)
- Use __be16 type to store return value from cpu_to_be16() (Manivannan
Sadhasivam)
- Constify static dw_pcie_ep_ops (Rikard Falkeborn)
* remotes/lorenzo/pci/qcom:
PCI: qcom-ep: Constify static dw_pcie_ep_ops
PCI: qcom: Use __be16 type to store return value from cpu_to_be16()
PCI: qcom: Fix an error handling path in 'qcom_pcie_probe()'