mirror of
https://github.com/torvalds/linux.git
synced 2024-12-25 20:32:22 +00:00
9f0c253ddd
- Implement per-PMU context rescheduling to significantly improve single-PMU performance, and related cleanups/fixes. (by Peter Zijlstra and Namhyung Kim) - Fix ancient bug resulting in a lot of events being dropped erroneously at higher sampling frequencies. (by Luo Gengkun) - uprobes enhancements: - Implement RCU-protected hot path optimizations for better performance: "For baseline vs SRCU, peak througput increased from 3.7 M/s (million uprobe triggerings per second) up to about 8 M/s. For uretprobes it's a bit more modest with bump from 2.4 M/s to 5 M/s. For SRCU vs RCU Tasks Trace, peak throughput for uprobes increases further from 8 M/s to 10.3 M/s (+28%!), and for uretprobes from 5.3 M/s to 5.8 M/s (+11%), as we have more work to do on uretprobes side. Even single-thread (no contention) performance is slightly better: 3.276 M/s to 3.396 M/s (+3.5%) for uprobes, and 2.055 M/s to 2.174 M/s (+5.8%) for uretprobes." (by Andrii Nakryiko et al) - Document mmap_lock, don't abuse get_user_pages_remote(). (by Oleg Nesterov) - Cleanups & fixes to prepare for future work: - Remove uprobe_register_refctr() - Simplify error handling for alloc_uprobe() - Make uprobe_register() return struct uprobe * - Fold __uprobe_unregister() into uprobe_unregister() - Shift put_uprobe() from delete_uprobe() to uprobe_unregister() - BPF: Fix use-after-free in bpf_uprobe_multi_link_attach() (by Oleg Nesterov) - New feature & ABI extension: allow events to use PERF_SAMPLE READ with inheritance, enabling sample based profiling of a group of counters over a hierarchy of processes or threads. (by Ben Gainey) - Intel uncore & power events updates: - Add Arrow Lake and Lunar Lake support - Add PERF_EV_CAP_READ_SCOPE - Clean up and enhance cpumask and hotplug support (by Kan Liang) - Add LNL uncore iMC freerunning support - Use D0:F0 as a default device (by Zhenyu Wang) - Intel PT: fix AUX snapshot handling race. (by Adrian Hunter) - Misc fixes and cleanups. (by James Clark, Jiri Olsa, Oleg Nesterov and Peter Zijlstra) Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJEBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmbqxEwRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1iusw/43UAcAZVof6Qs+j6bVAxSabF66fFfE9Wh jc+F4yZ2MGl9x6a1f392+CPcTdVsYp6G2QtRGMipD+trmi/lhDhmRrhxxD1KWIwP zVGSBx9CSFl0UpCXdGiVrGzT5xpIpJ4qqW2XUVr32n8SxTT5X/vM5ySm6KUXsIrD 2/KXwucT9a7grkl3pvy/A/FUHxaF7oAMJjcIPSvLBveQjQSHUrZoCZdHsRGT9rjS HjzxG6gDy97172z5XV1ej3HJOfFlFTQ1RcoxNqdLfiZ6n3hD4hfmtsXWB5zTzRjT xHaCOmWLhEp5v+fK2+RCFiWUbDBsmW/mecZdrjGb3C1RIDWQhLCXXc95XtrobTvk BkW9QEC/XRB+vU6Ssdv3ugN7yRWxih0BsLU5sy4nlzmwoYt9qOy8fgjRvSBKHr5K Mu1RIFu+KXq++sa7+ZJjUMY70PHQCp2m4AHprG/Y98t93CQMhDXzGVpPzWyQuW/V lqYFjd/CAoCIVGF4Jxq7sqOdZ1emDN+P0WSnnFWssJ0ZJFvxN9ZDPH2AaMk4lwo7 NFW6u3+0Vx9P0m/H6xRQj00Iye2JLMqJNCIA8QtjnB7L6upgVvcIPjgcG58fpV1o xfJekOR1A7T2aQUDlX5t9Cu36ZUImDRmwHj2m1p84s5AANlbD7/fOmffR1Hn9uFj wCTqSpi8Hg== =E3s3 -----END PGP SIGNATURE----- Merge tag 'perf-core-2024-09-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events updates from Ingo Molnar: - Implement per-PMU context rescheduling to significantly improve single-PMU performance, and related cleanups/fixes (Peter Zijlstra and Namhyung Kim) - Fix ancient bug resulting in a lot of events being dropped erroneously at higher sampling frequencies (Luo Gengkun) - uprobes enhancements: - Implement RCU-protected hot path optimizations for better performance: "For baseline vs SRCU, peak througput increased from 3.7 M/s (million uprobe triggerings per second) up to about 8 M/s. For uretprobes it's a bit more modest with bump from 2.4 M/s to 5 M/s. For SRCU vs RCU Tasks Trace, peak throughput for uprobes increases further from 8 M/s to 10.3 M/s (+28%!), and for uretprobes from 5.3 M/s to 5.8 M/s (+11%), as we have more work to do on uretprobes side. Even single-thread (no contention) performance is slightly better: 3.276 M/s to 3.396 M/s (+3.5%) for uprobes, and 2.055 M/s to 2.174 M/s (+5.8%) for uretprobes." (Andrii Nakryiko et al) - Document mmap_lock, don't abuse get_user_pages_remote() (Oleg Nesterov) - Cleanups & fixes to prepare for future work: - Remove uprobe_register_refctr() - Simplify error handling for alloc_uprobe() - Make uprobe_register() return struct uprobe * - Fold __uprobe_unregister() into uprobe_unregister() - Shift put_uprobe() from delete_uprobe() to uprobe_unregister() - BPF: Fix use-after-free in bpf_uprobe_multi_link_attach() (Oleg Nesterov) - New feature & ABI extension: allow events to use PERF_SAMPLE READ with inheritance, enabling sample based profiling of a group of counters over a hierarchy of processes or threads (Ben Gainey) - Intel uncore & power events updates: - Add Arrow Lake and Lunar Lake support - Add PERF_EV_CAP_READ_SCOPE - Clean up and enhance cpumask and hotplug support (Kan Liang) - Add LNL uncore iMC freerunning support - Use D0:F0 as a default device (Zhenyu Wang) - Intel PT: fix AUX snapshot handling race (Adrian Hunter) - Misc fixes and cleanups (James Clark, Jiri Olsa, Oleg Nesterov and Peter Zijlstra) * tag 'perf-core-2024-09-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) dmaengine: idxd: Clean up cpumask and hotplug for perfmon iommu/vt-d: Clean up cpumask and hotplug for perfmon perf/x86/intel/cstate: Clean up cpumask and hotplug perf: Add PERF_EV_CAP_READ_SCOPE perf: Generic hotplug support for a PMU with a scope uprobes: perform lockless SRCU-protected uprobes_tree lookup rbtree: provide rb_find_rcu() / rb_find_add_rcu() perf/uprobe: split uprobe_unregister() uprobes: travers uprobe's consumer list locklessly under SRCU protection uprobes: get rid of enum uprobe_filter_ctx in uprobe filter callbacks uprobes: protected uprobe lifetime with SRCU uprobes: revamp uprobe refcounting and lifetime management bpf: Fix use-after-free in bpf_uprobe_multi_link_attach() perf/core: Fix small negative period being ignored perf: Really fix event_function_call() locking perf: Optimize __pmu_ctx_sched_out() perf: Add context time freeze perf: Fix event_function_call() locking perf: Extract a few helpers perf: Optimize context reschedule for single PMU cases ... |
||
---|---|---|
.. | ||
bestcomm | ||
dw | ||
dw-axi-dmac | ||
dw-edma | ||
fsl-dpaa2-qdma | ||
hsu | ||
idxd | ||
ioat | ||
lgm | ||
mediatek | ||
ppc4xx | ||
ptdma | ||
qcom | ||
sf-pdma | ||
sh | ||
stm32 | ||
ti | ||
xilinx | ||
acpi-dma.c | ||
altera-msgdma.c | ||
amba-pl08x.c | ||
apple-admac.c | ||
at_hdmac.c | ||
at_xdmac.c | ||
bcm2835-dma.c | ||
bcm-sba-raid.c | ||
dma-axi-dmac.c | ||
dma-jz4780.c | ||
dmaengine.c | ||
dmaengine.h | ||
dmatest.c | ||
ep93xx_dma.c | ||
fsl_raid.c | ||
fsl_raid.h | ||
fsl-edma-common.c | ||
fsl-edma-common.h | ||
fsl-edma-main.c | ||
fsl-edma-trace.c | ||
fsl-edma-trace.h | ||
fsl-qdma.c | ||
fsldma.c | ||
fsldma.h | ||
hisi_dma.c | ||
idma64.c | ||
idma64.h | ||
img-mdc-dma.c | ||
imx-dma.c | ||
imx-sdma.c | ||
k3dma.c | ||
Kconfig | ||
lpc18xx-dmamux.c | ||
ls2x-apb-dma.c | ||
Makefile | ||
mcf-edma-main.c | ||
milbeaut-hdmac.c | ||
milbeaut-xdmac.c | ||
mmp_pdma.c | ||
mmp_tdma.c | ||
moxart-dma.c | ||
mpc512x_dma.c | ||
mv_xor_v2.c | ||
mv_xor.c | ||
mv_xor.h | ||
mxs-dma.c | ||
nbpfaxi.c | ||
of-dma.c | ||
owl-dma.c | ||
pch_dma.c | ||
pl330.c | ||
plx_dma.c | ||
pxa_dma.c | ||
sa11x0-dma.c | ||
sprd-dma.c | ||
st_fdma.c | ||
st_fdma.h | ||
ste_dma40_ll.c | ||
ste_dma40_ll.h | ||
ste_dma40.c | ||
ste_dma40.h | ||
sun4i-dma.c | ||
sun6i-dma.c | ||
tegra20-apb-dma.c | ||
tegra186-gpc-dma.c | ||
tegra210-adma.c | ||
timb_dma.c | ||
TODO | ||
txx9dmac.c | ||
txx9dmac.h | ||
uniphier-mdmac.c | ||
uniphier-xdmac.c | ||
virt-dma.c | ||
virt-dma.h | ||
xgene-dma.c |