linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-10 06:01:57 +00:00

Author	SHA1	Message	Date
Lu Baolu	6e02a277f1	iommu/vt-d: Fix incorrect pci_for_each_dma_alias() for non-PCI devices Previously, the domain_context_clear() function incorrectly called pci_for_each_dma_alias() to set up context entries for non-PCI devices. This could lead to kernel hangs or other unexpected behavior. Add a check to only call pci_for_each_dma_alias() for PCI devices. For non-PCI devices, domain_context_clear_one() is called directly. Reported-by: Todd Brandt <todd.e.brandt@intel.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219363 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219349 Fixes: `9a16ab9d64` ("iommu/vt-d: Make context clearing consistent with context mapping") Cc: stable@vger.kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20241014013744.102197-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-10-15 10:17:54 +02:00
Chen Ni	7de7d35429	iommu/arm-smmu-v3: Convert comma to semicolon Replace comma between expressions with semicolons. Using a ',' in place of a ';' can have unintended side effects. Although that is not the case here, it is seems best to use ';' unless ',' is intended. Found by inspection. No functional change intended. Compile tested only. Fixes: `e3b1be2e73` ("iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20240923021557.3432068-1-nichen@iscas.ac.cn Signed-off-by: Will Deacon <will@kernel.org>	2024-10-08 18:38:34 +01:00
Daniel Mentz	f63237f54c	iommu/arm-smmu-v3: Fix last_sid_idx calculation for sid_bits==32 The function arm_smmu_init_strtab_2lvl uses the expression ((1 << smmu->sid_bits) - 1) to calculate the largest StreamID value. However, this fails for the maximum allowed value of SMMU_IDR1.SIDSIZE which is 32. The C standard states: "If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined." With smmu->sid_bits being 32, the prerequisites for undefined behavior are met. We observed that the value of (1 << 32) is 1 and not 0 as we initially expected. Similar bit shift operations in arm_smmu_init_strtab_linear seem to not be affected, because it appears to be unlikely for an SMMU to have SMMU_IDR1.SIDSIZE set to 32 but then not support 2-level Stream tables This issue was found by Ryan Huang <tzukui@google.com> on our team. Fixes: `ce410410f1` ("iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()") Signed-off-by: Daniel Mentz <danielmentz@google.com> Link: https://lore.kernel.org/r/20241002015357.1766934-1-danielmentz@google.com Signed-off-by: Will Deacon <will@kernel.org>	2024-10-08 18:37:48 +01:00
Robin Murphy	0dfe314cdd	iommu/arm-smmu: Clarify MMU-500 CPRE workaround CPRE workarounds are implicated in at least 5 MMU-500 errata, some of which remain unfixed. The comment and warning message have proven to be unhelpfully misleading about this scope, so reword them to get the point across with less risk of going out of date or confusing users. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/dfa82171b5248ad7cf1f25592101a6eec36b8c9a.1728400877.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2024-10-08 18:35:41 +01:00
Al Viro	cb787f4ac0	[tree-wide] finally take no_llseek out no_llseek had been defined to NULL two years ago, in commit `868941b144` ("fs: remove no_llseek") To quote that commit, At -rc1 we'll need do a mechanical removal of no_llseek - git grep -l -w no_llseek \| grep -v porting.rst \| while read i; do sed -i '/\<no_llseek\>/d' $i done would do it. Unfortunately, that hadn't been done. Linus, could you do that now, so that we could finally put that thing to rest? All instances are of the form .llseek = no_llseek, so it's obviously safe. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-09-27 08:18:43 -07:00
Linus Torvalds	4491b85480	dma-mapping fixes for Linux 6.12 - sort out a few issues with the direct calls to iommu-dma (Christoph Hellwig, Leon Romanovsky) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmbyWtkLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYPToRAAlChDftSXmz9Isxds5VyzFy8El6hcutPiRfB8mUze 9aq7qgCeLVq6L0tPvNTgl2iYhS8KjTxVKSooYSQ0eAfb3FgNHKggmfV9GO8ddqBe ojf9s9EJzJ3BatkVPBDPin88Sv0Vfwkb6n78gEeuTGPReScMdR5WZfmyl5QfMWpt 5W16gSA666DcIpxIDPo6zEPZT/lXsPcmvXrZJhFZe7hITNp/CihVxdxy9spLRG2Z Ntei8AtXUvjn3ziU4ymcKOfP6fmERlVAiq65CqlXMUELs3Eh4knH9eEiXHtu0JAD qMhuHSB5b2Dd2gjwvJDoISRPPtpy0sGnx81lAZJPIGnvVKKJSI+Lp7wDP/w66Rxx 7vF7tU9uyAb4Qmb5h0s9BqJ3mCNN9WXsAKrNX3rLr2ZLCjSuQFDne0oAYgVl10J8 ezALiv+mG4C4Q5lxzmZm4PU8f1CYjLMe0KppDMLsuwdWpliOlGigCKYddqkOj0ct Cvg4lMHNcBo8xADyaqm/9iJYIkvLl5o6JGmVBbSCikUT3GMssBf/0KdITS4D+/cj 150LGxiZOQaOx1Ef8eACYslKJq8hIz0OmAjgDGpdLM7sBxa62FuZJrcujkKW9UA6 8DrYp+XqKd5R/ZOcjAIMbNAlZEa5CytgB9r4dXvrAC1dk//Dd23G873KunAE1DJJ cy0= =3hPK -----END PGP SIGNATURE----- Merge tag 'dma-mapping-6.12-2024-09-24' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fixes from Christoph Hellwig: - sort out a few issues with the direct calls to iommu-dma (Christoph Hellwig, Leon Romanovsky) * tag 'dma-mapping-6.12-2024-09-24' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: report unlimited DMA addressing in IOMMU DMA path iommu/dma: remove most stubs in iommu-dma.h dma-mapping: fix vmap and mmap of noncontiougs allocations	2024-09-24 12:00:37 -07:00
Linus Torvalds	db78436bed	iommufd 6.12 merge window pull Collection of small cleanup and one fix: - Sort headers and struct forward declarations - Fix random selftest failures in some cases due to dirty tracking tests - Have the reserved IOVA regions mechanism work when a HWPT is used as a nesting parent. This updates the nesting parent's IOAS with the reserved regions of the device and will also install the ITS doorbell page on ARM. - Add missed validation of parent domain ops against the current iommu - Fix a syzkaller bug related to integer overflow during ALIGN() - Tidy two iommu_domain attach paths -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZvGksgAKCRCFwuHvBreF YbSfAP931gRT85t0r7z6tH1GJVIviX2mg5TYGsb9SkrxVKcKAwD9H65T7tJRzTyP K1oYBY7wtpHbR38hjFbnRPD7ZM+k8A4= =r7jm -----END PGP SIGNATURE----- Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd updates from Jason Gunthorpe: "Collection of small cleanup and one fix: - Sort headers and struct forward declarations - Fix random selftest failures in some cases due to dirty tracking tests - Have the reserved IOVA regions mechanism work when a HWPT is used as a nesting parent. This updates the nesting parent's IOAS with the reserved regions of the device and will also install the ITS doorbell page on ARM. - Add missed validation of parent domain ops against the current iommu - Fix a syzkaller bug related to integer overflow during ALIGN() - Tidy two iommu_domain attach paths" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: iommu: Set iommu_attach_handle->domain in core iommufd: Avoid duplicated __iommu_group_set_core_domain() call iommufd: Protect against overflow of ALIGN() during iova allocation iommufd: Reorder struct forward declarations iommufd: Check the domain owner of the parent before creating a nesting domain iommufd/device: Enforce reserved IOVA also when attached to hwpt_nested iommufd/selftest: Fix buffer read overrrun in the dirty test iommufd: Reorder include files	2024-09-24 11:55:26 -07:00
Christoph Hellwig	bb0e391975	dma-mapping: fix vmap and mmap of noncontiougs allocations Commit `b5c58b2fdc` ("dma-mapping: direct calls for dma-iommu") switched to use direct calls to dma-iommu, but missed the dma_vmap_noncontiguous, dma_vunmap_noncontiguous and dma_mmap_noncontiguous behavior keyed off the presence of the alloc_noncontiguous method. Fix this by removing the now unused alloc_noncontiguous and free_noncontiguous methods and moving the vmapping and mmaping of the noncontiguous allocations into the iommu code, as it is the only provider of actually noncontiguous allocations. Fixes: `b5c58b2fdc` ("dma-mapping: direct calls for dma-iommu") Reported-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Leon Romanovsky <leon@kernel.org> Tested-by: Xi Ruoyao <xry111@xry111.site>	2024-09-22 18:47:51 +02:00
Linus Torvalds	7856a56541	Many singleton patches - please see the various changelogs for details. Quite a lot of nilfs2 work this time around. Notable patch series in this pull request are: "mul_u64_u64_div_u64: new implementation" by Nicolas Pitre, with assistance from Uwe Kleine-König. Reimplement mul_u64_u64_div_u64() to provide (much) more accurate results. The current implementation was causing Uwe some issues in the PWM drivers. "xz: Updates to license, filters, and compression options" from Lasse Collin. Miscellaneous maintenance and kinor feature work to the xz decompressor. "Fix some GDB command error and add some GDB commands" from Kuan-Ying Lee. Fixes and enhancements to the gdb scripts. "treewide: add missing MODULE_DESCRIPTION() macros" from Jeff Johnson. Adds lots of MODULE_DESCRIPTIONs, thus fixing lots of warnings about this. "nilfs2: add support for some common ioctls" from Ryusuke Konishi. Adds various commonly-available ioctls to nilfs2. "This series fixes a number of formatting issues in kernel doc comments" from Ryusuke Konishi does that. "nilfs2: prevent unexpected ENOENT propagation" from Ryusuke Konishi. Fix issues where -ENOENT was being unintentionally and inappropriately returned to userspace. "nilfs2: assorted cleanups" from Huang Xiaojia. "nilfs2: fix potential issues with empty b-tree nodes" from Ryusuke Konishi fixes some issues which can occur on corrupted nilfs2 filesystems. "scripts/decode_stacktrace.sh: improve error reporting and usability" from Luca Ceresoli does those things. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZu7dpAAKCRDdBJ7gKXxA jsPqAPwMDEZyKlfSw7QioEHNHDkmkbP7VYCYR0CbUnppbztwpAD8D37aVbWQ+UzM 3nnOq3W2Pc2o/20zqi8Upf1mnvUrygQ= =/NWE -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2024-09-21-07-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Many singleton patches - please see the various changelogs for details. Quite a lot of nilfs2 work this time around. Notable patch series in this pull request are: - "mul_u64_u64_div_u64: new implementation" by Nicolas Pitre, with assistance from Uwe Kleine-König. Reimplement mul_u64_u64_div_u64() to provide (much) more accurate results. The current implementation was causing Uwe some issues in the PWM drivers. - "xz: Updates to license, filters, and compression options" from Lasse Collin. Miscellaneous maintenance and kinor feature work to the xz decompressor. - "Fix some GDB command error and add some GDB commands" from Kuan-Ying Lee. Fixes and enhancements to the gdb scripts. - "treewide: add missing MODULE_DESCRIPTION() macros" from Jeff Johnson. Adds lots of MODULE_DESCRIPTIONs, thus fixing lots of warnings about this. - "nilfs2: add support for some common ioctls" from Ryusuke Konishi. Adds various commonly-available ioctls to nilfs2. - "This series fixes a number of formatting issues in kernel doc comments" from Ryusuke Konishi does that. - "nilfs2: prevent unexpected ENOENT propagation" from Ryusuke Konishi. Fix issues where -ENOENT was being unintentionally and inappropriately returned to userspace. - "nilfs2: assorted cleanups" from Huang Xiaojia. - "nilfs2: fix potential issues with empty b-tree nodes" from Ryusuke Konishi fixes some issues which can occur on corrupted nilfs2 filesystems. - "scripts/decode_stacktrace.sh: improve error reporting and usability" from Luca Ceresoli does those things" * tag 'mm-nonmm-stable-2024-09-21-07-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (103 commits) list: test: increase coverage of list_test_list_replace*() list: test: fix tests for list_cut_position() proc: use __auto_type more treewide: correct the typo 'retun' ocfs2: cleanup return value and mlog in ocfs2_global_read_info() nilfs2: remove duplicate 'unlikely()' usage nilfs2: fix potential oob read in nilfs_btree_check_delete() nilfs2: determine empty node blocks as corrupted nilfs2: fix potential null-ptr-deref in nilfs_btree_insert() user_namespace: use kmemdup_array() instead of kmemdup() for multiple allocation tools/mm: rm thp_swap_allocator_test when make clean squashfs: fix percpu address space issues in decompressor_multi_percpu.c lib: glob.c: added null check for character class nilfs2: refactor nilfs_segctor_thread() nilfs2: use kthread_create and kthread_stop for the log writer thread nilfs2: remove sc_timer_task nilfs2: do not repair reserved inode bitmap in nilfs_new_inode() nilfs2: eliminate the shared counter and spinlock for i_generation nilfs2: separate inode type information from i_state field nilfs2: use the BITS_PER_LONG macro ...	2024-09-21 08:20:50 -07:00
Linus Torvalds	726e2d0cf2	dma-mapping updates for linux 6.12 - support DMA zones for arm64 systems where memory starts at > 4GB (Baruch Siach, Catalin Marinas) - support direct calls into dma-iommu and thus obsolete dma_map_ops for many common configurations (Leon Romanovsky) - add DMA-API tracing (Sean Anderson) - remove the not very useful return value from various dma_set_* APIs (Christoph Hellwig) - misc cleanups and minor optimizations (Chen Y, Yosry Ahmed, Christoph Hellwig) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmbr2BALHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYNheA/6A453SQy2kFvspFRvEp8ztEqtvxwxGLAUMIyvmU+a 9b37KlMwUnpbMsXK5+KtYdTLRoIvtl89uIkdZq7pYYKj0uoPZvF9QVnKtrJWAvqK fFuauokZznuD3ZSd6v6uY4ijb29ImGfx5kZopQf1zWoYLENxM7mWqRU+eqxDozev FbyfYhJzMBhpHveen9+Q7PEfi/90ZdEqtJhSK2AOzuV9ZvbYiSFCrcnT/4wM30DS 2OxjGa8tKcGYZ9ah0rF2V5hboaRuYedTFgXoKfUSJINJkzmBlTXdxVx5Xr3kQtyC 7S/xv2y79CXkDKck2+IY7xkhwwBsXPrTAyTzWAIJqOEmaMJ4KqEW54JOsK+VHfmO 29UKBnASOK0xvfCzakm2631iOzEZF743RgpQiOGeMcnph789Mwu8EUCcqeEW/fJy Xh7B0z3/XgJz8BtTG/64IhmqO63Cwa/o7DSQdLr9dh5F/mPBzqrnRov97KL7mH1q VSO0Z7+8J0x9ALcYutpth/IzG/lXtXn/pfR1sj6dBHvjf5SwjuT8MKUHgh0l6N+C BWZn8swwrZaJ2Li2Gv3CpnCzVQZCkL6ns9VqAWiWq7VfGhDLndMqfi/jHCyGH83i E3dMtqf81XaQ7JRDPCs7Jx/4Zkn/iNkkZe8IQsByMc1BY4oeD7/Z2s8mkK8MbNla /CA= =DZVc -----END PGP SIGNATURE----- Merge tag 'dma-mapping-6.12-2024-09-19' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - support DMA zones for arm64 systems where memory starts at > 4GB (Baruch Siach, Catalin Marinas) - support direct calls into dma-iommu and thus obsolete dma_map_ops for many common configurations (Leon Romanovsky) - add DMA-API tracing (Sean Anderson) - remove the not very useful return value from various dma_set_* APIs (Christoph Hellwig) - misc cleanups and minor optimizations (Chen Y, Yosry Ahmed, Christoph Hellwig) * tag 'dma-mapping-6.12-2024-09-19' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: reflow dma_supported dma-mapping: reliably inform about DMA support for IOMMU dma-mapping: add tracing for dma-mapping API calls dma-mapping: use IOMMU DMA calls for common alloc/free page calls dma-direct: optimize page freeing when it is not addressable dma-mapping: clearly mark DMA ops as an architecture feature vdpa_sim: don't select DMA_OPS arm64: mm: keep low RAM dma zone dma-mapping: don't return errors from dma_set_max_seg_size dma-mapping: don't return errors from dma_set_seg_boundary dma-mapping: don't return errors from dma_set_min_align_mask scsi: check that busses support the DMA API before setting dma parameters arm64: mm: fix DMA zone when dma-ranges is missing dma-mapping: direct calls for dma-iommu dma-mapping: call ->unmap_page and ->unmap_sg unconditionally arm64: support DMA zone above 4GB dma-mapping: replace zone_dma_bits by zone_dma_limit dma-mapping: use bit masking to check VM_DMA_COHERENT	2024-09-19 11:12:49 +02:00
Linus Torvalds	9f0c253ddd	Performance events changes for v6.12: - Implement per-PMU context rescheduling to significantly improve single-PMU performance, and related cleanups/fixes. (by Peter Zijlstra and Namhyung Kim) - Fix ancient bug resulting in a lot of events being dropped erroneously at higher sampling frequencies. (by Luo Gengkun) - uprobes enhancements: - Implement RCU-protected hot path optimizations for better performance: "For baseline vs SRCU, peak througput increased from 3.7 M/s (million uprobe triggerings per second) up to about 8 M/s. For uretprobes it's a bit more modest with bump from 2.4 M/s to 5 M/s. For SRCU vs RCU Tasks Trace, peak throughput for uprobes increases further from 8 M/s to 10.3 M/s (+28%!), and for uretprobes from 5.3 M/s to 5.8 M/s (+11%), as we have more work to do on uretprobes side. Even single-thread (no contention) performance is slightly better: 3.276 M/s to 3.396 M/s (+3.5%) for uprobes, and 2.055 M/s to 2.174 M/s (+5.8%) for uretprobes." (by Andrii Nakryiko et al) - Document mmap_lock, don't abuse get_user_pages_remote(). (by Oleg Nesterov) - Cleanups & fixes to prepare for future work: - Remove uprobe_register_refctr() - Simplify error handling for alloc_uprobe() - Make uprobe_register() return struct uprobe * - Fold __uprobe_unregister() into uprobe_unregister() - Shift put_uprobe() from delete_uprobe() to uprobe_unregister() - BPF: Fix use-after-free in bpf_uprobe_multi_link_attach() (by Oleg Nesterov) - New feature & ABI extension: allow events to use PERF_SAMPLE READ with inheritance, enabling sample based profiling of a group of counters over a hierarchy of processes or threads. (by Ben Gainey) - Intel uncore & power events updates: - Add Arrow Lake and Lunar Lake support - Add PERF_EV_CAP_READ_SCOPE - Clean up and enhance cpumask and hotplug support (by Kan Liang) - Add LNL uncore iMC freerunning support - Use D0:F0 as a default device (by Zhenyu Wang) - Intel PT: fix AUX snapshot handling race. (by Adrian Hunter) - Misc fixes and cleanups. (by James Clark, Jiri Olsa, Oleg Nesterov and Peter Zijlstra) Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJEBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmbqxEwRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1iusw/43UAcAZVof6Qs+j6bVAxSabF66fFfE9Wh jc+F4yZ2MGl9x6a1f392+CPcTdVsYp6G2QtRGMipD+trmi/lhDhmRrhxxD1KWIwP zVGSBx9CSFl0UpCXdGiVrGzT5xpIpJ4qqW2XUVr32n8SxTT5X/vM5ySm6KUXsIrD 2/KXwucT9a7grkl3pvy/A/FUHxaF7oAMJjcIPSvLBveQjQSHUrZoCZdHsRGT9rjS HjzxG6gDy97172z5XV1ej3HJOfFlFTQ1RcoxNqdLfiZ6n3hD4hfmtsXWB5zTzRjT xHaCOmWLhEp5v+fK2+RCFiWUbDBsmW/mecZdrjGb3C1RIDWQhLCXXc95XtrobTvk BkW9QEC/XRB+vU6Ssdv3ugN7yRWxih0BsLU5sy4nlzmwoYt9qOy8fgjRvSBKHr5K Mu1RIFu+KXq++sa7+ZJjUMY70PHQCp2m4AHprG/Y98t93CQMhDXzGVpPzWyQuW/V lqYFjd/CAoCIVGF4Jxq7sqOdZ1emDN+P0WSnnFWssJ0ZJFvxN9ZDPH2AaMk4lwo7 NFW6u3+0Vx9P0m/H6xRQj00Iye2JLMqJNCIA8QtjnB7L6upgVvcIPjgcG58fpV1o xfJekOR1A7T2aQUDlX5t9Cu36ZUImDRmwHj2m1p84s5AANlbD7/fOmffR1Hn9uFj wCTqSpi8Hg== =E3s3 -----END PGP SIGNATURE----- Merge tag 'perf-core-2024-09-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events updates from Ingo Molnar: - Implement per-PMU context rescheduling to significantly improve single-PMU performance, and related cleanups/fixes (Peter Zijlstra and Namhyung Kim) - Fix ancient bug resulting in a lot of events being dropped erroneously at higher sampling frequencies (Luo Gengkun) - uprobes enhancements: - Implement RCU-protected hot path optimizations for better performance: "For baseline vs SRCU, peak througput increased from 3.7 M/s (million uprobe triggerings per second) up to about 8 M/s. For uretprobes it's a bit more modest with bump from 2.4 M/s to 5 M/s. For SRCU vs RCU Tasks Trace, peak throughput for uprobes increases further from 8 M/s to 10.3 M/s (+28%!), and for uretprobes from 5.3 M/s to 5.8 M/s (+11%), as we have more work to do on uretprobes side. Even single-thread (no contention) performance is slightly better: 3.276 M/s to 3.396 M/s (+3.5%) for uprobes, and 2.055 M/s to 2.174 M/s (+5.8%) for uretprobes." (Andrii Nakryiko et al) - Document mmap_lock, don't abuse get_user_pages_remote() (Oleg Nesterov) - Cleanups & fixes to prepare for future work: - Remove uprobe_register_refctr() - Simplify error handling for alloc_uprobe() - Make uprobe_register() return struct uprobe * - Fold __uprobe_unregister() into uprobe_unregister() - Shift put_uprobe() from delete_uprobe() to uprobe_unregister() - BPF: Fix use-after-free in bpf_uprobe_multi_link_attach() (Oleg Nesterov) - New feature & ABI extension: allow events to use PERF_SAMPLE READ with inheritance, enabling sample based profiling of a group of counters over a hierarchy of processes or threads (Ben Gainey) - Intel uncore & power events updates: - Add Arrow Lake and Lunar Lake support - Add PERF_EV_CAP_READ_SCOPE - Clean up and enhance cpumask and hotplug support (Kan Liang) - Add LNL uncore iMC freerunning support - Use D0:F0 as a default device (Zhenyu Wang) - Intel PT: fix AUX snapshot handling race (Adrian Hunter) - Misc fixes and cleanups (James Clark, Jiri Olsa, Oleg Nesterov and Peter Zijlstra) * tag 'perf-core-2024-09-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) dmaengine: idxd: Clean up cpumask and hotplug for perfmon iommu/vt-d: Clean up cpumask and hotplug for perfmon perf/x86/intel/cstate: Clean up cpumask and hotplug perf: Add PERF_EV_CAP_READ_SCOPE perf: Generic hotplug support for a PMU with a scope uprobes: perform lockless SRCU-protected uprobes_tree lookup rbtree: provide rb_find_rcu() / rb_find_add_rcu() perf/uprobe: split uprobe_unregister() uprobes: travers uprobe's consumer list locklessly under SRCU protection uprobes: get rid of enum uprobe_filter_ctx in uprobe filter callbacks uprobes: protected uprobe lifetime with SRCU uprobes: revamp uprobe refcounting and lifetime management bpf: Fix use-after-free in bpf_uprobe_multi_link_attach() perf/core: Fix small negative period being ignored perf: Really fix event_function_call() locking perf: Optimize __pmu_ctx_sched_out() perf: Add context time freeze perf: Fix event_function_call() locking perf: Extract a few helpers perf: Optimize context reschedule for single PMU cases ...	2024-09-18 15:03:58 +02:00
Linus Torvalds	eec91e22fe	IOMMU Updates for Linux v6.12 Including: - Core changes: - Allow ATS on VF when parent device is identity mapped. - Optimize unmap path on ARM io-pagetable implementation. - Use of_property_present(). - ARM-SMMU changes: - SMMUv2: - Devicetree binding updates for Qualcomm MMU-500 implementations. - Extend workarounds for broken Qualcomm hypervisor to avoid touching features that are not available (e.g. 16KiB page support, reserved context banks). - SMMUv3: - Support for NVIDIA's custom virtual command queue hardware. - Fix Stage-2 stall configuration and extend tests to cover this area. - A bunch of driver cleanups, including simplification of the master rbtree code. - Plus minor cleanups and fixes across both drivers. - Intel VT-d changes: - Retire si_domain and convert to use static identity domain. - Batched IOTLB/dev-IOTLB invalidation. - Small code refactoring and cleanups. - AMD-Vi changes: - Cleanup and refactoring of io-pagetable code. - Add parameter to limit the used io-pagesizes. - Other cleanups and fixes. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmboAtoACgkQK/BELZcB GuNidQ//WOhwVQZmdS6vnU2vu//LwFE7Q7PsRYPW2QhFri1eurKo6jxMNBtgUsXu fPTSEBM7/lhagRgb29ycrbOYoavkEnUiIMX7vRsjl9tVkqd/GKNTrMuUC+QPiBYQ ASkStmEUW6Zvye4rWyUxiCJIFIA5wm74wSOOQ6X2Wg3WMo51njrj1DK/k2H5JenJ RTmIA9Ynef2py38xWDd0UE/psvKrzA5uug4IP0E0v014i36cSEVrH7hjztMfd8Sc 2dUuJ8eUUtLTo1ffTcmxoTvUBjBzJOzeSQrFfaDZDgyqayt6JoSKeX1DV/nCI8kc ftg0pe37Zr3mndgQC7wNyUO1GOmkJl+GpMFyJTG8wpnBc0tr+TDn1o6QERymcRxA kn62n4vxxjWoRSKt3di7hNM0Uuwj8/z/cIbDSTNbSov4fDuuz0xppdcA/ewKATv0 VgmpP5OyIFZXM+mR4Vem2hZQQ3wPOsJAFVWS1ROtYQFgiimrGf+w9et8rEU4pmp5 Ve4rSmka60NLdE6i1JNqx4sRrRsdJJ55knI77nHrt0TZkbMzA/JG1UT3TbbMJTtd v5dviMMOXLpcKQLgqlde8QWOEjT6VUw/fbU640iyzhrWAm8fWDBefrSv6JLhevQ4 fBajoaej89cd9DkkEJiSTiyGig8QkY3HFaqDo3u5g/sBBrMBZas= =1QvI -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu updates from Joerg Roedel: "Core changes: - Allow ATS on VF when parent device is identity mapped - Optimize unmap path on ARM io-pagetable implementation - Use of_property_present() ARM-SMMU changes: - SMMUv2: - Devicetree binding updates for Qualcomm MMU-500 implementations - Extend workarounds for broken Qualcomm hypervisor to avoid touching features that are not available (e.g. 16KiB page support, reserved context banks) - SMMUv3: - Support for NVIDIA's custom virtual command queue hardware - Fix Stage-2 stall configuration and extend tests to cover this area - A bunch of driver cleanups, including simplification of the master rbtree code - Minor cleanups and fixes across both drivers Intel VT-d changes: - Retire si_domain and convert to use static identity domain - Batched IOTLB/dev-IOTLB invalidation - Small code refactoring and cleanups AMD-Vi changes: - Cleanup and refactoring of io-pagetable code - Add parameter to limit the used io-pagesizes - Other cleanups and fixes" * tag 'iommu-updates-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (77 commits) dt-bindings: arm-smmu: Add compatible for QCS8300 SoC iommu/amd: Test for PAGING domains before freeing a domain iommu/amd: Fix argument order in amd_iommu_dev_flush_pasid_all() iommu/amd: Add kernel parameters to limit V1 page-sizes iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg iommu/arm-smmu-v3: Add types for each level of the CD table iommu/arm-smmu-v3: Shrink the cdtab l1_desc array iommu/arm-smmu-v3: Do not use devm for the cd table allocations iommu/arm-smmu-v3: Remove strtab_base/cfg iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg iommu/arm-smmu-v3: Add types for each level of the 2 level stream table iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx() iommu/arm-smmu-qcom: apply num_context_bank fixes for SDM630 / SDM660 iommu/arm-smmu-v3: Use the new rb tree helpers dt-bindings: arm-smmu: document the support on SA8255p iommu/tegra241-cmdqv: Do not allocate vcmdq until dma_set_mask_and_coherent iommu/tegra241-cmdqv: Drop static at local variable iommu/tegra241-cmdqv: Fix ioremap() error handling in probe() iommu/amd: Do not set the D bit on AMD v2 table entries iommu/amd: Correct the reported page sizes from the V1 table ...	2024-09-18 12:45:52 +02:00
Linus Torvalds	61d1ea914b	Updates for the x86 APIC code: - Handle an allocation failure in the IO/APIC code gracefully instead of crashing the machine. - Remove support for APIC local destination mode on 64bit Logical destination mode of the local APIC is used for systems with up to 8 CPUs. It has an advantage over physical destination mode as it allows to target multiple CPUs at once with IPIs. That advantage was definitely worth it when systems with up to 8 CPUs were state of the art for servers and workstations, but that's history. In the recent past there were quite some reports of new laptops failing to boot with logical destination mode, but they work fine with physical destination mode. That's not a suprise because physical destination mode is guaranteed to work as it's the only way to get a CPU up and running via the INIT/INIT/STARTUP sequence. Some of the affected systems were cured by BIOS updates, but not all OEMs provide them. As the number of CPUs keep increasing, logical destination mode becomes less used and the benefit for small systems, like laptops, is not really worth the trouble. So just remove logical destination mode support for 64bit and be done with it. - Code and comment cleanups in the APIC area. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmbpL0gTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYob/VD/984H4Ku5/Djq9HkhBO11hfRTIVz/uf 1/b5ogd3eN0dK5nAv79/Gj7E/zntVsvCjuCYckXz51xPxkQH2LxUhDKqeUwg5lmz xQV0mKK4fIS/g5yymQGplKc7FfjRAnVL9ZZRRvMkvtqbr1+dA665XrfjFAPkp929 zLaBUbNC6YxYfSddsV+fE8711QP6NzCYdeEBIdZ3NuBrlGfiLy1g1OWCk8za7zjM cLJfGnU63MNXI4smrZWrQwJDBOiQl1wPbJYWL216OPHofLzLNGNZFXm4y8OJcyN0 WPWn1TliAwpRYx18Z/cEPgkoES8mXqqpPcoo0yBjOmPLl31J6QYU7QQhDb3HOnM/ ALgnnuhoWll5YjNBPJkONAa7lpnmfTbEg82WxaipEscz9CyEBoeOLvYBGPl/YqV+ B8wMOZHDH+BchJ6rYXDA1AmkD+9q86F+ddbiVOKj09dVm/QeLrGjwox1O7yGALGZ hZPQx9MsTOJqQIh40PsqFko6OiMKuMBIebacFb4NqmVA2/WbRbcmkzRyxk+kkBFv UMZX5O6sQhat615WZkxTnjmdnXETTIlv4nRQURBd/LF6ECRkXXG11dWaZfTXZ9iW 8NNlHw8mIbGmzn7wWXHlhk7N7vuhWCikAf7V2y+eZUVtE56qGM2volJNCmTZacP2 rrjmltwEGR+5gg== =Y3a/ -----END PGP SIGNATURE----- Merge tag 'x86-apic-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 APIC updates from Thomas Gleixner: - Handle an allocation failure in the IO/APIC code gracefully instead of crashing the machine. - Remove support for APIC local destination mode on 64bit Logical destination mode of the local APIC is used for systems with up to 8 CPUs. It has an advantage over physical destination mode as it allows to target multiple CPUs at once with IPIs. That advantage was definitely worth it when systems with up to 8 CPUs were state of the art for servers and workstations, but that's history. In the recent past there were quite some reports of new laptops failing to boot with logical destination mode, but they work fine with physical destination mode. That's not a suprise because physical destination mode is guaranteed to work as it's the only way to get a CPU up and running via the INIT/INIT/STARTUP sequence. Some of the affected systems were cured by BIOS updates, but not all OEMs provide them. As the number of CPUs keep increasing, logical destination mode becomes less used and the benefit for small systems, like laptops, is not really worth the trouble. So just remove logical destination mode support for 64bit and be done with it. - Code and comment cleanups in the APIC area. * tag 'x86-apic-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/irq: Fix comment on IRQ vector layout x86/apic: Remove unused extern declarations x86/apic: Remove logical destination mode for 64-bit x86/apic: Remove unused inline function apic_set_eoi_cb() x86/ioapic: Cleanup remaining coding style issues x86/ioapic: Cleanup line breaks x86/ioapic: Cleanup bracket usage x86/ioapic: Cleanup comments x86/ioapic: Move replace_pin_at_irq_node() to the call site iommu/vt-d: Cleanup apic_printk() x86/mpparse: Cleanup apic_printk()s x86/ioapic: Cleanup guarded debug printk()s x86/ioapic: Cleanup apic_printk()s x86/apic: Cleanup apic_printk()s x86/apic: Provide apic_printk() helpers x86/ioapic: Use guard() for locking where applicable x86/ioapic: Cleanup structs x86/ioapic: Mark mp_alloc_timer_irq() __init x86/ioapic: Handle allocation failures gracefully	2024-09-17 13:09:49 +02:00
Linus Torvalds	1636f57c78	ARM development updates for v6.12-rc1 - clean up TTBCR magic numbers and use u32 for this register - fix clang issue in VFP code leading to kernel oops, caused by compiler instruction scheduling. - switch 32-bit Arm to use GENERIC_CPU_DEVICES and use the arch_cpu_is_hotpluggable() hook. - pass struct device to arm_iommu_create_mapping() and move over to use iommu_paging_domain_alloc() rather than iommu_domain_alloc() - make amba_bustype constant -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAmbZziAACgkQ9OeQG+St rGRC4g//WolL3+djg71a1uDj4X3XjuDStlCoj2WVrOqmoQ57BcNR+yR/H89tXtNa kWypAr6GZP9aFIS1Liprce2ypnsbNTKZjOnU4zqTiG9XTq5ddBvGw71RiA74R8Gt hDt27XpvzwAmkoLiTRK9KPrHUFRAL6jj++lq9HJBnHzIV2VD4wlUHt4P535Wn26p IO2vBWIHxsE/OBprO92nMeEskDHj3bwHGMBMdogz4mQvoet55m9q1JLtWlc5Fb5j yrHPoZ/4kYyu8ReIZMvudULMvg9XwF8LT4ferwk7qrKj/aua0xglK3h6FBzKj0EU Tsp9ic2TwVoDFg9xoV9US7+axVikxN6ZgVCqlGUZ+0jDArHlskV++cncsqfAxLw1 ga4Tb9JG3oPAj+SThLgsk8PzKK+RWXpW+n4DQsgJaI7iJhSyB6mLFwr5fud3j3mZ pMVMeXAh3GVjcdfwDtWmrUmwSLh9KMaOLIKCQm4ooL3ELVsa+UUbEHLb71ZKMa8T /H9bNkABhJHqTk7T2tBIVKV/hSnHddjg2lGLOppuo5z3DKYi6oIlbBcqBSqz69PK afkSrhWk8u7tvHI31uY9qncyjY8mQPkvXu1mYxz9t0D3w/+YM5ts45y4bLUg5wMr 8eSkmFPVBQWqeQurs490YwVcmGNL9v4gSsArI7RJzvxnj9UUVVY= =hhvp -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux Pull ARM updates from Russell King: - clean up TTBCR magic numbers and use u32 for this register - fix clang issue in VFP code leading to kernel oops, caused by compiler instruction scheduling. - switch 32-bit Arm to use GENERIC_CPU_DEVICES and use the arch_cpu_is_hotpluggable() hook. - pass struct device to arm_iommu_create_mapping() and move over to use iommu_paging_domain_alloc() rather than iommu_domain_alloc() - make amba_bustype constant * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: ARM: 9418/1: dma-mapping: Use iommu_paging_domain_alloc() ARM: 9417/1: dma-mapping: Pass device to arm_iommu_create_mapping() ARM: 9416/1: amba: make amba_bustype constant ARM: 9412/1: Convert to arch_cpu_is_hotpluggable() ARM: 9411/1: Switch over to GENERIC_CPU_DEVICES using arch_register_cpu() ARM: 9410/1: vfp: Use asm volatile in fmrx/fmxr macros ARM: 9409/1: mmu: Do not use magic number for TTBCR settings	2024-09-16 06:32:08 +02:00
Joerg Roedel	97162f6093	Merge branches 'fixes', 'arm/smmu', 'intel/vt-d', 'amd/amd-vi' and 'core' into next	2024-09-13 12:53:05 +02:00
Jason Gunthorpe	3ab9d8d1b5	iommu/amd: Test for PAGING domains before freeing a domain This domain free function can be called for IDENTITY and SVA domains too, and they don't have page tables. For now protect against this by checking the type. Eventually the different types should have their own free functions. Fixes: `485534bfcc` ("iommu/amd: Remove conditions from domain free paths") Reported-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/0-v1-ad9884ee5f5b+da-amd_iopgtbl_fix_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-12 09:21:40 +02:00
Eliav Bar-ilan	8386207f37	iommu/amd: Fix argument order in amd_iommu_dev_flush_pasid_all() An incorrect argument order calling amd_iommu_dev_flush_pasid_pages() causes improper flushing of the IOMMU, leaving the old value of GCR3 from a previous process attached to the same PASID. The function has the signature: void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data, ioasid_t pasid, u64 address, size_t size) Correct the argument order. Cc: stable@vger.kernel.org Fixes: `474bf01ed9` ("iommu/amd: Add support for device based TLB invalidation") Signed-off-by: Eliav Bar-ilan <eliavb@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/0-v1-fc6bc37d8208+250b-amd_pasid_flush_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-12 09:20:18 +02:00
Yi Liu	79805c1bbb	iommu: Set iommu_attach_handle->domain in core The IOMMU core sets the iommu_attach_handle->domain for the iommu_attach_group_handle() path, while the iommu_replace_group_handle() sets it on the caller side. Make the two paths aligned on it. Link: https://patch.msgid.link/r/20240908114256.979518-3-yi.l.liu@intel.com Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-09-11 20:14:07 -03:00
Yi Liu	d9dfb5e622	iommufd: Avoid duplicated __iommu_group_set_core_domain() call For the fault-capable hwpts, the iommufd_hwpt_detach_device() calls both iommufd_fault_domain_detach_dev() and iommu_detach_group(). This would have duplicated __iommu_group_set_core_domain() call since both functions call it in the end. This looks no harm as the __iommu_group_set_core_domain() returns if the new domain equals to the existing one. But it makes sense to avoid such duplicated calls in caller side. Link: https://patch.msgid.link/r/20240908114256.979518-2-yi.l.liu@intel.com Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-09-11 20:14:07 -03:00
Joerg Roedel	f0295913c4	iommu/amd: Add kernel parameters to limit V1 page-sizes Add two new kernel command line parameters to limit the page-sizes used for v1 page-tables: nohugepages - Limits page-sizes to 4KiB v2_pgsizes_only - Limits page-sizes to 4Kib/2Mib/1GiB; The same as the sizes used with v2 page-tables This is needed for multiple scenarios. When assigning devices to SEV-SNP guests the IOMMU page-sizes need to match the sizes in the RMP table, otherwise the device will not be able to access all shared memory. Also, some ATS devices do not work properly with arbitrary IO page-sizes as supported by AMD-Vi, so limiting the sizes used by the driver is a suitable workaround. All-in-all, these parameters are only workarounds until the IOMMU core and related APIs gather the ability to negotiate the page-sizes in a better way. Signed-off-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20240905072240.253313-1-joro@8bytes.org	2024-09-10 11:48:57 +02:00
Kan Liang	a8c73b82f7	iommu/vt-d: Clean up cpumask and hotplug for perfmon The iommu PMU is system-wide scope, which is supported by the generic perf_event subsystem now. Set the scope for the iommu PMU and remove all the cpumask and hotplug codes. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20240802151643.1691631-5-kan.liang@linux.intel.com	2024-09-10 11:44:13 +02:00
Jason Gunthorpe	e3b1be2e73	iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg The members here are being used for both the linear and the 2 level case, with the meaning of each item slightly different in the two cases. Split it into a clean union where both cases have their own struct with their own logical names and correct types. Adjust all the users to detect linear/2lvl and use the right sub structure and types consistently. Remove CTXDESC_CD_DWORDS by changing the last places to use sizeof(struct arm_smmu_cd). Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/8-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:15 +01:00
Jason Gunthorpe	7c567eb1e1	iommu/arm-smmu-v3: Add types for each level of the CD table As well as indexing helpers arm_smmu_cdtab_l1/2_idx(). Remove CTXDESC_L1_DESC_DWORDS and CTXDESC_CD_DWORDS replacing them all with type specific calculations. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/7-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:15 +01:00
Jason Gunthorpe	c0a25a96de	iommu/arm-smmu-v3: Shrink the cdtab l1_desc array The top of the 2 level CD table is (at most) 1024 entries big, and two high order allocations are required. One of __le64 which is programmed into the HW (8k) and one of struct arm_smmu_l1_ctx_desc which holds the CPU pointer (16k). There are two copies of the l2ptr_dma, one is stored in the struct arm_smmu_l1_ctx_desc, and another is encoded in the __le64 for the HW to use. Instead of storing two copies just decode the value from the __le64. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:15 +01:00
Jason Gunthorpe	47b2de35ca	iommu/arm-smmu-v3: Do not use devm for the cd table allocations The master->cd_table is entirely contained within the struct arm_smmu_master which is guaranteed to be freed by the core code under arm_smmu_release_device(). There is no reason to use devm here, arm_smmu_free_cd_tables() is reliably called to free the CD related memory. Remove it and save some memory. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:14 +01:00
Jason Gunthorpe	8c153ef956	iommu/arm-smmu-v3: Remove strtab_base/cfg These values can be computed from the other values already stored in the config. Move the calculation to arm_smmu_write_strtab() and do it directly before writing the registers. This moves all the logic to calculate the two registers into one function from three and saves an unimportant 16 bytes from the arm_smmu_device. Suggested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:14 +01:00
Jason Gunthorpe	85196f5474	iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg The members here are being used for both the linear and the 2 level case, with the meaning of each item slightly different in the two cases. Split it into a clean union where both cases have their own struct with their own logical names and correct types. Adjust all the users to detect linear/2lvl and use the right sub structure and types consistently. Remove STRTAB_STE_DWORDS by changing the last places to use sizeof(struct arm_smmu_ste). Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:14 +01:00
Jason Gunthorpe	abb4f9d323	iommu/arm-smmu-v3: Add types for each level of the 2 level stream table Add types struct arm_smmu_strtab_l1 and l2 to represent the HW layout of the descriptors, and use them in most places, following patches will get the remaing places. The size of the l1 and l2 HW allocations are sizeof(struct arm_smmu_strtab_l1/2). This provides some more clarity than having raw __le64 *'s and sizes computed via macros. Remove STRTAB_L1_DESC_DWORDS. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:14 +01:00
Jason Gunthorpe	ce410410f1	iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx() Don't open code the calculations of the indexes for each level, provide two functions to do that math and call them in all the places. Update all the places computing indexes. Calculate the L1 table size directly based on the max required index from the cap. Remove STRTAB_L1_SZ_SHIFT in favour of STRTAB_NUM_L2_STES. Use STRTAB_NUM_L2_STES to replace remaining open coded 1 << STRTAB_SPLIT. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:14 +01:00
Dmitry Baryshkov	19eb465c96	iommu/arm-smmu-qcom: apply num_context_bank fixes for SDM630 / SDM660 The Qualcomm SDM630 / SDM660 platform requires the same kind of workaround as MSM8998: some IOMMUs have context banks reserved by firmware / TZ, touching those banks resets the board. Apply the num_context_bank workaround to those two SMMU devices in order to allow them to be used by Linux. Fixes: `b812834b53` ("iommu: arm-smmu-qcom: Add sdm630/msm8998 compatibles for qcom quirks") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20240907-sdm660-wifi-v1-1-e316055142f8@linaro.org Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:21:56 +01:00
Jason Gunthorpe	a2bb820e86	iommu/arm-smmu-v3: Use the new rb tree helpers Since v5.12 the rbtree has gained some simplifying helpers aimed at making rb tree users write less convoluted boiler plate code. Instead the caller provides a single comparison function and the helpers generate the prior open-coded stuff. Update smmu->streams to use rb_find_add() and rb_find(). Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v3-9fef8cdc2ff6+150d1-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-06 14:23:49 +01:00
Jason Gunthorpe	8f6887349b	iommufd: Protect against overflow of ALIGN() during iova allocation Userspace can supply an iova and uptr such that the target iova alignment becomes really big and ALIGN() overflows which corrupts the selected area range during allocation. CONFIG_IOMMUFD_TEST can detect this: WARNING: CPU: 1 PID: 5092 at drivers/iommu/iommufd/io_pagetable.c:268 iopt_alloc_area_pages drivers/iommu/iommufd/io_pagetable.c:268 [inline] WARNING: CPU: 1 PID: 5092 at drivers/iommu/iommufd/io_pagetable.c:268 iopt_map_pages+0xf95/0x1050 drivers/iommu/iommufd/io_pagetable.c:352 Modules linked in: CPU: 1 PID: 5092 Comm: syz-executor294 Not tainted 6.10.0-rc5-syzkaller-00294-g3ffea9a7a6f7 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024 RIP: 0010:iopt_alloc_area_pages drivers/iommu/iommufd/io_pagetable.c:268 [inline] RIP: 0010:iopt_map_pages+0xf95/0x1050 drivers/iommu/iommufd/io_pagetable.c:352 Code: fc e9 a4 f3 ff ff e8 1a 8b 4c fc 41 be e4 ff ff ff e9 8a f3 ff ff e8 0a 8b 4c fc 90 0f 0b 90 e9 37 f5 ff ff e8 fc 8a 4c fc 90 <0f> 0b 90 e9 68 f3 ff ff 48 c7 c1 ec 82 ad 8f 80 e1 07 80 c1 03 38 RSP: 0018:ffffc90003ebf9e0 EFLAGS: 00010293 RAX: ffffffff85499fa4 RBX: 00000000ffffffef RCX: ffff888079b49e00 RDX: 0000000000000000 RSI: 00000000ffffffef RDI: 0000000000000000 RBP: ffffc90003ebfc50 R08: ffffffff85499b30 R09: ffffffff85499942 R10: 0000000000000002 R11: ffff888079b49e00 R12: ffff8880228e0010 R13: 0000000000000000 R14: 1ffff920007d7f68 R15: ffffc90003ebfd00 FS: 000055557d760380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000005fdeb8 CR3: 000000007404a000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> iommufd_ioas_copy+0x610/0x7b0 drivers/iommu/iommufd/ioas.c:274 iommufd_fops_ioctl+0x4d9/0x5a0 drivers/iommu/iommufd/main.c:421 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Cap the automatic alignment to the huge page size, which is probably a better idea overall. Huge automatic alignments can fragment and chew up the available IOVA space without any reason. Link: https://patch.msgid.link/r/0-v1-8009738b9891+1f7-iommufd_align_overflow_jgg@nvidia.com Cc: stable@vger.kernel.org Fixes: `51fe6141f0` ("iommufd: Data structure to provide IOVA to PFN mapping") Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reported-by: syzbot+16073ebbc4c64b819b47@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/000000000000388410061a74f014@google.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-09-05 13:54:41 -03:00
Nicolin Chen	483e0bd888	iommu/tegra241-cmdqv: Do not allocate vcmdq until dma_set_mask_and_coherent It's observed that, when the first 4GB of system memory was reserved, all VCMDQ allocations failed (even with the smallest qsz in the last attempt): arm-smmu-v3: found companion CMDQV device: NVDA200C:00 arm-smmu-v3: option mask 0x10 arm-smmu-v3: failed to allocate queue (0x8000 bytes) for vcmdq0 acpi NVDA200C:00: tegra241_cmdqv: Falling back to standard SMMU CMDQ arm-smmu-v3: ias 48-bit, oas 48-bit (features 0x001e1fbf) arm-smmu-v3: allocated 524288 entries for cmdq arm-smmu-v3: allocated 524288 entries for evtq arm-smmu-v3: allocated 524288 entries for priq This is because the 4GB reserved memory shifted the entire DMA zone from a lower 32-bit range (on a system without the 4GB carveout) to higher range, while the dev->coherent_dma_mask was set to DMA_BIT_MASK(32) by default. The dma_set_mask_and_coherent() call is done in arm_smmu_device_hw_probe() of the SMMU driver. So any DMA allocation from tegra241_cmdqv_probe() must wait until the coherent_dma_mask is correctly set. Move the vintf/vcmdq structure initialization routine into a different op, "init_structures". Call it at the end of arm_smmu_init_structures(), where standard SMMU queues get allocated. Most of the impl_ops aren't ready until vintf/vcmdq structure are init-ed. So replace the full impl_ops with an init_ops in __tegra241_cmdqv_probe(). And switch to tegra241_cmdqv_impl_ops later in arm_smmu_init_structures(). Note that tegra241_cmdqv_impl_ops does not link to the new init_structures op after this switch, since there is no point in having it once it's done. Fixes: `918eb5c856` ("iommu/arm-smmu-v3: Add in-kernel support for NVIDIA Tegra241 (Grace) CMDQV") Reported-by: Matt Ochs <mochs@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/530993c3aafa1b0fc3d879b8119e13c629d12e2b.1725503154.git.nicolinc@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-05 16:07:38 +01:00
Nicolin Chen	2408b81f81	iommu/tegra241-cmdqv: Drop static at local variable This is likely a typo. Drop it. Fixes: `918eb5c856` ("iommu/arm-smmu-v3: Add in-kernel support for NVIDIA Tegra241 (Grace) CMDQV") Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/13fd3accb5b7ed6ec11cc6b7435f79f84af9f45f.1725503154.git.nicolinc@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-05 16:07:38 +01:00
Jason Gunthorpe	73183ad6ea	iommufd: Check the domain owner of the parent before creating a nesting domain This check was missed, before we can pass a struct iommu_domain to a driver callback we need to validate that the domain was created by that driver. Fixes: `bd529dbb66` ("iommufd: Add a nested HW pagetable object") Link: https://patch.msgid.link/r/0-v1-c8770519edde+1a-iommufd_nesting_ops_jgg@nvidia.com Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-09-05 11:25:40 -03:00
Dan Carpenter	086a3c40eb	iommu/tegra241-cmdqv: Fix ioremap() error handling in probe() The ioremap() function doesn't return error pointers, it returns NULL on error so update the error handling. Also just return directly instead of calling iounmap() on the NULL pointer. Calling iounmap(NULL) doesn't cause a problem on ARM but on other architectures it can trigger a warning so it'a bad habbit. Fixes: `918eb5c856` ("iommu/arm-smmu-v3: Add in-kernel support for NVIDIA Tegra241 (Grace) CMDQV") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/5a6c1e9a-0724-41b1-86d4-36335d3768ea@stanley.mountain Signed-off-by: Will Deacon <will@kernel.org>	2024-09-04 16:42:49 +01:00
Jason Gunthorpe	9e8354b399	ARM: 9417/1: dma-mapping: Pass device to arm_iommu_create_mapping() All users of ARM IOMMU mappings create them for a particular device, so change the interface to accept the device rather than forcing a vague indirection through a bus type. This prepares for making a similar change to iommu_domain_alloc() itself. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>	2024-09-04 15:02:07 +01:00
Jason Gunthorpe	2910a7fa1b	iommu/amd: Do not set the D bit on AMD v2 table entries The manual says that bit 6 is IGN for all Page-Table Base Address pointers, don't set it. Fixes: `aaac38f614` ("iommu/amd: Initial support for AMD IOMMU v2 page table") Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/14-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-04 11:39:03 +02:00
Jason Gunthorpe	7e51586629	iommu/amd: Correct the reported page sizes from the V1 table The HW only has 52 bits of physical address support, the supported page sizes should not have bits set beyond this. Further the spec says that the 6th level does not support any "default page size for translation entries" meaning leafs in the 6th level are not allowed too. Rework the definition to use GENMASK to build the range of supported pages from the top of physical to 4k. Nothing ever uses such large pages, so this is a cosmetic/documentation improvement only. Reported-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/13-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-04 11:39:03 +02:00
Jason Gunthorpe	c435209f72	iommu/amd: Remove the confusing dummy iommu_flush_ops tlb ops The iommu driver is supposed to provide these ops to its io_pgtable implementation so that it can hook the invalidations and do the right thing. They are called by wrapper functions like io_pgtable_tlb_add_page() etc, which the AMD code never calls. Instead it directly calls the AMD IOMMU invalidation functions by casting to the struct protection_domain. Remove it all. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/12-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-04 11:39:02 +02:00
Jason Gunthorpe	a06dcb6b78	iommu/amd: Fix typo of , instead of ; Generates the same code, but is not the expected C style. Fixes: `aaac38f614` ("iommu/amd: Initial support for AMD IOMMU v2 page table") Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/11-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-04 11:39:02 +02:00
Jason Gunthorpe	485534bfcc	iommu/amd: Remove conditions from domain free paths Don't use tlb as some flag to indicate if protection_domain_alloc() completed. Have protection_domain_alloc() unwind itself in the normal kernel style and require protection_domain_free() only be called on successful results of protection_domain_alloc(). Also, the amd_iommu_domain_free() op is never called by the core code with a NULL argument, so remove all the NULL tests as well. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/10-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-04 11:39:01 +02:00
Jason Gunthorpe	9ac0b3380a	iommu/amd: Narrow the use of struct protection_domain to invalidation The AMD io_pgtable stuff doesn't implement the tlb ops callbacks, instead it invokes the invalidation ops directly on the struct protection_domain. Narrow the use of struct protection_domain to only those few code paths. Make everything else properly use struct amd_io_pgtable through the call chains, which is the correct modular type for an io-pgtable module. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/9-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-04 11:39:00 +02:00
Jason Gunthorpe	47f218d108	iommu/amd: Store the nid in io_pgtable_cfg instead of the domain We already have memory in the union here that is being wasted in AMD's case, use it to store the nid. Putting the nid here further isolates the io_pgtable code from the struct protection_domain. Fixup protection_domain_alloc so that the NID from the device is provided, at this point dev is never NULL for AMD so this will now allocate the first table pointer on the correct NUMA node. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/8-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-04 11:38:34 +02:00
Jason Gunthorpe	977fc27ca7	iommu/amd: Remove amd_io_pgtable::pgtbl_cfg This struct is already in iop.cfg, we don't need two. AMD is using this API sort of wrong, the cfg is supposed to be passed in and then the allocation function will allocate ops memory and copy the passed config into the new memory. Keep it kind of wrong and pass in the cfg memory that is already part of the pagetable struct. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/7-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-04 11:38:33 +02:00
Jason Gunthorpe	670b57796c	iommu/amd: Rename struct amd_io_pgtable iopt to pgtbl There is struct protection_domain iopt and struct amd_io_pgtable iopt. Next patches are going to want to write domain.iopt.iopt.xx which is quite unnatural to read. Give one of them a different name, amd_io_pgtable has fewer references so call it pgtbl, to match pgtbl_cfg, instead. Suggested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-04 11:38:32 +02:00
Jason Gunthorpe	1ed2d21d47	iommu/amd: Remove the amd_iommu_domain_set_pt_root() and related Looks like many refactorings here have left this confused. There is only one storage of the root/mode, it is in the iop struct. increase_address_space() calls amd_iommu_domain_set_pgtable() with values that it already stored in iop a few lines above. amd_iommu_domain_clr_pt_root() is zero'ing memory we are about to free. It used to protect against a double free of root, but that is gone now. Remove amd_iommu_domain_set_pgtable(), amd_iommu_domain_set_pt_root(), amd_iommu_domain_clr_pt_root() as they are all pointless. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-04 11:38:31 +02:00
Jason Gunthorpe	322d889ae7	iommu/amd: Remove amd_iommu_domain_update() from page table freeing It is a serious bug if the domain is still mapped to any DTEs when it is freed as we immediately start freeing page table memory, so any remaining HW touch will UAF. If it is not mapped then dev_list is empty and amd_iommu_domain_update() does nothing. Remove it and add a WARN_ON() to catch this class of bug. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-04 11:37:43 +02:00
Jason Gunthorpe	7a41dcb52f	iommu/amd: Set the pgsize_bitmap correctly When using io_pgtable the correct pgsize_bitmap is stored in the cfg, both v1_alloc_pgtable() and v2_alloc_pgtable() set it correctly. This fixes a bug where the v2 pgtable had the wrong pgsize as protection_domain_init_v2() would set it and then do_iommu_domain_alloc() immediately resets it. Remove the confusing ops.pgsize_bitmap since that is not used if the driver sets domain.pgsize_bitmap. Fixes: `134288158a` ("iommu/amd: Add domain_alloc_user based domain allocation") Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-04 11:37:42 +02:00
Jason Gunthorpe	b0a6c883bc	iommu/amd: Allocate the page table root using GFP_KERNEL Domain allocation is always done under a sleepable context, the v1 path and other drivers use GFP_KERNEL already. Fix the v2 path to also use GFP_KERNEL. Fixes: `0d571dcbe7` ("iommu/amd: Allocate page table using numa locality info") Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-09-04 11:37:42 +02:00

1 2 3 4 5 ...

5428 Commits