mirror of
https://github.com/torvalds/linux.git
synced 2024-11-10 22:21:40 +00:00
8d3ae46c64
6050 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Linus Torvalds
|
61307b7be4 |
The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs. Notable series include: - Lucas Stach has provided some page-mapping cleanup/consolidation/maintainability work in the series "mm/treewide: Remove pXd_huge() API". - In the series "Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one test. - In their series "Memory allocation profiling" Kent Overstreet and Suren Baghdasaryan have contributed a means of determining (via /proc/allocinfo) whereabouts in the kernel memory is being allocated: number of calls and amount of memory. - Matthew Wilcox has provided the series "Various significant MM patches" which does a number of rather unrelated things, but in largely similar code sites. - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes Weiner has fixed the page allocator's handling of migratetype requests, with resulting improvements in compaction efficiency. - In the series "make the hugetlb migration strategy consistent" Baolin Wang has fixed a hugetlb migration issue, which should improve hugetlb allocation reliability. - Liu Shixin has hit an I/O meltdown caused by readahead in a memory-tight memcg. Addressed in the series "Fix I/O high when memory almost met memcg limit". - In the series "mm/filemap: optimize folio adding and splitting" Kairui Song has optimized pagecache insertion, yielding ~10% performance improvement in one test. - Baoquan He has cleaned up and consolidated the early zone initialization code in the series "mm/mm_init.c: refactor free_area_init_core()". - Baoquan has also redone some MM initializatio code in the series "mm/init: minor clean up and improvement". - MM helper cleanups from Christoph Hellwig in his series "remove follow_pfn". - More cleanups from Matthew Wilcox in the series "Various page->flags cleanups". - Vlastimil Babka has contributed maintainability improvements in the series "memcg_kmem hooks refactoring". - More folio conversions and cleanups in Matthew Wilcox's series "Convert huge_zero_page to huge_zero_folio" "khugepaged folio conversions" "Remove page_idle and page_young wrappers" "Use folio APIs in procfs" "Clean up __folio_put()" "Some cleanups for memory-failure" "Remove page_mapping()" "More folio compat code removal" - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb functions to work on folis". - Code consolidation and cleanup work related to GUP's handling of hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2". - Rick Edgecombe has developed some fixes to stack guard gaps in the series "Cover a guard gap corner case". - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series "mm/ksm: fix ksm exec support for prctl". - Baolin Wang has implemented NUMA balancing for multi-size THPs. This is a simple first-cut implementation for now. The series is "support multi-size THP numa balancing". - Cleanups to vma handling helper functions from Matthew Wilcox in the series "Unify vma_address and vma_pgoff_address". - Some selftests maintenance work from Dev Jain in the series "selftests/mm: mremap_test: Optimizations and style fixes". - Improvements to the swapping of multi-size THPs from Ryan Roberts in the series "Swap-out mTHP without splitting". - Kefeng Wang has significantly optimized the handling of arm64's permission page faults in the series "arch/mm/fault: accelerate pagefault when badaccess" "mm: remove arch's private VM_FAULT_BADMAP/BADACCESS" - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it GUP-fast". - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to use struct vm_fault". - selftests build fixes from John Hubbard in the series "Fix selftests/mm build without requiring "make headers"". - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the series "Improved Memory Tier Creation for CPUless NUMA Nodes". Fixes the initialization code so that migration between different memory types works as intended. - David Hildenbrand has improved follow_pte() and fixed an errant driver in the series "mm: follow_pte() improvements and acrn follow_pte() fixes". - David also did some cleanup work on large folio mapcounts in his series "mm: mapcount for large folios + page_mapcount() cleanups". - Folio conversions in KSM in Alex Shi's series "transfer page to folio in KSM". - Barry Song has added some sysfs stats for monitoring multi-size THP's in the series "mm: add per-order mTHP alloc and swpout counters". - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled and limit checking cleanups". - Matthew Wilcox has been looking at buffer_head code and found the documentation to be lacking. The series is "Improve buffer head documentation". - Multi-size THPs get more work, this time from Lance Yang. His series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes the freeing of these things. - Kemeng Shi has added more userspace-visible writeback instrumentation in the series "Improve visibility of writeback". - Kemeng Shi then sent some maintenance work on top in the series "Fix and cleanups to page-writeback". - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the series "Improve anon_vma scalability for anon VMAs". Intel's test bot reported an improbable 3x improvement in one test. - SeongJae Park adds some DAMON feature work in the series "mm/damon: add a DAMOS filter type for page granularity access recheck" "selftests/damon: add DAMOS quota goal test" - Also some maintenance work in the series "mm/damon/paddr: simplify page level access re-check for pageout" "mm/damon: misc fixes and improvements" - David Hildenbrand has disabled some known-to-fail selftests ni the series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL". - memcg metadata storage optimizations from Shakeel Butt in "memcg: reduce memory consumption by memcg stats". - DAX fixes and maintenance work from Vishal Verma in the series "dax/bus.c: Fixups for dax-bus locking". -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkgQYwAKCRDdBJ7gKXxA jrdKAP9WVJdpEcXxpoub/vVE0UWGtffr8foifi9bCwrQrGh5mgEAx7Yf0+d/oBZB nvA4E0DcPrUAFy144FNM0NTCb7u9vAw= =V3R/ -----END PGP SIGNATURE----- Merge tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull mm updates from Andrew Morton: "The usual shower of singleton fixes and minor series all over MM, documented (hopefully adequately) in the respective changelogs. Notable series include: - Lucas Stach has provided some page-mapping cleanup/consolidation/ maintainability work in the series "mm/treewide: Remove pXd_huge() API". - In the series "Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one test. - In their series "Memory allocation profiling" Kent Overstreet and Suren Baghdasaryan have contributed a means of determining (via /proc/allocinfo) whereabouts in the kernel memory is being allocated: number of calls and amount of memory. - Matthew Wilcox has provided the series "Various significant MM patches" which does a number of rather unrelated things, but in largely similar code sites. - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes Weiner has fixed the page allocator's handling of migratetype requests, with resulting improvements in compaction efficiency. - In the series "make the hugetlb migration strategy consistent" Baolin Wang has fixed a hugetlb migration issue, which should improve hugetlb allocation reliability. - Liu Shixin has hit an I/O meltdown caused by readahead in a memory-tight memcg. Addressed in the series "Fix I/O high when memory almost met memcg limit". - In the series "mm/filemap: optimize folio adding and splitting" Kairui Song has optimized pagecache insertion, yielding ~10% performance improvement in one test. - Baoquan He has cleaned up and consolidated the early zone initialization code in the series "mm/mm_init.c: refactor free_area_init_core()". - Baoquan has also redone some MM initializatio code in the series "mm/init: minor clean up and improvement". - MM helper cleanups from Christoph Hellwig in his series "remove follow_pfn". - More cleanups from Matthew Wilcox in the series "Various page->flags cleanups". - Vlastimil Babka has contributed maintainability improvements in the series "memcg_kmem hooks refactoring". - More folio conversions and cleanups in Matthew Wilcox's series: "Convert huge_zero_page to huge_zero_folio" "khugepaged folio conversions" "Remove page_idle and page_young wrappers" "Use folio APIs in procfs" "Clean up __folio_put()" "Some cleanups for memory-failure" "Remove page_mapping()" "More folio compat code removal" - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb functions to work on folis". - Code consolidation and cleanup work related to GUP's handling of hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2". - Rick Edgecombe has developed some fixes to stack guard gaps in the series "Cover a guard gap corner case". - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series "mm/ksm: fix ksm exec support for prctl". - Baolin Wang has implemented NUMA balancing for multi-size THPs. This is a simple first-cut implementation for now. The series is "support multi-size THP numa balancing". - Cleanups to vma handling helper functions from Matthew Wilcox in the series "Unify vma_address and vma_pgoff_address". - Some selftests maintenance work from Dev Jain in the series "selftests/mm: mremap_test: Optimizations and style fixes". - Improvements to the swapping of multi-size THPs from Ryan Roberts in the series "Swap-out mTHP without splitting". - Kefeng Wang has significantly optimized the handling of arm64's permission page faults in the series "arch/mm/fault: accelerate pagefault when badaccess" "mm: remove arch's private VM_FAULT_BADMAP/BADACCESS" - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it GUP-fast". - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to use struct vm_fault". - selftests build fixes from John Hubbard in the series "Fix selftests/mm build without requiring "make headers"". - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the series "Improved Memory Tier Creation for CPUless NUMA Nodes". Fixes the initialization code so that migration between different memory types works as intended. - David Hildenbrand has improved follow_pte() and fixed an errant driver in the series "mm: follow_pte() improvements and acrn follow_pte() fixes". - David also did some cleanup work on large folio mapcounts in his series "mm: mapcount for large folios + page_mapcount() cleanups". - Folio conversions in KSM in Alex Shi's series "transfer page to folio in KSM". - Barry Song has added some sysfs stats for monitoring multi-size THP's in the series "mm: add per-order mTHP alloc and swpout counters". - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled and limit checking cleanups". - Matthew Wilcox has been looking at buffer_head code and found the documentation to be lacking. The series is "Improve buffer head documentation". - Multi-size THPs get more work, this time from Lance Yang. His series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes the freeing of these things. - Kemeng Shi has added more userspace-visible writeback instrumentation in the series "Improve visibility of writeback". - Kemeng Shi then sent some maintenance work on top in the series "Fix and cleanups to page-writeback". - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the series "Improve anon_vma scalability for anon VMAs". Intel's test bot reported an improbable 3x improvement in one test. - SeongJae Park adds some DAMON feature work in the series "mm/damon: add a DAMOS filter type for page granularity access recheck" "selftests/damon: add DAMOS quota goal test" - Also some maintenance work in the series "mm/damon/paddr: simplify page level access re-check for pageout" "mm/damon: misc fixes and improvements" - David Hildenbrand has disabled some known-to-fail selftests ni the series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL". - memcg metadata storage optimizations from Shakeel Butt in "memcg: reduce memory consumption by memcg stats". - DAX fixes and maintenance work from Vishal Verma in the series "dax/bus.c: Fixups for dax-bus locking"" * tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits) memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault selftests: cgroup: add tests to verify the zswap writeback path mm: memcg: make alloc_mem_cgroup_per_node_info() return bool mm/damon/core: fix return value from damos_wmark_metric_value mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED selftests: cgroup: remove redundant enabling of memory controller Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT Docs/mm/damon/design: use a list for supported filters Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file selftests/damon: classify tests for functionalities and regressions selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None' selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts selftests/damon/_damon_sysfs: check errors from nr_schemes file reads mm/damon/core: initialize ->esz_bp from damos_quota_init_priv() selftests/damon: add a test for DAMOS quota goal ... |
||
Linus Torvalds
|
ff9a79307f |
Kbuild updates for v6.10
- Avoid 'constexpr', which is a keyword in C23 - Allow 'dtbs_check' and 'dt_compatible_check' run independently of 'dt_binding_check' - Fix weak references to avoid GOT entries in position-independent code generation - Convert the last use of 'optional' property in arch/sh/Kconfig - Remove support for the 'optional' property in Kconfig - Remove support for Clang's ThinLTO caching, which does not work with the .incbin directive - Change the semantics of $(src) so it always points to the source directory, which fixes Makefile inconsistencies between upstream and downstream - Fix 'make tar-pkg' for RISC-V to produce a consistent package - Provide reasonable default coverage for objtool, sanitizers, and profilers - Remove redundant OBJECT_FILES_NON_STANDARD, KASAN_SANITIZE, etc. - Remove the last use of tristate choice in drivers/rapidio/Kconfig - Various cleanups and fixes in Kconfig -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmZFlGcVHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsG8voQALC8NtFpduWVfLRj2Qg6Ll/xf1vX 2igcTJEOFHkeqXLGoT8dTDKLEipUBUvKyguPq66CGwVTe2g6zy/nUSXeVtFrUsIa msLTi8FqhqUo5lodNvGMRf8qqmuqcvnXoiQwIocF92jtsFy14bhiFY+n4HfcFNjj GOKwqBZYQUwY/VVb090efc7RfS9c7uwABJSBelSoxg3AGZriwjGy7Pw5aSKGgVYi inqL1eR6qwPP6z7CgQWM99soP+zwybFZmnQrsD9SniRBI4rtAat8Ih5jQFaSUFUQ lk2w0NQBRFN88/uR2IJ2GWuIlQ74WeJ+QnCqVuQ59tV5zw90wqSmLzngfPD057Dv JjNuhk0UyXVtpIg3lRtd4810ppNSTe33b9OM4O2H846W/crju5oDRNDHcflUXcwm Rmn5ho1rb5QVzDVejJbgwidnUInSgJ9PZcvXQ/RJVZPhpgsBzAY9pQexG1G3hviw y9UDrt6KP6bF9tHjmolmtdIes9Pj0c4dN6/Rdj4HS4hIQ/GDar0tnwvOvtfUctNL orJlBsA6GeMmDVXKkR0ytOCWRYqWWbyt8g70RVKQJfuHX7/hGyAQPaQ2/u4mQhC2 aevYfbNJMj0VDfGz81HDBKFtkc5n+Ite8l157dHEl2LEabkOkRdNVcn7SNbOvZmd ZCSnZ31h7woGfNho =D5B/ -----END PGP SIGNATURE----- Merge tag 'kbuild-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Avoid 'constexpr', which is a keyword in C23 - Allow 'dtbs_check' and 'dt_compatible_check' run independently of 'dt_binding_check' - Fix weak references to avoid GOT entries in position-independent code generation - Convert the last use of 'optional' property in arch/sh/Kconfig - Remove support for the 'optional' property in Kconfig - Remove support for Clang's ThinLTO caching, which does not work with the .incbin directive - Change the semantics of $(src) so it always points to the source directory, which fixes Makefile inconsistencies between upstream and downstream - Fix 'make tar-pkg' for RISC-V to produce a consistent package - Provide reasonable default coverage for objtool, sanitizers, and profilers - Remove redundant OBJECT_FILES_NON_STANDARD, KASAN_SANITIZE, etc. - Remove the last use of tristate choice in drivers/rapidio/Kconfig - Various cleanups and fixes in Kconfig * tag 'kbuild-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (46 commits) kconfig: use sym_get_choice_menu() in sym_check_prop() rapidio: remove choice for enumeration kconfig: lxdialog: remove initialization with A_NORMAL kconfig: m/nconf: merge two item_add_str() calls kconfig: m/nconf: remove dead code to display value of bool choice kconfig: m/nconf: remove dead code to display children of choice members kconfig: gconf: show checkbox for choice correctly kbuild: use GCOV_PROFILE and KCSAN_SANITIZE in scripts/Makefile.modfinal Makefile: remove redundant tool coverage variables kbuild: provide reasonable defaults for tool coverage modules: Drop the .export_symbol section from the final modules kconfig: use menu_list_for_each_sym() in sym_check_choice_deps() kconfig: use sym_get_choice_menu() in conf_write_defconfig() kconfig: add sym_get_choice_menu() helper kconfig: turn defaults and additional prompt for choice members into error kconfig: turn missing prompt for choice members into error kconfig: turn conf_choice() into void function kconfig: use linked list in sym_set_changed() kconfig: gconf: use MENU_CHANGED instead of SYMBOL_CHANGED kconfig: gconf: remove debug code ... |
||
Linus Torvalds
|
2fc0e7892c |
Landlock updates for v6.10-rc1
-----BEGIN PGP SIGNATURE----- iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZkYGahAcbWljQGRpZ2lr b2QubmV0AAoJEOXj0OiMgvbSaxUBAL1USJI2DUo+hJTDHqXdVZGEE6VfYnmEhIur NZfsti4UAQCabAej7+NHzh1jAnnKMJFbcyl4LRy99anJGlEWq9ExDA== =fViz -----END PGP SIGNATURE----- Merge tag 'landlock-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock updates from Mickaël Salaün: "This brings ioctl control to Landlock, contributed by Günther Noack. This also adds him as a Landlock reviewer, and fixes an issue in the sample" * tag 'landlock-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: MAINTAINERS: Add Günther Noack as Landlock reviewer fs/ioctl: Add a comment to keep the logic in sync with LSM policies MAINTAINERS: Notify Landlock maintainers about changes to fs/ioctl.c landlock: Document IOCTL support samples/landlock: Add support for LANDLOCK_ACCESS_FS_IOCTL_DEV selftests/landlock: Exhaustive test for the IOCTL allow-list selftests/landlock: Check IOCTL restrictions for named UNIX domain sockets selftests/landlock: Test IOCTLs on named pipes selftests/landlock: Test ioctl(2) and ftruncate(2) with open(O_PATH) selftests/landlock: Test IOCTL with memfds selftests/landlock: Test IOCTL support landlock: Add IOCTL access right for character and block devices samples/landlock: Fix incorrect free in populate_ruleset_net |
||
Linus Torvalds
|
353ad6c083 |
integrity-v6.10
-----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCZkQD2xQcem9oYXJAbGlu dXguaWJtLmNvbQAKCRDLwZzRsCrn5b81APwINCoiLDB0L9KkyUhQjqvLLZCS85u9 Qo3/wdUGAb2tygD/RYKJgtGxvqO1Ap1IysytKHId0yo+vokdXG5stpMPJQE= =dIrw -----END PGP SIGNATURE----- Merge tag 'integrity-v6.10' of ssh://ra.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "Two IMA changes, one EVM change, a use after free bug fix, and a code cleanup to address "-Wflex-array-member-not-at-end" warnings: - The existing IMA {ascii, binary}_runtime_measurements lists include a hard coded SHA1 hash. To address this limitation, define per TPM enabled hash algorithm {ascii, binary}_runtime_measurements lists - Close an IMA integrity init_module syscall measurement gap by defining a new critical-data record - Enable (partial) EVM support on stacked filesystems (overlayfs). Only EVM portable & immutable file signatures are copied up, since they do not contain filesystem specific metadata" * tag 'integrity-v6.10' of ssh://ra.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: add crypto agility support for template-hash algorithm evm: Rename is_unsupported_fs to is_unsupported_hmac_fs fs: Rename SB_I_EVM_UNSUPPORTED to SB_I_EVM_HMAC_UNSUPPORTED evm: Enforce signatures on unsupported filesystem for EVM_INIT_X509 ima: re-evaluate file integrity on file metadata change evm: Store and detect metadata inode attributes changes ima: Move file-change detection variables into new structure evm: Use the metadata inode to calculate metadata hash evm: Implement per signature type decision in security_inode_copy_up_xattr security: allow finer granularity in permitting copy-up of security xattrs ima: Rename backing_inode to real_inode integrity: Avoid -Wflex-array-member-not-at-end warnings ima: define an init_module critical data record ima: Fix use-after-free on a dentry's dname.name |
||
Linus Torvalds
|
ccae19c623 |
selinux/stable-6.10 PR 20240513
-----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmZCWd4UHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMFMQ/8DlQhbe+NzeAjBg2O7RqNKoE/UU2l JAkzgQ7h35/sgsfDIEhPv0VOzq2cwLSaw5/qSGGrQDWR4AKz55HvBSJQDElgzPo2 wtL2lozjJ03+P391aeXEHdkqzgp2F3c6oNthIcFvRmi3EVcWr7eqnjWU+bJzFtMA z4IgX01yiUF6EMdTJcWjSLclk3dOqHWPV657RadYyp2pr3u9rc9gqgxichs0dQW0 X/vexUBg81TdSeHS+u9Jgrz+TCsQdP3CHSFibXDv1rX89bqCbG9wYNTyTBHH9f0j LPpV2bKaesxIZV6dZyIKKSqCtHGBcjcJEjUbl7JFkQOS/a8NPMDH2ZYkHqbyq4BH SX+kRhthmXZiZ2IWhMRikaE4FBIWI7JAC4riKB2Dd0RzJrNEgaawU0wM+evKai9F s3GCk/P7wnTT4GlDHSkYEiiRF9c3DNkfVb9mqG78PRMR4X+z/0dsX5iaJTwlyMxo 69JSoTNzMxVdDtvD+K89V7GCm8NmGFjLwQUqb282L/6ibzC/5CCUQXn8ZWvNRPql IUoQhKbveCWR3m9vAMeGpE61oNgFp0mHj3Zb/vE5Yu7HxOn2y1lFv2XGxR6bGyTF T53AaHM15uFomhqhG+iM3b4FDhNIJpbUc4MJsq3iMgFC941n1ceXyeBEgf31+Z2x k1ZlxqLAWn05auA= =2Wbi -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20240513' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: - Attempt to pre-allocate the SELinux status page so it doesn't appear to userspace that we are skipping SELinux policy sequence numbers - Reject invalid SELinux policy bitmaps with an error at policy load time - Consistently use the same type, u32, for ebitmap offsets - Improve the "symhash" hash function for better distribution on common policies - Correct a number of printk format specifiers in the ebitmap code - Improved error checking in sel_write_load() - Ensure we have a proper return code in the filename_trans_read_helper_compat() function - Make better use of the current_sid() helper function - Allow for more hash table statistics when debugging is enabled - Migrate from printk_ratelimit() to pr_warn_ratelimited() - Miscellaneous cleanups and tweaks to selinux_lsm_getattr() - More consitification work in the conditional policy space * tag 'selinux-pr-20240513' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: constify source policy in cond_policydb_dup() selinux: avoid printk_ratelimit() selinux: pre-allocate the status page selinux: clarify return code in filename_trans_read_helper_compat() selinux: use u32 as bit position type in ebitmap code selinux: improve symtab string hashing selinux: dump statistics for more hash tables selinux: make more use of current_sid() selinux: update numeric format specifiers for ebitmaps selinux: improve error checking in sel_write_load() selinux: cleanup selinux_lsm_getattr() selinux: reject invalid ebitmaps |
||
Linus Torvalds
|
4cd4e4b881 |
lsm/stable-6.10 PR 20240513
-----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmZCWeoUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOb/Q//V68LTU4pJF/2DaxC4rA9mDXC/iey AF1v8ajzWOuPA9X0xTkYRKkXY7llgqVCjJmKhtqA9elPEfZyRXhvZikoBnfb7TvS dh3nEh/3u2I09PnewBZdj4Bq1owQ+JDfxVzTpMQ433lNri23zcuszI8IB3ePvHe2 CoMs/UxbB5D6JEBOIiK1GFqiv2gt+293cw0RorD4DmUinAPU1WOqk9pdxuIwwONM PyJ4kHcQJ7aU6nAfSPQbFzrqaZCvoy2gKaCtslQ9ASlOWRtFbSbccKu29g8vbPcg Pql70xzYcjr5iXcSpD3d84LHzFOYBuSWrdqRFeP4w9tZzCAYGTlfvltAnQImbg2B GbDFogNB2hZbUq2zf9hhXcmhGVr4tqlxeujsGPCiivZfVSPR/9VyWvL9dH6R/ooE YZbmc1DNSNPkvozbejJ48rcI+nb5iCe7FVMeKODSvQFMpzAn4WsCb+svMhRzQ1h4 f1XkJ9TGxNEkBoVd0LpOfiPgB+HsFWRy/h9I5lQdAYr/H1MU4bAnKT761k8zPdXr Ne03zv0uvoOPHp5ObeKc5krWjJsuRffT/r9hegUiWfww/sANo7dfx2Oo22YoTo86 siK27qa/tiWn11/WGM8ozW3Ni8GcU496gO4n7RkVrVjdzpl6JG42CtRwaNYM7q6v 1Ntmg9H/wZllMg4= =F3Tz -----END PGP SIGNATURE----- Merge tag 'lsm-pr-20240513' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm updates from Paul Moore: - The security/* portion of the effort to remove the empty sentinel elements at the end of the ctl_table arrays - Update the file list associated with the LSM / "SECURITY SUBSYSTEM" entry in the MAINTAINERS file (and then fix a typo in then update) * tag 'lsm-pr-20240513' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: MAINTAINERS: repair file entry in SECURITY SUBSYSTEM MAINTAINERS: update the LSM file list lsm: remove the now superfluous sentinel element from ctl_table array |
||
Linus Torvalds
|
1b294a1f35 |
Networking changes for 6.10.
Core & protocols ---------------- - Complete rework of garbage collection of AF_UNIX sockets. AF_UNIX is prone to forming reference count cycles due to fd passing functionality. New method based on Tarjan's Strongly Connected Components algorithm should be both faster and remove a lot of workarounds we accumulated over the years. - Add TCP fraglist GRO support, allowing chaining multiple TCP packets and forwarding them together. Useful for small switches / routers which lack basic checksum offload in some scenarios (e.g. PPPoE). - Support using SMP threads for handling packet backlog i.e. packet processing from software interfaces and old drivers which don't use NAPI. This helps move the processing out of the softirq jumble. - Continue work of converting from rtnl lock to RCU protection. Don't require rtnl lock when reading: IPv6 routing FIB, IPv6 address labels, netdev threaded NAPI sysfs files, bonding driver's sysfs files, MPLS devconf, IPv4 FIB rules, netns IDs, tcp metrics, TC Qdiscs, neighbor entries, ARP entries via ioctl(SIOCGARP), a lot of the link information available via rtnetlink. - Small optimizations from Eric to UDP wake up handling, memory accounting, RPS/RFS implementation, TCP packet sizing etc. - Allow direct page recycling in the bulk API used by XDP, for +2% PPS. - Support peek with an offset on TCP sockets. - Add MPTCP APIs for querying last time packets were received/sent/acked, and whether MPTCP "upgrade" succeeded on a TCP socket. - Add intra-node communication shortcut to improve SMC performance. - Add IPv6 (and IPv{4,6}-over-IPv{4,6}) support to the GTP protocol driver. - Add HSR-SAN (RedBOX) mode of operation to the HSR protocol driver. - Add reset reasons for tracing what caused a TCP reset to be sent. - Introduce direction attribute for xfrm (IPSec) states. State can be used either for input or output packet processing. Things we sprinkled into general kernel code -------------------------------------------- - Add bitmap_{read,write}(), bitmap_size(), expose BYTES_TO_BITS(). This required touch-ups and renaming of a few existing users. - Add Endian-dependent __counted_by_{le,be} annotations. - Make building selftests "quieter" by printing summaries like "CC object.o" rather than full commands with all the arguments. Netfilter --------- - Use GFP_KERNEL to clone elements, to deal better with OOM situations and avoid failures in the .commit step. BPF --- - Add eBPF JIT for ARCv2 CPUs. - Support attaching kprobe BPF programs through kprobe_multi link in a session mode, meaning, a BPF program is attached to both function entry and return, the entry program can decide if the return program gets executed and the entry program can share u64 cookie value with return program. "Session mode" is a common use-case for tetragon and bpftrace. - Add the ability to specify and retrieve BPF cookie for raw tracepoint programs in order to ease migration from classic to raw tracepoints. - Add an internal-only BPF per-CPU instruction for resolving per-CPU memory addresses and implement support in x86, ARM64 and RISC-V JITs. This allows inlining functions which need to access per-CPU state. - Optimize x86 BPF JIT's emit_mov_imm64, and add support for various atomics in bpf_arena which can be JITed as a single x86 instruction. Support BPF arena on ARM64. - Add a new bpf_wq API for deferring events and refactor process-context bpf_timer code to keep common code where possible. - Harden the BPF verifier's and/or/xor value tracking. - Introduce crypto kfuncs to let BPF programs call kernel crypto APIs. - Support bpf_tail_call_static() helper for BPF programs with GCC 13. - Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF program to have code sections where preemption is disabled. Driver API ---------- - Skip software TC processing completely if all installed rules are marked as HW-only, instead of checking the HW-only flag rule by rule. - Add support for configuring PoE (Power over Ethernet), similar to the already existing support for PoDL (Power over Data Line) config. - Initial bits of a queue control API, for now allowing a single queue to be reset without disturbing packet flow to other queues. - Common (ethtool) statistics for hardware timestamping. Tests and tooling ----------------- - Remove the need to create a config file to run the net forwarding tests so that a naive "make run_tests" can exercise them. - Define a method of writing tests which require an external endpoint to communicate with (to send/receive data towards the test machine). Add a few such tests. - Create a shared code library for writing Python tests. Expose the YAML Netlink library from tools/ to the tests for easy Netlink access. - Move netfilter tests under net/, extend them, separate performance tests from correctness tests, and iron out issues found by running them "on every commit". - Refactor BPF selftests to use common network helpers. - Further work filling in YAML definitions of Netlink messages for: nftables, team driver, bonding interfaces, vlan interfaces, VF info, TC u32 mark, TC police action. - Teach Python YAML Netlink to decode attribute policies. - Extend the definition of the "indexed array" construct in the specs to cover arrays of scalars rather than just nests. - Add hyperlinks between definitions in generated Netlink docs. Drivers ------- - Make sure unsupported flower control flags are rejected by drivers, and make more drivers report errors directly to the application rather than dmesg (large number of driver changes from Asbjørn Sloth Tønnesen). - Ethernet high-speed NICs: - Broadcom (bnxt): - support multiple RSS contexts and steering traffic to them - support XDP metadata - make page pool allocations more NUMA aware - Intel (100G, ice, idpf): - extract datapath code common among Intel drivers into a library - use fewer resources in switchdev by sharing queues with the PF - add PFCP filter support - add Ethernet filter support - use a spinlock instead of HW lock in PTP clock ops - support 5 layer Tx scheduler topology - nVidia/Mellanox: - 800G link modes and 100G SerDes speeds - per-queue IRQ coalescing configuration - Marvell Octeon: - support offloading TC packet mark action - Ethernet NICs consumer, embedded and virtual: - stop lying about skb->truesize in USB Ethernet drivers, it messes up TCP memory calculations - Google cloud vNIC: - support changing ring size via ethtool - support ring reset using the queue control API - VirtIO net: - expose flow hash from RSS to XDP - per-queue statistics - add selftests - Synopsys (stmmac): - support controllers which require an RX clock signal from the MII bus to perform their hardware initialization - TI: - icssg_prueth: support ICSSG-based Ethernet on AM65x SR1.0 devices - icssg_prueth: add SW TX / RX Coalescing based on hrtimers - cpsw: minimal XDP support - Renesas (ravb): - support describing the MDIO bus - Realtek (r8169): - add support for RTL8168M - Microchip Sparx5: - matchall and flower actions mirred and redirect - Ethernet switches: - nVidia/Mellanox: - improve events processing performance - Marvell: - add support for MV88E6250 family internal PHYs - Microchip: - add DCB and DSCP mapping support for KSZ switches - vsc73xx: convert to PHYLINK - Realtek: - rtl8226b/rtl8221b: add C45 instances and SerDes switching - Many driver changes related to PHYLIB and PHYLINK deprecated API cleanup. - Ethernet PHYs: - Add a new driver for Airoha EN8811H 2.5 Gigabit PHY. - micrel: lan8814: add support for PPS out and external timestamp trigger - WiFi: - Disable Wireless Extensions (WEXT) in all Wi-Fi 7 devices drivers. Modern devices can only be configured using nl80211. - mac80211/cfg80211 - handle color change per link for WiFi 7 Multi-Link Operation - Intel (iwlwifi): - don't support puncturing in 5 GHz - support monitor mode on passive channels - BZ-W device support - P2P with HE/EHT support - re-add support for firmware API 90 - provide channel survey information for Automatic Channel Selection - MediaTek (mt76): - mt7921 LED control - mt7925 EHT radiotap support - mt7920e PCI support - Qualcomm (ath11k): - P2P support for QCA6390, WCN6855 and QCA2066 - support hibernation - ieee80211-freq-limit Device Tree property support - Qualcomm (ath12k): - refactoring in preparation of multi-link support - suspend and hibernation support - ACPI support - debugfs support, including dfs_simulate_radar support - RealTek: - rtw88: RTL8723CS SDIO device support - rtw89: RTL8922AE Wi-Fi 7 PCI device support - rtw89: complete features of new WiFi 7 chip 8922AE including BT-coexistence and Wake-on-WLAN - rtw89: use BIOS ACPI settings to set TX power and channels - rtl8xxxu: enable Management Frame Protection (MFP) support - Bluetooth: - support for Intel BlazarI and Filmore Peak2 (BE201) - support for MediaTek MT7921S SDIO - initial support for Intel PCIe BT driver - remove HCI_AMP support Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmZD6sQACgkQMUZtbf5S IrtLYw/+I73ePGIye37o2jpbodcLAUZVfF3r6uYUzK8hokEcKD0QVJa9w7PizLZ3 UO45ClOXFLJCkfP4reFenLfxGCel2AJI+F7VFl2xaO2XgrcH/lnVrHqKZEAEXjls KoYMnShIolv7h2MKP6hHtyTi2j1wvQUKsZC71o9/fuW+4fUT8gECx1YtYcL73wrw gEMdlUgBYC3jiiCUHJIFX6iPJ2t/TC+q1eIIF2K/Osrk2kIqQhzoozcL4vpuAZQT 99ljx/qRelXa8oppDb7nM5eulg7WY8ZqxEfFZphTMC5nLEGzClxuOTTl2kDYI/D/ UZmTWZDY+F5F0xvNk2gH84qVJXBOVDoobpT7hVA/tDuybobc/kvGDzRayEVqVzKj Q0tPlJs+xBZpkK5TVnxaFLJVOM+p1Xosxy3kNVXmuYNBvT/R89UbJiCrUKqKZF+L z/1mOYUv8UklHqYAeuJSptHvqJjTGa/fsEYP7dAUBbc1N2eVB8mzZ4mgU5rYXbtC E6UXXiWnoSRm8bmco9QmcWWoXt5UGEizHSJLz6t1R5Df/YmXhWlytll5aCwY1ksf FNoL7S4u7AZThL1Nwi7yUs4CAjhk/N4aOsk+41S0sALCx30BJuI6UdesAxJ0lu+Z fwCQYbs27y4p7mBLbkYwcQNxAxGm7PSK4yeyRIy2njiyV4qnLf8= =EsC2 -----END PGP SIGNATURE----- Merge tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Complete rework of garbage collection of AF_UNIX sockets. AF_UNIX is prone to forming reference count cycles due to fd passing functionality. New method based on Tarjan's Strongly Connected Components algorithm should be both faster and remove a lot of workarounds we accumulated over the years. - Add TCP fraglist GRO support, allowing chaining multiple TCP packets and forwarding them together. Useful for small switches / routers which lack basic checksum offload in some scenarios (e.g. PPPoE). - Support using SMP threads for handling packet backlog i.e. packet processing from software interfaces and old drivers which don't use NAPI. This helps move the processing out of the softirq jumble. - Continue work of converting from rtnl lock to RCU protection. Don't require rtnl lock when reading: IPv6 routing FIB, IPv6 address labels, netdev threaded NAPI sysfs files, bonding driver's sysfs files, MPLS devconf, IPv4 FIB rules, netns IDs, tcp metrics, TC Qdiscs, neighbor entries, ARP entries via ioctl(SIOCGARP), a lot of the link information available via rtnetlink. - Small optimizations from Eric to UDP wake up handling, memory accounting, RPS/RFS implementation, TCP packet sizing etc. - Allow direct page recycling in the bulk API used by XDP, for +2% PPS. - Support peek with an offset on TCP sockets. - Add MPTCP APIs for querying last time packets were received/sent/acked and whether MPTCP "upgrade" succeeded on a TCP socket. - Add intra-node communication shortcut to improve SMC performance. - Add IPv6 (and IPv{4,6}-over-IPv{4,6}) support to the GTP protocol driver. - Add HSR-SAN (RedBOX) mode of operation to the HSR protocol driver. - Add reset reasons for tracing what caused a TCP reset to be sent. - Introduce direction attribute for xfrm (IPSec) states. State can be used either for input or output packet processing. Things we sprinkled into general kernel code: - Add bitmap_{read,write}(), bitmap_size(), expose BYTES_TO_BITS(). This required touch-ups and renaming of a few existing users. - Add Endian-dependent __counted_by_{le,be} annotations. - Make building selftests "quieter" by printing summaries like "CC object.o" rather than full commands with all the arguments. Netfilter: - Use GFP_KERNEL to clone elements, to deal better with OOM situations and avoid failures in the .commit step. BPF: - Add eBPF JIT for ARCv2 CPUs. - Support attaching kprobe BPF programs through kprobe_multi link in a session mode, meaning, a BPF program is attached to both function entry and return, the entry program can decide if the return program gets executed and the entry program can share u64 cookie value with return program. "Session mode" is a common use-case for tetragon and bpftrace. - Add the ability to specify and retrieve BPF cookie for raw tracepoint programs in order to ease migration from classic to raw tracepoints. - Add an internal-only BPF per-CPU instruction for resolving per-CPU memory addresses and implement support in x86, ARM64 and RISC-V JITs. This allows inlining functions which need to access per-CPU state. - Optimize x86 BPF JIT's emit_mov_imm64, and add support for various atomics in bpf_arena which can be JITed as a single x86 instruction. Support BPF arena on ARM64. - Add a new bpf_wq API for deferring events and refactor process-context bpf_timer code to keep common code where possible. - Harden the BPF verifier's and/or/xor value tracking. - Introduce crypto kfuncs to let BPF programs call kernel crypto APIs. - Support bpf_tail_call_static() helper for BPF programs with GCC 13. - Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF program to have code sections where preemption is disabled. Driver API: - Skip software TC processing completely if all installed rules are marked as HW-only, instead of checking the HW-only flag rule by rule. - Add support for configuring PoE (Power over Ethernet), similar to the already existing support for PoDL (Power over Data Line) config. - Initial bits of a queue control API, for now allowing a single queue to be reset without disturbing packet flow to other queues. - Common (ethtool) statistics for hardware timestamping. Tests and tooling: - Remove the need to create a config file to run the net forwarding tests so that a naive "make run_tests" can exercise them. - Define a method of writing tests which require an external endpoint to communicate with (to send/receive data towards the test machine). Add a few such tests. - Create a shared code library for writing Python tests. Expose the YAML Netlink library from tools/ to the tests for easy Netlink access. - Move netfilter tests under net/, extend them, separate performance tests from correctness tests, and iron out issues found by running them "on every commit". - Refactor BPF selftests to use common network helpers. - Further work filling in YAML definitions of Netlink messages for: nftables, team driver, bonding interfaces, vlan interfaces, VF info, TC u32 mark, TC police action. - Teach Python YAML Netlink to decode attribute policies. - Extend the definition of the "indexed array" construct in the specs to cover arrays of scalars rather than just nests. - Add hyperlinks between definitions in generated Netlink docs. Drivers: - Make sure unsupported flower control flags are rejected by drivers, and make more drivers report errors directly to the application rather than dmesg (large number of driver changes from Asbjørn Sloth Tønnesen). - Ethernet high-speed NICs: - Broadcom (bnxt): - support multiple RSS contexts and steering traffic to them - support XDP metadata - make page pool allocations more NUMA aware - Intel (100G, ice, idpf): - extract datapath code common among Intel drivers into a library - use fewer resources in switchdev by sharing queues with the PF - add PFCP filter support - add Ethernet filter support - use a spinlock instead of HW lock in PTP clock ops - support 5 layer Tx scheduler topology - nVidia/Mellanox: - 800G link modes and 100G SerDes speeds - per-queue IRQ coalescing configuration - Marvell Octeon: - support offloading TC packet mark action - Ethernet NICs consumer, embedded and virtual: - stop lying about skb->truesize in USB Ethernet drivers, it messes up TCP memory calculations - Google cloud vNIC: - support changing ring size via ethtool - support ring reset using the queue control API - VirtIO net: - expose flow hash from RSS to XDP - per-queue statistics - add selftests - Synopsys (stmmac): - support controllers which require an RX clock signal from the MII bus to perform their hardware initialization - TI: - icssg_prueth: support ICSSG-based Ethernet on AM65x SR1.0 devices - icssg_prueth: add SW TX / RX Coalescing based on hrtimers - cpsw: minimal XDP support - Renesas (ravb): - support describing the MDIO bus - Realtek (r8169): - add support for RTL8168M - Microchip Sparx5: - matchall and flower actions mirred and redirect - Ethernet switches: - nVidia/Mellanox: - improve events processing performance - Marvell: - add support for MV88E6250 family internal PHYs - Microchip: - add DCB and DSCP mapping support for KSZ switches - vsc73xx: convert to PHYLINK - Realtek: - rtl8226b/rtl8221b: add C45 instances and SerDes switching - Many driver changes related to PHYLIB and PHYLINK deprecated API cleanup - Ethernet PHYs: - Add a new driver for Airoha EN8811H 2.5 Gigabit PHY. - micrel: lan8814: add support for PPS out and external timestamp trigger - WiFi: - Disable Wireless Extensions (WEXT) in all Wi-Fi 7 devices drivers. Modern devices can only be configured using nl80211. - mac80211/cfg80211 - handle color change per link for WiFi 7 Multi-Link Operation - Intel (iwlwifi): - don't support puncturing in 5 GHz - support monitor mode on passive channels - BZ-W device support - P2P with HE/EHT support - re-add support for firmware API 90 - provide channel survey information for Automatic Channel Selection - MediaTek (mt76): - mt7921 LED control - mt7925 EHT radiotap support - mt7920e PCI support - Qualcomm (ath11k): - P2P support for QCA6390, WCN6855 and QCA2066 - support hibernation - ieee80211-freq-limit Device Tree property support - Qualcomm (ath12k): - refactoring in preparation of multi-link support - suspend and hibernation support - ACPI support - debugfs support, including dfs_simulate_radar support - RealTek: - rtw88: RTL8723CS SDIO device support - rtw89: RTL8922AE Wi-Fi 7 PCI device support - rtw89: complete features of new WiFi 7 chip 8922AE including BT-coexistence and Wake-on-WLAN - rtw89: use BIOS ACPI settings to set TX power and channels - rtl8xxxu: enable Management Frame Protection (MFP) support - Bluetooth: - support for Intel BlazarI and Filmore Peak2 (BE201) - support for MediaTek MT7921S SDIO - initial support for Intel PCIe BT driver - remove HCI_AMP support" * tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1827 commits) selftests: netfilter: fix packetdrill conntrack testcase net: gro: fix napi_gro_cb zeroed alignment Bluetooth: btintel_pcie: Refactor and code cleanup Bluetooth: btintel_pcie: Fix warning reported by sparse Bluetooth: hci_core: Fix not handling hdev->le_num_of_adv_sets=1 Bluetooth: btintel: Fix compiler warning for multi_v7_defconfig config Bluetooth: btintel_pcie: Fix compiler warnings Bluetooth: btintel_pcie: Add *setup* function to download firmware Bluetooth: btintel_pcie: Add support for PCIe transport Bluetooth: btintel: Export few static functions Bluetooth: HCI: Remove HCI_AMP support Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init() Bluetooth: qca: Fix error code in qca_read_fw_build_info() Bluetooth: hci_conn: Use __counted_by() and avoid -Wfamnae warning Bluetooth: btintel: Add support for Filmore Peak2 (BE201) Bluetooth: btintel: Add support for BlazarI LE Create Connection command timeout increased to 20 secs dt-bindings: net: bluetooth: Add MediaTek MT7921S SDIO Bluetooth Bluetooth: compute LE flow credits based on recvbuf space Bluetooth: hci_sync: Use cmd->num_cis instead of magic number ... |
||
Davide Caratti
|
8ec9897ec2 |
netlabel: fix RCU annotation for IPv4 options on socket creation
Xiumei reports the following splat when netlabel and TCP socket are used:
=============================
WARNING: suspicious RCU usage
6.9.0-rc2+ #637 Not tainted
-----------------------------
net/ipv4/cipso_ipv4.c:1880 suspicious rcu_dereference_protected() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by ncat/23333:
#0: ffffffff906030c0 (rcu_read_lock){....}-{1:2}, at: netlbl_sock_setattr+0x25/0x1b0
stack backtrace:
CPU: 11 PID: 23333 Comm: ncat Kdump: loaded Not tainted 6.9.0-rc2+ #637
Hardware name: Supermicro SYS-6027R-72RF/X9DRH-7TF/7F/iTF/iF, BIOS 3.0 07/26/2013
Call Trace:
<TASK>
dump_stack_lvl+0xa9/0xc0
lockdep_rcu_suspicious+0x117/0x190
cipso_v4_sock_setattr+0x1ab/0x1b0
netlbl_sock_setattr+0x13e/0x1b0
selinux_netlbl_socket_post_create+0x3f/0x80
selinux_socket_post_create+0x1a0/0x460
security_socket_post_create+0x42/0x60
__sock_create+0x342/0x3a0
__sys_socket_create.part.22+0x42/0x70
__sys_socket+0x37/0xb0
__x64_sys_socket+0x16/0x20
do_syscall_64+0x96/0x180
? do_user_addr_fault+0x68d/0xa30
? exc_page_fault+0x171/0x280
? asm_exc_page_fault+0x22/0x30
entry_SYSCALL_64_after_hwframe+0x71/0x79
RIP: 0033:0x7fbc0ca3fc1b
Code: 73 01 c3 48 8b 0d 05 f2 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 29 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 f1 1b 00 f7 d8 64 89 01 48
RSP: 002b:00007fff18635208 EFLAGS: 00000246 ORIG_RAX: 0000000000000029
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fbc0ca3fc1b
RDX: 0000000000000006 RSI: 0000000000000001 RDI: 0000000000000002
RBP: 000055d24f80f8a0 R08: 0000000000000003 R09: 0000000000000001
R10: 0000000000020000 R11: 0000000000000246 R12: 000055d24f80f8a0
R13: 0000000000000000 R14: 000055d24f80fb88 R15: 0000000000000000
</TASK>
The current implementation of cipso_v4_sock_setattr() replaces IP options
under the assumption that the caller holds the socket lock; however, such
assumption is not true, nor needed, in selinux_socket_post_create() hook.
Let all callers of cipso_v4_sock_setattr() specify the "socket lock held"
condition, except selinux_socket_post_create() _ where such condition can
safely be set as true even without holding the socket lock.
Fixes:
|
||
Linus Torvalds
|
25c73642cc |
Hi
2nd trial of the earlier PR with more appropriate tag: 1. Do no overwrite the key expiration once it is set. 2. Early to quota updates for keys to key_put(), instead of updating them in key_gc_unused_keys(). [1] Earlier PR: https://lore.kernel.org/linux-integrity/20240326143838.15076-1-jarkko@kernel.org/ BR, Jarkko -----BEGIN PGP SIGNATURE----- iJYEABYKAD4WIQRE6pSOnaBC00OEHEIaerohdGur0gUCZjzYqSAcamFya2tvLnNh a2tpbmVuQGxpbnV4LmludGVsLmNvbQAKCRAaerohdGur0nQJAQDN4r2K/QgryPyS fSfVPeeIPkBIOat2sTq8IOQLKlCybwD8Duo6gJuZWmQmdbn2m2Pn/Dg3xUy4ljba IwHRro/kcwY= =3V25 -----END PGP SIGNATURE----- Merge tag 'keys-next-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull keys updates from Jarkko Sakkinen: - do not overwrite the key expiration once it is set - move key quota updates earlier into key_put(), instead of updating them in key_gc_unused_keys() * tag 'keys-next-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: keys: Fix overwrite of key expiration on instantiation keys: update key quotas in key_put() |
||
Linus Torvalds
|
b19239143e |
Hi,
These are the changes for the TPM driver with a single major new feature: TPM bus encryption and integrity protection. The key pair on TPM side is generated from so called null random seed per power on of the machine [1]. This supports the TPM encryption of the hard drive by adding layer of protection against bus interposer attacks. Other than the pull request a few minor fixes and documentation for tpm_tis to clarify basics of TPM localities for future patch review discussions (will be extended and refined over times, just a seed). [1] https://lore.kernel.org/linux-integrity/20240429202811.13643-1-James.Bottomley@HansenPartnership.com/ BR, Jarkko -----BEGIN PGP SIGNATURE----- iJYEABYKAD4WIQRE6pSOnaBC00OEHEIaerohdGur0gUCZj0l2iAcamFya2tvLnNh a2tpbmVuQGxpbnV4LmludGVsLmNvbQAKCRAaerohdGur0m8yAP4hBjMtpgAJZ4eZ 5o9tEQJrh/1JFZJ+8HU5IKPc4RU8BAEAyyYOCtxtS/C5B95iP+LvNla0KWi0pprU HsCLULnV2Aw= =RTXJ -----END PGP SIGNATURE----- Merge tag 'tpmdd-next-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull TPM updates from Jarkko Sakkinen: "These are the changes for the TPM driver with a single major new feature: TPM bus encryption and integrity protection. The key pair on TPM side is generated from so called null random seed per power on of the machine [1]. This supports the TPM encryption of the hard drive by adding layer of protection against bus interposer attacks. Other than that, a few minor fixes and documentation for tpm_tis to clarify basics of TPM localities for future patch review discussions (will be extended and refined over times, just a seed)" Link: https://lore.kernel.org/linux-integrity/20240429202811.13643-1-James.Bottomley@HansenPartnership.com/ [1] * tag 'tpmdd-next-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: (28 commits) Documentation: tpm: Add TPM security docs toctree entry tpm: disable the TPM if NULL name changes Documentation: add tpm-security.rst tpm: add the null key name as a sysfs export KEYS: trusted: Add session encryption protection to the seal/unseal path tpm: add session encryption protection to tpm2_get_random() tpm: add hmac checks to tpm2_pcr_extend() tpm: Add the rest of the session HMAC API tpm: Add HMAC session name/handle append tpm: Add HMAC session start and end functions tpm: Add TCG mandated Key Derivation Functions (KDFs) tpm: Add NULL primary creation tpm: export the context save and load commands tpm: add buffer function to point to returned parameters crypto: lib - implement library version of AES in CFB mode KEYS: trusted: tpm2: Use struct tpm_buf for sized buffers tpm: Add tpm_buf_read_{u8,u16,u32} tpm: TPM2B formatted buffers tpm: Store the length of the tpm_buf data separately. tpm: Update struct tpm_buf documentation comments ... |
||
Günther Noack
|
b25f7415eb
|
landlock: Add IOCTL access right for character and block devices
Introduces the LANDLOCK_ACCESS_FS_IOCTL_DEV right and increments the Landlock ABI version to 5. This access right applies to device-custom IOCTL commands when they are invoked on block or character device files. Like the truncate right, this right is associated with a file descriptor at the time of open(2), and gets respected even when the file descriptor is used outside of the thread which it was originally opened in. Therefore, a newly enabled Landlock policy does not apply to file descriptors which are already open. If the LANDLOCK_ACCESS_FS_IOCTL_DEV right is handled, only a small number of safe IOCTL commands will be permitted on newly opened device files. These include FIOCLEX, FIONCLEX, FIONBIO and FIOASYNC, as well as other IOCTL commands for regular files which are implemented in fs/ioctl.c. Noteworthy scenarios which require special attention: TTY devices are often passed into a process from the parent process, and so a newly enabled Landlock policy does not retroactively apply to them automatically. In the past, TTY devices have often supported IOCTL commands like TIOCSTI and some TIOCLINUX subcommands, which were letting callers control the TTY input buffer (and simulate keypresses). This should be restricted to CAP_SYS_ADMIN programs on modern kernels though. Known limitations: The LANDLOCK_ACCESS_FS_IOCTL_DEV access right is a coarse-grained control over IOCTL commands. Landlock users may use path-based restrictions in combination with their knowledge about the file system layout to control what IOCTLs can be done. Cc: Paul Moore <paul@paul-moore.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20240419161122.2023765-2-gnoack@google.com Signed-off-by: Mickaël Salaün <mic@digikod.net> |
||
Masahiro Yamada
|
b1992c3772 |
kbuild: use $(src) instead of $(srctree)/$(src) for source directory
Kbuild conventionally uses $(obj)/ for generated files, and $(src)/ for checked-in source files. It is merely a convention without any functional difference. In fact, $(obj) and $(src) are exactly the same, as defined in scripts/Makefile.build: src := $(obj) When the kernel is built in a separate output directory, $(src) does not accurately reflect the source directory location. While Kbuild resolves this discrepancy by specifying VPATH=$(srctree) to search for source files, it does not cover all cases. For example, when adding a header search path for local headers, -I$(srctree)/$(src) is typically passed to the compiler. This introduces inconsistency between upstream and downstream Makefiles because $(src) is used instead of $(srctree)/$(src) for the latter. To address this inconsistency, this commit changes the semantics of $(src) so that it always points to the directory in the source tree. Going forward, the variables used in Makefiles will have the following meanings: $(obj) - directory in the object tree $(src) - directory in the source tree (changed by this commit) $(objtree) - the top of the kernel object tree $(srctree) - the top of the kernel source tree Consequently, $(srctree)/$(src) in upstream Makefiles need to be replaced with $(src). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> |
||
James Bottomley
|
52ce7d9731 |
KEYS: trusted: Add session encryption protection to the seal/unseal path
If some entity is snooping the TPM bus, the can see the data going in to be sealed and the data coming out as it is unsealed. Add parameter and response encryption to these cases to ensure that no secrets are leaked even if the bus is snooped. As part of doing this conversion it was discovered that policy sessions can't work with HMAC protected authority because of missing pieces (the tpm Nonce). I've added code to work the same way as before, which will result in potential authority exposure (while still adding security for the command and the returned blob), and a fixme to redo the API to get rid of this security hole. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> |
||
Jarkko Sakkinen
|
40813f1879 |
KEYS: trusted: tpm2: Use struct tpm_buf for sized buffers
Take advantage of the new sized buffer (TPM2B) mode of struct tpm_buf in tpm2_seal_trusted(). This allows to add robustness to the command construction without requiring to calculate buffer sizes manually. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> |
||
Jarkko Sakkinen
|
e1b72e1b11 |
tpm: Store the length of the tpm_buf data separately.
TPM2B buffers, or sized buffers, have a two byte header, which contains the length of the payload as a 16-bit big-endian number, without counting in the space taken by the header. This differs from encoding in the TPM header where the length includes also the bytes taken by the header. Unbound the length of a tpm_buf from the value stored to the TPM command header. A separate encoding and decoding step so that different buffer types can be supported, with variant header format and length encoding. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> |
||
Jarkko Sakkinen
|
4f0feb5463 |
tpm: Remove tpm_send()
Open code the last remaining call site for tpm_send(). Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> |
||
David Gstir
|
28c5f596ae |
docs: trusted-encrypted: add DCP as new trust source
Update the documentation for trusted and encrypted KEYS with DCP as new trust source: - Describe security properties of DCP trust source - Describe key usage - Document blob format Co-developed-by: Richard Weinberger <richard@nod.at> Signed-off-by: Richard Weinberger <richard@nod.at> Co-developed-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> Signed-off-by: David Gstir <david@sigma-star.at> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> |
||
David Gstir
|
2e8a0f40a3 |
KEYS: trusted: Introduce NXP DCP-backed trusted keys
DCP (Data Co-Processor) is the little brother of NXP's CAAM IP. Beside of accelerated crypto operations, it also offers support for hardware-bound keys. Using this feature it is possible to implement a blob mechanism similar to what CAAM offers. Unlike on CAAM, constructing and parsing the blob has to happen in software (i.e. the kernel). The software-based blob format used by DCP trusted keys encrypts the payload using AES-128-GCM with a freshly generated random key and nonce. The random key itself is AES-128-ECB encrypted using the DCP unique or OTP key. The DCP trusted key blob format is: /* * struct dcp_blob_fmt - DCP BLOB format. * * @fmt_version: Format version, currently being %1 * @blob_key: Random AES 128 key which is used to encrypt @payload, * @blob_key itself is encrypted with OTP or UNIQUE device key in * AES-128-ECB mode by DCP. * @nonce: Random nonce used for @payload encryption. * @payload_len: Length of the plain text @payload. * @payload: The payload itself, encrypted using AES-128-GCM and @blob_key, * GCM auth tag of size AES_BLOCK_SIZE is attached at the end of it. * * The total size of a DCP BLOB is sizeof(struct dcp_blob_fmt) + @payload_len + * AES_BLOCK_SIZE. */ struct dcp_blob_fmt { __u8 fmt_version; __u8 blob_key[AES_KEYSIZE_128]; __u8 nonce[AES_KEYSIZE_128]; __le32 payload_len; __u8 payload[]; } __packed; By default the unique key is used. It is also possible to use the OTP key. While the unique key should be unique it is not documented how this key is derived. Therefore selection the OTP key is supported as well via the use_otp_key module parameter. Co-developed-by: Richard Weinberger <richard@nod.at> Signed-off-by: Richard Weinberger <richard@nod.at> Co-developed-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> Signed-off-by: David Gstir <david@sigma-star.at> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> |
||
David Gstir
|
633cb72fb6 |
KEYS: trusted: improve scalability of trust source config
Enabling trusted keys requires at least one trust source implementation (currently TPM, TEE or CAAM) to be enabled. Currently, this is done by checking each trust source's config option individually. This does not scale when more trust sources like the one for DCP are added, because the condition will get long and hard to read. Add config HAVE_TRUSTED_KEYS which is set to true by each trust source once its enabled and adapt the check for having at least one active trust source to use this option. Whenever a new trust source is added, it now needs to select HAVE_TRUSTED_KEYS. Signed-off-by: David Gstir <david@sigma-star.at> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> # for TRUSTED_KEYS_TPM Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> |
||
Silvio Gissi
|
9da27fb65a |
keys: Fix overwrite of key expiration on instantiation
The expiry time of a key is unconditionally overwritten during
instantiation, defaulting to turn it permanent. This causes a problem
for DNS resolution as the expiration set by user-space is overwritten to
TIME64_MAX, disabling further DNS updates. Fix this by restoring the
condition that key_set_expiry is only called when the pre-parser sets a
specific expiry.
Fixes:
|
||
Luis Henriques
|
9578e327b2 |
keys: update key quotas in key_put()
Delaying key quotas update when key's refcount reaches 0 in key_put() has been causing some issues in fscrypt testing, specifically in fstest generic/581. This commit fixes this test flakiness by dealing with the quotas immediately, and leaving all the other clean-ups to the key garbage collector. This is done by moving the updates to the qnkeys and qnbytes fields in struct key_user from key_gc_unused_keys() into key_put(). Unfortunately, this also means that we need to switch to the irq-version of the spinlock that protects these fields and use spin_lock_{irqsave,irqrestore} in all the code that touches these fields. Signed-off-by: Luis Henriques <lhenriques@suse.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@kernel.org> |
||
Christian Göttsche
|
581646c3fb |
selinux: constify source policy in cond_policydb_dup()
cond_policydb_dup() duplicates conditional parts of an existing policy. Declare the source policy const, since it should not be modified. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: various line length fixups] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
Christian Göttsche
|
851541709a |
selinux: avoid printk_ratelimit()
The usage of printk_ratelimit() is discouraged, see include/linux/printk.h, thus use pr_warn_ratelimited(). While editing this line address the following checkpatch warning: WARNING: Integer promotion: Using 'h' in '%hu' is unnecessary Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
Christian Göttsche
|
fc983171e4 |
selinux: pre-allocate the status page
Since the status page is currently only allocated on first use, the sequence number of the initial policyload (i.e. 1) is not stored, leading to the observable sequence of 0, 2, 3, 4, ... Try to pre-allocate the status page during the initialization of the selinuxfs, so selinux_status_update_policyload() will set the sequence number. This brings the status page to return the actual sequence number for the initial policy load, which is also observable via the netlink socket. I could not find any occurrence where userspace depends on the actual value returned by selinux_status_policyload(3), thus the breakage should be unnoticed. Closes: https://lore.kernel.org/selinux/87o7fmua12.fsf@redhat.com/ Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: trimmed 'reported-by' that was missing an email] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
York Jasper Niebuhr
|
ba42b524a0 |
mm: init_mlocked_on_free_v3
Implements the "init_mlocked_on_free" boot option. When this boot option
is enabled, any mlock'ed pages are zeroed on free. If
the pages are munlock'ed beforehand, no initialization takes place.
This boot option is meant to combat the performance hit of
"init_on_free" as reported in commit
|
||
Joel Granados
|
74560bb368 |
lsm: remove the now superfluous sentinel element from ctl_table array
This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/) Remove the sentinel from all files under security/ that register a sysctl table. Signed-off-by: Joel Granados <j.granados@samsung.com> Acked-by: Kees Cook <keescook@chromium.org> # loadpin & yama Tested-by: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Valentin Schneider <vschneid@redhat.com> [PM: subject line tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
Enrico Bravi
|
9fa8e76250 |
ima: add crypto agility support for template-hash algorithm
The template hash showed by the ascii_runtime_measurements and binary_runtime_measurements is the one calculated using sha1 and there is no possibility to change this value, despite the fact that the template hash is calculated using the hash algorithms corresponding to all the PCR banks configured in the TPM. Add the support to retrieve the ima log with the template data hash calculated with a specific hash algorithm. Add a new file in the securityfs ima directory for each hash algo configured in a PCR bank of the TPM. Each new file has the name with the following structure: {binary, ascii}_runtime_measurements_<hash_algo_name> Legacy files are kept, to avoid breaking existing applications, but as symbolic links which point to {binary, ascii}_runtime_measurements_sha1 files. These two files are created even if a TPM chip is not detected or the sha1 bank is not configured in the TPM. As example, in the case a TPM chip is present and sha256 is the only configured PCR bank, the listing of the securityfs ima directory is the following: lr--r--r-- [...] ascii_runtime_measurements -> ascii_runtime_measurements_sha1 -r--r----- [...] ascii_runtime_measurements_sha1 -r--r----- [...] ascii_runtime_measurements_sha256 lr--r--r-- [...] binary_runtime_measurements -> binary_runtime_measurements_sha1 -r--r----- [...] binary_runtime_measurements_sha1 -r--r----- [...] binary_runtime_measurements_sha256 --w------- [...] policy -r--r----- [...] runtime_measurements_count -r--r----- [...] violations Signed-off-by: Enrico Bravi <enrico.bravi@polito.it> Signed-off-by: Silvia Sisinni <silvia.sisinni@polito.it> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Stefan Berger
|
5e2e4d0ea5 |
evm: Rename is_unsupported_fs to is_unsupported_hmac_fs
Rename is_unsupported_fs to is_unsupported_hmac_fs since now only HMAC is unsupported. Co-developed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Stefan Berger
|
1f65e57dc5 |
fs: Rename SB_I_EVM_UNSUPPORTED to SB_I_EVM_HMAC_UNSUPPORTED
Now that EVM supports RSA signatures for previously completely unsupported filesystems rename the flag SB_I_EVM_UNSUPPORTED to SB_I_EVM_HMAC_UNSUPPORTED to reflect that only HMAC is not supported. Suggested-by: Amir Goldstein <amir73il@gmail.com> Suggested-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Stefan Berger
|
47add87ad1 |
evm: Enforce signatures on unsupported filesystem for EVM_INIT_X509
Unsupported filesystems currently do not enforce any signatures. Add support for signature enforcement of the "original" and "portable & immutable" signatures when EVM_INIT_X509 is enabled. The "original" signature type contains filesystem specific metadata. Thus it cannot be copied up and verified. However with EVM_INIT_X509 and EVM_ALLOW_METADATA_WRITES enabled, the "original" file signature may be written. When EVM_ALLOW_METADATA_WRITES is not set or once it is removed from /sys/kernel/security/evm by setting EVM_INIT_HMAC for example, it is not possible to write or remove xattrs on the overlay filesystem. This change still prevents EVM from writing HMAC signatures on unsupported filesystem when EVM_INIT_HMAC is enabled. Co-developed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Stefan Berger
|
cd9b909a11 |
ima: re-evaluate file integrity on file metadata change
Force a file's integrity to be re-evaluated on file metadata change by resetting both the IMA and EVM status flags. Co-developed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Stefan Berger
|
a652aa5906 |
evm: Store and detect metadata inode attributes changes
On stacked filesystem the metadata inode may be different than the one file data inode and therefore changes to it need to be detected independently. Therefore, store the i_version, device number, and inode number associated with the file metadata inode. Implement a function to detect changes to the inode and if a change is detected reset the evm_status. This function will be called by IMA when IMA detects that the metadata inode is different from the file's inode. Co-developed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Stefan Berger
|
309e2b775d |
ima: Move file-change detection variables into new structure
Move all the variables used for file change detection into a structure that can be used by IMA and EVM. Implement an inline function for storing the identification of an inode and one for detecting changes to an inode based on this new structure. Co-developed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Stefan Berger
|
faf994811e |
evm: Use the metadata inode to calculate metadata hash
Changes to file attributes (mode bits, uid, gid) on the lower layer are not taken into account when d_backing_inode() is used when a file is accessed on the overlay layer and this file has not yet been copied up. This is because d_backing_inode() does not return the real inode of the lower layer but instead returns the backing inode which in this case holds wrong file attributes. Further, when CONFIG_OVERLAY_FS_METACOPY is enabled and a copy-up is triggered due to file metadata changes, then the metadata are held by the backing inode while the data are still held by the real inode. Therefore, use d_inode(d_real(dentry, D_REAL_METADATA)) to get to the file's metadata inode and use it to calculate the metadata hash with. Co-developed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Stefan Berger
|
f2b3fc42f6 |
evm: Implement per signature type decision in security_inode_copy_up_xattr
To support "portable and immutable signatures" on otherwise unsupported filesystems, determine the EVM signature type by the content of a file's xattr. If the file has the appropriate signature type then allow it to be copied up. All other signature types are discarded as before. "Portable and immutable" EVM signatures can be copied up by stacked file- system since the metadata their signature covers does not include file- system-specific data such as a file's inode number, generation, and UUID. Co-developed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Stefan Berger
|
3253804773 |
security: allow finer granularity in permitting copy-up of security xattrs
Copying up xattrs is solely based on the security xattr name. For finer granularity add a dentry parameter to the security_inode_copy_up_xattr hook definition, allowing decisions to be based on the xattr content as well. Co-developed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Amir Goldstein <amir73il@gmail.com> Acked-by: Paul Moore <paul@paul-moore.com> (LSM,SELinux) Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Stefan Berger
|
c21632b668 |
ima: Rename backing_inode to real_inode
Rename the backing_inode variable to real_inode since it gets its value from real_inode(). Suggested-by: Amir Goldstein <amir73il@gmail.com> Co-developed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Gustavo A. R. Silva
|
38aa3f5ac6 |
integrity: Avoid -Wflex-array-member-not-at-end warnings
-Wflex-array-member-not-at-end is coming in GCC-14, and we are getting ready to enable it globally. There is currently an object (`hdr)` in `struct ima_max_digest_data` that contains a flexible structure (`struct ima_digest_data`): struct ima_max_digest_data { struct ima_digest_data hdr; u8 digest[HASH_MAX_DIGESTSIZE]; } __packed; So, in order to avoid ending up with a flexible-array member in the middle of a struct, we use the `__struct_group()` helper to separate the flexible array from the rest of the members in the flexible structure: struct ima_digest_data { __struct_group(ima_digest_data_hdr, hdr, __packed, ... the rest of the members ); u8 digest[]; } __packed; And similarly for `struct evm_ima_xattr_data`. With the change described above, we can now declare an object of the type of the tagged `struct ima_digest_data_hdr`, without embedding the flexible array in the middle of another struct: struct ima_max_digest_data { struct ima_digest_data_hdr hdr; u8 digest[HASH_MAX_DIGESTSIZE]; } __packed; And similarly for `struct evm_digest` and `struct evm_xattr`. We also use `container_of()` whenever we need to retrieve a pointer to the flexible structure. So, with these changes, fix the following warnings: security/integrity/evm/evm.h:64:32: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] security/integrity/evm/../integrity.h:40:35: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] security/integrity/evm/../integrity.h:68:32: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] security/integrity/ima/../integrity.h:40:35: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] security/integrity/ima/../integrity.h:68:32: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] security/integrity/integrity.h:40:35: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] security/integrity/integrity.h:68:32: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] security/integrity/platform_certs/../integrity.h:40:35: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] security/integrity/platform_certs/../integrity.h:68:32: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Link: https://github.com/KSPP/linux/issues/202 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Mimi Zohar
|
cc293c8466 |
ima: define an init_module critical data record
The init_module syscall loads an ELF image into kernel space without measuring the buffer containing the ELF image. To close this kernel module integrity gap, define a new critical-data record which includes the hash of the ELF image. Instead of including the buffer data in the IMA measurement list, include the hash of the buffer data to avoid large IMA measurement list records. The buffer data hash would be the same value as the finit_module syscall file hash. To enable measuring the init_module buffer and other critical data from boot, define "ima_policy=critical_data" on the boot command line. Since builtin policies are not persistent, a custom IMA policy must include the rule as well: measure func=CRITICAL_DATA label=modules To verify the template data hash value, first convert the buffer data hash to binary: grep "init_module" \ /sys/kernel/security/integrity/ima/ascii_runtime_measurements | \ tail -1 | cut -d' ' -f 6 | xxd -r -p | sha256sum Reported-by: Ken Goldman <kgold@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Stefan Berger
|
be84f32bb2 |
ima: Fix use-after-free on a dentry's dname.name
->d_name.name can change on rename and the earlier value can be freed; there are conditions sufficient to stabilize it (->d_lock on dentry, ->d_lock on its parent, ->i_rwsem exclusive on the parent's inode, rename_lock), but none of those are met at any of the sites. Take a stable snapshot of the name instead. Link: https://lore.kernel.org/all/20240202182732.GE2087318@ZenIV/ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Ondrej Mosnacek
|
4e551db042 |
selinux: clarify return code in filename_trans_read_helper_compat()
For the "conflicting/duplicate rules" branch in
filename_trans_read_helper_compat() the Smatch static checker reports:
security/selinux/ss/policydb.c:1953 filename_trans_read_helper_compat()
warn: missing error code 'rc'
While the value of rc will already always be zero here, it is not
obvious that it's the case and that it's the intended return value
(Smatch expects rc to be assigned within 5 lines from the goto).
Therefore, add an explicit assignment just before the goto to make the
intent more clear and the code less error-prone.
Fixes:
|
||
Roberto Sassu
|
701b38995e |
security: Place security_path_post_mknod() where the original IMA call was
Commit |
||
Christian Göttsche
|
37801a36b4 |
selinux: avoid dereference of garbage after mount failure
In case kern_mount() fails and returns an error pointer return in the
error branch instead of continuing and dereferencing the error pointer.
While on it drop the never read static variable selinuxfs_mount.
Cc: stable@vger.kernel.org
Fixes:
|
||
Christian Göttsche
|
abb0f43fcd |
selinux: use u32 as bit position type in ebitmap code
The extensible bitmap supports bit positions up to U32_MAX due to the type of the member highbit being u32. Use u32 consistently as the type for bit positions to announce to callers what range of values is supported. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: merge fuzz, subject line tweak] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
Christian Göttsche
|
32db469edf |
selinux: improve symtab string hashing
The number of buckets is calculated by performing a binary AND against the mask of the hash table, which is one less than its size (which is a power of two). This leads to all top bits being discarded, requiring for short or similar inputs a hash function with a good avalanche effect. Use djb2a: # current common prefixes: 7 entries and 5/8 buckets used, longest chain length 2, sum of chain length^2 11 classes: 134 entries and 100/256 buckets used, longest chain length 5, sum of chain length^2 234 roles: 15 entries and 6/16 buckets used, longest chain length 5, sum of chain length^2 57 types: 4448 entries and 3016/8192 buckets used, longest chain length 41, sum of chain length^2 14922 users: 7 entries and 3/8 buckets used, longest chain length 3, sum of chain length^2 17 bools: 306 entries and 221/512 buckets used, longest chain length 4, sum of chain length^2 524 levels: 1 entries and 1/1 buckets used, longest chain length 1, sum of chain length^2 1 categories: 1024 entries and 400/1024 buckets used, longest chain length 4, sum of chain length^2 2740 # patch common prefixes: 7 entries and 5/8 buckets used, longest chain length 2, sum of chain length^2 11 classes: 134 entries and 101/256 buckets used, longest chain length 3, sum of chain length^2 210 roles: 15 entries and 9/16 buckets used, longest chain length 3, sum of chain length^2 31 types: 4448 entries and 3459/8192 buckets used, longest chain length 5, sum of chain length^2 6778 users: 7 entries and 5/8 buckets used, longest chain length 3, sum of chain length^2 13 bools: 306 entries and 236/512 buckets used, longest chain length 5, sum of chain length^2 470 levels: 1 entries and 1/1 buckets used, longest chain length 1, sum of chain length^2 1 categories: 1024 entries and 518/1024 buckets used, longest chain length 7, sum of chain length^2 2992 Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: line length fixes in the commit message] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
Christian Göttsche
|
0fd0b4fefa |
selinux: dump statistics for more hash tables
Dump in the SELinux debug configuration the statistics for the conditional rules avtab, the role transition, and class and common permission hash tables. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: style fixes] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
Christian Göttsche
|
cdc12eb412 |
selinux: make more use of current_sid()
Use the internal helper current_sid() where applicable. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
Christian Göttsche
|
4b3124de63 |
selinux: update numeric format specifiers for ebitmaps
Use the correct, according to Documentation/core-api/printk-formats.rst,
format specifiers for numeric arguments in string formatting.
The general bit type is u32 thus use %u, EBITMAP_SIZE is a constant
computed via sizeof() thus use %zu.
Fixes:
|
||
Paul Moore
|
42c7732380 |
selinux: improve error checking in sel_write_load()
Move our existing input sanity checking to the top of sel_write_load() and add a check to ensure the buffer size is non-zero. Move a local variable initialization from the declaration to before it is used. Minor style adjustments. Reported-by: Sam Sun <samsun1006219@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
Paul Moore
|
e6b5ebca41 |
selinux: cleanup selinux_lsm_getattr()
A number of small changes to selinux_lsm_getattr() to improve the quality and readability of the code: * Explicitly set the `value` parameter to NULL in the case where an attribute has not been set. * Rename the `__tsec` variable to `tsec` to better fit the SELinux code. * Rename `bad` to `err_unlock` to better indicate the jump target drops the RCU lock. Signed-off-by: Paul Moore <paul@paul-moore.com> |