2019-05-27 06:55:01 +00:00
|
|
|
// SPDX-License-Identifier: GPL-2.0-or-later
|
2005-04-16 22:20:36 +00:00
|
|
|
/*
|
|
|
|
* Security plug functions
|
|
|
|
*
|
|
|
|
* Copyright (C) 2001 WireX Communications, Inc <chris@wirex.com>
|
|
|
|
* Copyright (C) 2001-2002 Greg Kroah-Hartman <greg@kroah.com>
|
|
|
|
* Copyright (C) 2001 Networks Associates Technology, Inc <ssmalley@nai.com>
|
IB/core: Enforce PKey security on QPs
Add new LSM hooks to allocate and free security contexts and check for
permission to access a PKey.
Allocate and free a security context when creating and destroying a QP.
This context is used for controlling access to PKeys.
When a request is made to modify a QP that changes the port, PKey index,
or alternate path, check that the QP has permission for the PKey in the
PKey table index on the subnet prefix of the port. If the QP is shared
make sure all handles to the QP also have access.
Store which port and PKey index a QP is using. After the reset to init
transition the user can modify the port, PKey index and alternate path
independently. So port and PKey settings changes can be a merge of the
previous settings and the new ones.
In order to maintain access control if there are PKey table or subnet
prefix change keep a list of all QPs are using each PKey index on
each port. If a change occurs all QPs using that device and port must
have access enforced for the new cache settings.
These changes add a transaction to the QP modify process. Association
with the old port and PKey index must be maintained if the modify fails,
and must be removed if it succeeds. Association with the new port and
PKey index must be established prior to the modify and removed if the
modify fails.
1. When a QP is modified to a particular Port, PKey index or alternate
path insert that QP into the appropriate lists.
2. Check permission to access the new settings.
3. If step 2 grants access attempt to modify the QP.
4a. If steps 2 and 3 succeed remove any prior associations.
4b. If ether fails remove the new setting associations.
If a PKey table or subnet prefix changes walk the list of QPs and
check that they have permission. If not send the QP to the error state
and raise a fatal error event. If it's a shared QP make sure all the
QPs that share the real_qp have permission as well. If the QP that
owns a security structure is denied access the security structure is
marked as such and the QP is added to an error_list. Once the moving
the QP to error is complete the security structure mark is cleared.
Maintaining the lists correctly turns QP destroy into a transaction.
The hardware driver for the device frees the ib_qp structure, so while
the destroy is in progress the ib_qp pointer in the ib_qp_security
struct is undefined. When the destroy process begins the ib_qp_security
structure is marked as destroying. This prevents any action from being
taken on the QP pointer. After the QP is destroyed successfully it
could still listed on an error_list wait for it to be processed by that
flow before cleaning up the structure.
If the destroy fails the QPs port and PKey settings are reinserted into
the appropriate lists, the destroying flag is cleared, and access control
is enforced, in case there were any cache changes during the destroy
flow.
To keep the security changes isolated a new file is used to hold security
related functionality.
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
[PM: merge fixup in ib_verbs.h and uverbs_cmd.c]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-19 12:48:52 +00:00
|
|
|
* Copyright (C) 2016 Mellanox Technologies
|
2023-02-07 22:06:51 +00:00
|
|
|
* Copyright (C) 2023 Microsoft Corporation <paul@paul-moore.com>
|
2005-04-16 22:20:36 +00:00
|
|
|
*/
|
|
|
|
|
2018-10-11 00:18:25 +00:00
|
|
|
#define pr_fmt(fmt) "LSM: " fmt
|
|
|
|
|
2017-10-18 20:00:24 +00:00
|
|
|
#include <linux/bpf.h>
|
2006-01-11 20:17:46 +00:00
|
|
|
#include <linux/capability.h>
|
2013-05-22 16:50:34 +00:00
|
|
|
#include <linux/dcache.h>
|
2018-12-09 20:36:29 +00:00
|
|
|
#include <linux/export.h>
|
2005-04-16 22:20:36 +00:00
|
|
|
#include <linux/init.h>
|
|
|
|
#include <linux/kernel.h>
|
2020-10-02 17:38:15 +00:00
|
|
|
#include <linux/kernel_read_file.h>
|
2015-05-02 22:10:46 +00:00
|
|
|
#include <linux/lsm_hooks.h>
|
2012-02-13 03:58:52 +00:00
|
|
|
#include <linux/fsnotify.h>
|
2012-05-30 21:11:23 +00:00
|
|
|
#include <linux/mman.h>
|
|
|
|
#include <linux/mount.h>
|
|
|
|
#include <linux/personality.h>
|
2012-07-02 05:34:11 +00:00
|
|
|
#include <linux/backing-dev.h>
|
LSM: Enable multiple calls to security_add_hooks() for the same LSM
The commit d69dece5f5b6 ("LSM: Add /sys/kernel/security/lsm") extend
security_add_hooks() with a new parameter to register the LSM name,
which may be useful to make the list of currently loaded LSM available
to userspace. However, there is no clean way for an LSM to split its
hook declarations into multiple files, which may reduce the mess with
all the included files (needed for LSM hook argument types) and make the
source code easier to review and maintain.
This change allows an LSM to register multiple times its hook while
keeping a consistent list of LSM names as described in
Documentation/security/LSM.txt . The list reflects the order in which
checks are made. This patch only check for the last registered LSM. If
an LSM register multiple times its hooks, interleaved with other LSM
registrations (which should not happen), its name will still appear in
the same order that the hooks are called, hence multiple times.
To sum up, "capability,selinux,foo,foo" will be replaced with
"capability,selinux,foo", however "capability,foo,selinux,foo" will
remain as is.
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-05-10 20:48:48 +00:00
|
|
|
#include <linux/string.h>
|
evm: Move to LSM infrastructure
As for IMA, move hardcoded EVM function calls from various places in the
kernel to the LSM infrastructure, by introducing a new LSM named 'evm'
(last and always enabled like 'ima'). The order in the Makefile ensures
that 'evm' hooks are executed after 'ima' ones.
Make EVM functions as static (except for evm_inode_init_security(), which
is exported), and register them as hook implementations in init_evm_lsm().
Also move the inline functions evm_inode_remove_acl(),
evm_inode_post_remove_acl(), and evm_inode_post_set_acl() from the public
evm.h header to evm_main.c.
Unlike before (see commit to move IMA to the LSM infrastructure),
evm_inode_post_setattr(), evm_inode_post_set_acl(),
evm_inode_post_remove_acl(), and evm_inode_post_removexattr() are not
executed for private inodes.
Finally, add the LSM_ID_EVM case in lsm_list_modules_test.c
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-02-15 10:31:10 +00:00
|
|
|
#include <linux/xattr.h>
|
2018-11-20 19:55:02 +00:00
|
|
|
#include <linux/msg.h>
|
lsm: fix integer overflow in lsm_set_self_attr() syscall
security_setselfattr() has an integer overflow bug that leads to
out-of-bounds access when userspace provides bogus input:
`lctx->ctx_len + sizeof(*lctx)` is checked against `lctx->len` (and,
redundantly, also against `size`), but there are no checks on
`lctx->ctx_len`.
Therefore, userspace can provide an `lsm_ctx` with `->ctx_len` set to a
value between `-sizeof(struct lsm_ctx)` and -1, and this bogus `->ctx_len`
will then be passed to an LSM module as a buffer length, causing LSM
modules to perform out-of-bounds accesses.
The following reproducer will demonstrate this under ASAN (if AppArmor is
loaded as an LSM):
```
struct lsm_ctx {
uint64_t id;
uint64_t flags;
uint64_t len;
uint64_t ctx_len;
char ctx[];
};
int main(void) {
size_t size = sizeof(struct lsm_ctx);
struct lsm_ctx *ctx = malloc(size);
ctx->id = 104/*LSM_ID_APPARMOR*/;
ctx->flags = 0;
ctx->len = size;
ctx->ctx_len = -sizeof(struct lsm_ctx);
syscall(
460/*__NR_lsm_set_self_attr*/,
/*attr=*/ 100/*LSM_ATTR_CURRENT*/,
/*ctx=*/ ctx,
/*size=*/ size,
/*flags=*/ 0
);
}
```
Fixes: a04a1198088a ("LSM: syscalls for current process attributes")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subj tweak, removed ref to ASAN splat that isn't included]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-02-14 16:05:38 +00:00
|
|
|
#include <linux/overflow.h>
|
2024-07-10 21:32:30 +00:00
|
|
|
#include <linux/perf_event.h>
|
block,lsm: add LSM blob and new LSM hooks for block devices
This patch introduces a new LSM blob to the block_device structure,
enabling the security subsystem to store security-sensitive data related
to block devices. Currently, for a device mapper's mapped device containing
a dm-verity target, critical security information such as the roothash and
its signing state are not readily accessible. Specifically, while the
dm-verity volume creation process passes the dm-verity roothash and its
signature from userspace to the kernel, the roothash is stored privately
within the dm-verity target, and its signature is discarded
post-verification. This makes it extremely hard for the security subsystem
to utilize these data.
With the addition of the LSM blob to the block_device structure, the
security subsystem can now retain and manage important security metadata
such as the roothash and the signing state of a dm-verity by storing them
inside the blob. Access decisions can then be based on these stored data.
The implementation follows the same approach used for security blobs in
other structures like struct file, struct inode, and struct superblock.
The initialization of the security blob occurs after the creation of the
struct block_device, performed by the security subsystem. Similarly, the
security blob is freed by the security subsystem before the struct
block_device is deallocated or freed.
This patch also introduces a new hook security_bdev_setintegrity() to save
block device's integrity data to the new LSM blob. For example, for
dm-verity, it can use this hook to expose its roothash and signing state
to LSMs, then LSMs can save these data into the LSM blob.
Please note that the new hook should be invoked every time the security
information is updated to keep these data current. For example, in
dm-verity, if the mapping table is reloaded and configured to use a
different dm-verity target with a new roothash and signing information,
the previously stored data in the LSM blob will become obsolete. It is
crucial to re-invoke the hook to refresh these data and ensure they are up
to date. This necessity arises from the design of device-mapper, where a
device-mapper device is first created, and then targets are subsequently
loaded into it. These targets can be modified multiple times during the
device's lifetime. Therefore, while the LSM blob is allocated during the
creation of the block device, its actual contents are not initialized at
this stage and can change substantially over time. This includes
alterations from data that the LSM 'trusts' to those it does not, making
it essential to handle these changes correctly. Failure to address this
dynamic aspect could potentially allow for bypassing LSM checks.
Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
[PM: merge fuzz, subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-03 06:08:25 +00:00
|
|
|
#include <linux/fs.h>
|
2012-02-13 03:58:52 +00:00
|
|
|
#include <net/flow.h>
|
2024-07-10 21:32:25 +00:00
|
|
|
#include <net/sock.h>
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2018-09-19 23:58:31 +00:00
|
|
|
/* How many LSMs were built into the kernel? */
|
|
|
|
#define LSM_COUNT (__end_lsm_info - __start_lsm_info)
|
|
|
|
|
2023-09-12 20:56:47 +00:00
|
|
|
/*
|
|
|
|
* How many LSMs are built into the kernel as determined at
|
|
|
|
* build time. Used to determine fixed array sizes.
|
|
|
|
* The capability module is accounted for by CONFIG_SECURITY
|
|
|
|
*/
|
|
|
|
#define LSM_CONFIG_COUNT ( \
|
|
|
|
(IS_ENABLED(CONFIG_SECURITY) ? 1 : 0) + \
|
|
|
|
(IS_ENABLED(CONFIG_SECURITY_SELINUX) ? 1 : 0) + \
|
|
|
|
(IS_ENABLED(CONFIG_SECURITY_SMACK) ? 1 : 0) + \
|
|
|
|
(IS_ENABLED(CONFIG_SECURITY_TOMOYO) ? 1 : 0) + \
|
|
|
|
(IS_ENABLED(CONFIG_SECURITY_APPARMOR) ? 1 : 0) + \
|
|
|
|
(IS_ENABLED(CONFIG_SECURITY_YAMA) ? 1 : 0) + \
|
|
|
|
(IS_ENABLED(CONFIG_SECURITY_LOADPIN) ? 1 : 0) + \
|
|
|
|
(IS_ENABLED(CONFIG_SECURITY_SAFESETID) ? 1 : 0) + \
|
|
|
|
(IS_ENABLED(CONFIG_SECURITY_LOCKDOWN_LSM) ? 1 : 0) + \
|
|
|
|
(IS_ENABLED(CONFIG_BPF_LSM) ? 1 : 0) + \
|
ima: Move to LSM infrastructure
Move hardcoded IMA function calls (not appraisal-specific functions) from
various places in the kernel to the LSM infrastructure, by introducing a
new LSM named 'ima' (at the end of the LSM list and always enabled like
'integrity').
Having IMA before EVM in the Makefile is sufficient to preserve the
relative order of the new 'ima' LSM in respect to the upcoming 'evm' LSM,
and thus the order of IMA and EVM function calls as when they were
hardcoded.
Make moved functions as static (except ima_post_key_create_or_update(),
which is not in ima_main.c), and register them as implementation of the
respective hooks in the new function init_ima_lsm().
Select CONFIG_SECURITY_PATH, to ensure that the path-based LSM hook
path_post_mknod is always available and ima_post_path_mknod() is always
executed to mark files as new, as before the move.
A slight difference is that IMA and EVM functions registered for the
inode_post_setattr, inode_post_removexattr, path_post_mknod,
inode_post_create_tmpfile, inode_post_set_acl and inode_post_remove_acl
won't be executed for private inodes. Since those inodes are supposed to be
fs-internal, they should not be of interest to IMA or EVM. The S_PRIVATE
flag is used for anonymous inodes, hugetlbfs, reiserfs xattrs, XFS scrub
and kernel-internal tmpfs files.
Conditionally register ima_post_key_create_or_update() if
CONFIG_IMA_MEASURE_ASYMMETRIC_KEYS is enabled. Also, conditionally register
ima_kernel_module_request() if CONFIG_INTEGRITY_ASYMMETRIC_KEYS is enabled.
Finally, add the LSM_ID_IMA case in lsm_list_modules_test.c.
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-02-15 10:31:08 +00:00
|
|
|
(IS_ENABLED(CONFIG_SECURITY_LANDLOCK) ? 1 : 0) + \
|
evm: Move to LSM infrastructure
As for IMA, move hardcoded EVM function calls from various places in the
kernel to the LSM infrastructure, by introducing a new LSM named 'evm'
(last and always enabled like 'ima'). The order in the Makefile ensures
that 'evm' hooks are executed after 'ima' ones.
Make EVM functions as static (except for evm_inode_init_security(), which
is exported), and register them as hook implementations in init_evm_lsm().
Also move the inline functions evm_inode_remove_acl(),
evm_inode_post_remove_acl(), and evm_inode_post_set_acl() from the public
evm.h header to evm_main.c.
Unlike before (see commit to move IMA to the LSM infrastructure),
evm_inode_post_setattr(), evm_inode_post_set_acl(),
evm_inode_post_remove_acl(), and evm_inode_post_removexattr() are not
executed for private inodes.
Finally, add the LSM_ID_EVM case in lsm_list_modules_test.c
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-02-15 10:31:10 +00:00
|
|
|
(IS_ENABLED(CONFIG_IMA) ? 1 : 0) + \
|
2024-08-03 06:08:15 +00:00
|
|
|
(IS_ENABLED(CONFIG_EVM) ? 1 : 0) + \
|
|
|
|
(IS_ENABLED(CONFIG_SECURITY_IPE) ? 1 : 0))
|
2023-09-12 20:56:47 +00:00
|
|
|
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
#define SECURITY_HOOK_ACTIVE_KEY(HOOK, IDX) security_hook_active_##HOOK##_##IDX
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Identifier for the LSM static calls.
|
|
|
|
* HOOK is an LSM hook as defined in linux/lsm_hookdefs.h
|
|
|
|
* IDX is the index of the static call. 0 <= NUM < MAX_LSM_COUNT
|
|
|
|
*/
|
|
|
|
#define LSM_STATIC_CALL(HOOK, IDX) lsm_static_call_##HOOK##_##IDX
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Call the macro M for each LSM hook MAX_LSM_COUNT times.
|
|
|
|
*/
|
|
|
|
#define LSM_LOOP_UNROLL(M, ...) \
|
|
|
|
do { \
|
|
|
|
UNROLL(MAX_LSM_COUNT, M, __VA_ARGS__) \
|
|
|
|
} while (0)
|
|
|
|
|
|
|
|
#define LSM_DEFINE_UNROLL(M, ...) UNROLL(MAX_LSM_COUNT, M, __VA_ARGS__)
|
|
|
|
|
security,lockdown,selinux: implement SELinux lockdown
Implement a SELinux hook for lockdown. If the lockdown module is also
enabled, then a denial by the lockdown module will take precedence over
SELinux, so SELinux can only further restrict lockdown decisions.
The SELinux hook only distinguishes at the granularity of integrity
versus confidentiality similar to the lockdown module, but includes the
full lockdown reason as part of the audit record as a hint in diagnosing
what triggered the denial. To support this auditing, move the
lockdown_reasons[] string array from being private to the lockdown
module to the security framework so that it can be used by the lsm audit
code and so that it is always available even when the lockdown module
is disabled.
Note that the SELinux implementation allows the integrity and
confidentiality reasons to be controlled independently from one another.
Thus, in an SELinux policy, one could allow operations that specify
an integrity reason while blocking operations that specify a
confidentiality reason. The SELinux hook implementation is
stricter than the lockdown module in validating the provided reason value.
Sample AVC audit output from denials:
avc: denied { integrity } for pid=3402 comm="fwupd"
lockdown_reason="/dev/mem,kmem,port" scontext=system_u:system_r:fwupd_t:s0
tcontext=system_u:system_r:fwupd_t:s0 tclass=lockdown permissive=0
avc: denied { confidentiality } for pid=4628 comm="cp"
lockdown_reason="/proc/kcore access"
scontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
tcontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
tclass=lockdown permissive=0
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
[PM: some merge fuzz do the the perf hooks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-11-27 17:04:36 +00:00
|
|
|
/*
|
|
|
|
* These are descriptions of the reasons that can be passed to the
|
|
|
|
* security_locked_down() LSM hook. Placing this array here allows
|
|
|
|
* all security modules to use the same descriptions for auditing
|
|
|
|
* purposes.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
const char *const lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX + 1] = {
|
security,lockdown,selinux: implement SELinux lockdown
Implement a SELinux hook for lockdown. If the lockdown module is also
enabled, then a denial by the lockdown module will take precedence over
SELinux, so SELinux can only further restrict lockdown decisions.
The SELinux hook only distinguishes at the granularity of integrity
versus confidentiality similar to the lockdown module, but includes the
full lockdown reason as part of the audit record as a hint in diagnosing
what triggered the denial. To support this auditing, move the
lockdown_reasons[] string array from being private to the lockdown
module to the security framework so that it can be used by the lsm audit
code and so that it is always available even when the lockdown module
is disabled.
Note that the SELinux implementation allows the integrity and
confidentiality reasons to be controlled independently from one another.
Thus, in an SELinux policy, one could allow operations that specify
an integrity reason while blocking operations that specify a
confidentiality reason. The SELinux hook implementation is
stricter than the lockdown module in validating the provided reason value.
Sample AVC audit output from denials:
avc: denied { integrity } for pid=3402 comm="fwupd"
lockdown_reason="/dev/mem,kmem,port" scontext=system_u:system_r:fwupd_t:s0
tcontext=system_u:system_r:fwupd_t:s0 tclass=lockdown permissive=0
avc: denied { confidentiality } for pid=4628 comm="cp"
lockdown_reason="/proc/kcore access"
scontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
tcontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
tclass=lockdown permissive=0
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
[PM: some merge fuzz do the the perf hooks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-11-27 17:04:36 +00:00
|
|
|
[LOCKDOWN_NONE] = "none",
|
|
|
|
[LOCKDOWN_MODULE_SIGNATURE] = "unsigned module loading",
|
|
|
|
[LOCKDOWN_DEV_MEM] = "/dev/mem,kmem,port",
|
|
|
|
[LOCKDOWN_EFI_TEST] = "/dev/efi_test access",
|
|
|
|
[LOCKDOWN_KEXEC] = "kexec of unsigned images",
|
|
|
|
[LOCKDOWN_HIBERNATION] = "hibernation",
|
|
|
|
[LOCKDOWN_PCI_ACCESS] = "direct PCI access",
|
|
|
|
[LOCKDOWN_IOPORT] = "raw io port access",
|
|
|
|
[LOCKDOWN_MSR] = "raw MSR access",
|
|
|
|
[LOCKDOWN_ACPI_TABLES] = "modifying ACPI tables",
|
2022-09-26 13:16:42 +00:00
|
|
|
[LOCKDOWN_DEVICE_TREE] = "modifying device tree contents",
|
security,lockdown,selinux: implement SELinux lockdown
Implement a SELinux hook for lockdown. If the lockdown module is also
enabled, then a denial by the lockdown module will take precedence over
SELinux, so SELinux can only further restrict lockdown decisions.
The SELinux hook only distinguishes at the granularity of integrity
versus confidentiality similar to the lockdown module, but includes the
full lockdown reason as part of the audit record as a hint in diagnosing
what triggered the denial. To support this auditing, move the
lockdown_reasons[] string array from being private to the lockdown
module to the security framework so that it can be used by the lsm audit
code and so that it is always available even when the lockdown module
is disabled.
Note that the SELinux implementation allows the integrity and
confidentiality reasons to be controlled independently from one another.
Thus, in an SELinux policy, one could allow operations that specify
an integrity reason while blocking operations that specify a
confidentiality reason. The SELinux hook implementation is
stricter than the lockdown module in validating the provided reason value.
Sample AVC audit output from denials:
avc: denied { integrity } for pid=3402 comm="fwupd"
lockdown_reason="/dev/mem,kmem,port" scontext=system_u:system_r:fwupd_t:s0
tcontext=system_u:system_r:fwupd_t:s0 tclass=lockdown permissive=0
avc: denied { confidentiality } for pid=4628 comm="cp"
lockdown_reason="/proc/kcore access"
scontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
tcontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
tclass=lockdown permissive=0
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
[PM: some merge fuzz do the the perf hooks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-11-27 17:04:36 +00:00
|
|
|
[LOCKDOWN_PCMCIA_CIS] = "direct PCMCIA CIS storage",
|
|
|
|
[LOCKDOWN_TIOCSSERIAL] = "reconfiguration of serial port IO",
|
|
|
|
[LOCKDOWN_MODULE_PARAMETERS] = "unsafe module parameters",
|
|
|
|
[LOCKDOWN_MMIOTRACE] = "unsafe mmio",
|
|
|
|
[LOCKDOWN_DEBUGFS] = "debugfs access",
|
|
|
|
[LOCKDOWN_XMON_WR] = "xmon write access",
|
bpf: Add lockdown check for probe_write_user helper
Back then, commit 96ae52279594 ("bpf: Add bpf_probe_write_user BPF helper
to be called in tracers") added the bpf_probe_write_user() helper in order
to allow to override user space memory. Its original goal was to have a
facility to "debug, divert, and manipulate execution of semi-cooperative
processes" under CAP_SYS_ADMIN. Write to kernel was explicitly disallowed
since it would otherwise tamper with its integrity.
One use case was shown in cf9b1199de27 ("samples/bpf: Add test/example of
using bpf_probe_write_user bpf helper") where the program DNATs traffic
at the time of connect(2) syscall, meaning, it rewrites the arguments to
a syscall while they're still in userspace, and before the syscall has a
chance to copy the argument into kernel space. These days we have better
mechanisms in BPF for achieving the same (e.g. for load-balancers), but
without having to write to userspace memory.
Of course the bpf_probe_write_user() helper can also be used to abuse
many other things for both good or bad purpose. Outside of BPF, there is
a similar mechanism for ptrace(2) such as PTRACE_PEEK{TEXT,DATA} and
PTRACE_POKE{TEXT,DATA}, but would likely require some more effort.
Commit 96ae52279594 explicitly dedicated the helper for experimentation
purpose only. Thus, move the helper's availability behind a newly added
LOCKDOWN_BPF_WRITE_USER lockdown knob so that the helper is disabled under
the "integrity" mode. More fine-grained control can be implemented also
from LSM side with this change.
Fixes: 96ae52279594 ("bpf: Add bpf_probe_write_user BPF helper to be called in tracers")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-09 10:43:17 +00:00
|
|
|
[LOCKDOWN_BPF_WRITE_USER] = "use of bpf to write user RAM",
|
2022-05-23 18:11:02 +00:00
|
|
|
[LOCKDOWN_DBG_WRITE_KERNEL] = "use of kgdb/kdb to write kernel RAM",
|
2022-09-26 13:16:43 +00:00
|
|
|
[LOCKDOWN_RTAS_ERROR_INJECTION] = "RTAS error injection",
|
security,lockdown,selinux: implement SELinux lockdown
Implement a SELinux hook for lockdown. If the lockdown module is also
enabled, then a denial by the lockdown module will take precedence over
SELinux, so SELinux can only further restrict lockdown decisions.
The SELinux hook only distinguishes at the granularity of integrity
versus confidentiality similar to the lockdown module, but includes the
full lockdown reason as part of the audit record as a hint in diagnosing
what triggered the denial. To support this auditing, move the
lockdown_reasons[] string array from being private to the lockdown
module to the security framework so that it can be used by the lsm audit
code and so that it is always available even when the lockdown module
is disabled.
Note that the SELinux implementation allows the integrity and
confidentiality reasons to be controlled independently from one another.
Thus, in an SELinux policy, one could allow operations that specify
an integrity reason while blocking operations that specify a
confidentiality reason. The SELinux hook implementation is
stricter than the lockdown module in validating the provided reason value.
Sample AVC audit output from denials:
avc: denied { integrity } for pid=3402 comm="fwupd"
lockdown_reason="/dev/mem,kmem,port" scontext=system_u:system_r:fwupd_t:s0
tcontext=system_u:system_r:fwupd_t:s0 tclass=lockdown permissive=0
avc: denied { confidentiality } for pid=4628 comm="cp"
lockdown_reason="/proc/kcore access"
scontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
tcontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
tclass=lockdown permissive=0
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
[PM: some merge fuzz do the the perf hooks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-11-27 17:04:36 +00:00
|
|
|
[LOCKDOWN_INTEGRITY_MAX] = "integrity",
|
|
|
|
[LOCKDOWN_KCORE] = "/proc/kcore access",
|
|
|
|
[LOCKDOWN_KPROBES] = "use of kprobes",
|
2021-08-09 19:45:32 +00:00
|
|
|
[LOCKDOWN_BPF_READ_KERNEL] = "use of bpf to read kernel RAM",
|
2022-05-23 18:11:02 +00:00
|
|
|
[LOCKDOWN_DBG_READ_KERNEL] = "use of kgdb/kdb to read kernel RAM",
|
security,lockdown,selinux: implement SELinux lockdown
Implement a SELinux hook for lockdown. If the lockdown module is also
enabled, then a denial by the lockdown module will take precedence over
SELinux, so SELinux can only further restrict lockdown decisions.
The SELinux hook only distinguishes at the granularity of integrity
versus confidentiality similar to the lockdown module, but includes the
full lockdown reason as part of the audit record as a hint in diagnosing
what triggered the denial. To support this auditing, move the
lockdown_reasons[] string array from being private to the lockdown
module to the security framework so that it can be used by the lsm audit
code and so that it is always available even when the lockdown module
is disabled.
Note that the SELinux implementation allows the integrity and
confidentiality reasons to be controlled independently from one another.
Thus, in an SELinux policy, one could allow operations that specify
an integrity reason while blocking operations that specify a
confidentiality reason. The SELinux hook implementation is
stricter than the lockdown module in validating the provided reason value.
Sample AVC audit output from denials:
avc: denied { integrity } for pid=3402 comm="fwupd"
lockdown_reason="/dev/mem,kmem,port" scontext=system_u:system_r:fwupd_t:s0
tcontext=system_u:system_r:fwupd_t:s0 tclass=lockdown permissive=0
avc: denied { confidentiality } for pid=4628 comm="cp"
lockdown_reason="/proc/kcore access"
scontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
tcontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
tclass=lockdown permissive=0
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
[PM: some merge fuzz do the the perf hooks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-11-27 17:04:36 +00:00
|
|
|
[LOCKDOWN_PERF] = "unsafe use of perf",
|
|
|
|
[LOCKDOWN_TRACEFS] = "use of tracefs",
|
|
|
|
[LOCKDOWN_XMON_RW] = "xmon read and write access",
|
2020-11-17 16:47:23 +00:00
|
|
|
[LOCKDOWN_XFRM_SECRET] = "xfrm SA secret",
|
security,lockdown,selinux: implement SELinux lockdown
Implement a SELinux hook for lockdown. If the lockdown module is also
enabled, then a denial by the lockdown module will take precedence over
SELinux, so SELinux can only further restrict lockdown decisions.
The SELinux hook only distinguishes at the granularity of integrity
versus confidentiality similar to the lockdown module, but includes the
full lockdown reason as part of the audit record as a hint in diagnosing
what triggered the denial. To support this auditing, move the
lockdown_reasons[] string array from being private to the lockdown
module to the security framework so that it can be used by the lsm audit
code and so that it is always available even when the lockdown module
is disabled.
Note that the SELinux implementation allows the integrity and
confidentiality reasons to be controlled independently from one another.
Thus, in an SELinux policy, one could allow operations that specify
an integrity reason while blocking operations that specify a
confidentiality reason. The SELinux hook implementation is
stricter than the lockdown module in validating the provided reason value.
Sample AVC audit output from denials:
avc: denied { integrity } for pid=3402 comm="fwupd"
lockdown_reason="/dev/mem,kmem,port" scontext=system_u:system_r:fwupd_t:s0
tcontext=system_u:system_r:fwupd_t:s0 tclass=lockdown permissive=0
avc: denied { confidentiality } for pid=4628 comm="cp"
lockdown_reason="/proc/kcore access"
scontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
tcontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
tclass=lockdown permissive=0
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
[PM: some merge fuzz do the the perf hooks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-11-27 17:04:36 +00:00
|
|
|
[LOCKDOWN_CONFIDENTIALITY_MAX] = "confidentiality",
|
|
|
|
};
|
|
|
|
|
2019-06-14 12:20:14 +00:00
|
|
|
static BLOCKING_NOTIFIER_HEAD(blocking_lsm_notifier_chain);
|
2017-05-19 12:48:53 +00:00
|
|
|
|
2018-11-12 20:02:49 +00:00
|
|
|
static struct kmem_cache *lsm_file_cache;
|
2018-09-22 00:19:29 +00:00
|
|
|
static struct kmem_cache *lsm_inode_cache;
|
2018-11-12 20:02:49 +00:00
|
|
|
|
2017-01-19 01:09:05 +00:00
|
|
|
char *lsm_names;
|
selinux: remove the runtime disable functionality
After working with the larger SELinux-based distros for several
years, we're finally at a place where we can disable the SELinux
runtime disable functionality. The existing kernel deprecation
notice explains the functionality and why we want to remove it:
The selinuxfs "disable" node allows SELinux to be disabled at
runtime prior to a policy being loaded into the kernel. If
disabled via this mechanism, SELinux will remain disabled until
the system is rebooted.
The preferred method of disabling SELinux is via the "selinux=0"
boot parameter, but the selinuxfs "disable" node was created to
make it easier for systems with primitive bootloaders that did not
allow for easy modification of the kernel command line.
Unfortunately, allowing for SELinux to be disabled at runtime makes
it difficult to secure the kernel's LSM hooks using the
"__ro_after_init" feature.
It is that last sentence, mentioning the '__ro_after_init' hardening,
which is the real motivation for this change, and if you look at the
diffstat you'll see that the impact of this patch reaches across all
the different LSMs, helping prevent tampering at the LSM hook level.
From a SELinux perspective, it is important to note that if you
continue to disable SELinux via "/etc/selinux/config" it may appear
that SELinux is disabled, but it is simply in an uninitialized state.
If you load a policy with `load_policy -i`, you will see SELinux
come alive just as if you had loaded the policy during early-boot.
It is also worth noting that the "/sys/fs/selinux/disable" file is
always writable now, regardless of the Kconfig settings, but writing
to the file has no effect on the system, other than to display an
error on the console if a non-zero/true value is written.
Finally, in the several years where we have been working on
deprecating this functionality, there has only been one instance of
someone mentioning any user visible breakage. In this particular
case it was an individual's kernel test system, and the workaround
documented in the deprecation notice ("selinux=0" on the kernel
command line) resolved the issue without problem.
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-03-17 16:43:07 +00:00
|
|
|
static struct lsm_blob_sizes blob_sizes __ro_after_init;
|
2018-11-12 17:30:56 +00:00
|
|
|
|
2008-03-06 16:09:10 +00:00
|
|
|
/* Boot-time LSM user choice */
|
2018-09-20 00:30:09 +00:00
|
|
|
static __initdata const char *chosen_lsm_order;
|
2018-09-19 20:11:41 +00:00
|
|
|
static __initdata const char *chosen_major_lsm;
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2023-02-17 02:33:20 +00:00
|
|
|
static __initconst const char *const builtin_lsm_order = CONFIG_LSM;
|
2018-10-09 21:27:46 +00:00
|
|
|
|
2018-09-19 23:58:31 +00:00
|
|
|
/* Ordered list of LSMs to initialize. */
|
|
|
|
static __initdata struct lsm_info **ordered_lsms;
|
2018-09-20 02:57:06 +00:00
|
|
|
static __initdata struct lsm_info *exclusive;
|
2018-09-19 23:58:31 +00:00
|
|
|
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
#ifdef CONFIG_HAVE_STATIC_CALL
|
|
|
|
#define LSM_HOOK_TRAMP(NAME, NUM) \
|
|
|
|
&STATIC_CALL_TRAMP(LSM_STATIC_CALL(NAME, NUM))
|
|
|
|
#else
|
|
|
|
#define LSM_HOOK_TRAMP(NAME, NUM) NULL
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Define static calls and static keys for each LSM hook.
|
|
|
|
*/
|
|
|
|
#define DEFINE_LSM_STATIC_CALL(NUM, NAME, RET, ...) \
|
|
|
|
DEFINE_STATIC_CALL_NULL(LSM_STATIC_CALL(NAME, NUM), \
|
|
|
|
*((RET(*)(__VA_ARGS__))NULL)); \
|
|
|
|
DEFINE_STATIC_KEY_FALSE(SECURITY_HOOK_ACTIVE_KEY(NAME, NUM));
|
|
|
|
|
|
|
|
#define LSM_HOOK(RET, DEFAULT, NAME, ...) \
|
|
|
|
LSM_DEFINE_UNROLL(DEFINE_LSM_STATIC_CALL, NAME, RET, __VA_ARGS__)
|
|
|
|
#include <linux/lsm_hook_defs.h>
|
|
|
|
#undef LSM_HOOK
|
|
|
|
#undef DEFINE_LSM_STATIC_CALL
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Initialise a table of static calls for each LSM hook.
|
|
|
|
* DEFINE_STATIC_CALL_NULL invocation above generates a key (STATIC_CALL_KEY)
|
|
|
|
* and a trampoline (STATIC_CALL_TRAMP) which are used to call
|
|
|
|
* __static_call_update when updating the static call.
|
|
|
|
*
|
|
|
|
* The static calls table is used by early LSMs, some architectures can fault on
|
|
|
|
* unaligned accesses and the fault handling code may not be ready by then.
|
|
|
|
* Thus, the static calls table should be aligned to avoid any unhandled faults
|
|
|
|
* in early init.
|
|
|
|
*/
|
|
|
|
struct lsm_static_calls_table
|
|
|
|
static_calls_table __ro_after_init __aligned(sizeof(u64)) = {
|
|
|
|
#define INIT_LSM_STATIC_CALL(NUM, NAME) \
|
|
|
|
(struct lsm_static_call) { \
|
|
|
|
.key = &STATIC_CALL_KEY(LSM_STATIC_CALL(NAME, NUM)), \
|
|
|
|
.trampoline = LSM_HOOK_TRAMP(NAME, NUM), \
|
|
|
|
.active = &SECURITY_HOOK_ACTIVE_KEY(NAME, NUM), \
|
|
|
|
},
|
|
|
|
#define LSM_HOOK(RET, DEFAULT, NAME, ...) \
|
|
|
|
.NAME = { \
|
|
|
|
LSM_DEFINE_UNROLL(INIT_LSM_STATIC_CALL, NAME) \
|
|
|
|
},
|
|
|
|
#include <linux/lsm_hook_defs.h>
|
|
|
|
#undef LSM_HOOK
|
|
|
|
#undef INIT_LSM_STATIC_CALL
|
|
|
|
};
|
|
|
|
|
2018-10-11 00:18:25 +00:00
|
|
|
static __initdata bool debug;
|
|
|
|
#define init_debug(...) \
|
|
|
|
do { \
|
|
|
|
if (debug) \
|
|
|
|
pr_info(__VA_ARGS__); \
|
|
|
|
} while (0)
|
|
|
|
|
2018-09-14 06:17:50 +00:00
|
|
|
static bool __init is_enabled(struct lsm_info *lsm)
|
|
|
|
{
|
2018-10-09 21:42:57 +00:00
|
|
|
if (!lsm->enabled)
|
|
|
|
return false;
|
2018-09-14 06:17:50 +00:00
|
|
|
|
2018-10-09 21:42:57 +00:00
|
|
|
return *lsm->enabled;
|
2018-09-14 06:17:50 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Mark an LSM's enabled flag. */
|
|
|
|
static int lsm_enabled_true __initdata = 1;
|
|
|
|
static int lsm_enabled_false __initdata = 0;
|
|
|
|
static void __init set_enabled(struct lsm_info *lsm, bool enabled)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* When an LSM hasn't configured an enable variable, we can use
|
|
|
|
* a hard-coded location for storing the default enabled state.
|
|
|
|
*/
|
|
|
|
if (!lsm->enabled) {
|
|
|
|
if (enabled)
|
|
|
|
lsm->enabled = &lsm_enabled_true;
|
|
|
|
else
|
|
|
|
lsm->enabled = &lsm_enabled_false;
|
|
|
|
} else if (lsm->enabled == &lsm_enabled_true) {
|
|
|
|
if (!enabled)
|
|
|
|
lsm->enabled = &lsm_enabled_false;
|
|
|
|
} else if (lsm->enabled == &lsm_enabled_false) {
|
|
|
|
if (enabled)
|
|
|
|
lsm->enabled = &lsm_enabled_true;
|
|
|
|
} else {
|
|
|
|
*lsm->enabled = enabled;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-09-19 23:58:31 +00:00
|
|
|
/* Is an LSM already listed in the ordered LSMs list? */
|
|
|
|
static bool __init exists_ordered_lsm(struct lsm_info *lsm)
|
|
|
|
{
|
|
|
|
struct lsm_info **check;
|
|
|
|
|
|
|
|
for (check = ordered_lsms; *check; check++)
|
|
|
|
if (*check == lsm)
|
|
|
|
return true;
|
|
|
|
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Append an LSM to the list of ordered LSMs to initialize. */
|
|
|
|
static int last_lsm __initdata;
|
|
|
|
static void __init append_ordered_lsm(struct lsm_info *lsm, const char *from)
|
|
|
|
{
|
|
|
|
/* Ignore duplicate selections. */
|
|
|
|
if (exists_ordered_lsm(lsm))
|
|
|
|
return;
|
|
|
|
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
if (WARN(last_lsm == LSM_COUNT, "%s: out of LSM static calls!?\n", from))
|
2018-09-19 23:58:31 +00:00
|
|
|
return;
|
|
|
|
|
2018-10-09 21:42:57 +00:00
|
|
|
/* Enable this LSM, if it is not already set. */
|
|
|
|
if (!lsm->enabled)
|
|
|
|
lsm->enabled = &lsm_enabled_true;
|
2018-09-19 23:58:31 +00:00
|
|
|
ordered_lsms[last_lsm++] = lsm;
|
2018-10-09 21:42:57 +00:00
|
|
|
|
LSM: Better reporting of actual LSMs at boot
Enhance the details reported by "lsm.debug" in several ways:
- report contents of "security="
- report contents of "CONFIG_LSM"
- report contents of "lsm="
- report any early LSM details
- whitespace-align the output of similar phases for easier visual parsing
- change "disabled" to more accurate "skipped"
- explain what "skipped" and "ignored" mean in a parenthetical
Upgrade the "security= is ignored" warning from pr_info to pr_warn,
and include full arguments list to make the cause even more clear.
Replace static "Security Framework initializing" pr_info with specific
list of the resulting order of enabled LSMs.
For example, if the kernel is built with:
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_APPARMOR=y
CONFIG_SECURITY_LOADPIN=y
CONFIG_SECURITY_YAMA=y
CONFIG_SECURITY_SAFESETID=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LANDLOCK=y
CONFIG_INTEGRITY=y
CONFIG_BPF_LSM=y
CONFIG_DEFAULT_SECURITY_APPARMOR=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,
smack,tomoyo,apparmor,bpf"
Booting without options will show:
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
landlock: Up and running.
Yama: becoming mindful.
LoadPin: ready to pin (currently not enforcing)
SELinux: Initializing.
LSM support for eBPF active
Boot with "lsm.debug" will show:
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (enabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: exclusive disabled: apparmor
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
LSM: cred blob size = 32
LSM: file blob size = 16
LSM: inode blob size = 72
LSM: ipc blob size = 8
LSM: msg_msg blob size = 4
LSM: superblock blob size = 80
LSM: task blob size = 8
LSM: initializing capability
LSM: initializing landlock
landlock: Up and running.
LSM: initializing yama
Yama: becoming mindful.
LSM: initializing loadpin
LoadPin: ready to pin (currently not enforcing)
LSM: initializing safesetid
LSM: initializing integrity
LSM: initializing selinux
SELinux: Initializing.
LSM: initializing bpf
LSM support for eBPF active
And some examples of how the lsm.debug ordering report changes...
With "lsm.debug security=selinux":
LSM: legacy security=selinux
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: security=selinux disabled: apparmor (only one legacy major LSM)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (disabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
With "lsm.debug lsm=integrity,selinux,loadpin,crabability,bpf,
loadpin,loadpin":
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm=integrity,selinux,loadpin,capability,bpf,loadpin,
loadpin
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: cmdline ordered: integrity (enabled)
LSM: cmdline ordered: selinux (enabled)
LSM: cmdline ordered: loadpin (enabled)
LSM: cmdline ignored: crabability (not built into kernel)
LSM: cmdline ordered: bpf (enabled)
LSM: cmdline skipped: apparmor (not in requested order)
LSM: cmdline skipped: yama (not in requested order)
LSM: cmdline skipped: safesetid (not in requested order)
LSM: cmdline skipped: landlock (not in requested order)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,integrity,selinux,loadpin,bpf
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Mickaël Salaün <mic@digikod.net>
[PM: line wrapped commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-11-02 00:05:29 +00:00
|
|
|
init_debug("%s ordered: %s (%s)\n", from, lsm->name,
|
|
|
|
is_enabled(lsm) ? "enabled" : "disabled");
|
2018-09-19 23:58:31 +00:00
|
|
|
}
|
|
|
|
|
2018-09-14 06:17:50 +00:00
|
|
|
/* Is an LSM allowed to be initialized? */
|
|
|
|
static bool __init lsm_allowed(struct lsm_info *lsm)
|
|
|
|
{
|
|
|
|
/* Skip if the LSM is disabled. */
|
|
|
|
if (!is_enabled(lsm))
|
|
|
|
return false;
|
|
|
|
|
2018-09-20 02:57:06 +00:00
|
|
|
/* Not allowed if another exclusive LSM already initialized. */
|
|
|
|
if ((lsm->flags & LSM_FLAG_EXCLUSIVE) && exclusive) {
|
|
|
|
init_debug("exclusive disabled: %s\n", lsm->name);
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2018-09-14 06:17:50 +00:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2018-11-12 17:30:56 +00:00
|
|
|
static void __init lsm_set_blob_size(int *need, int *lbs)
|
|
|
|
{
|
|
|
|
int offset;
|
|
|
|
|
2022-10-18 18:22:09 +00:00
|
|
|
if (*need <= 0)
|
|
|
|
return;
|
|
|
|
|
|
|
|
offset = ALIGN(*lbs, sizeof(void *));
|
|
|
|
*lbs = offset + *need;
|
|
|
|
*need = offset;
|
2018-11-12 17:30:56 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static void __init lsm_set_blob_sizes(struct lsm_blob_sizes *needed)
|
|
|
|
{
|
|
|
|
if (!needed)
|
|
|
|
return;
|
|
|
|
|
|
|
|
lsm_set_blob_size(&needed->lbs_cred, &blob_sizes.lbs_cred);
|
2018-11-12 20:02:49 +00:00
|
|
|
lsm_set_blob_size(&needed->lbs_file, &blob_sizes.lbs_file);
|
2024-07-10 21:32:29 +00:00
|
|
|
lsm_set_blob_size(&needed->lbs_ib, &blob_sizes.lbs_ib);
|
2018-09-22 00:19:29 +00:00
|
|
|
/*
|
|
|
|
* The inode blob gets an rcu_head in addition to
|
|
|
|
* what the modules might need.
|
|
|
|
*/
|
|
|
|
if (needed->lbs_inode && blob_sizes.lbs_inode == 0)
|
|
|
|
blob_sizes.lbs_inode = sizeof(struct rcu_head);
|
|
|
|
lsm_set_blob_size(&needed->lbs_inode, &blob_sizes.lbs_inode);
|
2018-11-20 19:55:02 +00:00
|
|
|
lsm_set_blob_size(&needed->lbs_ipc, &blob_sizes.lbs_ipc);
|
2024-07-10 21:32:26 +00:00
|
|
|
lsm_set_blob_size(&needed->lbs_key, &blob_sizes.lbs_key);
|
2018-11-20 19:55:02 +00:00
|
|
|
lsm_set_blob_size(&needed->lbs_msg_msg, &blob_sizes.lbs_msg_msg);
|
2024-07-10 21:32:30 +00:00
|
|
|
lsm_set_blob_size(&needed->lbs_perf_event, &blob_sizes.lbs_perf_event);
|
2024-07-10 21:32:25 +00:00
|
|
|
lsm_set_blob_size(&needed->lbs_sock, &blob_sizes.lbs_sock);
|
2021-04-22 15:41:15 +00:00
|
|
|
lsm_set_blob_size(&needed->lbs_superblock, &blob_sizes.lbs_superblock);
|
2018-09-22 00:19:37 +00:00
|
|
|
lsm_set_blob_size(&needed->lbs_task, &blob_sizes.lbs_task);
|
2024-07-10 21:32:28 +00:00
|
|
|
lsm_set_blob_size(&needed->lbs_tun_dev, &blob_sizes.lbs_tun_dev);
|
security: Allow all LSMs to provide xattrs for inode_init_security hook
Currently, the LSM infrastructure supports only one LSM providing an xattr
and EVM calculating the HMAC on that xattr, plus other inode metadata.
Allow all LSMs to provide one or multiple xattrs, by extending the security
blob reservation mechanism. Introduce the new lbs_xattr_count field of the
lsm_blob_sizes structure, so that each LSM can specify how many xattrs it
needs, and the LSM infrastructure knows how many xattr slots it should
allocate.
Modify the inode_init_security hook definition, by passing the full
xattr array allocated in security_inode_init_security(), and the current
number of xattr slots in that array filled by LSMs. The first parameter
would allow EVM to access and calculate the HMAC on xattrs supplied by
other LSMs, the second to not leave gaps in the xattr array, when an LSM
requested but did not provide xattrs (e.g. if it is not initialized).
Introduce lsm_get_xattr_slot(), which LSMs can call as many times as the
number specified in the lbs_xattr_count field of the lsm_blob_sizes
structure. During each call, lsm_get_xattr_slot() increments the number of
filled xattrs, so that at the next invocation it returns the next xattr
slot to fill.
Cleanup security_inode_init_security(). Unify the !initxattrs and
initxattrs case by simply not allocating the new_xattrs array in the
former. Update the documentation to reflect the changes, and fix the
description of the xattr name, as it is not allocated anymore.
Adapt both SELinux and Smack to use the new definition of the
inode_init_security hook, and to call lsm_get_xattr_slot() to obtain and
fill the reserved slots in the xattr array.
Move the xattr->name assignment after the xattr->value one, so that it is
done only in case of successful memory allocation.
Finally, change the default return value of the inode_init_security hook
from zero to -EOPNOTSUPP, so that BPF LSM correctly follows the hook
conventions.
Reported-by: Nicolas Bouchinet <nicolas.bouchinet@clip-os.org>
Link: https://lore.kernel.org/linux-integrity/Y1FTSIo+1x+4X0LS@archlinux/
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: minor comment and variable tweaks, approved by RS]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-06-10 07:57:35 +00:00
|
|
|
lsm_set_blob_size(&needed->lbs_xattr_count,
|
|
|
|
&blob_sizes.lbs_xattr_count);
|
block,lsm: add LSM blob and new LSM hooks for block devices
This patch introduces a new LSM blob to the block_device structure,
enabling the security subsystem to store security-sensitive data related
to block devices. Currently, for a device mapper's mapped device containing
a dm-verity target, critical security information such as the roothash and
its signing state are not readily accessible. Specifically, while the
dm-verity volume creation process passes the dm-verity roothash and its
signature from userspace to the kernel, the roothash is stored privately
within the dm-verity target, and its signature is discarded
post-verification. This makes it extremely hard for the security subsystem
to utilize these data.
With the addition of the LSM blob to the block_device structure, the
security subsystem can now retain and manage important security metadata
such as the roothash and the signing state of a dm-verity by storing them
inside the blob. Access decisions can then be based on these stored data.
The implementation follows the same approach used for security blobs in
other structures like struct file, struct inode, and struct superblock.
The initialization of the security blob occurs after the creation of the
struct block_device, performed by the security subsystem. Similarly, the
security blob is freed by the security subsystem before the struct
block_device is deallocated or freed.
This patch also introduces a new hook security_bdev_setintegrity() to save
block device's integrity data to the new LSM blob. For example, for
dm-verity, it can use this hook to expose its roothash and signing state
to LSMs, then LSMs can save these data into the LSM blob.
Please note that the new hook should be invoked every time the security
information is updated to keep these data current. For example, in
dm-verity, if the mapping table is reloaded and configured to use a
different dm-verity target with a new roothash and signing information,
the previously stored data in the LSM blob will become obsolete. It is
crucial to re-invoke the hook to refresh these data and ensure they are up
to date. This necessity arises from the design of device-mapper, where a
device-mapper device is first created, and then targets are subsequently
loaded into it. These targets can be modified multiple times during the
device's lifetime. Therefore, while the LSM blob is allocated during the
creation of the block device, its actual contents are not initialized at
this stage and can change substantially over time. This includes
alterations from data that the LSM 'trusts' to those it does not, making
it essential to handle these changes correctly. Failure to address this
dynamic aspect could potentially allow for bypassing LSM checks.
Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
[PM: merge fuzz, subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-03 06:08:25 +00:00
|
|
|
lsm_set_blob_size(&needed->lbs_bdev, &blob_sizes.lbs_bdev);
|
2018-11-12 17:30:56 +00:00
|
|
|
}
|
|
|
|
|
2018-10-10 22:45:22 +00:00
|
|
|
/* Prepare LSM for initialization. */
|
|
|
|
static void __init prepare_lsm(struct lsm_info *lsm)
|
2018-09-14 06:17:50 +00:00
|
|
|
{
|
|
|
|
int enabled = lsm_allowed(lsm);
|
|
|
|
|
|
|
|
/* Record enablement (to handle any following exclusive LSMs). */
|
|
|
|
set_enabled(lsm, enabled);
|
|
|
|
|
2018-10-10 22:45:22 +00:00
|
|
|
/* If enabled, do pre-initialization work. */
|
2018-09-14 06:17:50 +00:00
|
|
|
if (enabled) {
|
2018-09-20 02:57:06 +00:00
|
|
|
if ((lsm->flags & LSM_FLAG_EXCLUSIVE) && !exclusive) {
|
|
|
|
exclusive = lsm;
|
LSM: Better reporting of actual LSMs at boot
Enhance the details reported by "lsm.debug" in several ways:
- report contents of "security="
- report contents of "CONFIG_LSM"
- report contents of "lsm="
- report any early LSM details
- whitespace-align the output of similar phases for easier visual parsing
- change "disabled" to more accurate "skipped"
- explain what "skipped" and "ignored" mean in a parenthetical
Upgrade the "security= is ignored" warning from pr_info to pr_warn,
and include full arguments list to make the cause even more clear.
Replace static "Security Framework initializing" pr_info with specific
list of the resulting order of enabled LSMs.
For example, if the kernel is built with:
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_APPARMOR=y
CONFIG_SECURITY_LOADPIN=y
CONFIG_SECURITY_YAMA=y
CONFIG_SECURITY_SAFESETID=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LANDLOCK=y
CONFIG_INTEGRITY=y
CONFIG_BPF_LSM=y
CONFIG_DEFAULT_SECURITY_APPARMOR=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,
smack,tomoyo,apparmor,bpf"
Booting without options will show:
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
landlock: Up and running.
Yama: becoming mindful.
LoadPin: ready to pin (currently not enforcing)
SELinux: Initializing.
LSM support for eBPF active
Boot with "lsm.debug" will show:
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (enabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: exclusive disabled: apparmor
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
LSM: cred blob size = 32
LSM: file blob size = 16
LSM: inode blob size = 72
LSM: ipc blob size = 8
LSM: msg_msg blob size = 4
LSM: superblock blob size = 80
LSM: task blob size = 8
LSM: initializing capability
LSM: initializing landlock
landlock: Up and running.
LSM: initializing yama
Yama: becoming mindful.
LSM: initializing loadpin
LoadPin: ready to pin (currently not enforcing)
LSM: initializing safesetid
LSM: initializing integrity
LSM: initializing selinux
SELinux: Initializing.
LSM: initializing bpf
LSM support for eBPF active
And some examples of how the lsm.debug ordering report changes...
With "lsm.debug security=selinux":
LSM: legacy security=selinux
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: security=selinux disabled: apparmor (only one legacy major LSM)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (disabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
With "lsm.debug lsm=integrity,selinux,loadpin,crabability,bpf,
loadpin,loadpin":
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm=integrity,selinux,loadpin,capability,bpf,loadpin,
loadpin
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: cmdline ordered: integrity (enabled)
LSM: cmdline ordered: selinux (enabled)
LSM: cmdline ordered: loadpin (enabled)
LSM: cmdline ignored: crabability (not built into kernel)
LSM: cmdline ordered: bpf (enabled)
LSM: cmdline skipped: apparmor (not in requested order)
LSM: cmdline skipped: yama (not in requested order)
LSM: cmdline skipped: safesetid (not in requested order)
LSM: cmdline skipped: landlock (not in requested order)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,integrity,selinux,loadpin,bpf
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Mickaël Salaün <mic@digikod.net>
[PM: line wrapped commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-11-02 00:05:29 +00:00
|
|
|
init_debug("exclusive chosen: %s\n", lsm->name);
|
2018-09-20 02:57:06 +00:00
|
|
|
}
|
2018-11-12 17:30:56 +00:00
|
|
|
|
|
|
|
lsm_set_blob_sizes(lsm->blobs);
|
2018-10-10 22:45:22 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Initialize a given LSM, if it is enabled. */
|
|
|
|
static void __init initialize_lsm(struct lsm_info *lsm)
|
|
|
|
{
|
|
|
|
if (is_enabled(lsm)) {
|
|
|
|
int ret;
|
2018-09-20 02:57:06 +00:00
|
|
|
|
2018-09-14 06:17:50 +00:00
|
|
|
init_debug("initializing %s\n", lsm->name);
|
|
|
|
ret = lsm->init();
|
|
|
|
WARN(ret, "%s failed to initialize: %d\n", lsm->name, ret);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2023-09-12 20:56:47 +00:00
|
|
|
/*
|
|
|
|
* Current index to use while initializing the lsm id list.
|
|
|
|
*/
|
|
|
|
u32 lsm_active_cnt __ro_after_init;
|
|
|
|
const struct lsm_id *lsm_idlist[LSM_CONFIG_COUNT];
|
|
|
|
|
2018-10-09 21:27:46 +00:00
|
|
|
/* Populate ordered LSMs list from comma-separated LSM name list. */
|
2018-09-19 23:58:31 +00:00
|
|
|
static void __init ordered_lsm_parse(const char *order, const char *origin)
|
2018-09-19 23:16:55 +00:00
|
|
|
{
|
|
|
|
struct lsm_info *lsm;
|
2018-10-09 21:27:46 +00:00
|
|
|
char *sep, *name, *next;
|
|
|
|
|
2018-09-20 00:48:21 +00:00
|
|
|
/* LSM_ORDER_FIRST is always first. */
|
|
|
|
for (lsm = __start_lsm_info; lsm < __end_lsm_info; lsm++) {
|
|
|
|
if (lsm->order == LSM_ORDER_FIRST)
|
LSM: Better reporting of actual LSMs at boot
Enhance the details reported by "lsm.debug" in several ways:
- report contents of "security="
- report contents of "CONFIG_LSM"
- report contents of "lsm="
- report any early LSM details
- whitespace-align the output of similar phases for easier visual parsing
- change "disabled" to more accurate "skipped"
- explain what "skipped" and "ignored" mean in a parenthetical
Upgrade the "security= is ignored" warning from pr_info to pr_warn,
and include full arguments list to make the cause even more clear.
Replace static "Security Framework initializing" pr_info with specific
list of the resulting order of enabled LSMs.
For example, if the kernel is built with:
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_APPARMOR=y
CONFIG_SECURITY_LOADPIN=y
CONFIG_SECURITY_YAMA=y
CONFIG_SECURITY_SAFESETID=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LANDLOCK=y
CONFIG_INTEGRITY=y
CONFIG_BPF_LSM=y
CONFIG_DEFAULT_SECURITY_APPARMOR=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,
smack,tomoyo,apparmor,bpf"
Booting without options will show:
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
landlock: Up and running.
Yama: becoming mindful.
LoadPin: ready to pin (currently not enforcing)
SELinux: Initializing.
LSM support for eBPF active
Boot with "lsm.debug" will show:
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (enabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: exclusive disabled: apparmor
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
LSM: cred blob size = 32
LSM: file blob size = 16
LSM: inode blob size = 72
LSM: ipc blob size = 8
LSM: msg_msg blob size = 4
LSM: superblock blob size = 80
LSM: task blob size = 8
LSM: initializing capability
LSM: initializing landlock
landlock: Up and running.
LSM: initializing yama
Yama: becoming mindful.
LSM: initializing loadpin
LoadPin: ready to pin (currently not enforcing)
LSM: initializing safesetid
LSM: initializing integrity
LSM: initializing selinux
SELinux: Initializing.
LSM: initializing bpf
LSM support for eBPF active
And some examples of how the lsm.debug ordering report changes...
With "lsm.debug security=selinux":
LSM: legacy security=selinux
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: security=selinux disabled: apparmor (only one legacy major LSM)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (disabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
With "lsm.debug lsm=integrity,selinux,loadpin,crabability,bpf,
loadpin,loadpin":
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm=integrity,selinux,loadpin,capability,bpf,loadpin,
loadpin
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: cmdline ordered: integrity (enabled)
LSM: cmdline ordered: selinux (enabled)
LSM: cmdline ordered: loadpin (enabled)
LSM: cmdline ignored: crabability (not built into kernel)
LSM: cmdline ordered: bpf (enabled)
LSM: cmdline skipped: apparmor (not in requested order)
LSM: cmdline skipped: yama (not in requested order)
LSM: cmdline skipped: safesetid (not in requested order)
LSM: cmdline skipped: landlock (not in requested order)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,integrity,selinux,loadpin,bpf
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Mickaël Salaün <mic@digikod.net>
[PM: line wrapped commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-11-02 00:05:29 +00:00
|
|
|
append_ordered_lsm(lsm, " first");
|
2018-09-20 00:48:21 +00:00
|
|
|
}
|
|
|
|
|
2018-09-19 20:32:15 +00:00
|
|
|
/* Process "security=", if given. */
|
|
|
|
if (chosen_major_lsm) {
|
|
|
|
struct lsm_info *major;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* To match the original "security=" behavior, this
|
|
|
|
* explicitly does NOT fallback to another Legacy Major
|
|
|
|
* if the selected one was separately disabled: disable
|
|
|
|
* all non-matching Legacy Major LSMs.
|
|
|
|
*/
|
|
|
|
for (major = __start_lsm_info; major < __end_lsm_info;
|
|
|
|
major++) {
|
|
|
|
if ((major->flags & LSM_FLAG_LEGACY_MAJOR) &&
|
|
|
|
strcmp(major->name, chosen_major_lsm) != 0) {
|
|
|
|
set_enabled(major, false);
|
LSM: Better reporting of actual LSMs at boot
Enhance the details reported by "lsm.debug" in several ways:
- report contents of "security="
- report contents of "CONFIG_LSM"
- report contents of "lsm="
- report any early LSM details
- whitespace-align the output of similar phases for easier visual parsing
- change "disabled" to more accurate "skipped"
- explain what "skipped" and "ignored" mean in a parenthetical
Upgrade the "security= is ignored" warning from pr_info to pr_warn,
and include full arguments list to make the cause even more clear.
Replace static "Security Framework initializing" pr_info with specific
list of the resulting order of enabled LSMs.
For example, if the kernel is built with:
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_APPARMOR=y
CONFIG_SECURITY_LOADPIN=y
CONFIG_SECURITY_YAMA=y
CONFIG_SECURITY_SAFESETID=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LANDLOCK=y
CONFIG_INTEGRITY=y
CONFIG_BPF_LSM=y
CONFIG_DEFAULT_SECURITY_APPARMOR=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,
smack,tomoyo,apparmor,bpf"
Booting without options will show:
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
landlock: Up and running.
Yama: becoming mindful.
LoadPin: ready to pin (currently not enforcing)
SELinux: Initializing.
LSM support for eBPF active
Boot with "lsm.debug" will show:
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (enabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: exclusive disabled: apparmor
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
LSM: cred blob size = 32
LSM: file blob size = 16
LSM: inode blob size = 72
LSM: ipc blob size = 8
LSM: msg_msg blob size = 4
LSM: superblock blob size = 80
LSM: task blob size = 8
LSM: initializing capability
LSM: initializing landlock
landlock: Up and running.
LSM: initializing yama
Yama: becoming mindful.
LSM: initializing loadpin
LoadPin: ready to pin (currently not enforcing)
LSM: initializing safesetid
LSM: initializing integrity
LSM: initializing selinux
SELinux: Initializing.
LSM: initializing bpf
LSM support for eBPF active
And some examples of how the lsm.debug ordering report changes...
With "lsm.debug security=selinux":
LSM: legacy security=selinux
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: security=selinux disabled: apparmor (only one legacy major LSM)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (disabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
With "lsm.debug lsm=integrity,selinux,loadpin,crabability,bpf,
loadpin,loadpin":
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm=integrity,selinux,loadpin,capability,bpf,loadpin,
loadpin
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: cmdline ordered: integrity (enabled)
LSM: cmdline ordered: selinux (enabled)
LSM: cmdline ordered: loadpin (enabled)
LSM: cmdline ignored: crabability (not built into kernel)
LSM: cmdline ordered: bpf (enabled)
LSM: cmdline skipped: apparmor (not in requested order)
LSM: cmdline skipped: yama (not in requested order)
LSM: cmdline skipped: safesetid (not in requested order)
LSM: cmdline skipped: landlock (not in requested order)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,integrity,selinux,loadpin,bpf
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Mickaël Salaün <mic@digikod.net>
[PM: line wrapped commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-11-02 00:05:29 +00:00
|
|
|
init_debug("security=%s disabled: %s (only one legacy major LSM)\n",
|
2018-09-19 20:32:15 +00:00
|
|
|
chosen_major_lsm, major->name);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2018-09-19 20:11:41 +00:00
|
|
|
|
2018-10-09 21:27:46 +00:00
|
|
|
sep = kstrdup(order, GFP_KERNEL);
|
|
|
|
next = sep;
|
|
|
|
/* Walk the list, looking for matching LSMs. */
|
|
|
|
while ((name = strsep(&next, ",")) != NULL) {
|
|
|
|
bool found = false;
|
|
|
|
|
|
|
|
for (lsm = __start_lsm_info; lsm < __end_lsm_info; lsm++) {
|
security: Introduce LSM_ORDER_LAST and set it for the integrity LSM
Introduce LSM_ORDER_LAST, to satisfy the requirement of LSMs needing to be
last, e.g. the 'integrity' LSM, without changing the kernel command line or
configuration.
Also, set this order for the 'integrity' LSM. While not enforced, this is
the only LSM expected to use it.
Similarly to LSM_ORDER_FIRST, LSMs with LSM_ORDER_LAST are always enabled
and put at the end of the LSM list, if selected in the kernel
configuration. Setting one of these orders alone, does not cause the LSMs
to be selected and compiled built-in in the kernel.
Finally, for LSM_ORDER_MUTABLE LSMs, set the found variable to true if an
LSM is found, regardless of its order. In this way, the kernel would not
wrongly report that the LSM is not built-in in the kernel if its order is
LSM_ORDER_LAST.
Fixes: 79f7865d844c ("LSM: Introduce "lsm=" for boottime LSM selection")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-03-10 08:53:59 +00:00
|
|
|
if (strcmp(lsm->name, name) == 0) {
|
|
|
|
if (lsm->order == LSM_ORDER_MUTABLE)
|
|
|
|
append_ordered_lsm(lsm, origin);
|
2018-10-09 21:27:46 +00:00
|
|
|
found = true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!found)
|
LSM: Better reporting of actual LSMs at boot
Enhance the details reported by "lsm.debug" in several ways:
- report contents of "security="
- report contents of "CONFIG_LSM"
- report contents of "lsm="
- report any early LSM details
- whitespace-align the output of similar phases for easier visual parsing
- change "disabled" to more accurate "skipped"
- explain what "skipped" and "ignored" mean in a parenthetical
Upgrade the "security= is ignored" warning from pr_info to pr_warn,
and include full arguments list to make the cause even more clear.
Replace static "Security Framework initializing" pr_info with specific
list of the resulting order of enabled LSMs.
For example, if the kernel is built with:
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_APPARMOR=y
CONFIG_SECURITY_LOADPIN=y
CONFIG_SECURITY_YAMA=y
CONFIG_SECURITY_SAFESETID=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LANDLOCK=y
CONFIG_INTEGRITY=y
CONFIG_BPF_LSM=y
CONFIG_DEFAULT_SECURITY_APPARMOR=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,
smack,tomoyo,apparmor,bpf"
Booting without options will show:
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
landlock: Up and running.
Yama: becoming mindful.
LoadPin: ready to pin (currently not enforcing)
SELinux: Initializing.
LSM support for eBPF active
Boot with "lsm.debug" will show:
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (enabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: exclusive disabled: apparmor
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
LSM: cred blob size = 32
LSM: file blob size = 16
LSM: inode blob size = 72
LSM: ipc blob size = 8
LSM: msg_msg blob size = 4
LSM: superblock blob size = 80
LSM: task blob size = 8
LSM: initializing capability
LSM: initializing landlock
landlock: Up and running.
LSM: initializing yama
Yama: becoming mindful.
LSM: initializing loadpin
LoadPin: ready to pin (currently not enforcing)
LSM: initializing safesetid
LSM: initializing integrity
LSM: initializing selinux
SELinux: Initializing.
LSM: initializing bpf
LSM support for eBPF active
And some examples of how the lsm.debug ordering report changes...
With "lsm.debug security=selinux":
LSM: legacy security=selinux
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: security=selinux disabled: apparmor (only one legacy major LSM)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (disabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
With "lsm.debug lsm=integrity,selinux,loadpin,crabability,bpf,
loadpin,loadpin":
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm=integrity,selinux,loadpin,capability,bpf,loadpin,
loadpin
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: cmdline ordered: integrity (enabled)
LSM: cmdline ordered: selinux (enabled)
LSM: cmdline ordered: loadpin (enabled)
LSM: cmdline ignored: crabability (not built into kernel)
LSM: cmdline ordered: bpf (enabled)
LSM: cmdline skipped: apparmor (not in requested order)
LSM: cmdline skipped: yama (not in requested order)
LSM: cmdline skipped: safesetid (not in requested order)
LSM: cmdline skipped: landlock (not in requested order)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,integrity,selinux,loadpin,bpf
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Mickaël Salaün <mic@digikod.net>
[PM: line wrapped commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-11-02 00:05:29 +00:00
|
|
|
init_debug("%s ignored: %s (not built into kernel)\n",
|
|
|
|
origin, name);
|
2018-09-19 23:16:55 +00:00
|
|
|
}
|
2018-11-20 02:04:32 +00:00
|
|
|
|
|
|
|
/* Process "security=", if given. */
|
|
|
|
if (chosen_major_lsm) {
|
|
|
|
for (lsm = __start_lsm_info; lsm < __end_lsm_info; lsm++) {
|
|
|
|
if (exists_ordered_lsm(lsm))
|
|
|
|
continue;
|
|
|
|
if (strcmp(lsm->name, chosen_major_lsm) == 0)
|
|
|
|
append_ordered_lsm(lsm, "security=");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
security: Introduce LSM_ORDER_LAST and set it for the integrity LSM
Introduce LSM_ORDER_LAST, to satisfy the requirement of LSMs needing to be
last, e.g. the 'integrity' LSM, without changing the kernel command line or
configuration.
Also, set this order for the 'integrity' LSM. While not enforced, this is
the only LSM expected to use it.
Similarly to LSM_ORDER_FIRST, LSMs with LSM_ORDER_LAST are always enabled
and put at the end of the LSM list, if selected in the kernel
configuration. Setting one of these orders alone, does not cause the LSMs
to be selected and compiled built-in in the kernel.
Finally, for LSM_ORDER_MUTABLE LSMs, set the found variable to true if an
LSM is found, regardless of its order. In this way, the kernel would not
wrongly report that the LSM is not built-in in the kernel if its order is
LSM_ORDER_LAST.
Fixes: 79f7865d844c ("LSM: Introduce "lsm=" for boottime LSM selection")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-03-10 08:53:59 +00:00
|
|
|
/* LSM_ORDER_LAST is always last. */
|
|
|
|
for (lsm = __start_lsm_info; lsm < __end_lsm_info; lsm++) {
|
|
|
|
if (lsm->order == LSM_ORDER_LAST)
|
|
|
|
append_ordered_lsm(lsm, " last");
|
|
|
|
}
|
|
|
|
|
2018-11-20 02:04:32 +00:00
|
|
|
/* Disable all LSMs not in the ordered list. */
|
|
|
|
for (lsm = __start_lsm_info; lsm < __end_lsm_info; lsm++) {
|
|
|
|
if (exists_ordered_lsm(lsm))
|
|
|
|
continue;
|
|
|
|
set_enabled(lsm, false);
|
LSM: Better reporting of actual LSMs at boot
Enhance the details reported by "lsm.debug" in several ways:
- report contents of "security="
- report contents of "CONFIG_LSM"
- report contents of "lsm="
- report any early LSM details
- whitespace-align the output of similar phases for easier visual parsing
- change "disabled" to more accurate "skipped"
- explain what "skipped" and "ignored" mean in a parenthetical
Upgrade the "security= is ignored" warning from pr_info to pr_warn,
and include full arguments list to make the cause even more clear.
Replace static "Security Framework initializing" pr_info with specific
list of the resulting order of enabled LSMs.
For example, if the kernel is built with:
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_APPARMOR=y
CONFIG_SECURITY_LOADPIN=y
CONFIG_SECURITY_YAMA=y
CONFIG_SECURITY_SAFESETID=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LANDLOCK=y
CONFIG_INTEGRITY=y
CONFIG_BPF_LSM=y
CONFIG_DEFAULT_SECURITY_APPARMOR=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,
smack,tomoyo,apparmor,bpf"
Booting without options will show:
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
landlock: Up and running.
Yama: becoming mindful.
LoadPin: ready to pin (currently not enforcing)
SELinux: Initializing.
LSM support for eBPF active
Boot with "lsm.debug" will show:
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (enabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: exclusive disabled: apparmor
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
LSM: cred blob size = 32
LSM: file blob size = 16
LSM: inode blob size = 72
LSM: ipc blob size = 8
LSM: msg_msg blob size = 4
LSM: superblock blob size = 80
LSM: task blob size = 8
LSM: initializing capability
LSM: initializing landlock
landlock: Up and running.
LSM: initializing yama
Yama: becoming mindful.
LSM: initializing loadpin
LoadPin: ready to pin (currently not enforcing)
LSM: initializing safesetid
LSM: initializing integrity
LSM: initializing selinux
SELinux: Initializing.
LSM: initializing bpf
LSM support for eBPF active
And some examples of how the lsm.debug ordering report changes...
With "lsm.debug security=selinux":
LSM: legacy security=selinux
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: security=selinux disabled: apparmor (only one legacy major LSM)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (disabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
With "lsm.debug lsm=integrity,selinux,loadpin,crabability,bpf,
loadpin,loadpin":
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm=integrity,selinux,loadpin,capability,bpf,loadpin,
loadpin
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: cmdline ordered: integrity (enabled)
LSM: cmdline ordered: selinux (enabled)
LSM: cmdline ordered: loadpin (enabled)
LSM: cmdline ignored: crabability (not built into kernel)
LSM: cmdline ordered: bpf (enabled)
LSM: cmdline skipped: apparmor (not in requested order)
LSM: cmdline skipped: yama (not in requested order)
LSM: cmdline skipped: safesetid (not in requested order)
LSM: cmdline skipped: landlock (not in requested order)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,integrity,selinux,loadpin,bpf
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Mickaël Salaün <mic@digikod.net>
[PM: line wrapped commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-11-02 00:05:29 +00:00
|
|
|
init_debug("%s skipped: %s (not in requested order)\n",
|
|
|
|
origin, lsm->name);
|
2018-11-20 02:04:32 +00:00
|
|
|
}
|
|
|
|
|
2018-10-09 21:27:46 +00:00
|
|
|
kfree(sep);
|
2018-09-19 23:16:55 +00:00
|
|
|
}
|
|
|
|
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
static void __init lsm_static_call_init(struct security_hook_list *hl)
|
|
|
|
{
|
|
|
|
struct lsm_static_call *scall = hl->scalls;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 0; i < MAX_LSM_COUNT; i++) {
|
|
|
|
/* Update the first static call that is not used yet */
|
|
|
|
if (!scall->hl) {
|
|
|
|
__static_call_update(scall->key, scall->trampoline,
|
|
|
|
hl->hook.lsm_func_addr);
|
|
|
|
scall->hl = hl;
|
|
|
|
static_branch_enable(scall->active);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
scall++;
|
|
|
|
}
|
|
|
|
panic("%s - Ran out of static slots.\n", __func__);
|
|
|
|
}
|
|
|
|
|
2019-01-18 10:15:59 +00:00
|
|
|
static void __init lsm_early_cred(struct cred *cred);
|
|
|
|
static void __init lsm_early_task(struct task_struct *task);
|
|
|
|
|
2019-08-20 00:17:37 +00:00
|
|
|
static int lsm_append(const char *new, char **result);
|
|
|
|
|
LSM: Better reporting of actual LSMs at boot
Enhance the details reported by "lsm.debug" in several ways:
- report contents of "security="
- report contents of "CONFIG_LSM"
- report contents of "lsm="
- report any early LSM details
- whitespace-align the output of similar phases for easier visual parsing
- change "disabled" to more accurate "skipped"
- explain what "skipped" and "ignored" mean in a parenthetical
Upgrade the "security= is ignored" warning from pr_info to pr_warn,
and include full arguments list to make the cause even more clear.
Replace static "Security Framework initializing" pr_info with specific
list of the resulting order of enabled LSMs.
For example, if the kernel is built with:
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_APPARMOR=y
CONFIG_SECURITY_LOADPIN=y
CONFIG_SECURITY_YAMA=y
CONFIG_SECURITY_SAFESETID=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LANDLOCK=y
CONFIG_INTEGRITY=y
CONFIG_BPF_LSM=y
CONFIG_DEFAULT_SECURITY_APPARMOR=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,
smack,tomoyo,apparmor,bpf"
Booting without options will show:
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
landlock: Up and running.
Yama: becoming mindful.
LoadPin: ready to pin (currently not enforcing)
SELinux: Initializing.
LSM support for eBPF active
Boot with "lsm.debug" will show:
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (enabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: exclusive disabled: apparmor
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
LSM: cred blob size = 32
LSM: file blob size = 16
LSM: inode blob size = 72
LSM: ipc blob size = 8
LSM: msg_msg blob size = 4
LSM: superblock blob size = 80
LSM: task blob size = 8
LSM: initializing capability
LSM: initializing landlock
landlock: Up and running.
LSM: initializing yama
Yama: becoming mindful.
LSM: initializing loadpin
LoadPin: ready to pin (currently not enforcing)
LSM: initializing safesetid
LSM: initializing integrity
LSM: initializing selinux
SELinux: Initializing.
LSM: initializing bpf
LSM support for eBPF active
And some examples of how the lsm.debug ordering report changes...
With "lsm.debug security=selinux":
LSM: legacy security=selinux
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: security=selinux disabled: apparmor (only one legacy major LSM)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (disabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
With "lsm.debug lsm=integrity,selinux,loadpin,crabability,bpf,
loadpin,loadpin":
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm=integrity,selinux,loadpin,capability,bpf,loadpin,
loadpin
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: cmdline ordered: integrity (enabled)
LSM: cmdline ordered: selinux (enabled)
LSM: cmdline ordered: loadpin (enabled)
LSM: cmdline ignored: crabability (not built into kernel)
LSM: cmdline ordered: bpf (enabled)
LSM: cmdline skipped: apparmor (not in requested order)
LSM: cmdline skipped: yama (not in requested order)
LSM: cmdline skipped: safesetid (not in requested order)
LSM: cmdline skipped: landlock (not in requested order)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,integrity,selinux,loadpin,bpf
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Mickaël Salaün <mic@digikod.net>
[PM: line wrapped commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-11-02 00:05:29 +00:00
|
|
|
static void __init report_lsm_order(void)
|
|
|
|
{
|
|
|
|
struct lsm_info **lsm, *early;
|
|
|
|
int first = 0;
|
|
|
|
|
|
|
|
pr_info("initializing lsm=");
|
|
|
|
|
|
|
|
/* Report each enabled LSM name, comma separated. */
|
2023-02-17 02:33:20 +00:00
|
|
|
for (early = __start_early_lsm_info;
|
|
|
|
early < __end_early_lsm_info; early++)
|
LSM: Better reporting of actual LSMs at boot
Enhance the details reported by "lsm.debug" in several ways:
- report contents of "security="
- report contents of "CONFIG_LSM"
- report contents of "lsm="
- report any early LSM details
- whitespace-align the output of similar phases for easier visual parsing
- change "disabled" to more accurate "skipped"
- explain what "skipped" and "ignored" mean in a parenthetical
Upgrade the "security= is ignored" warning from pr_info to pr_warn,
and include full arguments list to make the cause even more clear.
Replace static "Security Framework initializing" pr_info with specific
list of the resulting order of enabled LSMs.
For example, if the kernel is built with:
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_APPARMOR=y
CONFIG_SECURITY_LOADPIN=y
CONFIG_SECURITY_YAMA=y
CONFIG_SECURITY_SAFESETID=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LANDLOCK=y
CONFIG_INTEGRITY=y
CONFIG_BPF_LSM=y
CONFIG_DEFAULT_SECURITY_APPARMOR=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,
smack,tomoyo,apparmor,bpf"
Booting without options will show:
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
landlock: Up and running.
Yama: becoming mindful.
LoadPin: ready to pin (currently not enforcing)
SELinux: Initializing.
LSM support for eBPF active
Boot with "lsm.debug" will show:
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (enabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: exclusive disabled: apparmor
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
LSM: cred blob size = 32
LSM: file blob size = 16
LSM: inode blob size = 72
LSM: ipc blob size = 8
LSM: msg_msg blob size = 4
LSM: superblock blob size = 80
LSM: task blob size = 8
LSM: initializing capability
LSM: initializing landlock
landlock: Up and running.
LSM: initializing yama
Yama: becoming mindful.
LSM: initializing loadpin
LoadPin: ready to pin (currently not enforcing)
LSM: initializing safesetid
LSM: initializing integrity
LSM: initializing selinux
SELinux: Initializing.
LSM: initializing bpf
LSM support for eBPF active
And some examples of how the lsm.debug ordering report changes...
With "lsm.debug security=selinux":
LSM: legacy security=selinux
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: security=selinux disabled: apparmor (only one legacy major LSM)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (disabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
With "lsm.debug lsm=integrity,selinux,loadpin,crabability,bpf,
loadpin,loadpin":
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm=integrity,selinux,loadpin,capability,bpf,loadpin,
loadpin
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: cmdline ordered: integrity (enabled)
LSM: cmdline ordered: selinux (enabled)
LSM: cmdline ordered: loadpin (enabled)
LSM: cmdline ignored: crabability (not built into kernel)
LSM: cmdline ordered: bpf (enabled)
LSM: cmdline skipped: apparmor (not in requested order)
LSM: cmdline skipped: yama (not in requested order)
LSM: cmdline skipped: safesetid (not in requested order)
LSM: cmdline skipped: landlock (not in requested order)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,integrity,selinux,loadpin,bpf
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Mickaël Salaün <mic@digikod.net>
[PM: line wrapped commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-11-02 00:05:29 +00:00
|
|
|
if (is_enabled(early))
|
|
|
|
pr_cont("%s%s", first++ == 0 ? "" : ",", early->name);
|
|
|
|
for (lsm = ordered_lsms; *lsm; lsm++)
|
|
|
|
if (is_enabled(*lsm))
|
|
|
|
pr_cont("%s%s", first++ == 0 ? "" : ",", (*lsm)->name);
|
|
|
|
|
|
|
|
pr_cont("\n");
|
|
|
|
}
|
|
|
|
|
2018-09-19 23:58:31 +00:00
|
|
|
static void __init ordered_lsm_init(void)
|
|
|
|
{
|
|
|
|
struct lsm_info **lsm;
|
|
|
|
|
|
|
|
ordered_lsms = kcalloc(LSM_COUNT + 1, sizeof(*ordered_lsms),
|
2023-02-17 02:33:20 +00:00
|
|
|
GFP_KERNEL);
|
2018-09-19 23:58:31 +00:00
|
|
|
|
2019-02-12 18:23:18 +00:00
|
|
|
if (chosen_lsm_order) {
|
|
|
|
if (chosen_major_lsm) {
|
LSM: Better reporting of actual LSMs at boot
Enhance the details reported by "lsm.debug" in several ways:
- report contents of "security="
- report contents of "CONFIG_LSM"
- report contents of "lsm="
- report any early LSM details
- whitespace-align the output of similar phases for easier visual parsing
- change "disabled" to more accurate "skipped"
- explain what "skipped" and "ignored" mean in a parenthetical
Upgrade the "security= is ignored" warning from pr_info to pr_warn,
and include full arguments list to make the cause even more clear.
Replace static "Security Framework initializing" pr_info with specific
list of the resulting order of enabled LSMs.
For example, if the kernel is built with:
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_APPARMOR=y
CONFIG_SECURITY_LOADPIN=y
CONFIG_SECURITY_YAMA=y
CONFIG_SECURITY_SAFESETID=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LANDLOCK=y
CONFIG_INTEGRITY=y
CONFIG_BPF_LSM=y
CONFIG_DEFAULT_SECURITY_APPARMOR=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,
smack,tomoyo,apparmor,bpf"
Booting without options will show:
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
landlock: Up and running.
Yama: becoming mindful.
LoadPin: ready to pin (currently not enforcing)
SELinux: Initializing.
LSM support for eBPF active
Boot with "lsm.debug" will show:
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (enabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: exclusive disabled: apparmor
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
LSM: cred blob size = 32
LSM: file blob size = 16
LSM: inode blob size = 72
LSM: ipc blob size = 8
LSM: msg_msg blob size = 4
LSM: superblock blob size = 80
LSM: task blob size = 8
LSM: initializing capability
LSM: initializing landlock
landlock: Up and running.
LSM: initializing yama
Yama: becoming mindful.
LSM: initializing loadpin
LoadPin: ready to pin (currently not enforcing)
LSM: initializing safesetid
LSM: initializing integrity
LSM: initializing selinux
SELinux: Initializing.
LSM: initializing bpf
LSM support for eBPF active
And some examples of how the lsm.debug ordering report changes...
With "lsm.debug security=selinux":
LSM: legacy security=selinux
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: security=selinux disabled: apparmor (only one legacy major LSM)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (disabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
With "lsm.debug lsm=integrity,selinux,loadpin,crabability,bpf,
loadpin,loadpin":
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm=integrity,selinux,loadpin,capability,bpf,loadpin,
loadpin
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: cmdline ordered: integrity (enabled)
LSM: cmdline ordered: selinux (enabled)
LSM: cmdline ordered: loadpin (enabled)
LSM: cmdline ignored: crabability (not built into kernel)
LSM: cmdline ordered: bpf (enabled)
LSM: cmdline skipped: apparmor (not in requested order)
LSM: cmdline skipped: yama (not in requested order)
LSM: cmdline skipped: safesetid (not in requested order)
LSM: cmdline skipped: landlock (not in requested order)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,integrity,selinux,loadpin,bpf
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Mickaël Salaün <mic@digikod.net>
[PM: line wrapped commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-11-02 00:05:29 +00:00
|
|
|
pr_warn("security=%s is ignored because it is superseded by lsm=%s\n",
|
|
|
|
chosen_major_lsm, chosen_lsm_order);
|
2019-02-12 18:23:18 +00:00
|
|
|
chosen_major_lsm = NULL;
|
|
|
|
}
|
2018-09-20 00:30:09 +00:00
|
|
|
ordered_lsm_parse(chosen_lsm_order, "cmdline");
|
2019-02-12 18:23:18 +00:00
|
|
|
} else
|
2018-09-20 00:30:09 +00:00
|
|
|
ordered_lsm_parse(builtin_lsm_order, "builtin");
|
2018-09-19 23:58:31 +00:00
|
|
|
|
|
|
|
for (lsm = ordered_lsms; *lsm; lsm++)
|
2018-10-10 22:45:22 +00:00
|
|
|
prepare_lsm(*lsm);
|
|
|
|
|
LSM: Better reporting of actual LSMs at boot
Enhance the details reported by "lsm.debug" in several ways:
- report contents of "security="
- report contents of "CONFIG_LSM"
- report contents of "lsm="
- report any early LSM details
- whitespace-align the output of similar phases for easier visual parsing
- change "disabled" to more accurate "skipped"
- explain what "skipped" and "ignored" mean in a parenthetical
Upgrade the "security= is ignored" warning from pr_info to pr_warn,
and include full arguments list to make the cause even more clear.
Replace static "Security Framework initializing" pr_info with specific
list of the resulting order of enabled LSMs.
For example, if the kernel is built with:
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_APPARMOR=y
CONFIG_SECURITY_LOADPIN=y
CONFIG_SECURITY_YAMA=y
CONFIG_SECURITY_SAFESETID=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LANDLOCK=y
CONFIG_INTEGRITY=y
CONFIG_BPF_LSM=y
CONFIG_DEFAULT_SECURITY_APPARMOR=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,
smack,tomoyo,apparmor,bpf"
Booting without options will show:
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
landlock: Up and running.
Yama: becoming mindful.
LoadPin: ready to pin (currently not enforcing)
SELinux: Initializing.
LSM support for eBPF active
Boot with "lsm.debug" will show:
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (enabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: exclusive disabled: apparmor
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
LSM: cred blob size = 32
LSM: file blob size = 16
LSM: inode blob size = 72
LSM: ipc blob size = 8
LSM: msg_msg blob size = 4
LSM: superblock blob size = 80
LSM: task blob size = 8
LSM: initializing capability
LSM: initializing landlock
landlock: Up and running.
LSM: initializing yama
Yama: becoming mindful.
LSM: initializing loadpin
LoadPin: ready to pin (currently not enforcing)
LSM: initializing safesetid
LSM: initializing integrity
LSM: initializing selinux
SELinux: Initializing.
LSM: initializing bpf
LSM support for eBPF active
And some examples of how the lsm.debug ordering report changes...
With "lsm.debug security=selinux":
LSM: legacy security=selinux
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: security=selinux disabled: apparmor (only one legacy major LSM)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (disabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
With "lsm.debug lsm=integrity,selinux,loadpin,crabability,bpf,
loadpin,loadpin":
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm=integrity,selinux,loadpin,capability,bpf,loadpin,
loadpin
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: cmdline ordered: integrity (enabled)
LSM: cmdline ordered: selinux (enabled)
LSM: cmdline ordered: loadpin (enabled)
LSM: cmdline ignored: crabability (not built into kernel)
LSM: cmdline ordered: bpf (enabled)
LSM: cmdline skipped: apparmor (not in requested order)
LSM: cmdline skipped: yama (not in requested order)
LSM: cmdline skipped: safesetid (not in requested order)
LSM: cmdline skipped: landlock (not in requested order)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,integrity,selinux,loadpin,bpf
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Mickaël Salaün <mic@digikod.net>
[PM: line wrapped commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-11-02 00:05:29 +00:00
|
|
|
report_lsm_order();
|
|
|
|
|
2021-04-22 15:41:15 +00:00
|
|
|
init_debug("cred blob size = %d\n", blob_sizes.lbs_cred);
|
|
|
|
init_debug("file blob size = %d\n", blob_sizes.lbs_file);
|
2024-07-10 21:32:29 +00:00
|
|
|
init_debug("ib blob size = %d\n", blob_sizes.lbs_ib);
|
2021-04-22 15:41:15 +00:00
|
|
|
init_debug("inode blob size = %d\n", blob_sizes.lbs_inode);
|
|
|
|
init_debug("ipc blob size = %d\n", blob_sizes.lbs_ipc);
|
2024-07-10 21:32:26 +00:00
|
|
|
#ifdef CONFIG_KEYS
|
|
|
|
init_debug("key blob size = %d\n", blob_sizes.lbs_key);
|
|
|
|
#endif /* CONFIG_KEYS */
|
2021-04-22 15:41:15 +00:00
|
|
|
init_debug("msg_msg blob size = %d\n", blob_sizes.lbs_msg_msg);
|
2024-07-10 21:32:25 +00:00
|
|
|
init_debug("sock blob size = %d\n", blob_sizes.lbs_sock);
|
2021-04-22 15:41:15 +00:00
|
|
|
init_debug("superblock blob size = %d\n", blob_sizes.lbs_superblock);
|
2024-07-10 21:32:30 +00:00
|
|
|
init_debug("perf event blob size = %d\n", blob_sizes.lbs_perf_event);
|
2021-04-22 15:41:15 +00:00
|
|
|
init_debug("task blob size = %d\n", blob_sizes.lbs_task);
|
2024-07-10 21:32:28 +00:00
|
|
|
init_debug("tun device blob size = %d\n", blob_sizes.lbs_tun_dev);
|
security: Allow all LSMs to provide xattrs for inode_init_security hook
Currently, the LSM infrastructure supports only one LSM providing an xattr
and EVM calculating the HMAC on that xattr, plus other inode metadata.
Allow all LSMs to provide one or multiple xattrs, by extending the security
blob reservation mechanism. Introduce the new lbs_xattr_count field of the
lsm_blob_sizes structure, so that each LSM can specify how many xattrs it
needs, and the LSM infrastructure knows how many xattr slots it should
allocate.
Modify the inode_init_security hook definition, by passing the full
xattr array allocated in security_inode_init_security(), and the current
number of xattr slots in that array filled by LSMs. The first parameter
would allow EVM to access and calculate the HMAC on xattrs supplied by
other LSMs, the second to not leave gaps in the xattr array, when an LSM
requested but did not provide xattrs (e.g. if it is not initialized).
Introduce lsm_get_xattr_slot(), which LSMs can call as many times as the
number specified in the lbs_xattr_count field of the lsm_blob_sizes
structure. During each call, lsm_get_xattr_slot() increments the number of
filled xattrs, so that at the next invocation it returns the next xattr
slot to fill.
Cleanup security_inode_init_security(). Unify the !initxattrs and
initxattrs case by simply not allocating the new_xattrs array in the
former. Update the documentation to reflect the changes, and fix the
description of the xattr name, as it is not allocated anymore.
Adapt both SELinux and Smack to use the new definition of the
inode_init_security hook, and to call lsm_get_xattr_slot() to obtain and
fill the reserved slots in the xattr array.
Move the xattr->name assignment after the xattr->value one, so that it is
done only in case of successful memory allocation.
Finally, change the default return value of the inode_init_security hook
from zero to -EOPNOTSUPP, so that BPF LSM correctly follows the hook
conventions.
Reported-by: Nicolas Bouchinet <nicolas.bouchinet@clip-os.org>
Link: https://lore.kernel.org/linux-integrity/Y1FTSIo+1x+4X0LS@archlinux/
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: minor comment and variable tweaks, approved by RS]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-06-10 07:57:35 +00:00
|
|
|
init_debug("xattr slots = %d\n", blob_sizes.lbs_xattr_count);
|
block,lsm: add LSM blob and new LSM hooks for block devices
This patch introduces a new LSM blob to the block_device structure,
enabling the security subsystem to store security-sensitive data related
to block devices. Currently, for a device mapper's mapped device containing
a dm-verity target, critical security information such as the roothash and
its signing state are not readily accessible. Specifically, while the
dm-verity volume creation process passes the dm-verity roothash and its
signature from userspace to the kernel, the roothash is stored privately
within the dm-verity target, and its signature is discarded
post-verification. This makes it extremely hard for the security subsystem
to utilize these data.
With the addition of the LSM blob to the block_device structure, the
security subsystem can now retain and manage important security metadata
such as the roothash and the signing state of a dm-verity by storing them
inside the blob. Access decisions can then be based on these stored data.
The implementation follows the same approach used for security blobs in
other structures like struct file, struct inode, and struct superblock.
The initialization of the security blob occurs after the creation of the
struct block_device, performed by the security subsystem. Similarly, the
security blob is freed by the security subsystem before the struct
block_device is deallocated or freed.
This patch also introduces a new hook security_bdev_setintegrity() to save
block device's integrity data to the new LSM blob. For example, for
dm-verity, it can use this hook to expose its roothash and signing state
to LSMs, then LSMs can save these data into the LSM blob.
Please note that the new hook should be invoked every time the security
information is updated to keep these data current. For example, in
dm-verity, if the mapping table is reloaded and configured to use a
different dm-verity target with a new roothash and signing information,
the previously stored data in the LSM blob will become obsolete. It is
crucial to re-invoke the hook to refresh these data and ensure they are up
to date. This necessity arises from the design of device-mapper, where a
device-mapper device is first created, and then targets are subsequently
loaded into it. These targets can be modified multiple times during the
device's lifetime. Therefore, while the LSM blob is allocated during the
creation of the block device, its actual contents are not initialized at
this stage and can change substantially over time. This includes
alterations from data that the LSM 'trusts' to those it does not, making
it essential to handle these changes correctly. Failure to address this
dynamic aspect could potentially allow for bypassing LSM checks.
Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
[PM: merge fuzz, subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-03 06:08:25 +00:00
|
|
|
init_debug("bdev blob size = %d\n", blob_sizes.lbs_bdev);
|
2018-11-12 20:02:49 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Create any kmem_caches needed for blobs
|
|
|
|
*/
|
|
|
|
if (blob_sizes.lbs_file)
|
|
|
|
lsm_file_cache = kmem_cache_create("lsm_file_cache",
|
|
|
|
blob_sizes.lbs_file, 0,
|
|
|
|
SLAB_PANIC, NULL);
|
2018-09-22 00:19:29 +00:00
|
|
|
if (blob_sizes.lbs_inode)
|
|
|
|
lsm_inode_cache = kmem_cache_create("lsm_inode_cache",
|
|
|
|
blob_sizes.lbs_inode, 0,
|
|
|
|
SLAB_PANIC, NULL);
|
2018-11-12 17:30:56 +00:00
|
|
|
|
2019-01-18 10:15:59 +00:00
|
|
|
lsm_early_cred((struct cred *) current->cred);
|
|
|
|
lsm_early_task(current);
|
2018-10-10 22:45:22 +00:00
|
|
|
for (lsm = ordered_lsms; *lsm; lsm++)
|
|
|
|
initialize_lsm(*lsm);
|
2018-09-19 23:58:31 +00:00
|
|
|
|
|
|
|
kfree(ordered_lsms);
|
|
|
|
}
|
|
|
|
|
2019-08-20 00:17:37 +00:00
|
|
|
int __init early_security_init(void)
|
|
|
|
{
|
|
|
|
struct lsm_info *lsm;
|
|
|
|
|
|
|
|
for (lsm = __start_early_lsm_info; lsm < __end_early_lsm_info; lsm++) {
|
|
|
|
if (!lsm->enabled)
|
|
|
|
lsm->enabled = &lsm_enabled_true;
|
|
|
|
prepare_lsm(lsm);
|
|
|
|
initialize_lsm(lsm);
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/**
|
|
|
|
* security_init - initializes the security framework
|
|
|
|
*
|
|
|
|
* This should be called early in the kernel initialization sequence.
|
|
|
|
*/
|
|
|
|
int __init security_init(void)
|
|
|
|
{
|
2019-08-20 00:17:37 +00:00
|
|
|
struct lsm_info *lsm;
|
2017-03-22 10:46:19 +00:00
|
|
|
|
2023-02-17 02:33:20 +00:00
|
|
|
init_debug("legacy security=%s\n", chosen_major_lsm ? : " *unspecified*");
|
LSM: Better reporting of actual LSMs at boot
Enhance the details reported by "lsm.debug" in several ways:
- report contents of "security="
- report contents of "CONFIG_LSM"
- report contents of "lsm="
- report any early LSM details
- whitespace-align the output of similar phases for easier visual parsing
- change "disabled" to more accurate "skipped"
- explain what "skipped" and "ignored" mean in a parenthetical
Upgrade the "security= is ignored" warning from pr_info to pr_warn,
and include full arguments list to make the cause even more clear.
Replace static "Security Framework initializing" pr_info with specific
list of the resulting order of enabled LSMs.
For example, if the kernel is built with:
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_APPARMOR=y
CONFIG_SECURITY_LOADPIN=y
CONFIG_SECURITY_YAMA=y
CONFIG_SECURITY_SAFESETID=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LANDLOCK=y
CONFIG_INTEGRITY=y
CONFIG_BPF_LSM=y
CONFIG_DEFAULT_SECURITY_APPARMOR=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,
smack,tomoyo,apparmor,bpf"
Booting without options will show:
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
landlock: Up and running.
Yama: becoming mindful.
LoadPin: ready to pin (currently not enforcing)
SELinux: Initializing.
LSM support for eBPF active
Boot with "lsm.debug" will show:
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (enabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: exclusive disabled: apparmor
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
LSM: cred blob size = 32
LSM: file blob size = 16
LSM: inode blob size = 72
LSM: ipc blob size = 8
LSM: msg_msg blob size = 4
LSM: superblock blob size = 80
LSM: task blob size = 8
LSM: initializing capability
LSM: initializing landlock
landlock: Up and running.
LSM: initializing yama
Yama: becoming mindful.
LSM: initializing loadpin
LoadPin: ready to pin (currently not enforcing)
LSM: initializing safesetid
LSM: initializing integrity
LSM: initializing selinux
SELinux: Initializing.
LSM: initializing bpf
LSM support for eBPF active
And some examples of how the lsm.debug ordering report changes...
With "lsm.debug security=selinux":
LSM: legacy security=selinux
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: security=selinux disabled: apparmor (only one legacy major LSM)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (disabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
With "lsm.debug lsm=integrity,selinux,loadpin,crabability,bpf,
loadpin,loadpin":
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm=integrity,selinux,loadpin,capability,bpf,loadpin,
loadpin
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: cmdline ordered: integrity (enabled)
LSM: cmdline ordered: selinux (enabled)
LSM: cmdline ordered: loadpin (enabled)
LSM: cmdline ignored: crabability (not built into kernel)
LSM: cmdline ordered: bpf (enabled)
LSM: cmdline skipped: apparmor (not in requested order)
LSM: cmdline skipped: yama (not in requested order)
LSM: cmdline skipped: safesetid (not in requested order)
LSM: cmdline skipped: landlock (not in requested order)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,integrity,selinux,loadpin,bpf
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Mickaël Salaün <mic@digikod.net>
[PM: line wrapped commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-11-02 00:05:29 +00:00
|
|
|
init_debug(" CONFIG_LSM=%s\n", builtin_lsm_order);
|
2023-02-17 02:33:20 +00:00
|
|
|
init_debug("boot arg lsm=%s\n", chosen_lsm_order ? : " *unspecified*");
|
2018-10-11 00:18:17 +00:00
|
|
|
|
2019-08-20 00:17:37 +00:00
|
|
|
/*
|
|
|
|
* Append the names of the early LSM modules now that kmalloc() is
|
|
|
|
* available
|
|
|
|
*/
|
|
|
|
for (lsm = __start_early_lsm_info; lsm < __end_early_lsm_info; lsm++) {
|
LSM: Better reporting of actual LSMs at boot
Enhance the details reported by "lsm.debug" in several ways:
- report contents of "security="
- report contents of "CONFIG_LSM"
- report contents of "lsm="
- report any early LSM details
- whitespace-align the output of similar phases for easier visual parsing
- change "disabled" to more accurate "skipped"
- explain what "skipped" and "ignored" mean in a parenthetical
Upgrade the "security= is ignored" warning from pr_info to pr_warn,
and include full arguments list to make the cause even more clear.
Replace static "Security Framework initializing" pr_info with specific
list of the resulting order of enabled LSMs.
For example, if the kernel is built with:
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_APPARMOR=y
CONFIG_SECURITY_LOADPIN=y
CONFIG_SECURITY_YAMA=y
CONFIG_SECURITY_SAFESETID=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LANDLOCK=y
CONFIG_INTEGRITY=y
CONFIG_BPF_LSM=y
CONFIG_DEFAULT_SECURITY_APPARMOR=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,
smack,tomoyo,apparmor,bpf"
Booting without options will show:
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
landlock: Up and running.
Yama: becoming mindful.
LoadPin: ready to pin (currently not enforcing)
SELinux: Initializing.
LSM support for eBPF active
Boot with "lsm.debug" will show:
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (enabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: exclusive disabled: apparmor
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
LSM: cred blob size = 32
LSM: file blob size = 16
LSM: inode blob size = 72
LSM: ipc blob size = 8
LSM: msg_msg blob size = 4
LSM: superblock blob size = 80
LSM: task blob size = 8
LSM: initializing capability
LSM: initializing landlock
landlock: Up and running.
LSM: initializing yama
Yama: becoming mindful.
LSM: initializing loadpin
LoadPin: ready to pin (currently not enforcing)
LSM: initializing safesetid
LSM: initializing integrity
LSM: initializing selinux
SELinux: Initializing.
LSM: initializing bpf
LSM support for eBPF active
And some examples of how the lsm.debug ordering report changes...
With "lsm.debug security=selinux":
LSM: legacy security=selinux
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm= *unspecified*
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: security=selinux disabled: apparmor (only one legacy major LSM)
LSM: builtin ordered: landlock (enabled)
LSM: builtin ignored: lockdown (not built into kernel)
LSM: builtin ordered: yama (enabled)
LSM: builtin ordered: loadpin (enabled)
LSM: builtin ordered: safesetid (enabled)
LSM: builtin ordered: integrity (enabled)
LSM: builtin ordered: selinux (enabled)
LSM: builtin ignored: smack (not built into kernel)
LSM: builtin ignored: tomoyo (not built into kernel)
LSM: builtin ordered: apparmor (disabled)
LSM: builtin ordered: bpf (enabled)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
safesetid,integrity,selinux,bpf
With "lsm.debug lsm=integrity,selinux,loadpin,crabability,bpf,
loadpin,loadpin":
LSM: legacy security= *unspecified*
LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
selinux,smack,tomoyo,apparmor,bpf
LSM: boot arg lsm=integrity,selinux,loadpin,capability,bpf,loadpin,
loadpin
LSM: early started: lockdown (enabled)
LSM: first ordered: capability (enabled)
LSM: cmdline ordered: integrity (enabled)
LSM: cmdline ordered: selinux (enabled)
LSM: cmdline ordered: loadpin (enabled)
LSM: cmdline ignored: crabability (not built into kernel)
LSM: cmdline ordered: bpf (enabled)
LSM: cmdline skipped: apparmor (not in requested order)
LSM: cmdline skipped: yama (not in requested order)
LSM: cmdline skipped: safesetid (not in requested order)
LSM: cmdline skipped: landlock (not in requested order)
LSM: exclusive chosen: selinux
LSM: initializing lsm=lockdown,capability,integrity,selinux,loadpin,bpf
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Mickaël Salaün <mic@digikod.net>
[PM: line wrapped commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-11-02 00:05:29 +00:00
|
|
|
init_debug(" early started: %s (%s)\n", lsm->name,
|
|
|
|
is_enabled(lsm) ? "enabled" : "disabled");
|
2019-08-20 00:17:37 +00:00
|
|
|
if (lsm->enabled)
|
|
|
|
lsm_append(lsm->name, &lsm_names);
|
|
|
|
}
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2018-09-19 23:16:55 +00:00
|
|
|
/* Load LSMs in specified order. */
|
|
|
|
ordered_lsm_init();
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2008-03-06 16:09:10 +00:00
|
|
|
/* Save user chosen LSM */
|
2018-09-19 20:11:41 +00:00
|
|
|
static int __init choose_major_lsm(char *str)
|
2008-03-06 16:09:10 +00:00
|
|
|
{
|
2018-09-19 20:11:41 +00:00
|
|
|
chosen_major_lsm = str;
|
2008-03-06 16:09:10 +00:00
|
|
|
return 1;
|
|
|
|
}
|
2018-09-19 20:11:41 +00:00
|
|
|
__setup("security=", choose_major_lsm);
|
2008-03-06 16:09:10 +00:00
|
|
|
|
2018-09-20 00:30:09 +00:00
|
|
|
/* Explicitly choose LSM initialization order. */
|
|
|
|
static int __init choose_lsm_order(char *str)
|
|
|
|
{
|
|
|
|
chosen_lsm_order = str;
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
__setup("lsm=", choose_lsm_order);
|
|
|
|
|
2018-10-11 00:18:25 +00:00
|
|
|
/* Enable LSM order debugging. */
|
|
|
|
static int __init enable_debug(char *str)
|
|
|
|
{
|
|
|
|
debug = true;
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
__setup("lsm.debug", enable_debug);
|
|
|
|
|
LSM: Enable multiple calls to security_add_hooks() for the same LSM
The commit d69dece5f5b6 ("LSM: Add /sys/kernel/security/lsm") extend
security_add_hooks() with a new parameter to register the LSM name,
which may be useful to make the list of currently loaded LSM available
to userspace. However, there is no clean way for an LSM to split its
hook declarations into multiple files, which may reduce the mess with
all the included files (needed for LSM hook argument types) and make the
source code easier to review and maintain.
This change allows an LSM to register multiple times its hook while
keeping a consistent list of LSM names as described in
Documentation/security/LSM.txt . The list reflects the order in which
checks are made. This patch only check for the last registered LSM. If
an LSM register multiple times its hooks, interleaved with other LSM
registrations (which should not happen), its name will still appear in
the same order that the hooks are called, hence multiple times.
To sum up, "capability,selinux,foo,foo" will be replaced with
"capability,selinux,foo", however "capability,foo,selinux,foo" will
remain as is.
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-05-10 20:48:48 +00:00
|
|
|
static bool match_last_lsm(const char *list, const char *lsm)
|
|
|
|
{
|
|
|
|
const char *last;
|
|
|
|
|
|
|
|
if (WARN_ON(!list || !lsm))
|
|
|
|
return false;
|
|
|
|
last = strrchr(list, ',');
|
|
|
|
if (last)
|
|
|
|
/* Pass the comma, strcmp() will check for '\0' */
|
|
|
|
last++;
|
|
|
|
else
|
|
|
|
last = list;
|
|
|
|
return !strcmp(last, lsm);
|
|
|
|
}
|
|
|
|
|
2019-08-20 00:17:37 +00:00
|
|
|
static int lsm_append(const char *new, char **result)
|
2017-01-19 01:09:05 +00:00
|
|
|
{
|
|
|
|
char *cp;
|
|
|
|
|
|
|
|
if (*result == NULL) {
|
|
|
|
*result = kstrdup(new, GFP_KERNEL);
|
2018-07-17 17:36:04 +00:00
|
|
|
if (*result == NULL)
|
|
|
|
return -ENOMEM;
|
2017-01-19 01:09:05 +00:00
|
|
|
} else {
|
LSM: Enable multiple calls to security_add_hooks() for the same LSM
The commit d69dece5f5b6 ("LSM: Add /sys/kernel/security/lsm") extend
security_add_hooks() with a new parameter to register the LSM name,
which may be useful to make the list of currently loaded LSM available
to userspace. However, there is no clean way for an LSM to split its
hook declarations into multiple files, which may reduce the mess with
all the included files (needed for LSM hook argument types) and make the
source code easier to review and maintain.
This change allows an LSM to register multiple times its hook while
keeping a consistent list of LSM names as described in
Documentation/security/LSM.txt . The list reflects the order in which
checks are made. This patch only check for the last registered LSM. If
an LSM register multiple times its hooks, interleaved with other LSM
registrations (which should not happen), its name will still appear in
the same order that the hooks are called, hence multiple times.
To sum up, "capability,selinux,foo,foo" will be replaced with
"capability,selinux,foo", however "capability,foo,selinux,foo" will
remain as is.
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-05-10 20:48:48 +00:00
|
|
|
/* Check if it is the last registered name */
|
|
|
|
if (match_last_lsm(*result, new))
|
|
|
|
return 0;
|
2017-01-19 01:09:05 +00:00
|
|
|
cp = kasprintf(GFP_KERNEL, "%s,%s", *result, new);
|
|
|
|
if (cp == NULL)
|
|
|
|
return -ENOMEM;
|
|
|
|
kfree(*result);
|
|
|
|
*result = cp;
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* security_add_hooks - Add a modules hooks to the hook lists.
|
|
|
|
* @hooks: the hooks to add
|
|
|
|
* @count: the number of hooks to add
|
2023-09-12 20:56:46 +00:00
|
|
|
* @lsmid: the identification information for the security module
|
2017-01-19 01:09:05 +00:00
|
|
|
*
|
|
|
|
* Each LSM has to register its hooks with the infrastructure.
|
|
|
|
*/
|
|
|
|
void __init security_add_hooks(struct security_hook_list *hooks, int count,
|
2023-09-12 20:56:46 +00:00
|
|
|
const struct lsm_id *lsmid)
|
2017-01-19 01:09:05 +00:00
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
2023-09-12 20:56:47 +00:00
|
|
|
/*
|
|
|
|
* A security module may call security_add_hooks() more
|
|
|
|
* than once during initialization, and LSM initialization
|
|
|
|
* is serialized. Landlock is one such case.
|
|
|
|
* Look at the previous entry, if there is one, for duplication.
|
|
|
|
*/
|
|
|
|
if (lsm_active_cnt == 0 || lsm_idlist[lsm_active_cnt - 1] != lsmid) {
|
|
|
|
if (lsm_active_cnt >= LSM_CONFIG_COUNT)
|
|
|
|
panic("%s Too many LSMs registered.\n", __func__);
|
|
|
|
lsm_idlist[lsm_active_cnt++] = lsmid;
|
|
|
|
}
|
|
|
|
|
2017-01-19 01:09:05 +00:00
|
|
|
for (i = 0; i < count; i++) {
|
2023-09-12 20:56:46 +00:00
|
|
|
hooks[i].lsmid = lsmid;
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
lsm_static_call_init(&hooks[i]);
|
2017-01-19 01:09:05 +00:00
|
|
|
}
|
2019-08-20 00:17:37 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Don't try to append during early_security_init(), we'll come back
|
|
|
|
* and fix this up afterwards.
|
|
|
|
*/
|
|
|
|
if (slab_is_available()) {
|
2023-09-12 20:56:46 +00:00
|
|
|
if (lsm_append(lsmid->name, &lsm_names) < 0)
|
2019-08-20 00:17:37 +00:00
|
|
|
panic("%s - Cannot get early memory.\n", __func__);
|
|
|
|
}
|
2017-01-19 01:09:05 +00:00
|
|
|
}
|
|
|
|
|
2019-06-14 12:20:14 +00:00
|
|
|
int call_blocking_lsm_notifier(enum lsm_event event, void *data)
|
2017-05-19 12:48:53 +00:00
|
|
|
{
|
2019-06-14 12:20:14 +00:00
|
|
|
return blocking_notifier_call_chain(&blocking_lsm_notifier_chain,
|
|
|
|
event, data);
|
2017-05-19 12:48:53 +00:00
|
|
|
}
|
2019-06-14 12:20:14 +00:00
|
|
|
EXPORT_SYMBOL(call_blocking_lsm_notifier);
|
2017-05-19 12:48:53 +00:00
|
|
|
|
2019-06-14 12:20:14 +00:00
|
|
|
int register_blocking_lsm_notifier(struct notifier_block *nb)
|
2017-05-19 12:48:53 +00:00
|
|
|
{
|
2019-06-14 12:20:14 +00:00
|
|
|
return blocking_notifier_chain_register(&blocking_lsm_notifier_chain,
|
|
|
|
nb);
|
2017-05-19 12:48:53 +00:00
|
|
|
}
|
2019-06-14 12:20:14 +00:00
|
|
|
EXPORT_SYMBOL(register_blocking_lsm_notifier);
|
2017-05-19 12:48:53 +00:00
|
|
|
|
2019-06-14 12:20:14 +00:00
|
|
|
int unregister_blocking_lsm_notifier(struct notifier_block *nb)
|
2017-05-19 12:48:53 +00:00
|
|
|
{
|
2019-06-14 12:20:14 +00:00
|
|
|
return blocking_notifier_chain_unregister(&blocking_lsm_notifier_chain,
|
|
|
|
nb);
|
2017-05-19 12:48:53 +00:00
|
|
|
}
|
2019-06-14 12:20:14 +00:00
|
|
|
EXPORT_SYMBOL(unregister_blocking_lsm_notifier);
|
2017-05-19 12:48:53 +00:00
|
|
|
|
2018-11-12 17:30:56 +00:00
|
|
|
/**
|
2024-07-10 21:32:27 +00:00
|
|
|
* lsm_blob_alloc - allocate a composite blob
|
|
|
|
* @dest: the destination for the blob
|
|
|
|
* @size: the size of the blob
|
2018-11-12 17:30:56 +00:00
|
|
|
* @gfp: allocation type
|
|
|
|
*
|
2024-07-10 21:32:27 +00:00
|
|
|
* Allocate a blob for all the modules
|
2018-11-12 17:30:56 +00:00
|
|
|
*
|
|
|
|
* Returns 0, or -ENOMEM if memory can't be allocated.
|
|
|
|
*/
|
2024-07-10 21:32:27 +00:00
|
|
|
static int lsm_blob_alloc(void **dest, size_t size, gfp_t gfp)
|
2018-11-12 17:30:56 +00:00
|
|
|
{
|
2024-07-10 21:32:27 +00:00
|
|
|
if (size == 0) {
|
|
|
|
*dest = NULL;
|
2018-11-12 17:30:56 +00:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2024-07-10 21:32:27 +00:00
|
|
|
*dest = kzalloc(size, gfp);
|
|
|
|
if (*dest == NULL)
|
2018-11-12 17:30:56 +00:00
|
|
|
return -ENOMEM;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2024-07-10 21:32:27 +00:00
|
|
|
/**
|
|
|
|
* lsm_cred_alloc - allocate a composite cred blob
|
|
|
|
* @cred: the cred that needs a blob
|
|
|
|
* @gfp: allocation type
|
|
|
|
*
|
|
|
|
* Allocate the cred blob for all the modules
|
|
|
|
*
|
|
|
|
* Returns 0, or -ENOMEM if memory can't be allocated.
|
|
|
|
*/
|
|
|
|
static int lsm_cred_alloc(struct cred *cred, gfp_t gfp)
|
|
|
|
{
|
|
|
|
return lsm_blob_alloc(&cred->security, blob_sizes.lbs_cred, gfp);
|
|
|
|
}
|
|
|
|
|
2018-11-12 17:30:56 +00:00
|
|
|
/**
|
|
|
|
* lsm_early_cred - during initialization allocate a composite cred blob
|
|
|
|
* @cred: the cred that needs a blob
|
|
|
|
*
|
2019-01-18 10:15:59 +00:00
|
|
|
* Allocate the cred blob for all the modules
|
2018-11-12 17:30:56 +00:00
|
|
|
*/
|
2019-01-18 10:15:59 +00:00
|
|
|
static void __init lsm_early_cred(struct cred *cred)
|
2018-11-12 17:30:56 +00:00
|
|
|
{
|
2019-01-18 10:15:59 +00:00
|
|
|
int rc = lsm_cred_alloc(cred, GFP_KERNEL);
|
2018-11-12 17:30:56 +00:00
|
|
|
|
|
|
|
if (rc)
|
|
|
|
panic("%s: Early cred alloc failed.\n", __func__);
|
|
|
|
}
|
|
|
|
|
2018-11-12 20:02:49 +00:00
|
|
|
/**
|
|
|
|
* lsm_file_alloc - allocate a composite file blob
|
|
|
|
* @file: the file that needs a blob
|
|
|
|
*
|
|
|
|
* Allocate the file blob for all the modules
|
|
|
|
*
|
|
|
|
* Returns 0, or -ENOMEM if memory can't be allocated.
|
|
|
|
*/
|
|
|
|
static int lsm_file_alloc(struct file *file)
|
|
|
|
{
|
|
|
|
if (!lsm_file_cache) {
|
|
|
|
file->f_security = NULL;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
file->f_security = kmem_cache_zalloc(lsm_file_cache, GFP_KERNEL);
|
|
|
|
if (file->f_security == NULL)
|
|
|
|
return -ENOMEM;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-09-22 00:19:29 +00:00
|
|
|
/**
|
|
|
|
* lsm_inode_alloc - allocate a composite inode blob
|
|
|
|
* @inode: the inode that needs a blob
|
|
|
|
*
|
|
|
|
* Allocate the inode blob for all the modules
|
|
|
|
*
|
|
|
|
* Returns 0, or -ENOMEM if memory can't be allocated.
|
|
|
|
*/
|
2024-07-16 01:22:51 +00:00
|
|
|
static int lsm_inode_alloc(struct inode *inode)
|
2018-09-22 00:19:29 +00:00
|
|
|
{
|
|
|
|
if (!lsm_inode_cache) {
|
|
|
|
inode->i_security = NULL;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
inode->i_security = kmem_cache_zalloc(lsm_inode_cache, GFP_NOFS);
|
|
|
|
if (inode->i_security == NULL)
|
|
|
|
return -ENOMEM;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-09-22 00:19:37 +00:00
|
|
|
/**
|
|
|
|
* lsm_task_alloc - allocate a composite task blob
|
|
|
|
* @task: the task that needs a blob
|
|
|
|
*
|
|
|
|
* Allocate the task blob for all the modules
|
|
|
|
*
|
|
|
|
* Returns 0, or -ENOMEM if memory can't be allocated.
|
|
|
|
*/
|
2019-01-16 05:44:32 +00:00
|
|
|
static int lsm_task_alloc(struct task_struct *task)
|
2018-09-22 00:19:37 +00:00
|
|
|
{
|
2024-07-10 21:32:27 +00:00
|
|
|
return lsm_blob_alloc(&task->security, blob_sizes.lbs_task, GFP_KERNEL);
|
2018-09-22 00:19:37 +00:00
|
|
|
}
|
|
|
|
|
2018-11-20 19:55:02 +00:00
|
|
|
/**
|
|
|
|
* lsm_ipc_alloc - allocate a composite ipc blob
|
|
|
|
* @kip: the ipc that needs a blob
|
|
|
|
*
|
|
|
|
* Allocate the ipc blob for all the modules
|
|
|
|
*
|
|
|
|
* Returns 0, or -ENOMEM if memory can't be allocated.
|
|
|
|
*/
|
2019-01-16 05:44:32 +00:00
|
|
|
static int lsm_ipc_alloc(struct kern_ipc_perm *kip)
|
2018-11-20 19:55:02 +00:00
|
|
|
{
|
2024-07-10 21:32:27 +00:00
|
|
|
return lsm_blob_alloc(&kip->security, blob_sizes.lbs_ipc, GFP_KERNEL);
|
2018-11-20 19:55:02 +00:00
|
|
|
}
|
|
|
|
|
2024-07-10 21:32:26 +00:00
|
|
|
#ifdef CONFIG_KEYS
|
|
|
|
/**
|
|
|
|
* lsm_key_alloc - allocate a composite key blob
|
|
|
|
* @key: the key that needs a blob
|
|
|
|
*
|
|
|
|
* Allocate the key blob for all the modules
|
|
|
|
*
|
|
|
|
* Returns 0, or -ENOMEM if memory can't be allocated.
|
|
|
|
*/
|
|
|
|
static int lsm_key_alloc(struct key *key)
|
|
|
|
{
|
2024-07-10 21:32:27 +00:00
|
|
|
return lsm_blob_alloc(&key->security, blob_sizes.lbs_key, GFP_KERNEL);
|
2024-07-10 21:32:26 +00:00
|
|
|
}
|
|
|
|
#endif /* CONFIG_KEYS */
|
|
|
|
|
2018-11-20 19:55:02 +00:00
|
|
|
/**
|
|
|
|
* lsm_msg_msg_alloc - allocate a composite msg_msg blob
|
|
|
|
* @mp: the msg_msg that needs a blob
|
|
|
|
*
|
|
|
|
* Allocate the ipc blob for all the modules
|
|
|
|
*
|
|
|
|
* Returns 0, or -ENOMEM if memory can't be allocated.
|
|
|
|
*/
|
2019-01-16 05:44:32 +00:00
|
|
|
static int lsm_msg_msg_alloc(struct msg_msg *mp)
|
2018-11-20 19:55:02 +00:00
|
|
|
{
|
2024-07-10 21:32:27 +00:00
|
|
|
return lsm_blob_alloc(&mp->security, blob_sizes.lbs_msg_msg,
|
|
|
|
GFP_KERNEL);
|
2018-11-20 19:55:02 +00:00
|
|
|
}
|
|
|
|
|
block,lsm: add LSM blob and new LSM hooks for block devices
This patch introduces a new LSM blob to the block_device structure,
enabling the security subsystem to store security-sensitive data related
to block devices. Currently, for a device mapper's mapped device containing
a dm-verity target, critical security information such as the roothash and
its signing state are not readily accessible. Specifically, while the
dm-verity volume creation process passes the dm-verity roothash and its
signature from userspace to the kernel, the roothash is stored privately
within the dm-verity target, and its signature is discarded
post-verification. This makes it extremely hard for the security subsystem
to utilize these data.
With the addition of the LSM blob to the block_device structure, the
security subsystem can now retain and manage important security metadata
such as the roothash and the signing state of a dm-verity by storing them
inside the blob. Access decisions can then be based on these stored data.
The implementation follows the same approach used for security blobs in
other structures like struct file, struct inode, and struct superblock.
The initialization of the security blob occurs after the creation of the
struct block_device, performed by the security subsystem. Similarly, the
security blob is freed by the security subsystem before the struct
block_device is deallocated or freed.
This patch also introduces a new hook security_bdev_setintegrity() to save
block device's integrity data to the new LSM blob. For example, for
dm-verity, it can use this hook to expose its roothash and signing state
to LSMs, then LSMs can save these data into the LSM blob.
Please note that the new hook should be invoked every time the security
information is updated to keep these data current. For example, in
dm-verity, if the mapping table is reloaded and configured to use a
different dm-verity target with a new roothash and signing information,
the previously stored data in the LSM blob will become obsolete. It is
crucial to re-invoke the hook to refresh these data and ensure they are up
to date. This necessity arises from the design of device-mapper, where a
device-mapper device is first created, and then targets are subsequently
loaded into it. These targets can be modified multiple times during the
device's lifetime. Therefore, while the LSM blob is allocated during the
creation of the block device, its actual contents are not initialized at
this stage and can change substantially over time. This includes
alterations from data that the LSM 'trusts' to those it does not, making
it essential to handle these changes correctly. Failure to address this
dynamic aspect could potentially allow for bypassing LSM checks.
Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
[PM: merge fuzz, subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-03 06:08:25 +00:00
|
|
|
/**
|
|
|
|
* lsm_bdev_alloc - allocate a composite block_device blob
|
|
|
|
* @bdev: the block_device that needs a blob
|
|
|
|
*
|
|
|
|
* Allocate the block_device blob for all the modules
|
|
|
|
*
|
|
|
|
* Returns 0, or -ENOMEM if memory can't be allocated.
|
|
|
|
*/
|
|
|
|
static int lsm_bdev_alloc(struct block_device *bdev)
|
|
|
|
{
|
|
|
|
if (blob_sizes.lbs_bdev == 0) {
|
|
|
|
bdev->bd_security = NULL;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
bdev->bd_security = kzalloc(blob_sizes.lbs_bdev, GFP_KERNEL);
|
|
|
|
if (!bdev->bd_security)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-09-22 00:19:37 +00:00
|
|
|
/**
|
|
|
|
* lsm_early_task - during initialization allocate a composite task blob
|
|
|
|
* @task: the task that needs a blob
|
|
|
|
*
|
2019-01-18 10:15:59 +00:00
|
|
|
* Allocate the task blob for all the modules
|
2018-09-22 00:19:37 +00:00
|
|
|
*/
|
2019-01-18 10:15:59 +00:00
|
|
|
static void __init lsm_early_task(struct task_struct *task)
|
2018-09-22 00:19:37 +00:00
|
|
|
{
|
2019-01-18 10:15:59 +00:00
|
|
|
int rc = lsm_task_alloc(task);
|
2018-09-22 00:19:37 +00:00
|
|
|
|
|
|
|
if (rc)
|
|
|
|
panic("%s: Early task alloc failed.\n", __func__);
|
|
|
|
}
|
|
|
|
|
2021-04-22 15:41:15 +00:00
|
|
|
/**
|
|
|
|
* lsm_superblock_alloc - allocate a composite superblock blob
|
|
|
|
* @sb: the superblock that needs a blob
|
|
|
|
*
|
|
|
|
* Allocate the superblock blob for all the modules
|
|
|
|
*
|
|
|
|
* Returns 0, or -ENOMEM if memory can't be allocated.
|
|
|
|
*/
|
|
|
|
static int lsm_superblock_alloc(struct super_block *sb)
|
|
|
|
{
|
2024-07-10 21:32:27 +00:00
|
|
|
return lsm_blob_alloc(&sb->s_security, blob_sizes.lbs_superblock,
|
|
|
|
GFP_KERNEL);
|
2021-04-22 15:41:15 +00:00
|
|
|
}
|
|
|
|
|
2023-09-12 20:56:52 +00:00
|
|
|
/**
|
|
|
|
* lsm_fill_user_ctx - Fill a user space lsm_ctx structure
|
2023-10-24 18:44:00 +00:00
|
|
|
* @uctx: a userspace LSM context to be filled
|
|
|
|
* @uctx_len: available uctx size (input), used uctx size (output)
|
|
|
|
* @val: the new LSM context value
|
|
|
|
* @val_len: the size of the new LSM context value
|
2023-09-12 20:56:52 +00:00
|
|
|
* @id: LSM id
|
|
|
|
* @flags: LSM defined flags
|
|
|
|
*
|
2024-03-14 01:37:48 +00:00
|
|
|
* Fill all of the fields in a userspace lsm_ctx structure. If @uctx is NULL
|
|
|
|
* simply calculate the required size to output via @utc_len and return
|
|
|
|
* success.
|
2023-09-12 20:56:52 +00:00
|
|
|
*
|
2023-10-24 18:44:00 +00:00
|
|
|
* Returns 0 on success, -E2BIG if userspace buffer is not large enough,
|
|
|
|
* -EFAULT on a copyout error, -ENOMEM if memory can't be allocated.
|
2023-09-12 20:56:52 +00:00
|
|
|
*/
|
2024-03-14 15:31:26 +00:00
|
|
|
int lsm_fill_user_ctx(struct lsm_ctx __user *uctx, u32 *uctx_len,
|
2023-10-24 18:44:00 +00:00
|
|
|
void *val, size_t val_len,
|
|
|
|
u64 id, u64 flags)
|
2023-09-12 20:56:52 +00:00
|
|
|
{
|
2023-10-24 18:44:00 +00:00
|
|
|
struct lsm_ctx *nctx = NULL;
|
|
|
|
size_t nctx_len;
|
2023-09-12 20:56:52 +00:00
|
|
|
int rc = 0;
|
|
|
|
|
2023-11-01 21:39:44 +00:00
|
|
|
nctx_len = ALIGN(struct_size(nctx, ctx, val_len), sizeof(void *));
|
2023-10-24 18:44:00 +00:00
|
|
|
if (nctx_len > *uctx_len) {
|
|
|
|
rc = -E2BIG;
|
|
|
|
goto out;
|
|
|
|
}
|
2023-09-12 20:56:52 +00:00
|
|
|
|
2024-03-14 01:37:48 +00:00
|
|
|
/* no buffer - return success/0 and set @uctx_len to the req size */
|
|
|
|
if (!uctx)
|
|
|
|
goto out;
|
|
|
|
|
2023-10-24 18:44:00 +00:00
|
|
|
nctx = kzalloc(nctx_len, GFP_KERNEL);
|
|
|
|
if (nctx == NULL) {
|
|
|
|
rc = -ENOMEM;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
nctx->id = id;
|
|
|
|
nctx->flags = flags;
|
|
|
|
nctx->len = nctx_len;
|
|
|
|
nctx->ctx_len = val_len;
|
|
|
|
memcpy(nctx->ctx, val, val_len);
|
2023-09-12 20:56:52 +00:00
|
|
|
|
2023-10-24 18:44:00 +00:00
|
|
|
if (copy_to_user(uctx, nctx, nctx_len))
|
2023-09-12 20:56:52 +00:00
|
|
|
rc = -EFAULT;
|
|
|
|
|
2023-10-24 18:44:00 +00:00
|
|
|
out:
|
|
|
|
kfree(nctx);
|
|
|
|
*uctx_len = nctx_len;
|
2023-09-12 20:56:52 +00:00
|
|
|
return rc;
|
|
|
|
}
|
|
|
|
|
2020-03-29 00:43:50 +00:00
|
|
|
/*
|
|
|
|
* The default value of the LSM hook is defined in linux/lsm_hook_defs.h and
|
|
|
|
* can be accessed with:
|
|
|
|
*
|
|
|
|
* LSM_RET_DEFAULT(<hook_name>)
|
|
|
|
*
|
|
|
|
* The macros below define static constants for the default value of each
|
|
|
|
* LSM hook.
|
|
|
|
*/
|
|
|
|
#define LSM_RET_DEFAULT(NAME) (NAME##_default)
|
|
|
|
#define DECLARE_LSM_RET_DEFAULT_void(DEFAULT, NAME)
|
|
|
|
#define DECLARE_LSM_RET_DEFAULT_int(DEFAULT, NAME) \
|
LSM: Avoid warnings about potentially unused hook variables
Building with W=1 shows many unused const variable warnings. These can
be silenced, as we're well aware of their being potentially unused:
./include/linux/lsm_hook_defs.h:36:18: error: 'ptrace_access_check_default' defined but not used [-Werror=unused-const-variable=]
36 | LSM_HOOK(int, 0, ptrace_access_check, struct task_struct *child,
| ^~~~~~~~~~~~~~~~~~~
security/security.c:706:32: note: in definition of macro 'LSM_RET_DEFAULT'
706 | #define LSM_RET_DEFAULT(NAME) (NAME##_default)
| ^~~~
security/security.c:711:9: note: in expansion of macro 'DECLARE_LSM_RET_DEFAULT_int'
711 | DECLARE_LSM_RET_DEFAULT_##RET(DEFAULT, NAME)
| ^~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/lsm_hook_defs.h:36:1: note: in expansion of macro 'LSM_HOOK'
36 | LSM_HOOK(int, 0, ptrace_access_check, struct task_struct *child,
| ^~~~~~~~
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: linux-security-module@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/linux-mm/202110131608.zms53FPR-lkp@intel.com/
Fixes: 98e828a0650f ("security: Refactor declaration of LSM hooks")
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-10-13 17:28:48 +00:00
|
|
|
static const int __maybe_unused LSM_RET_DEFAULT(NAME) = (DEFAULT);
|
2020-03-29 00:43:50 +00:00
|
|
|
#define LSM_HOOK(RET, DEFAULT, NAME, ...) \
|
|
|
|
DECLARE_LSM_RET_DEFAULT_##RET(DEFAULT, NAME)
|
|
|
|
|
|
|
|
#include <linux/lsm_hook_defs.h>
|
|
|
|
#undef LSM_HOOK
|
|
|
|
|
2015-05-02 22:11:29 +00:00
|
|
|
/*
|
2015-05-02 22:11:42 +00:00
|
|
|
* Hook list operation macros.
|
2005-04-16 22:20:36 +00:00
|
|
|
*
|
2015-05-02 22:11:29 +00:00
|
|
|
* call_void_hook:
|
|
|
|
* This is a hook that does not return a value.
|
2005-04-16 22:20:36 +00:00
|
|
|
*
|
2015-05-02 22:11:29 +00:00
|
|
|
* call_int_hook:
|
|
|
|
* This is a hook that returns a value.
|
2005-04-16 22:20:36 +00:00
|
|
|
*/
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
#define __CALL_STATIC_VOID(NUM, HOOK, ...) \
|
|
|
|
do { \
|
|
|
|
if (static_branch_unlikely(&SECURITY_HOOK_ACTIVE_KEY(HOOK, NUM))) { \
|
|
|
|
static_call(LSM_STATIC_CALL(HOOK, NUM))(__VA_ARGS__); \
|
|
|
|
} \
|
|
|
|
} while (0);
|
2005-04-16 22:20:36 +00:00
|
|
|
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
#define call_void_hook(HOOK, ...) \
|
|
|
|
do { \
|
|
|
|
LSM_LOOP_UNROLL(__CALL_STATIC_VOID, HOOK, __VA_ARGS__); \
|
2015-05-02 22:11:42 +00:00
|
|
|
} while (0)
|
|
|
|
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
|
|
|
|
#define __CALL_STATIC_INT(NUM, R, HOOK, LABEL, ...) \
|
|
|
|
do { \
|
|
|
|
if (static_branch_unlikely(&SECURITY_HOOK_ACTIVE_KEY(HOOK, NUM))) { \
|
|
|
|
R = static_call(LSM_STATIC_CALL(HOOK, NUM))(__VA_ARGS__); \
|
|
|
|
if (R != LSM_RET_DEFAULT(HOOK)) \
|
|
|
|
goto LABEL; \
|
|
|
|
} \
|
|
|
|
} while (0);
|
|
|
|
|
|
|
|
#define call_int_hook(HOOK, ...) \
|
|
|
|
({ \
|
|
|
|
__label__ OUT; \
|
|
|
|
int RC = LSM_RET_DEFAULT(HOOK); \
|
|
|
|
\
|
|
|
|
LSM_LOOP_UNROLL(__CALL_STATIC_INT, RC, HOOK, OUT, __VA_ARGS__); \
|
|
|
|
OUT: \
|
|
|
|
RC; \
|
2015-05-02 22:11:42 +00:00
|
|
|
})
|
2005-04-16 22:20:36 +00:00
|
|
|
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
#define lsm_for_each_hook(scall, NAME) \
|
|
|
|
for (scall = static_calls_table.NAME; \
|
|
|
|
scall - static_calls_table.NAME < MAX_LSM_COUNT; scall++) \
|
|
|
|
if (static_key_enabled(&scall->active->key))
|
|
|
|
|
2007-10-17 06:31:32 +00:00
|
|
|
/* Security operations */
|
|
|
|
|
2023-02-16 21:39:08 +00:00
|
|
|
/**
|
|
|
|
* security_binder_set_context_mgr() - Check if becoming binder ctx mgr is ok
|
|
|
|
* @mgr: task credentials of current binder process
|
|
|
|
*
|
|
|
|
* Check whether @mgr is allowed to be the binder context manager.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if permission is granted.
|
|
|
|
*/
|
2021-10-12 16:56:13 +00:00
|
|
|
int security_binder_set_context_mgr(const struct cred *mgr)
|
2015-01-21 15:54:10 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(binder_set_context_mgr, mgr);
|
2015-01-21 15:54:10 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 21:39:08 +00:00
|
|
|
/**
|
|
|
|
* security_binder_transaction() - Check if a binder transaction is allowed
|
|
|
|
* @from: sending process
|
|
|
|
* @to: receiving process
|
|
|
|
*
|
|
|
|
* Check whether @from is allowed to invoke a binder transaction call to @to.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2021-10-12 16:56:13 +00:00
|
|
|
int security_binder_transaction(const struct cred *from,
|
|
|
|
const struct cred *to)
|
2015-01-21 15:54:10 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(binder_transaction, from, to);
|
2015-01-21 15:54:10 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 21:39:08 +00:00
|
|
|
/**
|
|
|
|
* security_binder_transfer_binder() - Check if a binder transfer is allowed
|
|
|
|
* @from: sending process
|
|
|
|
* @to: receiving process
|
|
|
|
*
|
|
|
|
* Check whether @from is allowed to transfer a binder reference to @to.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2021-10-12 16:56:13 +00:00
|
|
|
int security_binder_transfer_binder(const struct cred *from,
|
|
|
|
const struct cred *to)
|
2015-01-21 15:54:10 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(binder_transfer_binder, from, to);
|
2015-01-21 15:54:10 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 21:39:08 +00:00
|
|
|
/**
|
|
|
|
* security_binder_transfer_file() - Check if a binder file xfer is allowed
|
|
|
|
* @from: sending process
|
|
|
|
* @to: receiving process
|
|
|
|
* @file: file being transferred
|
|
|
|
*
|
|
|
|
* Check whether @from is allowed to transfer @file to @to.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2021-10-12 16:56:13 +00:00
|
|
|
int security_binder_transfer_file(const struct cred *from,
|
2023-08-12 15:31:08 +00:00
|
|
|
const struct cred *to, const struct file *file)
|
2015-01-21 15:54:10 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(binder_transfer_file, from, to, file);
|
2015-01-21 15:54:10 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_ptrace_access_check() - Check if tracing is allowed
|
|
|
|
* @child: target process
|
|
|
|
* @mode: PTRACE_MODE flags
|
|
|
|
*
|
|
|
|
* Check permission before allowing the current process to trace the @child
|
|
|
|
* process. Security modules may also want to perform a process tracing check
|
|
|
|
* during an execve in the set_security or apply_creds hooks of tracing check
|
|
|
|
* during an execve in the bprm_set_creds hook of binprm_security_ops if the
|
|
|
|
* process is being traced and its security attributes would be changed by the
|
|
|
|
* execve.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2009-05-07 09:26:19 +00:00
|
|
|
int security_ptrace_access_check(struct task_struct *child, unsigned int mode)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(ptrace_access_check, child, mode);
|
security: Fix setting of PF_SUPERPRIV by __capable()
Fix the setting of PF_SUPERPRIV by __capable() as it could corrupt the flags
the target process if that is not the current process and it is trying to
change its own flags in a different way at the same time.
__capable() is using neither atomic ops nor locking to protect t->flags. This
patch removes __capable() and introduces has_capability() that doesn't set
PF_SUPERPRIV on the process being queried.
This patch further splits security_ptrace() in two:
(1) security_ptrace_may_access(). This passes judgement on whether one
process may access another only (PTRACE_MODE_ATTACH for ptrace() and
PTRACE_MODE_READ for /proc), and takes a pointer to the child process.
current is the parent.
(2) security_ptrace_traceme(). This passes judgement on PTRACE_TRACEME only,
and takes only a pointer to the parent process. current is the child.
In Smack and commoncap, this uses has_capability() to determine whether
the parent will be permitted to use PTRACE_ATTACH if normal checks fail.
This does not set PF_SUPERPRIV.
Two of the instances of __capable() actually only act on current, and so have
been changed to calls to capable().
Of the places that were using __capable():
(1) The OOM killer calls __capable() thrice when weighing the killability of a
process. All of these now use has_capability().
(2) cap_ptrace() and smack_ptrace() were using __capable() to check to see
whether the parent was allowed to trace any process. As mentioned above,
these have been split. For PTRACE_ATTACH and /proc, capable() is now
used, and for PTRACE_TRACEME, has_capability() is used.
(3) cap_safe_nice() only ever saw current, so now uses capable().
(4) smack_setprocattr() rejected accesses to tasks other than current just
after calling __capable(), so the order of these two tests have been
switched and capable() is used instead.
(5) In smack_file_send_sigiotask(), we need to allow privileged processes to
receive SIGIO on files they're manipulating.
(6) In smack_task_wait(), we let a process wait for a privileged process,
whether or not the process doing the waiting is privileged.
I've tested this with the LTP SELinux and syscalls testscripts.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-14 10:37:28 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_ptrace_traceme() - Check if tracing is allowed
|
|
|
|
* @parent: tracing process
|
|
|
|
*
|
|
|
|
* Check that the @parent process has sufficient permission to trace the
|
|
|
|
* current process before allowing the current process to present itself to the
|
|
|
|
* @parent process for tracing.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
security: Fix setting of PF_SUPERPRIV by __capable()
Fix the setting of PF_SUPERPRIV by __capable() as it could corrupt the flags
the target process if that is not the current process and it is trying to
change its own flags in a different way at the same time.
__capable() is using neither atomic ops nor locking to protect t->flags. This
patch removes __capable() and introduces has_capability() that doesn't set
PF_SUPERPRIV on the process being queried.
This patch further splits security_ptrace() in two:
(1) security_ptrace_may_access(). This passes judgement on whether one
process may access another only (PTRACE_MODE_ATTACH for ptrace() and
PTRACE_MODE_READ for /proc), and takes a pointer to the child process.
current is the parent.
(2) security_ptrace_traceme(). This passes judgement on PTRACE_TRACEME only,
and takes only a pointer to the parent process. current is the child.
In Smack and commoncap, this uses has_capability() to determine whether
the parent will be permitted to use PTRACE_ATTACH if normal checks fail.
This does not set PF_SUPERPRIV.
Two of the instances of __capable() actually only act on current, and so have
been changed to calls to capable().
Of the places that were using __capable():
(1) The OOM killer calls __capable() thrice when weighing the killability of a
process. All of these now use has_capability().
(2) cap_ptrace() and smack_ptrace() were using __capable() to check to see
whether the parent was allowed to trace any process. As mentioned above,
these have been split. For PTRACE_ATTACH and /proc, capable() is now
used, and for PTRACE_TRACEME, has_capability() is used.
(3) cap_safe_nice() only ever saw current, so now uses capable().
(4) smack_setprocattr() rejected accesses to tasks other than current just
after calling __capable(), so the order of these two tests have been
switched and capable() is used instead.
(5) In smack_file_send_sigiotask(), we need to allow privileged processes to
receive SIGIO on files they're manipulating.
(6) In smack_task_wait(), we let a process wait for a privileged process,
whether or not the process doing the waiting is privileged.
I've tested this with the LTP SELinux and syscalls testscripts.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-14 10:37:28 +00:00
|
|
|
int security_ptrace_traceme(struct task_struct *parent)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(ptrace_traceme, parent);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_capget() - Get the capability sets for a process
|
|
|
|
* @target: target process
|
|
|
|
* @effective: effective capability set
|
|
|
|
* @inheritable: inheritable capability set
|
|
|
|
* @permitted: permitted capability set
|
|
|
|
*
|
|
|
|
* Get the @effective, @inheritable, and @permitted capability sets for the
|
|
|
|
* @target process. The hook may also perform permission checking to determine
|
|
|
|
* if the current process is allowed to see the capability sets of the @target
|
|
|
|
* process.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if the capability sets were successfully obtained.
|
|
|
|
*/
|
2023-08-07 06:59:29 +00:00
|
|
|
int security_capget(const struct task_struct *target,
|
2023-02-17 02:33:20 +00:00
|
|
|
kernel_cap_t *effective,
|
|
|
|
kernel_cap_t *inheritable,
|
|
|
|
kernel_cap_t *permitted)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(capget, target, effective, inheritable, permitted);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_capset() - Set the capability sets for a process
|
|
|
|
* @new: new credentials for the target process
|
|
|
|
* @old: current credentials of the target process
|
|
|
|
* @effective: effective capability set
|
|
|
|
* @inheritable: inheritable capability set
|
|
|
|
* @permitted: permitted capability set
|
|
|
|
*
|
|
|
|
* Set the @effective, @inheritable, and @permitted capability sets for the
|
|
|
|
* current process.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 and update @new if permission is granted.
|
|
|
|
*/
|
CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.
A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().
With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:
struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);
There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.
To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:
(1) Its reference count may incremented and decremented.
(2) The keyrings to which it points may be modified, but not replaced.
The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.
(2) Temporary credential overrides.
do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.
This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.
(3) LSM interface.
A number of functions have been changed, added or removed:
(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()
Removed in favour of security_capset().
(*) security_capset(), ->capset()
New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
Changed; now returns a value, which will cause the process to be
killed if it's an error.
(*) security_task_alloc(), ->task_alloc_security()
Removed in favour of security_prepare_creds().
(*) security_cred_free(), ->cred_free()
New. Free security data attached to cred->security.
(*) security_prepare_creds(), ->cred_prepare()
New. Duplicate any security data attached to cred->security.
(*) security_commit_creds(), ->cred_commit()
New. Apply any security effects for the upcoming installation of new
security by commit_creds().
(*) security_task_post_setuid(), ->task_post_setuid()
Removed in favour of security_task_fix_setuid().
(*) security_task_fix_setuid(), ->task_fix_setuid()
Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().
(*) security_task_reparent_to_init(), ->task_reparent_to_init()
Removed. Instead the task being reparented to init is referred
directly to init's credentials.
NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.
(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()
Changed. These now take cred pointers rather than task pointers to
refer to the security context.
(4) sys_capset().
This has been simplified and uses less locking. The LSM functions it
calls have been merged.
(5) reparent_to_kthreadd().
This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.
(6) __sigqueue_alloc() and switch_uid()
__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.
switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().
(7) [sg]et[ug]id() and co and [sg]et_current_groups.
The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.
security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.
The calling of set_dumpable() has been moved into commit_creds().
Much of the functionality of set_user() has been moved into
commit_creds().
The get functions all simply access the data directly.
(8) security_task_prctl() and cap_task_prctl().
security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.
Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.
(9) Keyrings.
A number of changes have been made to the keyrings code:
(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.
(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.
(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.
(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.
(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).
(10) Usermode helper.
The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.
call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.
call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.
(11) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.
(12) is_single_threaded().
This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.
The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).
(13) nfsd.
The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-13 23:39:23 +00:00
|
|
|
int security_capset(struct cred *new, const struct cred *old,
|
|
|
|
const kernel_cap_t *effective,
|
|
|
|
const kernel_cap_t *inheritable,
|
|
|
|
const kernel_cap_t *permitted)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(capset, new, old, effective, inheritable,
|
|
|
|
permitted);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_capable() - Check if a process has the necessary capability
|
|
|
|
* @cred: credentials to examine
|
|
|
|
* @ns: user namespace
|
|
|
|
* @cap: capability requested
|
|
|
|
* @opts: capability check options
|
|
|
|
*
|
|
|
|
* Check whether the @tsk process has the @cap capability in the indicated
|
|
|
|
* credentials. @cap contains the capability <include/linux/capability.h>.
|
|
|
|
* @opts contains options for the capable check <include/linux/security.h>.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if the capability is granted.
|
|
|
|
*/
|
2019-01-08 00:10:53 +00:00
|
|
|
int security_capable(const struct cred *cred,
|
|
|
|
struct user_namespace *ns,
|
|
|
|
int cap,
|
|
|
|
unsigned int opts)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(capable, cred, ns, cap, opts);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_quotactl() - Check if a quotactl() syscall is allowed for this fs
|
|
|
|
* @cmds: commands
|
|
|
|
* @type: type
|
|
|
|
* @id: id
|
|
|
|
* @sb: filesystem
|
|
|
|
*
|
|
|
|
* Check whether the quotactl syscall is allowed for this @sb.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-08-23 06:44:41 +00:00
|
|
|
int security_quotactl(int cmds, int type, int id, const struct super_block *sb)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(quotactl, cmds, type, id, sb);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_quota_on() - Check if QUOTAON is allowed for a dentry
|
|
|
|
* @dentry: dentry
|
|
|
|
*
|
|
|
|
* Check whether QUOTAON is allowed for @dentry.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_quota_on(struct dentry *dentry)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(quota_on, dentry);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_syslog() - Check if accessing the kernel message ring is allowed
|
|
|
|
* @type: SYSLOG_ACTION_* type
|
|
|
|
*
|
|
|
|
* Check permission before accessing the kernel message ring or changing
|
|
|
|
* logging to the console. See the syslog(2) manual page for an explanation of
|
|
|
|
* the @type values.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if permission is granted.
|
|
|
|
*/
|
2010-11-15 23:36:29 +00:00
|
|
|
int security_syslog(int type)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(syslog, type);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_settime64() - Check if changing the system time is allowed
|
|
|
|
* @ts: new time
|
|
|
|
* @tz: timezone
|
|
|
|
*
|
|
|
|
* Check permission to change the system time, struct timespec64 is defined in
|
|
|
|
* <include/linux/time64.h> and timezone is defined in <include/linux/time.h>.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2016-04-08 06:02:11 +00:00
|
|
|
int security_settime64(const struct timespec64 *ts, const struct timezone *tz)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(settime, ts, tz);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_vm_enough_memory_mm() - Check if allocating a new mem map is allowed
|
|
|
|
* @mm: mm struct
|
|
|
|
* @pages: number of pages
|
|
|
|
*
|
|
|
|
* Check permissions for allocating a new virtual mapping. If all LSMs return
|
|
|
|
* a positive value, __vm_enough_memory() will be called with cap_sys_admin
|
|
|
|
* set. If at least one LSM returns 0 or negative, __vm_enough_memory() will be
|
|
|
|
* called with cap_sys_admin cleared.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted by the LSM infrastructure to the
|
|
|
|
* caller.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_vm_enough_memory_mm(struct mm_struct *mm, long pages)
|
|
|
|
{
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
struct lsm_static_call *scall;
|
2015-05-02 22:11:42 +00:00
|
|
|
int cap_sys_admin = 1;
|
|
|
|
int rc;
|
|
|
|
|
|
|
|
/*
|
2024-07-24 02:06:58 +00:00
|
|
|
* The module will respond with 0 if it thinks the __vm_enough_memory()
|
|
|
|
* call should be made with the cap_sys_admin set. If all of the modules
|
|
|
|
* agree that it should be set it will. If any module thinks it should
|
|
|
|
* not be set it won't.
|
2015-05-02 22:11:42 +00:00
|
|
|
*/
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
lsm_for_each_hook(scall, vm_enough_memory) {
|
|
|
|
rc = scall->hl->hook.vm_enough_memory(mm, pages);
|
2024-07-24 02:06:58 +00:00
|
|
|
if (rc < 0) {
|
2015-05-02 22:11:42 +00:00
|
|
|
cap_sys_admin = 0;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return __vm_enough_memory(mm, pages, cap_sys_admin);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:06:51 +00:00
|
|
|
/**
|
|
|
|
* security_bprm_creds_for_exec() - Prepare the credentials for exec()
|
|
|
|
* @bprm: binary program information
|
|
|
|
*
|
|
|
|
* If the setup in prepare_exec_creds did not setup @bprm->cred->security
|
|
|
|
* properly for executing @bprm->file, update the LSM's portion of
|
|
|
|
* @bprm->cred->security to be what commit_creds needs to install for the new
|
|
|
|
* program. This hook may also optionally check permissions (e.g. for
|
|
|
|
* transitions between security domains). The hook must set @bprm->secureexec
|
|
|
|
* to 1 if AT_SECURE should be set to request libc enable secure mode. @bprm
|
|
|
|
* contains the linux_binprm structure.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if the hook is successful and permission is granted.
|
|
|
|
*/
|
2020-03-22 20:46:24 +00:00
|
|
|
int security_bprm_creds_for_exec(struct linux_binprm *bprm)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(bprm_creds_for_exec, bprm);
|
2020-03-22 20:46:24 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:06:51 +00:00
|
|
|
/**
|
|
|
|
* security_bprm_creds_from_file() - Update linux_binprm creds based on file
|
|
|
|
* @bprm: binary program information
|
|
|
|
* @file: associated file
|
|
|
|
*
|
|
|
|
* If @file is setpcap, suid, sgid or otherwise marked to change privilege upon
|
|
|
|
* exec, update @bprm->cred to reflect that change. This is called after
|
|
|
|
* finding the binary that will be executed without an interpreter. This
|
|
|
|
* ensures that the credentials will not be derived from a script that the
|
|
|
|
* binary will need to reopen, which when reopend may end up being a completely
|
|
|
|
* different file. This hook may also optionally check permissions (e.g. for
|
|
|
|
* transitions between security domains). The hook must set @bprm->secureexec
|
|
|
|
* to 1 if AT_SECURE should be set to request libc enable secure mode. The
|
|
|
|
* hook must add to @bprm->per_clear any personality flags that should be
|
|
|
|
* cleared from current->personality. @bprm contains the linux_binprm
|
|
|
|
* structure.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if the hook is successful and permission is granted.
|
|
|
|
*/
|
2023-08-23 07:17:29 +00:00
|
|
|
int security_bprm_creds_from_file(struct linux_binprm *bprm, const struct file *file)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(bprm_creds_from_file, bprm, file);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:06:51 +00:00
|
|
|
/**
|
|
|
|
* security_bprm_check() - Mediate binary handler search
|
|
|
|
* @bprm: binary program information
|
|
|
|
*
|
|
|
|
* This hook mediates the point when a search for a binary handler will begin.
|
|
|
|
* It allows a check against the @bprm->cred->security value which was set in
|
|
|
|
* the preceding creds_for_exec call. The argv list and envp list are reliably
|
|
|
|
* available in @bprm. This hook may be called multiple times during a single
|
|
|
|
* execve. @bprm contains the linux_binprm structure.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if the hook is successful and permission is granted.
|
|
|
|
*/
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-13 23:39:24 +00:00
|
|
|
int security_bprm_check(struct linux_binprm *bprm)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(bprm_check_security, bprm);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:06:51 +00:00
|
|
|
/**
|
|
|
|
* security_bprm_committing_creds() - Install creds for a process during exec()
|
|
|
|
* @bprm: binary program information
|
|
|
|
*
|
|
|
|
* Prepare to install the new security attributes of a process being
|
|
|
|
* transformed by an execve operation, based on the old credentials pointed to
|
|
|
|
* by @current->cred and the information set in @bprm->cred by the
|
|
|
|
* bprm_creds_for_exec hook. @bprm points to the linux_binprm structure. This
|
|
|
|
* hook is a good place to perform state changes on the process such as closing
|
|
|
|
* open file descriptors to which access will no longer be granted when the
|
|
|
|
* attributes are changed. This is called immediately before commit_creds().
|
|
|
|
*/
|
2023-08-23 07:47:56 +00:00
|
|
|
void security_bprm_committing_creds(const struct linux_binprm *bprm)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(bprm_committing_creds, bprm);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:06:51 +00:00
|
|
|
/**
|
|
|
|
* security_bprm_committed_creds() - Tidy up after cred install during exec()
|
|
|
|
* @bprm: binary program information
|
|
|
|
*
|
|
|
|
* Tidy up after the installation of the new security attributes of a process
|
|
|
|
* being transformed by an execve operation. The new credentials have, by this
|
|
|
|
* point, been set to @current->cred. @bprm points to the linux_binprm
|
|
|
|
* structure. This hook is a good place to perform state changes on the
|
|
|
|
* process such as clearing out non-inheritable signal state. This is called
|
|
|
|
* immediately after commit_creds().
|
|
|
|
*/
|
2023-08-23 08:16:40 +00:00
|
|
|
void security_bprm_committed_creds(const struct linux_binprm *bprm)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(bprm_committed_creds, bprm);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing
When NFS superblocks are created by automounting, their LSM parameters
aren't set in the fs_context struct prior to sget_fc() being called,
leading to failure to match existing superblocks.
This bug leads to messages like the following appearing in dmesg when
fscache is enabled:
NFS: Cache volume key already in use (nfs,4.2,2,108,106a8c0,1,,,,100000,100000,2ee,3a98,1d4c,3a98,1)
Fix this by adding a new LSM hook to load fc->security for submount
creation.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/165962680944.3334508.6610023900349142034.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165962729225.3357250.14350728846471527137.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/165970659095.2812394.6868894171102318796.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166133579016.3678898.6283195019480567275.stgit@warthog.procyon.org.uk/ # v4
Link: https://lore.kernel.org/r/217595.1662033775@warthog.procyon.org.uk/ # v5
Fixes: 9bc61ab18b1d ("vfs: Introduce fs_context, switch vfs_kern_mount() to it.")
Fixes: 779df6a5480f ("NFS: Ensure security label is set for root inode")
Tested-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: "Christian Brauner (Microsoft)" <brauner@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Message-Id: <20230808-master-v9-1-e0ecde888221@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-08 11:34:20 +00:00
|
|
|
/**
|
|
|
|
* security_fs_context_submount() - Initialise fc->security
|
|
|
|
* @fc: new filesystem context
|
|
|
|
* @reference: dentry reference for submount/remount
|
|
|
|
*
|
|
|
|
* Fill out the ->security field for a new fs_context.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success or negative error code on failure.
|
|
|
|
*/
|
|
|
|
int security_fs_context_submount(struct fs_context *fc, struct super_block *reference)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(fs_context_submount, fc, reference);
|
vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing
When NFS superblocks are created by automounting, their LSM parameters
aren't set in the fs_context struct prior to sget_fc() being called,
leading to failure to match existing superblocks.
This bug leads to messages like the following appearing in dmesg when
fscache is enabled:
NFS: Cache volume key already in use (nfs,4.2,2,108,106a8c0,1,,,,100000,100000,2ee,3a98,1d4c,3a98,1)
Fix this by adding a new LSM hook to load fc->security for submount
creation.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/165962680944.3334508.6610023900349142034.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165962729225.3357250.14350728846471527137.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/165970659095.2812394.6868894171102318796.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166133579016.3678898.6283195019480567275.stgit@warthog.procyon.org.uk/ # v4
Link: https://lore.kernel.org/r/217595.1662033775@warthog.procyon.org.uk/ # v5
Fixes: 9bc61ab18b1d ("vfs: Introduce fs_context, switch vfs_kern_mount() to it.")
Fixes: 779df6a5480f ("NFS: Ensure security label is set for root inode")
Tested-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: "Christian Brauner (Microsoft)" <brauner@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Message-Id: <20230808-master-v9-1-e0ecde888221@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-08 11:34:20 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:33:31 +00:00
|
|
|
/**
|
|
|
|
* security_fs_context_dup() - Duplicate a fs_context LSM blob
|
|
|
|
* @fc: destination filesystem context
|
|
|
|
* @src_fc: source filesystem context
|
|
|
|
*
|
|
|
|
* Allocate and attach a security structure to sc->security. This pointer is
|
|
|
|
* initialised to NULL by the caller. @fc indicates the new filesystem context.
|
|
|
|
* @src_fc indicates the original filesystem context.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success or a negative error code on failure.
|
|
|
|
*/
|
2018-12-23 21:02:47 +00:00
|
|
|
int security_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(fs_context_dup, fc, src_fc);
|
2018-12-23 21:02:47 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:33:31 +00:00
|
|
|
/**
|
|
|
|
* security_fs_context_parse_param() - Configure a filesystem context
|
|
|
|
* @fc: filesystem context
|
|
|
|
* @param: filesystem parameter
|
|
|
|
*
|
|
|
|
* Userspace provided a parameter to configure a superblock. The LSM can
|
|
|
|
* consume the parameter or return it to the caller for use elsewhere.
|
|
|
|
*
|
|
|
|
* Return: If the parameter is used by the LSM it should return 0, if it is
|
|
|
|
* returned to the caller -ENOPARAM is returned, otherwise a negative
|
|
|
|
* error code is returned.
|
|
|
|
*/
|
2022-01-27 04:51:00 +00:00
|
|
|
int security_fs_context_parse_param(struct fs_context *fc,
|
|
|
|
struct fs_parameter *param)
|
2018-11-01 23:07:24 +00:00
|
|
|
{
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
struct lsm_static_call *scall;
|
2022-01-27 04:51:00 +00:00
|
|
|
int trc;
|
|
|
|
int rc = -ENOPARAM;
|
|
|
|
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
lsm_for_each_hook(scall, fs_context_parse_param) {
|
|
|
|
trc = scall->hl->hook.fs_context_parse_param(fc, param);
|
2022-01-27 04:51:00 +00:00
|
|
|
if (trc == 0)
|
|
|
|
rc = 0;
|
|
|
|
else if (trc != -ENOPARAM)
|
|
|
|
return trc;
|
|
|
|
}
|
|
|
|
return rc;
|
2018-11-01 23:07:24 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_sb_alloc() - Allocate a super_block LSM blob
|
|
|
|
* @sb: filesystem superblock
|
|
|
|
*
|
|
|
|
* Allocate and attach a security structure to the sb->s_security field. The
|
|
|
|
* s_security field is initialized to NULL when the structure is allocated.
|
|
|
|
* @sb contains the super_block structure to be modified.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if operation was successful.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_sb_alloc(struct super_block *sb)
|
|
|
|
{
|
2021-04-22 15:41:15 +00:00
|
|
|
int rc = lsm_superblock_alloc(sb);
|
|
|
|
|
|
|
|
if (unlikely(rc))
|
|
|
|
return rc;
|
2024-01-30 12:56:59 +00:00
|
|
|
rc = call_int_hook(sb_alloc_security, sb);
|
2021-04-22 15:41:15 +00:00
|
|
|
if (unlikely(rc))
|
|
|
|
security_sb_free(sb);
|
|
|
|
return rc;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_sb_delete() - Release super_block LSM associated objects
|
|
|
|
* @sb: filesystem superblock
|
|
|
|
*
|
|
|
|
* Release objects tied to a superblock (e.g. inodes). @sb contains the
|
|
|
|
* super_block structure being released.
|
|
|
|
*/
|
2021-04-22 15:41:16 +00:00
|
|
|
void security_sb_delete(struct super_block *sb)
|
|
|
|
{
|
|
|
|
call_void_hook(sb_delete, sb);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_sb_free() - Free a super_block LSM blob
|
|
|
|
* @sb: filesystem superblock
|
|
|
|
*
|
|
|
|
* Deallocate and clear the sb->s_security field. @sb contains the super_block
|
|
|
|
* structure to be modified.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
void security_sb_free(struct super_block *sb)
|
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(sb_free_security, sb);
|
2021-04-22 15:41:15 +00:00
|
|
|
kfree(sb->s_security);
|
|
|
|
sb->s_security = NULL;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_free_mnt_opts() - Free memory associated with mount options
|
2023-03-08 17:31:03 +00:00
|
|
|
* @mnt_opts: LSM processed mount options
|
2023-02-07 22:44:01 +00:00
|
|
|
*
|
|
|
|
* Free memory associated with @mnt_ops.
|
|
|
|
*/
|
2018-12-13 18:41:47 +00:00
|
|
|
void security_free_mnt_opts(void **mnt_opts)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2018-12-13 18:41:47 +00:00
|
|
|
if (!*mnt_opts)
|
|
|
|
return;
|
|
|
|
call_void_hook(sb_free_mnt_opts, *mnt_opts);
|
|
|
|
*mnt_opts = NULL;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
2018-12-13 18:41:47 +00:00
|
|
|
EXPORT_SYMBOL(security_free_mnt_opts);
|
2007-10-17 06:31:32 +00:00
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_sb_eat_lsm_opts() - Consume LSM mount options
|
|
|
|
* @options: mount options
|
2023-03-08 17:31:03 +00:00
|
|
|
* @mnt_opts: LSM processed mount options
|
2023-02-07 22:44:01 +00:00
|
|
|
*
|
|
|
|
* Eat (scan @options) and save them in @mnt_opts.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, negative values on failure.
|
|
|
|
*/
|
2018-12-13 18:41:47 +00:00
|
|
|
int security_sb_eat_lsm_opts(char *options, void **mnt_opts)
|
2011-03-03 21:09:14 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sb_eat_lsm_opts, options, mnt_opts);
|
2011-03-03 21:09:14 +00:00
|
|
|
}
|
2018-11-17 17:09:18 +00:00
|
|
|
EXPORT_SYMBOL(security_sb_eat_lsm_opts);
|
2011-03-03 21:09:14 +00:00
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_sb_mnt_opts_compat() - Check if new mount options are allowed
|
|
|
|
* @sb: filesystem superblock
|
|
|
|
* @mnt_opts: new mount options
|
|
|
|
*
|
|
|
|
* Determine if the new mount options in @mnt_opts are allowed given the
|
|
|
|
* existing mounted filesystem at @sb. @sb superblock being compared.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if options are compatible.
|
|
|
|
*/
|
2021-02-27 03:37:55 +00:00
|
|
|
int security_sb_mnt_opts_compat(struct super_block *sb,
|
|
|
|
void *mnt_opts)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sb_mnt_opts_compat, sb, mnt_opts);
|
2021-02-27 03:37:55 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_sb_mnt_opts_compat);
|
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_sb_remount() - Verify no incompatible mount changes during remount
|
|
|
|
* @sb: filesystem superblock
|
|
|
|
* @mnt_opts: (re)mount options
|
|
|
|
*
|
|
|
|
* Extracts security system specific mount options and verifies no changes are
|
|
|
|
* being made to those options.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2018-12-02 04:06:57 +00:00
|
|
|
int security_sb_remount(struct super_block *sb,
|
2018-12-13 18:41:47 +00:00
|
|
|
void *mnt_opts)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sb_remount, sb, mnt_opts);
|
2011-03-03 21:09:14 +00:00
|
|
|
}
|
2018-12-10 22:19:21 +00:00
|
|
|
EXPORT_SYMBOL(security_sb_remount);
|
2011-03-03 21:09:14 +00:00
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_sb_kern_mount() - Check if a kernel mount is allowed
|
|
|
|
* @sb: filesystem superblock
|
|
|
|
*
|
|
|
|
* Mount this @sb if allowed by permissions.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-08-23 09:01:28 +00:00
|
|
|
int security_sb_kern_mount(const struct super_block *sb)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sb_kern_mount, sb);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_sb_show_options() - Output the mount options for a superblock
|
|
|
|
* @m: output file
|
|
|
|
* @sb: filesystem superblock
|
|
|
|
*
|
|
|
|
* Show (print on @m) mount options for this @sb.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, negative values on failure.
|
|
|
|
*/
|
2008-07-03 23:47:13 +00:00
|
|
|
int security_sb_show_options(struct seq_file *m, struct super_block *sb)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sb_show_options, m, sb);
|
2008-07-03 23:47:13 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_sb_statfs() - Check if accessing fs stats is allowed
|
|
|
|
* @dentry: superblock handle
|
|
|
|
*
|
|
|
|
* Check permission before obtaining filesystem statistics for the @mnt
|
|
|
|
* mountpoint. @dentry is a handle on the superblock for the filesystem.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_sb_statfs(struct dentry *dentry)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sb_statfs, dentry);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_sb_mount() - Check permission for mounting a filesystem
|
|
|
|
* @dev_name: filesystem backing device
|
|
|
|
* @path: mount point
|
|
|
|
* @type: filesystem type
|
|
|
|
* @flags: mount flags
|
|
|
|
* @data: filesystem specific data
|
|
|
|
*
|
|
|
|
* Check permission before an object specified by @dev_name is mounted on the
|
|
|
|
* mount point named by @nd. For an ordinary mount, @dev_name identifies a
|
|
|
|
* device if the file system type requires a device. For a remount
|
|
|
|
* (@flags & MS_REMOUNT), @dev_name is irrelevant. For a loopback/bind mount
|
|
|
|
* (@flags & MS_BIND), @dev_name identifies the pathname of the object being
|
|
|
|
* mounted.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2016-03-25 18:52:53 +00:00
|
|
|
int security_sb_mount(const char *dev_name, const struct path *path,
|
2023-02-17 02:33:20 +00:00
|
|
|
const char *type, unsigned long flags, void *data)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sb_mount, dev_name, path, type, flags, data);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_sb_umount() - Check permission for unmounting a filesystem
|
|
|
|
* @mnt: mounted filesystem
|
|
|
|
* @flags: unmount flags
|
|
|
|
*
|
|
|
|
* Check permission before the @mnt file system is unmounted.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_sb_umount(struct vfsmount *mnt, int flags)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sb_umount, mnt, flags);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_sb_pivotroot() - Check permissions for pivoting the rootfs
|
|
|
|
* @old_path: new location for current rootfs
|
|
|
|
* @new_path: location of the new rootfs
|
|
|
|
*
|
|
|
|
* Check permission before pivoting the root filesystem.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_sb_pivotroot(const struct path *old_path,
|
|
|
|
const struct path *new_path)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sb_pivotroot, old_path, new_path);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_sb_set_mnt_opts() - Set the mount options for a filesystem
|
|
|
|
* @sb: filesystem superblock
|
|
|
|
* @mnt_opts: binary mount options
|
|
|
|
* @kern_flags: kernel flags (in)
|
|
|
|
* @set_kern_flags: kernel flags (out)
|
|
|
|
*
|
|
|
|
* Set the security relevant mount options used for a superblock.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
2007-11-30 18:00:35 +00:00
|
|
|
int security_sb_set_mnt_opts(struct super_block *sb,
|
2023-02-17 02:33:20 +00:00
|
|
|
void *mnt_opts,
|
|
|
|
unsigned long kern_flags,
|
|
|
|
unsigned long *set_kern_flags)
|
2007-11-30 18:00:35 +00:00
|
|
|
{
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
struct lsm_static_call *scall;
|
2024-01-30 12:56:59 +00:00
|
|
|
int rc = mnt_opts ? -EOPNOTSUPP : LSM_RET_DEFAULT(sb_set_mnt_opts);
|
|
|
|
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
lsm_for_each_hook(scall, sb_set_mnt_opts) {
|
|
|
|
rc = scall->hl->hook.sb_set_mnt_opts(sb, mnt_opts, kern_flags,
|
2024-01-30 12:56:59 +00:00
|
|
|
set_kern_flags);
|
|
|
|
if (rc != LSM_RET_DEFAULT(sb_set_mnt_opts))
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
return rc;
|
2007-11-30 18:00:35 +00:00
|
|
|
}
|
2008-03-05 15:31:54 +00:00
|
|
|
EXPORT_SYMBOL(security_sb_set_mnt_opts);
|
2007-11-30 18:00:35 +00:00
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_sb_clone_mnt_opts() - Duplicate superblock mount options
|
2023-03-08 17:31:03 +00:00
|
|
|
* @oldsb: source superblock
|
|
|
|
* @newsb: destination superblock
|
2023-02-07 22:44:01 +00:00
|
|
|
* @kern_flags: kernel flags (in)
|
|
|
|
* @set_kern_flags: kernel flags (out)
|
|
|
|
*
|
|
|
|
* Copy all security options from a given superblock to another.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
selinux: make security_sb_clone_mnt_opts return an error on context mismatch
I had the following problem reported a while back. If you mount the
same filesystem twice using NFSv4 with different contexts, then the
second context= option is ignored. For instance:
# mount server:/export /mnt/test1
# mount server:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0
# ls -dZ /mnt/test1
drwxrwxrwt. root root system_u:object_r:nfs_t:s0 /mnt/test1
# ls -dZ /mnt/test2
drwxrwxrwt. root root system_u:object_r:nfs_t:s0 /mnt/test2
When we call into SELinux to set the context of a "cloned" superblock,
it will currently just bail out when it notices that we're reusing an
existing superblock. Since the existing superblock is already set up and
presumably in use, we can't go overwriting its context with the one from
the "original" sb. Because of this, the second context= option in this
case cannot take effect.
This patch fixes this by turning security_sb_clone_mnt_opts into an int
return operation. When it finds that the "new" superblock that it has
been handed is already set up, it checks to see whether the contexts on
the old superblock match it. If it does, then it will just return
success, otherwise it'll return -EBUSY and emit a printk to tell the
admin why the second mount failed.
Note that this patch may cause casualties. The NFSv4 code relies on
being able to walk down to an export from the pseudoroot. If you mount
filesystems that are nested within one another with different contexts,
then this patch will make those mounts fail in new and "exciting" ways.
For instance, suppose that /export is a separate filesystem on the
server:
# mount server:/ /mnt/test1
# mount salusa:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0
mount.nfs: an incorrect mount option was specified
...with the printk in the ring buffer. Because we *might* eventually
walk down to /mnt/test1/export, the mount is denied due to this patch.
The second mount needs the pseudoroot superblock, but that's already
present with the wrong context.
OTOH, if we mount these in the reverse order, then both mounts work,
because the pseudoroot superblock created when mounting /export is
discarded once that mount is done. If we then however try to walk into
that directory, the automount fails for the similar reasons:
# cd /mnt/test1/scratch/
-bash: cd: /mnt/test1/scratch: Device or resource busy
The story I've gotten from the SELinux folks that I've talked to is that
this is desirable behavior. In SELinux-land, mounting the same data
under different contexts is wrong -- there can be only one.
Cc: Steve Dickson <steved@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-04-01 12:14:24 +00:00
|
|
|
int security_sb_clone_mnt_opts(const struct super_block *oldsb,
|
2023-02-17 02:33:20 +00:00
|
|
|
struct super_block *newsb,
|
|
|
|
unsigned long kern_flags,
|
|
|
|
unsigned long *set_kern_flags)
|
2007-11-30 18:00:35 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sb_clone_mnt_opts, oldsb, newsb,
|
2023-02-17 02:33:20 +00:00
|
|
|
kern_flags, set_kern_flags);
|
2007-11-30 18:00:35 +00:00
|
|
|
}
|
2008-03-05 15:31:54 +00:00
|
|
|
EXPORT_SYMBOL(security_sb_clone_mnt_opts);
|
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_move_mount() - Check permissions for moving a mount
|
|
|
|
* @from_path: source mount point
|
|
|
|
* @to_path: destination mount point
|
|
|
|
*
|
|
|
|
* Check permission before a mount is moved.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_move_mount(const struct path *from_path,
|
|
|
|
const struct path *to_path)
|
2018-11-05 17:40:30 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(move_mount, from_path, to_path);
|
2018-11-05 17:40:30 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_path_notify() - Check if setting a watch is allowed
|
|
|
|
* @path: file path
|
|
|
|
* @mask: event mask
|
|
|
|
* @obj_type: file path type
|
|
|
|
*
|
|
|
|
* Check permissions before setting a watch on events as defined by @mask, on
|
|
|
|
* an object at @path, whose type is defined by @obj_type.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
fanotify, inotify, dnotify, security: add security hook for fs notifications
As of now, setting watches on filesystem objects has, at most, applied a
check for read access to the inode, and in the case of fanotify, requires
CAP_SYS_ADMIN. No specific security hook or permission check has been
provided to control the setting of watches. Using any of inotify, dnotify,
or fanotify, it is possible to observe, not only write-like operations, but
even read access to a file. Modeling the watch as being merely a read from
the file is insufficient for the needs of SELinux. This is due to the fact
that read access should not necessarily imply access to information about
when another process reads from a file. Furthermore, fanotify watches grant
more power to an application in the form of permission events. While
notification events are solely, unidirectional (i.e. they only pass
information to the receiving application), permission events are blocking.
Permission events make a request to the receiving application which will
then reply with a decision as to whether or not that action may be
completed. This causes the issue of the watching application having the
ability to exercise control over the triggering process. Without drawing a
distinction within the permission check, the ability to read would imply
the greater ability to control an application. Additionally, mount and
superblock watches apply to all files within the same mount or superblock.
Read access to one file should not necessarily imply the ability to watch
all files accessed within a given mount or superblock.
In order to solve these issues, a new LSM hook is implemented and has been
placed within the system calls for marking filesystem objects with inotify,
fanotify, and dnotify watches. These calls to the hook are placed at the
point at which the target path has been resolved and are provided with the
path struct, the mask of requested notification events, and the type of
object on which the mark is being set (inode, superblock, or mount). The
mask and obj_type have already been translated into common FS_* values
shared by the entirety of the fs notification infrastructure. The path
struct is passed rather than just the inode so that the mount is available,
particularly for mount watches. This also allows for use of the hook by
pathname-based security modules. However, since the hook is intended for
use even by inode based security modules, it is not placed under the
CONFIG_SECURITY_PATH conditional. Otherwise, the inode-based security
modules would need to enable all of the path hooks, even though they do not
use any of them.
This only provides a hook at the point of setting a watch, and presumes
that permission to set a particular watch implies the ability to receive
all notification about that object which match the mask. This is all that
is required for SELinux. If other security modules require additional hooks
or infrastructure to control delivery of notification, these can be added
by them. It does not make sense for us to propose hooks for which we have
no implementation. The understanding that all notifications received by the
requesting application are all strictly of a type for which the application
has been granted permission shows that this implementation is sufficient in
its coverage.
Security modules wishing to provide complete control over fanotify must
also implement a security_file_open hook that validates that the access
requested by the watching application is authorized. Fanotify has the issue
that it returns a file descriptor with the file mode specified during
fanotify_init() to the watching process on event. This is already covered
by the LSM security_file_open hook if the security module implements
checking of the requested file mode there. Otherwise, a watching process
can obtain escalated access to a file for which it has not been authorized.
The selinux_path_notify hook implementation works by adding five new file
permissions: watch, watch_mount, watch_sb, watch_reads, and watch_with_perm
(descriptions about which will follow), and one new filesystem permission:
watch (which is applied to superblock checks). The hook then decides which
subset of these permissions must be held by the requesting application
based on the contents of the provided mask and the obj_type. The
selinux_file_open hook already checks the requested file mode and therefore
ensures that a watching process cannot escalate its access through
fanotify.
The watch, watch_mount, and watch_sb permissions are the baseline
permissions for setting a watch on an object and each are a requirement for
any watch to be set on a file, mount, or superblock respectively. It should
be noted that having either of the other two permissions (watch_reads and
watch_with_perm) does not imply the watch, watch_mount, or watch_sb
permission. Superblock watches further require the filesystem watch
permission to the superblock. As there is no labeled object in view for
mounts, there is no specific check for mount watches beyond watch_mount to
the inode. Such a check could be added in the future, if a suitable labeled
object existed representing the mount.
The watch_reads permission is required to receive notifications from
read-exclusive events on filesystem objects. These events include accessing
a file for the purpose of reading and closing a file which has been opened
read-only. This distinction has been drawn in order to provide a direct
indication in the policy for this otherwise not obvious capability. Read
access to a file should not necessarily imply the ability to observe read
events on a file.
Finally, watch_with_perm only applies to fanotify masks since it is the
only way to set a mask which allows for the blocking, permission event.
This permission is needed for any watch which is of this type. Though
fanotify requires CAP_SYS_ADMIN, this is insufficient as it gives implicit
trust to root, which we do not do, and does not support least privilege.
Signed-off-by: Aaron Goidel <acgoide@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-12 15:20:00 +00:00
|
|
|
int security_path_notify(const struct path *path, u64 mask,
|
2023-02-17 02:33:20 +00:00
|
|
|
unsigned int obj_type)
|
fanotify, inotify, dnotify, security: add security hook for fs notifications
As of now, setting watches on filesystem objects has, at most, applied a
check for read access to the inode, and in the case of fanotify, requires
CAP_SYS_ADMIN. No specific security hook or permission check has been
provided to control the setting of watches. Using any of inotify, dnotify,
or fanotify, it is possible to observe, not only write-like operations, but
even read access to a file. Modeling the watch as being merely a read from
the file is insufficient for the needs of SELinux. This is due to the fact
that read access should not necessarily imply access to information about
when another process reads from a file. Furthermore, fanotify watches grant
more power to an application in the form of permission events. While
notification events are solely, unidirectional (i.e. they only pass
information to the receiving application), permission events are blocking.
Permission events make a request to the receiving application which will
then reply with a decision as to whether or not that action may be
completed. This causes the issue of the watching application having the
ability to exercise control over the triggering process. Without drawing a
distinction within the permission check, the ability to read would imply
the greater ability to control an application. Additionally, mount and
superblock watches apply to all files within the same mount or superblock.
Read access to one file should not necessarily imply the ability to watch
all files accessed within a given mount or superblock.
In order to solve these issues, a new LSM hook is implemented and has been
placed within the system calls for marking filesystem objects with inotify,
fanotify, and dnotify watches. These calls to the hook are placed at the
point at which the target path has been resolved and are provided with the
path struct, the mask of requested notification events, and the type of
object on which the mark is being set (inode, superblock, or mount). The
mask and obj_type have already been translated into common FS_* values
shared by the entirety of the fs notification infrastructure. The path
struct is passed rather than just the inode so that the mount is available,
particularly for mount watches. This also allows for use of the hook by
pathname-based security modules. However, since the hook is intended for
use even by inode based security modules, it is not placed under the
CONFIG_SECURITY_PATH conditional. Otherwise, the inode-based security
modules would need to enable all of the path hooks, even though they do not
use any of them.
This only provides a hook at the point of setting a watch, and presumes
that permission to set a particular watch implies the ability to receive
all notification about that object which match the mask. This is all that
is required for SELinux. If other security modules require additional hooks
or infrastructure to control delivery of notification, these can be added
by them. It does not make sense for us to propose hooks for which we have
no implementation. The understanding that all notifications received by the
requesting application are all strictly of a type for which the application
has been granted permission shows that this implementation is sufficient in
its coverage.
Security modules wishing to provide complete control over fanotify must
also implement a security_file_open hook that validates that the access
requested by the watching application is authorized. Fanotify has the issue
that it returns a file descriptor with the file mode specified during
fanotify_init() to the watching process on event. This is already covered
by the LSM security_file_open hook if the security module implements
checking of the requested file mode there. Otherwise, a watching process
can obtain escalated access to a file for which it has not been authorized.
The selinux_path_notify hook implementation works by adding five new file
permissions: watch, watch_mount, watch_sb, watch_reads, and watch_with_perm
(descriptions about which will follow), and one new filesystem permission:
watch (which is applied to superblock checks). The hook then decides which
subset of these permissions must be held by the requesting application
based on the contents of the provided mask and the obj_type. The
selinux_file_open hook already checks the requested file mode and therefore
ensures that a watching process cannot escalate its access through
fanotify.
The watch, watch_mount, and watch_sb permissions are the baseline
permissions for setting a watch on an object and each are a requirement for
any watch to be set on a file, mount, or superblock respectively. It should
be noted that having either of the other two permissions (watch_reads and
watch_with_perm) does not imply the watch, watch_mount, or watch_sb
permission. Superblock watches further require the filesystem watch
permission to the superblock. As there is no labeled object in view for
mounts, there is no specific check for mount watches beyond watch_mount to
the inode. Such a check could be added in the future, if a suitable labeled
object existed representing the mount.
The watch_reads permission is required to receive notifications from
read-exclusive events on filesystem objects. These events include accessing
a file for the purpose of reading and closing a file which has been opened
read-only. This distinction has been drawn in order to provide a direct
indication in the policy for this otherwise not obvious capability. Read
access to a file should not necessarily imply the ability to observe read
events on a file.
Finally, watch_with_perm only applies to fanotify masks since it is the
only way to set a mask which allows for the blocking, permission event.
This permission is needed for any watch which is of this type. Though
fanotify requires CAP_SYS_ADMIN, this is insufficient as it gives implicit
trust to root, which we do not do, and does not support least privilege.
Signed-off-by: Aaron Goidel <acgoide@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-12 15:20:00 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(path_notify, path, mask, obj_type);
|
fanotify, inotify, dnotify, security: add security hook for fs notifications
As of now, setting watches on filesystem objects has, at most, applied a
check for read access to the inode, and in the case of fanotify, requires
CAP_SYS_ADMIN. No specific security hook or permission check has been
provided to control the setting of watches. Using any of inotify, dnotify,
or fanotify, it is possible to observe, not only write-like operations, but
even read access to a file. Modeling the watch as being merely a read from
the file is insufficient for the needs of SELinux. This is due to the fact
that read access should not necessarily imply access to information about
when another process reads from a file. Furthermore, fanotify watches grant
more power to an application in the form of permission events. While
notification events are solely, unidirectional (i.e. they only pass
information to the receiving application), permission events are blocking.
Permission events make a request to the receiving application which will
then reply with a decision as to whether or not that action may be
completed. This causes the issue of the watching application having the
ability to exercise control over the triggering process. Without drawing a
distinction within the permission check, the ability to read would imply
the greater ability to control an application. Additionally, mount and
superblock watches apply to all files within the same mount or superblock.
Read access to one file should not necessarily imply the ability to watch
all files accessed within a given mount or superblock.
In order to solve these issues, a new LSM hook is implemented and has been
placed within the system calls for marking filesystem objects with inotify,
fanotify, and dnotify watches. These calls to the hook are placed at the
point at which the target path has been resolved and are provided with the
path struct, the mask of requested notification events, and the type of
object on which the mark is being set (inode, superblock, or mount). The
mask and obj_type have already been translated into common FS_* values
shared by the entirety of the fs notification infrastructure. The path
struct is passed rather than just the inode so that the mount is available,
particularly for mount watches. This also allows for use of the hook by
pathname-based security modules. However, since the hook is intended for
use even by inode based security modules, it is not placed under the
CONFIG_SECURITY_PATH conditional. Otherwise, the inode-based security
modules would need to enable all of the path hooks, even though they do not
use any of them.
This only provides a hook at the point of setting a watch, and presumes
that permission to set a particular watch implies the ability to receive
all notification about that object which match the mask. This is all that
is required for SELinux. If other security modules require additional hooks
or infrastructure to control delivery of notification, these can be added
by them. It does not make sense for us to propose hooks for which we have
no implementation. The understanding that all notifications received by the
requesting application are all strictly of a type for which the application
has been granted permission shows that this implementation is sufficient in
its coverage.
Security modules wishing to provide complete control over fanotify must
also implement a security_file_open hook that validates that the access
requested by the watching application is authorized. Fanotify has the issue
that it returns a file descriptor with the file mode specified during
fanotify_init() to the watching process on event. This is already covered
by the LSM security_file_open hook if the security module implements
checking of the requested file mode there. Otherwise, a watching process
can obtain escalated access to a file for which it has not been authorized.
The selinux_path_notify hook implementation works by adding five new file
permissions: watch, watch_mount, watch_sb, watch_reads, and watch_with_perm
(descriptions about which will follow), and one new filesystem permission:
watch (which is applied to superblock checks). The hook then decides which
subset of these permissions must be held by the requesting application
based on the contents of the provided mask and the obj_type. The
selinux_file_open hook already checks the requested file mode and therefore
ensures that a watching process cannot escalate its access through
fanotify.
The watch, watch_mount, and watch_sb permissions are the baseline
permissions for setting a watch on an object and each are a requirement for
any watch to be set on a file, mount, or superblock respectively. It should
be noted that having either of the other two permissions (watch_reads and
watch_with_perm) does not imply the watch, watch_mount, or watch_sb
permission. Superblock watches further require the filesystem watch
permission to the superblock. As there is no labeled object in view for
mounts, there is no specific check for mount watches beyond watch_mount to
the inode. Such a check could be added in the future, if a suitable labeled
object existed representing the mount.
The watch_reads permission is required to receive notifications from
read-exclusive events on filesystem objects. These events include accessing
a file for the purpose of reading and closing a file which has been opened
read-only. This distinction has been drawn in order to provide a direct
indication in the policy for this otherwise not obvious capability. Read
access to a file should not necessarily imply the ability to observe read
events on a file.
Finally, watch_with_perm only applies to fanotify masks since it is the
only way to set a mask which allows for the blocking, permission event.
This permission is needed for any watch which is of this type. Though
fanotify requires CAP_SYS_ADMIN, this is insufficient as it gives implicit
trust to root, which we do not do, and does not support least privilege.
Signed-off-by: Aaron Goidel <acgoide@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-12 15:20:00 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_alloc() - Allocate an inode LSM blob
|
|
|
|
* @inode: the inode
|
|
|
|
*
|
|
|
|
* Allocate and attach a security structure to @inode->i_security. The
|
|
|
|
* i_security field is initialized to NULL when the inode structure is
|
|
|
|
* allocated.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if operation was successful.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_inode_alloc(struct inode *inode)
|
|
|
|
{
|
2018-09-22 00:19:29 +00:00
|
|
|
int rc = lsm_inode_alloc(inode);
|
|
|
|
|
|
|
|
if (unlikely(rc))
|
|
|
|
return rc;
|
2024-01-30 12:56:59 +00:00
|
|
|
rc = call_int_hook(inode_alloc_security, inode);
|
2018-09-22 00:19:29 +00:00
|
|
|
if (unlikely(rc))
|
|
|
|
security_inode_free(inode);
|
|
|
|
return rc;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void inode_free_by_rcu(struct rcu_head *head)
|
|
|
|
{
|
2024-07-09 23:43:06 +00:00
|
|
|
/* The rcu head is at the start of the inode blob */
|
|
|
|
call_void_hook(inode_free_security_rcu, head);
|
2018-09-22 00:19:29 +00:00
|
|
|
kmem_cache_free(lsm_inode_cache, head);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_free() - Free an inode's LSM blob
|
|
|
|
* @inode: the inode
|
|
|
|
*
|
2024-07-09 23:43:06 +00:00
|
|
|
* Release any LSM resources associated with @inode, although due to the
|
|
|
|
* inode's RCU protections it is possible that the resources will not be
|
|
|
|
* fully released until after the current RCU grace period has elapsed.
|
|
|
|
*
|
|
|
|
* It is important for LSMs to note that despite being present in a call to
|
|
|
|
* security_inode_free(), @inode may still be referenced in a VFS path walk
|
|
|
|
* and calls to security_inode_permission() may be made during, or after,
|
|
|
|
* a call to security_inode_free(). For this reason the inode->i_security
|
|
|
|
* field is released via a call_rcu() callback and any LSMs which need to
|
|
|
|
* retain inode state for use in security_inode_permission() should only
|
|
|
|
* release that state in the inode_free_security_rcu() LSM hook callback.
|
2023-02-08 21:31:55 +00:00
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
void security_inode_free(struct inode *inode)
|
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(inode_free_security, inode);
|
2024-07-09 23:43:06 +00:00
|
|
|
if (!inode->i_security)
|
|
|
|
return;
|
|
|
|
call_rcu((struct rcu_head *)inode->i_security, inode_free_by_rcu);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_dentry_init_security() - Perform dentry initialization
|
|
|
|
* @dentry: the dentry to initialize
|
|
|
|
* @mode: mode used to determine resource type
|
|
|
|
* @name: name of the last path component
|
|
|
|
* @xattr_name: name of the security/LSM xattr
|
|
|
|
* @ctx: pointer to the resulting LSM context
|
|
|
|
* @ctxlen: length of @ctx
|
|
|
|
*
|
|
|
|
* Compute a context for a dentry as the inode is not yet available since NFSv4
|
|
|
|
* has no label backed by an EA anyway. It is important to note that
|
|
|
|
* @xattr_name does not need to be free'd by the caller, it is a static string.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, negative values on failure.
|
|
|
|
*/
|
2013-05-22 16:50:34 +00:00
|
|
|
int security_dentry_init_security(struct dentry *dentry, int mode,
|
2021-10-12 13:23:07 +00:00
|
|
|
const struct qstr *name,
|
|
|
|
const char **xattr_name, void **ctx,
|
|
|
|
u32 *ctxlen)
|
2013-05-22 16:50:34 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(dentry_init_security, dentry, mode, name,
|
|
|
|
xattr_name, ctx, ctxlen);
|
2013-05-22 16:50:34 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_dentry_init_security);
|
|
|
|
|
2023-02-07 22:44:01 +00:00
|
|
|
/**
|
|
|
|
* security_dentry_create_files_as() - Perform dentry initialization
|
|
|
|
* @dentry: the dentry to initialize
|
|
|
|
* @mode: mode used to determine resource type
|
|
|
|
* @name: name of the last path component
|
|
|
|
* @old: creds to use for LSM context calculations
|
|
|
|
* @new: creds to modify
|
|
|
|
*
|
|
|
|
* Compute a context for a dentry as the inode is not yet available and set
|
|
|
|
* that context in passed in creds so that new files are created using that
|
|
|
|
* context. Context is calculated using the passed in creds and not the creds
|
|
|
|
* of the caller.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
2016-07-13 14:44:52 +00:00
|
|
|
int security_dentry_create_files_as(struct dentry *dentry, int mode,
|
|
|
|
struct qstr *name,
|
|
|
|
const struct cred *old, struct cred *new)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(dentry_create_files_as, dentry, mode,
|
2023-02-17 02:33:20 +00:00
|
|
|
name, old, new);
|
2016-07-13 14:44:52 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_dentry_create_files_as);
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_init_security() - Initialize an inode's LSM context
|
|
|
|
* @inode: the inode
|
|
|
|
* @dir: parent directory
|
|
|
|
* @qstr: last component of the pathname
|
|
|
|
* @initxattrs: callback function to write xattrs
|
|
|
|
* @fs_data: filesystem specific data
|
|
|
|
*
|
|
|
|
* Obtain the security attribute name suffix and value to set on a newly
|
|
|
|
* created inode and set up the incore security field for the new inode. This
|
|
|
|
* hook is called by the fs code as part of the inode creation transaction and
|
|
|
|
* provides for atomic labeling of the inode, unlike the post_create/mkdir/...
|
security: Allow all LSMs to provide xattrs for inode_init_security hook
Currently, the LSM infrastructure supports only one LSM providing an xattr
and EVM calculating the HMAC on that xattr, plus other inode metadata.
Allow all LSMs to provide one or multiple xattrs, by extending the security
blob reservation mechanism. Introduce the new lbs_xattr_count field of the
lsm_blob_sizes structure, so that each LSM can specify how many xattrs it
needs, and the LSM infrastructure knows how many xattr slots it should
allocate.
Modify the inode_init_security hook definition, by passing the full
xattr array allocated in security_inode_init_security(), and the current
number of xattr slots in that array filled by LSMs. The first parameter
would allow EVM to access and calculate the HMAC on xattrs supplied by
other LSMs, the second to not leave gaps in the xattr array, when an LSM
requested but did not provide xattrs (e.g. if it is not initialized).
Introduce lsm_get_xattr_slot(), which LSMs can call as many times as the
number specified in the lbs_xattr_count field of the lsm_blob_sizes
structure. During each call, lsm_get_xattr_slot() increments the number of
filled xattrs, so that at the next invocation it returns the next xattr
slot to fill.
Cleanup security_inode_init_security(). Unify the !initxattrs and
initxattrs case by simply not allocating the new_xattrs array in the
former. Update the documentation to reflect the changes, and fix the
description of the xattr name, as it is not allocated anymore.
Adapt both SELinux and Smack to use the new definition of the
inode_init_security hook, and to call lsm_get_xattr_slot() to obtain and
fill the reserved slots in the xattr array.
Move the xattr->name assignment after the xattr->value one, so that it is
done only in case of successful memory allocation.
Finally, change the default return value of the inode_init_security hook
from zero to -EOPNOTSUPP, so that BPF LSM correctly follows the hook
conventions.
Reported-by: Nicolas Bouchinet <nicolas.bouchinet@clip-os.org>
Link: https://lore.kernel.org/linux-integrity/Y1FTSIo+1x+4X0LS@archlinux/
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: minor comment and variable tweaks, approved by RS]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-06-10 07:57:35 +00:00
|
|
|
* hooks called by the VFS.
|
|
|
|
*
|
|
|
|
* The hook function is expected to populate the xattrs array, by calling
|
|
|
|
* lsm_get_xattr_slot() to retrieve the slots reserved by the security module
|
|
|
|
* with the lbs_xattr_count field of the lsm_blob_sizes structure. For each
|
|
|
|
* slot, the hook function should set ->name to the attribute name suffix
|
|
|
|
* (e.g. selinux), to allocate ->value (will be freed by the caller) and set it
|
|
|
|
* to the attribute value, to set ->value_len to the length of the value. If
|
|
|
|
* the security module does not use security attributes or does not wish to put
|
|
|
|
* a security attribute on this particular inode, then it should return
|
|
|
|
* -EOPNOTSUPP to skip this processing.
|
2023-02-08 21:31:55 +00:00
|
|
|
*
|
2023-07-26 07:39:05 +00:00
|
|
|
* Return: Returns 0 if the LSM successfully initialized all of the inode
|
|
|
|
* security attributes that are required, negative values otherwise.
|
2023-02-08 21:31:55 +00:00
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_inode_init_security(struct inode *inode, struct inode *dir,
|
2011-06-06 19:29:25 +00:00
|
|
|
const struct qstr *qstr,
|
|
|
|
const initxattrs initxattrs, void *fs_data)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
struct lsm_static_call *scall;
|
security: Allow all LSMs to provide xattrs for inode_init_security hook
Currently, the LSM infrastructure supports only one LSM providing an xattr
and EVM calculating the HMAC on that xattr, plus other inode metadata.
Allow all LSMs to provide one or multiple xattrs, by extending the security
blob reservation mechanism. Introduce the new lbs_xattr_count field of the
lsm_blob_sizes structure, so that each LSM can specify how many xattrs it
needs, and the LSM infrastructure knows how many xattr slots it should
allocate.
Modify the inode_init_security hook definition, by passing the full
xattr array allocated in security_inode_init_security(), and the current
number of xattr slots in that array filled by LSMs. The first parameter
would allow EVM to access and calculate the HMAC on xattrs supplied by
other LSMs, the second to not leave gaps in the xattr array, when an LSM
requested but did not provide xattrs (e.g. if it is not initialized).
Introduce lsm_get_xattr_slot(), which LSMs can call as many times as the
number specified in the lbs_xattr_count field of the lsm_blob_sizes
structure. During each call, lsm_get_xattr_slot() increments the number of
filled xattrs, so that at the next invocation it returns the next xattr
slot to fill.
Cleanup security_inode_init_security(). Unify the !initxattrs and
initxattrs case by simply not allocating the new_xattrs array in the
former. Update the documentation to reflect the changes, and fix the
description of the xattr name, as it is not allocated anymore.
Adapt both SELinux and Smack to use the new definition of the
inode_init_security hook, and to call lsm_get_xattr_slot() to obtain and
fill the reserved slots in the xattr array.
Move the xattr->name assignment after the xattr->value one, so that it is
done only in case of successful memory allocation.
Finally, change the default return value of the inode_init_security hook
from zero to -EOPNOTSUPP, so that BPF LSM correctly follows the hook
conventions.
Reported-by: Nicolas Bouchinet <nicolas.bouchinet@clip-os.org>
Link: https://lore.kernel.org/linux-integrity/Y1FTSIo+1x+4X0LS@archlinux/
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: minor comment and variable tweaks, approved by RS]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-06-10 07:57:35 +00:00
|
|
|
struct xattr *new_xattrs = NULL;
|
|
|
|
int ret = -EOPNOTSUPP, xattr_count = 0;
|
2011-06-06 19:29:25 +00:00
|
|
|
|
2007-10-17 06:31:32 +00:00
|
|
|
if (unlikely(IS_PRIVATE(inode)))
|
2011-08-15 14:13:18 +00:00
|
|
|
return 0;
|
2011-06-06 19:29:25 +00:00
|
|
|
|
security: Allow all LSMs to provide xattrs for inode_init_security hook
Currently, the LSM infrastructure supports only one LSM providing an xattr
and EVM calculating the HMAC on that xattr, plus other inode metadata.
Allow all LSMs to provide one or multiple xattrs, by extending the security
blob reservation mechanism. Introduce the new lbs_xattr_count field of the
lsm_blob_sizes structure, so that each LSM can specify how many xattrs it
needs, and the LSM infrastructure knows how many xattr slots it should
allocate.
Modify the inode_init_security hook definition, by passing the full
xattr array allocated in security_inode_init_security(), and the current
number of xattr slots in that array filled by LSMs. The first parameter
would allow EVM to access and calculate the HMAC on xattrs supplied by
other LSMs, the second to not leave gaps in the xattr array, when an LSM
requested but did not provide xattrs (e.g. if it is not initialized).
Introduce lsm_get_xattr_slot(), which LSMs can call as many times as the
number specified in the lbs_xattr_count field of the lsm_blob_sizes
structure. During each call, lsm_get_xattr_slot() increments the number of
filled xattrs, so that at the next invocation it returns the next xattr
slot to fill.
Cleanup security_inode_init_security(). Unify the !initxattrs and
initxattrs case by simply not allocating the new_xattrs array in the
former. Update the documentation to reflect the changes, and fix the
description of the xattr name, as it is not allocated anymore.
Adapt both SELinux and Smack to use the new definition of the
inode_init_security hook, and to call lsm_get_xattr_slot() to obtain and
fill the reserved slots in the xattr array.
Move the xattr->name assignment after the xattr->value one, so that it is
done only in case of successful memory allocation.
Finally, change the default return value of the inode_init_security hook
from zero to -EOPNOTSUPP, so that BPF LSM correctly follows the hook
conventions.
Reported-by: Nicolas Bouchinet <nicolas.bouchinet@clip-os.org>
Link: https://lore.kernel.org/linux-integrity/Y1FTSIo+1x+4X0LS@archlinux/
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: minor comment and variable tweaks, approved by RS]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-06-10 07:57:35 +00:00
|
|
|
if (!blob_sizes.lbs_xattr_count)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if (initxattrs) {
|
evm: Make it independent from 'integrity' LSM
Define a new structure for EVM-specific metadata, called evm_iint_cache,
and embed it in the inode security blob. Introduce evm_iint_inode() to
retrieve metadata, and register evm_inode_alloc_security() for the
inode_alloc_security LSM hook, to initialize the structure (before
splitting metadata, this task was done by iint_init_always()).
Keep the non-NULL checks after calling evm_iint_inode() except in
evm_inode_alloc_security(), to take into account inodes for which
security_inode_alloc() was not called. When using shared metadata,
obtaining a NULL pointer from integrity_iint_find() meant that the file
wasn't in the IMA policy. Now, because IMA and EVM use disjoint metadata,
the EVM status has to be stored for every inode regardless of the IMA
policy.
Given that from now on EVM relies on its own metadata, remove the iint
parameter from evm_verifyxattr(). Also, directly retrieve the iint in
evm_verify_hmac(), called by both evm_verifyxattr() and
evm_verify_current_integrity(), since now there is no performance penalty
in retrieving EVM metadata (constant time).
Replicate the management of the IMA_NEW_FILE flag, by introducing
evm_post_path_mknod() and evm_file_release() to respectively set and clear
the newly introduced flag EVM_NEW_FILE, at the same time IMA does. Like for
IMA, select CONFIG_SECURITY_PATH when EVM is enabled, to ensure that files
are marked as new.
Unlike ima_post_path_mknod(), evm_post_path_mknod() cannot check if a file
must be appraised. Thus, it marks all affected files. Also, it does not
clear EVM_NEW_FILE depending on i_version, but that is not a problem
because IMA_NEW_FILE is always cleared when set in ima_check_last_writer().
Move the EVM-specific flag EVM_IMMUTABLE_DIGSIG to
security/integrity/evm/evm.h, since that definition is now unnecessary in
the common integrity layer.
Finally, switch to the LSM reservation mechanism for the EVM xattr, and
consequently decrement by one the number of xattrs to allocate in
security_inode_init_security().
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-02-15 10:31:11 +00:00
|
|
|
/* Allocate +1 as terminator. */
|
|
|
|
new_xattrs = kcalloc(blob_sizes.lbs_xattr_count + 1,
|
security: Allow all LSMs to provide xattrs for inode_init_security hook
Currently, the LSM infrastructure supports only one LSM providing an xattr
and EVM calculating the HMAC on that xattr, plus other inode metadata.
Allow all LSMs to provide one or multiple xattrs, by extending the security
blob reservation mechanism. Introduce the new lbs_xattr_count field of the
lsm_blob_sizes structure, so that each LSM can specify how many xattrs it
needs, and the LSM infrastructure knows how many xattr slots it should
allocate.
Modify the inode_init_security hook definition, by passing the full
xattr array allocated in security_inode_init_security(), and the current
number of xattr slots in that array filled by LSMs. The first parameter
would allow EVM to access and calculate the HMAC on xattrs supplied by
other LSMs, the second to not leave gaps in the xattr array, when an LSM
requested but did not provide xattrs (e.g. if it is not initialized).
Introduce lsm_get_xattr_slot(), which LSMs can call as many times as the
number specified in the lbs_xattr_count field of the lsm_blob_sizes
structure. During each call, lsm_get_xattr_slot() increments the number of
filled xattrs, so that at the next invocation it returns the next xattr
slot to fill.
Cleanup security_inode_init_security(). Unify the !initxattrs and
initxattrs case by simply not allocating the new_xattrs array in the
former. Update the documentation to reflect the changes, and fix the
description of the xattr name, as it is not allocated anymore.
Adapt both SELinux and Smack to use the new definition of the
inode_init_security hook, and to call lsm_get_xattr_slot() to obtain and
fill the reserved slots in the xattr array.
Move the xattr->name assignment after the xattr->value one, so that it is
done only in case of successful memory allocation.
Finally, change the default return value of the inode_init_security hook
from zero to -EOPNOTSUPP, so that BPF LSM correctly follows the hook
conventions.
Reported-by: Nicolas Bouchinet <nicolas.bouchinet@clip-os.org>
Link: https://lore.kernel.org/linux-integrity/Y1FTSIo+1x+4X0LS@archlinux/
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: minor comment and variable tweaks, approved by RS]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-06-10 07:57:35 +00:00
|
|
|
sizeof(*new_xattrs), GFP_NOFS);
|
|
|
|
if (!new_xattrs)
|
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
lsm_for_each_hook(scall, inode_init_security) {
|
|
|
|
ret = scall->hl->hook.inode_init_security(inode, dir, qstr, new_xattrs,
|
security: Allow all LSMs to provide xattrs for inode_init_security hook
Currently, the LSM infrastructure supports only one LSM providing an xattr
and EVM calculating the HMAC on that xattr, plus other inode metadata.
Allow all LSMs to provide one or multiple xattrs, by extending the security
blob reservation mechanism. Introduce the new lbs_xattr_count field of the
lsm_blob_sizes structure, so that each LSM can specify how many xattrs it
needs, and the LSM infrastructure knows how many xattr slots it should
allocate.
Modify the inode_init_security hook definition, by passing the full
xattr array allocated in security_inode_init_security(), and the current
number of xattr slots in that array filled by LSMs. The first parameter
would allow EVM to access and calculate the HMAC on xattrs supplied by
other LSMs, the second to not leave gaps in the xattr array, when an LSM
requested but did not provide xattrs (e.g. if it is not initialized).
Introduce lsm_get_xattr_slot(), which LSMs can call as many times as the
number specified in the lbs_xattr_count field of the lsm_blob_sizes
structure. During each call, lsm_get_xattr_slot() increments the number of
filled xattrs, so that at the next invocation it returns the next xattr
slot to fill.
Cleanup security_inode_init_security(). Unify the !initxattrs and
initxattrs case by simply not allocating the new_xattrs array in the
former. Update the documentation to reflect the changes, and fix the
description of the xattr name, as it is not allocated anymore.
Adapt both SELinux and Smack to use the new definition of the
inode_init_security hook, and to call lsm_get_xattr_slot() to obtain and
fill the reserved slots in the xattr array.
Move the xattr->name assignment after the xattr->value one, so that it is
done only in case of successful memory allocation.
Finally, change the default return value of the inode_init_security hook
from zero to -EOPNOTSUPP, so that BPF LSM correctly follows the hook
conventions.
Reported-by: Nicolas Bouchinet <nicolas.bouchinet@clip-os.org>
Link: https://lore.kernel.org/linux-integrity/Y1FTSIo+1x+4X0LS@archlinux/
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: minor comment and variable tweaks, approved by RS]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-06-10 07:57:35 +00:00
|
|
|
&xattr_count);
|
|
|
|
if (ret && ret != -EOPNOTSUPP)
|
|
|
|
goto out;
|
|
|
|
/*
|
|
|
|
* As documented in lsm_hooks.h, -EOPNOTSUPP in this context
|
|
|
|
* means that the LSM is not willing to provide an xattr, not
|
|
|
|
* that it wants to signal an error. Thus, continue to invoke
|
|
|
|
* the remaining LSMs.
|
|
|
|
*/
|
|
|
|
}
|
|
|
|
|
|
|
|
/* If initxattrs() is NULL, xattr_count is zero, skip the call. */
|
|
|
|
if (!xattr_count)
|
2011-06-06 19:29:25 +00:00
|
|
|
goto out;
|
2011-06-16 01:19:10 +00:00
|
|
|
|
2011-06-06 19:29:25 +00:00
|
|
|
ret = initxattrs(inode, new_xattrs, fs_data);
|
|
|
|
out:
|
security: Allow all LSMs to provide xattrs for inode_init_security hook
Currently, the LSM infrastructure supports only one LSM providing an xattr
and EVM calculating the HMAC on that xattr, plus other inode metadata.
Allow all LSMs to provide one or multiple xattrs, by extending the security
blob reservation mechanism. Introduce the new lbs_xattr_count field of the
lsm_blob_sizes structure, so that each LSM can specify how many xattrs it
needs, and the LSM infrastructure knows how many xattr slots it should
allocate.
Modify the inode_init_security hook definition, by passing the full
xattr array allocated in security_inode_init_security(), and the current
number of xattr slots in that array filled by LSMs. The first parameter
would allow EVM to access and calculate the HMAC on xattrs supplied by
other LSMs, the second to not leave gaps in the xattr array, when an LSM
requested but did not provide xattrs (e.g. if it is not initialized).
Introduce lsm_get_xattr_slot(), which LSMs can call as many times as the
number specified in the lbs_xattr_count field of the lsm_blob_sizes
structure. During each call, lsm_get_xattr_slot() increments the number of
filled xattrs, so that at the next invocation it returns the next xattr
slot to fill.
Cleanup security_inode_init_security(). Unify the !initxattrs and
initxattrs case by simply not allocating the new_xattrs array in the
former. Update the documentation to reflect the changes, and fix the
description of the xattr name, as it is not allocated anymore.
Adapt both SELinux and Smack to use the new definition of the
inode_init_security hook, and to call lsm_get_xattr_slot() to obtain and
fill the reserved slots in the xattr array.
Move the xattr->name assignment after the xattr->value one, so that it is
done only in case of successful memory allocation.
Finally, change the default return value of the inode_init_security hook
from zero to -EOPNOTSUPP, so that BPF LSM correctly follows the hook
conventions.
Reported-by: Nicolas Bouchinet <nicolas.bouchinet@clip-os.org>
Link: https://lore.kernel.org/linux-integrity/Y1FTSIo+1x+4X0LS@archlinux/
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: minor comment and variable tweaks, approved by RS]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-06-10 07:57:35 +00:00
|
|
|
for (; xattr_count > 0; xattr_count--)
|
|
|
|
kfree(new_xattrs[xattr_count - 1].value);
|
|
|
|
kfree(new_xattrs);
|
2011-06-06 19:29:25 +00:00
|
|
|
return (ret == -EOPNOTSUPP) ? 0 : ret;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_inode_init_security);
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_init_security_anon() - Initialize an anonymous inode
|
|
|
|
* @inode: the inode
|
|
|
|
* @name: the anonymous inode class
|
|
|
|
* @context_inode: an optional related inode
|
|
|
|
*
|
|
|
|
* Set up the incore security field for the new anonymous inode and return
|
|
|
|
* whether the inode creation is permitted by the security module or not.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, -EACCES if the security module denies the
|
|
|
|
* creation of this inode, or another -errno upon other errors.
|
|
|
|
*/
|
2021-01-08 22:22:20 +00:00
|
|
|
int security_inode_init_security_anon(struct inode *inode,
|
|
|
|
const struct qstr *name,
|
|
|
|
const struct inode *context_inode)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_init_security_anon, inode, name,
|
2021-01-08 22:22:20 +00:00
|
|
|
context_inode);
|
|
|
|
}
|
|
|
|
|
2008-12-17 04:24:15 +00:00
|
|
|
#ifdef CONFIG_SECURITY_PATH
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_path_mknod() - Check if creating a special file is allowed
|
|
|
|
* @dir: parent directory
|
|
|
|
* @dentry: new file
|
|
|
|
* @mode: new file mode
|
|
|
|
* @dev: device number
|
|
|
|
*
|
|
|
|
* Check permissions when creating a file. Note that this hook is called even
|
|
|
|
* if mknod operation is being done for a regular file.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_path_mknod(const struct path *dir, struct dentry *dentry,
|
|
|
|
umode_t mode, unsigned int dev)
|
2008-12-17 04:24:15 +00:00
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dir->dentry))))
|
2008-12-17 04:24:15 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(path_mknod, dir, dentry, mode, dev);
|
2008-12-17 04:24:15 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_path_mknod);
|
|
|
|
|
2024-02-15 10:31:02 +00:00
|
|
|
/**
|
security: Place security_path_post_mknod() where the original IMA call was
Commit 08abce60d63f ("security: Introduce path_post_mknod hook")
introduced security_path_post_mknod(), to replace the IMA-specific call
to ima_post_path_mknod().
For symmetry with security_path_mknod(), security_path_post_mknod() was
called after a successful mknod operation, for any file type, rather
than only for regular files at the time there was the IMA call.
However, as reported by VFS maintainers, successful mknod operation does
not mean that the dentry always has an inode attached to it (for
example, not for FIFOs on a SAMBA mount).
If that condition happens, the kernel crashes when
security_path_post_mknod() attempts to verify if the inode associated to
the dentry is private.
Move security_path_post_mknod() where the ima_post_path_mknod() call was,
which is obviously correct from IMA/EVM perspective. IMA/EVM are the only
in-kernel users, and only need to inspect regular files.
Reported-by: Steve French <smfrench@gmail.com>
Closes: https://lore.kernel.org/linux-kernel/CAH2r5msAVzxCUHHG8VKrMPUKQHmBpE6K9_vjhgDa1uAvwx4ppw@mail.gmail.com/
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes: 08abce60d63f ("security: Introduce path_post_mknod hook")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-04-03 07:57:29 +00:00
|
|
|
* security_path_post_mknod() - Update inode security after reg file creation
|
2024-02-15 10:31:02 +00:00
|
|
|
* @idmap: idmap of the mount
|
|
|
|
* @dentry: new file
|
|
|
|
*
|
security: Place security_path_post_mknod() where the original IMA call was
Commit 08abce60d63f ("security: Introduce path_post_mknod hook")
introduced security_path_post_mknod(), to replace the IMA-specific call
to ima_post_path_mknod().
For symmetry with security_path_mknod(), security_path_post_mknod() was
called after a successful mknod operation, for any file type, rather
than only for regular files at the time there was the IMA call.
However, as reported by VFS maintainers, successful mknod operation does
not mean that the dentry always has an inode attached to it (for
example, not for FIFOs on a SAMBA mount).
If that condition happens, the kernel crashes when
security_path_post_mknod() attempts to verify if the inode associated to
the dentry is private.
Move security_path_post_mknod() where the ima_post_path_mknod() call was,
which is obviously correct from IMA/EVM perspective. IMA/EVM are the only
in-kernel users, and only need to inspect regular files.
Reported-by: Steve French <smfrench@gmail.com>
Closes: https://lore.kernel.org/linux-kernel/CAH2r5msAVzxCUHHG8VKrMPUKQHmBpE6K9_vjhgDa1uAvwx4ppw@mail.gmail.com/
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes: 08abce60d63f ("security: Introduce path_post_mknod hook")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-04-03 07:57:29 +00:00
|
|
|
* Update inode security field after a regular file has been created.
|
2024-02-15 10:31:02 +00:00
|
|
|
*/
|
|
|
|
void security_path_post_mknod(struct mnt_idmap *idmap, struct dentry *dentry)
|
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
|
|
|
return;
|
|
|
|
call_void_hook(path_post_mknod, idmap, dentry);
|
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_path_mkdir() - Check if creating a new directory is allowed
|
|
|
|
* @dir: parent directory
|
|
|
|
* @dentry: new directory
|
|
|
|
* @mode: new directory mode
|
|
|
|
*
|
|
|
|
* Check permissions to create a new directory in the existing directory.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_path_mkdir(const struct path *dir, struct dentry *dentry,
|
|
|
|
umode_t mode)
|
2008-12-17 04:24:15 +00:00
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dir->dentry))))
|
2008-12-17 04:24:15 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(path_mkdir, dir, dentry, mode);
|
2008-12-17 04:24:15 +00:00
|
|
|
}
|
2010-12-24 14:48:35 +00:00
|
|
|
EXPORT_SYMBOL(security_path_mkdir);
|
2008-12-17 04:24:15 +00:00
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_path_rmdir() - Check if removing a directory is allowed
|
|
|
|
* @dir: parent directory
|
|
|
|
* @dentry: directory to remove
|
|
|
|
*
|
|
|
|
* Check the permission to remove a directory.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2016-03-25 19:13:39 +00:00
|
|
|
int security_path_rmdir(const struct path *dir, struct dentry *dentry)
|
2008-12-17 04:24:15 +00:00
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dir->dentry))))
|
2008-12-17 04:24:15 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(path_rmdir, dir, dentry);
|
2008-12-17 04:24:15 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_path_unlink() - Check if removing a hard link is allowed
|
|
|
|
* @dir: parent directory
|
|
|
|
* @dentry: file
|
|
|
|
*
|
|
|
|
* Check the permission to remove a hard link to a file.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2016-03-25 19:13:39 +00:00
|
|
|
int security_path_unlink(const struct path *dir, struct dentry *dentry)
|
2008-12-17 04:24:15 +00:00
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dir->dentry))))
|
2008-12-17 04:24:15 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(path_unlink, dir, dentry);
|
2008-12-17 04:24:15 +00:00
|
|
|
}
|
2010-12-24 14:48:35 +00:00
|
|
|
EXPORT_SYMBOL(security_path_unlink);
|
2008-12-17 04:24:15 +00:00
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_path_symlink() - Check if creating a symbolic link is allowed
|
|
|
|
* @dir: parent directory
|
|
|
|
* @dentry: symbolic link
|
|
|
|
* @old_name: file pathname
|
|
|
|
*
|
|
|
|
* Check the permission to create a symbolic link to a file.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2016-03-25 19:21:09 +00:00
|
|
|
int security_path_symlink(const struct path *dir, struct dentry *dentry,
|
2008-12-17 04:24:15 +00:00
|
|
|
const char *old_name)
|
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dir->dentry))))
|
2008-12-17 04:24:15 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(path_symlink, dir, dentry, old_name);
|
2008-12-17 04:24:15 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_path_link - Check if creating a hard link is allowed
|
|
|
|
* @old_dentry: existing file
|
|
|
|
* @new_dir: new parent directory
|
|
|
|
* @new_dentry: new link
|
|
|
|
*
|
|
|
|
* Check permission before creating a new hard link to a file.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2016-03-25 19:27:45 +00:00
|
|
|
int security_path_link(struct dentry *old_dentry, const struct path *new_dir,
|
2008-12-17 04:24:15 +00:00
|
|
|
struct dentry *new_dentry)
|
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(old_dentry))))
|
2008-12-17 04:24:15 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(path_link, old_dentry, new_dir, new_dentry);
|
2008-12-17 04:24:15 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_path_rename() - Check if renaming a file is allowed
|
|
|
|
* @old_dir: parent directory of the old file
|
|
|
|
* @old_dentry: the old file
|
|
|
|
* @new_dir: parent directory of the new file
|
|
|
|
* @new_dentry: the new file
|
|
|
|
* @flags: flags
|
|
|
|
*
|
|
|
|
* Check for permission to rename a file or directory.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2016-03-25 19:27:45 +00:00
|
|
|
int security_path_rename(const struct path *old_dir, struct dentry *old_dentry,
|
|
|
|
const struct path *new_dir, struct dentry *new_dentry,
|
2014-04-01 15:08:43 +00:00
|
|
|
unsigned int flags)
|
2008-12-17 04:24:15 +00:00
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(old_dentry)) ||
|
2023-02-17 02:33:20 +00:00
|
|
|
(d_is_positive(new_dentry) &&
|
|
|
|
IS_PRIVATE(d_backing_inode(new_dentry)))))
|
2008-12-17 04:24:15 +00:00
|
|
|
return 0;
|
2014-04-01 15:08:43 +00:00
|
|
|
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(path_rename, old_dir, old_dentry, new_dir,
|
2023-02-17 02:33:20 +00:00
|
|
|
new_dentry, flags);
|
2008-12-17 04:24:15 +00:00
|
|
|
}
|
2010-12-24 14:48:35 +00:00
|
|
|
EXPORT_SYMBOL(security_path_rename);
|
2008-12-17 04:24:15 +00:00
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_path_truncate() - Check if truncating a file is allowed
|
|
|
|
* @path: file
|
|
|
|
*
|
|
|
|
* Check permission before truncating the file indicated by path. Note that
|
|
|
|
* truncation permissions may also be checked based on already opened files,
|
|
|
|
* using the security_file_truncate() hook.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2016-03-25 18:22:01 +00:00
|
|
|
int security_path_truncate(const struct path *path)
|
2008-12-17 04:24:15 +00:00
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(path->dentry))))
|
2008-12-17 04:24:15 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(path_truncate, path);
|
2008-12-17 04:24:15 +00:00
|
|
|
}
|
2009-10-04 12:49:47 +00:00
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_path_chmod() - Check if changing the file's mode is allowed
|
|
|
|
* @path: file
|
|
|
|
* @mode: new mode
|
|
|
|
*
|
|
|
|
* Check for permission to change a mode of the file @path. The new mode is
|
|
|
|
* specified in @mode which is a bitmask of constants from
|
|
|
|
* <include/uapi/linux/stat.h>.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2016-03-25 18:56:23 +00:00
|
|
|
int security_path_chmod(const struct path *path, umode_t mode)
|
2009-10-04 12:49:47 +00:00
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(path->dentry))))
|
2009-10-04 12:49:47 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(path_chmod, path, mode);
|
2009-10-04 12:49:47 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_path_chown() - Check if changing the file's owner/group is allowed
|
|
|
|
* @path: file
|
|
|
|
* @uid: file owner
|
|
|
|
* @gid: file group
|
|
|
|
*
|
|
|
|
* Check for permission to change owner/group of a file or directory.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2016-03-25 18:44:41 +00:00
|
|
|
int security_path_chown(const struct path *path, kuid_t uid, kgid_t gid)
|
2009-10-04 12:49:47 +00:00
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(path->dentry))))
|
2009-10-04 12:49:47 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(path_chown, path, uid, gid);
|
2009-10-04 12:49:47 +00:00
|
|
|
}
|
2009-10-04 12:49:48 +00:00
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_path_chroot() - Check if changing the root directory is allowed
|
|
|
|
* @path: directory
|
|
|
|
*
|
|
|
|
* Check for permission to change root directory.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2016-03-25 19:28:43 +00:00
|
|
|
int security_path_chroot(const struct path *path)
|
2009-10-04 12:49:48 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(path_chroot, path);
|
2009-10-04 12:49:48 +00:00
|
|
|
}
|
2023-02-17 02:33:20 +00:00
|
|
|
#endif /* CONFIG_SECURITY_PATH */
|
2008-12-17 04:24:15 +00:00
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_create() - Check if creating a file is allowed
|
|
|
|
* @dir: the parent directory
|
|
|
|
* @dentry: the file being created
|
|
|
|
* @mode: requested file mode
|
|
|
|
*
|
|
|
|
* Check permission to create a regular file.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_inode_create(struct inode *dir, struct dentry *dentry,
|
|
|
|
umode_t mode)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(dir)))
|
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_create, dir, dentry, mode);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
2009-04-03 15:42:40 +00:00
|
|
|
EXPORT_SYMBOL_GPL(security_inode_create);
|
2007-10-17 06:31:32 +00:00
|
|
|
|
2024-02-15 10:31:03 +00:00
|
|
|
/**
|
|
|
|
* security_inode_post_create_tmpfile() - Update inode security of new tmpfile
|
|
|
|
* @idmap: idmap of the mount
|
|
|
|
* @inode: inode of the new tmpfile
|
|
|
|
*
|
|
|
|
* Update inode security data after a tmpfile has been created.
|
|
|
|
*/
|
|
|
|
void security_inode_post_create_tmpfile(struct mnt_idmap *idmap,
|
|
|
|
struct inode *inode)
|
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(inode)))
|
|
|
|
return;
|
|
|
|
call_void_hook(inode_post_create_tmpfile, idmap, inode);
|
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_link() - Check if creating a hard link is allowed
|
|
|
|
* @old_dentry: existing file
|
|
|
|
* @dir: new parent directory
|
|
|
|
* @new_dentry: new link
|
|
|
|
*
|
|
|
|
* Check permission before creating a new hard link to a file.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_inode_link(struct dentry *old_dentry, struct inode *dir,
|
2023-02-17 02:33:20 +00:00
|
|
|
struct dentry *new_dentry)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(old_dentry))))
|
2007-10-17 06:31:32 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_link, old_dentry, dir, new_dentry);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_unlink() - Check if removing a hard link is allowed
|
|
|
|
* @dir: parent directory
|
|
|
|
* @dentry: file
|
|
|
|
*
|
|
|
|
* Check the permission to remove a hard link to a file.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_inode_unlink(struct inode *dir, struct dentry *dentry)
|
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
2007-10-17 06:31:32 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_unlink, dir, dentry);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
2023-03-08 17:31:03 +00:00
|
|
|
* security_inode_symlink() - Check if creating a symbolic link is allowed
|
2023-02-08 21:31:55 +00:00
|
|
|
* @dir: parent directory
|
|
|
|
* @dentry: symbolic link
|
|
|
|
* @old_name: existing filename
|
|
|
|
*
|
|
|
|
* Check the permission to create a symbolic link to a file.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_inode_symlink(struct inode *dir, struct dentry *dentry,
|
2023-02-17 02:33:20 +00:00
|
|
|
const char *old_name)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(dir)))
|
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_symlink, dir, dentry, old_name);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_mkdir() - Check if creation a new director is allowed
|
|
|
|
* @dir: parent directory
|
|
|
|
* @dentry: new directory
|
|
|
|
* @mode: new directory mode
|
|
|
|
*
|
|
|
|
* Check permissions to create a new directory in the existing directory
|
|
|
|
* associated with inode structure @dir.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2011-07-26 05:41:39 +00:00
|
|
|
int security_inode_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(dir)))
|
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_mkdir, dir, dentry, mode);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
2009-04-03 15:42:40 +00:00
|
|
|
EXPORT_SYMBOL_GPL(security_inode_mkdir);
|
2007-10-17 06:31:32 +00:00
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_rmdir() - Check if removing a directory is allowed
|
|
|
|
* @dir: parent directory
|
|
|
|
* @dentry: directory to be removed
|
|
|
|
*
|
|
|
|
* Check the permission to remove a directory.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_inode_rmdir(struct inode *dir, struct dentry *dentry)
|
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
2007-10-17 06:31:32 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_rmdir, dir, dentry);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_mknod() - Check if creating a special file is allowed
|
|
|
|
* @dir: parent directory
|
|
|
|
* @dentry: new file
|
|
|
|
* @mode: new file mode
|
|
|
|
* @dev: device number
|
|
|
|
*
|
|
|
|
* Check permissions when creating a special file (or a socket or a fifo file
|
|
|
|
* created via the mknod system call). Note that if mknod operation is being
|
|
|
|
* done for a regular file, then the create hook will be called and not this
|
|
|
|
* hook.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_inode_mknod(struct inode *dir, struct dentry *dentry,
|
|
|
|
umode_t mode, dev_t dev)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(dir)))
|
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_mknod, dir, dentry, mode, dev);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_rename() - Check if renaming a file is allowed
|
|
|
|
* @old_dir: parent directory of the old file
|
|
|
|
* @old_dentry: the old file
|
|
|
|
* @new_dir: parent directory of the new file
|
|
|
|
* @new_dentry: the new file
|
|
|
|
* @flags: flags
|
|
|
|
*
|
|
|
|
* Check for permission to rename a file or directory.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_inode_rename(struct inode *old_dir, struct dentry *old_dentry,
|
2023-02-17 02:33:20 +00:00
|
|
|
struct inode *new_dir, struct dentry *new_dentry,
|
|
|
|
unsigned int flags)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2023-02-17 02:33:20 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(old_dentry)) ||
|
|
|
|
(d_is_positive(new_dentry) &&
|
|
|
|
IS_PRIVATE(d_backing_inode(new_dentry)))))
|
2007-10-17 06:31:32 +00:00
|
|
|
return 0;
|
2014-04-01 15:08:43 +00:00
|
|
|
|
|
|
|
if (flags & RENAME_EXCHANGE) {
|
2024-01-30 12:56:59 +00:00
|
|
|
int err = call_int_hook(inode_rename, new_dir, new_dentry,
|
2023-02-17 02:33:20 +00:00
|
|
|
old_dir, old_dentry);
|
2014-04-01 15:08:43 +00:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_rename, old_dir, old_dentry,
|
2023-02-17 02:33:20 +00:00
|
|
|
new_dir, new_dentry);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_readlink() - Check if reading a symbolic link is allowed
|
|
|
|
* @dentry: link
|
|
|
|
*
|
|
|
|
* Check the permission to read the symbolic link.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_inode_readlink(struct dentry *dentry)
|
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
2007-10-17 06:31:32 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_readlink, dentry);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_follow_link() - Check if following a symbolic link is allowed
|
|
|
|
* @dentry: link dentry
|
|
|
|
* @inode: link inode
|
|
|
|
* @rcu: true if in RCU-walk mode
|
|
|
|
*
|
|
|
|
* Check permission to follow a symbolic link when looking up a pathname. If
|
|
|
|
* @rcu is true, @inode is not stable.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2015-03-23 02:37:39 +00:00
|
|
|
int security_inode_follow_link(struct dentry *dentry, struct inode *inode,
|
|
|
|
bool rcu)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-03-23 02:37:39 +00:00
|
|
|
if (unlikely(IS_PRIVATE(inode)))
|
2007-10-17 06:31:32 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_follow_link, dentry, inode, rcu);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_permission() - Check if accessing an inode is allowed
|
|
|
|
* @inode: inode
|
|
|
|
* @mask: access mask
|
|
|
|
*
|
|
|
|
* Check permission before accessing an inode. This hook is called by the
|
|
|
|
* existing Linux permission function, so a security module can use it to
|
|
|
|
* provide additional checking for existing Linux permission checks. Notice
|
|
|
|
* that this hook is called when a file is opened (as well as many other
|
|
|
|
* operations), whereas the file_security_ops permission hook is called when
|
|
|
|
* the actual read/write operations are performed.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2008-07-17 13:37:02 +00:00
|
|
|
int security_inode_permission(struct inode *inode, int mask)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(inode)))
|
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_permission, inode, mask);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_setattr() - Check if setting file attributes is allowed
|
|
|
|
* @idmap: idmap of the mount
|
|
|
|
* @dentry: file
|
|
|
|
* @attr: new attributes
|
|
|
|
*
|
|
|
|
* Check permission before setting file attributes. Note that the kernel call
|
|
|
|
* to notify_change is performed from several locations, whenever file
|
|
|
|
* attributes change (such as when a file is truncated, chown/chmod operations,
|
|
|
|
* transferring disk quotas, etc).
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-01-13 11:49:11 +00:00
|
|
|
int security_inode_setattr(struct mnt_idmap *idmap,
|
2022-06-21 14:14:53 +00:00
|
|
|
struct dentry *dentry, struct iattr *attr)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
2007-10-17 06:31:32 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_setattr, idmap, dentry, attr);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
2008-07-01 13:01:28 +00:00
|
|
|
EXPORT_SYMBOL_GPL(security_inode_setattr);
|
2007-10-17 06:31:32 +00:00
|
|
|
|
2024-02-15 10:30:58 +00:00
|
|
|
/**
|
|
|
|
* security_inode_post_setattr() - Update the inode after a setattr operation
|
|
|
|
* @idmap: idmap of the mount
|
|
|
|
* @dentry: file
|
|
|
|
* @ia_valid: file attributes set
|
|
|
|
*
|
|
|
|
* Update inode security field after successful setting file attributes.
|
|
|
|
*/
|
|
|
|
void security_inode_post_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
|
|
|
|
int ia_valid)
|
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
|
|
|
return;
|
|
|
|
call_void_hook(inode_post_setattr, idmap, dentry, ia_valid);
|
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_getattr() - Check if getting file attributes is allowed
|
|
|
|
* @path: file
|
|
|
|
*
|
|
|
|
* Check permission before obtaining file attributes.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2015-03-08 23:28:30 +00:00
|
|
|
int security_inode_getattr(const struct path *path)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(path->dentry))))
|
2007-10-17 06:31:32 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_getattr, path);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_setxattr() - Check if setting file xattrs is allowed
|
|
|
|
* @idmap: idmap of the mount
|
|
|
|
* @dentry: file
|
|
|
|
* @name: xattr name
|
|
|
|
* @value: xattr value
|
2023-03-08 17:31:03 +00:00
|
|
|
* @size: size of xattr value
|
2023-02-08 21:31:55 +00:00
|
|
|
* @flags: flags
|
|
|
|
*
|
lsm: fixup the inode xattr capability handling
The current security_inode_setxattr() and security_inode_removexattr()
hooks rely on individual LSMs to either call into the associated
capability hooks (cap_inode_setxattr() or cap_inode_removexattr()), or
return a magic value of 1 to indicate that the LSM layer itself should
perform the capability checks. Unfortunately, with the default return
value for these LSM hooks being 0, an individual LSM hook returning a
1 will cause the LSM hook processing to exit early, potentially
skipping a LSM. Thankfully, with the exception of the BPF LSM, none
of the LSMs which currently register inode xattr hooks should end up
returning a value of 1, and in the BPF LSM case, with the BPF LSM hooks
executing last there should be no real harm in stopping processing of
the LSM hooks. However, the reliance on the individual LSMs to either
call the capability hooks themselves, or signal the LSM with a return
value of 1, is fragile and relies on a specific set of LSMs being
enabled. This patch is an effort to resolve, or minimize, these
issues.
Before we discuss the solution, there are a few observations and
considerations that we need to take into account:
* BPF LSM registers an implementation for every LSM hook, and that
implementation simply returns the hook's default return value, a
0 in this case. We want to ensure that the default BPF LSM behavior
results in the capability checks being called.
* SELinux and Smack do not expect the traditional capability checks
to be applied to the xattrs that they "own".
* SELinux and Smack are currently written in such a way that the
xattr capability checks happen before any additional LSM specific
access control checks. SELinux does apply SELinux specific access
controls to all xattrs, even those not "owned" by SELinux.
* IMA and EVM also register xattr hooks but assume that the LSM layer
and specific LSMs have already authorized the basic xattr operation.
In order to ensure we perform the capability based access controls
before the individual LSM access controls, perform only one capability
access control check for each operation, and clarify the logic around
applying the capability controls, we need a mechanism to determine if
any of the enabled LSMs "own" a particular xattr and want to take
responsibility for controlling access to that xattr. The solution in
this patch is to create a new LSM hook, 'inode_xattr_skipcap', that is
not exported to the rest of the kernel via a security_XXX() function,
but is used by the LSM layer to determine if a LSM wants to control
access to a given xattr and avoid the traditional capability controls.
Registering an inode_xattr_skipcap hook is optional, if a LSM declines
to register an implementation, or uses an implementation that simply
returns the default value (0), there is no effect as the LSM continues
to enforce the capability based controls (unless another LSM takes
ownership of the xattr). If none of the LSMs signal that the
capability checks should be skipped, the capability check is performed
and if access is granted the individual LSM xattr access control hooks
are executed, keeping with the DAC-before-LSM convention.
Cc: stable@vger.kernel.org
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-05-02 21:57:51 +00:00
|
|
|
* This hook performs the desired permission checks before setting the extended
|
|
|
|
* attributes (xattrs) on @dentry. It is important to note that we have some
|
|
|
|
* additional logic before the main LSM implementation calls to detect if we
|
|
|
|
* need to perform an additional capability check at the LSM layer.
|
|
|
|
*
|
|
|
|
* Normally we enforce a capability check prior to executing the various LSM
|
|
|
|
* hook implementations, but if a LSM wants to avoid this capability check,
|
|
|
|
* it can register a 'inode_xattr_skipcap' hook and return a value of 1 for
|
|
|
|
* xattrs that it wants to avoid the capability check, leaving the LSM fully
|
|
|
|
* responsible for enforcing the access control for the specific xattr. If all
|
|
|
|
* of the enabled LSMs refrain from registering a 'inode_xattr_skipcap' hook,
|
|
|
|
* or return a 0 (the default return value), the capability check is still
|
|
|
|
* performed. If no 'inode_xattr_skipcap' hooks are registered the capability
|
|
|
|
* check is performed.
|
2023-02-08 21:31:55 +00:00
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-01-13 11:49:23 +00:00
|
|
|
int security_inode_setxattr(struct mnt_idmap *idmap,
|
2021-01-21 13:19:29 +00:00
|
|
|
struct dentry *dentry, const char *name,
|
2008-04-29 07:59:41 +00:00
|
|
|
const void *value, size_t size, int flags)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
lsm: fixup the inode xattr capability handling
The current security_inode_setxattr() and security_inode_removexattr()
hooks rely on individual LSMs to either call into the associated
capability hooks (cap_inode_setxattr() or cap_inode_removexattr()), or
return a magic value of 1 to indicate that the LSM layer itself should
perform the capability checks. Unfortunately, with the default return
value for these LSM hooks being 0, an individual LSM hook returning a
1 will cause the LSM hook processing to exit early, potentially
skipping a LSM. Thankfully, with the exception of the BPF LSM, none
of the LSMs which currently register inode xattr hooks should end up
returning a value of 1, and in the BPF LSM case, with the BPF LSM hooks
executing last there should be no real harm in stopping processing of
the LSM hooks. However, the reliance on the individual LSMs to either
call the capability hooks themselves, or signal the LSM with a return
value of 1, is fragile and relies on a specific set of LSMs being
enabled. This patch is an effort to resolve, or minimize, these
issues.
Before we discuss the solution, there are a few observations and
considerations that we need to take into account:
* BPF LSM registers an implementation for every LSM hook, and that
implementation simply returns the hook's default return value, a
0 in this case. We want to ensure that the default BPF LSM behavior
results in the capability checks being called.
* SELinux and Smack do not expect the traditional capability checks
to be applied to the xattrs that they "own".
* SELinux and Smack are currently written in such a way that the
xattr capability checks happen before any additional LSM specific
access control checks. SELinux does apply SELinux specific access
controls to all xattrs, even those not "owned" by SELinux.
* IMA and EVM also register xattr hooks but assume that the LSM layer
and specific LSMs have already authorized the basic xattr operation.
In order to ensure we perform the capability based access controls
before the individual LSM access controls, perform only one capability
access control check for each operation, and clarify the logic around
applying the capability controls, we need a mechanism to determine if
any of the enabled LSMs "own" a particular xattr and want to take
responsibility for controlling access to that xattr. The solution in
this patch is to create a new LSM hook, 'inode_xattr_skipcap', that is
not exported to the rest of the kernel via a security_XXX() function,
but is used by the LSM layer to determine if a LSM wants to control
access to a given xattr and avoid the traditional capability controls.
Registering an inode_xattr_skipcap hook is optional, if a LSM declines
to register an implementation, or uses an implementation that simply
returns the default value (0), there is no effect as the LSM continues
to enforce the capability based controls (unless another LSM takes
ownership of the xattr). If none of the LSMs signal that the
capability checks should be skipped, the capability check is performed
and if access is granted the individual LSM xattr access control hooks
are executed, keeping with the DAC-before-LSM convention.
Cc: stable@vger.kernel.org
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-05-02 21:57:51 +00:00
|
|
|
int rc;
|
2011-03-09 19:38:26 +00:00
|
|
|
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
2007-10-17 06:31:32 +00:00
|
|
|
return 0;
|
2015-05-02 22:11:42 +00:00
|
|
|
|
lsm: fixup the inode xattr capability handling
The current security_inode_setxattr() and security_inode_removexattr()
hooks rely on individual LSMs to either call into the associated
capability hooks (cap_inode_setxattr() or cap_inode_removexattr()), or
return a magic value of 1 to indicate that the LSM layer itself should
perform the capability checks. Unfortunately, with the default return
value for these LSM hooks being 0, an individual LSM hook returning a
1 will cause the LSM hook processing to exit early, potentially
skipping a LSM. Thankfully, with the exception of the BPF LSM, none
of the LSMs which currently register inode xattr hooks should end up
returning a value of 1, and in the BPF LSM case, with the BPF LSM hooks
executing last there should be no real harm in stopping processing of
the LSM hooks. However, the reliance on the individual LSMs to either
call the capability hooks themselves, or signal the LSM with a return
value of 1, is fragile and relies on a specific set of LSMs being
enabled. This patch is an effort to resolve, or minimize, these
issues.
Before we discuss the solution, there are a few observations and
considerations that we need to take into account:
* BPF LSM registers an implementation for every LSM hook, and that
implementation simply returns the hook's default return value, a
0 in this case. We want to ensure that the default BPF LSM behavior
results in the capability checks being called.
* SELinux and Smack do not expect the traditional capability checks
to be applied to the xattrs that they "own".
* SELinux and Smack are currently written in such a way that the
xattr capability checks happen before any additional LSM specific
access control checks. SELinux does apply SELinux specific access
controls to all xattrs, even those not "owned" by SELinux.
* IMA and EVM also register xattr hooks but assume that the LSM layer
and specific LSMs have already authorized the basic xattr operation.
In order to ensure we perform the capability based access controls
before the individual LSM access controls, perform only one capability
access control check for each operation, and clarify the logic around
applying the capability controls, we need a mechanism to determine if
any of the enabled LSMs "own" a particular xattr and want to take
responsibility for controlling access to that xattr. The solution in
this patch is to create a new LSM hook, 'inode_xattr_skipcap', that is
not exported to the rest of the kernel via a security_XXX() function,
but is used by the LSM layer to determine if a LSM wants to control
access to a given xattr and avoid the traditional capability controls.
Registering an inode_xattr_skipcap hook is optional, if a LSM declines
to register an implementation, or uses an implementation that simply
returns the default value (0), there is no effect as the LSM continues
to enforce the capability based controls (unless another LSM takes
ownership of the xattr). If none of the LSMs signal that the
capability checks should be skipped, the capability check is performed
and if access is granted the individual LSM xattr access control hooks
are executed, keeping with the DAC-before-LSM convention.
Cc: stable@vger.kernel.org
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-05-02 21:57:51 +00:00
|
|
|
/* enforce the capability checks at the lsm layer, if needed */
|
|
|
|
if (!call_int_hook(inode_xattr_skipcap, name)) {
|
|
|
|
rc = cap_inode_setxattr(dentry, name, value, size, flags);
|
|
|
|
if (rc)
|
|
|
|
return rc;
|
|
|
|
}
|
|
|
|
|
|
|
|
return call_int_hook(inode_setxattr, idmap, dentry, name, value, size,
|
|
|
|
flags);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_set_acl() - Check if setting posix acls is allowed
|
|
|
|
* @idmap: idmap of the mount
|
|
|
|
* @dentry: file
|
|
|
|
* @acl_name: acl name
|
|
|
|
* @kacl: acl struct
|
|
|
|
*
|
|
|
|
* Check permission before setting posix acls, the posix acls in @kacl are
|
|
|
|
* identified by @acl_name.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-01-13 11:49:24 +00:00
|
|
|
int security_inode_set_acl(struct mnt_idmap *idmap,
|
2022-09-22 15:17:07 +00:00
|
|
|
struct dentry *dentry, const char *acl_name,
|
|
|
|
struct posix_acl *kacl)
|
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_set_acl, idmap, dentry, acl_name, kacl);
|
2022-09-22 15:17:07 +00:00
|
|
|
}
|
|
|
|
|
2024-02-15 10:31:04 +00:00
|
|
|
/**
|
|
|
|
* security_inode_post_set_acl() - Update inode security from posix acls set
|
|
|
|
* @dentry: file
|
|
|
|
* @acl_name: acl name
|
|
|
|
* @kacl: acl struct
|
|
|
|
*
|
|
|
|
* Update inode security data after successfully setting posix acls on @dentry.
|
|
|
|
* The posix acls in @kacl are identified by @acl_name.
|
|
|
|
*/
|
|
|
|
void security_inode_post_set_acl(struct dentry *dentry, const char *acl_name,
|
|
|
|
struct posix_acl *kacl)
|
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
|
|
|
return;
|
|
|
|
call_void_hook(inode_post_set_acl, dentry, acl_name, kacl);
|
2022-09-22 15:17:07 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_get_acl() - Check if reading posix acls is allowed
|
|
|
|
* @idmap: idmap of the mount
|
|
|
|
* @dentry: file
|
|
|
|
* @acl_name: acl name
|
|
|
|
*
|
|
|
|
* Check permission before getting osix acls, the posix acls are identified by
|
|
|
|
* @acl_name.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-01-13 11:49:24 +00:00
|
|
|
int security_inode_get_acl(struct mnt_idmap *idmap,
|
2022-09-22 15:17:07 +00:00
|
|
|
struct dentry *dentry, const char *acl_name)
|
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_get_acl, idmap, dentry, acl_name);
|
2022-09-22 15:17:07 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_remove_acl() - Check if removing a posix acl is allowed
|
|
|
|
* @idmap: idmap of the mount
|
|
|
|
* @dentry: file
|
|
|
|
* @acl_name: acl name
|
|
|
|
*
|
|
|
|
* Check permission before removing posix acls, the posix acls are identified
|
|
|
|
* by @acl_name.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-01-13 11:49:24 +00:00
|
|
|
int security_inode_remove_acl(struct mnt_idmap *idmap,
|
2022-09-22 15:17:07 +00:00
|
|
|
struct dentry *dentry, const char *acl_name)
|
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_remove_acl, idmap, dentry, acl_name);
|
2022-09-22 15:17:07 +00:00
|
|
|
}
|
|
|
|
|
2024-02-15 10:31:05 +00:00
|
|
|
/**
|
|
|
|
* security_inode_post_remove_acl() - Update inode security after rm posix acls
|
|
|
|
* @idmap: idmap of the mount
|
|
|
|
* @dentry: file
|
|
|
|
* @acl_name: acl name
|
|
|
|
*
|
|
|
|
* Update inode security data after successfully removing posix acls on
|
|
|
|
* @dentry in @idmap. The posix acls are identified by @acl_name.
|
|
|
|
*/
|
|
|
|
void security_inode_post_remove_acl(struct mnt_idmap *idmap,
|
|
|
|
struct dentry *dentry, const char *acl_name)
|
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
|
|
|
return;
|
|
|
|
call_void_hook(inode_post_remove_acl, idmap, dentry, acl_name);
|
2022-09-22 15:17:07 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_post_setxattr() - Update the inode after a setxattr operation
|
|
|
|
* @dentry: file
|
|
|
|
* @name: xattr name
|
|
|
|
* @value: xattr value
|
|
|
|
* @size: xattr value size
|
|
|
|
* @flags: flags
|
|
|
|
*
|
|
|
|
* Update inode security field after successful setxattr operation.
|
|
|
|
*/
|
2008-04-29 07:59:41 +00:00
|
|
|
void security_inode_post_setxattr(struct dentry *dentry, const char *name,
|
|
|
|
const void *value, size_t size, int flags)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
2007-10-17 06:31:32 +00:00
|
|
|
return;
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(inode_post_setxattr, dentry, name, value, size, flags);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_getxattr() - Check if xattr access is allowed
|
|
|
|
* @dentry: file
|
|
|
|
* @name: xattr name
|
|
|
|
*
|
|
|
|
* Check permission before obtaining the extended attributes identified by
|
|
|
|
* @name for @dentry.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2008-04-29 07:59:41 +00:00
|
|
|
int security_inode_getxattr(struct dentry *dentry, const char *name)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
2007-10-17 06:31:32 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_getxattr, dentry, name);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_listxattr() - Check if listing xattrs is allowed
|
|
|
|
* @dentry: file
|
|
|
|
*
|
|
|
|
* Check permission before obtaining the list of extended attribute names for
|
|
|
|
* @dentry.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_inode_listxattr(struct dentry *dentry)
|
|
|
|
{
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
2007-10-17 06:31:32 +00:00
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_listxattr, dentry);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_removexattr() - Check if removing an xattr is allowed
|
|
|
|
* @idmap: idmap of the mount
|
|
|
|
* @dentry: file
|
|
|
|
* @name: xattr name
|
|
|
|
*
|
lsm: fixup the inode xattr capability handling
The current security_inode_setxattr() and security_inode_removexattr()
hooks rely on individual LSMs to either call into the associated
capability hooks (cap_inode_setxattr() or cap_inode_removexattr()), or
return a magic value of 1 to indicate that the LSM layer itself should
perform the capability checks. Unfortunately, with the default return
value for these LSM hooks being 0, an individual LSM hook returning a
1 will cause the LSM hook processing to exit early, potentially
skipping a LSM. Thankfully, with the exception of the BPF LSM, none
of the LSMs which currently register inode xattr hooks should end up
returning a value of 1, and in the BPF LSM case, with the BPF LSM hooks
executing last there should be no real harm in stopping processing of
the LSM hooks. However, the reliance on the individual LSMs to either
call the capability hooks themselves, or signal the LSM with a return
value of 1, is fragile and relies on a specific set of LSMs being
enabled. This patch is an effort to resolve, or minimize, these
issues.
Before we discuss the solution, there are a few observations and
considerations that we need to take into account:
* BPF LSM registers an implementation for every LSM hook, and that
implementation simply returns the hook's default return value, a
0 in this case. We want to ensure that the default BPF LSM behavior
results in the capability checks being called.
* SELinux and Smack do not expect the traditional capability checks
to be applied to the xattrs that they "own".
* SELinux and Smack are currently written in such a way that the
xattr capability checks happen before any additional LSM specific
access control checks. SELinux does apply SELinux specific access
controls to all xattrs, even those not "owned" by SELinux.
* IMA and EVM also register xattr hooks but assume that the LSM layer
and specific LSMs have already authorized the basic xattr operation.
In order to ensure we perform the capability based access controls
before the individual LSM access controls, perform only one capability
access control check for each operation, and clarify the logic around
applying the capability controls, we need a mechanism to determine if
any of the enabled LSMs "own" a particular xattr and want to take
responsibility for controlling access to that xattr. The solution in
this patch is to create a new LSM hook, 'inode_xattr_skipcap', that is
not exported to the rest of the kernel via a security_XXX() function,
but is used by the LSM layer to determine if a LSM wants to control
access to a given xattr and avoid the traditional capability controls.
Registering an inode_xattr_skipcap hook is optional, if a LSM declines
to register an implementation, or uses an implementation that simply
returns the default value (0), there is no effect as the LSM continues
to enforce the capability based controls (unless another LSM takes
ownership of the xattr). If none of the LSMs signal that the
capability checks should be skipped, the capability check is performed
and if access is granted the individual LSM xattr access control hooks
are executed, keeping with the DAC-before-LSM convention.
Cc: stable@vger.kernel.org
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-05-02 21:57:51 +00:00
|
|
|
* This hook performs the desired permission checks before setting the extended
|
|
|
|
* attributes (xattrs) on @dentry. It is important to note that we have some
|
|
|
|
* additional logic before the main LSM implementation calls to detect if we
|
|
|
|
* need to perform an additional capability check at the LSM layer.
|
|
|
|
*
|
|
|
|
* Normally we enforce a capability check prior to executing the various LSM
|
|
|
|
* hook implementations, but if a LSM wants to avoid this capability check,
|
|
|
|
* it can register a 'inode_xattr_skipcap' hook and return a value of 1 for
|
|
|
|
* xattrs that it wants to avoid the capability check, leaving the LSM fully
|
|
|
|
* responsible for enforcing the access control for the specific xattr. If all
|
|
|
|
* of the enabled LSMs refrain from registering a 'inode_xattr_skipcap' hook,
|
|
|
|
* or return a 0 (the default return value), the capability check is still
|
|
|
|
* performed. If no 'inode_xattr_skipcap' hooks are registered the capability
|
|
|
|
* check is performed.
|
2023-02-08 21:31:55 +00:00
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-01-13 11:49:23 +00:00
|
|
|
int security_inode_removexattr(struct mnt_idmap *idmap,
|
2021-01-21 13:19:29 +00:00
|
|
|
struct dentry *dentry, const char *name)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
lsm: fixup the inode xattr capability handling
The current security_inode_setxattr() and security_inode_removexattr()
hooks rely on individual LSMs to either call into the associated
capability hooks (cap_inode_setxattr() or cap_inode_removexattr()), or
return a magic value of 1 to indicate that the LSM layer itself should
perform the capability checks. Unfortunately, with the default return
value for these LSM hooks being 0, an individual LSM hook returning a
1 will cause the LSM hook processing to exit early, potentially
skipping a LSM. Thankfully, with the exception of the BPF LSM, none
of the LSMs which currently register inode xattr hooks should end up
returning a value of 1, and in the BPF LSM case, with the BPF LSM hooks
executing last there should be no real harm in stopping processing of
the LSM hooks. However, the reliance on the individual LSMs to either
call the capability hooks themselves, or signal the LSM with a return
value of 1, is fragile and relies on a specific set of LSMs being
enabled. This patch is an effort to resolve, or minimize, these
issues.
Before we discuss the solution, there are a few observations and
considerations that we need to take into account:
* BPF LSM registers an implementation for every LSM hook, and that
implementation simply returns the hook's default return value, a
0 in this case. We want to ensure that the default BPF LSM behavior
results in the capability checks being called.
* SELinux and Smack do not expect the traditional capability checks
to be applied to the xattrs that they "own".
* SELinux and Smack are currently written in such a way that the
xattr capability checks happen before any additional LSM specific
access control checks. SELinux does apply SELinux specific access
controls to all xattrs, even those not "owned" by SELinux.
* IMA and EVM also register xattr hooks but assume that the LSM layer
and specific LSMs have already authorized the basic xattr operation.
In order to ensure we perform the capability based access controls
before the individual LSM access controls, perform only one capability
access control check for each operation, and clarify the logic around
applying the capability controls, we need a mechanism to determine if
any of the enabled LSMs "own" a particular xattr and want to take
responsibility for controlling access to that xattr. The solution in
this patch is to create a new LSM hook, 'inode_xattr_skipcap', that is
not exported to the rest of the kernel via a security_XXX() function,
but is used by the LSM layer to determine if a LSM wants to control
access to a given xattr and avoid the traditional capability controls.
Registering an inode_xattr_skipcap hook is optional, if a LSM declines
to register an implementation, or uses an implementation that simply
returns the default value (0), there is no effect as the LSM continues
to enforce the capability based controls (unless another LSM takes
ownership of the xattr). If none of the LSMs signal that the
capability checks should be skipped, the capability check is performed
and if access is granted the individual LSM xattr access control hooks
are executed, keeping with the DAC-before-LSM convention.
Cc: stable@vger.kernel.org
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-05-02 21:57:51 +00:00
|
|
|
int rc;
|
2011-03-09 19:38:26 +00:00
|
|
|
|
2015-03-17 22:26:22 +00:00
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
2007-10-17 06:31:32 +00:00
|
|
|
return 0;
|
lsm: fixup the inode xattr capability handling
The current security_inode_setxattr() and security_inode_removexattr()
hooks rely on individual LSMs to either call into the associated
capability hooks (cap_inode_setxattr() or cap_inode_removexattr()), or
return a magic value of 1 to indicate that the LSM layer itself should
perform the capability checks. Unfortunately, with the default return
value for these LSM hooks being 0, an individual LSM hook returning a
1 will cause the LSM hook processing to exit early, potentially
skipping a LSM. Thankfully, with the exception of the BPF LSM, none
of the LSMs which currently register inode xattr hooks should end up
returning a value of 1, and in the BPF LSM case, with the BPF LSM hooks
executing last there should be no real harm in stopping processing of
the LSM hooks. However, the reliance on the individual LSMs to either
call the capability hooks themselves, or signal the LSM with a return
value of 1, is fragile and relies on a specific set of LSMs being
enabled. This patch is an effort to resolve, or minimize, these
issues.
Before we discuss the solution, there are a few observations and
considerations that we need to take into account:
* BPF LSM registers an implementation for every LSM hook, and that
implementation simply returns the hook's default return value, a
0 in this case. We want to ensure that the default BPF LSM behavior
results in the capability checks being called.
* SELinux and Smack do not expect the traditional capability checks
to be applied to the xattrs that they "own".
* SELinux and Smack are currently written in such a way that the
xattr capability checks happen before any additional LSM specific
access control checks. SELinux does apply SELinux specific access
controls to all xattrs, even those not "owned" by SELinux.
* IMA and EVM also register xattr hooks but assume that the LSM layer
and specific LSMs have already authorized the basic xattr operation.
In order to ensure we perform the capability based access controls
before the individual LSM access controls, perform only one capability
access control check for each operation, and clarify the logic around
applying the capability controls, we need a mechanism to determine if
any of the enabled LSMs "own" a particular xattr and want to take
responsibility for controlling access to that xattr. The solution in
this patch is to create a new LSM hook, 'inode_xattr_skipcap', that is
not exported to the rest of the kernel via a security_XXX() function,
but is used by the LSM layer to determine if a LSM wants to control
access to a given xattr and avoid the traditional capability controls.
Registering an inode_xattr_skipcap hook is optional, if a LSM declines
to register an implementation, or uses an implementation that simply
returns the default value (0), there is no effect as the LSM continues
to enforce the capability based controls (unless another LSM takes
ownership of the xattr). If none of the LSMs signal that the
capability checks should be skipped, the capability check is performed
and if access is granted the individual LSM xattr access control hooks
are executed, keeping with the DAC-before-LSM convention.
Cc: stable@vger.kernel.org
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-05-02 21:57:51 +00:00
|
|
|
|
|
|
|
/* enforce the capability checks at the lsm layer, if needed */
|
|
|
|
if (!call_int_hook(inode_xattr_skipcap, name)) {
|
|
|
|
rc = cap_inode_removexattr(idmap, dentry, name);
|
|
|
|
if (rc)
|
|
|
|
return rc;
|
|
|
|
}
|
|
|
|
|
|
|
|
return call_int_hook(inode_removexattr, idmap, dentry, name);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2024-02-15 10:30:59 +00:00
|
|
|
/**
|
|
|
|
* security_inode_post_removexattr() - Update the inode after a removexattr op
|
|
|
|
* @dentry: file
|
|
|
|
* @name: xattr name
|
|
|
|
*
|
|
|
|
* Update the inode after a successful removexattr operation.
|
|
|
|
*/
|
|
|
|
void security_inode_post_removexattr(struct dentry *dentry, const char *name)
|
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
|
|
|
|
return;
|
|
|
|
call_void_hook(inode_post_removexattr, dentry, name);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_need_killpriv() - Check if security_inode_killpriv() required
|
|
|
|
* @dentry: associated dentry
|
|
|
|
*
|
|
|
|
* Called when an inode has been changed to determine if
|
|
|
|
* security_inode_killpriv() should be called.
|
|
|
|
*
|
|
|
|
* Return: Return <0 on error to abort the inode change operation, return 0 if
|
|
|
|
* security_inode_killpriv() does not need to be called, return >0 if
|
|
|
|
* security_inode_killpriv() does need to be called.
|
|
|
|
*/
|
Implement file posix capabilities
Implement file posix capabilities. This allows programs to be given a
subset of root's powers regardless of who runs them, without having to use
setuid and giving the binary all of root's powers.
This version works with Kaigai Kohei's userspace tools, found at
http://www.kaigai.gr.jp/index.php. For more information on how to use this
patch, Chris Friedhoff has posted a nice page at
http://www.friedhoff.org/fscaps.html.
Changelog:
Nov 27:
Incorporate fixes from Andrew Morton
(security-introduce-file-caps-tweaks and
security-introduce-file-caps-warning-fix)
Fix Kconfig dependency.
Fix change signaling behavior when file caps are not compiled in.
Nov 13:
Integrate comments from Alexey: Remove CONFIG_ ifdef from
capability.h, and use %zd for printing a size_t.
Nov 13:
Fix endianness warnings by sparse as suggested by Alexey
Dobriyan.
Nov 09:
Address warnings of unused variables at cap_bprm_set_security
when file capabilities are disabled, and simultaneously clean
up the code a little, by pulling the new code into a helper
function.
Nov 08:
For pointers to required userspace tools and how to use
them, see http://www.friedhoff.org/fscaps.html.
Nov 07:
Fix the calculation of the highest bit checked in
check_cap_sanity().
Nov 07:
Allow file caps to be enabled without CONFIG_SECURITY, since
capabilities are the default.
Hook cap_task_setscheduler when !CONFIG_SECURITY.
Move capable(TASK_KILL) to end of cap_task_kill to reduce
audit messages.
Nov 05:
Add secondary calls in selinux/hooks.c to task_setioprio and
task_setscheduler so that selinux and capabilities with file
cap support can be stacked.
Sep 05:
As Seth Arnold points out, uid checks are out of place
for capability code.
Sep 01:
Define task_setscheduler, task_setioprio, cap_task_kill, and
task_setnice to make sure a user cannot affect a process in which
they called a program with some fscaps.
One remaining question is the note under task_setscheduler: are we
ok with CAP_SYS_NICE being sufficient to confine a process to a
cpuset?
It is a semantic change, as without fsccaps, attach_task doesn't
allow CAP_SYS_NICE to override the uid equivalence check. But since
it uses security_task_setscheduler, which elsewhere is used where
CAP_SYS_NICE can be used to override the uid equivalence check,
fixing it might be tough.
task_setscheduler
note: this also controls cpuset:attach_task. Are we ok with
CAP_SYS_NICE being used to confine to a cpuset?
task_setioprio
task_setnice
sys_setpriority uses this (through set_one_prio) for another
process. Need same checks as setrlimit
Aug 21:
Updated secureexec implementation to reflect the fact that
euid and uid might be the same and nonzero, but the process
might still have elevated caps.
Aug 15:
Handle endianness of xattrs.
Enforce capability version match between kernel and disk.
Enforce that no bits beyond the known max capability are
set, else return -EPERM.
With this extra processing, it may be worth reconsidering
doing all the work at bprm_set_security rather than
d_instantiate.
Aug 10:
Always call getxattr at bprm_set_security, rather than
caching it at d_instantiate.
[morgan@kernel.org: file-caps clean up for linux/capability.h]
[bunk@kernel.org: unexport cap_inode_killpriv]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 06:31:36 +00:00
|
|
|
int security_inode_need_killpriv(struct dentry *dentry)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_need_killpriv, dentry);
|
Implement file posix capabilities
Implement file posix capabilities. This allows programs to be given a
subset of root's powers regardless of who runs them, without having to use
setuid and giving the binary all of root's powers.
This version works with Kaigai Kohei's userspace tools, found at
http://www.kaigai.gr.jp/index.php. For more information on how to use this
patch, Chris Friedhoff has posted a nice page at
http://www.friedhoff.org/fscaps.html.
Changelog:
Nov 27:
Incorporate fixes from Andrew Morton
(security-introduce-file-caps-tweaks and
security-introduce-file-caps-warning-fix)
Fix Kconfig dependency.
Fix change signaling behavior when file caps are not compiled in.
Nov 13:
Integrate comments from Alexey: Remove CONFIG_ ifdef from
capability.h, and use %zd for printing a size_t.
Nov 13:
Fix endianness warnings by sparse as suggested by Alexey
Dobriyan.
Nov 09:
Address warnings of unused variables at cap_bprm_set_security
when file capabilities are disabled, and simultaneously clean
up the code a little, by pulling the new code into a helper
function.
Nov 08:
For pointers to required userspace tools and how to use
them, see http://www.friedhoff.org/fscaps.html.
Nov 07:
Fix the calculation of the highest bit checked in
check_cap_sanity().
Nov 07:
Allow file caps to be enabled without CONFIG_SECURITY, since
capabilities are the default.
Hook cap_task_setscheduler when !CONFIG_SECURITY.
Move capable(TASK_KILL) to end of cap_task_kill to reduce
audit messages.
Nov 05:
Add secondary calls in selinux/hooks.c to task_setioprio and
task_setscheduler so that selinux and capabilities with file
cap support can be stacked.
Sep 05:
As Seth Arnold points out, uid checks are out of place
for capability code.
Sep 01:
Define task_setscheduler, task_setioprio, cap_task_kill, and
task_setnice to make sure a user cannot affect a process in which
they called a program with some fscaps.
One remaining question is the note under task_setscheduler: are we
ok with CAP_SYS_NICE being sufficient to confine a process to a
cpuset?
It is a semantic change, as without fsccaps, attach_task doesn't
allow CAP_SYS_NICE to override the uid equivalence check. But since
it uses security_task_setscheduler, which elsewhere is used where
CAP_SYS_NICE can be used to override the uid equivalence check,
fixing it might be tough.
task_setscheduler
note: this also controls cpuset:attach_task. Are we ok with
CAP_SYS_NICE being used to confine to a cpuset?
task_setioprio
task_setnice
sys_setpriority uses this (through set_one_prio) for another
process. Need same checks as setrlimit
Aug 21:
Updated secureexec implementation to reflect the fact that
euid and uid might be the same and nonzero, but the process
might still have elevated caps.
Aug 15:
Handle endianness of xattrs.
Enforce capability version match between kernel and disk.
Enforce that no bits beyond the known max capability are
set, else return -EPERM.
With this extra processing, it may be worth reconsidering
doing all the work at bprm_set_security rather than
d_instantiate.
Aug 10:
Always call getxattr at bprm_set_security, rather than
caching it at d_instantiate.
[morgan@kernel.org: file-caps clean up for linux/capability.h]
[bunk@kernel.org: unexport cap_inode_killpriv]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 06:31:36 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_killpriv() - The setuid bit is removed, update LSM state
|
|
|
|
* @idmap: idmap of the mount
|
|
|
|
* @dentry: associated dentry
|
|
|
|
*
|
|
|
|
* The @dentry's setuid bit is being removed. Remove similar security labels.
|
|
|
|
* Called with the dentry->d_inode->i_mutex held.
|
|
|
|
*
|
|
|
|
* Return: Return 0 on success. If error is returned, then the operation
|
|
|
|
* causing setuid bit removal is failed.
|
|
|
|
*/
|
2023-01-13 11:49:23 +00:00
|
|
|
int security_inode_killpriv(struct mnt_idmap *idmap,
|
2021-01-21 13:19:29 +00:00
|
|
|
struct dentry *dentry)
|
Implement file posix capabilities
Implement file posix capabilities. This allows programs to be given a
subset of root's powers regardless of who runs them, without having to use
setuid and giving the binary all of root's powers.
This version works with Kaigai Kohei's userspace tools, found at
http://www.kaigai.gr.jp/index.php. For more information on how to use this
patch, Chris Friedhoff has posted a nice page at
http://www.friedhoff.org/fscaps.html.
Changelog:
Nov 27:
Incorporate fixes from Andrew Morton
(security-introduce-file-caps-tweaks and
security-introduce-file-caps-warning-fix)
Fix Kconfig dependency.
Fix change signaling behavior when file caps are not compiled in.
Nov 13:
Integrate comments from Alexey: Remove CONFIG_ ifdef from
capability.h, and use %zd for printing a size_t.
Nov 13:
Fix endianness warnings by sparse as suggested by Alexey
Dobriyan.
Nov 09:
Address warnings of unused variables at cap_bprm_set_security
when file capabilities are disabled, and simultaneously clean
up the code a little, by pulling the new code into a helper
function.
Nov 08:
For pointers to required userspace tools and how to use
them, see http://www.friedhoff.org/fscaps.html.
Nov 07:
Fix the calculation of the highest bit checked in
check_cap_sanity().
Nov 07:
Allow file caps to be enabled without CONFIG_SECURITY, since
capabilities are the default.
Hook cap_task_setscheduler when !CONFIG_SECURITY.
Move capable(TASK_KILL) to end of cap_task_kill to reduce
audit messages.
Nov 05:
Add secondary calls in selinux/hooks.c to task_setioprio and
task_setscheduler so that selinux and capabilities with file
cap support can be stacked.
Sep 05:
As Seth Arnold points out, uid checks are out of place
for capability code.
Sep 01:
Define task_setscheduler, task_setioprio, cap_task_kill, and
task_setnice to make sure a user cannot affect a process in which
they called a program with some fscaps.
One remaining question is the note under task_setscheduler: are we
ok with CAP_SYS_NICE being sufficient to confine a process to a
cpuset?
It is a semantic change, as without fsccaps, attach_task doesn't
allow CAP_SYS_NICE to override the uid equivalence check. But since
it uses security_task_setscheduler, which elsewhere is used where
CAP_SYS_NICE can be used to override the uid equivalence check,
fixing it might be tough.
task_setscheduler
note: this also controls cpuset:attach_task. Are we ok with
CAP_SYS_NICE being used to confine to a cpuset?
task_setioprio
task_setnice
sys_setpriority uses this (through set_one_prio) for another
process. Need same checks as setrlimit
Aug 21:
Updated secureexec implementation to reflect the fact that
euid and uid might be the same and nonzero, but the process
might still have elevated caps.
Aug 15:
Handle endianness of xattrs.
Enforce capability version match between kernel and disk.
Enforce that no bits beyond the known max capability are
set, else return -EPERM.
With this extra processing, it may be worth reconsidering
doing all the work at bprm_set_security rather than
d_instantiate.
Aug 10:
Always call getxattr at bprm_set_security, rather than
caching it at d_instantiate.
[morgan@kernel.org: file-caps clean up for linux/capability.h]
[bunk@kernel.org: unexport cap_inode_killpriv]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 06:31:36 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_killpriv, idmap, dentry);
|
Implement file posix capabilities
Implement file posix capabilities. This allows programs to be given a
subset of root's powers regardless of who runs them, without having to use
setuid and giving the binary all of root's powers.
This version works with Kaigai Kohei's userspace tools, found at
http://www.kaigai.gr.jp/index.php. For more information on how to use this
patch, Chris Friedhoff has posted a nice page at
http://www.friedhoff.org/fscaps.html.
Changelog:
Nov 27:
Incorporate fixes from Andrew Morton
(security-introduce-file-caps-tweaks and
security-introduce-file-caps-warning-fix)
Fix Kconfig dependency.
Fix change signaling behavior when file caps are not compiled in.
Nov 13:
Integrate comments from Alexey: Remove CONFIG_ ifdef from
capability.h, and use %zd for printing a size_t.
Nov 13:
Fix endianness warnings by sparse as suggested by Alexey
Dobriyan.
Nov 09:
Address warnings of unused variables at cap_bprm_set_security
when file capabilities are disabled, and simultaneously clean
up the code a little, by pulling the new code into a helper
function.
Nov 08:
For pointers to required userspace tools and how to use
them, see http://www.friedhoff.org/fscaps.html.
Nov 07:
Fix the calculation of the highest bit checked in
check_cap_sanity().
Nov 07:
Allow file caps to be enabled without CONFIG_SECURITY, since
capabilities are the default.
Hook cap_task_setscheduler when !CONFIG_SECURITY.
Move capable(TASK_KILL) to end of cap_task_kill to reduce
audit messages.
Nov 05:
Add secondary calls in selinux/hooks.c to task_setioprio and
task_setscheduler so that selinux and capabilities with file
cap support can be stacked.
Sep 05:
As Seth Arnold points out, uid checks are out of place
for capability code.
Sep 01:
Define task_setscheduler, task_setioprio, cap_task_kill, and
task_setnice to make sure a user cannot affect a process in which
they called a program with some fscaps.
One remaining question is the note under task_setscheduler: are we
ok with CAP_SYS_NICE being sufficient to confine a process to a
cpuset?
It is a semantic change, as without fsccaps, attach_task doesn't
allow CAP_SYS_NICE to override the uid equivalence check. But since
it uses security_task_setscheduler, which elsewhere is used where
CAP_SYS_NICE can be used to override the uid equivalence check,
fixing it might be tough.
task_setscheduler
note: this also controls cpuset:attach_task. Are we ok with
CAP_SYS_NICE being used to confine to a cpuset?
task_setioprio
task_setnice
sys_setpriority uses this (through set_one_prio) for another
process. Need same checks as setrlimit
Aug 21:
Updated secureexec implementation to reflect the fact that
euid and uid might be the same and nonzero, but the process
might still have elevated caps.
Aug 15:
Handle endianness of xattrs.
Enforce capability version match between kernel and disk.
Enforce that no bits beyond the known max capability are
set, else return -EPERM.
With this extra processing, it may be worth reconsidering
doing all the work at bprm_set_security rather than
d_instantiate.
Aug 10:
Always call getxattr at bprm_set_security, rather than
caching it at d_instantiate.
[morgan@kernel.org: file-caps clean up for linux/capability.h]
[bunk@kernel.org: unexport cap_inode_killpriv]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 06:31:36 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_getsecurity() - Get the xattr security label of an inode
|
|
|
|
* @idmap: idmap of the mount
|
|
|
|
* @inode: inode
|
|
|
|
* @name: xattr name
|
|
|
|
* @buffer: security label buffer
|
|
|
|
* @alloc: allocation flag
|
|
|
|
*
|
|
|
|
* Retrieve a copy of the extended attribute representation of the security
|
|
|
|
* label associated with @name for @inode via @buffer. Note that @name is the
|
|
|
|
* remainder of the attribute name after the security prefix has been removed.
|
|
|
|
* @alloc is used to specify if the call should return a value via the buffer
|
|
|
|
* or just the value length.
|
|
|
|
*
|
|
|
|
* Return: Returns size of buffer on success.
|
|
|
|
*/
|
2023-01-13 11:49:22 +00:00
|
|
|
int security_inode_getsecurity(struct mnt_idmap *idmap,
|
2021-01-21 13:19:29 +00:00
|
|
|
struct inode *inode, const char *name,
|
|
|
|
void **buffer, bool alloc)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(inode)))
|
2020-03-29 00:43:50 +00:00
|
|
|
return LSM_RET_DEFAULT(inode_getsecurity);
|
2024-01-30 12:56:59 +00:00
|
|
|
|
|
|
|
return call_int_hook(inode_getsecurity, idmap, inode, name, buffer,
|
|
|
|
alloc);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_setsecurity() - Set the xattr security label of an inode
|
|
|
|
* @inode: inode
|
|
|
|
* @name: xattr name
|
|
|
|
* @value: security label
|
|
|
|
* @size: length of security label
|
|
|
|
* @flags: flags
|
|
|
|
*
|
|
|
|
* Set the security label associated with @name for @inode from the extended
|
|
|
|
* attribute value @value. @size indicates the size of the @value in bytes.
|
|
|
|
* @flags may be XATTR_CREATE, XATTR_REPLACE, or 0. Note that @name is the
|
|
|
|
* remainder of the attribute name after the security. prefix has been removed.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_inode_setsecurity(struct inode *inode, const char *name,
|
|
|
|
const void *value, size_t size, int flags)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(inode)))
|
2020-03-29 00:43:50 +00:00
|
|
|
return LSM_RET_DEFAULT(inode_setsecurity);
|
2024-01-30 12:56:59 +00:00
|
|
|
|
|
|
|
return call_int_hook(inode_setsecurity, inode, name, value, size,
|
|
|
|
flags);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_listsecurity() - List the xattr security label names
|
|
|
|
* @inode: inode
|
|
|
|
* @buffer: buffer
|
|
|
|
* @buffer_size: size of buffer
|
|
|
|
*
|
|
|
|
* Copy the extended attribute names for the security labels associated with
|
|
|
|
* @inode into @buffer. The maximum size of @buffer is specified by
|
|
|
|
* @buffer_size. @buffer may be NULL to request the size of the buffer
|
|
|
|
* required.
|
|
|
|
*
|
|
|
|
* Return: Returns number of bytes used/required on success.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_inode_listsecurity(struct inode *inode,
|
|
|
|
char *buffer, size_t buffer_size)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
|
|
|
if (unlikely(IS_PRIVATE(inode)))
|
|
|
|
return 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_listsecurity, inode, buffer, buffer_size);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
2013-05-22 16:50:45 +00:00
|
|
|
EXPORT_SYMBOL(security_inode_listsecurity);
|
2007-10-17 06:31:32 +00:00
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_getsecid() - Get an inode's secid
|
|
|
|
* @inode: inode
|
|
|
|
* @secid: secid to return
|
|
|
|
*
|
|
|
|
* Get the secid associated with the node. In case of failure, @secid will be
|
|
|
|
* set to zero.
|
|
|
|
*/
|
2015-12-24 16:09:39 +00:00
|
|
|
void security_inode_getsecid(struct inode *inode, u32 *secid)
|
2008-03-01 19:51:09 +00:00
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(inode_getsecid, inode, secid);
|
2008-03-01 19:51:09 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_copy_up() - Create new creds for an overlayfs copy-up op
|
|
|
|
* @src: union dentry of copy-up file
|
|
|
|
* @new: newly created creds
|
|
|
|
*
|
|
|
|
* A file is about to be copied up from lower layer to upper layer of overlay
|
|
|
|
* filesystem. Security module can prepare a set of new creds and modify as
|
|
|
|
* need be and return new creds. Caller will switch to new creds temporarily to
|
|
|
|
* create new file and release newly allocated creds.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success or a negative error code on error.
|
|
|
|
*/
|
2016-07-13 15:13:56 +00:00
|
|
|
int security_inode_copy_up(struct dentry *src, struct cred **new)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_copy_up, src, new);
|
2016-07-13 15:13:56 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_inode_copy_up);
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_inode_copy_up_xattr() - Filter xattrs in an overlayfs copy-up op
|
2024-02-23 17:25:05 +00:00
|
|
|
* @src: union dentry of copy-up file
|
2023-02-08 21:31:55 +00:00
|
|
|
* @name: xattr name
|
|
|
|
*
|
|
|
|
* Filter the xattrs being copied up when a unioned file is copied up from a
|
|
|
|
* lower layer to the union/overlay layer. The caller is responsible for
|
|
|
|
* reading and writing the xattrs, this hook is merely a filter.
|
|
|
|
*
|
2024-07-24 02:06:59 +00:00
|
|
|
* Return: Returns 0 to accept the xattr, -ECANCELED to discard the xattr,
|
|
|
|
* -EOPNOTSUPP if the security module does not know about attribute,
|
|
|
|
* or a negative error code to abort the copy up.
|
2023-02-08 21:31:55 +00:00
|
|
|
*/
|
2024-02-23 17:25:05 +00:00
|
|
|
int security_inode_copy_up_xattr(struct dentry *src, const char *name)
|
2016-07-13 14:44:49 +00:00
|
|
|
{
|
2020-06-21 22:21:35 +00:00
|
|
|
int rc;
|
|
|
|
|
2024-02-23 17:25:05 +00:00
|
|
|
rc = call_int_hook(inode_copy_up_xattr, src, name);
|
2024-01-30 12:56:59 +00:00
|
|
|
if (rc != LSM_RET_DEFAULT(inode_copy_up_xattr))
|
|
|
|
return rc;
|
2020-06-21 22:21:35 +00:00
|
|
|
|
evm: Move to LSM infrastructure
As for IMA, move hardcoded EVM function calls from various places in the
kernel to the LSM infrastructure, by introducing a new LSM named 'evm'
(last and always enabled like 'ima'). The order in the Makefile ensures
that 'evm' hooks are executed after 'ima' ones.
Make EVM functions as static (except for evm_inode_init_security(), which
is exported), and register them as hook implementations in init_evm_lsm().
Also move the inline functions evm_inode_remove_acl(),
evm_inode_post_remove_acl(), and evm_inode_post_set_acl() from the public
evm.h header to evm_main.c.
Unlike before (see commit to move IMA to the LSM infrastructure),
evm_inode_post_setattr(), evm_inode_post_set_acl(),
evm_inode_post_remove_acl(), and evm_inode_post_removexattr() are not
executed for private inodes.
Finally, add the LSM_ID_EVM case in lsm_list_modules_test.c
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-02-15 10:31:10 +00:00
|
|
|
return LSM_RET_DEFAULT(inode_copy_up_xattr);
|
2016-07-13 14:44:49 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_inode_copy_up_xattr);
|
|
|
|
|
2024-08-03 06:08:28 +00:00
|
|
|
/**
|
|
|
|
* security_inode_setintegrity() - Set the inode's integrity data
|
|
|
|
* @inode: inode
|
|
|
|
* @type: type of integrity, e.g. hash digest, signature, etc
|
|
|
|
* @value: the integrity value
|
|
|
|
* @size: size of the integrity value
|
|
|
|
*
|
|
|
|
* Register a verified integrity measurement of a inode with LSMs.
|
|
|
|
* LSMs should free the previously saved data if @value is NULL.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, negative values on failure.
|
|
|
|
*/
|
|
|
|
int security_inode_setintegrity(const struct inode *inode,
|
|
|
|
enum lsm_integrity_type type, const void *value,
|
|
|
|
size_t size)
|
|
|
|
{
|
|
|
|
return call_int_hook(inode_setintegrity, inode, type, value, size);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_inode_setintegrity);
|
|
|
|
|
2023-02-10 18:20:33 +00:00
|
|
|
/**
|
|
|
|
* security_kernfs_init_security() - Init LSM context for a kernfs node
|
|
|
|
* @kn_dir: parent kernfs node
|
|
|
|
* @kn: the kernfs node to initialize
|
|
|
|
*
|
|
|
|
* Initialize the security context of a newly created kernfs node based on its
|
|
|
|
* own and its parent's attributes.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2019-02-22 14:57:16 +00:00
|
|
|
int security_kernfs_init_security(struct kernfs_node *kn_dir,
|
|
|
|
struct kernfs_node *kn)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(kernfs_init_security, kn_dir, kn);
|
2019-02-22 14:57:16 +00:00
|
|
|
}
|
|
|
|
|
2023-02-10 18:23:09 +00:00
|
|
|
/**
|
|
|
|
* security_file_permission() - Check file permissions
|
|
|
|
* @file: file
|
|
|
|
* @mask: requested permissions
|
|
|
|
*
|
|
|
|
* Check file permissions before accessing an open file. This hook is called
|
|
|
|
* by various operations that read or write files. A security module can use
|
|
|
|
* this hook to perform additional checking on these operations, e.g. to
|
|
|
|
* revalidate permissions on use to support privilege bracketing or policy
|
|
|
|
* changes. Notice that this hook is used when the actual read/write
|
|
|
|
* operations are performed, whereas the inode_security_ops hook is called when
|
|
|
|
* a file is opened (as well as many other operations). Although this hook can
|
|
|
|
* be used to revalidate permissions for various system call operations that
|
|
|
|
* read or write files, it does not address the revalidation of permissions for
|
|
|
|
* memory-mapped files. Security modules must handle this separately if they
|
|
|
|
* need such revalidation.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_file_permission(struct file *file, int mask)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(file_permission, file, mask);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-10 18:23:09 +00:00
|
|
|
/**
|
|
|
|
* security_file_alloc() - Allocate and init a file's LSM blob
|
|
|
|
* @file: the file
|
|
|
|
*
|
|
|
|
* Allocate and attach a security structure to the file->f_security field. The
|
|
|
|
* security field is initialized to NULL when the structure is first created.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if the hook is successful and permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_file_alloc(struct file *file)
|
|
|
|
{
|
2018-11-12 20:02:49 +00:00
|
|
|
int rc = lsm_file_alloc(file);
|
|
|
|
|
|
|
|
if (rc)
|
|
|
|
return rc;
|
2024-01-30 12:56:59 +00:00
|
|
|
rc = call_int_hook(file_alloc_security, file);
|
2018-11-12 20:02:49 +00:00
|
|
|
if (unlikely(rc))
|
|
|
|
security_file_free(file);
|
|
|
|
return rc;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2024-02-15 10:31:01 +00:00
|
|
|
/**
|
|
|
|
* security_file_release() - Perform actions before releasing the file ref
|
|
|
|
* @file: the file
|
|
|
|
*
|
|
|
|
* Perform actions before releasing the last reference to a file.
|
|
|
|
*/
|
|
|
|
void security_file_release(struct file *file)
|
|
|
|
{
|
|
|
|
call_void_hook(file_release, file);
|
|
|
|
}
|
|
|
|
|
2023-02-10 18:23:09 +00:00
|
|
|
/**
|
|
|
|
* security_file_free() - Free a file's LSM blob
|
|
|
|
* @file: the file
|
|
|
|
*
|
|
|
|
* Deallocate and free any security structures stored in file->f_security.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
void security_file_free(struct file *file)
|
|
|
|
{
|
2018-11-12 20:02:49 +00:00
|
|
|
void *blob;
|
|
|
|
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(file_free_security, file);
|
2018-11-12 20:02:49 +00:00
|
|
|
|
|
|
|
blob = file->f_security;
|
|
|
|
if (blob) {
|
|
|
|
file->f_security = NULL;
|
|
|
|
kmem_cache_free(lsm_file_cache, blob);
|
|
|
|
}
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-10 18:23:09 +00:00
|
|
|
/**
|
|
|
|
* security_file_ioctl() - Check if an ioctl is allowed
|
|
|
|
* @file: associated file
|
|
|
|
* @cmd: ioctl cmd
|
|
|
|
* @arg: ioctl arguments
|
|
|
|
*
|
|
|
|
* Check permission for an ioctl operation on @file. Note that @arg sometimes
|
|
|
|
* represents a user space pointer; in other cases, it may be a simple integer
|
|
|
|
* value. When @arg represents a user space pointer, it should never be used
|
|
|
|
* by the security module.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(file_ioctl, file, cmd, arg);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
2020-06-02 20:20:26 +00:00
|
|
|
EXPORT_SYMBOL_GPL(security_file_ioctl);
|
2007-10-17 06:31:32 +00:00
|
|
|
|
lsm: new security_file_ioctl_compat() hook
Some ioctl commands do not require ioctl permission, but are routed to
other permissions such as FILE_GETATTR or FILE_SETATTR. This routing is
done by comparing the ioctl cmd to a set of 64-bit flags (FS_IOC_*).
However, if a 32-bit process is running on a 64-bit kernel, it emits
32-bit flags (FS_IOC32_*) for certain ioctl operations. These flags are
being checked erroneously, which leads to these ioctl operations being
routed to the ioctl permission, rather than the correct file
permissions.
This was also noted in a RED-PEN finding from a while back -
"/* RED-PEN how should LSM module know it's handling 32bit? */".
This patch introduces a new hook, security_file_ioctl_compat(), that is
called from the compat ioctl syscall. All current LSMs have been changed
to support this hook.
Reviewing the three places where we are currently using
security_file_ioctl(), it appears that only SELinux needs a dedicated
compat change; TOMOYO and SMACK appear to be functional without any
change.
Cc: stable@vger.kernel.org
Fixes: 0b24dcb7f2f7 ("Revert "selinux: simplify ioctl checking"")
Signed-off-by: Alfred Piccioni <alpic@google.com>
Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com>
[PM: subject tweak, line length fixes, and alignment corrections]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-12-19 09:09:09 +00:00
|
|
|
/**
|
|
|
|
* security_file_ioctl_compat() - Check if an ioctl is allowed in compat mode
|
|
|
|
* @file: associated file
|
|
|
|
* @cmd: ioctl cmd
|
|
|
|
* @arg: ioctl arguments
|
|
|
|
*
|
|
|
|
* Compat version of security_file_ioctl() that correctly handles 32-bit
|
|
|
|
* processes running on 64-bit kernels.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
|
|
|
int security_file_ioctl_compat(struct file *file, unsigned int cmd,
|
|
|
|
unsigned long arg)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(file_ioctl_compat, file, cmd, arg);
|
lsm: new security_file_ioctl_compat() hook
Some ioctl commands do not require ioctl permission, but are routed to
other permissions such as FILE_GETATTR or FILE_SETATTR. This routing is
done by comparing the ioctl cmd to a set of 64-bit flags (FS_IOC_*).
However, if a 32-bit process is running on a 64-bit kernel, it emits
32-bit flags (FS_IOC32_*) for certain ioctl operations. These flags are
being checked erroneously, which leads to these ioctl operations being
routed to the ioctl permission, rather than the correct file
permissions.
This was also noted in a RED-PEN finding from a while back -
"/* RED-PEN how should LSM module know it's handling 32bit? */".
This patch introduces a new hook, security_file_ioctl_compat(), that is
called from the compat ioctl syscall. All current LSMs have been changed
to support this hook.
Reviewing the three places where we are currently using
security_file_ioctl(), it appears that only SELinux needs a dedicated
compat change; TOMOYO and SMACK appear to be functional without any
change.
Cc: stable@vger.kernel.org
Fixes: 0b24dcb7f2f7 ("Revert "selinux: simplify ioctl checking"")
Signed-off-by: Alfred Piccioni <alpic@google.com>
Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com>
[PM: subject tweak, line length fixes, and alignment corrections]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-12-19 09:09:09 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(security_file_ioctl_compat);
|
|
|
|
|
2012-05-30 23:58:30 +00:00
|
|
|
static inline unsigned long mmap_prot(struct file *file, unsigned long prot)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2012-05-30 21:11:23 +00:00
|
|
|
/*
|
2012-05-30 23:58:30 +00:00
|
|
|
* Does we have PROT_READ and does the application expect
|
|
|
|
* it to imply PROT_EXEC? If not, nothing to talk about...
|
2012-05-30 21:11:23 +00:00
|
|
|
*/
|
2012-05-30 23:58:30 +00:00
|
|
|
if ((prot & (PROT_READ | PROT_EXEC)) != PROT_READ)
|
|
|
|
return prot;
|
2012-05-30 21:11:23 +00:00
|
|
|
if (!(current->personality & READ_IMPLIES_EXEC))
|
2012-05-30 23:58:30 +00:00
|
|
|
return prot;
|
|
|
|
/*
|
|
|
|
* if that's an anonymous mapping, let it.
|
|
|
|
*/
|
|
|
|
if (!file)
|
|
|
|
return prot | PROT_EXEC;
|
|
|
|
/*
|
|
|
|
* ditto if it's not on noexec mount, except that on !MMU we need
|
2015-01-14 09:42:32 +00:00
|
|
|
* NOMMU_MAP_EXEC (== VM_MAYEXEC) in this case
|
2012-05-30 23:58:30 +00:00
|
|
|
*/
|
2015-06-29 19:42:03 +00:00
|
|
|
if (!path_noexec(&file->f_path)) {
|
2012-05-30 21:11:23 +00:00
|
|
|
#ifndef CONFIG_MMU
|
2015-01-14 09:42:32 +00:00
|
|
|
if (file->f_op->mmap_capabilities) {
|
|
|
|
unsigned caps = file->f_op->mmap_capabilities(file);
|
|
|
|
if (!(caps & NOMMU_MAP_EXEC))
|
|
|
|
return prot;
|
|
|
|
}
|
2012-05-30 21:11:23 +00:00
|
|
|
#endif
|
2012-05-30 23:58:30 +00:00
|
|
|
return prot | PROT_EXEC;
|
2012-05-30 21:11:23 +00:00
|
|
|
}
|
2012-05-30 23:58:30 +00:00
|
|
|
/* anything on noexec mount won't get PROT_EXEC */
|
|
|
|
return prot;
|
|
|
|
}
|
|
|
|
|
2023-02-10 18:23:09 +00:00
|
|
|
/**
|
|
|
|
* security_mmap_file() - Check if mmap'ing a file is allowed
|
|
|
|
* @file: file
|
|
|
|
* @prot: protection applied by the kernel
|
|
|
|
* @flags: flags
|
|
|
|
*
|
|
|
|
* Check permissions for a mmap operation. The @file may be NULL, e.g. if
|
|
|
|
* mapping anonymous memory.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2012-05-30 23:58:30 +00:00
|
|
|
int security_mmap_file(struct file *file, unsigned long prot,
|
2023-02-17 02:33:20 +00:00
|
|
|
unsigned long flags)
|
2012-05-30 23:58:30 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(mmap_file, file, prot, mmap_prot(file, prot),
|
ima: Move to LSM infrastructure
Move hardcoded IMA function calls (not appraisal-specific functions) from
various places in the kernel to the LSM infrastructure, by introducing a
new LSM named 'ima' (at the end of the LSM list and always enabled like
'integrity').
Having IMA before EVM in the Makefile is sufficient to preserve the
relative order of the new 'ima' LSM in respect to the upcoming 'evm' LSM,
and thus the order of IMA and EVM function calls as when they were
hardcoded.
Make moved functions as static (except ima_post_key_create_or_update(),
which is not in ima_main.c), and register them as implementation of the
respective hooks in the new function init_ima_lsm().
Select CONFIG_SECURITY_PATH, to ensure that the path-based LSM hook
path_post_mknod is always available and ima_post_path_mknod() is always
executed to mark files as new, as before the move.
A slight difference is that IMA and EVM functions registered for the
inode_post_setattr, inode_post_removexattr, path_post_mknod,
inode_post_create_tmpfile, inode_post_set_acl and inode_post_remove_acl
won't be executed for private inodes. Since those inodes are supposed to be
fs-internal, they should not be of interest to IMA or EVM. The S_PRIVATE
flag is used for anonymous inodes, hugetlbfs, reiserfs xattrs, XFS scrub
and kernel-internal tmpfs files.
Conditionally register ima_post_key_create_or_update() if
CONFIG_IMA_MEASURE_ASYMMETRIC_KEYS is enabled. Also, conditionally register
ima_kernel_module_request() if CONFIG_INTEGRITY_ASYMMETRIC_KEYS is enabled.
Finally, add the LSM_ID_IMA case in lsm_list_modules_test.c.
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-02-15 10:31:08 +00:00
|
|
|
flags);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-10 18:23:09 +00:00
|
|
|
/**
|
|
|
|
* security_mmap_addr() - Check if mmap'ing an address is allowed
|
|
|
|
* @addr: address
|
|
|
|
*
|
|
|
|
* Check permissions for a mmap operation at @addr.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2012-05-30 17:30:51 +00:00
|
|
|
int security_mmap_addr(unsigned long addr)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(mmap_addr, addr);
|
2012-05-30 17:30:51 +00:00
|
|
|
}
|
|
|
|
|
2023-02-10 18:23:09 +00:00
|
|
|
/**
|
|
|
|
* security_file_mprotect() - Check if changing memory protections is allowed
|
|
|
|
* @vma: memory region
|
|
|
|
* @reqprot: application requested protection
|
2023-03-08 17:31:03 +00:00
|
|
|
* @prot: protection applied by the kernel
|
2023-02-10 18:23:09 +00:00
|
|
|
*
|
|
|
|
* Check permissions before changing memory access permissions.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_file_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
|
2023-02-17 02:33:20 +00:00
|
|
|
unsigned long prot)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(file_mprotect, vma, reqprot, prot);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-10 18:23:09 +00:00
|
|
|
/**
|
|
|
|
* security_file_lock() - Check if a file lock is allowed
|
|
|
|
* @file: file
|
|
|
|
* @cmd: lock operation (e.g. F_RDLCK, F_WRLCK)
|
|
|
|
*
|
|
|
|
* Check permission before performing file locking operations. Note the hook
|
|
|
|
* mediates both flock and fcntl style locks.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_file_lock(struct file *file, unsigned int cmd)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(file_lock, file, cmd);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-10 18:23:09 +00:00
|
|
|
/**
|
|
|
|
* security_file_fcntl() - Check if fcntl() op is allowed
|
|
|
|
* @file: file
|
2023-07-02 17:08:57 +00:00
|
|
|
* @cmd: fcntl command
|
2023-02-10 18:23:09 +00:00
|
|
|
* @arg: command argument
|
|
|
|
*
|
|
|
|
* Check permission before allowing the file operation specified by @cmd from
|
|
|
|
* being performed on the file @file. Note that @arg sometimes represents a
|
|
|
|
* user space pointer; in other cases, it may be a simple integer value. When
|
|
|
|
* @arg represents a user space pointer, it should never be used by the
|
|
|
|
* security module.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_file_fcntl(struct file *file, unsigned int cmd, unsigned long arg)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(file_fcntl, file, cmd, arg);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-10 18:23:09 +00:00
|
|
|
/**
|
|
|
|
* security_file_set_fowner() - Set the file owner info in the LSM blob
|
|
|
|
* @file: the file
|
|
|
|
*
|
|
|
|
* Save owner security information (typically from current->security) in
|
|
|
|
* file->f_security for later use by the send_sigiotask hook.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success.
|
|
|
|
*/
|
2014-08-22 15:27:32 +00:00
|
|
|
void security_file_set_fowner(struct file *file)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(file_set_fowner, file);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-10 18:23:09 +00:00
|
|
|
/**
|
|
|
|
* security_file_send_sigiotask() - Check if sending SIGIO/SIGURG is allowed
|
|
|
|
* @tsk: target task
|
|
|
|
* @fown: signal sender
|
|
|
|
* @sig: signal to be sent, SIGIO is sent if 0
|
|
|
|
*
|
|
|
|
* Check permission for the file owner @fown to send SIGIO or SIGURG to the
|
|
|
|
* process @tsk. Note that this hook is sometimes called from interrupt. Note
|
|
|
|
* that the fown_struct, @fown, is never outside the context of a struct file,
|
|
|
|
* so the file structure (and associated security information) can always be
|
|
|
|
* obtained: container_of(fown, struct file, f_owner).
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_file_send_sigiotask(struct task_struct *tsk,
|
2023-02-17 02:33:20 +00:00
|
|
|
struct fown_struct *fown, int sig)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(file_send_sigiotask, tsk, fown, sig);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-10 18:23:09 +00:00
|
|
|
/**
|
2024-02-17 13:35:04 +00:00
|
|
|
* security_file_receive() - Check if receiving a file via IPC is allowed
|
2023-02-10 18:23:09 +00:00
|
|
|
* @file: file being received
|
|
|
|
*
|
|
|
|
* This hook allows security modules to control the ability of a process to
|
|
|
|
* receive an open file descriptor via socket IPC.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_file_receive(struct file *file)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(file_receive, file);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-10 18:23:09 +00:00
|
|
|
/**
|
|
|
|
* security_file_open() - Save open() time state for late use by the LSM
|
|
|
|
* @file:
|
|
|
|
*
|
|
|
|
* Save open-time permission checking state for later use upon file_permission,
|
|
|
|
* and recheck access if anything has changed since inode_permission.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2018-07-10 17:25:29 +00:00
|
|
|
int security_file_open(struct file *file)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2009-12-18 02:24:34 +00:00
|
|
|
int ret;
|
|
|
|
|
2024-01-30 12:56:59 +00:00
|
|
|
ret = call_int_hook(file_open, file);
|
2009-12-18 02:24:34 +00:00
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
2023-12-12 09:44:38 +00:00
|
|
|
return fsnotify_open_perm(file);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2024-02-15 10:31:00 +00:00
|
|
|
/**
|
|
|
|
* security_file_post_open() - Evaluate a file after it has been opened
|
|
|
|
* @file: the file
|
|
|
|
* @mask: access mask
|
|
|
|
*
|
|
|
|
* Evaluate an opened file and the access mask requested with open(). The hook
|
|
|
|
* is useful for LSMs that require the file content to be available in order to
|
|
|
|
* make decisions.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
|
|
|
int security_file_post_open(struct file *file, int mask)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(file_post_open, file, mask);
|
2024-02-15 10:31:00 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(security_file_post_open);
|
|
|
|
|
2023-02-10 18:23:09 +00:00
|
|
|
/**
|
|
|
|
* security_file_truncate() - Check if truncating a file is allowed
|
|
|
|
* @file: file
|
|
|
|
*
|
|
|
|
* Check permission before truncating a file, i.e. using ftruncate. Note that
|
|
|
|
* truncation permission may also be checked based on the path, using the
|
|
|
|
* @path_truncate hook.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2022-10-18 18:22:06 +00:00
|
|
|
int security_file_truncate(struct file *file)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(file_truncate, file);
|
2022-10-18 18:22:06 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_alloc() - Allocate a task's LSM blob
|
|
|
|
* @task: the task
|
|
|
|
* @clone_flags: flags indicating what is being shared
|
|
|
|
*
|
|
|
|
* Handle allocation of task-related resources.
|
|
|
|
*
|
|
|
|
* Return: Returns a zero on success, negative values on failure.
|
|
|
|
*/
|
2017-03-24 11:46:33 +00:00
|
|
|
int security_task_alloc(struct task_struct *task, unsigned long clone_flags)
|
|
|
|
{
|
2018-09-22 00:19:37 +00:00
|
|
|
int rc = lsm_task_alloc(task);
|
|
|
|
|
|
|
|
if (rc)
|
|
|
|
return rc;
|
2024-01-30 12:56:59 +00:00
|
|
|
rc = call_int_hook(task_alloc, task, clone_flags);
|
2018-09-22 00:19:37 +00:00
|
|
|
if (unlikely(rc))
|
|
|
|
security_task_free(task);
|
|
|
|
return rc;
|
2017-03-24 11:46:33 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_free() - Free a task's LSM blob and related resources
|
|
|
|
* @task: task
|
|
|
|
*
|
|
|
|
* Handle release of task-related resources. Note that this can be called from
|
|
|
|
* interrupt context.
|
|
|
|
*/
|
2011-12-21 20:17:03 +00:00
|
|
|
void security_task_free(struct task_struct *task)
|
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(task_free, task);
|
2018-09-22 00:19:37 +00:00
|
|
|
|
|
|
|
kfree(task->security);
|
|
|
|
task->security = NULL;
|
2011-12-21 20:17:03 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_cred_alloc_blank() - Allocate the min memory to allow cred_transfer
|
|
|
|
* @cred: credentials
|
|
|
|
* @gfp: gfp flags
|
|
|
|
*
|
|
|
|
* Only allocate sufficient memory and attach to @cred such that
|
|
|
|
* cred_transfer() will not get ENOMEM.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, negative values on failure.
|
|
|
|
*/
|
KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]
Add a keyctl to install a process's session keyring onto its parent. This
replaces the parent's session keyring. Because the COW credential code does
not permit one process to change another process's credentials directly, the
change is deferred until userspace next starts executing again. Normally this
will be after a wait*() syscall.
To support this, three new security hooks have been provided:
cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
the blank security creds and key_session_to_parent() - which asks the LSM if
the process may replace its parent's session keyring.
The replacement may only happen if the process has the same ownership details
as its parent, and the process has LINK permission on the session keyring, and
the session keyring is owned by the process, and the LSM permits it.
Note that this requires alteration to each architecture's notify_resume path.
This has been done for all arches barring blackfin, m68k* and xtensa, all of
which need assembly alteration to support TIF_NOTIFY_RESUME. This allows the
replacement to be performed at the point the parent process resumes userspace
execution.
This allows the userspace AFS pioctl emulation to fully emulate newpag() and
the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
alter the parent process's PAG membership. However, since kAFS doesn't use
PAGs per se, but rather dumps the keys into the session keyring, the session
keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
the newpag flag.
This can be tested with the following program:
#include <stdio.h>
#include <stdlib.h>
#include <keyutils.h>
#define KEYCTL_SESSION_TO_PARENT 18
#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)
int main(int argc, char **argv)
{
key_serial_t keyring, key;
long ret;
keyring = keyctl_join_session_keyring(argv[1]);
OSERROR(keyring, "keyctl_join_session_keyring");
key = add_key("user", "a", "b", 1, keyring);
OSERROR(key, "add_key");
ret = keyctl(KEYCTL_SESSION_TO_PARENT);
OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");
return 0;
}
Compiled and linked with -lkeyutils, you should see something like:
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
355907932 --alswrv 4043 -1 \_ keyring: _uid.4043
[dhowells@andromeda ~]$ /tmp/newpag
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
1055658746 --alswrv 4043 4043 \_ user: a
[dhowells@andromeda ~]$ /tmp/newpag hello
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: hello
340417692 --alswrv 4043 4043 \_ user: a
Where the test program creates a new session keyring, sticks a user key named
'a' into it and then installs it on its parent.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 08:14:21 +00:00
|
|
|
int security_cred_alloc_blank(struct cred *cred, gfp_t gfp)
|
|
|
|
{
|
2018-11-12 17:30:56 +00:00
|
|
|
int rc = lsm_cred_alloc(cred, gfp);
|
|
|
|
|
|
|
|
if (rc)
|
|
|
|
return rc;
|
|
|
|
|
2024-01-30 12:56:59 +00:00
|
|
|
rc = call_int_hook(cred_alloc_blank, cred, gfp);
|
2018-11-12 20:02:49 +00:00
|
|
|
if (unlikely(rc))
|
2018-11-12 17:30:56 +00:00
|
|
|
security_cred_free(cred);
|
|
|
|
return rc;
|
KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]
Add a keyctl to install a process's session keyring onto its parent. This
replaces the parent's session keyring. Because the COW credential code does
not permit one process to change another process's credentials directly, the
change is deferred until userspace next starts executing again. Normally this
will be after a wait*() syscall.
To support this, three new security hooks have been provided:
cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
the blank security creds and key_session_to_parent() - which asks the LSM if
the process may replace its parent's session keyring.
The replacement may only happen if the process has the same ownership details
as its parent, and the process has LINK permission on the session keyring, and
the session keyring is owned by the process, and the LSM permits it.
Note that this requires alteration to each architecture's notify_resume path.
This has been done for all arches barring blackfin, m68k* and xtensa, all of
which need assembly alteration to support TIF_NOTIFY_RESUME. This allows the
replacement to be performed at the point the parent process resumes userspace
execution.
This allows the userspace AFS pioctl emulation to fully emulate newpag() and
the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
alter the parent process's PAG membership. However, since kAFS doesn't use
PAGs per se, but rather dumps the keys into the session keyring, the session
keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
the newpag flag.
This can be tested with the following program:
#include <stdio.h>
#include <stdlib.h>
#include <keyutils.h>
#define KEYCTL_SESSION_TO_PARENT 18
#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)
int main(int argc, char **argv)
{
key_serial_t keyring, key;
long ret;
keyring = keyctl_join_session_keyring(argv[1]);
OSERROR(keyring, "keyctl_join_session_keyring");
key = add_key("user", "a", "b", 1, keyring);
OSERROR(key, "add_key");
ret = keyctl(KEYCTL_SESSION_TO_PARENT);
OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");
return 0;
}
Compiled and linked with -lkeyutils, you should see something like:
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
355907932 --alswrv 4043 -1 \_ keyring: _uid.4043
[dhowells@andromeda ~]$ /tmp/newpag
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
1055658746 --alswrv 4043 4043 \_ user: a
[dhowells@andromeda ~]$ /tmp/newpag hello
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: hello
340417692 --alswrv 4043 4043 \_ user: a
Where the test program creates a new session keyring, sticks a user key named
'a' into it and then installs it on its parent.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 08:14:21 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_cred_free() - Free the cred's LSM blob and associated resources
|
|
|
|
* @cred: credentials
|
|
|
|
*
|
|
|
|
* Deallocate and clear the cred->security field in a set of credentials.
|
|
|
|
*/
|
CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.
A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().
With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:
struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);
There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.
To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:
(1) Its reference count may incremented and decremented.
(2) The keyrings to which it points may be modified, but not replaced.
The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.
(2) Temporary credential overrides.
do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.
This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.
(3) LSM interface.
A number of functions have been changed, added or removed:
(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()
Removed in favour of security_capset().
(*) security_capset(), ->capset()
New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
Changed; now returns a value, which will cause the process to be
killed if it's an error.
(*) security_task_alloc(), ->task_alloc_security()
Removed in favour of security_prepare_creds().
(*) security_cred_free(), ->cred_free()
New. Free security data attached to cred->security.
(*) security_prepare_creds(), ->cred_prepare()
New. Duplicate any security data attached to cred->security.
(*) security_commit_creds(), ->cred_commit()
New. Apply any security effects for the upcoming installation of new
security by commit_creds().
(*) security_task_post_setuid(), ->task_post_setuid()
Removed in favour of security_task_fix_setuid().
(*) security_task_fix_setuid(), ->task_fix_setuid()
Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().
(*) security_task_reparent_to_init(), ->task_reparent_to_init()
Removed. Instead the task being reparented to init is referred
directly to init's credentials.
NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.
(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()
Changed. These now take cred pointers rather than task pointers to
refer to the security context.
(4) sys_capset().
This has been simplified and uses less locking. The LSM functions it
calls have been merged.
(5) reparent_to_kthreadd().
This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.
(6) __sigqueue_alloc() and switch_uid()
__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.
switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().
(7) [sg]et[ug]id() and co and [sg]et_current_groups.
The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.
security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.
The calling of set_dumpable() has been moved into commit_creds().
Much of the functionality of set_user() has been moved into
commit_creds().
The get functions all simply access the data directly.
(8) security_task_prctl() and cap_task_prctl().
security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.
Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.
(9) Keyrings.
A number of changes have been made to the keyrings code:
(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.
(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.
(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.
(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.
(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).
(10) Usermode helper.
The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.
call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.
call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.
(11) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.
(12) is_single_threaded().
This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.
The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).
(13) nfsd.
The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-13 23:39:23 +00:00
|
|
|
void security_cred_free(struct cred *cred)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2019-01-16 23:41:11 +00:00
|
|
|
/*
|
|
|
|
* There is a failure case in prepare_creds() that
|
|
|
|
* may result in a call here with ->security being NULL.
|
|
|
|
*/
|
|
|
|
if (unlikely(cred->security == NULL))
|
|
|
|
return;
|
|
|
|
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(cred_free, cred);
|
2018-11-12 17:30:56 +00:00
|
|
|
|
|
|
|
kfree(cred->security);
|
|
|
|
cred->security = NULL;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_prepare_creds() - Prepare a new set of credentials
|
|
|
|
* @new: new credentials
|
|
|
|
* @old: original credentials
|
|
|
|
* @gfp: gfp flags
|
|
|
|
*
|
|
|
|
* Prepare a new set of credentials by copying the data from the old set.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, negative values on failure.
|
|
|
|
*/
|
CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.
A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().
With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:
struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);
There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.
To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:
(1) Its reference count may incremented and decremented.
(2) The keyrings to which it points may be modified, but not replaced.
The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.
(2) Temporary credential overrides.
do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.
This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.
(3) LSM interface.
A number of functions have been changed, added or removed:
(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()
Removed in favour of security_capset().
(*) security_capset(), ->capset()
New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
Changed; now returns a value, which will cause the process to be
killed if it's an error.
(*) security_task_alloc(), ->task_alloc_security()
Removed in favour of security_prepare_creds().
(*) security_cred_free(), ->cred_free()
New. Free security data attached to cred->security.
(*) security_prepare_creds(), ->cred_prepare()
New. Duplicate any security data attached to cred->security.
(*) security_commit_creds(), ->cred_commit()
New. Apply any security effects for the upcoming installation of new
security by commit_creds().
(*) security_task_post_setuid(), ->task_post_setuid()
Removed in favour of security_task_fix_setuid().
(*) security_task_fix_setuid(), ->task_fix_setuid()
Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().
(*) security_task_reparent_to_init(), ->task_reparent_to_init()
Removed. Instead the task being reparented to init is referred
directly to init's credentials.
NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.
(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()
Changed. These now take cred pointers rather than task pointers to
refer to the security context.
(4) sys_capset().
This has been simplified and uses less locking. The LSM functions it
calls have been merged.
(5) reparent_to_kthreadd().
This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.
(6) __sigqueue_alloc() and switch_uid()
__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.
switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().
(7) [sg]et[ug]id() and co and [sg]et_current_groups.
The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.
security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.
The calling of set_dumpable() has been moved into commit_creds().
Much of the functionality of set_user() has been moved into
commit_creds().
The get functions all simply access the data directly.
(8) security_task_prctl() and cap_task_prctl().
security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.
Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.
(9) Keyrings.
A number of changes have been made to the keyrings code:
(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.
(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.
(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.
(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.
(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).
(10) Usermode helper.
The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.
call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.
call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.
(11) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.
(12) is_single_threaded().
This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.
The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).
(13) nfsd.
The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-13 23:39:23 +00:00
|
|
|
int security_prepare_creds(struct cred *new, const struct cred *old, gfp_t gfp)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2018-11-12 17:30:56 +00:00
|
|
|
int rc = lsm_cred_alloc(new, gfp);
|
|
|
|
|
|
|
|
if (rc)
|
|
|
|
return rc;
|
|
|
|
|
2024-01-30 12:56:59 +00:00
|
|
|
rc = call_int_hook(cred_prepare, new, old, gfp);
|
2018-11-12 20:02:49 +00:00
|
|
|
if (unlikely(rc))
|
2018-11-12 17:30:56 +00:00
|
|
|
security_cred_free(new);
|
|
|
|
return rc;
|
CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.
A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().
With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:
struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);
There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.
To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:
(1) Its reference count may incremented and decremented.
(2) The keyrings to which it points may be modified, but not replaced.
The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.
(2) Temporary credential overrides.
do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.
This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.
(3) LSM interface.
A number of functions have been changed, added or removed:
(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()
Removed in favour of security_capset().
(*) security_capset(), ->capset()
New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
Changed; now returns a value, which will cause the process to be
killed if it's an error.
(*) security_task_alloc(), ->task_alloc_security()
Removed in favour of security_prepare_creds().
(*) security_cred_free(), ->cred_free()
New. Free security data attached to cred->security.
(*) security_prepare_creds(), ->cred_prepare()
New. Duplicate any security data attached to cred->security.
(*) security_commit_creds(), ->cred_commit()
New. Apply any security effects for the upcoming installation of new
security by commit_creds().
(*) security_task_post_setuid(), ->task_post_setuid()
Removed in favour of security_task_fix_setuid().
(*) security_task_fix_setuid(), ->task_fix_setuid()
Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().
(*) security_task_reparent_to_init(), ->task_reparent_to_init()
Removed. Instead the task being reparented to init is referred
directly to init's credentials.
NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.
(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()
Changed. These now take cred pointers rather than task pointers to
refer to the security context.
(4) sys_capset().
This has been simplified and uses less locking. The LSM functions it
calls have been merged.
(5) reparent_to_kthreadd().
This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.
(6) __sigqueue_alloc() and switch_uid()
__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.
switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().
(7) [sg]et[ug]id() and co and [sg]et_current_groups.
The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.
security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.
The calling of set_dumpable() has been moved into commit_creds().
Much of the functionality of set_user() has been moved into
commit_creds().
The get functions all simply access the data directly.
(8) security_task_prctl() and cap_task_prctl().
security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.
Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.
(9) Keyrings.
A number of changes have been made to the keyrings code:
(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.
(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.
(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.
(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.
(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).
(10) Usermode helper.
The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.
call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.
call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.
(11) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.
(12) is_single_threaded().
This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.
The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).
(13) nfsd.
The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-13 23:39:23 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_transfer_creds() - Transfer creds
|
|
|
|
* @new: target credentials
|
|
|
|
* @old: original credentials
|
|
|
|
*
|
|
|
|
* Transfer data from original creds to new creds.
|
|
|
|
*/
|
KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]
Add a keyctl to install a process's session keyring onto its parent. This
replaces the parent's session keyring. Because the COW credential code does
not permit one process to change another process's credentials directly, the
change is deferred until userspace next starts executing again. Normally this
will be after a wait*() syscall.
To support this, three new security hooks have been provided:
cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
the blank security creds and key_session_to_parent() - which asks the LSM if
the process may replace its parent's session keyring.
The replacement may only happen if the process has the same ownership details
as its parent, and the process has LINK permission on the session keyring, and
the session keyring is owned by the process, and the LSM permits it.
Note that this requires alteration to each architecture's notify_resume path.
This has been done for all arches barring blackfin, m68k* and xtensa, all of
which need assembly alteration to support TIF_NOTIFY_RESUME. This allows the
replacement to be performed at the point the parent process resumes userspace
execution.
This allows the userspace AFS pioctl emulation to fully emulate newpag() and
the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
alter the parent process's PAG membership. However, since kAFS doesn't use
PAGs per se, but rather dumps the keys into the session keyring, the session
keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
the newpag flag.
This can be tested with the following program:
#include <stdio.h>
#include <stdlib.h>
#include <keyutils.h>
#define KEYCTL_SESSION_TO_PARENT 18
#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)
int main(int argc, char **argv)
{
key_serial_t keyring, key;
long ret;
keyring = keyctl_join_session_keyring(argv[1]);
OSERROR(keyring, "keyctl_join_session_keyring");
key = add_key("user", "a", "b", 1, keyring);
OSERROR(key, "add_key");
ret = keyctl(KEYCTL_SESSION_TO_PARENT);
OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");
return 0;
}
Compiled and linked with -lkeyutils, you should see something like:
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
355907932 --alswrv 4043 -1 \_ keyring: _uid.4043
[dhowells@andromeda ~]$ /tmp/newpag
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
1055658746 --alswrv 4043 4043 \_ user: a
[dhowells@andromeda ~]$ /tmp/newpag hello
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: hello
340417692 --alswrv 4043 4043 \_ user: a
Where the test program creates a new session keyring, sticks a user key named
'a' into it and then installs it on its parent.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 08:14:21 +00:00
|
|
|
void security_transfer_creds(struct cred *new, const struct cred *old)
|
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(cred_transfer, new, old);
|
KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]
Add a keyctl to install a process's session keyring onto its parent. This
replaces the parent's session keyring. Because the COW credential code does
not permit one process to change another process's credentials directly, the
change is deferred until userspace next starts executing again. Normally this
will be after a wait*() syscall.
To support this, three new security hooks have been provided:
cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
the blank security creds and key_session_to_parent() - which asks the LSM if
the process may replace its parent's session keyring.
The replacement may only happen if the process has the same ownership details
as its parent, and the process has LINK permission on the session keyring, and
the session keyring is owned by the process, and the LSM permits it.
Note that this requires alteration to each architecture's notify_resume path.
This has been done for all arches barring blackfin, m68k* and xtensa, all of
which need assembly alteration to support TIF_NOTIFY_RESUME. This allows the
replacement to be performed at the point the parent process resumes userspace
execution.
This allows the userspace AFS pioctl emulation to fully emulate newpag() and
the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
alter the parent process's PAG membership. However, since kAFS doesn't use
PAGs per se, but rather dumps the keys into the session keyring, the session
keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
the newpag flag.
This can be tested with the following program:
#include <stdio.h>
#include <stdlib.h>
#include <keyutils.h>
#define KEYCTL_SESSION_TO_PARENT 18
#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)
int main(int argc, char **argv)
{
key_serial_t keyring, key;
long ret;
keyring = keyctl_join_session_keyring(argv[1]);
OSERROR(keyring, "keyctl_join_session_keyring");
key = add_key("user", "a", "b", 1, keyring);
OSERROR(key, "add_key");
ret = keyctl(KEYCTL_SESSION_TO_PARENT);
OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");
return 0;
}
Compiled and linked with -lkeyutils, you should see something like:
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
355907932 --alswrv 4043 -1 \_ keyring: _uid.4043
[dhowells@andromeda ~]$ /tmp/newpag
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
1055658746 --alswrv 4043 4043 \_ user: a
[dhowells@andromeda ~]$ /tmp/newpag hello
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: hello
340417692 --alswrv 4043 4043 \_ user: a
Where the test program creates a new session keyring, sticks a user key named
'a' into it and then installs it on its parent.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 08:14:21 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_cred_getsecid() - Get the secid from a set of credentials
|
|
|
|
* @c: credentials
|
|
|
|
* @secid: secid value
|
|
|
|
*
|
|
|
|
* Retrieve the security identifier of the cred structure @c. In case of
|
|
|
|
* failure, @secid will be set to zero.
|
|
|
|
*/
|
2018-01-08 21:36:19 +00:00
|
|
|
void security_cred_getsecid(const struct cred *c, u32 *secid)
|
|
|
|
{
|
|
|
|
*secid = 0;
|
|
|
|
call_void_hook(cred_getsecid, c, secid);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_cred_getsecid);
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_kernel_act_as() - Set the kernel credentials to act as secid
|
|
|
|
* @new: credentials
|
|
|
|
* @secid: secid
|
|
|
|
*
|
|
|
|
* Set the credentials for a kernel service to act as (subjective context).
|
|
|
|
* The current task must be the one that nominated @secid.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if successful.
|
|
|
|
*/
|
2008-11-13 23:39:28 +00:00
|
|
|
int security_kernel_act_as(struct cred *new, u32 secid)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(kernel_act_as, new, secid);
|
2008-11-13 23:39:28 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_kernel_create_files_as() - Set file creation context using an inode
|
|
|
|
* @new: target credentials
|
|
|
|
* @inode: reference inode
|
|
|
|
*
|
|
|
|
* Set the file creation context in a set of credentials to be the same as the
|
|
|
|
* objective context of the specified inode. The current task must be the one
|
|
|
|
* that nominated @inode.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if successful.
|
|
|
|
*/
|
2008-11-13 23:39:28 +00:00
|
|
|
int security_kernel_create_files_as(struct cred *new, struct inode *inode)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(kernel_create_files_as, new, inode);
|
2008-11-13 23:39:28 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
2024-02-17 13:35:04 +00:00
|
|
|
* security_kernel_module_request() - Check if loading a module is allowed
|
2023-02-12 00:27:58 +00:00
|
|
|
* @kmod_name: module name
|
|
|
|
*
|
|
|
|
* Ability to trigger the kernel to automatically upcall to userspace for
|
|
|
|
* userspace to load a kernel module with the given name.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if successful.
|
|
|
|
*/
|
2009-11-03 05:35:32 +00:00
|
|
|
int security_kernel_module_request(char *kmod_name)
|
2009-08-13 13:44:57 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(kernel_module_request, kmod_name);
|
2009-08-13 13:44:57 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_kernel_read_file() - Read a file specified by userspace
|
|
|
|
* @file: file
|
|
|
|
* @id: file identifier
|
|
|
|
* @contents: trust if security_kernel_post_read_file() will be called
|
|
|
|
*
|
|
|
|
* Read a file specified by userspace.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2020-10-02 17:38:23 +00:00
|
|
|
int security_kernel_read_file(struct file *file, enum kernel_read_file_id id,
|
|
|
|
bool contents)
|
2016-01-31 03:23:26 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(kernel_read_file, file, id, contents);
|
2016-01-31 03:23:26 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(security_kernel_read_file);
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_kernel_post_read_file() - Read a file specified by userspace
|
|
|
|
* @file: file
|
|
|
|
* @buf: file contents
|
|
|
|
* @size: size of file contents
|
|
|
|
* @id: file identifier
|
|
|
|
*
|
|
|
|
* Read a file specified by userspace. This must be paired with a prior call
|
|
|
|
* to security_kernel_read_file() call that indicated this hook would also be
|
|
|
|
* called, see security_kernel_read_file() for more information.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2016-01-24 15:07:32 +00:00
|
|
|
int security_kernel_post_read_file(struct file *file, char *buf, loff_t size,
|
|
|
|
enum kernel_read_file_id id)
|
2015-12-28 21:02:29 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(kernel_post_read_file, file, buf, size, id);
|
2015-12-28 21:02:29 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(security_kernel_post_read_file);
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_kernel_load_data() - Load data provided by userspace
|
|
|
|
* @id: data identifier
|
|
|
|
* @contents: true if security_kernel_post_load_data() will be called
|
|
|
|
*
|
|
|
|
* Load data provided by userspace.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2020-10-02 17:38:20 +00:00
|
|
|
int security_kernel_load_data(enum kernel_load_data_id id, bool contents)
|
2018-07-13 18:05:56 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(kernel_load_data, id, contents);
|
2018-07-13 18:05:56 +00:00
|
|
|
}
|
2018-07-17 20:23:37 +00:00
|
|
|
EXPORT_SYMBOL_GPL(security_kernel_load_data);
|
2018-07-13 18:05:56 +00:00
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_kernel_post_load_data() - Load userspace data from a non-file source
|
|
|
|
* @buf: data
|
|
|
|
* @size: size of data
|
|
|
|
* @id: data identifier
|
|
|
|
* @description: text description of data, specific to the id value
|
|
|
|
*
|
|
|
|
* Load data provided by a non-file source (usually userspace buffer). This
|
|
|
|
* must be paired with a prior security_kernel_load_data() call that indicated
|
|
|
|
* this hook would also be called, see security_kernel_load_data() for more
|
|
|
|
* information.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2020-10-02 17:38:20 +00:00
|
|
|
int security_kernel_post_load_data(char *buf, loff_t size,
|
|
|
|
enum kernel_load_data_id id,
|
|
|
|
char *description)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(kernel_post_load_data, buf, size, id, description);
|
2020-10-02 17:38:20 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(security_kernel_post_load_data);
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_fix_setuid() - Update LSM with new user id attributes
|
|
|
|
* @new: updated credentials
|
|
|
|
* @old: credentials being replaced
|
|
|
|
* @flags: LSM_SETID_* flag values
|
|
|
|
*
|
|
|
|
* Update the module's state after setting one or more of the user identity
|
|
|
|
* attributes of the current process. The @flags parameter indicates which of
|
|
|
|
* the set*uid system calls invoked this hook. If @new is the set of
|
|
|
|
* credentials that will be installed. Modifications should be made to this
|
|
|
|
* rather than to @current->cred.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success.
|
|
|
|
*/
|
CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.
A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().
With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:
struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);
There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.
To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:
(1) Its reference count may incremented and decremented.
(2) The keyrings to which it points may be modified, but not replaced.
The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.
(2) Temporary credential overrides.
do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.
This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.
(3) LSM interface.
A number of functions have been changed, added or removed:
(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()
Removed in favour of security_capset().
(*) security_capset(), ->capset()
New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
Changed; now returns a value, which will cause the process to be
killed if it's an error.
(*) security_task_alloc(), ->task_alloc_security()
Removed in favour of security_prepare_creds().
(*) security_cred_free(), ->cred_free()
New. Free security data attached to cred->security.
(*) security_prepare_creds(), ->cred_prepare()
New. Duplicate any security data attached to cred->security.
(*) security_commit_creds(), ->cred_commit()
New. Apply any security effects for the upcoming installation of new
security by commit_creds().
(*) security_task_post_setuid(), ->task_post_setuid()
Removed in favour of security_task_fix_setuid().
(*) security_task_fix_setuid(), ->task_fix_setuid()
Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().
(*) security_task_reparent_to_init(), ->task_reparent_to_init()
Removed. Instead the task being reparented to init is referred
directly to init's credentials.
NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.
(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()
Changed. These now take cred pointers rather than task pointers to
refer to the security context.
(4) sys_capset().
This has been simplified and uses less locking. The LSM functions it
calls have been merged.
(5) reparent_to_kthreadd().
This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.
(6) __sigqueue_alloc() and switch_uid()
__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.
switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().
(7) [sg]et[ug]id() and co and [sg]et_current_groups.
The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.
security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.
The calling of set_dumpable() has been moved into commit_creds().
Much of the functionality of set_user() has been moved into
commit_creds().
The get functions all simply access the data directly.
(8) security_task_prctl() and cap_task_prctl().
security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.
Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.
(9) Keyrings.
A number of changes have been made to the keyrings code:
(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.
(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.
(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.
(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.
(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).
(10) Usermode helper.
The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.
call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.
call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.
(11) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.
(12) is_single_threaded().
This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.
The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).
(13) nfsd.
The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-13 23:39:23 +00:00
|
|
|
int security_task_fix_setuid(struct cred *new, const struct cred *old,
|
|
|
|
int flags)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_fix_setuid, new, old, flags);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_fix_setgid() - Update LSM with new group id attributes
|
|
|
|
* @new: updated credentials
|
|
|
|
* @old: credentials being replaced
|
|
|
|
* @flags: LSM_SETID_* flag value
|
|
|
|
*
|
|
|
|
* Update the module's state after setting one or more of the group identity
|
|
|
|
* attributes of the current process. The @flags parameter indicates which of
|
|
|
|
* the set*gid system calls invoked this hook. @new is the set of credentials
|
|
|
|
* that will be installed. Modifications should be made to this rather than to
|
|
|
|
* @current->cred.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success.
|
|
|
|
*/
|
2020-06-09 17:22:13 +00:00
|
|
|
int security_task_fix_setgid(struct cred *new, const struct cred *old,
|
2023-02-17 02:33:20 +00:00
|
|
|
int flags)
|
2020-06-09 17:22:13 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_fix_setgid, new, old, flags);
|
2020-06-09 17:22:13 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_fix_setgroups() - Update LSM with new supplementary groups
|
|
|
|
* @new: updated credentials
|
|
|
|
* @old: credentials being replaced
|
|
|
|
*
|
|
|
|
* Update the module's state after setting the supplementary group identity
|
|
|
|
* attributes of the current process. @new is the set of credentials that will
|
|
|
|
* be installed. Modifications should be made to this rather than to
|
|
|
|
* @current->cred.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success.
|
|
|
|
*/
|
2022-06-08 20:57:11 +00:00
|
|
|
int security_task_fix_setgroups(struct cred *new, const struct cred *old)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_fix_setgroups, new, old);
|
2022-06-08 20:57:11 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_setpgid() - Check if setting the pgid is allowed
|
|
|
|
* @p: task being modified
|
|
|
|
* @pgid: new pgid
|
|
|
|
*
|
|
|
|
* Check permission before setting the process group identifier of the process
|
|
|
|
* @p to @pgid.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_task_setpgid(struct task_struct *p, pid_t pgid)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_setpgid, p, pgid);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_getpgid() - Check if getting the pgid is allowed
|
|
|
|
* @p: task
|
|
|
|
*
|
|
|
|
* Check permission before getting the process group identifier of the process
|
|
|
|
* @p.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_task_getpgid(struct task_struct *p)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_getpgid, p);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_getsid() - Check if getting the session id is allowed
|
|
|
|
* @p: task
|
|
|
|
*
|
|
|
|
* Check permission before getting the session identifier of the process @p.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_task_getsid(struct task_struct *p)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_getsid, p);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_current_getsecid_subj() - Get the current task's subjective secid
|
|
|
|
* @secid: secid value
|
|
|
|
*
|
|
|
|
* Retrieve the subjective security identifier of the current task and return
|
|
|
|
* it in @secid. In case of failure, @secid will be set to zero.
|
|
|
|
*/
|
2021-09-29 15:01:21 +00:00
|
|
|
void security_current_getsecid_subj(u32 *secid)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-05-02 22:11:42 +00:00
|
|
|
*secid = 0;
|
2021-09-29 15:01:21 +00:00
|
|
|
call_void_hook(current_getsecid_subj, secid);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
2021-09-29 15:01:21 +00:00
|
|
|
EXPORT_SYMBOL(security_current_getsecid_subj);
|
2021-02-19 19:26:21 +00:00
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_getsecid_obj() - Get a task's objective secid
|
|
|
|
* @p: target task
|
|
|
|
* @secid: secid value
|
|
|
|
*
|
|
|
|
* Retrieve the objective security identifier of the task_struct in @p and
|
|
|
|
* return it in @secid. In case of failure, @secid will be set to zero.
|
|
|
|
*/
|
2021-02-19 19:26:21 +00:00
|
|
|
void security_task_getsecid_obj(struct task_struct *p, u32 *secid)
|
|
|
|
{
|
|
|
|
*secid = 0;
|
|
|
|
call_void_hook(task_getsecid_obj, p, secid);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_task_getsecid_obj);
|
2007-10-17 06:31:32 +00:00
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_setnice() - Check if setting a task's nice value is allowed
|
|
|
|
* @p: target task
|
|
|
|
* @nice: nice value
|
|
|
|
*
|
|
|
|
* Check permission before setting the nice value of @p to @nice.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_task_setnice(struct task_struct *p, int nice)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_setnice, p, nice);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_setioprio() - Check if setting a task's ioprio is allowed
|
|
|
|
* @p: target task
|
|
|
|
* @ioprio: ioprio value
|
|
|
|
*
|
|
|
|
* Check permission before setting the ioprio value of @p to @ioprio.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_task_setioprio(struct task_struct *p, int ioprio)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_setioprio, p, ioprio);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_getioprio() - Check if getting a task's ioprio is allowed
|
|
|
|
* @p: task
|
|
|
|
*
|
|
|
|
* Check permission before getting the ioprio value of @p.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_task_getioprio(struct task_struct *p)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_getioprio, p);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_prlimit() - Check if get/setting resources limits is allowed
|
|
|
|
* @cred: current task credentials
|
|
|
|
* @tcred: target task credentials
|
|
|
|
* @flags: LSM_PRLIMIT_* flag bits indicating a get/set/both
|
|
|
|
*
|
|
|
|
* Check permission before getting and/or setting the resource limits of
|
|
|
|
* another task.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
prlimit,security,selinux: add a security hook for prlimit
When SELinux was first added to the kernel, a process could only get
and set its own resource limits via getrlimit(2) and setrlimit(2), so no
MAC checks were required for those operations, and thus no security hooks
were defined for them. Later, SELinux introduced a hook for setlimit(2)
with a check if the hard limit was being changed in order to be able to
rely on the hard limit value as a safe reset point upon context
transitions.
Later on, when prlimit(2) was added to the kernel with the ability to get
or set resource limits (hard or soft) of another process, LSM/SELinux was
not updated other than to pass the target process to the setrlimit hook.
This resulted in incomplete control over both getting and setting the
resource limits of another process.
Add a new security_task_prlimit() hook to the check_prlimit_permission()
function to provide complete mediation. The hook is only called when
acting on another task, and only if the existing DAC/capability checks
would allow access. Pass flags down to the hook to indicate whether the
prlimit(2) call will read, write, or both read and write the resource
limits of the target process.
The existing security_task_setrlimit() hook is left alone; it continues
to serve a purpose in supporting the ability to make decisions based on
the old and/or new resource limit values when setting limits. This
is consistent with the DAC/capability logic, where
check_prlimit_permission() performs generic DAC/capability checks for
acting on another task, while do_prlimit() performs a capability check
based on a comparison of the old and new resource limits. Fix the
inline documentation for the hook to match the code.
Implement the new hook for SELinux. For setting resource limits, we
reuse the existing setrlimit permission. Note that this does overload
the setrlimit permission to mean the ability to set the resource limit
(soft or hard) of another process or the ability to change one's own
hard limit. For getting resource limits, a new getrlimit permission
is defined. This was not originally defined since getrlimit(2) could
only be used to obtain a process' own limits.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-02-17 12:57:00 +00:00
|
|
|
int security_task_prlimit(const struct cred *cred, const struct cred *tcred,
|
|
|
|
unsigned int flags)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_prlimit, cred, tcred, flags);
|
prlimit,security,selinux: add a security hook for prlimit
When SELinux was first added to the kernel, a process could only get
and set its own resource limits via getrlimit(2) and setrlimit(2), so no
MAC checks were required for those operations, and thus no security hooks
were defined for them. Later, SELinux introduced a hook for setlimit(2)
with a check if the hard limit was being changed in order to be able to
rely on the hard limit value as a safe reset point upon context
transitions.
Later on, when prlimit(2) was added to the kernel with the ability to get
or set resource limits (hard or soft) of another process, LSM/SELinux was
not updated other than to pass the target process to the setrlimit hook.
This resulted in incomplete control over both getting and setting the
resource limits of another process.
Add a new security_task_prlimit() hook to the check_prlimit_permission()
function to provide complete mediation. The hook is only called when
acting on another task, and only if the existing DAC/capability checks
would allow access. Pass flags down to the hook to indicate whether the
prlimit(2) call will read, write, or both read and write the resource
limits of the target process.
The existing security_task_setrlimit() hook is left alone; it continues
to serve a purpose in supporting the ability to make decisions based on
the old and/or new resource limit values when setting limits. This
is consistent with the DAC/capability logic, where
check_prlimit_permission() performs generic DAC/capability checks for
acting on another task, while do_prlimit() performs a capability check
based on a comparison of the old and new resource limits. Fix the
inline documentation for the hook to match the code.
Implement the new hook for SELinux. For setting resource limits, we
reuse the existing setrlimit permission. Note that this does overload
the setrlimit permission to mean the ability to set the resource limit
(soft or hard) of another process or the ability to change one's own
hard limit. For getting resource limits, a new getrlimit permission
is defined. This was not originally defined since getrlimit(2) could
only be used to obtain a process' own limits.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-02-17 12:57:00 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_setrlimit() - Check if setting a new rlimit value is allowed
|
|
|
|
* @p: target task's group leader
|
|
|
|
* @resource: resource whose limit is being set
|
|
|
|
* @new_rlim: new resource limit
|
|
|
|
*
|
|
|
|
* Check permission before setting the resource limits of process @p for
|
|
|
|
* @resource to @new_rlim. The old resource limit values can be examined by
|
|
|
|
* dereferencing (p->signal->rlim + resource).
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2009-08-26 16:41:16 +00:00
|
|
|
int security_task_setrlimit(struct task_struct *p, unsigned int resource,
|
2023-02-17 02:33:20 +00:00
|
|
|
struct rlimit *new_rlim)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_setrlimit, p, resource, new_rlim);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_setscheduler() - Check if setting sched policy/param is allowed
|
|
|
|
* @p: target task
|
|
|
|
*
|
|
|
|
* Check permission before setting scheduling policy and/or parameters of
|
|
|
|
* process @p.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2010-10-14 19:21:18 +00:00
|
|
|
int security_task_setscheduler(struct task_struct *p)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_setscheduler, p);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_getscheduler() - Check if getting scheduling info is allowed
|
|
|
|
* @p: target task
|
|
|
|
*
|
|
|
|
* Check permission before obtaining scheduling information for process @p.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_task_getscheduler(struct task_struct *p)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_getscheduler, p);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_movememory() - Check if moving memory is allowed
|
|
|
|
* @p: task
|
|
|
|
*
|
|
|
|
* Check permission before moving memory owned by process @p.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_task_movememory(struct task_struct *p)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_movememory, p);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_kill() - Check if sending a signal is allowed
|
|
|
|
* @p: target process
|
|
|
|
* @info: signal information
|
|
|
|
* @sig: signal value
|
|
|
|
* @cred: credentials of the signal sender, NULL if @current
|
|
|
|
*
|
|
|
|
* Check permission before sending signal @sig to @p. @info can be NULL, the
|
|
|
|
* constant 1, or a pointer to a kernel_siginfo structure. If @info is 1 or
|
|
|
|
* SI_FROMKERNEL(info) is true, then the signal should be viewed as coming from
|
|
|
|
* the kernel and should typically be permitted. SIGIO signals are handled
|
|
|
|
* separately by the send_sigiotask hook in file_security_ops.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2018-09-25 09:27:20 +00:00
|
|
|
int security_task_kill(struct task_struct *p, struct kernel_siginfo *info,
|
2023-02-17 02:33:20 +00:00
|
|
|
int sig, const struct cred *cred)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(task_kill, p, info, sig, cred);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_prctl() - Check if a prctl op is allowed
|
|
|
|
* @option: operation
|
|
|
|
* @arg2: argument
|
|
|
|
* @arg3: argument
|
|
|
|
* @arg4: argument
|
|
|
|
* @arg5: argument
|
|
|
|
*
|
|
|
|
* Check permission before performing a process control operation on the
|
|
|
|
* current process.
|
|
|
|
*
|
|
|
|
* Return: Return -ENOSYS if no-one wanted to handle this op, any other value
|
|
|
|
* to cause prctl() to return immediately with that value.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
|
2023-02-17 02:33:20 +00:00
|
|
|
unsigned long arg4, unsigned long arg5)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-05-02 22:11:42 +00:00
|
|
|
int thisrc;
|
2020-03-29 00:43:50 +00:00
|
|
|
int rc = LSM_RET_DEFAULT(task_prctl);
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
struct lsm_static_call *scall;
|
2015-05-02 22:11:42 +00:00
|
|
|
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
lsm_for_each_hook(scall, task_prctl) {
|
|
|
|
thisrc = scall->hl->hook.task_prctl(option, arg2, arg3, arg4, arg5);
|
2020-03-29 00:43:50 +00:00
|
|
|
if (thisrc != LSM_RET_DEFAULT(task_prctl)) {
|
2015-05-02 22:11:42 +00:00
|
|
|
rc = thisrc;
|
|
|
|
if (thisrc != 0)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return rc;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_task_to_inode() - Set the security attributes of a task's inode
|
|
|
|
* @p: task
|
|
|
|
* @inode: inode
|
|
|
|
*
|
|
|
|
* Set the security attributes for an inode based on an associated task's
|
|
|
|
* security attributes, e.g. for /proc/pid inodes.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
void security_task_to_inode(struct task_struct *p, struct inode *inode)
|
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(task_to_inode, p, inode);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 00:27:58 +00:00
|
|
|
/**
|
|
|
|
* security_create_user_ns() - Check if creating a new userns is allowed
|
|
|
|
* @cred: prepared creds
|
|
|
|
*
|
|
|
|
* Check permission prior to creating a new user namespace.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if successful, otherwise < 0 error code.
|
|
|
|
*/
|
security, lsm: Introduce security_create_user_ns()
User namespaces are an effective tool to allow programs to run with
permission without requiring the need for a program to run as root. User
namespaces may also be used as a sandboxing technique. However, attackers
sometimes leverage user namespaces as an initial attack vector to perform
some exploit. [1,2,3]
While it is not the unprivileged user namespace functionality, which
causes the kernel to be exploitable, users/administrators might want to
more granularly limit or at least monitor how various processes use this
functionality, while vulnerable kernel subsystems are being patched.
Preventing user namespace already creation comes in a few of forms in
order of granularity:
1. /proc/sys/user/max_user_namespaces sysctl
2. Distro specific patch(es)
3. CONFIG_USER_NS
To block a task based on its attributes, the LSM hook cred_prepare is a
decent candidate for use because it provides more granular control, and
it is called before create_user_ns():
cred = prepare_creds()
security_prepare_creds()
call_int_hook(cred_prepare, ...
if (cred)
create_user_ns(cred)
Since security_prepare_creds() is meant for LSMs to copy and prepare
credentials, access control is an unintended use of the hook. [4]
Further, security_prepare_creds() will always return a ENOMEM if the
hook returns any non-zero error code.
This hook also does not handle the clone3 case which requires us to
access a user space pointer to know if we're in the CLONE_NEW_USER
call path which may be subject to a TOCTTOU attack.
Lastly, cred_prepare is called in many call paths, and a targeted hook
further limits the frequency of calls which is a beneficial outcome.
Therefore introduce a new function security_create_user_ns() with an
accompanying userns_create LSM hook.
With the new userns_create hook, users will have more control over the
observability and access control over user namespace creation. Users
should expect that normal operation of user namespaces will behave as
usual, and only be impacted when controls are implemented by users or
administrators.
This hook takes the prepared creds for LSM authors to write policy
against. On success, the new namespace is applied to credentials,
otherwise an error is returned.
Links:
1. https://nvd.nist.gov/vuln/detail/CVE-2022-0492
2. https://nvd.nist.gov/vuln/detail/CVE-2022-25636
3. https://nvd.nist.gov/vuln/detail/CVE-2022-34918
4. https://lore.kernel.org/all/1c4b1c0d-12f6-6e9e-a6a3-cdce7418110c@schaufler-ca.com/
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Reviewed-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Frederick Lawler <fred@cloudflare.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-08-15 16:20:25 +00:00
|
|
|
int security_create_user_ns(const struct cred *cred)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(userns_create, cred);
|
security, lsm: Introduce security_create_user_ns()
User namespaces are an effective tool to allow programs to run with
permission without requiring the need for a program to run as root. User
namespaces may also be used as a sandboxing technique. However, attackers
sometimes leverage user namespaces as an initial attack vector to perform
some exploit. [1,2,3]
While it is not the unprivileged user namespace functionality, which
causes the kernel to be exploitable, users/administrators might want to
more granularly limit or at least monitor how various processes use this
functionality, while vulnerable kernel subsystems are being patched.
Preventing user namespace already creation comes in a few of forms in
order of granularity:
1. /proc/sys/user/max_user_namespaces sysctl
2. Distro specific patch(es)
3. CONFIG_USER_NS
To block a task based on its attributes, the LSM hook cred_prepare is a
decent candidate for use because it provides more granular control, and
it is called before create_user_ns():
cred = prepare_creds()
security_prepare_creds()
call_int_hook(cred_prepare, ...
if (cred)
create_user_ns(cred)
Since security_prepare_creds() is meant for LSMs to copy and prepare
credentials, access control is an unintended use of the hook. [4]
Further, security_prepare_creds() will always return a ENOMEM if the
hook returns any non-zero error code.
This hook also does not handle the clone3 case which requires us to
access a user space pointer to know if we're in the CLONE_NEW_USER
call path which may be subject to a TOCTTOU attack.
Lastly, cred_prepare is called in many call paths, and a targeted hook
further limits the frequency of calls which is a beneficial outcome.
Therefore introduce a new function security_create_user_ns() with an
accompanying userns_create LSM hook.
With the new userns_create hook, users will have more control over the
observability and access control over user namespace creation. Users
should expect that normal operation of user namespaces will behave as
usual, and only be impacted when controls are implemented by users or
administrators.
This hook takes the prepared creds for LSM authors to write policy
against. On success, the new namespace is applied to credentials,
otherwise an error is returned.
Links:
1. https://nvd.nist.gov/vuln/detail/CVE-2022-0492
2. https://nvd.nist.gov/vuln/detail/CVE-2022-25636
3. https://nvd.nist.gov/vuln/detail/CVE-2022-34918
4. https://lore.kernel.org/all/1c4b1c0d-12f6-6e9e-a6a3-cdce7418110c@schaufler-ca.com/
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Reviewed-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Frederick Lawler <fred@cloudflare.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-08-15 16:20:25 +00:00
|
|
|
}
|
2007-10-17 06:31:32 +00:00
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_ipc_permission() - Check if sysv ipc access is allowed
|
|
|
|
* @ipcp: ipc permission structure
|
2023-03-08 17:31:03 +00:00
|
|
|
* @flag: requested permissions
|
2023-02-16 16:53:49 +00:00
|
|
|
*
|
|
|
|
* Check permissions for access to IPC.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_ipc_permission(struct kern_ipc_perm *ipcp, short flag)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(ipc_permission, ipcp, flag);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_ipc_getsecid() - Get the sysv ipc object's secid
|
|
|
|
* @ipcp: ipc permission structure
|
|
|
|
* @secid: secid pointer
|
|
|
|
*
|
|
|
|
* Get the secid associated with the ipc object. In case of failure, @secid
|
|
|
|
* will be set to zero.
|
|
|
|
*/
|
2008-03-01 19:51:09 +00:00
|
|
|
void security_ipc_getsecid(struct kern_ipc_perm *ipcp, u32 *secid)
|
|
|
|
{
|
2015-05-02 22:11:42 +00:00
|
|
|
*secid = 0;
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(ipc_getsecid, ipcp, secid);
|
2008-03-01 19:51:09 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_msg_msg_alloc() - Allocate a sysv ipc message LSM blob
|
|
|
|
* @msg: message structure
|
|
|
|
*
|
|
|
|
* Allocate and attach a security structure to the msg->security field. The
|
|
|
|
* security field is initialized to NULL when the structure is first created.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if operation was successful and permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_msg_msg_alloc(struct msg_msg *msg)
|
|
|
|
{
|
2018-11-20 19:55:02 +00:00
|
|
|
int rc = lsm_msg_msg_alloc(msg);
|
|
|
|
|
|
|
|
if (unlikely(rc))
|
|
|
|
return rc;
|
2024-01-30 12:56:59 +00:00
|
|
|
rc = call_int_hook(msg_msg_alloc_security, msg);
|
2018-11-20 19:55:02 +00:00
|
|
|
if (unlikely(rc))
|
|
|
|
security_msg_msg_free(msg);
|
|
|
|
return rc;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_msg_msg_free() - Free a sysv ipc message LSM blob
|
|
|
|
* @msg: message structure
|
|
|
|
*
|
|
|
|
* Deallocate the security structure for this message.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
void security_msg_msg_free(struct msg_msg *msg)
|
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(msg_msg_free_security, msg);
|
2018-11-20 19:55:02 +00:00
|
|
|
kfree(msg->security);
|
|
|
|
msg->security = NULL;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_msg_queue_alloc() - Allocate a sysv ipc msg queue LSM blob
|
|
|
|
* @msq: sysv ipc permission structure
|
|
|
|
*
|
|
|
|
* Allocate and attach a security structure to @msg. The security field is
|
|
|
|
* initialized to NULL when the structure is first created.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if operation was successful and permission is granted.
|
|
|
|
*/
|
2018-03-23 02:22:26 +00:00
|
|
|
int security_msg_queue_alloc(struct kern_ipc_perm *msq)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2018-11-20 19:55:02 +00:00
|
|
|
int rc = lsm_ipc_alloc(msq);
|
|
|
|
|
|
|
|
if (unlikely(rc))
|
|
|
|
return rc;
|
2024-01-30 12:56:59 +00:00
|
|
|
rc = call_int_hook(msg_queue_alloc_security, msq);
|
2018-11-20 19:55:02 +00:00
|
|
|
if (unlikely(rc))
|
|
|
|
security_msg_queue_free(msq);
|
|
|
|
return rc;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_msg_queue_free() - Free a sysv ipc msg queue LSM blob
|
|
|
|
* @msq: sysv ipc permission structure
|
|
|
|
*
|
|
|
|
* Deallocate security field @perm->security for the message queue.
|
|
|
|
*/
|
2018-03-23 02:22:26 +00:00
|
|
|
void security_msg_queue_free(struct kern_ipc_perm *msq)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(msg_queue_free_security, msq);
|
2018-11-20 19:55:02 +00:00
|
|
|
kfree(msq->security);
|
|
|
|
msq->security = NULL;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_msg_queue_associate() - Check if a msg queue operation is allowed
|
|
|
|
* @msq: sysv ipc permission structure
|
|
|
|
* @msqflg: operation flags
|
|
|
|
*
|
|
|
|
* Check permission when a message queue is requested through the msgget system
|
|
|
|
* call. This hook is only called when returning the message queue identifier
|
|
|
|
* for an existing message queue, not when a new message queue is created.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if permission is granted.
|
|
|
|
*/
|
2018-03-23 02:22:26 +00:00
|
|
|
int security_msg_queue_associate(struct kern_ipc_perm *msq, int msqflg)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(msg_queue_associate, msq, msqflg);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_msg_queue_msgctl() - Check if a msg queue operation is allowed
|
|
|
|
* @msq: sysv ipc permission structure
|
|
|
|
* @cmd: operation
|
|
|
|
*
|
|
|
|
* Check permission when a message control operation specified by @cmd is to be
|
|
|
|
* performed on the message queue with permissions.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2018-03-23 02:22:26 +00:00
|
|
|
int security_msg_queue_msgctl(struct kern_ipc_perm *msq, int cmd)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(msg_queue_msgctl, msq, cmd);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_msg_queue_msgsnd() - Check if sending a sysv ipc message is allowed
|
|
|
|
* @msq: sysv ipc permission structure
|
|
|
|
* @msg: message
|
|
|
|
* @msqflg: operation flags
|
|
|
|
*
|
|
|
|
* Check permission before a message, @msg, is enqueued on the message queue
|
|
|
|
* with permissions specified in @msq.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2018-03-23 02:22:26 +00:00
|
|
|
int security_msg_queue_msgsnd(struct kern_ipc_perm *msq,
|
2023-02-17 02:33:20 +00:00
|
|
|
struct msg_msg *msg, int msqflg)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(msg_queue_msgsnd, msq, msg, msqflg);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_msg_queue_msgrcv() - Check if receiving a sysv ipc msg is allowed
|
|
|
|
* @msq: sysv ipc permission structure
|
|
|
|
* @msg: message
|
|
|
|
* @target: target task
|
|
|
|
* @type: type of message requested
|
|
|
|
* @mode: operation flags
|
|
|
|
*
|
|
|
|
* Check permission before a message, @msg, is removed from the message queue.
|
|
|
|
* The @target task structure contains a pointer to the process that will be
|
|
|
|
* receiving the message (not equal to the current process when inline receives
|
|
|
|
* are being performed).
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2018-03-23 02:22:26 +00:00
|
|
|
int security_msg_queue_msgrcv(struct kern_ipc_perm *msq, struct msg_msg *msg,
|
2023-02-17 02:33:20 +00:00
|
|
|
struct task_struct *target, long type, int mode)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(msg_queue_msgrcv, msq, msg, target, type, mode);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_shm_alloc() - Allocate a sysv shm LSM blob
|
|
|
|
* @shp: sysv ipc permission structure
|
|
|
|
*
|
|
|
|
* Allocate and attach a security structure to the @shp security field. The
|
|
|
|
* security field is initialized to NULL when the structure is first created.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if operation was successful and permission is granted.
|
|
|
|
*/
|
2018-03-23 02:08:27 +00:00
|
|
|
int security_shm_alloc(struct kern_ipc_perm *shp)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2018-11-20 19:55:02 +00:00
|
|
|
int rc = lsm_ipc_alloc(shp);
|
|
|
|
|
|
|
|
if (unlikely(rc))
|
|
|
|
return rc;
|
2024-01-30 12:56:59 +00:00
|
|
|
rc = call_int_hook(shm_alloc_security, shp);
|
2018-11-20 19:55:02 +00:00
|
|
|
if (unlikely(rc))
|
|
|
|
security_shm_free(shp);
|
|
|
|
return rc;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_shm_free() - Free a sysv shm LSM blob
|
|
|
|
* @shp: sysv ipc permission structure
|
|
|
|
*
|
|
|
|
* Deallocate the security structure @perm->security for the memory segment.
|
|
|
|
*/
|
2018-03-23 02:08:27 +00:00
|
|
|
void security_shm_free(struct kern_ipc_perm *shp)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(shm_free_security, shp);
|
2018-11-20 19:55:02 +00:00
|
|
|
kfree(shp->security);
|
|
|
|
shp->security = NULL;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_shm_associate() - Check if a sysv shm operation is allowed
|
|
|
|
* @shp: sysv ipc permission structure
|
|
|
|
* @shmflg: operation flags
|
|
|
|
*
|
|
|
|
* Check permission when a shared memory region is requested through the shmget
|
|
|
|
* system call. This hook is only called when returning the shared memory
|
|
|
|
* region identifier for an existing region, not when a new shared memory
|
|
|
|
* region is created.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2018-03-23 02:08:27 +00:00
|
|
|
int security_shm_associate(struct kern_ipc_perm *shp, int shmflg)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(shm_associate, shp, shmflg);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_shm_shmctl() - Check if a sysv shm operation is allowed
|
|
|
|
* @shp: sysv ipc permission structure
|
|
|
|
* @cmd: operation
|
|
|
|
*
|
|
|
|
* Check permission when a shared memory control operation specified by @cmd is
|
|
|
|
* to be performed on the shared memory region with permissions in @shp.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if permission is granted.
|
|
|
|
*/
|
2018-03-23 02:08:27 +00:00
|
|
|
int security_shm_shmctl(struct kern_ipc_perm *shp, int cmd)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(shm_shmctl, shp, cmd);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_shm_shmat() - Check if a sysv shm attach operation is allowed
|
|
|
|
* @shp: sysv ipc permission structure
|
|
|
|
* @shmaddr: address of memory region to attach
|
|
|
|
* @shmflg: operation flags
|
|
|
|
*
|
|
|
|
* Check permissions prior to allowing the shmat system call to attach the
|
|
|
|
* shared memory segment with permissions @shp to the data segment of the
|
|
|
|
* calling process. The attaching address is specified by @shmaddr.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_shm_shmat(struct kern_ipc_perm *shp,
|
|
|
|
char __user *shmaddr, int shmflg)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(shm_shmat, shp, shmaddr, shmflg);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_sem_alloc() - Allocate a sysv semaphore LSM blob
|
|
|
|
* @sma: sysv ipc permission structure
|
|
|
|
*
|
|
|
|
* Allocate and attach a security structure to the @sma security field. The
|
|
|
|
* security field is initialized to NULL when the structure is first created.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if operation was successful and permission is granted.
|
|
|
|
*/
|
2018-03-23 01:52:43 +00:00
|
|
|
int security_sem_alloc(struct kern_ipc_perm *sma)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2018-11-20 19:55:02 +00:00
|
|
|
int rc = lsm_ipc_alloc(sma);
|
|
|
|
|
|
|
|
if (unlikely(rc))
|
|
|
|
return rc;
|
2024-01-30 12:56:59 +00:00
|
|
|
rc = call_int_hook(sem_alloc_security, sma);
|
2018-11-20 19:55:02 +00:00
|
|
|
if (unlikely(rc))
|
|
|
|
security_sem_free(sma);
|
|
|
|
return rc;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_sem_free() - Free a sysv semaphore LSM blob
|
|
|
|
* @sma: sysv ipc permission structure
|
|
|
|
*
|
|
|
|
* Deallocate security structure @sma->security for the semaphore.
|
|
|
|
*/
|
2018-03-23 01:52:43 +00:00
|
|
|
void security_sem_free(struct kern_ipc_perm *sma)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(sem_free_security, sma);
|
2018-11-20 19:55:02 +00:00
|
|
|
kfree(sma->security);
|
|
|
|
sma->security = NULL;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_sem_associate() - Check if a sysv semaphore operation is allowed
|
|
|
|
* @sma: sysv ipc permission structure
|
|
|
|
* @semflg: operation flags
|
|
|
|
*
|
|
|
|
* Check permission when a semaphore is requested through the semget system
|
|
|
|
* call. This hook is only called when returning the semaphore identifier for
|
|
|
|
* an existing semaphore, not when a new one must be created.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2018-03-23 01:52:43 +00:00
|
|
|
int security_sem_associate(struct kern_ipc_perm *sma, int semflg)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sem_associate, sma, semflg);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
2023-03-08 17:31:03 +00:00
|
|
|
* security_sem_semctl() - Check if a sysv semaphore operation is allowed
|
2023-02-16 16:53:49 +00:00
|
|
|
* @sma: sysv ipc permission structure
|
|
|
|
* @cmd: operation
|
|
|
|
*
|
|
|
|
* Check permission when a semaphore operation specified by @cmd is to be
|
|
|
|
* performed on the semaphore.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2018-03-23 01:52:43 +00:00
|
|
|
int security_sem_semctl(struct kern_ipc_perm *sma, int cmd)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sem_semctl, sma, cmd);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 16:53:49 +00:00
|
|
|
/**
|
|
|
|
* security_sem_semop() - Check if a sysv semaphore operation is allowed
|
|
|
|
* @sma: sysv ipc permission structure
|
|
|
|
* @sops: operations to perform
|
|
|
|
* @nsops: number of operations
|
|
|
|
* @alter: flag indicating changes will be made
|
|
|
|
*
|
|
|
|
* Check permissions before performing operations on members of the semaphore
|
|
|
|
* set. If the @alter flag is nonzero, the semaphore set may be modified.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2018-03-23 01:52:43 +00:00
|
|
|
int security_sem_semop(struct kern_ipc_perm *sma, struct sembuf *sops,
|
2023-02-17 02:33:20 +00:00
|
|
|
unsigned nsops, int alter)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sem_semop, sma, sops, nsops, alter);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_d_instantiate() - Populate an inode's LSM state based on a dentry
|
|
|
|
* @dentry: dentry
|
|
|
|
* @inode: inode
|
|
|
|
*
|
|
|
|
* Fill in @inode security information for a @dentry if allowed.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
void security_d_instantiate(struct dentry *dentry, struct inode *inode)
|
|
|
|
{
|
|
|
|
if (unlikely(inode && IS_PRIVATE(inode)))
|
|
|
|
return;
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(d_instantiate, dentry, inode);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_d_instantiate);
|
|
|
|
|
2023-09-12 20:56:49 +00:00
|
|
|
/*
|
|
|
|
* Please keep this in sync with it's counterpart in security/lsm_syscalls.c
|
|
|
|
*/
|
|
|
|
|
|
|
|
/**
|
|
|
|
* security_getselfattr - Read an LSM attribute of the current process.
|
|
|
|
* @attr: which attribute to return
|
|
|
|
* @uctx: the user-space destination for the information, or NULL
|
|
|
|
* @size: pointer to the size of space available to receive the data
|
|
|
|
* @flags: special handling options. LSM_FLAG_SINGLE indicates that only
|
|
|
|
* attributes associated with the LSM identified in the passed @ctx be
|
|
|
|
* reported.
|
|
|
|
*
|
|
|
|
* A NULL value for @uctx can be used to get both the number of attributes
|
|
|
|
* and the size of the data.
|
|
|
|
*
|
|
|
|
* Returns the number of attributes found on success, negative value
|
|
|
|
* on error. @size is reset to the total size of the data.
|
|
|
|
* If @size is insufficient to contain the data -E2BIG is returned.
|
|
|
|
*/
|
|
|
|
int security_getselfattr(unsigned int attr, struct lsm_ctx __user *uctx,
|
2024-03-14 15:31:26 +00:00
|
|
|
u32 __user *size, u32 flags)
|
2023-09-12 20:56:49 +00:00
|
|
|
{
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
struct lsm_static_call *scall;
|
2023-09-12 20:56:49 +00:00
|
|
|
struct lsm_ctx lctx = { .id = LSM_ID_UNDEF, };
|
|
|
|
u8 __user *base = (u8 __user *)uctx;
|
2024-03-14 15:31:26 +00:00
|
|
|
u32 entrysize;
|
|
|
|
u32 total = 0;
|
|
|
|
u32 left;
|
2023-09-12 20:56:49 +00:00
|
|
|
bool toobig = false;
|
|
|
|
bool single = false;
|
|
|
|
int count = 0;
|
|
|
|
int rc;
|
|
|
|
|
|
|
|
if (attr == LSM_ATTR_UNDEF)
|
|
|
|
return -EINVAL;
|
|
|
|
if (size == NULL)
|
|
|
|
return -EINVAL;
|
|
|
|
if (get_user(left, size))
|
|
|
|
return -EFAULT;
|
|
|
|
|
|
|
|
if (flags) {
|
|
|
|
/*
|
|
|
|
* Only flag supported is LSM_FLAG_SINGLE
|
|
|
|
*/
|
2023-10-24 16:42:38 +00:00
|
|
|
if (flags != LSM_FLAG_SINGLE || !uctx)
|
2023-09-12 20:56:49 +00:00
|
|
|
return -EINVAL;
|
2023-10-24 16:42:38 +00:00
|
|
|
if (copy_from_user(&lctx, uctx, sizeof(lctx)))
|
2023-09-12 20:56:49 +00:00
|
|
|
return -EFAULT;
|
|
|
|
/*
|
|
|
|
* If the LSM ID isn't specified it is an error.
|
|
|
|
*/
|
|
|
|
if (lctx.id == LSM_ID_UNDEF)
|
|
|
|
return -EINVAL;
|
|
|
|
single = true;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* In the usual case gather all the data from the LSMs.
|
|
|
|
* In the single case only get the data from the LSM specified.
|
|
|
|
*/
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
lsm_for_each_hook(scall, getselfattr) {
|
|
|
|
if (single && lctx.id != scall->hl->lsmid->id)
|
2023-09-12 20:56:49 +00:00
|
|
|
continue;
|
|
|
|
entrysize = left;
|
|
|
|
if (base)
|
|
|
|
uctx = (struct lsm_ctx __user *)(base + total);
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
rc = scall->hl->hook.getselfattr(attr, uctx, &entrysize, flags);
|
2023-09-12 20:56:49 +00:00
|
|
|
if (rc == -EOPNOTSUPP) {
|
|
|
|
rc = 0;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
if (rc == -E2BIG) {
|
2023-10-24 16:38:40 +00:00
|
|
|
rc = 0;
|
2023-09-12 20:56:49 +00:00
|
|
|
left = 0;
|
2023-10-24 16:38:40 +00:00
|
|
|
toobig = true;
|
2023-09-12 20:56:49 +00:00
|
|
|
} else if (rc < 0)
|
|
|
|
return rc;
|
|
|
|
else
|
|
|
|
left -= entrysize;
|
|
|
|
|
|
|
|
total += entrysize;
|
|
|
|
count += rc;
|
|
|
|
if (single)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
if (put_user(total, size))
|
|
|
|
return -EFAULT;
|
|
|
|
if (toobig)
|
|
|
|
return -E2BIG;
|
|
|
|
if (count == 0)
|
|
|
|
return LSM_RET_DEFAULT(getselfattr);
|
|
|
|
return count;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Please keep this in sync with it's counterpart in security/lsm_syscalls.c
|
|
|
|
*/
|
|
|
|
|
|
|
|
/**
|
|
|
|
* security_setselfattr - Set an LSM attribute on the current process.
|
|
|
|
* @attr: which attribute to set
|
|
|
|
* @uctx: the user-space source for the information
|
|
|
|
* @size: the size of the data
|
|
|
|
* @flags: reserved for future use, must be 0
|
|
|
|
*
|
|
|
|
* Set an LSM attribute for the current process. The LSM, attribute
|
|
|
|
* and new value are included in @uctx.
|
|
|
|
*
|
|
|
|
* Returns 0 on success, -EINVAL if the input is inconsistent, -EFAULT
|
|
|
|
* if the user buffer is inaccessible, E2BIG if size is too big, or an
|
|
|
|
* LSM specific failure.
|
|
|
|
*/
|
|
|
|
int security_setselfattr(unsigned int attr, struct lsm_ctx __user *uctx,
|
2024-03-14 15:31:26 +00:00
|
|
|
u32 size, u32 flags)
|
2023-09-12 20:56:49 +00:00
|
|
|
{
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
struct lsm_static_call *scall;
|
2023-09-12 20:56:49 +00:00
|
|
|
struct lsm_ctx *lctx;
|
|
|
|
int rc = LSM_RET_DEFAULT(setselfattr);
|
lsm: fix integer overflow in lsm_set_self_attr() syscall
security_setselfattr() has an integer overflow bug that leads to
out-of-bounds access when userspace provides bogus input:
`lctx->ctx_len + sizeof(*lctx)` is checked against `lctx->len` (and,
redundantly, also against `size`), but there are no checks on
`lctx->ctx_len`.
Therefore, userspace can provide an `lsm_ctx` with `->ctx_len` set to a
value between `-sizeof(struct lsm_ctx)` and -1, and this bogus `->ctx_len`
will then be passed to an LSM module as a buffer length, causing LSM
modules to perform out-of-bounds accesses.
The following reproducer will demonstrate this under ASAN (if AppArmor is
loaded as an LSM):
```
struct lsm_ctx {
uint64_t id;
uint64_t flags;
uint64_t len;
uint64_t ctx_len;
char ctx[];
};
int main(void) {
size_t size = sizeof(struct lsm_ctx);
struct lsm_ctx *ctx = malloc(size);
ctx->id = 104/*LSM_ID_APPARMOR*/;
ctx->flags = 0;
ctx->len = size;
ctx->ctx_len = -sizeof(struct lsm_ctx);
syscall(
460/*__NR_lsm_set_self_attr*/,
/*attr=*/ 100/*LSM_ATTR_CURRENT*/,
/*ctx=*/ ctx,
/*size=*/ size,
/*flags=*/ 0
);
}
```
Fixes: a04a1198088a ("LSM: syscalls for current process attributes")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subj tweak, removed ref to ASAN splat that isn't included]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-02-14 16:05:38 +00:00
|
|
|
u64 required_len;
|
2023-09-12 20:56:49 +00:00
|
|
|
|
|
|
|
if (flags)
|
|
|
|
return -EINVAL;
|
|
|
|
if (size < sizeof(*lctx))
|
|
|
|
return -EINVAL;
|
|
|
|
if (size > PAGE_SIZE)
|
|
|
|
return -E2BIG;
|
|
|
|
|
2023-11-01 22:42:12 +00:00
|
|
|
lctx = memdup_user(uctx, size);
|
|
|
|
if (IS_ERR(lctx))
|
|
|
|
return PTR_ERR(lctx);
|
2023-09-12 20:56:49 +00:00
|
|
|
|
lsm: fix integer overflow in lsm_set_self_attr() syscall
security_setselfattr() has an integer overflow bug that leads to
out-of-bounds access when userspace provides bogus input:
`lctx->ctx_len + sizeof(*lctx)` is checked against `lctx->len` (and,
redundantly, also against `size`), but there are no checks on
`lctx->ctx_len`.
Therefore, userspace can provide an `lsm_ctx` with `->ctx_len` set to a
value between `-sizeof(struct lsm_ctx)` and -1, and this bogus `->ctx_len`
will then be passed to an LSM module as a buffer length, causing LSM
modules to perform out-of-bounds accesses.
The following reproducer will demonstrate this under ASAN (if AppArmor is
loaded as an LSM):
```
struct lsm_ctx {
uint64_t id;
uint64_t flags;
uint64_t len;
uint64_t ctx_len;
char ctx[];
};
int main(void) {
size_t size = sizeof(struct lsm_ctx);
struct lsm_ctx *ctx = malloc(size);
ctx->id = 104/*LSM_ID_APPARMOR*/;
ctx->flags = 0;
ctx->len = size;
ctx->ctx_len = -sizeof(struct lsm_ctx);
syscall(
460/*__NR_lsm_set_self_attr*/,
/*attr=*/ 100/*LSM_ATTR_CURRENT*/,
/*ctx=*/ ctx,
/*size=*/ size,
/*flags=*/ 0
);
}
```
Fixes: a04a1198088a ("LSM: syscalls for current process attributes")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subj tweak, removed ref to ASAN splat that isn't included]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-02-14 16:05:38 +00:00
|
|
|
if (size < lctx->len ||
|
|
|
|
check_add_overflow(sizeof(*lctx), lctx->ctx_len, &required_len) ||
|
|
|
|
lctx->len < required_len) {
|
2023-09-12 20:56:49 +00:00
|
|
|
rc = -EINVAL;
|
|
|
|
goto free_out;
|
|
|
|
}
|
|
|
|
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
lsm_for_each_hook(scall, setselfattr)
|
|
|
|
if ((scall->hl->lsmid->id) == lctx->id) {
|
|
|
|
rc = scall->hl->hook.setselfattr(attr, lctx, size, flags);
|
2023-09-12 20:56:49 +00:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
free_out:
|
|
|
|
kfree(lctx);
|
|
|
|
return rc;
|
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_getprocattr() - Read an attribute for a task
|
|
|
|
* @p: the task
|
2023-09-12 20:56:48 +00:00
|
|
|
* @lsmid: LSM identification
|
2023-02-08 21:31:55 +00:00
|
|
|
* @name: attribute name
|
|
|
|
* @value: attribute value
|
|
|
|
*
|
|
|
|
* Read attribute @name for task @p and store it into @value if allowed.
|
|
|
|
*
|
|
|
|
* Return: Returns the length of @value on success, a negative value otherwise.
|
|
|
|
*/
|
2023-09-12 20:56:48 +00:00
|
|
|
int security_getprocattr(struct task_struct *p, int lsmid, const char *name,
|
|
|
|
char **value)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
struct lsm_static_call *scall;
|
2018-09-22 00:16:59 +00:00
|
|
|
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
lsm_for_each_hook(scall, getprocattr) {
|
|
|
|
if (lsmid != 0 && lsmid != scall->hl->lsmid->id)
|
2018-09-22 00:16:59 +00:00
|
|
|
continue;
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
return scall->hl->hook.getprocattr(p, name, value);
|
2018-09-22 00:16:59 +00:00
|
|
|
}
|
2020-03-29 00:43:50 +00:00
|
|
|
return LSM_RET_DEFAULT(getprocattr);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-08 21:31:55 +00:00
|
|
|
/**
|
|
|
|
* security_setprocattr() - Set an attribute for a task
|
2023-09-12 20:56:48 +00:00
|
|
|
* @lsmid: LSM identification
|
2023-02-08 21:31:55 +00:00
|
|
|
* @name: attribute name
|
|
|
|
* @value: attribute value
|
|
|
|
* @size: attribute value size
|
|
|
|
*
|
|
|
|
* Write (set) the current task's attribute @name to @value, size @size if
|
|
|
|
* allowed.
|
|
|
|
*
|
|
|
|
* Return: Returns bytes written on success, a negative value otherwise.
|
|
|
|
*/
|
2023-09-12 20:56:48 +00:00
|
|
|
int security_setprocattr(int lsmid, const char *name, void *value, size_t size)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
struct lsm_static_call *scall;
|
2018-09-22 00:16:59 +00:00
|
|
|
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
lsm_for_each_hook(scall, setprocattr) {
|
|
|
|
if (lsmid != 0 && lsmid != scall->hl->lsmid->id)
|
2018-09-22 00:16:59 +00:00
|
|
|
continue;
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
return scall->hl->hook.setprocattr(name, value, size);
|
2018-09-22 00:16:59 +00:00
|
|
|
}
|
2020-03-29 00:43:50 +00:00
|
|
|
return LSM_RET_DEFAULT(setprocattr);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:06:59 +00:00
|
|
|
/**
|
|
|
|
* security_netlink_send() - Save info and check if netlink sending is allowed
|
|
|
|
* @sk: sending socket
|
|
|
|
* @skb: netlink message
|
|
|
|
*
|
|
|
|
* Save security information for a netlink message so that permission checking
|
|
|
|
* can be performed when the message is processed. The security information
|
|
|
|
* can be saved using the eff_cap field of the netlink_skb_parms structure.
|
|
|
|
* Also may be used to provide fine grained control over message transmission.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if the information was successfully saved and message is
|
|
|
|
* allowed to be transmitted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_netlink_send(struct sock *sk, struct sk_buff *skb)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(netlink_send, sk, skb);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
2024-02-17 13:35:04 +00:00
|
|
|
* security_ismaclabel() - Check if the named attribute is a MAC label
|
2023-02-16 22:34:14 +00:00
|
|
|
* @name: full extended attribute name
|
|
|
|
*
|
|
|
|
* Check if the extended attribute specified by @name represents a MAC label.
|
|
|
|
*
|
|
|
|
* Return: Returns 1 if name is a MAC attribute otherwise returns 0.
|
|
|
|
*/
|
2013-05-22 16:50:35 +00:00
|
|
|
int security_ismaclabel(const char *name)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(ismaclabel, name);
|
2013-05-22 16:50:35 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_ismaclabel);
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_secid_to_secctx() - Convert a secid to a secctx
|
|
|
|
* @secid: secid
|
|
|
|
* @secdata: secctx
|
|
|
|
* @seclen: secctx length
|
|
|
|
*
|
|
|
|
* Convert secid to security context. If @secdata is NULL the length of the
|
|
|
|
* result will be returned in @seclen, but no @secdata will be returned. This
|
|
|
|
* does mean that the length could change between calls to check the length and
|
|
|
|
* the next call which actually allocates and returns the @secdata.
|
|
|
|
*
|
|
|
|
* Return: Return 0 on success, error on failure.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_secid_to_secctx(u32 secid, char **secdata, u32 *seclen)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(secid_to_secctx, secid, secdata, seclen);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_secid_to_secctx);
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_secctx_to_secid() - Convert a secctx to a secid
|
|
|
|
* @secdata: secctx
|
|
|
|
* @seclen: length of secctx
|
|
|
|
* @secid: secid
|
|
|
|
*
|
|
|
|
* Convert security context to secid.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
2008-04-29 19:52:51 +00:00
|
|
|
int security_secctx_to_secid(const char *secdata, u32 seclen, u32 *secid)
|
2008-01-15 23:47:35 +00:00
|
|
|
{
|
2015-05-02 22:11:42 +00:00
|
|
|
*secid = 0;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(secctx_to_secid, secdata, seclen, secid);
|
2008-01-15 23:47:35 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_secctx_to_secid);
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_release_secctx() - Free a secctx buffer
|
|
|
|
* @secdata: secctx
|
|
|
|
* @seclen: length of secctx
|
|
|
|
*
|
|
|
|
* Release the security context.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
void security_release_secctx(char *secdata, u32 seclen)
|
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(release_secctx, secdata, seclen);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_release_secctx);
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_inode_invalidate_secctx() - Invalidate an inode's security label
|
|
|
|
* @inode: inode
|
|
|
|
*
|
|
|
|
* Notify the security module that it must revalidate the security context of
|
|
|
|
* an inode.
|
|
|
|
*/
|
2015-12-24 16:09:40 +00:00
|
|
|
void security_inode_invalidate_secctx(struct inode *inode)
|
|
|
|
{
|
|
|
|
call_void_hook(inode_invalidate_secctx, inode);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_inode_invalidate_secctx);
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
2023-10-04 20:08:09 +00:00
|
|
|
* security_inode_notifysecctx() - Notify the LSM of an inode's security label
|
2023-02-16 22:34:14 +00:00
|
|
|
* @inode: inode
|
|
|
|
* @ctx: secctx
|
|
|
|
* @ctxlen: length of secctx
|
|
|
|
*
|
|
|
|
* Notify the security module of what the security context of an inode should
|
|
|
|
* be. Initializes the incore security context managed by the security module
|
|
|
|
* for this inode. Example usage: NFS client invokes this hook to initialize
|
|
|
|
* the security context in its incore inode to the value provided by the server
|
|
|
|
* for the file when the server returned the file's attributes to the client.
|
|
|
|
* Must be called with inode->i_mutex locked.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
2009-09-03 18:25:57 +00:00
|
|
|
int security_inode_notifysecctx(struct inode *inode, void *ctx, u32 ctxlen)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_notifysecctx, inode, ctx, ctxlen);
|
2009-09-03 18:25:57 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_inode_notifysecctx);
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_inode_setsecctx() - Change the security label of an inode
|
|
|
|
* @dentry: inode
|
|
|
|
* @ctx: secctx
|
|
|
|
* @ctxlen: length of secctx
|
|
|
|
*
|
|
|
|
* Change the security context of an inode. Updates the incore security
|
|
|
|
* context managed by the security module and invokes the fs code as needed
|
|
|
|
* (via __vfs_setxattr_noperm) to update any backing xattrs that represent the
|
|
|
|
* context. Example usage: NFS server invokes this hook to change the security
|
|
|
|
* context in its incore inode and on the backing filesystem to a value
|
|
|
|
* provided by the client on a SETATTR operation. Must be called with
|
|
|
|
* inode->i_mutex locked.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
2009-09-03 18:25:57 +00:00
|
|
|
int security_inode_setsecctx(struct dentry *dentry, void *ctx, u32 ctxlen)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_setsecctx, dentry, ctx, ctxlen);
|
2009-09-03 18:25:57 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_inode_setsecctx);
|
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_inode_getsecctx() - Get the security label of an inode
|
|
|
|
* @inode: inode
|
|
|
|
* @ctx: secctx
|
|
|
|
* @ctxlen: length of secctx
|
|
|
|
*
|
|
|
|
* On success, returns 0 and fills out @ctx and @ctxlen with the security
|
|
|
|
* context for the given @inode.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
2009-09-03 18:25:57 +00:00
|
|
|
int security_inode_getsecctx(struct inode *inode, void **ctx, u32 *ctxlen)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inode_getsecctx, inode, ctx, ctxlen);
|
2009-09-03 18:25:57 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_inode_getsecctx);
|
|
|
|
|
2020-02-12 13:58:35 +00:00
|
|
|
#ifdef CONFIG_WATCH_QUEUE
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_post_notification() - Check if a watch notification can be posted
|
|
|
|
* @w_cred: credentials of the task that set the watch
|
|
|
|
* @cred: credentials of the task which triggered the watch
|
|
|
|
* @n: the notification
|
|
|
|
*
|
|
|
|
* Check to see if a watch notification can be posted to a particular queue.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2020-02-12 13:58:35 +00:00
|
|
|
int security_post_notification(const struct cred *w_cred,
|
|
|
|
const struct cred *cred,
|
|
|
|
struct watch_notification *n)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(post_notification, w_cred, cred, n);
|
2020-02-12 13:58:35 +00:00
|
|
|
}
|
|
|
|
#endif /* CONFIG_WATCH_QUEUE */
|
|
|
|
|
2020-02-12 13:58:35 +00:00
|
|
|
#ifdef CONFIG_KEY_NOTIFICATIONS
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_watch_key() - Check if a task is allowed to watch for key events
|
|
|
|
* @key: the key to watch
|
|
|
|
*
|
|
|
|
* Check to see if a process is allowed to watch for event notifications from
|
|
|
|
* a key or keyring.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2020-02-12 13:58:35 +00:00
|
|
|
int security_watch_key(struct key *key)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(watch_key, key);
|
2020-02-12 13:58:35 +00:00
|
|
|
}
|
2023-02-17 02:33:20 +00:00
|
|
|
#endif /* CONFIG_KEY_NOTIFICATIONS */
|
2020-02-12 13:58:35 +00:00
|
|
|
|
2007-10-17 06:31:32 +00:00
|
|
|
#ifdef CONFIG_SECURITY_NETWORK
|
2023-02-12 20:10:23 +00:00
|
|
|
/**
|
|
|
|
* security_unix_stream_connect() - Check if a AF_UNIX stream is allowed
|
|
|
|
* @sock: originating sock
|
|
|
|
* @other: peer sock
|
|
|
|
* @newsk: new sock
|
|
|
|
*
|
|
|
|
* Check permissions before establishing a Unix domain stream connection
|
|
|
|
* between @sock and @other.
|
|
|
|
*
|
|
|
|
* The @unix_stream_connect and @unix_may_send hooks were necessary because
|
|
|
|
* Linux provides an alternative to the conventional file name space for Unix
|
|
|
|
* domain sockets. Whereas binding and connecting to sockets in the file name
|
|
|
|
* space is mediated by the typical file permissions (and caught by the mknod
|
|
|
|
* and permission hooks in inode_security_ops), binding and connecting to
|
|
|
|
* sockets in the abstract name space is completely unmediated. Sufficient
|
|
|
|
* control of Unix domain sockets in the abstract name space isn't possible
|
|
|
|
* using only the socket layer hooks, since we need to know the actual target
|
|
|
|
* socket, which is not looked up until we are inside the af_unix code.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_unix_stream_connect(struct sock *sock, struct sock *other,
|
|
|
|
struct sock *newsk)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(unix_stream_connect, sock, other, newsk);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_unix_stream_connect);
|
|
|
|
|
2023-02-12 20:10:23 +00:00
|
|
|
/**
|
|
|
|
* security_unix_may_send() - Check if AF_UNIX socket can send datagrams
|
|
|
|
* @sock: originating sock
|
|
|
|
* @other: peer sock
|
|
|
|
*
|
|
|
|
* Check permissions before connecting or sending datagrams from @sock to
|
|
|
|
* @other.
|
|
|
|
*
|
|
|
|
* The @unix_stream_connect and @unix_may_send hooks were necessary because
|
|
|
|
* Linux provides an alternative to the conventional file name space for Unix
|
|
|
|
* domain sockets. Whereas binding and connecting to sockets in the file name
|
|
|
|
* space is mediated by the typical file permissions (and caught by the mknod
|
|
|
|
* and permission hooks in inode_security_ops), binding and connecting to
|
|
|
|
* sockets in the abstract name space is completely unmediated. Sufficient
|
|
|
|
* control of Unix domain sockets in the abstract name space isn't possible
|
|
|
|
* using only the socket layer hooks, since we need to know the actual target
|
|
|
|
* socket, which is not looked up until we are inside the af_unix code.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_unix_may_send(struct socket *sock, struct socket *other)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(unix_may_send, sock, other);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_unix_may_send);
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_socket_create() - Check if creating a new socket is allowed
|
|
|
|
* @family: protocol family
|
|
|
|
* @type: communications type
|
|
|
|
* @protocol: requested protocol
|
|
|
|
* @kern: set to 1 if a kernel socket is requested
|
|
|
|
*
|
|
|
|
* Check permissions prior to creating a new socket.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_socket_create(int family, int type, int protocol, int kern)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_create, family, type, protocol, kern);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
2023-03-08 17:31:03 +00:00
|
|
|
* security_socket_post_create() - Initialize a newly created socket
|
2023-02-12 20:15:29 +00:00
|
|
|
* @sock: socket
|
|
|
|
* @family: protocol family
|
|
|
|
* @type: communications type
|
|
|
|
* @protocol: requested protocol
|
|
|
|
* @kern: set to 1 if a kernel socket is requested
|
|
|
|
*
|
|
|
|
* This hook allows a module to update or allocate a per-socket security
|
|
|
|
* structure. Note that the security field was not added directly to the socket
|
|
|
|
* structure, but rather, the socket security information is stored in the
|
|
|
|
* associated inode. Typically, the inode alloc_security hook will allocate
|
|
|
|
* and attach security information to SOCK_INODE(sock)->i_security. This hook
|
|
|
|
* may be used to update the SOCK_INODE(sock)->i_security field with additional
|
|
|
|
* information that wasn't available when the inode was allocated.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_socket_post_create(struct socket *sock, int family,
|
|
|
|
int type, int protocol, int kern)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_post_create, sock, family, type,
|
2023-02-17 02:33:20 +00:00
|
|
|
protocol, kern);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_socket_socketpair() - Check if creating a socketpair is allowed
|
|
|
|
* @socka: first socket
|
|
|
|
* @sockb: second socket
|
|
|
|
*
|
|
|
|
* Check permissions before creating a fresh pair of sockets.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted and the connection was
|
|
|
|
* established.
|
|
|
|
*/
|
2018-05-04 14:28:19 +00:00
|
|
|
int security_socket_socketpair(struct socket *socka, struct socket *sockb)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_socketpair, socka, sockb);
|
2018-05-04 14:28:19 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_socket_socketpair);
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_socket_bind() - Check if a socket bind operation is allowed
|
|
|
|
* @sock: socket
|
|
|
|
* @address: requested bind address
|
|
|
|
* @addrlen: length of address
|
|
|
|
*
|
|
|
|
* Check permission before socket protocol layer bind operation is performed
|
|
|
|
* and the socket @sock is bound to the address specified in the @address
|
|
|
|
* parameter.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_socket_bind(struct socket *sock,
|
|
|
|
struct sockaddr *address, int addrlen)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_bind, sock, address, addrlen);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_socket_connect() - Check if a socket connect operation is allowed
|
|
|
|
* @sock: socket
|
|
|
|
* @address: address of remote connection point
|
|
|
|
* @addrlen: length of address
|
|
|
|
*
|
|
|
|
* Check permission before socket protocol layer connect operation attempts to
|
|
|
|
* connect socket @sock to a remote address, @address.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_socket_connect(struct socket *sock,
|
|
|
|
struct sockaddr *address, int addrlen)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_connect, sock, address, addrlen);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_socket_listen() - Check if a socket is allowed to listen
|
|
|
|
* @sock: socket
|
|
|
|
* @backlog: connection queue size
|
|
|
|
*
|
|
|
|
* Check permission before socket protocol layer listen operation.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_socket_listen(struct socket *sock, int backlog)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_listen, sock, backlog);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_socket_accept() - Check if a socket is allowed to accept connections
|
|
|
|
* @sock: listening socket
|
|
|
|
* @newsock: newly creation connection socket
|
|
|
|
*
|
|
|
|
* Check permission before accepting a new connection. Note that the new
|
|
|
|
* socket, @newsock, has been created and some information copied to it, but
|
|
|
|
* the accept operation has not actually been performed.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_socket_accept(struct socket *sock, struct socket *newsock)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_accept, sock, newsock);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
2024-02-17 13:35:04 +00:00
|
|
|
* security_socket_sendmsg() - Check if sending a message is allowed
|
2023-02-12 20:15:29 +00:00
|
|
|
* @sock: sending socket
|
|
|
|
* @msg: message to send
|
|
|
|
* @size: size of message
|
|
|
|
*
|
|
|
|
* Check permission before transmitting a message to another socket.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_sendmsg, sock, msg, size);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_socket_recvmsg() - Check if receiving a message is allowed
|
|
|
|
* @sock: receiving socket
|
|
|
|
* @msg: message to receive
|
|
|
|
* @size: size of message
|
|
|
|
* @flags: operational flags
|
|
|
|
*
|
|
|
|
* Check permission before receiving a message from a socket.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_socket_recvmsg(struct socket *sock, struct msghdr *msg,
|
|
|
|
int size, int flags)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_recvmsg, sock, msg, size, flags);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_socket_getsockname() - Check if reading the socket addr is allowed
|
|
|
|
* @sock: socket
|
|
|
|
*
|
|
|
|
* Check permission before reading the local address (name) of the socket
|
|
|
|
* object.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_socket_getsockname(struct socket *sock)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_getsockname, sock);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_socket_getpeername() - Check if reading the peer's addr is allowed
|
|
|
|
* @sock: socket
|
|
|
|
*
|
|
|
|
* Check permission before the remote address (name) of a socket object.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_socket_getpeername(struct socket *sock)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_getpeername, sock);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_socket_getsockopt() - Check if reading a socket option is allowed
|
|
|
|
* @sock: socket
|
|
|
|
* @level: option's protocol level
|
|
|
|
* @optname: option name
|
|
|
|
*
|
|
|
|
* Check permissions before retrieving the options associated with socket
|
|
|
|
* @sock.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_socket_getsockopt(struct socket *sock, int level, int optname)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_getsockopt, sock, level, optname);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_socket_setsockopt() - Check if setting a socket option is allowed
|
|
|
|
* @sock: socket
|
|
|
|
* @level: option's protocol level
|
|
|
|
* @optname: option name
|
|
|
|
*
|
|
|
|
* Check permissions before setting the options associated with socket @sock.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_socket_setsockopt(struct socket *sock, int level, int optname)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_setsockopt, sock, level, optname);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_socket_shutdown() - Checks if shutting down the socket is allowed
|
|
|
|
* @sock: socket
|
|
|
|
* @how: flag indicating how sends and receives are handled
|
|
|
|
*
|
|
|
|
* Checks permission before all or part of a connection on the socket @sock is
|
|
|
|
* shut down.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_socket_shutdown(struct socket *sock, int how)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_shutdown, sock, how);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_sock_rcv_skb() - Check if an incoming network packet is allowed
|
|
|
|
* @sk: destination sock
|
|
|
|
* @skb: incoming packet
|
|
|
|
*
|
|
|
|
* Check permissions on incoming network packets. This hook is distinct from
|
|
|
|
* Netfilter's IP input hooks since it is the first time that the incoming
|
|
|
|
* sk_buff @skb has been associated with a particular socket, @sk. Must not
|
|
|
|
* sleep inside this hook because some callers hold spinlocks.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_sock_rcv_skb(struct sock *sk, struct sk_buff *skb)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_sock_rcv_skb, sk, skb);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_sock_rcv_skb);
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_socket_getpeersec_stream() - Get the remote peer label
|
|
|
|
* @sock: socket
|
|
|
|
* @optval: destination buffer
|
|
|
|
* @optlen: size of peer label copied into the buffer
|
|
|
|
* @len: maximum size of the destination buffer
|
|
|
|
*
|
|
|
|
* This hook allows the security module to provide peer socket security state
|
|
|
|
* for unix or connected tcp sockets to userspace via getsockopt SO_GETPEERSEC.
|
|
|
|
* For tcp sockets this can be meaningful if the socket is associated with an
|
|
|
|
* ipsec SA.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if all is well, otherwise, typical getsockopt return
|
|
|
|
* values.
|
|
|
|
*/
|
2022-10-10 16:31:21 +00:00
|
|
|
int security_socket_getpeersec_stream(struct socket *sock, sockptr_t optval,
|
|
|
|
sockptr_t optlen, unsigned int len)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_getpeersec_stream, sock, optval, optlen,
|
|
|
|
len);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_socket_getpeersec_dgram() - Get the remote peer label
|
|
|
|
* @sock: socket
|
|
|
|
* @skb: datagram packet
|
|
|
|
* @secid: remote peer label secid
|
|
|
|
*
|
|
|
|
* This hook allows the security module to provide peer socket security state
|
|
|
|
* for udp sockets on a per-packet basis to userspace via getsockopt
|
|
|
|
* SO_GETPEERSEC. The application must first have indicated the IP_PASSSEC
|
|
|
|
* option via getsockopt. It can then retrieve the security state returned by
|
|
|
|
* this hook for a packet via the SCM_SECURITY ancillary message type.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_socket_getpeersec_dgram(struct socket *sock,
|
|
|
|
struct sk_buff *skb, u32 *secid)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(socket_getpeersec_dgram, sock, skb, secid);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_socket_getpeersec_dgram);
|
|
|
|
|
2024-07-10 21:32:25 +00:00
|
|
|
/**
|
|
|
|
* lsm_sock_alloc - allocate a composite sock blob
|
|
|
|
* @sock: the sock that needs a blob
|
2024-07-10 21:32:27 +00:00
|
|
|
* @gfp: allocation mode
|
2024-07-10 21:32:25 +00:00
|
|
|
*
|
|
|
|
* Allocate the sock blob for all the modules
|
|
|
|
*
|
|
|
|
* Returns 0, or -ENOMEM if memory can't be allocated.
|
|
|
|
*/
|
2024-07-10 21:32:27 +00:00
|
|
|
static int lsm_sock_alloc(struct sock *sock, gfp_t gfp)
|
2024-07-10 21:32:25 +00:00
|
|
|
{
|
2024-07-10 21:32:27 +00:00
|
|
|
return lsm_blob_alloc(&sock->sk_security, blob_sizes.lbs_sock, gfp);
|
2024-07-10 21:32:25 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_sk_alloc() - Allocate and initialize a sock's LSM blob
|
|
|
|
* @sk: sock
|
|
|
|
* @family: protocol family
|
2023-03-08 17:31:03 +00:00
|
|
|
* @priority: gfp flags
|
2023-02-12 20:15:29 +00:00
|
|
|
*
|
|
|
|
* Allocate and attach a security structure to the sk->sk_security field, which
|
|
|
|
* is used to copy security attributes between local stream sockets.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_sk_alloc(struct sock *sk, int family, gfp_t priority)
|
|
|
|
{
|
2024-07-10 21:32:25 +00:00
|
|
|
int rc = lsm_sock_alloc(sk, priority);
|
|
|
|
|
|
|
|
if (unlikely(rc))
|
|
|
|
return rc;
|
|
|
|
rc = call_int_hook(sk_alloc_security, sk, family, priority);
|
|
|
|
if (unlikely(rc))
|
|
|
|
security_sk_free(sk);
|
|
|
|
return rc;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_sk_free() - Free the sock's LSM blob
|
|
|
|
* @sk: sock
|
|
|
|
*
|
|
|
|
* Deallocate security structure.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
void security_sk_free(struct sock *sk)
|
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(sk_free_security, sk);
|
2024-07-10 21:32:25 +00:00
|
|
|
kfree(sk->sk_security);
|
|
|
|
sk->sk_security = NULL;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_sk_clone() - Clone a sock's LSM state
|
|
|
|
* @sk: original sock
|
|
|
|
* @newsk: target sock
|
|
|
|
*
|
|
|
|
* Clone/copy security structure.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
void security_sk_clone(const struct sock *sk, struct sock *newsk)
|
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(sk_clone_security, sk, newsk);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
2011-10-07 09:40:59 +00:00
|
|
|
EXPORT_SYMBOL(security_sk_clone);
|
2007-10-17 06:31:32 +00:00
|
|
|
|
2023-07-31 14:36:47 +00:00
|
|
|
/**
|
|
|
|
* security_sk_classify_flow() - Set a flow's secid based on socket
|
|
|
|
* @sk: original socket
|
|
|
|
* @flic: target flow
|
|
|
|
*
|
|
|
|
* Set the target flow's secid to socket's secid.
|
|
|
|
*/
|
2023-07-11 13:06:08 +00:00
|
|
|
void security_sk_classify_flow(const struct sock *sk, struct flowi_common *flic)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2020-09-28 02:38:26 +00:00
|
|
|
call_void_hook(sk_getsecid, sk, &flic->flowic_secid);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_sk_classify_flow);
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_req_classify_flow() - Set a flow's secid based on request_sock
|
|
|
|
* @req: request_sock
|
|
|
|
* @flic: target flow
|
|
|
|
*
|
|
|
|
* Sets @flic's secid to @req's secid.
|
|
|
|
*/
|
2020-09-28 02:38:26 +00:00
|
|
|
void security_req_classify_flow(const struct request_sock *req,
|
|
|
|
struct flowi_common *flic)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2020-09-28 02:38:26 +00:00
|
|
|
call_void_hook(req_classify_flow, req, flic);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_req_classify_flow);
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_sock_graft() - Reconcile LSM state when grafting a sock on a socket
|
|
|
|
* @sk: sock being grafted
|
2023-03-08 17:31:03 +00:00
|
|
|
* @parent: target parent socket
|
2023-02-12 20:15:29 +00:00
|
|
|
*
|
2023-03-08 17:31:03 +00:00
|
|
|
* Sets @parent's inode secid to @sk's secid and update @sk with any necessary
|
|
|
|
* LSM state from @parent.
|
2023-02-12 20:15:29 +00:00
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
void security_sock_graft(struct sock *sk, struct socket *parent)
|
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(sock_graft, sk, parent);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_sock_graft);
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_inet_conn_request() - Set request_sock state using incoming connect
|
|
|
|
* @sk: parent listening sock
|
|
|
|
* @skb: incoming connection
|
|
|
|
* @req: new request_sock
|
|
|
|
*
|
|
|
|
* Initialize the @req LSM state based on @sk and the incoming connect in @skb.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2020-11-30 15:36:29 +00:00
|
|
|
int security_inet_conn_request(const struct sock *sk,
|
2023-02-17 02:33:20 +00:00
|
|
|
struct sk_buff *skb, struct request_sock *req)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(inet_conn_request, sk, skb, req);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_inet_conn_request);
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_inet_csk_clone() - Set new sock LSM state based on request_sock
|
|
|
|
* @newsk: new sock
|
|
|
|
* @req: connection request_sock
|
|
|
|
*
|
|
|
|
* Set that LSM state of @sock using the LSM state from @req.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
void security_inet_csk_clone(struct sock *newsk,
|
2023-02-17 02:33:20 +00:00
|
|
|
const struct request_sock *req)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(inet_csk_clone, newsk, req);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_inet_conn_established() - Update sock's LSM state with connection
|
|
|
|
* @sk: sock
|
|
|
|
* @skb: connection packet
|
|
|
|
*
|
|
|
|
* Update @sock's LSM state to represent a new connection from @skb.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
void security_inet_conn_established(struct sock *sk,
|
2023-02-17 02:33:20 +00:00
|
|
|
struct sk_buff *skb)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(inet_conn_established, sk, skb);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
2018-02-13 20:53:21 +00:00
|
|
|
EXPORT_SYMBOL(security_inet_conn_established);
|
2007-10-17 06:31:32 +00:00
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_secmark_relabel_packet() - Check if setting a secmark is allowed
|
|
|
|
* @secid: new secmark value
|
|
|
|
*
|
|
|
|
* Check if the process should be allowed to relabel packets to @secid.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2010-10-13 20:24:41 +00:00
|
|
|
int security_secmark_relabel_packet(u32 secid)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(secmark_relabel_packet, secid);
|
2010-10-13 20:24:41 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_secmark_relabel_packet);
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_secmark_refcount_inc() - Increment the secmark labeling rule count
|
|
|
|
*
|
|
|
|
* Tells the LSM to increment the number of secmark labeling rules loaded.
|
|
|
|
*/
|
2010-10-13 20:24:41 +00:00
|
|
|
void security_secmark_refcount_inc(void)
|
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(secmark_refcount_inc);
|
2010-10-13 20:24:41 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_secmark_refcount_inc);
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_secmark_refcount_dec() - Decrement the secmark labeling rule count
|
|
|
|
*
|
|
|
|
* Tells the LSM to decrement the number of secmark labeling rules loaded.
|
|
|
|
*/
|
2010-10-13 20:24:41 +00:00
|
|
|
void security_secmark_refcount_dec(void)
|
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(secmark_refcount_dec);
|
2010-10-13 20:24:41 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_secmark_refcount_dec);
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_tun_dev_alloc_security() - Allocate a LSM blob for a TUN device
|
|
|
|
* @security: pointer to the LSM blob
|
|
|
|
*
|
|
|
|
* This hook allows a module to allocate a security structure for a TUN device,
|
|
|
|
* returning the pointer in @security.
|
|
|
|
*
|
|
|
|
* Return: Returns a zero on success, negative values on failure.
|
|
|
|
*/
|
2013-01-14 07:12:19 +00:00
|
|
|
int security_tun_dev_alloc_security(void **security)
|
|
|
|
{
|
2024-07-10 21:32:28 +00:00
|
|
|
int rc;
|
|
|
|
|
|
|
|
rc = lsm_blob_alloc(security, blob_sizes.lbs_tun_dev, GFP_KERNEL);
|
|
|
|
if (rc)
|
|
|
|
return rc;
|
|
|
|
|
|
|
|
rc = call_int_hook(tun_dev_alloc_security, *security);
|
|
|
|
if (rc) {
|
|
|
|
kfree(*security);
|
|
|
|
*security = NULL;
|
|
|
|
}
|
|
|
|
return rc;
|
2013-01-14 07:12:19 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_tun_dev_alloc_security);
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_tun_dev_free_security() - Free a TUN device LSM blob
|
|
|
|
* @security: LSM blob
|
|
|
|
*
|
|
|
|
* This hook allows a module to free the security structure for a TUN device.
|
|
|
|
*/
|
2013-01-14 07:12:19 +00:00
|
|
|
void security_tun_dev_free_security(void *security)
|
|
|
|
{
|
2024-07-10 21:32:28 +00:00
|
|
|
kfree(security);
|
2013-01-14 07:12:19 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_tun_dev_free_security);
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_tun_dev_create() - Check if creating a TUN device is allowed
|
|
|
|
*
|
|
|
|
* Check permissions prior to creating a new TUN device.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2009-08-28 22:12:43 +00:00
|
|
|
int security_tun_dev_create(void)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(tun_dev_create);
|
2009-08-28 22:12:43 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_tun_dev_create);
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_tun_dev_attach_queue() - Check if attaching a TUN queue is allowed
|
|
|
|
* @security: TUN device LSM blob
|
|
|
|
*
|
|
|
|
* Check permissions prior to attaching to a TUN device queue.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2013-01-14 07:12:19 +00:00
|
|
|
int security_tun_dev_attach_queue(void *security)
|
2009-08-28 22:12:43 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(tun_dev_attach_queue, security);
|
2009-08-28 22:12:43 +00:00
|
|
|
}
|
2013-01-14 07:12:19 +00:00
|
|
|
EXPORT_SYMBOL(security_tun_dev_attach_queue);
|
2009-08-28 22:12:43 +00:00
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_tun_dev_attach() - Update TUN device LSM state on attach
|
|
|
|
* @sk: associated sock
|
|
|
|
* @security: TUN device LSM blob
|
|
|
|
*
|
|
|
|
* This hook can be used by the module to update any security state associated
|
|
|
|
* with the TUN device's sock structure.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2013-01-14 07:12:19 +00:00
|
|
|
int security_tun_dev_attach(struct sock *sk, void *security)
|
2009-08-28 22:12:43 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(tun_dev_attach, sk, security);
|
2009-08-28 22:12:43 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_tun_dev_attach);
|
|
|
|
|
2023-02-12 20:15:29 +00:00
|
|
|
/**
|
|
|
|
* security_tun_dev_open() - Update TUN device LSM state on open
|
|
|
|
* @security: TUN device LSM blob
|
|
|
|
*
|
|
|
|
* This hook can be used by the module to update any security state associated
|
|
|
|
* with the TUN device's security structure.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2013-01-14 07:12:19 +00:00
|
|
|
int security_tun_dev_open(void *security)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(tun_dev_open, security);
|
2013-01-14 07:12:19 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_tun_dev_open);
|
|
|
|
|
2023-02-15 22:47:10 +00:00
|
|
|
/**
|
|
|
|
* security_sctp_assoc_request() - Update the LSM on a SCTP association req
|
|
|
|
* @asoc: SCTP association
|
|
|
|
* @skb: packet requesting the association
|
|
|
|
*
|
|
|
|
* Passes the @asoc and @chunk->skb of the association INIT packet to the LSM.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_sctp_assoc_request(struct sctp_association *asoc,
|
|
|
|
struct sk_buff *skb)
|
2018-02-13 20:53:21 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sctp_assoc_request, asoc, skb);
|
2018-02-13 20:53:21 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_sctp_assoc_request);
|
|
|
|
|
2023-02-15 22:47:10 +00:00
|
|
|
/**
|
|
|
|
* security_sctp_bind_connect() - Validate a list of addrs for a SCTP option
|
|
|
|
* @sk: socket
|
|
|
|
* @optname: SCTP option to validate
|
|
|
|
* @address: list of IP addresses to validate
|
|
|
|
* @addrlen: length of the address list
|
|
|
|
*
|
|
|
|
* Validiate permissions required for each address associated with sock @sk.
|
|
|
|
* Depending on @optname, the addresses will be treated as either a connect or
|
|
|
|
* bind service. The @addrlen is calculated on each IPv4 and IPv6 address using
|
|
|
|
* sizeof(struct sockaddr_in) or sizeof(struct sockaddr_in6).
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
2018-02-13 20:53:21 +00:00
|
|
|
int security_sctp_bind_connect(struct sock *sk, int optname,
|
|
|
|
struct sockaddr *address, int addrlen)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sctp_bind_connect, sk, optname, address, addrlen);
|
2018-02-13 20:53:21 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_sctp_bind_connect);
|
|
|
|
|
2023-02-15 22:47:10 +00:00
|
|
|
/**
|
|
|
|
* security_sctp_sk_clone() - Clone a SCTP sock's LSM state
|
|
|
|
* @asoc: SCTP association
|
|
|
|
* @sk: original sock
|
|
|
|
* @newsk: target sock
|
|
|
|
*
|
|
|
|
* Called whenever a new socket is created by accept(2) (i.e. a TCP style
|
|
|
|
* socket) or when a socket is 'peeled off' e.g userspace calls
|
|
|
|
* sctp_peeloff(3).
|
|
|
|
*/
|
2021-11-02 12:02:47 +00:00
|
|
|
void security_sctp_sk_clone(struct sctp_association *asoc, struct sock *sk,
|
2018-02-13 20:53:21 +00:00
|
|
|
struct sock *newsk)
|
|
|
|
{
|
2021-11-02 12:02:47 +00:00
|
|
|
call_void_hook(sctp_sk_clone, asoc, sk, newsk);
|
2018-02-13 20:53:21 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_sctp_sk_clone);
|
|
|
|
|
2023-02-15 22:47:10 +00:00
|
|
|
/**
|
|
|
|
* security_sctp_assoc_established() - Update LSM state when assoc established
|
|
|
|
* @asoc: SCTP association
|
|
|
|
* @skb: packet establishing the association
|
|
|
|
*
|
|
|
|
* Passes the @asoc and @chunk->skb of the association COOKIE_ACK packet to the
|
|
|
|
* security module.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2022-02-12 17:59:21 +00:00
|
|
|
int security_sctp_assoc_established(struct sctp_association *asoc,
|
|
|
|
struct sk_buff *skb)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(sctp_assoc_established, asoc, skb);
|
2022-02-12 17:59:21 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_sctp_assoc_established);
|
|
|
|
|
2023-04-20 17:17:13 +00:00
|
|
|
/**
|
|
|
|
* security_mptcp_add_subflow() - Inherit the LSM label from the MPTCP socket
|
|
|
|
* @sk: the owning MPTCP socket
|
|
|
|
* @ssk: the new subflow
|
|
|
|
*
|
|
|
|
* Update the labeling for the given MPTCP subflow, to match the one of the
|
|
|
|
* owning MPTCP socket. This hook has to be called after the socket creation and
|
|
|
|
* initialization via the security_socket_create() and
|
|
|
|
* security_socket_post_create() LSM hooks.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success or a negative error code on failure.
|
|
|
|
*/
|
|
|
|
int security_mptcp_add_subflow(struct sock *sk, struct sock *ssk)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(mptcp_add_subflow, sk, ssk);
|
2023-04-20 17:17:13 +00:00
|
|
|
}
|
|
|
|
|
2007-10-17 06:31:32 +00:00
|
|
|
#endif /* CONFIG_SECURITY_NETWORK */
|
|
|
|
|
IB/core: Enforce PKey security on QPs
Add new LSM hooks to allocate and free security contexts and check for
permission to access a PKey.
Allocate and free a security context when creating and destroying a QP.
This context is used for controlling access to PKeys.
When a request is made to modify a QP that changes the port, PKey index,
or alternate path, check that the QP has permission for the PKey in the
PKey table index on the subnet prefix of the port. If the QP is shared
make sure all handles to the QP also have access.
Store which port and PKey index a QP is using. After the reset to init
transition the user can modify the port, PKey index and alternate path
independently. So port and PKey settings changes can be a merge of the
previous settings and the new ones.
In order to maintain access control if there are PKey table or subnet
prefix change keep a list of all QPs are using each PKey index on
each port. If a change occurs all QPs using that device and port must
have access enforced for the new cache settings.
These changes add a transaction to the QP modify process. Association
with the old port and PKey index must be maintained if the modify fails,
and must be removed if it succeeds. Association with the new port and
PKey index must be established prior to the modify and removed if the
modify fails.
1. When a QP is modified to a particular Port, PKey index or alternate
path insert that QP into the appropriate lists.
2. Check permission to access the new settings.
3. If step 2 grants access attempt to modify the QP.
4a. If steps 2 and 3 succeed remove any prior associations.
4b. If ether fails remove the new setting associations.
If a PKey table or subnet prefix changes walk the list of QPs and
check that they have permission. If not send the QP to the error state
and raise a fatal error event. If it's a shared QP make sure all the
QPs that share the real_qp have permission as well. If the QP that
owns a security structure is denied access the security structure is
marked as such and the QP is added to an error_list. Once the moving
the QP to error is complete the security structure mark is cleared.
Maintaining the lists correctly turns QP destroy into a transaction.
The hardware driver for the device frees the ib_qp structure, so while
the destroy is in progress the ib_qp pointer in the ib_qp_security
struct is undefined. When the destroy process begins the ib_qp_security
structure is marked as destroying. This prevents any action from being
taken on the QP pointer. After the QP is destroyed successfully it
could still listed on an error_list wait for it to be processed by that
flow before cleaning up the structure.
If the destroy fails the QPs port and PKey settings are reinserted into
the appropriate lists, the destroying flag is cleared, and access control
is enforced, in case there were any cache changes during the destroy
flow.
To keep the security changes isolated a new file is used to hold security
related functionality.
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
[PM: merge fixup in ib_verbs.h and uverbs_cmd.c]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-19 12:48:52 +00:00
|
|
|
#ifdef CONFIG_SECURITY_INFINIBAND
|
2023-02-15 23:07:41 +00:00
|
|
|
/**
|
|
|
|
* security_ib_pkey_access() - Check if access to an IB pkey is allowed
|
|
|
|
* @sec: LSM blob
|
|
|
|
* @subnet_prefix: subnet prefix of the port
|
|
|
|
* @pkey: IB pkey
|
|
|
|
*
|
2023-05-25 03:19:53 +00:00
|
|
|
* Check permission to access a pkey when modifying a QP.
|
2023-02-15 23:07:41 +00:00
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
IB/core: Enforce PKey security on QPs
Add new LSM hooks to allocate and free security contexts and check for
permission to access a PKey.
Allocate and free a security context when creating and destroying a QP.
This context is used for controlling access to PKeys.
When a request is made to modify a QP that changes the port, PKey index,
or alternate path, check that the QP has permission for the PKey in the
PKey table index on the subnet prefix of the port. If the QP is shared
make sure all handles to the QP also have access.
Store which port and PKey index a QP is using. After the reset to init
transition the user can modify the port, PKey index and alternate path
independently. So port and PKey settings changes can be a merge of the
previous settings and the new ones.
In order to maintain access control if there are PKey table or subnet
prefix change keep a list of all QPs are using each PKey index on
each port. If a change occurs all QPs using that device and port must
have access enforced for the new cache settings.
These changes add a transaction to the QP modify process. Association
with the old port and PKey index must be maintained if the modify fails,
and must be removed if it succeeds. Association with the new port and
PKey index must be established prior to the modify and removed if the
modify fails.
1. When a QP is modified to a particular Port, PKey index or alternate
path insert that QP into the appropriate lists.
2. Check permission to access the new settings.
3. If step 2 grants access attempt to modify the QP.
4a. If steps 2 and 3 succeed remove any prior associations.
4b. If ether fails remove the new setting associations.
If a PKey table or subnet prefix changes walk the list of QPs and
check that they have permission. If not send the QP to the error state
and raise a fatal error event. If it's a shared QP make sure all the
QPs that share the real_qp have permission as well. If the QP that
owns a security structure is denied access the security structure is
marked as such and the QP is added to an error_list. Once the moving
the QP to error is complete the security structure mark is cleared.
Maintaining the lists correctly turns QP destroy into a transaction.
The hardware driver for the device frees the ib_qp structure, so while
the destroy is in progress the ib_qp pointer in the ib_qp_security
struct is undefined. When the destroy process begins the ib_qp_security
structure is marked as destroying. This prevents any action from being
taken on the QP pointer. After the QP is destroyed successfully it
could still listed on an error_list wait for it to be processed by that
flow before cleaning up the structure.
If the destroy fails the QPs port and PKey settings are reinserted into
the appropriate lists, the destroying flag is cleared, and access control
is enforced, in case there were any cache changes during the destroy
flow.
To keep the security changes isolated a new file is used to hold security
related functionality.
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
[PM: merge fixup in ib_verbs.h and uverbs_cmd.c]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-19 12:48:52 +00:00
|
|
|
int security_ib_pkey_access(void *sec, u64 subnet_prefix, u16 pkey)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(ib_pkey_access, sec, subnet_prefix, pkey);
|
IB/core: Enforce PKey security on QPs
Add new LSM hooks to allocate and free security contexts and check for
permission to access a PKey.
Allocate and free a security context when creating and destroying a QP.
This context is used for controlling access to PKeys.
When a request is made to modify a QP that changes the port, PKey index,
or alternate path, check that the QP has permission for the PKey in the
PKey table index on the subnet prefix of the port. If the QP is shared
make sure all handles to the QP also have access.
Store which port and PKey index a QP is using. After the reset to init
transition the user can modify the port, PKey index and alternate path
independently. So port and PKey settings changes can be a merge of the
previous settings and the new ones.
In order to maintain access control if there are PKey table or subnet
prefix change keep a list of all QPs are using each PKey index on
each port. If a change occurs all QPs using that device and port must
have access enforced for the new cache settings.
These changes add a transaction to the QP modify process. Association
with the old port and PKey index must be maintained if the modify fails,
and must be removed if it succeeds. Association with the new port and
PKey index must be established prior to the modify and removed if the
modify fails.
1. When a QP is modified to a particular Port, PKey index or alternate
path insert that QP into the appropriate lists.
2. Check permission to access the new settings.
3. If step 2 grants access attempt to modify the QP.
4a. If steps 2 and 3 succeed remove any prior associations.
4b. If ether fails remove the new setting associations.
If a PKey table or subnet prefix changes walk the list of QPs and
check that they have permission. If not send the QP to the error state
and raise a fatal error event. If it's a shared QP make sure all the
QPs that share the real_qp have permission as well. If the QP that
owns a security structure is denied access the security structure is
marked as such and the QP is added to an error_list. Once the moving
the QP to error is complete the security structure mark is cleared.
Maintaining the lists correctly turns QP destroy into a transaction.
The hardware driver for the device frees the ib_qp structure, so while
the destroy is in progress the ib_qp pointer in the ib_qp_security
struct is undefined. When the destroy process begins the ib_qp_security
structure is marked as destroying. This prevents any action from being
taken on the QP pointer. After the QP is destroyed successfully it
could still listed on an error_list wait for it to be processed by that
flow before cleaning up the structure.
If the destroy fails the QPs port and PKey settings are reinserted into
the appropriate lists, the destroying flag is cleared, and access control
is enforced, in case there were any cache changes during the destroy
flow.
To keep the security changes isolated a new file is used to hold security
related functionality.
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
[PM: merge fixup in ib_verbs.h and uverbs_cmd.c]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-19 12:48:52 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_ib_pkey_access);
|
|
|
|
|
2023-02-15 23:07:41 +00:00
|
|
|
/**
|
|
|
|
* security_ib_endport_manage_subnet() - Check if SMPs traffic is allowed
|
|
|
|
* @sec: LSM blob
|
|
|
|
* @dev_name: IB device name
|
|
|
|
* @port_num: port number
|
|
|
|
*
|
|
|
|
* Check permissions to send and receive SMPs on a end port.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2023-02-17 02:33:20 +00:00
|
|
|
int security_ib_endport_manage_subnet(void *sec,
|
|
|
|
const char *dev_name, u8 port_num)
|
2017-05-19 12:48:54 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(ib_endport_manage_subnet, sec, dev_name, port_num);
|
2017-05-19 12:48:54 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_ib_endport_manage_subnet);
|
|
|
|
|
2023-02-15 23:07:41 +00:00
|
|
|
/**
|
|
|
|
* security_ib_alloc_security() - Allocate an Infiniband LSM blob
|
|
|
|
* @sec: LSM blob
|
|
|
|
*
|
|
|
|
* Allocate a security structure for Infiniband objects.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, non-zero on failure.
|
|
|
|
*/
|
IB/core: Enforce PKey security on QPs
Add new LSM hooks to allocate and free security contexts and check for
permission to access a PKey.
Allocate and free a security context when creating and destroying a QP.
This context is used for controlling access to PKeys.
When a request is made to modify a QP that changes the port, PKey index,
or alternate path, check that the QP has permission for the PKey in the
PKey table index on the subnet prefix of the port. If the QP is shared
make sure all handles to the QP also have access.
Store which port and PKey index a QP is using. After the reset to init
transition the user can modify the port, PKey index and alternate path
independently. So port and PKey settings changes can be a merge of the
previous settings and the new ones.
In order to maintain access control if there are PKey table or subnet
prefix change keep a list of all QPs are using each PKey index on
each port. If a change occurs all QPs using that device and port must
have access enforced for the new cache settings.
These changes add a transaction to the QP modify process. Association
with the old port and PKey index must be maintained if the modify fails,
and must be removed if it succeeds. Association with the new port and
PKey index must be established prior to the modify and removed if the
modify fails.
1. When a QP is modified to a particular Port, PKey index or alternate
path insert that QP into the appropriate lists.
2. Check permission to access the new settings.
3. If step 2 grants access attempt to modify the QP.
4a. If steps 2 and 3 succeed remove any prior associations.
4b. If ether fails remove the new setting associations.
If a PKey table or subnet prefix changes walk the list of QPs and
check that they have permission. If not send the QP to the error state
and raise a fatal error event. If it's a shared QP make sure all the
QPs that share the real_qp have permission as well. If the QP that
owns a security structure is denied access the security structure is
marked as such and the QP is added to an error_list. Once the moving
the QP to error is complete the security structure mark is cleared.
Maintaining the lists correctly turns QP destroy into a transaction.
The hardware driver for the device frees the ib_qp structure, so while
the destroy is in progress the ib_qp pointer in the ib_qp_security
struct is undefined. When the destroy process begins the ib_qp_security
structure is marked as destroying. This prevents any action from being
taken on the QP pointer. After the QP is destroyed successfully it
could still listed on an error_list wait for it to be processed by that
flow before cleaning up the structure.
If the destroy fails the QPs port and PKey settings are reinserted into
the appropriate lists, the destroying flag is cleared, and access control
is enforced, in case there were any cache changes during the destroy
flow.
To keep the security changes isolated a new file is used to hold security
related functionality.
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
[PM: merge fixup in ib_verbs.h and uverbs_cmd.c]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-19 12:48:52 +00:00
|
|
|
int security_ib_alloc_security(void **sec)
|
|
|
|
{
|
2024-07-10 21:32:29 +00:00
|
|
|
int rc;
|
|
|
|
|
|
|
|
rc = lsm_blob_alloc(sec, blob_sizes.lbs_ib, GFP_KERNEL);
|
|
|
|
if (rc)
|
|
|
|
return rc;
|
|
|
|
|
|
|
|
rc = call_int_hook(ib_alloc_security, *sec);
|
|
|
|
if (rc) {
|
|
|
|
kfree(*sec);
|
|
|
|
*sec = NULL;
|
|
|
|
}
|
|
|
|
return rc;
|
IB/core: Enforce PKey security on QPs
Add new LSM hooks to allocate and free security contexts and check for
permission to access a PKey.
Allocate and free a security context when creating and destroying a QP.
This context is used for controlling access to PKeys.
When a request is made to modify a QP that changes the port, PKey index,
or alternate path, check that the QP has permission for the PKey in the
PKey table index on the subnet prefix of the port. If the QP is shared
make sure all handles to the QP also have access.
Store which port and PKey index a QP is using. After the reset to init
transition the user can modify the port, PKey index and alternate path
independently. So port and PKey settings changes can be a merge of the
previous settings and the new ones.
In order to maintain access control if there are PKey table or subnet
prefix change keep a list of all QPs are using each PKey index on
each port. If a change occurs all QPs using that device and port must
have access enforced for the new cache settings.
These changes add a transaction to the QP modify process. Association
with the old port and PKey index must be maintained if the modify fails,
and must be removed if it succeeds. Association with the new port and
PKey index must be established prior to the modify and removed if the
modify fails.
1. When a QP is modified to a particular Port, PKey index or alternate
path insert that QP into the appropriate lists.
2. Check permission to access the new settings.
3. If step 2 grants access attempt to modify the QP.
4a. If steps 2 and 3 succeed remove any prior associations.
4b. If ether fails remove the new setting associations.
If a PKey table or subnet prefix changes walk the list of QPs and
check that they have permission. If not send the QP to the error state
and raise a fatal error event. If it's a shared QP make sure all the
QPs that share the real_qp have permission as well. If the QP that
owns a security structure is denied access the security structure is
marked as such and the QP is added to an error_list. Once the moving
the QP to error is complete the security structure mark is cleared.
Maintaining the lists correctly turns QP destroy into a transaction.
The hardware driver for the device frees the ib_qp structure, so while
the destroy is in progress the ib_qp pointer in the ib_qp_security
struct is undefined. When the destroy process begins the ib_qp_security
structure is marked as destroying. This prevents any action from being
taken on the QP pointer. After the QP is destroyed successfully it
could still listed on an error_list wait for it to be processed by that
flow before cleaning up the structure.
If the destroy fails the QPs port and PKey settings are reinserted into
the appropriate lists, the destroying flag is cleared, and access control
is enforced, in case there were any cache changes during the destroy
flow.
To keep the security changes isolated a new file is used to hold security
related functionality.
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
[PM: merge fixup in ib_verbs.h and uverbs_cmd.c]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-19 12:48:52 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_ib_alloc_security);
|
|
|
|
|
2023-02-15 23:07:41 +00:00
|
|
|
/**
|
|
|
|
* security_ib_free_security() - Free an Infiniband LSM blob
|
|
|
|
* @sec: LSM blob
|
|
|
|
*
|
|
|
|
* Deallocate an Infiniband security structure.
|
|
|
|
*/
|
IB/core: Enforce PKey security on QPs
Add new LSM hooks to allocate and free security contexts and check for
permission to access a PKey.
Allocate and free a security context when creating and destroying a QP.
This context is used for controlling access to PKeys.
When a request is made to modify a QP that changes the port, PKey index,
or alternate path, check that the QP has permission for the PKey in the
PKey table index on the subnet prefix of the port. If the QP is shared
make sure all handles to the QP also have access.
Store which port and PKey index a QP is using. After the reset to init
transition the user can modify the port, PKey index and alternate path
independently. So port and PKey settings changes can be a merge of the
previous settings and the new ones.
In order to maintain access control if there are PKey table or subnet
prefix change keep a list of all QPs are using each PKey index on
each port. If a change occurs all QPs using that device and port must
have access enforced for the new cache settings.
These changes add a transaction to the QP modify process. Association
with the old port and PKey index must be maintained if the modify fails,
and must be removed if it succeeds. Association with the new port and
PKey index must be established prior to the modify and removed if the
modify fails.
1. When a QP is modified to a particular Port, PKey index or alternate
path insert that QP into the appropriate lists.
2. Check permission to access the new settings.
3. If step 2 grants access attempt to modify the QP.
4a. If steps 2 and 3 succeed remove any prior associations.
4b. If ether fails remove the new setting associations.
If a PKey table or subnet prefix changes walk the list of QPs and
check that they have permission. If not send the QP to the error state
and raise a fatal error event. If it's a shared QP make sure all the
QPs that share the real_qp have permission as well. If the QP that
owns a security structure is denied access the security structure is
marked as such and the QP is added to an error_list. Once the moving
the QP to error is complete the security structure mark is cleared.
Maintaining the lists correctly turns QP destroy into a transaction.
The hardware driver for the device frees the ib_qp structure, so while
the destroy is in progress the ib_qp pointer in the ib_qp_security
struct is undefined. When the destroy process begins the ib_qp_security
structure is marked as destroying. This prevents any action from being
taken on the QP pointer. After the QP is destroyed successfully it
could still listed on an error_list wait for it to be processed by that
flow before cleaning up the structure.
If the destroy fails the QPs port and PKey settings are reinserted into
the appropriate lists, the destroying flag is cleared, and access control
is enforced, in case there were any cache changes during the destroy
flow.
To keep the security changes isolated a new file is used to hold security
related functionality.
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
[PM: merge fixup in ib_verbs.h and uverbs_cmd.c]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-19 12:48:52 +00:00
|
|
|
void security_ib_free_security(void *sec)
|
|
|
|
{
|
2024-07-10 21:32:29 +00:00
|
|
|
kfree(sec);
|
IB/core: Enforce PKey security on QPs
Add new LSM hooks to allocate and free security contexts and check for
permission to access a PKey.
Allocate and free a security context when creating and destroying a QP.
This context is used for controlling access to PKeys.
When a request is made to modify a QP that changes the port, PKey index,
or alternate path, check that the QP has permission for the PKey in the
PKey table index on the subnet prefix of the port. If the QP is shared
make sure all handles to the QP also have access.
Store which port and PKey index a QP is using. After the reset to init
transition the user can modify the port, PKey index and alternate path
independently. So port and PKey settings changes can be a merge of the
previous settings and the new ones.
In order to maintain access control if there are PKey table or subnet
prefix change keep a list of all QPs are using each PKey index on
each port. If a change occurs all QPs using that device and port must
have access enforced for the new cache settings.
These changes add a transaction to the QP modify process. Association
with the old port and PKey index must be maintained if the modify fails,
and must be removed if it succeeds. Association with the new port and
PKey index must be established prior to the modify and removed if the
modify fails.
1. When a QP is modified to a particular Port, PKey index or alternate
path insert that QP into the appropriate lists.
2. Check permission to access the new settings.
3. If step 2 grants access attempt to modify the QP.
4a. If steps 2 and 3 succeed remove any prior associations.
4b. If ether fails remove the new setting associations.
If a PKey table or subnet prefix changes walk the list of QPs and
check that they have permission. If not send the QP to the error state
and raise a fatal error event. If it's a shared QP make sure all the
QPs that share the real_qp have permission as well. If the QP that
owns a security structure is denied access the security structure is
marked as such and the QP is added to an error_list. Once the moving
the QP to error is complete the security structure mark is cleared.
Maintaining the lists correctly turns QP destroy into a transaction.
The hardware driver for the device frees the ib_qp structure, so while
the destroy is in progress the ib_qp pointer in the ib_qp_security
struct is undefined. When the destroy process begins the ib_qp_security
structure is marked as destroying. This prevents any action from being
taken on the QP pointer. After the QP is destroyed successfully it
could still listed on an error_list wait for it to be processed by that
flow before cleaning up the structure.
If the destroy fails the QPs port and PKey settings are reinserted into
the appropriate lists, the destroying flag is cleared, and access control
is enforced, in case there were any cache changes during the destroy
flow.
To keep the security changes isolated a new file is used to hold security
related functionality.
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
[PM: merge fixup in ib_verbs.h and uverbs_cmd.c]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-19 12:48:52 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_ib_free_security);
|
|
|
|
#endif /* CONFIG_SECURITY_INFINIBAND */
|
|
|
|
|
2007-10-17 06:31:32 +00:00
|
|
|
#ifdef CONFIG_SECURITY_NETWORK_XFRM
|
2023-02-15 23:14:01 +00:00
|
|
|
/**
|
|
|
|
* security_xfrm_policy_alloc() - Allocate a xfrm policy LSM blob
|
|
|
|
* @ctxp: xfrm security context being added to the SPD
|
|
|
|
* @sec_ctx: security label provided by userspace
|
|
|
|
* @gfp: gfp flags
|
|
|
|
*
|
|
|
|
* Allocate a security structure to the xp->security field; the security field
|
|
|
|
* is initialized to NULL when the xfrm_policy is allocated.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if operation was successful.
|
|
|
|
*/
|
2014-03-07 11:44:19 +00:00
|
|
|
int security_xfrm_policy_alloc(struct xfrm_sec_ctx **ctxp,
|
|
|
|
struct xfrm_user_sec_ctx *sec_ctx,
|
|
|
|
gfp_t gfp)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(xfrm_policy_alloc_security, ctxp, sec_ctx, gfp);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_xfrm_policy_alloc);
|
|
|
|
|
2023-02-15 23:14:01 +00:00
|
|
|
/**
|
|
|
|
* security_xfrm_policy_clone() - Clone xfrm policy LSM state
|
|
|
|
* @old_ctx: xfrm security context
|
|
|
|
* @new_ctxp: target xfrm security context
|
|
|
|
*
|
|
|
|
* Allocate a security structure in new_ctxp that contains the information from
|
|
|
|
* the old_ctx structure.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if operation was successful.
|
|
|
|
*/
|
2008-04-13 02:07:52 +00:00
|
|
|
int security_xfrm_policy_clone(struct xfrm_sec_ctx *old_ctx,
|
2023-02-17 02:33:20 +00:00
|
|
|
struct xfrm_sec_ctx **new_ctxp)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(xfrm_policy_clone_security, old_ctx, new_ctxp);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-15 23:14:01 +00:00
|
|
|
/**
|
|
|
|
* security_xfrm_policy_free() - Free a xfrm security context
|
|
|
|
* @ctx: xfrm security context
|
|
|
|
*
|
|
|
|
* Free LSM resources associated with @ctx.
|
|
|
|
*/
|
2008-04-13 02:07:52 +00:00
|
|
|
void security_xfrm_policy_free(struct xfrm_sec_ctx *ctx)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(xfrm_policy_free_security, ctx);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_xfrm_policy_free);
|
|
|
|
|
2023-02-15 23:14:01 +00:00
|
|
|
/**
|
|
|
|
* security_xfrm_policy_delete() - Check if deleting a xfrm policy is allowed
|
|
|
|
* @ctx: xfrm security context
|
|
|
|
*
|
|
|
|
* Authorize deletion of a SPD entry.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2008-04-13 02:07:52 +00:00
|
|
|
int security_xfrm_policy_delete(struct xfrm_sec_ctx *ctx)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(xfrm_policy_delete_security, ctx);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-15 23:14:01 +00:00
|
|
|
/**
|
|
|
|
* security_xfrm_state_alloc() - Allocate a xfrm state LSM blob
|
|
|
|
* @x: xfrm state being added to the SAD
|
|
|
|
* @sec_ctx: security label provided by userspace
|
|
|
|
*
|
|
|
|
* Allocate a security structure to the @x->security field; the security field
|
|
|
|
* is initialized to NULL when the xfrm_state is allocated. Set the context to
|
|
|
|
* correspond to @sec_ctx.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if operation was successful.
|
|
|
|
*/
|
2013-07-23 21:38:38 +00:00
|
|
|
int security_xfrm_state_alloc(struct xfrm_state *x,
|
|
|
|
struct xfrm_user_sec_ctx *sec_ctx)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(xfrm_state_alloc, x, sec_ctx);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_xfrm_state_alloc);
|
|
|
|
|
2023-02-15 23:14:01 +00:00
|
|
|
/**
|
|
|
|
* security_xfrm_state_alloc_acquire() - Allocate a xfrm state LSM blob
|
|
|
|
* @x: xfrm state being added to the SAD
|
|
|
|
* @polsec: associated policy's security context
|
|
|
|
* @secid: secid from the flow
|
|
|
|
*
|
|
|
|
* Allocate a security structure to the x->security field; the security field
|
|
|
|
* is initialized to NULL when the xfrm_state is allocated. Set the context to
|
|
|
|
* correspond to secid.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if operation was successful.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_xfrm_state_alloc_acquire(struct xfrm_state *x,
|
|
|
|
struct xfrm_sec_ctx *polsec, u32 secid)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(xfrm_state_alloc_acquire, x, polsec, secid);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-15 23:14:01 +00:00
|
|
|
/**
|
|
|
|
* security_xfrm_state_delete() - Check if deleting a xfrm state is allowed
|
|
|
|
* @x: xfrm state
|
|
|
|
*
|
|
|
|
* Authorize deletion of x->security.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_xfrm_state_delete(struct xfrm_state *x)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(xfrm_state_delete_security, x);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_xfrm_state_delete);
|
|
|
|
|
2023-02-15 23:14:01 +00:00
|
|
|
/**
|
|
|
|
* security_xfrm_state_free() - Free a xfrm state
|
|
|
|
* @x: xfrm state
|
|
|
|
*
|
|
|
|
* Deallocate x->security.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
void security_xfrm_state_free(struct xfrm_state *x)
|
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(xfrm_state_free_security, x);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-15 23:14:01 +00:00
|
|
|
/**
|
|
|
|
* security_xfrm_policy_lookup() - Check if using a xfrm policy is allowed
|
|
|
|
* @ctx: target xfrm security context
|
|
|
|
* @fl_secid: flow secid used to authorize access
|
|
|
|
*
|
|
|
|
* Check permission when a flow selects a xfrm_policy for processing XFRMs on a
|
|
|
|
* packet. The hook is called when selecting either a per-socket policy or a
|
|
|
|
* generic xfrm policy.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if permission is granted, -ESRCH otherwise, or -errno on
|
|
|
|
* other errors.
|
|
|
|
*/
|
2021-04-09 05:48:41 +00:00
|
|
|
int security_xfrm_policy_lookup(struct xfrm_sec_ctx *ctx, u32 fl_secid)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(xfrm_policy_lookup, ctx, fl_secid);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-15 23:14:01 +00:00
|
|
|
/**
|
|
|
|
* security_xfrm_state_pol_flow_match() - Check for a xfrm match
|
|
|
|
* @x: xfrm state to match
|
2023-03-08 17:31:03 +00:00
|
|
|
* @xp: xfrm policy to check for a match
|
2023-02-15 23:14:01 +00:00
|
|
|
* @flic: flow to check for a match.
|
|
|
|
*
|
|
|
|
* Check @xp and @flic for a match with @x.
|
|
|
|
*
|
|
|
|
* Return: Returns 1 if there is a match.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_xfrm_state_pol_flow_match(struct xfrm_state *x,
|
2011-02-23 02:13:15 +00:00
|
|
|
struct xfrm_policy *xp,
|
2020-09-28 02:38:26 +00:00
|
|
|
const struct flowi_common *flic)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
struct lsm_static_call *scall;
|
2020-03-29 00:43:50 +00:00
|
|
|
int rc = LSM_RET_DEFAULT(xfrm_state_pol_flow_match);
|
2015-05-02 22:11:42 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Since this function is expected to return 0 or 1, the judgment
|
|
|
|
* becomes difficult if multiple LSMs supply this call. Fortunately,
|
|
|
|
* we can use the first LSM's judgment because currently only SELinux
|
|
|
|
* supplies this call.
|
|
|
|
*
|
|
|
|
* For speed optimization, we explicitly break the loop rather than
|
|
|
|
* using the macro
|
|
|
|
*/
|
lsm: replace indirect LSM hook calls with static calls
LSM hooks are currently invoked from a linked list as indirect calls
which are invoked using retpolines as a mitigation for speculative
attacks (Branch History / Target injection) and add extra overhead which
is especially bad in kernel hot paths:
security_file_ioctl:
0xff...0320 <+0>: endbr64
0xff...0324 <+4>: push %rbp
0xff...0325 <+5>: push %r15
0xff...0327 <+7>: push %r14
0xff...0329 <+9>: push %rbx
0xff...032a <+10>: mov %rdx,%rbx
0xff...032d <+13>: mov %esi,%ebp
0xff...032f <+15>: mov %rdi,%r14
0xff...0332 <+18>: mov $0xff...7030,%r15
0xff...0339 <+25>: mov (%r15),%r15
0xff...033c <+28>: test %r15,%r15
0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56>
0xff...0341 <+33>: mov 0x18(%r15),%r11
0xff...0345 <+37>: mov %r14,%rdi
0xff...0348 <+40>: mov %ebp,%esi
0xff...034a <+42>: mov %rbx,%rdx
0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Indirect calls that use retpolines leading to overhead, not just due
to extra instruction but also branch misses.
0xff...0352 <+50>: test %eax,%eax
0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25>
0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58>
0xff...0358 <+56>: xor %eax,%eax
0xff...035a <+58>: pop %rbx
0xff...035b <+59>: pop %r14
0xff...035d <+61>: pop %r15
0xff...035f <+63>: pop %rbp
0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk>
The indirect calls are not really needed as one knows the addresses of
enabled LSM callbacks at boot time and only the order can possibly
change at boot time with the lsm= kernel command line parameter.
An array of static calls is defined per LSM hook and the static calls
are updated at boot time once the order has been determined.
With the hook now exposed as a static call, one can see that the
retpolines are no longer there and the LSM callbacks are invoked
directly:
security_file_ioctl:
0xff...0ca0 <+0>: endbr64
0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1)
0xff...0ca9 <+9>: push %rbp
0xff...0caa <+10>: push %r14
0xff...0cac <+12>: push %rbx
0xff...0cad <+13>: mov %rdx,%rbx
0xff...0cb0 <+16>: mov %esi,%ebp
0xff...0cb2 <+18>: mov %rdi,%r14
0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for SELinux
0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Static key enabled for BPF LSM. This is something that is changed to
default to false to avoid the existing side effect issues of BPF LSM
[1] in a subsequent patch.
0xff...0cb9 <+25>: xor %eax,%eax
0xff...0cbb <+27>: xchg %ax,%ax
0xff...0cbd <+29>: pop %rbx
0xff...0cbe <+30>: pop %r14
0xff...0cc0 <+32>: pop %rbp
0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk>
0xff...0cc7 <+39>: endbr64
0xff...0ccb <+43>: mov %r14,%rdi
0xff...0cce <+46>: mov %ebp,%esi
0xff...0cd0 <+48>: mov %rbx,%rdx
0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to SELinux.
0xff...0cd8 <+56>: test %eax,%eax
0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23>
0xff...0cde <+62>: endbr64
0xff...0ce2 <+66>: mov %r14,%rdi
0xff...0ce5 <+69>: mov %ebp,%esi
0xff...0ce7 <+71>: mov %rbx,%rdx
0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Direct call to BPF LSM.
0xff...0cef <+79>: test %eax,%eax
0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29>
0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25>
0xff...0cf5 <+85>: endbr64
0xff...0cf9 <+89>: mov %r14,%rdi
0xff...0cfc <+92>: mov %ebp,%esi
0xff...0cfe <+94>: mov %rbx,%rdx
0xff...0d01 <+97>: pop %rbx
0xff...0d02 <+98>: pop %r14
0xff...0d04 <+100>: pop %rbp
0xff...0d05 <+101>: ret
0xff...0d06 <+102>: int3
0xff...0d07 <+103>: int3
0xff...0d08 <+104>: int3
0xff...0d09 <+105>: int3
While this patch uses static_branch_unlikely indicating that an LSM hook
is likely to be not present. In most cases this is still a better choice
as even when an LSM with one hook is added, empty slots are created for
all LSM hooks (especially when many LSMs that do not initialize most
hooks are present on the system).
There are some hooks that don't use the call_int_hook or
call_void_hook. These hooks are updated to use a new macro called
lsm_for_each_hook where the lsm_callback is directly invoked as an
indirect call.
Below are results of the relevant Unixbench system benchmarks with BPF LSM
and SELinux enabled with default policies enabled with and without these
patches.
Benchmark Delta(%): (+ is better)
==========================================================================
Execl Throughput +1.9356
File Write 1024 bufsize 2000 maxblocks +6.5953
Pipe Throughput +9.5499
Pipe-based Context Switching +3.0209
Process Creation +2.3246
Shell Scripts (1 concurrent) +1.4975
System Call Overhead +2.7815
System Benchmarks Index Score (Partial Only): +3.4859
In the best case, some syscalls like eventfd_create benefitted to about
~10%.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-16 15:43:07 +00:00
|
|
|
lsm_for_each_hook(scall, xfrm_state_pol_flow_match) {
|
|
|
|
rc = scall->hl->hook.xfrm_state_pol_flow_match(x, xp, flic);
|
2015-05-02 22:11:42 +00:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
return rc;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-15 23:14:01 +00:00
|
|
|
/**
|
|
|
|
* security_xfrm_decode_session() - Determine the xfrm secid for a packet
|
|
|
|
* @skb: xfrm packet
|
|
|
|
* @secid: secid
|
|
|
|
*
|
|
|
|
* Decode the packet in @skb and return the security label in @secid.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if all xfrms used have the same secid.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
int security_xfrm_decode_session(struct sk_buff *skb, u32 *secid)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(xfrm_decode_session, skb, secid, 1);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2020-09-28 02:38:26 +00:00
|
|
|
void security_skb_classify_flow(struct sk_buff *skb, struct flowi_common *flic)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
int rc = call_int_hook(xfrm_decode_session, skb, &flic->flowic_secid,
|
2023-02-17 02:33:20 +00:00
|
|
|
0);
|
2007-10-17 06:31:32 +00:00
|
|
|
|
|
|
|
BUG_ON(rc);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_skb_classify_flow);
|
|
|
|
#endif /* CONFIG_SECURITY_NETWORK_XFRM */
|
|
|
|
|
|
|
|
#ifdef CONFIG_KEYS
|
2023-02-16 02:46:31 +00:00
|
|
|
/**
|
|
|
|
* security_key_alloc() - Allocate and initialize a kernel key LSM blob
|
|
|
|
* @key: key
|
|
|
|
* @cred: credentials
|
|
|
|
* @flags: allocation flags
|
|
|
|
*
|
|
|
|
* Permit allocation of a key and assign security data. Note that key does not
|
|
|
|
* have a serial number assigned at this point.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if permission is granted, -ve error otherwise.
|
|
|
|
*/
|
CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.
A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().
With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:
struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);
There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.
To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:
(1) Its reference count may incremented and decremented.
(2) The keyrings to which it points may be modified, but not replaced.
The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.
(2) Temporary credential overrides.
do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.
This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.
(3) LSM interface.
A number of functions have been changed, added or removed:
(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()
Removed in favour of security_capset().
(*) security_capset(), ->capset()
New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
Changed; now returns a value, which will cause the process to be
killed if it's an error.
(*) security_task_alloc(), ->task_alloc_security()
Removed in favour of security_prepare_creds().
(*) security_cred_free(), ->cred_free()
New. Free security data attached to cred->security.
(*) security_prepare_creds(), ->cred_prepare()
New. Duplicate any security data attached to cred->security.
(*) security_commit_creds(), ->cred_commit()
New. Apply any security effects for the upcoming installation of new
security by commit_creds().
(*) security_task_post_setuid(), ->task_post_setuid()
Removed in favour of security_task_fix_setuid().
(*) security_task_fix_setuid(), ->task_fix_setuid()
Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().
(*) security_task_reparent_to_init(), ->task_reparent_to_init()
Removed. Instead the task being reparented to init is referred
directly to init's credentials.
NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.
(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()
Changed. These now take cred pointers rather than task pointers to
refer to the security context.
(4) sys_capset().
This has been simplified and uses less locking. The LSM functions it
calls have been merged.
(5) reparent_to_kthreadd().
This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.
(6) __sigqueue_alloc() and switch_uid()
__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.
switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().
(7) [sg]et[ug]id() and co and [sg]et_current_groups.
The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.
security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.
The calling of set_dumpable() has been moved into commit_creds().
Much of the functionality of set_user() has been moved into
commit_creds().
The get functions all simply access the data directly.
(8) security_task_prctl() and cap_task_prctl().
security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.
Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.
(9) Keyrings.
A number of changes have been made to the keyrings code:
(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.
(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.
(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.
(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.
(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).
(10) Usermode helper.
The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.
call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.
call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.
(11) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.
(12) is_single_threaded().
This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.
The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).
(13) nfsd.
The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-13 23:39:23 +00:00
|
|
|
int security_key_alloc(struct key *key, const struct cred *cred,
|
|
|
|
unsigned long flags)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-07-10 21:32:26 +00:00
|
|
|
int rc = lsm_key_alloc(key);
|
|
|
|
|
|
|
|
if (unlikely(rc))
|
|
|
|
return rc;
|
|
|
|
rc = call_int_hook(key_alloc, key, cred, flags);
|
|
|
|
if (unlikely(rc))
|
|
|
|
security_key_free(key);
|
|
|
|
return rc;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 02:46:31 +00:00
|
|
|
/**
|
|
|
|
* security_key_free() - Free a kernel key LSM blob
|
|
|
|
* @key: key
|
|
|
|
*
|
|
|
|
* Notification of destruction; free security data.
|
|
|
|
*/
|
2007-10-17 06:31:32 +00:00
|
|
|
void security_key_free(struct key *key)
|
|
|
|
{
|
2024-07-10 21:32:26 +00:00
|
|
|
kfree(key->security);
|
|
|
|
key->security = NULL;
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 02:46:31 +00:00
|
|
|
/**
|
|
|
|
* security_key_permission() - Check if a kernel key operation is allowed
|
|
|
|
* @key_ref: key reference
|
|
|
|
* @cred: credentials of actor requesting access
|
|
|
|
* @need_perm: requested permissions
|
|
|
|
*
|
|
|
|
* See whether a specific operational right is granted to a process on a key.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if permission is granted, -ve error otherwise.
|
|
|
|
*/
|
2020-05-12 14:16:29 +00:00
|
|
|
int security_key_permission(key_ref_t key_ref, const struct cred *cred,
|
|
|
|
enum key_need_perm need_perm)
|
2007-10-17 06:31:32 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(key_permission, key_ref, cred, need_perm);
|
2007-10-17 06:31:32 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 02:46:31 +00:00
|
|
|
/**
|
|
|
|
* security_key_getsecurity() - Get the key's security label
|
|
|
|
* @key: key
|
2023-03-08 18:28:18 +00:00
|
|
|
* @buffer: security label buffer
|
2023-02-16 02:46:31 +00:00
|
|
|
*
|
|
|
|
* Get a textual representation of the security context attached to a key for
|
|
|
|
* the purposes of honouring KEYCTL_GETSECURITY. This function allocates the
|
|
|
|
* storage for the NUL-terminated string and the caller should free it.
|
|
|
|
*
|
2023-03-08 18:28:18 +00:00
|
|
|
* Return: Returns the length of @buffer (including terminating NUL) or -ve if
|
2023-02-16 02:46:31 +00:00
|
|
|
* an error occurs. May also return 0 (and a NULL buffer pointer) if
|
|
|
|
* there is no security label assigned to the key.
|
|
|
|
*/
|
2023-03-08 18:28:18 +00:00
|
|
|
int security_key_getsecurity(struct key *key, char **buffer)
|
2008-04-29 08:01:26 +00:00
|
|
|
{
|
2023-03-08 18:28:18 +00:00
|
|
|
*buffer = NULL;
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(key_getsecurity, key, buffer);
|
2008-04-29 08:01:26 +00:00
|
|
|
}
|
2024-02-15 10:31:06 +00:00
|
|
|
|
|
|
|
/**
|
|
|
|
* security_key_post_create_or_update() - Notification of key create or update
|
|
|
|
* @keyring: keyring to which the key is linked to
|
|
|
|
* @key: created or updated key
|
|
|
|
* @payload: data used to instantiate or update the key
|
|
|
|
* @payload_len: length of payload
|
|
|
|
* @flags: key flags
|
|
|
|
* @create: flag indicating whether the key was created or updated
|
|
|
|
*
|
|
|
|
* Notify the caller of a key creation or update.
|
|
|
|
*/
|
|
|
|
void security_key_post_create_or_update(struct key *keyring, struct key *key,
|
|
|
|
const void *payload, size_t payload_len,
|
|
|
|
unsigned long flags, bool create)
|
|
|
|
{
|
|
|
|
call_void_hook(key_post_create_or_update, keyring, key, payload,
|
|
|
|
payload_len, flags, create);
|
2008-04-29 08:01:26 +00:00
|
|
|
}
|
2007-10-17 06:31:32 +00:00
|
|
|
#endif /* CONFIG_KEYS */
|
2008-03-01 20:00:05 +00:00
|
|
|
|
|
|
|
#ifdef CONFIG_AUDIT
|
2023-02-16 22:00:01 +00:00
|
|
|
/**
|
|
|
|
* security_audit_rule_init() - Allocate and init an LSM audit rule struct
|
|
|
|
* @field: audit action
|
|
|
|
* @op: rule operator
|
|
|
|
* @rulestr: rule context
|
|
|
|
* @lsmrule: receive buffer for audit rule struct
|
2024-05-07 01:25:41 +00:00
|
|
|
* @gfp: GFP flag used for kmalloc
|
2023-02-16 22:00:01 +00:00
|
|
|
*
|
|
|
|
* Allocate and initialize an LSM audit rule structure.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if @lsmrule has been successfully set, -EINVAL in case of
|
|
|
|
* an invalid rule.
|
|
|
|
*/
|
2024-05-07 01:25:41 +00:00
|
|
|
int security_audit_rule_init(u32 field, u32 op, char *rulestr, void **lsmrule,
|
|
|
|
gfp_t gfp)
|
2008-03-01 20:00:05 +00:00
|
|
|
{
|
2024-05-07 01:25:41 +00:00
|
|
|
return call_int_hook(audit_rule_init, field, op, rulestr, lsmrule, gfp);
|
2008-03-01 20:00:05 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:00:01 +00:00
|
|
|
/**
|
|
|
|
* security_audit_rule_known() - Check if an audit rule contains LSM fields
|
|
|
|
* @krule: audit rule
|
|
|
|
*
|
|
|
|
* Specifies whether given @krule contains any fields related to the current
|
|
|
|
* LSM.
|
|
|
|
*
|
|
|
|
* Return: Returns 1 in case of relation found, 0 otherwise.
|
|
|
|
*/
|
2008-03-01 20:00:05 +00:00
|
|
|
int security_audit_rule_known(struct audit_krule *krule)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(audit_rule_known, krule);
|
2008-03-01 20:00:05 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:00:01 +00:00
|
|
|
/**
|
|
|
|
* security_audit_rule_free() - Free an LSM audit rule struct
|
|
|
|
* @lsmrule: audit rule struct
|
|
|
|
*
|
|
|
|
* Deallocate the LSM audit rule structure previously allocated by
|
|
|
|
* audit_rule_init().
|
|
|
|
*/
|
2008-03-01 20:00:05 +00:00
|
|
|
void security_audit_rule_free(void *lsmrule)
|
|
|
|
{
|
2015-05-02 22:11:29 +00:00
|
|
|
call_void_hook(audit_rule_free, lsmrule);
|
2008-03-01 20:00:05 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:00:01 +00:00
|
|
|
/**
|
|
|
|
* security_audit_rule_match() - Check if a label matches an audit rule
|
|
|
|
* @secid: security label
|
|
|
|
* @field: LSM audit field
|
|
|
|
* @op: matching operator
|
|
|
|
* @lsmrule: audit rule
|
|
|
|
*
|
|
|
|
* Determine if given @secid matches a rule previously approved by
|
|
|
|
* security_audit_rule_known().
|
|
|
|
*
|
|
|
|
* Return: Returns 1 if secid matches the rule, 0 if it does not, -ERRNO on
|
|
|
|
* failure.
|
|
|
|
*/
|
2019-01-31 16:52:11 +00:00
|
|
|
int security_audit_rule_match(u32 secid, u32 field, u32 op, void *lsmrule)
|
2008-03-01 20:00:05 +00:00
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(audit_rule_match, secid, field, op, lsmrule);
|
2008-03-01 20:00:05 +00:00
|
|
|
}
|
2015-05-02 22:11:42 +00:00
|
|
|
#endif /* CONFIG_AUDIT */
|
2017-10-18 20:00:24 +00:00
|
|
|
|
|
|
|
#ifdef CONFIG_BPF_SYSCALL
|
2023-02-16 22:13:40 +00:00
|
|
|
/**
|
|
|
|
* security_bpf() - Check if the bpf syscall operation is allowed
|
|
|
|
* @cmd: command
|
|
|
|
* @attr: bpf attribute
|
|
|
|
* @size: size
|
|
|
|
*
|
|
|
|
* Do a initial check for all bpf syscalls after the attribute is copied into
|
|
|
|
* the kernel. The actual security module can implement their own rules to
|
|
|
|
* check the specific cmd they need.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2017-10-18 20:00:24 +00:00
|
|
|
int security_bpf(int cmd, union bpf_attr *attr, unsigned int size)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(bpf, cmd, attr, size);
|
2017-10-18 20:00:24 +00:00
|
|
|
}
|
2023-02-16 22:13:40 +00:00
|
|
|
|
|
|
|
/**
|
|
|
|
* security_bpf_map() - Check if access to a bpf map is allowed
|
|
|
|
* @map: bpf map
|
|
|
|
* @fmode: mode
|
|
|
|
*
|
|
|
|
* Do a check when the kernel generates and returns a file descriptor for eBPF
|
|
|
|
* maps.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2017-10-18 20:00:24 +00:00
|
|
|
int security_bpf_map(struct bpf_map *map, fmode_t fmode)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(bpf_map, map, fmode);
|
2017-10-18 20:00:24 +00:00
|
|
|
}
|
2023-02-16 22:13:40 +00:00
|
|
|
|
|
|
|
/**
|
|
|
|
* security_bpf_prog() - Check if access to a bpf program is allowed
|
|
|
|
* @prog: bpf program
|
|
|
|
*
|
|
|
|
* Do a check when the kernel generates and returns a file descriptor for eBPF
|
|
|
|
* programs.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2017-10-18 20:00:24 +00:00
|
|
|
int security_bpf_prog(struct bpf_prog *prog)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(bpf_prog, prog);
|
2017-10-18 20:00:24 +00:00
|
|
|
}
|
2023-02-16 22:13:40 +00:00
|
|
|
|
|
|
|
/**
|
bpf,lsm: Refactor bpf_map_alloc/bpf_map_free LSM hooks
Similarly to bpf_prog_alloc LSM hook, rename and extend bpf_map_alloc
hook into bpf_map_create, taking not just struct bpf_map, but also
bpf_attr and bpf_token, to give a fuller context to LSMs.
Unlike bpf_prog_alloc, there is no need to move the hook around, as it
currently is firing right before allocating BPF map ID and FD, which
seems to be a sweet spot.
But like bpf_prog_alloc/bpf_prog_free combo, make sure that bpf_map_free
LSM hook is called even if bpf_map_create hook returned error, as if few
LSMs are combined together it could be that one LSM successfully
allocated security blob for its needs, while subsequent LSM rejected BPF
map creation. The former LSM would still need to free up LSM blob, so we
need to ensure security_bpf_map_free() is called regardless of the
outcome.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-11-andrii@kernel.org
2024-01-24 02:21:07 +00:00
|
|
|
* security_bpf_map_create() - Check if BPF map creation is allowed
|
|
|
|
* @map: BPF map object
|
|
|
|
* @attr: BPF syscall attributes used to create BPF map
|
|
|
|
* @token: BPF token used to grant user access
|
2023-02-16 22:13:40 +00:00
|
|
|
*
|
bpf,lsm: Refactor bpf_map_alloc/bpf_map_free LSM hooks
Similarly to bpf_prog_alloc LSM hook, rename and extend bpf_map_alloc
hook into bpf_map_create, taking not just struct bpf_map, but also
bpf_attr and bpf_token, to give a fuller context to LSMs.
Unlike bpf_prog_alloc, there is no need to move the hook around, as it
currently is firing right before allocating BPF map ID and FD, which
seems to be a sweet spot.
But like bpf_prog_alloc/bpf_prog_free combo, make sure that bpf_map_free
LSM hook is called even if bpf_map_create hook returned error, as if few
LSMs are combined together it could be that one LSM successfully
allocated security blob for its needs, while subsequent LSM rejected BPF
map creation. The former LSM would still need to free up LSM blob, so we
need to ensure security_bpf_map_free() is called regardless of the
outcome.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-11-andrii@kernel.org
2024-01-24 02:21:07 +00:00
|
|
|
* Do a check when the kernel creates a new BPF map. This is also the
|
|
|
|
* point where LSM blob is allocated for LSMs that need them.
|
2023-02-16 22:13:40 +00:00
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
bpf,lsm: Refactor bpf_map_alloc/bpf_map_free LSM hooks
Similarly to bpf_prog_alloc LSM hook, rename and extend bpf_map_alloc
hook into bpf_map_create, taking not just struct bpf_map, but also
bpf_attr and bpf_token, to give a fuller context to LSMs.
Unlike bpf_prog_alloc, there is no need to move the hook around, as it
currently is firing right before allocating BPF map ID and FD, which
seems to be a sweet spot.
But like bpf_prog_alloc/bpf_prog_free combo, make sure that bpf_map_free
LSM hook is called even if bpf_map_create hook returned error, as if few
LSMs are combined together it could be that one LSM successfully
allocated security blob for its needs, while subsequent LSM rejected BPF
map creation. The former LSM would still need to free up LSM blob, so we
need to ensure security_bpf_map_free() is called regardless of the
outcome.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-11-andrii@kernel.org
2024-01-24 02:21:07 +00:00
|
|
|
int security_bpf_map_create(struct bpf_map *map, union bpf_attr *attr,
|
|
|
|
struct bpf_token *token)
|
2017-10-18 20:00:24 +00:00
|
|
|
{
|
lsm/stable-6.9 PR 20240312
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmXwt3cUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXObOhAAqldn1nbYS/t1D/k/9ZN/PtSQetK4
S58D8+gB59Sg0daWFaRhCwwShIbXS/6XzhqaVb3iAPptJs0YDFMbWLAW2d+dd69K
/7C8diguHbuJdEnCJtFYQIVinavaYVRlyoQcO8uwTz8uvTgXPOhr2P9NcOApJXcR
xqttuADVo/9Zn0O9/+GUPCH0ROL0SMnuUjwdVP3bpPHj9zEk8F1/A6chzTeSLJru
Y4+cRrN/r0JTkvRqPdnF9LSvxK7mtAEaHkKGeLQbw0O5pv3r3w0EWMJvq+uonGU2
WX0eR5VMfevkFMUdw8FKOTa+OZ0HJ2KKIb4sB4wDMgeGyov7Z6SxgvFeQiSyD3aB
QnyfLDzeEuPfousxUd45dUDnsWNnSgFF+JAdi0LSzm5hMuLeQDozTsFmh0orQcX1
L5A6VtAbSPP0ffl+tuPi48q3P3LlSjMP0B8W20NXFYhXukKXCgXVMr/dEvpwpu1m
o1glviGIXeLQQSnX3lMWb7Ds2igmCtXPrqkdu2vpRhMp0od6n4R4jH73Aj5MeSQn
n3sP73dg5sAaMjtI2NOisMeFUp09MMlOumCCM+AIplPXremm1kwgKRTIp0rKsLW9
VoQPXa43LQc3hAgPrpGuE+4yBfaBUq7Z8I37IFER/2y4K8b9YkduW4kDh7OdRz+d
iQ4Nnu2lR/+CCH0=
=0mTM
-----END PGP SIGNATURE-----
Merge tag 'lsm-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:
- Promote IMA/EVM to a proper LSM
This is the bulk of the diffstat, and the source of all the changes
in the VFS code. Prior to the start of the LSM stacking work it was
important that IMA/EVM were separate from the rest of the LSMs,
complete with their own hooks, infrastructure, etc. as it was the
only way to enable IMA/EVM at the same time as a LSM.
However, now that the bulk of the LSM infrastructure supports
multiple simultaneous LSMs, we can simplify things greatly by
bringing IMA/EVM into the LSM infrastructure as proper LSMs. This is
something I've wanted to see happen for quite some time and Roberto
was kind enough to put in the work to make it happen.
- Use the LSM hook default values to simplify the call_int_hook() macro
Previously the call_int_hook() macro required callers to supply a
default return value, despite a default value being specified when
the LSM hook was defined.
This simplifies the macro by using the defined default return value
which makes life easier for callers and should also reduce the number
of return value bugs in the future (we've had a few pop up recently,
hence this work).
- Use the KMEM_CACHE() macro instead of kmem_cache_create()
The guidance appears to be to use the KMEM_CACHE() macro when
possible and there is no reason why we can't use the macro, so let's
use it.
- Fix a number of comment typos in the LSM hook comment blocks
Not much to say here, we fixed some questionable grammar decisions in
the LSM hook comment blocks.
* tag 'lsm-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (28 commits)
cred: Use KMEM_CACHE() instead of kmem_cache_create()
lsm: use default hook return value in call_int_hook()
lsm: fix typos in security/security.c comment headers
integrity: Remove LSM
ima: Make it independent from 'integrity' LSM
evm: Make it independent from 'integrity' LSM
evm: Move to LSM infrastructure
ima: Move IMA-Appraisal to LSM infrastructure
ima: Move to LSM infrastructure
integrity: Move integrity_kernel_module_request() to IMA
security: Introduce key_post_create_or_update hook
security: Introduce inode_post_remove_acl hook
security: Introduce inode_post_set_acl hook
security: Introduce inode_post_create_tmpfile hook
security: Introduce path_post_mknod hook
security: Introduce file_release hook
security: Introduce file_post_open hook
security: Introduce inode_post_removexattr hook
security: Introduce inode_post_setattr hook
security: Align inode_setattr hook definition with EVM
...
2024-03-13 03:03:34 +00:00
|
|
|
return call_int_hook(bpf_map_create, map, attr, token);
|
2017-10-18 20:00:24 +00:00
|
|
|
}
|
2023-02-16 22:13:40 +00:00
|
|
|
|
|
|
|
/**
|
bpf,lsm: Refactor bpf_prog_alloc/bpf_prog_free LSM hooks
Based on upstream discussion ([0]), rework existing
bpf_prog_alloc_security LSM hook. Rename it to bpf_prog_load and instead
of passing bpf_prog_aux, pass proper bpf_prog pointer for a full BPF
program struct. Also, we pass bpf_attr union with all the user-provided
arguments for BPF_PROG_LOAD command. This will give LSMs as much
information as we can basically provide.
The hook is also BPF token-aware now, and optional bpf_token struct is
passed as a third argument. bpf_prog_load LSM hook is called after
a bunch of sanity checks were performed, bpf_prog and bpf_prog_aux were
allocated and filled out, but right before performing full-fledged BPF
verification step.
bpf_prog_free LSM hook is now accepting struct bpf_prog argument, for
consistency. SELinux code is adjusted to all new names, types, and
signatures.
Note, given that bpf_prog_load (previously bpf_prog_alloc) hook can be
used by some LSMs to allocate extra security blob, but also by other
LSMs to reject BPF program loading, we need to make sure that
bpf_prog_free LSM hook is called after bpf_prog_load/bpf_prog_alloc one
*even* if the hook itself returned error. If we don't do that, we run
the risk of leaking memory. This seems to be possible today when
combining SELinux and BPF LSM, as one example, depending on their
relative ordering.
Also, for BPF LSM setup, add bpf_prog_load and bpf_prog_free to
sleepable LSM hooks list, as they are both executed in sleepable
context. Also drop bpf_prog_load hook from untrusted, as there is no
issue with refcount or anything else anymore, that originally forced us
to add it to untrusted list in c0c852dd1876 ("bpf: Do not mark certain LSM
hook arguments as trusted"). We now trigger this hook much later and it
should not be an issue anymore.
[0] https://lore.kernel.org/bpf/9fe88aef7deabbe87d3fc38c4aea3c69.paul@paul-moore.com/
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-10-andrii@kernel.org
2024-01-24 02:21:06 +00:00
|
|
|
* security_bpf_prog_load() - Check if loading of BPF program is allowed
|
|
|
|
* @prog: BPF program object
|
|
|
|
* @attr: BPF syscall attributes used to create BPF program
|
|
|
|
* @token: BPF token used to grant user access to BPF subsystem
|
2023-02-16 22:13:40 +00:00
|
|
|
*
|
bpf,lsm: Refactor bpf_prog_alloc/bpf_prog_free LSM hooks
Based on upstream discussion ([0]), rework existing
bpf_prog_alloc_security LSM hook. Rename it to bpf_prog_load and instead
of passing bpf_prog_aux, pass proper bpf_prog pointer for a full BPF
program struct. Also, we pass bpf_attr union with all the user-provided
arguments for BPF_PROG_LOAD command. This will give LSMs as much
information as we can basically provide.
The hook is also BPF token-aware now, and optional bpf_token struct is
passed as a third argument. bpf_prog_load LSM hook is called after
a bunch of sanity checks were performed, bpf_prog and bpf_prog_aux were
allocated and filled out, but right before performing full-fledged BPF
verification step.
bpf_prog_free LSM hook is now accepting struct bpf_prog argument, for
consistency. SELinux code is adjusted to all new names, types, and
signatures.
Note, given that bpf_prog_load (previously bpf_prog_alloc) hook can be
used by some LSMs to allocate extra security blob, but also by other
LSMs to reject BPF program loading, we need to make sure that
bpf_prog_free LSM hook is called after bpf_prog_load/bpf_prog_alloc one
*even* if the hook itself returned error. If we don't do that, we run
the risk of leaking memory. This seems to be possible today when
combining SELinux and BPF LSM, as one example, depending on their
relative ordering.
Also, for BPF LSM setup, add bpf_prog_load and bpf_prog_free to
sleepable LSM hooks list, as they are both executed in sleepable
context. Also drop bpf_prog_load hook from untrusted, as there is no
issue with refcount or anything else anymore, that originally forced us
to add it to untrusted list in c0c852dd1876 ("bpf: Do not mark certain LSM
hook arguments as trusted"). We now trigger this hook much later and it
should not be an issue anymore.
[0] https://lore.kernel.org/bpf/9fe88aef7deabbe87d3fc38c4aea3c69.paul@paul-moore.com/
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-10-andrii@kernel.org
2024-01-24 02:21:06 +00:00
|
|
|
* Perform an access control check when the kernel loads a BPF program and
|
|
|
|
* allocates associated BPF program object. This hook is also responsible for
|
|
|
|
* allocating any required LSM state for the BPF program.
|
2023-02-16 22:13:40 +00:00
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
bpf,lsm: Refactor bpf_prog_alloc/bpf_prog_free LSM hooks
Based on upstream discussion ([0]), rework existing
bpf_prog_alloc_security LSM hook. Rename it to bpf_prog_load and instead
of passing bpf_prog_aux, pass proper bpf_prog pointer for a full BPF
program struct. Also, we pass bpf_attr union with all the user-provided
arguments for BPF_PROG_LOAD command. This will give LSMs as much
information as we can basically provide.
The hook is also BPF token-aware now, and optional bpf_token struct is
passed as a third argument. bpf_prog_load LSM hook is called after
a bunch of sanity checks were performed, bpf_prog and bpf_prog_aux were
allocated and filled out, but right before performing full-fledged BPF
verification step.
bpf_prog_free LSM hook is now accepting struct bpf_prog argument, for
consistency. SELinux code is adjusted to all new names, types, and
signatures.
Note, given that bpf_prog_load (previously bpf_prog_alloc) hook can be
used by some LSMs to allocate extra security blob, but also by other
LSMs to reject BPF program loading, we need to make sure that
bpf_prog_free LSM hook is called after bpf_prog_load/bpf_prog_alloc one
*even* if the hook itself returned error. If we don't do that, we run
the risk of leaking memory. This seems to be possible today when
combining SELinux and BPF LSM, as one example, depending on their
relative ordering.
Also, for BPF LSM setup, add bpf_prog_load and bpf_prog_free to
sleepable LSM hooks list, as they are both executed in sleepable
context. Also drop bpf_prog_load hook from untrusted, as there is no
issue with refcount or anything else anymore, that originally forced us
to add it to untrusted list in c0c852dd1876 ("bpf: Do not mark certain LSM
hook arguments as trusted"). We now trigger this hook much later and it
should not be an issue anymore.
[0] https://lore.kernel.org/bpf/9fe88aef7deabbe87d3fc38c4aea3c69.paul@paul-moore.com/
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-10-andrii@kernel.org
2024-01-24 02:21:06 +00:00
|
|
|
int security_bpf_prog_load(struct bpf_prog *prog, union bpf_attr *attr,
|
|
|
|
struct bpf_token *token)
|
2017-10-18 20:00:24 +00:00
|
|
|
{
|
lsm/stable-6.9 PR 20240312
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmXwt3cUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXObOhAAqldn1nbYS/t1D/k/9ZN/PtSQetK4
S58D8+gB59Sg0daWFaRhCwwShIbXS/6XzhqaVb3iAPptJs0YDFMbWLAW2d+dd69K
/7C8diguHbuJdEnCJtFYQIVinavaYVRlyoQcO8uwTz8uvTgXPOhr2P9NcOApJXcR
xqttuADVo/9Zn0O9/+GUPCH0ROL0SMnuUjwdVP3bpPHj9zEk8F1/A6chzTeSLJru
Y4+cRrN/r0JTkvRqPdnF9LSvxK7mtAEaHkKGeLQbw0O5pv3r3w0EWMJvq+uonGU2
WX0eR5VMfevkFMUdw8FKOTa+OZ0HJ2KKIb4sB4wDMgeGyov7Z6SxgvFeQiSyD3aB
QnyfLDzeEuPfousxUd45dUDnsWNnSgFF+JAdi0LSzm5hMuLeQDozTsFmh0orQcX1
L5A6VtAbSPP0ffl+tuPi48q3P3LlSjMP0B8W20NXFYhXukKXCgXVMr/dEvpwpu1m
o1glviGIXeLQQSnX3lMWb7Ds2igmCtXPrqkdu2vpRhMp0od6n4R4jH73Aj5MeSQn
n3sP73dg5sAaMjtI2NOisMeFUp09MMlOumCCM+AIplPXremm1kwgKRTIp0rKsLW9
VoQPXa43LQc3hAgPrpGuE+4yBfaBUq7Z8I37IFER/2y4K8b9YkduW4kDh7OdRz+d
iQ4Nnu2lR/+CCH0=
=0mTM
-----END PGP SIGNATURE-----
Merge tag 'lsm-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:
- Promote IMA/EVM to a proper LSM
This is the bulk of the diffstat, and the source of all the changes
in the VFS code. Prior to the start of the LSM stacking work it was
important that IMA/EVM were separate from the rest of the LSMs,
complete with their own hooks, infrastructure, etc. as it was the
only way to enable IMA/EVM at the same time as a LSM.
However, now that the bulk of the LSM infrastructure supports
multiple simultaneous LSMs, we can simplify things greatly by
bringing IMA/EVM into the LSM infrastructure as proper LSMs. This is
something I've wanted to see happen for quite some time and Roberto
was kind enough to put in the work to make it happen.
- Use the LSM hook default values to simplify the call_int_hook() macro
Previously the call_int_hook() macro required callers to supply a
default return value, despite a default value being specified when
the LSM hook was defined.
This simplifies the macro by using the defined default return value
which makes life easier for callers and should also reduce the number
of return value bugs in the future (we've had a few pop up recently,
hence this work).
- Use the KMEM_CACHE() macro instead of kmem_cache_create()
The guidance appears to be to use the KMEM_CACHE() macro when
possible and there is no reason why we can't use the macro, so let's
use it.
- Fix a number of comment typos in the LSM hook comment blocks
Not much to say here, we fixed some questionable grammar decisions in
the LSM hook comment blocks.
* tag 'lsm-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (28 commits)
cred: Use KMEM_CACHE() instead of kmem_cache_create()
lsm: use default hook return value in call_int_hook()
lsm: fix typos in security/security.c comment headers
integrity: Remove LSM
ima: Make it independent from 'integrity' LSM
evm: Make it independent from 'integrity' LSM
evm: Move to LSM infrastructure
ima: Move IMA-Appraisal to LSM infrastructure
ima: Move to LSM infrastructure
integrity: Move integrity_kernel_module_request() to IMA
security: Introduce key_post_create_or_update hook
security: Introduce inode_post_remove_acl hook
security: Introduce inode_post_set_acl hook
security: Introduce inode_post_create_tmpfile hook
security: Introduce path_post_mknod hook
security: Introduce file_release hook
security: Introduce file_post_open hook
security: Introduce inode_post_removexattr hook
security: Introduce inode_post_setattr hook
security: Align inode_setattr hook definition with EVM
...
2024-03-13 03:03:34 +00:00
|
|
|
return call_int_hook(bpf_prog_load, prog, attr, token);
|
2017-10-18 20:00:24 +00:00
|
|
|
}
|
2023-02-16 22:13:40 +00:00
|
|
|
|
2024-01-24 02:21:08 +00:00
|
|
|
/**
|
|
|
|
* security_bpf_token_create() - Check if creating of BPF token is allowed
|
|
|
|
* @token: BPF token object
|
|
|
|
* @attr: BPF syscall attributes used to create BPF token
|
|
|
|
* @path: path pointing to BPF FS mount point from which BPF token is created
|
|
|
|
*
|
|
|
|
* Do a check when the kernel instantiates a new BPF token object from BPF FS
|
|
|
|
* instance. This is also the point where LSM blob can be allocated for LSMs.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
|
|
|
int security_bpf_token_create(struct bpf_token *token, union bpf_attr *attr,
|
|
|
|
struct path *path)
|
|
|
|
{
|
lsm/stable-6.9 PR 20240312
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmXwt3cUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXObOhAAqldn1nbYS/t1D/k/9ZN/PtSQetK4
S58D8+gB59Sg0daWFaRhCwwShIbXS/6XzhqaVb3iAPptJs0YDFMbWLAW2d+dd69K
/7C8diguHbuJdEnCJtFYQIVinavaYVRlyoQcO8uwTz8uvTgXPOhr2P9NcOApJXcR
xqttuADVo/9Zn0O9/+GUPCH0ROL0SMnuUjwdVP3bpPHj9zEk8F1/A6chzTeSLJru
Y4+cRrN/r0JTkvRqPdnF9LSvxK7mtAEaHkKGeLQbw0O5pv3r3w0EWMJvq+uonGU2
WX0eR5VMfevkFMUdw8FKOTa+OZ0HJ2KKIb4sB4wDMgeGyov7Z6SxgvFeQiSyD3aB
QnyfLDzeEuPfousxUd45dUDnsWNnSgFF+JAdi0LSzm5hMuLeQDozTsFmh0orQcX1
L5A6VtAbSPP0ffl+tuPi48q3P3LlSjMP0B8W20NXFYhXukKXCgXVMr/dEvpwpu1m
o1glviGIXeLQQSnX3lMWb7Ds2igmCtXPrqkdu2vpRhMp0od6n4R4jH73Aj5MeSQn
n3sP73dg5sAaMjtI2NOisMeFUp09MMlOumCCM+AIplPXremm1kwgKRTIp0rKsLW9
VoQPXa43LQc3hAgPrpGuE+4yBfaBUq7Z8I37IFER/2y4K8b9YkduW4kDh7OdRz+d
iQ4Nnu2lR/+CCH0=
=0mTM
-----END PGP SIGNATURE-----
Merge tag 'lsm-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:
- Promote IMA/EVM to a proper LSM
This is the bulk of the diffstat, and the source of all the changes
in the VFS code. Prior to the start of the LSM stacking work it was
important that IMA/EVM were separate from the rest of the LSMs,
complete with their own hooks, infrastructure, etc. as it was the
only way to enable IMA/EVM at the same time as a LSM.
However, now that the bulk of the LSM infrastructure supports
multiple simultaneous LSMs, we can simplify things greatly by
bringing IMA/EVM into the LSM infrastructure as proper LSMs. This is
something I've wanted to see happen for quite some time and Roberto
was kind enough to put in the work to make it happen.
- Use the LSM hook default values to simplify the call_int_hook() macro
Previously the call_int_hook() macro required callers to supply a
default return value, despite a default value being specified when
the LSM hook was defined.
This simplifies the macro by using the defined default return value
which makes life easier for callers and should also reduce the number
of return value bugs in the future (we've had a few pop up recently,
hence this work).
- Use the KMEM_CACHE() macro instead of kmem_cache_create()
The guidance appears to be to use the KMEM_CACHE() macro when
possible and there is no reason why we can't use the macro, so let's
use it.
- Fix a number of comment typos in the LSM hook comment blocks
Not much to say here, we fixed some questionable grammar decisions in
the LSM hook comment blocks.
* tag 'lsm-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (28 commits)
cred: Use KMEM_CACHE() instead of kmem_cache_create()
lsm: use default hook return value in call_int_hook()
lsm: fix typos in security/security.c comment headers
integrity: Remove LSM
ima: Make it independent from 'integrity' LSM
evm: Make it independent from 'integrity' LSM
evm: Move to LSM infrastructure
ima: Move IMA-Appraisal to LSM infrastructure
ima: Move to LSM infrastructure
integrity: Move integrity_kernel_module_request() to IMA
security: Introduce key_post_create_or_update hook
security: Introduce inode_post_remove_acl hook
security: Introduce inode_post_set_acl hook
security: Introduce inode_post_create_tmpfile hook
security: Introduce path_post_mknod hook
security: Introduce file_release hook
security: Introduce file_post_open hook
security: Introduce inode_post_removexattr hook
security: Introduce inode_post_setattr hook
security: Align inode_setattr hook definition with EVM
...
2024-03-13 03:03:34 +00:00
|
|
|
return call_int_hook(bpf_token_create, token, attr, path);
|
2024-01-24 02:21:08 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* security_bpf_token_cmd() - Check if BPF token is allowed to delegate
|
|
|
|
* requested BPF syscall command
|
|
|
|
* @token: BPF token object
|
|
|
|
* @cmd: BPF syscall command requested to be delegated by BPF token
|
|
|
|
*
|
|
|
|
* Do a check when the kernel decides whether provided BPF token should allow
|
|
|
|
* delegation of requested BPF syscall command.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
|
|
|
int security_bpf_token_cmd(const struct bpf_token *token, enum bpf_cmd cmd)
|
|
|
|
{
|
lsm/stable-6.9 PR 20240312
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmXwt3cUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXObOhAAqldn1nbYS/t1D/k/9ZN/PtSQetK4
S58D8+gB59Sg0daWFaRhCwwShIbXS/6XzhqaVb3iAPptJs0YDFMbWLAW2d+dd69K
/7C8diguHbuJdEnCJtFYQIVinavaYVRlyoQcO8uwTz8uvTgXPOhr2P9NcOApJXcR
xqttuADVo/9Zn0O9/+GUPCH0ROL0SMnuUjwdVP3bpPHj9zEk8F1/A6chzTeSLJru
Y4+cRrN/r0JTkvRqPdnF9LSvxK7mtAEaHkKGeLQbw0O5pv3r3w0EWMJvq+uonGU2
WX0eR5VMfevkFMUdw8FKOTa+OZ0HJ2KKIb4sB4wDMgeGyov7Z6SxgvFeQiSyD3aB
QnyfLDzeEuPfousxUd45dUDnsWNnSgFF+JAdi0LSzm5hMuLeQDozTsFmh0orQcX1
L5A6VtAbSPP0ffl+tuPi48q3P3LlSjMP0B8W20NXFYhXukKXCgXVMr/dEvpwpu1m
o1glviGIXeLQQSnX3lMWb7Ds2igmCtXPrqkdu2vpRhMp0od6n4R4jH73Aj5MeSQn
n3sP73dg5sAaMjtI2NOisMeFUp09MMlOumCCM+AIplPXremm1kwgKRTIp0rKsLW9
VoQPXa43LQc3hAgPrpGuE+4yBfaBUq7Z8I37IFER/2y4K8b9YkduW4kDh7OdRz+d
iQ4Nnu2lR/+CCH0=
=0mTM
-----END PGP SIGNATURE-----
Merge tag 'lsm-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:
- Promote IMA/EVM to a proper LSM
This is the bulk of the diffstat, and the source of all the changes
in the VFS code. Prior to the start of the LSM stacking work it was
important that IMA/EVM were separate from the rest of the LSMs,
complete with their own hooks, infrastructure, etc. as it was the
only way to enable IMA/EVM at the same time as a LSM.
However, now that the bulk of the LSM infrastructure supports
multiple simultaneous LSMs, we can simplify things greatly by
bringing IMA/EVM into the LSM infrastructure as proper LSMs. This is
something I've wanted to see happen for quite some time and Roberto
was kind enough to put in the work to make it happen.
- Use the LSM hook default values to simplify the call_int_hook() macro
Previously the call_int_hook() macro required callers to supply a
default return value, despite a default value being specified when
the LSM hook was defined.
This simplifies the macro by using the defined default return value
which makes life easier for callers and should also reduce the number
of return value bugs in the future (we've had a few pop up recently,
hence this work).
- Use the KMEM_CACHE() macro instead of kmem_cache_create()
The guidance appears to be to use the KMEM_CACHE() macro when
possible and there is no reason why we can't use the macro, so let's
use it.
- Fix a number of comment typos in the LSM hook comment blocks
Not much to say here, we fixed some questionable grammar decisions in
the LSM hook comment blocks.
* tag 'lsm-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (28 commits)
cred: Use KMEM_CACHE() instead of kmem_cache_create()
lsm: use default hook return value in call_int_hook()
lsm: fix typos in security/security.c comment headers
integrity: Remove LSM
ima: Make it independent from 'integrity' LSM
evm: Make it independent from 'integrity' LSM
evm: Move to LSM infrastructure
ima: Move IMA-Appraisal to LSM infrastructure
ima: Move to LSM infrastructure
integrity: Move integrity_kernel_module_request() to IMA
security: Introduce key_post_create_or_update hook
security: Introduce inode_post_remove_acl hook
security: Introduce inode_post_set_acl hook
security: Introduce inode_post_create_tmpfile hook
security: Introduce path_post_mknod hook
security: Introduce file_release hook
security: Introduce file_post_open hook
security: Introduce inode_post_removexattr hook
security: Introduce inode_post_setattr hook
security: Align inode_setattr hook definition with EVM
...
2024-03-13 03:03:34 +00:00
|
|
|
return call_int_hook(bpf_token_cmd, token, cmd);
|
2024-01-24 02:21:08 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* security_bpf_token_capable() - Check if BPF token is allowed to delegate
|
|
|
|
* requested BPF-related capability
|
|
|
|
* @token: BPF token object
|
|
|
|
* @cap: capabilities requested to be delegated by BPF token
|
|
|
|
*
|
|
|
|
* Do a check when the kernel decides whether provided BPF token should allow
|
|
|
|
* delegation of requested BPF-related capabilities.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
|
|
|
int security_bpf_token_capable(const struct bpf_token *token, int cap)
|
|
|
|
{
|
lsm/stable-6.9 PR 20240312
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmXwt3cUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXObOhAAqldn1nbYS/t1D/k/9ZN/PtSQetK4
S58D8+gB59Sg0daWFaRhCwwShIbXS/6XzhqaVb3iAPptJs0YDFMbWLAW2d+dd69K
/7C8diguHbuJdEnCJtFYQIVinavaYVRlyoQcO8uwTz8uvTgXPOhr2P9NcOApJXcR
xqttuADVo/9Zn0O9/+GUPCH0ROL0SMnuUjwdVP3bpPHj9zEk8F1/A6chzTeSLJru
Y4+cRrN/r0JTkvRqPdnF9LSvxK7mtAEaHkKGeLQbw0O5pv3r3w0EWMJvq+uonGU2
WX0eR5VMfevkFMUdw8FKOTa+OZ0HJ2KKIb4sB4wDMgeGyov7Z6SxgvFeQiSyD3aB
QnyfLDzeEuPfousxUd45dUDnsWNnSgFF+JAdi0LSzm5hMuLeQDozTsFmh0orQcX1
L5A6VtAbSPP0ffl+tuPi48q3P3LlSjMP0B8W20NXFYhXukKXCgXVMr/dEvpwpu1m
o1glviGIXeLQQSnX3lMWb7Ds2igmCtXPrqkdu2vpRhMp0od6n4R4jH73Aj5MeSQn
n3sP73dg5sAaMjtI2NOisMeFUp09MMlOumCCM+AIplPXremm1kwgKRTIp0rKsLW9
VoQPXa43LQc3hAgPrpGuE+4yBfaBUq7Z8I37IFER/2y4K8b9YkduW4kDh7OdRz+d
iQ4Nnu2lR/+CCH0=
=0mTM
-----END PGP SIGNATURE-----
Merge tag 'lsm-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:
- Promote IMA/EVM to a proper LSM
This is the bulk of the diffstat, and the source of all the changes
in the VFS code. Prior to the start of the LSM stacking work it was
important that IMA/EVM were separate from the rest of the LSMs,
complete with their own hooks, infrastructure, etc. as it was the
only way to enable IMA/EVM at the same time as a LSM.
However, now that the bulk of the LSM infrastructure supports
multiple simultaneous LSMs, we can simplify things greatly by
bringing IMA/EVM into the LSM infrastructure as proper LSMs. This is
something I've wanted to see happen for quite some time and Roberto
was kind enough to put in the work to make it happen.
- Use the LSM hook default values to simplify the call_int_hook() macro
Previously the call_int_hook() macro required callers to supply a
default return value, despite a default value being specified when
the LSM hook was defined.
This simplifies the macro by using the defined default return value
which makes life easier for callers and should also reduce the number
of return value bugs in the future (we've had a few pop up recently,
hence this work).
- Use the KMEM_CACHE() macro instead of kmem_cache_create()
The guidance appears to be to use the KMEM_CACHE() macro when
possible and there is no reason why we can't use the macro, so let's
use it.
- Fix a number of comment typos in the LSM hook comment blocks
Not much to say here, we fixed some questionable grammar decisions in
the LSM hook comment blocks.
* tag 'lsm-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (28 commits)
cred: Use KMEM_CACHE() instead of kmem_cache_create()
lsm: use default hook return value in call_int_hook()
lsm: fix typos in security/security.c comment headers
integrity: Remove LSM
ima: Make it independent from 'integrity' LSM
evm: Make it independent from 'integrity' LSM
evm: Move to LSM infrastructure
ima: Move IMA-Appraisal to LSM infrastructure
ima: Move to LSM infrastructure
integrity: Move integrity_kernel_module_request() to IMA
security: Introduce key_post_create_or_update hook
security: Introduce inode_post_remove_acl hook
security: Introduce inode_post_set_acl hook
security: Introduce inode_post_create_tmpfile hook
security: Introduce path_post_mknod hook
security: Introduce file_release hook
security: Introduce file_post_open hook
security: Introduce inode_post_removexattr hook
security: Introduce inode_post_setattr hook
security: Align inode_setattr hook definition with EVM
...
2024-03-13 03:03:34 +00:00
|
|
|
return call_int_hook(bpf_token_capable, token, cap);
|
2024-01-24 02:21:08 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:13:40 +00:00
|
|
|
/**
|
|
|
|
* security_bpf_map_free() - Free a bpf map's LSM blob
|
|
|
|
* @map: bpf map
|
|
|
|
*
|
|
|
|
* Clean up the security information stored inside bpf map.
|
|
|
|
*/
|
2017-10-18 20:00:24 +00:00
|
|
|
void security_bpf_map_free(struct bpf_map *map)
|
|
|
|
{
|
bpf,lsm: Refactor bpf_map_alloc/bpf_map_free LSM hooks
Similarly to bpf_prog_alloc LSM hook, rename and extend bpf_map_alloc
hook into bpf_map_create, taking not just struct bpf_map, but also
bpf_attr and bpf_token, to give a fuller context to LSMs.
Unlike bpf_prog_alloc, there is no need to move the hook around, as it
currently is firing right before allocating BPF map ID and FD, which
seems to be a sweet spot.
But like bpf_prog_alloc/bpf_prog_free combo, make sure that bpf_map_free
LSM hook is called even if bpf_map_create hook returned error, as if few
LSMs are combined together it could be that one LSM successfully
allocated security blob for its needs, while subsequent LSM rejected BPF
map creation. The former LSM would still need to free up LSM blob, so we
need to ensure security_bpf_map_free() is called regardless of the
outcome.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-11-andrii@kernel.org
2024-01-24 02:21:07 +00:00
|
|
|
call_void_hook(bpf_map_free, map);
|
2017-10-18 20:00:24 +00:00
|
|
|
}
|
2023-02-16 22:13:40 +00:00
|
|
|
|
|
|
|
/**
|
bpf,lsm: Refactor bpf_prog_alloc/bpf_prog_free LSM hooks
Based on upstream discussion ([0]), rework existing
bpf_prog_alloc_security LSM hook. Rename it to bpf_prog_load and instead
of passing bpf_prog_aux, pass proper bpf_prog pointer for a full BPF
program struct. Also, we pass bpf_attr union with all the user-provided
arguments for BPF_PROG_LOAD command. This will give LSMs as much
information as we can basically provide.
The hook is also BPF token-aware now, and optional bpf_token struct is
passed as a third argument. bpf_prog_load LSM hook is called after
a bunch of sanity checks were performed, bpf_prog and bpf_prog_aux were
allocated and filled out, but right before performing full-fledged BPF
verification step.
bpf_prog_free LSM hook is now accepting struct bpf_prog argument, for
consistency. SELinux code is adjusted to all new names, types, and
signatures.
Note, given that bpf_prog_load (previously bpf_prog_alloc) hook can be
used by some LSMs to allocate extra security blob, but also by other
LSMs to reject BPF program loading, we need to make sure that
bpf_prog_free LSM hook is called after bpf_prog_load/bpf_prog_alloc one
*even* if the hook itself returned error. If we don't do that, we run
the risk of leaking memory. This seems to be possible today when
combining SELinux and BPF LSM, as one example, depending on their
relative ordering.
Also, for BPF LSM setup, add bpf_prog_load and bpf_prog_free to
sleepable LSM hooks list, as they are both executed in sleepable
context. Also drop bpf_prog_load hook from untrusted, as there is no
issue with refcount or anything else anymore, that originally forced us
to add it to untrusted list in c0c852dd1876 ("bpf: Do not mark certain LSM
hook arguments as trusted"). We now trigger this hook much later and it
should not be an issue anymore.
[0] https://lore.kernel.org/bpf/9fe88aef7deabbe87d3fc38c4aea3c69.paul@paul-moore.com/
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-10-andrii@kernel.org
2024-01-24 02:21:06 +00:00
|
|
|
* security_bpf_prog_free() - Free a BPF program's LSM blob
|
|
|
|
* @prog: BPF program struct
|
2023-02-16 22:13:40 +00:00
|
|
|
*
|
bpf,lsm: Refactor bpf_prog_alloc/bpf_prog_free LSM hooks
Based on upstream discussion ([0]), rework existing
bpf_prog_alloc_security LSM hook. Rename it to bpf_prog_load and instead
of passing bpf_prog_aux, pass proper bpf_prog pointer for a full BPF
program struct. Also, we pass bpf_attr union with all the user-provided
arguments for BPF_PROG_LOAD command. This will give LSMs as much
information as we can basically provide.
The hook is also BPF token-aware now, and optional bpf_token struct is
passed as a third argument. bpf_prog_load LSM hook is called after
a bunch of sanity checks were performed, bpf_prog and bpf_prog_aux were
allocated and filled out, but right before performing full-fledged BPF
verification step.
bpf_prog_free LSM hook is now accepting struct bpf_prog argument, for
consistency. SELinux code is adjusted to all new names, types, and
signatures.
Note, given that bpf_prog_load (previously bpf_prog_alloc) hook can be
used by some LSMs to allocate extra security blob, but also by other
LSMs to reject BPF program loading, we need to make sure that
bpf_prog_free LSM hook is called after bpf_prog_load/bpf_prog_alloc one
*even* if the hook itself returned error. If we don't do that, we run
the risk of leaking memory. This seems to be possible today when
combining SELinux and BPF LSM, as one example, depending on their
relative ordering.
Also, for BPF LSM setup, add bpf_prog_load and bpf_prog_free to
sleepable LSM hooks list, as they are both executed in sleepable
context. Also drop bpf_prog_load hook from untrusted, as there is no
issue with refcount or anything else anymore, that originally forced us
to add it to untrusted list in c0c852dd1876 ("bpf: Do not mark certain LSM
hook arguments as trusted"). We now trigger this hook much later and it
should not be an issue anymore.
[0] https://lore.kernel.org/bpf/9fe88aef7deabbe87d3fc38c4aea3c69.paul@paul-moore.com/
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-10-andrii@kernel.org
2024-01-24 02:21:06 +00:00
|
|
|
* Clean up the security information stored inside BPF program.
|
2023-02-16 22:13:40 +00:00
|
|
|
*/
|
bpf,lsm: Refactor bpf_prog_alloc/bpf_prog_free LSM hooks
Based on upstream discussion ([0]), rework existing
bpf_prog_alloc_security LSM hook. Rename it to bpf_prog_load and instead
of passing bpf_prog_aux, pass proper bpf_prog pointer for a full BPF
program struct. Also, we pass bpf_attr union with all the user-provided
arguments for BPF_PROG_LOAD command. This will give LSMs as much
information as we can basically provide.
The hook is also BPF token-aware now, and optional bpf_token struct is
passed as a third argument. bpf_prog_load LSM hook is called after
a bunch of sanity checks were performed, bpf_prog and bpf_prog_aux were
allocated and filled out, but right before performing full-fledged BPF
verification step.
bpf_prog_free LSM hook is now accepting struct bpf_prog argument, for
consistency. SELinux code is adjusted to all new names, types, and
signatures.
Note, given that bpf_prog_load (previously bpf_prog_alloc) hook can be
used by some LSMs to allocate extra security blob, but also by other
LSMs to reject BPF program loading, we need to make sure that
bpf_prog_free LSM hook is called after bpf_prog_load/bpf_prog_alloc one
*even* if the hook itself returned error. If we don't do that, we run
the risk of leaking memory. This seems to be possible today when
combining SELinux and BPF LSM, as one example, depending on their
relative ordering.
Also, for BPF LSM setup, add bpf_prog_load and bpf_prog_free to
sleepable LSM hooks list, as they are both executed in sleepable
context. Also drop bpf_prog_load hook from untrusted, as there is no
issue with refcount or anything else anymore, that originally forced us
to add it to untrusted list in c0c852dd1876 ("bpf: Do not mark certain LSM
hook arguments as trusted"). We now trigger this hook much later and it
should not be an issue anymore.
[0] https://lore.kernel.org/bpf/9fe88aef7deabbe87d3fc38c4aea3c69.paul@paul-moore.com/
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-10-andrii@kernel.org
2024-01-24 02:21:06 +00:00
|
|
|
void security_bpf_prog_free(struct bpf_prog *prog)
|
2017-10-18 20:00:24 +00:00
|
|
|
{
|
bpf,lsm: Refactor bpf_prog_alloc/bpf_prog_free LSM hooks
Based on upstream discussion ([0]), rework existing
bpf_prog_alloc_security LSM hook. Rename it to bpf_prog_load and instead
of passing bpf_prog_aux, pass proper bpf_prog pointer for a full BPF
program struct. Also, we pass bpf_attr union with all the user-provided
arguments for BPF_PROG_LOAD command. This will give LSMs as much
information as we can basically provide.
The hook is also BPF token-aware now, and optional bpf_token struct is
passed as a third argument. bpf_prog_load LSM hook is called after
a bunch of sanity checks were performed, bpf_prog and bpf_prog_aux were
allocated and filled out, but right before performing full-fledged BPF
verification step.
bpf_prog_free LSM hook is now accepting struct bpf_prog argument, for
consistency. SELinux code is adjusted to all new names, types, and
signatures.
Note, given that bpf_prog_load (previously bpf_prog_alloc) hook can be
used by some LSMs to allocate extra security blob, but also by other
LSMs to reject BPF program loading, we need to make sure that
bpf_prog_free LSM hook is called after bpf_prog_load/bpf_prog_alloc one
*even* if the hook itself returned error. If we don't do that, we run
the risk of leaking memory. This seems to be possible today when
combining SELinux and BPF LSM, as one example, depending on their
relative ordering.
Also, for BPF LSM setup, add bpf_prog_load and bpf_prog_free to
sleepable LSM hooks list, as they are both executed in sleepable
context. Also drop bpf_prog_load hook from untrusted, as there is no
issue with refcount or anything else anymore, that originally forced us
to add it to untrusted list in c0c852dd1876 ("bpf: Do not mark certain LSM
hook arguments as trusted"). We now trigger this hook much later and it
should not be an issue anymore.
[0] https://lore.kernel.org/bpf/9fe88aef7deabbe87d3fc38c4aea3c69.paul@paul-moore.com/
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-10-andrii@kernel.org
2024-01-24 02:21:06 +00:00
|
|
|
call_void_hook(bpf_prog_free, prog);
|
2017-10-18 20:00:24 +00:00
|
|
|
}
|
2024-01-24 02:21:08 +00:00
|
|
|
|
|
|
|
/**
|
|
|
|
* security_bpf_token_free() - Free a BPF token's LSM blob
|
|
|
|
* @token: BPF token struct
|
|
|
|
*
|
|
|
|
* Clean up the security information stored inside BPF token.
|
|
|
|
*/
|
|
|
|
void security_bpf_token_free(struct bpf_token *token)
|
|
|
|
{
|
|
|
|
call_void_hook(bpf_token_free, token);
|
|
|
|
}
|
2017-10-18 20:00:24 +00:00
|
|
|
#endif /* CONFIG_BPF_SYSCALL */
|
2019-08-20 00:17:38 +00:00
|
|
|
|
2023-02-16 22:34:14 +00:00
|
|
|
/**
|
|
|
|
* security_locked_down() - Check if a kernel feature is allowed
|
|
|
|
* @what: requested kernel feature
|
|
|
|
*
|
|
|
|
* Determine whether a kernel feature that potentially enables arbitrary code
|
|
|
|
* execution in kernel space should be permitted.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2019-08-20 00:17:38 +00:00
|
|
|
int security_locked_down(enum lockdown_reason what)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(locked_down, what);
|
2019-08-20 00:17:38 +00:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_locked_down);
|
perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl. This has a number of
limitations:
1. The sysctl is only a single value. Many types of accesses are controlled
based on the single value thus making the control very limited and
coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
all processes get access to perf_event_open(2) opening the door to
security issues.
This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.
5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
syscall itself. The hook is called from all the places that the
perf_event_paranoid sysctl is checked to keep it consistent with the
systctl. The hook gets passed a 'type' argument which controls CPU,
kernel and tracepoint accesses (in this context, CPU, kernel and
tracepoint have the same semantics as the perf_event_paranoid sysctl).
Additionally, I added an 'open' type which is similar to
perf_event_paranoid sysctl == 3 patch carried in Android and several other
distros but was rejected in mainline [1] in 2016.
2. perf_event_alloc: This allocates a new security object for the event
which stores the current SID within the event. It will be useful when
the perf event's FD is passed through IPC to another process which may
try to read the FD. Appropriate security checks will limit access.
3. perf_event_free: Called when the event is closed.
4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.
5. perf_event_write: Called from the ioctl(2) syscalls for the event.
[1] https://lwn.net/Articles/696240/
Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.
To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-14 17:03:08 +00:00
|
|
|
|
block,lsm: add LSM blob and new LSM hooks for block devices
This patch introduces a new LSM blob to the block_device structure,
enabling the security subsystem to store security-sensitive data related
to block devices. Currently, for a device mapper's mapped device containing
a dm-verity target, critical security information such as the roothash and
its signing state are not readily accessible. Specifically, while the
dm-verity volume creation process passes the dm-verity roothash and its
signature from userspace to the kernel, the roothash is stored privately
within the dm-verity target, and its signature is discarded
post-verification. This makes it extremely hard for the security subsystem
to utilize these data.
With the addition of the LSM blob to the block_device structure, the
security subsystem can now retain and manage important security metadata
such as the roothash and the signing state of a dm-verity by storing them
inside the blob. Access decisions can then be based on these stored data.
The implementation follows the same approach used for security blobs in
other structures like struct file, struct inode, and struct superblock.
The initialization of the security blob occurs after the creation of the
struct block_device, performed by the security subsystem. Similarly, the
security blob is freed by the security subsystem before the struct
block_device is deallocated or freed.
This patch also introduces a new hook security_bdev_setintegrity() to save
block device's integrity data to the new LSM blob. For example, for
dm-verity, it can use this hook to expose its roothash and signing state
to LSMs, then LSMs can save these data into the LSM blob.
Please note that the new hook should be invoked every time the security
information is updated to keep these data current. For example, in
dm-verity, if the mapping table is reloaded and configured to use a
different dm-verity target with a new roothash and signing information,
the previously stored data in the LSM blob will become obsolete. It is
crucial to re-invoke the hook to refresh these data and ensure they are up
to date. This necessity arises from the design of device-mapper, where a
device-mapper device is first created, and then targets are subsequently
loaded into it. These targets can be modified multiple times during the
device's lifetime. Therefore, while the LSM blob is allocated during the
creation of the block device, its actual contents are not initialized at
this stage and can change substantially over time. This includes
alterations from data that the LSM 'trusts' to those it does not, making
it essential to handle these changes correctly. Failure to address this
dynamic aspect could potentially allow for bypassing LSM checks.
Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
[PM: merge fuzz, subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-03 06:08:25 +00:00
|
|
|
/**
|
|
|
|
* security_bdev_alloc() - Allocate a block device LSM blob
|
|
|
|
* @bdev: block device
|
|
|
|
*
|
|
|
|
* Allocate and attach a security structure to @bdev->bd_security. The
|
|
|
|
* security field is initialized to NULL when the bdev structure is
|
|
|
|
* allocated.
|
|
|
|
*
|
|
|
|
* Return: Return 0 if operation was successful.
|
|
|
|
*/
|
|
|
|
int security_bdev_alloc(struct block_device *bdev)
|
|
|
|
{
|
|
|
|
int rc = 0;
|
|
|
|
|
|
|
|
rc = lsm_bdev_alloc(bdev);
|
|
|
|
if (unlikely(rc))
|
|
|
|
return rc;
|
|
|
|
|
|
|
|
rc = call_int_hook(bdev_alloc_security, bdev);
|
|
|
|
if (unlikely(rc))
|
|
|
|
security_bdev_free(bdev);
|
|
|
|
|
|
|
|
return rc;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_bdev_alloc);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* security_bdev_free() - Free a block device's LSM blob
|
|
|
|
* @bdev: block device
|
|
|
|
*
|
|
|
|
* Deallocate the bdev security structure and set @bdev->bd_security to NULL.
|
|
|
|
*/
|
|
|
|
void security_bdev_free(struct block_device *bdev)
|
|
|
|
{
|
|
|
|
if (!bdev->bd_security)
|
|
|
|
return;
|
|
|
|
|
|
|
|
call_void_hook(bdev_free_security, bdev);
|
|
|
|
|
|
|
|
kfree(bdev->bd_security);
|
|
|
|
bdev->bd_security = NULL;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_bdev_free);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* security_bdev_setintegrity() - Set the device's integrity data
|
|
|
|
* @bdev: block device
|
|
|
|
* @type: type of integrity, e.g. hash digest, signature, etc
|
|
|
|
* @value: the integrity value
|
|
|
|
* @size: size of the integrity value
|
|
|
|
*
|
|
|
|
* Register a verified integrity measurement of a bdev with LSMs.
|
|
|
|
* LSMs should free the previously saved data if @value is NULL.
|
|
|
|
* Please note that the new hook should be invoked every time the security
|
|
|
|
* information is updated to keep these data current. For example, in dm-verity,
|
|
|
|
* if the mapping table is reloaded and configured to use a different dm-verity
|
2024-08-03 06:08:27 +00:00
|
|
|
* target with a new roothash and signing information, the previously stored
|
|
|
|
* data in the LSM blob will become obsolete. It is crucial to re-invoke the
|
|
|
|
* hook to refresh these data and ensure they are up to date. This necessity
|
|
|
|
* arises from the design of device-mapper, where a device-mapper device is
|
|
|
|
* first created, and then targets are subsequently loaded into it. These
|
|
|
|
* targets can be modified multiple times during the device's lifetime.
|
|
|
|
* Therefore, while the LSM blob is allocated during the creation of the block
|
|
|
|
* device, its actual contents are not initialized at this stage and can change
|
|
|
|
* substantially over time. This includes alterations from data that the LSMs
|
|
|
|
* 'trusts' to those they do not, making it essential to handle these changes
|
|
|
|
* correctly. Failure to address this dynamic aspect could potentially allow
|
|
|
|
* for bypassing LSM checks.
|
block,lsm: add LSM blob and new LSM hooks for block devices
This patch introduces a new LSM blob to the block_device structure,
enabling the security subsystem to store security-sensitive data related
to block devices. Currently, for a device mapper's mapped device containing
a dm-verity target, critical security information such as the roothash and
its signing state are not readily accessible. Specifically, while the
dm-verity volume creation process passes the dm-verity roothash and its
signature from userspace to the kernel, the roothash is stored privately
within the dm-verity target, and its signature is discarded
post-verification. This makes it extremely hard for the security subsystem
to utilize these data.
With the addition of the LSM blob to the block_device structure, the
security subsystem can now retain and manage important security metadata
such as the roothash and the signing state of a dm-verity by storing them
inside the blob. Access decisions can then be based on these stored data.
The implementation follows the same approach used for security blobs in
other structures like struct file, struct inode, and struct superblock.
The initialization of the security blob occurs after the creation of the
struct block_device, performed by the security subsystem. Similarly, the
security blob is freed by the security subsystem before the struct
block_device is deallocated or freed.
This patch also introduces a new hook security_bdev_setintegrity() to save
block device's integrity data to the new LSM blob. For example, for
dm-verity, it can use this hook to expose its roothash and signing state
to LSMs, then LSMs can save these data into the LSM blob.
Please note that the new hook should be invoked every time the security
information is updated to keep these data current. For example, in
dm-verity, if the mapping table is reloaded and configured to use a
different dm-verity target with a new roothash and signing information,
the previously stored data in the LSM blob will become obsolete. It is
crucial to re-invoke the hook to refresh these data and ensure they are up
to date. This necessity arises from the design of device-mapper, where a
device-mapper device is first created, and then targets are subsequently
loaded into it. These targets can be modified multiple times during the
device's lifetime. Therefore, while the LSM blob is allocated during the
creation of the block device, its actual contents are not initialized at
this stage and can change substantially over time. This includes
alterations from data that the LSM 'trusts' to those it does not, making
it essential to handle these changes correctly. Failure to address this
dynamic aspect could potentially allow for bypassing LSM checks.
Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
[PM: merge fuzz, subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-03 06:08:25 +00:00
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, negative values on failure.
|
|
|
|
*/
|
|
|
|
int security_bdev_setintegrity(struct block_device *bdev,
|
|
|
|
enum lsm_integrity_type type, const void *value,
|
|
|
|
size_t size)
|
|
|
|
{
|
|
|
|
return call_int_hook(bdev_setintegrity, bdev, type, value, size);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(security_bdev_setintegrity);
|
|
|
|
|
perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl. This has a number of
limitations:
1. The sysctl is only a single value. Many types of accesses are controlled
based on the single value thus making the control very limited and
coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
all processes get access to perf_event_open(2) opening the door to
security issues.
This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.
5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
syscall itself. The hook is called from all the places that the
perf_event_paranoid sysctl is checked to keep it consistent with the
systctl. The hook gets passed a 'type' argument which controls CPU,
kernel and tracepoint accesses (in this context, CPU, kernel and
tracepoint have the same semantics as the perf_event_paranoid sysctl).
Additionally, I added an 'open' type which is similar to
perf_event_paranoid sysctl == 3 patch carried in Android and several other
distros but was rejected in mainline [1] in 2016.
2. perf_event_alloc: This allocates a new security object for the event
which stores the current SID within the event. It will be useful when
the perf event's FD is passed through IPC to another process which may
try to read the FD. Appropriate security checks will limit access.
3. perf_event_free: Called when the event is closed.
4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.
5. perf_event_write: Called from the ioctl(2) syscalls for the event.
[1] https://lwn.net/Articles/696240/
Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.
To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-14 17:03:08 +00:00
|
|
|
#ifdef CONFIG_PERF_EVENTS
|
2023-02-16 22:22:36 +00:00
|
|
|
/**
|
|
|
|
* security_perf_event_open() - Check if a perf event open is allowed
|
|
|
|
* @attr: perf event attribute
|
|
|
|
* @type: type of event
|
|
|
|
*
|
|
|
|
* Check whether the @type of perf_event_open syscall is allowed.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl. This has a number of
limitations:
1. The sysctl is only a single value. Many types of accesses are controlled
based on the single value thus making the control very limited and
coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
all processes get access to perf_event_open(2) opening the door to
security issues.
This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.
5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
syscall itself. The hook is called from all the places that the
perf_event_paranoid sysctl is checked to keep it consistent with the
systctl. The hook gets passed a 'type' argument which controls CPU,
kernel and tracepoint accesses (in this context, CPU, kernel and
tracepoint have the same semantics as the perf_event_paranoid sysctl).
Additionally, I added an 'open' type which is similar to
perf_event_paranoid sysctl == 3 patch carried in Android and several other
distros but was rejected in mainline [1] in 2016.
2. perf_event_alloc: This allocates a new security object for the event
which stores the current SID within the event. It will be useful when
the perf event's FD is passed through IPC to another process which may
try to read the FD. Appropriate security checks will limit access.
3. perf_event_free: Called when the event is closed.
4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.
5. perf_event_write: Called from the ioctl(2) syscalls for the event.
[1] https://lwn.net/Articles/696240/
Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.
To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-14 17:03:08 +00:00
|
|
|
int security_perf_event_open(struct perf_event_attr *attr, int type)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(perf_event_open, attr, type);
|
perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl. This has a number of
limitations:
1. The sysctl is only a single value. Many types of accesses are controlled
based on the single value thus making the control very limited and
coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
all processes get access to perf_event_open(2) opening the door to
security issues.
This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.
5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
syscall itself. The hook is called from all the places that the
perf_event_paranoid sysctl is checked to keep it consistent with the
systctl. The hook gets passed a 'type' argument which controls CPU,
kernel and tracepoint accesses (in this context, CPU, kernel and
tracepoint have the same semantics as the perf_event_paranoid sysctl).
Additionally, I added an 'open' type which is similar to
perf_event_paranoid sysctl == 3 patch carried in Android and several other
distros but was rejected in mainline [1] in 2016.
2. perf_event_alloc: This allocates a new security object for the event
which stores the current SID within the event. It will be useful when
the perf event's FD is passed through IPC to another process which may
try to read the FD. Appropriate security checks will limit access.
3. perf_event_free: Called when the event is closed.
4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.
5. perf_event_write: Called from the ioctl(2) syscalls for the event.
[1] https://lwn.net/Articles/696240/
Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.
To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-14 17:03:08 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:22:36 +00:00
|
|
|
/**
|
|
|
|
* security_perf_event_alloc() - Allocate a perf event LSM blob
|
|
|
|
* @event: perf event
|
|
|
|
*
|
|
|
|
* Allocate and save perf_event security info.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 on success, error on failure.
|
|
|
|
*/
|
perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl. This has a number of
limitations:
1. The sysctl is only a single value. Many types of accesses are controlled
based on the single value thus making the control very limited and
coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
all processes get access to perf_event_open(2) opening the door to
security issues.
This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.
5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
syscall itself. The hook is called from all the places that the
perf_event_paranoid sysctl is checked to keep it consistent with the
systctl. The hook gets passed a 'type' argument which controls CPU,
kernel and tracepoint accesses (in this context, CPU, kernel and
tracepoint have the same semantics as the perf_event_paranoid sysctl).
Additionally, I added an 'open' type which is similar to
perf_event_paranoid sysctl == 3 patch carried in Android and several other
distros but was rejected in mainline [1] in 2016.
2. perf_event_alloc: This allocates a new security object for the event
which stores the current SID within the event. It will be useful when
the perf event's FD is passed through IPC to another process which may
try to read the FD. Appropriate security checks will limit access.
3. perf_event_free: Called when the event is closed.
4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.
5. perf_event_write: Called from the ioctl(2) syscalls for the event.
[1] https://lwn.net/Articles/696240/
Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.
To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-14 17:03:08 +00:00
|
|
|
int security_perf_event_alloc(struct perf_event *event)
|
|
|
|
{
|
2024-07-10 21:32:30 +00:00
|
|
|
int rc;
|
|
|
|
|
|
|
|
rc = lsm_blob_alloc(&event->security, blob_sizes.lbs_perf_event,
|
|
|
|
GFP_KERNEL);
|
|
|
|
if (rc)
|
|
|
|
return rc;
|
|
|
|
|
|
|
|
rc = call_int_hook(perf_event_alloc, event);
|
|
|
|
if (rc) {
|
|
|
|
kfree(event->security);
|
|
|
|
event->security = NULL;
|
|
|
|
}
|
|
|
|
return rc;
|
perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl. This has a number of
limitations:
1. The sysctl is only a single value. Many types of accesses are controlled
based on the single value thus making the control very limited and
coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
all processes get access to perf_event_open(2) opening the door to
security issues.
This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.
5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
syscall itself. The hook is called from all the places that the
perf_event_paranoid sysctl is checked to keep it consistent with the
systctl. The hook gets passed a 'type' argument which controls CPU,
kernel and tracepoint accesses (in this context, CPU, kernel and
tracepoint have the same semantics as the perf_event_paranoid sysctl).
Additionally, I added an 'open' type which is similar to
perf_event_paranoid sysctl == 3 patch carried in Android and several other
distros but was rejected in mainline [1] in 2016.
2. perf_event_alloc: This allocates a new security object for the event
which stores the current SID within the event. It will be useful when
the perf event's FD is passed through IPC to another process which may
try to read the FD. Appropriate security checks will limit access.
3. perf_event_free: Called when the event is closed.
4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.
5. perf_event_write: Called from the ioctl(2) syscalls for the event.
[1] https://lwn.net/Articles/696240/
Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.
To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-14 17:03:08 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:22:36 +00:00
|
|
|
/**
|
|
|
|
* security_perf_event_free() - Free a perf event LSM blob
|
|
|
|
* @event: perf event
|
|
|
|
*
|
|
|
|
* Release (free) perf_event security info.
|
|
|
|
*/
|
perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl. This has a number of
limitations:
1. The sysctl is only a single value. Many types of accesses are controlled
based on the single value thus making the control very limited and
coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
all processes get access to perf_event_open(2) opening the door to
security issues.
This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.
5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
syscall itself. The hook is called from all the places that the
perf_event_paranoid sysctl is checked to keep it consistent with the
systctl. The hook gets passed a 'type' argument which controls CPU,
kernel and tracepoint accesses (in this context, CPU, kernel and
tracepoint have the same semantics as the perf_event_paranoid sysctl).
Additionally, I added an 'open' type which is similar to
perf_event_paranoid sysctl == 3 patch carried in Android and several other
distros but was rejected in mainline [1] in 2016.
2. perf_event_alloc: This allocates a new security object for the event
which stores the current SID within the event. It will be useful when
the perf event's FD is passed through IPC to another process which may
try to read the FD. Appropriate security checks will limit access.
3. perf_event_free: Called when the event is closed.
4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.
5. perf_event_write: Called from the ioctl(2) syscalls for the event.
[1] https://lwn.net/Articles/696240/
Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.
To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-14 17:03:08 +00:00
|
|
|
void security_perf_event_free(struct perf_event *event)
|
|
|
|
{
|
2024-07-10 21:32:30 +00:00
|
|
|
kfree(event->security);
|
|
|
|
event->security = NULL;
|
perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl. This has a number of
limitations:
1. The sysctl is only a single value. Many types of accesses are controlled
based on the single value thus making the control very limited and
coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
all processes get access to perf_event_open(2) opening the door to
security issues.
This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.
5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
syscall itself. The hook is called from all the places that the
perf_event_paranoid sysctl is checked to keep it consistent with the
systctl. The hook gets passed a 'type' argument which controls CPU,
kernel and tracepoint accesses (in this context, CPU, kernel and
tracepoint have the same semantics as the perf_event_paranoid sysctl).
Additionally, I added an 'open' type which is similar to
perf_event_paranoid sysctl == 3 patch carried in Android and several other
distros but was rejected in mainline [1] in 2016.
2. perf_event_alloc: This allocates a new security object for the event
which stores the current SID within the event. It will be useful when
the perf event's FD is passed through IPC to another process which may
try to read the FD. Appropriate security checks will limit access.
3. perf_event_free: Called when the event is closed.
4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.
5. perf_event_write: Called from the ioctl(2) syscalls for the event.
[1] https://lwn.net/Articles/696240/
Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.
To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-14 17:03:08 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:22:36 +00:00
|
|
|
/**
|
|
|
|
* security_perf_event_read() - Check if reading a perf event label is allowed
|
|
|
|
* @event: perf event
|
|
|
|
*
|
|
|
|
* Read perf_event security info if allowed.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl. This has a number of
limitations:
1. The sysctl is only a single value. Many types of accesses are controlled
based on the single value thus making the control very limited and
coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
all processes get access to perf_event_open(2) opening the door to
security issues.
This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.
5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
syscall itself. The hook is called from all the places that the
perf_event_paranoid sysctl is checked to keep it consistent with the
systctl. The hook gets passed a 'type' argument which controls CPU,
kernel and tracepoint accesses (in this context, CPU, kernel and
tracepoint have the same semantics as the perf_event_paranoid sysctl).
Additionally, I added an 'open' type which is similar to
perf_event_paranoid sysctl == 3 patch carried in Android and several other
distros but was rejected in mainline [1] in 2016.
2. perf_event_alloc: This allocates a new security object for the event
which stores the current SID within the event. It will be useful when
the perf event's FD is passed through IPC to another process which may
try to read the FD. Appropriate security checks will limit access.
3. perf_event_free: Called when the event is closed.
4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.
5. perf_event_write: Called from the ioctl(2) syscalls for the event.
[1] https://lwn.net/Articles/696240/
Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.
To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-14 17:03:08 +00:00
|
|
|
int security_perf_event_read(struct perf_event *event)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(perf_event_read, event);
|
perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl. This has a number of
limitations:
1. The sysctl is only a single value. Many types of accesses are controlled
based on the single value thus making the control very limited and
coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
all processes get access to perf_event_open(2) opening the door to
security issues.
This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.
5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
syscall itself. The hook is called from all the places that the
perf_event_paranoid sysctl is checked to keep it consistent with the
systctl. The hook gets passed a 'type' argument which controls CPU,
kernel and tracepoint accesses (in this context, CPU, kernel and
tracepoint have the same semantics as the perf_event_paranoid sysctl).
Additionally, I added an 'open' type which is similar to
perf_event_paranoid sysctl == 3 patch carried in Android and several other
distros but was rejected in mainline [1] in 2016.
2. perf_event_alloc: This allocates a new security object for the event
which stores the current SID within the event. It will be useful when
the perf event's FD is passed through IPC to another process which may
try to read the FD. Appropriate security checks will limit access.
3. perf_event_free: Called when the event is closed.
4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.
5. perf_event_write: Called from the ioctl(2) syscalls for the event.
[1] https://lwn.net/Articles/696240/
Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.
To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-14 17:03:08 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:22:36 +00:00
|
|
|
/**
|
|
|
|
* security_perf_event_write() - Check if writing a perf event label is allowed
|
|
|
|
* @event: perf event
|
|
|
|
*
|
|
|
|
* Write perf_event security info if allowed.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl. This has a number of
limitations:
1. The sysctl is only a single value. Many types of accesses are controlled
based on the single value thus making the control very limited and
coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
all processes get access to perf_event_open(2) opening the door to
security issues.
This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.
5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
syscall itself. The hook is called from all the places that the
perf_event_paranoid sysctl is checked to keep it consistent with the
systctl. The hook gets passed a 'type' argument which controls CPU,
kernel and tracepoint accesses (in this context, CPU, kernel and
tracepoint have the same semantics as the perf_event_paranoid sysctl).
Additionally, I added an 'open' type which is similar to
perf_event_paranoid sysctl == 3 patch carried in Android and several other
distros but was rejected in mainline [1] in 2016.
2. perf_event_alloc: This allocates a new security object for the event
which stores the current SID within the event. It will be useful when
the perf event's FD is passed through IPC to another process which may
try to read the FD. Appropriate security checks will limit access.
3. perf_event_free: Called when the event is closed.
4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.
5. perf_event_write: Called from the ioctl(2) syscalls for the event.
[1] https://lwn.net/Articles/696240/
Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.
To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-14 17:03:08 +00:00
|
|
|
int security_perf_event_write(struct perf_event *event)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(perf_event_write, event);
|
perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl. This has a number of
limitations:
1. The sysctl is only a single value. Many types of accesses are controlled
based on the single value thus making the control very limited and
coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
all processes get access to perf_event_open(2) opening the door to
security issues.
This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.
5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
syscall itself. The hook is called from all the places that the
perf_event_paranoid sysctl is checked to keep it consistent with the
systctl. The hook gets passed a 'type' argument which controls CPU,
kernel and tracepoint accesses (in this context, CPU, kernel and
tracepoint have the same semantics as the perf_event_paranoid sysctl).
Additionally, I added an 'open' type which is similar to
perf_event_paranoid sysctl == 3 patch carried in Android and several other
distros but was rejected in mainline [1] in 2016.
2. perf_event_alloc: This allocates a new security object for the event
which stores the current SID within the event. It will be useful when
the perf event's FD is passed through IPC to another process which may
try to read the FD. Appropriate security checks will limit access.
3. perf_event_free: Called when the event is closed.
4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.
5. perf_event_write: Called from the ioctl(2) syscalls for the event.
[1] https://lwn.net/Articles/696240/
Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.
To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-14 17:03:08 +00:00
|
|
|
}
|
|
|
|
#endif /* CONFIG_PERF_EVENTS */
|
lsm,io_uring: add LSM hooks to io_uring
A full expalantion of io_uring is beyond the scope of this commit
description, but in summary it is an asynchronous I/O mechanism
which allows for I/O requests and the resulting data to be queued
in memory mapped "rings" which are shared between the kernel and
userspace. Optionally, io_uring offers the ability for applications
to spawn kernel threads to dequeue I/O requests from the ring and
submit the requests in the kernel, helping to minimize the syscall
overhead. Rings are accessed in userspace by memory mapping a file
descriptor provided by the io_uring_setup(2), and can be shared
between applications as one might do with any open file descriptor.
Finally, process credentials can be registered with a given ring
and any process with access to that ring can submit I/O requests
using any of the registered credentials.
While the io_uring functionality is widely recognized as offering a
vastly improved, and high performing asynchronous I/O mechanism, its
ability to allow processes to submit I/O requests with credentials
other than its own presents a challenge to LSMs. When a process
creates a new io_uring ring the ring's credentials are inhertied
from the calling process; if this ring is shared with another
process operating with different credentials there is the potential
to bypass the LSMs security policy. Similarly, registering
credentials with a given ring allows any process with access to that
ring to submit I/O requests with those credentials.
In an effort to allow LSMs to apply security policy to io_uring I/O
operations, this patch adds two new LSM hooks. These hooks, in
conjunction with the LSM anonymous inode support previously
submitted, allow an LSM to apply access control policy to the
sharing of io_uring rings as well as any io_uring credential changes
requested by a process.
The new LSM hooks are described below:
* int security_uring_override_creds(cred)
Controls if the current task, executing an io_uring operation,
is allowed to override it's credentials with @cred. In cases
where the current task is a user application, the current
credentials will be those of the user application. In cases
where the current task is a kernel thread servicing io_uring
requests the current credentials will be those of the io_uring
ring (inherited from the process that created the ring).
* int security_uring_sqpoll(void)
Controls if the current task is allowed to create an io_uring
polling thread (IORING_SETUP_SQPOLL). Without a SQPOLL thread
in the kernel processes must submit I/O requests via
io_uring_enter(2) which allows us to compare any requested
credential changes against the application making the request.
With a SQPOLL thread, we can no longer compare requested
credential changes against the application making the request,
the comparison is made against the ring's credentials.
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-02-02 00:56:49 +00:00
|
|
|
|
|
|
|
#ifdef CONFIG_IO_URING
|
2023-02-16 22:28:31 +00:00
|
|
|
/**
|
|
|
|
* security_uring_override_creds() - Check if overriding creds is allowed
|
|
|
|
* @new: new credentials
|
|
|
|
*
|
|
|
|
* Check if the current task, executing an io_uring operation, is allowed to
|
|
|
|
* override it's credentials with @new.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
lsm,io_uring: add LSM hooks to io_uring
A full expalantion of io_uring is beyond the scope of this commit
description, but in summary it is an asynchronous I/O mechanism
which allows for I/O requests and the resulting data to be queued
in memory mapped "rings" which are shared between the kernel and
userspace. Optionally, io_uring offers the ability for applications
to spawn kernel threads to dequeue I/O requests from the ring and
submit the requests in the kernel, helping to minimize the syscall
overhead. Rings are accessed in userspace by memory mapping a file
descriptor provided by the io_uring_setup(2), and can be shared
between applications as one might do with any open file descriptor.
Finally, process credentials can be registered with a given ring
and any process with access to that ring can submit I/O requests
using any of the registered credentials.
While the io_uring functionality is widely recognized as offering a
vastly improved, and high performing asynchronous I/O mechanism, its
ability to allow processes to submit I/O requests with credentials
other than its own presents a challenge to LSMs. When a process
creates a new io_uring ring the ring's credentials are inhertied
from the calling process; if this ring is shared with another
process operating with different credentials there is the potential
to bypass the LSMs security policy. Similarly, registering
credentials with a given ring allows any process with access to that
ring to submit I/O requests with those credentials.
In an effort to allow LSMs to apply security policy to io_uring I/O
operations, this patch adds two new LSM hooks. These hooks, in
conjunction with the LSM anonymous inode support previously
submitted, allow an LSM to apply access control policy to the
sharing of io_uring rings as well as any io_uring credential changes
requested by a process.
The new LSM hooks are described below:
* int security_uring_override_creds(cred)
Controls if the current task, executing an io_uring operation,
is allowed to override it's credentials with @cred. In cases
where the current task is a user application, the current
credentials will be those of the user application. In cases
where the current task is a kernel thread servicing io_uring
requests the current credentials will be those of the io_uring
ring (inherited from the process that created the ring).
* int security_uring_sqpoll(void)
Controls if the current task is allowed to create an io_uring
polling thread (IORING_SETUP_SQPOLL). Without a SQPOLL thread
in the kernel processes must submit I/O requests via
io_uring_enter(2) which allows us to compare any requested
credential changes against the application making the request.
With a SQPOLL thread, we can no longer compare requested
credential changes against the application making the request,
the comparison is made against the ring's credentials.
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-02-02 00:56:49 +00:00
|
|
|
int security_uring_override_creds(const struct cred *new)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(uring_override_creds, new);
|
lsm,io_uring: add LSM hooks to io_uring
A full expalantion of io_uring is beyond the scope of this commit
description, but in summary it is an asynchronous I/O mechanism
which allows for I/O requests and the resulting data to be queued
in memory mapped "rings" which are shared between the kernel and
userspace. Optionally, io_uring offers the ability for applications
to spawn kernel threads to dequeue I/O requests from the ring and
submit the requests in the kernel, helping to minimize the syscall
overhead. Rings are accessed in userspace by memory mapping a file
descriptor provided by the io_uring_setup(2), and can be shared
between applications as one might do with any open file descriptor.
Finally, process credentials can be registered with a given ring
and any process with access to that ring can submit I/O requests
using any of the registered credentials.
While the io_uring functionality is widely recognized as offering a
vastly improved, and high performing asynchronous I/O mechanism, its
ability to allow processes to submit I/O requests with credentials
other than its own presents a challenge to LSMs. When a process
creates a new io_uring ring the ring's credentials are inhertied
from the calling process; if this ring is shared with another
process operating with different credentials there is the potential
to bypass the LSMs security policy. Similarly, registering
credentials with a given ring allows any process with access to that
ring to submit I/O requests with those credentials.
In an effort to allow LSMs to apply security policy to io_uring I/O
operations, this patch adds two new LSM hooks. These hooks, in
conjunction with the LSM anonymous inode support previously
submitted, allow an LSM to apply access control policy to the
sharing of io_uring rings as well as any io_uring credential changes
requested by a process.
The new LSM hooks are described below:
* int security_uring_override_creds(cred)
Controls if the current task, executing an io_uring operation,
is allowed to override it's credentials with @cred. In cases
where the current task is a user application, the current
credentials will be those of the user application. In cases
where the current task is a kernel thread servicing io_uring
requests the current credentials will be those of the io_uring
ring (inherited from the process that created the ring).
* int security_uring_sqpoll(void)
Controls if the current task is allowed to create an io_uring
polling thread (IORING_SETUP_SQPOLL). Without a SQPOLL thread
in the kernel processes must submit I/O requests via
io_uring_enter(2) which allows us to compare any requested
credential changes against the application making the request.
With a SQPOLL thread, we can no longer compare requested
credential changes against the application making the request,
the comparison is made against the ring's credentials.
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-02-02 00:56:49 +00:00
|
|
|
}
|
|
|
|
|
2023-02-16 22:28:31 +00:00
|
|
|
/**
|
|
|
|
* security_uring_sqpoll() - Check if IORING_SETUP_SQPOLL is allowed
|
|
|
|
*
|
|
|
|
* Check whether the current task is allowed to spawn a io_uring polling thread
|
|
|
|
* (IORING_SETUP_SQPOLL).
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
lsm,io_uring: add LSM hooks to io_uring
A full expalantion of io_uring is beyond the scope of this commit
description, but in summary it is an asynchronous I/O mechanism
which allows for I/O requests and the resulting data to be queued
in memory mapped "rings" which are shared between the kernel and
userspace. Optionally, io_uring offers the ability for applications
to spawn kernel threads to dequeue I/O requests from the ring and
submit the requests in the kernel, helping to minimize the syscall
overhead. Rings are accessed in userspace by memory mapping a file
descriptor provided by the io_uring_setup(2), and can be shared
between applications as one might do with any open file descriptor.
Finally, process credentials can be registered with a given ring
and any process with access to that ring can submit I/O requests
using any of the registered credentials.
While the io_uring functionality is widely recognized as offering a
vastly improved, and high performing asynchronous I/O mechanism, its
ability to allow processes to submit I/O requests with credentials
other than its own presents a challenge to LSMs. When a process
creates a new io_uring ring the ring's credentials are inhertied
from the calling process; if this ring is shared with another
process operating with different credentials there is the potential
to bypass the LSMs security policy. Similarly, registering
credentials with a given ring allows any process with access to that
ring to submit I/O requests with those credentials.
In an effort to allow LSMs to apply security policy to io_uring I/O
operations, this patch adds two new LSM hooks. These hooks, in
conjunction with the LSM anonymous inode support previously
submitted, allow an LSM to apply access control policy to the
sharing of io_uring rings as well as any io_uring credential changes
requested by a process.
The new LSM hooks are described below:
* int security_uring_override_creds(cred)
Controls if the current task, executing an io_uring operation,
is allowed to override it's credentials with @cred. In cases
where the current task is a user application, the current
credentials will be those of the user application. In cases
where the current task is a kernel thread servicing io_uring
requests the current credentials will be those of the io_uring
ring (inherited from the process that created the ring).
* int security_uring_sqpoll(void)
Controls if the current task is allowed to create an io_uring
polling thread (IORING_SETUP_SQPOLL). Without a SQPOLL thread
in the kernel processes must submit I/O requests via
io_uring_enter(2) which allows us to compare any requested
credential changes against the application making the request.
With a SQPOLL thread, we can no longer compare requested
credential changes against the application making the request,
the comparison is made against the ring's credentials.
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-02-02 00:56:49 +00:00
|
|
|
int security_uring_sqpoll(void)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(uring_sqpoll);
|
lsm,io_uring: add LSM hooks to io_uring
A full expalantion of io_uring is beyond the scope of this commit
description, but in summary it is an asynchronous I/O mechanism
which allows for I/O requests and the resulting data to be queued
in memory mapped "rings" which are shared between the kernel and
userspace. Optionally, io_uring offers the ability for applications
to spawn kernel threads to dequeue I/O requests from the ring and
submit the requests in the kernel, helping to minimize the syscall
overhead. Rings are accessed in userspace by memory mapping a file
descriptor provided by the io_uring_setup(2), and can be shared
between applications as one might do with any open file descriptor.
Finally, process credentials can be registered with a given ring
and any process with access to that ring can submit I/O requests
using any of the registered credentials.
While the io_uring functionality is widely recognized as offering a
vastly improved, and high performing asynchronous I/O mechanism, its
ability to allow processes to submit I/O requests with credentials
other than its own presents a challenge to LSMs. When a process
creates a new io_uring ring the ring's credentials are inhertied
from the calling process; if this ring is shared with another
process operating with different credentials there is the potential
to bypass the LSMs security policy. Similarly, registering
credentials with a given ring allows any process with access to that
ring to submit I/O requests with those credentials.
In an effort to allow LSMs to apply security policy to io_uring I/O
operations, this patch adds two new LSM hooks. These hooks, in
conjunction with the LSM anonymous inode support previously
submitted, allow an LSM to apply access control policy to the
sharing of io_uring rings as well as any io_uring credential changes
requested by a process.
The new LSM hooks are described below:
* int security_uring_override_creds(cred)
Controls if the current task, executing an io_uring operation,
is allowed to override it's credentials with @cred. In cases
where the current task is a user application, the current
credentials will be those of the user application. In cases
where the current task is a kernel thread servicing io_uring
requests the current credentials will be those of the io_uring
ring (inherited from the process that created the ring).
* int security_uring_sqpoll(void)
Controls if the current task is allowed to create an io_uring
polling thread (IORING_SETUP_SQPOLL). Without a SQPOLL thread
in the kernel processes must submit I/O requests via
io_uring_enter(2) which allows us to compare any requested
credential changes against the application making the request.
With a SQPOLL thread, we can no longer compare requested
credential changes against the application making the request,
the comparison is made against the ring's credentials.
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-02-02 00:56:49 +00:00
|
|
|
}
|
2023-02-16 22:28:31 +00:00
|
|
|
|
|
|
|
/**
|
|
|
|
* security_uring_cmd() - Check if a io_uring passthrough command is allowed
|
|
|
|
* @ioucmd: command
|
|
|
|
*
|
|
|
|
* Check whether the file_operations uring_cmd is allowed to run.
|
|
|
|
*
|
|
|
|
* Return: Returns 0 if permission is granted.
|
|
|
|
*/
|
2022-07-15 19:16:22 +00:00
|
|
|
int security_uring_cmd(struct io_uring_cmd *ioucmd)
|
|
|
|
{
|
2024-01-30 12:56:59 +00:00
|
|
|
return call_int_hook(uring_cmd, ioucmd);
|
2022-07-15 19:16:22 +00:00
|
|
|
}
|
lsm,io_uring: add LSM hooks to io_uring
A full expalantion of io_uring is beyond the scope of this commit
description, but in summary it is an asynchronous I/O mechanism
which allows for I/O requests and the resulting data to be queued
in memory mapped "rings" which are shared between the kernel and
userspace. Optionally, io_uring offers the ability for applications
to spawn kernel threads to dequeue I/O requests from the ring and
submit the requests in the kernel, helping to minimize the syscall
overhead. Rings are accessed in userspace by memory mapping a file
descriptor provided by the io_uring_setup(2), and can be shared
between applications as one might do with any open file descriptor.
Finally, process credentials can be registered with a given ring
and any process with access to that ring can submit I/O requests
using any of the registered credentials.
While the io_uring functionality is widely recognized as offering a
vastly improved, and high performing asynchronous I/O mechanism, its
ability to allow processes to submit I/O requests with credentials
other than its own presents a challenge to LSMs. When a process
creates a new io_uring ring the ring's credentials are inhertied
from the calling process; if this ring is shared with another
process operating with different credentials there is the potential
to bypass the LSMs security policy. Similarly, registering
credentials with a given ring allows any process with access to that
ring to submit I/O requests with those credentials.
In an effort to allow LSMs to apply security policy to io_uring I/O
operations, this patch adds two new LSM hooks. These hooks, in
conjunction with the LSM anonymous inode support previously
submitted, allow an LSM to apply access control policy to the
sharing of io_uring rings as well as any io_uring credential changes
requested by a process.
The new LSM hooks are described below:
* int security_uring_override_creds(cred)
Controls if the current task, executing an io_uring operation,
is allowed to override it's credentials with @cred. In cases
where the current task is a user application, the current
credentials will be those of the user application. In cases
where the current task is a kernel thread servicing io_uring
requests the current credentials will be those of the io_uring
ring (inherited from the process that created the ring).
* int security_uring_sqpoll(void)
Controls if the current task is allowed to create an io_uring
polling thread (IORING_SETUP_SQPOLL). Without a SQPOLL thread
in the kernel processes must submit I/O requests via
io_uring_enter(2) which allows us to compare any requested
credential changes against the application making the request.
With a SQPOLL thread, we can no longer compare requested
credential changes against the application making the request,
the comparison is made against the ring's credentials.
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-02-02 00:56:49 +00:00
|
|
|
#endif /* CONFIG_IO_URING */
|
2024-08-03 06:08:19 +00:00
|
|
|
|
|
|
|
/**
|
|
|
|
* security_initramfs_populated() - Notify LSMs that initramfs has been loaded
|
|
|
|
*
|
|
|
|
* Tells the LSMs the initramfs has been unpacked into the rootfs.
|
|
|
|
*/
|
|
|
|
void security_initramfs_populated(void)
|
|
|
|
{
|
|
|
|
call_void_hook(initramfs_populated);
|
|
|
|
}
|