linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-13 15:41:39 +00:00

Author	SHA1	Message	Date
Ingo Molnar	b2d0910310	sched/headers: Prepare to use <linux/rcuupdate.h> instead of <linux/rculist.h> in <linux/sched.h> We don't actually need the full rculist.h header in sched.h anymore, we will be able to include the smaller rcupdate.h header instead. But first update code that relied on the implicit header inclusion. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:38 +01:00
Ingo Molnar	299300258d	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> We are going to split <linux/sched/task.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/task.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:35 +01:00
Ingo Molnar	5b825c3af1	sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> Add #include <linux/cred.h> dependencies to all .c files rely on sched.h doing that for them. Note that even if the count where we need to add extra headers seems high, it's still a net win, because <linux/sched.h> is included in over 2,200 files ... Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:31 +01:00
Ingo Molnar	8703e8a465	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/user.h> We are going to split <linux/sched/user.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/user.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:29 +01:00
Ingo Molnar	3f07c01441	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/signal.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:29 +01:00
Stephen Smalley	2651225b5e	selinux: wrap cgroup seclabel support with its own policy capability commit `1ea0ce4069` ("selinux: allow changing labels for cgroupfs") broke the Android init program, which looks up security contexts whenever creating directories and attempts to assign them via setfscreatecon(). When creating subdirectories in cgroup mounts, this would previously be ignored since cgroup did not support userspace setting of security contexts. However, after the commit, SELinux would attempt to honor the requested context on cgroup directories and fail due to permission denial. Avoid breaking existing userspace/policy by wrapping this change with a conditional on a new cgroup_seclabel policy capability. This preserves existing behavior until/unless a new policy explicitly enables this capability. Reported-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-03-02 10:27:40 +11:00
David Howells	0837e49ab3	KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload() rcu_dereference_key() and user_key_payload() are currently being used in two different, incompatible ways: (1) As a wrapper to rcu_dereference() - when only the RCU read lock used to protect the key. (2) As a wrapper to rcu_dereference_protected() - when the key semaphor is used to protect the key and the may be being modified. Fix this by splitting both of the key wrappers to produce: (1) RCU accessors for keys when caller has the key semaphore locked: dereference_key_locked() user_key_payload_locked() (2) RCU accessors for keys when caller holds the RCU read lock: dereference_key_rcu() user_key_payload_rcu() This should fix following warning in the NFS idmapper =============================== [ INFO: suspicious RCU usage. ] 4.10.0 #1 Tainted: G W ------------------------------- ./include/keys/user-type.h:53 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 0 1 lock held by mount.nfs/5987: #0: (rcu_read_lock){......}, at: [<d000000002527abc>] nfs_idmap_get_key+0x15c/0x420 [nfsv4] stack backtrace: CPU: 1 PID: 5987 Comm: mount.nfs Tainted: G W 4.10.0 #1 Call Trace: dump_stack+0xe8/0x154 (unreliable) lockdep_rcu_suspicious+0x140/0x190 nfs_idmap_get_key+0x380/0x420 [nfsv4] nfs_map_name_to_uid+0x2a0/0x3b0 [nfsv4] decode_getfattr_attrs+0xfac/0x16b0 [nfsv4] decode_getfattr_generic.constprop.106+0xbc/0x150 [nfsv4] nfs4_xdr_dec_lookup_root+0xac/0xb0 [nfsv4] rpcauth_unwrap_resp+0xe8/0x140 [sunrpc] call_decode+0x29c/0x910 [sunrpc] __rpc_execute+0x140/0x8f0 [sunrpc] rpc_run_task+0x170/0x200 [sunrpc] nfs4_call_sync_sequence+0x68/0xa0 [nfsv4] _nfs4_lookup_root.isra.44+0xd0/0xf0 [nfsv4] nfs4_lookup_root+0xe0/0x350 [nfsv4] nfs4_lookup_root_sec+0x70/0xa0 [nfsv4] nfs4_find_root_sec+0xc4/0x100 [nfsv4] nfs4_proc_get_rootfh+0x5c/0xf0 [nfsv4] nfs4_get_rootfh+0x6c/0x190 [nfsv4] nfs4_server_common_setup+0xc4/0x260 [nfsv4] nfs4_create_server+0x278/0x3c0 [nfsv4] nfs4_remote_mount+0x50/0xb0 [nfsv4] mount_fs+0x74/0x210 vfs_kern_mount+0x78/0x220 nfs_do_root_mount+0xb0/0x140 [nfsv4] nfs4_try_mount+0x60/0x100 [nfsv4] nfs_fs_mount+0x5ec/0xda0 [nfs] mount_fs+0x74/0x210 vfs_kern_mount+0x78/0x220 do_mount+0x254/0xf70 SyS_mount+0x94/0x100 system_call+0x38/0xe0 Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-03-02 10:09:00 +11:00
Alexey Dobriyan	5b5e0928f7	lib/vsprintf.c: remove %Z support Now that %z is standartised in C99 there is no reason to support %Z. Unlike %L it doesn't even make format strings smaller. Use BUILD_BUG_ON in a couple ATM drivers. In case anyone didn't notice lib/vsprintf.o is about half of SLUB which is in my opinion is quite an achievement. Hopefully this patch inspires someone else to trim vsprintf.c more. Link: http://lkml.kernel.org/r/20170103230126.GA30170@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-27 18:43:47 -08:00
Dave Jiang	11bac80004	mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmf ->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to take a vma and vmf parameter when the vma already resides in vmf. Remove the vma parameter to simplify things. [arnd@arndb.de: fix ARM build] Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-24 17:46:54 -08:00
Linus Torvalds	f1ef09fde1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace updates from Eric Biederman: "There is a lot here. A lot of these changes result in subtle user visible differences in kernel behavior. I don't expect anything will care but I will revert/fix things immediately if any regressions show up. From Seth Forshee there is a continuation of the work to make the vfs ready for unpriviled mounts. We had thought the previous changes prevented the creation of files outside of s_user_ns of a filesystem, but it turns we missed the O_CREAT path. Ooops. Pavel Tikhomirov and Oleg Nesterov worked together to fix a long standing bug in the implemenation of PR_SET_CHILD_SUBREAPER where only children that are forked after the prctl are considered and not children forked before the prctl. The only known user of this prctl systemd forks all children after the prctl. So no userspace regressions will occur. Holding earlier forked children to the same rules as later forked children creates a semantic that is sane enough to allow checkpoing of processes that use this feature. There is a long delayed change by Nikolay Borisov to limit inotify instances inside a user namespace. Michael Kerrisk extends the API for files used to maniuplate namespaces with two new trivial ioctls to allow discovery of the hierachy and properties of namespaces. Konstantin Khlebnikov with the help of Al Viro adds code that when a network namespace exits purges it's sysctl entries from the dcache. As in some circumstances this could use a lot of memory. Vivek Goyal fixed a bug with stacked filesystems where the permissions on the wrong inode were being checked. I continue previous work on ptracing across exec. Allowing a file to be setuid across exec while being ptraced if the tracer has enough credentials in the user namespace, and if the process has CAP_SETUID in it's own namespace. Proc files for setuid or otherwise undumpable executables are now owned by the root in the user namespace of their mm. Allowing debugging of setuid applications in containers to work better. A bug I introduced with permission checking and automount is now fixed. The big change is to mark the mounts that the kernel initiates as a result of an automount. This allows the permission checks in sget to be safely suppressed for this kind of mount. As the permission check happened when the original filesystem was mounted. Finally a special case in the mount namespace is removed preventing unbounded chains in the mount hash table, and making the semantics simpler which benefits CRIU. The vfs fix along with related work in ima and evm I believe makes us ready to finish developing and merge fully unprivileged mounts of the fuse filesystem. The cleanups of the mount namespace makes discussing how to fix the worst case complexity of umount. The stacked filesystem fixes pave the way for adding multiple mappings for the filesystem uids so that efficient and safer containers can be implemented" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: proc/sysctl: Don't grab i_lock under sysctl_lock. vfs: Use upper filesystem inode in bprm_fill_uid() proc/sysctl: prune stale dentries during unregistering mnt: Tuck mounts under others instead of creating shadow/side mounts. prctl: propagate has_child_subreaper flag to every descendant introduce the walk_process_tree() helper nsfs: Add an ioctl() to return owner UID of a userns fs: Better permission checking for submounts exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction vfs: open() with O_CREAT should not create inodes with unknown ids nsfs: Add an ioctl() to return the namespace type proc: Better ownership of files for non-dumpable tasks in user namespaces exec: Remove LSM_UNSAFE_PTRACE_CAP exec: Test the ptracer's saved cred to see if the tracee can gain caps exec: Don't reset euid and egid when the tracee has CAP_SETUID inotify: Convert to using per-namespace limits	2017-02-23 20:33:51 -08:00
Linus Torvalds	b2064617c7	driver core patches for 4.11-rc1 Here is the "small" driver core patches for 4.11-rc1. Not much here, some firmware documentation and self-test updates, a debugfs code formatting issue, and a new feature for call_usermodehelper to make it more robust on systems that want to lock it down in a more secure way. All of these have been linux-next for a while now with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWK2jKg8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ymCEACgozYuqZZ/TUGW0P3xVNi7fbfUWCEAn3nYExrc XgevqeYOSKp2We6X/2JX =aZ+5 -----END PGP SIGNATURE----- Merge tag 'driver-core-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the "small" driver core patches for 4.11-rc1. Not much here, some firmware documentation and self-test updates, a debugfs code formatting issue, and a new feature for call_usermodehelper to make it more robust on systems that want to lock it down in a more secure way. All of these have been linux-next for a while now with no reported issues" * tag 'driver-core-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: kernfs: handle null pointers while printing node name and path Introduce STATIC_USERMODEHELPER to mediate call_usermodehelper() Make static usermode helper binaries constant kmod: make usermodehelper path a const string firmware: revamp firmware documentation selftests: firmware: send expected errors to /dev/null selftests: firmware: only modprobe if driver is missing platform: Print the resource range if device failed to claim kref: prefer atomic_inc_not_zero to atomic_add_unless debugfs: improve formatting of debugfs_real_fops()	2017-02-22 11:44:32 -08:00
Linus Torvalds	3051bf36c2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Highlights: 1) Support TX_RING in AF_PACKET TPACKET_V3 mode, from Sowmini Varadhan. 2) Simplify classifier state on sk_buff in order to shrink it a bit. From Willem de Bruijn. 3) Introduce SIPHASH and it's usage for secure sequence numbers and syncookies. From Jason A. Donenfeld. 4) Reduce CPU usage for ICMP replies we are going to limit or suppress, from Jesper Dangaard Brouer. 5) Introduce Shared Memory Communications socket layer, from Ursula Braun. 6) Add RACK loss detection and allow it to actually trigger fast recovery instead of just assisting after other algorithms have triggered it. From Yuchung Cheng. 7) Add xmit_more and BQL support to mvneta driver, from Simon Guinot. 8) skb_cow_data avoidance in esp4 and esp6, from Steffen Klassert. 9) Export MPLS packet stats via netlink, from Robert Shearman. 10) Significantly improve inet port bind conflict handling, especially when an application is restarted and changes it's setting of reuseport. From Josef Bacik. 11) Implement TX batching in vhost_net, from Jason Wang. 12) Extend the dummy device so that VF (virtual function) features, such as configuration, can be more easily tested. From Phil Sutter. 13) Avoid two atomic ops per page on x86 in bnx2x driver, from Eric Dumazet. 14) Add new bpf MAP, implementing a longest prefix match trie. From Daniel Mack. 15) Packet sample offloading support in mlxsw driver, from Yotam Gigi. 16) Add new aquantia driver, from David VomLehn. 17) Add bpf tracepoints, from Daniel Borkmann. 18) Add support for port mirroring to b53 and bcm_sf2 drivers, from Florian Fainelli. 19) Remove custom busy polling in many drivers, it is done in the core networking since 4.5 times. From Eric Dumazet. 20) Support XDP adjust_head in virtio_net, from John Fastabend. 21) Fix several major holes in neighbour entry confirmation, from Julian Anastasov. 22) Add XDP support to bnxt_en driver, from Michael Chan. 23) VXLAN offloads for enic driver, from Govindarajulu Varadarajan. 24) Add IPVTAP driver (IP-VLAN based tap driver) from Sainath Grandhi. 25) Support GRO in IPSEC protocols, from Steffen Klassert" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1764 commits) Revert "ath10k: Search SMBIOS for OEM board file extension" net: socket: fix recvmmsg not returning error from sock_error bnxt_en: use eth_hw_addr_random() bpf: fix unlocking of jited image when module ronx not set arch: add ARCH_HAS_SET_MEMORY config net: napi_watchdog() can use napi_schedule_irqoff() tcp: Revert "tcp: tcp_probe: use spin_lock_bh()" net/hsr: use eth_hw_addr_random() net: mvpp2: enable building on 64-bit platforms net: mvpp2: switch to build_skb() in the RX path net: mvpp2: simplify MVPP2_PRS_RI_* definitions net: mvpp2: fix indentation of MVPP2_EXT_GLOBAL_CTRL_DEFAULT net: mvpp2: remove unused register definitions net: mvpp2: simplify mvpp2_bm_bufs_add() net: mvpp2: drop useless fields in mvpp2_bm_pool and related code net: mvpp2: remove unused 'tx_skb' field of 'struct mvpp2_tx_queue' net: mvpp2: release reference to txq_cpu[] entry after unmapping net: mvpp2: handle too large value in mvpp2_rx_time_coal_set() net: mvpp2: handle too large value handling in mvpp2_rx_pkts_coal_set() net: mvpp2: remove useless arguments in mvpp2_rx_{pkts, time}_coal_set ...	2017-02-22 10:15:09 -08:00
Linus Torvalds	c9341ee0af	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security layer updates from James Morris: "Highlights: - major AppArmor update: policy namespaces & lots of fixes - add /sys/kernel/security/lsm node for easy detection of loaded LSMs - SELinux cgroupfs labeling support - SELinux context mounts on tmpfs, ramfs, devpts within user namespaces - improved TPM 2.0 support" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (117 commits) tpm: declare tpm2_get_pcr_allocation() as static tpm: Fix expected number of response bytes of TPM1.2 PCR Extend tpm xen: drop unneeded chip variable tpm: fix misspelled "facilitate" in module parameter description tpm_tis: fix the error handling of init_tis() KEYS: Use memzero_explicit() for secret data KEYS: Fix an error code in request_master_key() sign-file: fix build error in sign-file.c with libressl selinux: allow changing labels for cgroupfs selinux: fix off-by-one in setprocattr tpm: silence an array overflow warning tpm: fix the type of owned field in cap_t tpm: add securityfs support for TPM 2.0 firmware event log tpm: enhance read_log_of() to support Physical TPM event log tpm: enhance TPM 2.0 PCR extend to support multiple banks tpm: implement TPM 2.0 capability to get active PCR banks tpm: fix RC value check in tpm2_seal_trusted tpm_tis: fix iTPM probe via probe_itpm() function tpm: Begin the process to deprecate user_read_timer tpm: remove tpm_read_index and tpm_write_index from tpm.h ...	2017-02-21 12:49:56 -08:00
Linus Torvalds	42e1b14b6e	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle were: - Implement wraparound-safe refcount_t and kref_t types based on generic atomic primitives (Peter Zijlstra) - Improve and fix the ww_mutex code (Nicolai Hähnle) - Add self-tests to the ww_mutex code (Chris Wilson) - Optimize percpu-rwsems with the 'rcuwait' mechanism (Davidlohr Bueso) - Micro-optimize the current-task logic all around the core kernel (Davidlohr Bueso) - Tidy up after recent optimizations: remove stale code and APIs, clean up the code (Waiman Long) - ... plus misc fixes, updates and cleanups" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) fork: Fix task_struct alignment locking/spinlock/debug: Remove spinlock lockup detection code lockdep: Fix incorrect condition to print bug msgs for MAX_LOCKDEP_CHAIN_HLOCKS lkdtm: Convert to refcount_t testing kref: Implement 'struct kref' using refcount_t refcount_t: Introduce a special purpose refcount type sched/wake_q: Clarify queue reinit comment sched/wait, rcuwait: Fix typo in comment locking/mutex: Fix lockdep_assert_held() fail locking/rtmutex: Flip unlikely() branch to likely() in __rt_mutex_slowlock() locking/rwsem: Reinit wake_q after use locking/rwsem: Remove unnecessary atomic_long_t casts jump_labels: Move header guard #endif down where it belongs locking/atomic, kref: Implement kref_put_lock() locking/ww_mutex: Turn off __must_check for now locking/atomic, kref: Avoid more abuse locking/atomic, kref: Use kref_get_unless_zero() more locking/atomic, kref: Kill kref_sub() locking/atomic, kref: Add kref_read() locking/atomic, kref: Add KREF_INIT() ...	2017-02-20 13:23:30 -08:00
David S. Miller	35eeacf182	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-02-11 02:31:11 -05:00
Dan Carpenter	5217660379	KEYS: Use memzero_explicit() for secret data I don't think GCC has figured out how to optimize the memset() away, but they might eventually so let's future proof this code a bit. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-02-10 12:43:51 +11:00
Dan Carpenter	57cb17e764	KEYS: Fix an error code in request_master_key() This function has two callers and neither are able to handle a NULL return. Really, -EINVAL is the correct thing return here anyway. This fixes some static checker warnings like: security/keys/encrypted-keys/encrypted.c:709 encrypted_key_decrypt() error: uninitialized symbol 'master_key'. Fixes: `7e70cb4978` ("keys: add new key-type encrypted") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-02-10 12:43:49 +11:00
James Morris	a2a15479d6	Merge branch 'stable-4.11' of git://git.infradead.org/users/pcmoore/selinux into next	2017-02-10 10:28:49 +11:00
Stephen Smalley	0c461cb727	selinux: fix off-by-one in setprocattr SELinux tries to support setting/clearing of /proc/pid/attr attributes from the shell by ignoring terminating newlines and treating an attribute value that begins with a NUL or newline as an attempt to clear the attribute. However, the test for clearing attributes has always been wrong; it has an off-by-one error, and this could further lead to reading past the end of the allocated buffer since commit `bb646cdb12` ("proc_pid_attr_write(): switch to memdup_user()"). Fix the off-by-one error. Even with this fix, setting and clearing /proc/pid/attr attributes from the shell is not straightforward since the interface does not support multiple write() calls (so shells that write the value and newline separately will set and then immediately clear the attribute, requiring use of echo -n to set the attribute), whereas trying to use echo -n "" to clear the attribute causes the shell to skip the write() call altogether since POSIX says that a zero-length write causes no side effects. Thus, one must use echo -n to set and echo without -n to clear, as in the following example: $ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate $ cat /proc/$$/attr/fscreate unconfined_u:object_r:user_home_t:s0 $ echo "" > /proc/$$/attr/fscreate $ cat /proc/$$/attr/fscreate Note the use of /proc/$$ rather than /proc/self, as otherwise the cat command will read its own attribute value, not that of the shell. There are no users of this facility to my knowledge; possibly we should just get rid of it. UPDATE: Upon further investigation it appears that a local process with the process:setfscreate permission can cause a kernel panic as a result of this bug. This patch fixes CVE-2017-2618. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: added the update about CVE-2017-2618 to the commit description] Cc: stable@vger.kernel.org # 3.5: `d6ea83ec68` Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-02-08 19:09:53 +11:00
James Morris	e2241be62d	Merge branch 'stable-4.10' of git://git.infradead.org/users/pcmoore/selinux into next	2017-02-08 19:01:07 +11:00
Antonio Murdaca	1ea0ce4069	selinux: allow changing labels for cgroupfs This patch allows changing labels for cgroup mounts. Previously, running chcon on cgroupfs would throw an "Operation not supported". This patch specifically whitelist cgroupfs. The patch could also allow containers to write only to the systemd cgroup for instance, while the other cgroups are kept with cgroup_t label. Signed-off-by: Antonio Murdaca <runcom@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-02-07 22:17:47 -05:00
Stephen Smalley	a050a570db	selinux: fix off-by-one in setprocattr SELinux tries to support setting/clearing of /proc/pid/attr attributes from the shell by ignoring terminating newlines and treating an attribute value that begins with a NUL or newline as an attempt to clear the attribute. However, the test for clearing attributes has always been wrong; it has an off-by-one error, and this could further lead to reading past the end of the allocated buffer since commit `bb646cdb12` ("proc_pid_attr_write(): switch to memdup_user()"). Fix the off-by-one error. Even with this fix, setting and clearing /proc/pid/attr attributes from the shell is not straightforward since the interface does not support multiple write() calls (so shells that write the value and newline separately will set and then immediately clear the attribute, requiring use of echo -n to set the attribute), whereas trying to use echo -n "" to clear the attribute causes the shell to skip the write() call altogether since POSIX says that a zero-length write causes no side effects. Thus, one must use echo -n to set and echo without -n to clear, as in the following example: $ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate $ cat /proc/$$/attr/fscreate unconfined_u:object_r:user_home_t:s0 $ echo "" > /proc/$$/attr/fscreate $ cat /proc/$$/attr/fscreate Note the use of /proc/$$ rather than /proc/self, as otherwise the cat command will read its own attribute value, not that of the shell. There are no users of this facility to my knowledge; possibly we should just get rid of it. UPDATE: Upon further investigation it appears that a local process with the process:setfscreate permission can cause a kernel panic as a result of this bug. This patch fixes CVE-2017-2618. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: added the update about CVE-2017-2618 to the commit description] Cc: stable@vger.kernel.org # 3.5: `d6ea83ec68` Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-02-07 17:31:38 -05:00
Lans Zhang	20f482ab9e	ima: allow to check MAY_APPEND Otherwise some mask and inmask tokens with MAY_APPEND flag may not work as expected. Signed-off-by: Lans Zhang <jia.zhang@windriver.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2017-01-27 14:17:21 -05:00
Mimi Zohar	bc15ed663e	ima: fix ima_d_path() possible race with rename On failure to return a pathname from ima_d_path(), a pointer to dname is returned, which is subsequently used in the IMA measurement list, the IMA audit records, and other audit logging. Saving the pointer to dname for later use has the potential to race with rename. Intead of returning a pointer to dname on failure, this patch returns a pointer to a copy of the filename. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: stable@vger.kernel.org	2017-01-27 14:16:02 -05:00
James Morris	710584b9da	Merge branch 'smack-for-4.11' of git://github.com/cschaufler/smack-next into next	2017-01-27 09:23:21 +11:00
Krister Johansen	4548b683b7	Introduce a sysctl that modifies the value of PROT_SOCK. Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl that denotes the first unprivileged inet port in the namespace. To disable all privileged ports set this to zero. It also checks for overlap with the local port range. The privileged and local range may not overlap. The use case for this change is to allow containerized processes to bind to priviliged ports, but prevent them from ever being allowed to modify their container's network configuration. The latter is accomplished by ensuring that the network namespace is not a child of the user namespace. This modification was needed to allow the container manager to disable a namespace's priviliged port restrictions without exposing control of the network namespace to processes in the user namespace. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-24 12:10:51 -05:00
Eric W. Biederman	9227dd2a84	exec: Remove LSM_UNSAFE_PTRACE_CAP With previous changes every location that tests for LSM_UNSAFE_PTRACE_CAP also tests for LSM_UNSAFE_PTRACE making the LSM_UNSAFE_PTRACE_CAP redundant, so remove it. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2017-01-24 12:03:08 +13:00
Eric W. Biederman	20523132ec	exec: Test the ptracer's saved cred to see if the tracee can gain caps Now that we have user namespaces and non-global capabilities verify the tracer has capabilities in the relevant user namespace instead of in the current_user_ns(). As the test for setting LSM_UNSAFE_PTRACE_CAP is currently ptracer_capable(p, current_user_ns()) and the new task credentials are in current_user_ns() this change does not have any user visible change and simply moves the test to where it is used, making the code easier to read. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2017-01-24 12:03:08 +13:00
Eric W. Biederman	70169420f5	exec: Don't reset euid and egid when the tracee has CAP_SETUID Don't reset euid and egid when the tracee has CAP_SETUID in it's user namespace. I punted on relaxing this permission check long ago but now that I have read this code closely it is clear it is safe to test against CAP_SETUID in the user namespace. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2017-01-24 12:03:07 +13:00
Greg Kroah-Hartman	64e90a8acb	Introduce STATIC_USERMODEHELPER to mediate call_usermodehelper() Some usermode helper applications are defined at kernel build time, while others can be changed at runtime. To provide a sane way to filter these, add a new kernel option "STATIC_USERMODEHELPER". This option routes all call_usermodehelper() calls through this binary, no matter what the caller wishes to have called. The new binary (by default set to /sbin/usermode-helper, but can be changed through the STATIC_USERMODEHELPER_PATH option) can properly filter the requested programs to be run by the kernel by looking at the first argument that is passed to it. All other options should then be passed onto the proper program if so desired. To disable all call_usermodehelper() calls by the kernel, set STATIC_USERMODEHELPER_PATH to an empty string. Thanks to Neil Brown for the idea of this feature. Cc: NeilBrown <neilb@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-01-19 12:59:45 +01:00
Greg Kroah-Hartman	377e7a27c0	Make static usermode helper binaries constant There are a number of usermode helper binaries that are "hard coded" in the kernel today, so mark them as "const" to make it harder for someone to change where the variables point to. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Sailer <t.sailer@alumni.ethz.ch> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Johan Hovold <johan@kernel.org> Cc: Alex Elder <elder@kernel.org> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-01-19 12:59:45 +01:00
Casey Schaufler	d69dece5f5	LSM: Add /sys/kernel/security/lsm I am still tired of having to find indirect ways to determine what security modules are active on a system. I have added /sys/kernel/security/lsm, which contains a comma separated list of the active security modules. No more groping around in /proc/filesystems or other clever hacks. Unchanged from previous versions except for being updated to the latest security next branch. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-01-19 13:18:29 +11:00
John Johansen	3ccb76c5df	apparmor: fix undefined reference to `aa_g_hash_policy' The kernel build bot turned up a bad config combination when CONFIG_SECURITY_APPARMOR is y and CONFIG_SECURITY_APPARMOR_HASH is n, resulting in the build error security/built-in.o: In function `aa_unpack': (.text+0x841e2): undefined reference to `aa_g_hash_policy' Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 13:21:27 -08:00
John Johansen	e6bfa25deb	apparmor: replace remaining BUG_ON() asserts with AA_BUG() AA_BUG() uses WARN and won't break the kernel like BUG_ON(). Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:56 -08:00
John Johansen	2c17cd3681	apparmor: fix restricted endian type warnings for policy unpack Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:55 -08:00
John Johansen	e6e8bf4188	apparmor: fix restricted endian type warnings for dfa unpack Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:54 -08:00
John Johansen	ca4bd5ae0a	apparmor: add check for apparmor enabled in module parameters missing it Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:53 -08:00
John Johansen	d4669f0b03	apparmor: add per cpu work buffers to avoid allocating buffers at every hook Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:53 -08:00
Tyler Hicks	e3ea1ca59a	apparmor: sysctl to enable unprivileged user ns AppArmor policy loading If this sysctl is set to non-zero and a process with CAP_MAC_ADMIN in the root namespace has created an AppArmor policy namespace, unprivileged processes will be able to change to a profile in the newly created AppArmor policy namespace and, if the profile allows CAP_MAC_ADMIN and appropriate file permissions, will be able to load policy in the respective policy namespace. Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:52 -08:00
William Hua	e025be0f26	apparmor: support querying extended trusted helper extra data Allow a profile to carry extra data that can be queried via userspace. This provides a means to store extra data in a profile that a trusted helper can extract and use from live policy. Signed-off-by: William Hua <william.hua@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:51 -08:00
John Johansen	12eb87d50b	apparmor: update cap audit to check SECURITY_CAP_NOAUDIT apparmor should be checking the SECURITY_CAP_NOAUDIT constant. Also in complain mode make it so apparmor can elect to log a message, informing of the check. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:50 -08:00
John Johansen	31f75bfecd	apparmor: make computing policy hashes conditional on kernel parameter Allow turning off the computation of the policy hashes via the apparmor.hash_policy kernel parameter. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:50 -08:00
John Johansen	aa9a39ad8f	apparmor: convert change_profile to use fqname later to give better control Moving the use of fqname to later allows learning profiles to be based on the fqname request instead of just the hname. It also allows cleaning up some of the name parsing and lookup by allowing the use of the fqlookupn_profile() lib fn. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:49 -08:00
John Johansen	c3e1e584ad	apparmor: fix change_hat debug output Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:48 -08:00
John Johansen	5ef50d014c	apparmor: remove unused op parameter from simple_write_to_buffer() Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:48 -08:00
John Johansen	ef88a7ac55	apparmor: change aad apparmor_audit_data macro to a fn macro The aad macro can replace aad strings when it is not intended to. Switch to a fn macro so it is only applied when intended. Also at the same time cleanup audit_data initialization by putting common boiler plate behind a macro, and dropping the gfp_t parameter which will become useless. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:47 -08:00
John Johansen	47f6e5cc73	apparmor: change op from int to const char * Having ops be an integer that is an index into an op name table is awkward and brittle. Every op change requires an edit for both the op constant and a string in the table. Instead switch to using const strings directly, eliminating the need for the table that needs to be kept in sync. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:46 -08:00
John Johansen	55a26ebf63	apparmor: rename context abreviation cxt to the more standard ctx Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:45 -08:00
John Johansen	a20aa95fbe	apparmor: fail task profile update if current_cred isn't real_cred Trying to update the task cred while the task current cred is not the real cred will result in an error at the cred layer. Avoid this by failing early and delaying the update. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:45 -08:00
John Johansen	b7fd2c0340	apparmor: add per policy ns .load, .replace, .remove interface files Having per policy ns interface files helps with containers restoring policy. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:44 -08:00
John Johansen	12dd7171d6	apparmor: pass the subject profile into profile replace/remove This is just setup for new ns specific .load, .replace, .remove interface files. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:43 -08:00
John Johansen	04dc715e24	apparmor: audit policy ns specified in policy load Verify that profiles in a load set specify the same policy ns and audit the name of the policy ns that policy is being loaded for. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:43 -08:00
John Johansen	5ac8c355ae	apparmor: allow introspecting the loaded policy pre internal transform Store loaded policy and allow introspecting it through apparmorfs. This has several uses from debugging, policy validation, and policy checkpoint and restore for containers. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:42 -08:00
John Johansen	fc1c9fd10a	apparmor: add ns name to the audit data for policy loads Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:41 -08:00
John Johansen	078c73c63f	apparmor: add profile and ns params to aa_may_manage_policy() Policy management will be expanded beyond traditional unconfined root. This will require knowning the profile of the task doing the management and the ns view. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:40 -08:00
John Johansen	fd2a80438d	apparmor: add ns being viewed as a param to policy_admin_capable() Prepare for a tighter pairing of user namespaces and apparmor policy namespaces, by making the ns to be viewed available. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:40 -08:00
John Johansen	2bd8dbbf22	apparmor: add ns being viewed as a param to policy_view_capable() Prepare for a tighter pairing of user namespaces and apparmor policy namespaces, by making the ns to be viewed available and checking that the user namespace level is the same as the policy ns level. This strict pairing will be relaxed once true support of user namespaces lands. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:39 -08:00
John Johansen	a6f233003b	apparmor: allow specifying the profile doing the management Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:38 -08:00
John Johansen	3e3e569539	apparmor: allow introspecting the policy namespace name Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:37 -08:00
John Johansen	b79473f2de	apparmor: Make aa_remove_profile() callable from a different view This is prep work for fs operations being able to remove namespaces. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:37 -08:00
John Johansen	ee2351e4b0	apparmor: track ns level so it can be used to help in view checks Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:36 -08:00
John Johansen	a71ada3058	apparmor: add special .null file used to "close" fds at exec Borrow the special null device file from selinux to "close" fds that don't have sufficient permissions at exec time. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:35 -08:00
John Johansen	34c426acb7	apparmor: provide userspace flag indicating binfmt_elf_mmap change Commit `9f834ec18d` ("binfmt_elf: switch to new creds when switching to new mm") changed when the creds are installed by the binfmt_elf handler. This affects which creds are used to mmap the executable into the address space. Which can have an affect on apparmor policy. Add a flag to apparmor at /sys/kernel/security/apparmor/features/domain/fix_binfmt_elf_mmap to make it possible to detect this semantic change so that the userspace tools and the regression test suite can correctly deal with the change. BugLink: http://bugs.launchpad.net/bugs/1630069 Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:35 -08:00
John Johansen	11c236b89d	apparmor: add a default null dfa Instead of testing whether a given dfa exists in every code path, have a default null dfa that is used when loaded policy doesn't provide a dfa. This will let us get rid of special casing and avoid dereference bugs when special casing is missed. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:34 -08:00
John Johansen	6604d4c1c1	apparmor: allow policydb to be used as the file dfa Newer policy will combine the file and policydb dfas, allowing for better optimizations. However to support older policy we need to keep the ability to address the "file" dfa separately. So dup the policydb as if it is the file dfa and set the appropriate start state. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:33 -08:00
John Johansen	293a4886f9	apparmor: add get_dfa() fn The dfa is currently setup to be shared (has the basis of refcounting) but currently can't be because the count can't be increased. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:32 -08:00
John Johansen	474d6b7510	apparmor: prepare to support newer versions of policy Newer policy encodes more than just version in the version tag, so add masking to make sure the comparison remains correct. Note: this is fully compatible with older policy as it will never set the bits being masked out. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:32 -08:00
John Johansen	5ebfb12822	apparmor: add support for force complain flag to support learning mode Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:31 -08:00
John Johansen	abbf873403	apparmor: remove paranoid load switch Policy should always under go a full paranoid verification. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:30 -08:00
John Johansen	181f7c9776	apparmor: name null-XXX profiles after the executable When possible its better to name a learning profile after the missing profile in question. This allows for both more informative names and for profile reuse. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:30 -08:00
John Johansen	30b026a8d1	apparmor: pass gfp_t parameter into profile allocation Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:29 -08:00
John Johansen	73688d1ed0	apparmor: refactor prepare_ns() and make usable from different views prepare_ns() will need to be called from alternate views, and namespaces will need to be created via different interfaces. So refactor and allow specifying the view ns. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:28 -08:00
John Johansen	5fd1b95fc9	apparmor: update policy_destroy to use new debug asserts Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:27 -08:00
John Johansen	d102d89571	apparmor: pass gfp param into aa_policy_init() Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:27 -08:00
John Johansen	bbe4a7c873	apparmor: constify policy name and hname Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:26 -08:00
John Johansen	6e474e3063	apparmor: rename hname_tail to basename Rename to the shorter and more familiar shell cmd name Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:25 -08:00
John Johansen	efeee83a70	apparmor: rename mediated_filesystem() to path_mediated_fs() Rename to indicate the test is only about whether path mediation is used, not whether other types of mediation might be used. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:24 -08:00
John Johansen	680cd62e91	apparmor: add debug assert AA_BUG and Kconfig to control debug info Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:24 -08:00
John Johansen	57e36bbd67	apparmor: add macro for bug asserts to check that a lock is held Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:23 -08:00
John Johansen	92b6d8eff5	apparmor: allow ns visibility question to consider subnses Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:22 -08:00
John Johansen	31617ddfdd	apparmor: add fn to lookup profiles by fqname Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:22 -08:00
John Johansen	3b0aaf5866	apparmor: add lib fn to find the "split" for fqnames Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:21 -08:00
John Johansen	9a2d40c12d	apparmor: add strn version of aa_find_ns Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:20 -08:00
John Johansen	1741e9eb8c	apparmor: add strn version of lookup_profile fn Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:19 -08:00
John Johansen	8399588a7f	apparmor: rename replacedby to proxy Proxy is shorter and a better fit than replaceby, so rename it. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:18:19 -08:00
John Johansen	d97d51d253	apparmor: rename PFLAG_INVALID to PFLAG_STALE Invalid does not convey the meaning of the flag anymore so rename it. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 01:16:37 -08:00
John Johansen	121d4a91e3	apparmor: rename sid to secid Move to common terminology with other LSMs and kernel infrastucture Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 00:42:17 -08:00
John Johansen	98849dff90	apparmor: rename namespace to ns to improve code line lengths Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 00:42:16 -08:00
John Johansen	cff281f686	apparmor: split apparmor policy namespaces code into its own file Policy namespaces will be diverging from profile management and expanding so put it in its own file. Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 00:42:15 -08:00
John Johansen	fe6bb31f59	apparmor: split out shared policy_XXX fns to lib Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 00:42:14 -08:00
John Johansen	12557dcba2	apparmor: move lib definitions into separate lib include Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-16 00:42:13 -08:00
Kees Cook	8486adf0d7	apparmor: use designated initializers Prepare to mark sensitive kernel structures for randomization by making sure they're using designated initializers. These were identified during allyesconfig builds of x86, arm, and arm64, with most initializer fixes extracted from grsecurity. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-15 20:00:32 -08:00
Tetsuo Handa	a7f6c1b63b	AppArmor: Use GFP_KERNEL for __aa_kvmalloc(). Calling kmalloc(GFP_NOIO) with order == PAGE_ALLOC_COSTLY_ORDER is not recommended because it might fall into infinite retry loop without invoking the OOM killer. Since aa_dfa_unpack() is the only caller of kvzalloc() and aa_dfa_unpack() which is calling kvzalloc() via unpack_table() is doing kzalloc(GFP_KERNEL), it is safe to use GFP_KERNEL from __aa_kvmalloc(). Since aa_simple_write_to_buffer() is the only caller of kvmalloc() and aa_simple_write_to_buffer() is calling copy_from_user() which is GFP_KERNEL context (see memdup_user_nul()), it is safe to use GFP_KERNEL from __aa_kvmalloc(). Therefore, replace GFP_NOIO with GFP_KERNEL. Also, since we have vmalloc() fallback, add __GFP_NORETRY so that we don't invoke the OOM killer by kmalloc(GFP_KERNEL) with order == PAGE_ALLOC_COSTLY_ORDER. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: John Johansen <john.johansen@canonical.com>	2017-01-15 13:41:09 -08:00
Peter Zijlstra	6b1ffa06e5	locking/atomic, kref: Use kref_get_unless_zero() more For some obscure reason apparmor thinks its needs to locally implement kref primitives that already exist. Stop doing this. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:37:20 +01:00
Stephen Smalley	3a2f5a59a6	security,selinux,smack: kill security_task_wait hook As reported by yangshukui, a permission denial from security_task_wait() can lead to a soft lockup in zap_pid_ns_processes() since it only expects sys_wait4() to return 0 or -ECHILD. Further, security_task_wait() can in general lead to zombies; in the absence of some way to automatically reparent a child process upon a denial, the hook is not useful. Remove the security hook and its implementations in SELinux and Smack. Smack already removed its check from its hook. Reported-by: yangshukui <yangshukui@huawei.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-12 11:10:57 -05:00
Stephen Smalley	b4ba35c75a	selinux: drop unused socket security classes Several of the extended socket classes introduced by commit `da69a5306a` ("selinux: support distinctions among all network address families") are never used because sockets can never be created with the associated address family. Remove these unused socket security classes. The removed classes are bridge_socket for PF_BRIDGE, ib_socket for PF_IB, and mpls_socket for PF_MPLS. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-12 11:10:24 -05:00
Seung-Woo Kim	83a1e53f39	Smack: ignore private inode for file functions The access to fd from anon_inode is always failed because there is no set xattr operations. So this patch fixes to ignore private inode including anon_inode for file functions. It was only ignored for smack_file_receive() to share dma-buf fd, but dma-buf has other functions like ioctl and mmap. Reference: https://lkml.org/lkml/2015/4/17/16 Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2017-01-10 09:47:20 -08:00
Rafal Krypa	805b65a80b	Smack: fix d_instantiate logic for sockfs and pipefs Since `4b936885a` (v2.6.32) all inodes on sockfs and pipefs are disconnected. It caused filesystem specific code in smack_d_instantiate to be skipped, because all inodes on those pseudo filesystems were treated as root inodes. As a result all sockfs inodes had the Smack label set to floor. In most cases access checks for sockets use socket_smack data so the inode label is not important. But there are special cases that were broken. One example would be calling fcntl with F_SETOWN command on a socket fd. Now smack_d_instantiate expects all pipefs and sockfs inodes to be disconnected and has the logic in appropriate place. Signed-off-by: Rafal Krypa <r.krypa@samsung.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2017-01-10 09:47:20 -08:00
Himanshu Shukla	c9d238a18b	SMACK: Use smk_tskacc() instead of smk_access() for proper logging smack_file_open() is first checking the capability of calling subject, this check will skip the SMACK logging for success case. Use smk_tskacc() for proper logging and SMACK access check. Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2017-01-10 09:47:20 -08:00
Vishal Goel	348dc288d4	Smack: Traverse the smack_known_list using list_for_each_entry_rcu macro In smack_from_secattr function,"smack_known_list" is being traversed using list_for_each_entry macro, although it is a rcu protected structure. So it should be traversed using "list_for_each_entry_rcu" macro to fetch the rcu protected entry. Signed-off-by: Vishal Goel <vishal.goel@samsung.com> Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2017-01-10 09:47:20 -08:00
Himanshu Shukla	3d4f673a69	SMACK: Free the i_security blob in inode using RCU There is race condition issue while freeing the i_security blob in SMACK module. There is existing condition where i_security can be freed while inode_permission is called from path lookup on second CPU. There has been observed the page fault with such condition. VFS code and Selinux module takes care of this condition by freeing the inode and i_security field using RCU via call_rcu(). But in SMACK directly the i_secuirty blob is being freed. Use call_rcu() to fix this race condition issue. Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com> Signed-off-by: Vishal Goel <vishal.goel@samsung.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2017-01-10 09:47:20 -08:00
Himanshu Shukla	d54a197964	SMACK: Delete list_head repeated initialization smk_copy_rules() and smk_copy_relabel() are initializing list_head though they have been initialized already in new_task_smack() function. Delete repeated initialization. Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2017-01-10 09:47:20 -08:00
Vishal Goel	2e962e2fec	SMACK: Add new lock for adding entry in smack master list "smk_set_access()" function adds a new rule entry in subject label specific list(rule_list) and in global rule list(smack_rule_list) both. Mutex lock (rule_lock) is used to avoid simultaneous updates. But this lock is subject label specific lock. If 2 processes tries to add different rules(i.e with different subject labels) simultaneously, then both the processes can take the "rule_lock" respectively. So it will cause a problem while adding entries in master rule list. Now a new mutex lock(smack_master_list_lock) has been taken to add entry in smack_rule_list to avoid simultaneous updates of different rules. Signed-off-by: Vishal Goel <vishal.goel@samsung.com> Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2017-01-10 09:47:20 -08:00
Vishal Goel	0c96d1f532	Smack: Fix the issue of wrong SMACK label update in socket bind fail case Fix the issue of wrong SMACK label (SMACK64IPIN) update when a second bind call is made to same IP address & port, but with different SMACK label (SMACK64IPIN) by second instance of server. In this case server returns with "Bind:Address already in use" error but before returning, SMACK label is updated in SMACK port-label mapping list inside smack_socket_bind() hook To fix this issue a new check has been added in smk_ipv6_port_label() function before updating the existing port entry. It checks whether the socket for matching port entry is closed or not. If it is closed then it means port is not bound and it is safe to update the existing port entry else return if port is still getting used. For checking whether socket is closed or not, one more field "smk_can_reuse" has been added in the "smk_port_label" structure. This field will be set to '1' in "smack_sk_free_security()" function which is called to free the socket security blob when the socket is being closed. In this function, port entry is searched in the SMACK port-label mapping list for the closing socket. If entry is found then "smk_can_reuse" field is set to '1'.Initially "smk_can_reuse" field is set to '0' in smk_ipv6_port_label() function after creating a new entry in the list which indicates that socket is in use. Signed-off-by: Vishal Goel <vishal.goel@samsung.com> Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2017-01-10 09:47:20 -08:00
Vishal Goel	9d44c97384	Smack: Fix the issue of permission denied error in ipv6 hook Permission denied error comes when 2 IPv6 servers are running and client tries to connect one of them. Scenario is that both servers are using same IP and port but different protocols(Udp and tcp). They are using different SMACK64IPIN labels.Tcp server is using "test" and udp server is using "test-in". When we try to run tcp client with SMACK64IPOUT label as "test", then connection denied error comes. It should not happen since both tcp server and client labels are same.This happens because there is no check for protocol in smk_ipv6_port_label() function while searching for the earlier port entry. It checks whether there is an existing port entry on the basis of port only. So it updates the earlier port entry in the list. Due to which smack label gets changed for earlier entry in the "smk_ipv6_port_list" list and permission denied error comes. Now a check is added for socket type also.Now if 2 processes use same port but different protocols (tcp or udp), then 2 different port entries will be added in the list. Similarly while checking smack access in smk_ipv6_port_check() function, port entry is searched on the basis of both port and protocol. Signed-off-by: Vishal Goel <vishal.goel@samsung.com> Signed-off-by: Himanshu Shukla <Himanshu.sh@samsung.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2017-01-10 09:47:20 -08:00
Vishal Goel	3c7ce34271	SMACK: Add the rcu synchronization mechanism in ipv6 hooks Add the rcu synchronization mechanism for accessing smk_ipv6_port_list in smack IPv6 hooks. Access to the port list is vulnerable to a race condition issue,it does not apply proper synchronization methods while working on critical section. It is possible that when one thread is reading the list, at the same time another thread is modifying the same port list, which can cause the major problems. To ensure proper synchronization between two threads, rcu mechanism has been applied while accessing and modifying the port list. RCU will also not affect the performance, as there are more accesses than modification where RCU is most effective synchronization mechanism. Signed-off-by: Vishal Goel <vishal.goel@samsung.com> Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2017-01-10 09:47:20 -08:00
Gary Tierney	900fde06cb	selinux: default to security isid in sel_make_bools() if no sid is found Use SECINITSID_SECURITY as the default SID for booleans which don't have a matching SID returned from security_genfs_sid(), also update the error message to a warning which matches this. This prevents the policy failing to load (and consequently the system failing to boot) when there is no default genfscon statement matched for the selinuxfs in the new policy. Signed-off-by: Gary Tierney <gary.tierney@gmx.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:32 -05:00
Gary Tierney	4262fb51c9	selinux: log errors when loading new policy Adds error logging to the code paths which can fail when loading a new policy in sel_write_load(). If the policy fails to be loaded from userspace then a warning message is printed, whereas if a failure occurs after loading policy from userspace an error message will be printed with details on where policy loading failed (recreating one of /classes/, /policy_capabilities/, /booleans/ in the SELinux fs). Also, if sel_make_bools() fails to obtain an SID for an entry in /booleans/* an error will be printed indicating the path of the boolean. Signed-off-by: Gary Tierney <gary.tierney@gmx.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:31 -05:00
Stephen Smalley	b21507e272	proc,security: move restriction on writing /proc/pid/attr nodes to proc Processes can only alter their own security attributes via /proc/pid/attr nodes. This is presently enforced by each individual security module and is also imposed by the Linux credentials implementation, which only allows a task to alter its own credentials. Move the check enforcing this restriction from the individual security modules to proc_pid_attr_write() before calling the security hook, and drop the unnecessary task argument to the security hook since it can only ever be the current task. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:31 -05:00
Stephen Smalley	be0554c9bf	selinux: clean up cred usage and simplify SELinux was sometimes using the task "objective" credentials when it could/should use the "subjective" credentials. This was sometimes hidden by the fact that we were unnecessarily passing around pointers to the current task, making it appear as if the task could be something other than current, so eliminate all such passing of current. Inline various permission checking helper functions that can be reduced to a single avc_has_perm() call. Since the credentials infrastructure only allows a task to alter its own credentials, we can always assume that current must be the same as the target task in selinux_setprocattr after the check. We likely should move this check from selinux_setprocattr() to proc_pid_attr_write() and drop the task argument to the security hook altogether; it can only serve to confuse things. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:31 -05:00
Stephen Smalley	01593d3299	selinux: allow context mounts on tmpfs, ramfs, devpts within user namespaces commit `aad82892af` ("selinux: Add support for unprivileged mounts from user namespaces") prohibited any use of context mount options within non-init user namespaces. However, this breaks use of context mount options for tmpfs mounts within user namespaces, which are being used by Docker/runc. There is no reason to block such usage for tmpfs, ramfs or devpts. Exempt these filesystem types from this restriction. Before: sh$ userns_child_exec -p -m -U -M '0 1000 1' -G '0 1000 1' bash sh# mount -t tmpfs -o context=system_u:object_r:user_tmp_t:s0:c13 none /tmp mount: tmpfs is write-protected, mounting read-only mount: cannot mount tmpfs read-only After: sh$ userns_child_exec -p -m -U -M '0 1000 1' -G '0 1000 1' bash sh# mount -t tmpfs -o context=system_u:object_r:user_tmp_t:s0:c13 none /tmp sh# ls -Zd /tmp unconfined_u:object_r:user_tmp_t:s0:c13 /tmp Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:31 -05:00
Stephen Smalley	ef37979a2c	selinux: handle ICMPv6 consistently with ICMP commit 79c8b348f215 ("selinux: support distinctions among all network address families") mapped datagram ICMP sockets to the new icmp_socket security class, but left ICMPv6 sockets unchanged. This change fixes that oversight to handle both kinds of sockets consistently. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:31 -05:00
Yongqin Liu	a2c7c6fbe5	selinux: add security in-core xattr support for tracefs Since kernel 4.1 ftrace is supported as a new separate filesystem. It gets automatically mounted by the kernel under the old path /sys/kernel/debug/tracing. Because it lives now on a separate filesystem SELinux needs to be updated to also support setting SELinux labels on tracefs inodes. This is required for compatibility in Android when moving to Linux 4.1 or newer. Signed-off-by: Yongqin Liu <yongqin.liu@linaro.org> Signed-off-by: William Roberts <william.c.roberts@intel.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:30 -05:00
Stephen Smalley	da69a5306a	selinux: support distinctions among all network address families Extend SELinux to support distinctions among all network address families implemented by the kernel by defining new socket security classes and mapping to them. Otherwise, many sockets are mapped to the generic socket class and are indistinguishable in policy. This has come up previously with regard to selectively allowing access to bluetooth sockets, and more recently with regard to selectively allowing access to AF_ALG sockets. Guido Trentalancia submitted a patch that took a similar approach to add only support for distinguishing AF_ALG sockets, but this generalizes his approach to handle all address families implemented by the kernel. Socket security classes are also added for ICMP and SCTP sockets. Socket security classes were not defined for AF_* values that are reserved but unimplemented in the kernel, e.g. AF_NETBEUI, AF_SECURITY, AF_ASH, AF_ECONET, AF_SNA, AF_WANPIPE. Backward compatibility is provided by only enabling the finer-grained socket classes if a new policy capability is set in the policy; older policies will behave as before. The legacy redhat1 policy capability that was only ever used in testing within Fedora for ptrace_child is reclaimed for this purpose; as far as I can tell, this policy capability is not enabled in any supported distro policy. Add a pair of conditional compilation guards to detect when new AF_* values are added so that we can update SELinux accordingly rather than having to belatedly update it long after new address families are introduced. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-01-09 10:07:30 -05:00
Linus Torvalds	7c0f6ba682	Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]#[[:blank:]]include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"\|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-24 11:46:01 -08:00
Linus Torvalds	67327145c4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull SElinux fix from James Morris: "From Paul: 'A small SELinux patch to fix some clang/llvm compiler warnings and ensure the tools under scripts work well in the face of kernel changes'" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: selinux: use the kernel headers when building scripts/selinux	2016-12-22 10:03:52 -08:00
Paul Moore	bfc5e3a6af	selinux: use the kernel headers when building scripts/selinux Commit `3322d0d64f` ("selinux: keep SELinux in sync with new capability definitions") added a check on the defined capabilities without explicitly including the capability header file which caused problems when building genheaders for users of clang/llvm. Resolve this by using the kernel headers when building genheaders, which is arguably the right thing to do regardless, and explicitly including the kernel's capability.h header file in classmap.h. We also update the mdp build, even though it wasn't causing an error we really should be using the headers from the kernel we are building. Reported-by: Nicolas Iooss <nicolas.iooss@m4x.org> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-12-21 10:39:25 -05:00
Andreas Steffen	98e1d55d03	ima: platform-independent hash value For remote attestion it is important for the ima measurement values to be platform-independent. Therefore integer fields to be hashed must be converted to canonical format. Link: http://lkml.kernel.org/r/1480554346-29071-11-git-send-email-zohar@linux.vnet.ibm.com Signed-off-by: Andreas Steffen <andreas.steffen@strongswan.org> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: Josh Sklar <sklar@linux.vnet.ibm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-20 09:48:46 -08:00
Mimi Zohar	d68a6fe9fc	ima: define a canonical binary_runtime_measurements list format The IMA binary_runtime_measurements list is currently in platform native format. To allow restoring a measurement list carried across kexec with a different endianness than the targeted kernel, this patch defines little-endian as the canonical format. For big endian systems wanting to save/restore the measurement list from a system with a different endianness, a new boot command line parameter named "ima_canonical_fmt" is defined. Considerations: use of the "ima_canonical_fmt" boot command line option will break existing userspace applications on big endian systems expecting the binary_runtime_measurements list to be in platform native format. Link: http://lkml.kernel.org/r/1480554346-29071-10-git-send-email-zohar@linux.vnet.ibm.com Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andreas Steffen <andreas.steffen@strongswan.org> Cc: Josh Sklar <sklar@linux.vnet.ibm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-20 09:48:45 -08:00
Mimi Zohar	c7d0936770	ima: support restoring multiple template formats The configured IMA measurement list template format can be replaced at runtime on the boot command line, including a custom template format. This patch adds support for restoring a measuremement list containing multiple builtin/custom template formats. Link: http://lkml.kernel.org/r/1480554346-29071-9-git-send-email-zohar@linux.vnet.ibm.com Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andreas Steffen <andreas.steffen@strongswan.org> Cc: Josh Sklar <sklar@linux.vnet.ibm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-20 09:48:45 -08:00
Mimi Zohar	3f23d624de	ima: store the builtin/custom template definitions in a list The builtin and single custom templates are currently stored in an array. In preparation for being able to restore a measurement list containing multiple builtin/custom templates, this patch stores the builtin and custom templates as a linked list. This will permit defining more than one custom template per boot. Link: http://lkml.kernel.org/r/1480554346-29071-8-git-send-email-zohar@linux.vnet.ibm.com Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andreas Steffen <andreas.steffen@strongswan.org> Cc: Josh Sklar <sklar@linux.vnet.ibm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-20 09:48:45 -08:00
Mimi Zohar	7b8589cc29	ima: on soft reboot, save the measurement list The TPM PCRs are only reset on a hard reboot. In order to validate a TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement list of the running kernel must be saved and restored on boot. This patch uses the kexec buffer passing mechanism to pass the serialized IMA binary_runtime_measurements to the next kernel. Link: http://lkml.kernel.org/r/1480554346-29071-7-git-send-email-zohar@linux.vnet.ibm.com Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: Andreas Steffen <andreas.steffen@strongswan.org> Cc: Josh Sklar <sklar@linux.vnet.ibm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-20 09:48:44 -08:00
Mimi Zohar	d158847ae8	ima: maintain memory size needed for serializing the measurement list In preparation for serializing the binary_runtime_measurements, this patch maintains the amount of memory required. Link: http://lkml.kernel.org/r/1480554346-29071-5-git-send-email-zohar@linux.vnet.ibm.com Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andreas Steffen <andreas.steffen@strongswan.org> Cc: Josh Sklar <sklar@linux.vnet.ibm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-20 09:48:44 -08:00
Mimi Zohar	dcfc56937b	ima: permit duplicate measurement list entries Measurements carried across kexec need to be added to the IMA measurement list, but should not prevent measurements of the newly booted kernel from being added to the measurement list. This patch adds support for allowing duplicate measurements. The "boot_aggregate" measurement entry is the delimiter between soft boots. Link: http://lkml.kernel.org/r/1480554346-29071-4-git-send-email-zohar@linux.vnet.ibm.com Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andreas Steffen <andreas.steffen@strongswan.org> Cc: Josh Sklar <sklar@linux.vnet.ibm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-20 09:48:43 -08:00
Mimi Zohar	94c3aac567	ima: on soft reboot, restore the measurement list The TPM PCRs are only reset on a hard reboot. In order to validate a TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement list of the running kernel must be saved and restored on boot. This patch restores the measurement list. Link: http://lkml.kernel.org/r/1480554346-29071-3-git-send-email-zohar@linux.vnet.ibm.com Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andreas Steffen <andreas.steffen@strongswan.org> Cc: Josh Sklar <sklar@linux.vnet.ibm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-20 09:48:43 -08:00
Linus Torvalds	9a19a6db37	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: - more ->d_init() stuff (work.dcache) - pathname resolution cleanups (work.namei) - a few missing iov_iter primitives - copy_from_iter_full() and friends. Either copy the full requested amount, advance the iterator and return true, or fail, return false and do _not_ advance the iterator. Quite a few open-coded callers converted (and became more readable and harder to fuck up that way) (work.iov_iter) - several assorted patches, the big one being logfs removal * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: logfs: remove from tree vfs: fix put_compat_statfs64() does not handle errors namei: fold should_follow_link() with the step into not-followed link namei: pass both WALK_GET and WALK_MORE to should_follow_link() namei: invert WALK_PUT logics namei: shift interpretation of LOOKUP_FOLLOW inside should_follow_link() namei: saner calling conventions for mountpoint_last() namei.c: get rid of user_path_parent() switch getfrag callbacks to ..._full() primitives make skb_add_data,{_nocache}() and skb_copy_to_page_nocache() advance only on success [iov_iter] new primitives - copy_from_iter_full() and friends don't open-code file_inode() ceph: switch to use of ->d_init() ceph: unify dentry_operations instances lustre: switch to use of ->d_init()	2016-12-16 10:24:44 -08:00
Al Viro	c4364f837c	Merge branches 'work.namei', 'work.dcache' and 'work.iov_iter' into for-linus	2016-12-15 01:07:29 -05:00
Linus Torvalds	a57cb1c1d7	Merge branch 'akpm' (patches from Andrew) Merge more updates from Andrew Morton: - a few misc things - kexec updates - DMA-mapping updates to better support networking DMA operations - IPC updates - various MM changes to improve DAX fault handling - lots of radix-tree changes, mainly to the test suite. All leading up to reimplementing the IDA/IDR code to be a wrapper layer over the radix-tree. However the final trigger-pulling patch is held off for 4.11. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (114 commits) radix tree test suite: delete unused rcupdate.c radix tree test suite: add new tag check radix-tree: ensure counts are initialised radix tree test suite: cache recently freed objects radix tree test suite: add some more functionality idr: reduce the number of bits per level from 8 to 6 rxrpc: abstract away knowledge of IDR internals tpm: use idr_find(), not idr_find_slowpath() idr: add ida_is_empty radix tree test suite: check multiorder iteration radix-tree: fix replacement for multiorder entries radix-tree: add radix_tree_split_preload() radix-tree: add radix_tree_split radix-tree: add radix_tree_join radix-tree: delete radix_tree_range_tag_if_tagged() radix-tree: delete radix_tree_locate_item() radix-tree: improve multiorder iterators btrfs: fix race in btrfs_free_dummy_fs_info() radix-tree: improve dump output radix-tree: make radix_tree_find_next_bit more useful ...	2016-12-14 17:25:18 -08:00
Lorenzo Stoakes	5b56d49fc3	mm: add locked parameter to get_user_pages_remote() Patch series "mm: unexport __get_user_pages_unlocked()". This patch series continues the cleanup of get_user_pages() functions taking advantage of the fact we can now pass gup_flags as we please. It firstly adds an additional 'locked' parameter to get_user_pages_remote() to allow for its callers to utilise VM_FAULT_RETRY functionality. This is necessary as the invocation of __get_user_pages_unlocked() in process_vm_rw_single_vec() makes use of this and no other existing higher level function would allow it to do so. Secondly existing callers of __get_user_pages_unlocked() are replaced with the appropriate higher-level replacement - get_user_pages_unlocked() if the current task and memory descriptor are referenced, or get_user_pages_remote() if other task/memory descriptors are referenced (having acquiring mmap_sem.) This patch (of 2): Add a int locked parameter to get_user_pages_remote() to allow VM_FAULT_RETRY faulting behaviour similar to get_user_pages_[un]locked(). Taking into account the previous adjustments to get_user_pages*() functions allowing for the passing of gup_flags, we are now in a position where __get_user_pages_unlocked() need only be exported for his ability to allow VM_FAULT_RETRY behaviour, this adjustment allows us to subsequently unexport __get_user_pages_unlocked() as well as allowing for future flexibility in the use of get_user_pages_remote(). [sfr@canb.auug.org.au: merge fix for get_user_pages_remote API change] Link: http://lkml.kernel.org/r/20161122210511.024ec341@canb.auug.org.au Link: http://lkml.kernel.org/r/20161027095141.2569-2-lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Jan Kara <jack@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-14 16:04:08 -08:00
Linus Torvalds	412ac77a9d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace updates from Eric Biederman: "After a lot of discussion and work we have finally reachanged a basic understanding of what is necessary to make unprivileged mounts safe in the presence of EVM and IMA xattrs which the last commit in this series reflects. While technically it is a revert the comments it adds are important for people not getting confused in the future. Clearing up that confusion allows us to seriously work on unprivileged mounts of fuse in the next development cycle. The rest of the fixes in this set are in the intersection of user namespaces, ptrace, and exec. I started with the first fix which started a feedback cycle of finding additional issues during review and fixing them. Culiminating in a fix for a bug that has been present since at least Linux v1.0. Potentially these fixes were candidates for being merged during the rc cycle, and are certainly backport candidates but enough little things turned up during review and testing that I decided they should be handled as part of the normal development process just to be certain there were not any great surprises when it came time to backport some of these fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: Revert "evm: Translate user/group ids relative to s_user_ns when computing HMAC" exec: Ensure mm->user_ns contains the execed files ptrace: Don't allow accessing an undumpable mm ptrace: Capture the ptracer's creds not PT_PTRACE_CAP mm: Add a user_ns owner to mm_struct and fix ptrace permission checks	2016-12-14 14:09:48 -08:00
Linus Torvalds	683b96f4d1	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Generally pretty quiet for this release. Highlights: Yama: - allow ptrace access for original parent after re-parenting TPM: - add documentation - many bugfixes & cleanups - define a generic open() method for ascii & bios measurements Integrity: - Harden against malformed xattrs SELinux: - bugfixes & cleanups Smack: - Remove unnecessary smack_known_invalid label - Do not apply star label in smack_setprocattr hook - parse mnt opts after privileges check (fixes unpriv DoS vuln)" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (56 commits) Yama: allow access for the current ptrace parent tpm: adjust return value of tpm_read_log tpm: vtpm_proxy: conditionally call tpm_chip_unregister tpm: Fix handling of missing event log tpm: Check the bios_dir entry for NULL before accessing it tpm: return -ENODEV if np is not set tpm: cleanup of printk error messages tpm: replace of_find_node_by_name() with dev of_node property tpm: redefine read_log() to handle ACPI/OF at runtime tpm: fix the missing .owner in tpm_bios_measurements_ops tpm: have event log use the tpm_chip tpm: drop tpm1_chip_register(/unregister) tpm: replace dynamically allocated bios_dir with a static array tpm: replace symbolic permission with octal for securityfs files char: tpm: fix kerneldoc tpm2_unseal_trusted name typo tpm_tis: Allow tpm_tis to be bound using DT tpm, tpm_vtpm_proxy: add kdoc comments for VTPM_PROXY_IOC_NEW_DEV tpm: Only call pm_runtime_get_sync if device has a parent tpm: define a generic open() method for ascii & bios measurements Documentation: tpm: add the Physical TPM device tree binding documentation ...	2016-12-14 13:57:44 -08:00
Linus Torvalds	9465d9cc31	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "The time/timekeeping/timer folks deliver with this update: - Fix a reintroduced signed/unsigned issue and cleanup the whole signed/unsigned mess in the timekeeping core so this wont happen accidentaly again. - Add a new trace clock based on boot time - Prevent injection of random sleep times when PM tracing abuses the RTC for storage - Make posix timers configurable for real tiny systems - Add tracepoints for the alarm timer subsystem so timer based suspend wakeups can be instrumented - The usual pile of fixes and updates to core and drivers" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) timekeeping: Use mul_u64_u32_shr() instead of open coding it timekeeping: Get rid of pointless typecasts timekeeping: Make the conversion call chain consistently unsigned timekeeping_Force_unsigned_clocksource_to_nanoseconds_conversion alarmtimer: Add tracepoints for alarm timers trace: Update documentation for mono, mono_raw and boot clock trace: Add an option for boot clock as trace clock timekeeping: Add a fast and NMI safe boot clock timekeeping/clocksource_cyc2ns: Document intended range limitation timekeeping: Ignore the bogus sleep time if pm_trace is enabled selftests/timers: Fix spelling mistake "Asyncrhonous" -> "Asynchronous" clocksource/drivers/bcm2835_timer: Unmap region obtained by of_iomap clocksource/drivers/arm_arch_timer: Map frame with of_io_request_and_map() arm64: dts: rockchip: Arch counter doesn't tick in system suspend clocksource/drivers/arm_arch_timer: Don't assume clock runs in suspend posix-timers: Make them configurable posix_cpu_timers: Move the add_device_randomness() call to a proper place timer: Move sys_alarm from timer.c to itimer.c ptp_clock: Allow for it to be optional Kconfig: Regenerate *.c_shipped files after previous changes ...	2016-12-12 19:56:15 -08:00
Al Viro	cbbd26b8b1	[iov_iter] new primitives - copy_from_iter_full() and friends copy_from_iter_full(), copy_from_iter_full_nocache() and csum_and_copy_from_iter_full() - counterparts of copy_from_iter() et.al., advancing iterator only in case of successful full copy and returning whether it had been successful or not. Convert some obvious users. NOTE - do not blindly assume that something is a good candidate for those unless you are sure that not advancing iov_iter in failure case is the right thing in this case. Anything that does short read/short write kind of stuff (or is in a loop, etc.) is unlikely to be a good one. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-12-05 14:33:36 -05:00
Josh Stone	50523a29d9	Yama: allow access for the current ptrace parent Under ptrace_scope=1, it's possible to have a tracee that is already ptrace-attached, but is no longer a direct descendant. For instance, a forking daemon will be re-parented to init, losing its ancestry to the tracer that launched it. The tracer can continue using ptrace in that state, but it will be denied other accesses that check PTRACE_MODE_ATTACH, like process_vm_rw and various procfs files. There's no reason to prevent such access for a tracer that already has ptrace control anyway. This patch adds a case to ptracer_exception_found to allow access for any task in the same thread group as the current ptrace parent. Signed-off-by: Josh Stone <jistone@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: James Morris <james.l.morris@oracle.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: linux-security-module@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2016-12-05 11:48:01 +11:00
Al Viro	450630975d	don't open-code file_inode() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-12-04 18:29:28 -05:00
Eric W. Biederman	19339c2516	Revert "evm: Translate user/group ids relative to s_user_ns when computing HMAC" This reverts commit `0b3c9761d1`. Seth Forshee <seth.forshee@canonical.com> writes: > All right, I think `0b3c9761d1` should be > reverted then. EVM is a machine-local integrity mechanism, and so it > makes sense that the signature would be based on the kernel's notion of > the uid and not the filesystem's. I added a commment explaining why the EVM hmac needs to be in the kernel's notion of uid and gid, not the filesystems to prevent remounting the filesystem and gaining unwaranted trust in files. Acked-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2016-12-02 20:58:41 -06:00
James Morris	0821e30cd2	Merge branch 'stable-4.10' of git://git.infradead.org/users/pcmoore/selinux into next	2016-11-24 11:21:25 +11:00
James Morris	b075361e91	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next	2016-11-23 09:52:11 +11:00
Andreas Gruenbacher	9287aed2ad	selinux: Convert isec->lock into a spinlock Convert isec->lock from a mutex into a spinlock. Instead of holding the lock while sleeping in inode_doinit_with_dentry, set isec->initialized to LABEL_PENDING and release the lock. Then, when the sid has been determined, re-acquire the lock. If isec->initialized is still set to LABEL_PENDING, set isec->sid; otherwise, the sid has been set by another task (LABEL_INITIALIZED) or invalidated (LABEL_INVALID) in the meantime. This fixes a deadlock on gfs2 where * one task is in inode_doinit_with_dentry -> gfs2_getxattr, holds isec->lock, and tries to acquire the inode's glock, and * another task is in do_xmote -> inode_go_inval -> selinux_inode_invalidate_secctx, holds the inode's glock, and tries to acquire isec->lock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> [PM: minor tweaks to keep checkpatch.pl happy] Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-11-22 17:44:02 -05:00
James Morris	636e4625ad	Merge remote branch 'smack/smack-for-4.10' into next	2016-11-22 22:15:30 +11:00
Stephen Smalley	3322d0d64f	selinux: keep SELinux in sync with new capability definitions When a new capability is defined, SELinux needs to be updated. Trigger a build error if a new capability is defined without corresponding update to security/selinux/include/classmap.h's COMMON_CAP2_PERMS. This is similar to BUILD_BUG_ON() guards in the SELinux nlmsgtab code to ensure that SELinux tracks new netlink message types as needed. Note that there is already a similar build guard in security/selinux/hooks.c to detect when more than 64 capabilities are defined, since that will require adding a third capability class to SELinux. A nicer way to do this would be to extend scripts/selinux/genheaders or a similar tool to auto-generate the necessary definitions and code for SELinux capability checking from include/uapi/linux/capability.h. AppArmor does something similar in its Makefile, although it only needs to generate a single table of names. That is left as future work. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: reformat the description to keep checkpatch.pl happy] Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-11-21 15:37:24 -05:00
John Johansen	3d40658c97	apparmor: fix change_hat not finding hat after policy replacement After a policy replacement, the task cred may be out of date and need to be updated. However change_hat is using the stale profiles from the out of date cred resulting in either: a stale profile being applied or, incorrect failure when searching for a hat profile as it has been migrated to the new parent profile. Fixes: `01e2b670aa` (failure to find hat) Fixes: `898127c34e` (stale policy being applied) Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1000287 Cc: stable@vger.kernel.org Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2016-11-21 18:01:28 +11:00
Stephen Smalley	ea49d10eee	selinux: normalize input to /sys/fs/selinux/enforce At present, one can write any signed integer value to /sys/fs/selinux/enforce and it will be stored, e.g. echo -1 > /sys/fs/selinux/enforce or echo 2 > /sys/fs/selinux/enforce. This makes no real difference to the kernel, since it only ever cares if it is zero or non-zero, but some userspace code compares it with 1 to decide if SELinux is enforcing, and this could confuse it. Only a process that is already root and is allowed the setenforce permission in SELinux policy can write to /sys/fs/selinux/enforce, so this is not considered to be a security issue, but it should be fixed. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-11-20 17:13:19 -05:00
Nicolas Pitre	baa73d9e47	posix-timers: Make them configurable Some embedded systems have no use for them. This removes about 25KB from the kernel binary size when configured out. Corresponding syscalls are routed to a stub logging the attempt to use those syscalls which should be enough of a clue if they were disabled without proper consideration. They are: timer_create, timer_gettime: timer_getoverrun, timer_settime, timer_delete, clock_adjtime, setitimer, getitimer, alarm. The clock_settime, clock_gettime, clock_getres and clock_nanosleep syscalls are replaced by simple wrappers compatible with CLOCK_REALTIME, CLOCK_MONOTONIC and CLOCK_BOOTTIME only which should cover the vast majority of use cases with very little code. Signed-off-by: Nicolas Pitre <nico@linaro.org> Acked-by: Richard Cochran <richardcochran@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Paul Bolle <pebolle@tiscali.nl> Cc: linux-kbuild@vger.kernel.org Cc: netdev@vger.kernel.org Cc: Michal Marek <mmarek@suse.com> Cc: Edward Cree <ecree@solarflare.com> Link: http://lkml.kernel.org/r/1478841010-28605-7-git-send-email-nicolas.pitre@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-11-16 09:26:35 +01:00
Casey Schaufler	152f91d4d1	Smack: Remove unnecessary smack_known_invalid The invalid Smack label ("") and the Huh ("?") Smack label serve the same purpose and having both is unnecessary. While pulling out the invalid label it became clear that the use of smack_from_secid() was inconsistent, so that is repaired. The setting of inode labels to the invalid label could never happen in a functional system, has never been observed in the wild and is not what you'd really want for a failure behavior in any case. That is removed. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2016-11-15 09:34:39 -08:00
Tetsuo Handa	8c15d66e42	Smack: Use GFP_KERNEL for smack_parse_opts_str(). Since smack_parse_opts_str() is calling match_strdup() which uses GFP_KERNEL, it is safe to use GFP_KERNEL from kcalloc() which is called by smack_parse_opts_str(). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Casey Schaufler <casey@schaufler-ca.com>	2016-11-14 12:55:11 -08:00
Andreas Gruenbacher	13457d073c	selinux: Clean up initialization of isec->sclass Now that isec->initialized == LABEL_INITIALIZED implies that isec->sclass is valid, skip such inodes immediately in inode_doinit_with_dentry. For the remaining inodes, initialize isec->sclass at the beginning of inode_doinit_with_dentry to simplify the code. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-11-14 15:53:04 -05:00
Andreas Gruenbacher	db978da8fa	proc: Pass file mode to proc_pid_make_inode Pass the file mode of the proc inode to be created to proc_pid_make_inode. In proc_pid_make_inode, initialize inode->i_mode before calling security_task_to_inode. This allows selinux to set isec->sclass right away without introducing "half-initialized" inode security structs. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-11-14 15:39:48 -05:00
Andreas Gruenbacher	420591128c	selinux: Minor cleanups Fix the comment for function __inode_security_revalidate, which returns an integer. Use the LABEL_* constants consistently for isec->initialized. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-11-14 15:25:07 -05:00
Tetsuo Handa	8931c3bdb3	SELinux: Use GFP_KERNEL for selinux_parse_opts_str(). Since selinux_parse_opts_str() is calling match_strdup() which uses GFP_KERNEL, it is safe to use GFP_KERNEL from kcalloc() which is called by selinux_parse_opts_str(). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-11-14 15:03:38 -05:00
Seth Forshee	b4bfec7f4a	security/integrity: Harden against malformed xattrs In general the handling of IMA/EVM xattrs is good, but I found a few locations where either the xattr size or the value of the type field in the xattr are not checked. Add a few simple checks to these locations to prevent malformed or malicious xattrs from causing problems. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2016-11-13 22:50:11 -05:00
Mimi Zohar	064be15c52	ima: include the reason for TPM-bypass mode This patch includes the reason for going into TPM-bypass mode and not using the TPM. Signed-off-by: Mimi Zohar (zohar@linux.vnet.ibm>	2016-11-13 22:50:09 -05:00
Mimi Zohar	f5acb3dcba	Revert "ima: limit file hash setting by user to fix and log modes" Userspace applications have been modified to write security xattrs, but they are not context aware. In the case of security.ima, the security xattr can be either a file hash or a file signature. Permitting writing one, but not the other requires the application to be context aware. In addition, userspace applications might write files to a staging area, which might not be in policy, and then change some file metadata (eg. owner) making it in policy. As a result, these files are not labeled properly. This reverts commit `c68ed80c97`, which prevents writing file hashes as security.ima xattrs. Requested-by: Patrick Ohly <patrick.ohly@intel.com> Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2016-11-13 22:50:09 -05:00
Eric Richter	9a11a18902	ima: fix memory leak in ima_release_policy When the "policy" securityfs file is opened for read, it is opened as a sequential file. However, when it is eventually released, there is no cleanup for the sequential file, therefore some memory is leaked. This patch adds a call to seq_release() in ima_release_policy() to clean up the memory when the file is opened for read. Fixes: `80eae209d6` IMA: allow reading back the current policy Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com> Tested-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2016-11-13 22:50:08 -05:00
Casey Schaufler	2e4939f702	Smack: ipv6 label match fix The check for a deleted entry in the list of IPv6 host addresses was being performed in the wrong place, leading to most peculiar results in some cases. This puts the check into the right place. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2016-11-10 11:22:18 -08:00
Himanshu Shukla	b437aba85b	SMACK: Fix the memory leak in smack_cred_prepare() hook Memory leak in smack_cred_prepare()function. smack_cred_prepare() hook returns error if there is error in allocating memory in smk_copy_rules() or smk_copy_relabel() function. If smack_cred_prepare() function returns error then the calling function should call smack_cred_free() function for cleanup. In smack_cred_free() function first credential is extracted and then all rules are deleted. In smack_cred_prepare() function security field is assigned in the end when all function return success. But this function may return before and memory will not be freed. Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com>	2016-11-10 11:22:06 -08:00
Himanshu Shukla	7128ea159d	SMACK: Do not apply star label in smack_setprocattr hook Smack prohibits processes from using the star ("") and web ("@") labels. Checks have been added in other functions. In smack_setprocattr() hook, only check for web ("@") label has been added and restricted from applying web ("@") label. Check for star ("") label should also be added in smack_setprocattr() hook. Return error should be "-EINVAL" not "-EPERM" as permission is there for setting label but not the label value as star ("*") or web ("@"). Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com>	2016-11-10 11:21:52 -08:00
Himanshu Shukla	2097f59920	smack: parse mnt opts after privileges check In smack_set_mnt_opts()first the SMACK mount options are being parsed and later it is being checked whether the user calling mount has CAP_MAC_ADMIN capability. This sequence of operationis will allow unauthorized user to add SMACK labels in label list and may cause denial of security attack by adding many labels by allocating kernel memory by unauthorized user. Superblock smack flag is also being set as initialized though function may return with EPERM error. First check the capability of calling user then set the SMACK attributes and smk_flags. Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com>	2016-11-10 11:21:32 -08:00
jooseong lee	08382c9f6e	Smack: Assign smack_known_web label for kernel thread's Assign smack_known_web label for kernel thread's socket Creating struct sock by sk_alloc function in various kernel subsystems like bluetooth doesn't call smack_socket_post_create(). In such case, received sock label is the floor('_') label and makes access deny. Signed-off-by: jooseong lee <jooseong.lee@samsung.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com>	2016-11-04 17:42:57 -07:00
Artem Savkov	31e6ec4519	security/keys: make BIG_KEYS dependent on stdrng. Since BIG_KEYS can't be compiled as module it requires one of the "stdrng" providers to be compiled into kernel. Otherwise big_key_crypto_init() fails on crypto_alloc_rng step and next dereference of big_key_skcipher (e.g. in big_key_preparse()) results in a NULL pointer dereference. Fixes: `13100a72f4` ('Security: Keys: Big keys stored encrypted') Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Stephan Mueller <smueller@chronox.de> cc: Kirill Marinushkin <k.marinushkin@gmail.com> cc: stable@vger.kernel.org Signed-off-by: James Morris <james.l.morris@oracle.com>	2016-10-27 16:03:33 +11:00
David Howells	7df3e59c3d	KEYS: Sort out big_key initialisation big_key has two separate initialisation functions, one that registers the key type and one that registers the crypto. If the key type fails to register, there's no problem if the crypto registers successfully because there's no way to reach the crypto except through the key type. However, if the key type registers successfully but the crypto does not, big_key_rng and big_key_blkcipher may end up set to NULL - but the code neither checks for this nor unregisters the big key key type. Furthermore, since the key type is registered before the crypto, it is theoretically possible for the kernel to try adding a big_key before the crypto is set up, leading to the same effect. Fix this by merging big_key_crypto_init() and big_key_init() and calling the resulting function late. If they're going to be encrypted, we shouldn't be creating big_keys before we have the facilities to do the encryption available. The key type registration is also moved after the crypto initialisation. The fix also includes message printing on failure. If the big_key type isn't correctly set up, simply doing: dd if=/dev/zero bs=4096 count=1 \| keyctl padd big_key a @s ought to cause an oops. Fixes: `13100a72f4` ('Security: Keys: Big keys stored encrypted') Signed-off-by: David Howells <dhowells@redhat.com> cc: Peter Hlavaty <zer0mem@yahoo.com> cc: Kirill Marinushkin <k.marinushkin@gmail.com> cc: Artem Savkov <asavkov@redhat.com> cc: stable@vger.kernel.org Signed-off-by: James Morris <james.l.morris@oracle.com>	2016-10-27 16:03:27 +11:00
David Howells	03dab869b7	KEYS: Fix short sprintf buffer in /proc/keys show function This fixes CVE-2016-7042. Fix a short sprintf buffer in proc_keys_show(). If the gcc stack protector is turned on, this can cause a panic due to stack corruption. The problem is that xbuf[] is not big enough to hold a 64-bit timeout rendered as weeks: (gdb) p 0xffffffffffffffffULL/(606024*7) $2 = 30500568904943 That's 14 chars plus NUL, not 11 chars plus NUL. Expand the buffer to 16 chars. I think the unpatched code apparently works if the stack-protector is not enabled because on a 32-bit machine the buffer won't be overflowed and on a 64-bit machine there's a 64-bit aligned pointer at one side and an int that isn't checked again on the other side. The panic incurred looks something like: Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81352ebe CPU: 0 PID: 1692 Comm: reproducer Not tainted 4.7.2-201.fc24.x86_64 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 0000000000000086 00000000fbbd2679 ffff8800a044bc00 ffffffff813d941f ffffffff81a28d58 ffff8800a044bc98 ffff8800a044bc88 ffffffff811b2cb6 ffff880000000010 ffff8800a044bc98 ffff8800a044bc30 00000000fbbd2679 Call Trace: [<ffffffff813d941f>] dump_stack+0x63/0x84 [<ffffffff811b2cb6>] panic+0xde/0x22a [<ffffffff81352ebe>] ? proc_keys_show+0x3ce/0x3d0 [<ffffffff8109f7f9>] __stack_chk_fail+0x19/0x30 [<ffffffff81352ebe>] proc_keys_show+0x3ce/0x3d0 [<ffffffff81350410>] ? key_validate+0x50/0x50 [<ffffffff8134db30>] ? key_default_cmp+0x20/0x20 [<ffffffff8126b31c>] seq_read+0x2cc/0x390 [<ffffffff812b6b12>] proc_reg_read+0x42/0x70 [<ffffffff81244fc7>] __vfs_read+0x37/0x150 [<ffffffff81357020>] ? security_file_permission+0xa0/0xc0 [<ffffffff81246156>] vfs_read+0x96/0x130 [<ffffffff81247635>] SyS_read+0x55/0xc0 [<ffffffff817eb872>] entry_SYSCALL_64_fastpath+0x1a/0xa4 Reported-by: Ondrej Kozina <okozina@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Ondrej Kozina <okozina@redhat.com> cc: stable@vger.kernel.org Signed-off-by: James Morris <james.l.morris@oracle.com>	2016-10-27 16:03:24 +11:00
Linus Torvalds	86c5bf7101	Merge branch 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull vmap stack fixes from Ingo Molnar: "This is fallout from CONFIG_HAVE_ARCH_VMAP_STACK=y on x86: stack accesses that used to be just somewhat questionable are now totally buggy. These changes try to do it without breaking the ABI: the fields are left there, they are just reporting zero, or reporting narrower information (the maps file change)" * 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: mm: Change vm_is_stack_for_task() to vm_is_stack_for_current() fs/proc: Stop trying to report thread stacks fs/proc: Stop reporting eip and esp in /proc/PID/stat mm/numa: Remove duplicated include from mprotect.c	2016-10-22 09:39:10 -07:00
Andy Lutomirski	d17af5056c	mm: Change vm_is_stack_for_task() to vm_is_stack_for_current() Asking for a non-current task's stack can't be done without races unless the task is frozen in kernel mode. As far as I know, vm_is_stack_for_task() never had a safe non-current use case. The __unused annotation is because some KSTK_ESP implementations ignore their parameter, which IMO is further justification for this patch. Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Jann Horn <jann@thejh.net> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linux API <linux-api@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tycho Andersen <tycho.andersen@canonical.com> Link: http://lkml.kernel.org/r/4c3f68f426e6c061ca98b4fc7ef85ffbb0a25b0c.1475257877.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-10-20 09:21:41 +02:00
Lorenzo Stoakes	9beae1ea89	mm: replace get_user_pages_remote() write/force parameters with gup_flags This removes the 'write' and 'force' from get_user_pages_remote() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-10-19 08:12:02 -07:00
Linus Torvalds	101105b171	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: ">rename2() work from Miklos + current_time() from Deepa" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Replace current_fs_time() with current_time() fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps fs: Replace CURRENT_TIME with current_time() for inode timestamps fs: proc: Delete inode time initializations in proc_alloc_inode() vfs: Add current_time() api vfs: add note about i_op->rename changes to porting fs: rename "rename2" i_op to "rename" vfs: remove unused i_op->rename fs: make remaining filesystems use .rename2 libfs: support RENAME_NOREPLACE in simple_rename() fs: support RENAME_NOREPLACE for local filesystems ncpfs: fix unused variable warning	2016-10-10 20:16:43 -07:00
Al Viro	3873691e5a	Merge remote-tracking branch 'ovl/rename2' into for-linus	2016-10-10 23:02:51 -04:00
Linus Torvalds	97d2116708	Merge branch 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs xattr updates from Al Viro: "xattr stuff from Andreas This completes the switch to xattr_handler ->get()/->set() from ->getxattr/->setxattr/->removexattr" * 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: Remove {get,set,remove}xattr inode operations xattr: Stop calling {get,set,remove}xattr inode operations vfs: Check for the IOP_XATTR flag in listxattr xattr: Add __vfs_{get,set,remove}xattr helpers libfs: Use IOP_XATTR flag for empty directory handling vfs: Use IOP_XATTR flag for bad-inode handling vfs: Add IOP_XATTR inode operations flag vfs: Move xattr_resolve_name to the front of fs/xattr.c ecryptfs: Switch to generic xattr handlers sockfs: Get rid of getxattr iop sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names kernfs: Switch to generic xattr handlers hfs: Switch to generic xattr handlers jffs2: Remove jffs2_{get,set,remove}xattr macros xattr: Remove unnecessary NULL attribute name check	2016-10-10 17:11:50 -07:00
Linus Torvalds	abb5a14fa2	Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Assorted misc bits and pieces. There are several single-topic branches left after this (rename2 series from Miklos, current_time series from Deepa Dinamani, xattr series from Andreas, uaccess stuff from from me) and I'd prefer to send those separately" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (39 commits) proc: switch auxv to use of __mem_open() hpfs: support FIEMAP cifs: get rid of unused arguments of CIFSSMBWrite() posix_acl: uapi header split posix_acl: xattr representation cleanups fs/aio.c: eliminate redundant loads in put_aio_ring_file fs/internal.h: add const to ns_dentry_operations declaration compat: remove compat_printk() fs/buffer.c: make __getblk_slow() static proc: unsigned file descriptors fs/file: more unsigned file descriptors fs: compat: remove redundant check of nr_segs cachefiles: Fix attempt to read i_blocks after deleting file [ver #2] cifs: don't use memcpy() to copy struct iov_iter get rid of separate multipage fault-in primitives fs: Avoid premature clearing of capabilities fs: Give dentry to inode_change_ok() instead of inode fuse: Propagate dentry down to inode_change_ok() ceph: Propagate dentry down to inode_change_ok() xfs: Propagate dentry down to inode_change_ok() ...	2016-10-10 13:04:49 -07:00
Linus Torvalds	563873318d	Merge branch 'printk-cleanups' Merge my system logging cleanups, triggered by the broken '\n' patches. The line continuation handling has been broken basically forever, and the code to handle the system log records was both confusing and dubious. And it would do entirely the wrong thing unless you always had a terminating newline, partly because it couldn't actually see whether a message was marked KERN_CONT or not (but partly because the LOG_CONT handling in the recording code was rather confusing too). This re-introduces a real semantically meaningful KERN_CONT, and fixes the few places I noticed where it was missing. There are probably more missing cases, since KERN_CONT hasn't actually had any semantic meaning for at least four years (other than the checkpatch meaning of "no log level necessary, this is a continuation line"). This also allows the combination of KERN_CONT and a log level. In that case the log level will be ignored if the merging with a previous line is successful, but if a new record is needed, that new record will now get the right log level. That also means that you can at least in theory combine KERN_CONT with the "pr_info()" style helpers, although any use of pr_fmt() prefixing would make that just result in a mess, of course (the prefix would end up in the middle of a continuing line). * printk-cleanups: printk: make reading the kernel log flush pending lines printk: re-organize log_output() to be more legible printk: split out core logging code into helper function printk: reinstate KERN_CONT for printing continuation lines	2016-10-10 09:29:50 -07:00
Linus Torvalds	4bcc595ccd	printk: reinstate KERN_CONT for printing continuation lines Long long ago the kernel log buffer was a buffered stream of bytes, very much like stdio in user space. It supported log levels by scanning the stream and noticing the log level markers at the beginning of each line, but if you wanted to print a partial line in multiple chunks, you just did multiple printk() calls, and it just automatically worked. Except when it didn't, and you had very confusing output when different lines got all mixed up with each other. Then you got fragment lines mixing with each other, or with non-fragment lines, because it was traditionally impossible to tell whether a printk() call was a continuation or not. To at least help clarify the issue of continuation lines, we added a KERN_CONT marker back in 2007 to mark continuation lines: `4749252776` ("printk: add KERN_CONT annotation"). That continuation marker was initially an empty string, and didn't actuall make any semantic difference. But it at least made it possible to annotate the source code, and have check-patch notice that a printk() didn't need or want a log level marker, because it was a continuation of a previous line. To avoid the ambiguity between a continuation line that had that KERN_CONT marker, and a printk with no level information at all, we then in 2009 made KERN_CONT be a real log level marker which meant that we could now reliably tell the difference between the two cases. `5fd29d6ccb` ("printk: clean up handling of log-levels and newlines") and we could take advantage of that to make sure we didn't mix up continuation lines with lines that just didn't have any loglevel at all. Then, in 2012, the kernel log buffer was changed to be a "record" based log, where each line was a record that has a loglevel and a timestamp. You can see the beginning of that conversion in commits `e11fea92e1` ("kmsg: export printk records to the /dev/kmsg interface") `7ff9554bb5` ("printk: convert byte-buffer to variable-length record buffer") with a number of follow-up commits to fix some painful fallout from that conversion. Over all, it took a couple of months to sort out most of it. But the upside was that you could have concurrent readers (and writers) of the kernel log and not have lines with mixed output in them. And one particular pain-point for the record-based kernel logging was exactly the fragmentary lines that are generated in smaller chunks. In order to still log them as one recrod, the continuation lines need to be attached to the previous record properly. However the explicit continuation record marker that is actually useful for this exact case was actually removed in aroundm the same time by commit `61e99ab8e3` ("printk: remove the now unnecessary "C" annotation for KERN_CONT") due to the incorrect belief that KERN_CONT wasn't meaningful. The ambiguity between "is this a continuation line" or "is this a plain printk with no log level information" was reintroduced, and in fact became an even bigger pain point because there was now the whole record-level merging of kernel messages going on. This patch reinstates the KERN_CONT as a real non-empty string marker, so that the ambiguity is fixed once again. But it's not a plain revert of that original removal: in the four years since we made KERN_CONT an empty string again, not only has the format of the log level markers changed, we've also had some usage changes in this area. For example, some ACPI code seems to use KERN_CONT _together_ with a log level, and now uses both the KERN_CONT marker and (for example) a KERN_INFO marker to show that it's an informational continuation of a line. Which is actually not a bad idea - if the continuation line cannot be attached to its predecessor, without the log level information we don't know what log level to assign to it (and we traditionally just assigned it the default loglevel). So having both a log level and the KERN_CONT marker is not necessarily a bad idea, but it does mean that we need to actually iterate over potentially multiple markers, rather than just a single one. Also, since KERN_CONT was still conceptually needed, and encouraged, but didn't actually _do_ anything, we've also had the reverse problem: rather than having too many annotations it has too few, and there is bit rot with code that no longer marks the continuation lines with the KERN_CONT marker. So this patch not only re-instates the non-empty KERN_CONT marker, it also fixes up the cases of bit-rot I noticed in my own logs. There are probably other cases where KERN_CONT will be needed to be added, either because it is new code that never dealt with the need for KERN_CONT, or old code that has bitrotted without anybody noticing. That said, we should strive to avoid the need for KERN_CONT. It does result in real problems for logging, and should generally not be seen as a good feature. If we some day can get rid of the feature entirely, because nobody does any fragmented printk calls, that would be lovely. But until that point, let's at mark the code that relies on the hacky multi-fragment kernel printk's. Not only does it avoid the ambiguity, it also annotates code as "maybe this would be good to fix some day". (That said, particularly during single-threaded bootup, the downsides of KERN_CONT are very limited. Things get much hairier when you have multiple threads going on and user level reading and writing logs too). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-10-09 12:23:38 -07:00
Al Viro	f334bcd94b	Merge remote-tracking branch 'ovl/misc' into work.misc	2016-10-08 11:00:01 -04:00
Andreas Gruenbacher	5d6c31910b	xattr: Add __vfs_{get,set,remove}xattr helpers Right now, various places in the kernel check for the existence of getxattr, setxattr, and removexattr inode operations and directly call those operations. Switch to helper functions and test for the IOP_XATTR flag instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-10-07 20:10:44 -04:00
Linus Torvalds	2ab704a47e	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial updates from Jiri Kosina: "The usual rocket science from the trivial tree" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: tracing/syscalls: fix multiline in error message text lib/Kconfig.debug: fix DEBUG_SECTION_MISMATCH description doc: vfs: fix fadvise() sycall name x86/entry: spell EBX register correctly in documentation securityfs: fix securityfs_create_dir comment irq: Fix typo in tracepoint.xml	2016-10-07 12:24:08 -07:00
Linus Torvalds	a3443cda55	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: SELinux/LSM: - overlayfs support, necessary for container filesystems LSM: - finally remove the kernel_module_from_file hook Smack: - treat signal delivery as an 'append' operation TPM: - lots of bugfixes & updates Audit: - new audit data type: LSM_AUDIT_DATA_FILE * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (47 commits) Revert "tpm/tpm_crb: implement tpm crb idle state" Revert "tmp/tpm_crb: fix Intel PTT hw bug during idle state" Revert "tpm/tpm_crb: open code the crb_init into acpi_add" Revert "tmp/tpm_crb: implement runtime pm for tpm_crb" lsm,audit,selinux: Introduce a new audit data type LSM_AUDIT_DATA_FILE tmp/tpm_crb: implement runtime pm for tpm_crb tpm/tpm_crb: open code the crb_init into acpi_add tmp/tpm_crb: fix Intel PTT hw bug during idle state tpm/tpm_crb: implement tpm crb idle state tpm: add check for minimum buffer size in tpm_transmit() tpm: constify TPM 1.x header structures tpm/tpm_crb: fix the over 80 characters checkpatch warring tpm/tpm_crb: drop useless cpu_to_le32 when writing to registers tpm/tpm_crb: cache cmd_size register value. tmp/tpm_crb: drop include to platform_device tpm/tpm_tis: remove unused itpm variable tpm_crb: fix incorrect values of cmdReady and goIdle bits tpm_crb: refine the naming of constants tpm_crb: remove wmb()'s tpm_crb: fix crb_req_canceled behavior ...	2016-10-04 14:48:27 -07:00
Linus Torvalds	3cd013ab79	Merge branch 'stable-4.9' of git://git.infradead.org/users/pcmoore/audit Pull audit updates from Paul Moore: "Another relatively small pull request for v4.9 with just two patches. The patch from Richard updates the list of features we support and report back to userspace; this should have been sent earlier with the rest of the v4.8 patches but it got lost in my inbox. The second patch fixes a problem reported by our Android friends where we weren't very consistent in recording PIDs" * 'stable-4.9' of git://git.infradead.org/users/pcmoore/audit: audit: add exclude filter extension to feature bitmap audit: consistently record PIDs with task_tgid_nr()	2016-10-04 14:21:41 -07:00
Laurent Georget	1b4606511d	securityfs: fix securityfs_create_dir comment If there is an error creating a directory with securityfs_create_dir, the error is propagated via ERR_PTR but the function comment claims that NULL is returned. This is a similar commit to `88e6c94cda` ("fix long-broken securityfs_create_file comment") that did not fix securityfs_create_dir comment at the same time. Signed-off-by: Laurent Georget <laurent.georget@supelec.fr> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2016-09-29 10:07:01 +02:00
Deepa Dinamani	078cd8279e	fs: Replace CURRENT_TIME with current_time() for inode timestamps CURRENT_TIME macro is not appropriate for filesystems as it doesn't use the right granularity for filesystem timestamps. Use current_time() instead. CURRENT_TIME is also not y2038 safe. This is also in preparation for the patch that transitions vfs timestamps to use 64 bit time and hence make them y2038 safe. As part of the effort current_time() will be extended to do range checks. Hence, it is necessary for all file system timestamps to use current_time(). Also, current_time() will be transitioned along with vfs to be y2038 safe. Note that whenever a single call to current_time() is used to change timestamps in different inodes, it is because they share the same time granularity. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Felipe Balbi <balbi@kernel.org> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-09-27 21:06:21 -04:00
Miklos Szeredi	2773bf00ae	fs: rename "rename2" i_op to "rename" Generated patch: sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2` sed -i "s/\brename2\b/rename/g" `git grep -wl rename2` Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2016-09-27 11:03:58 +02:00
Miklos Szeredi	18fc84dafa	vfs: remove unused i_op->rename No in-tree uses remain. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2016-09-27 11:03:58 +02:00
Linus Torvalds	2ddfdd4289	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes a regression in RSA that was only half-fixed earlier in the cycle. It also fixes an older regression that breaks the keyring subsystem" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: rsa-pkcs1pad - Handle leading zero for decryption KEYS: Fix skcipher IV clobbering	2016-09-23 11:28:04 -07:00
Herbert Xu	456bee986e	KEYS: Fix skcipher IV clobbering The IV must not be modified by the skcipher operation so we need to duplicate it. Fixes: `c3917fd9df` ("KEYS: Use skcipher") Cc: stable@vger.kernel.org Reported-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-09-22 17:42:07 +08:00
James Morris	8a17ef9d85	Merge branch 'stable-4.9' of git://git.infradead.org/users/pcmoore/selinux into next	2016-09-21 11:54:19 +10:00
Vivek Goyal	43af5de742	lsm,audit,selinux: Introduce a new audit data type LSM_AUDIT_DATA_FILE Right now LSM_AUDIT_DATA_PATH type contains "struct path" in union "u" of common_audit_data. This information is used to print path of file at the same time it is also used to get to dentry and inode. And this inode information is used to get to superblock and device and print device information. This does not work well for layered filesystems like overlay where dentry contained in path is overlay dentry and not the real dentry of underlying file system. That means inode retrieved from dentry is also overlay inode and not the real inode. SELinux helpers like file_path_has_perm() are doing checks on inode retrieved from file_inode(). This returns the real inode and not the overlay inode. That means we are doing check on real inode but for audit purposes we are printing details of overlay inode and that can be confusing while debugging. Hence, introduce a new type LSM_AUDIT_DATA_FILE which carries file information and inode retrieved is real inode using file_inode(). That way right avc denied information is given to user. For example, following is one example avc before the patch. type=AVC msg=audit(1473360868.399:214): avc: denied { read open } for pid=1765 comm="cat" path="/root/.../overlay/container1/merged/readfile" dev="overlay" ino=21443 scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20 tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0 tclass=file permissive=0 It looks as follows after the patch. type=AVC msg=audit(1473360017.388:282): avc: denied { read open } for pid=2530 comm="cat" path="/root/.../overlay/container1/merged/readfile" dev="dm-0" ino=2377915 scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20 tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0 tclass=file permissive=0 Notice that now dev information points to "dm-0" device instead of "overlay" device. This makes it clear that check failed on underlying inode and not on the overlay inode. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> [PM: slight tweaks to the description to make checkpatch.pl happy] Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-09-19 13:42:38 -04:00
James Morris	de2f4b3453	Merge branch 'stable-4.9' of git://git.infradead.org/users/pcmoore/selinux into next	2016-09-19 12:27:10 +10:00
Miklos Szeredi	e71b9dff06	ima: use file_dentry() Ima tries to call ->setxattr() on overlayfs dentry after having locked underlying inode, which results in a deadlock. Reported-by: Krisztian Litkey <kli@iki.fi> Fixes: `4bacc9c923` ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> # v4.2 Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>	2016-09-16 12:44:20 +02:00
Wei Yongjun	9b6a9ecc2d	selinux: fix error return code in policydb_read() Fix to return error code -EINVAL from the error handling case instead of 0 (rc is overwrite to 0 when policyvers >= POLICYDB_VERSION_ROLETRANS), as done elsewhere in this function. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> [PM: normalize "selinux" in patch subject, description line wrap] Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-09-13 17:14:43 -04:00
Casey Schaufler	c60b906673	Smack: Signal delivery as an append operation Under a strict subject/object security policy delivering a signal or delivering network IPC could be considered either a write or an append operation. The original choice to make both write operations leads to an issue where IPC delivery is desired under policy, but delivery of signals is not. This patch provides the option of making signal delivery an append operation, allowing Smack rules that deny signal delivery while allowing IPC. This was requested for Tizen. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2016-09-08 13:22:56 -07:00
Linus Torvalds	80a77045da	- force check_object_size() to be inline too - move page-spanning check behind a CONFIG since it's triggering false positives -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Kees Cook <kees@outflux.net> iQIcBAABCgAGBQJX0F3qAAoJEIly9N/cbcAmaxEP/R737i+XhP4VOv+rYW050NHB K+FDeLv5/Gx56JrHRRWwj80K5/9F09wS929AWcaTDjEb2NKp/o/6W/gN1eVdGlJF K3UQk+Kmncb44poVoHkjMRAl6+sfKm6mTWZBTjBECKwQuCFyDoDoqDXhn5IXTlw9 Ig+TTOSgNw9gRke3ECtFynbVnDWx/Ry/axfT9vGXhFOkWclMUFy2UOdDSTtFAB6x yw5hdrfGakk2BPscHLO1xNqRuVLRUSXZVUiJGIQ6AiUupm34Yqmm69mrMuxaOtPC Ai3zhNGDuYClcGJAiPJYX+7nRjgPCWAdlyzQqLp5hwx63TJ+gxvhmxoFOJxEmHE/ 99i2Ak073Es6WII532Eknk3vV+UJzQNT/HO+0LcrJFkOEp9EHfVUb19CngQTaX7Q UbfYdyFgp3y24cRp7v0tP8gE2LCrsRe0UEhUq2NGmrerw3caNqZGHS9Od5OYEM8D uIhaotWoOv9Z0r+DZMGkUjfqeLb6RWNcUoWc5wZ3VYG27BM/pfhRxKf/2aw6O9u0 2Jk1QJxBr+/8DQ500xu/IBOP9V7aGAc4nxKyqUlwA05/JEFGiAzCwqfZW5CKTxgD 5Ht994WbTEH3/VaAskKnwggeHvttiEpehBCdVA4bXuhBhJhmPjFiKHX7uRrcM2GV /yH3UTkPnr/VD/I2ndiX =r8BE -----END PGP SIGNATURE----- Merge tag 'usercopy-v4.8-rc6-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull more hardened usercopyfixes from Kees Cook: - force check_object_size() to be inline too - move page-spanning check behind a CONFIG since it's triggering false positives [ Changed the page-spanning config option to depend on EXPERT in the merge. That way it still gets build testing, and you can enable it if you want to, but is never enabled for "normal" configurations ] * tag 'usercopy-v4.8-rc6-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: usercopy: remove page-spanning test for now usercopy: force check_object_size() inline	2016-09-07 14:03:49 -07:00
Kees Cook	8e1f74ea02	usercopy: remove page-spanning test for now A custom allocator without __GFP_COMP that copies to userspace has been found in vmw_execbuf_process[1], so this disables the page-span checker by placing it behind a CONFIG for future work where such things can be tracked down later. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1373326 Reported-by: Vinson Lee <vlee@freedesktop.org> Fixes: `f5509cc18d` ("mm: Hardened usercopy") Signed-off-by: Kees Cook <keescook@chromium.org>	2016-09-07 11:33:26 -07:00
Paul Moore	fa2bea2f5c	audit: consistently record PIDs with task_tgid_nr() Unfortunately we record PIDs in audit records using a variety of methods despite the correct way being the use of task_tgid_nr(). This patch converts all of these callers, except for the case of AUDIT_SET in audit_receive_msg() (see the comment in the code). Reported-by: Jeff Vander Stoep <jeffv@google.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-30 17:19:13 -04:00
William Roberts	7c686af071	selinux: fix overflow and 0 length allocations Throughout the SELinux LSM, values taken from sepolicy are used in places where length == 0 or length == <saturated> matter, find and fix these. Signed-off-by: William Roberts <william.c.roberts@intel.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-30 15:45:50 -04:00
William Roberts	3bc7bcf69b	selinux: initialize structures libsepol pointed out an issue where its possible to have an unitialized jmp and invalid dereference, fix this. While we're here, zero allocate all the *_val_to_struct structures. Signed-off-by: William Roberts <william.c.roberts@intel.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-29 19:22:10 -04:00
William Roberts	74d977b65e	selinux: detect invalid ebitmap When count is 0 and the highbit is not zero, the ebitmap is not valid and the internal node is not allocated. This causes issues when routines, like mls_context_isvalid() attempt to use the ebitmap_for_each_bit() and ebitmap_node_get_bit() as they assume a highbit > 0 will have a node allocated. Signed-off-by: William Roberts <william.c.roberts@intel.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-29 19:19:50 -04:00
Markus Elfring	63e24c4971	Smack: Use memdup_user() rather than duplicating its implementation Reuse existing functionality from memdup_user() instead of keeping duplicate source code. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Casey Schaufler <casey@schaufler-ca.com>	2016-08-23 09:58:21 -07:00
Linus Torvalds	6040e57658	Make the hardened user-copy code depend on having a hardened allocator The kernel test robot reported a usercopy failure in the new hardened sanity checks, due to a page-crossing copy of the FPU state into the task structure. This happened because the kernel test robot was testing with SLOB, which doesn't actually do the required book-keeping for slab allocations, and as a result the hardening code didn't realize that the task struct allocation was one single allocation - and the sanity checks fail. Since SLOB doesn't even claim to support hardening (and you really shouldn't use it), the straightforward solution is to just make the usercopy hardening code depend on the allocator supporting it. Reported-by: kernel test robot <xiaolong.ye@intel.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-08-19 12:47:01 -07:00
William Roberts	348a0db9e6	selinux: drop SECURITY_SELINUX_POLICYDB_VERSION_MAX Remove the SECURITY_SELINUX_POLICYDB_VERSION_MAX Kconfig option Per: https://github.com/SELinuxProject/selinux/wiki/Kernel-Todo This was only needed on Fedora 3 and 4 and just causes issues now, so drop it. The MAX and MIN should just be whatever the kernel can support. Signed-off-by: William Roberts <william.c.roberts@intel.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-18 20:01:15 -04:00
Vivek Goyal	a518b0a5b0	selinux: Implement dentry_create_files_as() hook Calculate what would be the label of newly created file and set that secid in the passed creds. Context of the task which is actually creating file is retrieved from set of creds passed in. (old->security). Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-10 08:25:22 -04:00
Vivek Goyal	2602625b7e	security, overlayfs: Provide hook to correctly label newly created files During a new file creation we need to make sure new file is created with the right label. New file is created in upper/ so effectively file should get label as if task had created file in upper/. We switched to mounter's creds for actual file creation. Also if there is a whiteout present, then file will be created in work/ dir first and then renamed in upper. In none of the cases file will be labeled as we want it to be. This patch introduces a new hook dentry_create_files_as(), which determines the label/context dentry will get if it had been created by task in upper and modify passed set of creds appropriately. Caller makes use of these new creds for file creation. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: fix whitespace issues found with checkpatch.pl] [PM: changes to use stat->mode in ovl_create_or_link()] Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-08 20:46:46 -04:00
Vivek Goyal	c957f6df52	selinux: Pass security pointer to determine_inode_label() Right now selinux_determine_inode_label() works on security pointer of current task. Soon I need this to work on a security pointer retrieved from a set of creds. So start passing in a pointer and caller can decide where to fetch security pointer from. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>	2016-08-08 20:45:29 -04:00

... 2 3 4 5 6 ...

3286 Commits