Commit Graph

4790 Commits

Author SHA1 Message Date
John Johansen
a83bd86e83 apparmor: add label data availability to the feature set
gsettings mediation needs to be able to determine if apparmor supports
label data queries. A label data query can be done to test for support
but its failure is indistinguishable from other failures, making it an
unreliable indicator.

Fix by making support of label data queries available as a flag in the
apparmorfs features dir tree.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-06-10 17:11:28 -07:00
John Johansen
4ae47f3335 apparmor: add mkdir/rmdir interface to manage policy namespaces
When setting up namespaces for containers its easier for them to use
an fs interface to create the namespace for the containers
policy. Allow mkdir/rmdir under the policy/namespaces/ dir to be used
to create and remove namespaces.

BugLink: http://bugs.launchpad.net/bugs/1611078

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-06-10 17:11:27 -07:00
John Johansen
d9bf2c268b apparmor: add policy revision file interface
Add a policy revision file to find the current revision of a ns's policy.
There is a revision file per ns, as well as a virtualized global revision
file in the base apparmor fs directory. The global revision file when
opened will provide the revision of the opening task namespace.

The revision file can be waited on via select/poll to detect apparmor
policy changes from the last read revision of the opened file. This
means that the revision file must be read after the select/poll other
wise update data will remain ready for reading.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-06-10 17:11:27 -07:00
John Johansen
18e99f191a apparmor: provide finer control over policy management
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-06-10 17:11:20 -07:00
Scott Mayhew
0b4d3452b8 security/selinux: allow security_sb_clone_mnt_opts to enable/disable native labeling behavior
When an NFSv4 client performs a mount operation, it first mounts the
NFSv4 root and then does path walk to the exported path and performs a
submount on that, cloning the security mount options from the root's
superblock to the submount's superblock in the process.

Unless the NFS server has an explicit fsid=0 export with the
"security_label" option, the NFSv4 root superblock will not have
SBLABEL_MNT set, and neither will the submount superblock after cloning
the security mount options.  As a result, setxattr's of security labels
over NFSv4.2 will fail.  In a similar fashion, NFSv4.2 mounts mounted
with the context= mount option will not show the correct labels because
the nfs_server->caps flags of the cloned superblock will still have
NFS_CAP_SECURITY_LABEL set.

Allowing the NFSv4 client to enable or disable SECURITY_LSM_NATIVE_LABELS
behavior will ensure that the SBLABEL_MNT flag has the correct value
when the client traverses from an exported path without the
"security_label" option to one with the "security_label" option and
vice versa.  Similarly, checking to see if SECURITY_LSM_NATIVE_LABELS is
set upon return from security_sb_clone_mnt_opts() and clearing
NFS_CAP_SECURITY_LABEL if necessary will allow the correct labels to
be displayed for NFSv4.2 mounts mounted with the context= mount option.

Resolves: https://github.com/SELinuxProject/selinux-kernel/issues/35

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Tested-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-06-09 16:17:47 -04:00
Junil Lee
b4958c892e selinux: use kmem_cache for ebitmap
The allocated size for each ebitmap_node is 192byte by kzalloc().
Then, ebitmap_node size is fixed, so it's possible to use only 144byte
for each object by kmem_cache_zalloc().
It can reduce some dynamic allocation size.

Signed-off-by: Junil Lee <junil0814.lee@lge.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-06-09 16:13:50 -04:00
John Johansen
e53cfe6c7c apparmor: rework perm mapping to a slightly broader set
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-06-09 05:59:22 -07:00
Mark Rutland
92347cfd62 KEYS: fix refcount_inc() on zero
If a key's refcount is dropped to zero between key_lookup() peeking at
the refcount and subsequently attempting to increment it, refcount_inc()
will see a zero refcount.  Here, refcount_inc() will WARN_ONCE(), and
will *not* increment the refcount, which will remain zero.

Once key_lookup() drops key_serial_lock, it is possible for the key to
be freed behind our back.

This patch uses refcount_inc_not_zero() to perform the peek and increment
atomically.

Fixes: fff292914d ("security, keys: convert key.usage from atomic_t to refcount_t")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:50 +10:00
Mat Martineau
7cbe0932c2 KEYS: Convert KEYCTL_DH_COMPUTE to use the crypto KPP API
The initial Diffie-Hellman computation made direct use of the MPI
library because the crypto module did not support DH at the time. Now
that KPP is implemented, KEYCTL_DH_COMPUTE should use it to get rid of
duplicate code and leverage possible hardware acceleration.

This fixes an issue whereby the input to the KDF computation would
include additional uninitialized memory when the result of the
Diffie-Hellman computation was shorter than the input prime number.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:50 +10:00
Eric Biggers
0ddd9f1a6b KEYS: DH: ensure the KDF counter is properly aligned
Accessing a 'u8[4]' through a '__be32 *' violates alignment rules.  Just
make the counter a __be32 instead.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:49 +10:00
Eric Biggers
281590b422 KEYS: DH: don't feed uninitialized "otherinfo" into KDF
If userspace called KEYCTL_DH_COMPUTE with kdf_params containing NULL
otherinfo but nonzero otherinfolen, the kernel would allocate a buffer
for the otherinfo, then feed it into the KDF without initializing it.
Fix this by always doing the copy from userspace (which will fail with
EFAULT in this scenario).

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:49 +10:00
Eric Biggers
bbe240454d KEYS: DH: forbid using digest_null as the KDF hash
Requesting "digest_null" in the keyctl_kdf_params caused an infinite
loop in kdf_ctr() because the "null" hash has a digest size of 0.  Fix
it by rejecting hash algorithms with a digest size of 0.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:49 +10:00
Eric Biggers
0620fddb56 KEYS: sanitize key structs before freeing
While a 'struct key' itself normally does not contain sensitive
information, Documentation/security/keys.txt actually encourages this:

     "Having a payload is not required; and the payload can, in fact,
     just be a value stored in the struct key itself."

In case someone has taken this advice, or will take this advice in the
future, zero the key structure before freeing it.  We might as well, and
as a bonus this could make it a bit more difficult for an adversary to
determine which keys have recently been in use.

This is safe because the key_jar cache does not use a constructor.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:48 +10:00
Eric Biggers
ee618b4619 KEYS: trusted: sanitize all key material
As the previous patch did for encrypted-keys, zero sensitive any
potentially sensitive data related to the "trusted" key type before it
is freed.  Notably, we were not zeroing the tpm_buf structures in which
the actual key is stored for TPM seal and unseal, nor were we zeroing
the trusted_key_payload in certain error paths.

Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: David Safford <safford@us.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:48 +10:00
Eric Biggers
a9dd74b252 KEYS: encrypted: sanitize all key material
For keys of type "encrypted", consistently zero sensitive key material
before freeing it.  This was already being done for the decrypted
payloads of encrypted keys, but not for the master key and the keys
derived from the master key.

Out of an abundance of caution and because it is trivial to do so, also
zero buffers containing the key payload in encrypted form, although
depending on how the encrypted-keys feature is used such information
does not necessarily need to be kept secret.

Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: David Safford <safford@us.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:48 +10:00
Eric Biggers
6966c74932 KEYS: user_defined: sanitize key payloads
Zero the payloads of user and logon keys before freeing them.  This
prevents sensitive key material from being kept around in the slab
caches after a key is released.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:48 +10:00
Eric Biggers
57070c850a KEYS: sanitize add_key() and keyctl() key payloads
Before returning from add_key() or one of the keyctl() commands that
takes in a key payload, zero the temporary buffer that was allocated to
hold the key payload copied from userspace.  This may contain sensitive
key material that should not be kept around in the slab caches.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:48 +10:00
Eric Biggers
63a0b0509e KEYS: fix freeing uninitialized memory in key_update()
key_update() freed the key_preparsed_payload even if it was not
initialized first.  This would cause a crash if userspace called
keyctl_update() on a key with type like "asymmetric" that has a
->preparse() method but not an ->update() method.  Possibly it could
even be triggered for other key types by racing with keyctl_setperm() to
make the KEY_NEED_WRITE check fail (the permission was already checked,
so normally it wouldn't fail there).

Reproducer with key type "asymmetric", given a valid cert.der:

keyctl new_session
keyid=$(keyctl padd asymmetric desc @s < cert.der)
keyctl setperm $keyid 0x3f000000
keyctl update $keyid data

[  150.686666] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
[  150.687601] IP: asymmetric_key_free_kids+0x12/0x30
[  150.688139] PGD 38a3d067
[  150.688141] PUD 3b3de067
[  150.688447] PMD 0
[  150.688745]
[  150.689160] Oops: 0000 [#1] SMP
[  150.689455] Modules linked in:
[  150.689769] CPU: 1 PID: 2478 Comm: keyctl Not tainted 4.11.0-rc4-xfstests-00187-ga9f6b6b8cd2f #742
[  150.690916] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014
[  150.692199] task: ffff88003b30c480 task.stack: ffffc90000350000
[  150.692952] RIP: 0010:asymmetric_key_free_kids+0x12/0x30
[  150.693556] RSP: 0018:ffffc90000353e58 EFLAGS: 00010202
[  150.694142] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000004
[  150.694845] RDX: ffffffff81ee3920 RSI: ffff88003d4b0700 RDI: 0000000000000001
[  150.697569] RBP: ffffc90000353e60 R08: ffff88003d5d2140 R09: 0000000000000000
[  150.702483] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[  150.707393] R13: 0000000000000004 R14: ffff880038a4d2d8 R15: 000000000040411f
[  150.709720] FS:  00007fcbcee35700(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000
[  150.711504] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  150.712733] CR2: 0000000000000001 CR3: 0000000039eab000 CR4: 00000000003406e0
[  150.714487] Call Trace:
[  150.714975]  asymmetric_key_free_preparse+0x2f/0x40
[  150.715907]  key_update+0xf7/0x140
[  150.716560]  ? key_default_cmp+0x20/0x20
[  150.717319]  keyctl_update_key+0xb0/0xe0
[  150.718066]  SyS_keyctl+0x109/0x130
[  150.718663]  entry_SYSCALL_64_fastpath+0x1f/0xc2
[  150.719440] RIP: 0033:0x7fcbce75ff19
[  150.719926] RSP: 002b:00007ffd5d167088 EFLAGS: 00000206 ORIG_RAX: 00000000000000fa
[  150.720918] RAX: ffffffffffffffda RBX: 0000000000404d80 RCX: 00007fcbce75ff19
[  150.721874] RDX: 00007ffd5d16785e RSI: 000000002866cd36 RDI: 0000000000000002
[  150.722827] RBP: 0000000000000006 R08: 000000002866cd36 R09: 00007ffd5d16785e
[  150.723781] R10: 0000000000000004 R11: 0000000000000206 R12: 0000000000404d80
[  150.724650] R13: 00007ffd5d16784d R14: 00007ffd5d167238 R15: 000000000040411f
[  150.725447] Code: 83 c4 08 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 85 ff 74 23 55 48 89 e5 53 48 89 fb <48> 8b 3f e8 06 21 c5 ff 48 8b 7b 08 e8 fd 20 c5 ff 48 89 df e8
[  150.727489] RIP: asymmetric_key_free_kids+0x12/0x30 RSP: ffffc90000353e58
[  150.728117] CR2: 0000000000000001
[  150.728430] ---[ end trace f7f8fe1da2d5ae8d ]---

Fixes: 4d8c0250b8 ("KEYS: Call ->free_preparse() even after ->preparse() returns an error")
Cc: stable@vger.kernel.org # 3.17+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:47 +10:00
Eric Biggers
5649645d72 KEYS: fix dereferencing NULL payload with nonzero length
sys_add_key() and the KEYCTL_UPDATE operation of sys_keyctl() allowed a
NULL payload with nonzero length to be passed to the key type's
->preparse(), ->instantiate(), and/or ->update() methods.  Various key
types including asymmetric, cifs.idmap, cifs.spnego, and pkcs7_test did
not handle this case, allowing an unprivileged user to trivially cause a
NULL pointer dereference (kernel oops) if one of these key types was
present.  Fix it by doing the copy_from_user() when 'plen' is nonzero
rather than when '_payload' is non-NULL, causing the syscall to fail
with EFAULT as expected when an invalid buffer is specified.

Cc: stable@vger.kernel.org # 2.6.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:47 +10:00
Eric Biggers
0f534e4a13 KEYS: encrypted: use constant-time HMAC comparison
MACs should, in general, be compared using crypto_memneq() to prevent
timing attacks.

Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:47 +10:00
Eric Biggers
64d107d3ac KEYS: encrypted: fix race causing incorrect HMAC calculations
The encrypted-keys module was using a single global HMAC transform,
which could be rekeyed by multiple threads concurrently operating on
different keys, causing incorrect HMAC values to be calculated.  Fix
this by allocating a new HMAC transform whenever we need to calculate a
HMAC.  Also simplify things a bit by allocating the shash_desc's using
SHASH_DESC_ON_STACK() for both the HMAC and unkeyed hashes.

The following script reproduces the bug:

    keyctl new_session
    keyctl add user master "abcdefghijklmnop" @s
    for i in $(seq 2); do
        (
            set -e
            for j in $(seq 1000); do
                keyid=$(keyctl add encrypted desc$i "new user:master 25" @s)
                datablob="$(keyctl pipe $keyid)"
                keyctl unlink $keyid > /dev/null
                keyid=$(keyctl add encrypted desc$i "load $datablob" @s)
                keyctl unlink $keyid > /dev/null
            done
        ) &
    done

Output with bug:

    [  439.691094] encrypted_key: bad hmac (-22)
    add_key: Invalid argument
    add_key: Invalid argument

Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:47 +10:00
Eric Biggers
794b4bc292 KEYS: encrypted: fix buffer overread in valid_master_desc()
With the 'encrypted' key type it was possible for userspace to provide a
data blob ending with a master key description shorter than expected,
e.g. 'keyctl add encrypted desc "new x" @s'.  When validating such a
master key description, validate_master_desc() could read beyond the end
of the buffer.  Fix this by using strncmp() instead of memcmp().  [Also
clean up the code to deduplicate some logic.]

Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:46 +10:00
Eric Biggers
e9ff56ac35 KEYS: encrypted: avoid encrypting/decrypting stack buffers
Since v4.9, the crypto API cannot (normally) be used to encrypt/decrypt
stack buffers because the stack may be virtually mapped.  Fix this for
the padding buffers in encrypted-keys by using ZERO_PAGE for the
encryption padding and by allocating a temporary heap buffer for the
decryption padding.

Tested with CONFIG_DEBUG_SG=y:
	keyctl new_session
	keyctl add user master "abcdefghijklmnop" @s
	keyid=$(keyctl add encrypted desc "new user:master 25" @s)
	datablob="$(keyctl pipe $keyid)"
	keyctl unlink $keyid
	keyid=$(keyctl add encrypted desc "load $datablob" @s)
	datablob2="$(keyctl pipe $keyid)"
	[ "$datablob" = "$datablob2" ] && echo "Success!"

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:46 +10:00
Eric Biggers
d636bd9f12 KEYS: put keyring if install_session_keyring_to_cred() fails
In join_session_keyring(), if install_session_keyring_to_cred() were to
fail, we would leak the keyring reference, just like in the bug fixed by
commit 23567fd052 ("KEYS: Fix keyring ref leak in
join_session_keyring()").  Fortunately this cannot happen currently, but
we really should be more careful.  Do this by adding and using a new
error label at which the keyring reference is dropped.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:46 +10:00
Markus Elfring
41f1c53e0d KEYS: Delete an error message for a failed memory allocation in get_derived_key()
Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:46 +10:00
Davidlohr Bueso
381f20fceb security: use READ_ONCE instead of deprecated ACCESS_ONCE
With the new standardized functions, we can replace all ACCESS_ONCE()
calls across relevant security/keyrings/.

ACCESS_ONCE() does not work reliably on non-scalar types. For example
gcc 4.6 and 4.7 might remove the volatile tag for such accesses during
the SRA (scalar replacement of aggregates) step:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145

Update the new calls regardless of if it is a scalar type, this is
cleaner than having three alternatives.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:45 +10:00
Bilal Amarni
47b2c3fff4 security/keys: add CONFIG_KEYS_COMPAT to Kconfig
CONFIG_KEYS_COMPAT is defined in arch-specific Kconfigs and is missing for
several 64-bit architectures : mips, parisc, tile.

At the moment and for those architectures, calling in 32-bit userspace the
keyctl syscall would return an ENOSYS error.

This patch moves the CONFIG_KEYS_COMPAT option to security/keys/Kconfig, to
make sure the compatibility wrapper is registered by default for any 64-bit
architecture as long as it is configured with CONFIG_COMPAT.

[DH: Modified to remove arm64 compat enablement also as requested by Eric
 Biggers]

Signed-off-by: Bilal Amarni <bilal.amarni@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
cc: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:45 +10:00
John Johansen
fc7e0b26b8 apparmor: move permissions into their own file to be more easily shared
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-06-08 12:51:53 -07:00
John Johansen
c961ee5f21 apparmor: convert from securityfs to apparmorfs for policy ns files
Virtualize the apparmor policy/ directory so that the current
namespace affects what part of policy is seen. To do this convert to
using apparmorfs for policy namespace files and setup a magic symlink
in the securityfs apparmor dir to access those files.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Seth Arnold <seth.arnold@canonical.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
2017-06-08 12:51:52 -07:00
John Johansen
98407f0a0d apparmor: allow specifying an already created dir to create ns entries in
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Seth Arnold <seth.arnold@canonical.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
2017-06-08 12:51:52 -07:00
John Johansen
c97204baf8 apparmor: rename apparmor file fns and data to indicate use
prefixes are used for fns/data that are not static to apparmorfs.c
with the prefixes being
  aafs   - special magic apparmorfs for policy namespace data
  aa_sfs - for fns/data that go into securityfs
  aa_fs  - for fns/data that may be used in the either of aafs or
           securityfs

Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Seth Arnold <seth.arnold@canonical.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
2017-06-08 12:51:52 -07:00
John Johansen
a481f4d917 apparmor: add custom apparmorfs that will be used by policy namespace files
AppArmor policy needs to be able to be resolved based on the policy
namespace a task is confined by. Add a base apparmorfs filesystem that
(like nsfs) will exist as a kern mount and be accessed via jump_link
through a securityfs file.

Setup the base apparmorfs fns and data, but don't use it yet.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Seth Arnold <seth.arnold@canonical.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
2017-06-08 12:51:51 -07:00
John Johansen
64c8697045 apparmor: use macro template to simplify namespace seq_files
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Seth Arnold <seth.arnold@canonical.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
2017-06-08 12:51:51 -07:00
John Johansen
52b97de322 apparmor: use macro template to simplify profile seq_files
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Seth Arnold <seth.arnold@canonical.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
2017-06-08 12:51:50 -07:00
John Johansen
5d5182cae4 apparmor: move to per loaddata files, instead of replicating in profiles
The loaddata sets cover more than just a single profile and should
be tracked at the ns level. Move the load data files under the namespace
and reference the files from the profiles via a symlink.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Seth Arnold <seth.arnold@canonical.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
2017-06-08 12:51:49 -07:00
John Johansen
6623ec7c4d securityfs: add the ability to support symlinks
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Seth Arnold <seth.arnold@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
2017-06-08 12:51:43 -07:00
John Johansen
4227c333f6 apparmor: Move path lookup to using preallocated buffers
Dynamically allocating buffers is problematic and is an extra layer
that is a potntial point of failure and can slow down mediation.
Change path lookup to use the preallocated per cpu buffers.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-06-08 11:29:34 -07:00
John Johansen
72c8a76864 apparmor: allow profiles to provide info to disconnected paths
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-06-08 11:29:34 -07:00
John Johansen
b91deb9db1 apparmor: make internal lib fn skipn_spaces available to the rest of apparmor
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-06-08 11:29:33 -07:00
John Johansen
af7caa8f8d apparmor: move file context into file.h
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-06-08 11:29:33 -07:00
Thomas Schneider
651e54953b security/apparmor: Use POSIX-compatible "printf '%s'"
When using a strictly POSIX-compliant shell, "-n #define ..." gets
written into the file.  Use "printf '%s'" to avoid this.

Signed-off-by: Thomas Schneider <qsx@qsx.re>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-06-08 11:29:27 -07:00
Dan Carpenter
ffac1de6cf apparmor: Fix error cod in __aa_fs_profile_mkdir()
We can either return PTR_ERR(NULL) or a PTR_ERR(a valid pointer) here.
Returning NULL is probably not good, but since this happens at boot
then we are probably already toasted if we were to hit this bug in real
life.  In other words, it seems like a very low severity bug to me.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-06-08 11:21:05 -07:00
Markus Elfring
47dbd1cdbb apparmorfs: Use seq_putc() in two functions
Two single characters (line breaks) should be put into a sequence.
Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-06-08 11:21:02 -07:00
Markus Elfring
0ff3d97f76 apparmorfs: Combine two function calls into one in aa_fs_seq_raw_abi_show()
A bit of data was put into a sequence by two separate function calls.
Print the same data by a single function call instead.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-06-08 11:20:58 -07:00
Christoph Hellwig
85787090a2 fs: switch ->s_uuid to uuid_t
For some file systems we still memcpy into it, but in various places this
already allows us to use the proper uuid helpers.  More to come..

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> (Changes to IMA/EVM)
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2017-06-05 16:59:12 +02:00
Christoph Hellwig
787d8c530a ima/policy: switch to use uuid_t
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2017-06-05 16:59:11 +02:00
Christoph Hellwig
1dd771eb0b block: remove blk_part_pack_uuid
This helper was only used by IMA of all things, which would get spurious
errors if CONFIG_BLOCK is disabled.  Just opencode the call there.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2017-06-05 16:59:10 +02:00
Florian Westphal
8e71bf75ef selinux: use pernet operations for hook registration
It will allow us to remove the old netfilter hook api in the near future.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-06-02 10:27:46 -04:00
Casey Schaufler
f28e783ff6 Smack: Use cap_capable in privilege check
Use cap_capable() rather than capable() in the Smack privilege
check as the former does not invoke other security module
privilege check, while the later does. This becomes important
when stacking. It may be a problem even with minor modules.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-06-01 09:27:21 -07:00
Casey Schaufler
51d59af26f Smack: Safer check for a socket in file_receive
The check of S_ISSOCK() in smack_file_receive() is not
appropriate if the passed descriptor is a socket.

Reported-by: Stephen Smalley <sds@tyco.nsa.gov>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-06-01 09:27:12 -07:00
Florian Westphal
e661a58279 smack: use pernet operations for hook registration
It will allow us to remove the old netfilter hook api in the near future.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-06-01 09:26:43 -07:00
Al Viro
0b884d25f5 sel_write_validatetrans(): don't open-code memdup_user_nul()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-05-25 23:28:36 -04:00
Daniel Jurgens
409dcf3153 selinux: Add a cache for quicker retreival of PKey SIDs
It is likely that the SID for the same PKey will be requested many
times. To reduce the time to modify QPs and process MADs use a cache to
store PKey SIDs.

This code is heavily based on the "netif" and "netport" concept
originally developed by James Morris <jmorris@redhat.com> and Paul Moore
<paul@paul-moore.com> (see security/selinux/netif.c and
security/selinux/netport.c for more information)

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 12:28:12 -04:00
Daniel Jurgens
ab861dfca1 selinux: Add IB Port SMP access vector
Add a type for Infiniband ports and an access vector for subnet
management packets. Implement the ib_port_smp hook to check that the
caller has permission to send and receive SMPs on the end port specified
by the device name and port. Add interface to query the SID for a IB
port, which walks the IB_PORT ocontexts to find an entry for the
given name and port.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 12:28:02 -04:00
Daniel Jurgens
cfc4d882d4 selinux: Implement Infiniband PKey "Access" access vector
Add a type and access vector for PKeys. Implement the ib_pkey_access
hook to check that the caller has permission to access the PKey on the
given subnet prefix. Add an interface to get the PKey SID. Walk the PKey
ocontexts to find an entry for the given subnet prefix and pkey.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 12:27:50 -04:00
Daniel Jurgens
3a976fa676 selinux: Allocate and free infiniband security hooks
Implement and attach hooks to allocate and free Infiniband object
security structures.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 12:27:41 -04:00
Daniel Jurgens
a806f7a161 selinux: Create policydb version for Infiniband support
Support for Infiniband requires the addition of two new object contexts,
one for infiniband PKeys and another IB Ports. Added handlers to read
and write the new ocontext types when reading or writing a binary policy
representation.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 12:27:32 -04:00
Daniel Jurgens
47a2b338fe IB/core: Enforce security on management datagrams
Allocate and free a security context when creating and destroying a MAD
agent.  This context is used for controlling access to PKeys and sending
and receiving SMPs.

When sending or receiving a MAD check that the agent has permission to
access the PKey for the Subnet Prefix of the port.

During MAD and snoop agent registration for SMI QPs check that the
calling process has permission to access the manage the subnet  and
register a callback with the LSM to be notified of policy changes. When
notificaiton of a policy change occurs recheck permission and set a flag
indicating sending and receiving SMPs is allowed.

When sending and receiving MADs check that the agent has access to the
SMI if it's on an SMI QP.  Because security policy can change it's
possible permission was allowed when creating the agent, but no longer
is.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
[PM: remove the LSM hook init code]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 12:27:21 -04:00
Daniel Jurgens
8f408ab64b selinux lsm IB/core: Implement LSM notification system
Add a generic notificaiton mechanism in the LSM. Interested consumers
can register a callback with the LSM and security modules can produce
events.

Because access to Infiniband QPs are enforced in the setup phase of a
connection security should be enforced again if the policy changes.
Register infiniband devices for policy change notification and check all
QPs on that device when the notification is received.

Add a call to the notification mechanism from SELinux when the AVC
cache changes or setenforce is cleared.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 12:27:11 -04:00
Daniel Jurgens
d291f1a652 IB/core: Enforce PKey security on QPs
Add new LSM hooks to allocate and free security contexts and check for
permission to access a PKey.

Allocate and free a security context when creating and destroying a QP.
This context is used for controlling access to PKeys.

When a request is made to modify a QP that changes the port, PKey index,
or alternate path, check that the QP has permission for the PKey in the
PKey table index on the subnet prefix of the port. If the QP is shared
make sure all handles to the QP also have access.

Store which port and PKey index a QP is using. After the reset to init
transition the user can modify the port, PKey index and alternate path
independently. So port and PKey settings changes can be a merge of the
previous settings and the new ones.

In order to maintain access control if there are PKey table or subnet
prefix change keep a list of all QPs are using each PKey index on
each port. If a change occurs all QPs using that device and port must
have access enforced for the new cache settings.

These changes add a transaction to the QP modify process. Association
with the old port and PKey index must be maintained if the modify fails,
and must be removed if it succeeds. Association with the new port and
PKey index must be established prior to the modify and removed if the
modify fails.

1. When a QP is modified to a particular Port, PKey index or alternate
   path insert that QP into the appropriate lists.

2. Check permission to access the new settings.

3. If step 2 grants access attempt to modify the QP.

4a. If steps 2 and 3 succeed remove any prior associations.

4b. If ether fails remove the new setting associations.

If a PKey table or subnet prefix changes walk the list of QPs and
check that they have permission. If not send the QP to the error state
and raise a fatal error event. If it's a shared QP make sure all the
QPs that share the real_qp have permission as well. If the QP that
owns a security structure is denied access the security structure is
marked as such and the QP is added to an error_list. Once the moving
the QP to error is complete the security structure mark is cleared.

Maintaining the lists correctly turns QP destroy into a transaction.
The hardware driver for the device frees the ib_qp structure, so while
the destroy is in progress the ib_qp pointer in the ib_qp_security
struct is undefined. When the destroy process begins the ib_qp_security
structure is marked as destroying. This prevents any action from being
taken on the QP pointer. After the QP is destroyed successfully it
could still listed on an error_list wait for it to be processed by that
flow before cleaning up the structure.

If the destroy fails the QPs port and PKey settings are reinserted into
the appropriate lists, the destroying flag is cleared, and access control
is enforced, in case there were any cache changes during the destroy
flow.

To keep the security changes isolated a new file is used to hold security
related functionality.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
[PM: merge fixup in ib_verbs.h and uverbs_cmd.c]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 12:26:59 -04:00
Matthias Kaehlcke
270e857314 selinux: Remove redundant check for unknown labeling behavior
The check is already performed in ocontext_read() when the policy is
loaded. Removing the array also fixes the following warning when
building with clang:

security/selinux/hooks.c:338:20: error: variable 'labeling_behaviors'
    is not needed and will not be emitted
    [-Werror,-Wunneeded-internal-declaration]

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 10:24:06 -04:00
Stephen Smalley
4dc2fce342 selinux: log policy capability state when a policy is loaded
Log the state of SELinux policy capabilities when a policy is loaded.
For each policy capability known to the kernel, log the policy capability
name and the value set in the policy.  For policy capabilities that are
set in the loaded policy but unknown to the kernel, log the policy
capability index, since this is the only information presently available
in the policy.

Sample output with a policy created with a new capability defined
that is not known to the kernel:
SELinux:  policy capability network_peer_controls=1
SELinux:  policy capability open_perms=1
SELinux:  policy capability extended_socket_class=1
SELinux:  policy capability always_check_network=0
SELinux:  policy capability cgroup_seclabel=0
SELinux:  unknown policy capability 5

Resolves: https://github.com/SELinuxProject/selinux-kernel/issues/32

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 10:23:50 -04:00
Stephen Smalley
ccb544781d selinux: do not check open permission on sockets
open permission is currently only defined for files in the kernel
(COMMON_FILE_PERMS rather than COMMON_FILE_SOCK_PERMS). Construction of
an artificial test case that tries to open a socket via /proc/pid/fd will
generate a recvfrom avc denial because recvfrom and open happen to map to
the same permission bit in socket vs file classes.

open of a socket via /proc/pid/fd is not supported by the kernel regardless
and will ultimately return ENXIO. But we hit the permission check first and
can thus produce these odd/misleading denials.  Omit the open check when
operating on a socket.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 10:23:42 -04:00
Stephen Smalley
3ba4bf5f1e selinux: add a map permission check for mmap
Add a map permission check on mmap so that we can distinguish memory mapped
access (since it has different implications for revocation). When a file
is opened and then read or written via syscalls like read(2)/write(2),
we revalidate access on each read/write operation via
selinux_file_permission() and therefore can revoke access if the
process context, the file context, or the policy changes in such a
manner that access is no longer allowed. When a file is opened and then
memory mapped via mmap(2) and then subsequently read or written directly
in memory, we presently have no way to revalidate or revoke access.
The purpose of a separate map permission check on mmap(2) is to permit
policy to prohibit memory mapping of specific files for which we need
to ensure that every access is revalidated, particularly useful for
scenarios where we expect the file to be relabeled at runtime in order
to reflect state changes (e.g. cross-domain solution, assured pipeline
without data copying).

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 10:23:39 -04:00
Stephen Smalley
db59000ab7 selinux: only invoke capabilities and selinux for CAP_MAC_ADMIN checks
SELinux uses CAP_MAC_ADMIN to control the ability to get or set a raw,
uninterpreted security context unknown to the currently loaded security
policy. When performing these checks, we only want to perform a base
capabilities check and a SELinux permission check.  If any other
modules that implement a capable hook are stacked with SELinux, we do
not want to require them to also have to authorize CAP_MAC_ADMIN,
since it may have different implications for their security model.
Rework the CAP_MAC_ADMIN checks within SELinux to only invoke the
capabilities module and the SELinux permission checking.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 10:23:22 -04:00
Markus Elfring
46be14d2b6 selinux: Return an error code only as a constant in sidtab_insert()
* Return an error code without storing it in an intermediate variable.

* Delete the local variable "rc" and the jump label "out" which became
  unnecessary with this refactoring.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 10:23:17 -04:00
Markus Elfring
62934ffb9e selinux: Return directly after a failed memory allocation in policydb_index()
Replace five goto statements (and previous variable assignments) by
direct returns after a memory allocation failure in this function.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 10:23:12 -04:00
Tetsuo Handa
a79be23860 selinux: Use task_alloc hook rather than task_create hook
This patch is a preparation for getting rid of task_create hook because
task_alloc hook which can do what task_create hook can do was revived.

Creating a new thread is unlikely prohibited by security policy, for
fork()/execve()/exit() is fundamental of how processes are managed in
Unix. If a program is known to create a new thread, it is likely that
permission to create a new thread is given to that program. Therefore,
a situation where security_task_create() returns an error is likely that
the program was exploited and lost control. Even if SELinux failed to
check permission to create a thread at security_task_create(), SELinux
can later check it at security_task_alloc(). Since the new thread is not
yet visible from the rest of the system, nobody can do bad things using
the new thread. What we waste will be limited to some initialization
steps such as dup_task_struct(), copy_creds() and audit_alloc() in
copy_process(). We can tolerate these overhead for unlikely situation.

Therefore, this patch changes SELinux to use task_alloc hook rather than
task_create hook so that we can remove task_create hook.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-05-23 10:23:02 -04:00
James Morris
d68c51e0b3 Sync to mainline for security submaintainers to work against 2017-05-22 16:32:40 +10:00
Kees Cook
5395d312df doc: ReSTify keys-trusted-encrypted.txt
Adjusts for ReST markup and moves under keys security devel index.

Cc: David Howells <dhowells@redhat.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-05-18 10:33:56 -06:00
Kees Cook
3db38ed768 doc: ReSTify keys-request-key.txt
Adjusts for ReST markup and moves under keys security devel index.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-05-18 10:33:51 -06:00
Kees Cook
90bb766440 doc: ReSTify Yama.txt
Adjusts for ReST markup and moves under LSM admin guide.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-05-18 10:33:04 -06:00
Kees Cook
26fccd9ed2 doc: ReSTify apparmor.txt
Adjusts for ReST markup and moves under LSM admin guide.

Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-05-18 10:32:38 -06:00
Geert Uytterhoeven
99c55fb18f security: Grammar s/allocates/allocated/
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-05-15 10:02:13 +10:00
Mickaël Salaün
3bb857e47e LSM: Enable multiple calls to security_add_hooks() for the same LSM
The commit d69dece5f5 ("LSM: Add /sys/kernel/security/lsm") extend
security_add_hooks() with a new parameter to register the LSM name,
which may be useful to make the list of currently loaded LSM available
to userspace. However, there is no clean way for an LSM to split its
hook declarations into multiple files, which may reduce the mess with
all the included files (needed for LSM hook argument types) and make the
source code easier to review and maintain.

This change allows an LSM to register multiple times its hook while
keeping a consistent list of LSM names as described in
Documentation/security/LSM.txt . The list reflects the order in which
checks are made. This patch only check for the last registered LSM. If
an LSM register multiple times its hooks, interleaved with other LSM
registrations (which should not happen), its name will still appear in
the same order that the hooks are called, hence multiple times.

To sum up, "capability,selinux,foo,foo" will be replaced with
"capability,selinux,foo", however "capability,foo,selinux,foo" will
remain as is.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-05-15 09:35:53 +10:00
Linus Torvalds
11fbf53d66 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "Assorted bits and pieces from various people. No common topic in this
  pile, sorry"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs/affs: add rename exchange
  fs/affs: add rename2 to prepare multiple methods
  Make stat/lstat/fstatat pass AT_NO_AUTOMOUNT to vfs_statx()
  fs: don't set *REFERENCED on single use objects
  fs: compat: Remove warning from COMPATIBLE_IOCTL
  remove pointless extern of atime_need_update_rcu()
  fs: completely ignore unknown open flags
  fs: add a VALID_OPEN_FLAGS
  fs: remove _submit_bh()
  fs: constify tree_descr arrays passed to simple_fill_super()
  fs: drop duplicate header percpu-rwsem.h
  fs/affs: bugfix: Write files greater than page size on OFS
  fs/affs: bugfix: enable writes on OFS disks
  fs/affs: remove node generation check
  fs/affs: import amigaffs.h
  fs/affs: bugfix: make symbolic links work again
2017-05-09 09:12:53 -07:00
Deepa Dinamani
24d0d03c2e apparmorfs: replace CURRENT_TIME with current_time()
CURRENT_TIME macro is not y2038 safe on 32 bit systems.

The patch replaces all the uses of CURRENT_TIME by current_time().

This is also in preparation for the patch that transitions vfs
timestamps to use 64 bit time and hence make them y2038 safe.
current_time() is also planned to be transitioned to y2038 safe behavior
along with this change.

CURRENT_TIME macro will be deleted before merging the aforementioned
change.

Link: http://lkml.kernel.org/r/1491613030-11599-11-git-send-email-deepa.kernel@gmail.com
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08 17:15:15 -07:00
Michal Hocko
752ade68cb treewide: use kv[mz]alloc* rather than opencoded variants
There are many code paths opencoding kvmalloc.  Let's use the helper
instead.  The main difference to kvmalloc is that those users are
usually not considering all the aspects of the memory allocator.  E.g.
allocation requests <= 32kB (with 4kB pages) are basically never failing
and invoke OOM killer to satisfy the allocation.  This sounds too
disruptive for something that has a reasonable fallback - the vmalloc.
On the other hand those requests might fallback to vmalloc even when the
memory allocator would succeed after several more reclaim/compaction
attempts previously.  There is no guarantee something like that happens
though.

This patch converts many of those places to kv[mz]alloc* helpers because
they are more conservative.

Link: http://lkml.kernel.org/r/20170306103327.2766-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> # Xen bits
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Andreas Dilger <andreas.dilger@intel.com> # Lustre
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> # KVM/s390
Acked-by: Dan Williams <dan.j.williams@intel.com> # nvdim
Acked-by: David Sterba <dsterba@suse.com> # btrfs
Acked-by: Ilya Dryomov <idryomov@gmail.com> # Ceph
Acked-by: Tariq Toukan <tariqt@mellanox.com> # mlx4
Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx5
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Santosh Raspatur <santosh@chelsio.com>
Cc: Hariprasad S <hariprasad@chelsio.com>
Cc: Yishai Hadas <yishaih@mellanox.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: "Yan, Zheng" <zyan@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08 17:15:13 -07:00
Michal Hocko
a7c3e901a4 mm: introduce kv[mz]alloc helpers
Patch series "kvmalloc", v5.

There are many open coded kmalloc with vmalloc fallback instances in the
tree.  Most of them are not careful enough or simply do not care about
the underlying semantic of the kmalloc/page allocator which means that
a) some vmalloc fallbacks are basically unreachable because the kmalloc
part will keep retrying until it succeeds b) the page allocator can
invoke a really disruptive steps like the OOM killer to move forward
which doesn't sound appropriate when we consider that the vmalloc
fallback is available.

As it can be seen implementing kvmalloc requires quite an intimate
knowledge if the page allocator and the memory reclaim internals which
strongly suggests that a helper should be implemented in the memory
subsystem proper.

Most callers, I could find, have been converted to use the helper
instead.  This is patch 6.  There are some more relying on __GFP_REPEAT
in the networking stack which I have converted as well and Eric Dumazet
was not opposed [2] to convert them as well.

[1] http://lkml.kernel.org/r/20170130094940.13546-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/1485273626.16328.301.camel@edumazet-glaptop3.roam.corp.google.com

This patch (of 9):

Using kmalloc with the vmalloc fallback for larger allocations is a
common pattern in the kernel code.  Yet we do not have any common helper
for that and so users have invented their own helpers.  Some of them are
really creative when doing so.  Let's just add kv[mz]alloc and make sure
it is implemented properly.  This implementation makes sure to not make
a large memory pressure for > PAGE_SZE requests (__GFP_NORETRY) and also
to not warn about allocation failures.  This also rules out the OOM
killer as the vmalloc is a more approapriate fallback than a disruptive
user visible action.

This patch also changes some existing users and removes helpers which
are specific for them.  In some cases this is not possible (e.g.
ext4_kvmalloc, libcfs_kvzalloc) because those seems to be broken and
require GFP_NO{FS,IO} context which is not vmalloc compatible in general
(note that the page table allocation is GFP_KERNEL).  Those need to be
fixed separately.

While we are at it, document that __vmalloc{_node} about unsupported gfp
mask because there seems to be a lot of confusion out there.
kvmalloc_node will warn about GFP_KERNEL incompatible (which are not
superset) flags to catch new abusers.  Existing ones would have to die
slowly.

[sfr@canb.auug.org.au: f2fs fixup]
  Link: http://lkml.kernel.org/r/20170320163735.332e64b7@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170306103032.2540-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>	[ext4 part]
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08 17:15:12 -07:00
Linus Torvalds
0302e28dee Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

  IMA:
   - provide ">" and "<" operators for fowner/uid/euid rules

  KEYS:
   - add a system blacklist keyring

   - add KEYCTL_RESTRICT_KEYRING, exposes keyring link restriction
     functionality to userland via keyctl()

  LSM:
   - harden LSM API with __ro_after_init

   - add prlmit security hook, implement for SELinux

   - revive security_task_alloc hook

  TPM:
   - implement contextual TPM command 'spaces'"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (98 commits)
  tpm: Fix reference count to main device
  tpm_tis: convert to using locality callbacks
  tpm: fix handling of the TPM 2.0 event logs
  tpm_crb: remove a cruft constant
  keys: select CONFIG_CRYPTO when selecting DH / KDF
  apparmor: Make path_max parameter readonly
  apparmor: fix parameters so that the permission test is bypassed at boot
  apparmor: fix invalid reference to index variable of iterator line 836
  apparmor: use SHASH_DESC_ON_STACK
  security/apparmor/lsm.c: set debug messages
  apparmor: fix boolreturn.cocci warnings
  Smack: Use GFP_KERNEL for smk_netlbl_mls().
  smack: fix double free in smack_parse_opts_str()
  KEYS: add SP800-56A KDF support for DH
  KEYS: Keyring asymmetric key restrict method with chaining
  KEYS: Restrict asymmetric key linkage using a specific keychain
  KEYS: Add a lookup_restriction function for the asymmetric key type
  KEYS: Add KEYCTL_RESTRICT_KEYRING
  KEYS: Consistent ordering for __key_link_begin and restrict check
  KEYS: Add an optional lookup_restriction hook to key_type
  ...
2017-05-03 08:50:52 -07:00
Linus Torvalds
8d65b08deb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Millar:
 "Here are some highlights from the 2065 networking commits that
  happened this development cycle:

   1) XDP support for IXGBE (John Fastabend) and thunderx (Sunil Kowuri)

   2) Add a generic XDP driver, so that anyone can test XDP even if they
      lack a networking device whose driver has explicit XDP support
      (me).

   3) Sparc64 now has an eBPF JIT too (me)

   4) Add a BPF program testing framework via BPF_PROG_TEST_RUN (Alexei
      Starovoitov)

   5) Make netfitler network namespace teardown less expensive (Florian
      Westphal)

   6) Add symmetric hashing support to nft_hash (Laura Garcia Liebana)

   7) Implement NAPI and GRO in netvsc driver (Stephen Hemminger)

   8) Support TC flower offload statistics in mlxsw (Arkadi Sharshevsky)

   9) Multiqueue support in stmmac driver (Joao Pinto)

  10) Remove TCP timewait recycling, it never really could possibly work
      well in the real world and timestamp randomization really zaps any
      hint of usability this feature had (Soheil Hassas Yeganeh)

  11) Support level3 vs level4 ECMP route hashing in ipv4 (Nikolay
      Aleksandrov)

  12) Add socket busy poll support to epoll (Sridhar Samudrala)

  13) Netlink extended ACK support (Johannes Berg, Pablo Neira Ayuso,
      and several others)

  14) IPSEC hw offload infrastructure (Steffen Klassert)"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2065 commits)
  tipc: refactor function tipc_sk_recv_stream()
  tipc: refactor function tipc_sk_recvmsg()
  net: thunderx: Optimize page recycling for XDP
  net: thunderx: Support for XDP header adjustment
  net: thunderx: Add support for XDP_TX
  net: thunderx: Add support for XDP_DROP
  net: thunderx: Add basic XDP support
  net: thunderx: Cleanup receive buffer allocation
  net: thunderx: Optimize CQE_TX handling
  net: thunderx: Optimize RBDR descriptor handling
  net: thunderx: Support for page recycling
  ipx: call ipxitf_put() in ioctl error path
  net: sched: add helpers to handle extended actions
  qed*: Fix issues in the ptp filter config implementation.
  qede: Fix concurrency issue in PTP Tx path processing.
  stmmac: Add support for SIMATIC IOT2000 platform
  net: hns: fix ethtool_get_strings overflow in hns driver
  tcp: fix wraparound issue in tcp_lp
  bpf, arm64: fix jit branch offset related to ldimm64
  bpf, arm64: implement jiting of BPF_XADD
  ...
2017-05-02 16:40:27 -07:00
Linus Torvalds
c58d4055c0 A reasonably busy cycle for documentation this time around. There is a new
guide for user-space API documents, rather sparsely populated at the
 moment, but it's a start.  Markus improved the infrastructure for
 converting diagrams.  Mauro has converted much of the USB documentation
 over to RST.  Plus the usual set of fixes, improvements, and tweaks.
 
 There's a bit more than the usual amount of reaching out of Documentation/
 to fix comments elsewhere in the tree; I have acks for those where I could
 get them.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZB1elAAoJEI3ONVYwIuV6wUIQAJSM/4rNdj6z+GXeWhRfbsOo
 vqqVYluvXQIJaaqdsy9dgcfThhOXWYsPyVF6Xd+bDJpwF3BMZYbX1CI1Mo3kRD+7
 9+Pf68cYSHRoU3l/sFI8q0zfKbHtmFteIvnRQoFtRaExqgTR8glUfxNDyN9XuNAZ
 3naS4qMZivM4gjMcSpIB/wFOQpV+6qVIs6VTFLdCC8wodT3W/Wmb+bqrCVJ0twbB
 t8jJeYHt2wsiTdqrKU+VilAUAZ1Lby+DNfeWrO18rC1ohktPyUzOGg8JmTKUBpVO
 qj1OJwD6abuaNh/J9bXsh8u0OrVrBKWjVrhq9IFYDlm92fu3Bgr6YeoaVPEpcklt
 jdlgZnWs9/oXa6d32aMc9F7mP9a0Q1qikFTYINhaHQZCb4VDRuQ9hCSuqWm5jlVy
 lmVAoxLa0zSdOoXaYuO3HC99ku1cIn814CXMDz/IwKXkqUCV+zl+H3AGkvxGyQ5M
 eblw2TnQnc6e1LRcxt5bgpFR1JYMbCJhu0U5XrNFueQV8ReB15dvL7h4y21dWJKF
 2Sr83rwfG1rpZQiSqCjOXxIzuXbEGH3+a+zCDV5IHhQRt/VNDOt2hgmcyucSSJ5h
 5GRFYgTlGvoT/6LdIT39QooHB+4tSDRtEQ6lh0q2ZtVV2rfG/I6/PR5sUbWM65SN
 vAfctRm2afHLhdonSX5O
 =41m+
 -----END PGP SIGNATURE-----

Merge tag 'docs-4.12' of git://git.lwn.net/linux

Pull documentation update from Jonathan Corbet:
 "A reasonably busy cycle for documentation this time around. There is a
  new guide for user-space API documents, rather sparsely populated at
  the moment, but it's a start. Markus improved the infrastructure for
  converting diagrams. Mauro has converted much of the USB documentation
  over to RST. Plus the usual set of fixes, improvements, and tweaks.

  There's a bit more than the usual amount of reaching out of
  Documentation/ to fix comments elsewhere in the tree; I have acks for
  those where I could get them"

* tag 'docs-4.12' of git://git.lwn.net/linux: (74 commits)
  docs: Fix a couple typos
  docs: Fix a spelling error in vfio-mediated-device.txt
  docs: Fix a spelling error in ioctl-number.txt
  MAINTAINERS: update file entry for HSI subsystem
  Documentation: allow installing man pages to a user defined directory
  Doc/PM: Sync with intel_powerclamp code behavior
  zr364xx.rst: usb/devices is now at /sys/kernel/debug/
  usb.rst: move documentation from proc_usb_info.txt to USB ReST book
  convert philips.txt to ReST and add to media docs
  docs-rst: usb: update old usbfs-related documentation
  arm: Documentation: update a path name
  docs: process/4.Coding.rst: Fix a couple of document refs
  docs-rst: fix usb cross-references
  usb: gadget.h: be consistent at kernel doc macros
  usb: composite.h: fix two warnings when building docs
  usb: get rid of some ReST doc build errors
  usb.rst: get rid of some Sphinx errors
  usb/URB.txt: convert to ReST and update it
  usb/persist.txt: convert to ReST and add to driver-api book
  usb/hotplug.txt: convert to ReST and add to driver-api book
  ...
2017-05-02 10:21:17 -07:00
Linus Torvalds
5db6db0d40 Merge branch 'work.uaccess' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull uaccess unification updates from Al Viro:
 "This is the uaccess unification pile. It's _not_ the end of uaccess
  work, but the next batch of that will go into the next cycle. This one
  mostly takes copy_from_user() and friends out of arch/* and gets the
  zero-padding behaviour in sync for all architectures.

  Dealing with the nocache/writethrough mess is for the next cycle;
  fortunately, that's x86-only. Same for cleanups in iov_iter.c (I am
  sold on access_ok() in there, BTW; just not in this pile), same for
  reducing __copy_... callsites, strn*... stuff, etc. - there will be a
  pile about as large as this one in the next merge window.

  This one sat in -next for weeks. -3KLoC"

* 'work.uaccess' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (96 commits)
  HAVE_ARCH_HARDENED_USERCOPY is unconditional now
  CONFIG_ARCH_HAS_RAW_COPY_USER is unconditional now
  m32r: switch to RAW_COPY_USER
  hexagon: switch to RAW_COPY_USER
  microblaze: switch to RAW_COPY_USER
  get rid of padding, switch to RAW_COPY_USER
  ia64: get rid of copy_in_user()
  ia64: sanitize __access_ok()
  ia64: get rid of 'segment' argument of __do_{get,put}_user()
  ia64: get rid of 'segment' argument of __{get,put}_user_check()
  ia64: add extable.h
  powerpc: get rid of zeroing, switch to RAW_COPY_USER
  esas2r: don't open-code memdup_user()
  alpha: fix stack smashing in old_adjtimex(2)
  don't open-code kernel_setsockopt()
  mips: switch to RAW_COPY_USER
  mips: get rid of tail-zeroing in primitives
  mips: make copy_from_user() zero tail explicitly
  mips: clean and reorder the forest of macros...
  mips: consolidate __invoke_... wrappers
  ...
2017-05-01 14:41:04 -07:00
Eric Biggers
cda37124f4 fs: constify tree_descr arrays passed to simple_fill_super()
simple_fill_super() is passed an array of tree_descr structures which
describe the files to create in the filesystem's root directory.  Since
these arrays are never modified intentionally, they should be 'const' so
that they are placed in .rodata and benefit from memory protection.
This patch updates the function signature and all users, and also
constifies tree_descr.name.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-04-26 23:54:06 -04:00
Al Viro
2fefc97b21 HAVE_ARCH_HARDENED_USERCOPY is unconditional now
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-04-26 12:11:06 -04:00
David S. Miller
fb796707d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Both conflict were simple overlapping changes.

In the kaweth case, Eric Dumazet's skb_cow() bug fix overlapped the
conversion of the driver in net-next to use in-netdev stats.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21 20:23:53 -07:00
James Morris
f65cc104c4 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next 2017-04-19 22:00:15 +10:00
James Morris
6859e21e81 Merge branch 'smack-for-4.12' of git://github.com/cschaufler/smack-next into next 2017-04-19 08:35:01 +10:00
James Morris
fa5b5b26e2 Merge branch 'stable-4.12' of git://git.infradead.org/users/pcmoore/selinux into next 2017-04-19 08:30:08 +10:00
Eric Biggers
c9f838d104 KEYS: fix keyctl_set_reqkey_keyring() to not leak thread keyrings
This fixes CVE-2017-7472.

Running the following program as an unprivileged user exhausts kernel
memory by leaking thread keyrings:

	#include <keyutils.h>

	int main()
	{
		for (;;)
			keyctl_set_reqkey_keyring(KEY_REQKEY_DEFL_THREAD_KEYRING);
	}

Fix it by only creating a new thread keyring if there wasn't one before.
To make things more consistent, make install_thread_keyring_to_cred()
and install_process_keyring_to_cred() both return 0 if the corresponding
keyring is already present.

Fixes: d84f4f992c ("CRED: Inaugurate COW credentials")
Cc: stable@vger.kernel.org # 2.6.29+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-04-18 15:31:49 +01:00
David Howells
c1644fe041 KEYS: Change the name of the dead type to ".dead" to prevent user access
This fixes CVE-2017-6951.

Userspace should not be able to do things with the "dead" key type as it
doesn't have some of the helper functions set upon it that the kernel
needs.  Attempting to use it may cause the kernel to crash.

Fix this by changing the name of the type to ".dead" so that it's rejected
up front on userspace syscalls by key_get_type_from_user().

Though this doesn't seem to affect recent kernels, it does affect older
ones, certainly those prior to:

	commit c06cfb08b8
	Author: David Howells <dhowells@redhat.com>
	Date:   Tue Sep 16 17:36:06 2014 +0100
	KEYS: Remove key_type::match in favour of overriding default by match_preparse

which went in before 3.18-rc1.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: stable@vger.kernel.org
2017-04-18 15:31:39 +01:00
David Howells
ee8f844e3c KEYS: Disallow keyrings beginning with '.' to be joined as session keyrings
This fixes CVE-2016-9604.

Keyrings whose name begin with a '.' are special internal keyrings and so
userspace isn't allowed to create keyrings by this name to prevent
shadowing.  However, the patch that added the guard didn't fix
KEYCTL_JOIN_SESSION_KEYRING.  Not only can that create dot-named keyrings,
it can also subscribe to them as a session keyring if they grant SEARCH
permission to the user.

This, for example, allows a root process to set .builtin_trusted_keys as
its session keyring, at which point it has full access because now the
possessor permissions are added.  This permits root to add extra public
keys, thereby bypassing module verification.

This also affects kexec and IMA.

This can be tested by (as root):

	keyctl session .builtin_trusted_keys
	keyctl add user a a @s
	keyctl list @s

which on my test box gives me:

	2 keys in keyring:
	180010936: ---lswrv     0     0 asymmetric: Build time autogenerated kernel key: ae3d4a31b82daa8e1a75b49dc2bba949fd992a05
	801382539: --alswrv     0     0 user: a


Fix this by rejecting names beginning with a '.' in the keyctl.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
cc: linux-ima-devel@lists.sourceforge.net
cc: stable@vger.kernel.org
2017-04-18 15:31:35 +01:00
James Morris
30a83251dd Merge tag 'keys-next-20170412' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2017-04-18 07:37:51 +10:00
Stephan Müller
4cd4ca7cc8 keys: select CONFIG_CRYPTO when selecting DH / KDF
Select CONFIG_CRYPTO in addition to CONFIG_HASH to ensure that
also CONFIG_HASH2 is selected. Both are needed for the shash
cipher support required for the KDF operation.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-04-11 23:18:09 +01:00
John Johansen
622f6e3265 apparmor: Make path_max parameter readonly
The path_max parameter determines the max size of buffers allocated
but it should  not be setable at run time. If can be used to cause an
oops

root@ubuntu:~# echo 16777216 > /sys/module/apparmor/parameters/path_max
root@ubuntu:~# cat /sys/module/apparmor/parameters/path_max
Killed

[  122.141911] BUG: unable to handle kernel paging request at ffff880080945fff
[  122.143497] IP: [<ffffffff81228844>] d_absolute_path+0x44/0xa0
[  122.144742] PGD 220c067 PUD 0
[  122.145453] Oops: 0002 [#1] SMP
[  122.146204] Modules linked in: vmw_vsock_vmci_transport vsock ppdev vmw_balloon snd_ens1371 btusb snd_ac97_codec gameport snd_rawmidi btrtl snd_seq_device ac97_bus btbcm btintel snd_pcm input_leds bluetooth snd_timer snd joydev soundcore serio_raw coretemp shpchp nfit parport_pc i2c_piix4 8250_fintek vmw_vmci parport mac_hid ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd vmwgfx psmouse mptspi ttm mptscsih drm_kms_helper mptbase syscopyarea scsi_transport_spi sysfillrect
[  122.163365]  ahci sysimgblt e1000 fb_sys_fops libahci drm pata_acpi fjes
[  122.164747] CPU: 3 PID: 1501 Comm: bash Not tainted 4.4.0-59-generic #80-Ubuntu
[  122.166250] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[  122.168611] task: ffff88003496aa00 ti: ffff880076474000 task.ti: ffff880076474000
[  122.170018] RIP: 0010:[<ffffffff81228844>]  [<ffffffff81228844>] d_absolute_path+0x44/0xa0
[  122.171525] RSP: 0018:ffff880076477b90  EFLAGS: 00010206
[  122.172462] RAX: ffff880080945fff RBX: 0000000000000000 RCX: 0000000001000000
[  122.173709] RDX: 0000000000ffffff RSI: ffff880080946000 RDI: ffff8800348a1010
[  122.174978] RBP: ffff880076477bb8 R08: ffff880076477c80 R09: 0000000000000000
[  122.176227] R10: 00007ffffffff000 R11: ffff88007f946000 R12: ffff88007f946000
[  122.177496] R13: ffff880076477c80 R14: ffff8800348a1010 R15: ffff8800348a2400
[  122.178745] FS:  00007fd459eb4700(0000) GS:ffff88007b6c0000(0000) knlGS:0000000000000000
[  122.180176] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  122.181186] CR2: ffff880080945fff CR3: 0000000073422000 CR4: 00000000001406e0
[  122.182469] Stack:
[  122.182843]  00ffffff00000001 ffff880080946000 0000000000000000 0000000000000000
[  122.184409]  00000000570f789c ffff880076477c30 ffffffff81385671 ffff88007a2e7a58
[  122.185810]  0000000000000000 ffff880076477c88 01000000008a1000 0000000000000000
[  122.187231] Call Trace:
[  122.187680]  [<ffffffff81385671>] aa_path_name+0x81/0x370
[  122.188637]  [<ffffffff813875dd>] profile_transition+0xbd/0xb80
[  122.190181]  [<ffffffff811af9bc>] ? zone_statistics+0x7c/0xa0
[  122.191674]  [<ffffffff81389b20>] apparmor_bprm_set_creds+0x9b0/0xac0
[  122.193288]  [<ffffffff812e1971>] ? ext4_xattr_get+0x81/0x220
[  122.194793]  [<ffffffff812e800c>] ? ext4_xattr_security_get+0x1c/0x30
[  122.196392]  [<ffffffff813449b9>] ? get_vfs_caps_from_disk+0x69/0x110
[  122.198004]  [<ffffffff81232d4f>] ? mnt_may_suid+0x3f/0x50
[  122.199737]  [<ffffffff81344b03>] ? cap_bprm_set_creds+0xa3/0x600
[  122.201377]  [<ffffffff81346e53>] security_bprm_set_creds+0x33/0x50
[  122.203024]  [<ffffffff81214ce5>] prepare_binprm+0x85/0x190
[  122.204515]  [<ffffffff81216545>] do_execveat_common.isra.33+0x485/0x710
[  122.206200]  [<ffffffff81216a6a>] SyS_execve+0x3a/0x50
[  122.207615]  [<ffffffff81838795>] stub_execve+0x5/0x5
[  122.208978]  [<ffffffff818384f2>] ? entry_SYSCALL_64_fastpath+0x16/0x71
[  122.210615] Code: f8 31 c0 48 63 c2 83 ea 01 48 c7 45 e8 00 00 00 00 48 01 c6 85 d2 48 c7 45 f0 00 00 00 00 48 89 75 e0 89 55 dc 78 0c 48 8d 46 ff <c6> 46 ff 00 48 89 45 e0 48 8d 55 e0 48 8d 4d dc 48 8d 75 e8 e8
[  122.217320] RIP  [<ffffffff81228844>] d_absolute_path+0x44/0xa0
[  122.218860]  RSP <ffff880076477b90>
[  122.219919] CR2: ffff880080945fff
[  122.220936] ---[ end trace 506cdbd85eb6c55e ]---

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-04-07 08:58:36 +10:00
John Johansen
545de8fe0f apparmor: fix parameters so that the permission test is bypassed at boot
Boot parameters are written before apparmor is ready to answer whether
the user is policy_view_capable(). Setting the parameters at boot results
in an oops and failure to boot. Setting the parameters at boot is
obviously allowed so skip the permission check when apparmor is not
initialized.

While we are at it move the more complicated check to last.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-04-07 08:58:36 +10:00
John Johansen
b9b144bcaf apparmor: fix invalid reference to index variable of iterator line 836
Once the loop on lines 836-853 is complete and exits normally, ent is a
pointer to the dummy list head value.  The derefernces accessible from eg
the goto fail on line 860 or the various goto fail_lock's afterwards thus
seem incorrect.

Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-04-07 08:58:36 +10:00
Nicolas Iooss
9814448da7 apparmor: use SHASH_DESC_ON_STACK
When building the kernel with clang, the compiler fails to build
security/apparmor/crypto.c with the following error:

    security/apparmor/crypto.c:36:8: error: fields must have a constant
    size: 'variable length array in structure' extension will never be
    supported
                    char ctx[crypto_shash_descsize(apparmor_tfm)];
                         ^

Since commit a0a77af141 ("crypto: LLVMLinux: Add macro to remove use
of VLAIS in crypto code"), include/crypto/hash.h defines
SHASH_DESC_ON_STACK to work around this issue. Use it in aa_calc_hash()
and aa_calc_profile_hash().

Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-04-07 08:58:35 +10:00
Valentin Rothberg
eea7a05f19 security/apparmor/lsm.c: set debug messages
Add the _APPARMOR substring to reference the intended Kconfig option.

Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-04-07 08:58:35 +10:00
kbuild test robot
b9c42ac76e apparmor: fix boolreturn.cocci warnings
security/apparmor/lib.c:132:9-10: WARNING: return of 0/1 in function 'aa_policy_init' with return type bool

 Return statements in functions returning bool should use
 true/false instead of 1/0.
Generated by: scripts/coccinelle/misc/boolreturn.cocci

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-04-07 08:58:35 +10:00
Tetsuo Handa
af96f0d639 Smack: Use GFP_KERNEL for smk_netlbl_mls().
Since all callers of smk_netlbl_mls() are GFP_KERNEL context
(smk_set_cipso() calls memdup_user_nul(), init_smk_fs() calls
__kernfs_new_node(), smk_import_entry() calls kzalloc(GFP_KERNEL)),
it is safe to use GFP_KERNEL from netlbl_catmap_setbit().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-04-04 15:41:15 -07:00
Tetsuo Handa
c3c8dc9f13 smack: fix double free in smack_parse_opts_str()
smack_parse_opts_str() calls kfree(opts->mnt_opts) when kcalloc() for
opts->mnt_opts_flags failed. But it should not have called it because
security_free_mnt_opts() will call kfree(opts->mnt_opts).

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
fixes: 3bf2789cad ("smack: allow mount opts setting over filesystems with binary mount data")
Cc: Vivek Trivedi <t.vivek@samsung.com>
Cc: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
2017-04-04 15:41:15 -07:00
Stephan Mueller
f1c316a3ab KEYS: add SP800-56A KDF support for DH
SP800-56A defines the use of DH with key derivation function based on a
counter. The input to the KDF is defined as (DH shared secret || other
information). The value for the "other information" is to be provided by
the caller.

The KDF is implemented using the hash support from the kernel crypto API.
The implementation uses the symmetric hash support as the input to the
hash operation is usually very small. The caller is allowed to specify
the hash name that he wants to use to derive the key material allowing
the use of all supported hashes provided with the kernel crypto API.

As the KDF implements the proper truncation of the DH shared secret to
the requested size, this patch fills the caller buffer up to its size.

The patch is tested with a new test added to the keyutils user space
code which uses a CAVS test vector testing the compliance with
SP800-56A.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-04-04 22:33:38 +01:00
Mat Martineau
6563c91fd6 KEYS: Add KEYCTL_RESTRICT_KEYRING
Keyrings recently gained restrict_link capabilities that allow
individual keys to be validated prior to linking.  This functionality
was only available using internal kernel APIs.

With the KEYCTL_RESTRICT_KEYRING command existing keyrings can be
configured to check the content of keys before they are linked, and
then allow or disallow linkage of that key to the keyring.

To restrict a keyring, call:

  keyctl(KEYCTL_RESTRICT_KEYRING, key_serial_t keyring, const char *type,
         const char *restriction)

where 'type' is the name of a registered key type and 'restriction' is a
string describing how key linkage is to be restricted. The restriction
option syntax is specific to each key type.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
2017-04-04 14:10:12 -07:00
Mat Martineau
4a420896f1 KEYS: Consistent ordering for __key_link_begin and restrict check
The keyring restrict callback was sometimes called before
__key_link_begin and sometimes after, which meant that the keyring
semaphores were not always held during the restrict callback.

If the semaphores are consistently acquired before checking link
restrictions, keyring contents cannot be changed after the restrict
check is complete but before the evaluated key is linked to the keyring.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
2017-04-04 14:10:11 -07:00
Mat Martineau
2b6aa412ff KEYS: Use structure to capture key restriction function and data
Replace struct key's restrict_link function pointer with a pointer to
the new struct key_restriction. The structure contains pointers to the
restriction function as well as relevant data for evaluating the
restriction.

The garbage collector checks restrict_link->keytype when key types are
unregistered. Restrictions involving a removed key type are converted
to use restrict_link_reject so that restrictions cannot be removed by
unregistering key types.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
2017-04-04 14:10:10 -07:00
Mat Martineau
aaf66c8838 KEYS: Split role of the keyring pointer for keyring restrict functions
The first argument to the restrict_link_func_t functions was a keyring
pointer. These functions are called by the key subsystem with this
argument set to the destination keyring, but restrict_link_by_signature
expects a pointer to the relevant trusted keyring.

Restrict functions may need something other than a single struct key
pointer to allow or reject key linkage, so the data used to make that
decision (such as the trust keyring) is moved to a new, fourth
argument. The first argument is now always the destination keyring.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
2017-04-03 10:24:56 -07:00
Mat Martineau
469ff8f7d4 KEYS: Use a typedef for restrict_link function pointers
This pointer type needs to be returned from a lookup function, and
without a typedef the syntax gets cumbersome.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
2017-04-03 10:24:55 -07:00
Elena Reshetova
ddb99e118e security, keys: convert key_user.usage from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-04-03 10:49:06 +10:00
Elena Reshetova
fff292914d security, keys: convert key.usage from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-04-03 10:49:05 +10:00
mchehab@s-opensource.com
0e056eb553 kernel-api.rst: fix a series of errors when parsing C files
./lib/string.c:134: WARNING: Inline emphasis start-string without end-string.
./mm/filemap.c:522: WARNING: Inline interpreted text or phrase reference start-string without end-string.
./mm/filemap.c:1283: ERROR: Unexpected indentation.
./mm/filemap.c:3003: WARNING: Inline interpreted text or phrase reference start-string without end-string.
./mm/vmalloc.c:1544: WARNING: Inline emphasis start-string without end-string.
./mm/page_alloc.c:4245: ERROR: Unexpected indentation.
./ipc/util.c:676: ERROR: Unexpected indentation.
./drivers/pci/irq.c:35: WARNING: Block quote ends without a blank line; unexpected unindent.
./security/security.c:109: ERROR: Unexpected indentation.
./security/security.c:110: WARNING: Definition list ends without a blank line; unexpected unindent.
./block/genhd.c:275: WARNING: Inline strong start-string without end-string.
./block/genhd.c:283: WARNING: Inline strong start-string without end-string.
./include/linux/clk.h:134: WARNING: Inline emphasis start-string without end-string.
./include/linux/clk.h:134: WARNING: Inline emphasis start-string without end-string.
./ipc/util.c:477: ERROR: Unknown target name: "s".

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-04-02 14:31:49 -06:00
Dan Carpenter
cae303df3f selinux: Fix an uninitialized variable bug
We removed this initialization as a cleanup but it is probably required.

The concern is that "nel" can be zero.  I'm not an expert on SELinux
code but I think it looks possible to write an SELinux policy which
triggers this bug.  GCC doesn't catch this, but my static checker does.

Fixes: 9c312e79d6 ("selinux: Delete an unnecessary variable initialisation in range_read()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-31 15:16:18 -04:00
Kees Cook
8291798dcf TOMOYO: Use designated initializers
Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. These were identified during
allyesconfig builds of x86, arm, and arm64, with most initializer fixes
extracted from grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-30 17:37:45 +11:00
Matthias Kaehlcke
342e91578e selinux: Remove unnecessary check of array base in selinux_set_mapping()
'perms' will never be NULL since it isn't a plain pointer but an array
of u32 values.

This fixes the following warning when building with clang:

security/selinux/ss/services.c:158:16: error: address of array
'p_in->perms' will always evaluate to 'true'
[-Werror,-Wpointer-bool-conversion]
                while (p_in->perms && p_in->perms[k]) {

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-29 18:57:25 -04:00
Markus Elfring
710a0647ba selinuxfs: Use seq_puts() in sel_avc_stats_seq_show()
A string which did not contain data format specifications should be put
into a sequence. Thus use the corresponding function "seq_puts".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-29 11:52:30 -04:00
Markus Elfring
8ee4586ca5 selinux: Adjust two checks for null pointers
The script "checkpatch.pl" pointed information out like the following.

Comparison to NULL could be written !…

Thus fix affected source code places.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-29 11:46:10 -04:00
Markus Elfring
b380f78377 selinux: Use kmalloc_array() in sidtab_init()
A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "kmalloc_array".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-29 11:44:10 -04:00
Markus Elfring
ebd2b47ba5 selinux: Return directly after a failed kzalloc() in roles_init()
Return directly after a call of the function "kzalloc" failed
at the beginning.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-29 11:39:17 -04:00
Markus Elfring
7befb7514e selinux: Return directly after a failed kzalloc() in perm_read()
Return directly after a call of the function "kzalloc" failed
at the beginning.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-29 11:30:51 -04:00
Markus Elfring
442ca4d656 selinux: Return directly after a failed kzalloc() in common_read()
Return directly after a call of the function "kzalloc" failed
at the beginning.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-29 11:29:11 -04:00
Markus Elfring
df4a14dfb4 selinux: Return directly after a failed kzalloc() in class_read()
Return directly after a call of the function "kzalloc" failed
at the beginning.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-29 11:24:58 -04:00
Markus Elfring
ea6e2f7d12 selinux: Return directly after a failed kzalloc() in role_read()
Return directly after a call of the function "kzalloc" failed
at the beginning.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-29 11:22:12 -04:00
Markus Elfring
549fe69ee5 selinux: Return directly after a failed kzalloc() in type_read()
Return directly after a call of the function "kzalloc" failed
at the beginning.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-29 11:20:07 -04:00
Markus Elfring
4bd9f07b89 selinux: Return directly after a failed kzalloc() in user_read()
Return directly after a call of the function "kzalloc" failed
at the beginning.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-29 11:15:17 -04:00
Markus Elfring
b592119100 selinux: Improve another size determination in sens_read()
Replace the specification of a data type by a pointer dereference
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-29 10:51:09 -04:00
Markus Elfring
3c354d7d7b selinux: Return directly after a failed kzalloc() in sens_read()
Return directly after a call of the function "kzalloc" failed
at the beginning.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-29 09:56:48 -04:00
Markus Elfring
7f6d0ad8b7 selinux: Return directly after a failed kzalloc() in cat_read()
Return directly after a call of the function "kzalloc" failed
at the beginning.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-29 09:54:48 -04:00
David Ahern
983701eb06 rtnetlink: Add RTM_DELNETCONF
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-28 22:32:42 -07:00
Al Viro
db68ce10c4 new helper: uaccess_kernel()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-28 16:43:25 -04:00
Tetsuo Handa
e4e55b47ed LSM: Revive security_task_alloc() hook and per "struct task_struct" security blob.
We switched from "struct task_struct"->security to "struct cred"->security
in Linux 2.6.29. But not all LSM modules were happy with that change.
TOMOYO LSM module is an example which want to use per "struct task_struct"
security blob, for TOMOYO's security context is defined based on "struct
task_struct" rather than "struct cred". AppArmor LSM module is another
example which want to use it, for AppArmor is currently abusing the cred
a little bit to store the change_hat and setexeccon info. Although
security_task_free() hook was revived in Linux 3.4 because Yama LSM module
wanted to release per "struct task_struct" security blob,
security_task_alloc() hook and "struct task_struct"->security field were
not revived. Nowadays, we are getting proposals of lightweight LSM modules
which want to use per "struct task_struct" security blob.

We are already allowing multiple concurrent LSM modules (up to one fully
armored module which uses "struct cred"->security field or exclusive hooks
like security_xfrm_state_pol_flow_match(), plus unlimited number of
lightweight modules which do not use "struct cred"->security nor exclusive
hooks) as long as they are built into the kernel. But this patch does not
implement variable length "struct task_struct"->security field which will
become needed when multiple LSM modules want to use "struct task_struct"->
security field. Although it won't be difficult to implement variable length
"struct task_struct"->security field, let's think about it after we merged
this patch.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Tested-by: Djalal Harouni <tixxdz@gmail.com>
Acked-by: José Bollo <jobol@nonadev.net>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: James Morris <james.l.morris@oracle.com>
Cc: José Bollo <jobol@nonadev.net>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-28 11:05:14 +11:00
Tetsuo Handa
3dfc9b0286 LSM: Initialize security_hook_heads upon registration.
"struct security_hook_heads" is an array of "struct list_head"
where elements can be initialized just before registration.

There is no need to waste 350+ lines for initialization. Let's
initialize "struct security_hook_heads" just before registration.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-24 14:24:41 +11:00
Markus Elfring
9c312e79d6 selinux: Delete an unnecessary variable initialisation in range_read()
The local variable "rt" will be set to an appropriate pointer a bit later.
Thus omit the explicit initialisation at the beginning which became
unnecessary with a previous update step.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-23 18:16:55 -04:00
Markus Elfring
57152a5be0 selinux: Return directly after a failed next_entry() in range_read()
Return directly after a call of the function "next_entry" failed
at the beginning.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-23 18:11:33 -04:00
Markus Elfring
02fcef27cc selinux: Delete an unnecessary variable assignment in filename_trans_read()
The local variable "ft" was set to a null pointer despite of an
immediate reassignment.
Thus remove this statement from the beginning of a loop.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-23 18:08:05 -04:00
Markus Elfring
315e01ada8 selinux: One function call less in genfs_read() after null pointer detection
Call the function "kfree" at the end only after it was determined
that the local variable "newgenfs" contained a non-null pointer.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-23 17:53:29 -04:00
Markus Elfring
3a0aa56518 selinux: Return directly after a failed next_entry() in genfs_read()
Return directly after a call of the function "next_entry" failed
at the beginning.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-23 17:45:29 -04:00
Markus Elfring
b4e4686f65 selinux: Delete an unnecessary return statement in policydb_destroy()
The script "checkpatch.pl" pointed information out like the following.

WARNING: void function return statements are not generally useful

Thus remove such a statement in the affected function.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-23 17:21:26 -04:00
Markus Elfring
ad10a10567 selinux: Use kcalloc() in policydb_index()
Multiplications for the size determination of memory allocations
indicated that array data structures should be processed.
Thus use the corresponding function "kcalloc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-23 17:14:16 -04:00
Markus Elfring
cb8d21e364 selinux: Adjust four checks for null pointers
The script "checkpatch.pl" pointed information out like the following.

Comparison to NULL could be written !…

Thus fix affected source code places.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-23 16:36:38 -04:00
Markus Elfring
2f00e680fe selinux: Use kmalloc_array() in hashtab_create()
A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "kmalloc_array".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-23 16:31:20 -04:00
Markus Elfring
fb13a312da selinux: Improve size determinations in four functions
Replace the specification of data structures by pointer dereferences
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-23 16:29:02 -04:00
Markus Elfring
e34cfef901 selinux: Delete an unnecessary return statement in cond_compute_av()
The script "checkpatch.pl" pointed information out like the following.

WARNING: void function return statements are not generally useful

Thus remove such a statement in the affected function.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-23 16:25:42 -04:00
Markus Elfring
f6076f704a selinux: Use kmalloc_array() in cond_init_bool_indexes()
* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus use the corresponding function "kmalloc_array".

  This issue was detected by using the Coccinelle software.

* Replace the specification of a data type by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-23 16:21:41 -04:00
Mikhail Kurinnoi
3dd0c8d065 ima: provide ">" and "<" operators for fowner/uid/euid rules.
For now we have only "=" operator for fowner/uid/euid rules. This
patch provide two more operators - ">" and "<" in order to make
fowner/uid/euid rules more flexible.

Examples of usage.

 Appraise all files owned by special and system users (SYS_UID_MAX 999):
    appraise fowner<1000
 Don't appraise files owned by normal users (UID_MIN 1000):
    dont_appraise fowner>999
 Appraise all files owned by users with UID 1000-1010:
    dont_appraise fowner>1010
    appraise fowner>999

Changelog v3:
- Removed code duplication in ima_parse_rule().
- Fix ima_policy_show() - (Mimi)

Changelog v2:
- Fixed default policy rules.

Signed-off-by: Mikhail Kurinnoi <viewizard@viewizard.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>

 security/integrity/ima/ima_policy.c | 115 +++++++++++++++++++++++++++---------
 1 file changed, 87 insertions(+), 28 deletions(-)
2017-03-13 07:01:24 -04:00
Alexander Potapenko
e2f586bd83 selinux: check for address length in selinux_socket_bind()
KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
uninitialized memory in selinux_socket_bind():

==================================================================
BUG: KMSAN: use of unitialized memory
inter: 0
CPU: 3 PID: 1074 Comm: packet2 Tainted: G    B           4.8.0-rc6+ #1916
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 0000000000000000 ffff8800882ffb08 ffffffff825759c8 ffff8800882ffa48
 ffffffff818bf551 ffffffff85bab870 0000000000000092 ffffffff85bab550
 0000000000000000 0000000000000092 00000000bb0009bb 0000000000000002
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff825759c8>] dump_stack+0x238/0x290 lib/dump_stack.c:51
 [<ffffffff818bdee6>] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1008
 [<ffffffff818bf0fb>] __msan_warning+0x5b/0xb0 mm/kmsan/kmsan_instr.c:424
 [<ffffffff822dae71>] selinux_socket_bind+0xf41/0x1080 security/selinux/hooks.c:4288
 [<ffffffff8229357c>] security_socket_bind+0x1ec/0x240 security/security.c:1240
 [<ffffffff84265d98>] SYSC_bind+0x358/0x5f0 net/socket.c:1366
 [<ffffffff84265a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
 [<ffffffff81005678>] do_syscall_64+0x58/0x70 arch/x86/entry/common.c:292
 [<ffffffff8518217c>] entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.o:?
chained origin: 00000000ba6009bb
 [<ffffffff810bb7a7>] save_stack_trace+0x27/0x50 arch/x86/kernel/stacktrace.c:67
 [<     inline     >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
 [<     inline     >] kmsan_save_stack mm/kmsan/kmsan.c:337
 [<ffffffff818bd2b8>] kmsan_internal_chain_origin+0x118/0x1e0 mm/kmsan/kmsan.c:530
 [<ffffffff818bf033>] __msan_set_alloca_origin4+0xc3/0x130 mm/kmsan/kmsan_instr.c:380
 [<ffffffff84265b69>] SYSC_bind+0x129/0x5f0 net/socket.c:1356
 [<ffffffff84265a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
 [<ffffffff81005678>] do_syscall_64+0x58/0x70 arch/x86/entry/common.c:292
 [<ffffffff8518217c>] return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.o:?
origin description: ----address@SYSC_bind (origin=00000000b8c00900)
==================================================================

(the line numbers are relative to 4.8-rc6, but the bug persists upstream)

, when I run the following program as root:

=======================================================
  #include <string.h>
  #include <sys/socket.h>
  #include <netinet/in.h>

  int main(int argc, char *argv[]) {
    struct sockaddr addr;
    int size = 0;
    if (argc > 1) {
      size = atoi(argv[1]);
    }
    memset(&addr, 0, sizeof(addr));
    int fd = socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP);
    bind(fd, &addr, size);
    return 0;
  }
=======================================================

(for different values of |size| other error reports are printed).

This happens because bind() unconditionally copies |size| bytes of
|addr| to the kernel, leaving the rest uninitialized. Then
security_socket_bind() reads the IP address bytes, including the
uninitialized ones, to determine the port, or e.g. pass them further to
sel_netnode_find(), which uses them to calculate a hash.

Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
[PM: fixed some whitespace damage]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-10 15:22:16 -05:00
Daniel Glöckner
1ac202e978 ima: accept previously set IMA_NEW_FILE
Modifying the attributes of a file makes ima_inode_post_setattr reset
the IMA cache flags. So if the file, which has just been created,
is opened a second time before the first file descriptor is closed,
verification fails since the security.ima xattr has not been written
yet. We therefore have to look at the IMA_NEW_FILE even if the file
already existed.

With this patch there should no longer be an error when cat tries to
open testfile:

$ rm -f testfile
$ ( echo test >&3 ; touch testfile ; cat testfile ) 3>testfile

A file being new is no reason to accept that it is missing a digital
signature demanded by the policy.

Signed-off-by: Daniel Glöckner <dg@emlix.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2017-03-07 07:06:10 -05:00
James Morris
bad4417b69 integrity: mark default IMA rules as __ro_after_init
The default IMA rules are loaded during init and then do not
change, so mark them as __ro_after_init.

Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2017-03-06 19:08:57 -05:00
James Morris
579fc0dc09 selinux: constify nlmsg permission tables
Constify nlmsg permission tables, which are initialized once
and then do not change.

Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-03-06 11:58:08 -05:00
James Morris
ca97d939db security: mark LSM hooks as __ro_after_init
Mark all of the registration hooks as __ro_after_init (via the
__lsm_ro_after_init macro).

Signed-off-by: James Morris <james.l.morris@oracle.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Kees Cook <keescook@chromium.org>
2017-03-06 11:00:15 +11:00
James Morris
dd0859dccb security: introduce CONFIG_SECURITY_WRITABLE_HOOKS
Subsequent patches will add RO hardening to LSM hooks, however, SELinux
still needs to be able to perform runtime disablement after init to handle
architectures where init-time disablement via boot parameters is not feasible.

Introduce a new kernel configuration parameter CONFIG_SECURITY_WRITABLE_HOOKS,
and a helper macro __lsm_ro_after_init, to handle this case.

Signed-off-by: James Morris <james.l.morris@oracle.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Kees Cook <keescook@chromium.org>
2017-03-06 11:00:12 +11:00
Stephen Smalley
84e6885e9e selinux: fix kernel BUG on prlimit(..., NULL, NULL)
commit 79bcf325e6b32b3c ("prlimit,security,selinux: add a security hook
for prlimit") introduced a security hook for prlimit() and implemented it
for SELinux.  However, if prlimit() is called with NULL arguments for both
the new limit and the old limit, then the hook is called with 0 for the
read/write flags, since the prlimit() will neither read nor write the
process' limits.  This would in turn lead to calling avc_has_perm() with 0
for the requested permissions, which triggers a BUG_ON() in
avc_has_perm_noaudit() since the kernel should never be invoking
avc_has_perm() with no permissions.  Fix this in the SELinux hook by
returning immediately if the flags are 0.  Arguably prlimit64() itself
ought to return immediately if both old_rlim and new_rlim are NULL since
it is effectively a no-op in that case.

Reported by the lkp-robot based on trinity testing.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-06 10:44:16 +11:00
Stephen Smalley
791ec491c3 prlimit,security,selinux: add a security hook for prlimit
When SELinux was first added to the kernel, a process could only get
and set its own resource limits via getrlimit(2) and setrlimit(2), so no
MAC checks were required for those operations, and thus no security hooks
were defined for them. Later, SELinux introduced a hook for setlimit(2)
with a check if the hard limit was being changed in order to be able to
rely on the hard limit value as a safe reset point upon context
transitions.

Later on, when prlimit(2) was added to the kernel with the ability to get
or set resource limits (hard or soft) of another process, LSM/SELinux was
not updated other than to pass the target process to the setrlimit hook.
This resulted in incomplete control over both getting and setting the
resource limits of another process.

Add a new security_task_prlimit() hook to the check_prlimit_permission()
function to provide complete mediation.  The hook is only called when
acting on another task, and only if the existing DAC/capability checks
would allow access.  Pass flags down to the hook to indicate whether the
prlimit(2) call will read, write, or both read and write the resource
limits of the target process.

The existing security_task_setrlimit() hook is left alone; it continues
to serve a purpose in supporting the ability to make decisions based on
the old and/or new resource limit values when setting limits.  This
is consistent with the DAC/capability logic, where
check_prlimit_permission() performs generic DAC/capability checks for
acting on another task, while do_prlimit() performs a capability check
based on a comparison of the old and new resource limits.  Fix the
inline documentation for the hook to match the code.

Implement the new hook for SELinux.  For setting resource limits, we
reuse the existing setrlimit permission.  Note that this does overload
the setrlimit permission to mean the ability to set the resource limit
(soft or hard) of another process or the ability to change one's own
hard limit.  For getting resource limits, a new getrlimit permission
is defined.  This was not originally defined since getrlimit(2) could
only be used to obtain a process' own limits.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-06 10:43:47 +11:00
Linus Torvalds
1827adb11a Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull sched.h split-up from Ingo Molnar:
 "The point of these changes is to significantly reduce the
  <linux/sched.h> header footprint, to speed up the kernel build and to
  have a cleaner header structure.

  After these changes the new <linux/sched.h>'s typical preprocessed
  size goes down from a previous ~0.68 MB (~22K lines) to ~0.45 MB (~15K
  lines), which is around 40% faster to build on typical configs.

  Not much changed from the last version (-v2) posted three weeks ago: I
  eliminated quirks, backmerged fixes plus I rebased it to an upstream
  SHA1 from yesterday that includes most changes queued up in -next plus
  all sched.h changes that were pending from Andrew.

  I've re-tested the series both on x86 and on cross-arch defconfigs,
  and did a bisectability test at a number of random points.

  I tried to test as many build configurations as possible, but some
  build breakage is probably still left - but it should be mostly
  limited to architectures that have no cross-compiler binaries
  available on kernel.org, and non-default configurations"

* 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (146 commits)
  sched/headers: Clean up <linux/sched.h>
  sched/headers: Remove #ifdefs from <linux/sched.h>
  sched/headers: Remove the <linux/topology.h> include from <linux/sched.h>
  sched/headers, hrtimer: Remove the <linux/wait.h> include from <linux/hrtimer.h>
  sched/headers, x86/apic: Remove the <linux/pm.h> header inclusion from <asm/apic.h>
  sched/headers, timers: Remove the <linux/sysctl.h> include from <linux/timer.h>
  sched/headers: Remove <linux/magic.h> from <linux/sched/task_stack.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/init.h>
  sched/core: Remove unused prefetch_stack()
  sched/headers: Remove <linux/rculist.h> from <linux/sched.h>
  sched/headers: Remove the 'init_pid_ns' prototype from <linux/sched.h>
  sched/headers: Remove <linux/signal.h> from <linux/sched.h>
  sched/headers: Remove <linux/rwsem.h> from <linux/sched.h>
  sched/headers: Remove the runqueue_is_locked() prototype
  sched/headers: Remove <linux/sched.h> from <linux/sched/hotplug.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/debug.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/nohz.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/stat.h>
  sched/headers: Remove the <linux/gfp.h> include from <linux/sched.h>
  sched/headers: Remove <linux/rtmutex.h> from <linux/sched.h>
  ...
2017-03-03 10:16:38 -08:00
Ingo Molnar
50d34394ce sched/headers: Prepare to remove the <linux/magic.h> include from <linux/sched/task_stack.h>
Update files that depend on the magic.h inclusion.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:40 +01:00
Ingo Molnar
b2d0910310 sched/headers: Prepare to use <linux/rcuupdate.h> instead of <linux/rculist.h> in <linux/sched.h>
We don't actually need the full rculist.h header in sched.h anymore,
we will be able to include the smaller rcupdate.h header instead.

But first update code that relied on the implicit header inclusion.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:38 +01:00
Ingo Molnar
299300258d sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h>
We are going to split <linux/sched/task.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/task.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:35 +01:00
Ingo Molnar
5b825c3af1 sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h>
Add #include <linux/cred.h> dependencies to all .c files rely on sched.h
doing that for them.

Note that even if the count where we need to add extra headers seems high,
it's still a net win, because <linux/sched.h> is included in over
2,200 files ...

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:31 +01:00
Ingo Molnar
8703e8a465 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/user.h>
We are going to split <linux/sched/user.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/user.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:29 +01:00
Ingo Molnar
3f07c01441 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h>
We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:29 +01:00
Stephen Smalley
2651225b5e selinux: wrap cgroup seclabel support with its own policy capability
commit 1ea0ce4069 ("selinux: allow
changing labels for cgroupfs") broke the Android init program,
which looks up security contexts whenever creating directories
and attempts to assign them via setfscreatecon().
When creating subdirectories in cgroup mounts, this would previously
be ignored since cgroup did not support userspace setting of security
contexts.  However, after the commit, SELinux would attempt to honor
the requested context on cgroup directories and fail due to permission
denial.  Avoid breaking existing userspace/policy by wrapping this change
with a conditional on a new cgroup_seclabel policy capability.  This
preserves existing behavior until/unless a new policy explicitly enables
this capability.

Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-02 10:27:40 +11:00
David Howells
0837e49ab3 KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload()
rcu_dereference_key() and user_key_payload() are currently being used in
two different, incompatible ways:

 (1) As a wrapper to rcu_dereference() - when only the RCU read lock used
     to protect the key.

 (2) As a wrapper to rcu_dereference_protected() - when the key semaphor is
     used to protect the key and the may be being modified.

Fix this by splitting both of the key wrappers to produce:

 (1) RCU accessors for keys when caller has the key semaphore locked:

	dereference_key_locked()
	user_key_payload_locked()

 (2) RCU accessors for keys when caller holds the RCU read lock:

	dereference_key_rcu()
	user_key_payload_rcu()

This should fix following warning in the NFS idmapper

  ===============================
  [ INFO: suspicious RCU usage. ]
  4.10.0 #1 Tainted: G        W
  -------------------------------
  ./include/keys/user-type.h:53 suspicious rcu_dereference_protected() usage!
  other info that might help us debug this:
  rcu_scheduler_active = 2, debug_locks = 0
  1 lock held by mount.nfs/5987:
    #0:  (rcu_read_lock){......}, at: [<d000000002527abc>] nfs_idmap_get_key+0x15c/0x420 [nfsv4]
  stack backtrace:
  CPU: 1 PID: 5987 Comm: mount.nfs Tainted: G        W       4.10.0 #1
  Call Trace:
    dump_stack+0xe8/0x154 (unreliable)
    lockdep_rcu_suspicious+0x140/0x190
    nfs_idmap_get_key+0x380/0x420 [nfsv4]
    nfs_map_name_to_uid+0x2a0/0x3b0 [nfsv4]
    decode_getfattr_attrs+0xfac/0x16b0 [nfsv4]
    decode_getfattr_generic.constprop.106+0xbc/0x150 [nfsv4]
    nfs4_xdr_dec_lookup_root+0xac/0xb0 [nfsv4]
    rpcauth_unwrap_resp+0xe8/0x140 [sunrpc]
    call_decode+0x29c/0x910 [sunrpc]
    __rpc_execute+0x140/0x8f0 [sunrpc]
    rpc_run_task+0x170/0x200 [sunrpc]
    nfs4_call_sync_sequence+0x68/0xa0 [nfsv4]
    _nfs4_lookup_root.isra.44+0xd0/0xf0 [nfsv4]
    nfs4_lookup_root+0xe0/0x350 [nfsv4]
    nfs4_lookup_root_sec+0x70/0xa0 [nfsv4]
    nfs4_find_root_sec+0xc4/0x100 [nfsv4]
    nfs4_proc_get_rootfh+0x5c/0xf0 [nfsv4]
    nfs4_get_rootfh+0x6c/0x190 [nfsv4]
    nfs4_server_common_setup+0xc4/0x260 [nfsv4]
    nfs4_create_server+0x278/0x3c0 [nfsv4]
    nfs4_remote_mount+0x50/0xb0 [nfsv4]
    mount_fs+0x74/0x210
    vfs_kern_mount+0x78/0x220
    nfs_do_root_mount+0xb0/0x140 [nfsv4]
    nfs4_try_mount+0x60/0x100 [nfsv4]
    nfs_fs_mount+0x5ec/0xda0 [nfs]
    mount_fs+0x74/0x210
    vfs_kern_mount+0x78/0x220
    do_mount+0x254/0xf70
    SyS_mount+0x94/0x100
    system_call+0x38/0xe0

Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-02 10:09:00 +11:00
Alexey Dobriyan
5b5e0928f7 lib/vsprintf.c: remove %Z support
Now that %z is standartised in C99 there is no reason to support %Z.
Unlike %L it doesn't even make format strings smaller.

Use BUILD_BUG_ON in a couple ATM drivers.

In case anyone didn't notice lib/vsprintf.o is about half of SLUB which
is in my opinion is quite an achievement.  Hopefully this patch inspires
someone else to trim vsprintf.c more.

Link: http://lkml.kernel.org/r/20170103230126.GA30170@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:47 -08:00
Dave Jiang
11bac80004 mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmf
->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to
take a vma and vmf parameter when the vma already resides in vmf.

Remove the vma parameter to simplify things.

[arnd@arndb.de: fix ARM build]
  Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:54 -08:00
Linus Torvalds
f1ef09fde1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace updates from Eric Biederman:
 "There is a lot here. A lot of these changes result in subtle user
  visible differences in kernel behavior. I don't expect anything will
  care but I will revert/fix things immediately if any regressions show
  up.

  From Seth Forshee there is a continuation of the work to make the vfs
  ready for unpriviled mounts. We had thought the previous changes
  prevented the creation of files outside of s_user_ns of a filesystem,
  but it turns we missed the O_CREAT path. Ooops.

  Pavel Tikhomirov and Oleg Nesterov worked together to fix a long
  standing bug in the implemenation of PR_SET_CHILD_SUBREAPER where only
  children that are forked after the prctl are considered and not
  children forked before the prctl. The only known user of this prctl
  systemd forks all children after the prctl. So no userspace
  regressions will occur. Holding earlier forked children to the same
  rules as later forked children creates a semantic that is sane enough
  to allow checkpoing of processes that use this feature.

  There is a long delayed change by Nikolay Borisov to limit inotify
  instances inside a user namespace.

  Michael Kerrisk extends the API for files used to maniuplate
  namespaces with two new trivial ioctls to allow discovery of the
  hierachy and properties of namespaces.

  Konstantin Khlebnikov with the help of Al Viro adds code that when a
  network namespace exits purges it's sysctl entries from the dcache. As
  in some circumstances this could use a lot of memory.

  Vivek Goyal fixed a bug with stacked filesystems where the permissions
  on the wrong inode were being checked.

  I continue previous work on ptracing across exec. Allowing a file to
  be setuid across exec while being ptraced if the tracer has enough
  credentials in the user namespace, and if the process has CAP_SETUID
  in it's own namespace. Proc files for setuid or otherwise undumpable
  executables are now owned by the root in the user namespace of their
  mm. Allowing debugging of setuid applications in containers to work
  better.

  A bug I introduced with permission checking and automount is now
  fixed. The big change is to mark the mounts that the kernel initiates
  as a result of an automount. This allows the permission checks in sget
  to be safely suppressed for this kind of mount. As the permission
  check happened when the original filesystem was mounted.

  Finally a special case in the mount namespace is removed preventing
  unbounded chains in the mount hash table, and making the semantics
  simpler which benefits CRIU.

  The vfs fix along with related work in ima and evm I believe makes us
  ready to finish developing and merge fully unprivileged mounts of the
  fuse filesystem. The cleanups of the mount namespace makes discussing
  how to fix the worst case complexity of umount. The stacked filesystem
  fixes pave the way for adding multiple mappings for the filesystem
  uids so that efficient and safer containers can be implemented"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  proc/sysctl: Don't grab i_lock under sysctl_lock.
  vfs: Use upper filesystem inode in bprm_fill_uid()
  proc/sysctl: prune stale dentries during unregistering
  mnt: Tuck mounts under others instead of creating shadow/side mounts.
  prctl: propagate has_child_subreaper flag to every descendant
  introduce the walk_process_tree() helper
  nsfs: Add an ioctl() to return owner UID of a userns
  fs: Better permission checking for submounts
  exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction
  vfs: open() with O_CREAT should not create inodes with unknown ids
  nsfs: Add an ioctl() to return the namespace type
  proc: Better ownership of files for non-dumpable tasks in user namespaces
  exec: Remove LSM_UNSAFE_PTRACE_CAP
  exec: Test the ptracer's saved cred to see if the tracee can gain caps
  exec: Don't reset euid and egid when the tracee has CAP_SETUID
  inotify: Convert to using per-namespace limits
2017-02-23 20:33:51 -08:00
Linus Torvalds
b2064617c7 driver core patches for 4.11-rc1
Here is the "small" driver core patches for 4.11-rc1.
 
 Not much here, some firmware documentation and self-test updates, a
 debugfs code formatting issue, and a new feature for call_usermodehelper
 to make it more robust on systems that want to lock it down in a more
 secure way.
 
 All of these have been linux-next for a while now with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWK2jKg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymCEACgozYuqZZ/TUGW0P3xVNi7fbfUWCEAn3nYExrc
 XgevqeYOSKp2We6X/2JX
 =aZ+5
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the "small" driver core patches for 4.11-rc1.

  Not much here, some firmware documentation and self-test updates, a
  debugfs code formatting issue, and a new feature for call_usermodehelper
  to make it more robust on systems that want to lock it down in a more
  secure way.

  All of these have been linux-next for a while now with no reported
  issues"

* tag 'driver-core-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  kernfs: handle null pointers while printing node name and path
  Introduce STATIC_USERMODEHELPER to mediate call_usermodehelper()
  Make static usermode helper binaries constant
  kmod: make usermodehelper path a const string
  firmware: revamp firmware documentation
  selftests: firmware: send expected errors to /dev/null
  selftests: firmware: only modprobe if driver is missing
  platform: Print the resource range if device failed to claim
  kref: prefer atomic_inc_not_zero to atomic_add_unless
  debugfs: improve formatting of debugfs_real_fops()
2017-02-22 11:44:32 -08:00
Linus Torvalds
3051bf36c2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Support TX_RING in AF_PACKET TPACKET_V3 mode, from Sowmini
      Varadhan.

   2) Simplify classifier state on sk_buff in order to shrink it a bit.
      From Willem de Bruijn.

   3) Introduce SIPHASH and it's usage for secure sequence numbers and
      syncookies. From Jason A. Donenfeld.

   4) Reduce CPU usage for ICMP replies we are going to limit or
      suppress, from Jesper Dangaard Brouer.

   5) Introduce Shared Memory Communications socket layer, from Ursula
      Braun.

   6) Add RACK loss detection and allow it to actually trigger fast
      recovery instead of just assisting after other algorithms have
      triggered it. From Yuchung Cheng.

   7) Add xmit_more and BQL support to mvneta driver, from Simon Guinot.

   8) skb_cow_data avoidance in esp4 and esp6, from Steffen Klassert.

   9) Export MPLS packet stats via netlink, from Robert Shearman.

  10) Significantly improve inet port bind conflict handling, especially
      when an application is restarted and changes it's setting of
      reuseport. From Josef Bacik.

  11) Implement TX batching in vhost_net, from Jason Wang.

  12) Extend the dummy device so that VF (virtual function) features,
      such as configuration, can be more easily tested. From Phil
      Sutter.

  13) Avoid two atomic ops per page on x86 in bnx2x driver, from Eric
      Dumazet.

  14) Add new bpf MAP, implementing a longest prefix match trie. From
      Daniel Mack.

  15) Packet sample offloading support in mlxsw driver, from Yotam Gigi.

  16) Add new aquantia driver, from David VomLehn.

  17) Add bpf tracepoints, from Daniel Borkmann.

  18) Add support for port mirroring to b53 and bcm_sf2 drivers, from
      Florian Fainelli.

  19) Remove custom busy polling in many drivers, it is done in the core
      networking since 4.5 times. From Eric Dumazet.

  20) Support XDP adjust_head in virtio_net, from John Fastabend.

  21) Fix several major holes in neighbour entry confirmation, from
      Julian Anastasov.

  22) Add XDP support to bnxt_en driver, from Michael Chan.

  23) VXLAN offloads for enic driver, from Govindarajulu Varadarajan.

  24) Add IPVTAP driver (IP-VLAN based tap driver) from Sainath Grandhi.

  25) Support GRO in IPSEC protocols, from Steffen Klassert"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1764 commits)
  Revert "ath10k: Search SMBIOS for OEM board file extension"
  net: socket: fix recvmmsg not returning error from sock_error
  bnxt_en: use eth_hw_addr_random()
  bpf: fix unlocking of jited image when module ronx not set
  arch: add ARCH_HAS_SET_MEMORY config
  net: napi_watchdog() can use napi_schedule_irqoff()
  tcp: Revert "tcp: tcp_probe: use spin_lock_bh()"
  net/hsr: use eth_hw_addr_random()
  net: mvpp2: enable building on 64-bit platforms
  net: mvpp2: switch to build_skb() in the RX path
  net: mvpp2: simplify MVPP2_PRS_RI_* definitions
  net: mvpp2: fix indentation of MVPP2_EXT_GLOBAL_CTRL_DEFAULT
  net: mvpp2: remove unused register definitions
  net: mvpp2: simplify mvpp2_bm_bufs_add()
  net: mvpp2: drop useless fields in mvpp2_bm_pool and related code
  net: mvpp2: remove unused 'tx_skb' field of 'struct mvpp2_tx_queue'
  net: mvpp2: release reference to txq_cpu[] entry after unmapping
  net: mvpp2: handle too large value in mvpp2_rx_time_coal_set()
  net: mvpp2: handle too large value handling in mvpp2_rx_pkts_coal_set()
  net: mvpp2: remove useless arguments in mvpp2_rx_{pkts, time}_coal_set
  ...
2017-02-22 10:15:09 -08:00
Linus Torvalds
c9341ee0af Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security layer updates from James Morris:
 "Highlights:

   - major AppArmor update: policy namespaces & lots of fixes

   - add /sys/kernel/security/lsm node for easy detection of loaded LSMs

   - SELinux cgroupfs labeling support

   - SELinux context mounts on tmpfs, ramfs, devpts within user
     namespaces

   - improved TPM 2.0 support"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (117 commits)
  tpm: declare tpm2_get_pcr_allocation() as static
  tpm: Fix expected number of response bytes of TPM1.2 PCR Extend
  tpm xen: drop unneeded chip variable
  tpm: fix misspelled "facilitate" in module parameter description
  tpm_tis: fix the error handling of init_tis()
  KEYS: Use memzero_explicit() for secret data
  KEYS: Fix an error code in request_master_key()
  sign-file: fix build error in sign-file.c with libressl
  selinux: allow changing labels for cgroupfs
  selinux: fix off-by-one in setprocattr
  tpm: silence an array overflow warning
  tpm: fix the type of owned field in cap_t
  tpm: add securityfs support for TPM 2.0 firmware event log
  tpm: enhance read_log_of() to support Physical TPM event log
  tpm: enhance TPM 2.0 PCR extend to support multiple banks
  tpm: implement TPM 2.0 capability to get active PCR banks
  tpm: fix RC value check in tpm2_seal_trusted
  tpm_tis: fix iTPM probe via probe_itpm() function
  tpm: Begin the process to deprecate user_read_timer
  tpm: remove tpm_read_index and tpm_write_index from tpm.h
  ...
2017-02-21 12:49:56 -08:00
Linus Torvalds
42e1b14b6e Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Implement wraparound-safe refcount_t and kref_t types based on
     generic atomic primitives (Peter Zijlstra)

   - Improve and fix the ww_mutex code (Nicolai Hähnle)

   - Add self-tests to the ww_mutex code (Chris Wilson)

   - Optimize percpu-rwsems with the 'rcuwait' mechanism (Davidlohr
     Bueso)

   - Micro-optimize the current-task logic all around the core kernel
     (Davidlohr Bueso)

   - Tidy up after recent optimizations: remove stale code and APIs,
     clean up the code (Waiman Long)

   - ... plus misc fixes, updates and cleanups"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
  fork: Fix task_struct alignment
  locking/spinlock/debug: Remove spinlock lockup detection code
  lockdep: Fix incorrect condition to print bug msgs for MAX_LOCKDEP_CHAIN_HLOCKS
  lkdtm: Convert to refcount_t testing
  kref: Implement 'struct kref' using refcount_t
  refcount_t: Introduce a special purpose refcount type
  sched/wake_q: Clarify queue reinit comment
  sched/wait, rcuwait: Fix typo in comment
  locking/mutex: Fix lockdep_assert_held() fail
  locking/rtmutex: Flip unlikely() branch to likely() in __rt_mutex_slowlock()
  locking/rwsem: Reinit wake_q after use
  locking/rwsem: Remove unnecessary atomic_long_t casts
  jump_labels: Move header guard #endif down where it belongs
  locking/atomic, kref: Implement kref_put_lock()
  locking/ww_mutex: Turn off __must_check for now
  locking/atomic, kref: Avoid more abuse
  locking/atomic, kref: Use kref_get_unless_zero() more
  locking/atomic, kref: Kill kref_sub()
  locking/atomic, kref: Add kref_read()
  locking/atomic, kref: Add KREF_INIT()
  ...
2017-02-20 13:23:30 -08:00
David S. Miller
35eeacf182 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-02-11 02:31:11 -05:00
Dan Carpenter
5217660379 KEYS: Use memzero_explicit() for secret data
I don't think GCC has figured out how to optimize the memset() away, but
they might eventually so let's future proof this code a bit.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-02-10 12:43:51 +11:00
Dan Carpenter
57cb17e764 KEYS: Fix an error code in request_master_key()
This function has two callers and neither are able to handle a NULL
return.  Really, -EINVAL is the correct thing return here anyway.  This
fixes some static checker warnings like:

	security/keys/encrypted-keys/encrypted.c:709 encrypted_key_decrypt()
	error: uninitialized symbol 'master_key'.

Fixes: 7e70cb4978 ("keys: add new key-type encrypted")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-02-10 12:43:49 +11:00
James Morris
a2a15479d6 Merge branch 'stable-4.11' of git://git.infradead.org/users/pcmoore/selinux into next 2017-02-10 10:28:49 +11:00
Stephen Smalley
0c461cb727 selinux: fix off-by-one in setprocattr
SELinux tries to support setting/clearing of /proc/pid/attr attributes
from the shell by ignoring terminating newlines and treating an
attribute value that begins with a NUL or newline as an attempt to
clear the attribute.  However, the test for clearing attributes has
always been wrong; it has an off-by-one error, and this could further
lead to reading past the end of the allocated buffer since commit
bb646cdb12 ("proc_pid_attr_write():
switch to memdup_user()").  Fix the off-by-one error.

Even with this fix, setting and clearing /proc/pid/attr attributes
from the shell is not straightforward since the interface does not
support multiple write() calls (so shells that write the value and
newline separately will set and then immediately clear the attribute,
requiring use of echo -n to set the attribute), whereas trying to use
echo -n "" to clear the attribute causes the shell to skip the
write() call altogether since POSIX says that a zero-length write
causes no side effects. Thus, one must use echo -n to set and echo
without -n to clear, as in the following example:
$ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate
$ cat /proc/$$/attr/fscreate
unconfined_u:object_r:user_home_t:s0
$ echo "" > /proc/$$/attr/fscreate
$ cat /proc/$$/attr/fscreate

Note the use of /proc/$$ rather than /proc/self, as otherwise
the cat command will read its own attribute value, not that of the shell.

There are no users of this facility to my knowledge; possibly we
should just get rid of it.

UPDATE: Upon further investigation it appears that a local process
with the process:setfscreate permission can cause a kernel panic as a
result of this bug.  This patch fixes CVE-2017-2618.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: added the update about CVE-2017-2618 to the commit description]
Cc: stable@vger.kernel.org # 3.5: d6ea83ec68
Signed-off-by: Paul Moore <paul@paul-moore.com>

Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-02-08 19:09:53 +11:00
James Morris
e2241be62d Merge branch 'stable-4.10' of git://git.infradead.org/users/pcmoore/selinux into next 2017-02-08 19:01:07 +11:00
Antonio Murdaca
1ea0ce4069 selinux: allow changing labels for cgroupfs
This patch allows changing labels for cgroup mounts. Previously, running
chcon on cgroupfs would throw an "Operation not supported". This patch
specifically whitelist cgroupfs.

The patch could also allow containers to write only to the systemd cgroup
for instance, while the other cgroups are kept with cgroup_t label.

Signed-off-by: Antonio Murdaca <runcom@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-02-07 22:17:47 -05:00
Stephen Smalley
a050a570db selinux: fix off-by-one in setprocattr
SELinux tries to support setting/clearing of /proc/pid/attr attributes
from the shell by ignoring terminating newlines and treating an
attribute value that begins with a NUL or newline as an attempt to
clear the attribute.  However, the test for clearing attributes has
always been wrong; it has an off-by-one error, and this could further
lead to reading past the end of the allocated buffer since commit
bb646cdb12 ("proc_pid_attr_write():
switch to memdup_user()").  Fix the off-by-one error.

Even with this fix, setting and clearing /proc/pid/attr attributes
from the shell is not straightforward since the interface does not
support multiple write() calls (so shells that write the value and
newline separately will set and then immediately clear the attribute,
requiring use of echo -n to set the attribute), whereas trying to use
echo -n "" to clear the attribute causes the shell to skip the
write() call altogether since POSIX says that a zero-length write
causes no side effects. Thus, one must use echo -n to set and echo
without -n to clear, as in the following example:
$ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate
$ cat /proc/$$/attr/fscreate
unconfined_u:object_r:user_home_t:s0
$ echo "" > /proc/$$/attr/fscreate
$ cat /proc/$$/attr/fscreate

Note the use of /proc/$$ rather than /proc/self, as otherwise
the cat command will read its own attribute value, not that of the shell.

There are no users of this facility to my knowledge; possibly we
should just get rid of it.

UPDATE: Upon further investigation it appears that a local process
with the process:setfscreate permission can cause a kernel panic as a
result of this bug.  This patch fixes CVE-2017-2618.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: added the update about CVE-2017-2618 to the commit description]
Cc: stable@vger.kernel.org # 3.5: d6ea83ec68
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-02-07 17:31:38 -05:00
Lans Zhang
20f482ab9e ima: allow to check MAY_APPEND
Otherwise some mask and inmask tokens with MAY_APPEND flag may not work
as expected.

Signed-off-by: Lans Zhang <jia.zhang@windriver.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2017-01-27 14:17:21 -05:00
Mimi Zohar
bc15ed663e ima: fix ima_d_path() possible race with rename
On failure to return a pathname from ima_d_path(), a pointer to
dname is returned, which is subsequently used in the IMA measurement
list, the IMA audit records, and other audit logging.  Saving the
pointer to dname for later use has the potential to race with rename.

Intead of returning a pointer to dname on failure, this patch returns
a pointer to a copy of the filename.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
2017-01-27 14:16:02 -05:00
James Morris
710584b9da Merge branch 'smack-for-4.11' of git://github.com/cschaufler/smack-next into next 2017-01-27 09:23:21 +11:00
Krister Johansen
4548b683b7 Introduce a sysctl that modifies the value of PROT_SOCK.
Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl
that denotes the first unprivileged inet port in the namespace.  To
disable all privileged ports set this to zero.  It also checks for
overlap with the local port range.  The privileged and local range may
not overlap.

The use case for this change is to allow containerized processes to bind
to priviliged ports, but prevent them from ever being allowed to modify
their container's network configuration.  The latter is accomplished by
ensuring that the network namespace is not a child of the user
namespace.  This modification was needed to allow the container manager
to disable a namespace's priviliged port restrictions without exposing
control of the network namespace to processes in the user namespace.

Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 12:10:51 -05:00
Eric W. Biederman
9227dd2a84 exec: Remove LSM_UNSAFE_PTRACE_CAP
With previous changes every location that tests for
LSM_UNSAFE_PTRACE_CAP also tests for LSM_UNSAFE_PTRACE making the
LSM_UNSAFE_PTRACE_CAP redundant, so remove it.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-01-24 12:03:08 +13:00
Eric W. Biederman
20523132ec exec: Test the ptracer's saved cred to see if the tracee can gain caps
Now that we have user namespaces and non-global capabilities verify
the tracer has capabilities in the relevant user namespace instead
of in the current_user_ns().

As the test for setting LSM_UNSAFE_PTRACE_CAP is currently
ptracer_capable(p, current_user_ns()) and the new task credentials are
in current_user_ns() this change does not have any user visible change
and simply moves the test to where it is used, making the code easier
to read.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-01-24 12:03:08 +13:00
Eric W. Biederman
70169420f5 exec: Don't reset euid and egid when the tracee has CAP_SETUID
Don't reset euid and egid when the tracee has CAP_SETUID in
it's user namespace.  I punted on relaxing this permission check
long ago but now that I have read this code closely it is clear
it is safe to test against CAP_SETUID in the user namespace.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-01-24 12:03:07 +13:00
Greg Kroah-Hartman
64e90a8acb Introduce STATIC_USERMODEHELPER to mediate call_usermodehelper()
Some usermode helper applications are defined at kernel build time, while
others can be changed at runtime.  To provide a sane way to filter these, add a
new kernel option "STATIC_USERMODEHELPER".  This option routes all
call_usermodehelper() calls through this binary, no matter what the caller
wishes to have called.

The new binary (by default set to /sbin/usermode-helper, but can be changed
through the STATIC_USERMODEHELPER_PATH option) can properly filter the
requested programs to be run by the kernel by looking at the first argument
that is passed to it.  All other options should then be passed onto the proper
program if so desired.

To disable all call_usermodehelper() calls by the kernel, set
STATIC_USERMODEHELPER_PATH to an empty string.

Thanks to Neil Brown for the idea of this feature.

Cc: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-19 12:59:45 +01:00
Greg Kroah-Hartman
377e7a27c0 Make static usermode helper binaries constant
There are a number of usermode helper binaries that are "hard coded" in
the kernel today, so mark them as "const" to make it harder for someone
to change where the variables point to.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Sailer <t.sailer@alumni.ethz.ch>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Johan Hovold <johan@kernel.org>
Cc: Alex Elder <elder@kernel.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Jeff Layton <jlayton@poochiereds.net>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-19 12:59:45 +01:00
Casey Schaufler
d69dece5f5 LSM: Add /sys/kernel/security/lsm
I am still tired of having to find indirect ways to determine
what security modules are active on a system. I have added
/sys/kernel/security/lsm, which contains a comma separated
list of the active security modules. No more groping around
in /proc/filesystems or other clever hacks.

Unchanged from previous versions except for being updated
to the latest security next branch.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-01-19 13:18:29 +11:00
John Johansen
3ccb76c5df apparmor: fix undefined reference to `aa_g_hash_policy'
The kernel build bot turned up a bad config combination when
CONFIG_SECURITY_APPARMOR is y and CONFIG_SECURITY_APPARMOR_HASH is n,
resulting in the build error
   security/built-in.o: In function `aa_unpack':
   (.text+0x841e2): undefined reference to `aa_g_hash_policy'

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 13:21:27 -08:00
John Johansen
e6bfa25deb apparmor: replace remaining BUG_ON() asserts with AA_BUG()
AA_BUG() uses WARN and won't break the kernel like BUG_ON().

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:56 -08:00
John Johansen
2c17cd3681 apparmor: fix restricted endian type warnings for policy unpack
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:55 -08:00
John Johansen
e6e8bf4188 apparmor: fix restricted endian type warnings for dfa unpack
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:54 -08:00
John Johansen
ca4bd5ae0a apparmor: add check for apparmor enabled in module parameters missing it
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:53 -08:00
John Johansen
d4669f0b03 apparmor: add per cpu work buffers to avoid allocating buffers at every hook
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:53 -08:00
Tyler Hicks
e3ea1ca59a apparmor: sysctl to enable unprivileged user ns AppArmor policy loading
If this sysctl is set to non-zero and a process with CAP_MAC_ADMIN in
the root namespace has created an AppArmor policy namespace,
unprivileged processes will be able to change to a profile in the
newly created AppArmor policy namespace and, if the profile allows
CAP_MAC_ADMIN and appropriate file permissions, will be able to load
policy in the respective policy namespace.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:52 -08:00
William Hua
e025be0f26 apparmor: support querying extended trusted helper extra data
Allow a profile to carry extra data that can be queried via userspace.
This provides a means to store extra data in a profile that a trusted
helper can extract and use from live policy.

Signed-off-by: William Hua <william.hua@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:51 -08:00
John Johansen
12eb87d50b apparmor: update cap audit to check SECURITY_CAP_NOAUDIT
apparmor should be checking the SECURITY_CAP_NOAUDIT constant. Also
in complain mode make it so apparmor can elect to log a message,
informing of the check.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:50 -08:00
John Johansen
31f75bfecd apparmor: make computing policy hashes conditional on kernel parameter
Allow turning off the computation of the policy hashes via the
apparmor.hash_policy kernel parameter.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:50 -08:00
John Johansen
aa9a39ad8f apparmor: convert change_profile to use fqname later to give better control
Moving the use of fqname to later allows learning profiles to be based
on the fqname request instead of just the hname. It also allows cleaning
up some of the name parsing and lookup by allowing the use of
the fqlookupn_profile() lib fn.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:49 -08:00
John Johansen
c3e1e584ad apparmor: fix change_hat debug output
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:48 -08:00
John Johansen
5ef50d014c apparmor: remove unused op parameter from simple_write_to_buffer()
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:48 -08:00
John Johansen
ef88a7ac55 apparmor: change aad apparmor_audit_data macro to a fn macro
The aad macro can replace aad strings when it is not intended to. Switch
to a fn macro so it is only applied when intended.

Also at the same time cleanup audit_data initialization by putting
common boiler plate behind a macro, and dropping the gfp_t parameter
which will become useless.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:47 -08:00
John Johansen
47f6e5cc73 apparmor: change op from int to const char *
Having ops be an integer that is an index into an op name table is
awkward and brittle. Every op change requires an edit for both the
op constant and a string in the table. Instead switch to using const
strings directly, eliminating the need for the table that needs to
be kept in sync.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:46 -08:00
John Johansen
55a26ebf63 apparmor: rename context abreviation cxt to the more standard ctx
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:45 -08:00
John Johansen
a20aa95fbe apparmor: fail task profile update if current_cred isn't real_cred
Trying to update the task cred while the task current cred is not the
real cred will result in an error at the cred layer. Avoid this by
failing early and delaying the update.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:45 -08:00
John Johansen
b7fd2c0340 apparmor: add per policy ns .load, .replace, .remove interface files
Having per policy ns interface files helps with containers restoring
policy.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:44 -08:00
John Johansen
12dd7171d6 apparmor: pass the subject profile into profile replace/remove
This is just setup for new ns specific .load, .replace, .remove interface
files.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:43 -08:00
John Johansen
04dc715e24 apparmor: audit policy ns specified in policy load
Verify that profiles in a load set specify the same policy ns and
audit the name of the policy ns that policy is being loaded for.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:43 -08:00
John Johansen
5ac8c355ae apparmor: allow introspecting the loaded policy pre internal transform
Store loaded policy and allow introspecting it through apparmorfs. This
has several uses from debugging, policy validation, and policy checkpoint
and restore for containers.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:42 -08:00
John Johansen
fc1c9fd10a apparmor: add ns name to the audit data for policy loads
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:41 -08:00
John Johansen
078c73c63f apparmor: add profile and ns params to aa_may_manage_policy()
Policy management will be expanded beyond traditional unconfined root.
This will require knowning the profile of the task doing the management
and the ns view.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:40 -08:00
John Johansen
fd2a80438d apparmor: add ns being viewed as a param to policy_admin_capable()
Prepare for a tighter pairing of user namespaces and apparmor policy
namespaces, by making the ns to be viewed available.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:40 -08:00
John Johansen
2bd8dbbf22 apparmor: add ns being viewed as a param to policy_view_capable()
Prepare for a tighter pairing of user namespaces and apparmor policy
namespaces, by making the ns to be viewed available and checking
that the user namespace level is the same as the policy ns level.

This strict pairing will be relaxed once true support of user namespaces
lands.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:39 -08:00
John Johansen
a6f233003b apparmor: allow specifying the profile doing the management
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:38 -08:00
John Johansen
3e3e569539 apparmor: allow introspecting the policy namespace name
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:37 -08:00
John Johansen
b79473f2de apparmor: Make aa_remove_profile() callable from a different view
This is prep work for fs operations being able to remove namespaces.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:37 -08:00
John Johansen
ee2351e4b0 apparmor: track ns level so it can be used to help in view checks
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:36 -08:00
John Johansen
a71ada3058 apparmor: add special .null file used to "close" fds at exec
Borrow the special null device file from selinux to "close" fds that
don't have sufficient permissions at exec time.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:35 -08:00
John Johansen
34c426acb7 apparmor: provide userspace flag indicating binfmt_elf_mmap change
Commit 9f834ec18d ("binfmt_elf: switch to new creds when switching to new mm")
changed when the creds are installed by the binfmt_elf handler. This
affects which creds are used to mmap the executable into the address
space. Which can have an affect on apparmor policy.

Add a flag to apparmor at
/sys/kernel/security/apparmor/features/domain/fix_binfmt_elf_mmap

to make it possible to detect this semantic change so that the userspace
tools and the regression test suite can correctly deal with the change.

BugLink: http://bugs.launchpad.net/bugs/1630069
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:35 -08:00
John Johansen
11c236b89d apparmor: add a default null dfa
Instead of testing whether a given dfa exists in every code path, have
a default null dfa that is used when loaded policy doesn't provide a
dfa.

This will let us get rid of special casing and avoid dereference bugs
when special casing is missed.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:34 -08:00
John Johansen
6604d4c1c1 apparmor: allow policydb to be used as the file dfa
Newer policy will combine the file and policydb dfas, allowing for
better optimizations. However to support older policy we need to
keep the ability to address the "file" dfa separately. So dup
the policydb as if it is the file dfa and set the appropriate start
state.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:33 -08:00
John Johansen
293a4886f9 apparmor: add get_dfa() fn
The dfa is currently setup to be shared (has the basis of refcounting)
but currently can't be because the count can't be increased.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:32 -08:00
John Johansen
474d6b7510 apparmor: prepare to support newer versions of policy
Newer policy encodes more than just version in the version tag,
so add masking to make sure the comparison remains correct.

Note: this is fully compatible with older policy as it will never set
the bits being masked out.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:32 -08:00
John Johansen
5ebfb12822 apparmor: add support for force complain flag to support learning mode
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:31 -08:00
John Johansen
abbf873403 apparmor: remove paranoid load switch
Policy should always under go a full paranoid verification.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:30 -08:00
John Johansen
181f7c9776 apparmor: name null-XXX profiles after the executable
When possible its better to name a learning profile after the missing
profile in question. This allows for both more informative names and
for profile reuse.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:30 -08:00
John Johansen
30b026a8d1 apparmor: pass gfp_t parameter into profile allocation
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:29 -08:00
John Johansen
73688d1ed0 apparmor: refactor prepare_ns() and make usable from different views
prepare_ns() will need to be called from alternate views, and namespaces
will need to be created via different interfaces. So refactor and
allow specifying the view ns.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:28 -08:00
John Johansen
5fd1b95fc9 apparmor: update policy_destroy to use new debug asserts
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:27 -08:00
John Johansen
d102d89571 apparmor: pass gfp param into aa_policy_init()
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:27 -08:00
John Johansen
bbe4a7c873 apparmor: constify policy name and hname
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:26 -08:00
John Johansen
6e474e3063 apparmor: rename hname_tail to basename
Rename to the shorter and more familiar shell cmd name

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:25 -08:00
John Johansen
efeee83a70 apparmor: rename mediated_filesystem() to path_mediated_fs()
Rename to indicate the test is only about whether path mediation is used,
not whether other types of mediation might be used.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:24 -08:00
John Johansen
680cd62e91 apparmor: add debug assert AA_BUG and Kconfig to control debug info
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:24 -08:00
John Johansen
57e36bbd67 apparmor: add macro for bug asserts to check that a lock is held
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:23 -08:00
John Johansen
92b6d8eff5 apparmor: allow ns visibility question to consider subnses
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:22 -08:00
John Johansen
31617ddfdd apparmor: add fn to lookup profiles by fqname
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:22 -08:00
John Johansen
3b0aaf5866 apparmor: add lib fn to find the "split" for fqnames
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:21 -08:00
John Johansen
9a2d40c12d apparmor: add strn version of aa_find_ns
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:20 -08:00
John Johansen
1741e9eb8c apparmor: add strn version of lookup_profile fn
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:19 -08:00
John Johansen
8399588a7f apparmor: rename replacedby to proxy
Proxy is shorter and a better fit than replaceby, so rename it.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:19 -08:00
John Johansen
d97d51d253 apparmor: rename PFLAG_INVALID to PFLAG_STALE
Invalid does not convey the meaning of the flag anymore so rename it.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:16:37 -08:00
John Johansen
121d4a91e3 apparmor: rename sid to secid
Move to common terminology with other LSMs and kernel infrastucture

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 00:42:17 -08:00
John Johansen
98849dff90 apparmor: rename namespace to ns to improve code line lengths
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 00:42:16 -08:00
John Johansen
cff281f686 apparmor: split apparmor policy namespaces code into its own file
Policy namespaces will be diverging from profile management and
expanding so put it in its own file.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 00:42:15 -08:00
John Johansen
fe6bb31f59 apparmor: split out shared policy_XXX fns to lib
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 00:42:14 -08:00
John Johansen
12557dcba2 apparmor: move lib definitions into separate lib include
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 00:42:13 -08:00
Kees Cook
8486adf0d7 apparmor: use designated initializers
Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. These were identified during
allyesconfig builds of x86, arm, and arm64, with most initializer fixes
extracted from grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-15 20:00:32 -08:00
Tetsuo Handa
a7f6c1b63b AppArmor: Use GFP_KERNEL for __aa_kvmalloc().
Calling kmalloc(GFP_NOIO) with order == PAGE_ALLOC_COSTLY_ORDER is not
recommended because it might fall into infinite retry loop without
invoking the OOM killer.

Since aa_dfa_unpack() is the only caller of kvzalloc() and
aa_dfa_unpack() which is calling kvzalloc() via unpack_table() is
doing kzalloc(GFP_KERNEL), it is safe to use GFP_KERNEL from
__aa_kvmalloc().

Since aa_simple_write_to_buffer() is the only caller of kvmalloc()
and aa_simple_write_to_buffer() is calling copy_from_user() which
is GFP_KERNEL context (see memdup_user_nul()), it is safe to use
GFP_KERNEL from __aa_kvmalloc().

Therefore, replace GFP_NOIO with GFP_KERNEL. Also, since we have
vmalloc() fallback, add __GFP_NORETRY so that we don't invoke the OOM
killer by kmalloc(GFP_KERNEL) with order == PAGE_ALLOC_COSTLY_ORDER.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-15 13:41:09 -08:00
Peter Zijlstra
6b1ffa06e5 locking/atomic, kref: Use kref_get_unless_zero() more
For some obscure reason apparmor thinks its needs to locally implement
kref primitives that already exist. Stop doing this.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14 11:37:20 +01:00
Stephen Smalley
3a2f5a59a6 security,selinux,smack: kill security_task_wait hook
As reported by yangshukui, a permission denial from security_task_wait()
can lead to a soft lockup in zap_pid_ns_processes() since it only expects
sys_wait4() to return 0 or -ECHILD. Further, security_task_wait() can
in general lead to zombies; in the absence of some way to automatically
reparent a child process upon a denial, the hook is not useful.  Remove
the security hook and its implementations in SELinux and Smack.  Smack
already removed its check from its hook.

Reported-by: yangshukui <yangshukui@huawei.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-12 11:10:57 -05:00
Stephen Smalley
b4ba35c75a selinux: drop unused socket security classes
Several of the extended socket classes introduced by
commit da69a5306a ("selinux: support distinctions
among all network address families") are never used because
sockets can never be created with the associated address family.
Remove these unused socket security classes.  The removed classes
are bridge_socket for PF_BRIDGE, ib_socket for PF_IB, and mpls_socket
for PF_MPLS.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-12 11:10:24 -05:00
Seung-Woo Kim
83a1e53f39 Smack: ignore private inode for file functions
The access to fd from anon_inode is always failed because there is
no set xattr operations. So this patch fixes to ignore private
inode including anon_inode for file functions.

It was only ignored for smack_file_receive() to share dma-buf fd,
but dma-buf has other functions like ioctl and mmap.

Reference: https://lkml.org/lkml/2015/4/17/16

Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Rafal Krypa
805b65a80b Smack: fix d_instantiate logic for sockfs and pipefs
Since 4b936885a (v2.6.32) all inodes on sockfs and pipefs are disconnected.
It caused filesystem specific code in smack_d_instantiate to be skipped,
because all inodes on those pseudo filesystems were treated as root inodes.
As a result all sockfs inodes had the Smack label set to floor.

In most cases access checks for sockets use socket_smack data so the inode
label is not important. But there are special cases that were broken.
One example would be calling fcntl with F_SETOWN command on a socket fd.

Now smack_d_instantiate expects all pipefs and sockfs inodes to be
disconnected and has the logic in appropriate place.

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Himanshu Shukla
c9d238a18b SMACK: Use smk_tskacc() instead of smk_access() for proper logging
smack_file_open() is first checking the capability of calling subject,
this check will skip the SMACK logging for success case. Use smk_tskacc()
for proper logging and SMACK access check.

Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Vishal Goel
348dc288d4 Smack: Traverse the smack_known_list using list_for_each_entry_rcu macro
In smack_from_secattr function,"smack_known_list" is being traversed
using list_for_each_entry macro, although it is a rcu protected
structure. So it should be traversed using "list_for_each_entry_rcu"
macro to fetch the rcu protected entry.

Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Himanshu Shukla
3d4f673a69 SMACK: Free the i_security blob in inode using RCU
There is race condition issue while freeing the i_security blob in SMACK
module. There is existing condition where i_security can be freed while
inode_permission is called from path lookup on second CPU. There has been
observed the page fault with such condition. VFS code and Selinux module
takes care of this condition by freeing the inode and i_security field
using RCU via call_rcu(). But in SMACK directly the i_secuirty blob is
being freed. Use call_rcu() to fix this race condition issue.

Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Himanshu Shukla
d54a197964 SMACK: Delete list_head repeated initialization
smk_copy_rules() and smk_copy_relabel() are initializing list_head though
they have been initialized already in new_task_smack() function. Delete
repeated initialization.

Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Vishal Goel
2e962e2fec SMACK: Add new lock for adding entry in smack master list
"smk_set_access()" function adds a new rule entry in subject label specific
list(rule_list) and in global rule list(smack_rule_list) both. Mutex lock
(rule_lock) is used to avoid simultaneous updates. But this lock is subject
label specific lock. If 2 processes tries to add different rules(i.e with
different subject labels) simultaneously, then both the processes can take
the "rule_lock" respectively. So it will cause a problem while adding
entries in master rule list.
Now a new mutex lock(smack_master_list_lock) has been taken to add entry in
smack_rule_list to avoid simultaneous updates of different rules.

Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Vishal Goel
0c96d1f532 Smack: Fix the issue of wrong SMACK label update in socket bind fail case
Fix the issue of wrong SMACK label (SMACK64IPIN) update when a second bind
call is made to same IP address & port, but with different SMACK label
(SMACK64IPIN) by second instance of server. In this case server returns
with "Bind:Address already in use" error but before returning, SMACK label
is updated in SMACK port-label mapping list inside smack_socket_bind() hook

To fix this issue a new check has been added in smk_ipv6_port_label()
function before updating the existing port entry. It checks whether the
socket for matching port entry is closed or not. If it is closed then it
means port is not bound and it is safe to update the existing port entry
else return if port is still getting used. For checking whether socket is
closed or not, one more field "smk_can_reuse" has been added in the
"smk_port_label" structure. This field will be set to '1' in
"smack_sk_free_security()" function which is called to free the socket
security blob when the socket is being closed. In this function, port entry
is searched in the SMACK port-label mapping list for the closing socket.
If entry is found then "smk_can_reuse" field is set to '1'.Initially
"smk_can_reuse" field is set to '0' in smk_ipv6_port_label() function after
creating a new entry in the list which indicates that socket is in use.

Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Vishal Goel
9d44c97384 Smack: Fix the issue of permission denied error in ipv6 hook
Permission denied error comes when 2 IPv6 servers are running and client
tries to connect one of them. Scenario is that both servers are using same
IP and port but different protocols(Udp and tcp). They are using different
SMACK64IPIN labels.Tcp server is using "test" and udp server is using
"test-in". When we try to run tcp client with SMACK64IPOUT label as "test",
then connection denied error comes. It should not happen since both tcp
server and client labels are same.This happens because there is no check
for protocol in smk_ipv6_port_label() function while searching for the
earlier port entry. It checks whether there is an existing port entry on
the basis of port only. So it updates the earlier port entry in the list.
Due to which smack label gets changed for earlier entry in the
"smk_ipv6_port_list" list and permission denied error comes.

Now a check is added for socket type also.Now if 2 processes use same
port  but different protocols (tcp or udp), then 2 different port entries
will be  added in the list. Similarly while checking smack access in
smk_ipv6_port_check() function,  port entry is searched on the basis of
both port and protocol.

Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <Himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Vishal Goel
3c7ce34271 SMACK: Add the rcu synchronization mechanism in ipv6 hooks
Add the rcu synchronization mechanism for accessing smk_ipv6_port_list
in smack IPv6 hooks. Access to the port list is vulnerable to a race
condition issue,it does not apply proper synchronization methods while
working on critical section. It is possible that when one thread is
reading the list, at the same time another thread is modifying the
same port list, which can cause the major problems.

To ensure proper synchronization between two threads, rcu mechanism
has been applied while accessing and modifying the port list. RCU will
also not affect the performance, as there are more accesses than
modification where RCU is most effective synchronization mechanism.

Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Gary Tierney
900fde06cb selinux: default to security isid in sel_make_bools() if no sid is found
Use SECINITSID_SECURITY as the default SID for booleans which don't have
a matching SID returned from security_genfs_sid(), also update the
error message to a warning which matches this.

This prevents the policy failing to load (and consequently the system
failing to boot) when there is no default genfscon statement matched for
the selinuxfs in the new policy.

Signed-off-by: Gary Tierney <gary.tierney@gmx.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:32 -05:00
Gary Tierney
4262fb51c9 selinux: log errors when loading new policy
Adds error logging to the code paths which can fail when loading a new
policy in sel_write_load().  If the policy fails to be loaded from
userspace then a warning message is printed, whereas if a failure occurs
after loading policy from userspace an error message will be printed
with details on where policy loading failed (recreating one of /classes/,
/policy_capabilities/, /booleans/ in the SELinux fs).

Also, if sel_make_bools() fails to obtain an SID for an entry in
/booleans/* an error will be printed indicating the path of the
boolean.

Signed-off-by: Gary Tierney <gary.tierney@gmx.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:31 -05:00
Stephen Smalley
b21507e272 proc,security: move restriction on writing /proc/pid/attr nodes to proc
Processes can only alter their own security attributes via
/proc/pid/attr nodes.  This is presently enforced by each individual
security module and is also imposed by the Linux credentials
implementation, which only allows a task to alter its own credentials.
Move the check enforcing this restriction from the individual
security modules to proc_pid_attr_write() before calling the security hook,
and drop the unnecessary task argument to the security hook since it can
only ever be the current task.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:31 -05:00
Stephen Smalley
be0554c9bf selinux: clean up cred usage and simplify
SELinux was sometimes using the task "objective" credentials when
it could/should use the "subjective" credentials.  This was sometimes
hidden by the fact that we were unnecessarily passing around pointers
to the current task, making it appear as if the task could be something
other than current, so eliminate all such passing of current.  Inline
various permission checking helper functions that can be reduced to a
single avc_has_perm() call.

Since the credentials infrastructure only allows a task to alter
its own credentials, we can always assume that current must be the same
as the target task in selinux_setprocattr after the check. We likely
should move this check from selinux_setprocattr() to proc_pid_attr_write()
and drop the task argument to the security hook altogether; it can only
serve to confuse things.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:31 -05:00
Stephen Smalley
01593d3299 selinux: allow context mounts on tmpfs, ramfs, devpts within user namespaces
commit aad82892af ("selinux: Add support for
unprivileged mounts from user namespaces") prohibited any use of context
mount options within non-init user namespaces.  However, this breaks
use of context mount options for tmpfs mounts within user namespaces,
which are being used by Docker/runc.  There is no reason to block such
usage for tmpfs, ramfs or devpts.  Exempt these filesystem types
from this restriction.

Before:
sh$ userns_child_exec  -p -m -U -M '0 1000 1' -G '0 1000 1' bash
sh# mount -t tmpfs -o context=system_u:object_r:user_tmp_t:s0:c13 none /tmp
mount: tmpfs is write-protected, mounting read-only
mount: cannot mount tmpfs read-only

After:
sh$ userns_child_exec  -p -m -U -M '0 1000 1' -G '0 1000 1' bash
sh# mount -t tmpfs -o context=system_u:object_r:user_tmp_t:s0:c13 none /tmp
sh# ls -Zd /tmp
unconfined_u:object_r:user_tmp_t:s0:c13 /tmp

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:31 -05:00
Stephen Smalley
ef37979a2c selinux: handle ICMPv6 consistently with ICMP
commit 79c8b348f215 ("selinux: support distinctions among all network
address families") mapped datagram ICMP sockets to the new icmp_socket
security class, but left ICMPv6 sockets unchanged.  This change fixes
that oversight to handle both kinds of sockets consistently.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:31 -05:00
Yongqin Liu
a2c7c6fbe5 selinux: add security in-core xattr support for tracefs
Since kernel 4.1 ftrace is supported as a new separate filesystem. It
gets automatically mounted by the kernel under the old path
/sys/kernel/debug/tracing. Because it lives now on a separate filesystem
SELinux needs to be updated to also support setting SELinux labels
on tracefs inodes.  This is required for compatibility in Android
when moving to Linux 4.1 or newer.

Signed-off-by: Yongqin Liu <yongqin.liu@linaro.org>
Signed-off-by: William Roberts <william.c.roberts@intel.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:30 -05:00
Stephen Smalley
da69a5306a selinux: support distinctions among all network address families
Extend SELinux to support distinctions among all network address families
implemented by the kernel by defining new socket security classes
and mapping to them. Otherwise, many sockets are mapped to the generic
socket class and are indistinguishable in policy.  This has come up
previously with regard to selectively allowing access to bluetooth sockets,
and more recently with regard to selectively allowing access to AF_ALG
sockets.  Guido Trentalancia submitted a patch that took a similar approach
to add only support for distinguishing AF_ALG sockets, but this generalizes
his approach to handle all address families implemented by the kernel.
Socket security classes are also added for ICMP and SCTP sockets.
Socket security classes were not defined for AF_* values that are reserved
but unimplemented in the kernel, e.g. AF_NETBEUI, AF_SECURITY, AF_ASH,
AF_ECONET, AF_SNA, AF_WANPIPE.

Backward compatibility is provided by only enabling the finer-grained
socket classes if a new policy capability is set in the policy; older
policies will behave as before.  The legacy redhat1 policy capability
that was only ever used in testing within Fedora for ptrace_child
is reclaimed for this purpose; as far as I can tell, this policy
capability is not enabled in any supported distro policy.

Add a pair of conditional compilation guards to detect when new AF_* values
are added so that we can update SELinux accordingly rather than having to
belatedly update it long after new address families are introduced.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:30 -05:00
Linus Torvalds
7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
Linus Torvalds
67327145c4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull SElinux fix from James Morris:
 "From Paul:
   'A small SELinux patch to fix some clang/llvm compiler warnings and
    ensure the tools under scripts work well in the face of kernel
    changes'"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  selinux: use the kernel headers when building scripts/selinux
2016-12-22 10:03:52 -08:00
Paul Moore
bfc5e3a6af selinux: use the kernel headers when building scripts/selinux
Commit 3322d0d64f ("selinux: keep SELinux in sync with new capability
definitions") added a check on the defined capabilities without
explicitly including the capability header file which caused problems
when building genheaders for users of clang/llvm.  Resolve this by
using the kernel headers when building genheaders, which is arguably
the right thing to do regardless, and explicitly including the
kernel's capability.h header file in classmap.h.  We also update the
mdp build, even though it wasn't causing an error we really should
be using the headers from the kernel we are building.

Reported-by: Nicolas Iooss <nicolas.iooss@m4x.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-12-21 10:39:25 -05:00
Andreas Steffen
98e1d55d03 ima: platform-independent hash value
For remote attestion it is important for the ima measurement values to
be platform-independent.  Therefore integer fields to be hashed must be
converted to canonical format.

Link: http://lkml.kernel.org/r/1480554346-29071-11-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Andreas Steffen <andreas.steffen@strongswan.org>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:46 -08:00
Mimi Zohar
d68a6fe9fc ima: define a canonical binary_runtime_measurements list format
The IMA binary_runtime_measurements list is currently in platform native
format.

To allow restoring a measurement list carried across kexec with a
different endianness than the targeted kernel, this patch defines
little-endian as the canonical format.  For big endian systems wanting
to save/restore the measurement list from a system with a different
endianness, a new boot command line parameter named "ima_canonical_fmt"
is defined.

Considerations: use of the "ima_canonical_fmt" boot command line option
will break existing userspace applications on big endian systems
expecting the binary_runtime_measurements list to be in platform native
format.

Link: http://lkml.kernel.org/r/1480554346-29071-10-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:45 -08:00
Mimi Zohar
c7d0936770 ima: support restoring multiple template formats
The configured IMA measurement list template format can be replaced at
runtime on the boot command line, including a custom template format.
This patch adds support for restoring a measuremement list containing
multiple builtin/custom template formats.

Link: http://lkml.kernel.org/r/1480554346-29071-9-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:45 -08:00
Mimi Zohar
3f23d624de ima: store the builtin/custom template definitions in a list
The builtin and single custom templates are currently stored in an
array.  In preparation for being able to restore a measurement list
containing multiple builtin/custom templates, this patch stores the
builtin and custom templates as a linked list.  This will permit
defining more than one custom template per boot.

Link: http://lkml.kernel.org/r/1480554346-29071-8-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:45 -08:00
Mimi Zohar
7b8589cc29 ima: on soft reboot, save the measurement list
The TPM PCRs are only reset on a hard reboot.  In order to validate a
TPM's quote after a soft reboot (eg.  kexec -e), the IMA measurement
list of the running kernel must be saved and restored on boot.

This patch uses the kexec buffer passing mechanism to pass the
serialized IMA binary_runtime_measurements to the next kernel.

Link: http://lkml.kernel.org/r/1480554346-29071-7-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:44 -08:00
Mimi Zohar
d158847ae8 ima: maintain memory size needed for serializing the measurement list
In preparation for serializing the binary_runtime_measurements, this
patch maintains the amount of memory required.

Link: http://lkml.kernel.org/r/1480554346-29071-5-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:44 -08:00
Mimi Zohar
dcfc56937b ima: permit duplicate measurement list entries
Measurements carried across kexec need to be added to the IMA
measurement list, but should not prevent measurements of the newly
booted kernel from being added to the measurement list.  This patch adds
support for allowing duplicate measurements.

The "boot_aggregate" measurement entry is the delimiter between soft
boots.

Link: http://lkml.kernel.org/r/1480554346-29071-4-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:43 -08:00
Mimi Zohar
94c3aac567 ima: on soft reboot, restore the measurement list
The TPM PCRs are only reset on a hard reboot.  In order to validate a
TPM's quote after a soft reboot (eg.  kexec -e), the IMA measurement
list of the running kernel must be saved and restored on boot.  This
patch restores the measurement list.

Link: http://lkml.kernel.org/r/1480554346-29071-3-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:43 -08:00
Linus Torvalds
9a19a6db37 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:

 - more ->d_init() stuff (work.dcache)

 - pathname resolution cleanups (work.namei)

 - a few missing iov_iter primitives - copy_from_iter_full() and
   friends. Either copy the full requested amount, advance the iterator
   and return true, or fail, return false and do _not_ advance the
   iterator. Quite a few open-coded callers converted (and became more
   readable and harder to fuck up that way) (work.iov_iter)

 - several assorted patches, the big one being logfs removal

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  logfs: remove from tree
  vfs: fix put_compat_statfs64() does not handle errors
  namei: fold should_follow_link() with the step into not-followed link
  namei: pass both WALK_GET and WALK_MORE to should_follow_link()
  namei: invert WALK_PUT logics
  namei: shift interpretation of LOOKUP_FOLLOW inside should_follow_link()
  namei: saner calling conventions for mountpoint_last()
  namei.c: get rid of user_path_parent()
  switch getfrag callbacks to ..._full() primitives
  make skb_add_data,{_nocache}() and skb_copy_to_page_nocache() advance only on success
  [iov_iter] new primitives - copy_from_iter_full() and friends
  don't open-code file_inode()
  ceph: switch to use of ->d_init()
  ceph: unify dentry_operations instances
  lustre: switch to use of ->d_init()
2016-12-16 10:24:44 -08:00
Al Viro
c4364f837c Merge branches 'work.namei', 'work.dcache' and 'work.iov_iter' into for-linus 2016-12-15 01:07:29 -05:00
Linus Torvalds
a57cb1c1d7 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - a few misc things

 - kexec updates

 - DMA-mapping updates to better support networking DMA operations

 - IPC updates

 - various MM changes to improve DAX fault handling

 - lots of radix-tree changes, mainly to the test suite. All leading up
   to reimplementing the IDA/IDR code to be a wrapper layer over the
   radix-tree. However the final trigger-pulling patch is held off for
   4.11.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (114 commits)
  radix tree test suite: delete unused rcupdate.c
  radix tree test suite: add new tag check
  radix-tree: ensure counts are initialised
  radix tree test suite: cache recently freed objects
  radix tree test suite: add some more functionality
  idr: reduce the number of bits per level from 8 to 6
  rxrpc: abstract away knowledge of IDR internals
  tpm: use idr_find(), not idr_find_slowpath()
  idr: add ida_is_empty
  radix tree test suite: check multiorder iteration
  radix-tree: fix replacement for multiorder entries
  radix-tree: add radix_tree_split_preload()
  radix-tree: add radix_tree_split
  radix-tree: add radix_tree_join
  radix-tree: delete radix_tree_range_tag_if_tagged()
  radix-tree: delete radix_tree_locate_item()
  radix-tree: improve multiorder iterators
  btrfs: fix race in btrfs_free_dummy_fs_info()
  radix-tree: improve dump output
  radix-tree: make radix_tree_find_next_bit more useful
  ...
2016-12-14 17:25:18 -08:00
Lorenzo Stoakes
5b56d49fc3 mm: add locked parameter to get_user_pages_remote()
Patch series "mm: unexport __get_user_pages_unlocked()".

This patch series continues the cleanup of get_user_pages*() functions
taking advantage of the fact we can now pass gup_flags as we please.

It firstly adds an additional 'locked' parameter to
get_user_pages_remote() to allow for its callers to utilise
VM_FAULT_RETRY functionality.  This is necessary as the invocation of
__get_user_pages_unlocked() in process_vm_rw_single_vec() makes use of
this and no other existing higher level function would allow it to do
so.

Secondly existing callers of __get_user_pages_unlocked() are replaced
with the appropriate higher-level replacement -
get_user_pages_unlocked() if the current task and memory descriptor are
referenced, or get_user_pages_remote() if other task/memory descriptors
are referenced (having acquiring mmap_sem.)

This patch (of 2):

Add a int *locked parameter to get_user_pages_remote() to allow
VM_FAULT_RETRY faulting behaviour similar to get_user_pages_[un]locked().

Taking into account the previous adjustments to get_user_pages*()
functions allowing for the passing of gup_flags, we are now in a
position where __get_user_pages_unlocked() need only be exported for his
ability to allow VM_FAULT_RETRY behaviour, this adjustment allows us to
subsequently unexport __get_user_pages_unlocked() as well as allowing
for future flexibility in the use of get_user_pages_remote().

[sfr@canb.auug.org.au: merge fix for get_user_pages_remote API change]
  Link: http://lkml.kernel.org/r/20161122210511.024ec341@canb.auug.org.au
Link: http://lkml.kernel.org/r/20161027095141.2569-2-lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Linus Torvalds
412ac77a9d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace updates from Eric Biederman:
 "After a lot of discussion and work we have finally reachanged a basic
  understanding of what is necessary to make unprivileged mounts safe in
  the presence of EVM and IMA xattrs which the last commit in this
  series reflects. While technically it is a revert the comments it adds
  are important for people not getting confused in the future. Clearing
  up that confusion allows us to seriously work on unprivileged mounts
  of fuse in the next development cycle.

  The rest of the fixes in this set are in the intersection of user
  namespaces, ptrace, and exec. I started with the first fix which
  started a feedback cycle of finding additional issues during review
  and fixing them. Culiminating in a fix for a bug that has been present
  since at least Linux v1.0.

  Potentially these fixes were candidates for being merged during the rc
  cycle, and are certainly backport candidates but enough little things
  turned up during review and testing that I decided they should be
  handled as part of the normal development process just to be certain
  there were not any great surprises when it came time to backport some
  of these fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  Revert "evm: Translate user/group ids relative to s_user_ns when computing HMAC"
  exec: Ensure mm->user_ns contains the execed files
  ptrace: Don't allow accessing an undumpable mm
  ptrace: Capture the ptracer's creds not PT_PTRACE_CAP
  mm: Add a user_ns owner to mm_struct and fix ptrace permission checks
2016-12-14 14:09:48 -08:00
Linus Torvalds
683b96f4d1 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Generally pretty quiet for this release. Highlights:

  Yama:
   - allow ptrace access for original parent after re-parenting

  TPM:
   - add documentation
   - many bugfixes & cleanups
   - define a generic open() method for ascii & bios measurements

  Integrity:
   - Harden against malformed xattrs

  SELinux:
   - bugfixes & cleanups

  Smack:
   - Remove unnecessary smack_known_invalid label
   - Do not apply star label in smack_setprocattr hook
   - parse mnt opts after privileges check (fixes unpriv DoS vuln)"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (56 commits)
  Yama: allow access for the current ptrace parent
  tpm: adjust return value of tpm_read_log
  tpm: vtpm_proxy: conditionally call tpm_chip_unregister
  tpm: Fix handling of missing event log
  tpm: Check the bios_dir entry for NULL before accessing it
  tpm: return -ENODEV if np is not set
  tpm: cleanup of printk error messages
  tpm: replace of_find_node_by_name() with dev of_node property
  tpm: redefine read_log() to handle ACPI/OF at runtime
  tpm: fix the missing .owner in tpm_bios_measurements_ops
  tpm: have event log use the tpm_chip
  tpm: drop tpm1_chip_register(/unregister)
  tpm: replace dynamically allocated bios_dir with a static array
  tpm: replace symbolic permission with octal for securityfs files
  char: tpm: fix kerneldoc tpm2_unseal_trusted name typo
  tpm_tis: Allow tpm_tis to be bound using DT
  tpm, tpm_vtpm_proxy: add kdoc comments for VTPM_PROXY_IOC_NEW_DEV
  tpm: Only call pm_runtime_get_sync if device has a parent
  tpm: define a generic open() method for ascii & bios measurements
  Documentation: tpm: add the Physical TPM device tree binding documentation
  ...
2016-12-14 13:57:44 -08:00
Linus Torvalds
9465d9cc31 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "The time/timekeeping/timer folks deliver with this update:

   - Fix a reintroduced signed/unsigned issue and cleanup the whole
     signed/unsigned mess in the timekeeping core so this wont happen
     accidentaly again.

   - Add a new trace clock based on boot time

   - Prevent injection of random sleep times when PM tracing abuses the
     RTC for storage

   - Make posix timers configurable for real tiny systems

   - Add tracepoints for the alarm timer subsystem so timer based
     suspend wakeups can be instrumented

   - The usual pile of fixes and updates to core and drivers"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  timekeeping: Use mul_u64_u32_shr() instead of open coding it
  timekeeping: Get rid of pointless typecasts
  timekeeping: Make the conversion call chain consistently unsigned
  timekeeping_Force_unsigned_clocksource_to_nanoseconds_conversion
  alarmtimer: Add tracepoints for alarm timers
  trace: Update documentation for mono, mono_raw and boot clock
  trace: Add an option for boot clock as trace clock
  timekeeping: Add a fast and NMI safe boot clock
  timekeeping/clocksource_cyc2ns: Document intended range limitation
  timekeeping: Ignore the bogus sleep time if pm_trace is enabled
  selftests/timers: Fix spelling mistake "Asyncrhonous" -> "Asynchronous"
  clocksource/drivers/bcm2835_timer: Unmap region obtained by of_iomap
  clocksource/drivers/arm_arch_timer: Map frame with of_io_request_and_map()
  arm64: dts: rockchip: Arch counter doesn't tick in system suspend
  clocksource/drivers/arm_arch_timer: Don't assume clock runs in suspend
  posix-timers: Make them configurable
  posix_cpu_timers: Move the add_device_randomness() call to a proper place
  timer: Move sys_alarm from timer.c to itimer.c
  ptp_clock: Allow for it to be optional
  Kconfig: Regenerate *.c_shipped files after previous changes
  ...
2016-12-12 19:56:15 -08:00
Al Viro
cbbd26b8b1 [iov_iter] new primitives - copy_from_iter_full() and friends
copy_from_iter_full(), copy_from_iter_full_nocache() and
csum_and_copy_from_iter_full() - counterparts of copy_from_iter()
et.al., advancing iterator only in case of successful full copy
and returning whether it had been successful or not.

Convert some obvious users.  *NOTE* - do not blindly assume that
something is a good candidate for those unless you are sure that
not advancing iov_iter in failure case is the right thing in
this case.  Anything that does short read/short write kind of
stuff (or is in a loop, etc.) is unlikely to be a good one.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-05 14:33:36 -05:00
Josh Stone
50523a29d9 Yama: allow access for the current ptrace parent
Under ptrace_scope=1, it's possible to have a tracee that is already
ptrace-attached, but is no longer a direct descendant.  For instance, a
forking daemon will be re-parented to init, losing its ancestry to the
tracer that launched it.

The tracer can continue using ptrace in that state, but it will be
denied other accesses that check PTRACE_MODE_ATTACH, like process_vm_rw
and various procfs files.  There's no reason to prevent such access for
a tracer that already has ptrace control anyway.

This patch adds a case to ptracer_exception_found to allow access for
any task in the same thread group as the current ptrace parent.

Signed-off-by: Josh Stone <jistone@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-12-05 11:48:01 +11:00
Al Viro
450630975d don't open-code file_inode()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-04 18:29:28 -05:00
Eric W. Biederman
19339c2516 Revert "evm: Translate user/group ids relative to s_user_ns when computing HMAC"
This reverts commit 0b3c9761d1.

Seth Forshee <seth.forshee@canonical.com> writes:
> All right, I think 0b3c9761d1 should be
> reverted then. EVM is a machine-local integrity mechanism, and so it
> makes sense that the signature would be based on the kernel's notion of
> the uid and not the filesystem's.

I added a commment explaining why the EVM hmac needs to be in the
kernel's notion of uid and gid, not the filesystems to prevent
remounting the filesystem and gaining unwaranted trust in files.

Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-12-02 20:58:41 -06:00
James Morris
0821e30cd2 Merge branch 'stable-4.10' of git://git.infradead.org/users/pcmoore/selinux into next 2016-11-24 11:21:25 +11:00
James Morris
b075361e91 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next 2016-11-23 09:52:11 +11:00
Andreas Gruenbacher
9287aed2ad selinux: Convert isec->lock into a spinlock
Convert isec->lock from a mutex into a spinlock.  Instead of holding
the lock while sleeping in inode_doinit_with_dentry, set
isec->initialized to LABEL_PENDING and release the lock.  Then, when
the sid has been determined, re-acquire the lock.  If isec->initialized
is still set to LABEL_PENDING, set isec->sid; otherwise, the sid has
been set by another task (LABEL_INITIALIZED) or invalidated
(LABEL_INVALID) in the meantime.

This fixes a deadlock on gfs2 where

 * one task is in inode_doinit_with_dentry -> gfs2_getxattr, holds
   isec->lock, and tries to acquire the inode's glock, and

 * another task is in do_xmote -> inode_go_inval ->
   selinux_inode_invalidate_secctx, holds the inode's glock, and
   tries to acquire isec->lock.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
[PM: minor tweaks to keep checkpatch.pl happy]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-22 17:44:02 -05:00
James Morris
636e4625ad Merge remote branch 'smack/smack-for-4.10' into next 2016-11-22 22:15:30 +11:00
Stephen Smalley
3322d0d64f selinux: keep SELinux in sync with new capability definitions
When a new capability is defined, SELinux needs to be updated.
Trigger a build error if a new capability is defined without
corresponding update to security/selinux/include/classmap.h's
COMMON_CAP2_PERMS.  This is similar to BUILD_BUG_ON() guards
in the SELinux nlmsgtab code to ensure that SELinux tracks
new netlink message types as needed.

Note that there is already a similar build guard in
security/selinux/hooks.c to detect when more than 64
capabilities are defined, since that will require adding
a third capability class to SELinux.

A nicer way to do this would be to extend scripts/selinux/genheaders
or a similar tool to auto-generate the necessary definitions and code
for SELinux capability checking from include/uapi/linux/capability.h.
AppArmor does something similar in its Makefile, although it only
needs to generate a single table of names.  That is left as future
work.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: reformat the description to keep checkpatch.pl happy]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-21 15:37:24 -05:00
John Johansen
3d40658c97 apparmor: fix change_hat not finding hat after policy replacement
After a policy replacement, the task cred may be out of date and need
to be updated. However change_hat is using the stale profiles from
the out of date cred resulting in either: a stale profile being applied
or, incorrect failure when searching for a hat profile as it has been
migrated to the new parent profile.

Fixes: 01e2b670aa (failure to find hat)
Fixes: 898127c34e (stale policy being applied)
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1000287
Cc: stable@vger.kernel.org
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-11-21 18:01:28 +11:00
Stephen Smalley
ea49d10eee selinux: normalize input to /sys/fs/selinux/enforce
At present, one can write any signed integer value to
/sys/fs/selinux/enforce and it will be stored,
e.g. echo -1 > /sys/fs/selinux/enforce or echo 2 >
/sys/fs/selinux/enforce. This makes no real difference
to the kernel, since it only ever cares if it is zero or non-zero,
but some userspace code compares it with 1 to decide if SELinux
is enforcing, and this could confuse it. Only a process that is
already root and is allowed the setenforce permission in SELinux
policy can write to /sys/fs/selinux/enforce, so this is not considered
to be a security issue, but it should be fixed.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-20 17:13:19 -05:00
Nicolas Pitre
baa73d9e47 posix-timers: Make them configurable
Some embedded systems have no use for them.  This removes about
25KB from the kernel binary size when configured out.

Corresponding syscalls are routed to a stub logging the attempt to
use those syscalls which should be enough of a clue if they were
disabled without proper consideration. They are: timer_create,
timer_gettime: timer_getoverrun, timer_settime, timer_delete,
clock_adjtime, setitimer, getitimer, alarm.

The clock_settime, clock_gettime, clock_getres and clock_nanosleep
syscalls are replaced by simple wrappers compatible with CLOCK_REALTIME,
CLOCK_MONOTONIC and CLOCK_BOOTTIME only which should cover the vast
majority of use cases with very little code.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <john.stultz@linaro.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: linux-kbuild@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Michal Marek <mmarek@suse.com>
Cc: Edward Cree <ecree@solarflare.com>
Link: http://lkml.kernel.org/r/1478841010-28605-7-git-send-email-nicolas.pitre@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 09:26:35 +01:00
Casey Schaufler
152f91d4d1 Smack: Remove unnecessary smack_known_invalid
The invalid Smack label ("") and the Huh ("?") Smack label
serve the same purpose and having both is unnecessary.
While pulling out the invalid label it became clear that
the use of smack_from_secid() was inconsistent, so that
is repaired. The setting of inode labels to the invalid
label could never happen in a functional system, has
never been observed in the wild and is not what you'd
really want for a failure behavior in any case. That is
removed.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2016-11-15 09:34:39 -08:00
Tetsuo Handa
8c15d66e42 Smack: Use GFP_KERNEL for smack_parse_opts_str().
Since smack_parse_opts_str() is calling match_strdup() which uses
GFP_KERNEL, it is safe to use GFP_KERNEL from kcalloc() which is
called by smack_parse_opts_str().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-11-14 12:55:11 -08:00
Andreas Gruenbacher
13457d073c selinux: Clean up initialization of isec->sclass
Now that isec->initialized == LABEL_INITIALIZED implies that
isec->sclass is valid, skip such inodes immediately in
inode_doinit_with_dentry.

For the remaining inodes, initialize isec->sclass at the beginning of
inode_doinit_with_dentry to simplify the code.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-14 15:53:04 -05:00
Andreas Gruenbacher
db978da8fa proc: Pass file mode to proc_pid_make_inode
Pass the file mode of the proc inode to be created to
proc_pid_make_inode.  In proc_pid_make_inode, initialize inode->i_mode
before calling security_task_to_inode.  This allows selinux to set
isec->sclass right away without introducing "half-initialized" inode
security structs.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-14 15:39:48 -05:00
Andreas Gruenbacher
420591128c selinux: Minor cleanups
Fix the comment for function __inode_security_revalidate, which returns
an integer.

Use the LABEL_* constants consistently for isec->initialized.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-14 15:25:07 -05:00
Tetsuo Handa
8931c3bdb3 SELinux: Use GFP_KERNEL for selinux_parse_opts_str().
Since selinux_parse_opts_str() is calling match_strdup() which uses
GFP_KERNEL, it is safe to use GFP_KERNEL from kcalloc() which is
called by selinux_parse_opts_str().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-14 15:03:38 -05:00
Seth Forshee
b4bfec7f4a security/integrity: Harden against malformed xattrs
In general the handling of IMA/EVM xattrs is good, but I found
a few locations where either the xattr size or the value of the
type field in the xattr are not checked. Add a few simple checks
to these locations to prevent malformed or malicious xattrs from
causing problems.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-11-13 22:50:11 -05:00
Mimi Zohar
064be15c52 ima: include the reason for TPM-bypass mode
This patch includes the reason for going into TPM-bypass mode
and not using the TPM.

Signed-off-by: Mimi Zohar (zohar@linux.vnet.ibm>
2016-11-13 22:50:09 -05:00
Mimi Zohar
f5acb3dcba Revert "ima: limit file hash setting by user to fix and log modes"
Userspace applications have been modified to write security xattrs,
but they are not context aware.  In the case of security.ima, the
security xattr can be either a file hash or a file signature.
Permitting writing one, but not the other requires the application to
be context aware.

In addition, userspace applications might write files to a staging
area, which might not be in policy, and then change some file metadata
(eg. owner) making it in policy.  As a result, these files are not
labeled properly.

This reverts commit c68ed80c97, which
prevents writing file hashes as security.ima xattrs.

Requested-by: Patrick Ohly <patrick.ohly@intel.com>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-11-13 22:50:09 -05:00
Eric Richter
9a11a18902 ima: fix memory leak in ima_release_policy
When the "policy" securityfs file is opened for read, it is opened as a
sequential file. However, when it is eventually released, there is no
cleanup for the sequential file, therefore some memory is leaked.

This patch adds a call to seq_release() in ima_release_policy() to clean up
the memory when the file is opened for read.

Fixes: 80eae209d6 IMA: allow reading back the current policy
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Tested-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-11-13 22:50:08 -05:00
Casey Schaufler
2e4939f702 Smack: ipv6 label match fix
The check for a deleted entry in the list of IPv6 host
addresses was being performed in the wrong place, leading
to most peculiar results in some cases. This puts the
check into the right place.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2016-11-10 11:22:18 -08:00
Himanshu Shukla
b437aba85b SMACK: Fix the memory leak in smack_cred_prepare() hook
Memory leak in smack_cred_prepare()function.
smack_cred_prepare() hook returns error if there is error in allocating
memory in smk_copy_rules() or smk_copy_relabel() function.
If smack_cred_prepare() function returns error then the calling
function should call smack_cred_free() function for cleanup.
In smack_cred_free() function first credential is  extracted and
then all rules are deleted. In smack_cred_prepare() function security
field is assigned in the end when all function return success. But this
function may return before and memory will not be freed.

Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-11-10 11:22:06 -08:00
Himanshu Shukla
7128ea159d SMACK: Do not apply star label in smack_setprocattr hook
Smack prohibits processes from using the star ("*") and web ("@") labels.
Checks have been added in other functions. In smack_setprocattr()
hook, only check for web ("@") label has been added and restricted
from applying web ("@") label.
Check for star ("*") label should also be added in smack_setprocattr()
hook. Return error should be "-EINVAL" not "-EPERM" as permission
is there for setting label but not the label value as star ("*") or
web ("@").

Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-11-10 11:21:52 -08:00
Himanshu Shukla
2097f59920 smack: parse mnt opts after privileges check
In smack_set_mnt_opts()first the SMACK mount options are being
parsed and later it is being checked whether the user calling
mount has CAP_MAC_ADMIN capability.
This sequence of operationis will allow unauthorized user to add
SMACK labels in label list and may cause denial of security attack
by adding many labels by allocating kernel memory by unauthorized user.
Superblock smack flag is also being set as initialized though function
may return with EPERM error.
First check the capability of calling user then set the SMACK attributes
and smk_flags.

Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-11-10 11:21:32 -08:00
jooseong lee
08382c9f6e Smack: Assign smack_known_web label for kernel thread's
Assign smack_known_web label for kernel thread's socket

Creating struct sock by sk_alloc function in various kernel subsystems
like bluetooth doesn't call smack_socket_post_create(). In such case,
received sock label is the floor('_') label and makes access deny.

Signed-off-by: jooseong lee <jooseong.lee@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-11-04 17:42:57 -07:00
Artem Savkov
31e6ec4519 security/keys: make BIG_KEYS dependent on stdrng.
Since BIG_KEYS can't be compiled as module it requires one of the "stdrng"
providers to be compiled into kernel. Otherwise big_key_crypto_init() fails
on crypto_alloc_rng step and next dereference of big_key_skcipher (e.g. in
big_key_preparse()) results in a NULL pointer dereference.

Fixes: 13100a72f4 ('Security: Keys: Big keys stored encrypted')
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Stephan Mueller <smueller@chronox.de>
cc: Kirill Marinushkin <k.marinushkin@gmail.com>
cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-10-27 16:03:33 +11:00
David Howells
7df3e59c3d KEYS: Sort out big_key initialisation
big_key has two separate initialisation functions, one that registers the
key type and one that registers the crypto.  If the key type fails to
register, there's no problem if the crypto registers successfully because
there's no way to reach the crypto except through the key type.

However, if the key type registers successfully but the crypto does not,
big_key_rng and big_key_blkcipher may end up set to NULL - but the code
neither checks for this nor unregisters the big key key type.

Furthermore, since the key type is registered before the crypto, it is
theoretically possible for the kernel to try adding a big_key before the
crypto is set up, leading to the same effect.

Fix this by merging big_key_crypto_init() and big_key_init() and calling
the resulting function late.  If they're going to be encrypted, we
shouldn't be creating big_keys before we have the facilities to do the
encryption available.  The key type registration is also moved after the
crypto initialisation.

The fix also includes message printing on failure.

If the big_key type isn't correctly set up, simply doing:

	dd if=/dev/zero bs=4096 count=1 | keyctl padd big_key a @s

ought to cause an oops.

Fixes: 13100a72f4 ('Security: Keys: Big keys stored encrypted')
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Peter Hlavaty <zer0mem@yahoo.com>
cc: Kirill Marinushkin <k.marinushkin@gmail.com>
cc: Artem Savkov <asavkov@redhat.com>
cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-10-27 16:03:27 +11:00
David Howells
03dab869b7 KEYS: Fix short sprintf buffer in /proc/keys show function
This fixes CVE-2016-7042.

Fix a short sprintf buffer in proc_keys_show().  If the gcc stack protector
is turned on, this can cause a panic due to stack corruption.

The problem is that xbuf[] is not big enough to hold a 64-bit timeout
rendered as weeks:

	(gdb) p 0xffffffffffffffffULL/(60*60*24*7)
	$2 = 30500568904943

That's 14 chars plus NUL, not 11 chars plus NUL.

Expand the buffer to 16 chars.

I think the unpatched code apparently works if the stack-protector is not
enabled because on a 32-bit machine the buffer won't be overflowed and on a
64-bit machine there's a 64-bit aligned pointer at one side and an int that
isn't checked again on the other side.

The panic incurred looks something like:

Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81352ebe
CPU: 0 PID: 1692 Comm: reproducer Not tainted 4.7.2-201.fc24.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
 0000000000000086 00000000fbbd2679 ffff8800a044bc00 ffffffff813d941f
 ffffffff81a28d58 ffff8800a044bc98 ffff8800a044bc88 ffffffff811b2cb6
 ffff880000000010 ffff8800a044bc98 ffff8800a044bc30 00000000fbbd2679
Call Trace:
 [<ffffffff813d941f>] dump_stack+0x63/0x84
 [<ffffffff811b2cb6>] panic+0xde/0x22a
 [<ffffffff81352ebe>] ? proc_keys_show+0x3ce/0x3d0
 [<ffffffff8109f7f9>] __stack_chk_fail+0x19/0x30
 [<ffffffff81352ebe>] proc_keys_show+0x3ce/0x3d0
 [<ffffffff81350410>] ? key_validate+0x50/0x50
 [<ffffffff8134db30>] ? key_default_cmp+0x20/0x20
 [<ffffffff8126b31c>] seq_read+0x2cc/0x390
 [<ffffffff812b6b12>] proc_reg_read+0x42/0x70
 [<ffffffff81244fc7>] __vfs_read+0x37/0x150
 [<ffffffff81357020>] ? security_file_permission+0xa0/0xc0
 [<ffffffff81246156>] vfs_read+0x96/0x130
 [<ffffffff81247635>] SyS_read+0x55/0xc0
 [<ffffffff817eb872>] entry_SYSCALL_64_fastpath+0x1a/0xa4

Reported-by: Ondrej Kozina <okozina@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Ondrej Kozina <okozina@redhat.com>
cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-10-27 16:03:24 +11:00
Linus Torvalds
86c5bf7101 Merge branch 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull vmap stack fixes from Ingo Molnar:
 "This is fallout from CONFIG_HAVE_ARCH_VMAP_STACK=y on x86: stack
  accesses that used to be just somewhat questionable are now totally
  buggy.

  These changes try to do it without breaking the ABI: the fields are
  left there, they are just reporting zero, or reporting narrower
  information (the maps file change)"

* 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  mm: Change vm_is_stack_for_task() to vm_is_stack_for_current()
  fs/proc: Stop trying to report thread stacks
  fs/proc: Stop reporting eip and esp in /proc/PID/stat
  mm/numa: Remove duplicated include from mprotect.c
2016-10-22 09:39:10 -07:00
Andy Lutomirski
d17af5056c mm: Change vm_is_stack_for_task() to vm_is_stack_for_current()
Asking for a non-current task's stack can't be done without races
unless the task is frozen in kernel mode.  As far as I know,
vm_is_stack_for_task() never had a safe non-current use case.

The __unused annotation is because some KSTK_ESP implementations
ignore their parameter, which IMO is further justification for this
patch.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux API <linux-api@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Link: http://lkml.kernel.org/r/4c3f68f426e6c061ca98b4fc7ef85ffbb0a25b0c.1475257877.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-20 09:21:41 +02:00
Lorenzo Stoakes
9beae1ea89 mm: replace get_user_pages_remote() write/force parameters with gup_flags
This removes the 'write' and 'force' from get_user_pages_remote() and
replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19 08:12:02 -07:00
Linus Torvalds
101105b171 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 ">rename2() work from Miklos + current_time() from Deepa"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: Replace current_fs_time() with current_time()
  fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
  fs: Replace CURRENT_TIME with current_time() for inode timestamps
  fs: proc: Delete inode time initializations in proc_alloc_inode()
  vfs: Add current_time() api
  vfs: add note about i_op->rename changes to porting
  fs: rename "rename2" i_op to "rename"
  vfs: remove unused i_op->rename
  fs: make remaining filesystems use .rename2
  libfs: support RENAME_NOREPLACE in simple_rename()
  fs: support RENAME_NOREPLACE for local filesystems
  ncpfs: fix unused variable warning
2016-10-10 20:16:43 -07:00
Al Viro
3873691e5a Merge remote-tracking branch 'ovl/rename2' into for-linus 2016-10-10 23:02:51 -04:00
Linus Torvalds
97d2116708 Merge branch 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs xattr updates from Al Viro:
 "xattr stuff from Andreas

  This completes the switch to xattr_handler ->get()/->set() from
  ->getxattr/->setxattr/->removexattr"

* 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: Remove {get,set,remove}xattr inode operations
  xattr: Stop calling {get,set,remove}xattr inode operations
  vfs: Check for the IOP_XATTR flag in listxattr
  xattr: Add __vfs_{get,set,remove}xattr helpers
  libfs: Use IOP_XATTR flag for empty directory handling
  vfs: Use IOP_XATTR flag for bad-inode handling
  vfs: Add IOP_XATTR inode operations flag
  vfs: Move xattr_resolve_name to the front of fs/xattr.c
  ecryptfs: Switch to generic xattr handlers
  sockfs: Get rid of getxattr iop
  sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
  kernfs: Switch to generic xattr handlers
  hfs: Switch to generic xattr handlers
  jffs2: Remove jffs2_{get,set,remove}xattr macros
  xattr: Remove unnecessary NULL attribute name check
2016-10-10 17:11:50 -07:00
Linus Torvalds
abb5a14fa2 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "Assorted misc bits and pieces.

  There are several single-topic branches left after this (rename2
  series from Miklos, current_time series from Deepa Dinamani, xattr
  series from Andreas, uaccess stuff from from me) and I'd prefer to
  send those separately"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (39 commits)
  proc: switch auxv to use of __mem_open()
  hpfs: support FIEMAP
  cifs: get rid of unused arguments of CIFSSMBWrite()
  posix_acl: uapi header split
  posix_acl: xattr representation cleanups
  fs/aio.c: eliminate redundant loads in put_aio_ring_file
  fs/internal.h: add const to ns_dentry_operations declaration
  compat: remove compat_printk()
  fs/buffer.c: make __getblk_slow() static
  proc: unsigned file descriptors
  fs/file: more unsigned file descriptors
  fs: compat: remove redundant check of nr_segs
  cachefiles: Fix attempt to read i_blocks after deleting file [ver #2]
  cifs: don't use memcpy() to copy struct iov_iter
  get rid of separate multipage fault-in primitives
  fs: Avoid premature clearing of capabilities
  fs: Give dentry to inode_change_ok() instead of inode
  fuse: Propagate dentry down to inode_change_ok()
  ceph: Propagate dentry down to inode_change_ok()
  xfs: Propagate dentry down to inode_change_ok()
  ...
2016-10-10 13:04:49 -07:00
Linus Torvalds
563873318d Merge branch 'printk-cleanups'
Merge my system logging cleanups, triggered by the broken '\n' patches.

The line continuation handling has been broken basically forever, and
the code to handle the system log records was both confusing and
dubious.  And it would do entirely the wrong thing unless you always had
a terminating newline, partly because it couldn't actually see whether a
message was marked KERN_CONT or not (but partly because the LOG_CONT
handling in the recording code was rather confusing too).

This re-introduces a real semantically meaningful KERN_CONT, and fixes
the few places I noticed where it was missing.  There are probably more
missing cases, since KERN_CONT hasn't actually had any semantic meaning
for at least four years (other than the checkpatch meaning of "no log
level necessary, this is a continuation line").

This also allows the combination of KERN_CONT and a log level.  In that
case the log level will be ignored if the merging with a previous line
is successful, but if a new record is needed, that new record will now
get the right log level.

That also means that you can at least in theory combine KERN_CONT with
the "pr_info()" style helpers, although any use of pr_fmt() prefixing
would make that just result in a mess, of course (the prefix would end
up in the middle of a continuing line).

* printk-cleanups:
  printk: make reading the kernel log flush pending lines
  printk: re-organize log_output() to be more legible
  printk: split out core logging code into helper function
  printk: reinstate KERN_CONT for printing continuation lines
2016-10-10 09:29:50 -07:00
Linus Torvalds
4bcc595ccd printk: reinstate KERN_CONT for printing continuation lines
Long long ago the kernel log buffer was a buffered stream of bytes, very
much like stdio in user space.  It supported log levels by scanning the
stream and noticing the log level markers at the beginning of each line,
but if you wanted to print a partial line in multiple chunks, you just
did multiple printk() calls, and it just automatically worked.

Except when it didn't, and you had very confusing output when different
lines got all mixed up with each other.  Then you got fragment lines
mixing with each other, or with non-fragment lines, because it was
traditionally impossible to tell whether a printk() call was a
continuation or not.

To at least help clarify the issue of continuation lines, we added a
KERN_CONT marker back in 2007 to mark continuation lines:

  4749252776 ("printk: add KERN_CONT annotation").

That continuation marker was initially an empty string, and didn't
actuall make any semantic difference.  But it at least made it possible
to annotate the source code, and have check-patch notice that a printk()
didn't need or want a log level marker, because it was a continuation of
a previous line.

To avoid the ambiguity between a continuation line that had that
KERN_CONT marker, and a printk with no level information at all, we then
in 2009 made KERN_CONT be a real log level marker which meant that we
could now reliably tell the difference between the two cases.

  5fd29d6ccb ("printk: clean up handling of log-levels and newlines")

and we could take advantage of that to make sure we didn't mix up
continuation lines with lines that just didn't have any loglevel at all.

Then, in 2012, the kernel log buffer was changed to be a "record" based
log, where each line was a record that has a loglevel and a timestamp.

You can see the beginning of that conversion in commits

  e11fea92e1 ("kmsg: export printk records to the /dev/kmsg interface")
  7ff9554bb5 ("printk: convert byte-buffer to variable-length record buffer")

with a number of follow-up commits to fix some painful fallout from that
conversion.  Over all, it took a couple of months to sort out most of
it.  But the upside was that you could have concurrent readers (and
writers) of the kernel log and not have lines with mixed output in them.

And one particular pain-point for the record-based kernel logging was
exactly the fragmentary lines that are generated in smaller chunks.  In
order to still log them as one recrod, the continuation lines need to be
attached to the previous record properly.

However the explicit continuation record marker that is actually useful
for this exact case was actually removed in aroundm the same time by commit

  61e99ab8e3 ("printk: remove the now unnecessary "C" annotation for KERN_CONT")

due to the incorrect belief that KERN_CONT wasn't meaningful.  The
ambiguity between "is this a continuation line" or "is this a plain
printk with no log level information" was reintroduced, and in fact
became an even bigger pain point because there was now the whole
record-level merging of kernel messages going on.

This patch reinstates the KERN_CONT as a real non-empty string marker,
so that the ambiguity is fixed once again.

But it's not a plain revert of that original removal: in the four years
since we made KERN_CONT an empty string again, not only has the format
of the log level markers changed, we've also had some usage changes in
this area.

For example, some ACPI code seems to use KERN_CONT _together_ with a log
level, and now uses both the KERN_CONT marker and (for example) a
KERN_INFO marker to show that it's an informational continuation of a
line.

Which is actually not a bad idea - if the continuation line cannot be
attached to its predecessor, without the log level information we don't
know what log level to assign to it (and we traditionally just assigned
it the default loglevel).  So having both a log level and the KERN_CONT
marker is not necessarily a bad idea, but it does mean that we need to
actually iterate over potentially multiple markers, rather than just a
single one.

Also, since KERN_CONT was still conceptually needed, and encouraged, but
didn't actually _do_ anything, we've also had the reverse problem:
rather than having too many annotations it has too few, and there is bit
rot with code that no longer marks the continuation lines with the
KERN_CONT marker.

So this patch not only re-instates the non-empty KERN_CONT marker, it
also fixes up the cases of bit-rot I noticed in my own logs.

There are probably other cases where KERN_CONT will be needed to be
added, either because it is new code that never dealt with the need for
KERN_CONT, or old code that has bitrotted without anybody noticing.

That said, we should strive to avoid the need for KERN_CONT.  It does
result in real problems for logging, and should generally not be seen as
a good feature.  If we some day can get rid of the feature entirely,
because nobody does any fragmented printk calls, that would be lovely.

But until that point, let's at mark the code that relies on the hacky
multi-fragment kernel printk's.  Not only does it avoid the ambiguity,
it also annotates code as "maybe this would be good to fix some day".

(That said, particularly during single-threaded bootup, the downsides of
KERN_CONT are very limited.  Things get much hairier when you have
multiple threads going on and user level reading and writing logs too).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-09 12:23:38 -07:00
Al Viro
f334bcd94b Merge remote-tracking branch 'ovl/misc' into work.misc 2016-10-08 11:00:01 -04:00
Andreas Gruenbacher
5d6c31910b xattr: Add __vfs_{get,set,remove}xattr helpers
Right now, various places in the kernel check for the existence of
getxattr, setxattr, and removexattr inode operations and directly call
those operations.  Switch to helper functions and test for the IOP_XATTR
flag instead.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-07 20:10:44 -04:00
Linus Torvalds
2ab704a47e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial updates from Jiri Kosina:
 "The usual rocket science from the trivial tree"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  tracing/syscalls: fix multiline in error message text
  lib/Kconfig.debug: fix DEBUG_SECTION_MISMATCH description
  doc: vfs: fix fadvise() sycall name
  x86/entry: spell EBX register correctly in documentation
  securityfs: fix securityfs_create_dir comment
  irq: Fix typo in tracepoint.xml
2016-10-07 12:24:08 -07:00
Linus Torvalds
a3443cda55 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:

  SELinux/LSM:
   - overlayfs support, necessary for container filesystems

  LSM:
   - finally remove the kernel_module_from_file hook

  Smack:
   - treat signal delivery as an 'append' operation

  TPM:
   - lots of bugfixes & updates

  Audit:
   - new audit data type: LSM_AUDIT_DATA_FILE

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (47 commits)
  Revert "tpm/tpm_crb: implement tpm crb idle state"
  Revert "tmp/tpm_crb: fix Intel PTT hw bug during idle state"
  Revert "tpm/tpm_crb: open code the crb_init into acpi_add"
  Revert "tmp/tpm_crb: implement runtime pm for tpm_crb"
  lsm,audit,selinux: Introduce a new audit data type LSM_AUDIT_DATA_FILE
  tmp/tpm_crb: implement runtime pm for tpm_crb
  tpm/tpm_crb: open code the crb_init into acpi_add
  tmp/tpm_crb: fix Intel PTT hw bug during idle state
  tpm/tpm_crb: implement tpm crb idle state
  tpm: add check for minimum buffer size in tpm_transmit()
  tpm: constify TPM 1.x header structures
  tpm/tpm_crb: fix the over 80 characters checkpatch warring
  tpm/tpm_crb: drop useless cpu_to_le32 when writing to registers
  tpm/tpm_crb: cache cmd_size register value.
  tmp/tpm_crb: drop include to platform_device
  tpm/tpm_tis: remove unused itpm variable
  tpm_crb: fix incorrect values of cmdReady and goIdle bits
  tpm_crb: refine the naming of constants
  tpm_crb: remove wmb()'s
  tpm_crb: fix crb_req_canceled behavior
  ...
2016-10-04 14:48:27 -07:00
Linus Torvalds
3cd013ab79 Merge branch 'stable-4.9' of git://git.infradead.org/users/pcmoore/audit
Pull audit updates from Paul Moore:
 "Another relatively small pull request for v4.9 with just two patches.

  The patch from Richard updates the list of features we support and
  report back to userspace; this should have been sent earlier with the
  rest of the v4.8 patches but it got lost in my inbox.

  The second patch fixes a problem reported by our Android friends where
  we weren't very consistent in recording PIDs"

* 'stable-4.9' of git://git.infradead.org/users/pcmoore/audit:
  audit: add exclude filter extension to feature bitmap
  audit: consistently record PIDs with task_tgid_nr()
2016-10-04 14:21:41 -07:00
Laurent Georget
1b4606511d securityfs: fix securityfs_create_dir comment
If there is an error creating a directory with securityfs_create_dir,
the error is propagated via ERR_PTR but the function comment claims that
NULL is returned.

This is a similar commit to 88e6c94cda
("fix long-broken securityfs_create_file comment") that did not fix
securityfs_create_dir comment at the same time.

Signed-off-by: Laurent Georget <laurent.georget@supelec.fr>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-09-29 10:07:01 +02:00
Deepa Dinamani
078cd8279e fs: Replace CURRENT_TIME with current_time() for inode timestamps
CURRENT_TIME macro is not appropriate for filesystems as it
doesn't use the right granularity for filesystem timestamps.
Use current_time() instead.

CURRENT_TIME is also not y2038 safe.

This is also in preparation for the patch that transitions
vfs timestamps to use 64 bit time and hence make them
y2038 safe. As part of the effort current_time() will be
extended to do range checks. Hence, it is necessary for all
file system timestamps to use current_time(). Also,
current_time() will be transitioned along with vfs to be
y2038 safe.

Note that whenever a single call to current_time() is used
to change timestamps in different inodes, it is because they
share the same time granularity.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Felipe Balbi <balbi@kernel.org>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-27 21:06:21 -04:00
Miklos Szeredi
2773bf00ae fs: rename "rename2" i_op to "rename"
Generated patch:

sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2`
sed -i "s/\brename2\b/rename/g" `git grep -wl rename2`

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-27 11:03:58 +02:00
Miklos Szeredi
18fc84dafa vfs: remove unused i_op->rename
No in-tree uses remain.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-27 11:03:58 +02:00
Linus Torvalds
2ddfdd4289 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes a regression in RSA that was only half-fixed earlier in the
  cycle.  It also fixes an older regression that breaks the keyring
  subsystem"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: rsa-pkcs1pad - Handle leading zero for decryption
  KEYS: Fix skcipher IV clobbering
2016-09-23 11:28:04 -07:00
Herbert Xu
456bee986e KEYS: Fix skcipher IV clobbering
The IV must not be modified by the skcipher operation so we need
to duplicate it.

Fixes: c3917fd9df ("KEYS: Use skcipher")
Cc: stable@vger.kernel.org
Reported-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-09-22 17:42:07 +08:00
James Morris
8a17ef9d85 Merge branch 'stable-4.9' of git://git.infradead.org/users/pcmoore/selinux into next 2016-09-21 11:54:19 +10:00
Vivek Goyal
43af5de742 lsm,audit,selinux: Introduce a new audit data type LSM_AUDIT_DATA_FILE
Right now LSM_AUDIT_DATA_PATH type contains "struct path" in union "u"
of common_audit_data. This information is used to print path of file
at the same time it is also used to get to dentry and inode. And this
inode information is used to get to superblock and device and print
device information.

This does not work well for layered filesystems like overlay where dentry
contained in path is overlay dentry and not the real dentry of underlying
file system. That means inode retrieved from dentry is also overlay
inode and not the real inode.

SELinux helpers like file_path_has_perm() are doing checks on inode
retrieved from file_inode(). This returns the real inode and not the
overlay inode. That means we are doing check on real inode but for audit
purposes we are printing details of overlay inode and that can be
confusing while debugging.

Hence, introduce a new type LSM_AUDIT_DATA_FILE which carries file
information and inode retrieved is real inode using file_inode(). That
way right avc denied information is given to user.

For example, following is one example avc before the patch.

  type=AVC msg=audit(1473360868.399:214): avc:  denied  { read open } for
    pid=1765 comm="cat"
    path="/root/.../overlay/container1/merged/readfile"
    dev="overlay" ino=21443
    scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20
    tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0
    tclass=file permissive=0

It looks as follows after the patch.

  type=AVC msg=audit(1473360017.388:282): avc:  denied  { read open } for
    pid=2530 comm="cat"
    path="/root/.../overlay/container1/merged/readfile"
    dev="dm-0" ino=2377915
    scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20
    tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0
    tclass=file permissive=0

Notice that now dev information points to "dm-0" device instead of
"overlay" device. This makes it clear that check failed on underlying
inode and not on the overlay inode.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
[PM: slight tweaks to the description to make checkpatch.pl happy]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-09-19 13:42:38 -04:00
James Morris
de2f4b3453 Merge branch 'stable-4.9' of git://git.infradead.org/users/pcmoore/selinux into next 2016-09-19 12:27:10 +10:00
Miklos Szeredi
e71b9dff06 ima: use file_dentry()
Ima tries to call ->setxattr() on overlayfs dentry after having locked
underlying inode, which results in a deadlock.

Reported-by: Krisztian Litkey <kli@iki.fi>
Fixes: 4bacc9c923 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # v4.2
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-09-16 12:44:20 +02:00
Wei Yongjun
9b6a9ecc2d selinux: fix error return code in policydb_read()
Fix to return error code -EINVAL from the error handling case instead
of 0 (rc is overwrite to 0 when policyvers >=
POLICYDB_VERSION_ROLETRANS), as done elsewhere in this function.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
[PM: normalize "selinux" in patch subject, description line wrap]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-09-13 17:14:43 -04:00
Casey Schaufler
c60b906673 Smack: Signal delivery as an append operation
Under a strict subject/object security policy delivering a
signal or delivering network IPC could be considered either
a write or an append operation. The original choice to make
both write operations leads to an issue where IPC delivery
is desired under policy, but delivery of signals is not.
This patch provides the option of making signal delivery
an append operation, allowing Smack rules that deny signal
delivery while allowing IPC. This was requested for Tizen.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2016-09-08 13:22:56 -07:00
Linus Torvalds
80a77045da - force check_object_size() to be inline too
- move page-spanning check behind a CONFIG since it's triggering false positives
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJX0F3qAAoJEIly9N/cbcAmaxEP/R737i+XhP4VOv+rYW050NHB
 K+FDeLv5/Gx56JrHRRWwj80K5/9F09wS929AWcaTDjEb2NKp/o/6W/gN1eVdGlJF
 K3UQk+Kmncb44poVoHkjMRAl6+sfKm6mTWZBTjBECKwQuCFyDoDoqDXhn5IXTlw9
 Ig+TTOSgNw9gRke3ECtFynbVnDWx/Ry/axfT9vGXhFOkWclMUFy2UOdDSTtFAB6x
 yw5hdrfGakk2BPscHLO1xNqRuVLRUSXZVUiJGIQ6AiUupm34Yqmm69mrMuxaOtPC
 Ai3zhNGDuYClcGJAiPJYX+7nRjgPCWAdlyzQqLp5hwx63TJ+gxvhmxoFOJxEmHE/
 99i2Ak073Es6WII532Eknk3vV+UJzQNT/HO+0LcrJFkOEp9EHfVUb19CngQTaX7Q
 UbfYdyFgp3y24cRp7v0tP8gE2LCrsRe0UEhUq2NGmrerw3caNqZGHS9Od5OYEM8D
 uIhaotWoOv9Z0r+DZMGkUjfqeLb6RWNcUoWc5wZ3VYG27BM/pfhRxKf/2aw6O9u0
 2Jk1QJxBr+/8DQ500xu/IBOP9V7aGAc4nxKyqUlwA05/JEFGiAzCwqfZW5CKTxgD
 5Ht994WbTEH3/VaAskKnwggeHvttiEpehBCdVA4bXuhBhJhmPjFiKHX7uRrcM2GV
 /yH3UTkPnr/VD/I2ndiX
 =r8BE
 -----END PGP SIGNATURE-----

Merge tag 'usercopy-v4.8-rc6-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull more hardened usercopyfixes from Kees Cook:

 - force check_object_size() to be inline too

 - move page-spanning check behind a CONFIG since it's triggering false
   positives

[ Changed the page-spanning config option to depend on EXPERT in the
  merge.  That way it still gets build testing, and you can enable it if
  you want to, but is never enabled for "normal" configurations ]

* tag 'usercopy-v4.8-rc6-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  usercopy: remove page-spanning test for now
  usercopy: force check_object_size() inline
2016-09-07 14:03:49 -07:00
Kees Cook
8e1f74ea02 usercopy: remove page-spanning test for now
A custom allocator without __GFP_COMP that copies to userspace has been
found in vmw_execbuf_process[1], so this disables the page-span checker
by placing it behind a CONFIG for future work where such things can be
tracked down later.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1373326

Reported-by: Vinson Lee <vlee@freedesktop.org>
Fixes: f5509cc18d ("mm: Hardened usercopy")
Signed-off-by: Kees Cook <keescook@chromium.org>
2016-09-07 11:33:26 -07:00
Paul Moore
fa2bea2f5c audit: consistently record PIDs with task_tgid_nr()
Unfortunately we record PIDs in audit records using a variety of
methods despite the correct way being the use of task_tgid_nr().
This patch converts all of these callers, except for the case of
AUDIT_SET in audit_receive_msg() (see the comment in the code).

Reported-by: Jeff Vander Stoep <jeffv@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-30 17:19:13 -04:00
William Roberts
7c686af071 selinux: fix overflow and 0 length allocations
Throughout the SELinux LSM, values taken from sepolicy are
used in places where length == 0 or length == <saturated>
matter, find and fix these.

Signed-off-by: William Roberts <william.c.roberts@intel.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-30 15:45:50 -04:00
William Roberts
3bc7bcf69b selinux: initialize structures
libsepol pointed out an issue where its possible to have
an unitialized jmp and invalid dereference, fix this.
While we're here, zero allocate all the *_val_to_struct
structures.

Signed-off-by: William Roberts <william.c.roberts@intel.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-29 19:22:10 -04:00
William Roberts
74d977b65e selinux: detect invalid ebitmap
When count is 0 and the highbit is not zero, the ebitmap is not
valid and the internal node is not allocated. This causes issues
when routines, like mls_context_isvalid() attempt to use the
ebitmap_for_each_bit() and ebitmap_node_get_bit() as they assume
a highbit > 0 will have a node allocated.

Signed-off-by: William Roberts <william.c.roberts@intel.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-29 19:19:50 -04:00
Markus Elfring
63e24c4971 Smack: Use memdup_user() rather than duplicating its implementation
Reuse existing functionality from memdup_user() instead of keeping
duplicate source code.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-08-23 09:58:21 -07:00
Linus Torvalds
6040e57658 Make the hardened user-copy code depend on having a hardened allocator
The kernel test robot reported a usercopy failure in the new hardened
sanity checks, due to a page-crossing copy of the FPU state into the
task structure.

This happened because the kernel test robot was testing with SLOB, which
doesn't actually do the required book-keeping for slab allocations, and
as a result the hardening code didn't realize that the task struct
allocation was one single allocation - and the sanity checks fail.

Since SLOB doesn't even claim to support hardening (and you really
shouldn't use it), the straightforward solution is to just make the
usercopy hardening code depend on the allocator supporting it.

Reported-by: kernel test robot <xiaolong.ye@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-19 12:47:01 -07:00
William Roberts
348a0db9e6 selinux: drop SECURITY_SELINUX_POLICYDB_VERSION_MAX
Remove the SECURITY_SELINUX_POLICYDB_VERSION_MAX Kconfig option

Per: https://github.com/SELinuxProject/selinux/wiki/Kernel-Todo

This was only needed on Fedora 3 and 4 and just causes issues now,
so drop it.

The MAX and MIN should just be whatever the kernel can support.

Signed-off-by: William Roberts <william.c.roberts@intel.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-18 20:01:15 -04:00
Vivek Goyal
a518b0a5b0 selinux: Implement dentry_create_files_as() hook
Calculate what would be the label of newly created file and set that
secid in the passed creds.

Context of the task which is actually creating file is retrieved from
set of creds passed in. (old->security).

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-10 08:25:22 -04:00
Vivek Goyal
2602625b7e security, overlayfs: Provide hook to correctly label newly created files
During a new file creation we need to make sure new file is created with the
right label. New file is created in upper/ so effectively file should get
label as if task had created file in upper/.

We switched to mounter's creds for actual file creation. Also if there is a
whiteout present, then file will be created in work/ dir first and then
renamed in upper. In none of the cases file will be labeled as we want it to
be.

This patch introduces a new hook dentry_create_files_as(), which determines
the label/context dentry will get if it had been created by task in upper
and modify passed set of creds appropriately. Caller makes use of these new
creds for file creation.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: fix whitespace issues found with checkpatch.pl]
[PM: changes to use stat->mode in ovl_create_or_link()]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:46:46 -04:00
Vivek Goyal
c957f6df52 selinux: Pass security pointer to determine_inode_label()
Right now selinux_determine_inode_label() works on security pointer of
current task. Soon I need this to work on a security pointer retrieved
from a set of creds. So start passing in a pointer and caller can
decide where to fetch security pointer from.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:45:29 -04:00
Vivek Goyal
19472b69d6 selinux: Implementation for inode_copy_up_xattr() hook
When a file is copied up in overlay, we have already created file on
upper/ with right label and there is no need to copy up selinux
label/xattr from lower file to upper file. In fact in case of context
mount, we don't want to copy up label as newly created file got its label
from context= option.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:43:59 -04:00
Vivek Goyal
121ab822ef security,overlayfs: Provide security hook for copy up of xattrs for overlay file
Provide a security hook which is called when xattrs of a file are being
copied up. This hook is called once for each xattr and LSM can return
0 if the security module wants the xattr to be copied up, 1 if the
security module wants the xattr to be discarded on the copy, -EOPNOTSUPP
if the security module does not handle/manage the xattr, or a -errno
upon an error.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: whitespace cleanup for checkpatch.pl]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:42:13 -04:00
Vivek Goyal
56909eb3f5 selinux: Implementation for inode_copy_up() hook
A file is being copied up for overlay file system. Prepare a new set of
creds and set create_sid appropriately so that new file is created with
appropriate label.

Overlay inode has right label for both context and non-context mount
cases. In case of non-context mount, overlay inode will have the label
of lower file and in case of context mount, overlay inode will have
the label from context= mount option.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:41:52 -04:00
Vivek Goyal
d8ad8b4961 security, overlayfs: provide copy up security hook for unioned files
Provide a security hook to label new file correctly when a file is copied
up from lower layer to upper layer of a overlay/union mount.

This hook can prepare a new set of creds which are suitable for new file
creation during copy up. Caller will use new creds to create file and then
revert back to old creds and release new creds.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: whitespace cleanup to appease checkpatch.pl]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:06:53 -04:00
Linus Torvalds
1eccfa090e Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user
bounds checking for most architectures on SLAB and SLUB.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJXl9tlAAoJEIly9N/cbcAm5BoP/ikTtDp2bFw1sn92yHTnIWzl
 O+dcKVAeRgjfnSvPfb1JITpaM58exQSaDsPBeR0DbVzU1zDdhLcwHHiQupFh98Ka
 vBZthbrlL/u4NB26enEEW0iyA32BsxYBMnIu0z5ux9RbZflmQwGQ0c0rvy3dJ7/b
 FzB5ayVST5y/a0m6/sImeeExh78GU9rsMb1XmJRMwlJAy6miDz/F9TP0LnuW6PhG
 J5XC99ygNJS1pQBLACRsrZw6ImgBxXnWCok6tWPMxFfD+rJBU2//wqS+HozyMWHL
 iYP7+ytVo/ZVok4114X/V4Oof3a6wqgpBuYrivJ228QO+UsLYbYLo6sZ8kRK7VFm
 9GgHo/8rWB1T9lBbSaa7UL5r0dVNNLjFGS42vwV+YlgUMQ1A35VRojO0jUnJSIQU
 Ug1IxKmylLd0nEcwD8/l3DXeQABsfL8GsoKW0OtdTZtW4RND4gzq34LK6t7hvayF
 kUkLg1OLNdUJwOi16M/rhugwYFZIMfoxQtjkRXKWN4RZ2QgSHnx2lhqNmRGPAXBG
 uy21wlzUTfLTqTpoeOyHzJwyF2qf2y4nsziBMhvmlrUvIzW1LIrYUKCNT4HR8Sh5
 lC2WMGYuIqaiu+NOF3v6CgvKd9UW+mxMRyPEybH8mEgfm+FLZlWABiBjIUpSEZuB
 JFfuMv1zlljj/okIQRg8
 =USIR
 -----END PGP SIGNATURE-----

Merge tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull usercopy protection from Kees Cook:
 "Tbhis implements HARDENED_USERCOPY verification of copy_to_user and
  copy_from_user bounds checking for most architectures on SLAB and
  SLUB"

* tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  mm: SLUB hardened usercopy support
  mm: SLAB hardened usercopy support
  s390/uaccess: Enable hardened usercopy
  sparc/uaccess: Enable hardened usercopy
  powerpc/uaccess: Enable hardened usercopy
  ia64/uaccess: Enable hardened usercopy
  arm64/uaccess: Enable hardened usercopy
  ARM: uaccess: Enable hardened usercopy
  x86/uaccess: Enable hardened usercopy
  mm: Hardened usercopy
  mm: Implement stack frame object validation
  mm: Add is_migrate_cma_page
2016-08-08 14:48:14 -07:00
William Roberts
8b31f456c7 selinux: print leading 0x on ioctlcmd audits
ioctlcmd is currently printing hex numbers, but their is no leading
0x. Thus things like ioctlcmd=1234 are misleading, as the base is
not evident.

Correct this by adding 0x as a prefix, so ioctlcmd=1234 becomes
ioctlcmd=0x1234.

Signed-off-by: William Roberts <william.c.roberts@intel.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 13:08:34 -04:00
Javier Martinez Canillas
1a93a6eac3 security: Use IS_ENABLED() instead of checking for built-in or module
The IS_ENABLED() macro checks if a Kconfig symbol has been enabled
either built-in or as a module, use that macro instead of open coding
the same.

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 13:08:25 -04:00
Linus Torvalds
835c92d43b Merge branch 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull qstr constification updates from Al Viro:
 "Fairly self-contained bunch - surprising lot of places passes struct
  qstr * as an argument when const struct qstr * would suffice; it
  complicates analysis for no good reason.

  I'd prefer to feed that separately from the assorted fixes (those are
  in #for-linus and with somewhat trickier topology)"

* 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  qstr: constify instances in adfs
  qstr: constify instances in lustre
  qstr: constify instances in f2fs
  qstr: constify instances in ext2
  qstr: constify instances in vfat
  qstr: constify instances in procfs
  qstr: constify instances in fuse
  qstr constify instances in fs/dcache.c
  qstr: constify instances in nfs
  qstr: constify instances in ocfs2
  qstr: constify instances in autofs4
  qstr: constify instances in hfs
  qstr: constify instances in hfsplus
  qstr: constify instances in logfs
  qstr: constify dentry_init_security
2016-08-06 09:49:02 -04:00
Linus Torvalds
7a1e8b80fb Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

   - TPM core and driver updates/fixes
   - IPv6 security labeling (CALIPSO)
   - Lots of Apparmor fixes
   - Seccomp: remove 2-phase API, close hole where ptrace can change
     syscall #"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits)
  apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
  tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family)
  tpm: Factor out common startup code
  tpm: use devm_add_action_or_reset
  tpm2_i2c_nuvoton: add irq validity check
  tpm: read burstcount from TPM_STS in one 32-bit transaction
  tpm: fix byte-order for the value read by tpm2_get_tpm_pt
  tpm_tis_core: convert max timeouts from msec to jiffies
  apparmor: fix arg_size computation for when setprocattr is null terminated
  apparmor: fix oops, validate buffer size in apparmor_setprocattr()
  apparmor: do not expose kernel stack
  apparmor: fix module parameters can be changed after policy is locked
  apparmor: fix oops in profile_unpack() when policy_db is not present
  apparmor: don't check for vmalloc_addr if kvzalloc() failed
  apparmor: add missing id bounds check on dfa verification
  apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
  apparmor: use list_next_entry instead of list_entry_next
  apparmor: fix refcount race when finding a child profile
  apparmor: fix ref count leak when profile sha1 hash is read
  apparmor: check that xindex is in trans_table bounds
  ...
2016-07-29 17:38:46 -07:00
Linus Torvalds
a867d7349e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns vfs updates from Eric Biederman:
 "This tree contains some very long awaited work on generalizing the
  user namespace support for mounting filesystems to include filesystems
  with a backing store.  The real world target is fuse but the goal is
  to update the vfs to allow any filesystem to be supported.  This
  patchset is based on a lot of code review and testing to approach that
  goal.

  While looking at what is needed to support the fuse filesystem it
  became clear that there were things like xattrs for security modules
  that needed special treatment.  That the resolution of those concerns
  would not be fuse specific.  That sorting out these general issues
  made most sense at the generic level, where the right people could be
  drawn into the conversation, and the issues could be solved for
  everyone.

  At a high level what this patchset does a couple of simple things:

   - Add a user namespace owner (s_user_ns) to struct super_block.

   - Teach the vfs to handle filesystem uids and gids not mapping into
     to kuids and kgids and being reported as INVALID_UID and
     INVALID_GID in vfs data structures.

  By assigning a user namespace owner filesystems that are mounted with
  only user namespace privilege can be detected.  This allows security
  modules and the like to know which mounts may not be trusted.  This
  also allows the set of uids and gids that are communicated to the
  filesystem to be capped at the set of kuids and kgids that are in the
  owning user namespace of the filesystem.

  One of the crazier corner casees this handles is the case of inodes
  whose i_uid or i_gid are not mapped into the vfs.  Most of the code
  simply doesn't care but it is easy to confuse the inode writeback path
  so no operation that could cause an inode write-back is permitted for
  such inodes (aka only reads are allowed).

  This set of changes starts out by cleaning up the code paths involved
  in user namespace permirted mounts.  Then when things are clean enough
  adds code that cleanly sets s_user_ns.  Then additional restrictions
  are added that are possible now that the filesystem superblock
  contains owner information.

  These changes should not affect anyone in practice, but there are some
  parts of these restrictions that are changes in behavior.

   - Andy's restriction on suid executables that does not honor the
     suid bit when the path is from another mount namespace (think
     /proc/[pid]/fd/) or when the filesystem was mounted by a less
     privileged user.

   - The replacement of the user namespace implicit setting of MNT_NODEV
     with implicitly setting SB_I_NODEV on the filesystem superblock
     instead.

     Using SB_I_NODEV is a stronger form that happens to make this state
     user invisible.  The user visibility can be managed but it caused
     problems when it was introduced from applications reasonably
     expecting mount flags to be what they were set to.

  There is a little bit of work remaining before it is safe to support
  mounting filesystems with backing store in user namespaces, beyond
  what is in this set of changes.

   - Verifying the mounter has permission to read/write the block device
     during mount.

   - Teaching the integrity modules IMA and EVM to handle filesystems
     mounted with only user namespace root and to reduce trust in their
     security xattrs accordingly.

   - Capturing the mounters credentials and using that for permission
     checks in d_automount and the like.  (Given that overlayfs already
     does this, and we need the work in d_automount it make sense to
     generalize this case).

  Furthermore there are a few changes that are on the wishlist:

   - Get all filesystems supporting posix acls using the generic posix
     acls so that posix_acl_fix_xattr_from_user and
     posix_acl_fix_xattr_to_user may be removed.  [Maintainability]

   - Reducing the permission checks in places such as remount to allow
     the superblock owner to perform them.

   - Allowing the superblock owner to chown files with unmapped uids and
     gids to something that is mapped so the files may be treated
     normally.

  I am not considering even obvious relaxations of permission checks
  until it is clear there are no more corner cases that need to be
  locked down and handled generically.

  Many thanks to Seth Forshee who kept this code alive, and putting up
  with me rewriting substantial portions of what he did to handle more
  corner cases, and for his diligent testing and reviewing of my
  changes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (30 commits)
  fs: Call d_automount with the filesystems creds
  fs: Update i_[ug]id_(read|write) to translate relative to s_user_ns
  evm: Translate user/group ids relative to s_user_ns when computing HMAC
  dquot: For now explicitly don't support filesystems outside of init_user_ns
  quota: Handle quota data stored in s_user_ns in quota_setxquota
  quota: Ensure qids map to the filesystem
  vfs: Don't create inodes with a uid or gid unknown to the vfs
  vfs: Don't modify inodes with a uid or gid unknown to the vfs
  cred: Reject inodes with invalid ids in set_create_file_as()
  fs: Check for invalid i_uid in may_follow_link()
  vfs: Verify acls are valid within superblock's s_user_ns.
  userns: Handle -1 in k[ug]id_has_mapping when !CONFIG_USER_NS
  fs: Refuse uid/gid changes which don't map into s_user_ns
  selinux: Add support for unprivileged mounts from user namespaces
  Smack: Handle labels consistently in untrusted mounts
  Smack: Add support for unprivileged mounts from user namespaces
  fs: Treat foreign mounts as nosuid
  fs: Limit file caps to the user namespace of the super block
  userns: Remove the now unnecessary FS_USERNS_DEV_MOUNT flag
  userns: Remove implicit MNT_NODEV fragility.
  ...
2016-07-29 15:54:19 -07:00
Linus Torvalds
6784725ab0 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "Assorted cleanups and fixes.

  Probably the most interesting part long-term is ->d_init() - that will
  have a bunch of followups in (at least) ceph and lustre, but we'll
  need to sort the barrier-related rules before it can get used for
  really non-trivial stuff.

  Another fun thing is the merge of ->d_iput() callers (dentry_iput()
  and dentry_unlink_inode()) and a bunch of ->d_compare() ones (all
  except the one in __d_lookup_lru())"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
  fs/dcache.c: avoid soft-lockup in dput()
  vfs: new d_init method
  vfs: Update lookup_dcache() comment
  bdev: get rid of ->bd_inodes
  Remove last traces of ->sync_page
  new helper: d_same_name()
  dentry_cmp(): use lockless_dereference() instead of smp_read_barrier_depends()
  vfs: clean up documentation
  vfs: document ->d_real()
  vfs: merge .d_select_inode() into .d_real()
  unify dentry_iput() and dentry_unlink_inode()
  binfmt_misc: ->s_root is not going anywhere
  drop redundant ->owner initializations
  ufs: get rid of redundant checks
  orangefs: constify inode_operations
  missed comment updates from ->direct_IO() prototype change
  file_inode(f)->i_mapping is f->f_mapping
  trim fsnotify hooks a bit
  9p: new helper - v9fs_parent_fid()
  debugfs: ->d_parent is never NULL or negative
  ...
2016-07-28 12:59:05 -07:00
Linus Torvalds
554828ee0d Merge branch 'salted-string-hash'
This changes the vfs dentry hashing to mix in the parent pointer at the
_beginning_ of the hash, rather than at the end.

That actually improves both the hash and the code generation, because we
can move more of the computation to the "static" part of the dcache
setup, and do less at lookup runtime.

It turns out that a lot of other hash users also really wanted to mix in
a base pointer as a 'salt' for the hash, and so the slightly extended
interface ends up working well for other cases too.

Users that want a string hash that is purely about the string pass in a
'salt' pointer of NULL.

* merge branch 'salted-string-hash':
  fs/dcache.c: Save one 32-bit multiply in dcache lookup
  vfs: make the string hashes salt the hash
2016-07-28 12:26:31 -07:00
Arnd Bergmann
7616ac70d1 apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
The newly added Kconfig option could never work and just causes a build error
when disabled:

security/apparmor/lsm.c:675:25: error: 'CONFIG_SECURITY_APPARMOR_HASH_DEFAULT' undeclared here (not in a function)
 bool aa_g_hash_policy = CONFIG_SECURITY_APPARMOR_HASH_DEFAULT;

The problem is that the macro undefined in this case, and we need to use the IS_ENABLED()
helper to turn it into a boolean constant.

Another minor problem with the original patch is that the option is even offered
in sysfs when SECURITY_APPARMOR_HASH is not enabled, so this also hides the option
in that case.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 6059f71f1e ("apparmor: add parameter to control whether policy hashing is used")
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-07-27 17:39:26 +10:00
Kees Cook
f5509cc18d mm: Hardened usercopy
This is the start of porting PAX_USERCOPY into the mainline kernel. This
is the first set of features, controlled by CONFIG_HARDENED_USERCOPY. The
work is based on code by PaX Team and Brad Spengler, and an earlier port
from Casey Schaufler. Additional non-slab page tests are from Rik van Riel.

This patch contains the logic for validating several conditions when
performing copy_to_user() and copy_from_user() on the kernel object
being copied to/from:
- address range doesn't wrap around
- address range isn't NULL or zero-allocated (with a non-zero copy size)
- if on the slab allocator:
  - object size must be less than or equal to copy size (when check is
    implemented in the allocator, which appear in subsequent patches)
- otherwise, object must not span page allocations (excepting Reserved
  and CMA ranges)
- if on the stack
  - object must not extend before/after the current process stack
  - object must be contained by a valid stack frame (when there is
    arch/build support for identifying stack frames)
- object must not overlap with kernel text

Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-26 14:41:47 -07:00
Linus Torvalds
bbce2ad2d7 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Here is the crypto update for 4.8:

  API:
   - first part of skcipher low-level conversions
   - add KPP (Key-agreement Protocol Primitives) interface.

  Algorithms:
   - fix IPsec/cryptd reordering issues that affects aesni
   - RSA no longer does explicit leading zero removal
   - add SHA3
   - add DH
   - add ECDH
   - improve DRBG performance by not doing CTR by hand

  Drivers:
   - add x86 AVX2 multibuffer SHA256/512
   - add POWER8 optimised crc32c
   - add xts support to vmx
   - add DH support to qat
   - add RSA support to caam
   - add Layerscape support to caam
   - add SEC1 AEAD support to talitos
   - improve performance by chaining requests in marvell/cesa
   - add support for Araneus Alea I USB RNG
   - add support for Broadcom BCM5301 RNG
   - add support for Amlogic Meson RNG
   - add support Broadcom NSP SoC RNG"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (180 commits)
  crypto: vmx - Fix aes_p8_xts_decrypt build failure
  crypto: vmx - Ignore generated files
  crypto: vmx - Adding support for XTS
  crypto: vmx - Adding asm subroutines for XTS
  crypto: skcipher - add comment for skcipher_alg->base
  crypto: testmgr - Print akcipher algorithm name
  crypto: marvell - Fix wrong flag used for GFP in mv_cesa_dma_add_iv_op
  crypto: nx - off by one bug in nx_of_update_msc()
  crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct
  crypto: scatterwalk - Inline start/map/done
  crypto: scatterwalk - Remove unnecessary BUG in scatterwalk_start
  crypto: scatterwalk - Remove unnecessary advance in scatterwalk_pagedone
  crypto: scatterwalk - Fix test in scatterwalk_done
  crypto: api - Optimise away crypto_yield when hard preemption is on
  crypto: scatterwalk - add no-copy support to copychunks
  crypto: scatterwalk - Remove scatterwalk_bytes_sglen
  crypto: omap - Stop using crypto scatterwalk_bytes_sglen
  crypto: skcipher - Remove top-level givcipher interface
  crypto: user - Remove crypto_lookup_skcipher call
  crypto: cts - Convert to skcipher
  ...
2016-07-26 13:40:17 -07:00
Al Viro
4f3ccd7657 qstr: constify dentry_init_security
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-07-20 23:30:06 -04:00
John Johansen
d4d03f74a7 apparmor: fix arg_size computation for when setprocattr is null terminated
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
Vegard Nossum
e89b808132 apparmor: fix oops, validate buffer size in apparmor_setprocattr()
When proc_pid_attr_write() was changed to use memdup_user apparmor's
(interface violating) assumption that the setprocattr buffer was always
a single page was violated.

The size test is not strictly speaking needed as proc_pid_attr_write()
will reject anything larger, but for the sake of robustness we can keep
it in.

SMACK and SELinux look safe to me, but somebody else should probably
have a look just in case.

Based on original patch from Vegard Nossum <vegard.nossum@oracle.com>
modified for the case that apparmor provides null termination.

Fixes: bb646cdb12
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: stable@kernel.org
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-07-12 08:43:10 -07:00
Heinrich Schuchardt
f4ee2def2d apparmor: do not expose kernel stack
Do not copy uninitalized fields th.td_hilen, th.td_data.

Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
58acf9d911 apparmor: fix module parameters can be changed after policy is locked
the policy_lock parameter is a one way switch that prevents policy
from being further modified. Unfortunately some of the module parameters
can effectively modify policy by turning off enforcement.

split policy_admin_capable into a view check and a full admin check,
and update the admin check to test the policy_lock parameter.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
5f20fdfed1 apparmor: fix oops in profile_unpack() when policy_db is not present
BugLink: http://bugs.launchpad.net/bugs/1592547

If unpack_dfa() returns NULL due to the dfa not being present,
profile_unpack() is not checking if the dfa is not present (NULL).

Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
3197f5adf5 apparmor: don't check for vmalloc_addr if kvzalloc() failed
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
15756178c6 apparmor: add missing id bounds check on dfa verification
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
Jeff Mahoney
ff118479a7 apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
While using AppArmor, SYS_CAP_RESOURCE is insufficient to call prlimit
on another task. The only other example of a AppArmor mediating access to
another, already running, task (ignoring fork+exec) is ptrace.

The AppArmor model for ptrace is that one of the following must be true:
1) The tracer is unconfined
2) The tracer is in complain mode
3) The tracer and tracee are confined by the same profile
4) The tracer is confined but has SYS_CAP_PTRACE

1), 2, and 3) are already true for setrlimit.

We can match the ptrace model just by allowing CAP_SYS_RESOURCE.

We still test the values of the rlimit since it can always be overridden
using a value that means unlimited for a particular resource.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
Geliang Tang
38dbd7d8be apparmor: use list_next_entry instead of list_entry_next
list_next_entry has been defined in list.h, so I replace list_entry_next
with it.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
de7c4cc947 apparmor: fix refcount race when finding a child profile
When finding a child profile via an rcu critical section, the profile
may be put and scheduled for deletion after the child is found but
before its refcount is incremented.

Protect against this by repeating the lookup if the profiles refcount
is 0 and is one its way to deletion.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
0b938a2e2c apparmor: fix ref count leak when profile sha1 hash is read
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
23ca7b640b apparmor: check that xindex is in trans_table bounds
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
f7da2de011 apparmor: ensure the target profile name is always audited
The target profile name was not being correctly audited in a few
cases because the target variable was not being set and gotos
passed the code to set it at apply:

Since it is always based on new_profile just drop the target var
and conditionally report based on new_profile.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
7ee6da25dc apparmor: fix audit full profile hname on successful load
Currently logging of a successful profile load only logs the basename
of the profile. This can result in confusion when a child profile has
the same name as the another profile in the set. Logging the hname
will ensure there is no confusion.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
bf15cf0c64 apparmor: fix log failures for all profiles in a set
currently only the profile that is causing the failure is logged. This
makes it more confusing than necessary about which profiles loaded
and which didn't. So make sure to log success and failure messages for
all profiles in the set being loaded.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
f351841f8d apparmor: fix put() parent ref after updating the active ref
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
6059f71f1e apparmor: add parameter to control whether policy hashing is used
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
bd35db8b8c apparmor: internal paths should be treated as disconnected
Internal mounts are not mounted anywhere and as such should be treated
as disconnected paths.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
f2e561d190 apparmor: fix disconnected bind mnts reconnection
Bind mounts can fail to be properly reconnected when PATH_CONNECT is
specified. Ensure that when PATH_CONNECT is specified the path has
a root.

BugLink: http://bugs.launchpad.net/bugs/1319984

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
d671e89020 apparmor: fix update the mtime of the profile file on replacement
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
9049a79221 apparmor: exec should not be returning ENOENT when it denies
The current behavior is confusing as it causes exec failures to report
the executable is missing instead of identifying that apparmor
caused the failure.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
b6b1b81b3a apparmor: fix uninitialized lsm_audit member
BugLink: http://bugs.launchpad.net/bugs/1268727

The task field in the lsm_audit struct needs to be initialized if
a change_hat fails, otherwise the following oops will occur

BUG: unable to handle kernel paging request at 0000002fbead7d08
IP: [<ffffffff8171153e>] _raw_spin_lock+0xe/0x50
PGD 1e3f35067 PUD 0
Oops: 0002 [#1] SMP
Modules linked in: pppox crc_ccitt p8023 p8022 psnap llc ax25 btrfs raid6_pq xor xfs libcrc32c dm_multipath scsi_dh kvm_amd dcdbas kvm microcode amd64_edac_mod joydev edac_core psmouse edac_mce_amd serio_raw k10temp sp5100_tco i2c_piix4 ipmi_si ipmi_msghandler acpi_power_meter mac_hid lp parport hid_generic usbhid hid pata_acpi mpt2sas ahci raid_class pata_atiixp bnx2 libahci scsi_transport_sas [last unloaded: tipc]
CPU: 2 PID: 699 Comm: changehat_twice Tainted: GF          O 3.13.0-7-generic #25-Ubuntu
Hardware name: Dell Inc. PowerEdge R415/08WNM9, BIOS 1.8.6 12/06/2011
task: ffff8802135c6000 ti: ffff880212986000 task.ti: ffff880212986000
RIP: 0010:[<ffffffff8171153e>]  [<ffffffff8171153e>] _raw_spin_lock+0xe/0x50
RSP: 0018:ffff880212987b68  EFLAGS: 00010006
RAX: 0000000000020000 RBX: 0000002fbead7500 RCX: 0000000000000000
RDX: 0000000000000292 RSI: ffff880212987ba8 RDI: 0000002fbead7d08
RBP: ffff880212987b68 R08: 0000000000000246 R09: ffff880216e572a0
R10: ffffffff815fd677 R11: ffffea0008469580 R12: ffffffff8130966f
R13: ffff880212987ba8 R14: 0000002fbead7d08 R15: ffff8800d8c6b830
FS:  00002b5e6c84e7c0(0000) GS:ffff880216e40000(0000) knlGS:0000000055731700
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000002fbead7d08 CR3: 000000021270f000 CR4: 00000000000006e0
Stack:
 ffff880212987b98 ffffffff81075f17 ffffffff8130966f 0000000000000009
 0000000000000000 0000000000000000 ffff880212987bd0 ffffffff81075f7c
 0000000000000292 ffff880212987c08 ffff8800d8c6b800 0000000000000026
Call Trace:
 [<ffffffff81075f17>] __lock_task_sighand+0x47/0x80
 [<ffffffff8130966f>] ? apparmor_cred_prepare+0x2f/0x50
 [<ffffffff81075f7c>] do_send_sig_info+0x2c/0x80
 [<ffffffff81075fee>] send_sig_info+0x1e/0x30
 [<ffffffff8130242d>] aa_audit+0x13d/0x190
 [<ffffffff8130c1dc>] aa_audit_file+0xbc/0x130
 [<ffffffff8130966f>] ? apparmor_cred_prepare+0x2f/0x50
 [<ffffffff81304cc2>] aa_change_hat+0x202/0x530
 [<ffffffff81308fc6>] aa_setprocattr_changehat+0x116/0x1d0
 [<ffffffff8130a11d>] apparmor_setprocattr+0x25d/0x300
 [<ffffffff812cee56>] security_setprocattr+0x16/0x20
 [<ffffffff8121fc87>] proc_pid_attr_write+0x107/0x130
 [<ffffffff811b7604>] vfs_write+0xb4/0x1f0
 [<ffffffff811b8039>] SyS_write+0x49/0xa0
 [<ffffffff8171a1bf>] tracesys+0xe1/0xe6

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
ec34fa24a9 apparmor: fix replacement bug that adds new child to old parent
When set atomic replacement is used and the parent is updated before the
child, and the child did not exist in the old parent so there is no
direct replacement then the new child is incorrectly added to the old
parent. This results in the new parent not having the child(ren) that
it should and the old parent when being destroyed asserting the
following error.

AppArmor: policy_destroy: internal error, policy '<profile/name>' still
contains profiles

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
dcda617a0c apparmor: fix refcount bug in profile replacement
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
James Morris
e1e5fa9616 Merge tag 'keys-misc-20160708' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2016-07-09 12:49:00 +10:00
James Morris
c632809953 Merge branch 'smack-for-4.8' of https://github.com/cschaufler/smack-next into next 2016-07-08 10:41:15 +10:00
Vegard Nossum
30a46a4647 apparmor: fix oops, validate buffer size in apparmor_setprocattr()
When proc_pid_attr_write() was changed to use memdup_user apparmor's
(interface violating) assumption that the setprocattr buffer was always
a single page was violated.

The size test is not strictly speaking needed as proc_pid_attr_write()
will reject anything larger, but for the sake of robustness we can keep
it in.

SMACK and SELinux look safe to me, but somebody else should probably
have a look just in case.

Based on original patch from Vegard Nossum <vegard.nossum@oracle.com>
modified for the case that apparmor provides null termination.

Fixes: bb646cdb12
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: stable@kernel.org
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-07-08 10:26:25 +10:00
James Morris
d011a4d861 Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/selinux into next 2016-07-07 10:15:34 +10:00
Seth Forshee
0b3c9761d1 evm: Translate user/group ids relative to s_user_ns when computing HMAC
The EVM HMAC should be calculated using the on disk user and
group ids, so the k[ug]ids in the inode must be translated
relative to the s_user_ns of the inode's super block.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-07-05 15:13:10 -05:00
Al Viro
b223f4e215 Merge branch 'd_real' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs into work.misc 2016-06-30 23:34:49 -04:00
Eric Richter
544e1cea03 ima: extend the measurement entry specific pcr
Extend the PCR supplied as a parameter, instead of assuming that the
measurement entry uses the default configured PCR.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:22 -04:00
Eric Richter
a422638d49 ima: change integrity cache to store measured pcr
IMA avoids re-measuring files by storing the current state as a flag in
the integrity cache. It will then skip adding a new measurement log entry
if the cache reports the file as already measured.

If a policy measures an already measured file to a new PCR, the measurement
will not be added to the list. This patch implements a new bitfield for
specifying which PCR the file was measured into, rather than if it was
measured.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:22 -04:00
Eric Richter
67696f6d79 ima: redefine duplicate template entries
Template entry duplicates are prevented from being added to the
measurement list by checking a hash table that contains the template
entry digests. However, the PCR value is not included in this comparison,
so duplicate template entry digests with differing PCRs may be dropped.

This patch redefines duplicate template entries as template entries with
the same digest and same PCR values.

Reported-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:21 -04:00
Eric Richter
5f6f027b50 ima: change ima_measurements_show() to display the entry specific pcr
IMA assumes that the same default Kconfig PCR is extended for each
entry. This patch replaces the default configured PCR with the policy
defined PCR.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:21 -04:00
Eric Richter
14b1da85bb ima: include pcr for each measurement log entry
The IMA measurement list entries include the Kconfig defined PCR value.
This patch defines a new ima_template_entry field for including the PCR
as specified in the policy rule.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:21 -04:00
Eric Richter
725de7fabb ima: extend ima_get_action() to return the policy pcr
Different policy rules may extend different PCRs. This patch retrieves
the specific PCR for the matched rule.  Subsequent patches will include
the rule specific PCR in the measurement list and extend the appropriate
PCR.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:20 -04:00
Eric Richter
0260643ce8 ima: add policy support for extending different pcrs
This patch defines a new IMA measurement policy rule option "pcr=",
which allows extending different PCRs on a per rule basis. For example,
the system independent files could extend the default IMA Kconfig
specified PCR, while the system dependent files could extend a different
PCR.

The following is an example of this usage with an SELinux policy; the
rule would extend PCR 11 with system configuration files:

  measure func=FILE_CHECK mask=MAY_READ obj_type=system_conf_t pcr=11

Changelog v3:
- FIELD_SIZEOF returns bytes, not bits. Fixed INVALID_PCR

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:20 -04:00
Eric Richter
96d450bbec integrity: add measured_pcrs field to integrity cache
To keep track of which measurements have been extended to which PCRs, this
patch defines a new integrity_iint_cache field named measured_pcrs. This
field is a bitmask of the PCRs measured. Each bit corresponds to a PCR
index. For example, bit 10 corresponds to PCR 10.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:19 -04:00
Huw Davies
4fee5242bf calipso: Add a label cache.
This works in exactly the same way as the CIPSO label cache.
The idea is to allow the lsm to cache the result of a secattr
lookup so that it doesn't need to perform the lookup for
every skbuff.

It introduces two sysctl controls:
 calipso_cache_enable - enables/disables the cache.
 calipso_cache_bucket_size - sets the size of a cache bucket.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:06:17 -04:00
Huw Davies
a04e71f631 netlabel: Pass a family parameter to netlbl_skbuff_err().
This makes it possible to route the error to the appropriate
labelling engine.  CALIPSO is far less verbose than CIPSO
when encountering a bogus packet, so there is no need for a
CALIPSO error handler.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:06:16 -04:00
Huw Davies
2917f57b6b calipso: Allow the lsm to label the skbuff directly.
In some cases, the lsm needs to add the label to the skbuff directly.
A NF_INET_LOCAL_OUT IPv6 hook is added to selinux to match the IPv4
behaviour.  This allows selinux to label the skbuffs that it requires.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:06:15 -04:00
Huw Davies
e1adea9270 calipso: Allow request sockets to be relabelled by the lsm.
Request sockets need to have a label that takes into account the
incoming connection as well as their parent's label.  This is used
for the outgoing SYN-ACK and for their child full-socket.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:05:29 -04:00
Huw Davies
1f440c99d3 netlabel: Prevent setsockopt() from changing the hop-by-hop option.
If a socket has a netlabel in place then don't let setsockopt() alter
the socket's IPv6 hop-by-hop option.  This is in the same spirit as
the existing check for IPv4.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:05:27 -04:00
Huw Davies
ceba1832b1 calipso: Set the calipso socket label to match the secattr.
CALIPSO is a hop-by-hop IPv6 option.  A lot of this patch is based on
the equivalent CISPO code.  The main difference is due to manipulating
the options in the hop-by-hop header.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:02:51 -04:00
Seth Forshee
aad82892af selinux: Add support for unprivileged mounts from user namespaces
Security labels from unprivileged mounts in user namespaces must
be ignored. Force superblocks from user namespaces whose labeling
behavior is to use xattrs to use mountpoint labeling instead.
For the mountpoint label, default to converting the current task
context into a form suitable for file objects, but also allow the
policy writer to specify a different label through policy
transition rules.

Pieced together from code snippets provided by Stephen Smalley.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 11:02:54 -05:00
Seth Forshee
809c02e091 Smack: Handle labels consistently in untrusted mounts
The SMACK64, SMACK64EXEC, and SMACK64MMAP labels are all handled
differently in untrusted mounts. This is confusing and
potentically problematic. Change this to handle them all the same
way that SMACK64 is currently handled; that is, read the label
from disk and check it at use time. For SMACK64 and SMACK64MMAP
access is denied if the label does not match smk_root. To be
consistent with suid, a SMACK64EXEC label which does not match
smk_root will still allow execution of the file but will not run
with the label supplied in the xattr.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 11:02:22 -05:00
Seth Forshee
9f50eda2a9 Smack: Add support for unprivileged mounts from user namespaces
Security labels from unprivileged mounts cannot be trusted.
Ideally for these mounts we would assign the objects in the
filesystem the same label as the inode for the backing device
passed to mount. Unfortunately it's currently impossible to
determine which inode this is from the LSM mount hooks, so we
settle for the label of the process doing the mount.

This label is assigned to s_root, and also to smk_default to
ensure that new inodes receive this label. The transmute property
is also set on s_root to make this behavior more explicit, even
though it is technically not necessary.

If a filesystem has existing security labels, access to inodes is
permitted if the label is the same as smk_root, otherwise access
is denied. The SMACK64EXEC xattr is completely ignored.

Explicit setting of security labels continues to require
CAP_MAC_ADMIN in init_user_ns.

Altogether, this ensures that filesystem objects are not
accessible to subjects which cannot already access the backing
store, that MAC is not violated for any objects in the fileystem
which are already labeled, and that a user cannot use an
unprivileged mount to gain elevated MAC privileges.

sysfs, tmpfs, and ramfs are already mountable from user
namespaces and support security labels. We can't rule out the
possibility that these filesystems may already be used in mounts
from user namespaces with security lables set from the init
namespace, so failing to trust lables in these filesystems may
introduce regressions. It is safe to trust labels from these
filesystems, since the unprivileged user does not control the
backing store and thus cannot supply security labels, so an
explicit exception is made to trust labels from these
filesystems.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 10:40:42 -05:00
Andy Lutomirski
380cf5ba6b fs: Treat foreign mounts as nosuid
If a process gets access to a mount from a different user
namespace, that process should not be able to take advantage of
setuid files or selinux entrypoints from that filesystem.  Prevent
this by treating mounts from other mount namespaces and those not
owned by current_user_ns() or an ancestor as nosuid.

This will make it safer to allow more complex filesystems to be
mounted in non-root user namespaces.

This does not remove the need for MNT_LOCK_NOSUID.  The setuid,
setgid, and file capability bits can no longer be abused if code in
a user namespace were to clear nosuid on an untrusted filesystem,
but this patch, by itself, is insufficient to protect the system
from abuse of files that, when execed, would increase MAC privilege.

As a more concrete explanation, any task that can manipulate a
vfsmount associated with a given user namespace already has
capabilities in that namespace and all of its descendents.  If they
can cause a malicious setuid, setgid, or file-caps executable to
appear in that mount, then that executable will only allow them to
elevate privileges in exactly the set of namespaces in which they
are already privileges.

On the other hand, if they can cause a malicious executable to
appear with a dangerous MAC label, running it could change the
caller's security context in a way that should not have been
possible, even inside the namespace in which the task is confined.

As a hardening measure, this would have made CVE-2014-5207 much
more difficult to exploit.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 10:40:41 -05:00
Seth Forshee
d07b846f62 fs: Limit file caps to the user namespace of the super block
Capability sets attached to files must be ignored except in the
user namespaces where the mounter is privileged, i.e. s_user_ns
and its descendants. Otherwise a vector exists for gaining
privileges in namespaces where a user is not already privileged.

Add a new helper function, current_in_user_ns(), to test whether a user
namespace is the same as or a descendant of another namespace.
Use this helper to determine whether a file's capability set
should be applied to the caps constructed during exec.

--EWB Replaced in_userns with the simpler current_in_userns.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 10:40:31 -05:00
Herbert Xu
d56d72c6a0 KEYS: Use skcipher for big keys
This patch replaces use of the obsolete blkcipher with skcipher.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: David Howells <dhowells@redhat.com>
2016-06-24 21:24:58 +08:00
Dan Carpenter
38327424b4 KEYS: potential uninitialized variable
If __key_link_begin() failed then "edit" would be uninitialized.  I've
added a check to fix that.

This allows a random user to crash the kernel, though it's quite
difficult to achieve.  There are three ways it can be done as the user
would have to cause an error to occur in __key_link():

 (1) Cause the kernel to run out of memory.  In practice, this is difficult
     to achieve without ENOMEM cropping up elsewhere and aborting the
     attempt.

 (2) Revoke the destination keyring between the keyring ID being looked up
     and it being tested for revocation.  In practice, this is difficult to
     time correctly because the KEYCTL_REJECT function can only be used
     from the request-key upcall process.  Further, users can only make use
     of what's in /sbin/request-key.conf, though this does including a
     rejection debugging test - which means that the destination keyring
     has to be the caller's session keyring in practice.

 (3) Have just enough key quota available to create a key, a new session
     keyring for the upcall and a link in the session keyring, but not then
     sufficient quota to create a link in the nominated destination keyring
     so that it fails with EDQUOT.

The bug can be triggered using option (3) above using something like the
following:

	echo 80 >/proc/sys/kernel/keys/root_maxbytes
	keyctl request2 user debug:fred negate @t

The above sets the quota to something much lower (80) to make the bug
easier to trigger, but this is dependent on the system.  Note also that
the name of the keyring created contains a random number that may be
between 1 and 10 characters in size, so may throw the test off by
changing the amount of quota used.

Assuming the failure occurs, something like the following will be seen:

	kfree_debugcheck: out of range ptr 6b6b6b6b6b6b6b68h
	------------[ cut here ]------------
	kernel BUG at ../mm/slab.c:2821!
	...
	RIP: 0010:[<ffffffff811600f9>] kfree_debugcheck+0x20/0x25
	RSP: 0018:ffff8804014a7de8  EFLAGS: 00010092
	RAX: 0000000000000034 RBX: 6b6b6b6b6b6b6b68 RCX: 0000000000000000
	RDX: 0000000000040001 RSI: 00000000000000f6 RDI: 0000000000000300
	RBP: ffff8804014a7df0 R08: 0000000000000001 R09: 0000000000000000
	R10: ffff8804014a7e68 R11: 0000000000000054 R12: 0000000000000202
	R13: ffffffff81318a66 R14: 0000000000000000 R15: 0000000000000001
	...
	Call Trace:
	  kfree+0xde/0x1bc
	  assoc_array_cancel_edit+0x1f/0x36
	  __key_link_end+0x55/0x63
	  key_reject_and_link+0x124/0x155
	  keyctl_reject_key+0xb6/0xe0
	  keyctl_negate_key+0x10/0x12
	  SyS_keyctl+0x9f/0xe7
	  do_syscall_64+0x63/0x13a
	  entry_SYSCALL64_slow_path+0x25/0x25

Fixes: f70e2e0619 ('KEYS: Do preallocation for __key_link()')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-16 17:15:04 -10:00
Heinrich Schuchardt
309c5fad5d selinux: fix type mismatch
avc_cache_threshold is of type unsigned int.  Do not use a signed
new_value in sscanf(page, "%u", &new_value).

Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
[PM: subject prefix fix, description cleanup]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-15 16:20:28 -04:00
David Howells
965475acca KEYS: Strip trailing spaces
Strip some trailing spaces.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-14 10:29:44 +01:00
Linus Torvalds
8387ff2577 vfs: make the string hashes salt the hash
We always mixed in the parent pointer into the dentry name hash, but we
did it late at lookup time.  It turns out that we can simplify that
lookup-time action by salting the hash with the parent pointer early
instead of late.

A few other users of our string hashes also wanted to mix in their own
pointers into the hash, and those are updated to use the same mechanism.

Hash users that don't have any particular initial salt can just use the
NULL pointer as a no-salt.

Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-10 20:21:46 -07:00
Paul Moore
8bebe88c09 selinux: import NetLabel category bitmaps correctly
The existing ebitmap_netlbl_import() code didn't correctly handle the
case where the ebitmap_node was not aligned/sized to a power of two,
this patch fixes this (on x86_64 ebitmap_node contains six bitmaps
making a range of 0..383).

Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-09 10:40:37 -04:00
Rafal Krypa
18d872f77c Smack: ignore null signal in smack_task_kill
Kill with signal number 0 is commonly used for checking PID existence.
Smack treated such cases like any other kills, although no signal is
actually delivered when sig == 0.

Checking permissions when sig == 0 didn't prevent an unprivileged caller
from learning whether PID exists or not. When it existed, kernel returned
EPERM, when it didn't - ESRCH. The only effect of policy check in such
case is noise in audit logs.

This change lets Smack silently ignore kill() invocations with sig == 0.

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-06-08 13:52:31 -07:00
Mike Danese
40d273782f security: tomoyo: simplify the gc kthread creation
The code is doing the equivalent of the kthread_run macro.

Signed-off-by: Mike Danese <mikedanese@google.com>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-06-06 20:23:55 +10:00
Casey Schaufler
2885c1e3e0 LSM: Fix for security_inode_getsecurity and -EOPNOTSUPP
Serge Hallyn pointed out that the current implementation of
security_inode_getsecurity() works if there is only one hook
provided for it, but will fail if there is more than one and
the attribute requested isn't supplied by the first module.
This isn't a problem today, since only SELinux and Smack
provide this hook and there is (currently) no way to enable
both of those modules at the same time. Serge, however, wants
to introduce a capability attribute and an inode_getsecurity
hook in the capability security module to handle it. This
addresses that upcoming problem, will be required for "extreme
stacking" and is just a better implementation.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-06-06 19:59:18 +10:00
Stephan Mueller
4693fc734d KEYS: Add placeholder for KDF usage with DH
The values computed during Diffie-Hellman key exchange are often used
in combination with key derivation functions to create cryptographic
keys.  Add a placeholder for a later implementation to configure a
key derivation function that will transform the Diffie-Hellman
result returned by the KEYCTL_DH_COMPUTE command.

[This patch was stripped down from a patch produced by Mat Martineau that
 had a bug in the compat code - so for the moment Stephan's patch simply
 requires that the placeholder argument must be NULL]

Original-signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-06-03 16:14:34 +10:00
Stephen Smalley
7ea59202db selinux: Only apply bounds checking to source types
The current bounds checking of both source and target types
requires allowing any domain that has access to the child
domain to also have the same permissions to the parent, which
is undesirable.  Drop the target bounds checking.

KaiGai Kohei originally removed all use of target bounds in
commit 7d52a155e3 ("selinux: remove dead code in
type_attribute_bounds_av()") but this was reverted in
commit 2ae3ba3938 ("selinux: libsepol: remove dead code in
check_avtab_hierarchy_callback()") because it would have
required explicitly allowing the parent any permissions
to the child that the child is allowed to itself.

This change in contrast retains the logic for the case where both
source and target types are bounded, thereby allowing access
if the parent of the source is allowed the corresponding
permissions to the parent of the target.  Further, this change
reworks the logic such that we only perform a single computation
for each case and there is no ambiguity as to how to resolve
a bounds violation.

Under the new logic, if the source type and target types are both
bounded, then the parent of the source type must be allowed the same
permissions to the parent of the target type.  If only the source
type is bounded, then the parent of the source type must be allowed
the same permissions to the target type.

Examples of the new logic and comparisons with the old logic:
1. If we have:
	typebounds A B;
then:
	allow B self:process <permissions>;
will satisfy the bounds constraint iff:
	allow A self:process <permissions>;
is also allowed in policy.

Under the old logic, the allow rule on B satisfies the
bounds constraint if any of the following three are allowed:
	allow A B:process <permissions>; or
	allow B A:process <permissions>; or
	allow A self:process <permissions>;
However, either of the first two ultimately require the third to
satisfy the bounds constraint under the old logic, and therefore
this degenerates to the same result (but is more efficient - we only
need to perform one compute_av call).

2. If we have:
	typebounds A B;
	typebounds A_exec B_exec;
then:
	allow B B_exec:file <permissions>;
will satisfy the bounds constraint iff:
	allow A A_exec:file <permissions>;
is also allowed in policy.

This is essentially the same as #1; it is merely included as
an example of dealing with object types related to a bounded domain
in a manner that satisfies the bounds relationship.  Note that
this approach is preferable to leaving B_exec unbounded and having:
	allow A B_exec:file <permissions>;
in policy because that would allow B's entrypoints to be used to
enter A.  Similarly for _tmp or other related types.

3. If we have:
	typebounds A B;
and an unbounded type T, then:
	allow B T:file <permissions>;
will satisfy the bounds constraint iff:
	allow A T:file <permissions>;
is allowed in policy.

The old logic would have been identical for this example.

4. If we have:
	typebounds A B;
and an unbounded domain D, then:
	allow D B:unix_stream_socket <permissions>;
is not subject to any bounds constraints under the new logic
because D is not bounded.  This is desirable so that we can
allow a domain to e.g. connectto a child domain without having
to allow it to do the same to its parent.

The old logic would have required:
	allow D A:unix_stream_socket <permissions>;
to also be allowed in policy.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: re-wrapped description to appease checkpatch.pl]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-05-31 12:01:59 -04:00
Al Viro
4093d306a9 securityfs: ->d_parent is never NULL or negative
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-29 16:22:06 -04:00
Al Viro
07a8e62fde drbd: ->d_parent is never NULL or negative
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-29 16:21:55 -04:00
Linus Torvalds
d102a56edb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "Followups to the parallel lookup work:

   - update docs

   - restore killability of the places that used to take ->i_mutex
     killably now that we have down_write_killable() merged

   - Additionally, it turns out that I missed a prerequisite for
     security_d_instantiate() stuff - ->getxattr() wasn't the only thing
     that could be called before dentry is attached to inode; with smack
     we needed the same treatment applied to ->setxattr() as well"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  switch ->setxattr() to passing dentry and inode separately
  switch xattr_handler->set() to passing dentry and inode separately
  restore killability of old mutex_lock_killable(&inode->i_mutex) users
  add down_write_killable_nested()
  update D/f/directory-locking
2016-05-27 17:14:05 -07:00
Al Viro
3767e255b3 switch ->setxattr() to passing dentry and inode separately
smack ->d_instantiate() uses ->setxattr(), so to be able to call it before
we'd hashed the new dentry and attached it to inode, we need ->setxattr()
instances getting the inode as an explicit argument rather than obtaining
it from dentry.

Similar change for ->getxattr() had been done in commit ce23e64.  Unlike
->getxattr() (which is used by both selinux and smack instances of
->d_instantiate()) ->setxattr() is used only by smack one and unfortunately
it got missed back then.

Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-27 20:09:16 -04:00
Jann Horn
dca6b41491 Yama: fix double-spinlock and user access in atomic context
Commit 8a56038c2a ("Yama: consolidate error reporting") causes lockups
when someone hits a Yama denial. Call chain:

process_vm_readv -> process_vm_rw -> process_vm_rw_core -> mm_access
-> ptrace_may_access
task_lock(...) is taken
__ptrace_may_access -> security_ptrace_access_check
-> yama_ptrace_access_check -> report_access -> kstrdup_quotable_cmdline
-> get_cmdline -> access_process_vm -> get_task_mm
task_lock(...) is taken again

task_lock(p) just calls spin_lock(&p->alloc_lock), so at this point,
spin_lock() is called on a lock that is already held by the current
process.

Also: Since the alloc_lock is a spinlock, sleeping inside
security_ptrace_access_check hooks is probably not allowed at all? So it's
not even possible to print the cmdline from in there because that might
involve paging in userspace memory.

It would be tempting to rewrite ptrace_may_access() to drop the alloc_lock
before calling the LSM, but even then, ptrace_may_access() itself might be
called from various contexts in which you're not allowed to sleep; for
example, as far as I understand, to be able to hold a reference to another
task, usually an RCU read lock will be taken (see e.g. kcmp() and
get_robust_list()), so that also prohibits sleeping. (And using e.g. FUSE,
a user can cause pagefault handling to take arbitrary amounts of time -
see https://bugs.chromium.org/p/project-zero/issues/detail?id=808.)

Therefore, AFAIK, in order to print the name of a process below
security_ptrace_access_check(), you'd have to either grab a reference to
the mm_struct and defer the access violation reporting or just use the
"comm" value that's stored in kernelspace and accessible without big
complications. (Or you could try to use some kind of atomic remote VM
access that fails if the memory isn't paged in, similar to
copy_from_user_inatomic(), and if necessary fall back to comm, but
that'd be kind of ugly because the comm/cmdline choice would look
pretty random to the user.)

Fix it by deferring reporting of the access violation until current
exits kernelspace the next time.

v2: Don't oops on PTRACE_TRACEME, call report_access under
task_lock(current). Also fix nonsensical comment. And don't use
GPF_ATOMIC for memory allocation with no locks held.
This patch is tested both for ptrace attach and ptrace traceme.

Fixes: 8a56038c2a ("Yama: consolidate error reporting")
Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-05-26 09:56:18 +10:00
Andy Shevchenko
b8b572789c security/integrity/ima/ima_policy.c: use %pU to output UUID in printable format
Instead of open coded variant re-use extension that vsprintf.c provides
us for ages.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Linus Torvalds
f4f27d0028 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

   - A new LSM, "LoadPin", from Kees Cook is added, which allows forcing
     of modules and firmware to be loaded from a specific device (this
     is from ChromeOS, where the device as a whole is verified
     cryptographically via dm-verity).

     This is disabled by default but can be configured to be enabled by
     default (don't do this if you don't know what you're doing).

   - Keys: allow authentication data to be stored in an asymmetric key.
     Lots of general fixes and updates.

   - SELinux: add restrictions for loading of kernel modules via
     finit_module().  Distinguish non-init user namespace capability
     checks.  Apply execstack check on thread stacks"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (48 commits)
  LSM: LoadPin: provide enablement CONFIG
  Yama: use atomic allocations when reporting
  seccomp: Fix comment typo
  ima: add support for creating files using the mknodat syscall
  ima: fix ima_inode_post_setattr
  vfs: forbid write access when reading a file into memory
  fs: fix over-zealous use of "const"
  selinux: apply execstack check on thread stacks
  selinux: distinguish non-init user namespace capability checks
  LSM: LoadPin for kernel file loading restrictions
  fs: define a string representation of the kernel_read_file_id enumeration
  Yama: consolidate error reporting
  string_helpers: add kstrdup_quotable_file
  string_helpers: add kstrdup_quotable_cmdline
  string_helpers: add kstrdup_quotable
  selinux: check ss_initialized before revalidating an inode label
  selinux: delay inode label lookup as long as possible
  selinux: don't revalidate an inode's label when explicitly setting it
  selinux: Change bool variable name to index.
  KEYS: Add KEYCTL_DH_COMPUTE command
  ...
2016-05-19 09:21:36 -07:00
Linus Torvalds
a7fd20d1c4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Support SPI based w5100 devices, from Akinobu Mita.

   2) Partial Segmentation Offload, from Alexander Duyck.

   3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE.

   4) Allow cls_flower stats offload, from Amir Vadai.

   5) Implement bpf blinding, from Daniel Borkmann.

   6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is
      actually using FASYNC these atomics are superfluous.  From Eric
      Dumazet.

   7) Run TCP more preemptibly, also from Eric Dumazet.

   8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e
      driver, from Gal Pressman.

   9) Allow creating ppp devices via rtnetlink, from Guillaume Nault.

  10) Improve BPF usage documentation, from Jesper Dangaard Brouer.

  11) Support tunneling offloads in qed, from Manish Chopra.

  12) aRFS offloading in mlx5e, from Maor Gottlieb.

  13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo
      Leitner.

  14) Add MSG_EOR support to TCP, this allows controlling packet
      coalescing on application record boundaries for more accurate
      socket timestamp sampling.  From Martin KaFai Lau.

  15) Fix alignment of 64-bit netlink attributes across the board, from
      Nicolas Dichtel.

  16) Per-vlan stats in bridging, from Nikolay Aleksandrov.

  17) Several conversions of drivers to ethtool ksettings, from Philippe
      Reynes.

  18) Checksum neutral ILA in ipv6, from Tom Herbert.

  19) Factorize all of the various marvell dsa drivers into one, from
      Vivien Didelot

  20) Add VF support to qed driver, from Yuval Mintz"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits)
  Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m"
  Revert "phy dp83867: Make rgmii parameters optional"
  r8169: default to 64-bit DMA on recent PCIe chips
  phy dp83867: Make rgmii parameters optional
  phy dp83867: Fix compilation with CONFIG_OF_MDIO=m
  bpf: arm64: remove callee-save registers use for tmp registers
  asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
  switchdev: pass pointer to fib_info instead of copy
  net_sched: close another race condition in tcf_mirred_release()
  tipc: fix nametable publication field in nl compat
  drivers: net: Don't print unpopulated net_device name
  qed: add support for dcbx.
  ravb: Add missing free_irq() calls to ravb_close()
  qed: Remove a stray tab
  net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
  net: ethernet: fec-mpc52xx: use phydev from struct net_device
  bpf, doc: fix typo on bpf_asm descriptions
  stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set
  net: ethernet: fs-enet: use phy_ethtool_{get|set}_link_ksettings
  net: ethernet: fs-enet: use phydev from struct net_device
  ...
2016-05-17 16:26:30 -07:00
Linus Torvalds
c52b76185b Merge branch 'work.const-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull 'struct path' constification update from Al Viro:
 "'struct path' is passed by reference to a bunch of Linux security
  methods; in theory, there's nothing to stop them from modifying the
  damn thing and LSM community being what it is, sooner or later some
  enterprising soul is going to decide that it's a good idea.

  Let's remove the temptation and constify all of those..."

* 'work.const-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  constify ima_d_path()
  constify security_sb_pivotroot()
  constify security_path_chroot()
  constify security_path_{link,rename}
  apparmor: remove useless checks for NULL ->mnt
  constify security_path_{mkdir,mknod,symlink}
  constify security_path_{unlink,rmdir}
  apparmor: constify common_perm_...()
  apparmor: constify aa_path_link()
  apparmor: new helper - common_path_perm()
  constify chmod_common/security_path_chmod
  constify security_sb_mount()
  constify chown_common/security_path_chown
  tomoyo: constify assorted struct path *
  apparmor_path_truncate(): path->mnt is never NULL
  constify vfs_truncate()
  constify security_path_truncate()
  [apparmor] constify struct path * in a bunch of helpers
2016-05-17 14:41:03 -07:00
Linus Torvalds
7f427d3a60 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull parallel filesystem directory handling update from Al Viro.

This is the main parallel directory work by Al that makes the vfs layer
able to do lookup and readdir in parallel within a single directory.
That's a big change, since this used to be all protected by the
directory inode mutex.

The inode mutex is replaced by an rwsem, and serialization of lookups of
a single name is done by a "in-progress" dentry marker.

The series begins with xattr cleanups, and then ends with switching
filesystems over to actually doing the readdir in parallel (switching to
the "iterate_shared()" that only takes the read lock).

A more detailed explanation of the process from Al Viro:
 "The xattr work starts with some acl fixes, then switches ->getxattr to
  passing inode and dentry separately.  This is the point where the
  things start to get tricky - that got merged into the very beginning
  of the -rc3-based #work.lookups, to allow untangling the
  security_d_instantiate() mess.  The xattr work itself proceeds to
  switch a lot of filesystems to generic_...xattr(); no complications
  there.

  After that initial xattr work, the series then does the following:

   - untangle security_d_instantiate()

   - convert a bunch of open-coded lookup_one_len_unlocked() to calls of
     that thing; one such place (in overlayfs) actually yields a trivial
     conflict with overlayfs fixes later in the cycle - overlayfs ended
     up switching to a variant of lookup_one_len_unlocked() sans the
     permission checks.  I would've dropped that commit (it gets
     overridden on merge from #ovl-fixes in #for-next; proper resolution
     is to use the variant in mainline fs/overlayfs/super.c), but I
     didn't want to rebase the damn thing - it was fairly late in the
     cycle...

   - some filesystems had managed to depend on lookup/lookup exclusion
     for *fs-internal* data structures in a way that would break if we
     relaxed the VFS exclusion.  Fixing hadn't been hard, fortunately.

   - core of that series - parallel lookup machinery, replacing
     ->i_mutex with rwsem, making lookup_slow() take it only shared.  At
     that point lookups happen in parallel; lookups on the same name
     wait for the in-progress one to be done with that dentry.

     Surprisingly little code, at that - almost all of it is in
     fs/dcache.c, with fs/namei.c changes limited to lookup_slow() -
     making it use the new primitive and actually switching to locking
     shared.

   - parallel readdir stuff - first of all, we provide the exclusion on
     per-struct file basis, same as we do for read() vs lseek() for
     regular files.  That takes care of most of the needed exclusion in
     readdir/readdir; however, these guys are trickier than lookups, so
     I went for switching them one-by-one.  To do that, a new method
     '->iterate_shared()' is added and filesystems are switched to it
     as they are either confirmed to be OK with shared lock on directory
     or fixed to be OK with that.  I hope to kill the original method
     come next cycle (almost all in-tree filesystems are switched
     already), but it's still not quite finished.

   - several filesystems get switched to parallel readdir.  The
     interesting part here is dealing with dcache preseeding by readdir;
     that needs minor adjustment to be safe with directory locked only
     shared.

     Most of the filesystems doing that got switched to in those
     commits.  Important exception: NFS.  Turns out that NFS folks, with
     their, er, insistence on VFS getting the fuck out of the way of the
     Smart Filesystem Code That Knows How And What To Lock(tm) have
     grown the locking of their own.  They had their own homegrown
     rwsem, with lookup/readdir/atomic_open being *writers* (sillyunlink
     is the reader there).  Of course, with VFS getting the fuck out of
     the way, as requested, the actual smarts of the smart filesystem
     code etc. had become exposed...

   - do_last/lookup_open/atomic_open cleanups.  As the result, open()
     without O_CREAT locks the directory only shared.  Including the
     ->atomic_open() case.  Backmerge from #for-linus in the middle of
     that - atomic_open() fix got brought in.

   - then comes NFS switch to saner (VFS-based ;-) locking, killing the
     homegrown "lookup and readdir are writers" kinda-sorta rwsem.  All
     exclusion for sillyunlink/lookup is done by the parallel lookups
     mechanism.  Exclusion between sillyunlink and rmdir is a real rwsem
     now - rmdir being the writer.

     Result: NFS lookups/readdirs/O_CREAT-less opens happen in parallel
     now.

   - the rest of the series consists of switching a lot of filesystems
     to parallel readdir; in a lot of cases ->llseek() gets simplified
     as well.  One backmerge in there (again, #for-linus - rockridge
     fix)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (74 commits)
  ext4: switch to ->iterate_shared()
  hfs: switch to ->iterate_shared()
  hfsplus: switch to ->iterate_shared()
  hostfs: switch to ->iterate_shared()
  hpfs: switch to ->iterate_shared()
  hpfs: handle allocation failures in hpfs_add_pos()
  gfs2: switch to ->iterate_shared()
  f2fs: switch to ->iterate_shared()
  afs: switch to ->iterate_shared()
  befs: switch to ->iterate_shared()
  befs: constify stuff a bit
  isofs: switch to ->iterate_shared()
  get_acorn_filename(): deobfuscate a bit
  btrfs: switch to ->iterate_shared()
  logfs: no need to lock directory in lseek
  switch ecryptfs to ->iterate_shared
  9p: switch to ->iterate_shared()
  fat: switch to ->iterate_shared()
  romfs, squashfs: switch to ->iterate_shared()
  more trivial ->iterate_shared conversions
  ...
2016-05-17 11:01:31 -07:00
Linus Torvalds
91e8d0cbc9 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "A rather small set of patches from the timer departement:

   - Some more y2038 work
   - Yet another new clocksource driver
   - The usual set of small fixes, cleanups and enhancements"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/tegra: Remove unused suspend/resume code
  clockevents/driversi/mps2: add MPS2 Timer driver
  dt-bindings: document the MPS2 timer bindings
  clocksource/drivers/mtk_timer: Add __init attribute
  clockevents/drivers/dw_apb_timer: Implement ->set_state_oneshot_stopped()
  time: Introduce do_sys_settimeofday64()
  security: Introduce security_settime64()
  clocksource: Add missing include of of.h.
2016-05-17 09:49:28 -07:00
Kees Cook
b937190c40 LSM: LoadPin: provide enablement CONFIG
Instead of being enabled by default when SECURITY_LOADPIN is selected,
provide an additional (default off) config to determine the boot time
behavior. As before, the "loadpin.enabled=0/1" kernel parameter remains
available.

Suggested-by: James Morris <jmorris@namei.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-05-17 20:10:30 +10:00
Al Viro
0e0162bb8c Merge branch 'ovl-fixes' into for-linus
Backmerge to resolve a conflict in ovl_lookup_real();
"ovl_lookup_real(): use lookup_one_len_unlocked()" instead,
but it was too late in the cycle to rebase.
2016-05-17 02:17:59 -04:00
David S. Miller
e800072c18 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
In netdevice.h we removed the structure in net-next that is being
changes in 'net'.  In macsec.c and rtnetlink.c we have overlaps
between fixes in 'net' and the u64 attribute changes in 'net-next'.

The mlx5 conflicts have to do with vxlan support dependencies.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09 15:59:24 -04:00
James Morris
a6926cc989 Merge branch 'stable-4.7' of git://git.infradead.org/users/pcmoore/selinux into next 2016-05-06 09:31:34 +10:00
James Morris
0250abcd72 Merge tag 'keys-next-20160505' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2016-05-06 09:29:00 +10:00
Sasha Levin
74f430cd0f Yama: use atomic allocations when reporting
Access reporting often happens from atomic contexes. Avoid
lockups when allocating memory for command lines.

Fixes: 8a56038c2a ("Yama: consolidate error reporting")
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2016-05-04 10:54:05 -07:00
David Howells
d55201ce08 Merge branch 'keys-trust' into keys-next
Here's a set of patches that changes how certificates/keys are determined
to be trusted.  That's currently a two-step process:

 (1) Up until recently, when an X.509 certificate was parsed - no matter
     the source - it was judged against the keys in .system_keyring,
     assuming those keys to be trusted if they have KEY_FLAG_TRUSTED set
     upon them.

     This has just been changed such that any key in the .ima_mok keyring,
     if configured, may also be used to judge the trustworthiness of a new
     certificate, whether or not the .ima_mok keyring is meant to be
     consulted for whatever process is being undertaken.

     If a certificate is determined to be trustworthy, KEY_FLAG_TRUSTED
     will be set upon a key it is loaded into (if it is loaded into one),
     no matter what the key is going to be loaded for.

 (2) If an X.509 certificate is loaded into a key, then that key - if
     KEY_FLAG_TRUSTED gets set upon it - can be linked into any keyring
     with KEY_FLAG_TRUSTED_ONLY set upon it.  This was meant to be the
     system keyring only, but has been extended to various IMA keyrings.
     A user can at will link any key marked KEY_FLAG_TRUSTED into any
     keyring marked KEY_FLAG_TRUSTED_ONLY if the relevant permissions masks
     permit it.

These patches change that:

 (1) Trust becomes a matter of consulting the ring of trusted keys supplied
     when the trust is evaluated only.

 (2) Every keyring can be supplied with its own manager function to
     restrict what may be added to that keyring.  This is called whenever a
     key is to be linked into the keyring to guard against a key being
     created in one keyring and then linked across.

     This function is supplied with the keyring and the key type and
     payload[*] of the key being linked in for use in its evaluation.  It
     is permitted to use other data also, such as the contents of other
     keyrings such as the system keyrings.

     [*] The type and payload are supplied instead of a key because as an
         optimisation this function may be called whilst creating a key and
         so may reject the proposed key between preparse and allocation.

 (3) A default manager function is provided that permits keys to be
     restricted to only asymmetric keys that are vouched for by the
     contents of the system keyring.

     A second manager function is provided that just rejects with EPERM.

 (4) A key allocation flag, KEY_ALLOC_BYPASS_RESTRICTION, is made available
     so that the kernel can initialise keyrings with keys that form the
     root of the trust relationship.

 (5) KEY_FLAG_TRUSTED and KEY_FLAG_TRUSTED_ONLY are removed, along with
     key_preparsed_payload::trusted.

This change also makes it possible in future for userspace to create a private
set of trusted keys and then to have it sealed by setting a manager function
where the private set is wholly independent of the kernel's trust
relationships.

Further changes in the set involve extracting certain IMA special keyrings
and making them generally global:

 (*) .system_keyring is renamed to .builtin_trusted_keys and remains read
     only.  It carries only keys built in to the kernel.  It may be where
     UEFI keys should be loaded - though that could better be the new
     secondary keyring (see below) or a separate UEFI keyring.

 (*) An optional secondary system keyring (called .secondary_trusted_keys)
     is added to replace the IMA MOK keyring.

     (*) Keys can be added to the secondary keyring by root if the keys can
         be vouched for by either ring of system keys.

 (*) Module signing and kexec only use .builtin_trusted_keys and do not use
     the new secondary keyring.

 (*) Config option SYSTEM_TRUSTED_KEYS now depends on ASYMMETRIC_KEY_TYPE as
     that's the only type currently permitted on the system keyrings.

 (*) A new config option, IMA_KEYRINGS_PERMIT_SIGNED_BY_BUILTIN_OR_SECONDARY,
     is provided to allow keys to be added to IMA keyrings, subject to the
     restriction that such keys are validly signed by a key already in the
     system keyrings.

     If this option is enabled, but secondary keyrings aren't, additions to
     the IMA keyrings will be restricted to signatures verifiable by keys in
     the builtin system keyring only.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-05-04 17:20:20 +01:00
Mimi Zohar
cf90ea9340 ima: fix the string representation of the LSM/IMA hook enumeration ordering
This patch fixes the string representation of the LSM/IMA hook enumeration
ordering used for displaying the IMA policy.

Fixes: d9ddf077bb ("ima: support for kexec image and initramfs")
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Tested-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-05-04 18:46:00 +10:00
Mimi Zohar
05d1a717ec ima: add support for creating files using the mknodat syscall
Commit 3034a14 "ima: pass 'opened' flag to identify newly created files"
stopped identifying empty files as new files.  However new empty files
can be created using the mknodat syscall.  On systems with IMA-appraisal
enabled, these empty files are not labeled with security.ima extended
attributes properly, preventing them from subsequently being opened in
order to write the file data contents.  This patch defines a new hook
named ima_post_path_mknod() to mark these empty files, created using
mknodat, as new in order to allow the file data contents to be written.

In addition, files with security.ima xattrs containing a file signature
are considered "immutable" and can not be modified.  The file contents
need to be written, before signing the file.  This patch relaxes this
requirement for new files, allowing the file signature to be written
before the file contents.

Changelog:
- defer identifying files with signatures stored as security.ima
  (based on Dmitry Rozhkov's comments)
- removing tests (eg. dentry, dentry->d_inode, inode->i_size == 0)
  (based on Al's review)

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Al Viro <<viro@zeniv.linux.org.uk>
Tested-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
2016-05-01 09:23:52 -04:00
Mimi Zohar
42a4c60319 ima: fix ima_inode_post_setattr
Changing file metadata (eg. uid, guid) could result in having to
re-appraise a file's integrity, but does not change the "new file"
status nor the security.ima xattr.  The IMA_PERMIT_DIRECTIO and
IMA_DIGSIG_REQUIRED flags are policy rule specific.  This patch
only resets these flags, not the IMA_NEW_FILE or IMA_DIGSIG flags.

With this patch, changing the file timestamp will not remove the
file signature on new files.

Reported-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Tested-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
2016-05-01 09:23:52 -04:00
Stephen Smalley
c2316dbf12 selinux: apply execstack check on thread stacks
The execstack check was only being applied on the main
process stack.  Thread stacks allocated via mmap were
only subject to the execmem permission check.  Augment
the check to apply to the current thread stack as well.
Note that this does NOT prevent making a different thread's
stack executable.

Suggested-by: Nick Kralevich <nnk@google.com>
Acked-by: Nick Kralevich <nnk@google.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-26 15:47:57 -04:00
Stephen Smalley
8e4ff6f228 selinux: distinguish non-init user namespace capability checks
Distinguish capability checks against a target associated
with the init user namespace versus capability checks against
a target associated with a non-init user namespace by defining
and using separate security classes for the latter.

This is needed to support e.g. Chrome usage of user namespaces
for the Chrome sandbox without needing to allow Chrome to also
exercise capabilities on targets in the init user namespace.

Suggested-by: Dan Walsh <dwalsh@redhat.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-26 15:41:43 -04:00
Baolin Wang
457db29bfc security: Introduce security_settime64()
security_settime() uses a timespec, which is not year 2038 safe
on 32bit systems. Thus this patch introduces the security_settime64()
function with timespec64 type. We also convert the cap_settime() helper
function to use the 64bit types.

This patch then moves security_settime() to the header file as an
inline helper function so that existing users can be iteratively
converted.

None of the existing hooks is using the timespec argument and therefor
the patch is not making any functional changes.

Cc: Serge Hallyn <serge.hallyn@canonical.com>,
Cc: James Morris <james.l.morris@oracle.com>,
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
Cc: Paul Moore <pmoore@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Kees Cook <keescook@chromium.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
[jstultz: Reworded commit message]
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-04-22 11:48:30 -07:00
Kees Cook
9b091556a0 LSM: LoadPin for kernel file loading restrictions
This LSM enforces that kernel-loaded files (modules, firmware, etc)
must all come from the same filesystem, with the expectation that
such a filesystem is backed by a read-only device such as dm-verity
or CDROM. This allows systems that have a verified and/or unchangeable
filesystem to enforce module and firmware loading restrictions without
needing to sign the files individually.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-04-21 10:47:27 +10:00
Kees Cook
8a56038c2a Yama: consolidate error reporting
Use a common error reporting function for Yama violation reports, and give
more detail into the process command lines.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-04-21 10:47:26 +10:00
Roopa Prabhu
10c9ead9f3 rtnetlink: add new RTM_GETSTATS message to dump link stats
This patch adds a new RTM_GETSTATS message to query link stats via netlink
from the kernel. RTM_NEWLINK also dumps stats today, but RTM_NEWLINK
returns a lot more than just stats and is expensive in some cases when
frequent polling for stats from userspace is a common operation.

RTM_GETSTATS is an attempt to provide a light weight netlink message
to explicity query only link stats from the kernel on an interface.
The idea is to also keep it extensible so that new kinds of stats can be
added to it in the future.

This patch adds the following attribute for NETDEV stats:
struct nla_policy ifla_stats_policy[IFLA_STATS_MAX + 1] = {
        [IFLA_STATS_LINK_64]  = { .len = sizeof(struct rtnl_link_stats64) },
};

Like any other rtnetlink message, RTM_GETSTATS can be used to get stats of
a single interface or all interfaces with NLM_F_DUMP.

Future possible new types of stat attributes:
link af stats:
    - IFLA_STATS_LINK_IPV6  (nested. for ipv6 stats)
    - IFLA_STATS_LINK_MPLS  (nested. for mpls/mdev stats)
extended stats:
    - IFLA_STATS_LINK_EXTENDED (nested. extended software netdev stats like bridge,
      vlan, vxlan etc)
    - IFLA_STATS_LINK_HW_EXTENDED (nested. extended hardware stats which are
      available via ethtool today)

This patch also declares a filter mask for all stat attributes.
User has to provide a mask of stats attributes to query. filter mask
can be specified in the new hdr 'struct if_stats_msg' for stats messages.
Other important field in the header is the ifindex.

This api can also include attributes for global stats (eg tcp) in the future.
When global stats are included in a stats msg, the ifindex in the header
must be zero. A single stats message cannot contain both global and
netdev specific stats. To easily distinguish them, netdev specific stat
attributes name are prefixed with IFLA_STATS_LINK_

Without any attributes in the filter_mask, no stats will be returned.

This patch has been tested with mofified iproute2 ifstat.

Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-20 15:43:42 -04:00
Paul Moore
1ac4247626 selinux: check ss_initialized before revalidating an inode label
There is no point in trying to revalidate an inode's security label if
the security server is not yet initialized.

Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-19 16:37:27 -04:00
Paul Moore
20cdef8d57 selinux: delay inode label lookup as long as possible
Since looking up an inode's label can result in revalidation, delay
the lookup as long as possible to limit the performance impact.

Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-19 16:37:07 -04:00
Paul Moore
2c97165bef selinux: don't revalidate an inode's label when explicitly setting it
There is no point in attempting to revalidate an inode's security
label when we are in the process of setting it.

Reported-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-19 16:36:28 -04:00
Prarit Bhargava
0fd71a620b selinux: Change bool variable name to index.
security_get_bool_value(int bool) argument "bool" conflicts with
in-kernel macros such as BUILD_BUG().  This patch changes this to
index which isn't a type.

Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Andrew Perepechko <anserper@ya.ru>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: selinux@tycho.nsa.gov
Cc: Eric Paris <eparis@redhat.com>
Cc: Paul Moore <pmoore@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
[PM: wrapped description for checkpatch.pl, use "selinux:..." as subj]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-14 11:24:50 -04:00
Mat Martineau
ddbb411487 KEYS: Add KEYCTL_DH_COMPUTE command
This adds userspace access to Diffie-Hellman computations through a
new keyctl() syscall command to calculate shared secrets or public
keys using input parameters stored in the keyring.

Input key ids are provided in a struct due to the current 5-arg limit
for the keyctl syscall. Only user keys are supported in order to avoid
exposing the content of logon or encrypted keys.

The output is written to the provided buffer, based on the assumption
that the values are only needed in userspace.

Future support for other types of key derivation would involve a new
command, like KEYCTL_ECDH_COMPUTE.

Once Diffie-Hellman support is included in the crypto API, this code
can be converted to use the crypto API to take advantage of possible
hardware acceleration and reduce redundant code.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2016-04-12 19:54:58 +01:00
Kirill Marinushkin
13100a72f4 Security: Keys: Big keys stored encrypted
Solved TODO task: big keys saved to shmem file are now stored encrypted.
The encryption key is randomly generated and saved to payload[big_key_data].

Signed-off-by: Kirill Marinushkin <k.marinushkin@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2016-04-12 19:54:58 +01:00
David Howells
898de7d0f2 KEYS: user_update should use copy of payload made during preparsing
The payload preparsing routine for user keys makes a copy of the payload
provided by the caller and stashes it in the key_preparsed_payload struct for
->instantiate() or ->update() to use.  However, ->update() takes another copy
of this to attach to the keyring.  ->update() should be using this directly
and clearing the pointer in the preparse data.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-04-12 19:54:58 +01:00
Andreas Ziegler
93da17b185 security: integrity: Remove select to deleted option PUBLIC_KEY_ALGO_RSA
Commit d43de6c780 ("akcipher: Move the RSA DER encoding check to
the crypto layer") removed the Kconfig option PUBLIC_KEY_ALGO_RSA,
but forgot to remove a 'select' to this option in the definition of
INTEGRITY_ASYMMETRIC_KEYS.

Let's remove the select, as it's ineffective now.

Signed-off-by: Andreas Ziegler <andreas.ziegler@fau.de>
Signed-off-by: David Howells <dhowells@redhat.com>
2016-04-12 19:54:58 +01:00
David Howells
56104cf2b8 IMA: Use the the system trusted keyrings instead of .ima_mok
Add a config option (IMA_KEYRINGS_PERMIT_SIGNED_BY_BUILTIN_OR_SECONDARY)
that, when enabled, allows keys to be added to the IMA keyrings by
userspace - with the restriction that each must be signed by a key in the
system trusted keyrings.

EPERM will be returned if this option is disabled, ENOKEY will be returned if
no authoritative key can be found and EKEYREJECTED will be returned if the
signature doesn't match.  Other errors such as ENOPKG may also be returned.

If this new option is enabled, the builtin system keyring is searched, as is
the secondary system keyring if that is also enabled.  Intermediate keys
between the builtin system keyring and the key being added can be added to
the secondary keyring (which replaces .ima_mok) to form a trust chain -
provided they are also validly signed by a key in one of the trusted keyrings.

The .ima_mok keyring is then removed and the IMA blacklist keyring gets its
own config option (IMA_BLACKLIST_KEYRING).

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-04-11 22:49:15 +01:00
David Howells
77f68bac94 KEYS: Remove KEY_FLAG_TRUSTED and KEY_ALLOC_TRUSTED
Remove KEY_FLAG_TRUSTED and KEY_ALLOC_TRUSTED as they're no longer
meaningful.  Also we can drop the trusted flag from the preparse structure.

Given this, we no longer need to pass the key flags through to
restrict_link().

Further, we can now get rid of keyring_restrict_trusted_only() also.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-04-11 22:44:15 +01:00
David Howells
a511e1af8b KEYS: Move the point of trust determination to __key_link()
Move the point at which a key is determined to be trustworthy to
__key_link() so that we use the contents of the keyring being linked in to
to determine whether the key being linked in is trusted or not.

What is 'trusted' then becomes a matter of what's in the keyring.

Currently, the test is done when the key is parsed, but given that at that
point we can only sensibly refer to the contents of the system trusted
keyring, we can only use that as the basis for working out the
trustworthiness of a new key.

With this change, a trusted keyring is a set of keys that once the
trusted-only flag is set cannot be added to except by verification through
one of the contained keys.

Further, adding a key into a trusted keyring, whilst it might grant
trustworthiness in the context of that keyring, does not automatically
grant trustworthiness in the context of a second keyring to which it could
be secondarily linked.

To accomplish this, the authentication data associated with the key source
must now be retained.  For an X.509 cert, this means the contents of the
AuthorityKeyIdentifier and the signature data.


If system keyrings are disabled then restrict_link_by_builtin_trusted()
resolves to restrict_link_reject().  The integrity digital signature code
still works correctly with this as it was previously using
KEY_FLAG_TRUSTED_ONLY, which doesn't permit anything to be added if there
is no system keyring against which trust can be determined.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-04-11 22:43:43 +01:00
David Howells
5ac7eace2d KEYS: Add a facility to restrict new links into a keyring
Add a facility whereby proposed new links to be added to a keyring can be
vetted, permitting them to be rejected if necessary.  This can be used to
block public keys from which the signature cannot be verified or for which
the signature verification fails.  It could also be used to provide
blacklisting.

This affects operations like add_key(), KEYCTL_LINK and KEYCTL_INSTANTIATE.

To this end:

 (1) A function pointer is added to the key struct that, if set, points to
     the vetting function.  This is called as:

	int (*restrict_link)(struct key *keyring,
			     const struct key_type *key_type,
			     unsigned long key_flags,
			     const union key_payload *key_payload),

     where 'keyring' will be the keyring being added to, key_type and
     key_payload will describe the key being added and key_flags[*] can be
     AND'ed with KEY_FLAG_TRUSTED.

     [*] This parameter will be removed in a later patch when
     	 KEY_FLAG_TRUSTED is removed.

     The function should return 0 to allow the link to take place or an
     error (typically -ENOKEY, -ENOPKG or -EKEYREJECTED) to reject the
     link.

     The pointer should not be set directly, but rather should be set
     through keyring_alloc().

     Note that if called during add_key(), preparse is called before this
     method, but a key isn't actually allocated until after this function
     is called.

 (2) KEY_ALLOC_BYPASS_RESTRICTION is added.  This can be passed to
     key_create_or_update() or key_instantiate_and_link() to bypass the
     restriction check.

 (3) KEY_FLAG_TRUSTED_ONLY is removed.  The entire contents of a keyring
     with this restriction emplaced can be considered 'trustworthy' by
     virtue of being in the keyring when that keyring is consulted.

 (4) key_alloc() and keyring_alloc() take an extra argument that will be
     used to set restrict_link in the new key.  This ensures that the
     pointer is set before the key is published, thus preventing a window
     of unrestrictedness.  Normally this argument will be NULL.

 (5) As a temporary affair, keyring_restrict_trusted_only() is added.  It
     should be passed to keyring_alloc() as the extra argument instead of
     setting KEY_FLAG_TRUSTED_ONLY on a keyring.  This will be replaced in
     a later patch with functions that look in the appropriate places for
     authoritative keys.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-04-11 22:37:37 +01:00
Al Viro
ce23e64013 ->getxattr(): pass dentry and inode as separate arguments
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-11 00:48:00 -04:00
Paolo Abeni
3c9d6296b7 security: drop the unused hook skb_owned_by
The skb_owned_by hook was added with the commit ca10b9e9a8
("selinux: add a skb_owned_by() hook") and later removed
when said commit was reverted.

Later on, when switching to list of hooks, a field named
'skb_owned_by' was included into the security_hook_head struct,
but without any users nor caller.

This commit removes the said left-over field.

Fixes: b1d9e6b064 ("LSM: Switch to lists of hooks")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Paul Moore <pmoore@paul-moore.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-04-11 12:21:43 +10:00
Al Viro
fc64005c93 don't bother with ->d_inode->i_sb - it's always equal to ->d_sb
... and neither can ever be NULL

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-10 17:11:51 -04:00
Jeff Vander Stoep
61d612ea73 selinux: restrict kernel module loading
Utilize existing kernel_read_file hook on kernel module load.
Add module_load permission to the system class.

Enforces restrictions on kernel module origin when calling the
finit_module syscall. The hook checks that source type has
permission module_load for the target type.
Example for finit_module:

allow foo bar_file:system module_load;

Similarly restrictions are enforced on kernel module loading when
calling the init_module syscall. The hook checks that source
type has permission module_load with itself as the target object
because the kernel module is sourced from the calling process.
Example for init_module:

allow foo foo:system module_load;

Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
[PM: fixed return value of selinux_kernel_read_file()]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-05 16:11:56 -04:00
Paul Moore
0c6181cb30 selinux: consolidate the ptrace parent lookup code
We lookup the tracing parent in two places, using effectively the
same code, let's consolidate it.

Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-05 16:11:02 -04:00
Paul Moore
4b57d6bcd9 selinux: simply inode label states to INVALID and INITIALIZED
There really is no need for LABEL_MISSING as we really only care if
the inode's label is INVALID or INITIALIZED.  Also adjust the
revalidate code to reload the label whenever the label is not
INITIALIZED so we are less sensitive to label state in the future.

Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-05 16:10:55 -04:00
Paul Moore
899134f2f6 selinux: don't revalidate inodes in selinux_socket_getpeersec_dgram()
We don't have to worry about socket inodes being invalidated so
use inode_security_novalidate() to fetch the inode's security blob.

Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-05 16:10:52 -04:00
Al Viro
81cd8896a6 constify ima_d_path()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:52 -04:00
Al Viro
3b73b68c05 constify security_sb_pivotroot()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:52 -04:00
Al Viro
77b286c0d2 constify security_path_chroot()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:51 -04:00
Al Viro
3ccee46ab4 constify security_path_{link,rename}
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:36 -04:00
Al Viro
8db0185659 apparmor: remove useless checks for NULL ->mnt
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:28 -04:00
Al Viro
d360775217 constify security_path_{mkdir,mknod,symlink}
... as well as unix_mknod() and may_o_create()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:27 -04:00
Al Viro
989f74e050 constify security_path_{unlink,rmdir}
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:27 -04:00
Al Viro
d6b49f7ad2 apparmor: constify common_perm_...()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:26 -04:00
Al Viro
3539aaf670 apparmor: constify aa_path_link()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:26 -04:00
Al Viro
741aca71d6 apparmor: new helper - common_path_perm()
was open-coded in several places...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:25 -04:00
Al Viro
be01f9f28e constify chmod_common/security_path_chmod
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:25 -04:00
Al Viro
8a04c43b87 constify security_sb_mount()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:24 -04:00
Al Viro
7fd25dac9a constify chown_common/security_path_chown
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:24 -04:00
Al Viro
e6641eddf0 tomoyo: constify assorted struct path *
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:23 -04:00
Al Viro
928e1ebfb5 apparmor_path_truncate(): path->mnt is never NULL
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:47:23 -04:00
Al Viro
81f4c50607 constify security_path_truncate()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 00:46:54 -04:00
Al Viro
2c7661ff41 [apparmor] constify struct path * in a bunch of helpers
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-27 23:48:14 -04:00
Linus Torvalds
643ad15d47 Merge branch 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 protection key support from Ingo Molnar:
 "This tree adds support for a new memory protection hardware feature
  that is available in upcoming Intel CPUs: 'protection keys' (pkeys).

  There's a background article at LWN.net:

      https://lwn.net/Articles/643797/

  The gist is that protection keys allow the encoding of
  user-controllable permission masks in the pte.  So instead of having a
  fixed protection mask in the pte (which needs a system call to change
  and works on a per page basis), the user can map a (handful of)
  protection mask variants and can change the masks runtime relatively
  cheaply, without having to change every single page in the affected
  virtual memory range.

  This allows the dynamic switching of the protection bits of large
  amounts of virtual memory, via user-space instructions.  It also
  allows more precise control of MMU permission bits: for example the
  executable bit is separate from the read bit (see more about that
  below).

  This tree adds the MM infrastructure and low level x86 glue needed for
  that, plus it adds a high level API to make use of protection keys -
  if a user-space application calls:

        mmap(..., PROT_EXEC);

  or

        mprotect(ptr, sz, PROT_EXEC);

  (note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice
  this special case, and will set a special protection key on this
  memory range.  It also sets the appropriate bits in the Protection
  Keys User Rights (PKRU) register so that the memory becomes unreadable
  and unwritable.

  So using protection keys the kernel is able to implement 'true'
  PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies
  PROT_READ as well.  Unreadable executable mappings have security
  advantages: they cannot be read via information leaks to figure out
  ASLR details, nor can they be scanned for ROP gadgets - and they
  cannot be used by exploits for data purposes either.

  We know about no user-space code that relies on pure PROT_EXEC
  mappings today, but binary loaders could start making use of this new
  feature to map binaries and libraries in a more secure fashion.

  There is other pending pkeys work that offers more high level system
  call APIs to manage protection keys - but those are not part of this
  pull request.

  Right now there's a Kconfig that controls this feature
  (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled
  (like most x86 CPU feature enablement code that has no runtime
  overhead), but it's not user-configurable at the moment.  If there's
  any serious problem with this then we can make it configurable and/or
  flip the default"

* 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
  x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
  mm/pkeys: Fix siginfo ABI breakage caused by new u64 field
  x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA
  mm/core, x86/mm/pkeys: Add execute-only protection keys support
  x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags
  x86/mm/pkeys: Allow kernel to modify user pkey rights register
  x86/fpu: Allow setting of XSAVE state
  x86/mm: Factor out LDT init from context init
  mm/core, x86/mm/pkeys: Add arch_validate_pkey()
  mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits()
  x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU
  x86/mm/pkeys: Add Kconfig prompt to existing config option
  x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps
  x86/mm/pkeys: Dump PKRU with other kernel registers
  mm/core, x86/mm/pkeys: Differentiate instruction fetches
  x86/mm/pkeys: Optimize fault handling in access_error()
  mm/core: Do not enforce PKEY permissions on remote mm access
  um, pkeys: Add UML arch_*_access_permitted() methods
  mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys
  x86/mm/gup: Simplify get_user_pages() PTE bit handling
  ...
2016-03-20 19:08:56 -07:00
Linus Torvalds
96b9b1c956 TTY/Serial patches for 4.6-rc1
Here's the big tty/serial driver pull request for 4.6-rc1.
 
 Lots of changes in here, Peter has been on a tear again, with lots of
 refactoring and bugs fixes, many thanks to the great work he has been
 doing.  Lots of driver updates and fixes as well, full details in the
 shortlog.
 
 All have been in linux-next for a while with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlbp8z8ACgkQMUfUDdst+ym1vwCgnOOCORaZyeQ4QrcxPAK5pHFn
 VrMAoNHvDgNYtG+Hmzv25Lgp3HnysPin
 =MLRG
 -----END PGP SIGNATURE-----

Merge tag 'tty-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial updates from Greg KH:
 "Here's the big tty/serial driver pull request for 4.6-rc1.

  Lots of changes in here, Peter has been on a tear again, with lots of
  refactoring and bugs fixes, many thanks to the great work he has been
  doing.  Lots of driver updates and fixes as well, full details in the
  shortlog.

  All have been in linux-next for a while with no reported issues"

* tag 'tty-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (220 commits)
  serial: 8250: describe CONFIG_SERIAL_8250_RSA
  serial: samsung: optimize UART rx fifo access routine
  serial: pl011: add mark/space parity support
  serial: sa1100: make sa1100_register_uart_fns a function
  tty: serial: 8250: add MOXA Smartio MUE boards support
  serial: 8250: convert drivers to use up_to_u8250p()
  serial: 8250/mediatek: fix building with SERIAL_8250=m
  serial: 8250/ingenic: fix building with SERIAL_8250=m
  serial: 8250/uniphier: fix modular build
  Revert "drivers/tty/serial: make 8250/8250_ingenic.c explicitly non-modular"
  Revert "drivers/tty/serial: make 8250/8250_mtk.c explicitly non-modular"
  serial: mvebu-uart: initial support for Armada-3700 serial port
  serial: mctrl_gpio: Add missing module license
  serial: ifx6x60: avoid uninitialized variable use
  tty/serial: at91: fix bad offset for UART timeout register
  tty/serial: at91: restore dynamic driver binding
  serial: 8250: Add hardware dependency to RT288X option
  TTY, devpts: document pty count limiting
  tty: goldfish: support platform_device with id -1
  drivers: tty: goldfish: Add device tree bindings
  ...
2016-03-17 13:53:25 -07:00
Linus Torvalds
bb7aeae3d6 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security layer updates from James Morris:
 "There are a bunch of fixes to the TPM, IMA, and Keys code, with minor
  fixes scattered across the subsystem.

  IMA now requires signed policy, and that policy is also now measured
  and appraised"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (67 commits)
  X.509: Make algo identifiers text instead of enum
  akcipher: Move the RSA DER encoding check to the crypto layer
  crypto: Add hash param to pkcs1pad
  sign-file: fix build with CMS support disabled
  MAINTAINERS: update tpmdd urls
  MODSIGN: linux/string.h should be #included to get memcpy()
  certs: Fix misaligned data in extra certificate list
  X.509: Handle midnight alternative notation in GeneralizedTime
  X.509: Support leap seconds
  Handle ISO 8601 leap seconds and encodings of midnight in mktime64()
  X.509: Fix leap year handling again
  PKCS#7: fix unitialized boolean 'want'
  firmware: change kernel read fail to dev_dbg()
  KEYS: Use the symbol value for list size, updated by scripts/insert-sys-cert
  KEYS: Reserve an extra certificate symbol for inserting without recompiling
  modsign: hide openssl output in silent builds
  tpm_tis: fix build warning with tpm_tis_resume
  ima: require signed IMA policy
  ima: measure and appraise the IMA policy itself
  ima: load policy using path
  ...
2016-03-17 11:33:45 -07:00
Linus Torvalds
70477371dc Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 "Here is the crypto update for 4.6:

  API:
   - Convert remaining crypto_hash users to shash or ahash, also convert
     blkcipher/ablkcipher users to skcipher.
   - Remove crypto_hash interface.
   - Remove crypto_pcomp interface.
   - Add crypto engine for async cipher drivers.
   - Add akcipher documentation.
   - Add skcipher documentation.

  Algorithms:
   - Rename crypto/crc32 to avoid name clash with lib/crc32.
   - Fix bug in keywrap where we zero the wrong pointer.

  Drivers:
   - Support T5/M5, T7/M7 SPARC CPUs in n2 hwrng driver.
   - Add PIC32 hwrng driver.
   - Support BCM6368 in bcm63xx hwrng driver.
   - Pack structs for 32-bit compat users in qat.
   - Use crypto engine in omap-aes.
   - Add support for sama5d2x SoCs in atmel-sha.
   - Make atmel-sha available again.
   - Make sahara hashing available again.
   - Make ccp hashing available again.
   - Make sha1-mb available again.
   - Add support for multiple devices in ccp.
   - Improve DMA performance in caam.
   - Add hashing support to rockchip"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits)
  crypto: qat - remove redundant arbiter configuration
  crypto: ux500 - fix checks of error code returned by devm_ioremap_resource()
  crypto: atmel - fix checks of error code returned by devm_ioremap_resource()
  crypto: qat - Change the definition of icp_qat_uof_regtype
  hwrng: exynos - use __maybe_unused to hide pm functions
  crypto: ccp - Add abstraction for device-specific calls
  crypto: ccp - CCP versioning support
  crypto: ccp - Support for multiple CCPs
  crypto: ccp - Remove check for x86 family and model
  crypto: ccp - memset request context to zero during import
  lib/mpi: use "static inline" instead of "extern inline"
  lib/mpi: avoid assembler warning
  hwrng: bcm63xx - fix non device tree compatibility
  crypto: testmgr - allow rfc3686 aes-ctr variants in fips mode.
  crypto: qat - The AE id should be less than the maximal AE number
  lib/mpi: Endianness fix
  crypto: rockchip - add hash support for crypto engine in rk3288
  crypto: xts - fix compile errors
  crypto: doc - add skcipher API documentation
  crypto: doc - update AEAD AD handling
  ...
2016-03-17 11:22:54 -07:00
James Morris
88a1b564a2 Merge tag 'keys-next-20160303' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2016-03-04 11:39:53 +11:00
James Morris
5804602536 Merge branch 'stable-4.6' of git://git.infradead.org/users/pcmoore/selinux into next 2016-03-04 11:39:05 +11:00
David Howells
4e8ae72a75 X.509: Make algo identifiers text instead of enum
Make the identifier public key and digest algorithm fields text instead of
enum.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-03-03 21:49:27 +00:00
David Howells
d43de6c780 akcipher: Move the RSA DER encoding check to the crypto layer
Move the RSA EMSA-PKCS1-v1_5 encoding from the asymmetric-key public_key
subtype to the rsa crypto module's pkcs1pad template.  This means that the
public_key subtype no longer has any dependencies on public key type.

To make this work, the following changes have been made:

 (1) The rsa pkcs1pad template is now used for RSA keys.  This strips off the
     padding and returns just the message hash.

 (2) In a previous patch, the pkcs1pad template gained an optional second
     parameter that, if given, specifies the hash used.  We now give this,
     and pkcs1pad checks the encoded message E(M) for the EMSA-PKCS1-v1_5
     encoding and verifies that the correct digest OID is present.

 (3) The crypto driver in crypto/asymmetric_keys/rsa.c is now reduced to
     something that doesn't care about what the encryption actually does
     and and has been merged into public_key.c.

 (4) CONFIG_PUBLIC_KEY_ALGO_RSA is gone.  Module signing must set
     CONFIG_CRYPTO_RSA=y instead.

Thoughts:

 (*) Should the encoding style (eg. raw, EMSA-PKCS1-v1_5) also be passed to
     the padding template?  Should there be multiple padding templates
     registered that share most of the code?

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-03-03 21:49:27 +00:00
James Morris
34d47a7759 Merge branch 'stable-4.5' of git://git.infradead.org/users/pcmoore/selinux into for-linus 2016-02-26 19:32:16 +11:00
James Morris
481873d06f Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next 2016-02-26 15:06:41 +11:00
James Morris
6020944280 Merge branch 'smack-for-4.6' of https://github.com/cschaufler/smack-next into next 2016-02-22 13:27:12 +11:00
Mimi Zohar
95ee08fa37 ima: require signed IMA policy
Require the IMA policy to be signed when additional rules can be added.

v1:
- initialize the policy flag
- include IMA_APPRAISE_POLICY in the policy flag

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Petko Manolov <petkan@mip-labs.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
2016-02-21 09:34:23 -05:00
Mimi Zohar
19f8a84713 ima: measure and appraise the IMA policy itself
Add support for measuring and appraising the IMA policy itself.

Changelog v4:
- use braces on both if/else branches, even if single line on one of the
branches - Dmitry
- Use the id mapping - Dmitry

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Petko Manolov <petkan@mip-labs.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
2016-02-21 09:34:22 -05:00
Dmitry Kasatkin
7429b09281 ima: load policy using path
We currently cannot do appraisal or signature vetting of IMA policies
since we currently can only load IMA policies by writing the contents
of the policy directly in, as follows:

cat policy-file > <securityfs>/ima/policy

If we provide the kernel the path to the IMA policy so it can load
the policy itself it'd be able to later appraise or vet the file
signature if it has one.  This patch adds support to load the IMA
policy with a given path as follows:

echo /etc/ima/ima_policy > /sys/kernel/security/ima/policy

Changelog v4+:
- moved kernel_read_file_from_path() error messages to callers
v3:
- moved kernel_read_file_from_path() to a separate patch
v2:
- after re-ordering the patches, replace calling integrity_kernel_read()
  to read the file with kernel_read_file_from_path() (Mimi)
- Patch description re-written by Luis R. Rodriguez

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-02-21 09:34:05 -05:00
Mimi Zohar
d9ddf077bb ima: support for kexec image and initramfs
Add IMA policy support for measuring/appraising the kexec image and
initramfs. Two new IMA policy identifiers KEXEC_KERNEL_CHECK and
KEXEC_INITRAMFS_CHECK are defined.

Example policy rules:
measure func=KEXEC_KERNEL_CHECK
appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig
measure func=KEXEC_INITRAMFS_CHECK
appraise func=KEXEC_INITRAMFS_CHECK appraise_type=imasig

Moving the enumeration to the vfs layer simplified the patches, allowing
the IMA changes, for the most part, to be separated from the other
changes.  Unfortunately, passing either a kernel_read_file_id or a
ima_hooks enumeration within IMA is messy.

Option 1: duplicate kernel_read_file enumeration in ima_hooks

enum kernel_read_file_id {
	...
        READING_KEXEC_IMAGE,
        READING_KEXEC_INITRAMFS,
        READING_MAX_ID

enum ima_hooks {
	...
	KEXEC_KERNEL_CHECK
	KEXEC_INITRAMFS_CHECK

Option 2: define ima_hooks as extension of kernel_read_file
eg: enum ima_hooks {
        FILE_CHECK = READING_MAX_ID,
        MMAP_CHECK,

In order to pass both kernel_read_file_id and ima_hooks values, we
would need to specify a struct containing a union.

struct caller_id {
        union {
                enum ima_hooks func_id;
                enum kernel_read_file_id read_id;
        };
};

Option 3: incorportate the ima_hooks enumeration into kernel_read_file_id,
perhaps changing the enumeration name.

For now, duplicate the new READING_KEXEC_IMAGE/INITRAMFS in the ima_hooks.

Changelog v4:
- replaced switch statement with a kernel_read_file_id to an ima_hooks
id mapping array - Dmitry
- renamed ima_hook tokens KEXEC_CHECK and INITRAMFS_CHECK to
KEXEC_KERNEL_CHECK and KEXEC_INITRAMFS_CHECK respectively - Dave Young

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Petko Manolov <petkan@mip-labs.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
Cc: Dave Young <dyoung@redhat.com>
2016-02-21 09:06:16 -05:00
Mimi Zohar
c6af8efe97 ima: remove firmware and module specific cached status info
Each time a file is read by the kernel, the file should be re-measured and
the file signature re-appraised, based on policy.  As there is no need to
preserve the status information, this patch replaces the firmware and
module specific cache status with a generic one named read_file.

This change simplifies adding support for other files read by the kernel.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Petko Manolov <petkan@mip-labs.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
2016-02-21 09:06:13 -05:00
Mimi Zohar
a1db742094 module: replace copy_module_from_fd with kernel version
Replace copy_module_from_fd() with kernel_read_file_from_fd().

Although none of the upstreamed LSMs define a kernel_module_from_file
hook, IMA is called, based on policy, to prevent unsigned kernel modules
from being loaded by the original kernel module syscall and to
measure/appraise signed kernel modules.

The security function security_kernel_module_from_file() was called prior
to reading a kernel module.  Preventing unsigned kernel modules from being
loaded by the original kernel module syscall remains on the pre-read
kernel_read_file() security hook.  Instead of reading the kernel module
twice, once for measuring/appraising and again for loading the kernel
module, the signature validation is moved to the kernel_post_read_file()
security hook.

This patch removes the security_kernel_module_from_file() hook and security
call.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
2016-02-21 09:06:12 -05:00
Mimi Zohar
39eeb4fb97 security: define kernel_read_file hook
The kernel_read_file security hook is called prior to reading the file
into memory.

Changelog v4+:
- export security_kernel_read_file()

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-02-21 09:06:09 -05:00
Mimi Zohar
e40ba6d56b firmware: replace call to fw_read_file_contents() with kernel version
Replace the fw_read_file_contents with kernel_file_read_from_path().

Although none of the upstreamed LSMs define a kernel_fw_from_file hook,
IMA is called by the security function to prevent unsigned firmware from
being loaded and to measure/appraise signed firmware, based on policy.

Instead of reading the firmware twice, once for measuring/appraising the
firmware and again for reading the firmware contents into memory, the
kernel_post_read_file() security hook calculates the file hash based on
the in memory file buffer.  The firmware is read once.

This patch removes the LSM kernel_fw_from_file() hook and security call.

Changelog v4+:
- revert dropped buf->size assignment - reported by Sergey Senozhatsky
v3:
- remove kernel_fw_from_file hook
- use kernel_file_read_from_path() - requested by Luis
v2:
- reordered and squashed firmware patches
- fix MAX firmware size (Kees Cook)

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
2016-02-21 09:03:44 -05:00
Mimi Zohar
cf22221786 ima: define a new hook to measure and appraise a file already in memory
This patch defines a new IMA hook ima_post_read_file() for measuring
and appraising files read by the kernel. The caller loads the file into
memory before calling this function, which calculates the hash followed by
the normal IMA policy based processing.

Changelog v5:
- fail ima_post_read_file() if either file or buf is NULL
v3:
- rename ima_hash_and_process_file() to ima_post_read_file()

v1:
- split patch

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
2016-02-20 22:35:08 -05:00
Andreas Gruenbacher
e817c2f33e selinux: Don't sleep inside inode_getsecid hook
The inode_getsecid hook is called from contexts in which sleeping is not
allowed, so we cannot revalidate inode security labels from there. Use
the non-validating version of inode_security() instead.

Reported-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-02-19 16:29:19 -05:00
Mimi Zohar
98304bcf71 ima: calculate the hash of a buffer using aynchronous hash(ahash)
Setting up ahash has some overhead.  Only use ahash to calculate the
hash of a buffer, if the buffer is larger than ima_ahash_minsize.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
2016-02-18 17:14:44 -05:00
Dmitry Kasatkin
11d7646df8 ima: provide buffer hash calculation function
This patch provides convenient buffer hash calculation function.

Changelog v3:
- fix while hash calculation - Dmitry
v1:
- rewrite to support loff_t sized buffers - Mimi
  (based on Fenguang Wu's testing)

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-02-18 17:14:28 -05:00
Mimi Zohar
bc8ca5b92d vfs: define kernel_read_file_id enumeration
To differentiate between the kernel_read_file() callers, this patch
defines a new enumeration named kernel_read_file_id and includes the
caller identifier as an argument.

Subsequent patches define READING_KEXEC_IMAGE, READING_KEXEC_INITRAMFS,
READING_FIRMWARE, READING_MODULE, and READING_POLICY.

Changelog v3:
- Replace the IMA specific enumeration with a generic one.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
2016-02-18 17:14:04 -05:00
Mimi Zohar
b44a7dfc6f vfs: define a generic function to read a file from the kernel
For a while it was looked down upon to directly read files from Linux.
These days there exists a few mechanisms in the kernel that do just
this though to load a file into a local buffer.  There are minor but
important checks differences on each.  This patch set is the first
attempt at resolving some of these differences.

This patch introduces a common function for reading files from the kernel
with the corresponding security post-read hook and function.

Changelog v4+:
- export security_kernel_post_read_file() - Fengguang Wu
v3:
- additional bounds checking - Luis
v2:
- To simplify patch review, re-ordered patches

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Reviewed-by: Luis R. Rodriguez <mcgrof@suse.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
2016-02-18 17:14:03 -05:00
Mimi Zohar
4ad87a3d74 ima: use "ima_hooks" enum as function argument
Cleanup the function arguments by using "ima_hooks" enumerator as needed.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Petko Manolov <petkan@mip-labs.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
2016-02-18 17:13:58 -05:00
Mimi Zohar
b5269ab3e2 ima: refactor ima_policy_show() to display "ima_hooks" rules
Define and call a function to display the "ima_hooks" rules.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Petko Manolov <petkan@mip-labs.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
2016-02-18 17:13:56 -05:00
Dmitry Kasatkin
1525b06d99 ima: separate 'security.ima' reading functionality from collect
Instead of passing pointers to pointers to ima_collect_measurent() to
read and return the 'security.ima' xattr value, this patch moves the
functionality to the calling process_measurement() to directly read
the xattr and pass only the hash algo to the ima_collect_measurement().

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-02-18 17:13:32 -05:00
Paul Gortmaker
a1f2bdf338 security/keys: make big_key.c explicitly non-modular
The Kconfig currently controlling compilation of this code is:

config BIG_KEYS
        bool "Large payload keys"

...meaning that it currently is not being built as a module by anyone.

Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.

Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.

We also delete the MODULE_LICENSE tag since all that information
is already contained at the top of the file in the comments.

Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: keyrings@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2016-02-18 15:15:59 +00:00
Tadeusz Struk
eb5798f2e2 integrity: convert digsig to akcipher api
Convert asymmetric_verify to akcipher api.

Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David Howells <dhowells@redhat.com>
2016-02-18 14:52:32 +00:00
José Bollo
8012495e17 smack: fix cache of access labels
Before this commit, removing the access property of
a file, aka, the extended attribute security.SMACK64
was not effictive until the cache had been cleaned.

This patch fixes that problem.

Signed-off-by: José Bollo <jobol@nonadev.net>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-02-16 09:56:35 -08:00
Dave Hansen
1e9877902d mm/gup: Introduce get_user_pages_remote()
For protection keys, we need to understand whether protections
should be enforced in software or not.  In general, we enforce
protections when working on our own task, but not when on others.
We call these "current" and "remote" operations.

This patch introduces a new get_user_pages() variant:

        get_user_pages_remote()

Which is a replacement for when get_user_pages() is called on
non-current tsk/mm.

We also introduce a new gup flag: FOLL_REMOTE which can be used
for the "__" gup variants to get this new behavior.

The uprobes is_trap_at_addr() location holds mmap_sem and
calls get_user_pages(current->mm) on an instruction address.  This
makes it a pretty unique gup caller.  Being an instruction access
and also really originating from the kernel (vs. the app), I opted
to consider this a 'remote' access where protection keys will not
be enforced.

Without protection keys, this patch should not change any behavior.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: jack@suse.cz
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20160212210154.3F0E51EA@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-16 10:04:09 +01:00
Greg Kroah-Hartman
249f3c4fe4 Merge 4.5-rc4 into tty-next
We want the fixes in here, and this resolves a merge error in tty_io.c

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-14 14:36:04 -08:00
Ryan Ware
613317bd21 EVM: Use crypto_memneq() for digest comparisons
This patch fixes vulnerability CVE-2016-2085.  The problem exists
because the vm_verify_hmac() function includes a use of memcmp().
Unfortunately, this allows timing side channel attacks; specifically
a MAC forgery complexity drop from 2^128 to 2^12.  This patch changes
the memcmp() to the cryptographically safe crypto_memneq().

Reported-by: Xiaofei Rex Guo <xiaofei.rex.guo@intel.com>
Signed-off-by: Ryan Ware <ware@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-02-12 18:36:47 +11:00
Casey Schaufler
491a0b08d3 Smack: Remove pointless hooks
Prior to the 4.2 kernel there no no harm in providing
a security module hook that does nothing, as the default
hook would get called if the module did not supply one.
With the list based infrastructure an empty hook adds
overhead. This patch removes the three Smack hooks that
don't actually do anything.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2016-02-11 09:14:35 -08:00
David Howells
50d35015ff KEYS: CONFIG_KEYS_DEBUG_PROC_KEYS is no longer an option
CONFIG_KEYS_DEBUG_PROC_KEYS is no longer an option as /proc/keys is now
mandatory if the keyrings facility is enabled (it's used by libkeyutils in
userspace).

The defconfig references were removed with:

	perl -p -i -e 's/CONFIG_KEYS_DEBUG_PROC_KEYS=y\n//' \
	    `git grep -l CONFIG_KEYS_DEBUG_PROC_KEYS=y`

and the integrity Kconfig fixed by hand.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Andreas Ziegler <andreas.ziegler@fau.de>
cc: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
2016-02-10 10:13:27 +00:00
Jarkko Sakkinen
f3c82ade7c tpm: fix checks for policy digest existence in tpm2_seal_trusted()
In my original patch sealing with policy was done with dynamically
allocated buffer that I changed later into an array so the checks in
tpm2-cmd.c became invalid. This patch fixes the issue.

Fixes: 5beb0c435b ("keys, trusted: seal with a TPM2 authorization policy")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Acked-by: Peter Huewe <peterhuewe@gmx.de>
2016-02-10 04:10:55 +02:00
David Howells
5d2787cf0b KEYS: Add an alloc flag to convey the builtinness of a key
Add KEY_ALLOC_BUILT_IN to convey that a key should have KEY_FLAG_BUILTIN
set rather than setting it after the fact.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-02-09 16:40:46 +00:00
Lorenzo Colitti
08ff924e7f selinux: nlmsgtab: add SOCK_DESTROY to the netlink mapping tables
Without this, using SOCK_DESTROY in enforcing mode results in:

  SELinux: unrecognized netlink message type=21 for sclass=32

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-09 04:55:05 -05:00
Herbert Xu
f75516a815 crypto: keys - Revert "convert public key to akcipher api"
This needs to go through the security tree so I'm reverting the
patches for now.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-02-09 16:18:01 +08:00
Colin Ian King
c75d8e96f3 IMA: fix non-ANSI declaration of ima_check_policy()
ima_check_policy() has no parameters, so use the normal void
parameter convention to make it match the prototype in the header file
security/integrity/ima/ima.h

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-02-08 18:17:38 -05:00
Tadeusz Struk
42bbaabb12 integrity: convert digsig to akcipher api
Convert asymmetric_verify to akcipher api.

Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-02-06 15:33:26 +08:00
Greg Kroah-Hartman
6e9131cc43 Merge 4.5-rc2 into tty-next
We want the tty/serial fixes in here as well to make merges easier.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-01 12:53:14 -08:00
Andy Shevchenko
9090a2d5e3 selinux: use absolute path to include directory
Compiler warns us a lot that it can't find include folder because it's
provided in relative form.

  CC      security/selinux/netlabel.o
cc1: warning: security/selinux/include: No such file or directory
cc1: warning: security/selinux/include: No such file or directory
cc1: warning: security/selinux/include: No such file or directory
cc1: warning: security/selinux/include: No such file or directory

Add $(srctree) prefix to the path.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[PM: minor description edits to fit under 80char width]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-01-28 10:37:15 -05:00
David Howells
eee045021f KEYS: Only apply KEY_FLAG_KEEP to a key if a parent keyring has it set
KEY_FLAG_KEEP should only be applied to a key if the keyring it is being
linked into has KEY_FLAG_KEEP set.

To this end, partially revert the following patch:

	commit 1d6d167c2e
	Author: Mimi Zohar <zohar@linux.vnet.ibm.com>
	Date:   Thu Jan 7 07:46:36 2016 -0500
	KEYS: refcount bug fix

to undo the change that made it unconditional (Mimi got it right the first
time).

Without undoing this change, it becomes impossible to delete, revoke or
invalidate keys added to keyrings through __key_instantiate_and_link()
where the keyring has itself been linked to.  To test this, run the
following command sequence:

    keyctl newring foo @s
    keyctl add user a a %:foo
    keyctl unlink %user:a %:foo
    keyctl clear %:foo

With the commit mentioned above the third and fourth commands fail with
EPERM when they should succeed.

Reported-by: Stephen Gallager <sgallagh@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by:  Mimi Zohar <zohar@linux.vnet.ibm.com>
cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
cc: keyrings@vger.kernel.org
cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-01-28 10:48:40 +11:00
Peter Hurley
4a51096937 tty: Make tty_files_lock per-tty
Access to tty->tty_files list is always per-tty, never for all ttys
simultaneously. Replace global tty_files_lock spinlock with per-tty
->files_lock. Initialize when the ->tty_files list is inited, in
alloc_tty_struct().

Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-27 15:13:28 -08:00
Herbert Xu
c3917fd9df KEYS: Use skcipher
This patch replaces uses of blkcipher with skcipher.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-01-27 20:36:03 +08:00
Al Viro
5955102c99 wrappers for ->i_mutex access
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-22 18:04:28 -05:00
Jann Horn
caaee6234d ptrace: use fsuid, fsgid, effective creds for fs access checks
By checking the effective credentials instead of the real UID / permitted
capabilities, ensure that the calling process actually intended to use its
credentials.

To ensure that all ptrace checks use the correct caller credentials (e.g.
in case out-of-tree code or newly added code omits the PTRACE_MODE_*CREDS
flag), use two new flags and require one of them to be set.

The problem was that when a privileged task had temporarily dropped its
privileges, e.g.  by calling setreuid(0, user_uid), with the intent to
perform following syscalls with the credentials of a user, it still passed
ptrace access checks that the user would not be able to pass.

While an attacker should not be able to convince the privileged task to
perform a ptrace() syscall, this is a problem because the ptrace access
check is reused for things in procfs.

In particular, the following somewhat interesting procfs entries only rely
on ptrace access checks:

 /proc/$pid/stat - uses the check for determining whether pointers
     should be visible, useful for bypassing ASLR
 /proc/$pid/maps - also useful for bypassing ASLR
 /proc/$pid/cwd - useful for gaining access to restricted
     directories that contain files with lax permissions, e.g. in
     this scenario:
     lrwxrwxrwx root root /proc/13020/cwd -> /root/foobar
     drwx------ root root /root
     drwxr-xr-x root root /root/foobar
     -rw-r--r-- root root /root/foobar/secret

Therefore, on a system where a root-owned mode 6755 binary changes its
effective credentials as described and then dumps a user-specified file,
this could be used by an attacker to reveal the memory layout of root's
processes or reveal the contents of files he is not allowed to access
(through /proc/$pid/cwd).

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Jann Horn
3dfb7d8cdb security: let security modules use PTRACE_MODE_* with bitmasks
It looks like smack and yama weren't aware that the ptrace mode
can have flags ORed into it - PTRACE_MODE_NOAUDIT until now, but
only for /proc/$pid/stat, and with the PTRACE_MODE_*CREDS patch,
all modes have flags ORed into them.

Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Yevgeny Pats
23567fd052 KEYS: Fix keyring ref leak in join_session_keyring()
This fixes CVE-2016-0728.

If a thread is asked to join as a session keyring the keyring that's already
set as its session, we leak a keyring reference.

This can be tested with the following program:

	#include <stddef.h>
	#include <stdio.h>
	#include <sys/types.h>
	#include <keyutils.h>

	int main(int argc, const char *argv[])
	{
		int i = 0;
		key_serial_t serial;

		serial = keyctl(KEYCTL_JOIN_SESSION_KEYRING,
				"leaked-keyring");
		if (serial < 0) {
			perror("keyctl");
			return -1;
		}

		if (keyctl(KEYCTL_SETPERM, serial,
			   KEY_POS_ALL | KEY_USR_ALL) < 0) {
			perror("keyctl");
			return -1;
		}

		for (i = 0; i < 100; i++) {
			serial = keyctl(KEYCTL_JOIN_SESSION_KEYRING,
					"leaked-keyring");
			if (serial < 0) {
				perror("keyctl");
				return -1;
			}
		}

		return 0;
	}

If, after the program has run, there something like the following line in
/proc/keys:

3f3d898f I--Q---   100 perm 3f3f0000     0     0 keyring   leaked-keyring: empty

with a usage count of 100 * the number of times the program has been run,
then the kernel is malfunctioning.  If leaked-keyring has zero usages or
has been garbage collected, then the problem is fixed.

Reported-by: Yevgeny Pats <yevgeny@perception-point.io>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-01-20 10:50:48 +11:00
Linus Torvalds
5807fcaa9b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:

 - EVM gains support for loading an x509 cert from the kernel
   (EVM_LOAD_X509), into the EVM trusted kernel keyring.

 - Smack implements 'file receive' process-based permission checking for
   sockets, rather than just depending on inode checks.

 - Misc enhancments for TPM & TPM2.

 - Cleanups and bugfixes for SELinux, Keys, and IMA.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (41 commits)
  selinux: Inode label revalidation performance fix
  KEYS: refcount bug fix
  ima: ima_write_policy() limit locking
  IMA: policy can be updated zero times
  selinux: rate-limit netlink message warnings in selinux_nlmsg_perm()
  selinux: export validatetrans decisions
  gfs2: Invalid security labels of inodes when they go invalid
  selinux: Revalidate invalid inode security labels
  security: Add hook to invalidate inode security labels
  selinux: Add accessor functions for inode->i_security
  security: Make inode argument of inode_getsecid non-const
  security: Make inode argument of inode_getsecurity non-const
  selinux: Remove unused variable in selinux_inode_init_security
  keys, trusted: seal with a TPM2 authorization policy
  keys, trusted: select hash algorithm for TPM2 chips
  keys, trusted: fix: *do not* allow duplicate key options
  tpm_ibmvtpm: properly handle interrupted packet receptions
  tpm_tis: Tighten IRQ auto-probing
  tpm_tis: Refactor the interrupt setup
  tpm_tis: Get rid of the duplicate IRQ probing code
  ...
2016-01-17 19:13:15 -08:00
James Morris
acb2cfdb31 Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/selinux into next 2016-01-14 12:11:58 +11:00
Linus Torvalds
33caf82acf Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "All kinds of stuff.  That probably should've been 5 or 6 separate
  branches, but by the time I'd realized how large and mixed that bag
  had become it had been too close to -final to play with rebasing.

  Some fs/namei.c cleanups there, memdup_user_nul() introduction and
  switching open-coded instances, burying long-dead code, whack-a-mole
  of various kinds, several new helpers for ->llseek(), assorted
  cleanups and fixes from various people, etc.

  One piece probably deserves special mention - Neil's
  lookup_one_len_unlocked().  Similar to lookup_one_len(), but gets
  called without ->i_mutex and tries to avoid ever taking it.  That, of
  course, means that it's not useful for any directory modifications,
  but things like getting inode attributes in nfds readdirplus are fine
  with that.  I really should've asked for moratorium on lookup-related
  changes this cycle, but since I hadn't done that early enough...  I
  *am* asking for that for the coming cycle, though - I'm going to try
  and get conversion of i_mutex to rwsem with ->lookup() done under lock
  taken shared.

  There will be a patch closer to the end of the window, along the lines
  of the one Linus had posted last May - mechanical conversion of
  ->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/
  inode_is_locked()/inode_lock_nested().  To quote Linus back then:

    -----
    |    This is an automated patch using
    |
    |        sed 's/mutex_lock(&\(.*\)->i_mutex)/inode_lock(\1)/'
    |        sed 's/mutex_unlock(&\(.*\)->i_mutex)/inode_unlock(\1)/'
    |        sed 's/mutex_lock_nested(&\(.*\)->i_mutex,[     ]*I_MUTEX_\([A-Z0-9_]*\))/inode_lock_nested(\1, I_MUTEX_\2)/'
    |        sed 's/mutex_is_locked(&\(.*\)->i_mutex)/inode_is_locked(\1)/'
    |        sed 's/mutex_trylock(&\(.*\)->i_mutex)/inode_trylock(\1)/'
    |
    |    with a very few manual fixups
    -----

  I'm going to send that once the ->i_mutex-affecting stuff in -next
  gets mostly merged (or when Linus says he's about to stop taking
  merges)"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  nfsd: don't hold i_mutex over userspace upcalls
  fs:affs:Replace time_t with time64_t
  fs/9p: use fscache mutex rather than spinlock
  proc: add a reschedule point in proc_readfd_common()
  logfs: constify logfs_block_ops structures
  fcntl: allow to set O_DIRECT flag on pipe
  fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE
  fs: xattr: Use kvfree()
  [s390] page_to_phys() always returns a multiple of PAGE_SIZE
  nbd: use ->compat_ioctl()
  fs: use block_device name vsprintf helper
  lib/vsprintf: add %*pg format specifier
  fs: use gendisk->disk_name where possible
  poll: plug an unused argument to do_poll
  amdkfd: don't open-code memdup_user()
  cdrom: don't open-code memdup_user()
  rsxx: don't open-code memdup_user()
  mtip32xx: don't open-code memdup_user()
  [um] mconsole: don't open-code memdup_user_nul()
  [um] hostaudio: don't open-code memdup_user()
  ...
2016-01-12 17:11:47 -08:00
Linus Torvalds
ddf1d6238d Merge branch 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs xattr updates from Al Viro:
 "Andreas' xattr cleanup series.

  It's a followup to his xattr work that went in last cycle; -0.5KLoC"

* 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  xattr handlers: Simplify list operation
  ocfs2: Replace list xattr handler operations
  nfs: Move call to security_inode_listsecurity into nfs_listxattr
  xfs: Change how listxattr generates synthetic attributes
  tmpfs: listxattr should include POSIX ACL xattrs
  tmpfs: Use xattr handler infrastructure
  btrfs: Use xattr handler infrastructure
  vfs: Distinguish between full xattr names and proper prefixes
  posix acls: Remove duplicate xattr name definitions
  gfs2: Remove gfs2_xattr_acl_chmod
  vfs: Remove vfs_xattr_cmp
2016-01-11 13:32:10 -08:00
James Morris
607259e17b Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into ra-next 2016-01-10 21:52:17 +11:00
Al Viro
6108209c4a Merge branch 'for-linus' into work.misc 2016-01-08 21:20:11 -05:00
Andreas Gruenbacher
b197367ed1 selinux: Inode label revalidation performance fix
Commit 5d226df4 has introduced a performance regression of about
10% in the UnixBench pipe benchmark.  It turns out that the call
to inode_security in selinux_file_permission can be moved below
the zero-mask test and that inode_security_revalidate can be
removed entirely, which brings us back to roughly the original
performance.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-01-08 16:24:27 -05:00
Mimi Zohar
1d6d167c2e KEYS: refcount bug fix
This patch fixes the key_ref leak, removes the unnecessary KEY_FLAG_KEEP
test before setting the flag, and cleans up the if/then brackets style
introduced in commit:
d3600bc KEYS: prevent keys from being removed from specified keyrings

Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
2016-01-07 12:56:42 -05:00
Al Viro
cc4e719e83 fix the leak in integrity_read_file()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-04 10:28:19 -05:00
Al Viro
8365a71946 selinuxfs: switch to memdup_user_nul()
Nothing in there gives a damn about the buffer alignment - it
just parses its contents.  So the use of get_zeroed_page()
doesn't buy us anything - might as well had been kmalloc(),
which makes that code equivalent to open-coded memdup_user_nul()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-04 10:27:39 -05:00
Al Viro
16e5c1fc36 convert a bunch of open-coded instances of memdup_user_nul()
A _lot_ of ->write() instances were open-coding it; some are
converted to memdup_user_nul(), a lot more remain...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-04 10:26:58 -05:00
Petko Manolov
6427e6c71c ima: ima_write_policy() limit locking
There is no need to hold the ima_write_mutex for so long.  We only need it
around ima_parse_add_rule().

Changelog:
- The return path now takes into account failed kmalloc() call.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Petko Manolov <petkan@mip-labs.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-01-03 13:22:38 -05:00
James Morris
aa98b942cb Merge branch 'smack-for-4.5' of https://github.com/cschaufler/smack-next into next 2015-12-26 16:11:13 +11:00
James Morris
37babe4ec6 Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/selinux into next 2015-12-26 16:07:31 +11:00
James Morris
3cb92fe481 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next 2015-12-26 16:06:53 +11:00
Sasha Levin
0112721df4 IMA: policy can be updated zero times
Commit "IMA: policy can now be updated multiple times" assumed that the
policy would be updated at least once.

If there are zero updates, the temporary list head object will get added
to the policy list, and later dereferenced as an IMA policy object, which
means that invalid memory will be accessed.

Changelog:
- Move list_empty() test to ima_release_policy(), before audit msg - Mimi

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-12-24 18:56:45 -05:00
Vladis Dronov
76319946f3 selinux: rate-limit netlink message warnings in selinux_nlmsg_perm()
Any process is able to send netlink messages with invalid types.
Make the warning rate-limited to prevent too much log spam.

The warning is supposed to help to find misbehaving programs, so
print the triggering command name and pid.

Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
[PM: subject line tweak to make checkpatch.pl happy]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-12-24 11:09:41 -05:00
Andrew Perepechko
f9df645821 selinux: export validatetrans decisions
Make validatetrans decisions available through selinuxfs.
"/validatetrans" is added to selinuxfs for this purpose.
This functionality is needed by file system servers
implemented in userspace or kernelspace without the VFS
layer.

Writing "$oldcontext $newcontext $tclass $taskcontext"
to /validatetrans is expected to return 0 if the transition
is allowed and -EPERM otherwise.

Signed-off-by: Andrew Perepechko <anserper@ya.ru>
CC: andrew.perepechko@seagate.com
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-12-24 11:09:41 -05:00
Andreas Gruenbacher
5d226df4ed selinux: Revalidate invalid inode security labels
When fetching an inode's security label, check if it is still valid, and
try reloading it if it is not. Reloading will fail when we are in RCU
context which doesn't allow sleeping, or when we can't find a dentry for
the inode.  (Reloading happens via iop->getxattr which takes a dentry
parameter.)  When reloading fails, continue using the old, invalid
label.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-12-24 11:09:40 -05:00
Andreas Gruenbacher
6f3be9f562 security: Add hook to invalidate inode security labels
Add a hook to invalidate an inode's security label when the cached
information becomes invalid.

Add the new hook in selinux: set a flag when a security label becomes
invalid.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-12-24 11:09:40 -05:00
Andreas Gruenbacher
83da53c5a3 selinux: Add accessor functions for inode->i_security
Add functions dentry_security and inode_security for accessing
inode->i_security.  These functions initially don't do much, but they
will later be used to revalidate the security labels when necessary.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-12-24 11:09:39 -05:00
Andreas Gruenbacher
d6335d77a7 security: Make inode argument of inode_getsecid non-const
Make the inode argument of the inode_getsecid hook non-const so that we
can use it to revalidate invalid security labels.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-12-24 11:09:39 -05:00
Andreas Gruenbacher
ea861dfd9e security: Make inode argument of inode_getsecurity non-const
Make the inode argument of the inode_getsecurity hook non-const so that
we can use it to revalidate invalid security labels.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-12-24 11:09:39 -05:00
Andreas Gruenbacher
a44ca52ca6 selinux: Remove unused variable in selinux_inode_init_security
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-12-24 11:09:39 -05:00
Jarkko Sakkinen
5beb0c435b keys, trusted: seal with a TPM2 authorization policy
TPM2 supports authorization policies, which are essentially
combinational logic statements repsenting the conditions where the data
can be unsealed based on the TPM state. This patch enables to use
authorization policies to seal trusted keys.

Two following new options have been added for trusted keys:

* 'policydigest=': provide an auth policy digest for sealing.
* 'policyhandle=': provide a policy session handle for unsealing.

If 'hash=' option is supplied after 'policydigest=' option, this
will result an error because the state of the option would become
mixed.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Peter Huewe <peterhuewe@gmx.de>
2015-12-20 15:27:13 +02:00
Jarkko Sakkinen
5ca4c20cfd keys, trusted: select hash algorithm for TPM2 chips
Added 'hash=' option for selecting the hash algorithm for add_key()
syscall and documentation for it.

Added entry for sm3-256 to the following tables in order to support
TPM_ALG_SM3_256:

* hash_algo_name
* hash_digest_size

Includes support for the following hash algorithms:

* sha1
* sha256
* sha384
* sha512
* sm3-256

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Reviewed-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Peter Huewe <peterhuewe@gmx.de>
2015-12-20 15:27:12 +02:00
Jarkko Sakkinen
5208cc8342 keys, trusted: fix: *do not* allow duplicate key options
The trusted keys option parsing allows specifying the same option
multiple times. The last option value specified is used.

This is problematic because:

* No gain.
* This makes complicated to specify options that are dependent on other
  options.

This patch changes the behavior in a way that option can be specified
only once.

Reported-by: James Morris James Morris <jmorris@namei.org>
Reviewed-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Acked-by: Peter Huewe <peterhuewe@gmx.de>
2015-12-20 15:27:12 +02:00
David Howells
b4a1b4f504 KEYS: Fix race between read and revoke
This fixes CVE-2015-7550.

There's a race between keyctl_read() and keyctl_revoke().  If the revoke
happens between keyctl_read() checking the validity of a key and the key's
semaphore being taken, then the key type read method will see a revoked key.

This causes a problem for the user-defined key type because it assumes in
its read method that there will always be a payload in a non-revoked key
and doesn't check for a NULL pointer.

Fix this by making keyctl_read() check the validity of a key after taking
semaphore instead of before.

I think the bug was introduced with the original keyrings code.

This was discovered by a multithreaded test program generated by syzkaller
(http://github.com/google/syzkaller).  Here's a cleaned up version:

	#include <sys/types.h>
	#include <keyutils.h>
	#include <pthread.h>
	void *thr0(void *arg)
	{
		key_serial_t key = (unsigned long)arg;
		keyctl_revoke(key);
		return 0;
	}
	void *thr1(void *arg)
	{
		key_serial_t key = (unsigned long)arg;
		char buffer[16];
		keyctl_read(key, buffer, 16);
		return 0;
	}
	int main()
	{
		key_serial_t key = add_key("user", "%", "foo", 3, KEY_SPEC_USER_KEYRING);
		pthread_t th[5];
		pthread_create(&th[0], 0, thr0, (void *)(unsigned long)key);
		pthread_create(&th[1], 0, thr1, (void *)(unsigned long)key);
		pthread_create(&th[2], 0, thr0, (void *)(unsigned long)key);
		pthread_create(&th[3], 0, thr1, (void *)(unsigned long)key);
		pthread_join(th[0], 0);
		pthread_join(th[1], 0);
		pthread_join(th[2], 0);
		pthread_join(th[3], 0);
		return 0;
	}

Build as:

	cc -o keyctl-race keyctl-race.c -lkeyutils -lpthread

Run as:

	while keyctl-race; do :; done

as it may need several iterations to crash the kernel.  The crash can be
summarised as:

	BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
	IP: [<ffffffff81279b08>] user_read+0x56/0xa3
	...
	Call Trace:
	 [<ffffffff81276aa9>] keyctl_read_key+0xb6/0xd7
	 [<ffffffff81277815>] SyS_keyctl+0x83/0xe0
	 [<ffffffff815dbb97>] entry_SYSCALL_64_fastpath+0x12/0x6f

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-12-19 12:34:43 +11:00
Roman Kubiak
81bd0d5629 Smack: type confusion in smak sendmsg() handler
Smack security handler for sendmsg() syscall
is vulnerable to type confusion issue what
can allow to privilege escalation into root
or cause denial of service.

A malicious attacker can create socket of one
type for example AF_UNIX and pass is into
sendmsg() function ensuring that this is
AF_INET socket.

Remedy
Do not trust user supplied data.
Proposed fix below.

Signed-off-by: Roman Kubiak <r.kubiak@samsung.com>
Signed-off-by: Mateusz Fruba <m.fruba@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2015-12-17 10:21:56 -08:00
Paul Gortmaker
92cc916638 security/integrity: make ima/ima_mok.c explicitly non-modular
The Kconfig currently controlling compilation of this code is:

ima/Kconfig:config IMA_MOK_KEYRING
ima/Kconfig: bool "Create IMA machine owner keys (MOK) and blacklist keyrings"

...meaning that it currently is not being built as a module by anyone.

Lets remove the couple of traces of modularity so that when reading the
driver there is no doubt it really is builtin-only.

Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.

Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-ima-devel@lists.sourceforge.net
Cc: linux-ima-user@lists.sourceforge.net
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-12-15 10:01:43 -05:00
Mimi Zohar
6ad6afa146 ima: update appraise flags after policy update completes
While creating a temporary list of new rules, the ima_appraise flag is
updated, but not reverted on failure to append the new rules to the
existing policy.  This patch defines temp_ima_appraise flag.  Only when
the new rules are appended to the policy is the flag updated.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Petko Manolov <petkan@mip-labs.com>
2015-12-15 10:01:43 -05:00
Mimi Zohar
501f1bde66 IMA: prevent keys on the .ima_blacklist from being removed
Set the KEY_FLAGS_KEEP on the .ima_blacklist to prevent userspace
from removing keys from the keyring.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-12-15 10:01:43 -05:00
Mimi Zohar
d3600bcf9d KEYS: prevent keys from being removed from specified keyrings
Userspace should not be allowed to remove keys from certain keyrings
(eg. blacklist), though the keys themselves can expire.

This patch defines a new key flag named KEY_FLAG_KEEP to prevent
userspace from being able to unlink, revoke, invalidate or timed
out a key on a keyring.  When this flag is set on the keyring, all
keys subsequently added are flagged.

In addition, when this flag is set, the keyring itself can not be
cleared.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
2015-12-15 10:01:43 -05:00
Petko Manolov
80eae209d6 IMA: allow reading back the current IMA policy
It is often useful to be able to read back the IMA policy.  It is
even more important after introducing CONFIG_IMA_WRITE_POLICY.
This option allows the root user to see the current policy rules.

Signed-off-by: Zbigniew Jasinski <z.jasinski@samsung.com>
Signed-off-by: Petko Manolov <petkan@mip-labs.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-12-15 10:01:43 -05:00
Petko Manolov
41c89b64d7 IMA: create machine owner and blacklist keyrings
This option creates IMA MOK and blacklist keyrings.  IMA MOK is an
intermediate keyring that sits between .system and .ima keyrings,
effectively forming a simple CA hierarchy.  To successfully import a key
into .ima_mok it must be signed by a key which CA is in .system keyring.
On turn any key that needs to go in .ima keyring must be signed by CA in
either .system or .ima_mok keyrings. IMA MOK is empty at kernel boot.

IMA blacklist keyring contains all revoked IMA keys.  It is consulted
before any other keyring.  If the search is successful the requested
operation is rejected and error is returned to the caller.

Signed-off-by: Petko Manolov <petkan@mip-labs.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-12-15 10:01:43 -05:00
Petko Manolov
38d859f991 IMA: policy can now be updated multiple times
The new rules get appended to the original policy, forming a queue.
The new rules are first added to a temporary list, which on error
get released without disturbing the normal IMA operations.  On
success both lists (the current policy and the new rules) are spliced.

IMA policy reads are many orders of magnitude more numerous compared to
writes, the match code is RCU protected.  The updater side also does
list splice in RCU manner.

Signed-off-by: Petko Manolov <petkan@mip-labs.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-12-15 10:01:43 -05:00
Arnd Bergmann
05d3884b1e evm: EVM_LOAD_X509 depends on EVM
The newly added EVM_LOAD_X509 code can be configured even if
CONFIG_EVM is disabled, but that causes a link error:

security/built-in.o: In function `integrity_load_keys':
digsig_asymmetric.c:(.init.text+0x400): undefined reference to `evm_load_x509'

This adds a Kconfig dependency to ensure it is only enabled when
CONFIG_EVM is set as well.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 2ce523eb89 ("evm: load x509 certificate from the kernel")
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-12-15 09:57:21 -05:00
Dmitry Kasatkin
523b74b16b evm: reset EVM status when file attributes change
The EVM verification status is cached in iint->evm_status and if it
was successful, never re-verified again when IMA passes the 'iint' to
evm_verifyxattr().

When file attributes or extended attributes change, we may wish to
re-verify EVM integrity as well.  For example, after setting a digital
signature we may need to re-verify the signature and update the
iint->flags that there is an EVM signature.

This patch enables that by resetting evm_status to INTEGRITY_UKNOWN
state.

Changes in v2:
* Flag setting moved to EVM layer

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-12-15 09:56:57 -05:00
Dmitry Kasatkin
7626676320 evm: provide a function to set the EVM key from the kernel
A crypto HW kernel module can possibly initialize the EVM key from the
kernel __init code to enable EVM before calling the 'init' process.
This patch provides a function evm_set_key() to set the EVM key
directly without using the KEY subsystem.

Changes in v4:
* kernel-doc style for evm_set_key

Changes in v3:
* error reporting moved to evm_set_key
* EVM_INIT_HMAC moved to evm_set_key
* added bitop to prevent key setting race

Changes in v2:
* use size_t for key size instead of signed int
* provide EVM_MAX_KEY_SIZE macro in <linux/evm.h>
* provide EVM_MIN_KEY_SIZE macro in <linux/evm.h>

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-12-15 08:53:36 -05:00
Dmitry Kasatkin
26ddabfe96 evm: enable EVM when X509 certificate is loaded
In order to enable EVM before starting the 'init' process,
evm_initialized needs to be non-zero.  Previously non-zero indicated
that the HMAC key was loaded.  When EVM loads the X509 before calling
'init', with this patch it is now possible to enable EVM to start
signature based verification.

This patch defines bits to enable EVM if a key of any type is loaded.

Changes in v3:
* print error message if key is not set

Changes in v2:
* EVM_STATE_KEY_SET replaced by EVM_INIT_HMAC
* EVM_STATE_X509_SET replaced by EVM_INIT_X509

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-12-15 08:50:48 -05:00
Dmitry Kasatkin
2ce523eb89 evm: load an x509 certificate from the kernel
This patch defines a configuration option and the evm_load_x509() hook
to load an X509 certificate onto the EVM trusted kernel keyring.

Changes in v4:
* Patch description updated

Changes in v3:
* Removed EVM_X509_PATH definition. CONFIG_EVM_X509_PATH is used
  directly.

Changes in v2:
* default key patch changed to /etc/keys

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-12-15 08:31:19 -05:00
Andreas Gruenbacher
c4803c497f nfs: Move call to security_inode_listsecurity into nfs_listxattr
Add a nfs_listxattr operation.  Move the call to security_inode_listsecurity
from list operation of the "security.*" xattr handler to nfs_listxattr.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Anna Schumaker <anna.schumaker@netapp.com>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-13 19:45:47 -05:00
Casey Schaufler
79be093500 Smack: File receive for sockets
The existing file receive hook checks for access on
the file inode even for UDS. This is not right, as
the inode is not used by Smack to make access checks
for sockets. This change checks for an appropriate
access relationship between the receiving (current)
process and the socket. If the process can't write
to the socket's send label or the socket's receive
label can't write to the process fail.

This will allow the legitimate cases, where the
socket sender and socket receiver can freely communicate.
Only strangly set socket labels should cause a problem.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2015-12-09 16:10:55 -08:00
James Morris
6e37592900 Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/selinux into for-linus2 2015-11-26 15:04:19 +11:00
David Howells
096fe9eaea KEYS: Fix handling of stored error in a negatively instantiated user key
If a user key gets negatively instantiated, an error code is cached in the
payload area.  A negatively instantiated key may be then be positively
instantiated by updating it with valid data.  However, the ->update key
type method must be aware that the error code may be there.

The following may be used to trigger the bug in the user key type:

    keyctl request2 user user "" @u
    keyctl add user user "a" @u

which manifests itself as:

	BUG: unable to handle kernel paging request at 00000000ffffff8a
	IP: [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280 kernel/rcu/tree.c:3046
	PGD 7cc30067 PUD 0
	Oops: 0002 [#1] SMP
	Modules linked in:
	CPU: 3 PID: 2644 Comm: a.out Not tainted 4.3.0+ #49
	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
	task: ffff88003ddea700 ti: ffff88003dd88000 task.ti: ffff88003dd88000
	RIP: 0010:[<ffffffff810a376f>]  [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280
	 [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280 kernel/rcu/tree.c:3046
	RSP: 0018:ffff88003dd8bdb0  EFLAGS: 00010246
	RAX: 00000000ffffff82 RBX: 0000000000000000 RCX: 0000000000000001
	RDX: ffffffff81e3fe40 RSI: 0000000000000000 RDI: 00000000ffffff82
	RBP: ffff88003dd8bde0 R08: ffff88007d2d2da0 R09: 0000000000000000
	R10: 0000000000000000 R11: ffff88003e8073c0 R12: 00000000ffffff82
	R13: ffff88003dd8be68 R14: ffff88007d027600 R15: ffff88003ddea700
	FS:  0000000000b92880(0063) GS:ffff88007fd00000(0000) knlGS:0000000000000000
	CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
	CR2: 00000000ffffff8a CR3: 000000007cc5f000 CR4: 00000000000006e0
	Stack:
	 ffff88003dd8bdf0 ffffffff81160a8a 0000000000000000 00000000ffffff82
	 ffff88003dd8be68 ffff88007d027600 ffff88003dd8bdf0 ffffffff810a39e5
	 ffff88003dd8be20 ffffffff812a31ab ffff88007d027600 ffff88007d027620
	Call Trace:
	 [<ffffffff810a39e5>] kfree_call_rcu+0x15/0x20 kernel/rcu/tree.c:3136
	 [<ffffffff812a31ab>] user_update+0x8b/0xb0 security/keys/user_defined.c:129
	 [<     inline     >] __key_update security/keys/key.c:730
	 [<ffffffff8129e5c1>] key_create_or_update+0x291/0x440 security/keys/key.c:908
	 [<     inline     >] SYSC_add_key security/keys/keyctl.c:125
	 [<ffffffff8129fc21>] SyS_add_key+0x101/0x1e0 security/keys/keyctl.c:60
	 [<ffffffff8185f617>] entry_SYSCALL_64_fastpath+0x12/0x6a arch/x86/entry/entry_64.S:185

Note the error code (-ENOKEY) in EDX.

A similar bug can be tripped by:

    keyctl request2 trusted user "" @u
    keyctl add trusted user "a" @u

This should also affect encrypted keys - but that has to be correctly
parameterised or it will fail with EINVAL before getting to the bit that
will crashes.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-11-25 14:19:47 +11:00
Stephen Smalley
f3bef67992 selinux: fix bug in conditional rules handling
commit fa1aa143ac ("selinux: extended permissions for ioctls")
introduced a bug into the handling of conditional rules, skipping the
processing entirely when the caller does not provide an extended
permissions (xperms) structure.  Access checks from userspace using
/sys/fs/selinux/access do not include such a structure since that
interface does not presently expose extended permission information.
As a result, conditional rules were being ignored entirely on userspace
access requests, producing denials when access was allowed by
conditional rules in the policy.  Fix the bug by only skipping
computation of extended permissions in this situation, not the entire
conditional rules processing.

Reported-by: Laurent Bigonville <bigon@debian.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: fixed long lines in patch description]
Cc: stable@vger.kernel.org # 4.3
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-11-24 13:44:32 -05:00
Dmitry Kasatkin
f4dc37785e integrity: define '.evm' as a builtin 'trusted' keyring
Require all keys added to the EVM keyring be signed by an
existing trusted key on the system trusted keyring.

This patch also switches IMA to use integrity_init_keyring().

Changes in v3:
* Added 'init_keyring' config based variable to skip initializing
  keyring instead of using  __integrity_init_keyring() wrapper.
* Added dependency back to CONFIG_IMA_TRUSTED_KEYRING

Changes in v2:
* Replace CONFIG_EVM_TRUSTED_KEYRING with IMA and EVM common
  CONFIG_INTEGRITY_TRUSTED_KEYRING configuration option
* Deprecate CONFIG_IMA_TRUSTED_KEYRING but keep it for config
  file compatibility. (Mimi Zohar)

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-11-23 14:30:02 -05:00
Linus Torvalds
2df4ee78d0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix null deref in xt_TEE netfilter module, from Eric Dumazet.

 2) Several spots need to get to the original listner for SYN-ACK
    packets, most spots got this ok but some were not.  Whilst covering
    the remaining cases, create a helper to do this.  From Eric Dumazet.

 3) Missiing check of return value from alloc_netdev() in CAIF SPI code,
    from Rasmus Villemoes.

 4) Don't sleep while != TASK_RUNNING in macvtap, from Vlad Yasevich.

 5) Use after free in mvneta driver, from Justin Maggard.

 6) Fix race on dst->flags access in dst_release(), from Eric Dumazet.

 7) Add missing ZLIB_INFLATE dependency for new qed driver.  From Arnd
    Bergmann.

 8) Fix multicast getsockopt deadlock, from WANG Cong.

 9) Fix deadlock in btusb, from Kuba Pawlak.

10) Some ipv6_add_dev() failure paths were not cleaning up the SNMP6
    counter state.  From Sabrina Dubroca.

11) Fix packet_bind() race, which can cause lost notifications, from
    Francesco Ruggeri.

12) Fix MAC restoration in qlcnic driver during bonding mode changes,
    from Jarod Wilson.

13) Revert bridging forward delay change which broke libvirt and other
    userspace things, from Vlad Yasevich.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
  Revert "bridge: Allow forward delay to be cfgd when STP enabled"
  bpf_trace: Make dependent on PERF_EVENTS
  qed: select ZLIB_INFLATE
  net: fix a race in dst_release()
  net: mvneta: Fix memory use after free.
  net: Documentation: Fix default value tcp_limit_output_bytes
  macvtap: Resolve possible __might_sleep warning in macvtap_do_read()
  mvneta: add FIXED_PHY dependency
  net: caif: check return value of alloc_netdev
  net: hisilicon: NET_VENDOR_HISILICON should depend on HAS_DMA
  drivers: net: xgene: fix RGMII 10/100Mb mode
  netfilter: nft_meta: use skb_to_full_sk() helper
  net_sched: em_meta: use skb_to_full_sk() helper
  sched: cls_flow: use skb_to_full_sk() helper
  netfilter: xt_owner: use skb_to_full_sk() helper
  smack: use skb_to_full_sk() helper
  net: add skb_to_full_sk() helper and use it in selinux_netlbl_skbuff_setsid()
  bpf: doc: correct arch list for supported eBPF JIT
  dwc_eth_qos: Delete an unnecessary check before the function call "of_node_put"
  bonding: fix panic on non-ARPHRD_ETHER enslave failure
  ...
2015-11-10 18:11:41 -08:00
Eric Dumazet
8827d90e29 smack: use skb_to_full_sk() helper
This module wants to access sk->sk_security, which is not
available for request sockets.

Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-08 20:56:38 -05:00
Eric Dumazet
54abc686c2 net: add skb_to_full_sk() helper and use it in selinux_netlbl_skbuff_setsid()
Generalize selinux_skb_sk() added in commit 212cd08953
("selinux: fix random read in selinux_ip_postroute_compat()")
so that we can use it other contexts.

Use it right away in selinux_netlbl_skbuff_setsid()

Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-08 20:56:38 -05:00
Mel Gorman
71baba4b92 mm, page_alloc: rename __GFP_WAIT to __GFP_RECLAIM
__GFP_WAIT was used to signal that the caller was in atomic context and
could not sleep.  Now it is possible to distinguish between true atomic
context and callers that are not willing to sleep.  The latter should
clear __GFP_DIRECT_RECLAIM so kswapd will still wake.  As clearing
__GFP_WAIT behaves differently, there is a risk that people will clear the
wrong flags.  This patch renames __GFP_WAIT to __GFP_RECLAIM to clearly
indicate what it does -- setting it allows all reclaim activity, clearing
them prevents it.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-06 17:50:42 -08:00
Linus Torvalds
1873499e13 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem update from James Morris:
 "This is mostly maintenance updates across the subsystem, with a
  notable update for TPM 2.0, and addition of Jarkko Sakkinen as a
  maintainer of that"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (40 commits)
  apparmor: clarify CRYPTO dependency
  selinux: Use a kmem_cache for allocation struct file_security_struct
  selinux: ioctl_has_perm should be static
  selinux: use sprintf return value
  selinux: use kstrdup() in security_get_bools()
  selinux: use kmemdup in security_sid_to_context_core()
  selinux: remove pointless cast in selinux_inode_setsecurity()
  selinux: introduce security_context_str_to_sid
  selinux: do not check open perm on ftruncate call
  selinux: change CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE default
  KEYS: Merge the type-specific data with the payload data
  KEYS: Provide a script to extract a module signature
  KEYS: Provide a script to extract the sys cert list from a vmlinux file
  keys: Be more consistent in selection of union members used
  certs: add .gitignore to stop git nagging about x509_certificate_list
  KEYS: use kvfree() in add_key
  Smack: limited capability for changing process label
  TPM: remove unnecessary little endian conversion
  vTPM: support little endian guests
  char: Drop owner assignment from i2c_driver
  ...
2015-11-05 15:32:38 -08:00
Eric Dumazet
212cd08953 selinux: fix random read in selinux_ip_postroute_compat()
In commit e446f9dfe1 ("net: synack packets can be attached to request
sockets"), I missed one remaining case of invalid skb->sk->sk_security
access.

Dmitry Vyukov got a KASan report pointing to it.

Add selinux_skb_sk() helper that is responsible to get back to the
listener if skb is attached to a request socket, instead of
duplicating the logic.

Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-05 16:45:51 -05:00
David S. Miller
b75ec3af27 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-11-01 00:15:30 -04:00
James Morris
ba94c3ff20 Merge tag 'keys-next-20151021' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2015-10-23 12:07:52 +11:00
James Morris
a47c7a6c8a Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/selinux into next 2015-10-22 11:17:50 +11:00
Arnd Bergmann
083c1290ca apparmor: clarify CRYPTO dependency
The crypto framework can be built as a loadable module, but the
apparmor hash code can only be built-in, which then causes a
link error:

security/built-in.o: In function `aa_calc_profile_hash':
integrity_audit.c:(.text+0x21610): undefined reference to `crypto_shash_update'
security/built-in.o: In function `init_profile_hash':
integrity_audit.c:(.init.text+0xb4c): undefined reference to `crypto_alloc_shash'

This changes Apparmor to use 'select CRYPTO' like a lot of other
subsystems do.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-10-22 11:11:28 +11:00
Sangwoo
63205654c0 selinux: Use a kmem_cache for allocation struct file_security_struct
The size of struct file_security_struct is 16byte at my setup.
But, the real allocation size for per each file_security_struct
is 64bytes in my setup that kmalloc min size is 64bytes
because ARCH_DMA_MINALIGN is 64.

This allocation is called every times at file allocation(alloc_file()).
So, the total slack memory size(allocated size - request size)
is increased exponentially.

E.g) Min Kmalloc Size : 64bytes, Unit : bytes
      Allocated Size | Request Size | Slack Size | Allocation Count
    ---------------------------------------------------------------
         770048      |    192512    |   577536   |      12032

At the result, this change reduce memory usage 42bytes per each
file_security_struct

Signed-off-by: Sangwoo <sangwoo2.park@lge.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: removed extra subject prefix]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-10-21 17:44:30 -04:00
Geliang Tang
1d2a168a08 selinux: ioctl_has_perm should be static
Fixes the following sparse warning:

 security/selinux/hooks.c:3242:5: warning: symbol 'ioctl_has_perm' was
 not declared. Should it be static?

Signed-off-by: Geliang Tang <geliangtang@163.com>
Acked-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-10-21 17:44:27 -04:00
Rasmus Villemoes
9529c7886c selinux: use sprintf return value
sprintf returns the number of characters printed (excluding '\0'), so
we can use that and avoid duplicating the length computation.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-10-21 17:44:27 -04:00
Rasmus Villemoes
21b76f199e selinux: use kstrdup() in security_get_bools()
This is much simpler.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-10-21 17:44:26 -04:00
Rasmus Villemoes
aa736c36db selinux: use kmemdup in security_sid_to_context_core()
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-10-21 17:44:26 -04:00
Rasmus Villemoes
20ba96aeeb selinux: remove pointless cast in selinux_inode_setsecurity()
security_context_to_sid() expects a const char* argument, so there's
no point in casting away the const qualifier of value.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-10-21 17:44:26 -04:00
Rasmus Villemoes
44be2f65d9 selinux: introduce security_context_str_to_sid
There seems to be a little confusion as to whether the scontext_len
parameter of security_context_to_sid() includes the nul-byte or
not. Reading security_context_to_sid_core(), it seems that the
expectation is that it does not (both the string copying and the test
for scontext_len being zero hint at that).

Introduce the helper security_context_str_to_sid() to do the strlen()
call and fix all callers.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-10-21 17:44:25 -04:00
Jeff Vander Stoep
44d37ad360 selinux: do not check open perm on ftruncate call
Use the ATTR_FILE attribute to distinguish between truncate()
and ftruncate() system calls. The two other cases where
do_truncate is called with a filp (and therefore ATTR_FILE is set)
are for coredump files and for open(O_TRUNC). In both of those cases
the open permission has already been checked during file open and
therefore does not need to be repeated.

Commit 95dbf73931 ("SELinux: check OPEN on truncate calls")
fixed a major issue where domains were allowed to truncate files
without the open permission. However, it introduced a new bug where
a domain with the write permission can no longer ftruncate files
without the open permission, even when they receive an already open
file.

Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-10-21 17:44:25 -04:00
Paul Moore
2a35d196c1 selinux: change CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE default
Change the SELinux checkreqprot default value to 0 so that SELinux
performs access control checking on the actual memory protections
used by the kernel and not those requested by the application.

Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-10-21 17:44:25 -04:00
David Howells
146aa8b145 KEYS: Merge the type-specific data with the payload data
Merge the type-specific data with the payload data into one four-word chunk
as it seems pointless to keep them separate.

Use user_key_payload() for accessing the payloads of overloaded
user-defined keys.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-cifs@vger.kernel.org
cc: ecryptfs@vger.kernel.org
cc: linux-ext4@vger.kernel.org
cc: linux-f2fs-devel@lists.sourceforge.net
cc: linux-nfs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: linux-ima-devel@lists.sourceforge.net
2015-10-21 15:18:36 +01:00
Insu Yun
27720e75a7 keys: Be more consistent in selection of union members used
key->description and key->index_key.description are same because
they are unioned. But, for readability, using same name for
duplication and validation seems better.

Signed-off-by: Insu Yun <wuninsu@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2015-10-21 15:18:35 +01:00
Geliang Tang
d0e0eba043 KEYS: use kvfree() in add_key
There is no need to make a flag to tell that this memory is allocated by
kmalloc or vmalloc. Just use kvfree to free the memory.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2015-10-21 15:18:35 +01:00
James Morris
09302fd19e Merge branch 'smack-for-4.4' of https://github.com/cschaufler/smack-next into next 2015-10-21 10:49:29 +11:00
James Morris
fbf9826589 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next 2015-10-20 12:34:04 +11:00
Zbigniew Jasinski
38416e5393 Smack: limited capability for changing process label
This feature introduces new kernel interface:

- <smack_fs>/relabel-self - for setting transition labels list

This list is used to control smack label transition mechanism.
List is set by, and per process. Process can transit to new label only if
label is on the list. Only process with CAP_MAC_ADMIN capability can add
labels to this list. With this list, process can change it's label without
CAP_MAC_ADMIN but only once. After label changing, list is unset.

Changes in v2:
* use list_for_each_entry instead of _rcu during label write
* added missing description in security/Smack.txt

Changes in v3:
* squashed into one commit

Changes in v4:
* switch from global list to per-task list
* since the per-task list is accessed only by the task itself
  there is no need to use synchronization mechanisms on it

Changes in v5:
* change smackfs interface of relabel-self to the one used for onlycap
  multiple labels are accepted, separated by space, which
  replace the previous list upon write

Signed-off-by: Zbigniew Jasinski <z.jasinski@samsung.com>
Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2015-10-19 12:06:47 -07:00
David Howells
911b79cde9 KEYS: Don't permit request_key() to construct a new keyring
If request_key() is used to find a keyring, only do the search part - don't
do the construction part if the keyring was not found by the search.  We
don't really want keyrings in the negative instantiated state since the
rejected/negative instantiation error value in the payload is unioned with
keyring metadata.

Now the kernel gives an error:

	request_key("keyring", "#selinux,bdekeyring", "keyring", KEY_SPEC_USER_SESSION_KEYRING) = -1 EPERM (Operation not permitted)

Signed-off-by: David Howells <dhowells@redhat.com>
2015-10-19 11:24:51 +01:00
Jarkko Sakkinen
0fe5480303 keys, trusted: seal/unseal with TPM 2.0 chips
Call tpm_seal_trusted() and tpm_unseal_trusted() for TPM 2.0 chips.
We require explicit 'keyhandle=' option because there's no a fixed
storage root key inside TPM2 chips.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Andreas Fuchs <andreas.fuchs@sit.fraunhofer.de>
Tested-by: Mimi Zohar <zohar@linux.vnet.ibm.com> (on TPM 1.2)
Tested-by: Chris J Arges <chris.j.arges@canonical.com>
Tested-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Kevin Strasser <kevin.strasser@intel.com>
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
2015-10-19 01:01:22 +02:00
Jarkko Sakkinen
fe351e8d4e keys, trusted: move struct trusted_key_options to trusted-type.h
Moved struct trusted_key_options to trustes-type.h so that the fields
can be accessed from drivers/char/tpm.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
2015-10-19 01:01:21 +02:00
Pablo Neira Ayuso
f0a0a978b6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
This merge resolves conflicts with 75aec9df3a ("bridge: Remove
br_nf_push_frag_xmit_sk") as part of Eric Biederman's effort to improve
netns support in the network stack that reached upstream via David's
net-next tree.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Conflicts:
	net/bridge/br_netfilter_hooks.c
2015-10-17 14:28:03 +02:00
Florian Westphal
2ffbceb2b0 netfilter: remove hook owner refcounting
since commit 8405a8fff3 ("netfilter: nf_qeueue: Drop queue entries on
nf_unregister_hook") all pending queued entries are discarded.

So we can simply remove all of the owner handling -- when module is
removed it also needs to unregister all its hooks.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-16 18:21:39 +02:00
David Howells
f05819df10 KEYS: Fix crash when attempt to garbage collect an uninstantiated keyring
The following sequence of commands:

    i=`keyctl add user a a @s`
    keyctl request2 keyring foo bar @t
    keyctl unlink $i @s

tries to invoke an upcall to instantiate a keyring if one doesn't already
exist by that name within the user's keyring set.  However, if the upcall
fails, the code sets keyring->type_data.reject_error to -ENOKEY or some
other error code.  When the key is garbage collected, the key destroy
function is called unconditionally and keyring_destroy() uses list_empty()
on keyring->type_data.link - which is in a union with reject_error.
Subsequently, the kernel tries to unlink the keyring from the keyring names
list - which oopses like this:

	BUG: unable to handle kernel paging request at 00000000ffffff8a
	IP: [<ffffffff8126e051>] keyring_destroy+0x3d/0x88
	...
	Workqueue: events key_garbage_collector
	...
	RIP: 0010:[<ffffffff8126e051>] keyring_destroy+0x3d/0x88
	RSP: 0018:ffff88003e2f3d30  EFLAGS: 00010203
	RAX: 00000000ffffff82 RBX: ffff88003bf1a900 RCX: 0000000000000000
	RDX: 0000000000000000 RSI: 000000003bfc6901 RDI: ffffffff81a73a40
	RBP: ffff88003e2f3d38 R08: 0000000000000152 R09: 0000000000000000
	R10: ffff88003e2f3c18 R11: 000000000000865b R12: ffff88003bf1a900
	R13: 0000000000000000 R14: ffff88003bf1a908 R15: ffff88003e2f4000
	...
	CR2: 00000000ffffff8a CR3: 000000003e3ec000 CR4: 00000000000006f0
	...
	Call Trace:
	 [<ffffffff8126c756>] key_gc_unused_keys.constprop.1+0x5d/0x10f
	 [<ffffffff8126ca71>] key_garbage_collector+0x1fa/0x351
	 [<ffffffff8105ec9b>] process_one_work+0x28e/0x547
	 [<ffffffff8105fd17>] worker_thread+0x26e/0x361
	 [<ffffffff8105faa9>] ? rescuer_thread+0x2a8/0x2a8
	 [<ffffffff810648ad>] kthread+0xf3/0xfb
	 [<ffffffff810647ba>] ? kthread_create_on_node+0x1c2/0x1c2
	 [<ffffffff815f2ccf>] ret_from_fork+0x3f/0x70
	 [<ffffffff810647ba>] ? kthread_create_on_node+0x1c2/0x1c2

Note the value in RAX.  This is a 32-bit representation of -ENOKEY.

The solution is to only call ->destroy() if the key was successfully
instantiated.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
2015-10-15 17:21:37 +01:00
Eric Dumazet
e446f9dfe1 net: synack packets can be attached to request sockets
selinux needs few changes to accommodate fact that SYNACK messages
can be attached to a request socket, lacking sk_security pointer

(Only syncookies are still attached to a TCP_LISTEN socket)

Adds a new sk_listener() helper, and use it in selinux and sch_fq

Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported by: kernel test robot <ying.huang@linux.intel.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:05:06 -07:00
Roman Kubiak
8da4aba5bf Smack: pipefs fix in smack_d_instantiate
This fix writes the task label when
smack_d_instantiate is called, before the
label of the superblock was written on the
pipe's inode.

Signed-off-by: Roman Kubiak <r.kubiak@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2015-10-09 15:13:41 -07:00
José Bollo
d21b7b049c Smack: Minor initialisation improvement
This change has two goals:
 - delay the setting of 'smack_enabled' until
   it will be really effective
 - ensure that smackfs is valid only if 'smack_enabled'
   is set (it is already the case in smack_netfilter.c)

Signed-off-by: José Bollo <jose.bollo@iot.bzh>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2015-10-09 15:13:24 -07:00
Geliang Tang
8b549ef42a smack: smk_ipv6_port_list should be static
Fixes the following sparse warning:

 security/smack/smack_lsm.c:55:1: warning: symbol 'smk_ipv6_port_list'
 was not declared. Should it be static?

Signed-off-by: Geliang Tang <geliangtang@163.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2015-10-09 15:13:08 -07:00
Lukasz Pawelczyk
5f2bfe2f1d Smack: fix a NULL dereference in wrong smack_import_entry() usage
'commit e774ad683f ("smack: pass error code through pointers")'
made this function return proper error codes instead of NULL. Reflect that.

This is a fix for a NULL dereference introduced in
'commit 21abb1ec41 ("Smack: IPv6 host labeling")'

echo "$SOME_IPV6_ADDR \"test" > /smack/ipv6host
  (this should return EINVAL, it doesn't)
cat /smack/ipv6host
  (derefences 0x000a)

Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2015-10-09 15:12:46 -07:00
Dmitry Kasatkin
72e1eed8ab integrity: prevent loading untrusted certificates on the IMA trusted keyring
If IMA_LOAD_X509 is enabled, either directly or indirectly via
IMA_APPRAISE_SIGNED_INIT, certificates are loaded onto the IMA
trusted keyring by the kernel via key_create_or_update(). When
the KEY_ALLOC_TRUSTED flag is provided, certificates are loaded
without first verifying the certificate is properly signed by a
trusted key on the system keyring.  This patch removes the
KEY_ALLOC_TRUSTED flag.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
Cc:  <stable@vger.kernel.org> # 3.19+
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-10-09 15:31:18 -04:00
David S. Miller
f6d3125fa3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/dsa/slave.c

net/dsa/slave.c simply had overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-02 07:21:25 -07:00
David S. Miller
4963ed48f2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/ipv4/arp.c

The net/ipv4/arp.c conflict was one commit adding a new
local variable while another commit was deleting one.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-26 16:08:27 -07:00
David Howells
94c4554ba0 KEYS: Fix race between key destruction and finding a keyring by name
There appears to be a race between:

 (1) key_gc_unused_keys() which frees key->security and then calls
     keyring_destroy() to unlink the name from the name list

 (2) find_keyring_by_name() which calls key_permission(), thus accessing
     key->security, on a key before checking to see whether the key usage is 0
     (ie. the key is dead and might be cleaned up).

Fix this by calling ->destroy() before cleaning up the core key data -
including key->security.

Reported-by: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2015-09-25 16:30:08 +01:00
Eric W. Biederman
06198b34a3 netfilter: Pass priv instead of nf_hook_ops to netfilter hooks
Only pass the void *priv parameter out of the nf_hook_ops.  That is
all any of the functions are interested now, and by limiting what is
passed it becomes simpler to change implementation details.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18 22:00:16 +02:00
Linus Torvalds
1b3dfde386 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU fix from Ingo Molnar:
 "Fix a false positive warning"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  security/device_cgroup: Fix RCU_LOCKDEP_WARN() condition
2015-09-17 08:44:27 -07:00
Ingo Molnar
31409c9764 Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/urgent
Pull RCU fix from Paul E. McKenney, fixing an inverted RCU_LOCKDEP_WARN() condition.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-12 10:26:24 +02:00
Kirill A. Shutemov
7cbea8dc01 mm: mark most vm_operations_struct const
With two exceptions (drm/qxl and drm/radeon) all vm_operations_struct
structs should be constant.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Linus Torvalds
b793c005ce Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

   - PKCS#7 support added to support signed kexec, also utilized for
     module signing.  See comments in 3f1e1bea.

     ** NOTE: this requires linking against the OpenSSL library, which
        must be installed, e.g.  the openssl-devel on Fedora **

   - Smack
      - add IPv6 host labeling; ignore labels on kernel threads
      - support smack labeling mounts which use binary mount data

   - SELinux:
      - add ioctl whitelisting (see
        http://kernsec.org/files/lss2015/vanderstoep.pdf)
      - fix mprotect PROT_EXEC regression caused by mm change

   - Seccomp:
      - add ptrace options for suspend/resume"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (57 commits)
  PKCS#7: Add OIDs for sha224, sha284 and sha512 hash algos and use them
  Documentation/Changes: Now need OpenSSL devel packages for module signing
  scripts: add extract-cert and sign-file to .gitignore
  modsign: Handle signing key in source tree
  modsign: Use if_changed rule for extracting cert from module signing key
  Move certificate handling to its own directory
  sign-file: Fix warning about BIO_reset() return value
  PKCS#7: Add MODULE_LICENSE() to test module
  Smack - Fix build error with bringup unconfigured
  sign-file: Document dependency on OpenSSL devel libraries
  PKCS#7: Appropriately restrict authenticated attributes and content type
  KEYS: Add a name for PKEY_ID_PKCS7
  PKCS#7: Improve and export the X.509 ASN.1 time object decoder
  modsign: Use extract-cert to process CONFIG_SYSTEM_TRUSTED_KEYS
  extract-cert: Cope with multiple X.509 certificates in a single file
  sign-file: Generate CMS message as signature instead of PKCS#7
  PKCS#7: Support CMS messages also [RFC5652]
  X.509: Change recorded SKID & AKID to not include Subject or Issuer
  PKCS#7: Check content type and versions
  MAINTAINERS: The keyrings mailing list has moved
  ...
2015-09-08 12:41:25 -07:00
Kees Cook
a068acf2ee fs: create and use seq_show_option for escaping
Many file systems that implement the show_options hook fail to correctly
escape their output which could lead to unescaped characters (e.g.  new
lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files.  This
could lead to confusion, spoofed entries (resulting in things like
systemd issuing false d-bus "mount" notifications), and who knows what
else.  This looks like it would only be the root user stepping on
themselves, but it's possible weird things could happen in containers or
in other situations with delegated mount privileges.

Here's an example using overlay with setuid fusermount trusting the
contents of /proc/mounts (via the /etc/mtab symlink).  Imagine the use
of "sudo" is something more sneaky:

  $ BASE="ovl"
  $ MNT="$BASE/mnt"
  $ LOW="$BASE/lower"
  $ UP="$BASE/upper"
  $ WORK="$BASE/work/ 0 0
  none /proc fuse.pwn user_id=1000"
  $ mkdir -p "$LOW" "$UP" "$WORK"
  $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
  $ cat /proc/mounts
  none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
  none /proc fuse.pwn user_id=1000 0 0
  $ fusermount -u /proc
  $ cat /proc/mounts
  cat: /proc/mounts: No such file or directory

This fixes the problem by adding new seq_show_option and
seq_show_option_n helpers, and updating the vulnerable show_option
handlers to use them as needed.  Some, like SELinux, need to be open
coded due to unusual existing escape mechanisms.

[akpm@linux-foundation.org: add lost chunk, per Kees]
[keescook@chromium.org: seq_show_option should be using const parameters]
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Acked-by: Jan Kara <jack@suse.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Cc: J. R. Okajima <hooanon05g@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andy Lutomirski
746bf6d642 capabilities: add a securebit to disable PR_CAP_AMBIENT_RAISE
Per Andrew Morgan's request, add a securebit to allow admins to disable
PR_CAP_AMBIENT_RAISE.  This securebit will prevent processes from adding
capabilities to their ambient set.

For simplicity, this disables PR_CAP_AMBIENT_RAISE entirely rather than
just disabling setting previously cleared bits.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Aaron Jones <aaronmdjones@gmail.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
Cc: Markku Savela <msa@moth.iki.fi>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andy Lutomirski
58319057b7 capabilities: ambient capabilities
Credit where credit is due: this idea comes from Christoph Lameter with
a lot of valuable input from Serge Hallyn.  This patch is heavily based
on Christoph's patch.

===== The status quo =====

On Linux, there are a number of capabilities defined by the kernel.  To
perform various privileged tasks, processes can wield capabilities that
they hold.

Each task has four capability masks: effective (pE), permitted (pP),
inheritable (pI), and a bounding set (X).  When the kernel checks for a
capability, it checks pE.  The other capability masks serve to modify
what capabilities can be in pE.

Any task can remove capabilities from pE, pP, or pI at any time.  If a
task has a capability in pP, it can add that capability to pE and/or pI.
If a task has CAP_SETPCAP, then it can add any capability to pI, and it
can remove capabilities from X.

Tasks are not the only things that can have capabilities; files can also
have capabilities.  A file can have no capabilty information at all [1].
If a file has capability information, then it has a permitted mask (fP)
and an inheritable mask (fI) as well as a single effective bit (fE) [2].
File capabilities modify the capabilities of tasks that execve(2) them.

A task that successfully calls execve has its capabilities modified for
the file ultimately being excecuted (i.e.  the binary itself if that
binary is ELF or for the interpreter if the binary is a script.) [3] In
the capability evolution rules, for each mask Z, pZ represents the old
value and pZ' represents the new value.  The rules are:

  pP' = (X & fP) | (pI & fI)
  pI' = pI
  pE' = (fE ? pP' : 0)
  X is unchanged

For setuid binaries, fP, fI, and fE are modified by a moderately
complicated set of rules that emulate POSIX behavior.  Similarly, if
euid == 0 or ruid == 0, then fP, fI, and fE are modified differently
(primary, fP and fI usually end up being the full set).  For nonroot
users executing binaries with neither setuid nor file caps, fI and fP
are empty and fE is false.

As an extra complication, if you execute a process as nonroot and fE is
set, then the "secure exec" rules are in effect: AT_SECURE gets set,
LD_PRELOAD doesn't work, etc.

This is rather messy.  We've learned that making any changes is
dangerous, though: if a new kernel version allows an unprivileged
program to change its security state in a way that persists cross
execution of a setuid program or a program with file caps, this
persistent state is surprisingly likely to allow setuid or file-capped
programs to be exploited for privilege escalation.

===== The problem =====

Capability inheritance is basically useless.

If you aren't root and you execute an ordinary binary, fI is zero, so
your capabilities have no effect whatsoever on pP'.  This means that you
can't usefully execute a helper process or a shell command with elevated
capabilities if you aren't root.

On current kernels, you can sort of work around this by setting fI to
the full set for most or all non-setuid executable files.  This causes
pP' = pI for nonroot, and inheritance works.  No one does this because
it's a PITA and it isn't even supported on most filesystems.

If you try this, you'll discover that every nonroot program ends up with
secure exec rules, breaking many things.

This is a problem that has bitten many people who have tried to use
capabilities for anything useful.

===== The proposed change =====

This patch adds a fifth capability mask called the ambient mask (pA).
pA does what most people expect pI to do.

pA obeys the invariant that no bit can ever be set in pA if it is not
set in both pP and pI.  Dropping a bit from pP or pI drops that bit from
pA.  This ensures that existing programs that try to drop capabilities
still do so, with a complication.  Because capability inheritance is so
broken, setting KEEPCAPS, using setresuid to switch to nonroot uids, and
then calling execve effectively drops capabilities.  Therefore,
setresuid from root to nonroot conditionally clears pA unless
SECBIT_NO_SETUID_FIXUP is set.  Processes that don't like this can
re-add bits to pA afterwards.

The capability evolution rules are changed:

  pA' = (file caps or setuid or setgid ? 0 : pA)
  pP' = (X & fP) | (pI & fI) | pA'
  pI' = pI
  pE' = (fE ? pP' : pA')
  X is unchanged

If you are nonroot but you have a capability, you can add it to pA.  If
you do so, your children get that capability in pA, pP, and pE.  For
example, you can set pA = CAP_NET_BIND_SERVICE, and your children can
automatically bind low-numbered ports.  Hallelujah!

Unprivileged users can create user namespaces, map themselves to a
nonzero uid, and create both privileged (relative to their namespace)
and unprivileged process trees.  This is currently more or less
impossible.  Hallelujah!

You cannot use pA to try to subvert a setuid, setgid, or file-capped
program: if you execute any such program, pA gets cleared and the
resulting evolution rules are unchanged by this patch.

Users with nonzero pA are unlikely to unintentionally leak that
capability.  If they run programs that try to drop privileges, dropping
privileges will still work.

It's worth noting that the degree of paranoia in this patch could
possibly be reduced without causing serious problems.  Specifically, if
we allowed pA to persist across executing non-pA-aware setuid binaries
and across setresuid, then, naively, the only capabilities that could
leak as a result would be the capabilities in pA, and any attacker
*already* has those capabilities.  This would make me nervous, though --
setuid binaries that tried to privilege-separate might fail to do so,
and putting CAP_DAC_READ_SEARCH or CAP_DAC_OVERRIDE into pA could have
unexpected side effects.  (Whether these unexpected side effects would
be exploitable is an open question.) I've therefore taken the more
paranoid route.  We can revisit this later.

An alternative would be to require PR_SET_NO_NEW_PRIVS before setting
ambient capabilities.  I think that this would be annoying and would
make granting otherwise unprivileged users minor ambient capabilities
(CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much less useful than
it is with this patch.

===== Footnotes =====

[1] Files that are missing the "security.capability" xattr or that have
unrecognized values for that xattr end up with has_cap set to false.
The code that does that appears to be complicated for no good reason.

[2] The libcap capability mask parsers and formatters are dangerously
misleading and the documentation is flat-out wrong.  fE is *not* a mask;
it's a single bit.  This has probably confused every single person who
has tried to use file capabilities.

[3] Linux very confusingly processes both the script and the interpreter
if applicable, for reasons that elude me.  The results from thinking
about a script's file capabilities and/or setuid bits are mostly
discarded.

Preliminary userspace code is here, but it needs updating:
https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=7f5afbd175d2

Here is a test program that can be used to verify the functionality
(from Christoph):

/*
 * Test program for the ambient capabilities. This program spawns a shell
 * that allows running processes with a defined set of capabilities.
 *
 * (C) 2015 Christoph Lameter <cl@linux.com>
 * Released under: GPL v3 or later.
 *
 *
 * Compile using:
 *
 *	gcc -o ambient_test ambient_test.o -lcap-ng
 *
 * This program must have the following capabilities to run properly:
 * Permissions for CAP_NET_RAW, CAP_NET_ADMIN, CAP_SYS_NICE
 *
 * A command to equip the binary with the right caps is:
 *
 *	setcap cap_net_raw,cap_net_admin,cap_sys_nice+p ambient_test
 *
 *
 * To get a shell with additional caps that can be inherited by other processes:
 *
 *	./ambient_test /bin/bash
 *
 *
 * Verifying that it works:
 *
 * From the bash spawed by ambient_test run
 *
 *	cat /proc/$$/status
 *
 * and have a look at the capabilities.
 */

#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <cap-ng.h>
#include <sys/prctl.h>
#include <linux/capability.h>

/*
 * Definitions from the kernel header files. These are going to be removed
 * when the /usr/include files have these defined.
 */
#define PR_CAP_AMBIENT 47
#define PR_CAP_AMBIENT_IS_SET 1
#define PR_CAP_AMBIENT_RAISE 2
#define PR_CAP_AMBIENT_LOWER 3
#define PR_CAP_AMBIENT_CLEAR_ALL 4

static void set_ambient_cap(int cap)
{
	int rc;

	capng_get_caps_process();
	rc = capng_update(CAPNG_ADD, CAPNG_INHERITABLE, cap);
	if (rc) {
		printf("Cannot add inheritable cap\n");
		exit(2);
	}
	capng_apply(CAPNG_SELECT_CAPS);

	/* Note the two 0s at the end. Kernel checks for these */
	if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, cap, 0, 0)) {
		perror("Cannot set cap");
		exit(1);
	}
}

int main(int argc, char **argv)
{
	int rc;

	set_ambient_cap(CAP_NET_RAW);
	set_ambient_cap(CAP_NET_ADMIN);
	set_ambient_cap(CAP_SYS_NICE);

	printf("Ambient_test forking shell\n");
	if (execv(argv[1], argv + 1))
		perror("Cannot exec");

	return 0;
}

Signed-off-by: Christoph Lameter <cl@linux.com> # Original author
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Aaron Jones <aaronmdjones@gmail.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
Cc: Markku Savela <msa@moth.iki.fi>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Paul E. McKenney
dc3a04d551 security/device_cgroup: Fix RCU_LOCKDEP_WARN() condition
f78f5b90c4 ("rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()")
introduced a bug by incorrectly inverting the condition when moving from
rcu_lockdep_assert() to RCU_LOCKDEP_WARN().  This commit therefore fixes
the inversion.

Reported-by: Felipe Balbi <balbi@ti.com>
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Tested-by: Josh Boyer <jwboyer@fedoraproject.org>
2015-09-03 18:13:10 -07:00
Linus Torvalds
73b6fa8e49 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace updates from Eric Biederman:
 "This finishes up the changes to ensure proc and sysfs do not start
  implementing executable files, as the there are application today that
  are only secure because such files do not exist.

  It akso fixes a long standing misfeature of /proc/<pid>/mountinfo that
  did not show the proper source for files bind mounted from
  /proc/<pid>/ns/*.

  It also straightens out the handling of clone flags related to user
  namespaces, fixing an unnecessary failure of unshare(CLONE_NEWUSER)
  when files such as /proc/<pid>/environ are read while <pid> is calling
  unshare.  This winds up fixing a minor bug in unshare flag handling
  that dates back to the first version of unshare in the kernel.

  Finally, this fixes a minor regression caused by the introduction of
  sysfs_create_mount_point, which broke someone's in house application,
  by restoring the size of /sys/fs/cgroup to 0 bytes.  Apparently that
  application uses the directory size to determine if a tmpfs is mounted
  on /sys/fs/cgroup.

  The bind mount escape fixes are present in Al Viros for-next branch.
  and I expect them to come from there.  The bind mount escape is the
  last of the user namespace related security bugs that I am aware of"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  fs: Set the size of empty dirs to 0.
  userns,pidns: Force thread group sharing, not signal handler sharing.
  unshare: Unsharing a thread does not require unsharing a vm
  nsfs: Add a show_path method to fix mountinfo
  mnt: fs_fully_visible enforce noexec and nosuid  if !SB_I_NOEXEC
  vfs: Commit to never having exectuables on proc and sysfs.
2015-09-01 16:13:25 -07:00
Linus Torvalds
7073bc6612 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main RCU changes in this cycle are:

   - the combination of tree geometry-initialization simplifications and
     OS-jitter-reduction changes to expedited grace periods.  These two
     are stacked due to the large number of conflicts that would
     otherwise result.

   - privatize smp_mb__after_unlock_lock().

     This commit moves the definition of smp_mb__after_unlock_lock() to
     kernel/rcu/tree.h, in recognition of the fact that RCU is the only
     thing using this, that nothing else is likely to use it, and that
     it is likely to go away completely.

   - documentation updates.

   - torture-test updates.

   - misc fixes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
  rcu,locking: Privatize smp_mb__after_unlock_lock()
  rcu: Silence lockdep false positive for expedited grace periods
  rcu: Don't disable CPU hotplug during OOM notifiers
  scripts: Make checkpatch.pl warn on expedited RCU grace periods
  rcu: Update MAINTAINERS entry
  rcu: Clarify CONFIG_RCU_EQS_DEBUG help text
  rcu: Fix backwards RCU_LOCKDEP_WARN() in synchronize_rcu_tasks()
  rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()
  rcu: Make rcu_is_watching() really notrace
  cpu: Wait for RCU grace periods concurrently
  rcu: Create a synchronize_rcu_mult()
  rcu: Fix obsolete priority-boosting comment
  rcu: Use WRITE_ONCE in RCU_INIT_POINTER
  rcu: Hide RCU_NOCB_CPU behind RCU_EXPERT
  rcu: Add RCU-sched flavors of get-state and cond-sync
  rcu: Add fastpath bypassing funnel locking
  rcu: Rename RCU_GP_DONE_FQS to RCU_GP_DOING_FQS
  rcu: Pull out wait_event*() condition into helper function
  documentation: Describe new expedited stall warnings
  rcu: Add stall warnings to synchronize_sched_expedited()
  ...
2015-08-31 18:12:07 -07:00
Jan Beulich
e308fd3bb2 LSM: restore certain default error codes
While in most cases commit b1d9e6b064 ("LSM: Switch to lists of hooks")
retained previous error returns, in three cases it altered them without
any explanation in the commit message. Restore all of them - in the
security_old_inode_init_security() case this led to reiserfs using
uninitialized data, sooner or later crashing the system (the only other
user of this function - ocfs2 - was unaffected afaict, since it passes
pre-initialized structures).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-08-26 09:46:50 +10:00
James Morris
3e5f206c00 Merge branch 'next' of git://git.infradead.org/users/pcmoore/selinux into next 2015-08-15 13:29:57 +10:00
James Morris
0e38c35815 Merge branch 'smack-for-4.3' of https://github.com/cschaufler/smack-next into next 2015-08-14 17:35:10 +10:00
Casey Schaufler
3d04c92403 Smack - Fix build error with bringup unconfigured
The changes for mounting binary filesystems was allied
improperly, with the list of tokens being in an ifdef that
it shouldn't have been. Fix that, and a couple style issues
that were bothering me.

Reported-by: Jim Davis <jim.epost@gmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2015-08-12 18:10:01 -07:00
Ingo Molnar
9b9412dc70 Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU changes from Paul E. McKenney:

  - The combination of tree geometry-initialization simplifications
    and OS-jitter-reduction changes to expedited grace periods.
    These two are stacked due to the large number of conflicts
    that would otherwise result.

    [ With one addition, a temporary commit to silence a lockdep false
      positive. Additional changes to the expedited grace-period
      primitives (queued for 4.4) remove the cause of this false
      positive, and therefore include a revert of this temporary commit. ]

  - Documentation updates.

  - Torture-test updates.

  - Miscellaneous fixes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-12 12:12:12 +02:00
James Morris
5ab1657902 Merge branch 'smack-for-4.3' of https://github.com/cschaufler/smack-next into next 2015-08-11 11:18:53 +10:00
Roman Kubiak
41a2d57516 Kernel threads excluded from smack checks
Adds an ignore case for kernel tasks,
so that they can access all resources.

Since kernel worker threads are spawned with
floor label, they are severely restricted by
Smack policy. It is not an issue without onlycap,
as these processes also run with root,
so CAP_MAC_OVERRIDE kicks in. But with onlycap
turned on, there is no way to change the label
for these processes.

Signed-off-by: Roman Kubiak <r.kubiak@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2015-08-10 15:15:50 -07:00
Salvatore Mesoraca
5413fcdbe9 Adding YAMA hooks also when YAMA is not stacked.
Without this patch YAMA will not work at all if it is chosen
as the primary LSM instead of being "stacked".

Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-08-04 01:36:18 +10:00
Casey Schaufler
1eddfe8edb Smack: Three symbols that should be static
The kbuild test robot reported a couple of these,
and the third showed up by inspection. Making the
symbols static is proper.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2015-07-31 12:12:17 -07:00
Casey Schaufler
21abb1ec41 Smack: IPv6 host labeling
IPv6 appears to be (finally) coming of age with the
influx of autonomous devices. In support of this, add
the ability to associate a Smack label with IPv6 addresses.

This patch also cleans up some of the conditional
compilation associated with the introduction of
secmark processing. It's now more obvious which bit
of code goes with which feature.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2015-07-28 06:35:21 -07:00
Kees Cook
730daa164e Yama: remove needless CONFIG_SECURITY_YAMA_STACKED
Now that minor LSMs can cleanly stack with major LSMs, remove the unneeded
config for Yama to be made to explicitly stack. Just selecting the main
Yama CONFIG will allow it to work, regardless of the major LSM. Since
distros using Yama are already forcing it to stack, this is effectively
a no-op change.

Additionally add MAINTAINERS entry.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-07-28 13:18:19 +10:00
Colin Ian King
ca4da5dd1f KEYS: ensure we free the assoc array edit if edit is valid
__key_link_end is not freeing the associated array edit structure
and this leads to a 512 byte memory leak each time an identical
existing key is added with add_key().

The reason the add_key() system call returns okay is that
key_create_or_update() calls __key_link_begin() before checking to see
whether it can update a key directly rather than adding/replacing - which
it turns out it can.  Thus __key_link() is not called through
__key_instantiate_and_link() and __key_link_end() must cancel the edit.

CVE-2015-1333

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-07-28 13:08:23 +10:00
Paul E. McKenney
f78f5b90c4 rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()
This commit renames rcu_lockdep_assert() to RCU_LOCKDEP_WARN() for
consistency with the WARN() series of macros.  This also requires
inverting the sense of the conditional, which this commit also does.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
2015-07-22 15:27:32 -07:00
kbuild test robot
ca70d27e44 sysfs: fix simple_return.cocci warnings
security/smack/smackfs.c:2251:1-4: WARNING: end returns can be
simpified and declaration on line 2250 can be dropped

 Simplify a trivial if-return sequence.  Possibly combine with a
 preceding function call.

Generated by: scripts/coccinelle/misc/simple_return.cocci

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2015-07-22 12:31:40 -07:00
Vivek Trivedi
3bf2789cad smack: allow mount opts setting over filesystems with binary mount data
Add support for setting smack mount labels(using smackfsdef, smackfsroot,
smackfshat, smackfsfloor, smackfstransmute) for filesystems with binary
mount data like NFS.

To achieve this, implement sb_parse_opts_str and sb_set_mnt_opts security
operations in smack LSM similar to SELinux.

Signed-off-by: Vivek Trivedi <t.vivek@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2015-07-22 12:31:28 -07:00
David Howells
c3c188b2c3 selinux: Create a common helper to determine an inode label [ver #3]
Create a common helper function to determine the label for a new inode.
This is then used by:

	- may_create()
	- selinux_dentry_init_security()
	- selinux_inode_init_security()

This will change the behaviour of the functions slightly, bringing them
all into line.

Suggested-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-07-13 13:31:59 -04:00
Stephen Smalley
bd1741f4cf selinux: Augment BUG_ON assertion for secclass_map.
Ensure that we catch any cases where tclass == 0.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-07-13 13:31:59 -04:00
Stephen Smalley
5dee25d08e selinux: initialize sock security class to default value
Initialize the security class of sock security structures
to the generic socket class.  This is similar to what is
already done in inode_alloc_security for files.  Generally
the sclass field will later by set by socket_post_create
or sk_clone or sock_graft, but for protocol implementations
that fail to call any of these for newly accepted sockets,
we want some sane default that will yield a legitimate
avc denied message with non-garbage values for class and
permission.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-07-13 13:31:59 -04:00
Waiman Long
9629d04ae0 selinux: reduce locking overhead in inode_free_security()
The inode_free_security() function just took the superblock's isec_lock
before checking and trying to remove the inode security struct from the
linked list. In many cases, the list was empty and so the lock taking
is wasteful as no useful work is done. On multi-socket systems with
a large number of CPUs, there can also be a fair amount of spinlock
contention on the isec_lock if many tasks are exiting at the same time.

This patch changes the code to check the state of the list first before
taking the lock and attempting to dequeue it. The list_del_init()
can be called more than once on the same list with no harm as long
as they are properly serialized. It should not be possible to have
inode_free_security() called concurrently with list_add(). For better
safety, however, we use list_empty_careful() here even though it is
still not completely safe in case that happens.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-07-13 13:31:59 -04:00
Jeff Vander Stoep
fa1aa143ac selinux: extended permissions for ioctls
Add extended permissions logic to selinux. Extended permissions
provides additional permissions in 256 bit increments. Extend the
generic ioctl permission check to use the extended permissions for
per-command filtering. Source/target/class sets including the ioctl
permission may additionally include a set of commands. Example:

allowxperm <source> <target>:<class> ioctl unpriv_app_socket_cmds
auditallowxperm <source> <target>:<class> ioctl priv_gpu_cmds

Where unpriv_app_socket_cmds and priv_gpu_cmds are macros
representing commonly granted sets of ioctl commands.

When ioctl commands are omitted only the permissions are checked.
This feature is intended to provide finer granularity for the ioctl
permission that may be too imprecise. For example, the same driver
may use ioctls to provide important and benign functionality such as
driver version or socket type as well as dangerous capabilities such
as debugging features, read/write/execute to physical memory or
access to sensitive data. Per-command filtering provides a mechanism
to reduce the attack surface of the kernel, and limit applications
to the subset of commands required.

The format of the policy binary has been modified to include ioctl
commands, and the policy version number has been incremented to
POLICYDB_VERSION_XPERMS_IOCTL=30 to account for the format
change.

The extended permissions logic is deliberately generic to allow
components to be reused e.g. netlink filters

Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Nick Kralevich <nnk@google.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-07-13 13:31:58 -04:00
Jeff Vander Stoep
671a2781ff security: add ioctl specific auditing to lsm_audit
Add information about ioctl calls to the LSM audit data. Log the
file path and command number.

Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Nick Kralevich <nnk@google.com>
[PM: subject line tweak]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-07-13 13:31:58 -04:00
James Morris
3dbbbe0eb6 Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/selinux into for-linus2 2015-07-11 09:13:45 +10:00
Stephen Smalley
892e8cac99 selinux: fix mprotect PROT_EXEC regression caused by mm change
commit 66fc130394 ("mm: shmem_zero_setup
skip security check and lockdep conflict with XFS") caused a regression
for SELinux by disabling any SELinux checking of mprotect PROT_EXEC on
shared anonymous mappings.  However, even before that regression, the
checking on such mprotect PROT_EXEC calls was inconsistent with the
checking on a mmap PROT_EXEC call for a shared anonymous mapping.  On a
mmap, the security hook is passed a NULL file and knows it is dealing
with an anonymous mapping and therefore applies an execmem check and no
file checks.  On a mprotect, the security hook is passed a vma with a
non-NULL vm_file (as this was set from the internally-created shmem
file during mmap) and therefore applies the file-based execute check
and no execmem check.  Since the aforementioned commit now marks the
shmem zero inode with the S_PRIVATE flag, the file checks are disabled
and we have no checking at all on mprotect PROT_EXEC.  Add a test to
the mprotect hook logic for such private inodes, and apply an execmem
check in that case.  This makes the mmap and mprotect checking
consistent for shared anonymous mappings, as well as for /dev/zero and
ashmem.

Cc: <stable@vger.kernel.org> # 4.1.x
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-07-10 16:45:29 -04:00
Eric W. Biederman
90f8572b0f vfs: Commit to never having exectuables on proc and sysfs.
Today proc and sysfs do not contain any executable files.  Several
applications today mount proc or sysfs without noexec and nosuid and
then depend on there being no exectuables files on proc or sysfs.
Having any executable files show on proc or sysfs would cause
a user space visible regression, and most likely security problems.

Therefore commit to never allowing executables on proc and sysfs by
adding a new flag to mark them as filesystems without executables and
enforce that flag.

Test the flag where MNT_NOEXEC is tested today, so that the only user
visible effect will be that exectuables will be treated as if the
execute bit is cleared.

The filesystems proc and sysfs do not currently incoporate any
executable files so this does not result in any user visible effects.

This makes it unnecessary to vet changes to proc and sysfs tightly for
adding exectuable files or changes to chattr that would modify
existing files, as no matter what the individual file say they will
not be treated as exectuable files by the vfs.

Not having to vet changes to closely is important as without this we
are only one proc_create call (or another goof up in the
implementation of notify_change) from having problematic executables
on proc.  Those mistakes are all too easy to make and would create
a situation where there are security issues or the assumptions of
some program having to be broken (and cause userspace regressions).

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-07-10 10:39:25 -05:00
Paul Moore
3324603524 selinux: don't waste ebitmap space when importing NetLabel categories
At present we don't create efficient ebitmaps when importing NetLabel
category bitmaps.  This can present a problem when comparing ebitmaps
since ebitmap_cmp() is very strict about these things and considers
these wasteful ebitmaps not equal when compared to their more
efficient counterparts, even if their values are the same.  This isn't
likely to cause problems on 64-bit systems due to a bit of luck on
how NetLabel/CIPSO works and the default ebitmap size, but it can be
a problem on 32-bit systems.

This patch fixes this problem by being a bit more intelligent when
importing NetLabel category bitmaps by skipping over empty sections
which should result in a nice, efficient ebitmap.

Cc: stable@vger.kernel.org # 3.17
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-07-09 14:20:36 -04:00
Linus Torvalds
1dc51b8288 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 "Assorted VFS fixes and related cleanups (IMO the most interesting in
  that part are f_path-related things and Eric's descriptor-related
  stuff).  UFS regression fixes (it got broken last cycle).  9P fixes.
  fs-cache series, DAX patches, Jan's file_remove_suid() work"

[ I'd say this is much more than "fixes and related cleanups".  The
  file_table locking rule change by Eric Dumazet is a rather big and
  fundamental update even if the patch isn't huge.   - Linus ]

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
  9p: cope with bogus responses from server in p9_client_{read,write}
  p9_client_write(): avoid double p9_free_req()
  9p: forgetting to cancel request on interrupted zero-copy RPC
  dax: bdev_direct_access() may sleep
  block: Add support for DAX reads/writes to block devices
  dax: Use copy_from_iter_nocache
  dax: Add block size note to documentation
  fs/file.c: __fget() and dup2() atomicity rules
  fs/file.c: don't acquire files->file_lock in fd_install()
  fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
  vfs: avoid creation of inode number 0 in get_next_ino
  namei: make set_root_rcu() return void
  make simple_positive() public
  ufs: use dir_pages instead of ufs_dir_pages()
  pagemap.h: move dir_pages() over there
  remove the pointless include of lglock.h
  fs: cleanup slight list_entry abuse
  xfs: Correctly lock inode when removing suid and file capabilities
  fs: Call security_ops->inode_killpriv on truncate
  fs: Provide function telling whether file_remove_privs() will do anything
  ...
2015-07-04 19:36:06 -07:00
Linus Torvalds
0cbee99269 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace updates from Eric Biederman:
 "Long ago and far away when user namespaces where young it was realized
  that allowing fresh mounts of proc and sysfs with only user namespace
  permissions could violate the basic rule that only root gets to decide
  if proc or sysfs should be mounted at all.

  Some hacks were put in place to reduce the worst of the damage could
  be done, and the common sense rule was adopted that fresh mounts of
  proc and sysfs should allow no more than bind mounts of proc and
  sysfs.  Unfortunately that rule has not been fully enforced.

  There are two kinds of gaps in that enforcement.  Only filesystems
  mounted on empty directories of proc and sysfs should be ignored but
  the test for empty directories was insufficient.  So in my tree
  directories on proc, sysctl and sysfs that will always be empty are
  created specially.  Every other technique is imperfect as an ordinary
  directory can have entries added even after a readdir returns and
  shows that the directory is empty.  Special creation of directories
  for mount points makes the code in the kernel a smidge clearer about
  it's purpose.  I asked container developers from the various container
  projects to help test this and no holes were found in the set of mount
  points on proc and sysfs that are created specially.

  This set of changes also starts enforcing the mount flags of fresh
  mounts of proc and sysfs are consistent with the existing mount of
  proc and sysfs.  I expected this to be the boring part of the work but
  unfortunately unprivileged userspace winds up mounting fresh copies of
  proc and sysfs with noexec and nosuid clear when root set those flags
  on the previous mount of proc and sysfs.  So for now only the atime,
  read-only and nodev attributes which userspace happens to keep
  consistent are enforced.  Dealing with the noexec and nosuid
  attributes remains for another time.

  This set of changes also addresses an issue with how open file
  descriptors from /proc/<pid>/ns/* are displayed.  Recently readlink of
  /proc/<pid>/fd has been triggering a WARN_ON that has not been
  meaningful since it was added (as all of the code in the kernel was
  converted) and is not now actively wrong.

  There is also a short list of issues that have not been fixed yet that
  I will mention briefly.

  It is possible to rename a directory from below to above a bind mount.
  At which point any directory pointers below the renamed directory can
  be walked up to the root directory of the filesystem.  With user
  namespaces enabled a bind mount of the bind mount can be created
  allowing the user to pick a directory whose children they can rename
  to outside of the bind mount.  This is challenging to fix and doubly
  so because all obvious solutions must touch code that is in the
  performance part of pathname resolution.

  As mentioned above there is also a question of how to ensure that
  developers by accident or with purpose do not introduce exectuable
  files on sysfs and proc and in doing so introduce security regressions
  in the current userspace that will not be immediately obvious and as
  such are likely to require breaking userspace in painful ways once
  they are recognized"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  vfs: Remove incorrect debugging WARN in prepend_path
  mnt: Update fs_fully_visible to test for permanently empty directories
  sysfs: Create mountpoints with sysfs_create_mount_point
  sysfs: Add support for permanently empty directories to serve as mount points.
  kernfs: Add support for always empty directories.
  proc: Allow creating permanently empty directories that serve as mount points
  sysctl: Allow creating permanently empty directories that serve as mountpoints.
  fs: Add helper functions for permanently empty directories.
  vfs: Ignore unlocked mounts in fs_fully_visible
  mnt: Modify fs_fully_visible to deal with locked ro nodev and atime
  mnt: Refactor the logic for mounting sysfs and proc in a user namespace
2015-07-03 15:20:57 -07:00
Linus Torvalds
02201e3f1b Minor merge needed, due to function move.
Main excitement here is Peter Zijlstra's lockless rbtree optimization to
 speed module address lookup.  He found some abusers of the module lock
 doing that too.
 
 A little bit of parameter work here too; including Dan Streetman's breaking
 up the big param mutex so writing a parameter can load another module (yeah,
 really).  Unfortunately that broke the usual suspects, !CONFIG_MODULES and
 !CONFIG_SYSFS, so those fixes were appended too.
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVkgKHAAoJENkgDmzRrbjxQpwQAJVmBN6jF3SnwbQXv9vRixjH
 58V33sb1G1RW+kXxQ3/e8jLX/4VaN479CufruXQp+IJWXsN/CH0lbC3k8m7u50d7
 b1Zeqd/Yrh79rkc11b0X1698uGCSMlzz+V54Z0QOTEEX+nSu2ZZvccFS4UaHkn3z
 rqDo00lb7rxQz8U25qro2OZrG6D3ub2q20TkWUB8EO4AOHkPn8KWP2r429Axrr0K
 wlDWDTTt8/IsvPbuPf3T15RAhq1avkMXWn9nDXDjyWbpLfTn8NFnWmtesgY7Jl4t
 GjbXC5WYekX3w2ZDB9KaT/DAMQ1a7RbMXNSz4RX4VbzDl+yYeSLmIh2G9fZb1PbB
 PsIxrOgy4BquOWsJPm+zeFPSC3q9Cfu219L4AmxSjiZxC3dlosg5rIB892Mjoyv4
 qxmg6oiqtc4Jxv+Gl9lRFVOqyHZrTC5IJ+xgfv1EyP6kKMUKLlDZtxZAuQxpUyxR
 HZLq220RYnYSvkWauikq4M8fqFM8bdt6hLJnv7bVqllseROk9stCvjSiE3A9szH5
 OgtOfYV5GhOeb8pCZqJKlGDw+RoJ21jtNCgOr6DgkNKV9CX/kL/Puwv8gnA0B0eh
 dxCeB7f/gcLl7Cg3Z3gVVcGlgak6JWrLf5ITAJhBZ8Lv+AtL2DKmwEWS/iIMRmek
 tLdh/a9GiCitqS0bT7GE
 =tWPQ
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module updates from Rusty Russell:
 "Main excitement here is Peter Zijlstra's lockless rbtree optimization
  to speed module address lookup.  He found some abusers of the module
  lock doing that too.

  A little bit of parameter work here too; including Dan Streetman's
  breaking up the big param mutex so writing a parameter can load
  another module (yeah, really).  Unfortunately that broke the usual
  suspects, !CONFIG_MODULES and !CONFIG_SYSFS, so those fixes were
  appended too"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (26 commits)
  modules: only use mod->param_lock if CONFIG_MODULES
  param: fix module param locks when !CONFIG_SYSFS.
  rcu: merge fix for Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE()
  module: add per-module param_lock
  module: make perm const
  params: suppress unused variable error, warn once just in case code changes.
  modules: clarify CONFIG_MODULE_COMPRESS help, suggest 'N'.
  kernel/module.c: avoid ifdefs for sig_enforce declaration
  kernel/workqueue.c: remove ifdefs over wq_power_efficient
  kernel/params.c: export param_ops_bool_enable_only
  kernel/params.c: generalize bool_enable_only
  kernel/module.c: use generic module param operaters for sig_enforce
  kernel/params: constify struct kernel_param_ops uses
  sysfs: tightened sysfs permission checks
  module: Rework module_addr_{min,max}
  module: Use __module_address() for module_address_lookup()
  module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING
  module: Optimize __module_address() using a latched RB-tree
  rbtree: Implement generic latch_tree
  seqlock: Introduce raw_read_seqcount_latch()
  ...
2015-07-01 10:49:25 -07:00
Eric W. Biederman
f9bb48825a sysfs: Create mountpoints with sysfs_create_mount_point
This allows for better documentation in the code and
it allows for a simpler and fully correct version of
fs_fully_visible to be written.

The mount points converted and their filesystems are:
/sys/hypervisor/s390/       s390_hypfs
/sys/kernel/config/         configfs
/sys/kernel/debug/          debugfs
/sys/firmware/efi/efivars/  efivarfs
/sys/fs/fuse/connections/   fusectl
/sys/fs/pstore/             pstore
/sys/kernel/tracing/        tracefs
/sys/fs/cgroup/             cgroup
/sys/kernel/security/       securityfs
/sys/fs/selinux/            selinuxfs
/sys/fs/smackfs/            smackfs

Cc: stable@vger.kernel.org
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-07-01 10:36:47 -05:00
Linus Torvalds
4a10a91756 Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit
Pull audit updates from Paul Moore:
 "Four small audit patches for v4.2, all bug fixes.  Only 10 lines of
  change this time so very unremarkable, the patch subject lines pretty
  much tell the whole story"

* 'upstream' of git://git.infradead.org/users/pcmoore/audit:
  audit: Fix check of return value of strnlen_user()
  audit: obsolete audit_context check is removed in audit_filter_rules()
  audit: fix for typo in comment to function audit_log_link_denied()
  lsm: rename duplicate labels in LSM_AUDIT_DATA_TASK audit message type
2015-06-27 13:53:16 -07:00
Linus Torvalds
e22619a29f Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "The main change in this kernel is Casey's generalized LSM stacking
  work, which removes the hard-coding of Capabilities and Yama stacking,
  allowing multiple arbitrary "small" LSMs to be stacked with a default
  monolithic module (e.g.  SELinux, Smack, AppArmor).

  See
        https://lwn.net/Articles/636056/

  This will allow smaller, simpler LSMs to be incorporated into the
  mainline kernel and arbitrarily stacked by users.  Also, this is a
  useful cleanup of the LSM code in its own right"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (38 commits)
  tpm, tpm_crb: fix le64_to_cpu conversions in crb_acpi_add()
  vTPM: set virtual device before passing to ibmvtpm_reset_crq
  tpm_ibmvtpm: remove unneccessary message level.
  ima: update builtin policies
  ima: extend "mask" policy matching support
  ima: add support for new "euid" policy condition
  ima: fix ima_show_template_data_ascii()
  Smack: freeing an error pointer in smk_write_revoke_subj()
  selinux: fix setting of security labels on NFS
  selinux: Remove unused permission definitions
  selinux: enable genfscon labeling for sysfs and pstore files
  selinux: enable per-file labeling for debugfs files.
  selinux: update netlink socket classes
  signals: don't abuse __flush_signals() in selinux_bprm_committed_creds()
  selinux: Print 'sclass' as string when unrecognized netlink message occurs
  Smack: allow multiple labels in onlycap
  Smack: fix seq operations in smackfs
  ima: pass iint to ima_add_violation()
  ima: wrap event related data to the new ima_event_data structure
  integrity: add validity checks for 'path' parameter
  ...
2015-06-27 13:26:03 -07:00
Linus Torvalds
e0456717e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Add TX fast path in mac80211, from Johannes Berg.

 2) Add TSO/GRO support to ibmveth, from Thomas Falcon

 3) Move away from cached routes in ipv6, just like ipv4, from Martin
    KaFai Lau.

 4) Lots of new rhashtable tests, from Thomas Graf.

 5) Run ingress qdisc lockless, from Alexei Starovoitov.

 6) Allow servers to fetch TCP packet headers for SYN packets of new
    connections, for fingerprinting.  From Eric Dumazet.

 7) Add mode parameter to pktgen, for testing receive.  From Alexei
    Starovoitov.

 8) Cache access optimizations via simplifications of build_skb(), from
    Alexander Duyck.

 9) Move page frag allocator under mm/, also from Alexander.

10) Add xmit_more support to hv_netvsc, from KY Srinivasan.

11) Add a counter guard in case we try to perform endless reclassify
    loops in the packet scheduler.

12) Extern flow dissector to be programmable and use it in new "Flower"
    classifier.  From Jiri Pirko.

13) AF_PACKET fanout rollover fixes, performance improvements, and new
    statistics.  From Willem de Bruijn.

14) Add netdev driver for GENEVE tunnels, from John W Linville.

15) Add ingress netfilter hooks and filtering, from Pablo Neira Ayuso.

16) Fix handling of epoll edge triggers in TCP, from Eric Dumazet.

17) Add an ECN retry fallback for the initial TCP handshake, from Daniel
    Borkmann.

18) Add tail call support to BPF, from Alexei Starovoitov.

19) Add several pktgen helper scripts, from Jesper Dangaard Brouer.

20) Add zerocopy support to AF_UNIX, from Hannes Frederic Sowa.

21) Favor even port numbers for allocation to connect() requests, and
    odd port numbers for bind(0), in an effort to help avoid
    ip_local_port_range exhaustion.  From Eric Dumazet.

22) Add Cavium ThunderX driver, from Sunil Goutham.

23) Allow bpf programs to access skb_iif and dev->ifindex SKB metadata,
    from Alexei Starovoitov.

24) Add support for T6 chips in cxgb4vf driver, from Hariprasad Shenai.

25) Double TCP Small Queues default to 256K to accomodate situations
    like the XEN driver and wireless aggregation.  From Wei Liu.

26) Add more entropy inputs to flow dissector, from Tom Herbert.

27) Add CDG congestion control algorithm to TCP, from Kenneth Klette
    Jonassen.

28) Convert ipset over to RCU locking, from Jozsef Kadlecsik.

29) Track and act upon link status of ipv4 route nexthops, from Andy
    Gospodarek.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1670 commits)
  bridge: vlan: flush the dynamically learned entries on port vlan delete
  bridge: multicast: add a comment to br_port_state_selection about blocking state
  net: inet_diag: export IPV6_V6ONLY sockopt
  stmmac: troubleshoot unexpected bits in des0 & des1
  net: ipv4 sysctl option to ignore routes when nexthop link is down
  net: track link-status of ipv4 nexthops
  net: switchdev: ignore unsupported bridge flags
  net: Cavium: Fix MAC address setting in shutdown state
  drivers: net: xgene: fix for ACPI support without ACPI
  ip: report the original address of ICMP messages
  net/mlx5e: Prefetch skb data on RX
  net/mlx5e: Pop cq outside mlx5e_get_cqe
  net/mlx5e: Remove mlx5e_cq.sqrq back-pointer
  net/mlx5e: Remove extra spaces
  net/mlx5e: Avoid TX CQE generation if more xmit packets expected
  net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion
  net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq()
  net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them
  net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues
  net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device
  ...
2015-06-24 16:49:49 -07:00
Al Viro
dc3f4198ea make simple_positive() public
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-06-23 18:02:01 -04:00
Eric W Biederman
8f481b50ea netfilter: Remove spurios included of netfilter.h
While testing my netfilter changes I noticed several files where
recompiling unncessarily because they unncessarily included
netfilter.h.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-18 21:14:32 +02:00
Mimi Zohar
24fd03c876 ima: update builtin policies
This patch defines a builtin measurement policy "tcb", similar to the
existing "ima_tcb", but with additional rules to also measure files
based on the effective uid and to measure files opened with the "read"
mode bit set (eg. read, read-write).

Changing the builtin "ima_tcb" policy could potentially break existing
users.  Instead of defining a new separate boot command line option each
time the builtin measurement policy is modified, this patch defines a
single generic boot command line option "ima_policy=" to specify the
builtin policy and deprecates the use of the builtin ima_tcb policy.

[The "ima_policy=" boot command line option is based on Roberto Sassu's
"ima: added new policy type exec" patch.]

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Dr. Greg Wettstein <gw@idfusion.org>
Cc: stable@vger.kernel.org
2015-06-16 08:18:45 -04:00
Mimi Zohar
4351c294b8 ima: extend "mask" policy matching support
The current "mask" policy option matches files opened as MAY_READ,
MAY_WRITE, MAY_APPEND or MAY_EXEC.  This patch extends the "mask"
option to match files opened containing one of these modes.  For
example, "mask=^MAY_READ" would match files opened read-write.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Dr. Greg Wettstein <gw@idfusion.org>
Cc: stable@vger.kernel.org
2015-06-16 08:18:44 -04:00
Mimi Zohar
139069eff7 ima: add support for new "euid" policy condition
The new "euid" policy condition measures files with the specified
effective uid (euid).  In addition, for CAP_SETUID files it measures
files with the specified uid or suid.

Changelog:
- fixed checkpatch.pl warnings
- fixed avc denied {setuid} messages - based on Roberto's feedback

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Dr. Greg Wettstein <gw@idfusion.org>
Cc: stable@vger.kernel.org
2015-06-16 08:18:43 -04:00
Mimi Zohar
45b26133b9 ima: fix ima_show_template_data_ascii()
This patch fixes a bug introduced in "4d7aeee ima: define new template
ima-ng and template fields d-ng and n-ng".

Changelog:
- change int to uint32 (Roberto Sassu's suggestion)

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Roberto Sassu <rsassu@suse.de>
Cc: stable@vger.kernel.org # 3.13
2015-06-16 08:18:21 -04:00
James Morris
d6f7aa27f4 Merge branch 'smack-for-4.2-stacked' of https://github.com/cschaufler/smack-next into next 2015-06-13 09:51:16 +10:00
Dan Carpenter
5430209497 Smack: freeing an error pointer in smk_write_revoke_subj()
This code used to rely on the fact that kfree(NULL) was a no-op, but
then we changed smk_parse_smack() to return error pointers on failure
instead of NULL.  Calling kfree() on an error pointer will oops.

I have re-arranged things a bit so that we only free things if they
have been allocated.

Fixes: e774ad683f ('smack: pass error code through pointers')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2015-06-12 11:59:11 -07:00
J. Bruce Fields
9fc2b4b436 selinux: fix setting of security labels on NFS
Before calling into the filesystem, vfs_setxattr calls
security_inode_setxattr, which ends up calling selinux_inode_setxattr in
our case.  That returns -EOPNOTSUPP whenever SBLABEL_MNT is not set.
SBLABEL_MNT was supposed to be set by sb_finish_set_opts, which sets it
only if selinux_is_sblabel_mnt returns true.

The selinux_is_sblabel_mnt logic was broken by eadcabc697 "SELinux: do
all flags twiddling in one place", which didn't take into the account
the SECURITY_FS_USE_NATIVE behavior that had been introduced for nfs
with eb9ae68650 "SELinux: Add new labeling type native labels".

This caused setxattr's of security labels over NFSv4.2 to fail.

Cc: stable@kernel.org # 3.13
Cc: Eric Paris <eparis@redhat.com>
Cc: David Quigley <dpquigl@davequigley.com>
Reported-by: Richard Chan <rc556677@outlook.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: added the stable dependency]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-06-05 14:21:48 -04:00
Stephen Smalley
42a9699a9f selinux: Remove unused permission definitions
Remove unused permission definitions from SELinux.
Many of these were only ever used in pre-mainline
versions of SELinux, prior to Linux 2.6.0.  Some of them
were used in the legacy network or compat_net=1 checks
that were disabled by default in Linux 2.6.18 and
fully removed in Linux 2.6.30.

Permissions never used in mainline Linux:
file swapon
filesystem transition
tcp_socket { connectto newconn acceptfrom }
node enforce_dest
unix_stream_socket { newconn acceptfrom }

Legacy network checks, removed in 2.6.30:
socket { recv_msg send_msg }
node { tcp_recv tcp_send udp_recv udp_send rawip_recv rawip_send dccp_recv dccp_send }
netif { tcp_recv tcp_send udp_recv udp_send rawip_recv rawip_send dccp_recv dccp_send }

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-06-04 16:22:17 -04:00
Stephen Smalley
8e01472078 selinux: enable genfscon labeling for sysfs and pstore files
Support per-file labeling of sysfs and pstore files based on
genfscon policy entries.  This is safe because the sysfs
and pstore directory tree cannot be manipulated by userspace,
except to unlink pstore entries.
This provides an alternative method of assigning per-file labeling
to sysfs or pstore files without needing to set the labels from
userspace on each boot.  The advantages of this approach are that
the labels are assigned as soon as the dentry is first instantiated
and userspace does not need to walk the sysfs or pstore tree and
set the labels on each boot.  The limitations of this approach are
that the labels can only be assigned based on pathname prefix matching.
You can initially assign labels using this mechanism and then change
them at runtime via setxattr if allowed to do so by policy.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Suggested-by: Dominick Grift <dac.override@gmail.com>
Acked-by: Jeff Vander Stoep <jeffv@google.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-06-04 16:22:17 -04:00
Stephen Smalley
134509d54e selinux: enable per-file labeling for debugfs files.
Add support for per-file labeling of debugfs files so that
we can distinguish them in policy.  This is particularly
important in Android where certain debugfs files have to be writable
by apps and therefore the debugfs directory tree can be read and
searched by all.

Since debugfs is entirely kernel-generated, the directory tree is
immutable by userspace, and the inodes are pinned in memory, we can
simply use the same approach as with proc and label the inodes from
policy based on pathname from the root of the debugfs filesystem.
Generalize the existing labeling support used for proc and reuse it
for debugfs too.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-06-04 16:22:17 -04:00
Stephen Smalley
6c6d2e9bde selinux: update netlink socket classes
Update the set of SELinux netlink socket class definitions to match
the set of netlink protocols implemented by the kernel.  The
ip_queue implementation for the NETLINK_FIREWALL and NETLINK_IP6_FW protocols
was removed in d16cf20e2f, so we can remove
the corresponding class definitions as this is dead code.  Add new
classes for NETLINK_ISCSI, NETLINK_FIB_LOOKUP, NETLINK_CONNECTOR,
NETLINK_NETFILTER, NETLINK_GENERIC, NETLINK_SCSITRANSPORT, NETLINK_RDMA,
and NETLINK_CRYPTO so that we can distinguish among sockets created
for each of these protocols.  This change does not define the finer-grained
nlsmsg_read/write permissions or map specific nlmsg_type values to those
permissions in the SELinux nlmsgtab; if finer-grained control of these
sockets is desired/required, that can be added as a follow-on change.
We do not define a SELinux class for NETLINK_ECRYPTFS as the implementation
was removed in 624ae52845.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-06-04 16:22:16 -04:00
Oleg Nesterov
9e7c8f8c62 signals: don't abuse __flush_signals() in selinux_bprm_committed_creds()
selinux_bprm_committed_creds()->__flush_signals() is not right, we
shouldn't clear TIF_SIGPENDING unconditionally. There can be other
reasons for signal_pending(): freezing(), JOBCTL_PENDING_MASK, and
potentially more.

Also change this code to check fatal_signal_pending() rather than
SIGNAL_GROUP_EXIT, it looks a bit better.

Now we can kill __flush_signals() before it finds another buggy user.

Note: this code looks racy, we can flush a signal which was sent after
the task SID has been updated.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-06-04 16:22:16 -04:00
Marek Milkovic
cded3fffbe selinux: Print 'sclass' as string when unrecognized netlink message occurs
This prints the 'sclass' field as string instead of index in unrecognized netlink message.
The textual representation makes it easier to distinguish the right class.

Signed-off-by: Marek Milkovic <mmilkovi@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: 80-char width fixes]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-06-04 16:22:16 -04:00
James Morris
e6e29a4eae Merge branch 'smack-for-4.2-stacked' of https://github.com/cschaufler/smack-next into next 2015-06-03 19:10:29 +10:00
Rafal Krypa
c0d77c8844 Smack: allow multiple labels in onlycap
Smack onlycap allows limiting of CAP_MAC_ADMIN and CAP_MAC_OVERRIDE to
processes running with the configured label. But having single privileged
label is not enough in some real use cases. On a complex system like Tizen,
there maybe few programs that need to configure Smack policy in run-time
and running them all with a single label is not always practical.
This patch extends onlycap feature for multiple labels. They are configured
in the same smackfs "onlycap" interface, separated by spaces.

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
2015-06-02 11:53:42 -07:00
Rafal Krypa
01fa8474fb Smack: fix seq operations in smackfs
Use proper RCU functions and read locking in smackfs seq_operations.

Smack gets away with not using proper RCU functions in smackfs, because
it never removes entries from these lists. But now one list will be
needed (with interface in smackfs) that will have both elements added and
removed to it.
This change will also help any future changes implementing removal of
unneeded entries from other Smack lists.

The patch also fixes handling of pos argument in smk_seq_start and
smk_seq_next. This fixes a bug in case when smackfs is read with a small
buffer:

Kernel panic - not syncing: Kernel mode fault at addr 0xfa0000011b
CPU: 0 PID: 1292 Comm: dd Not tainted 4.1.0-rc1-00012-g98179b8 #13
Stack:
 00000003 0000000d 7ff39e48 7f69fd00
 7ff39ce0 601ae4b0 7ff39d50 600e587b
 00000010 6039f690 7f69fd40 00612003
Call Trace:
 [<601ae4b0>] load2_seq_show+0x19/0x1d
 [<600e587b>] seq_read+0x168/0x331
 [<600c5943>] __vfs_read+0x21/0x101
 [<601a595e>] ? security_file_permission+0xf8/0x105
 [<600c5ec6>] ? rw_verify_area+0x86/0xe2
 [<600c5fc3>] vfs_read+0xa1/0x14c
 [<600c68e2>] SyS_read+0x57/0xa0
 [<6001da60>] handle_syscall+0x60/0x80
 [<6003087d>] userspace+0x442/0x548
 [<6001aa77>] ? interrupt_end+0x0/0x80
 [<6001daae>] ? copy_chunk_to_user+0x0/0x2b
 [<6002cb6b>] ? save_registers+0x1f/0x39
 [<60032ef7>] ? arch_prctl+0xf5/0x170
 [<6001a92d>] fork_handler+0x85/0x87

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
2015-06-02 11:53:23 -07:00
Richard Guy Briggs
5c5bc97e2f lsm: rename duplicate labels in LSM_AUDIT_DATA_TASK audit message type
The LSM_AUDIT_DATA_TASK pid= and comm= labels are duplicates of those at the
start of this function with different values.  Rename them to their object
counterparts opid= and ocomm= to disambiguate.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: minor merging needed due to differences in the tree]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-05-29 15:09:54 -04:00
Luis R. Rodriguez
9c27847dda kernel/params: constify struct kernel_param_ops uses
Most code already uses consts for the struct kernel_param_ops,
sweep the kernel for the last offending stragglers. Other than
include/linux/moduleparam.h and kernel/params.c all other changes
were generated with the following Coccinelle SmPL patch. Merge
conflicts between trees can be handled with Coccinelle.

In the future git could get Coccinelle merge support to deal with
patch --> fail --> grammar --> Coccinelle --> new patch conflicts
automatically for us on patches where the grammar is available and
the patch is of high confidence. Consider this a feature request.

Test compiled on x86_64 against:

	* allnoconfig
	* allmodconfig
	* allyesconfig

@ const_found @
identifier ops;
@@

const struct kernel_param_ops ops = {
};

@ const_not_found depends on !const_found @
identifier ops;
@@

-struct kernel_param_ops ops = {
+const struct kernel_param_ops ops = {
};

Generated-by: Coccinelle SmPL
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Junio C Hamano <gitster@pobox.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: cocci@systeme.lip6.fr
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-05-28 11:32:10 +09:30
Roberto Sassu
8d94eb9b5c ima: pass iint to ima_add_violation()
This patch adds the iint associated to the current inode as a new
parameter of ima_add_violation(). The passed iint is always not NULL
if a violation is detected. This modification will be used to determine
the inode for which there is a violation.

Since the 'd' and 'd-ng' template field init() functions were detecting
a violation from the value of the iint pointer, they now check the new
field 'violation', added to the 'ima_event_data' structure.

Changelog:
 - v1:
   - modified an old comment (Roberto Sassu)

Signed-off-by: Roberto Sassu <rsassu@suse.de>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-05-21 13:59:29 -04:00
Roberto Sassu
23b5741932 ima: wrap event related data to the new ima_event_data structure
All event related data has been wrapped into the new 'ima_event_data'
structure. The main benefit of this patch is that a new information
can be made available to template fields initialization functions
by simply adding a new field to the new structure instead of modifying
the definition of those functions.

Changelog:
 - v2:
   - f_dentry replaced with f_path.dentry (Roberto Sassu)
   - removed declaration of temporary variables in template field functions
     when possible (suggested by Dmitry Kasatkin)

Signed-off-by: Roberto Sassu <rsassu@suse.de>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-05-21 13:59:28 -04:00
Dmitry Kasatkin
9d03a721a3 integrity: add validity checks for 'path' parameter
This patch adds validity checks for 'path' parameter and
makes it const.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-05-21 13:59:28 -04:00
Dmitry Kasatkin
7c51bb00c4 evm: fix potential race when removing xattrs
EVM needs to be atomically updated when removing xattrs.
Otherwise concurrent EVM verification may fail in between.
This patch fixes by moving i_mutex unlocking after calling
EVM hook. fsnotify_xattr() is also now called while locked
the same way as it is done in __vfs_setxattr_noperm.

Changelog:
- remove unused 'inode' variable.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-05-21 13:28:47 -04:00
Mimi Zohar
5101a1850b evm: labeling pseudo filesystems exception
To prevent offline stripping of existing file xattrs and relabeling of
them at runtime, EVM allows only newly created files to be labeled.  As
pseudo filesystems are not persistent, stripping of xattrs is not a
concern.

Some LSMs defer file labeling on pseudo filesystems.  This patch
permits the labeling of existing files on pseudo files systems.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-05-21 13:28:47 -04:00
Dmitry Kasatkin
a18d0cbfab ima: remove definition of IMA_X509_PATH
CONFIG_IMA_X509_PATH is always defined.  This patch removes the
IMA_X509_PATH definition and uses CONFIG_IMA_X509_PATH.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-05-21 13:28:47 -04:00
Dmitry Kasatkin
c68ed80c97 ima: limit file hash setting by user to fix and log modes
File hashes are automatically set and updated and should not be
manually set. This patch limits file hash setting to fix and log
modes.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-05-21 13:28:46 -04:00
Mimi Zohar
cd025f7f94 ima: do not measure or appraise the NSFS filesystem
Include don't appraise or measure rules for the NSFS filesystem
in the builtin ima_tcb and ima_appraise_tcb policies.

Changelog:
- Update documentation

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # 3.19
2015-05-21 13:28:41 -04:00
Roberto Sassu
6438de9f3f ima: skip measurement of cgroupfs files and update documentation
This patch adds a rule in the default measurement policy to skip inodes
in the cgroupfs filesystem. Measurements for this filesystem can be
avoided, as all the digests collected have the same value of the digest of
an empty file.

Furthermore, this patch updates the documentation of IMA policies in
Documentation/ABI/testing/ima_policy to make it consistent with
the policies set in security/integrity/ima/ima_policy.c.

Signed-off-by: Roberto Sassu <rsassu@suse.de>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-05-21 13:27:19 -04:00
Lukasz Pawelczyk
e774ad683f smack: pass error code through pointers
This patch makes the following functions to use ERR_PTR() and related
macros to pass the appropriate error code through returned pointers:

smk_parse_smack()
smk_import_entry()
smk_fetch()

It also makes all the other functions that use them to handle the
error cases properly. This ways correct error codes from places
where they happened can be propagated to the user space if necessary.

Doing this it fixes a bug in onlycap and unconfined files
handling. Previously their content was cleared on any error from
smk_import_entry/smk_parse_smack, be it EINVAL (as originally intended)
or ENOMEM. Right now it only reacts on EINVAL passing other codes
properly to userspace.

Comments have been updated accordingly.

Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@samsung.com>
2015-05-15 08:36:03 -07:00
Seung-Woo Kim
9777582e8d Smack: ignore private inode for smack_file_receive
The dmabuf fd can be shared between processes via unix domain
socket. The file of dmabuf fd is came from anon_inode. The inode
has no set and get xattr operations, so it can not be shared
between processes with smack. This patch fixes just to ignore
private inode including anon_inode for smack_file_receive.

Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
2015-05-15 08:35:50 -07:00
Dan Carpenter
5577857f8e ima: cleanup ima_init_policy() a little
It's a bit easier to read this if we split it up into two for loops.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2015-05-13 06:07:19 -04:00
Casey Schaufler
1ddd3b4e07 LSM: Remove unused capability.c
The stub functions in capability.c are no longer required
with the list based stacking mechanism. Remove the file.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-05-12 15:00:45 +10:00
Casey Schaufler
b1d9e6b064 LSM: Switch to lists of hooks
Instead of using a vector of security operations
with explicit, special case stacking of the capability
and yama hooks use lists of hooks with capability and
yama hooks included as appropriate.

The security_operations structure is no longer required.
Instead, there is a union of the function pointers that
allows all the hooks lists to use a common mechanism for
list management while retaining typing. Each module
supplies an array describing the hooks it provides instead
of a sparsely populated security_operations structure.
The description includes the element that gets put on
the hook list, avoiding the issues surrounding individual
element allocation.

The method for registering security modules is changed to
reflect the information available. The method for removing
a module, currently only used by SELinux, has also changed.
It should be generic now, however if there are potential
race conditions based on ordering of hook removal that needs
to be addressed by the calling module.

The security hooks are called from the lists and the first
failure is returned.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-05-12 15:00:41 +10:00
Casey Schaufler
e20b043a69 LSM: Add security module hook list heads
Add a list header for each security hook. They aren't used until
later in the patch series. They are grouped together in a structure
so that there doesn't need to be an external address for each.

Macro-ize the initialization of the security_operations
for each security module in anticipation of changing out
the security_operations structure.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-05-12 15:00:36 +10:00
Casey Schaufler
f25fce3e8f LSM: Introduce security hook calling Macros
Introduce two macros around calling the functions in the
security operations vector. The marco versions here do not
change any behavior.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-05-12 15:00:30 +10:00
Casey Schaufler
3c4ed7bdf5 LSM: Split security.h
The security.h header file serves two purposes,
interfaces for users of the security modules and
interfaces for security modules. Users of the
security modules don't need to know about what's
in the security_operations structure, so pull it
out into it's own header, lsm_hooks.h

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-05-12 15:00:16 +10:00
NeilBrown
bda0be7ad9 security: make inode_follow_link RCU-walk aware
inode_follow_link now takes an inode and rcu flag as well as the
dentry.

inode is used in preference to d_backing_inode(dentry), particularly
in RCU-walk mode.

selinux_inode_follow_link() gets dentry_has_perm() and
inode_has_perm() open-coded into it so that it can call
avc_has_perm_flags() in way that is safe if LOOKUP_RCU is set.

Calling avc_has_perm_flags() with rcu_read_lock() held means
that when avc_has_perm_noaudit calls avc_compute_av(), the attempt
to rcu_read_unlock() before calling security_compute_av() will not
actually drop the RCU read-lock.

However as security_compute_av() is completely in a read_lock()ed
region, it should be safe with the RCU read-lock held.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-11 08:13:11 -04:00
NeilBrown
7b20ea2579 security/selinux: pass 'flags' arg to avc_audit() and avc_has_perm_flags()
This allows MAY_NOT_BLOCK to be passed, in RCU-walk mode, through
the new avc_has_perm_flags() to avc_audit() and thence the slow_avc_audit.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-11 08:13:11 -04:00
NeilBrown
37882db054 SECURITY: remove nameidata arg from inode_follow_link.
No ->inode_follow_link() methods use the nameidata arg, and
it is about to become private to namei.c.
So remove from all inode_follow_link() functions.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-10 22:18:29 -04:00
Linus Torvalds
9ec3a646fe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull fourth vfs update from Al Viro:
 "d_inode() annotations from David Howells (sat in for-next since before
  the beginning of merge window) + four assorted fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  RCU pathwalk breakage when running into a symlink overmounting something
  fix I_DIO_WAKEUP definition
  direct-io: only inc/dec inode->i_dio_count for file systems
  fs/9p: fix readdir()
  VFS: assorted d_backing_inode() annotations
  VFS: fs/inode.c helpers: d_inode() annotations
  VFS: fs/cachefiles: d_backing_inode() annotations
  VFS: fs library helpers: d_inode() annotations
  VFS: assorted weird filesystems: d_inode() annotations
  VFS: normal filesystems (and lustre): d_inode() annotations
  VFS: security/: d_inode() annotations
  VFS: security/: d_backing_inode() annotations
  VFS: net/: d_inode() annotations
  VFS: net/unix: d_backing_inode() annotations
  VFS: kernel/: d_inode() annotations
  VFS: audit: d_backing_inode() annotations
  VFS: Fix up some ->d_inode accesses in the chelsio driver
  VFS: Cachefiles should perform fs modifications on the top layer only
  VFS: AF_UNIX sockets should call mknod on the top layer only
2015-04-26 17:22:07 -07:00
Davidlohr Bueso
d4144ea6e7 tomoyo: reduce mmap_sem hold for mm->exe_file
The mm->exe_file is currently serialized with mmap_sem (shared) in order
to both safely (1) read the file and (2) compute the realpath by calling
tomoyo_realpath_from_path, making it an absolute overkill.  Good users
will, on the other hand, make use of the more standard get_mm_exe_file(),
requiring only holding the mmap_sem to read the value, and relying on
reference

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17 09:04:11 -04:00
Linus Torvalds
eea3a00264 Merge branch 'akpm' (patches from Andrew)
Merge second patchbomb from Andrew Morton:

 - the rest of MM

 - various misc bits

 - add ability to run /sbin/reboot at reboot time

 - printk/vsprintf changes

 - fiddle with seq_printf() return value

* akpm: (114 commits)
  parisc: remove use of seq_printf return value
  lru_cache: remove use of seq_printf return value
  tracing: remove use of seq_printf return value
  cgroup: remove use of seq_printf return value
  proc: remove use of seq_printf return value
  s390: remove use of seq_printf return value
  cris fasttimer: remove use of seq_printf return value
  cris: remove use of seq_printf return value
  openrisc: remove use of seq_printf return value
  ARM: plat-pxa: remove use of seq_printf return value
  nios2: cpuinfo: remove use of seq_printf return value
  microblaze: mb: remove use of seq_printf return value
  ipc: remove use of seq_printf return value
  rtc: remove use of seq_printf return value
  power: wakeup: remove use of seq_printf return value
  x86: mtrr: if: remove use of seq_printf return value
  linux/bitmap.h: improve BITMAP_{LAST,FIRST}_WORD_MASK
  MAINTAINERS: CREDITS: remove Stefano Brivio from B43
  .mailmap: add Ricardo Ribalda
  CREDITS: add Ricardo Ribalda Delgado
  ...
2015-04-15 16:39:15 -07:00
Iulia Manda
2813893f8b kernel: conditionally support non-root users, groups and capabilities
There are a lot of embedded systems that run most or all of their
functionality in init, running as root:root.  For these systems,
supporting multiple users is not necessary.

This patch adds a new symbol, CONFIG_MULTIUSER, that makes support for
non-root users, non-root groups, and capabilities optional.  It is enabled
under CONFIG_EXPERT menu.

When this symbol is not defined, UID and GID are zero in any possible case
and processes always have all capabilities.

The following syscalls are compiled out: setuid, setregid, setgid,
setreuid, setresuid, getresuid, setresgid, getresgid, setgroups,
getgroups, setfsuid, setfsgid, capget, capset.

Also, groups.c is compiled out completely.

In kernel/capability.c, capable function was moved in order to avoid
adding two ifdef blocks.

This change saves about 25 KB on a defconfig build.  The most minimal
kernels have total text sizes in the high hundreds of kB rather than
low MB.  (The 25k goes down a bit with allnoconfig, but not that much.

The kernel was booted in Qemu.  All the common functionalities work.
Adding users/groups is not possible, failing with -ENOSYS.

Bloat-o-meter output:
add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650)

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Iulia Manda <iulia.manda21@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:22 -07:00
David Howells
ce0b16ddf1 VFS: security/: d_inode() annotations
... except where that code acts as a filesystem driver, rather than
working with dentries given to it.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-15 15:06:57 -04:00
David Howells
c6f493d631 VFS: security/: d_backing_inode() annotations
most of the ->d_inode uses there refer to the same inode IO would
go to, i.e. d_backing_inode()

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-15 15:06:56 -04:00
Linus Torvalds
d488d3a4ce Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights for this window:

   - improved AVC hashing for SELinux by John Brooks and Stephen Smalley

   - addition of an unconfined label to Smack

   - Smack documentation update

   - TPM driver updates"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (28 commits)
  lsm: copy comm before calling audit_log to avoid race in string printing
  tomoyo: Do not generate empty policy files
  tomoyo: Use if_changed when generating builtin-policy.h
  tomoyo: Use bin2c to generate builtin-policy.h
  selinux: increase avtab max buckets
  selinux: Use a better hash function for avtab
  selinux: convert avtab hash table to flex_array
  selinux: reconcile security_netlbl_secattr_to_sid() and mls_import_netlbl_cat()
  selinux: remove unnecessary pointer reassignment
  Smack: Updates for Smack documentation
  tpm/st33zp24/spi: Add missing device table for spi phy.
  tpm/st33zp24: Add proper wait for ordinal duration in case of irq mode
  smack: Fix gcc warning from unused smack_syslog_lock mutex in smackfs.c
  Smack: Allow an unconfined label in bringup mode
  Smack: getting the Smack security context of keys
  Smack: Assign smack_known_web as default smk_in label for kernel thread's socket
  tpm/tpm_infineon: Use struct dev_pm_ops for power management
  MAINTAINERS: Add Jason as designated reviewer for TPM
  tpm: Update KConfig text to include TPM2.0 FIFO chips
  tpm/st33zp24/dts/st33zp24-spi: Add dts documentation for st33zp24 spi phy
  ...
2015-04-15 11:08:27 -07:00
Linus Torvalds
6c373ca893 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Add BQL support to via-rhine, from Tino Reichardt.

 2) Integrate SWITCHDEV layer support into the DSA layer, so DSA drivers
    can support hw switch offloading.  From Floria Fainelli.

 3) Allow 'ip address' commands to initiate multicast group join/leave,
    from Madhu Challa.

 4) Many ipv4 FIB lookup optimizations from Alexander Duyck.

 5) Support EBPF in cls_bpf classifier and act_bpf action, from Daniel
    Borkmann.

 6) Remove the ugly compat support in ARP for ugly layers like ax25,
    rose, etc.  And use this to clean up the neigh layer, then use it to
    implement MPLS support.  All from Eric Biederman.

 7) Support L3 forwarding offloading in switches, from Scott Feldman.

 8) Collapse the LOCAL and MAIN ipv4 FIB tables when possible, to speed
    up route lookups even further.  From Alexander Duyck.

 9) Many improvements and bug fixes to the rhashtable implementation,
    from Herbert Xu and Thomas Graf.  In particular, in the case where
    an rhashtable user bulk adds a large number of items into an empty
    table, we expand the table much more sanely.

10) Don't make the tcp_metrics hash table per-namespace, from Eric
    Biederman.

11) Extend EBPF to access SKB fields, from Alexei Starovoitov.

12) Split out new connection request sockets so that they can be
    established in the main hash table.  Much less false sharing since
    hash lookups go direct to the request sockets instead of having to
    go first to the listener then to the request socks hashed
    underneath.  From Eric Dumazet.

13) Add async I/O support for crytpo AF_ALG sockets, from Tadeusz Struk.

14) Support stable privacy address generation for RFC7217 in IPV6.  From
    Hannes Frederic Sowa.

15) Hash network namespace into IP frag IDs, also from Hannes Frederic
    Sowa.

16) Convert PTP get/set methods to use 64-bit time, from Richard
    Cochran.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1816 commits)
  fm10k: Bump driver version to 0.15.2
  fm10k: corrected VF multicast update
  fm10k: mbx_update_max_size does not drop all oversized messages
  fm10k: reset head instead of calling update_max_size
  fm10k: renamed mbx_tx_dropped to mbx_tx_oversized
  fm10k: update xcast mode before synchronizing multicast addresses
  fm10k: start service timer on probe
  fm10k: fix function header comment
  fm10k: comment next_vf_mbx flow
  fm10k: don't handle mailbox events in iov_event path and always process mailbox
  fm10k: use separate workqueue for fm10k driver
  fm10k: Set PF queues to unlimited bandwidth during virtualization
  fm10k: expose tx_timeout_count as an ethtool stat
  fm10k: only increment tx_timeout_count in Tx hang path
  fm10k: remove extraneous "Reset interface" message
  fm10k: separate PF only stats so that VF does not display them
  fm10k: use hw->mac.max_queues for stats
  fm10k: only show actual queues, not the maximum in hardware
  fm10k: allow creation of VLAN on default vid
  fm10k: fix unused warnings
  ...
2015-04-15 09:00:47 -07:00
Richard Guy Briggs
5deeb5cece lsm: copy comm before calling audit_log to avoid race in string printing
When task->comm is passed directly to audit_log_untrustedstring() without
getting a copy or using the task_lock, there is a race that could happen that
would output a NULL (\0) in the middle of the output string that would
effectively truncate the rest of the report text after the comm= field in the
audit log message, losing fields.

Using get_task_comm() to get a copy while acquiring the task_lock to prevent
this and to prevent the result from being a mixture of old and new values of
comm would incur potentially unacceptable overhead, considering that the value
can be influenced by userspace and therefore untrusted anyways.

Copy the value before passing it to audit_log_untrustedstring() ensures that a
local copy is used to calculate the length *and* subsequently printed.  Even if
this value contains a mix of old and new values, it will only calculate and
copy up to the first NULL, preventing the rest of the audit log message being
truncated.

Use a second local copy of comm to avoid a race between the first and second
calls to audit_log_untrustedstring() with comm.

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-04-15 10:13:20 +10:00
Linus Torvalds
ca2ec32658 Merge branch 'for-linus-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs update from Al Viro:
 "Part one:

   - struct filename-related cleanups

   - saner iov_iter_init() replacements (and switching the syscalls to
     use of those)

   - ntfs switch to ->write_iter() (Anton)

   - aio cleanups and splitting iocb into common and async parts
     (Christoph)

   - assorted fixes (me, bfields, Andrew Elble)

  There's a lot more, including the completion of switchover to
  ->{read,write}_iter(), d_inode/d_backing_inode annotations, f_flags
  race fixes, etc, but that goes after #for-davem merge.  David has
  pulled it, and once it's in I'll send the next vfs pull request"

* 'for-linus-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (35 commits)
  sg_start_req(): use import_iovec()
  sg_start_req(): make sure that there's not too many elements in iovec
  blk_rq_map_user(): use import_single_range()
  sg_io(): use import_iovec()
  process_vm_access: switch to {compat_,}import_iovec()
  switch keyctl_instantiate_key_common() to iov_iter
  switch {compat_,}do_readv_writev() to {compat_,}import_iovec()
  aio_setup_vectored_rw(): switch to {compat_,}import_iovec()
  vmsplice_to_user(): switch to import_iovec()
  kill aio_setup_single_vector()
  aio: simplify arguments of aio_setup_..._rw()
  aio: lift iov_iter_init() into aio_setup_..._rw()
  lift iov_iter into {compat_,}do_readv_writev()
  NFS: fix BUG() crash in notify_change() with patch to chown_common()
  dcache: return -ESTALE not -EBUSY on distributed fs race
  NTFS: Version 2.1.32 - Update file write from aio_write to write_iter.
  VFS: Add iov_iter_fault_in_multipages_readable()
  drop bogus check in file_open_root()
  switch security_inode_getattr() to struct path *
  constify tomoyo_realpath_from_path()
  ...
2015-04-14 15:31:03 -07:00
Nicolas Dichtel
cf89013808 selinux/nlmsg: add a build time check for rtnl/xfrm cmds
When a new rtnl or xfrm command is added, this part of the code is frequently
missing. Let's help the developer with a build time test.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-13 13:09:44 -04:00
James Morris
3add594bf6 Merge branch 'tomoyo-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild into next 2015-04-13 12:07:47 +10:00
Nicolas Dichtel
bd2cba0738 selinux/nlmsg: add XFRM_MSG_MAPPING
This command is missing.

Fixes: 3a2dfbe8ac ("xfrm: Notify changes in UDP encapsulation via netlink")
CC: Martin Willi <martin@strongswan.org>
Reported-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12 21:19:40 -04:00
Nicolas Dichtel
8d465bb777 selinux/nlmsg: add XFRM_MSG_MIGRATE
This command is missing.

Fixes: 5c79de6e79 ("[XFRM]: User interface for handling XFRM_MSG_MIGRATE")
Reported-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12 21:19:40 -04:00
Nicolas Dichtel
b0b59b0056 selinux/nlmsg: add XFRM_MSG_REPORT
This command is missing.

Fixes: 97a64b4577 ("[XFRM]: Introduce XFRM_MSG_REPORT.")
Reported-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12 21:19:40 -04:00
Al Viro
39c853ebfe Merge branch 'for-davem' into for-next 2015-04-11 22:27:19 -04:00
Al Viro
b353a1f7bb switch keyctl_instantiate_key_common() to iov_iter
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:12 -04:00
Al Viro
3f7036a071 switch security_inode_getattr() to struct path *
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:24:32 -04:00
Al Viro
2247386243 constify tomoyo_realpath_from_path()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:24:31 -04:00
Nicolas Dichtel
5b5800fad0 selinux/nlmsg: add XFRM_MSG_[NEW|GET]SADINFO
These commands are missing.

Fixes: 28d8909bc7 ("[XFRM]: Export SAD info.")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08 15:19:17 -04:00
Nicolas Dichtel
5e6deebafb selinux/nlmsg: add XFRM_MSG_GETSPDINFO
This command is missing.

Fixes: ecfd6b1837 ("[XFRM]: Export SPD info")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08 15:19:17 -04:00
Nicolas Dichtel
2b7834d3e1 selinux/nlmsg: add XFRM_MSG_NEWSPDINFO
This new command is missing.

Fixes: 880a6fab8f ("xfrm: configure policy hash table thresholds by netlink")
Reported-by: Christophe Gouault <christophe.gouault@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08 15:19:16 -04:00
Nicolas Dichtel
387f989a60 selinux/nlmsg: add RTM_GETNSID
This new command is missing.

Fixes: 9a9634545c ("netns: notify netns id events")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08 15:19:16 -04:00
Nicolas Dichtel
5bdfbc1f19 selinux/nlmsg: add RTM_NEWNSID and RTM_GETNSID
These new commands are missing.

Fixes: 0c7aecd4bd ("netns: add rtnl cmd to add and get peer netns ids")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08 15:19:15 -04:00
Michal Marek
f02dee2d14 tomoyo: Do not generate empty policy files
The Makefile automatically generates the tomoyo policy files, which are
not removed by make clean (because they could have been provided by the
user). Instead of generating the missing files, use /dev/null if a
given file is not provided. Store the default exception_policy in
exception_policy.conf.default.

Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2015-04-07 21:27:45 +02:00
Michal Marek
bf7a9ab43c tomoyo: Use if_changed when generating builtin-policy.h
Combine the generation of builtin-policy.h into a single command and use
if_changed, so that the file is regenerated each time the command
changes. The next patch will make use of this.

Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2015-04-07 21:27:45 +02:00
Michal Marek
7e114bbf51 tomoyo: Use bin2c to generate builtin-policy.h
Simplify the Makefile by using a readily available tool instead of a
custom sed script. The downside is that builtin-policy.h becomes
unreadable for humans, but it is only a generated file.

Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2015-04-07 21:27:45 +02:00
Stephen Smalley
cf7b6c0205 selinux: increase avtab max buckets
Now that we can safely increase the avtab max buckets without
triggering high order allocations and have a hash function that
will make better use of the larger number of buckets, increase
the max buckets to 2^16.

Original:
101421 entries and 2048/2048 buckets used, longest chain length 374

With new hash function:
101421 entries and 2048/2048 buckets used, longest chain length 81

With increased max buckets:
101421 entries and 31078/32768 buckets used, longest chain length 12

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-04-06 20:16:23 -04:00
John Brooks
33ebc1932a selinux: Use a better hash function for avtab
This function, based on murmurhash3, has much better distribution than
the original. Using the current default of 2048 buckets, there are many
fewer collisions:

Before:
101421 entries and 2048/2048 buckets used, longest chain length 374
After:
101421 entries and 2048/2048 buckets used, longest chain length 81

The difference becomes much more significant when buckets are increased.
A naive attempt to expand the current function to larger outputs doesn't
yield any significant improvement; so this function is a prerequisite
for increasing the bucket size.

sds:  Adapted from the original patches for libsepol to the kernel.

Signed-off-by: John Brooks <john.brooks@jolla.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-04-06 20:16:21 -04:00
Stephen Smalley
ba39db6e05 selinux: convert avtab hash table to flex_array
Previously we shrank the avtab max hash buckets to avoid
high order memory allocations, but this causes avtab lookups to
degenerate to very long linear searches for the Fedora policy. Convert to
using a flex_array instead so that we can increase the buckets
without such limitations.

This change does not alter the max hash buckets; that is left to a
separate follow-on change.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-04-06 20:16:20 -04:00
Paul Moore
da8026fa0f selinux: reconcile security_netlbl_secattr_to_sid() and mls_import_netlbl_cat()
Move the NetLabel secattr MLS category import logic into
mls_import_netlbl_cat() where it belongs, and use the
mls_import_netlbl_cat() function in security_netlbl_secattr_to_sid().

Reported-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-04-06 20:15:55 -04:00
Jeff Vander Stoep
83d4a806ae selinux: remove unnecessary pointer reassignment
Commit f01e1af445 ("selinux: don't pass in NULL avd to avc_has_perm_noaudit")
made this pointer reassignment unnecessary. Avd should continue to reference
the stack-based copy.

Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: tweaked subject line]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-04-06 20:04:54 -04:00
David S. Miller
238e54c9cb netfilter: Make nf_hookfn use nf_hook_state.
Pass the nf_hook_state all the way down into the hook
functions themselves.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 12:31:38 -04:00
David S. Miller
9f0d34bc34 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/asix_common.c
	drivers/net/usb/sr9800.c
	drivers/net/usb/usbnet.c
	include/linux/usb/usbnet.h
	net/ipv4/tcp_ipv4.c
	net/ipv6/tcp_ipv6.c

The TCP conflicts were overlapping changes.  In 'net' we added a
READ_ONCE() to the socket cached RX route read, whilst in 'net-next'
Eric Dumazet touched the surrounding code dealing with how mini
sockets are handled.

With USB, it's a case of the same bug fix first going into net-next
and then I cherry picked it back into net.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:16:53 -04:00
James Morris
641628146c Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/selinux into for-linus 2015-03-27 20:33:27 +11:00
Joe Perches
6436a123a1 selinux: fix sel_write_enforce broken return value
Return a negative error value like the rest of the entries in this function.

Cc: <stable@vger.kernel.org>
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
[PM: tweaked subject line]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-03-25 16:55:06 -04:00
Paul Gortmaker
f43b65bad6 smack: Fix gcc warning from unused smack_syslog_lock mutex in smackfs.c
In commit 00f84f3f2e ("Smack: Make the
syslog control configurable") this mutex was added, but the rest of
the final commit never actually made use of it, resulting in:

 In file included from include/linux/mutex.h:29:0,
                  from include/linux/notifier.h:13,
                  from include/linux/memory_hotplug.h:6,
                  from include/linux/mmzone.h:821,
                  from include/linux/gfp.h:5,
                  from include/linux/slab.h:14,
                  from include/linux/security.h:27,
                  from security/smack/smackfs.c:21:
 security/smack/smackfs.c:63:21: warning: ‘smack_syslog_lock’ defined but not used [-Wunused-variable]
  static DEFINE_MUTEX(smack_syslog_lock);
                      ^

A git grep shows no other instances/references to smack_syslog_lock.
Delete it, assuming that the mutex addition was just a leftover from
an earlier work in progress version of the change.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2015-03-23 13:24:02 -07:00
Casey Schaufler
bf4b2fee99 Smack: Allow an unconfined label in bringup mode
I have vehemently opposed adding a "permissive" mode to Smack
for the simple reasons that it would be subject to massive abuse
and that developers refuse to turn it off come product release.
I still believe that this is true, and still refuse to add a
general "permissive mode". So don't ask again.

Bumjin Im suggested an approach that addresses most of the concerns,
and I have implemented it here. I still believe that we'd be better
off without this sort of thing, but it looks like this minimizes the
abuse potential.

Firstly, you have to configure Smack Bringup Mode. That allows
for "release" software to be ammune from abuse. Second, only one
label gets to be "permissive" at a time. You can use it for
debugging, but that's about it.

A label written to smackfs/unconfined is treated specially.
If either the subject or object label of an access check
matches the "unconfined" label, and the access would not
have been allowed otherwise an audit record and a console
message are generated. The audit record "request" string is
marked with either "(US)" or "(UO)", to indicate that the
request was granted because of an unconfined label. The
fact that an inode was accessed by an unconfined label is
remembered, and subsequent accesses to that "impure"
object are noted in the log. The impurity is not stored in
the filesystem, so a file mislabled as a side effect of
using an unconfined label may still cause concern after
a reboot.

So, it's there, it's dangerous, but so many application
developers seem incapable of living without it I have
given in. I've tried to make it as safe as I can, but
in the end it's still a chain saw.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2015-03-23 13:21:34 -07:00
José Bollo
7fc5f36e98 Smack: getting the Smack security context of keys
With this commit, the LSM Smack implements the LSM
side part of the system call keyctl with the action
code KEYCTL_GET_SECURITY.

It is now possible to get the context of, for example,
the user session key using the command "keyctl security @s".

The original patch has been modified for merge.

Signed-off-by: José Bollo <jose.bollo@open.eurogiciel.org>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2015-03-23 13:19:47 -07:00
Marcin Lis
7412301b76 Smack: Assign smack_known_web as default smk_in label for kernel thread's socket
This change fixes the bug associated with sockets owned by kernel threads. These
sockets, created usually by network devices' drivers tasks, received smk_in
label from the task that created them - the "floor" label in the most cases. The
result was that they were not able to receive data packets because of missing
smack rules. The main reason of the access deny is that the socket smk_in label
is placed as the object during smk check, kernel thread's capabilities are
omitted.

Signed-off-by: Marcin Lis <m.lis@samsung.com>
2015-03-23 13:19:37 -07:00
Eric Dumazet
d3593b5cef Revert "selinux: add a skb_owned_by() hook"
This reverts commit ca10b9e9a8.

No longer needed after commit eb8895debe
("tcp: tcp_make_synack() should use sock_wmalloc")

When under SYNFLOOD, we build lot of SYNACK and hit false sharing
because of multiple modifications done on sk_listener->sk_wmem_alloc

Since tcp_make_synack() uses sock_wmalloc(), there is no need
to call skb_set_owner_w() again, as this adds two atomic operations.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-20 21:36:53 -04:00
James Morris
74f0414b2f Merge tag 'yama-4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux into next 2015-03-03 19:56:08 +11:00
Stephen Smalley
44aa1d4413 security/yama: Remove unnecessary selects from Kconfig.
Yama selects SECURITYFS and SECURITY_PATH, but requires neither.
Remove them.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Kees Cook <keescook@chromium.org>
2015-02-27 16:53:10 -08:00
Kees Cook
41a4695ca4 Yama: do not modify global sysctl table entry
When the sysctl table is constified, we won't be able to directly modify
it. Instead, use a table copy that carries any needed changes.

Suggested-by: PaX Team <pageexec@freemail.hu>
Signed-off-by: Kees Cook <keescook@chromium.org>
2015-02-27 16:53:09 -08:00
Linus Torvalds
be5e6616dd Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 "Assorted stuff from this cycle.  The big ones here are multilayer
  overlayfs from Miklos and beginning of sorting ->d_inode accesses out
  from David"

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (51 commits)
  autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation
  procfs: fix race between symlink removals and traversals
  debugfs: leave freeing a symlink body until inode eviction
  Documentation/filesystems/Locking: ->get_sb() is long gone
  trylock_super(): replacement for grab_super_passive()
  fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
  Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
  VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
  SELinux: Use d_is_positive() rather than testing dentry->d_inode
  Smack: Use d_is_positive() rather than testing dentry->d_inode
  TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR()
  Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode
  Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb
  VFS: Split DCACHE_FILE_TYPE into regular and special types
  VFS: Add a fallthrough flag for marking virtual dentries
  VFS: Add a whiteout dentry type
  VFS: Introduce inode-getting helpers for layered/unioned fs environments
  Infiniband: Fix potential NULL d_inode dereference
  posix_acl: fix reference leaks in posix_acl_create
  autofs4: Wrong format for printing dentry
  ...
2015-02-22 17:42:14 -08:00
David Howells
e36cb0b89c VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
Convert the following where appropriate:

 (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).

 (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).

 (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry).  This is actually more
     complicated than it appears as some calls should be converted to
     d_can_lookup() instead.  The difference is whether the directory in
     question is a real dir with a ->lookup op or whether it's a fake dir with
     a ->d_automount op.

In some circumstances, we can subsume checks for dentry->d_inode not being
NULL into this, provided we the code isn't in a filesystem that expects
d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
use d_inode() rather than d_backing_inode() to get the inode pointer).

Note that the dentry type field may be set to something other than
DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
manages the fall-through from a negative dentry to a lower layer.  In such a
case, the dentry type of the negative union dentry is set to the same as the
type of the lower dentry.

However, if you know d_inode is not NULL at the call site, then you can use
the d_is_xxx() functions even in a filesystem.

There is one further complication: a 0,0 chardev dentry may be labelled
DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE.  Strictly, this was
intended for special directory entry types that don't have attached inodes.

The following perl+coccinelle script was used:

use strict;

my @callers;
open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
    die "Can't grep for S_ISDIR and co. callers";
@callers = <$fd>;
close($fd);
unless (@callers) {
    print "No matches\n";
    exit(0);
}

my @cocci = (
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISLNK(E->d_inode->i_mode)',
    '+ d_is_symlink(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISDIR(E->d_inode->i_mode)',
    '+ d_is_dir(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISREG(E->d_inode->i_mode)',
    '+ d_is_reg(E)' );

my $coccifile = "tmp.sp.cocci";
open($fd, ">$coccifile") || die $coccifile;
print($fd "$_\n") || die $coccifile foreach (@cocci);
close($fd);

foreach my $file (@callers) {
    chomp $file;
    print "Processing ", $file, "\n";
    system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
	die "spatch failed";
}

[AV: overlayfs parts skipped]

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22 11:38:41 -05:00
David Howells
2c616d4d88 SELinux: Use d_is_positive() rather than testing dentry->d_inode
Use d_is_positive() rather than testing dentry->d_inode in SELinux to get rid
of direct references to d_inode outside of the VFS.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22 11:38:40 -05:00
David Howells
8802565b60 Smack: Use d_is_positive() rather than testing dentry->d_inode
Use d_is_positive() rather than testing dentry->d_inode in Smack to get rid of
direct references to d_inode outside of the VFS.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22 11:38:40 -05:00
David Howells
e656a8eb2e TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR()
Use d_is_dir() rather than d_inode and S_ISDIR().  Note that this will include
fake directories such as automount triggers.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22 11:38:39 -05:00
David Howells
729b8a3dee Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode
Use d_is_positive(dentry) or d_is_negative(dentry) rather than testing
dentry->d_inode as the dentry may cover another layer that has an inode when
the top layer doesn't or may hold a 0,0 chardev that's actually a whiteout.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22 11:38:39 -05:00
David Howells
7ac2856d99 Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb
mediated_filesystem() should use dentry->d_sb not dentry->d_inode->i_sb and
should avoid file_inode() also since it is really dealing with the path.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22 11:38:39 -05:00
Linus Torvalds
b11a278397 Merge branch 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kconfig updates from Michal Marek:
 "Yann E Morin was supposed to take over kconfig maintainership, but
  this hasn't happened.  So I'm sending a few kconfig patches that I
  collected:

   - Fix for missing va_end in kconfig
   - merge_config.sh displays used if given too few arguments
   - s/boolean/bool/ in Kconfig files for consistency, with the plan to
     only support bool in the future"

* 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  kconfig: use va_end to match corresponding va_start
  merge_config.sh: Display usage if given too few arguments
  kconfig: use bool instead of boolean for type definition attributes
2015-02-19 10:36:45 -08:00
Linus Torvalds
50652963ea Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc VFS updates from Al Viro:
 "This cycle a lot of stuff sits on topical branches, so I'll be sending
  more or less one pull request per branch.

  This is the first pile; more to follow in a few.  In this one are
  several misc commits from early in the cycle (before I went for
  separate branches), plus the rework of mntput/dput ordering on umount,
  switching to use of fs_pin instead of convoluted games in
  namespace_unlock()"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  switch the IO-triggering parts of umount to fs_pin
  new fs_pin killing logics
  allow attaching fs_pin to a group not associated with some superblock
  get rid of the second argument of acct_kill()
  take count and rcu_head out of fs_pin
  dcache: let the dentry count go down to zero without taking d_lock
  pull bumping refcount into ->kill()
  kill pin_put()
  mode_t whack-a-mole: chelsio
  file->f_path.dentry is pinned down for as long as the file is open...
  get rid of lustre_dump_dentry()
  gut proc_register() a bit
  kill d_validate()
  ncpfs: get rid of d_validate() nonsense
  selinuxfs: don't open-code d_genocide()
2015-02-17 14:56:45 -08:00
James Morris
0d309cbddb Merge branch 'smack-for-3.20-rebased' of git://git.gitorious.org/smack-next/kernel into for-linus 2015-02-16 13:47:53 +11:00
David Jeffery
d0709f1e66 Don't leak a key reference if request_key() tries to use a revoked keyring
If a request_key() call to allocate and fill out a key attempts to insert the
key structure into a revoked keyring, the key will leak, using memory and part
of the user's key quota until the system reboots. This is from a failure of
construct_alloc_key() to decrement the key's reference count after the attempt
to insert into the requested keyring is rejected.

key_put() needs to be called in the link_prealloc_failed callpath to ensure
the unused key is released.

Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-02-16 13:45:16 +11:00
Linus Torvalds
4ba63072b9 Char / Misc patches for 3.20-rc1
Here's the big char/misc driver update for 3.20-rc1.
 
 Lots of little things in here, all described in the changelog.  Nothing
 major or unusual, except maybe the binder selinux stuff, which was all
 acked by the proper selinux people and they thought it best to come
 through this tree.
 
 All of this has been in linux-next with no reported issues for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlTgs80ACgkQMUfUDdst+yn86gCeMLbxANGExVLd+PR46GNsAUQb
 SJ4AmgIqrkIz+5LCwZWM02ldbYhPeBVf
 =lfmM
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char / misc patches from Greg KH:
 "Here's the big char/misc driver update for 3.20-rc1.

  Lots of little things in here, all described in the changelog.
  Nothing major or unusual, except maybe the binder selinux stuff, which
  was all acked by the proper selinux people and they thought it best to
  come through this tree.

  All of this has been in linux-next with no reported issues for a while"

* tag 'char-misc-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (90 commits)
  coresight: fix function etm_writel_cp14() parameter order
  coresight-etm: remove check for unknown Kconfig macro
  coresight: fixing CPU hwid lookup in device tree
  coresight: remove the unnecessary function coresight_is_bit_set()
  coresight: fix the debug AMBA bus name
  coresight: remove the extra spaces
  coresight: fix the link between orphan connection and newly added device
  coresight: remove the unnecessary replicator property
  coresight: fix the replicator subtype value
  pdfdocs: Fix 'make pdfdocs' failure for 'uio-howto.tmpl'
  mcb: Fix error path of mcb_pci_probe
  virtio/console: verify device has config space
  ti-st: clean up data types (fix harmless memory corruption)
  mei: me: release hw from reset only during the reset flow
  mei: mask interrupt set bit on clean reset bit
  extcon: max77693: Constify struct regmap_config
  extcon: adc-jack: Release IIO channel on driver remove
  extcon: Remove duplicated include from extcon-class.c
  Drivers: hv: vmbus: hv_process_timer_expiration() can be static
  Drivers: hv: vmbus: serialize Offer and Rescind offer
  ...
2015-02-15 10:48:44 -08:00
Linus Torvalds
6bec003528 Merge branch 'for-3.20/bdi' of git://git.kernel.dk/linux-block
Pull backing device changes from Jens Axboe:
 "This contains a cleanup of how the backing device is handled, in
  preparation for a rework of the life time rules.  In this part, the
  most important change is to split the unrelated nommu mmap flags from
  it, but also removing a backing_dev_info pointer from the
  address_space (and inode), and a cleanup of other various minor bits.

  Christoph did all the work here, I just fixed an oops with pages that
  have a swap backing.  Arnd fixed a missing export, and Oleg killed the
  lustre backing_dev_info from staging.  Last patch was from Al,
  unexporting parts that are now no longer needed outside"

* 'for-3.20/bdi' of git://git.kernel.dk/linux-block:
  Make super_blocks and sb_lock static
  mtd: export new mtd_mmap_capabilities
  fs: make inode_to_bdi() handle NULL inode
  staging/lustre/llite: get rid of backing_dev_info
  fs: remove default_backing_dev_info
  fs: don't reassign dirty inodes to default_backing_dev_info
  nfs: don't call bdi_unregister
  ceph: remove call to bdi_unregister
  fs: remove mapping->backing_dev_info
  fs: export inode_to_bdi and use it in favor of mapping->backing_dev_info
  nilfs2: set up s_bdi like the generic mount_bdev code
  block_dev: get bdev inode bdi directly from the block device
  block_dev: only write bdev inode on close
  fs: introduce f_op->mmap_capabilities for nommu mmap support
  fs: kill BDI_CAP_SWAP_BACKED
  fs: deduplicate noop_backing_dev_info
2015-02-12 13:50:21 -08:00
Linus Torvalds
8cc748aa76 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security layer updates from James Morris:
 "Highlights:

   - Smack adds secmark support for Netfilter
   - /proc/keys is now mandatory if CONFIG_KEYS=y
   - TPM gets its own device class
   - Added TPM 2.0 support
   - Smack file hook rework (all Smack users should review this!)"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (64 commits)
  cipso: don't use IPCB() to locate the CIPSO IP option
  SELinux: fix error code in policydb_init()
  selinux: add security in-core xattr support for pstore and debugfs
  selinux: quiet the filesystem labeling behavior message
  selinux: Remove unused function avc_sidcmp()
  ima: /proc/keys is now mandatory
  Smack: Repair netfilter dependency
  X.509: silence asn1 compiler debug output
  X.509: shut up about included cert for silent build
  KEYS: Make /proc/keys unconditional if CONFIG_KEYS=y
  MAINTAINERS: email update
  tpm/tpm_tis: Add missing ifdef CONFIG_ACPI for pnp_acpi_device
  smack: fix possible use after frees in task_security() callers
  smack: Add missing logging in bidirectional UDS connect check
  Smack: secmark support for netfilter
  Smack: Rework file hooks
  tpm: fix format string error in tpm-chip.c
  char/tpm/tpm_crb: fix build error
  smack: Fix a bidirectional UDS connect check typo
  smack: introduce a special case for tmpfs in smack_d_instantiate()
  ...
2015-02-11 20:25:11 -08:00
Casey Schaufler
7f368ad34f Smack: secmark connections
If the secmark is available us it on connection as
well as packet delivery.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2015-02-11 12:55:05 -08:00
Dan Carpenter
6eb4e2b41b SELinux: fix error code in policydb_init()
If hashtab_create() returns a NULL pointer then we should return -ENOMEM
but instead the current code returns success.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-02-04 11:34:30 -05:00
Mark Salyzyn
d5f3a5f6e7 selinux: add security in-core xattr support for pstore and debugfs
- add "pstore" and "debugfs" to list of in-core exceptions
- change fstype checks to boolean equation
- change from strncmp to strcmp for checking

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
[PM: tweaked the subject line prefix to "selinux"]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-02-04 11:34:30 -05:00
Paul Moore
2088d60e3b selinux: quiet the filesystem labeling behavior message
While the filesystem labeling method is only printed at the KERN_DEBUG
level, this still appears in dmesg and on modern Linux distributions
that create a lot of tmpfs mounts for session handling, the dmesg can
easily be filled with a lot of "SELinux: initialized (dev X ..."
messages.  This patch removes this notification for the normal case
but leaves the error message intact (displayed when mounting a
filesystem with an unknown labeling behavior).

Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-02-04 11:34:30 -05:00
Rickard Strandqvist
e230f12c98 selinux: Remove unused function avc_sidcmp()
Remove the function avc_sidcmp() that is not used anywhere.

This was partially found by using a static code analysis program called cppcheck.

Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
[PM: rewrite the patch subject line]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-02-04 11:34:30 -05:00
David Howells
11cd64a234 ima: /proc/keys is now mandatory
/proc/keys is now mandatory and its config option no longer exists, so it
doesn't need selecting.

Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-02-02 13:19:48 +11:00
James Morris
bfc8419670 Merge tag 'keys-next-20150123' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2015-01-29 11:03:24 +11:00
Al Viro
f4a4a8b125 file->f_path.dentry is pinned down for as long as the file is open...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-01-25 23:16:27 -05:00
Al Viro
ad52184b70 selinuxfs: don't open-code d_genocide()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-01-25 23:16:25 -05:00
Stephen Smalley
79af73079d Add security hooks to binder and implement the hooks for SELinux.
Add security hooks to the binder and implement the hooks for SELinux.
The security hooks enable security modules such as SELinux to implement
controls over binder IPC.  The security hooks include support for
controlling what process can become the binder context manager
(binder_set_context_mgr), controlling the ability of a process
to invoke a binder transaction/IPC to another process (binder_transaction),
controlling the ability of a process to transfer a binder reference to
another process (binder_transfer_binder), and controlling the ability
of a process to transfer an open file to another process (binder_transfer_file).

These hooks have been included in the Android kernel trees since Android 4.3.

(Updated to reflect upstream relocation and changes to the binder driver,
changes to the LSM audit data structures, coding style cleanups, and
to add inline documentation for the hooks).

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Nick Kralevich <nnk@google.com>
Acked-by: Jeffrey Vander Stoep <jeffv@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-25 09:17:57 -08:00
Casey Schaufler
82b0b2c2b1 Smack: Repair netfilter dependency
On 1/23/2015 8:20 AM, Jim Davis wrote:
> Building with the attached random configuration file,
>
> security/smack/smack_netfilter.c: In function ‘smack_ipv4_output’:
> security/smack/smack_netfilter.c:55:6: error: ‘struct sk_buff’ has no
> member named ‘secmark’
>    skb->secmark = skp->smk_secid;
>       ^
> make[2]: *** [security/smack/smack_netfilter.o] Error 1

The existing Makefile used the wrong configuration option to
determine if smack_netfilter should be built. This sets it right.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2015-01-23 10:08:19 -08:00
David Howells
dabd39cc2f KEYS: Make /proc/keys unconditional if CONFIG_KEYS=y
Now that /proc/keys is used by libkeyutils to look up a key by type and
description, we should make it unconditional and remove
CONFIG_DEBUG_PROC_KEYS.

Reported-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
2015-01-22 22:34:32 +00:00
Andrey Ryabinin
6d1cff2a88 smack: fix possible use after frees in task_security() callers
We hit use after free on dereferncing pointer to task_smack struct in
smk_of_task() called from smack_task_to_inode().

task_security() macro uses task_cred_xxx() to get pointer to the task_smack.
task_cred_xxx() could be used only for non-pointer members of task's
credentials. It cannot be used for pointer members since what they point
to may disapper after dropping RCU read lock.

Mainly task_security() used this way:
	smk_of_task(task_security(p))

Intead of this introduce function smk_of_task_struct() which
takes task_struct as argument and returns pointer to smk_known struct
and do this under RCU read lock.
Bogus task_security() macro is not used anymore, so remove it.

KASan's report for this:

	AddressSanitizer: use after free in smack_task_to_inode+0x50/0x70 at addr c4635600
	=============================================================================
	BUG kmalloc-64 (Tainted: PO): kasan error
	-----------------------------------------------------------------------------

	Disabling lock debugging due to kernel taint
	INFO: Allocated in new_task_smack+0x44/0xd8 age=39 cpu=0 pid=1866
		kmem_cache_alloc_trace+0x88/0x1bc
		new_task_smack+0x44/0xd8
		smack_cred_prepare+0x48/0x21c
		security_prepare_creds+0x44/0x4c
		prepare_creds+0xdc/0x110
		smack_setprocattr+0x104/0x150
		security_setprocattr+0x4c/0x54
		proc_pid_attr_write+0x12c/0x194
		vfs_write+0x1b0/0x370
		SyS_write+0x5c/0x94
		ret_fast_syscall+0x0/0x48
	INFO: Freed in smack_cred_free+0xc4/0xd0 age=27 cpu=0 pid=1564
		kfree+0x270/0x290
		smack_cred_free+0xc4/0xd0
		security_cred_free+0x34/0x3c
		put_cred_rcu+0x58/0xcc
		rcu_process_callbacks+0x738/0x998
		__do_softirq+0x264/0x4cc
		do_softirq+0x94/0xf4
		irq_exit+0xbc/0x120
		handle_IRQ+0x104/0x134
		gic_handle_irq+0x70/0xac
		__irq_svc+0x44/0x78
		_raw_spin_unlock+0x18/0x48
		sync_inodes_sb+0x17c/0x1d8
		sync_filesystem+0xac/0xfc
		vdfs_file_fsync+0x90/0xc0
		vfs_fsync_range+0x74/0x7c
	INFO: Slab 0xd3b23f50 objects=32 used=31 fp=0xc4635600 flags=0x4080
	INFO: Object 0xc4635600 @offset=5632 fp=0x  (null)

	Bytes b4 c46355f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
	Object c4635600: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
	Object c4635610: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
	Object c4635620: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
	Object c4635630: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  kkkkkkkkkkkkkkk.
	Redzone c4635640: bb bb bb bb                                      ....
	Padding c46356e8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
	Padding c46356f8: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
	CPU: 5 PID: 834 Comm: launchpad_prelo Tainted: PBO 3.10.30 #1
	Backtrace:
	[<c00233a4>] (dump_backtrace+0x0/0x158) from [<c0023dec>] (show_stack+0x20/0x24)
	 r7:c4634010 r6:d3b23f50 r5:c4635600 r4:d1002140
	[<c0023dcc>] (show_stack+0x0/0x24) from [<c06d6d7c>] (dump_stack+0x20/0x28)
	[<c06d6d5c>] (dump_stack+0x0/0x28) from [<c01c1d50>] (print_trailer+0x124/0x144)
	[<c01c1c2c>] (print_trailer+0x0/0x144) from [<c01c1e88>] (object_err+0x3c/0x44)
	 r7:c4635600 r6:d1002140 r5:d3b23f50 r4:c4635600
	[<c01c1e4c>] (object_err+0x0/0x44) from [<c01cac18>] (kasan_report_error+0x2b8/0x538)
	 r6:d1002140 r5:d3b23f50 r4:c6429cf8 r3:c09e1aa7
	[<c01ca960>] (kasan_report_error+0x0/0x538) from [<c01c9430>] (__asan_load4+0xd4/0xf8)
	[<c01c935c>] (__asan_load4+0x0/0xf8) from [<c031e168>] (smack_task_to_inode+0x50/0x70)
	 r5:c4635600 r4:ca9da000
	[<c031e118>] (smack_task_to_inode+0x0/0x70) from [<c031af64>] (security_task_to_inode+0x3c/0x44)
	 r5:cca25e80 r4:c0ba9780
	[<c031af28>] (security_task_to_inode+0x0/0x44) from [<c023d614>] (pid_revalidate+0x124/0x178)
	 r6:00000000 r5:cca25e80 r4:cbabe3c0 r3:00008124
	[<c023d4f0>] (pid_revalidate+0x0/0x178) from [<c01db98c>] (lookup_fast+0x35c/0x43y4)
	 r9:c6429efc r8:00000101 r7:c079d940 r6:c6429e90 r5:c6429ed8 r4:c83c4148
	[<c01db630>] (lookup_fast+0x0/0x434) from [<c01deec8>] (do_last.isra.24+0x1c0/0x1108)
	[<c01ded08>] (do_last.isra.24+0x0/0x1108) from [<c01dff04>] (path_openat.isra.25+0xf4/0x648)
	[<c01dfe10>] (path_openat.isra.25+0x0/0x648) from [<c01e1458>] (do_filp_open+0x3c/0x88)
	[<c01e141c>] (do_filp_open+0x0/0x88) from [<c01ccb28>] (do_sys_open+0xf0/0x198)
	 r7:00000001 r6:c0ea2180 r5:0000000b r4:00000000
	[<c01cca38>] (do_sys_open+0x0/0x198) from [<c01ccc00>] (SyS_open+0x30/0x34)
	[<c01ccbd0>] (SyS_open+0x0/0x34) from [<c001db80>] (ret_fast_syscall+0x0/0x48)
	Read of size 4 by thread T834:
	Memory state around the buggy address:
	 c4635380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
	 c4635400: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
	 c4635480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
	 c4635500: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
	 c4635580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
	>c4635600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
	           ^
	 c4635680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
	 c4635700: 00 00 00 00 04 fc fc fc fc fc fc fc fc fc fc fc
	 c4635780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
	 c4635800: 00 00 00 00 00 00 04 fc fc fc fc fc fc fc fc fc
	 c4635880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
	==================================================================

Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: <stable@vger.kernel.org>
2015-01-21 11:56:53 -08:00
Ingo Molnar
f49028292c Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:

  - Documentation updates.

  - Miscellaneous fixes.

  - Preemptible-RCU fixes, including fixing an old bug in the
    interaction of RCU priority boosting and CPU hotplug.

  - SRCU updates.

  - RCU CPU stall-warning updates.

  - RCU torture-test updates.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-21 06:12:21 +01:00
Rafal Krypa
138a868f00 smack: Add missing logging in bidirectional UDS connect check
During UDS connection check, both sides are checked for write access to
the other side. But only the first check is performed with audit support.
The second one didn't produce any audit logs. This simple patch fixes that.

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
2015-01-20 16:35:31 -08:00
Casey Schaufler
69f287ae6f Smack: secmark support for netfilter
Smack uses CIPSO to label internet packets and thus provide
for access control on delivery of packets. The netfilter facility
was not used to allow for Smack to work properly without netfilter
configuration. Smack does not need netfilter, however there are
cases where it would be handy.

As a side effect, the labeling of local IPv4 packets can be optimized
and the handling of local IPv6 packets is just all out better.

The best part is that the netfilter tools use "contexts" that
are just strings, and they work just as well for Smack as they
do for SELinux.

All of the conditional compilation for IPv6 was implemented
by Rafal Krypa <r.krypa@samsung.com>

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2015-01-20 16:34:25 -08:00
Casey Schaufler
5e7270a6dd Smack: Rework file hooks
This is one of those cases where you look at code you did
years ago and wonder what you might have been thinking.
There are a number of LSM hooks that work off of file pointers,
and most of them really want the security data from the inode.
Some, however, really want the security context that the process
had when the file was opened. The difference went undetected in
Smack until it started getting used in a real system with real
testing. At that point it was clear that something was amiss.

This patch corrects the misuse of the f_security value in several
of the hooks. The behavior will not usually be any different, as
the process had to be able to open the file in the first place, and
the old check almost always succeeded, as will the new, but for
different reasons.

Thanks to the Samsung Tizen development team that identified this.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2015-01-20 16:32:17 -08:00
Christoph Hellwig
b4caecd480 fs: introduce f_op->mmap_capabilities for nommu mmap support
Since "BDI: Provide backing device capability information [try #3]" the
backing_dev_info structure also provides flags for the kind of mmap
operation available in a nommu environment, which is entirely unrelated
to it's original purpose.

Introduce a new nommu-only file operation to provide this information to
the nommu mmap code instead.  Splitting this from the backing_dev_info
structure allows to remove lots of backing_dev_info instance that aren't
otherwise needed, and entirely gets rid of the concept of providing a
backing_dev_info for a character device.  It also removes the need for
the mtd_inodefs filesystem.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Tejun Heo <tj@kernel.org>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-20 14:02:58 -07:00
Zbigniew Jasinski
96be7b5424 smack: Fix a bidirectional UDS connect check typo
The 54e70ec5eb commit introduced a
bidirectional check that should have checked for mutual WRITE access
between two labels. Due to a typo subject's OUT label is checked with
object's OUT. Should be OUT to IN.

Signed-off-by: Zbigniew Jasinski <z.jasinski@samsung.com>
2015-01-19 10:00:05 -08:00
Łukasz Stelmach
1d8c2326a4 smack: introduce a special case for tmpfs in smack_d_instantiate()
Files created with __shmem_file_stup() appear to have somewhat fake
dentries which make them look like root directories and not get
the label the current process or ("*") star meant for tmpfs files.

Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
2015-01-19 09:58:02 -08:00
Lukasz Pawelczyk
68390ccf8b smack: fix logic in smack_inode_init_security function
In principle if this function was called with "value" == NULL and "len"
not NULL it could return different results for the "len" compared to a
case where "name" was not NULL. This is a hypothetical case that does
not exist in the kernel, but it's a logic bug nonetheless.

Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@samsung.com>
2015-01-19 09:18:11 -08:00
Lukasz Pawelczyk
1a28979b32 smack: miscellaneous small fixes in function comments
Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@samsung.com>
2015-01-19 09:16:55 -08:00
Christoph Jaeger
6341e62b21 kconfig: use bool instead of boolean for type definition attributes
Support for keyword 'boolean' will be dropped later on.

No functional change.

Reference: http://lkml.kernel.org/r/cover.1418003065.git.cj@linux.com
Signed-off-by: Christoph Jaeger <cj@linux.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2015-01-07 13:08:04 +01:00
Pranith Kumar
83fe27ea53 rcu: Make SRCU optional by using CONFIG_SRCU
SRCU is not necessary to be compiled by default in all cases. For tinification
efforts not compiling SRCU unless necessary is desirable.

The current patch tries to make compiling SRCU optional by introducing a new
Kconfig option CONFIG_SRCU which is selected when any of the components making
use of SRCU are selected.

If we do not select CONFIG_SRCU, srcu.o will not be compiled at all.

   text    data     bss     dec     hex filename
   2007       0       0    2007     7d7 kernel/rcu/srcu.o

Size of arch/powerpc/boot/zImage changes from

   text    data     bss     dec     hex filename
 831552   64180   23944  919676   e087c arch/powerpc/boot/zImage : before
 829504   64180   23952  917636   e0084 arch/powerpc/boot/zImage : after

so the savings are about ~2000 bytes.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: resolve conflict due to removal of arch/ia64/kvm/Kconfig. ]
2015-01-06 11:04:29 -08:00
Sasha Levin
a3a8784454 KEYS: close race between key lookup and freeing
When a key is being garbage collected, it's key->user would get put before
the ->destroy() callback is called, where the key is removed from it's
respective tracking structures.

This leaves a key hanging in a semi-invalid state which leaves a window open
for a different task to try an access key->user. An example is
find_keyring_by_name() which would dereference key->user for a key that is
in the process of being garbage collected (where key->user was freed but
->destroy() wasn't called yet - so it's still present in the linked list).

This would cause either a panic, or corrupt memory.

Fixes CVE-2014-9529.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2015-01-05 15:58:01 +00:00
Dan Carpenter
5057975ae3 KEYS: remove a bogus NULL check
We already checked if "desc" was NULL at the beginning of the function
and we've dereferenced it so this causes a static checker warning.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-12-16 18:05:20 +11:00
James Morris
d0bffab043 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into for-linus 2014-12-16 12:49:10 +11:00
Linus Torvalds
67e2c38838 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security layer updates from James Morris:
 "In terms of changes, there's general maintenance to the Smack,
  SELinux, and integrity code.

  The IMA code adds a new kconfig option, IMA_APPRAISE_SIGNED_INIT,
  which allows IMA appraisal to require signatures.  Support for reading
  keys from rootfs before init is call is also added"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (23 commits)
  selinux: Remove security_ops extern
  security: smack: fix out-of-bounds access in smk_parse_smack()
  VFS: refactor vfs_read()
  ima: require signature based appraisal
  integrity: provide a hook to load keys when rootfs is ready
  ima: load x509 certificate from the kernel
  integrity: provide a function to load x509 certificate from the kernel
  integrity: define a new function integrity_read_file()
  Security: smack: replace kzalloc with kmem_cache for inode_smack
  Smack: Lock mode for the floor and hat labels
  ima: added support for new kernel cmdline parameter ima_template_fmt
  ima: allocate field pointers array on demand in template_desc_init_fields()
  ima: don't allocate a copy of template_fmt in template_desc_init_fields()
  ima: display template format in meas. list if template name length is zero
  ima: added error messages to template-related functions
  ima: use atomic bit operations to protect policy update interface
  ima: ignore empty and with whitespaces policy lines
  ima: no need to allocate entry for comment
  ima: report policy load status
  ima: use path names cache
  ...
2014-12-14 20:36:37 -08:00
Linus Torvalds
cbfe0de303 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS changes from Al Viro:
 "First pile out of several (there _definitely_ will be more).  Stuff in
  this one:

   - unification of d_splice_alias()/d_materialize_unique()

   - iov_iter rewrite

   - killing a bunch of ->f_path.dentry users (and f_dentry macro).

     Getting that completed will make life much simpler for
     unionmount/overlayfs, since then we'll be able to limit the places
     sensitive to file _dentry_ to reasonably few.  Which allows to have
     file_inode(file) pointing to inode in a covered layer, with dentry
     pointing to (negative) dentry in union one.

     Still not complete, but much closer now.

   - crapectomy in lustre (dead code removal, mostly)

   - "let's make seq_printf return nothing" preparations

   - assorted cleanups and fixes

  There _definitely_ will be more piles"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  copy_from_iter_nocache()
  new helper: iov_iter_kvec()
  csum_and_copy_..._iter()
  iov_iter.c: handle ITER_KVEC directly
  iov_iter.c: convert copy_to_iter() to iterate_and_advance
  iov_iter.c: convert copy_from_iter() to iterate_and_advance
  iov_iter.c: get rid of bvec_copy_page_{to,from}_iter()
  iov_iter.c: convert iov_iter_zero() to iterate_and_advance
  iov_iter.c: convert iov_iter_get_pages_alloc() to iterate_all_kinds
  iov_iter.c: convert iov_iter_get_pages() to iterate_all_kinds
  iov_iter.c: convert iov_iter_npages() to iterate_all_kinds
  iov_iter.c: iterate_and_advance
  iov_iter.c: macros for iterating over iov_iter
  kill f_dentry macro
  dcache: fix kmemcheck warning in switch_names
  new helper: audit_file()
  nfsd_vfs_write(): use file_inode()
  ncpfs: use file_inode()
  kill f_dentry uses
  lockd: get rid of ->f_path.dentry->d_sb
  ...
2014-12-10 16:10:49 -08:00
Al Viro
ba00410b81 Merge branch 'iov_iter' into for-next 2014-12-08 20:39:29 -05:00
Michael Ellerman
63a0eb7891 ima: Fix build failure on powerpc when TCG_IBMVTPM dependencies are not met
On powerpc we can end up with IMA=y and PPC_PSERIES=n which leads to:

 warning: (IMA) selects TCG_IBMVTPM which has unmet direct dependencies (TCG_TPM && PPC_PSERIES)
  tpm_ibmvtpm.c:(.text+0x14f3e8): undefined reference to `.plpar_hcall_norets'

I'm not sure why IMA needs to select those user-visible symbols, but if
it must then the simplest fix is to just express the proper dependencies
on the select.

Tested-by: Hon Ching (Vicky) Lo <lo1@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-12-06 21:50:37 -05:00
Takashi Iwai
b26bdde5bb KEYS: Fix stale key registration at error path
When loading encrypted-keys module, if the last check of
aes_get_sizes() in init_encrypted() fails, the driver just returns an
error without unregistering its key type.  This results in the stale
entry in the list.  In addition to memory leaks, this leads to a kernel
crash when registering a new key type later.

This patch fixes the problem by swapping the calls of aes_get_sizes()
and register_key_type(), and releasing resources properly at the error
paths.

Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=908163
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-12-06 21:50:36 -05:00
James Morris
b2d1965dce Merge branch 'next' of git://git.infradead.org/users/pcmoore/selinux into next 2014-12-05 08:49:14 +11:00
David Howells
0b0a84154e KEYS: request_key() should reget expired keys rather than give EKEYEXPIRED
Since the keyring facility can be viewed as a cache (at least in some
applications), the local expiration time on the key should probably be viewed
as a 'needs updating after this time' property rather than an absolute 'anyone
now wanting to use this object is out of luck' property.

Since request_key() is the main interface for the usage of keys, this should
update or replace an expired key rather than issuing EKEYEXPIRED if the local
expiration has been reached (ie. it should refresh the cache).

For absolute conditions where refreshing the cache probably doesn't help, the
key can be negatively instantiated using KEYCTL_REJECT_KEY with EKEYEXPIRED
given as the error to issue.  This will still cause request_key() to return
EKEYEXPIRED as that was explicitly set.

In the future, if the key type has an update op available, we might want to
upcall with the expired key and allow the upcall to update it.  We would pass
a different operation name (the first column in /etc/request-key.conf) to the
request-key program.

request_key() returning EKEYEXPIRED is causing an NFS problem which Chuck
Lever describes thusly:

	After about 10 minutes, my NFSv4 functional tests fail because the
	ownership of the test files goes to "-2". Looking at /proc/keys
	shows that the id_resolv keys that map to my test user ID have
	expired. The ownership problem persists until the expired keys are
	purged from the keyring, and fresh keys are obtained.

	I bisected the problem to 3.13 commit b2a4df200d ("KEYS: Expand
	the capacity of a keyring"). This commit inadvertantly changes the
	API contract of the internal function keyring_search_aux().

	The root cause appears to be that b2a4df200d made "no state check"
	the default behavior. "No state check" means the keyring search
	iterator function skips checking the key's expiry timeout, and
	returns expired keys.  request_key_and_link() depends on getting
	an -EAGAIN result code to know when to perform an upcall to refresh
	an expired key.

This patch can be tested directly by:

	keyctl request2 user debug:fred a @s
	keyctl timeout %user:debug:fred 3
	sleep 4
	keyctl request2 user debug:fred a @s

Without the patch, the last command gives error EKEYEXPIRED, but with the
command it gives a new key.

Reported-by: Carl Hetherington <cth@carlh.net>
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
2014-12-01 22:52:53 +00:00
David Howells
054f6180d8 KEYS: Simplify KEYRING_SEARCH_{NO,DO}_STATE_CHECK flags
Simplify KEYRING_SEARCH_{NO,DO}_STATE_CHECK flags to be two variations of the
same flag.  They are effectively mutually exclusive and one or the other
should be provided, but not both.

Keyring cycle detection and key possession determination are the only things
that set NO_STATE_CHECK, except that neither flag really does anything there
because neither purpose makes use of the keyring_search_iterator() function,
but rather provides their own.

For cycle detection we definitely want to check inside of expired keyrings,
just so that we don't create a cycle we can't get rid of.  Revoked keyrings
are cleared at revocation time and can't then be reused, so shouldn't be a
problem either way.

For possession determination, we *might* want to validate each keyring before
searching it: do you possess a key that's hidden behind an expired or just
plain inaccessible keyring?  Currently, the answer is yes.  Note that you
cannot, however, possess a key behind a revoked keyring because they are
cleared on revocation.

keyring_search() sets DO_STATE_CHECK, which is correct.

request_key_and_link() currently doesn't specify whether to check the key
state or not - but it should set DO_STATE_CHECK.

key_get_instantiation_authkey() also currently doesn't specify whether to
check the key state or not - but it probably should also set DO_STATE_CHECK.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
2014-12-01 22:52:50 +00:00
David Howells
aa9d443789 KEYS: Fix the size of the key description passed to/from userspace
When a key description argument is imported into the kernel from userspace, as
happens in add_key(), request_key(), KEYCTL_JOIN_SESSION_KEYRING,
KEYCTL_SEARCH, the description is copied into a buffer up to PAGE_SIZE in size.
PAGE_SIZE, however, is a variable quantity, depending on the arch.  Fix this at
4096 instead (ie. 4095 plus a NUL termination) and define a constant
(KEY_MAX_DESC_SIZE) to this end.

When reading the description back with KEYCTL_DESCRIBE, a PAGE_SIZE internal
buffer is allocated into which the information and description will be
rendered.  This means that the description will get truncated if an extremely
long description it has to be crammed into the buffer with the stringified
information.  There is no particular need to copy the description into the
buffer, so just copy it directly to userspace in a separate operation.

Reported-by: Christian Kastner <debian@kvr.at>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Christian Kastner <debian@kvr.at>
2014-12-01 22:52:45 +00:00
Yao Dongdong
00fec2a10b selinux: Remove security_ops extern
security_ops is not used in this file.

Signed-off-by: Yao Dongdong <yaodongdong@huawei.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-12-01 16:42:50 -05:00
James Morris
ac14ae25b6 Merge branch 'smack-for-3.19' of git://git.gitorious.org/smack-next/kernel into next 2014-11-27 00:35:32 +11:00
Andrey Ryabinin
5c1b66240b security: smack: fix out-of-bounds access in smk_parse_smack()
Setting smack label on file (e.g. 'attr -S -s SMACK64 -V "test" test')
triggered following spew on the kernel with KASan applied:
    ==================================================================
    BUG: AddressSanitizer: out of bounds access in strncpy+0x28/0x60 at addr ffff8800059ad064
    =============================================================================
    BUG kmalloc-8 (Not tainted): kasan error
    -----------------------------------------------------------------------------

    Disabling lock debugging due to kernel taint
    INFO: Slab 0xffffea0000166b40 objects=128 used=7 fp=0xffff8800059ad080 flags=0x4000000000000080
    INFO: Object 0xffff8800059ad060 @offset=96 fp=0xffff8800059ad080

    Bytes b4 ffff8800059ad050: a0 df 9a 05 00 88 ff ff 5a 5a 5a 5a 5a 5a 5a 5a  ........ZZZZZZZZ
    Object ffff8800059ad060: 74 65 73 74 6b 6b 6b a5                          testkkk.
    Redzone ffff8800059ad068: cc cc cc cc cc cc cc cc                          ........
    Padding ffff8800059ad078: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
    CPU: 0 PID: 528 Comm: attr Tainted: G    B          3.18.0-rc1-mm1+ #5
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
     0000000000000000 ffff8800059ad064 ffffffff81534cf2 ffff880005a5bc40
     ffffffff8112fe1a 0000000100800006 0000000f059ad060 ffff880006000f90
     0000000000000296 ffffea0000166b40 ffffffff8107ca97 ffff880005891060
    Call Trace:
    ? dump_stack (lib/dump_stack.c:52)
    ? kasan_report_error (mm/kasan/report.c:102 mm/kasan/report.c:178)
    ? preempt_count_sub (kernel/sched/core.c:2651)
    ? __asan_load1 (mm/kasan/kasan.h:50 mm/kasan/kasan.c:248 mm/kasan/kasan.c:358)
    ? strncpy (lib/string.c:121)
    ? strncpy (lib/string.c:121)
    ? smk_parse_smack (security/smack/smack_access.c:457)
    ? setxattr (fs/xattr.c:343)
    ? smk_import_entry (security/smack/smack_access.c:514)
    ? smack_inode_setxattr (security/smack/smack_lsm.c:1093 (discriminator 1))
    ? security_inode_setxattr (security/security.c:602)
    ? vfs_setxattr (fs/xattr.c:134)
    ? setxattr (fs/xattr.c:343)
    ? setxattr (fs/xattr.c:360)
    ? get_parent_ip (kernel/sched/core.c:2606)
    ? preempt_count_sub (kernel/sched/core.c:2651)
    ? __percpu_counter_add (arch/x86/include/asm/preempt.h:98 lib/percpu_counter.c:90)
    ? get_parent_ip (kernel/sched/core.c:2606)
    ? preempt_count_sub (kernel/sched/core.c:2651)
    ? __mnt_want_write (arch/x86/include/asm/preempt.h:98 fs/namespace.c:359)
    ? path_setxattr (fs/xattr.c:380)
    ? SyS_lsetxattr (fs/xattr.c:397)
    ? system_call_fastpath (arch/x86/kernel/entry_64.S:423)
    Read of size 1 by task attr:
    Memory state around the buggy address:
     ffff8800059ace80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     ffff8800059acf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     ffff8800059acf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    >ffff8800059ad000: 00 fc fc fc 00 fc fc fc 05 fc fc fc 04 fc fc fc
                                                           ^
     ffff8800059ad080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
     ffff8800059ad100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
     ffff8800059ad180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ==================================================================

strncpy() copies one byte more than the source string has.
Fix this by passing the correct length to strncpy().

Now we can remove initialization of the last byte in 'smack' string
because kzalloc() already did this for us.

Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
2014-11-21 13:14:22 -08:00
Al Viro
b583043e99 kill f_dentry uses
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-19 13:01:25 -05:00
Al Viro
a455589f18 assorted conversions to %p[dD]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-19 13:01:20 -05:00
James Morris
a6aacbde40 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next 2014-11-19 21:36:07 +11:00
James Morris
b10778a00d Merge commit 'v3.17' into next 2014-11-19 21:32:12 +11:00
Dmitry Kasatkin
6fb5032ebb VFS: refactor vfs_read()
integrity_kernel_read() duplicates the file read operations code
in vfs_read(). This patch refactors vfs_read() code creating a
helper function __vfs_read(). It is used by both vfs_read() and
integrity_kernel_read().

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-11-17 23:14:22 -05:00
Dmitry Kasatkin
c57782c13e ima: require signature based appraisal
This patch provides CONFIG_IMA_APPRAISE_SIGNED_INIT kernel configuration
option to force IMA appraisal using signatures. This is useful, when EVM
key is not initialized yet and we want securely initialize integrity or
any other functionality.

It forces embedded policy to require signature. Signed initialization
script can initialize EVM key, update the IMA policy and change further
requirement of everything to be signed.

Changes in v3:
* kernel parameter fixed to configuration option in the patch description

Changes in v2:
* policy change of this patch separated from the key loading patch

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-11-17 23:12:01 -05:00
Dmitry Kasatkin
c9cd2ce2bc integrity: provide a hook to load keys when rootfs is ready
Keys can only be loaded once the rootfs is mounted. Initcalls
are not suitable for that. This patch defines a special hook
to load the x509 public keys onto the IMA keyring, before
attempting to access any file. The keys are required for
verifying the file's signature. The hook is called after the
root filesystem is mounted and before the kernel calls 'init'.

Changes in v3:
* added more explanation to the patch description (Mimi)

Changes in v2:
* Hook renamed as 'integrity_load_keys()' to handle both IMA and EVM
  keys by integrity subsystem.
* Hook patch moved after defining loading functions

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-11-17 23:12:01 -05:00
Dmitry Kasatkin
fd5f4e9054 ima: load x509 certificate from the kernel
Define configuration option to load X509 certificate into the
IMA trusted kernel keyring. It implements ima_load_x509() hook
to load X509 certificate into the .ima trusted kernel keyring
from the root filesystem.

Changes in v3:
* use ima_policy_flag in ima_get_action()
  ima_load_x509 temporarily clears ima_policy_flag to disable
  appraisal to load key. Use it to skip appraisal rules.
* Key directory path changed to /etc/keys (Mimi)
* Expand IMA_LOAD_X509 Kconfig help

Changes in v2:
* added '__init'
* use ima_policy_flag to disable appraisal to load keys

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-11-17 23:12:00 -05:00
Dmitry Kasatkin
65d543b233 integrity: provide a function to load x509 certificate from the kernel
Provide the function to load x509 certificates from the kernel into the
integrity kernel keyring.

Changes in v2:
* configuration option removed
* function declared as '__init'

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-11-17 23:11:59 -05:00
Dmitry Kasatkin
e3c4abbfa9 integrity: define a new function integrity_read_file()
This patch defines a new function called integrity_read_file()
to read file from the kernel into a buffer. Subsequent patches
will read a file containing the public keys and load them onto
the IMA keyring.

This patch moves and renames ima_kernel_read(), the non-security
checking version of kernel_read(), to integrity_kernel_read().

Changes in v3:
* Patch descriptions improved (Mimi)
* Add missing cast (kbuild test robot)

Changes in v2:
* configuration option removed
* function declared as '__init'

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-11-17 23:09:18 -05:00
James Morris
09c6268927 Merge branch 'stable-3.18' of git://git.infradead.org/users/pcmoore/selinux into for-linus 2014-11-13 21:49:53 +11:00
Richard Guy Briggs
d950f84c1c selinux: convert WARN_ONCE() to printk() in selinux_nlmsg_perm()
Convert WARN_ONCE() to printk() in selinux_nlmsg_perm().

After conversion from audit_log() in commit e173fb26, WARN_ONCE() was
deemed too alarmist, so switch it to printk().

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: Changed to printk(WARNING) so we catch all of the different
 invalid netlink messages.  In Richard's defense, he brought this
 point up earlier, but I didn't understand his point at the time.]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-11-12 16:14:02 -05:00
Al Viro
946e51f2bf move d_rcu from overlapping d_child to overlapping d_alias
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-03 15:20:29 -05:00
Rohit
1a5b472bde Security: smack: replace kzalloc with kmem_cache for inode_smack
The patch use kmem_cache to allocate/free inode_smack since they are
alloced in high volumes making it a perfect case for kmem_cache.

As per analysis, 24 bytes of memory is wasted per allocation due
to internal fragmentation. With kmem_cache, this can be avoided.

Accounting of memory allocation is below :
 total       slack            net      count-alloc/free        caller
Before (with kzalloc)
1919872      719952          1919872      29998/0          new_inode_smack+0x14

After (with kmem_cache)
1201680          0           1201680      30042/0          new_inode_smack+0x18

>From above data, we found that 719952 bytes(~700 KB) of memory is
saved on allocation of 29998 smack inodes.

Signed-off-by: Rohit <rohit.kr@samsung.com>
2014-10-31 14:29:32 -07:00
James Morris
6c880ad51b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into for-linus 2014-10-29 15:03:54 +11:00
Casey Schaufler
6c892df268 Smack: Lock mode for the floor and hat labels
The lock access mode allows setting a read lock on a file
for with the process has only read access. The floor label is
defined to make it easy to have the basic system installed such
that everyone can read it. Once there's a desire to read lock
(rationally or otherwise) a floor file a rule needs to get set.
This happens all the time, so make the floor label a little bit
more special and allow everyone lock access, too. By implication,
give processes with the hat label (hat can read everything)
lock access as well. This reduces clutter in the Smack rule set.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2014-10-28 08:22:40 -07:00
Dmitry Kasatkin
3b1deef6b1 evm: check xattr value length and type in evm_inode_setxattr()
evm_inode_setxattr() can be called with no value. The function does not
check the length so that following command can be used to produce the
kernel oops: setfattr -n security.evm FOO. This patch fixes it.

Changes in v3:
* there is no reason to return different error codes for EVM_XATTR_HMAC
  and non EVM_XATTR_HMAC. Remove unnecessary test then.

Changes in v2:
* testing for validity of xattr type

[ 1106.396921] BUG: unable to handle kernel NULL pointer dereference at           (null)
[ 1106.398192] IP: [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48
[ 1106.399244] PGD 29048067 PUD 290d7067 PMD 0
[ 1106.399953] Oops: 0000 [#1] SMP
[ 1106.400020] Modules linked in: bridge stp llc evdev serio_raw i2c_piix4 button fuse
[ 1106.400020] CPU: 0 PID: 3635 Comm: setxattr Not tainted 3.16.0-kds+ #2936
[ 1106.400020] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 1106.400020] task: ffff8800291a0000 ti: ffff88002917c000 task.ti: ffff88002917c000
[ 1106.400020] RIP: 0010:[<ffffffff812af7b8>]  [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48
[ 1106.400020] RSP: 0018:ffff88002917fd50  EFLAGS: 00010246
[ 1106.400020] RAX: 0000000000000000 RBX: ffff88002917fdf8 RCX: 0000000000000000
[ 1106.400020] RDX: 0000000000000000 RSI: ffffffff818136d3 RDI: ffff88002917fdf8
[ 1106.400020] RBP: ffff88002917fd68 R08: 0000000000000000 R09: 00000000003ec1df
[ 1106.400020] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800438a0a00
[ 1106.400020] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1106.400020] FS:  00007f7dfa7d7740(0000) GS:ffff88005da00000(0000) knlGS:0000000000000000
[ 1106.400020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1106.400020] CR2: 0000000000000000 CR3: 000000003763e000 CR4: 00000000000006f0
[ 1106.400020] Stack:
[ 1106.400020]  ffff8800438a0a00 ffff88002917fdf8 0000000000000000 ffff88002917fd98
[ 1106.400020]  ffffffff812a1030 ffff8800438a0a00 ffff88002917fdf8 0000000000000000
[ 1106.400020]  0000000000000000 ffff88002917fde0 ffffffff8116d08a ffff88002917fdc8
[ 1106.400020] Call Trace:
[ 1106.400020]  [<ffffffff812a1030>] security_inode_setxattr+0x5d/0x6a
[ 1106.400020]  [<ffffffff8116d08a>] vfs_setxattr+0x6b/0x9f
[ 1106.400020]  [<ffffffff8116d1e0>] setxattr+0x122/0x16c
[ 1106.400020]  [<ffffffff811687e8>] ? mnt_want_write+0x21/0x45
[ 1106.400020]  [<ffffffff8114d011>] ? __sb_start_write+0x10f/0x143
[ 1106.400020]  [<ffffffff811687e8>] ? mnt_want_write+0x21/0x45
[ 1106.400020]  [<ffffffff811687c0>] ? __mnt_want_write+0x48/0x4f
[ 1106.400020]  [<ffffffff8116d3e6>] SyS_setxattr+0x6e/0xb0
[ 1106.400020]  [<ffffffff81529da9>] system_call_fastpath+0x16/0x1b
[ 1106.400020] Code: c3 0f 1f 44 00 00 55 48 89 e5 41 55 49 89 d5 41 54 49 89 fc 53 48 89 f3 48 c7 c6 d3 36 81 81 48 89 df e8 18 22 04 00 85 c0 75 07 <41> 80 7d 00 02 74 0d 48 89 de 4c 89 e7 e8 5a fe ff ff eb 03 83
[ 1106.400020] RIP  [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48
[ 1106.400020]  RSP <ffff88002917fd50>
[ 1106.400020] CR2: 0000000000000000
[ 1106.428061] ---[ end trace ae08331628ba3050 ]---

Reported-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-10-28 10:06:31 -04:00
Dmitry Kasatkin
a48fda9de9 ima: check xattr value length and type in the ima_inode_setxattr()
ima_inode_setxattr() can be called with no value. Function does not
check the length so that following command can be used to produce
kernel oops: setfattr -n security.ima FOO. This patch fixes it.

Changes in v3:
* for stable reverted "allow setting hash only in fix or log mode"
  It will be a separate patch.

Changes in v2:
* testing validity of xattr type
* allow setting hash only in fix or log mode (Mimi)

[  261.562522] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  261.564109] IP: [<ffffffff812af272>] ima_inode_setxattr+0x3e/0x5a
[  261.564109] PGD 3112f067 PUD 42965067 PMD 0
[  261.564109] Oops: 0000 [#1] SMP
[  261.564109] Modules linked in: bridge stp llc evdev serio_raw i2c_piix4 button fuse
[  261.564109] CPU: 0 PID: 3299 Comm: setxattr Not tainted 3.16.0-kds+ #2924
[  261.564109] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  261.564109] task: ffff8800428c2430 ti: ffff880042be0000 task.ti: ffff880042be0000
[  261.564109] RIP: 0010:[<ffffffff812af272>]  [<ffffffff812af272>] ima_inode_setxattr+0x3e/0x5a
[  261.564109] RSP: 0018:ffff880042be3d50  EFLAGS: 00010246
[  261.564109] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000015
[  261.564109] RDX: 0000001500000000 RSI: 0000000000000000 RDI: ffff8800375cc600
[  261.564109] RBP: ffff880042be3d68 R08: 0000000000000000 R09: 00000000004d6256
[  261.564109] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88002149ba00
[  261.564109] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  261.564109] FS:  00007f6c1e219740(0000) GS:ffff88005da00000(0000) knlGS:0000000000000000
[  261.564109] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  261.564109] CR2: 0000000000000000 CR3: 000000003b35a000 CR4: 00000000000006f0
[  261.564109] Stack:
[  261.564109]  ffff88002149ba00 ffff880042be3df8 0000000000000000 ffff880042be3d98
[  261.564109]  ffffffff812a101b ffff88002149ba00 ffff880042be3df8 0000000000000000
[  261.564109]  0000000000000000 ffff880042be3de0 ffffffff8116d08a ffff880042be3dc8
[  261.564109] Call Trace:
[  261.564109]  [<ffffffff812a101b>] security_inode_setxattr+0x48/0x6a
[  261.564109]  [<ffffffff8116d08a>] vfs_setxattr+0x6b/0x9f
[  261.564109]  [<ffffffff8116d1e0>] setxattr+0x122/0x16c
[  261.564109]  [<ffffffff811687e8>] ? mnt_want_write+0x21/0x45
[  261.564109]  [<ffffffff8114d011>] ? __sb_start_write+0x10f/0x143
[  261.564109]  [<ffffffff811687e8>] ? mnt_want_write+0x21/0x45
[  261.564109]  [<ffffffff811687c0>] ? __mnt_want_write+0x48/0x4f
[  261.564109]  [<ffffffff8116d3e6>] SyS_setxattr+0x6e/0xb0
[  261.564109]  [<ffffffff81529da9>] system_call_fastpath+0x16/0x1b
[  261.564109] Code: 48 89 f7 48 c7 c6 58 36 81 81 53 31 db e8 73 27 04 00 85 c0 75 28 bf 15 00 00 00 e8 8a a5 d9 ff 84 c0 75 05 83 cb ff eb 15 31 f6 <41> 80 7d 00 03 49 8b 7c 24 68 40 0f 94 c6 e8 e1 f9 ff ff 89 d8
[  261.564109] RIP  [<ffffffff812af272>] ima_inode_setxattr+0x3e/0x5a
[  261.564109]  RSP <ffff880042be3d50>
[  261.564109] CR2: 0000000000000000
[  261.599998] ---[ end trace 39a89a3fc267e652 ]---

Reported-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-10-28 10:03:49 -04:00
James Morris
9b32011acd Merge branch 'stable-3.18' of git://git.infradead.org/users/pcmoore/selinux into for-linus2 2014-10-16 21:04:18 +11:00
Stephen Smalley
923190d32d selinux: fix inode security list corruption
sb_finish_set_opts() can race with inode_free_security()
when initializing inode security structures for inodes
created prior to initial policy load or by the filesystem
during ->mount().   This appears to have always been
a possible race, but commit 3dc91d4 ("SELinux:  Fix possible
NULL pointer dereference in selinux_inode_permission()")
made it more evident by immediately reusing the unioned
list/rcu element  of the inode security structure for call_rcu()
upon an inode_free_security().  But the underlying issue
was already present before that commit as a possible use-after-free
of isec.

Shivnandan Kumar reported the list corruption and proposed
a patch to split the list and rcu elements out of the union
as separate fields of the inode_security_struct so that setting
the rcu element would not affect the list element.  However,
this would merely hide the issue and not truly fix the code.

This patch instead moves up the deletion of the list entry
prior to dropping the sbsec->isec_lock initially.  Then,
if the inode is dropped subsequently, there will be no further
references to the isec.

Reported-by: Shivnandan Kumar <shivnandan.k@samsung.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-10-15 10:37:02 -04:00
Behan Webster
357aabed62 security, crypto: LLVMLinux: Remove VLAIS from ima_crypto.c
Replaced the use of a Variable Length Array In Struct (VLAIS) with a C99
compliant equivalent. This patch allocates the appropriate amount of memory
using a char array using the SHASH_DESC_ON_STACK macro.

The new code can be compiled with both gcc and clang.

Signed-off-by: Behan Webster <behanw@converseincode.com>
Reviewed-by: Mark Charlebois <charlebm@gmail.com>
Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Cc: tglx@linutronix.de
2014-10-14 10:51:24 +02:00
Roberto Sassu
c2426d2ad5 ima: added support for new kernel cmdline parameter ima_template_fmt
This patch allows users to provide a custom template format through the
new kernel command line parameter 'ima_template_fmt'. If the supplied
format is not valid, IMA uses the default template descriptor.

Changelog:
 - v3:
   - added check for 'fields' and 'num_fields' in
     template_desc_init_fields() (suggested by Mimi Zohar)

 - v2:
   - using template_desc_init_fields() to validate a format string
     (Roberto Sassu)
   - updated documentation by stating that only the chosen template
     descriptor is initialized (Roberto Sassu)

 - v1:
   - simplified code of ima_template_fmt_setup()
     (Roberto Sassu, suggested by Mimi Zohar)

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-10-13 08:39:02 -04:00
Roberto Sassu
1bd7face74 ima: allocate field pointers array on demand in template_desc_init_fields()
The allocation of a field pointers array is moved at the end of
template_desc_init_fields() and done only if the value of the 'fields'
and 'num_fields' parameters is not NULL. For just validating a template
format string, retrieved template field pointers are placed in a temporary
array.

Changelog:
 - v3:
   - do not check in this patch if 'fields' and 'num_fields' are NULL
     (suggested by Mimi Zohar)

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-10-13 08:39:02 -04:00
Roberto Sassu
9f3166b8ca ima: don't allocate a copy of template_fmt in template_desc_init_fields()
This patch removes the allocation of a copy of 'template_fmt', needed for
iterating over all fields in the passed template format string. The removal
was possible by replacing strcspn(), which modifies the passed string,
with strchrnul(). The currently processed template field is copied in
a temporary variable.

The purpose of this change is use template_desc_init_fields() in two ways:
for just validating a template format string (the function should work
if called by a setup function, when memory cannot be allocated), and for
actually initializing a template descriptor. The implementation of this
feature will be complete with the next patch.

Changelog:
 - v3:
   - added 'goto out' in template_desc_init_fields() to free allocated
     memory if a template field length is not valid (suggested by
     Mimi Zohar)

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-10-13 08:39:01 -04:00
Roberto Sassu
7dbdb4206b ima: display template format in meas. list if template name length is zero
With the introduction of the 'ima_template_fmt' kernel cmdline parameter,
a user can define a new template descriptor with custom format. However,
in this case, userspace tools will be unable to parse the measurements
list because the new template is unknown. For this reason, this patch
modifies the current IMA behavior to display in the list the template
format instead of the name (only if the length of the latter is zero)
so that a tool can extract needed information if it can handle listed
fields.

This patch also correctly displays the error log message in
ima_init_template() if the selected template cannot be initialized.

Changelog:
 - v3:
   - check the first byte of 'e->template_desc->name' instead of using
     strlen() in ima_fs.c (suggested by Mimi Zohar)

 - v2:
   - print the template format in ima_init_template(), if the selected
     template is custom (Roberto Sassu)

 - v1:
   - fixed patch description (Roberto Sassu, suggested by Mimi Zohar)
   - set 'template_name' variable in ima_fs.c only once
     (Roberto Sassu, suggested by Mimi Zohar)

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-10-13 08:39:01 -04:00
Roberto Sassu
71fed2eee0 ima: added error messages to template-related functions
This patch adds some error messages to inform users about the following
events: template descriptor not found, invalid template descriptor,
template field not found and template initialization failed.

Changelog:
 - v2:
   - display an error message if the format string contains too many
     fields (Roberto Sassu)

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-10-13 08:39:01 -04:00
Linus Torvalds
5e40d331bd Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris.

Mostly ima, selinux, smack and key handling updates.

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (65 commits)
  integrity: do zero padding of the key id
  KEYS: output last portion of fingerprint in /proc/keys
  KEYS: strip 'id:' from ca_keyid
  KEYS: use swapped SKID for performing partial matching
  KEYS: Restore partial ID matching functionality for asymmetric keys
  X.509: If available, use the raw subjKeyId to form the key description
  KEYS: handle error code encoded in pointer
  selinux: normalize audit log formatting
  selinux: cleanup error reporting in selinux_nlmsg_perm()
  KEYS: Check hex2bin()'s return when generating an asymmetric key ID
  ima: detect violations for mmaped files
  ima: fix race condition on ima_rdwr_violation_check and process_measurement
  ima: added ima_policy_flag variable
  ima: return an error code from ima_add_boot_aggregate()
  ima: provide 'ima_appraise=log' kernel option
  ima: move keyring initialization to ima_init()
  PKCS#7: Handle PKCS#7 messages that contain no X.509 certs
  PKCS#7: Better handling of unsupported crypto
  KEYS: Overhaul key identification when searching for asymmetric keys
  KEYS: Implement binary asymmetric key ID handling
  ...
2014-10-12 10:13:55 -04:00
Dmitry Kasatkin
0716abbb58 ima: use atomic bit operations to protect policy update interface
The current implementation uses an atomic counter to provide exclusive
access to the sysfs 'policy' entry to update the IMA policy. While it is
highly unlikely, the usage of a counter might potentially allow another
process to overflow the counter, open the interface and insert additional
rules into the policy being loaded.

This patch replaces using an atomic counter with atomic bit operations
which is more reliable and a widely used method to provide exclusive access.

As bit operation keep the interface locked after successful update, it makes
it unnecessary to verify if the default policy was set or not during parsing
and interface closing. This patch also removes that code.

Changes in v3:
* move audit log message to ima_relead_policy() to report successful and
  unsuccessful result
* unnecessary comment removed

Changes in v2:
* keep interface locked after successful policy load as in original design
* remove sysfs entry as in original design

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-10-11 23:33:02 -04:00
Dmitry Kasatkin
7178784f0a ima: ignore empty and with whitespaces policy lines
Empty policy lines cause parsing failures which is, especially
for new users, hard to spot. This patch prevents it.

Changes in v2:
* strip leading blanks and tabs in rules to prevent parsing failures

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-10-11 23:29:19 -04:00
Dmitry Kasatkin
272a6e90ff ima: no need to allocate entry for comment
If a rule is a comment, there is no need to allocate an entry.
Move the checking for comments before allocating the entry.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-10-11 23:28:07 -04:00
Dmitry Kasatkin
78bb5d0b4f ima: report policy load status
Audit messages are rate limited, often causing the policy update
info to not be visible.  Report policy loading status also using
pr_info.

Changes in v2:
* reporting moved to ima_release_policy to notice parsing errors
* reporting both completed and failed status

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-10-11 23:25:25 -04:00
Linus Torvalds
ef4a48c513 File locking related changes for v3.18 (pile #1)
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUNZK4AAoJEAAOaEEZVoIVI08P/iM7eaIVRnqaqtWw/JBzxiba
 EMDlJYUBSlv6lYk9s8RJT4bMmcmGAKSYzVAHSoPahzNcqTDdFLeDTLGxJ8uKBbjf
 d1qRRdH1yZHGUzCvJq3mEendjfXn435Y3YburUxjLfmzrzW7EbMvndiQsS5dhAm9
 PEZ+wrKF/zFL7LuXa1YznYrbqOD/GRsJAXGEWc3kNwfS9avephVG/RI3GtpI2PJj
 RY1mf8P7+WOlrShYoEuUo5aqs01MnU70LbqGHzY8/QKH+Cb0SOkCHZPZyClpiA+G
 MMJ+o2XWcif3BZYz+dobwz/FpNZ0Bar102xvm2E8fqByr/T20JFjzooTKsQ+PtCk
 DetQptrU2gtyZDKtInJUQSDPrs4cvA13TW+OEB1tT8rKBnmyEbY3/TxBpBTB9E6j
 eb/V3iuWnywR3iE+yyvx24Qe7Pov6deM31s46+Vj+GQDuWmAUJXemhfzPtZiYpMT
 exMXTyDS3j+W+kKqHblfU5f+Bh1eYGpG2m43wJVMLXKV7NwDf8nVV+Wea962ga+w
 BAM3ia4JRVgRWJBPsnre3lvGT5kKPyfTZsoG+kOfRxiorus2OABoK+SIZBZ+c65V
 Xh8VH5p3qyCUBOynXlHJWFqYWe2wH0LfbPrwe9dQwTwON51WF082EMG5zxTG0Ymf
 J2z9Shz68zu0ok8cuSlo
 =Hhee
 -----END PGP SIGNATURE-----

Merge tag 'locks-v3.18-1' of git://git.samba.org/jlayton/linux

Pull file locking related changes from Jeff Layton:
 "This release is a little more busy for file locking changes than the
  last:

   - a set of patches from Kinglong Mee to fix the lockowner handling in
     knfsd
   - a pile of cleanups to the internal file lease API.  This should get
     us a bit closer to allowing for setlease methods that can block.

  There are some dependencies between mine and Bruce's trees this cycle,
  and I based my tree on top of the requisite patches in Bruce's tree"

* tag 'locks-v3.18-1' of git://git.samba.org/jlayton/linux: (26 commits)
  locks: fix fcntl_setlease/getlease return when !CONFIG_FILE_LOCKING
  locks: flock_make_lock should return a struct file_lock (or PTR_ERR)
  locks: set fl_owner for leases to filp instead of current->files
  locks: give lm_break a return value
  locks: __break_lease cleanup in preparation of allowing direct removal of leases
  locks: remove i_have_this_lease check from __break_lease
  locks: move freeing of leases outside of i_lock
  locks: move i_lock acquisition into generic_*_lease handlers
  locks: define a lm_setup handler for leases
  locks: plumb a "priv" pointer into the setlease routines
  nfsd: don't keep a pointer to the lease in nfs4_file
  locks: clean up vfs_setlease kerneldoc comments
  locks: generic_delete_lease doesn't need a file_lock at all
  nfsd: fix potential lease memory leak in nfs4_setlease
  locks: close potential race in lease_get_mtime
  security: make security_file_set_fowner, f_setown and __f_setown void return
  locks: consolidate "nolease" routines
  locks: remove lock_may_read and lock_may_write
  lockd: rip out deferred lock handling from testlock codepath
  NFSD: Get reference of lockowner when coping file_lock
  ...
2014-10-11 13:21:34 -04:00
Linus Torvalds
28596c9722 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull "trivial tree" updates from Jiri Kosina:
 "Usual pile from trivial tree everyone is so eagerly waiting for"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  Remove MN10300_PROC_MN2WS0038
  mei: fix comments
  treewide: Fix typos in Kconfig
  kprobes: update jprobe_example.c for do_fork() change
  Documentation: change "&" to "and" in Documentation/applying-patches.txt
  Documentation: remove obsolete pcmcia-cs from Changes
  Documentation: update links in Changes
  Documentation: Docbook: Fix generated DocBook/kernel-api.xml
  score: Remove GENERIC_HAS_IOMAP
  gpio: fix 'CONFIG_GPIO_IRQCHIP' comments
  tty: doc: Fix grammar in serial/tty
  dma-debug: modify check_for_stack output
  treewide: fix errors in printk
  genirq: fix reference in devm_request_threaded_irq comment
  treewide: fix synchronize_rcu() in comments
  checkstack.pl: port to AArch64
  doc: queue-sysfs: minor fixes
  init/do_mounts: better syntax description
  MIPS: fix comment spelling
  powerpc/simpleboot: fix comment
  ...
2014-10-07 21:16:26 -04:00
Linus Torvalds
bdf428feb2 Nothing major: support for compressing modules, and auto-tainting params.
Cheers,
 Rusty.
 PS.  My virtio-next tree is empty: DaveM took the patches I had.  There might
      be a virtio-rng starvation fix, but so far it's a bit voodoo so I will
      get to that in the next two days or it will wait.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUGFrvAAoJENkgDmzRrbjxOJYQALaZbTumrtX3Mo/FAtzn8d5N
 8gxcqk1Mhz4lR1vPWy/YN/H2f23qb/saqLxPar8Wgou3h7N8EqSdwDqJSuvEqhG0
 iEXUsNLC7BOsDkLYhdjTfZoW/lsVU/EH4bkZMSxAZI9V64phXhDYfPb5SQgJTECr
 Ue6IK4ijW6zdWLstGfg/ixrIeGDUSnyiThF9O2mYVaB1D0QkLDIAZxbjZJgfFfut
 PwO33/sEV4pceTpkmxFKl/OiS+obi/VbDixjSCcO+jaBd1pVxH9fhhKREStOhN4z
 88z5ADR71RH6so9TQTwIIcgb2Hon5d+3RVMB6CxuvKs9NmHSXDiQyZvG9J/jiSdm
 KrPKSiVwGGwJSwxXTm8CDaz6Oj0ibDXBIzv/vYI22sR7u8PmRQFvL3O1VrW+KDnE
 yoG75S9DHzSQ1183xFFFTt4FBRm/4XKyVs+F6YqYkchLigrUfQMCGb1cmZyE5y7K
 bgNyonu0m/ItoQmekoDgYqvSjwdguaJ35XCW55GrKJ84JDHBaw3SpPdEfjAS8FsH
 aT5o2oernvwRG6gsX9858RvB/uo1UKwHv1waDfV4cqNjMm5Ko+Yr6OIdQvBQiq07
 cFkVmkrMtEyX19QyIGW3QSbFL1lr3X5cC5glzEeKY941yZbTluSsNuMlMPT1+IMx
 NOUbh0aG8B8ZaMZPFNLi
 =QzCn
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module update from Rusty Russell:
 "Nothing major: support for compressing modules, and auto-tainting
  params.

  PS. My virtio-next tree is empty: DaveM took the patches I had.  There
      might be a virtio-rng starvation fix, but so far it's a bit voodoo
      so I will get to that in the next two days or it will wait"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  moduleparam: Resolve missing-field-initializer warning
  kbuild: handle module compression while running 'make modules_install'.
  modinst: wrap long lines in order to enhance cmd_modules_install
  modsign: lookup lines ending in .ko in .mod files
  modpost: simplify file name generation of *.mod.c files
  modpost: reduce visibility of symbols and constify r/o arrays
  param: check for tainting before calling set op.
  drm/i915: taint the kernel if unsafe module parameters are set
  module: add module_param_unsafe and module_param_named_unsafe
  module: make it possible to have unsafe, tainting module params
  module: rename KERNEL_PARAM_FL_NOARG to avoid confusion
2014-10-07 20:17:38 -04:00
Dmitry Kasatkin
456f5fd3f6 ima: use path names cache
__getname() uses slab allocation which is faster than kmalloc.
Make use of it.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-10-07 14:32:54 -04:00
Dmitry Kasatkin
c2baec7ffa evm: skip replacing EVM signature with HMAC on read-only filesystem
If filesystem is mounted read-only or file is immutable, updating
xattr will fail. This is a usual case during early boot until
filesystem is remount read-write. This patch verifies conditions
to skip unnecessary attempt to calculate HMAC and set xattr.

Changes in v2:
* indention changed according to Lindent (requested by Mimi)

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-10-07 14:32:53 -04:00
Dmitry Kasatkin
d16a8585d3 integrity: add missing '__init' keyword for integrity_init_keyring()
integrity_init_keyring() is used only from kernel '__init'
functions. Add it there as well.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-10-07 14:32:53 -04:00
Dmitry Kasatkin
0f34a0060a ima: check ima_policy_flag in the ima_file_free() hook
This patch completes the switching to the 'ima_policy_flag' variable
in the checks at the beginning of IMA functions, starting with the
commit a756024e.

Checking 'iint_initialized' is completely unnecessary, because
S_IMA flag is unset if iint was not allocated. At the same time
the integrity cache is allocated with SLAB_PANIC and the kernel will
panic if the allocation fails during kernel initialization. So on
a running system iint_initialized is always true and can be removed.

Changes in v3:
* not limiting test to IMA_APPRAISE (spotted by Roberto Sassu)

Changes in v2:
* 'iint_initialized' removal patch merged to this patch (requested
   by Mimi)

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Acked-by: Roberto Sassu <roberto.sassu@polito.it>
2014-10-07 14:32:52 -04:00
Dmitry Kasatkin
594081ee71 integrity: do zero padding of the key id
Latest KEYS code return error if hexadecimal string length id odd.
Fix it.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2014-10-06 17:33:27 +01:00
James Morris
c867d07e3c Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next 2014-10-02 19:47:23 +10:00
James Morris
858f61c429 Merge branch 'next' of git://git.infradead.org/users/pcmoore/selinux into next 2014-10-01 00:45:26 +10:00
Richard Guy Briggs
4093a84439 selinux: normalize audit log formatting
Restructure to keyword=value pairs without spaces.  Drop superfluous words in
text.  Make invalid_context a keyword.  Change result= keyword to seresult=.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[Minor rewrite to the patch subject line]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-09-22 17:02:10 -04:00
Richard Guy Briggs
e173fb2646 selinux: cleanup error reporting in selinux_nlmsg_perm()
Convert audit_log() call to WARN_ONCE().

Rename "type=" to nlmsg_type=" to avoid confusion with the audit record
type.

Added "protocol=" to help track down which protocol (NETLINK_AUDIT?) was used
within the netlink protocol family.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[Rewrote the patch subject line]
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-09-22 15:50:08 -04:00
James Morris
35e1efd25a Merge tag 'keys-next-20140922' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2014-09-22 22:54:56 +10:00
Roberto Sassu
1b68bdf9cd ima: detect violations for mmaped files
This patch fixes the detection of the 'open_writers' violation for mmaped
files.

before) an 'open_writers' violation is detected if the policy contains
        a rule with the criteria: func=FILE_CHECK mask=MAY_READ

after) an 'open_writers' violation is detected if the current event
       matches one of the policy rules.

With the old behaviour, the 'open_writers' violation is not detected
in the following case:

policy:
measure func=FILE_MMAP mask=MAY_EXEC

steps:
1) open a shared library for writing
2) execute a binary that links that shared library
3) during the binary execution, modify the shared library and save
   the change

result:
the 'open_writers' violation measurement is not present in the IMA list.

Only binaries executed are protected from writes. For libraries mapped
in memory there is the flag MAP_DENYWRITE for this purpose, but according
to the output of 'man mmap', the mmap flag is ignored.

Since ima_rdwr_violation_check() is now called by process_measurement()
the information about if the inode must be measured is already provided
by ima_get_action(). Thus the unnecessary function ima_must_measure()
has been removed.

Changes in v3 (Dmitry Kasatkin):
- Violation for MMAP_CHECK function are verified since this patch
- Changed patch description a bit

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-18 10:04:12 -04:00
Roberto Sassu
f7a859ff73 ima: fix race condition on ima_rdwr_violation_check and process_measurement
This patch fixes a race condition between two functions that try to access
the same inode. Since the i_mutex lock is held and released separately
in the two functions, there may be the possibility that a violation is
not correctly detected.

Suppose there are two processes, A (reader) and B (writer), if the
following sequence happens:

A: ima_rdwr_violation_check()
B: ima_rdwr_violation_check()
B: process_measurement()
B: starts writing the inode
A: process_measurement()

the ToMToU violation (a reader may be accessing a content different from
that measured, due to a concurrent modification by a writer) will not be
detected. To avoid this issue, the violation check and the measurement
must be done atomically.

This patch fixes the problem by moving the violation check inside
process_measurement() when the i_mutex lock is held. Differently from
the old code, the violation check is executed also for the MMAP_CHECK
hook (other than for FILE_CHECK). This allows to detect ToMToU violations
that are possible because shared libraries can be opened for writing
while they are in use (according to the output of 'man mmap', the mmap()
flag MAP_DENYWRITE is ignored).

Changes in v5 (Roberto Sassu):
* get iint if action is not zero
* exit process_measurement() after the violation check if action is zero
* reverse order process_measurement() exit cleanup (Mimi)

Changes in v4 (Dmitry Kasatkin):
* iint allocation is done before calling ima_rdrw_violation_check()
  (Suggested-by Mimi)
* do not check for violations if the policy does not contain 'measure'
  rules (done by Roberto Sassu)

Changes in v3 (Dmitry Kasatkin):
* no violation checking for MMAP_CHECK function in this patch
* remove use of filename from violation
* removes checking if ima is enabled from ima_rdrw_violation_check
* slight style change

Suggested-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-18 10:03:55 -04:00
James Morris
6f98e89288 Merge branch 'smack-for-3.18' of git://git.gitorious.org/smack-next/kernel into next 2014-09-18 23:52:46 +10:00
Roberto Sassu
a756024efe ima: added ima_policy_flag variable
This patch introduces the new variable 'ima_policy_flag', whose bits
are set depending on the action of the current policy rules. Only the
flags IMA_MEASURE, IMA_APPRAISE and IMA_AUDIT are set.

The new variable will be used to improve performance by skipping the
unnecessary execution of IMA code if the policy does not contain rules
with the above actions.

Changes in v6 (Roberto Sassu)
* do not check 'ima_initialized' before calling ima_update_policy_flag()
  in ima_update_policy() (suggested by Dmitry)
* calling ima_update_policy_flag() moved to init_ima to co-locate with
  ima_initialized (Dmitry)
* add/revise comments (Mimi)

Changes in v5 (Roberto Sassu)
* reset IMA_APPRAISE flag in 'ima_policy_flag' if 'ima_appraise' is set
  to zero (reported by Dmitry)
* update 'ima_policy_flag' only if IMA initialization is successful
  (suggested by Mimi and Dmitry)
* check 'ima_policy_flag' instead of 'ima_initialized'
  (suggested by Mimi and Dmitry)

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-17 16:39:36 -04:00
Roberto Sassu
be39ffc2fe ima: return an error code from ima_add_boot_aggregate()
This patch modifies ima_add_boot_aggregate() to return an error code.
This way we can determine if all the initialization procedures have
been executed successfully.

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-17 16:15:42 -04:00
Dmitry Kasatkin
2faa6ef3b2 ima: provide 'ima_appraise=log' kernel option
The kernel boot parameter "ima_appraise" currently defines 'off',
'enforce' and 'fix' modes.  When designing a policy and labeling
the system, access to files are either blocked in the default
'enforce' mode or automatically fixed in the 'fix' mode.  It is
beneficial to be able to run the system in a logging only mode,
without fixing it, in order to properly analyze the system. This
patch adds a 'log' mode to run the system in a permissive mode and
log the appraisal results.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-17 16:14:23 -04:00
Dmitry Kasatkin
31b70f6632 ima: move keyring initialization to ima_init()
ima_init() is used as a single place for all initializations.
Experimental keyring patches used the 'late_initcall' which was
co-located with the late_initcall(init_ima). When the late_initcall
for the keyring initialization was abandoned, initialization moved
to init_ima, though it would be more logical to move it to ima_init,
where the rest of the initialization is done. This patch moves the
keyring initialization to ima_init() as a preparatory step for
loading the keys which will be added to ima_init() in following
patches.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-17 16:10:59 -04:00
David Howells
0c903ab64f KEYS: Make the key matching functions return bool
Make the key matching functions pointed to by key_match_data::cmp return bool
rather than int.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-16 17:36:08 +01:00
David Howells
c06cfb08b8 KEYS: Remove key_type::match in favour of overriding default by match_preparse
A previous patch added a ->match_preparse() method to the key type.  This is
allowed to override the function called by the iteration algorithm.
Therefore, we can just set a default that simply checks for an exact match of
the key description with the original criterion data and allow match_preparse
to override it as needed.

The key_type::match op is then redundant and can be removed, as can the
user_match() function.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-16 17:36:06 +01:00
David Howells
614d8c3901 KEYS: Remove key_type::def_lookup_type
Remove key_type::def_lookup_type as it's no longer used.  The information now
defaults to KEYRING_SEARCH_LOOKUP_DIRECT but may be overridden by
type->match_preparse().

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-16 17:36:04 +01:00
David Howells
462919591a KEYS: Preparse match data
Preparse the match data.  This provides several advantages:

 (1) The preparser can reject invalid criteria up front.

 (2) The preparser can convert the criteria to binary data if necessary (the
     asymmetric key type really wants to do binary comparison of the key IDs).

 (3) The preparser can set the type of search to be performed.  This means
     that it's not then a one-off setting in the key type.

 (4) The preparser can set an appropriate comparator function.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-16 17:36:02 +01:00
David Howells
1c9c115ccc Keyrings fixes for next
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAVBhgQROxKuMESys7AQJv8w/+O61++U9OT5rL/X2pam3h4+tQmM7+2rU6
 rml9ZADkqKJVU1h3tT/WaFcmwR6J56buIWDZ7M7fPPRyvQPhfeBEHcakey+4iCd3
 SyFmP9ElkRvdOyoXRvImNIawsmnyme4CWSIQihSteyzyuV76odOI5HiJxMj6x8RV
 T4bjTsOMYugleWmuJzYPSniAs0my7Gxii7GK4WnLd+Jhxs8xJxiB3DNT4Kecxjuc
 MTldJIx32+Kj2JKp4ygbHM/WOI7jbdpseCedxbL4DNepgdXvVvV42pwPSCc2ydO/
 2b9kF+YnhsGbOv3Qd6GPPE5x0cMLKcmGf3DvJQcfCFSLjqpginGr0V/ylkVSf79S
 mQJLc2R7Z0ST7eFCyvClAeM9ZI6y+APWqo2wvVRNMww3NQhHyj43P2HPXiQEAryg
 UO6ZQehGHgxcx8uN/nB1gi4+S7+sZS6SzwvrTu04h5lYfCRrKSZGWDGpG/n7qtwo
 EKcWiAjupYbJtlDryL5Nz1xxkD6rpLKWbVfkO5FbJ8yAamz5XDl7FiQyJTu32LGj
 jbVma9sEIkJ5IB4mCAdF2tHi4o7QvQQbwWAPLu0B4hxp4JRslV5XhVVAH3CrvIFK
 l8A5nit0NwHms7d8+VLAyieknUForFHlmRheYXCSMvJDMfFToZrCNmTNHxZtyuuj
 voCrpkQYxGU=
 =9AZW
 -----END PGP SIGNATURE-----

Merge tag 'keys-next-fixes-20140916' into keys-next

Merge in keyrings fixes for next:

 (1) Insert some missing 'static' annotations.

Signed-off-by: David Howells <dhowells@redhat.com>
2014-09-16 17:32:55 +01:00
David Howells
68c45c7fea Keyrings fixes
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAVBhlXROxKuMESys7AQIQghAAmxHRD9kD7AJrlcL187gTXl5dRyxQNhX0
 myT9A1/01jHJh6mPPKS5jt5ooenwfcbvBkdUARLNLq6j7urIzFk+UtSmfcN1OMXw
 3y15nuJKbSCZRzPMR94Z0Ik23YED9dmc4ubxW7E+psoWWIZUvFt4GaGw7ei77O39
 8Pbt/n7nIx2+s8aiXte2Om/zqzyMEmcLd3Mzlqxe4GX1jo/ThNQZ348EL+ECxJEl
 3OKe7i7oRfa48+SybmhW5Bx/2f/aQMcIQX04+akOFMIC505rUg6CTzphcrTyV5/s
 6FdQexRquf8/Ei/6DMAYPhnumRfWJ5x9txiNSEY4i11AIjo6Bt65vaLPuNniYRNI
 b6Wn8SSE8Ucrq5RrmNlSmoJCs7r1NE+JdOaEPO0MDVkOouaja8daISmveV2AfZfF
 bITOQgEw3QRpyL2FYdwa39/NXCONBILfL5HvNyXEfPEHBhI8igTgEyXYRwNHV9jT
 dsVFTc9ZIrjksLt3CDh4Z8xdyZbyojYdRCfH/wna9aAkZpwwGfSYCcR3dE6SK26y
 IjkqEVJoCFHEnvLUQkAQrc/2qWX2D1qHrcjVLwzwbM5G66YIPeLcJlZK2FbiY0Ay
 Yc/kUJY0hU5W+TfFb1hhjO5G2DTTw8Ou6MGcxSTE3HwzqICjDhE7BwFZdVikHRP0
 xMtjfJnMwuM=
 =v+dv
 -----END PGP SIGNATURE-----

Merge tag 'keys-fixes-20140916' into keys-next

Merge in keyrings fixes, at least some of which later patches depend on:

 (1) Reinstate the production of EPERM for key types beginning with '.' in
     requests from userspace.

 (2) Tidy up the cleanup of PKCS#7 message signed information blocks and fix a
     bug this made more obvious.

Signed-off-by: David Howells <dhowells@redhat.coM>
2014-09-16 17:32:16 +01:00
David Howells
54e2c2c1a9 KEYS: Reinstate EPERM for a key type name beginning with a '.'
Reinstate the generation of EPERM for a key type name beginning with a '.' in
a userspace call.  Types whose name begins with a '.' are internal only.

The test was removed by:

	commit a4e3b8d79a
	Author: Mimi Zohar <zohar@linux.vnet.ibm.com>
	Date:   Thu May 22 14:02:23 2014 -0400
	Subject: KEYS: special dot prefixed keyring name bug fix

I think we want to keep the restriction on type name so that userspace can't
add keys of a special internal type.

Note that removal of the test causes several of the tests in the keyutils
testsuite to fail.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-16 17:29:03 +01:00
David Howells
8da79b6439 KEYS: Fix missing statics
Fix missing statics (found by checker).

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-16 17:07:07 +01:00
Paul Moore
cbe0d6e879 selinux: make the netif cache namespace aware
While SELinux largely ignores namespaces, for good reason, there are
some places where it needs to at least be aware of namespaces in order
to function correctly.  Network namespaces are one example.  Basic
awareness of network namespaces are necessary in order to match a
network interface's index number to an actual network device.

This patch corrects a problem with network interfaces added to a
non-init namespace, and can be reproduced with the following commands:

 [NOTE: the NetLabel configuration is here only to active the dynamic
        networking controls ]

 # netlabelctl unlbl add default address:0.0.0.0/0 \
   label:system_u:object_r:unlabeled_t:s0
 # netlabelctl unlbl add default address:::/0 \
   label:system_u:object_r:unlabeled_t:s0
 # netlabelctl cipsov4 add pass doi:100 tags:1
 # netlabelctl map add domain:lspp_test_netlabel_t \
   protocol:cipsov4,100

 # ip link add type veth
 # ip netns add myns
 # ip link set veth1 netns myns
 # ip a add dev veth0 10.250.13.100/24
 # ip netns exec myns ip a add dev veth1 10.250.13.101/24
 # ip l set veth0 up
 # ip netns exec myns ip l set veth1 up

 # ping -c 1 10.250.13.101
 # ip netns exec myns ping -c 1 10.250.13.100

Reported-by: Jiri Jaburek <jjaburek@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-09-10 17:09:57 -04:00
Jeff Layton
e0b93eddfe security: make security_file_set_fowner, f_setown and __f_setown void return
security_file_set_fowner always returns 0, so make it f_setown and
__f_setown void return functions and fix up the error handling in the
callers.

Cc: linux-security-module@vger.kernel.org
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-09-09 16:01:36 -04:00
Dmitry Kasatkin
a2d61ed525 integrity: make integrity files as 'integrity' module
The kernel print macros use the KBUILD_MODNAME, which is initialized
to the module name. The current integrity/Makefile makes every file
as its own module, so pr_xxx messages are prefixed with the file name
instead of the module.  Similar to the evm/Makefile and ima/Makefile,
this patch fixes the integrity/Makefile to use the single name
'integrity'.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:58 -04:00
Dmitry Kasatkin
7ef84e65ec integrity: base integrity subsystem kconfig options on integrity
The integrity subsystem has lots of options and takes more than
half of the security menu.  This patch consolidates the options
under "integrity", which are hidden if not enabled.  This change
does not affect existing configurations.  Re-configuration is not
needed.

Changes v4:
- no need to change "integrity subsystem" to menuconfig as
options are hidden, when not enabled. (Mimi)
- add INTEGRITY Kconfig help description

Changes v3:
- dependency to INTEGRITY removed when behind 'if INTEGRITY'

Changes v2:
- previous patch moved integrity out of the 'security' menu.
  This version keeps integrity as a security option (Mimi).

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:56 -04:00
Dmitry Kasatkin
1ae8f41c23 integrity: move asymmetric keys config option
For better visual appearance it is better to co-locate
asymmetric key options together with signature support.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:55 -04:00
Dmitry Kasatkin
b4148db517 ima: initialize only required template
IMA uses only one template. This patch initializes only required
template to avoid unnecessary memory allocations.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Reviewed-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:54 -04:00
Dmitry Kasatkin
17f4bad3ab ima: remove usage of filename parameter
In all cases except ima_bprm_check() the filename was not defined
and ima_d_path() was used to find the full path.  Unfortunately,
the bprm filename is a relative pathname (eg. ./<dir>/filename).

ima_bprm_check() selects between bprm->interp and bprm->filename.
The following dump demonstrates the differences between using
filename and interp.

bprm->filename
 filename: ./foo.sh, pathname: /root/bin/foo.sh
 filename: ./foo.sh, pathname: /bin/dash

bprm->interp
 filename: ./foo.sh, pathname: /root/bin/foo.sh
 filename: /bin/sh, pathname: /bin/dash

In both cases the pathnames are currently the same.  This patch
removes usage of filename and interp in favor of d_absolute_path.

Changes v3:
- 11 extra bytes for "deleted" not needed (Mimi)
- purpose "replace relative bprm filename with full pathname" (Mimi)

Changes v2:
- use d_absolute_path() instead of d_path to work in chroot environments.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:52 -04:00
Dmitry Kasatkin
86f2bc0249 ima: remove unnecessary appraisal test
ima_get_action() sets the "action" flags based on policy.
Before collecting, measuring, appraising, or auditing the
file, the "action" flag is updated based on the cached
iint->flags.

This patch removes the subsequent unnecessary appraisal
test in ima_appraise_measurement().

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:51 -04:00
Dmitry Kasatkin
e4a9c51965 ima: add missing '__init' keywords
Add missing keywords to the function definition to cleanup
to discard initialization code.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Reviewed-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:50 -04:00
Dmitry Kasatkin
3a8a2eadc4 ima: remove unnecessary extra variable
'function' variable value can be changed instead of
allocating extra '_func' variable.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:48 -04:00
Dmitry Kasatkin
f68c05f4d2 ima: simplify conditional statement to improve performance
Precede bit testing before string comparison makes code
faster. Also refactor statement as a single line pointer
assignment. Logic is following: we set 'xattr_ptr' to read
xattr value when we will do appraisal or in any case when
measurement template is other than 'ima'.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:47 -04:00
Dmitry Kasatkin
65d98f3be2 integrity: remove declaration of non-existing functions
Commit f381c27 "integrity: move ima inode integrity data management"
(re)moved few functions but left their declarations in header files.
This patch removes them and also removes duplicated declaration of
integrity_iint_find().

Commit c7de7ad "ima: remove unused cleanup functions".  This patch
removes these definitions as well.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:46 -04:00
Dmitry Kasatkin
d9a2e5d788 integrity: prevent flooding with 'Request for unknown key'
If file has IMA signature, IMA in enforce mode, but key is missing
then file access is blocked and single error message is printed.

If IMA appraisal is enabled in fix mode, then system runs as usual
but might produce tons of 'Request for unknown key' messages.

This patch switches 'pr_warn' to 'pr_err_ratelimited'.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:44 -04:00
Dmitry Kasatkin
3034a14682 ima: pass 'opened' flag to identify newly created files
Empty files and missing xattrs do not guarantee that a file was
just created.  This patch passes FILE_CREATED flag to IMA to
reliably identify new files.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>  3.14+
2014-09-09 10:28:43 -04:00
Dmitry Kasatkin
3dcbad52cf evm: properly handle INTEGRITY_NOXATTRS EVM status
Unless an LSM labels a file during d_instantiate(), newly created
files are not labeled with an initial security.evm xattr, until
the file closes.  EVM, before allowing a protected, security xattr
to be written, verifies the existing 'security.evm' value is good.
For newly created files without a security.evm label, this
verification prevents writing any protected, security xattrs,
until the file closes.

Following is the example when this happens:
fd = open("foo", O_CREAT | O_WRONLY, 0644);
setxattr("foo", "security.SMACK64", value, sizeof(value), 0);
close(fd);

While INTEGRITY_NOXATTRS status is handled in other places, such
as evm_inode_setattr(), it does not handle it in all cases in
evm_protect_xattr().  By limiting the use of INTEGRITY_NOXATTRS to
newly created files, we can now allow setting "protected" xattrs.

Changelog:
- limit the use of INTEGRITY_NOXATTRS to IMA identified new files

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>  3.14+
2014-09-09 10:26:10 -04:00
Masanari Iida
da3dae54e4 Documentation: Docbook: Fix generated DocBook/kernel-api.xml
This patch fix spelling typo found in DocBook/kernel-api.xml.
It is because the file is generated from the source comments,
I have to fix the comments in source codes.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-09-09 10:34:56 +02:00
Jiri Pirko
25db6bea1f selinux: register nf hooks with single nf_register_hooks call
Push ipv4 and ipv6 nf hooks into single array and register/unregister
them via single call.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-09-08 20:42:47 -04:00
Dmitry Kasatkin
b151d6b00b ima: provide flag to identify new empty files
On ima_file_free(), newly created empty files are not labeled with
an initial security.ima value, because the iversion did not change.
Commit dff6efc "fs: fix iversion handling" introduced a change in
iversion behavior.  To verify this change use the shell command:

  $ (exec >foo)
  $ getfattr -h -e hex -d -m security foo

This patch defines the IMA_NEW_FILE flag.  The flag is initially
set, when IMA detects that a new file is created, and subsequently
checked on the ima_file_free() hook to set the initial security.ima
value.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>  3.14+
2014-09-08 17:38:57 -04:00
Dmitry Kasatkin
1f1009791b evm: prevent passing integrity check if xattr read fails
This patch fixes a bug, where evm_verify_hmac() returns INTEGRITY_PASS
if inode->i_op->getxattr() returns an error in evm_find_protected_xattrs.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
2014-09-08 17:36:10 -04:00
Paul Moore
a7a91a1928 selinux: fix a problem with IPv6 traffic denials in selinux_ip_postroute()
A previous commit c0828e5048 ("selinux:
process labeled IPsec TCP SYN-ACK packets properly in
selinux_ip_postroute()") mistakenly left out a 'break' from a switch
statement which caused problems with IPv6 traffic.

Thanks to Florian Westphal for reporting and debugging the issue.

Reported-by: Florian Westphal <fwestpha@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-09-03 10:51:59 -04:00
Steve Dickson
738c5d190f KEYS: Increase root_maxkeys and root_maxbytes sizes
Now that NFS client uses the kernel key ring facility to store the NFSv4
id/gid mappings, the defaults for root_maxkeys and root_maxbytes need to be
substantially increased.

These values have been soak tested:

	https://bugzilla.redhat.com/show_bug.cgi?id=1033708#c73

Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-09-03 10:27:12 +10:00
Dmitry Kasatkin
e7d021e283 evm: fix checkpatch warnings
This patch fixes checkpatch 'return' warnings introduced with commit
9819cf2 "checkpatch: warn on unnecessary void function return statements".

Use scripts/checkpatch.pl --file security/integrity/evm/evm_main.c
to produce the warnings.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-02 17:03:37 -04:00
Dmitry Kasatkin
27cd1fc3ae ima: fix fallback to use new_sync_read()
3.16 commit aad4f8bb42
'switch simple generic_file_aio_read() users to ->read_iter()'
replaced ->aio_read with ->read_iter in most of the file systems
and introduced new_sync_read() as a replacement for do_sync_read().

Most of file systems set '->read' and ima_kernel_read is not affected.
When ->read is not set, this patch adopts fallback call changes from the
vfs_read.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>  3.16+
2014-09-02 17:03:36 -04:00
Dmitry Kasatkin
23c19e2ca7 ima: prevent buffer overflow in ima_alloc_tfm()
This patch fixes the case where the file's signature/hash xattr contains
an invalid hash algorithm.  Although we can not verify the xattr, we still
need to measure the file.  Use the default IMA hash algorithm.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-02 17:03:36 -04:00
Mimi Zohar
9a8d289fbc ima: fix ima_alloc_atfm()
The patch 3bcced39ea: "ima: use ahash API for file hash
calculation" from Feb 26, 2014, leads to the following static checker
warning:

security/integrity/ima/ima_crypto.c:204 ima_alloc_atfm()
         error: buffer overflow 'hash_algo_name' 17 <= 17

Unlike shash tfm memory, which is allocated on initialization, the
ahash tfm memory allocation is deferred until needed.

This patch fixes the case where ima_ahash_tfm has not yet been
allocated and the file's signature/hash xattr contains an invalid hash
algorithm.  Although we can not verify the xattr, we still need to
measure the file.  Use the default IMA hash algorithm.

Changelog:
- set valid algo before testing tfm - based on Dmitry's comment

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
2014-09-02 17:03:35 -04:00
Lukasz Pawelczyk
21c7eae21a Make Smack operate on smack_known struct where it still used char*
Smack used to use a mix of smack_known struct and char* throughout its
APIs and implementation. This patch unifies the behaviour and makes it
store and operate exclusively on smack_known struct pointers when managing
labels.

Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@samsung.com>

Conflicts:
	security/smack/smack_access.c
	security/smack/smack_lsm.c
2014-08-29 10:10:55 -07:00
Lukasz Pawelczyk
d01757904d Fix a bidirectional UDS connect check typo
The 54e70ec5eb commit introduced a
bidirectional check that should have checked for mutual WRITE access
between two labels. Due to a typo the second check was incorrect.

Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@samsung.com>
2014-08-29 10:10:47 -07:00
Lukasz Pawelczyk
e95ef49b7f Small fixes in comments describing function parameters
Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@samsung.com>
2014-08-29 10:10:36 -07:00
Casey Schaufler
d166c8024d Smack: Bring-up access mode
People keep asking me for permissive mode, and I keep saying "no".

Permissive mode is wrong for more reasons than I can enumerate,
but the compelling one is that it's once on, never off.

Nonetheless, there is an argument to be made for running a
process with lots of permissions, logging which are required,
and then locking the process down. There wasn't a way to do
that with Smack, but this provides it.

The notion is that you start out by giving the process an
appropriate Smack label, such as "ATBirds". You create rules
with a wide range of access and the "b" mode. On Tizen it
might be:

	ATBirds	System	rwxalb
	ATBirds	User	rwxalb
	ATBirds	_	rwxalb
	User	ATBirds	wb
	System	ATBirds	wb

Accesses that fail will generate audit records. Accesses
that succeed because of rules marked with a "b" generate
log messages identifying the rule, the program and as much
object information as is convenient.

When the system is properly configured and the programs
brought in line with the labeling scheme the "b" mode can
be removed from the rules. When the system is ready for
production the facility can be configured out.

This provides the developer the convenience of permissive
mode without creating a system that looks like it is
enforcing a policy while it is not.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2014-08-28 13:11:56 -07:00
Stephen Smalley
7b0d0b40cd selinux: Permit bounded transitions under NO_NEW_PRIVS or NOSUID.
If the callee SID is bounded by the caller SID, then allowing
the transition to occur poses no risk of privilege escalation and we can
therefore safely allow the transition to occur.  Add this exemption
for both the case where a transition was explicitly requested by the
application and the case where an automatic transition is defined in
policy.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-08-28 11:37:12 -04:00
Jani Nikula
6a4c264313 module: rename KERNEL_PARAM_FL_NOARG to avoid confusion
Make it clear this is about kernel_param_ops, not kernel_param (which
will soon have a flags field of its own). No functional changes.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Jon Mason <jon.mason@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-08-27 21:54:07 +09:30
Tetsuo Handa
8fe7a268b1 tomoyo: Fix pathname calculation breakage.
Commit 7177a9c4b5 ("fs: call rename2 if exists") changed
"struct inode_operations"->rename == NULL if
"struct inode_operations"->rename2 != NULL .

TOMOYO needs to check for both ->rename and ->rename2 , or
a system on (e.g.) ext4 filesystem won't boot.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-08-26 21:52:09 -05:00
Marcin Niesluchowski
d83d2c2646 Smack: Fix setting label on successful file open
While opening with CAP_MAC_OVERRIDE file label is not set.
Other calls may access it after CAP_MAC_OVERRIDE is dropped from process.

Signed-off-by: Marcin Niesluchowski <m.niesluchow@samsung.com>
2014-08-25 14:27:28 -07:00
Linus Torvalds
96784de59f Merge branch 'stable-3.17' of git://git.infradead.org/users/pcmoore/selinux
Pull SElinux fixes from Paul Moore:
 "Two small patches to fix a couple of build warnings in SELinux and
  NetLabel.  The patches are obvious enough that I don't think any
  additional explanation is necessary, but it basically boils down to
  the usual: I was stupid, and these patches fix some of the stupid.

  Both patches were posted earlier this week to the SELinux list, and
  that is where they sat as I didn't think there were noteworthy enough
  to go upstream at this point in time, but DaveM would rather see them
  upstream now so who am I to argue.  As the patches are both very
  small"

* 'stable-3.17' of git://git.infradead.org/users/pcmoore/selinux:
  selinux: remove unused variabled in the netport, netnode, and netif caches
  netlabel: fix the netlbl_catmap_setlong() dummy function
2014-08-09 15:09:52 -07:00
Konstantin Khlebnikov
da1b63566c Smack: remove unneeded NULL-termination from securtity label
Values of extended attributes are stored as binary blobs. NULL-termination
of them isn't required. It just wastes disk space and confuses command-line
tools like getfattr because they have to print that zero byte at the end.

This patch removes terminating zero byte from initial security label in
smack_inode_init_security and cuts it out in function smack_inode_getsecurity
which is used by syscall getxattr. This change seems completely safe, because
function smk_parse_smack ignores everything after first zero byte.

Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
2014-08-08 14:51:19 -07:00
Konstantin Khlebnikov
b862e561ba Smack: handle zero-length security labels without panic
Zero-length security labels are invalid but kernel should handle them.

This patch fixes kernel panic after setting zero-length security labels:
# attr -S -s "SMACK64" -V "" file

And after writing zero-length string into smackfs files syslog and onlycp:
# python -c 'import os; os.write(1, "")' > /smack/syslog

The problem is caused by brain-damaged logic in function smk_parse_smack()
which takes pointer to buffer and its length but if length below or equal zero
it thinks that the buffer is zero-terminated. Unfortunately callers of this
function are widely used and proper fix requires serious refactoring.

Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
2014-08-08 14:51:07 -07:00
Konstantin Khlebnikov
fd5c9d230d Smack: fix behavior of smack_inode_listsecurity
Security operation ->inode_listsecurity is used for generating list of
available extended attributes for syscall listxattr. Currently it's used
only in nfs4 or if filesystem doesn't provide i_op->listxattr.

The list is the set of NULL-terminated names, one after the other.
This method must include zero byte at the and into result.

Also this function must return length even if string does not fit into
output buffer or it is NULL, see similar method in selinux and man listxattr.

Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
2014-08-08 14:50:19 -07:00
Paul Moore
942ba36465 selinux: remove unused variabled in the netport, netnode, and netif caches
This patch removes the unused return code variable in the netport,
netnode, and netif initialization functions.

Reported-by: fengguang.wu@intel.com
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-08-07 20:55:30 -04:00
Linus Torvalds
bb2cbf5e93 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "In this release:

   - PKCS#7 parser for the key management subsystem from David Howells
   - appoint Kees Cook as seccomp maintainer
   - bugfixes and general maintenance across the subsystem"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (94 commits)
  X.509: Need to export x509_request_asymmetric_key()
  netlabel: shorter names for the NetLabel catmap funcs/structs
  netlabel: fix the catmap walking functions
  netlabel: fix the horribly broken catmap functions
  netlabel: fix a problem when setting bits below the previously lowest bit
  PKCS#7: X.509 certificate issuer and subject are mandatory fields in the ASN.1
  tpm: simplify code by using %*phN specifier
  tpm: Provide a generic means to override the chip returned timeouts
  tpm: missing tpm_chip_put in tpm_get_random()
  tpm: Properly clean sysfs entries in error path
  tpm: Add missing tpm_do_selftest to ST33 I2C driver
  PKCS#7: Use x509_request_asymmetric_key()
  Revert "selinux: fix the default socket labeling in sock_graft()"
  X.509: x509_request_asymmetric_keys() doesn't need string length arguments
  PKCS#7: fix sparse non static symbol warning
  KEYS: revert encrypted key change
  ima: add support for measuring and appraising firmware
  firmware_class: perform new LSM checks
  security: introduce kernel_fw_from_file hook
  PKCS#7: Missing inclusion of linux/err.h
  ...
2014-08-06 08:06:39 -07:00
Linus Torvalds
e7fda6c4c3 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer and time updates from Thomas Gleixner:
 "A rather large update of timers, timekeeping & co

   - Core timekeeping code is year-2038 safe now for 32bit machines.
     Now we just need to fix all in kernel users and the gazillion of
     user space interfaces which rely on timespec/timeval :)

   - Better cache layout for the timekeeping internal data structures.

   - Proper nanosecond based interfaces for in kernel users.

   - Tree wide cleanup of code which wants nanoseconds but does hoops
     and loops to convert back and forth from timespecs.  Some of it
     definitely belongs into the ugly code museum.

   - Consolidation of the timekeeping interface zoo.

   - A fast NMI safe accessor to clock monotonic for tracing.  This is a
     long standing request to support correlated user/kernel space
     traces.  With proper NTP frequency correction it's also suitable
     for correlation of traces accross separate machines.

   - Checkpoint/restart support for timerfd.

   - A few NOHZ[_FULL] improvements in the [hr]timer code.

   - Code move from kernel to kernel/time of all time* related code.

   - New clocksource/event drivers from the ARM universe.  I'm really
     impressed that despite an architected timer in the newer chips SoC
     manufacturers insist on inventing new and differently broken SoC
     specific timers.

[ Ed. "Impressed"? I don't think that word means what you think it means ]

   - Another round of code move from arch to drivers.  Looks like most
     of the legacy mess in ARM regarding timers is sorted out except for
     a few obnoxious strongholds.

   - The usual updates and fixlets all over the place"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
  timekeeping: Fixup typo in update_vsyscall_old definition
  clocksource: document some basic timekeeping concepts
  timekeeping: Use cached ntp_tick_length when accumulating error
  timekeeping: Rework frequency adjustments to work better w/ nohz
  timekeeping: Minor fixup for timespec64->timespec assignment
  ftrace: Provide trace clocks monotonic
  timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC
  seqcount: Add raw_write_seqcount_latch()
  seqcount: Provide raw_read_seqcount()
  timekeeping: Use tk_read_base as argument for timekeeping_get_ns()
  timekeeping: Create struct tk_read_base and use it in struct timekeeper
  timekeeping: Restructure the timekeeper some more
  clocksource: Get rid of cycle_last
  clocksource: Move cycle_last validation to core code
  clocksource: Make delta calculation a function
  wireless: ath9k: Get rid of timespec conversions
  drm: vmwgfx: Use nsec based interfaces
  drm: i915: Use nsec based interfaces
  timekeeping: Provide ktime_get_raw()
  hangcheck-timer: Use ktime_get_ns()
  ...
2014-08-05 17:46:42 -07:00
Paul Moore
aa9e0de81b Linux 3.16
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJT3rbVAAoJEHm+PkMAQRiGBc0H/0PcAqZ66KqBrjCaC7tlR9ZJ
 Oyv4usrPpVmJaCaYiNwc4KnkJXDfc/foEtZq32vYSb4d8xaOLta3DrT8YJTS7B7T
 Afdg8FbVdSjBD0S8It35XidmZlOaVrgGJGpDIRBRrqDwPPgbWpTeUR73bfkwoA/R
 ziW+78s0mquo9hN9Bdu3apr7XxVmzeIUx6lJxKPCoXNEGTsSC7ibCzZRzZDMpag/
 D1JrQbE0XevgEu5fWrJkcqKceUzi3I1wuKZvBIJm2aX5XDsKpYNfQL6ViJDW56dK
 LhrB8vex8gkQYSCVPyUKx4BjkdPourSICSKq+h0SwhOCpHVHPmG8XM3J4/U4a7U=
 =yoNZ
 -----END PGP SIGNATURE-----

Merge tag 'v3.16' into next

Linux 3.16
2014-08-05 15:44:22 -04:00
Linus Torvalds
98959948a7 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:

 - Move the nohz kick code out of the scheduler tick to a dedicated IPI,
   from Frederic Weisbecker.

  This necessiated quite some background infrastructure rework,
  including:

   * Clean up some irq-work internals
   * Implement remote irq-work
   * Implement nohz kick on top of remote irq-work
   * Move full dynticks timer enqueue notification to new kick
   * Move multi-task notification to new kick
   * Remove unecessary barriers on multi-task notification

 - Remove proliferation of wait_on_bit() action functions and allow
   wait_on_bit_action() functions to support a timeout.  (Neil Brown)

 - Another round of sched/numa improvements, cleanups and fixes.  (Rik
   van Riel)

 - Implement fast idling of CPUs when the system is partially loaded,
   for better scalability.  (Tim Chen)

 - Restructure and fix the CPU hotplug handling code that may leave
   cfs_rq and rt_rq's throttled when tasks are migrated away from a dead
   cpu.  (Kirill Tkhai)

 - Robustify the sched topology setup code.  (Peterz Zijlstra)

 - Improve sched_feat() handling wrt.  static_keys (Jason Baron)

 - Misc fixes.

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
  sched/fair: Fix 'make xmldocs' warning caused by missing description
  sched: Use macro for magic number of -1 for setparam
  sched: Robustify topology setup
  sched: Fix sched_setparam() policy == -1 logic
  sched: Allow wait_on_bit_action() functions to support a timeout
  sched: Remove proliferation of wait_on_bit() action functions
  sched/numa: Revert "Use effective_load() to balance NUMA loads"
  sched: Fix static_key race with sched_feat()
  sched: Remove extra static_key*() function indirection
  sched/rt: Fix replenish_dl_entity() comments to match the current upstream code
  sched: Transform resched_task() into resched_curr()
  sched/deadline: Kill task_struct->pi_top_task
  sched: Rework check_for_tasks()
  sched/rt: Enqueue just unthrottled rt_rq back on the stack in __disable_runtime()
  sched/fair: Disable runtime_enabled on dying rq
  sched/numa: Change scan period code to match intent
  sched/numa: Rework best node setting in task_numa_migrate()
  sched/numa: Examine a task move when examining a task swap
  sched/numa: Simplify task_numa_compare()
  sched/numa: Use effective_load() to balance NUMA loads
  ...
2014-08-04 16:23:30 -07:00
James Morris
103ae675b1 Merge branch 'next' of git://git.infradead.org/users/pcmoore/selinux into next 2014-08-02 22:58:02 +10:00
Paul Moore
4fbe63d1c7 netlabel: shorter names for the NetLabel catmap funcs/structs
Historically the NetLabel LSM secattr catmap functions and data
structures have had very long names which makes a mess of the NetLabel
code and anyone who uses NetLabel.  This patch renames the catmap
functions and structures from "*_secattr_catmap_*" to just "*_catmap_*"
which improves things greatly.

There are no substantial code or logic changes in this patch.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
2014-08-01 11:17:37 -04:00
Paul Moore
4b8feff251 netlabel: fix the horribly broken catmap functions
The NetLabel secattr catmap functions, and the SELinux import/export
glue routines, were broken in many horrible ways and the SELinux glue
code fiddled with the NetLabel catmap structures in ways that we
probably shouldn't allow.  At some point this "worked", but that was
likely due to a bit of dumb luck and sub-par testing (both inflicted
by yours truly).  This patch corrects these problems by basically
gutting the code in favor of something less obtuse and restoring the
NetLabel abstractions in the SELinux catmap glue code.

Everything is working now, and if it decides to break itself in the
future this code will be much easier to debug than the code it
replaces.

One noteworthy side effect of the changes is that it is no longer
necessary to allocate a NetLabel catmap before calling one of the
NetLabel APIs to set a bit in the catmap.  NetLabel will automatically
allocate the catmap nodes when needed, resulting in less allocations
when the lowest bit is greater than 255 and less code in the LSMs.

Cc: stable@vger.kernel.org
Reported-by: Christian Evans <frodox@zoho.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
2014-08-01 11:17:17 -04:00
Paul Moore
41c3bd2039 netlabel: fix a problem when setting bits below the previously lowest bit
The NetLabel category (catmap) functions have a problem in that they
assume categories will be set in an increasing manner, e.g. the next
category set will always be larger than the last.  Unfortunately, this
is not a valid assumption and could result in problems when attempting
to set categories less than the startbit in the lowest catmap node.
In some cases kernel panics and other nasties can result.

This patch corrects the problem by checking for this and allocating a
new catmap node instance and placing it at the front of the list.

Cc: stable@vger.kernel.org
Reported-by: Christian Evans <frodox@zoho.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
2014-08-01 11:17:03 -04:00
James Morris
167225b775 Merge branch 'stable-3.16' of git://git.infradead.org/users/pcmoore/selinux into next 2014-07-30 01:31:46 +10:00
Paul Moore
2873ead7e4 Revert "selinux: fix the default socket labeling in sock_graft()"
This reverts commit 4da6daf4d3.

Unfortunately, the commit in question caused problems with Bluetooth
devices, specifically it caused them to get caught in the newly
created BUG_ON() check.  The AF_ALG problem still exists, but will be
addressed in a future patch.

Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-07-28 10:46:07 -04:00
Mimi Zohar
b64cc5fb85 KEYS: revert encrypted key change
Commit fc7c70e "KEYS: struct key_preparsed_payload should have two
payload pointers" erroneously modified encrypted-keys.  This patch
reverts the change to that file.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2014-07-28 12:36:17 +01:00
Mimi Zohar
5a9196d715 ima: add support for measuring and appraising firmware
The "security: introduce kernel_fw_from_file hook" patch defined a
new security hook to evaluate any loaded firmware that wasn't built
into the kernel.

This patch defines ima_fw_from_file(), which is called from the new
security hook, to measure and/or appraise the loaded firmware's
integrity.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2014-07-25 11:47:46 -07:00
Kees Cook
13752fe2d7 security: introduce kernel_fw_from_file hook
In order to validate the contents of firmware being loaded, there must be
a hook to evaluate any loaded firmware that wasn't built into the kernel
itself. Without this, there is a risk that a root user could load malicious
firmware designed to mount an attack against kernel memory (e.g. via DMA).

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
2014-07-25 11:47:45 -07:00
Eric Paris
7d8b6c6375 CAPABILITIES: remove undefined caps from all processes
This is effectively a revert of 7b9a7ec565
plus fixing it a different way...

We found, when trying to run an application from an application which
had dropped privs that the kernel does security checks on undefined
capability bits.  This was ESPECIALLY difficult to debug as those
undefined bits are hidden from /proc/$PID/status.

Consider a root application which drops all capabilities from ALL 4
capability sets.  We assume, since the application is going to set
eff/perm/inh from an array that it will clear not only the defined caps
less than CAP_LAST_CAP, but also the higher 28ish bits which are
undefined future capabilities.

The BSET gets cleared differently.  Instead it is cleared one bit at a
time.  The problem here is that in security/commoncap.c::cap_task_prctl()
we actually check the validity of a capability being read.  So any task
which attempts to 'read all things set in bset' followed by 'unset all
things set in bset' will not even attempt to unset the undefined bits
higher than CAP_LAST_CAP.

So the 'parent' will look something like:
CapInh:	0000000000000000
CapPrm:	0000000000000000
CapEff:	0000000000000000
CapBnd:	ffffffc000000000

All of this 'should' be fine.  Given that these are undefined bits that
aren't supposed to have anything to do with permissions.  But they do...

So lets now consider a task which cleared the eff/perm/inh completely
and cleared all of the valid caps in the bset (but not the invalid caps
it couldn't read out of the kernel).  We know that this is exactly what
the libcap-ng library does and what the go capabilities library does.
They both leave you in that above situation if you try to clear all of
you capapabilities from all 4 sets.  If that root task calls execve()
the child task will pick up all caps not blocked by the bset.  The bset
however does not block bits higher than CAP_LAST_CAP.  So now the child
task has bits in eff which are not in the parent.  These are
'meaningless' undefined bits, but still bits which the parent doesn't
have.

The problem is now in cred_cap_issubset() (or any operation which does a
subset test) as the child, while a subset for valid cap bits, is not a
subset for invalid cap bits!  So now we set durring commit creds that
the child is not dumpable.  Given it is 'more priv' than its parent.  It
also means the parent cannot ptrace the child and other stupidity.

The solution here:
1) stop hiding capability bits in status
	This makes debugging easier!

2) stop giving any task undefined capability bits.  it's simple, it you
don't put those invalid bits in CAP_FULL_SET you won't get them in init
and you won't get them in any other task either.
	This fixes the cap_issubset() tests and resulting fallout (which
	made the init task in a docker container untraceable among other
	things)

3) mask out undefined bits when sys_capset() is called as it might use
~0, ~0 to denote 'all capabilities' for backward/forward compatibility.
	This lets 'capsh --caps="all=eip" -- -c /bin/bash' run.

4) mask out undefined bit when we read a file capability off of disk as
again likely all bits are set in the xattr for forward/backward
compatibility.
	This lets 'setcap all+pe /bin/bash; /bin/bash' run

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Steve Grubb <sgrubb@redhat.com>
Cc: Dan Walsh <dwalsh@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-07-24 21:53:47 +10:00
James Morris
4ca332e11d Merge tag 'keys-next-20140722' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2014-07-24 21:36:19 +10:00
Tetsuo Handa
6d6f332842 commoncap: don't alloc the credential unless needed in cap_task_prctl
In function cap_task_prctl(), we would allocate a credential
unconditionally and then check if we support the requested function.
If not we would release this credential with abort_creds() by using
RCU method. But on some archs such as powerpc, the sys_prctl is heavily
used to get/set the floating point exception mode. So the unnecessary
allocating/releasing of credential not only introduce runtime overhead
but also do cause OOM due to the RCU implementation.

This patch removes abort_creds() from cap_task_prctl() by calling
prepare_creds() only when we need to modify it.

Reported-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-07-24 21:12:30 +10:00
David Howells
633706a2ee Merge branch 'keys-fixes' into keys-next
Signed-off-by: David Howells <dhowells@redhat.com>
2014-07-22 21:55:45 +01:00
David Howells
64724cfc6e Merge remote-tracking branch 'integrity/next-with-keys' into keys-next
Signed-off-by: David Howells <dhowells@redhat.com>
2014-07-22 21:54:43 +01:00
David Howells
f1dcde91a3 KEYS: request_key_auth: Provide key preparsing
Provide key preparsing for the request_key_auth key type so that we can make
preparsing mandatory.  This does nothing as this type can only be set up
internally to the kernel.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
2014-07-22 21:46:55 +01:00
David Howells
5d19e20b53 KEYS: keyring: Provide key preparsing
Provide key preparsing in the keyring so that we can make preparsing
mandatory.  For keyrings, however, only an empty payload is permitted.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
2014-07-22 21:46:51 +01:00
David Howells
002edaf76f KEYS: big_key: Use key preparsing
Make use of key preparsing in the big key type so that quota size determination
can take place prior to keyring locking when a key is being added.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
2014-07-22 21:46:47 +01:00
David Howells
f9167789df KEYS: user: Use key preparsing
Make use of key preparsing in user-defined and logon keys so that quota size
determination can take place prior to keyring locking when a key is being
added.

Also the idmapper key types need to change to match as they use the
user-defined key type routines.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
2014-07-22 21:46:17 +01:00
David Howells
4d8c0250b8 KEYS: Call ->free_preparse() even after ->preparse() returns an error
Call the ->free_preparse() key type op even after ->preparse() returns an
error as it does cleaning up type stuff.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-22 21:46:12 +01:00
David Howells
7dfa0ca6a9 KEYS: Allow expiry time to be set when preparsing a key
Allow a key type's preparsing routine to set the expiry time for a key.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-22 21:46:08 +01:00
David Howells
fc7c70e0b6 KEYS: struct key_preparsed_payload should have two payload pointers
struct key_preparsed_payload should have two payload pointers to correspond
with those in struct key.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-22 21:46:02 +01:00
James Morris
fd33c43677 Merge tag 'seccomp-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux into next 2014-07-19 17:40:49 +10:00
James Morris
2ccf4661f3 Merge branch 'next' of git://git.infradead.org/users/pcmoore/selinux into next 2014-07-19 17:39:19 +10:00
Kees Cook
1d4457f999 sched: move no_new_privs into new atomic flags
Since seccomp transitions between threads requires updates to the
no_new_privs flag to be atomic, the flag must be part of an atomic flag
set. This moves the nnp flag into a separate task field, and introduces
accessors.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-07-18 12:13:38 -07:00
David Howells
6a09d17bb6 KEYS: Provide a generic instantiation function
Provide a generic instantiation function for key types that use the preparse
hook.  This makes it easier to prereserve key quota before keyrings get locked
to retain the new key.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-18 18:56:34 +01:00
David Howells
0c7774abb4 KEYS: Allow special keys (eg. DNS results) to be invalidated by CAP_SYS_ADMIN
Special kernel keys, such as those used to hold DNS results for AFS, CIFS and
NFS and those used to hold idmapper results for NFS, used to be
'invalidateable' with key_revoke().  However, since the default permissions for
keys were reduced:

	Commit: 96b5c8fea6
	KEYS: Reduce initial permissions on keys

it has become impossible to do this.

Add a key flag (KEY_FLAG_ROOT_CAN_INVAL) that will permit a key to be
invalidated by root.  This should not be used for system keyrings as the
garbage collector will try and remove any invalidate key.  For system keyrings,
KEY_FLAG_ROOT_CAN_CLEAR can be used instead.

After this, from userspace, keyctl_invalidate() and "keyctl invalidate" can be
used by any possessor of CAP_SYS_ADMIN (typically root) to invalidate DNS and
idmapper keys.  Invalidated keys are immediately garbage collected and will be
immediately rerequested if needed again.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Steve Dickson <steved@redhat.com>
2014-07-17 20:45:08 +01:00
Mimi Zohar
7d2ce2320e ima: define '.ima' as a builtin 'trusted' keyring
Require all keys added to the IMA keyring be signed by an
existing trusted key on the system trusted keyring.

Changelog v6:
- remove ifdef CONFIG_IMA_TRUSTED_KEYRING in C code - Dmitry
- update Kconfig dependency and help
- select KEYS_DEBUG_PROC_KEYS - Dmitry

Changelog v5:
- Move integrity_init_keyring() to init_ima() - Dmitry
- reset keyring[id] on failure - Dmitry

Changelog v1:
- don't link IMA trusted keyring to user keyring

Changelog:
- define stub integrity_init_keyring() function (reported-by Fengguang Wu)
- differentiate between regular and trusted keyring names.
- replace printk with pr_info (D. Kasatkin)
- only make the IMA keyring a trusted keyring (reported-by D. Kastatkin)
- define stub integrity_init_keyring() definition based on
  CONFIG_INTEGRITY_SIGNATURE, not CONFIG_INTEGRITY_ASYMMETRIC_KEYS.
  (reported-by Jim Davis)

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Acked-by: David Howells <dhowells@redhat.com>
2014-07-17 09:35:17 -04:00
Mimi Zohar
a4e3b8d79a KEYS: special dot prefixed keyring name bug fix
Dot prefixed keyring names are supposed to be reserved for the
kernel, but add_key() calls key_get_type_from_user(), which
incorrectly verifies the 'type' field, not the 'description' field.
This patch verifies the 'description' field isn't dot prefixed,
when creating a new keyring, and removes the dot prefix test in
key_get_type_from_user().

Changelog v6:
- whitespace and other cleanup

Changelog v5:
- Only prevent userspace from creating a dot prefixed keyring, not
  regular keys  - Dmitry

Reported-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
2014-07-17 09:35:14 -04:00
Dmitry Kasatkin
32c2e6752f ima: provide double buffering for hash calculation
The asynchronous hash API allows initiating a hash calculation and
then performing other tasks, while waiting for the hash calculation
to complete.

This patch introduces usage of double buffering for simultaneous
hashing and reading of the next chunk of data from storage.

Changes in v3:
- better comments

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17 09:35:11 -04:00
Dmitry Kasatkin
6edf7a8926 ima: introduce multi-page collect buffers
Use of multiple-page collect buffers reduces:
1) the number of block IO requests
2) the number of asynchronous hash update requests

Second is important for HW accelerated hashing, because significant
amount of time is spent for preparation of hash update operation,
which includes configuring acceleration HW, DMA engine, etc...
Thus, HW accelerators are more efficient when working on large
chunks of data.

This patch introduces usage of multi-page collect buffers. Buffer size
can be specified using 'ahash_bufsize' module parameter. Default buffer
size is 4096 bytes.

Changes in v3:
- kernel parameter replaced with module parameter

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17 09:35:11 -04:00
Dmitry Kasatkin
3bcced39ea ima: use ahash API for file hash calculation
Async hash API allows the use of HW acceleration for hash calculation.
It may give significant performance gain and/or reduce power consumption,
which might be very beneficial for battery powered devices.

This patch introduces hash calculation using ahash API. ahash performance
depends on the data size and the particular HW. Depending on the specific
system, shash performance may be better.

This patch defines 'ahash_minsize' module parameter, which is used to
define the minimal file size to use with ahash.  If this minimum file size
is not set or the file is smaller than defined by the parameter, shash will
be used.

Changes in v3:
- kernel parameter replaced with module parameter
- pr_crit replaced with pr_crit_ratelimited
- more comment changes - Mimi

Changes in v2:
- ima_ahash_size became as ima_ahash
- ahash pre-allocation moved out from __init code to be able to use
  ahash crypto modules. Ahash allocated once on the first use.
- hash calculation falls back to shash if ahash allocation/calculation fails
- complex initialization separated from variable declaration
- improved comments

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17 09:35:10 -04:00
Richard Guy Briggs
7e9001f663 audit: fix dangling keywords in integrity ima message output
Replace spaces in op keyword labels in log output since userspace audit tools
can't parse orphaned keywords.

Reported-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17 09:35:10 -04:00
Dmitry Kasatkin
209b43ca64 ima: delay template descriptor lookup until use
process_measurement() always calls ima_template_desc_current(),
including when an IMA policy has not been defined.

This patch delays template descriptor lookup until action is
determined.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17 09:35:09 -04:00
Dmitry Kasatkin
2c50b96482 ima: remove unnecessary i_mutex locking from ima_rdwr_violation_check()
Before 2.6.39 inode->i_readcount was maintained by IMA. It was not atomic
and protected using spinlock. For 2.6.39, i_readcount was converted to
atomic and maintaining was moved VFS layer. Spinlock for some unclear
reason was replaced by i_mutex.

After analyzing the code, we came to conclusion that i_mutex locking is
unnecessary, especially when an IMA policy has not been defined.

This patch removes i_mutex locking from ima_rdwr_violation_check().

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17 09:35:09 -04:00
Thomas Gleixner
afdb094380 Linux 3.16-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTwvRvAAoJEHm+PkMAQRiG8CoIAJucWkj+MJFFoDXjR9hfI8U7
 /WeQLJP0GpWGMXd2KznX9epCuw5rsuaPAxCy1HFDNOa7OtNYacWrsIhByxOIDLwL
 YjDB9+fpMMPFWsr+LPJa8Ombh/TveCS77w6Pt5VMZFwvIKujiNK/C3MdxjReH5Gr
 iTGm8x7nEs2D6L2+5sQVlhXot/97phxIlBSP6wPXEiaztNZ9/JZi905Xpgq+WU16
 ZOA8MlJj1TQD4xcWyUcsQ5REwIOdQ6xxPF00wv/12RFela+Puy4JLAilnV6Mc12U
 fwYOZKbUNBS8rjfDDdyX3sljV1L5iFFqKkW3WFdnv/z8ZaZSo5NupWuavDnifKw=
 =6Q8o
 -----END PGP SIGNATURE-----

Merge tag 'v3.16-rc5' into timers/core

Reason: Bring in upstream modifications, so the pending changes which
depend on them can be queued.
2014-07-16 21:57:38 +02:00
James Morris
b6b8a371f5 Merge branch 'stable-3.16' of git://git.infradead.org/users/pcmoore/selinux into next 2014-07-17 03:05:51 +10:00
NeilBrown
743162013d sched: Remove proliferation of wait_on_bit() action functions
The current "wait_on_bit" interface requires an 'action'
function to be provided which does the actual waiting.
There are over 20 such functions, many of them identical.
Most cases can be satisfied by one of just two functions, one
which uses io_schedule() and one which just uses schedule().

So:
 Rename wait_on_bit and        wait_on_bit_lock to
        wait_on_bit_action and wait_on_bit_lock_action
 to make it explicit that they need an action function.

 Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io
 which are *not* given an action function but implicitly use
 a standard one.
 The decision to error-out if a signal is pending is now made
 based on the 'mode' argument rather than being encoded in the action
 function.

 All instances of the old wait_on_bit and wait_on_bit_lock which
 can use the new version have been changed accordingly and their
 action functions have been discarded.
 wait_on_bit{_lock} does not return any specific error code in the
 event of a signal so the caller must check for non-zero and
 interpolate their own error code as appropriate.

The wait_on_bit() call in __fscache_wait_on_invalidate() was
ambiguous as it specified TASK_UNINTERRUPTIBLE but used
fscache_wait_bit_interruptible as an action function.
David Howells confirms this should be uniformly
"uninterruptible"

The main remaining user of wait_on_bit{,_lock}_action is NFS
which needs to use a freezer-aware schedule() call.

A comment in fs/gfs2/glock.c notes that having multiple 'action'
functions is useful as they display differently in the 'wchan'
field of 'ps'. (and /proc/$PID/wchan).
As the new bit_wait{,_io} functions are tagged "__sched", they
will not show up at all, but something higher in the stack.  So
the distinction will still be visible, only with different
function names (gds2_glock_wait versus gfs2_glock_dq_wait in the
gfs2/glock.c case).

Since first version of this patch (against 3.15) two new action
functions appeared, on in NFS and one in CIFS.  CIFS also now
uses an action function that makes the same freezer aware
schedule call as NFS.

Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: David Howells <dhowells@redhat.com> (fscache, keys)
Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2)
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brown
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16 15:10:39 +02:00
Tejun Heo
5577964e64 cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes
Currently, cgroup_subsys->base_cftypes is used for both the unified
default hierarchy and legacy ones and subsystems can mark each file
with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE if it has to appear
only on one of them.  This is quite hairy and error-prone.  Also, we
may end up exposing interface files to the default hierarchy without
thinking it through.

cgroup_subsys will grow two separate cftype arrays and apply each only
on the hierarchies of the matching type.  This will allow organizing
cftypes in a lot clearer way and encourage subsystems to scrutinize
the interface which is being exposed in the new default hierarchy.

In preparation, this patch renames cgroup_subsys->base_cftypes to
cgroup_subsys->legacy_cftypes.  This patch is pure rename.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2014-07-15 11:05:09 -04:00
Paul Moore
4da6daf4d3 selinux: fix the default socket labeling in sock_graft()
The sock_graft() hook has special handling for AF_INET, AF_INET, and
AF_UNIX sockets as those address families have special hooks which
label the sock before it is attached its associated socket.
Unfortunately, the sock_graft() hook was missing a default approach
to labeling sockets which meant that any other address family which
made use of connections or the accept() syscall would find the
returned socket to be in an "unlabeled" state.  This was recently
demonstrated by the kcrypto/AF_ALG subsystem and the newly released
cryptsetup package (cryptsetup v1.6.5 and later).

This patch preserves the special handling in selinux_sock_graft(),
but adds a default behavior - setting the sock's label equal to the
associated socket - which resolves the problem with AF_ALG and
presumably any other address family which makes use of accept().

Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Milan Broz <gmazyland@gmail.com>
2014-07-10 10:17:48 -04:00
Paul Moore
615e51fdda selinux: reduce the number of calls to synchronize_net() when flushing caches
When flushing the AVC, such as during a policy load, the various
network caches are also flushed, with each making a call to
synchronize_net() which has shown to be expensive in some cases.
This patch consolidates the network cache flushes into a single AVC
callback which only calls synchronize_net() once for each AVC cache
flush.

Reported-by: Jaejyn Shin <flagon22bass@gmail.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-26 14:33:56 -04:00
Waiman Long
f31e799459 selinux: no recursive read_lock of policy_rwlock in security_genfs_sid()
With the introduction of fair queued rwlock, recursive read_lock()
may hang the offending process if there is a write_lock() somewhere
in between.

With recursive read_lock checking enabled, the following error was
reported:

=============================================
[ INFO: possible recursive locking detected ]
3.16.0-rc1 #2 Tainted: G            E
---------------------------------------------
load_policy/708 is trying to acquire lock:
 (policy_rwlock){.+.+..}, at: [<ffffffff8125b32a>]
security_genfs_sid+0x3a/0x170

but task is already holding lock:
 (policy_rwlock){.+.+..}, at: [<ffffffff8125b48c>]
security_fs_use+0x2c/0x110

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(policy_rwlock);
  lock(policy_rwlock);

This patch fixes the occurrence of recursive read_lock() of
policy_rwlock by adding a helper function __security_genfs_sid()
which requires caller to take the lock before calling it. The
security_fs_use() was then modified to call the new helper function.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-23 16:52:55 -04:00
Namhyung Kim
6e51f9cbfa selinux: fix a possible memory leak in cond_read_node()
The cond_read_node() should free the given node on error path as it's
not linked to p->cond_list yet.  This is done via cond_node_destroy()
but it's not called when next_entry() fails before the expr loop.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-19 14:56:59 -04:00
Namhyung Kim
f004afe60d selinux: simple cleanup for cond_read_node()
The node->cur_state and len can be read in a single call of next_entry().
And setting len before reading is a dead write so can be eliminated.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
(Minor tweak to the length parameter in the call to next_entry())
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-19 14:53:15 -04:00
Gideon Israel Dsouza
4bb9398300 security: Used macros from compiler.h instead of __attribute__((...))
To increase compiler portability there is <linux/compiler.h> which
provides convenience macros for various gcc constructs.  Eg: __packed
for __attribute__((packed)).

This patch is part of a large task I've taken to clean the gcc
specific attributes and use the the macros instead.

Signed-off-by: Gideon Israel Dsouza <gidisrael@gmail.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-18 16:59:34 -04:00
Namhyung Kim
4b6f405f72 selinux: introduce str_read() helper
There're some code duplication for reading a string value during
policydb_read().  Add str_read() helper to fix it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-18 15:55:58 -04:00
Himangi Saraogi
5c7001b84b SELinux: use ARRAY_SIZE
ARRAY_SIZE is more concise to use when the size of an array is divided
by the size of its type or the size of its first element.

The Coccinelle semantic patch that makes this change is as follows:

// <smpl>
@@
type T;
T[] E;
@@

- (sizeof(E)/sizeof(E[...]))
+ ARRAY_SIZE(E)
// </smpl>

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-17 17:36:02 -04:00
Paul Moore
170b5910d9 Linux 3.15
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTlKlSAAoJEHm+PkMAQRiGF5YH/Ak8QPJLI0mRvSDI9PuaVVfv
 XPnXrOIigcj/X0CL4WbzqWW2zQFAK6YtDDoNlzuRaRveFVx0TNp4YAI/YHCD4IVZ
 FzRXBEWpeGvTh5Ev0DYDyHMXyMqz5fJeIM3rCmOCUOlHJQxneIHGKIXwsMKpAPkX
 BxVdfEqt/Yq/CB5RFbvAyfFlXlnl8A5QDELEkaCk+1s3iXXST5qUKl8XwdgCvfKF
 Qq3yau12+Lmw89Klh74U9Hj5s/LXz5L8Gg66XM30hTsaK5/0AQ7oGx3kJvO3Uurs
 CO3vrt7EXJjeq2YLken4Wa7UpPoHOK33KY2E/ByhkJn4xTFwJmo68miLH0Edy9U=
 =4IWq
 -----END PGP SIGNATURE-----

Merge tag 'v3.15' into next

Linux 3.15
2014-06-17 17:30:23 -04:00
Linus Torvalds
aa569fa0ea Merge branch 'serge-next-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security
Pull more security layer updates from Serge Hallyn:
 "A few more commits had previously failed to make it through
  security-next into linux-next but this week made it into linux-next.
  At least commit "ima: introduce ima_kernel_read()" was deemed critical
  by Mimi to make this merge window.

  This is a temporary tree just for this request.  Mimi has pointed me
  to some previous threads about keeping maintainer trees at the
  previous release, which I'll certainly do for anything long-term,
  after talking with James"

* 'serge-next-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security:
  ima: introduce ima_kernel_read()
  evm: prohibit userspace writing 'security.evm' HMAC value
  ima: check inode integrity cache in violation check
  ima: prevent unnecessary policy checking
  evm: provide option to protect additional SMACK xattrs
  evm: replace HMAC version with attribute mask
  ima: prevent new digsig xattr from being replaced
2014-06-13 07:39:39 -07:00
Dmitry Kasatkin
0430e49b6e ima: introduce ima_kernel_read()
Commit 8aac62706 "move exit_task_namespaces() outside of exit_notify"
introduced the kernel opps since the kernel v3.10, which happens when
Apparmor and IMA-appraisal are enabled at the same time.

----------------------------------------------------------------------
[  106.750167] BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
[  106.750221] IP: [<ffffffff811ec7da>] our_mnt+0x1a/0x30
[  106.750241] PGD 0
[  106.750254] Oops: 0000 [#1] SMP
[  106.750272] Modules linked in: cuse parport_pc ppdev bnep rfcomm
bluetooth rpcsec_gss_krb5 nfsd auth_rpcgss nfs_acl nfs lockd sunrpc
fscache dm_crypt intel_rapl x86_pkg_temp_thermal intel_powerclamp
kvm_intel snd_hda_codec_hdmi kvm crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul
ablk_helper cryptd snd_hda_codec_realtek dcdbas snd_hda_intel
snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi
snd_seq_midi_event snd_rawmidi psmouse snd_seq microcode serio_raw
snd_timer snd_seq_device snd soundcore video lpc_ich coretemp mac_hid lp
parport mei_me mei nbd hid_generic e1000e usbhid ahci ptp hid libahci
pps_core
[  106.750658] CPU: 6 PID: 1394 Comm: mysqld Not tainted 3.13.0-rc7-kds+ #15
[  106.750673] Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A08
09/19/2012
[  106.750689] task: ffff8800de804920 ti: ffff880400fca000 task.ti:
ffff880400fca000
[  106.750704] RIP: 0010:[<ffffffff811ec7da>]  [<ffffffff811ec7da>]
our_mnt+0x1a/0x30
[  106.750725] RSP: 0018:ffff880400fcba60  EFLAGS: 00010286
[  106.750738] RAX: 0000000000000000 RBX: 0000000000000100 RCX:
ffff8800d51523e7
[  106.750764] RDX: ffffffffffffffea RSI: ffff880400fcba34 RDI:
ffff880402d20020
[  106.750791] RBP: ffff880400fcbae0 R08: 0000000000000000 R09:
0000000000000001
[  106.750817] R10: 0000000000000000 R11: 0000000000000001 R12:
ffff8800d5152300
[  106.750844] R13: ffff8803eb8df510 R14: ffff880400fcbb28 R15:
ffff8800d51523e7
[  106.750871] FS:  0000000000000000(0000) GS:ffff88040d200000(0000)
knlGS:0000000000000000
[  106.750910] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  106.750935] CR2: 0000000000000018 CR3: 0000000001c0e000 CR4:
00000000001407e0
[  106.750962] Stack:
[  106.750981]  ffffffff813434eb ffff880400fcbb20 ffff880400fcbb18
0000000000000000
[  106.751037]  ffff8800de804920 ffffffff8101b9b9 0001800000000000
0000000000000100
[  106.751093]  0000010000000000 0000000000000002 000000000000000e
ffff8803eb8df500
[  106.751149] Call Trace:
[  106.751172]  [<ffffffff813434eb>] ? aa_path_name+0x2ab/0x430
[  106.751199]  [<ffffffff8101b9b9>] ? sched_clock+0x9/0x10
[  106.751225]  [<ffffffff8134a68d>] aa_path_perm+0x7d/0x170
[  106.751250]  [<ffffffff8101b945>] ? native_sched_clock+0x15/0x80
[  106.751276]  [<ffffffff8134aa73>] aa_file_perm+0x33/0x40
[  106.751301]  [<ffffffff81348c5e>] common_file_perm+0x8e/0xb0
[  106.751327]  [<ffffffff81348d78>] apparmor_file_permission+0x18/0x20
[  106.751355]  [<ffffffff8130c853>] security_file_permission+0x23/0xa0
[  106.751382]  [<ffffffff811c77a2>] rw_verify_area+0x52/0xe0
[  106.751407]  [<ffffffff811c789d>] vfs_read+0x6d/0x170
[  106.751432]  [<ffffffff811cda31>] kernel_read+0x41/0x60
[  106.751457]  [<ffffffff8134fd45>] ima_calc_file_hash+0x225/0x280
[  106.751483]  [<ffffffff8134fb52>] ? ima_calc_file_hash+0x32/0x280
[  106.751509]  [<ffffffff8135022d>] ima_collect_measurement+0x9d/0x160
[  106.751536]  [<ffffffff810b552d>] ? trace_hardirqs_on+0xd/0x10
[  106.751562]  [<ffffffff8134f07c>] ? ima_file_free+0x6c/0xd0
[  106.751587]  [<ffffffff81352824>] ima_update_xattr+0x34/0x60
[  106.751612]  [<ffffffff8134f0d0>] ima_file_free+0xc0/0xd0
[  106.751637]  [<ffffffff811c9635>] __fput+0xd5/0x300
[  106.751662]  [<ffffffff811c98ae>] ____fput+0xe/0x10
[  106.751687]  [<ffffffff81086774>] task_work_run+0xc4/0xe0
[  106.751712]  [<ffffffff81066fad>] do_exit+0x2bd/0xa90
[  106.751738]  [<ffffffff8173c958>] ? retint_swapgs+0x13/0x1b
[  106.751763]  [<ffffffff8106780c>] do_group_exit+0x4c/0xc0
[  106.751788]  [<ffffffff81067894>] SyS_exit_group+0x14/0x20
[  106.751814]  [<ffffffff8174522d>] system_call_fastpath+0x1a/0x1f
[  106.751839] Code: c3 0f 1f 44 00 00 55 48 89 e5 e8 22 fe ff ff 5d c3
0f 1f 44 00 00 55 65 48 8b 04 25 c0 c9 00 00 48 8b 80 28 06 00 00 48 89
e5 5d <48> 8b 40 18 48 39 87 c0 00 00 00 0f 94 c0 c3 0f 1f 80 00 00 00
[  106.752185] RIP  [<ffffffff811ec7da>] our_mnt+0x1a/0x30
[  106.752214]  RSP <ffff880400fcba60>
[  106.752236] CR2: 0000000000000018
[  106.752258] ---[ end trace 3c520748b4732721 ]---
----------------------------------------------------------------------

The reason for the oops is that IMA-appraisal uses "kernel_read()" when
file is closed. kernel_read() honors LSM security hook which calls
Apparmor handler, which uses current->nsproxy->mnt_ns. The 'guilty'
commit changed the order of cleanup code so that nsproxy->mnt_ns was
not already available for Apparmor.

Discussion about the issue with Al Viro and Eric W. Biederman suggested
that kernel_read() is too high-level for IMA. Another issue, except
security checking, that was identified is mandatory locking. kernel_read
honors it as well and it might prevent IMA from calculating necessary hash.
It was suggested to use simplified version of the function without security
and locking checks.

This patch introduces special version ima_kernel_read(), which skips security
and mandatory locking checking. It prevents the kernel oops to happen.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
2014-06-12 17:58:08 -04:00
Mimi Zohar
2fb1c9a4f2 evm: prohibit userspace writing 'security.evm' HMAC value
Calculating the 'security.evm' HMAC value requires access to the
EVM encrypted key.  Only the kernel should have access to it.  This
patch prevents userspace tools(eg. setfattr, cp --preserve=xattr)
from setting/modifying the 'security.evm' HMAC value directly.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
2014-06-12 17:58:07 -04:00
Dmitry Kasatkin
14503eb994 ima: check inode integrity cache in violation check
When IMA did not support ima-appraisal, existance of the S_IMA flag
clearly indicated that the file was measured. With IMA appraisal S_IMA
flag indicates that file was measured and/or appraised. Because of
this, when measurement is not enabled by the policy, violations are
still reported.

To differentiate between measurement and appraisal policies this
patch checks the inode integrity cache flags.  The IMA_MEASURED
flag indicates whether the file was actually measured, while the
IMA_MEASURE flag indicates whether the file should be measured.
Unfortunately, the IMA_MEASURED flag is reset to indicate the file
needs to be re-measured.  Thus, this patch checks the IMA_MEASURE
flag.

This patch limits the false positive violation reports, but does
not fix it entirely.  The IMA_MEASURE/IMA_MEASURED flags are
indications that, at some point in time, the file opened for read
was in policy, but might not be in policy now (eg. different uid).
Other changes would be needed to further limit false positive
violation reports.

Changelog:
- expanded patch description based on conversation with Roberto (Mimi)

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12 17:58:07 -04:00
Dmitry Kasatkin
b882fae2d3 ima: prevent unnecessary policy checking
ima_rdwr_violation_check is called for every file openning.
The function checks the policy even when violation condition
is not met. It causes unnecessary policy checking.

This patch does policy checking only if violation condition is met.

Changelog:
- check writecount is greater than zero (Mimi)

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12 17:58:06 -04:00
Dmitry Kasatkin
3e38df56e6 evm: provide option to protect additional SMACK xattrs
Newer versions of SMACK introduced following security xattrs:
SMACK64EXEC, SMACK64TRANSMUTE and SMACK64MMAP.

To protect these xattrs, this patch includes them in the HMAC
calculation.  However, for backwards compatibility with existing
labeled filesystems, including these xattrs needs to be
configurable.

Changelog:
- Add SMACK dependency on new option (Mimi)

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12 17:58:06 -04:00
Dmitry Kasatkin
d3b3367948 evm: replace HMAC version with attribute mask
Using HMAC version limits the posibility to arbitrarily add new
attributes such as SMACK64EXEC to the hmac calculation.

This patch replaces hmac version with attribute mask.
Desired attributes can be enabled with configuration parameter.
It allows to build kernels which works with previously labeled
filesystems.

Currently supported attribute is 'fsuuid' which is equivalent of
the former version 2.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12 17:58:06 -04:00
Mimi Zohar
060bdebfb0 ima: prevent new digsig xattr from being replaced
Even though a new xattr will only be appraised on the next access,
set the DIGSIG flag to prevent a signature from being replaced with
a hash on file close.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12 17:58:05 -04:00
Linus Torvalds
f9da455b93 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov.

 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J
    Benniston.

 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn
    Mork.

 4) BPF now has a "random" opcode, from Chema Gonzalez.

 5) Add more BPF documentation and improve test framework, from Daniel
    Borkmann.

 6) Support TCP fastopen over ipv6, from Daniel Lee.

 7) Add software TSO helper functions and use them to support software
    TSO in mvneta and mv643xx_eth drivers.  From Ezequiel Garcia.

 8) Support software TSO in fec driver too, from Nimrod Andy.

 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli.

10) Handle broadcasts more gracefully over macvlan when there are large
    numbers of interfaces configured, from Herbert Xu.

11) Allow more control over fwmark used for non-socket based responses,
    from Lorenzo Colitti.

12) Do TCP congestion window limiting based upon measurements, from Neal
    Cardwell.

13) Support busy polling in SCTP, from Neal Horman.

14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru.

15) Bridge promisc mode handling improvements from Vlad Yasevich.

16) Don't use inetpeer entries to implement ID generation any more, it
    performs poorly, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
  rtnetlink: fix userspace API breakage for iproute2 < v3.9.0
  tcp: fixing TLP's FIN recovery
  net: fec: Add software TSO support
  net: fec: Add Scatter/gather support
  net: fec: Increase buffer descriptor entry number
  net: fec: Factorize feature setting
  net: fec: Enable IP header hardware checksum
  net: fec: Factorize the .xmit transmit function
  bridge: fix compile error when compiling without IPv6 support
  bridge: fix smatch warning / potential null pointer dereference
  via-rhine: fix full-duplex with autoneg disable
  bnx2x: Enlarge the dorq threshold for VFs
  bnx2x: Check for UNDI in uncommon branch
  bnx2x: Fix 1G-baseT link
  bnx2x: Fix link for KR with swapped polarity lane
  sctp: Fix sk_ack_backlog wrap-around problem
  net/core: Add VF link state control policy
  net/fsl: xgmac_mdio is dependent on OF_MDIO
  net/fsl: Make xgmac_mdio read error message useful
  net_sched: drr: warn when qdisc is not work conserving
  ...
2014-06-12 14:27:40 -07:00
Thomas Gleixner
77f4fa089c tomoyo: Use sensible time interface
There is no point in calling gettimeofday if only the seconds part of
the timespec is used. Use get_seconds() instead. It's not only the
proper interface it's also faster.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: linux-security-module@vger.kernel.org
Link: http://lkml.kernel.org/r/20140611234607.775273584@linutronix.de
2014-06-12 16:18:45 +02:00
Linus Torvalds
fad0701eaa Merge branch 'serge-next-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security
Pull security layer updates from Serge Hallyn:
 "This is a merge of James Morris' security-next tree from 3.14 to
  yesterday's master, plus four patches from Paul Moore which are in
  linux-next, plus one patch from Mimi"

* 'serge-next-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security:
  ima: audit log files opened with O_DIRECT flag
  selinux: conditionally reschedule in hashtab_insert while loading selinux policy
  selinux: conditionally reschedule in mls_convert_context while loading selinux policy
  selinux: reject setexeccon() on MNT_NOSUID applications with -EACCES
  selinux:  Report permissive mode in avc: denied messages.
  Warning in scanf string typing
  Smack: Label cgroup files for systemd
  Smack: Verify read access on file open - v3
  security: Convert use of typedef ctl_table to struct ctl_table
  Smack: bidirectional UDS connect check
  Smack: Correctly remove SMACK64TRANSMUTE attribute
  SMACK: Fix handling value==NULL in post setxattr
  bugfix patch for SMACK
  Smack: adds smackfs/ptrace interface
  Smack: unify all ptrace accesses in the smack
  Smack: fix the subject/object order in smack_ptrace_traceme()
  Minor improvement of 'smack_sb_kern_mount'
  smack: fix key permission verification
  KEYS: Move the flags representing required permission to linux/key.h
2014-06-10 10:05:36 -07:00
Linus Torvalds
14208b0ec5 Merge branch 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
 "A lot of activities on cgroup side.  Heavy restructuring including
  locking simplification took place to improve the code base and enable
  implementation of the unified hierarchy, which currently exists behind
  a __DEVEL__ mount option.  The core support is mostly complete but
  individual controllers need further work.  To explain the design and
  rationales of the the unified hierarchy

        Documentation/cgroups/unified-hierarchy.txt

  is added.

  Another notable change is css (cgroup_subsys_state - what each
  controller uses to identify and interact with a cgroup) iteration
  update.  This is part of continuing updates on css object lifetime and
  visibility.  cgroup started with reference count draining on removal
  way back and is now reaching a point where csses behave and are
  iterated like normal refcnted objects albeit with some complexities to
  allow distinguishing the state where they're being deleted.  The css
  iteration update isn't taken advantage of yet but is planned to be
  used to simplify memcg significantly"

* 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (77 commits)
  cgroup: disallow disabled controllers on the default hierarchy
  cgroup: don't destroy the default root
  cgroup: disallow debug controller on the default hierarchy
  cgroup: clean up MAINTAINERS entries
  cgroup: implement css_tryget()
  device_cgroup: use css_has_online_children() instead of has_children()
  cgroup: convert cgroup_has_live_children() into css_has_online_children()
  cgroup: use CSS_ONLINE instead of CGRP_DEAD
  cgroup: iterate cgroup_subsys_states directly
  cgroup: introduce CSS_RELEASED and reduce css iteration fallback window
  cgroup: move cgroup->serial_nr into cgroup_subsys_state
  cgroup: link all cgroup_subsys_states in their sibling lists
  cgroup: move cgroup->sibling and ->children into cgroup_subsys_state
  cgroup: remove cgroup->parent
  device_cgroup: remove direct access to cgroup->children
  memcg: update memcg_has_children() to use css_next_child()
  memcg: remove tasks/children test from mem_cgroup_force_empty()
  cgroup: remove css_parent()
  cgroup: skip refcnting on normal root csses and cgrp_dfl_root self css
  cgroup: use cgroup->self.refcnt for cgroup refcnting
  ...
2014-06-09 15:03:33 -07:00
Mimi Zohar
f9b2a735bd ima: audit log files opened with O_DIRECT flag
Files are measured or appraised based on the IMA policy.  When a
file, in policy, is opened with the O_DIRECT flag, a deadlock
occurs.

The first attempt at resolving this lockdep temporarily removed the
O_DIRECT flag and restored it, after calculating the hash.  The
second attempt introduced the O_DIRECT_HAVELOCK flag. Based on this
flag, do_blockdev_direct_IO() would skip taking the i_mutex a second
time.  The third attempt, by Dmitry Kasatkin, resolves the i_mutex
locking issue, by re-introducing the IMA mutex, but uncovered
another problem.  Reading a file with O_DIRECT flag set, writes
directly to userspace pages.  A second patch allocates a user-space
like memory.  This works for all IMA hooks, except ima_file_free(),
which is called on __fput() to recalculate the file hash.

Until this last issue is addressed, do not 'collect' the
measurement for measuring, appraising, or auditing files opened
with the O_DIRECT flag set.  Based on policy, permit or deny file
access.  This patch defines a new IMA policy rule option named
'permit_directio'.  Policy rules could be defined, based on LSM
or other criteria, to permit specific applications to open files
with the O_DIRECT flag set.

Changelog v1:
- permit or deny file access based IMA policy rules

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Cc: <stable@vger.kernel.org>
2014-06-03 14:21:50 -05:00
Dave Jones
ed1c96429a selinux: conditionally reschedule in hashtab_insert while loading selinux policy
After silencing the sleeping warning in mls_convert_context() I started
seeing similar traces from hashtab_insert. Do a cond_resched there too.

Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-03 14:21:50 -05:00
Dave Jones
9a591f39a9 selinux: conditionally reschedule in mls_convert_context while loading selinux policy
On a slow machine (with debugging enabled), upgrading selinux policy may take
a considerable amount of time. Long enough that the softlockup detector
gets triggered.

The backtrace looks like this..

 > BUG: soft lockup - CPU#2 stuck for 23s! [load_policy:19045]
 > Call Trace:
 >  [<ffffffff81221ddf>] symcmp+0xf/0x20
 >  [<ffffffff81221c27>] hashtab_search+0x47/0x80
 >  [<ffffffff8122e96c>] mls_convert_context+0xdc/0x1c0
 >  [<ffffffff812294e8>] convert_context+0x378/0x460
 >  [<ffffffff81229170>] ? security_context_to_sid_core+0x240/0x240
 >  [<ffffffff812221b5>] sidtab_map+0x45/0x80
 >  [<ffffffff8122bb9f>] security_load_policy+0x3ff/0x580
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff810786dd>] ? sched_clock_local+0x1d/0x80
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff8103096a>] ? __change_page_attr_set_clr+0x82a/0xa50
 >  [<ffffffff810786dd>] ? sched_clock_local+0x1d/0x80
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff8103096a>] ? __change_page_attr_set_clr+0x82a/0xa50
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff81534ddc>] ? retint_restore_args+0xe/0xe
 >  [<ffffffff8109c82d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
 >  [<ffffffff81279a2e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 >  [<ffffffff810d28a8>] ? rcu_irq_exit+0x68/0xb0
 >  [<ffffffff81534ddc>] ? retint_restore_args+0xe/0xe
 >  [<ffffffff8121e947>] sel_write_load+0xa7/0x770
 >  [<ffffffff81139633>] ? vfs_write+0x1c3/0x200
 >  [<ffffffff81210e8e>] ? security_file_permission+0x1e/0xa0
 >  [<ffffffff8113952b>] vfs_write+0xbb/0x200
 >  [<ffffffff811581c7>] ? fget_light+0x397/0x4b0
 >  [<ffffffff81139c27>] SyS_write+0x47/0xa0
 >  [<ffffffff8153bde4>] tracesys+0xdd/0xe2

Stephen Smalley suggested:

 > Maybe put a cond_resched() within the ebitmap_for_each_positive_bit()
 > loop in mls_convert_context()?

That seems to do the trick. Tested by downgrading and re-upgrading selinux-policy-targeted.

Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-03 14:21:49 -05:00
Paul Moore
5b589d44fa selinux: reject setexeccon() on MNT_NOSUID applications with -EACCES
We presently prevent processes from using setexecon() to set the
security label of exec()'d processes when NO_NEW_PRIVS is enabled by
returning an error; however, we silently ignore setexeccon() when
exec()'ing from a nosuid mounted filesystem.  This patch makes things
a bit more consistent by returning an error in the setexeccon()/nosuid
case.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2014-06-03 14:21:48 -05:00
Stephen Smalley
ca7786a2f9 selinux: Report permissive mode in avc: denied messages.
We cannot presently tell from an avc: denied message whether access was in
fact denied or was allowed due to global or per-domain permissive mode.
Add a permissive= field to the avc message to reflect this information.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-03 14:21:48 -05:00
David S. Miller
54e5c4def0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/bonding/bond_alb.c
	drivers/net/ethernet/altera/altera_msgdma.c
	drivers/net/ethernet/altera/altera_sgdma.c
	net/ipv6/xfrm6_output.c

Several cases of overlapping changes.

The xfrm6_output.c has a bug fix which overlaps the renaming
of skb->local_df to skb->ignore_df.

In the Altera TSE driver cases, the register access cleanups
in net-next overlapped with bug fixes done in net.

Similarly a bug fix to send ALB packets in the bonding driver using
the right source address overlaps with cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24 00:32:30 -04:00
James Morris
2fd4e6698f Merge branch 'smack-for-3.16' of git://git.gitorious.org/smack-next/kernel into next 2014-05-20 14:50:09 +10:00
Tejun Heo
7a3bb24f7c device_cgroup: use css_has_online_children() instead of has_children()
devcgroup_update_access() wants to know whether there are child
cgroups which are online and visible to userland and has_children()
may return false positive.  Replace it with css_has_online_children().

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-05-16 13:22:52 -04:00
Tejun Heo
5877019d97 device_cgroup: remove direct access to cgroup->children
Currently, devcg::has_children() directly tests cgroup->children for
list emptiness.  The field is not a published field and scheduled to
go away.  In addition, the test isn't strictly correct as devcg should
only care about children which are visible to userland.

This patch converts has_children() to use css_next_child() instead.
The subtle incorrectness is noted and will be dealt with later.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-05-16 13:22:48 -04:00
Tejun Heo
5c9d535b89 cgroup: remove css_parent()
cgroup in general is moving towards using cgroup_subsys_state as the
fundamental structural component and css_parent() was introduced to
convert from using cgroup->parent to css->parent.  It was quite some
time ago and we're moving forward with making css more prominent.

This patch drops the trivial wrapper css_parent() and let the users
dereference css->parent.  While at it, explicitly mark fields of css
which are public and immutable.

v2: New usage from device_cgroup.c converted.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: "David S. Miller" <davem@davemloft.net>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Johannes Weiner <hannes@cmpxchg.org>
2014-05-16 13:22:48 -04:00
Dave Jones
47dd0b76ac selinux: conditionally reschedule in hashtab_insert while loading selinux policy
After silencing the sleeping warning in mls_convert_context() I started
seeing similar traces from hashtab_insert. Do a cond_resched there too.

Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-05-15 17:07:55 -04:00
Dave Jones
612c353178 selinux: conditionally reschedule in mls_convert_context while loading selinux policy
On a slow machine (with debugging enabled), upgrading selinux policy may take
a considerable amount of time. Long enough that the softlockup detector
gets triggered.

The backtrace looks like this..

 > BUG: soft lockup - CPU#2 stuck for 23s! [load_policy:19045]
 > Call Trace:
 >  [<ffffffff81221ddf>] symcmp+0xf/0x20
 >  [<ffffffff81221c27>] hashtab_search+0x47/0x80
 >  [<ffffffff8122e96c>] mls_convert_context+0xdc/0x1c0
 >  [<ffffffff812294e8>] convert_context+0x378/0x460
 >  [<ffffffff81229170>] ? security_context_to_sid_core+0x240/0x240
 >  [<ffffffff812221b5>] sidtab_map+0x45/0x80
 >  [<ffffffff8122bb9f>] security_load_policy+0x3ff/0x580
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff810786dd>] ? sched_clock_local+0x1d/0x80
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff8103096a>] ? __change_page_attr_set_clr+0x82a/0xa50
 >  [<ffffffff810786dd>] ? sched_clock_local+0x1d/0x80
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff8103096a>] ? __change_page_attr_set_clr+0x82a/0xa50
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff81534ddc>] ? retint_restore_args+0xe/0xe
 >  [<ffffffff8109c82d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
 >  [<ffffffff81279a2e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 >  [<ffffffff810d28a8>] ? rcu_irq_exit+0x68/0xb0
 >  [<ffffffff81534ddc>] ? retint_restore_args+0xe/0xe
 >  [<ffffffff8121e947>] sel_write_load+0xa7/0x770
 >  [<ffffffff81139633>] ? vfs_write+0x1c3/0x200
 >  [<ffffffff81210e8e>] ? security_file_permission+0x1e/0xa0
 >  [<ffffffff8113952b>] vfs_write+0xbb/0x200
 >  [<ffffffff811581c7>] ? fget_light+0x397/0x4b0
 >  [<ffffffff81139c27>] SyS_write+0x47/0xa0
 >  [<ffffffff8153bde4>] tracesys+0xdd/0xe2

Stephen Smalley suggested:

 > Maybe put a cond_resched() within the ebitmap_for_each_positive_bit()
 > loop in mls_convert_context()?

That seems to do the trick. Tested by downgrading and re-upgrading selinux-policy-targeted.

Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-05-15 17:06:14 -04:00
Paul Moore
4f189988a0 selinux: reject setexeccon() on MNT_NOSUID applications with -EACCES
We presently prevent processes from using setexecon() to set the
security label of exec()'d processes when NO_NEW_PRIVS is enabled by
returning an error; however, we silently ignore setexeccon() when
exec()'ing from a nosuid mounted filesystem.  This patch makes things
a bit more consistent by returning an error in the setexeccon()/nosuid
case.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2014-05-15 11:16:06 -04:00
Tejun Heo
451af504df cgroup: replace cftype->write_string() with cftype->write()
Convert all cftype->write_string() users to the new cftype->write()
which maps directly to kernfs write operation and has full access to
kernfs and cgroup contexts.  The conversions are mostly mechanical.

* @css and @cft are accessed using of_css() and of_cft() accessors
  respectively instead of being specified as arguments.

* Should return @nbytes on success instead of 0.

* @buf is not trimmed automatically.  Trim if necessary.  Note that
  blkcg and netprio don't need this as the parsers already handle
  whitespaces.

cftype->write_string() has no user left after the conversions and
removed.

While at it, remove unnecessary local variable @p in
cgroup_subtree_control_write() and stale comment about
CGROUP_LOCAL_BUFFER_SIZE in cgroup_freezer.c.

This patch doesn't introduce any visible behavior changes.

v2: netprio was missing from conversion.  Converted.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
2014-05-13 12:16:21 -04:00
Linus Torvalds
26a41cd1ee Merge branch 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
 "During recent restructuring, device_cgroup unified config input check
  and enforcement logic; unfortunately, it turned out to share too much.
  Aristeu's patches fix the breakage and marked for -stable backport.

  The other two patches are fallouts from kernfs conversion.  The blkcg
  change is temporary and will go away once kernfs internal locking gets
  simplified (patches pending)"

* 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  blkcg: use trylock on blkcg_pol_mutex in blkcg_reset_stats()
  device_cgroup: check if exception removal is allowed
  device_cgroup: fix the comment format for recently added functions
  device_cgroup: rework device access check and exception checking
  cgroup: fix the retry path of cgroup_mount()
2014-05-13 11:22:57 +09:00
David S. Miller
5f013c9bc7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/altera/altera_sgdma.c
	net/netlink/af_netlink.c
	net/sched/cls_api.c
	net/sched/sch_api.c

The netlink conflict dealt with moving to netlink_capable() and
netlink_ns_capable() in the 'net' tree vs. supporting 'tc' operations
in non-init namespaces.  These were simple transformations from
netlink_capable to netlink_ns_capable.

The Altera driver conflict was simply code removal overlapping some
void pointer cast cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-12 13:19:14 -04:00
Linus Torvalds
8169d3005e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "dcache fixes + kvfree() (uninlined, exported by mm/util.c) + posix_acl
  bugfix from hch"

The dcache fixes are for a subtle LRU list corruption bug reported by
Miklos Szeredi, where people inside IBM saw list corruptions with the
LTP/host01 test.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  nick kvfree() from apparmor
  posix_acl: handle NULL ACL in posix_acl_equiv_mode
  dcache: don't need rcu in shrink_dentry_list()
  more graceful recovery in umount_collect()
  don't remove from shrink list in select_collect()
  dentry_kill(): don't try to remove from shrink list
  expand the call of dentry_lru_del() in dentry_kill()
  new helper: dentry_free()
  fold try_prune_one_dentry()
  fold d_kill() and d_free()
  fix races between __d_instantiate() and checks of dentry flags
2014-05-06 12:22:20 -07:00
Toralf Förster
ec554fa75e Warning in scanf string typing
This fixes a warning about the mismatch of types between
the declared unsigned and integer.

Signed-off-by: Toralf Förster <toralf.foerster@gmx.de>
2014-05-06 11:32:53 -07:00
Al Viro
39f1f78d53 nick kvfree() from apparmor
too many places open-code it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-06 14:02:53 -04:00
Aristeu Rozanski
d2c2b11cfa device_cgroup: check if exception removal is allowed
[PATCH v3 1/2] device_cgroup: check if exception removal is allowed

When the device cgroup hierarchy was introduced in
	bd2953ebbb - devcg: propagate local changes down the hierarchy

a specific case was overlooked. Consider the hierarchy bellow:

	A	default policy: ALLOW, exceptions will deny access
	 \
	  B	default policy: ALLOW, exceptions will deny access

There's no need to verify when an new exception is added to B because
in this case exceptions will deny access to further devices, which is
always fine. Hierarchy in device cgroup only makes sure B won't have
more access than A.

But when an exception is removed (by writing devices.allow), it isn't
checked if the user is in fact removing an inherited exception from A,
thus giving more access to B.

Example:

	# echo 'a' >A/devices.allow
	# echo 'c 1:3 rw' >A/devices.deny
	# echo $$ >A/B/tasks
	# echo >/dev/null
	-bash: /dev/null: Operation not permitted
	# echo 'c 1:3 w' >A/B/devices.allow
	# echo >/dev/null
	#

This shouldn't be allowed and this patch fixes it by making sure to never allow
exceptions in this case to be removed if the exception is partially or fully
present on the parent.

v3: missing '*' in function description
v2: improved log message and formatting fixes

Cc: cgroups@vger.kernel.org
Cc: Li Zefan <lizefan@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2014-05-05 11:20:12 -04:00
Aristeu Rozanski
f5f3cf6f7e device_cgroup: fix the comment format for recently added functions
Moving more extensive explanations to the end of the comment.

Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2014-05-04 15:21:09 -04:00
Stephen Smalley
626b9740fa selinux: Report permissive mode in avc: denied messages.
We cannot presently tell from an avc: denied message whether access was in
fact denied or was allowed due to global or per-domain permissive mode.
Add a permissive= field to the avc message to reflect this information.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-05-01 14:56:14 -04:00
Casey Schaufler
36ea735b52 Smack: Label cgroup files for systemd
The cgroup filesystem isn't ready for an LSM to
properly use extented attributes. This patch makes
files created in the cgroup filesystem usable by
a system running Smack and systemd.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2014-04-30 10:49:33 -07:00
Casey Schaufler
a6834c0b91 Smack: Verify read access on file open - v3
Smack believes that many of the operatons that can
be performed on an open file descriptor are read operations.
The fstat and lseek system calls are examples.
An implication of this is that files shouldn't be open
if the task doesn't have read access even if it has
write access and the file is being opened write only.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2014-04-23 08:52:39 -07:00
Richard Guy Briggs
3a101b8de0 audit: add netlink audit protocol bind to check capabilities on multicast join
Register a netlink per-protocol bind fuction for audit to check userspace
process capabilities before allowing a multicast group connection.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22 21:42:27 -04:00
Jeff Layton
0d3f7a2dd2 locks: rename file-private locks to "open file description locks"
File-private locks have been merged into Linux for v3.15, and *now*
people are commenting that the name and macro definitions for the new
file-private locks suck.

...and I can't even disagree. The names and command macros do suck.

We're going to have to live with these for a long time, so it's
important that we be happy with the names before we're stuck with them.
The consensus on the lists so far is that they should be rechristened as
"open file description locks".

The name isn't a big deal for the kernel, but the command macros are not
visually distinct enough from the traditional POSIX lock macros. The
glibc and documentation folks are recommending that we change them to
look like F_OFD_{GETLK|SETLK|SETLKW}. That lessens the chance that a
programmer will typo one of the commands wrong, and also makes it easier
to spot this difference when reading code.

This patch makes the following changes that I think are necessary before
v3.15 ships:

1) rename the command macros to their new names. These end up in the uapi
   headers and so are part of the external-facing API. It turns out that
   glibc doesn't actually use the fcntl.h uapi header, but it's hard to
   be sure that something else won't. Changing it now is safest.

2) make the the /proc/locks output display these as type "OFDLCK"

Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Carlos O'Donell <carlos@redhat.com>
Cc: Stefan Metzmacher <metze@samba.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Frank Filz <ffilzlnx@mindspring.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
2014-04-22 08:23:58 -04:00
Aristeu Rozanski
79d719749d device_cgroup: rework device access check and exception checking
Whenever a device file is opened and checked against current device
cgroup rules, it uses the same function (may_access()) as when a new
exception rule is added by writing devices.{allow,deny}. And in both
cases, the algorithm is the same, doesn't matter the behavior.

First problem is having device access to be considered the same as rule
checking. Consider the following structure:

	A	(default behavior: allow, exceptions disallow access)
	 \
	  B	(default behavior: allow, exceptions disallow access)

A new exception is added to B by writing devices.deny:

	c 12:34 rw

When checking if that exception is allowed in may_access():

	if (dev_cgroup->behavior == DEVCG_DEFAULT_ALLOW) {
		if (behavior == DEVCG_DEFAULT_ALLOW) {
			/* the exception will deny access to certain devices */
			return true;

Which is ok, since B is not getting more privileges than A, it doesn't
matter and the rule is accepted

Now, consider it's a device file open check and the process belongs to
cgroup B. The access will be generated as:

	behavior: allow
	exception: c 12:34 rw

The very same chunk of code will allow it, even if there's an explicit
exception telling to do otherwise.

A simple test case:

	# mkdir new_group
	# cd new_group
	# echo $$ >tasks
	# echo "c 1:3 w" >devices.deny
	# echo >/dev/null
	# echo $?
	0

This is a serious bug and was introduced on

	c39a2a3018 devcg: prepare may_access() for hierarchy support

To solve this problem, the device file open function was split from the
new exception check.

Second problem is how exceptions are processed by may_access(). The
first part of the said function tries to match fully with an existing
exception:

	list_for_each_entry_rcu(ex, &dev_cgroup->exceptions, list) {
		if ((refex->type & DEV_BLOCK) && !(ex->type & DEV_BLOCK))
			continue;
		if ((refex->type & DEV_CHAR) && !(ex->type & DEV_CHAR))
			continue;
		if (ex->major != ~0 && ex->major != refex->major)
			continue;
		if (ex->minor != ~0 && ex->minor != refex->minor)
			continue;
		if (refex->access & (~ex->access))
			continue;
		match = true;
		break;
	}

That means the new exception should be contained into an existing one to
be considered a match:

	New exception		Existing	match?	notes
	b 12:34 rwm		b 12:34 rwm	yes
	b 12:34 r		b *:34 rw	yes
	b 12:34 rw		b 12:34 w	no	extra "r"
	b *:34 rw		b 12:34 rw	no	too broad "*"
	b *:34 rw		b *:34 rwm	yes

Which is fine in some cases. Consider:

	A	(default behavior: deny, exceptions allow access)
	 \
	  B	(default behavior: deny, exceptions allow access)

In this case the full match makes sense, the new exception cannot add
more access than the parent allows

But this doesn't always work, consider:

	A	(default behavior: allow, exceptions disallow access)
	 \
	  B	(default behavior: deny, exceptions allow access)

In this case, a new exception in B shouldn't match any of the exceptions
in A, after all you can't allow something that was forbidden by A. But
consider this scenario:

	New exception	Existing in A	match?	outcome
	b 12:34 rw	b 12:34 r	no	exception is accepted

Because the new exception has "w" as extra, it doesn't match, so it'll
be added to B's exception list.

The same problem can happen during a file access check. Consider a
cgroup with allow as default behavior:

	Access		Exception	match?
	b 12:34 rw	b 12:34 r	no

In this case, the access didn't match any of the exceptions in the
cgroup, which is required since exceptions will disallow access.

To solve this problem, two new functions were created to match an
exception either fully or partially. In the example above, a partial
check will be performed and it'll produce a match since at least
"b 12:34 r" from "b 12:34 rw" access matches.

Cc: cgroups@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2014-04-21 18:21:42 -04:00
Joe Perches
fab71a90ed security: Convert use of typedef ctl_table to struct ctl_table
This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-04-15 13:39:58 +10:00
James Morris
b13cebe707 Merge tag 'keys-20140314' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2014-04-14 11:42:49 +10:00
James Morris
ecd740c6f2 Merge commit 'v3.14' into next 2014-04-14 11:23:14 +10:00
Linus Torvalds
5166701b36 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "The first vfs pile, with deep apologies for being very late in this
  window.

  Assorted cleanups and fixes, plus a large preparatory part of iov_iter
  work.  There's a lot more of that, but it'll probably go into the next
  merge window - it *does* shape up nicely, removes a lot of
  boilerplate, gets rid of locking inconsistencie between aio_write and
  splice_write and I hope to get Kent's direct-io rewrite merged into
  the same queue, but some of the stuff after this point is having
  (mostly trivial) conflicts with the things already merged into
  mainline and with some I want more testing.

  This one passes LTP and xfstests without regressions, in addition to
  usual beating.  BTW, readahead02 in ltp syscalls testsuite has started
  giving failures since "mm/readahead.c: fix readahead failure for
  memoryless NUMA nodes and limit readahead pages" - might be a false
  positive, might be a real regression..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  missing bits of "splice: fix racy pipe->buffers uses"
  cifs: fix the race in cifs_writev()
  ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure
  kill generic_file_buffered_write()
  ocfs2_file_aio_write(): switch to generic_perform_write()
  ceph_aio_write(): switch to generic_perform_write()
  xfs_file_buffered_aio_write(): switch to generic_perform_write()
  export generic_perform_write(), start getting rid of generic_file_buffer_write()
  generic_file_direct_write(): get rid of ppos argument
  btrfs_file_aio_write(): get rid of ppos
  kill the 5th argument of generic_file_buffered_write()
  kill the 4th argument of __generic_file_aio_write()
  lustre: don't open-code kernel_recvmsg()
  ocfs2: don't open-code kernel_recvmsg()
  drbd: don't open-code kernel_recvmsg()
  constify blk_rq_map_user_iov() and friends
  lustre: switch to kernel_sendmsg()
  ocfs2: don't open-code kernel_sendmsg()
  take iov_iter stuff to mm/iov_iter.c
  process_vm_access: tidy up a bit
  ...
2014-04-12 14:49:50 -07:00
Linus Torvalds
0b747172dc Merge git://git.infradead.org/users/eparis/audit
Pull audit updates from Eric Paris.

* git://git.infradead.org/users/eparis/audit: (28 commits)
  AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERIC
  audit: renumber AUDIT_FEATURE_CHANGE into the 1300 range
  audit: do not cast audit_rule_data pointers pointlesly
  AUDIT: Allow login in non-init namespaces
  audit: define audit_is_compat in kernel internal header
  kernel: Use RCU_INIT_POINTER(x, NULL) in audit.c
  sched: declare pid_alive as inline
  audit: use uapi/linux/audit.h for AUDIT_ARCH declarations
  syscall_get_arch: remove useless function arguments
  audit: remove stray newline from audit_log_execve_info() audit_panic() call
  audit: remove stray newlines from audit_log_lost messages
  audit: include subject in login records
  audit: remove superfluous new- prefix in AUDIT_LOGIN messages
  audit: allow user processes to log from another PID namespace
  audit: anchor all pid references in the initial pid namespace
  audit: convert PPIDs to the inital PID namespace.
  pid: get pid_t ppid of task in init_pid_ns
  audit: rename the misleading audit_get_context() to audit_take_context()
  audit: Add generic compat syscall support
  audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL
  ...
2014-04-12 12:38:53 -07:00
Casey Schaufler
54e70ec5eb Smack: bidirectional UDS connect check
Smack IPC policy requires that the sender have write access
to the receiver. UDS streams don't do per-packet checks. The
only check is done at connect time. The existing code checks
if the connecting process can write to the other, but not the
other way around. This change adds a check that the other end
can write to the connecting process.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schuafler <casey@schaufler-ca.com>
2014-04-11 14:35:28 -07:00
Casey Schaufler
f59bdfba3e Smack: Correctly remove SMACK64TRANSMUTE attribute
Sam Henderson points out that removing the SMACK64TRANSMUTE
attribute from a directory does not result in the directory
transmuting. This is because the inode flag indicating that
the directory is transmuting isn't cleared. The fix is a tad
less than trivial because smk_task and smk_mmap should have
been broken out, too.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2014-04-11 14:35:19 -07:00
José Bollo
9598f4c9e7 SMACK: Fix handling value==NULL in post setxattr
The function `smack_inode_post_setxattr` is called each
time that a setxattr is done, for any value of name.
The kernel allow to put value==NULL when size==0
to set an empty attribute value. The systematic
call to smk_import_entry was causing the dereference
of a NULL pointer hence a KERNEL PANIC!

The problem can be produced easily by issuing the
command `setfattr -n user.data file` under bash prompt
when SMACK is active.

Moving the call to smk_import_entry as proposed by this
patch is correcting the behaviour because the function
smack_inode_post_setxattr is called for the SMACK's
attributes only if the function smack_inode_setxattr validated
the value and its size (what will not be the case when size==0).

It also has a benefical effect to not fill the smack hash
with garbage values coming from any extended attribute
write.

Change-Id: Iaf0039c2be9bccb6cee11c24a3b44d209101fe47
Signed-off-by: José Bollo <jose.bollo@open.eurogiciel.org>
2014-04-11 14:35:05 -07:00
Pankaj Kumar
5e9ab593c2 bugfix patch for SMACK
1. In order to remove any SMACK extended attribute from a file, a user
should have CAP_MAC_ADMIN capability. But user without having this
capability is able to remove SMACK64MMAP security attribute.

2. While validating size and value of smack extended attribute in
smack_inode_setsecurity hook, wrong error code is returned.

Signed-off-by: Pankaj Kumar <pamkaj.k2@samsung.com>
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
2014-04-11 14:34:52 -07:00
Lukasz Pawelczyk
6686781852 Smack: adds smackfs/ptrace interface
This allows to limit ptrace beyond the regular smack access rules.
It adds a smackfs/ptrace interface that allows smack to be configured
to require equal smack labels for PTRACE_MODE_ATTACH access.
See the changes in Documentation/security/Smack.txt below for details.

Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@partner.samsung.com>
Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
2014-04-11 14:34:35 -07:00
Lukasz Pawelczyk
5663884caa Smack: unify all ptrace accesses in the smack
The decision whether we can trace a process is made in the following
functions:
	smack_ptrace_traceme()
	smack_ptrace_access_check()
	smack_bprm_set_creds() (in case the proces is traced)

This patch unifies all those decisions by introducing one function that
checks whether ptrace is allowed: smk_ptrace_rule_check().

This makes possible to actually trace with TRACEME where first the
TRACEME itself must be allowed and then exec() on a traced process.

Additional bugs fixed:
- The decision is made according to the mode parameter that is now correctly
  translated from PTRACE_MODE_* to MAY_* instead of being treated 1:1.
  PTRACE_MODE_READ requires MAY_READ.
  PTRACE_MODE_ATTACH requires MAY_READWRITE.
- Add a smack audit log in case of exec() refused by bprm_set_creds().
- Honor the PTRACE_MODE_NOAUDIT flag and don't put smack audit info
  in case this flag is set.

Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@partner.samsung.com>
Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
2014-04-11 14:34:26 -07:00
Lukasz Pawelczyk
959e6c7f1e Smack: fix the subject/object order in smack_ptrace_traceme()
The order of subject/object is currently reversed in
smack_ptrace_traceme(). It is currently checked if the tracee has a
capability to trace tracer and according to this rule a decision is made
whether the tracer will be allowed to trace tracee.

Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@partner.samsung.com>
Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
2014-04-11 14:34:17 -07:00
José Bollo
55dfc5da1a Minor improvement of 'smack_sb_kern_mount'
Fix a possible memory access fault when transmute is true and isp is NULL.

Signed-off-by: José Bollo <jose.bollo@open.eurogiciel.org>
2014-04-11 14:33:59 -07:00
Linus Torvalds
f7789dc0d4 Merge branch 'locks-3.15' of git://git.samba.org/jlayton/linux
Pull file locking updates from Jeff Layton:
 "Highlights:

   - maintainership change for fs/locks.c.  Willy's not interested in
     maintaining it these days, and is OK with Bruce and I taking it.
   - fix for open vs setlease race that Al ID'ed
   - cleanup and consolidation of file locking code
   - eliminate unneeded BUG() call
   - merge of file-private lock implementation"

* 'locks-3.15' of git://git.samba.org/jlayton/linux:
  locks: make locks_mandatory_area check for file-private locks
  locks: fix locks_mandatory_locked to respect file-private locks
  locks: require that flock->l_pid be set to 0 for file-private locks
  locks: add new fcntl cmd values for handling file private locks
  locks: skip deadlock detection on FL_FILE_PVT locks
  locks: pass the cmd value to fcntl_getlk/getlk64
  locks: report l_pid as -1 for FL_FILE_PVT locks
  locks: make /proc/locks show IS_FILE_PVT locks as type "FLPVT"
  locks: rename locks_remove_flock to locks_remove_file
  locks: consolidate checks for compatible filp->f_mode values in setlk handlers
  locks: fix posix lock range overflow handling
  locks: eliminate BUG() call when there's an unexpected lock on file close
  locks: add __acquires and __releases annotations to locks_start and locks_stop
  locks: remove "inline" qualifier from fl_link manipulation functions
  locks: clean up comment typo
  locks: close potential race between setlease and open
  MAINTAINERS: update entry for fs/locks.c
2014-04-04 14:21:20 -07:00
Linus Torvalds
7df934526c Merge branch 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull renameat2 system call from Miklos Szeredi:
 "This adds a new syscall, renameat2(), which is the same as renameat()
  but with a flags argument.

  The purpose of extending rename is to add cross-rename, a symmetric
  variant of rename, which exchanges the two files.  This allows
  interesting things, which were not possible before, for example
  atomically replacing a directory tree with a symlink, etc...  This
  also allows overlayfs and friends to operate on whiteouts atomically.

  Andy Lutomirski also suggested a "noreplace" flag, which disables the
  overwriting behavior of rename.

  These two flags, RENAME_EXCHANGE and RENAME_NOREPLACE are only
  implemented for ext4 as an example and for testing"

* 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ext4: add cross rename support
  ext4: rename: split out helper functions
  ext4: rename: move EMLINK check up
  ext4: rename: create ext4_renament structure for local vars
  vfs: add cross-rename
  vfs: lock_two_nondirectories: allow directory args
  security: add flags to rename hooks
  vfs: add RENAME_NOREPLACE flag
  vfs: add renameat2 syscall
  vfs: rename: use common code for dir and non-dir
  vfs: rename: move d_move() up
  vfs: add d_is_dir()
2014-04-04 14:03:05 -07:00
Linus Torvalds
32d01dc7be Merge branch 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
 "A lot updates for cgroup:

   - The biggest one is cgroup's conversion to kernfs.  cgroup took
     after the long abandoned vfs-entangled sysfs implementation and
     made it even more convoluted over time.  cgroup's internal objects
     were fused with vfs objects which also brought in vfs locking and
     object lifetime rules.  Naturally, there are places where vfs rules
     don't fit and nasty hacks, such as credential switching or lock
     dance interleaving inode mutex and cgroup_mutex with object serial
     number comparison thrown in to decide whether the operation is
     actually necessary, needed to be employed.

     After conversion to kernfs, internal object lifetime and locking
     rules are mostly isolated from vfs interactions allowing shedding
     of several nasty hacks and overall simplification.  This will also
     allow implmentation of operations which may affect multiple cgroups
     which weren't possible before as it would have required nesting
     i_mutexes.

   - Various simplifications including dropping of module support,
     easier cgroup name/path handling, simplified cgroup file type
     handling and task_cg_lists optimization.

   - Prepatory changes for the planned unified hierarchy, which is still
     a patchset away from being actually operational.  The dummy
     hierarchy is updated to serve as the default unified hierarchy.
     Controllers which aren't claimed by other hierarchies are
     associated with it, which BTW was what the dummy hierarchy was for
     anyway.

   - Various fixes from Li and others.  This pull request includes some
     patches to add missing slab.h to various subsystems.  This was
     triggered xattr.h include removal from cgroup.h.  cgroup.h
     indirectly got included a lot of files which brought in xattr.h
     which brought in slab.h.

  There are several merge commits - one to pull in kernfs updates
  necessary for converting cgroup (already in upstream through
  driver-core), others for interfering changes in the fixes branch"

* 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (74 commits)
  cgroup: remove useless argument from cgroup_exit()
  cgroup: fix spurious lockdep warning in cgroup_exit()
  cgroup: Use RCU_INIT_POINTER(x, NULL) in cgroup.c
  cgroup: break kernfs active_ref protection in cgroup directory operations
  cgroup: fix cgroup_taskset walking order
  cgroup: implement CFTYPE_ONLY_ON_DFL
  cgroup: make cgrp_dfl_root mountable
  cgroup: drop const from @buffer of cftype->write_string()
  cgroup: rename cgroup_dummy_root and related names
  cgroup: move ->subsys_mask from cgroupfs_root to cgroup
  cgroup: treat cgroup_dummy_root as an equivalent hierarchy during rebinding
  cgroup: remove NULL checks from [pr_cont_]cgroup_{name|path}()
  cgroup: use cgroup_setup_root() to initialize cgroup_dummy_root
  cgroup: reorganize cgroup bootstrapping
  cgroup: relocate setting of CGRP_DEAD
  cpuset: use rcu_read_lock() to protect task_cs()
  cgroup_freezer: document freezer_fork() subtleties
  cgroup: update cgroup_transfer_tasks() to either succeed or fail
  cgroup: drop task_lock() protection around task->cgroups
  cgroup: update how a newly forked task gets associated with css_set
  ...
2014-04-03 13:05:42 -07:00
Linus Torvalds
bea803183e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Apart from reordering the SELinux mmap code to ensure DAC is called
  before MAC, these are minor maintenance updates"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (23 commits)
  selinux: correctly label /proc inodes in use before the policy is loaded
  selinux: put the mmap() DAC controls before the MAC controls
  selinux: fix the output of ./scripts/get_maintainer.pl for SELinux
  evm: enable key retention service automatically
  ima: skip memory allocation for empty files
  evm: EVM does not use MD5
  ima: return d_name.name if d_path fails
  integrity: fix checkpatch errors
  ima: fix erroneous removal of security.ima xattr
  security: integrity: Use a more current logging style
  MAINTAINERS: email updates and other misc. changes
  ima: reduce memory usage when a template containing the n field is used
  ima: restore the original behavior for sending data with ima template
  Integrity: Pass commname via get_task_comm()
  fs: move i_readcount
  ima: use static const char array definitions
  security: have cap_dentry_init_security return error
  ima: new helper: file_inode(file)
  kernel: Mark function as static in kernel/seccomp.c
  capability: Use current logging styles
  ...
2014-04-03 09:26:18 -07:00
Linus Torvalds
cd6362befe Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Here is my initial pull request for the networking subsystem during
  this merge window:

   1) Support for ESN in AH (RFC 4302) from Fan Du.

   2) Add full kernel doc for ethtool command structures, from Ben
      Hutchings.

   3) Add BCM7xxx PHY driver, from Florian Fainelli.

   4) Export computed TCP rate information in netlink socket dumps, from
      Eric Dumazet.

   5) Allow IPSEC SA to be dumped partially using a filter, from Nicolas
      Dichtel.

   6) Convert many drivers to pci_enable_msix_range(), from Alexander
      Gordeev.

   7) Record SKB timestamps more efficiently, from Eric Dumazet.

   8) Switch to microsecond resolution for TCP round trip times, also
      from Eric Dumazet.

   9) Clean up and fix 6lowpan fragmentation handling by making use of
      the existing inet_frag api for it's implementation.

  10) Add TX grant mapping to xen-netback driver, from Zoltan Kiss.

  11) Auto size SKB lengths when composing netlink messages based upon
      past message sizes used, from Eric Dumazet.

  12) qdisc dumps can take a long time, add a cond_resched(), From Eric
      Dumazet.

  13) Sanitize netpoll core and drivers wrt.  SKB handling semantics.
      Get rid of never-used-in-tree netpoll RX handling.  From Eric W
      Biederman.

  14) Support inter-address-family and namespace changing in VTI tunnel
      driver(s).  From Steffen Klassert.

  15) Add Altera TSE driver, from Vince Bridgers.

  16) Optimizing csum_replace2() so that it doesn't adjust the checksum
      by checksumming the entire header, from Eric Dumazet.

  17) Expand BPF internal implementation for faster interpreting, more
      direct translations into JIT'd code, and much cleaner uses of BPF
      filtering in non-socket ocntexts.  From Daniel Borkmann and Alexei
      Starovoitov"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1976 commits)
  netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
  net: Add a test to see if a skb is freeable in irq context
  qlcnic: Fix build failure due to undefined reference to `vxlan_get_rx_port'
  net: ptp: move PTP classifier in its own file
  net: sxgbe: make "core_ops" static
  net: sxgbe: fix logical vs bitwise operation
  net: sxgbe: sxgbe_mdio_register() frees the bus
  Call efx_set_channels() before efx->type->dimension_resources()
  xen-netback: disable rogue vif in kthread context
  net/mlx4: Set proper build dependancy with vxlan
  be2net: fix build dependency on VxLAN
  mac802154: make csma/cca parameters per-wpan
  mac802154: allow only one WPAN to be up at any given time
  net: filter: minor: fix kdoc in __sk_run_filter
  netlink: don't compare the nul-termination in nla_strcmp
  can: c_can: Avoid led toggling for every packet.
  can: c_can: Simplify TX interrupt cleanup
  can: c_can: Store dlc private
  can: c_can: Reduce register access
  can: c_can: Make the code readable
  ...
2014-04-02 20:53:45 -07:00
Al Viro
627bf81ac6 get rid of pointless checks for NULL ->i_op
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01 23:19:16 -04:00
Miklos Szeredi
da1ce0670c vfs: add cross-rename
If flags contain RENAME_EXCHANGE then exchange source and destination files.
There's no restriction on the type of the files; e.g. a directory can be
exchanged with a symlink.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: J. Bruce Fields <bfields@redhat.com>
2014-04-01 17:08:43 +02:00
Miklos Szeredi
0b3974eb04 security: add flags to rename hooks
Add flags to security_path_rename() and security_inode_rename() hooks.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: J. Bruce Fields <bfields@redhat.com>
2014-04-01 17:08:43 +02:00
Linus Torvalds
190f918660 Merge branch 'compat' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 compat wrapper rework from Heiko Carstens:
 "S390 compat system call wrapper simplification work.

  The intention of this work is to get rid of all hand written assembly
  compat system call wrappers on s390, which perform proper sign or zero
  extension, or pointer conversion of compat system call parameters.
  Instead all of this should be done with C code eg by using Al's
  COMPAT_SYSCALL_DEFINEx() macro.

  Therefore all common code and s390 specific compat system calls have
  been converted to the COMPAT_SYSCALL_DEFINEx() macro.

  In order to generate correct code all compat system calls may only
  have eg compat_ulong_t parameters, but no unsigned long parameters.
  Those patches which change parameter types from unsigned long to
  compat_ulong_t parameters are separate in this series, but shouldn't
  cause any harm.

  The only compat system calls which intentionally have 64 bit
  parameters (preadv64 and pwritev64) in support of the x86/32 ABI
  haven't been changed, but are now only available if an architecture
  defines __ARCH_WANT_COMPAT_SYS_PREADV64/PWRITEV64.

  System calls which do not have a compat variant but still need proper
  zero extension on s390, like eg "long sys_brk(unsigned long brk)" will
  get a proper wrapper function with the new s390 specific
  COMPAT_SYSCALL_WRAPx() macro:

     COMPAT_SYSCALL_WRAP1(brk, unsigned long, brk);

  which generates the following code (simplified):

     asmlinkage long sys_brk(unsigned long brk);
     asmlinkage long compat_sys_brk(long brk)
     {
         return sys_brk((u32)brk);
     }

  Given that the C file which contains all the COMPAT_SYSCALL_WRAP lines
  includes both linux/syscall.h and linux/compat.h, it will generate
  build errors, if the declaration of sys_brk() doesn't match, or if
  there exists a non-matching compat_sys_brk() declaration.

  In addition this will intentionally result in a link error if
  somewhere else a compat_sys_brk() function exists, which probably
  should have been used instead.  Two more BUILD_BUG_ONs make sure the
  size and type of each compat syscall parameter can be handled
  correctly with the s390 specific macros.

  I converted the compat system calls step by step to verify the
  generated code is correct and matches the previous code.  In fact it
  did not always match, however that was always a bug in the hand
  written asm code.

  In result we get less code, less bugs, and much more sanity checking"

* 'compat' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (44 commits)
  s390/compat: add copyright statement
  compat: include linux/unistd.h within linux/compat.h
  s390/compat: get rid of compat wrapper assembly code
  s390/compat: build error for large compat syscall args
  mm/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
  kexec/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
  net/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
  ipc/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
  fs/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
  ipc/compat: convert to COMPAT_SYSCALL_DEFINE
  fs/compat: convert to COMPAT_SYSCALL_DEFINE
  security/compat: convert to COMPAT_SYSCALL_DEFINE
  mm/compat: convert to COMPAT_SYSCALL_DEFINE
  net/compat: convert to COMPAT_SYSCALL_DEFINE
  kernel/compat: convert to COMPAT_SYSCALL_DEFINE
  fs/compat: optional preadv64/pwrite64 compat system calls
  ipc/compat_sys_msgrcv: change msgtyp type from long to compat_long_t
  s390/compat: partial parameter conversion within syscall wrappers
  s390/compat: automatic zero, sign and pointer conversion of syscalls
  s390/compat: add sync_file_range and fallocate compat syscalls
  ...
2014-03-31 14:32:17 -07:00
Paul Moore
6d32c85062 Linux 3.14
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTOOOnAAoJEHm+PkMAQRiGsBAH/2PAOL3TbOG6tEedxQrTwsr2
 muRIRTVWawjT8/npbHupxGnAyAVdmdffBHpmCmcftKdKNryT3YZW8/JWoYc+WSlo
 3vTDJHDOYAe6yCBjjhYwcu150THBQdOymOi5mbbclo0XWYG18jd3+abYprRH6SiD
 XqNSzYqoiv91JHBAWKBIpo1cyRDuwoM7+jZ7gX41r2800EL7loY3e08cPDDNU6HA
 CKaLXMwLwYTefE+Wnr+4UUr08NbNBbBUKLUSXVqKKIpd+MtbyhV1SnWzz8VQSkag
 K/uzsnGnE7nrqoepMSx3nXxzOWxUSY2EMbwhEjaKK4xBq9C9pzv3sG/o2/IyopU=
 =Nuom
 -----END PGP SIGNATURE-----

Merge tag 'v3.14' into next

Linux 3.14
2014-03-31 09:49:07 -04:00
Jeff Layton
5d50ffd7c3 locks: add new fcntl cmd values for handling file private locks
Due to some unfortunate history, POSIX locks have very strange and
unhelpful semantics. The thing that usually catches people by surprise
is that they are dropped whenever the process closes any file descriptor
associated with the inode.

This is extremely problematic for people developing file servers that
need to implement byte-range locks. Developers often need a "lock
management" facility to ensure that file descriptors are not closed
until all of the locks associated with the inode are finished.

Additionally, "classic" POSIX locks are owned by the process. Locks
taken between threads within the same process won't conflict with one
another, which renders them useless for synchronization between threads.

This patchset adds a new type of lock that attempts to address these
issues. These locks conflict with classic POSIX read/write locks, but
have semantics that are more like BSD locks with respect to inheritance
and behavior on close.

This is implemented primarily by changing how fl_owner field is set for
these locks. Instead of having them owned by the files_struct of the
process, they are instead owned by the filp on which they were acquired.
Thus, they are inherited across fork() and are only released when the
last reference to a filp is put.

These new semantics prevent them from being merged with classic POSIX
locks, even if they are acquired by the same process. These locks will
also conflict with classic POSIX locks even if they are acquired by
the same process or on the same file descriptor.

The new locks are managed using a new set of cmd values to the fcntl()
syscall. The initial implementation of this converts these values to
"classic" cmd values at a fairly high level, and the details are not
exposed to the underlying filesystem. We may eventually want to push
this handing out to the lower filesystem code but for now I don't
see any need for it.

Also, note that with this implementation the new cmd values are only
available via fcntl64() on 32-bit arches. There's little need to
add support for legacy apps on a new interface like this.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
2014-03-31 08:24:43 -04:00
David S. Miller
04f58c8854 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	Documentation/devicetree/bindings/net/micrel-ks8851.txt
	net/core/netpoll.c

The net/core/netpoll.c conflict is a bug fix in 'net' happening
to code which is completely removed in 'net-next'.

In micrel-ks8851.txt we simply have overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-25 20:29:20 -04:00
Richard Guy Briggs
f1dc4867ff audit: anchor all pid references in the initial pid namespace
Store and log all PIDs with reference to the initial PID namespace and
use the access functions task_pid_nr() and task_tgid_nr() for task->pid
and task->tgid.

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
(informed by ebiederman's c776b5d2)
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
2014-03-20 10:11:55 -04:00
Paul Moore
f64410ec66 selinux: correctly label /proc inodes in use before the policy is loaded
This patch is based on an earlier patch by Eric Paris, he describes
the problem below:

  "If an inode is accessed before policy load it will get placed on a
   list of inodes to be initialized after policy load.  After policy
   load we call inode_doinit() which calls inode_doinit_with_dentry()
   on all inodes accessed before policy load.  In the case of inodes
   in procfs that means we'll end up at the bottom where it does:

     /* Default to the fs superblock SID. */
     isec->sid = sbsec->sid;

     if ((sbsec->flags & SE_SBPROC) && !S_ISLNK(inode->i_mode)) {
             if (opt_dentry) {
                     isec->sclass = inode_mode_to_security_class(...)
                     rc = selinux_proc_get_sid(opt_dentry,
                                               isec->sclass,
                                               &sid);
                     if (rc)
                             goto out_unlock;
                     isec->sid = sid;
             }
     }

   Since opt_dentry is null, we'll never call selinux_proc_get_sid()
   and will leave the inode labeled with the label on the superblock.
   I believe a fix would be to mimic the behavior of xattrs.  Look
   for an alias of the inode.  If it can't be found, just leave the
   inode uninitialized (and pick it up later) if it can be found, we
   should be able to call selinux_proc_get_sid() ..."

On a system exhibiting this problem, you will notice a lot of files in
/proc with the generic "proc_t" type (at least the ones that were
accessed early in the boot), for example:

   # ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }'
   system_u:object_r:proc_t:s0 /proc/sys/kernel/shmmax

However, with this patch in place we see the expected result:

   # ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }'
   system_u:object_r:sysctl_kernel_t:s0 /proc/sys/kernel/shmmax

Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
2014-03-19 16:46:18 -04:00
Paul Moore
98883bfd9d selinux: put the mmap() DAC controls before the MAC controls
It turns out that doing the SELinux MAC checks for mmap() before the
DAC checks was causing users and the SELinux policy folks headaches
as users were seeing a lot of SELinux AVC denials for the
memprotect:mmap_zero permission that would have also been denied by
the normal DAC capability checks (CAP_SYS_RAWIO).

Example:

 # cat mmap_test.c
  #include <stdlib.h>
  #include <stdio.h>
  #include <errno.h>
  #include <sys/mman.h>

  int main(int argc, char *argv[])
  {
        int rc;
        void *mem;

        mem = mmap(0x0, 4096,
                   PROT_READ | PROT_WRITE,
                   MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
        if (mem == MAP_FAILED)
                return errno;
        printf("mem = %p\n", mem);
        munmap(mem, 4096);

        return 0;
  }
 # gcc -g -O0 -o mmap_test mmap_test.c
 # ./mmap_test
 mem = (nil)
 # ausearch -m AVC | grep mmap_zero
 type=AVC msg=audit(...): avc:  denied  { mmap_zero }
   for pid=1025 comm="mmap_test"
   scontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
   tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
   tclass=memprotect

This patch corrects things so that when the above example is run by a
user without CAP_SYS_RAWIO the SELinux AVC is no longer generated as
the DAC capability check fails before the SELinux permission check.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2014-03-19 16:46:11 -04:00
Tejun Heo
4d3bb511b5 cgroup: drop const from @buffer of cftype->write_string()
cftype->write_string() just passes on the writeable buffer from kernfs
and there's no reason to add const restriction on the buffer.  The
only thing const achieves is unnecessarily complicating parsing of the
buffer.  Drop const from @buffer.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>                                           
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
2014-03-19 10:23:54 -04:00
David S. Miller
72c2dfdefa Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
1) Fix a sleep in atomic when pfkey_sadb2xfrm_user_sec_ctx()
   is called from pfkey_compile_policy().
   Fix from Nikolay Aleksandrov.

2) security_xfrm_policy_alloc() can be called in process and atomic
   context. Add an argument to let the callers choose the appropriate
   way. Fix from Nikolay Aleksandrov.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-18 12:42:33 -04:00
David S. Miller
85dcce7a73 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/r8152.c
	drivers/net/xen-netback/netback.c

Both the r8152 and netback conflicts were simple overlapping
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:31:55 -04:00
Dmitry Kasatkin
fffea214ab smack: fix key permission verification
For any keyring access type SMACK always used MAY_READWRITE access check.
It prevents reading the key with label "_", which should be allowed for anyone.

This patch changes default access check to MAY_READ and use MAY_READWRITE in only
appropriate cases.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2014-03-14 17:44:49 +00:00
David Howells
f5895943d9 KEYS: Move the flags representing required permission to linux/key.h
Move the flags representing required permission to linux/key.h as the perm
parameter of security_key_permission() is in terms of them - and not the
permissions mask flags used in key->perm.

Whilst we're at it:

 (1) Rename them to be KEY_NEED_xxx rather than KEY_xxx to avoid collisions
     with symbols in uapi/linux/input.h.

 (2) Don't use key_perm_t for a mask of required permissions, but rather limit
     it to the permissions mask attached to the key and arguments related
     directly to that.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
2014-03-14 17:44:49 +00:00
James Morris
33b2533518 Merge branch 'next-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next 2014-03-12 16:33:48 +11:00
Nikolay Aleksandrov
52a4c6404f selinux: add gfp argument to security_xfrm_policy_alloc and fix callers
security_xfrm_policy_alloc can be called in atomic context so the
allocation should be done with GFP_ATOMIC. Add an argument to let the
callers choose the appropriate way. In order to do so a gfp argument
needs to be added to the method xfrm_policy_alloc_security in struct
security_operations and to the internal function
selinux_xfrm_alloc_user. After that switch to GFP_ATOMIC in the atomic
callers and leave GFP_KERNEL as before for the rest.
The path that needed the gfp argument addition is:
security_xfrm_policy_alloc -> security_ops.xfrm_policy_alloc_security ->
all users of xfrm_policy_alloc_security (e.g. selinux_xfrm_policy_alloc) ->
selinux_xfrm_alloc_user (here the allocation used to be GFP_KERNEL only)

Now adding a gfp argument to selinux_xfrm_alloc_user requires us to also
add it to security_context_to_sid which is used inside and prior to this
patch did only GFP_KERNEL allocation. So add gfp argument to
security_context_to_sid and adjust all of its callers as well.

CC: Paul Moore <paul@paul-moore.com>
CC: Dave Jones <davej@redhat.com>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Fan Du <fan.du@windriver.com>
CC: David S. Miller <davem@davemloft.net>
CC: LSM list <linux-security-module@vger.kernel.org>
CC: SELinux list <selinux@tycho.nsa.gov>

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-03-10 08:30:02 +01:00
David Howells
979e0d7465 KEYS: Make the keyring cycle detector ignore other keyrings of the same name
This fixes CVE-2014-0102.

The following command sequence produces an oops:

	keyctl new_session
	i=`keyctl newring _ses @s`
	keyctl link @s $i

The problem is that search_nested_keyrings() sees two keyrings that have
matching type and description, so keyring_compare_object() returns true.
s_n_k() then passes the key to the iterator function -
keyring_detect_cycle_iterator() - which *should* check to see whether this is
the keyring of interest, not just one with the same name.

Because assoc_array_find() will return one and only one match, I assumed that
the iterator function would only see an exact match or never be called - but
the iterator isn't only called from assoc_array_find()...

The oops looks something like this:

	kernel BUG at /data/fs/linux-2.6-fscache/security/keys/keyring.c:1003!
	invalid opcode: 0000 [#1] SMP
	...
	RIP: keyring_detect_cycle_iterator+0xe/0x1f
	...
	Call Trace:
	  search_nested_keyrings+0x76/0x2aa
	  __key_link_check_live_key+0x50/0x5f
	  key_link+0x4e/0x85
	  keyctl_keyring_link+0x60/0x81
	  SyS_keyctl+0x65/0xe4
	  tracesys+0xdd/0xe2

The fix is to make keyring_detect_cycle_iterator() check that the key it
has is the key it was actually looking for rather than calling BUG_ON().

A testcase has been included in the keyutils testsuite for this:

	http://git.kernel.org/cgit/linux/kernel/git/dhowells/keyutils.git/commit/?id=891f3365d07f1996778ade0e3428f01878a1790b

Reported-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-09 18:57:18 -07:00
Dmitry Kasatkin
a3aef94b31 evm: enable key retention service automatically
If keys are not enabled, EVM is not visible in the configuration menu.
It may be difficult to figure out what to do unless you really know.
Other subsystems as NFS, CIFS select keys automatically. This patch does
the same.

This patch also removes '(TRUSTED_KEYS=y || TRUSTED_KEYS=n)' dependency,
which is unnecessary. EVM does not depend on trusted keys, but on
encrypted keys. evm.h provides compile time dependency.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-03-07 12:15:49 -05:00
Dmitry Kasatkin
1d91ac6213 ima: skip memory allocation for empty files
Memory allocation is unnecessary for empty files.
This patch calculates the hash without memory allocation.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-03-07 12:15:48 -05:00
Dmitry Kasatkin
e0420039b6 evm: EVM does not use MD5
EVM does not use MD5 HMAC. Selection of CRYPTO_MD5 can be safely removed.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-03-07 12:15:47 -05:00
Dmitry Kasatkin
61997c4383 ima: return d_name.name if d_path fails
This is a small refactoring so ima_d_path() returns dentry name
if path reconstruction fails. It simplifies callers actions
and removes code duplication.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-03-07 12:15:46 -05:00
Dmitry Kasatkin
2bb930abcf integrity: fix checkpatch errors
Between checkpatch changes (eg. sizeof) and inconsistencies between
Lindent and checkpatch, unfixed checkpatch errors make it difficult
to see new errors. This patch fixes them. Some lines with over 80 chars
remained unchanged to improve code readability.

The "extern" keyword is removed from internal evm.h to make it consistent
with internal ima.h.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-03-07 12:15:45 -05:00
Dmitry Kasatkin
09b1148ef5 ima: fix erroneous removal of security.ima xattr
ima_inode_post_setattr() calls ima_must_appraise() to check if the
file needs to be appraised. If it does not then it removes security.ima
xattr. With original policy matching code it might happen that even
file needs to be appraised with FILE_CHECK hook, it might not be
for POST_SETATTR hook. 'security.ima' might be erronously removed.

This patch treats POST_SETATTR as special wildcard function and will
cause ima_must_appraise() to be true if any of the hooks rules matches.
security.ima will not be removed if any of the hooks would require
appraisal.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-03-07 12:15:44 -05:00
Joe Perches
20ee451f5a security: integrity: Use a more current logging style
Convert printks to pr_<level>.
Add pr_fmt.
Remove embedded prefixes.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-03-07 12:15:21 -05:00
Eric Paris
b7d3622a39 Linux 3.13
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJS3IyXAAoJEHm+PkMAQRiGplAH/ilCikBrCHyZ2938NHNLm+j1
 yhfYnEJHLNg7T69KEj3p0cNagO3v9RPWM6UYFBQ6uFIYNN1MBKO7U+mCZuMWzeO8
 +tGMV3mn5wx+oYn1RnWCCweQx5AESEl6rYn8udPDKh7LfW5fCLV60jguUjVSQ9IQ
 cvtKlWknbiHyM7t1GoYgzN7jlPrRQvcNQZ+Aogzz7uSnJAgwINglBAHS7WP2tiEM
 HAU2FoE4b3MbfGaid1vypaYQPBbFebx7Bw2WxAuZhkBbRiUBKlgF0/SYhOTvH38a
 Sjpj1EHKfjcuCBt9tht6KP6H56R25vNloGR2+FB+fuQBdujd/SPa9xDflEMcdG4=
 =iXnG
 -----END PGP SIGNATURE-----

Merge tag 'v3.13' into for-3.15

Linux 3.13

Conflicts:
	include/net/xfrm.h

Simple merge where v3.13 removed 'extern' from definitions and the audit
tree did s/u32/unsigned int/ to the same definitions.
2014-03-07 11:41:32 -05:00
Roberto Sassu
e3b64c268b ima: reduce memory usage when a template containing the n field is used
Before this change, to correctly calculate the template digest for the
'ima' template, the event name field (id: 'n') length was set to the fixed
size of 256 bytes.

This patch reduces the length of the event name field to the string
length incremented of one (to make room for the termination character '\0')
and handles the specific case of the digest calculation for the 'ima'
template directly in ima_calc_field_array_hash_tfm().

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-03-07 11:32:30 -05:00
Roberto Sassu
c019e307ad ima: restore the original behavior for sending data with ima template
With the new template mechanism introduced in IMA since kernel 3.13,
the format of data sent through the binary_runtime_measurements interface
is slightly changed. Now, for a generic measurement, the format of
template data (after the template name) is:

template_len | field1_len | field1 | ... | fieldN_len | fieldN

In addition, fields containing a string now include the '\0' termination
character.

Instead, the format for the 'ima' template should be:

SHA1 digest | event name length | event name

It must be noted that while in the IMA 3.13 code 'event name length' is
'IMA_EVENT_NAME_LEN_MAX + 1' (256 bytes), so that the template digest
is calculated correctly, and 'event name' contains '\0', in the pre 3.13
code 'event name length' is exactly the string length and 'event name'
does not contain the termination character.

The patch restores the behavior of the IMA code pre 3.13 for the 'ima'
template so that legacy userspace tools obtain a consistent behavior
when receiving data from the binary_runtime_measurements interface
regardless of which kernel version is used.

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Cc: <stable@vger.kernel.org> # 3.3.13: 3ce1217 ima: define template fields library
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-03-07 11:32:29 -05:00
Tetsuo Handa
73a6b44a00 Integrity: Pass commname via get_task_comm()
When we pass task->comm to audit_log_untrustedstring(), we need to pass it
via get_task_comm() because task->comm can be changed to contain untrusted
string by other threads after audit_log_untrustedstring() confirmed that
task->comm does not contain untrusted string.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-03-07 11:32:28 -05:00
Mimi Zohar
52a1328484 ima: use static const char array definitions
A const char pointer allocates memory for a pointer as well as for
a string,  This patch replaces a number of the const char pointers
throughout IMA, with a static const char array.

Suggested-by: David Howells <dhowells@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
2014-03-07 11:30:36 -05:00
Jeff Layton
d4a141c8e7 security: have cap_dentry_init_security return error
Currently, cap_dentry_init_security returns 0 without actually
initializing the security label. This confuses its only caller
(nfs4_label_init_security) which expects an error in that situation, and
causes it to end up sending out junk onto the wire instead of simply
suppressing the label in the attributes sent.

When CONFIG_SECURITY is disabled, security_dentry_init_security returns
-EOPNOTSUPP. Have cap_dentry_init_security do the same.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-03-07 11:50:01 +11:00
Heiko Carstens
875ec3da4c security/compat: convert to COMPAT_SYSCALL_DEFINE
Convert all compat system call functions where all parameter types
have a size of four or less than four bytes, or are pointer types
to COMPAT_SYSCALL_DEFINE.
The implicit casts within COMPAT_SYSCALL_DEFINE will perform proper
zero and sign extension to 64 bit of all parameters if needed.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2014-03-06 16:30:42 +01:00
David S. Miller
67ddc87f16 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/wireless/ath/ath9k/recv.c
	drivers/net/wireless/mwifiex/pcie.c
	net/ipv6/sit.c

The SIT driver conflict consists of a bug fix being done by hand
in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper
was created (netdev_alloc_pcpu_stats()) which takes care of this.

The two wireless conflicts were overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05 20:32:02 -05:00
Paul Moore
eee3094683 selinux: correctly label /proc inodes in use before the policy is loaded
This patch is based on an earlier patch by Eric Paris, he describes
the problem below:

  "If an inode is accessed before policy load it will get placed on a
   list of inodes to be initialized after policy load.  After policy
   load we call inode_doinit() which calls inode_doinit_with_dentry()
   on all inodes accessed before policy load.  In the case of inodes
   in procfs that means we'll end up at the bottom where it does:

     /* Default to the fs superblock SID. */
     isec->sid = sbsec->sid;

     if ((sbsec->flags & SE_SBPROC) && !S_ISLNK(inode->i_mode)) {
             if (opt_dentry) {
                     isec->sclass = inode_mode_to_security_class(...)
                     rc = selinux_proc_get_sid(opt_dentry,
                                               isec->sclass,
                                               &sid);
                     if (rc)
                             goto out_unlock;
                     isec->sid = sid;
             }
     }

   Since opt_dentry is null, we'll never call selinux_proc_get_sid()
   and will leave the inode labeled with the label on the superblock.
   I believe a fix would be to mimic the behavior of xattrs.  Look
   for an alias of the inode.  If it can't be found, just leave the
   inode uninitialized (and pick it up later) if it can be found, we
   should be able to call selinux_proc_get_sid() ..."

On a system exhibiting this problem, you will notice a lot of files in
/proc with the generic "proc_t" type (at least the ones that were
accessed early in the boot), for example:

   # ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }'
   system_u:object_r:proc_t:s0 /proc/sys/kernel/shmmax

However, with this patch in place we see the expected result:

   # ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }'
   system_u:object_r:sysctl_kernel_t:s0 /proc/sys/kernel/shmmax

Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
2014-03-05 15:54:57 -05:00
Libo Chen
31d4b76189 ima: new helper: file_inode(file)
Replace "file->f_dentry->d_inode" with the new file_inode() helper
function.

Signed-off-by: Libo Chen <clbchenlibo.chen@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-03-04 16:53:03 -05:00
Paul Moore
0909c0ae99 selinux: put the mmap() DAC controls before the MAC controls
It turns out that doing the SELinux MAC checks for mmap() before the
DAC checks was causing users and the SELinux policy folks headaches
as users were seeing a lot of SELinux AVC denials for the
memprotect:mmap_zero permission that would have also been denied by
the normal DAC capability checks (CAP_SYS_RAWIO).

Example:

 # cat mmap_test.c
  #include <stdlib.h>
  #include <stdio.h>
  #include <errno.h>
  #include <sys/mman.h>

  int main(int argc, char *argv[])
  {
        int rc;
        void *mem;

        mem = mmap(0x0, 4096,
                   PROT_READ | PROT_WRITE,
                   MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
        if (mem == MAP_FAILED)
                return errno;
        printf("mem = %p\n", mem);
        munmap(mem, 4096);

        return 0;
  }
 # gcc -g -O0 -o mmap_test mmap_test.c
 # ./mmap_test
 mem = (nil)
 # ausearch -m AVC | grep mmap_zero
 type=AVC msg=audit(...): avc:  denied  { mmap_zero }
   for pid=1025 comm="mmap_test"
   scontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
   tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
   tclass=memprotect

This patch corrects things so that when the above example is run by a
user without CAP_SYS_RAWIO the SELinux AVC is no longer generated as
the DAC capability check fails before the SELinux permission check.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2014-02-28 07:23:24 -05:00
James Morris
e4e027ea2d Merge branch 'stable-3.14' of git://git.infradead.org/users/pcmoore/selinux into for-linus 2014-02-24 14:40:16 +11:00
Eric Paris
9085a64229 SELinux: bigendian problems with filename trans rules
When writing policy via /sys/fs/selinux/policy I wrote the type and class
of filename trans rules in CPU endian instead of little endian.  On
x86_64 this works just fine, but it means that on big endian arch's like
ppc64 and s390 userspace reads the policy and converts it from
le32_to_cpu.  So the values are all screwed up.  Write the values in le
format like it should have been to start.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-02-20 12:07:58 -05:00
Sam Ravnborg
e0c2de2b15 security: cleanup Makefiles to use standard syntax for specifying sub-directories
The Makefiles in security/ uses a non-standard way to
specify sub-directories for building.

Fix it up so the normal (and documented) approach is used.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-02-17 11:08:04 +11:00
Fan Du
ca925cf153 flowcache: Make flow cache name space aware
Inserting a entry into flowcache, or flushing flowcache should be based
on per net scope. The reason to do so is flushing operation from fat
netns crammed with flow entries will also making the slim netns with only
a few flow cache entries go away in original implementation.

Since flowcache is tightly coupled with IPsec, so it would be easier to
put flow cache global parameters into xfrm namespace part. And one last
thing needs to do is bumping flow cache genid, and flush flow cache should
also be made in per net style.

Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-12 07:02:11 +01:00
James Morris
f743166da7 Merge branch 'stable-3.14' of git://git.infradead.org/users/pcmoore/selinux into for-linus 2014-02-10 11:48:21 +11:00
Tejun Heo
073219e995 cgroup: clean up cgroup_subsys names and initialization
cgroup_subsys is a bit messier than it needs to be.

* The name of a subsys can be different from its internal identifier
  defined in cgroup_subsys.h.  Most subsystems use the matching name
  but three - cpu, memory and perf_event - use different ones.

* cgroup_subsys_id enums are postfixed with _subsys_id and each
  cgroup_subsys is postfixed with _subsys.  cgroup.h is widely
  included throughout various subsystems, it doesn't and shouldn't
  have claim on such generic names which don't have any qualifier
  indicating that they belong to cgroup.

* cgroup_subsys->subsys_id should always equal the matching
  cgroup_subsys_id enum; however, we require each controller to
  initialize it and then BUG if they don't match, which is a bit
  silly.

This patch cleans up cgroup_subsys names and initialization by doing
the followings.

* cgroup_subsys_id enums are now postfixed with _cgrp_id, and each
  cgroup_subsys with _cgrp_subsys.

* With the above, renaming subsys identifiers to match the userland
  visible names doesn't cause any naming conflicts.  All non-matching
  identifiers are renamed to match the official names.

  cpu_cgroup -> cpu
  mem_cgroup -> memory
  perf -> perf_event

* controllers no longer need to initialize ->subsys_id and ->name.
  They're generated in cgroup core and set automatically during boot.

* Redundant cgroup_subsys declarations removed.

* While updating BUG_ON()s in cgroup_init_early(), convert them to
  WARN()s.  BUGging that early during boot is stupid - the kernel
  can't print anything, even through serial console and the trap
  handler doesn't even link stack frame properly for back-tracing.

This patch doesn't introduce any behavior changes.

v2: Rebased on top of fe1217c4f3 ("net: net_cls: move cgroupfs
    classid handling into core").

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: "David S. Miller" <davem@davemloft.net>
Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Ingo Molnar <mingo@redhat.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Serge E. Hallyn <serue@us.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Thomas Graf <tgraf@suug.ch>
2014-02-08 10:36:58 -05:00
Jingoo Han
29707b206c security: replace strict_strto*() with kstrto*()
The usage of strict_strto*() is not preferred, because
strict_strto*() is obsolete. Thus, kstrto*() should be
used.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-02-06 19:11:04 +11:00
Stephen Smalley
2172fa709a SELinux: Fix kernel BUG on empty security contexts.
Setting an empty security context (length=0) on a file will
lead to incorrectly dereferencing the type and other fields
of the security context structure, yielding a kernel BUG.
As a zero-length security context is never valid, just reject
all such security contexts whether coming from userspace
via setxattr or coming from the filesystem upon a getxattr
request by SELinux.

Setting a security context value (empty or otherwise) unknown to
SELinux in the first place is only possible for a root process
(CAP_MAC_ADMIN), and, if running SELinux in enforcing mode, only
if the corresponding SELinux mac_admin permission is also granted
to the domain by policy.  In Fedora policies, this is only allowed for
specific domains such as livecd for setting down security contexts
that are not defined in the build host policy.

Reproducer:
su
setenforce 0
touch foo
setfattr -n security.selinux foo

Caveat:
Relabeling or removing foo after doing the above may not be possible
without booting with SELinux disabled.  Any subsequent access to foo
after doing the above will also trigger the BUG.

BUG output from Matthew Thode:
[  473.893141] ------------[ cut here ]------------
[  473.962110] kernel BUG at security/selinux/ss/services.c:654!
[  473.995314] invalid opcode: 0000 [#6] SMP
[  474.027196] Modules linked in:
[  474.058118] CPU: 0 PID: 8138 Comm: ls Tainted: G      D   I
3.13.0-grsec #1
[  474.116637] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0
07/29/10
[  474.149768] task: ffff8805f50cd010 ti: ffff8805f50cd488 task.ti:
ffff8805f50cd488
[  474.183707] RIP: 0010:[<ffffffff814681c7>]  [<ffffffff814681c7>]
context_struct_compute_av+0xce/0x308
[  474.219954] RSP: 0018:ffff8805c0ac3c38  EFLAGS: 00010246
[  474.252253] RAX: 0000000000000000 RBX: ffff8805c0ac3d94 RCX:
0000000000000100
[  474.287018] RDX: ffff8805e8aac000 RSI: 00000000ffffffff RDI:
ffff8805e8aaa000
[  474.321199] RBP: ffff8805c0ac3cb8 R08: 0000000000000010 R09:
0000000000000006
[  474.357446] R10: 0000000000000000 R11: ffff8805c567a000 R12:
0000000000000006
[  474.419191] R13: ffff8805c2b74e88 R14: 00000000000001da R15:
0000000000000000
[  474.453816] FS:  00007f2e75220800(0000) GS:ffff88061fc00000(0000)
knlGS:0000000000000000
[  474.489254] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  474.522215] CR2: 00007f2e74716090 CR3: 00000005c085e000 CR4:
00000000000207f0
[  474.556058] Stack:
[  474.584325]  ffff8805c0ac3c98 ffffffff811b549b ffff8805c0ac3c98
ffff8805f1190a40
[  474.618913]  ffff8805a6202f08 ffff8805c2b74e88 00068800d0464990
ffff8805e8aac860
[  474.653955]  ffff8805c0ac3cb8 000700068113833a ffff880606c75060
ffff8805c0ac3d94
[  474.690461] Call Trace:
[  474.723779]  [<ffffffff811b549b>] ? lookup_fast+0x1cd/0x22a
[  474.778049]  [<ffffffff81468824>] security_compute_av+0xf4/0x20b
[  474.811398]  [<ffffffff8196f419>] avc_compute_av+0x2a/0x179
[  474.843813]  [<ffffffff8145727b>] avc_has_perm+0x45/0xf4
[  474.875694]  [<ffffffff81457d0e>] inode_has_perm+0x2a/0x31
[  474.907370]  [<ffffffff81457e76>] selinux_inode_getattr+0x3c/0x3e
[  474.938726]  [<ffffffff81455cf6>] security_inode_getattr+0x1b/0x22
[  474.970036]  [<ffffffff811b057d>] vfs_getattr+0x19/0x2d
[  475.000618]  [<ffffffff811b05e5>] vfs_fstatat+0x54/0x91
[  475.030402]  [<ffffffff811b063b>] vfs_lstat+0x19/0x1b
[  475.061097]  [<ffffffff811b077e>] SyS_newlstat+0x15/0x30
[  475.094595]  [<ffffffff8113c5c1>] ? __audit_syscall_entry+0xa1/0xc3
[  475.148405]  [<ffffffff8197791e>] system_call_fastpath+0x16/0x1b
[  475.179201] Code: 00 48 85 c0 48 89 45 b8 75 02 0f 0b 48 8b 45 a0 48
8b 3d 45 d0 b6 00 8b 40 08 89 c6 ff ce e8 d1 b0 06 00 48 85 c0 49 89 c7
75 02 <0f> 0b 48 8b 45 b8 4c 8b 28 eb 1e 49 8d 7d 08 be 80 01 00 00 e8
[  475.255884] RIP  [<ffffffff814681c7>]
context_struct_compute_av+0xce/0x308
[  475.296120]  RSP <ffff8805c0ac3c38>
[  475.328734] ---[ end trace f076482e9d754adc ]---

Reported-by:  Matthew Thode <mthode@mthode.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-02-05 12:20:51 -05:00
Paul Moore
6a96e15096 selinux: add SOCK_DIAG_BY_FAMILY to the list of netlink message types
The SELinux AF_NETLINK/NETLINK_SOCK_DIAG socket class was missing the
SOCK_DIAG_BY_FAMILY definition which caused SELINUX_ERR messages when
the ss tool was run.

 # ss
 Netid  State  Recv-Q Send-Q  Local Address:Port   Peer Address:Port
 u_str  ESTAB  0      0                  * 14189             * 14190
 u_str  ESTAB  0      0                  * 14145             * 14144
 u_str  ESTAB  0      0                  * 14151             * 14150
 {...}
 # ausearch -m SELINUX_ERR
 ----
 time->Thu Jan 23 11:11:16 2014
 type=SYSCALL msg=audit(1390493476.445:374):
  arch=c000003e syscall=44 success=yes exit=40
  a0=3 a1=7fff03aa11f0 a2=28 a3=0 items=0 ppid=1852 pid=1895
  auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0
  tty=pts0 ses=1 comm="ss" exe="/usr/sbin/ss"
  subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
 type=SELINUX_ERR msg=audit(1390493476.445:374):
  SELinux:  unrecognized netlink message type=20 for sclass=32

Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-02-05 12:20:48 -05:00
Paul Moore
825e587af2 Linux 3.13
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJS3IyXAAoJEHm+PkMAQRiGplAH/ilCikBrCHyZ2938NHNLm+j1
 yhfYnEJHLNg7T69KEj3p0cNagO3v9RPWM6UYFBQ6uFIYNN1MBKO7U+mCZuMWzeO8
 +tGMV3mn5wx+oYn1RnWCCweQx5AESEl6rYn8udPDKh7LfW5fCLV60jguUjVSQ9IQ
 cvtKlWknbiHyM7t1GoYgzN7jlPrRQvcNQZ+Aogzz7uSnJAgwINglBAHS7WP2tiEM
 HAU2FoE4b3MbfGaid1vypaYQPBbFebx7Bw2WxAuZhkBbRiUBKlgF0/SYhOTvH38a
 Sjpj1EHKfjcuCBt9tht6KP6H56R25vNloGR2+FB+fuQBdujd/SPa9xDflEMcdG4=
 =iXnG
 -----END PGP SIGNATURE-----

Merge tag 'v3.13' into stable-3.14

Linux 3.13

Conflicts:
	security/selinux/hooks.c

Trivial merge issue in selinux_inet_conn_request() likely due to me
including patches that I sent to the stable folks in my next tree
resulting in the patch hitting twice (I think).  Thankfully it was an
easy fix this time, but regardless, lesson learned, I will not do that
again.
2014-02-05 10:39:48 -05:00
Colin Cross
530b099dfe security: select correct default LSM_MMAP_MIN_ADDR on arm on arm64
Binaries compiled for arm may run on arm64 if CONFIG_COMPAT is
selected.  Set LSM_MMAP_MIN_ADDR to 32768 if ARM64 && COMPAT to
prevent selinux failures launching 32-bit static executables that
are mapped at 0x8000.

Signed-off-by: Colin Cross <ccross@android.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-05 14:59:14 +00:00
Linus Torvalds
6dd9158ae8 Merge git://git.infradead.org/users/eparis/audit
Pull audit update from Eric Paris:
 "Again we stayed pretty well contained inside the audit system.
  Venturing out was fixing a couple of function prototypes which were
  inconsistent (didn't hurt anything, but we used the same value as an
  int, uint, u32, and I think even a long in a couple of places).

  We also made a couple of minor changes to when a couple of LSMs called
  the audit system.  We hoped to add aarch64 audit support this go
  round, but it wasn't ready.

  I'm disappearing on vacation on Thursday.  I should have internet
  access, but it'll be spotty.  If anything goes wrong please be sure to
  cc rgb@redhat.com.  He'll make fixing things his top priority"

* git://git.infradead.org/users/eparis/audit: (50 commits)
  audit: whitespace fix in kernel-parameters.txt
  audit: fix location of __net_initdata for audit_net_ops
  audit: remove pr_info for every network namespace
  audit: Modify a set of system calls in audit class definitions
  audit: Convert int limit uses to u32
  audit: Use more current logging style
  audit: Use hex_byte_pack_upper
  audit: correct a type mismatch in audit_syscall_exit()
  audit: reorder AUDIT_TTY_SET arguments
  audit: rework AUDIT_TTY_SET to only grab spin_lock once
  audit: remove needless switch in AUDIT_SET
  audit: use define's for audit version
  audit: documentation of audit= kernel parameter
  audit: wait_for_auditd rework for readability
  audit: update MAINTAINERS
  audit: log task info on feature change
  audit: fix incorrect set of audit_sock
  audit: print error message when fail to create audit socket
  audit: fix dangling keywords in audit_log_set_loginuid() output
  audit: log on errors from filter user rules
  ...
2014-01-23 18:08:10 -08:00
Paul Moore
41be702a54 Linux 3.13
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJS3IyXAAoJEHm+PkMAQRiGplAH/ilCikBrCHyZ2938NHNLm+j1
 yhfYnEJHLNg7T69KEj3p0cNagO3v9RPWM6UYFBQ6uFIYNN1MBKO7U+mCZuMWzeO8
 +tGMV3mn5wx+oYn1RnWCCweQx5AESEl6rYn8udPDKh7LfW5fCLV60jguUjVSQ9IQ
 cvtKlWknbiHyM7t1GoYgzN7jlPrRQvcNQZ+Aogzz7uSnJAgwINglBAHS7WP2tiEM
 HAU2FoE4b3MbfGaid1vypaYQPBbFebx7Bw2WxAuZhkBbRiUBKlgF0/SYhOTvH38a
 Sjpj1EHKfjcuCBt9tht6KP6H56R25vNloGR2+FB+fuQBdujd/SPa9xDflEMcdG4=
 =iXnG
 -----END PGP SIGNATURE-----

Merge tag 'v3.13' into next

Linux 3.13

Minor fixup needed in selinux_inet_conn_request()

Conflicts:
	security/selinux/hooks.c
2014-01-23 15:52:06 -05:00
Linus Torvalds
f075e0f699 Merge branch 'for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
 "The bulk of changes are cleanups and preparations for the upcoming
  kernfs conversion.

   - cgroup_event mechanism which is and will be used only by memcg is
     moved to memcg.

   - pidlist handling is updated so that it can be served by seq_file.

     Also, the list is not sorted if sane_behavior.  cgroup
     documentation explicitly states that the file is not sorted but it
     has been for quite some time.

   - All cgroup file handling now happens on top of seq_file.  This is
     to prepare for kernfs conversion.  In addition, all operations are
     restructured so that they map 1-1 to kernfs operations.

   - Other cleanups and low-pri fixes"

* 'for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (40 commits)
  cgroup: trivial style updates
  cgroup: remove stray references to css_id
  doc: cgroups: Fix typo in doc/cgroups
  cgroup: fix fail path in cgroup_load_subsys()
  cgroup: fix missing unlock on error in cgroup_load_subsys()
  cgroup: remove for_each_root_subsys()
  cgroup: implement for_each_css()
  cgroup: factor out cgroup_subsys_state creation into create_css()
  cgroup: combine css handling loops in cgroup_create()
  cgroup: reorder operations in cgroup_create()
  cgroup: make for_each_subsys() useable under cgroup_root_mutex
  cgroup: css iterations and css_from_dir() are safe under cgroup_mutex
  cgroup: unify pidlist and other file handling
  cgroup: replace cftype->read_seq_string() with cftype->seq_show()
  cgroup: attach cgroup_open_file to all cgroup files
  cgroup: generalize cgroup_pidlist_open_file
  cgroup: unify read path so that seq_file is always used
  cgroup: unify cgroup_write_X64() and cgroup_write_string()
  cgroup: remove cftype->read(), ->read_map() and ->write()
  hugetlb_cgroup: convert away from cftype->read()
  ...
2014-01-21 17:51:34 -08:00
Linus Torvalds
fb2e2c8537 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security layer updates from James Morris:
 "Changes for this kernel include maintenance updates for Smack, SELinux
  (and several networking fixes), IMA and TPM"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (39 commits)
  SELinux: Fix memory leak upon loading policy
  tpm/tpm-sysfs: active_show() can be static
  tpm: tpm_tis: Fix compile problems with CONFIG_PM_SLEEP/CONFIG_PNP
  tpm: Make tpm-dev allocate a per-file structure
  tpm: Use the ops structure instead of a copy in tpm_vendor_specific
  tpm: Create a tpm_class_ops structure and use it in the drivers
  tpm: Pull all driver sysfs code into tpm-sysfs.c
  tpm: Move sysfs functions from tpm-interface to tpm-sysfs
  tpm: Pull everything related to /dev/tpmX into tpm-dev.c
  char: tpm: nuvoton: remove unused variable
  tpm: MAINTAINERS: Cleanup TPM Maintainers file
  tpm/tpm_i2c_atmel: fix coccinelle warnings
  tpm/tpm_ibmvtpm: fix unreachable code warning (smatch warning)
  tpm/tpm_i2c_stm_st33: Check return code of get_burstcount
  tpm/tpm_ppi: Check return value of acpi_get_name
  tpm/tpm_ppi: Do not compare strcmp(a,b) == -1
  ima: remove unneeded size_limit argument from ima_eventdigest_init_common()
  ima: update IMA-templates.txt documentation
  ima: pass HASH_ALGO__LAST as hash algo in ima_eventdigest_init()
  ima: change the default hash algorithm to SHA1 in ima_eventdigest_ng_init()
  ...
2014-01-21 09:06:02 -08:00
Richard Guy Briggs
4eb0f4abfb smack: call WARN_ONCE() instead of calling audit_log_start()
Remove the call to audit_log() (which call audit_log_start()) and deal with
the errors in the caller, logging only once if the condition is met.  Calling
audit_log_start() in this location makes buffer allocation and locking more
complicated in the calling tree (audit_filter_user()).

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2014-01-13 22:32:06 -05:00
Richard Guy Briggs
9ad42a7924 selinux: call WARN_ONCE() instead of calling audit_log_start()
Two of the conditions in selinux_audit_rule_match() should never happen and
the third indicates a race that should be retried.  Remove the calls to
audit_log() (which call audit_log_start()) and deal with the errors in the
caller, logging only once if the condition is met.  Calling audit_log_start()
in this location makes buffer allocation and locking more complicated in the
calling tree (audit_filter_user()).

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2014-01-13 22:32:00 -05:00
Steven Rostedt
3dc91d4338 SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()
While running stress tests on adding and deleting ftrace instances I hit
this bug:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
  IP: selinux_inode_permission+0x85/0x160
  PGD 63681067 PUD 7ddbe067 PMD 0
  Oops: 0000 [#1] PREEMPT
  CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20
  Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
  task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000
  RIP: 0010:[<ffffffff812d8bc5>]  [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
  RSP: 0018:ffff88007ddb1c48  EFLAGS: 00010246
  RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840
  RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000
  RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54
  R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000
  R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000
  FS:  00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
  CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0
  Call Trace:
    security_inode_permission+0x1c/0x30
    __inode_permission+0x41/0xa0
    inode_permission+0x18/0x50
    link_path_walk+0x66/0x920
    path_openat+0xa6/0x6c0
    do_filp_open+0x43/0xa0
    do_sys_open+0x146/0x240
    SyS_open+0x1e/0x20
    system_call_fastpath+0x16/0x1b
  Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff
  RIP  selinux_inode_permission+0x85/0x160
  CR2: 0000000000000020

Investigating, I found that the inode->i_security was NULL, and the
dereference of it caused the oops.

in selinux_inode_permission():

	isec = inode->i_security;

	rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);

Note, the crash came from stressing the deletion and reading of debugfs
files.  I was not able to recreate this via normal files.  But I'm not
sure they are safe.  It may just be that the race window is much harder
to hit.

What seems to have happened (and what I have traced), is the file is
being opened at the same time the file or directory is being deleted.
As the dentry and inode locks are not held during the path walk, nor is
the inodes ref counts being incremented, there is nothing saving these
structures from being discarded except for an rcu_read_lock().

The rcu_read_lock() protects against freeing of the inode, but it does
not protect freeing of the inode_security_struct.  Now if the freeing of
the i_security happens with a call_rcu(), and the i_security field of
the inode is not changed (it gets freed as the inode gets freed) then
there will be no issue here.  (Linus Torvalds suggested not setting the
field to NULL such that we do not need to check if it is NULL in the
permission check).

Note, this is a hack, but it fixes the problem at hand.  A real fix is
to restructure the destroy_inode() to call all the destructor handlers
from the RCU callback.  But that is a major job to do, and requires a
lot of work.  For now, we just band-aid this bug with this fix (it
works), and work on a more maintainable solution in the future.

Link: http://lkml.kernel.org/r/20140109101932.0508dec7@gandalf.local.home
Link: http://lkml.kernel.org/r/20140109182756.17abaaa8@gandalf.local.home

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-12 16:53:13 +07:00
James Morris
923b49ff69 Merge branch 'master' of git://git.infradead.org/users/pcmoore/selinux into next 2014-01-08 17:22:32 +11:00
Tetsuo Handa
8ed8146028 SELinux: Fix memory leak upon loading policy
Hello.

I got below leak with linux-3.10.0-54.0.1.el7.x86_64 .

[  681.903890] kmemleak: 5538 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

Below is a patch, but I don't know whether we need special handing for undoing
ebitmap_set_bit() call.
----------
>>From fe97527a90fe95e2239dfbaa7558f0ed559c0992 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Mon, 6 Jan 2014 16:30:21 +0900
Subject: [PATCH] SELinux: Fix memory leak upon loading policy

Commit 2463c26d "SELinux: put name based create rules in a hashtable" did not
check return value from hashtab_insert() in filename_trans_read(). It leaks
memory if hashtab_insert() returns error.

  unreferenced object 0xffff88005c9160d0 (size 8):
    comm "systemd", pid 1, jiffies 4294688674 (age 235.265s)
    hex dump (first 8 bytes):
      57 0b 00 00 6b 6b 6b a5                          W...kkk.
    backtrace:
      [<ffffffff816604ae>] kmemleak_alloc+0x4e/0xb0
      [<ffffffff811cba5e>] kmem_cache_alloc_trace+0x12e/0x360
      [<ffffffff812aec5d>] policydb_read+0xd1d/0xf70
      [<ffffffff812b345c>] security_load_policy+0x6c/0x500
      [<ffffffff812a623c>] sel_write_load+0xac/0x750
      [<ffffffff811eb680>] vfs_write+0xc0/0x1f0
      [<ffffffff811ec08c>] SyS_write+0x4c/0xa0
      [<ffffffff81690419>] system_call_fastpath+0x16/0x1b
      [<ffffffffffffffff>] 0xffffffffffffffff

However, we should not return EEXIST error to the caller, or the systemd will
show below message and the boot sequence freezes.

  systemd[1]: Failed to load SELinux policy. Freezing.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Eric Paris <eparis@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-01-07 10:21:44 -05:00
James Morris
d4a82a4a03 Merge branch 'master' of git://git.infradead.org/users/pcmoore/selinux into next
Conflicts:
	security/selinux/hooks.c

Resolved using request struct.

Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-01-07 01:45:59 +11:00
James Morris
38fd2c202a Merge to v3.13-rc7 for prerequisite changes in the Xen code for TPM 2014-01-06 22:23:01 +11:00
Roberto Sassu
dcf4e39286 ima: remove unneeded size_limit argument from ima_eventdigest_init_common()
This patch removes the 'size_limit' argument from
ima_eventdigest_init_common(). Since the 'd' field will never include
the hash algorithm as prefix and the 'd-ng' will always have it, we can
use the hash algorithm to differentiate the two cases in the modified
function (it is equal to HASH_ALGO__LAST in the first case, the opposite
in the second).

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-01-03 07:43:00 -05:00
Roberto Sassu
712a49bd7d ima: pass HASH_ALGO__LAST as hash algo in ima_eventdigest_init()
Replace the '-1' value with HASH_ALGO__LAST in ima_eventdigest_init()
as the called function ima_eventdigest_init_common() expects an unsigned
char.

Fix commit:
  4d7aeee ima: define new template ima-ng and template fields d-ng and n-ng

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-01-03 07:42:59 -05:00
Roberto Sassu
c502c78ba7 ima: change the default hash algorithm to SHA1 in ima_eventdigest_ng_init()
Replace HASH_ALGO__LAST with HASH_ALGO_SHA1 as the initial value of
the hash algorithm so that the prefix 'sha1:' is added to violation
digests.

Fix commit:
  4d7aeee ima: define new template ima-ng and template fields d-ng and n-ng

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Cc: <stable@vger.kernel.org> # 3.13.x
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-01-03 07:42:57 -05:00
Casey Schaufler
4482a44f6a Smack: File receive audit correction
Eric Paris politely points out:

    Inside smack_file_receive() it seems like you are initting the audit
    field with LSM_AUDIT_DATA_TASK.  And then use
    smk_ad_setfield_u_fs_path().

    Seems like LSM_AUDIT_DATA_PATH would make more sense.  (and depending
    on how it's used fix a crash...)

He is correct. This puts things in order.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2013-12-31 13:35:27 -08:00
Casey Schaufler
24ea1b6efc Smack: Rationalize mount restrictions
The mount restrictions imposed by Smack rely heavily on the
use of the filesystem "floor", which is the label that all
processes writing to the filesystem must have access to. It
turns out that while the "floor" notion is sound, it has yet
to be fully implemented and has never been used.

The sb_mount and sb_umount hooks only make sense if the
filesystem floor is used actively, and it isn't. They can
be reintroduced if a rational restriction comes up. Until
then, they get removed.

The sb_kern_mount hook is required for the option processing.
It is too permissive in the case of unprivileged mounts,
effectively bypassing the CAP_MAC_ADMIN restrictions if
any of the smack options are specified. Unprivileged mounts
are no longer allowed to set Smack filesystem options.
Additionally, the root and default values are set to the
label of the caller, in keeping with the policy that objects
get the label of their creator.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2013-12-31 13:35:16 -08:00
Casey Schaufler
4afde48be8 Smack: change rule cap check
smk_write_change_rule() is calling capable rather than
the more correct smack_privileged(). This allows for setting
rules in violation of the onlycap facility. This is the
simple repair.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2013-12-23 15:57:43 -08:00
Casey Schaufler
00f84f3f2e Smack: Make the syslog control configurable
The syslog control requires that the calling proccess
have the floor ("_") Smack label. Tizen does not run any
processes except for kernel helpers with the floor label.
This changes allows the admin to configure a specific
label for syslog. The default value is the star ("*")
label, effectively removing the restriction. The value
can be set using smackfs/syslog for anyone who wants
a more restrictive behavior.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2013-12-23 15:50:55 -08:00
Oleg Nesterov
c0c1439541 selinux: selinux_setprocattr()->ptrace_parent() needs rcu_read_lock()
selinux_setprocattr() does ptrace_parent(p) under task_lock(p),
but task_struct->alloc_lock doesn't pin ->parent or ->ptrace,
this looks confusing and triggers the "suspicious RCU usage"
warning because ptrace_parent() does rcu_dereference_check().

And in theory this is wrong, spin_lock()->preempt_disable()
doesn't necessarily imply rcu_read_lock() we need to access
the ->parent.

Reported-by: Evan McNabb <emcnabb@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-23 17:45:17 -05:00
Chad Hanson
46d01d6322 selinux: fix broken peer recv check
Fix a broken networking check. Return an error if peer recv fails.  If
secmark is active and the packet recv succeeds the peer recv error is
ignored.

Signed-off-by: Chad Hanson <chanson@trustedcs.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-23 17:45:17 -05:00
Casey Schaufler
19760ad03c Smack: Prevent the * and @ labels from being used in SMACK64EXEC
Smack prohibits processes from using the star ("*") and web ("@") labels
because we don't want files with those labels getting created implicitly.
All setting of those labels should be done explicitly. The trouble is that
there is no check for these labels in the processing of SMACK64EXEC. That
is repaired.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2013-12-19 13:05:24 -08:00
Oleg Nesterov
465954cd64 selinux: selinux_setprocattr()->ptrace_parent() needs rcu_read_lock()
selinux_setprocattr() does ptrace_parent(p) under task_lock(p),
but task_struct->alloc_lock doesn't pin ->parent or ->ptrace,
this looks confusing and triggers the "suspicious RCU usage"
warning because ptrace_parent() does rcu_dereference_check().

And in theory this is wrong, spin_lock()->preempt_disable()
doesn't necessarily imply rcu_read_lock() we need to access
the ->parent.

Reported-by: Evan McNabb <emcnabb@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-16 16:00:29 -05:00
Wei Yongjun
a5e333d340 SELinux: remove duplicated include from hooks.c
Remove duplicated include.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-16 15:58:23 -05:00
Linus Torvalds
b5745c5962 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull SELinux fixes from James Morris.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  selinux: process labeled IPsec TCP SYN-ACK packets properly in selinux_ip_postroute()
  selinux: look for IPsec labels on both inbound and outbound packets
  selinux: handle TCP SYN-ACK packets correctly in selinux_ip_postroute()
  selinux: handle TCP SYN-ACK packets correctly in selinux_ip_output()
  selinux: fix possible memory leak
2013-12-15 11:28:02 -08:00
Linus Torvalds
29b1deb2a4 Revert "selinux: consider filesystem subtype in policies"
This reverts commit 102aefdda4.

Tom London reports that it causes sync() to hang on Fedora rawhide:

  https://bugzilla.redhat.com/show_bug.cgi?id=1033965

and Josh Boyer bisected it down to this commit.  Reverting the commit in
the rawhide kernel fixes the problem.

Eric Paris root-caused it to incorrect subtype matching in that commit
breaking fuse, and has a tentative patch, but by now we're better off
retrying this in 3.14 rather than playing with it any more.

Reported-by: Tom London <selinux@gmail.com>
Bisected-by: Josh Boyer <jwboyer@fedoraproject.org>
Acked-by: Eric Paris <eparis@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Anand Avati <avati@redhat.com>
Cc: Paul Moore <paul@paul-moore.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-12-15 11:17:45 -08:00
Paul Moore
4d546f8171 selinux: revert 102aefdda4
Revert "selinux: consider filesystem subtype in policies"

This reverts commit 102aefdda4.

Explanation from Eric Paris:

	SELinux policy can specify if it should use a filesystem's
	xattrs or not.  In current policy we have a specification that
	fuse should not use xattrs but fuse.glusterfs should use
	xattrs.  This patch has a bug in which non-glusterfs
	filesystems would match the rule saying fuse.glusterfs should
	use xattrs.  If both fuse and the particular filesystem in
	question are not written to handle xattr calls during the mount
	command, they will deadlock.

	I have fixed the bug to do proper matching, however I believe a
	revert is still the correct solution.  The reason I believe
	that is because the code still does not work.  The s_subtype is
	not set until after the SELinux hook which attempts to match on
	the ".gluster" portion of the rule.  So we cannot match on the
	rule in question.  The code is useless.

Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-13 14:52:25 -05:00
James Morris
d93aca6050 Merge branch 'master' of git://git.infradead.org/users/pcmoore/selinux_fixes into for-linus 2013-12-13 13:27:55 +11:00
Paul Moore
c0828e5048 selinux: process labeled IPsec TCP SYN-ACK packets properly in selinux_ip_postroute()
Due to difficulty in arriving at the proper security label for
TCP SYN-ACK packets in selinux_ip_postroute(), we need to check packets
while/before they are undergoing XFRM transforms instead of waiting
until afterwards so that we can determine the correct security label.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-12 17:21:31 -05:00
Paul Moore
817eff718d selinux: look for IPsec labels on both inbound and outbound packets
Previously selinux_skb_peerlbl_sid() would only check for labeled
IPsec security labels on inbound packets, this patch enables it to
check both inbound and outbound traffic for labeled IPsec security
labels.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-12 17:21:31 -05:00
Paul Moore
446b802437 selinux: handle TCP SYN-ACK packets correctly in selinux_ip_postroute()
In selinux_ip_postroute() we perform access checks based on the
packet's security label.  For locally generated traffic we get the
packet's security label from the associated socket; this works in all
cases except for TCP SYN-ACK packets.  In the case of SYN-ACK packet's
the correct security label is stored in the connection's request_sock,
not the server's socket.  Unfortunately, at the point in time when
selinux_ip_postroute() is called we can't query the request_sock
directly, we need to recreate the label using the same logic that
originally labeled the associated request_sock.

See the inline comments for more explanation.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Tested-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-12 17:21:31 -05:00
Paul Moore
4718006827 selinux: handle TCP SYN-ACK packets correctly in selinux_ip_output()
In selinux_ip_output() we always label packets based on the parent
socket.  While this approach works in almost all cases, it doesn't
work in the case of TCP SYN-ACK packets when the correct label is not
the label of the parent socket, but rather the label of the larval
socket represented by the request_sock struct.

Unfortunately, since the request_sock isn't queued on the parent
socket until *after* the SYN-ACK packet is sent, we can't lookup the
request_sock to determine the correct label for the packet; at this
point in time the best we can do is simply pass/NF_ACCEPT the packet.
It must be said that simply passing the packet without any explicit
labeling action, while far from ideal, is not terrible as the SYN-ACK
packet will inherit any IP option based labeling from the initial
connection request so the label *should* be correct and all our
access controls remain in place so we shouldn't have to worry about
information leaks.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Tested-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-12 17:21:31 -05:00
Linus Torvalds
5dec682c7f Keyrings fixes 2013-12-10
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQIVAwUAUqdgihOxKuMESys7AQIishAAjGG3LnEp12fd7oay5//4SEg31ybPqGj7
 Xotk9mblBW7cWPbLqk7xyIhxpIj2zM14XatEV2TfBNZV0Lcrzr1M5s87ZRUX0ui+
 FHbtfn/M3Or8AvX7W1HZgG2Se7M0Ba6Y2RRXNsEdgqk6KMQuHhPEZ2FZ5vCx6BAD
 vXzxWIKVNeqMXR7z6xkyqmpztnVQs+ZgzE9+c96QKyhBdtBy4spDmfgJS90m0pjN
 HhgONpdfgosknj8yu43rWIQvd3UUO5BVntCeic94Fbgh77ZAEi0dD7ifLz/ebcja
 pfJfPXxYzkCfIrSdAzQF8iYnQ+5rRomvGsMcvtq6mBooah/YmEkmKpKJDTp47wyN
 IEUeJaxM2Qp2jcUSjEd7lY9o1AK4sj+90cKeVRUd5kzZP1iolkQHR+dGiAwqbn6w
 MJBn9fCamvhpsZDhl2G/DICprFDvErd9zAlNubivggJmITnPXNWx+K1RDL4fO4qc
 zLHTGHkxYyACM/7oJfwbH/NyZ1yu53OlE3R2h6TA6ISc7nAed7qzGAZyrSV92Quc
 pItGaQ/zNa0sbe+nufCx9FOWu8sA3x7qnazLxhVtlPde9nlxcpGo5cagqcyc4hyp
 /IGK2JoGkgHvehECY8miJlsu7UqmThIhmy6o4T4X6ErX1M9ifR92WGeXkAi/2mh/
 WciVpPvTrqM=
 =/AdA
 -----END PGP SIGNATURE-----

Merge tag 'keys-devel-20131210' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull misc keyrings fixes from David Howells:
 "These break down into five sets:

   - A patch to error handling in the big_key type for huge payloads.
     If the payload is larger than the "low limit" and the backing store
     allocation fails, then big_key_instantiate() doesn't clear the
     payload pointers in the key, assuming them to have been previously
     cleared - but only one of them is.

     Unfortunately, the garbage collector still calls big_key_destroy()
     when sees one of the pointers with a weird value in it (and not
     NULL) which it then tries to clean up.

   - Three patches to fix the keyring type:

     * A patch to fix the hash function to correctly divide keyrings off
       from keys in the topology of the tree inside the associative
       array.  This is only a problem if searching through nested
       keyrings - and only if the hash function incorrectly puts the a
       keyring outside of the 0 branch of the root node.

     * A patch to fix keyrings' use of the associative array.  The
       __key_link_begin() function initially passes a NULL key pointer
       to assoc_array_insert() on the basis that it's holding a place in
       the tree whilst it does more allocation and stuff.

       This is only a problem when a node contains 16 keys that match at
       that level and we want to add an also matching 17th.  This should
       easily be manufactured with a keyring full of keyrings (without
       chucking any other sort of key into the mix) - except for (a)
       above which makes it on average adding the 65th keyring.

     * A patch to fix searching down through nested keyrings, where any
       keyring in the set has more than 16 keyrings and none of the
       first keyrings we look through has a match (before the tree
       iteration needs to step to a more distal node).

     Test in keyutils test suite:

        http://git.kernel.org/cgit/linux/kernel/git/dhowells/keyutils.git/commit/?id=8b4ae963ed92523aea18dfbb8cab3f4979e13bd1

   - A patch to fix the big_key type's use of a shmem file as its
     backing store causing audit messages and LSM check failures.  This
     is done by setting S_PRIVATE on the file to avoid LSM checks on the
     file (access to the shmem file goes through the keyctl() interface
     and so is gated by the LSM that way).

     This isn't normally a problem if a key is used by the context that
     generated it - and it's currently only used by libkrb5.

     Test in keyutils test suite:

        http://git.kernel.org/cgit/linux/kernel/git/dhowells/keyutils.git/commit/?id=d9a53cbab42c293962f2f78f7190253fc73bd32e

   - A patch to add a generated file to .gitignore.

   - A patch to fix the alignment of the system certificate data such
     that it it works on s390.  As I understand it, on the S390 arch,
     symbols must be 2-byte aligned because loading the address discards
     the least-significant bit"

* tag 'keys-devel-20131210' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  KEYS: correct alignment of system_certificate_list content in assembly file
  Ignore generated file kernel/x509_certificate_list
  security: shmem: implement kernel private shmem inodes
  KEYS: Fix searching of nested keyrings
  KEYS: Fix multiple key add into associative array
  KEYS: Fix the keyring hash function
  KEYS: Pre-clear struct key on allocation
2013-12-12 10:15:24 -08:00
Chad Hanson
598cdbcf86 selinux: fix broken peer recv check
Fix a broken networking check. Return an error if peer recv fails.  If
secmark is active and the packet recv succeeds the peer recv error is
ignored.

Signed-off-by: Chad Hanson <chanson@trustedcs.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-11 17:07:56 -05:00
Jarkko Sakkinen
398ce07370 smack: fix: allow either entry be missing on access/access2 check (v2)
This is a regression caused by f7112e6c. When either subject or
object is not found the answer for access should be no. This
patch fixes the situation. '0' is written back instead of failing
with -EINVAL.

v2: cosmetic style fixes

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2013-12-11 10:48:55 -08:00
Paul Moore
5c6c26813a selinux: process labeled IPsec TCP SYN-ACK packets properly in selinux_ip_postroute()
Due to difficulty in arriving at the proper security label for
TCP SYN-ACK packets in selinux_ip_postroute(), we need to check packets
while/before they are undergoing XFRM transforms instead of waiting
until afterwards so that we can determine the correct security label.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-10 14:50:25 -05:00
Paul Moore
5b67c49324 selinux: look for IPsec labels on both inbound and outbound packets
Previously selinux_skb_peerlbl_sid() would only check for labeled
IPsec security labels on inbound packets, this patch enables it to
check both inbound and outbound traffic for labeled IPsec security
labels.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-09 15:32:33 -05:00
Tejun Heo
2da8ca822d cgroup: replace cftype->read_seq_string() with cftype->seq_show()
In preparation of conversion to kernfs, cgroup file handling is
updated so that it can be easily mapped to kernfs.  This patch
replaces cftype->read_seq_string() with cftype->seq_show() which is
not limited to single_open() operation and will map directcly to
kernfs seq_file interface.

The conversions are mechanical.  As ->seq_show() doesn't have @css and
@cft, the functions which make use of them are converted to use
seq_css() and seq_cft() respectively.  In several occassions, e.f. if
it has seq_string in its name, the function name is updated to fit the
new method better.

This patch does not introduce any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
2013-12-05 12:28:04 -05:00
Geyslan G. Bem
0af901643f selinux: fix possible memory leak
Free 'ctx_str' when necessary.

Signed-off-by: Geyslan G. Bem <geyslan@gmail.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-04 16:10:24 -05:00
Paul Moore
0b1f24e6db selinux: pull address family directly from the request_sock struct
We don't need to inspect the packet to determine if the packet is an
IPv4 packet arriving on an IPv6 socket when we can query the
request_sock directly.

Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-04 16:08:27 -05:00
Paul Moore
050d032b25 selinux: ensure that the cached NetLabel secattr matches the desired SID
In selinux_netlbl_skbuff_setsid() we leverage a cached NetLabel
secattr whenever possible.  However, we never check to ensure that
the desired SID matches the cached NetLabel secattr.  This patch
checks the SID against the secattr before use and only uses the
cached secattr when the SID values match.

Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-04 16:08:17 -05:00
Paul Moore
7f721643db selinux: handle TCP SYN-ACK packets correctly in selinux_ip_postroute()
In selinux_ip_postroute() we perform access checks based on the
packet's security label.  For locally generated traffic we get the
packet's security label from the associated socket; this works in all
cases except for TCP SYN-ACK packets.  In the case of SYN-ACK packet's
the correct security label is stored in the connection's request_sock,
not the server's socket.  Unfortunately, at the point in time when
selinux_ip_postroute() is called we can't query the request_sock
directly, we need to recreate the label using the same logic that
originally labeled the associated request_sock.

See the inline comments for more explanation.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Tested-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-04 16:07:28 -05:00
Paul Moore
da2ea0d056 selinux: handle TCP SYN-ACK packets correctly in selinux_ip_output()
In selinux_ip_output() we always label packets based on the parent
socket.  While this approach works in almost all cases, it doesn't
work in the case of TCP SYN-ACK packets when the correct label is not
the label of the parent socket, but rather the label of the larval
socket represented by the request_sock struct.

Unfortunately, since the request_sock isn't queued on the parent
socket until *after* the SYN-ACK packet is sent, we can't lookup the
request_sock to determine the correct label for the packet; at this
point in time the best we can do is simply pass/NF_ACCEPT the packet.
It must be said that simply passing the packet without any explicit
labeling action, while far from ideal, is not terrible as the SYN-ACK
packet will inherit any IP option based labeling from the initial
connection request so the label *should* be correct and all our
access controls remain in place so we shouldn't have to worry about
information leaks.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Tested-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-12-04 16:06:47 -05:00
Roberto Sassu
a7ed7c60e1 ima: properly free ima_template_entry structures
The new templates management mechanism records information associated
to an event into an array of 'ima_field_data' structures and makes it
available through the 'template_data' field of the 'ima_template_entry'
structure (the element of the measurements list created by IMA).

Since 'ima_field_data' contains dynamically allocated data (which length
varies depending on the data associated to a selected template field),
it is not enough to just free the memory reserved for a
'ima_template_entry' structure if something goes wrong.

This patch creates the new function ima_free_template_entry() which
walks the array of 'ima_field_data' structures, frees the memory
referenced by the 'data' pointer and finally the space reserved for
the 'ima_template_entry' structure. Further, it replaces existing kfree()
that have a pointer to an 'ima_template_entry' structure as argument
with calls to the new function.

Fixes: a71dc65: ima: switch to new template management mechanism
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
2013-12-02 20:46:56 -05:00
Christoph Paasch
09ae634572 ima: Do not free 'entry' before it is initialized
7bc5f447ce (ima: define new function ima_alloc_init_template() to
API) moved the initialization of 'entry' in ima_add_boot_aggregate() a
bit more below, after the if (ima_used_chip).

So, 'entry' is not initialized while being inside this if-block. So, we
should not attempt to free it.

Found by Coverity (CID: 1131971)

Fixes: 7bc5f447ce (ima: define new function ima_alloc_init_template() to API)
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
2013-12-02 20:46:32 -05:00
Eric Paris
c727709092 security: shmem: implement kernel private shmem inodes
We have a problem where the big_key key storage implementation uses a
shmem backed inode to hold the key contents.  Because of this detail of
implementation LSM checks are being done between processes trying to
read the keys and the tmpfs backed inode.  The LSM checks are already
being handled on the key interface level and should not be enforced at
the inode level (since the inode is an implementation detail, not a
part of the security model)

This patch implements a new function shmem_kernel_file_setup() which
returns the equivalent to shmem_file_setup() only the underlying inode
has S_PRIVATE set.  This means that all LSM checks for the inode in
question are skipped.  It should only be used for kernel internal
operations where the inode is not exposed to userspace without proper
LSM checking.  It is possible that some other users of
shmem_file_setup() should use the new interface, but this has not been
explored.

Reproducing this bug is a little bit difficult.  The steps I used on
Fedora are:

 (1) Turn off selinux enforcing:

	setenforce 0

 (2) Create a huge key

	k=`dd if=/dev/zero bs=8192 count=1 | keyctl padd big_key test-key @s`

 (3) Access the key in another context:

	runcon system_u:system_r:httpd_t:s0-s0:c0.c1023 keyctl print $k >/dev/null

 (4) Examine the audit logs:

	ausearch -m AVC -i --subject httpd_t | audit2allow

If the last command's output includes a line that looks like:

	allow httpd_t user_tmpfs_t:file { open read };

There was an inode check between httpd and the tmpfs filesystem.  With
this patch no such denial will be seen.  (NOTE! you should clear your
audit log if you have tested for this previously)

(Please return you box to enforcing)

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Hugh Dickins <hughd@google.com>
cc: linux-mm@kvack.org
2013-12-02 11:24:19 +00:00
David Howells
9c5e45df21 KEYS: Fix searching of nested keyrings
If a keyring contains more than 16 keyrings (the capacity of a single node in
the associative array) then those keyrings are split over multiple nodes
arranged as a tree.

If search_nested_keyrings() is called to search the keyring then it will
attempt to manually walk over just the 0 branch of the associative array tree
where all the keyring links are stored.  This works provided the key is found
before the algorithm steps from one node containing keyrings to a child node
or if there are sufficiently few keyring links that the keyrings are all in
one node.

However, if the algorithm does need to step from a node to a child node, it
doesn't change the node pointer unless a shortcut also gets transited.  This
means that the algorithm will keep scanning the same node over and over again
without terminating and without returning.

To fix this, move the internal-pointer-to-node translation from inside the
shortcut transit handler so that it applies it to node arrival as well.

This can be tested by:

	r=`keyctl newring sandbox @s`
	for ((i=0; i<=16; i++)); do keyctl newring ring$i $r; done
	for ((i=0; i<=16; i++)); do keyctl add user a$i a %:ring$i; done
	for ((i=0; i<=16; i++)); do keyctl search $r user a$i; done
	for ((i=17; i<=20; i++)); do keyctl search $r user a$i; done

The searches should all complete successfully (or with an error for 17-20),
but instead one or more of them will hang.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Stephen Gallagher <sgallagh@redhat.com>
2013-12-02 11:24:19 +00:00
David Howells
23fd78d764 KEYS: Fix multiple key add into associative array
If sufficient keys (or keyrings) are added into a keyring such that a node in
the associative array's tree overflows (each node has a capacity N, currently
16) and such that all N+1 keys have the same index key segment for that level
of the tree (the level'th nibble of the index key), then assoc_array_insert()
calls ops->diff_objects() to indicate at which bit position the two index keys
vary.

However, __key_link_begin() passes a NULL object to assoc_array_insert() with
the intention of supplying the correct pointer later before we commit the
change.  This means that keyring_diff_objects() is given a NULL pointer as one
of its arguments which it does not expect.  This results in an oops like the
attached.

With the previous patch to fix the keyring hash function, this can be forced
much more easily by creating a keyring and only adding keyrings to it.  Add any
other sort of key and a different insertion path is taken - all 16+1 objects
must want to cluster in the same node slot.

This can be tested by:

	r=`keyctl newring sandbox @s`
	for ((i=0; i<=16; i++)); do keyctl newring ring$i $r; done

This should work fine, but oopses when the 17th keyring is added.

Since ops->diff_objects() is always called with the first pointer pointing to
the object to be inserted (ie. the NULL pointer), we can fix the problem by
changing the to-be-inserted object pointer to point to the index key passed
into assoc_array_insert() instead.

Whilst we're at it, we also switch the arguments so that they are the same as
for ->compare_object().

BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
IP: [<ffffffff81191ee4>] hash_key_type_and_desc+0x18/0xb0
...
RIP: 0010:[<ffffffff81191ee4>] hash_key_type_and_desc+0x18/0xb0
...
Call Trace:
 [<ffffffff81191f9d>] keyring_diff_objects+0x21/0xd2
 [<ffffffff811f09ef>] assoc_array_insert+0x3b6/0x908
 [<ffffffff811929a7>] __key_link_begin+0x78/0xe5
 [<ffffffff81191a2e>] key_create_or_update+0x17d/0x36a
 [<ffffffff81192e0a>] SyS_add_key+0x123/0x183
 [<ffffffff81400ddb>] tracesys+0xdd/0xe2

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Stephen Gallagher <sgallagh@redhat.com>
2013-12-02 11:24:18 +00:00
David Howells
d54e58b7f0 KEYS: Fix the keyring hash function
The keyring hash function (used by the associative array) is supposed to clear
the bottommost nibble of the index key (where the hash value resides) for
keyrings and make sure it is non-zero for non-keyrings.  This is done to make
keyrings cluster together on one branch of the tree separately to other keys.

Unfortunately, the wrong mask is used, so only the bottom two bits are
examined and cleared and not the whole bottom nibble.  This means that keys
and keyrings can still be successfully searched for under most circumstances
as the hash is consistent in its miscalculation, but if a keyring's
associative array bottom node gets filled up then approx 75% of the keyrings
will not be put into the 0 branch.

The consequence of this is that a key in a keyring linked to by another
keyring, ie.

	keyring A -> keyring B -> key

may not be found if the search starts at keyring A and then descends into
keyring B because search_nested_keyrings() only searches up the 0 branch (as it
"knows" all keyrings must be there and not elsewhere in the tree).

The fix is to use the right mask.

This can be tested with:

	r=`keyctl newring sandbox @s`
	for ((i=0; i<=16; i++)); do keyctl newring ring$i $r; done
	for ((i=0; i<=16; i++)); do keyctl add user a$i a %:ring$i; done
	for ((i=0; i<=16; i++)); do keyctl search $r user a$i; done

This creates a sandbox keyring, then creates 17 keyrings therein (labelled
ring0..ring16).  This causes the root node of the sandbox's associative array
to overflow and for the tree to have extra nodes inserted.

Each keyring then is given a user key (labelled aN for ringN) for us to search
for.

We then search for the user keys we added, starting from the sandbox.  If
working correctly, it should return the same ordered list of key IDs as
for...keyctl add... did.  Without this patch, it reports ENOKEY "Required key
not available" for some of the keys.  Just which keys get this depends as the
kernel pointer to the key type forms part of the hash function.

Reported-by: Nalin Dahyabhai <nalin@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Stephen Gallagher <sgallagh@redhat.com>
2013-12-02 11:24:18 +00:00
David Howells
2480f57fb3 KEYS: Pre-clear struct key on allocation
The second word of key->payload does not get initialised in key_alloc(), but
the big_key type is relying on it having been cleared.  The problem comes when
big_key fails to instantiate a large key and doesn't then set the payload.  The
big_key_destroy() op is called from the garbage collector and this assumes that
the dentry pointer stored in the second word will be NULL if instantiation did
not complete.

Therefore just pre-clear the entire struct key on allocation rather than trying
to be clever and only initialising to 0 only those bits that aren't otherwise
initialised.

The lack of initialisation can lead to a bug report like the following if
big_key failed to initialise its file:

	general protection fault: 0000 [#1] SMP
	Modules linked in: ...
	CPU: 0 PID: 51 Comm: kworker/0:1 Not tainted 3.10.0-53.el7.x86_64 #1
	Hardware name: Dell Inc. PowerEdge 1955/0HC513, BIOS 1.4.4 12/09/2008
	Workqueue: events key_garbage_collector
	task: ffff8801294f5680 ti: ffff8801296e2000 task.ti: ffff8801296e2000
	RIP: 0010:[<ffffffff811b4a51>] dput+0x21/0x2d0
	...
	Call Trace:
	 [<ffffffff811a7b06>] path_put+0x16/0x30
	 [<ffffffff81235604>] big_key_destroy+0x44/0x60
	 [<ffffffff8122dc4b>] key_gc_unused_keys.constprop.2+0x5b/0xe0
	 [<ffffffff8122df2f>] key_garbage_collector+0x1df/0x3c0
	 [<ffffffff8107759b>] process_one_work+0x17b/0x460
	 [<ffffffff8107834b>] worker_thread+0x11b/0x400
	 [<ffffffff81078230>] ? rescuer_thread+0x3e0/0x3e0
	 [<ffffffff8107eb00>] kthread+0xc0/0xd0
	 [<ffffffff8107ea40>] ? kthread_create_on_node+0x110/0x110
	 [<ffffffff815c4bec>] ret_from_fork+0x7c/0xb0
	 [<ffffffff8107ea40>] ? kthread_create_on_node+0x110/0x110

Reported-by: Patrik Kis <pkis@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Stephen Gallagher <sgallagh@redhat.com>
2013-12-02 11:24:18 +00:00
Roberto Sassu
af91706d5d ima: store address of template_fmt_copy in a pointer before calling strsep
This patch stores the address of the 'template_fmt_copy' variable in a new
variable, called 'template_fmt_ptr', so that the latter is passed as an
argument of strsep() instead of the former. This modification is needed
in order to correctly free the memory area referenced by
'template_fmt_copy' (strsep() modifies the pointer of the passed string).

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-11-30 13:09:53 +11:00
Paul Moore
dd0a11815a Linux 3.12
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQEcBAABAgAGBQJSdt9HAAoJEHm+PkMAQRiGnzEH/345Keg5dp+oKACnokBfzOtp
 V0p3g5EBsGtzEVnV+1B96trczDUtWdDFFr5GfGSj565NBQpFyc+iZC1mC99RDJCs
 WUquGFqlLMK2aV0SbKwCO4K1rJ5A0TRVj0ZRJOUJUY7jwNf5Qahny0WBVjO/8qAY
 UvJK1rktBClhKdH53YtpDHHgXBeZ2LOrzt1fQ/AMpujGbZauGvnLdNOli5r2kCFK
 jzoOgFLvX+PHU/5/d4/QyJPeQNPva5hjk5Ho9UuSJYhnFtPO3EkD4XZLcpcbNEJb
 LqBvbnZWm6CS435lfU1l93RqQa5xMO9ITk0oe4h69syTSHwWk9aJ+ZTc/4Up+t8=
 =57MC
 -----END PGP SIGNATURE-----

Merge tag 'v3.12'

Linux 3.12
2013-11-26 17:32:55 -05:00
Geyslan G. Bem
8e645c345a selinux: fix possible memory leak
Free 'ctx_str' when necessary.

Signed-off-by: Geyslan G. Bem <geyslan@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-11-25 17:00:33 -05:00
Roberto Sassu
dbc335d2dc ima: make a copy of template_fmt in template_desc_init_fields()
This patch makes a copy of the 'template_fmt' function argument so that
the latter will not be modified by strsep(), which does the splitting by
replacing the given separator with '\0'.

 IMA: No TPM chip found, activating TPM-bypass!
 Unable to handle kernel pointer dereference at virtual kernel address 0000000000842000
 Oops: 0004 [#1] SMP
 Modules linked in:
 CPU: 3 PID: 1 Comm: swapper/0 Not tainted 3.12.0-rc2-00098-g3ce1217d6cd5 #17
 task: 000000003ffa0000 ti: 000000003ff84000 task.ti: 000000003ff84000
 Krnl PSW : 0704e00180000000 000000000044bf88 (strsep+0x7c/0xa0)
            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 EA:3
 Krnl GPRS: 000000000000007c 000000000000007c 000000003ff87d90 0000000000821fd8
            0000000000000000 000000000000007c 0000000000aa37e0 0000000000aa9008
            0000000000000051 0000000000a114d8 0000000100000002 0000000000842bde
            0000000000842bdf 00000000006f97f0 000000000040062c 000000003ff87cf0
 Krnl Code: 000000000044bf7c: a7f4000a           brc     15,44bf90
            000000000044bf80: b90200cc           ltgr    %r12,%r12
           #000000000044bf84: a7840006           brc     8,44bf90
           >000000000044bf88: 9200c000           mvi     0(%r12),0
            000000000044bf8c: 41c0c001           la      %r12,1(%r12)
            000000000044bf90: e3c020000024       stg     %r12,0(%r2)
            000000000044bf96: b904002b           lgr     %r2,%r11
            000000000044bf9a: ebbcf0700004       lmg     %r11,%r12,112(%r15)
 Call Trace:
 ([<00000000004005fe>] ima_init_template+0xa2/0x1bc)
  [<0000000000a7c896>] ima_init+0x7a/0xa8
  [<0000000000a7c938>] init_ima+0x24/0x40
  [<00000000001000e8>] do_one_initcall+0x68/0x128
  [<0000000000a4eb56>] kernel_init_freeable+0x20a/0x2b4
  [<00000000006a1ff4>] kernel_init+0x30/0x178
  [<00000000006b69fe>] kernel_thread_starter+0x6/0xc
  [<00000000006b69f8>] kernel_thread_starter+0x0/0xc
 Last Breaking-Event-Address:
  [<000000000044bf42>] strsep+0x36/0xa0

Fixes commit: adf53a7 ima: new templates management mechanism

Changelog v1:
- make template_fmt 'const char *' (reported-by James Morris)
- fix kstrdup memory leak (reported-by James Morris)

Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2013-11-25 15:05:33 -05:00
Roberto Sassu
3e8e5503a3 ima: do not send field length to userspace for digest of ima template
This patch defines a new value for the 'ima_show_type' enumerator
(IMA_SHOW_BINARY_NO_FIELD_LEN) to prevent that the field length
is transmitted through the 'binary_runtime_measurements' interface
for the digest field of the 'ima' template.

Fixes commit: 3ce1217 ima: define template fields library and new helpers

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-11-25 07:31:14 -05:00
Roberto Sassu
b6f8f16f41 ima: do not include field length in template digest calc for ima template
To maintain compatibility with userspace tools, the field length must not
be included in the template digest calculation for the 'ima' template.

Fixes commit: a71dc65 ima: switch to new template management mechanism

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-11-25 07:26:28 -05:00
Linus Torvalds
34ef7bd382 Revert "ima: define '_ima' as a builtin 'trusted' keyring"
This reverts commit 217091dd7a, which
caused the following build error:

  security/integrity/digsig.c:70:5: error: redefinition of ‘integrity_init_keyring’
  security/integrity/integrity.h:149:12: note: previous definition of ‘integrity_init_keyring’ w
  security/integrity/integrity.h:149:12: warning: ‘integrity_init_keyring’ defined but not used

reported by Krzysztof Kolasa. Mimi says:

 "I made the classic mistake of requesting this patch to be upstreamed
  at the last second, rather than waiting until the next open window.

  At this point, the best course would probably be to revert the two
  commits and fix them for the next open window"

Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-23 16:36:35 -08:00
Eric Paris
fc582aef7d Linux 3.12
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQEcBAABAgAGBQJSdt9HAAoJEHm+PkMAQRiGnzEH/345Keg5dp+oKACnokBfzOtp
 V0p3g5EBsGtzEVnV+1B96trczDUtWdDFFr5GfGSj565NBQpFyc+iZC1mC99RDJCs
 WUquGFqlLMK2aV0SbKwCO4K1rJ5A0TRVj0ZRJOUJUY7jwNf5Qahny0WBVjO/8qAY
 UvJK1rktBClhKdH53YtpDHHgXBeZ2LOrzt1fQ/AMpujGbZauGvnLdNOli5r2kCFK
 jzoOgFLvX+PHU/5/d4/QyJPeQNPva5hjk5Ho9UuSJYhnFtPO3EkD4XZLcpcbNEJb
 LqBvbnZWm6CS435lfU1l93RqQa5xMO9ITk0oe4h69syTSHwWk9aJ+ZTc/4Up+t8=
 =57MC
 -----END PGP SIGNATURE-----

Merge tag 'v3.12'

Linux 3.12

Conflicts:
	fs/exec.c
2013-11-22 18:57:54 -05:00
Linus Torvalds
78dc53c422 Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "In this patchset, we finally get an SELinux update, with Paul Moore
  taking over as maintainer of that code.

  Also a significant update for the Keys subsystem, as well as
  maintenance updates to Smack, IMA, TPM, and Apparmor"

and since I wanted to know more about the updates to key handling,
here's the explanation from David Howells on that:

 "Okay.  There are a number of separate bits.  I'll go over the big bits
  and the odd important other bit, most of the smaller bits are just
  fixes and cleanups.  If you want the small bits accounting for, I can
  do that too.

   (1) Keyring capacity expansion.

        KEYS: Consolidate the concept of an 'index key' for key access
        KEYS: Introduce a search context structure
        KEYS: Search for auth-key by name rather than target key ID
        Add a generic associative array implementation.
        KEYS: Expand the capacity of a keyring

     Several of the patches are providing an expansion of the capacity of a
     keyring.  Currently, the maximum size of a keyring payload is one page.
     Subtract a small header and then divide up into pointers, that only gives
     you ~500 pointers on an x86_64 box.  However, since the NFS idmapper uses
     a keyring to store ID mapping data, that has proven to be insufficient to
     the cause.

     Whatever data structure I use to handle the keyring payload, it can only
     store pointers to keys, not the keys themselves because several keyrings
     may point to a single key.  This precludes inserting, say, and rb_node
     struct into the key struct for this purpose.

     I could make an rbtree of records such that each record has an rb_node
     and a key pointer, but that would use four words of space per key stored
     in the keyring.  It would, however, be able to use much existing code.

     I selected instead a non-rebalancing radix-tree type approach as that
     could have a better space-used/key-pointer ratio.  I could have used the
     radix tree implementation that we already have and insert keys into it by
     their serial numbers, but that means any sort of search must iterate over
     the whole radix tree.  Further, its nodes are a bit on the capacious side
     for what I want - especially given that key serial numbers are randomly
     allocated, thus leaving a lot of empty space in the tree.

     So what I have is an associative array that internally is a radix-tree
     with 16 pointers per node where the index key is constructed from the key
     type pointer and the key description.  This means that an exact lookup by
     type+description is very fast as this tells us how to navigate directly to
     the target key.

     I made the data structure general in lib/assoc_array.c as far as it is
     concerned, its index key is just a sequence of bits that leads to a
     pointer.  It's possible that someone else will be able to make use of it
     also.  FS-Cache might, for example.

   (2) Mark keys as 'trusted' and keyrings as 'trusted only'.

        KEYS: verify a certificate is signed by a 'trusted' key
        KEYS: Make the system 'trusted' keyring viewable by userspace
        KEYS: Add a 'trusted' flag and a 'trusted only' flag
        KEYS: Separate the kernel signature checking keyring from module signing

     These patches allow keys carrying asymmetric public keys to be marked as
     being 'trusted' and allow keyrings to be marked as only permitting the
     addition or linkage of trusted keys.

     Keys loaded from hardware during kernel boot or compiled into the kernel
     during build are marked as being trusted automatically.  New keys can be
     loaded at runtime with add_key().  They are checked against the system
     keyring contents and if their signatures can be validated with keys that
     are already marked trusted, then they are marked trusted also and can
     thus be added into the master keyring.

     Patches from Mimi Zohar make this usable with the IMA keyrings also.

   (3) Remove the date checks on the key used to validate a module signature.

        X.509: Remove certificate date checks

     It's not reasonable to reject a signature just because the key that it was
     generated with is no longer valid datewise - especially if the kernel
     hasn't yet managed to set the system clock when the first module is
     loaded - so just remove those checks.

   (4) Make it simpler to deal with additional X.509 being loaded into the kernel.

        KEYS: Load *.x509 files into kernel keyring
        KEYS: Have make canonicalise the paths of the X.509 certs better to deduplicate

     The builder of the kernel now just places files with the extension ".x509"
     into the kernel source or build trees and they're concatenated by the
     kernel build and stuffed into the appropriate section.

   (5) Add support for userspace kerberos to use keyrings.

        KEYS: Add per-user_namespace registers for persistent per-UID kerberos caches
        KEYS: Implement a big key type that can save to tmpfs

     Fedora went to, by default, storing kerberos tickets and tokens in tmpfs.
     We looked at storing it in keyrings instead as that confers certain
     advantages such as tickets being automatically deleted after a certain
     amount of time and the ability for the kernel to get at these tokens more
     easily.

     To make this work, two things were needed:

     (a) A way for the tickets to persist beyond the lifetime of all a user's
         sessions so that cron-driven processes can still use them.

         The problem is that a user's session keyrings are deleted when the
         session that spawned them logs out and the user's user keyring is
         deleted when the UID is deleted (typically when the last log out
         happens), so neither of these places is suitable.

         I've added a system keyring into which a 'persistent' keyring is
         created for each UID on request.  Each time a user requests their
         persistent keyring, the expiry time on it is set anew.  If the user
         doesn't ask for it for, say, three days, the keyring is automatically
         expired and garbage collected using the existing gc.  All the kerberos
         tokens it held are then also gc'd.

     (b) A key type that can hold really big tickets (up to 1MB in size).

         The problem is that Active Directory can return huge tickets with lots
         of auxiliary data attached.  We don't, however, want to eat up huge
         tracts of unswappable kernel space for this, so if the ticket is
         greater than a certain size, we create a swappable shmem file and dump
         the contents in there and just live with the fact we then have an
         inode and a dentry overhead.  If the ticket is smaller than that, we
         slap it in a kmalloc()'d buffer"

* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (121 commits)
  KEYS: Fix keyring content gc scanner
  KEYS: Fix error handling in big_key instantiation
  KEYS: Fix UID check in keyctl_get_persistent()
  KEYS: The RSA public key algorithm needs to select MPILIB
  ima: define '_ima' as a builtin 'trusted' keyring
  ima: extend the measurement list to include the file signature
  kernel/system_certificate.S: use real contents instead of macro GLOBAL()
  KEYS: fix error return code in big_key_instantiate()
  KEYS: Fix keyring quota misaccounting on key replacement and unlink
  KEYS: Fix a race between negating a key and reading the error set
  KEYS: Make BIG_KEYS boolean
  apparmor: remove the "task" arg from may_change_ptraced_domain()
  apparmor: remove parent task info from audit logging
  apparmor: remove tsk field from the apparmor_audit_struct
  apparmor: fix capability to not use the current task, during reporting
  Smack: Ptrace access check mode
  ima: provide hash algo info in the xattr
  ima: enable support for larger default filedata hash algorithms
  ima: define kernel parameter 'ima_template=' to change configured default
  ima: add Kconfig default measurement list template
  ...
2013-11-21 19:46:00 -08:00
Linus Torvalds
3eaded86ac Merge git://git.infradead.org/users/eparis/audit
Pull audit updates from Eric Paris:
 "Nothing amazing.  Formatting, small bug fixes, couple of fixes where
  we didn't get records due to some old VFS changes, and a change to how
  we collect execve info..."

Fixed conflict in fs/exec.c as per Eric and linux-next.

* git://git.infradead.org/users/eparis/audit: (28 commits)
  audit: fix type of sessionid in audit_set_loginuid()
  audit: call audit_bprm() only once to add AUDIT_EXECVE information
  audit: move audit_aux_data_execve contents into audit_context union
  audit: remove unused envc member of audit_aux_data_execve
  audit: Kill the unused struct audit_aux_data_capset
  audit: do not reject all AUDIT_INODE filter types
  audit: suppress stock memalloc failure warnings since already managed
  audit: log the audit_names record type
  audit: add child record before the create to handle case where create fails
  audit: use given values in tty_audit enable api
  audit: use nlmsg_len() to get message payload length
  audit: use memset instead of trying to initialize field by field
  audit: fix info leak in AUDIT_GET requests
  audit: update AUDIT_INODE filter rule to comparator function
  audit: audit feature to set loginuid immutable
  audit: audit feature to only allow unsetting the loginuid
  audit: allow unsetting the loginuid (with priv)
  audit: remove CONFIG_AUDIT_LOGINUID_IMMUTABLE
  audit: loginuid functions coding style
  selinux: apply selinux checks on new audit message types
  ...
2013-11-21 19:18:14 -08:00
Tim Gardner
b5495b4217 SELinux: security_load_policy: Silence frame-larger-than warning
Dynamically allocate a couple of the larger stack variables in order to
reduce the stack footprint below 1024. gcc-4.8

security/selinux/ss/services.c: In function 'security_load_policy':
security/selinux/ss/services.c:1964:1: warning: the frame size of 1104 bytes is larger than 1024 bytes [-Wframe-larger-than=]
 }

Also silence a couple of checkpatch warnings at the same time.

WARNING: sizeof policydb should be sizeof(policydb)
+	memcpy(oldpolicydb, &policydb, sizeof policydb);

WARNING: sizeof policydb should be sizeof(policydb)
+	memcpy(&policydb, newpolicydb, sizeof policydb);

Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-11-19 17:35:18 -05:00
Richard Haines
a660bec1d8 SELinux: Update policy version to support constraints info
Update the policy version (POLICYDB_VERSION_CONSTRAINT_NAMES) to allow
holding of policy source info for constraints.

Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-11-19 17:34:23 -05:00
David Howells
62fe318256 KEYS: Fix keyring content gc scanner
Key pointers stored in the keyring are marked in bit 1 to indicate if they
point to a keyring.  We need to strip off this bit before using the pointer
when iterating over the keyring for the purpose of looking for links to garbage
collect.

This means that expirable keyrings aren't correctly expiring because the
checker is seeing their key pointer with 2 added to it.

Since the fix for this involves knowing about the internals of the keyring,
key_gc_keyring() is moved to keyring.c and merged into keyring_gc().

This can be tested by:

	echo 2 >/proc/sys/kernel/keys/gc_delay
	keyctl timeout `keyctl add keyring qwerty "" @s` 2
	cat /proc/keys
	sleep 5; cat /proc/keys

which should see a keyring called "qwerty" appear in the session keyring and
then disappear after it expires, and:

	echo 2 >/proc/sys/kernel/keys/gc_delay
	a=`keyctl get_persistent @s`
	b=`keyctl add keyring 0 "" $a`
	keyctl add user a a $b
	keyctl timeout $b 2
	cat /proc/keys
	sleep 5; cat /proc/keys

which should see a keyring called "0" with a key called "a" in it appear in the
user's persistent keyring (which will be attached to the session keyring) and
then both the "0" keyring and the "a" key should disappear when the "0" keyring
expires.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Simo Sorce <simo@redhat.com>
2013-11-14 14:09:53 +00:00
David Howells
97826c821e KEYS: Fix error handling in big_key instantiation
In the big_key_instantiate() function we return 0 if kernel_write() returns us
an error rather than returning an error.  This can potentially lead to
dentry_open() giving a BUG when called from big_key_read() with an unset
tmpfile path.

	------------[ cut here ]------------
	kernel BUG at fs/open.c:798!
	...
	RIP: 0010:[<ffffffff8119bbd1>] dentry_open+0xd1/0xe0
	...
	Call Trace:
	 [<ffffffff812350c5>] big_key_read+0x55/0x100
	 [<ffffffff81231084>] keyctl_read_key+0xb4/0xe0
	 [<ffffffff81231e58>] SyS_keyctl+0xf8/0x1d0
	 [<ffffffff815bb799>] system_call_fastpath+0x16/0x1b


Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Stephen Gallagher <sgallagh@redhat.com>
2013-11-13 16:51:06 +00:00
Linus Torvalds
42a2d923cc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) The addition of nftables.  No longer will we need protocol aware
    firewall filtering modules, it can all live in userspace.

    At the core of nftables is a, for lack of a better term, virtual
    machine that executes byte codes to inspect packet or metadata
    (arriving interface index, etc.) and make verdict decisions.

    Besides support for loading packet contents and comparing them, the
    interpreter supports lookups in various datastructures as
    fundamental operations.  For example sets are supports, and
    therefore one could create a set of whitelist IP address entries
    which have ACCEPT verdicts attached to them, and use the appropriate
    byte codes to do such lookups.

    Since the interpreted code is composed in userspace, userspace can
    do things like optimize things before giving it to the kernel.

    Another major improvement is the capability of atomically updating
    portions of the ruleset.  In the existing netfilter implementation,
    one has to update the entire rule set in order to make a change and
    this is very expensive.

    Userspace tools exist to create nftables rules using existing
    netfilter rule sets, but both kernel implementations will need to
    co-exist for quite some time as we transition from the old to the
    new stuff.

    Kudos to Patrick McHardy, Pablo Neira Ayuso, and others who have
    worked so hard on this.

 2) Daniel Borkmann and Hannes Frederic Sowa made several improvements
    to our pseudo-random number generator, mostly used for things like
    UDP port randomization and netfitler, amongst other things.

    In particular the taus88 generater is updated to taus113, and test
    cases are added.

 3) Support 64-bit rates in HTB and TBF schedulers, from Eric Dumazet
    and Yang Yingliang.

 4) Add support for new 577xx tigon3 chips to tg3 driver, from Nithin
    Sujir.

 5) Fix two fatal flaws in TCP dynamic right sizing, from Eric Dumazet,
    Neal Cardwell, and Yuchung Cheng.

 6) Allow IP_TOS and IP_TTL to be specified in sendmsg() ancillary
    control message data, much like other socket option attributes.
    From Francesco Fusco.

 7) Allow applications to specify a cap on the rate computed
    automatically by the kernel for pacing flows, via a new
    SO_MAX_PACING_RATE socket option.  From Eric Dumazet.

 8) Make the initial autotuned send buffer sizing in TCP more closely
    reflect actual needs, from Eric Dumazet.

 9) Currently early socket demux only happens for TCP sockets, but we
    can do it for connected UDP sockets too.  Implementation from Shawn
    Bohrer.

10) Refactor inet socket demux with the goal of improving hash demux
    performance for listening sockets.  With the main goals being able
    to use RCU lookups on even request sockets, and eliminating the
    listening lock contention.  From Eric Dumazet.

11) The bonding layer has many demuxes in it's fast path, and an RCU
    conversion was started back in 3.11, several changes here extend the
    RCU usage to even more locations.  From Ding Tianhong and Wang
    Yufen, based upon suggestions by Nikolay Aleksandrov and Veaceslav
    Falico.

12) Allow stackability of segmentation offloads to, in particular, allow
    segmentation offloading over tunnels.  From Eric Dumazet.

13) Significantly improve the handling of secret keys we input into the
    various hash functions in the inet hashtables, TCP fast open, as
    well as syncookies.  From Hannes Frederic Sowa.  The key fundamental
    operation is "net_get_random_once()" which uses static keys.

    Hannes even extended this to ipv4/ipv6 fragmentation handling and
    our generic flow dissector.

14) The generic driver layer takes care now to set the driver data to
    NULL on device removal, so it's no longer necessary for drivers to
    explicitly set it to NULL any more.  Many drivers have been cleaned
    up in this way, from Jingoo Han.

15) Add a BPF based packet scheduler classifier, from Daniel Borkmann.

16) Improve CRC32 interfaces and generic SKB checksum iterators so that
    SCTP's checksumming can more cleanly be handled.  Also from Daniel
    Borkmann.

17) Add a new PMTU discovery mode, IP_PMTUDISC_INTERFACE, which forces
    using the interface MTU value.  This helps avoid PMTU attacks,
    particularly on DNS servers.  From Hannes Frederic Sowa.

18) Use generic XPS for transmit queue steering rather than internal
    (re-)implementation in virtio-net.  From Jason Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
  random32: add test cases for taus113 implementation
  random32: upgrade taus88 generator to taus113 from errata paper
  random32: move rnd_state to linux/random.h
  random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized
  random32: add periodic reseeding
  random32: fix off-by-one in seeding requirement
  PHY: Add RTL8201CP phy_driver to realtek
  xtsonic: add missing platform_set_drvdata() in xtsonic_probe()
  macmace: add missing platform_set_drvdata() in mace_probe()
  ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe()
  ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh
  vlan: Implement vlan_dev_get_egress_qos_mask as an inline.
  ixgbe: add warning when max_vfs is out of range.
  igb: Update link modes display in ethtool
  netfilter: push reasm skb through instead of original frag skbs
  ip6_output: fragment outgoing reassembled skb properly
  MAINTAINERS: mv643xx_eth: take over maintainership from Lennart
  net_sched: tbf: support of 64bit rates
  ixgbe: deleting dfwd stations out of order can cause null ptr deref
  ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS
  ...
2013-11-13 17:40:34 +09:00
Linus Torvalds
a998646456 Merge branch 'for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup changes from Tejun Heo:
 "Not too much activity this time around.  css_id is finally killed and
  a minor update to device_cgroup"

* 'for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  device_cgroup: remove can_attach
  cgroup: kill css_id
  memcg: stop using css id
  memcg: fail to create cgroup if the cgroup id is too big
  memcg: convert to use cgroup id
  memcg: convert to use cgroup_is_descendant()
2013-11-13 15:21:53 +09:00
Paul Moore
94851b18d4 Linux 3.12
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQEcBAABAgAGBQJSdt9HAAoJEHm+PkMAQRiGnzEH/345Keg5dp+oKACnokBfzOtp
 V0p3g5EBsGtzEVnV+1B96trczDUtWdDFFr5GfGSj565NBQpFyc+iZC1mC99RDJCs
 WUquGFqlLMK2aV0SbKwCO4K1rJ5A0TRVj0ZRJOUJUY7jwNf5Qahny0WBVjO/8qAY
 UvJK1rktBClhKdH53YtpDHHgXBeZ2LOrzt1fQ/AMpujGbZauGvnLdNOli5r2kCFK
 jzoOgFLvX+PHU/5/d4/QyJPeQNPva5hjk5Ho9UuSJYhnFtPO3EkD4XZLcpcbNEJb
 LqBvbnZWm6CS435lfU1l93RqQa5xMO9ITk0oe4h69syTSHwWk9aJ+ZTc/4Up+t8=
 =57MC
 -----END PGP SIGNATURE-----

Merge tag 'v3.12'

Linux 3.12
2013-11-08 13:56:38 -05:00
David Howells
fbf8c53f1a KEYS: Fix UID check in keyctl_get_persistent()
If the UID is specified by userspace when calling the KEYCTL_GET_PERSISTENT
function and the process does not have the CAP_SETUID capability, then the
function will return -EPERM if the current process's uid, suid, euid and fsuid
all match the requested UID.  This is incorrect.

Fix it such that when a non-privileged caller requests a persistent keyring by
a specific UID they can only request their own (ie. the specified UID matches
either then process's UID or the process's EUID).

This can be tested by logging in as the user and doing:

	keyctl get_persistent @p
	keyctl get_persistent @p `id -u`
	keyctl get_persistent @p 0

The first two should successfully print the same key ID.  The third should do
the same if called by UID 0 or indicate Operation Not Permitted otherwise.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephen Gallagher <sgallagh@redhat.com>
2013-11-06 14:01:51 +00:00
Richard Guy Briggs
a20b62bdf7 audit: suppress stock memalloc failure warnings since already managed
Supress the stock memory allocation failure warnings for audit buffers
since audit alreay takes care of memory allocation failure warnings, including
rate-limiting, in audit_log_start().

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-11-05 11:09:11 -05:00
Eric Paris
b805b198dc selinux: apply selinux checks on new audit message types
We use the read check to get the feature set (like AUDIT_GET) and the
write check to set the features (like AUDIT_SET).

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-11-05 11:07:35 -05:00
Mimi Zohar
217091dd7a ima: define '_ima' as a builtin 'trusted' keyring
Require all keys added to the IMA keyring be signed by an
existing trusted key on the system trusted keyring.

Changelog:
- define stub integrity_init_keyring() function (reported-by Fengguang Wu)
- differentiate between regular and trusted keyring names.
- replace printk with pr_info (D. Kasatkin)

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
2013-10-31 20:20:48 -04:00
Mimi Zohar
bcbc9b0cf6 ima: extend the measurement list to include the file signature
This patch defines a new template called 'ima-sig', which includes
the file signature in the template data, in addition to the file's
digest and pathname.

A template is composed of a set of fields.  Associated with each
field is an initialization and display function.  This patch defines
a new template field called 'sig', the initialization function
ima_eventsig_init(), and the display function ima_show_template_sig().

This patch modifies the .field_init() function definition to include
the 'security.ima' extended attribute and length.

Changelog:
- remove unused code (Dmitry Kasatkin)
- avoid calling ima_write_template_field_data() unnecesarily (Roberto Sassu)
- rename DATA_FMT_SIG to DATA_FMT_HEX
- cleanup ima_eventsig_init() based on Roberto's comments

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
2013-10-31 20:19:35 -04:00
James Morris
42a20ba5c9 Merge branch 'keys-devel' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into ra-next 2013-10-31 09:46:36 +11:00
Wei Yongjun
d2b8697024 KEYS: fix error return code in big_key_instantiate()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David Howells <dhowells@redhat.com>
2013-10-30 12:54:29 +00:00
David Howells
034faeb9ef KEYS: Fix keyring quota misaccounting on key replacement and unlink
If a key is displaced from a keyring by a matching one, then four more bytes
of quota are allocated to the keyring - despite the fact that the keyring does
not change in size.

Further, when a key is unlinked from a keyring, the four bytes of quota
allocated the link isn't recovered and returned to the user's pool.

The first can be tested by repeating:

	keyctl add big_key a fred @s
	cat /proc/key-users

(Don't put it in a shell loop otherwise the garbage collector won't have time
to clear the displaced keys, thus affecting the result).

This was causing the kerberos keyring to run out of room fairly quickly.

The second can be tested by:

	cat /proc/key-users
	a=`keyctl add user a a @s`
	cat /proc/key-users
	keyctl unlink $a
	sleep 1 # Give RCU a chance to delete the key
	cat /proc/key-users

assuming no system activity that otherwise adds/removes keys, the amount of
key data allocated should go up (say 40/20000 -> 47/20000) and then return to
the original value at the end.

Reported-by: Stephen Gallagher <sgallagh@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2013-10-30 11:15:24 +00:00
David Howells
74792b0001 KEYS: Fix a race between negating a key and reading the error set
key_reject_and_link() marking a key as negative and setting the error with
which it was negated races with keyring searches and other things that read
that error.

The fix is to switch the order in which the assignments are done in
key_reject_and_link() and to use memory barriers.

Kudos to Dave Wysochanski <dwysocha@redhat.com> and Scott Mayhew
<smayhew@redhat.com> for tracking this down.

This may be the cause of:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
IP: [<ffffffff81219011>] wait_for_key_construction+0x31/0x80
PGD c6b2c3067 PUD c59879067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
CPU 0
Modules linked in: ...

Pid: 13359, comm: amqzxma0 Not tainted 2.6.32-358.20.1.el6.x86_64 #1 IBM System x3650 M3 -[7945PSJ]-/00J6159
RIP: 0010:[<ffffffff81219011>] wait_for_key_construction+0x31/0x80
RSP: 0018:ffff880c6ab33758  EFLAGS: 00010246
RAX: ffffffff81219080 RBX: 0000000000000000 RCX: 0000000000000002
RDX: ffffffff81219060 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff880c6ab33768 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff880adfcbce40
R13: ffffffffa03afb84 R14: ffff880adfcbce40 R15: ffff880adfcbce43
FS:  00007f29b8042700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000070 CR3: 0000000c613dc000 CR4: 00000000000007f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process amqzxma0 (pid: 13359, threadinfo ffff880c6ab32000, task ffff880c610deae0)
Stack:
 ffff880adfcbce40 0000000000000000 ffff880c6ab337b8 ffffffff81219695
<d> 0000000000000000 ffff880a000000d0 ffff880c6ab337a8 000000000000000f
<d> ffffffffa03afb93 000000000000000f ffff88186c7882c0 0000000000000014
Call Trace:
 [<ffffffff81219695>] request_key+0x65/0xa0
 [<ffffffffa03a0885>] nfs_idmap_request_key+0xc5/0x170 [nfs]
 [<ffffffffa03a0eb4>] nfs_idmap_lookup_id+0x34/0x80 [nfs]
 [<ffffffffa03a1255>] nfs_map_group_to_gid+0x75/0xa0 [nfs]
 [<ffffffffa039a9ad>] decode_getfattr_attrs+0xbdd/0xfb0 [nfs]
 [<ffffffff81057310>] ? __dequeue_entity+0x30/0x50
 [<ffffffff8100988e>] ? __switch_to+0x26e/0x320
 [<ffffffffa039ae03>] decode_getfattr+0x83/0xe0 [nfs]
 [<ffffffffa039b610>] ? nfs4_xdr_dec_getattr+0x0/0xa0 [nfs]
 [<ffffffffa039b69f>] nfs4_xdr_dec_getattr+0x8f/0xa0 [nfs]
 [<ffffffffa02dada4>] rpcauth_unwrap_resp+0x84/0xb0 [sunrpc]
 [<ffffffffa039b610>] ? nfs4_xdr_dec_getattr+0x0/0xa0 [nfs]
 [<ffffffffa02cf923>] call_decode+0x1b3/0x800 [sunrpc]
 [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50
 [<ffffffffa02cf770>] ? call_decode+0x0/0x800 [sunrpc]
 [<ffffffffa02d99a7>] __rpc_execute+0x77/0x350 [sunrpc]
 [<ffffffff81096c67>] ? bit_waitqueue+0x17/0xd0
 [<ffffffffa02d9ce1>] rpc_execute+0x61/0xa0 [sunrpc]
 [<ffffffffa02d03a5>] rpc_run_task+0x75/0x90 [sunrpc]
 [<ffffffffa02d04c2>] rpc_call_sync+0x42/0x70 [sunrpc]
 [<ffffffffa038ff80>] _nfs4_call_sync+0x30/0x40 [nfs]
 [<ffffffffa038836c>] _nfs4_proc_getattr+0xac/0xc0 [nfs]
 [<ffffffff810aac87>] ? futex_wait+0x227/0x380
 [<ffffffffa038b856>] nfs4_proc_getattr+0x56/0x80 [nfs]
 [<ffffffffa0371403>] __nfs_revalidate_inode+0xe3/0x220 [nfs]
 [<ffffffffa037158e>] nfs_revalidate_mapping+0x4e/0x170 [nfs]
 [<ffffffffa036f147>] nfs_file_read+0x77/0x130 [nfs]
 [<ffffffff811811aa>] do_sync_read+0xfa/0x140
 [<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8100bb8e>] ? apic_timer_interrupt+0xe/0x20
 [<ffffffff8100b9ce>] ? common_interrupt+0xe/0x13
 [<ffffffff81228ffb>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff8121bed6>] ? security_file_permission+0x16/0x20
 [<ffffffff81181a95>] vfs_read+0xb5/0x1a0
 [<ffffffff81181bd1>] sys_read+0x51/0x90
 [<ffffffff810dc685>] ? __audit_syscall_exit+0x265/0x290
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Dave Wysochanski <dwysocha@redhat.com>
cc: Scott Mayhew <smayhew@redhat.com>
2013-10-30 11:15:24 +00:00
Josh Boyer
2eaf6b5dca KEYS: Make BIG_KEYS boolean
Having the big_keys functionality as a module is very marginally useful.
The userspace code that would use this functionality will get odd error
messages from the keys layer if the module isn't loaded.  The code itself
is fairly small, so just have this as a boolean option and not a tristate.

Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: David Howells <dhowells@redhat.com>
2013-10-30 11:15:23 +00:00
Oleg Nesterov
51775fe736 apparmor: remove the "task" arg from may_change_ptraced_domain()
Unless task == current ptrace_parent(task) is not safe even under
rcu_read_lock() and most of the current users are not right.

So may_change_ptraced_domain(task) looks wrong as well. However it
is always called with task == current so the code is actually fine.
Remove this argument to make this fact clear.

Note: perhaps we should simply kill ptrace_parent(), it buys almost
nothing. And it is obviously racy, perhaps this should be fixed.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-10-29 21:34:18 -07:00
John Johansen
4a7fc3018f apparmor: remove parent task info from audit logging
The reporting of the parent task info is a vestage from old versions of
apparmor. The need for this information was removed by unique null-
profiles before apparmor was upstreamed so remove this info from logging.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-10-29 21:34:04 -07:00
John Johansen
61e3fb8aca apparmor: remove tsk field from the apparmor_audit_struct
Now that aa_capabile no longer sets the task field it can be removed
and the lsm_audit version of the field can be used.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-10-29 21:33:52 -07:00
John Johansen
dd0c6e86f6 apparmor: fix capability to not use the current task, during reporting
Mediation is based off of the cred but auditing includes the current
task which may not be related to the actual request.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-10-29 21:33:37 -07:00
James Morris
50b719f811 Merge branch 'smack-for-3.13' of git://git.gitorious.org/smack-next/kernel into ra-next 2013-10-30 14:07:10 +11:00
Casey Schaufler
b5dfd8075b Smack: Ptrace access check mode
When the ptrace security hooks were split the addition of
a mode parameter was not taken advantage of in the Smack
ptrace access check. This changes the access check from
always looking for read and write access to using the
passed mode. This will make use of /proc much happier.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2013-10-28 10:23:36 -07:00
Dmitry Kasatkin
3ea7a56067 ima: provide hash algo info in the xattr
All files labeled with 'security.ima' hashes, are hashed using the
same hash algorithm.  Changing from one hash algorithm to another,
requires relabeling the filesystem.  This patch defines a new xattr
type, which includes the hash algorithm, permitting different files
to be hashed with different algorithms.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-26 21:32:55 -04:00
Mimi Zohar
e7a2ad7eb6 ima: enable support for larger default filedata hash algorithms
The IMA measurement list contains two hashes - a template data hash
and a filedata hash.  The template data hash is committed to the TPM,
which is limited, by the TPM v1.2 specification, to 20 bytes.  The
filedata hash is defined as 20 bytes as well.

Now that support for variable length measurement list templates was
added, the filedata hash is not limited to 20 bytes.  This patch adds
Kconfig support for defining larger default filedata hash algorithms
and replacing the builtin default with one specified on the kernel
command line.

<uapi/linux/hash_info.h> contains a list of hash algorithms.  The
Kconfig default hash algorithm is a subset of this list, but any hash
algorithm included in the list can be specified at boot, using the
'ima_hash=' kernel command line option.

Changelog v2:
- update Kconfig

Changelog:
- support hashes that are configured
- use generic HASH_ALGO_ definitions
- add Kconfig support
- hash_setup must be called only once (Dmitry)
- removed trailing whitespaces (Roberto Sassu)

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
2013-10-26 21:32:55 -04:00
Roberto Sassu
9b9d4ce592 ima: define kernel parameter 'ima_template=' to change configured default
This patch allows users to specify from the kernel command line the
template descriptor, among those defined, that will be used to generate
and display measurement entries. If an user specifies a wrong template,
IMA reverts to the template descriptor set in the kernel configuration.

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-26 21:32:54 -04:00
Mimi Zohar
4286587dcc ima: add Kconfig default measurement list template
This patch adds a Kconfig option to select the default IMA
measurement list template.  The 'ima' template limited the
filedata hash to 20 bytes and the pathname to 255 charaters.
The 'ima-ng' measurement list template permits larger hash
digests and longer pathnames.

Changelog:
- keep 'select CRYPTO_HASH_INFO' in 'config IMA' section (Kconfig)
  (Roberto Sassu);
- removed trailing whitespaces (Roberto Sassu).
- Lindent fixes

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
2013-10-26 21:32:54 -04:00
Roberto Sassu
add1c05dce ima: defer determining the appraisal hash algorithm for 'ima' template
The same hash algorithm should be used for calculating the file
data hash for the IMA measurement list, as for appraising the file
data integrity.  (The appraise hash algorithm is stored in the
'security.ima' extended attribute.)  The exception is when the
reference file data hash digest, stored in the extended attribute,
is larger than the one supported by the template.  In this case,
the file data hash needs to be calculated twice, once for the
measurement list and, again, for appraisal.

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-26 21:32:53 -04:00
Mimi Zohar
5278aa52f3 ima: add audit log support for larger hashes
Different files might be signed based on different hash algorithms.
This patch prefixes the audit log measurement hash with the hash
algorithm.

Changelog:
- use generic HASH_ALGO defintions
- use ':' as delimiter between the hash algorithm and the digest
  (Roberto Sassu)
- always include the hash algorithm used when audit-logging a measurement

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Peter Moody <pmoody@google.com>
2013-10-26 21:32:46 -04:00
Roberto Sassu
a71dc65d30 ima: switch to new template management mechanism
This patch performs the switch to the new template mechanism by modifying
the functions ima_alloc_init_template(), ima_measurements_show() and
ima_ascii_measurements_show(). The old function ima_template_show() was
removed as it is no longer needed. Also, if the template descriptor used
to generate a measurement entry is not 'ima', the whole length of field
data stored for an entry is provided before the data itself through the
binary_runtime_measurement interface.

Changelog:
- unnecessary to use strncmp() (Mimi Zohar)
- create new variable 'field' in ima_alloc_init_template() (Roberto Sassu)
- use GFP_NOFS flag in ima_alloc_init_template() (Roberto Sassu)
- new variable 'num_fields' in ima_store_template() (Roberto Sassu,
  proposed by Mimi Zohar)
- rename ima_calc_buffer_hash/template_hash() to ima_calc_field_array_hash(),
  something more generic (Mimi, requested by Dmitry)
- sparse error fix - Fengguang Wu
- fix lindent warnings
- always include the field length in the template data length
- include the template field length variable size in the template data length
- include both the template field data and field length in the template digest
  calculation. Simplifies verifying the template digest. (Mimi)

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 17:17:06 -04:00
Roberto Sassu
4d7aeee73f ima: define new template ima-ng and template fields d-ng and n-ng
This patch adds support for the new template 'ima-ng', whose format
is defined as 'd-ng|n-ng'.  These new field definitions remove the
size limitations of the original 'ima' template.  Further, the 'd-ng'
field prefixes the inode digest with the hash algorithim, when
displaying the new larger digest sizes.

Change log:
- scripts/Lindent fixes  - Mimi
- "always true comparison" - reported by Fengguang Wu, resolved Dmitry
- initialize hash_algo variable to HASH_ALGO__LAST
- always prefix digest with hash algorithm - Mimi

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 17:17:05 -04:00
Roberto Sassu
3ce1217d6c ima: define template fields library and new helpers
This patch defines a library containing two initial template fields,
inode digest (d) and file name (n), the 'ima' template descriptor,
whose format is 'd|n', and two helper functions,
ima_write_template_field_data() and ima_show_template_field_data().

Changelog:
- replace ima_eventname_init() parameter NULL checking with BUG_ON.
  (suggested by Mimi)
- include "new template fields for inode digest (d) and file name (n)"
  definitions to fix a compiler warning.  - Mimi
- unnecessary to prefix static function names with 'ima_'. remove
  prefix to resolve Lindent formatting changes. - Mimi
- abbreviated/removed inline comments - Mimi
- always send the template field length - Mimi

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 17:17:05 -04:00
Roberto Sassu
adf53a778a ima: new templates management mechanism
The original 'ima' template is fixed length, containing the filedata hash
and pathname.  The filedata hash is limited to 20 bytes (md5/sha1).  The
pathname is a null terminated string, limited to 255 characters.  To
overcome these limitations and to add additional file metadata, it is
necessary to extend the current version of IMA by defining additional
templates.

The main reason to introduce this feature is that, each time a new
template is defined, the functions that generate and display the
measurement list would include the code for handling a new format and,
thus, would significantly grow over time.

This patch set solves this problem by separating the template management
from the remaining IMA code. The core of this solution is the definition
of two new data structures: a template descriptor, to determine which
information should be included in the measurement list, and a template
field, to generate and display data of a given type.

To define a new template field, developers define the field identifier
and implement two functions, init() and show(), respectively to generate
and display measurement entries.  Initially, this patch set defines the
following template fields (support for additional data types will be
added later):
 - 'd': the digest of the event (i.e. the digest of a measured file),
        calculated with the SHA1 or MD5 hash algorithm;
 - 'n': the name of the event (i.e. the file name), with size up to
        255 bytes;
 - 'd-ng': the digest of the event, calculated with an arbitrary hash
           algorithm (field format: [<hash algo>:]digest, where the digest
           prefix is shown only if the hash algorithm is not SHA1 or MD5);
 - 'n-ng': the name of the event, without size limitations.

Defining a new template descriptor requires specifying the template format,
a string of field identifiers separated by the '|' character.  This patch
set defines the following template descriptors:
 - "ima": its format is 'd|n';
 - "ima-ng" (default): its format is 'd-ng|n-ng'

Further details about the new template architecture can be found in
Documentation/security/IMA-templates.txt.

Changelog:
- don't defer calling ima_init_template() - Mimi
- don't define ima_lookup_template_desc() until used - Mimi
- squashed with documentation patch - Mimi

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 17:17:04 -04:00
Roberto Sassu
7bc5f447ce ima: define new function ima_alloc_init_template() to API
Instead of allocating and initializing the template entry from multiple
places (eg. boot aggregate, violation, and regular measurements), this
patch defines a new function called ima_alloc_init_template().  The new
function allocates and initializes the measurement entry with the inode
digest and the filename.

In respect to the current behavior, it truncates the file name passed
in the 'filename' argument if the latter's size is greater than 255 bytes
and the passed file descriptor is NULL.

Changelog:
- initialize 'hash' variable for non TPM case - Mimi
- conform to expectation for 'iint' to be defined as a pointer. - Mimi
- add missing 'file' dependency for recalculating file hash. - Mimi

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 17:17:04 -04:00
Roberto Sassu
9803d413f4 ima: pass the filename argument up to ima_add_template_entry()
Pass the filename argument to ima_add_template_entry() in order to
eliminate a dependency on template specific data (third argument of
integrity_audit_msg).

This change is required because, with the new template management
mechanism, the generation of a new measurement entry will be performed
by new specific functions (introduced in next patches) and the current IMA
code will not be aware anymore of how data is stored in the entry payload.

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 17:17:03 -04:00
Roberto Sassu
7d802a227b ima: pass the file descriptor to ima_add_violation()
Pass the file descriptor instead of the inode to ima_add_violation(),
to make the latter consistent with ima_store_measurement() in
preparation for the new template architecture.

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 17:17:02 -04:00
Dmitry Kasatkin
09ef54359c ima: ima_calc_boot_agregate must use SHA1
With multiple hash algorithms, ima_hash_tfm is no longer guaranteed to be sha1.
Need to force to use sha1.

Changelog:
- pass ima_digest_data to ima_calc_boot_aggregate() instead of char *
  (Roberto Sassu);
- create an ima_digest_data structure in ima_add_boot_aggregate()
  (Roberto Sassu);
- pass hash->algo to ima_alloc_tfm() (Roberto Sassu, reported by Dmitry).
- "move hash definition in ima_add_boot_aggregate()" commit hunk to here.
- sparse warning fix - Fengguang Wu

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 17:17:02 -04:00
Dmitry Kasatkin
ea593993d3 ima: support arbitrary hash algorithms in ima_calc_buffer_hash
ima_calc_buffer_hash will be used with different hash algorithms.
This patch provides support for arbitrary hash algorithms in
ima_calc_buffer_hash.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 17:17:01 -04:00
Dmitry Kasatkin
723326b927 ima: provide dedicated hash algo allocation function
This patch provides dedicated hash algo allocation and
deallocation function which can be used by different clients.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 17:17:01 -04:00
Mimi Zohar
140d802240 ima: differentiate between template hash and file data hash sizes
The TPM v1.2 limits the template hash size to 20 bytes.  This
patch differentiates between the template hash size, as defined
in the ima_template_entry, and the file data hash size, as
defined in the ima_template_data.  Subsequent patches add support
for different file data hash algorithms.

Change log:
- hash digest definition in ima_store_template() should be TPM_DIGEST_SIZE

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
2013-10-25 17:17:00 -04:00
Dmitry Kasatkin
a35c3fb649 ima: use dynamically allocated hash storage
For each inode in the IMA policy, an iint is allocated.  To support
larger hash digests, the iint digest size changed from 20 bytes to
the maximum supported hash digest size.  Instead of allocating the
maximum size, which most likely is not needed, this patch dynamically
allocates the needed hash storage.

Changelog:
- fix krealloc bug

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 17:17:00 -04:00
Dmitry Kasatkin
b1aaab22e2 ima: pass full xattr with the signature
For possibility to use xattr type for new signature formats,
pass full xattr to the signature verification function.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 17:16:59 -04:00
Dmitry Kasatkin
d3634d0f42 ima: read and use signature hash algorithm
All files on the filesystem, currently, are hashed using the same hash
algorithm.  In preparation for files from different packages being
signed using different hash algorithms, this patch adds support for
reading the signature hash algorithm from the 'security.ima' extended
attribute and calculates the appropriate file data hash based on it.

Changelog:
- fix scripts Lindent and checkpatch msgs - Mimi
- fix md5 support for older version, which occupied 20 bytes in the
  xattr, not the expected 16 bytes.  Fix the comparison to compare
  only the first 16 bytes.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 17:16:59 -04:00
Dmitry Kasatkin
c7c8bb237f ima: provide support for arbitrary hash algorithms
In preparation of supporting more hash algorithms with larger hash sizes
needed for signature verification, this patch replaces the 20 byte sized
digest, with a more flexible structure.  The new structure includes the
hash algorithm, digest size, and digest.

Changelog:
- recalculate filedata hash for the measurement list, if the signature
  hash digest size is greater than 20 bytes.
- use generic HASH_ALGO_
- make ima_calc_file_hash static
- scripts lindent and checkpatch fixes

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 17:16:58 -04:00
Mimi Zohar
08de59eb14 Revert "ima: policy for RAMFS"
This reverts commit 4c2c392763.

Everything in the initramfs should be measured and appraised,
but until the initramfs has extended attribute support, at
least measured.

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Cc: Stable Kernel <stable@kernel.org>
2013-10-25 13:17:19 -04:00
Dmitry Kasatkin
089bc8e95a ima: fix script messages
Fix checkpatch, lindent, etc, warnings/errors

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-10-25 13:17:19 -04:00
Serge Hallyn
73ba353471 device_cgroup: remove can_attach
It is really only wanting to duplicate a check which is already done by the
cgroup subsystem.

With this patch, user jdoe still cannot move pid 1 into a devices cgroup
he owns, but now he can move his own other tasks into devices cgroups.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Aristeu Rozanski <aris@redhat.com>
2013-10-24 06:56:56 -04:00
David S. Miller
c3fa32b976 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/qmi_wwan.c
	include/net/dst.h

Trivial merge conflicts, both were overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-23 16:49:34 -04:00
James Morris
6f799c97f3 Merge branch 'master' of git://git.infradead.org/users/pcmoore/selinux into ra-next 2013-10-22 22:26:41 +11:00
Casey Schaufler
c0ab6e56dc Smack: Implement lock security mode
Linux file locking does not follow the same rules
as other mechanisms. Even though it is a write operation
a process can set a read lock on files which it has open
only for read access. Two programs with read access to
a file can use read locks to communicate.

This is not acceptable in a Mandatory Access Control
environment. Smack treats setting a read lock as the
write operation that it is. Unfortunately, many programs
assume that setting a read lock is a read operation.
These programs are unhappy in the Smack environment.

This patch introduces a new access mode (lock) to address
this problem. A process with lock access to a file can
set a read lock. A process with write access to a file can
set a read lock or a write lock. This prevents a situation
where processes are granted write access just so they can
set read locks.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2013-10-18 09:39:33 -07:00
John Johansen
ed2c7da3a4 apparmor: fix bad lock balance when introspecting policy
BugLink: http://bugs.launchpad.net/bugs/1235977

The profile introspection seq file has a locking bug when policy is viewed
from a virtual root (task in a policy namespace), introspection from the
real root is not affected.

The test for root
    while (parent) {
is correct for the real root, but incorrect for tasks in a policy namespace.
This allows the task to walk backup the policy tree past its virtual root
causing it to be unlocked before the virtual root should be in the p_stop
fn.

This results in the following lockdep back trace:
[   78.479744] [ BUG: bad unlock balance detected! ]
[   78.479792] 3.11.0-11-generic #17 Not tainted
[   78.479838] -------------------------------------
[   78.479885] grep/2223 is trying to release lock (&ns->lock) at:
[   78.479952] [<ffffffff817bf3be>] mutex_unlock+0xe/0x10
[   78.480002] but there are no more locks to release!
[   78.480037]
[   78.480037] other info that might help us debug this:
[   78.480037] 1 lock held by grep/2223:
[   78.480037]  #0:  (&p->lock){+.+.+.}, at: [<ffffffff812111bd>] seq_read+0x3d/0x3d0
[   78.480037]
[   78.480037] stack backtrace:
[   78.480037] CPU: 0 PID: 2223 Comm: grep Not tainted 3.11.0-11-generic #17
[   78.480037] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   78.480037]  ffffffff817bf3be ffff880007763d60 ffffffff817b97ef ffff8800189d2190
[   78.480037]  ffff880007763d88 ffffffff810e1c6e ffff88001f044730 ffff8800189d2190
[   78.480037]  ffffffff817bf3be ffff880007763e00 ffffffff810e5bd6 0000000724fe56b7
[   78.480037] Call Trace:
[   78.480037]  [<ffffffff817bf3be>] ? mutex_unlock+0xe/0x10
[   78.480037]  [<ffffffff817b97ef>] dump_stack+0x54/0x74
[   78.480037]  [<ffffffff810e1c6e>] print_unlock_imbalance_bug+0xee/0x100
[   78.480037]  [<ffffffff817bf3be>] ? mutex_unlock+0xe/0x10
[   78.480037]  [<ffffffff810e5bd6>] lock_release_non_nested+0x226/0x300
[   78.480037]  [<ffffffff817bf2fe>] ? __mutex_unlock_slowpath+0xce/0x180
[   78.480037]  [<ffffffff817bf3be>] ? mutex_unlock+0xe/0x10
[   78.480037]  [<ffffffff810e5d5c>] lock_release+0xac/0x310
[   78.480037]  [<ffffffff817bf2b3>] __mutex_unlock_slowpath+0x83/0x180
[   78.480037]  [<ffffffff817bf3be>] mutex_unlock+0xe/0x10
[   78.480037]  [<ffffffff81376c91>] p_stop+0x51/0x90
[   78.480037]  [<ffffffff81211408>] seq_read+0x288/0x3d0
[   78.480037]  [<ffffffff811e9d9e>] vfs_read+0x9e/0x170
[   78.480037]  [<ffffffff811ea8cc>] SyS_read+0x4c/0xa0
[   78.480037]  [<ffffffff817ccc9d>] system_call_fastpath+0x1a/0x1f

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-10-16 11:54:01 +11:00
John Johansen
5cb3e91ebd apparmor: fix memleak of the profile hash
BugLink: http://bugs.launchpad.net/bugs/1235523

This fixes the following kmemleak trace:
unreferenced object 0xffff8801e8c35680 (size 32):
  comm "apparmor_parser", pid 691, jiffies 4294895667 (age 13230.876s)
  hex dump (first 32 bytes):
    e0 d3 4e b5 ac 6d f4 ed 3f cb ee 48 1c fd 40 cf  ..N..m..?..H..@.
    5b cc e9 93 00 00 00 00 00 00 00 00 00 00 00 00  [...............
  backtrace:
    [<ffffffff817a97ee>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff811ca9f3>] __kmalloc+0x103/0x290
    [<ffffffff8138acbc>] aa_calc_profile_hash+0x6c/0x150
    [<ffffffff8138074d>] aa_unpack+0x39d/0xd50
    [<ffffffff8137eced>] aa_replace_profiles+0x3d/0xd80
    [<ffffffff81376937>] profile_replace+0x37/0x50
    [<ffffffff811e9f2d>] vfs_write+0xbd/0x1e0
    [<ffffffff811ea96c>] SyS_write+0x4c/0xa0
    [<ffffffff817ccb1d>] system_call_fastpath+0x1a/0x1f
    [<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-10-16 11:53:59 +11:00
Patrick McHardy
795aa6ef6a netfilter: pass hook ops to hookfn
Pass the hook ops to the hookfn to allow for generic hook
functions. This change is required by nf_tables.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-14 11:29:31 +02:00
Eric Dumazet
c2bb06db59 net: fix build errors if ipv6 is disabled
CONFIG_IPV6=n is still a valid choice ;)

It appears we can remove dead code.

Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-09 13:04:03 -04:00
Eric Dumazet
efe4208f47 ipv6: make lookups simpler and faster
TCP listener refactoring, part 4 :

To speed up inet lookups, we moved IPv4 addresses from inet to struct
sock_common

Now is time to do the same for IPv6, because it permits us to have fast
lookups for all kind of sockets, including upcoming SYN_RECV.

Getting IPv6 addresses in TCP lookups currently requires two extra cache
lines, plus a dereference (and memory stall).

inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6

This patch is way bigger than its IPv4 counter part, because for IPv4,
we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
it's not doable easily.

inet6_sk(sk)->daddr becomes sk->sk_v6_daddr
inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr

And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr
at the same offset.

We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
macro.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-09 00:01:25 -04:00
Linus Torvalds
ab35406264 selinux: remove 'flags' parameter from avc_audit()
Now avc_audit() has no more users with that parameter. Remove it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-10-04 14:13:25 -07:00
Linus Torvalds
cb4fbe5703 selinux: avc_has_perm_flags has no more users
.. so get rid of it.  The only indirect users were all the
avc_has_perm() callers which just expanded to have a zero flags
argument.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-10-04 14:13:14 -07:00
Linus Torvalds
19e49834d2 selinux: remove 'flags' parameter from inode_has_perm
Every single user passes in '0'.  I think we had non-zero users back in
some stone age when selinux_inode_permission() was implemented in terms
of inode_has_perm(), but that complicated case got split up into a
totally separate code-path so that we could optimize the much simpler
special cases.

See commit 2e33405785 ("SELinux: delay initialization of audit data in
selinux_inode_permission") for example.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-10-04 12:54:11 -07:00
David S. Miller
4fbef95af4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/emulex/benet/be.h
	drivers/net/usb/qmi_wwan.c
	drivers/net/wireless/brcm80211/brcmfmac/dhd_bus.h
	include/net/netfilter/nf_conntrack_synproxy.h
	include/net/secure_seq.h

The conflicts are of two varieties:

1) Conflicts with Joe Perches's 'extern' removal from header file
   function declarations.  Usually it's an argument signature change
   or a function being added/removed.  The resolutions are trivial.

2) Some overlapping changes in qmi_wwan.c and be.h, one commit adds
   a new value, another changes an existing value.  That sort of
   thing.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-01 17:06:14 -04:00
Eric W. Biederman
0bbf87d852 net ipv4: Convert ipv4.ip_local_port_range to be per netns v3
- Move sysctl_local_ports from a global variable into struct netns_ipv4.
- Modify inet_get_local_port_range to take a struct net, and update all
  of the callers.
- Move the initialization of sysctl_local_ports into
   sysctl_net_ipv4.c:ipv4_sysctl_init_net from inet_connection_sock.c

v2:
- Ensure indentation used tabs
- Fixed ip.h so it applies cleanly to todays net-next

v3:
- Compile fixes of strange callers of inet_get_local_port_range.
  This patch now successfully passes an allmodconfig build.
  Removed manual inlining of inet_get_local_port_range in ipv4_local_port_range

Originally-by: Samya <samya@twitter.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-30 21:59:38 -07:00
John Johansen
4cd4fc7703 apparmor: fix suspicious RCU usage warning in policy.c/policy.h
The recent 3.12 pull request for apparmor was missing a couple rcu _protected
access modifiers. Resulting in the follow suspicious RCU usage

 [   29.804534] [ INFO: suspicious RCU usage. ]
 [   29.804539] 3.11.0+ #5 Not tainted
 [   29.804541] -------------------------------
 [   29.804545] security/apparmor/include/policy.h:363 suspicious rcu_dereference_check() usage!
 [   29.804548]
 [   29.804548] other info that might help us debug this:
 [   29.804548]
 [   29.804553]
 [   29.804553] rcu_scheduler_active = 1, debug_locks = 1
 [   29.804558] 2 locks held by apparmor_parser/1268:
 [   29.804560]  #0:  (sb_writers#9){.+.+.+}, at: [<ffffffff81120a4c>] file_start_write+0x27/0x29
 [   29.804576]  #1:  (&ns->lock){+.+.+.}, at: [<ffffffff811f5d88>] aa_replace_profiles+0x166/0x57c
 [   29.804589]
 [   29.804589] stack backtrace:
 [   29.804595] CPU: 0 PID: 1268 Comm: apparmor_parser Not tainted 3.11.0+ #5
 [   29.804599] Hardware name: ASUSTeK Computer Inc.         UL50VT          /UL50VT    , BIOS 217     03/01/2010
 [   29.804602]  0000000000000000 ffff8800b95a1d90 ffffffff8144eb9b ffff8800b94db540
 [   29.804611]  ffff8800b95a1dc0 ffffffff81087439 ffff880138cc3a18 ffff880138cc3a18
 [   29.804619]  ffff8800b9464a90 ffff880138cc3a38 ffff8800b95a1df0 ffffffff811f5084
 [   29.804628] Call Trace:
 [   29.804636]  [<ffffffff8144eb9b>] dump_stack+0x4e/0x82
 [   29.804642]  [<ffffffff81087439>] lockdep_rcu_suspicious+0xfc/0x105
 [   29.804649]  [<ffffffff811f5084>] __aa_update_replacedby+0x53/0x7f
 [   29.804655]  [<ffffffff811f5408>] __replace_profile+0x11f/0x1ed
 [   29.804661]  [<ffffffff811f6032>] aa_replace_profiles+0x410/0x57c
 [   29.804668]  [<ffffffff811f16d4>] profile_replace+0x35/0x4c
 [   29.804674]  [<ffffffff81120fa3>] vfs_write+0xad/0x113
 [   29.804680]  [<ffffffff81121609>] SyS_write+0x44/0x7a
 [   29.804687]  [<ffffffff8145bfd2>] system_call_fastpath+0x16/0x1b
 [   29.804691]
 [   29.804694] ===============================
 [   29.804697] [ INFO: suspicious RCU usage. ]
 [   29.804700] 3.11.0+ #5 Not tainted
 [   29.804703] -------------------------------
 [   29.804706] security/apparmor/policy.c:566 suspicious rcu_dereference_check() usage!
 [   29.804709]
 [   29.804709] other info that might help us debug this:
 [   29.804709]
 [   29.804714]
 [   29.804714] rcu_scheduler_active = 1, debug_locks = 1
 [   29.804718] 2 locks held by apparmor_parser/1268:
 [   29.804721]  #0:  (sb_writers#9){.+.+.+}, at: [<ffffffff81120a4c>] file_start_write+0x27/0x29
 [   29.804733]  #1:  (&ns->lock){+.+.+.}, at: [<ffffffff811f5d88>] aa_replace_profiles+0x166/0x57c
 [   29.804744]
 [   29.804744] stack backtrace:
 [   29.804750] CPU: 0 PID: 1268 Comm: apparmor_parser Not tainted 3.11.0+ #5
 [   29.804753] Hardware name: ASUSTeK Computer Inc.         UL50VT          /UL50VT    , BIOS 217     03/01/2010
 [   29.804756]  0000000000000000 ffff8800b95a1d80 ffffffff8144eb9b ffff8800b94db540
 [   29.804764]  ffff8800b95a1db0 ffffffff81087439 ffff8800b95b02b0 0000000000000000
 [   29.804772]  ffff8800b9efba08 ffff880138cc3a38 ffff8800b95a1dd0 ffffffff811f4f94
 [   29.804779] Call Trace:
 [   29.804786]  [<ffffffff8144eb9b>] dump_stack+0x4e/0x82
 [   29.804791]  [<ffffffff81087439>] lockdep_rcu_suspicious+0xfc/0x105
 [   29.804798]  [<ffffffff811f4f94>] aa_free_replacedby_kref+0x4d/0x62
 [   29.804804]  [<ffffffff811f4f47>] ? aa_put_namespace+0x17/0x17
 [   29.804810]  [<ffffffff811f4f0b>] kref_put+0x36/0x40
 [   29.804816]  [<ffffffff811f5423>] __replace_profile+0x13a/0x1ed
 [   29.804822]  [<ffffffff811f6032>] aa_replace_profiles+0x410/0x57c
 [   29.804829]  [<ffffffff811f16d4>] profile_replace+0x35/0x4c
 [   29.804835]  [<ffffffff81120fa3>] vfs_write+0xad/0x113
 [   29.804840]  [<ffffffff81121609>] SyS_write+0x44/0x7a
 [   29.804847]  [<ffffffff8145bfd2>] system_call_fastpath+0x16/0x1b

Reported-by: miles.lane@gmail.com
CC: paulmck@linux.vnet.ibm.com
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-09-30 09:54:01 +10:00
Tyler Hicks
71ac7f6255 apparmor: Use shash crypto API interface for profile hashes
Use the shash interface, rather than the hash interface, when hashing
AppArmor profiles. The shash interface does not use scatterlists and it
is a better fit for what AppArmor needs.

This fixes a kernel paging BUG when aa_calc_profile_hash() is passed a
buffer from vmalloc(). The hash interface requires callers to handle
vmalloc() buffers differently than what AppArmor was doing. Due to
vmalloc() memory not being physically contiguous, each individual page
behind the buffer must be assigned to a scatterlist with sg_set_page()
and then the scatterlist passed to crypto_hash_update().

The shash interface does not have that limitation and allows vmalloc()
and kmalloc() buffers to be handled in the same manner.

BugLink: https://launchpad.net/bugs/1216294/
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=62261

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-09-30 09:53:59 +10:00
Paul Moore
42d64e1add selinux: correct locking in selinux_netlbl_socket_connect)
The SELinux/NetLabel glue code has a locking bug that affects systems
with NetLabel enabled, see the kernel error message below.  This patch
corrects this problem by converting the bottom half socket lock to a
more conventional, and correct for this call-path, lock_sock() call.

 ===============================
 [ INFO: suspicious RCU usage. ]
 3.11.0-rc3+ #19 Not tainted
 -------------------------------
 net/ipv4/cipso_ipv4.c:1928 suspicious rcu_dereference_protected() usage!

 other info that might help us debug this:

 rcu_scheduler_active = 1, debug_locks = 0
 2 locks held by ping/731:
  #0:  (slock-AF_INET/1){+.-...}, at: [...] selinux_netlbl_socket_connect
  #1:  (rcu_read_lock){.+.+..}, at: [<...>] netlbl_conn_setattr

 stack backtrace:
 CPU: 1 PID: 731 Comm: ping Not tainted 3.11.0-rc3+ #19
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000001 ffff88006f659d28 ffffffff81726b6a ffff88003732c500
  ffff88006f659d58 ffffffff810e4457 ffff88006b845a00 0000000000000000
  000000000000000c ffff880075aa2f50 ffff88006f659d90 ffffffff8169bec7
 Call Trace:
  [<ffffffff81726b6a>] dump_stack+0x54/0x74
  [<ffffffff810e4457>] lockdep_rcu_suspicious+0xe7/0x120
  [<ffffffff8169bec7>] cipso_v4_sock_setattr+0x187/0x1a0
  [<ffffffff8170f317>] netlbl_conn_setattr+0x187/0x190
  [<ffffffff8170f195>] ? netlbl_conn_setattr+0x5/0x190
  [<ffffffff8131ac9e>] selinux_netlbl_socket_connect+0xae/0xc0
  [<ffffffff81303025>] selinux_socket_connect+0x135/0x170
  [<ffffffff8119d127>] ? might_fault+0x57/0xb0
  [<ffffffff812fb146>] security_socket_connect+0x16/0x20
  [<ffffffff815d3ad3>] SYSC_connect+0x73/0x130
  [<ffffffff81739a85>] ? sysret_check+0x22/0x5d
  [<ffffffff810e5e2d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
  [<ffffffff81373d4e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
  [<ffffffff815d52be>] SyS_connect+0xe/0x10
  [<ffffffff81739a59>] system_call_fastpath+0x16/0x1b

Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-09-26 17:00:46 -04:00
Duan Jiong
7d1db4b242 selinux: Use kmemdup instead of kmalloc + memcpy
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2013-09-26 15:52:13 -04:00
Mimi Zohar
c124bde28b KEYS: initialize root uid and session keyrings early
In order to create the integrity keyrings (eg. _evm, _ima), root's
uid and session keyrings need to be initialized early.

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-25 17:17:01 +01:00
David Howells
008643b86c KEYS: Add a 'trusted' flag and a 'trusted only' flag
Add KEY_FLAG_TRUSTED to indicate that a key either comes from a trusted source
or had a cryptographic signature chain that led back to a trusted key the
kernel already possessed.

Add KEY_FLAGS_TRUSTED_ONLY to indicate that a keyring will only accept links to
keys marked with KEY_FLAGS_TRUSTED.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
2013-09-25 17:17:01 +01:00
David Howells
f36f8c75ae KEYS: Add per-user_namespace registers for persistent per-UID kerberos caches
Add support for per-user_namespace registers of persistent per-UID kerberos
caches held within the kernel.

This allows the kerberos cache to be retained beyond the life of all a user's
processes so that the user's cron jobs can work.

The kerberos cache is envisioned as a keyring/key tree looking something like:

	struct user_namespace
	  \___ .krb_cache keyring		- The register
		\___ _krb.0 keyring		- Root's Kerberos cache
		\___ _krb.5000 keyring		- User 5000's Kerberos cache
		\___ _krb.5001 keyring		- User 5001's Kerberos cache
			\___ tkt785 big_key	- A ccache blob
			\___ tkt12345 big_key	- Another ccache blob

Or possibly:

	struct user_namespace
	  \___ .krb_cache keyring		- The register
		\___ _krb.0 keyring		- Root's Kerberos cache
		\___ _krb.5000 keyring		- User 5000's Kerberos cache
		\___ _krb.5001 keyring		- User 5001's Kerberos cache
			\___ tkt785 keyring	- A ccache
				\___ krbtgt/REDHAT.COM@REDHAT.COM big_key
				\___ http/REDHAT.COM@REDHAT.COM user
				\___ afs/REDHAT.COM@REDHAT.COM user
				\___ nfs/REDHAT.COM@REDHAT.COM user
				\___ krbtgt/KERNEL.ORG@KERNEL.ORG big_key
				\___ http/KERNEL.ORG@KERNEL.ORG big_key

What goes into a particular Kerberos cache is entirely up to userspace.  Kernel
support is limited to giving you the Kerberos cache keyring that you want.

The user asks for their Kerberos cache by:

	krb_cache = keyctl_get_krbcache(uid, dest_keyring);

The uid is -1 or the user's own UID for the user's own cache or the uid of some
other user's cache (requires CAP_SETUID).  This permits rpc.gssd or whatever to
mess with the cache.

The cache returned is a keyring named "_krb.<uid>" that the possessor can read,
search, clear, invalidate, unlink from and add links to.  Active LSMs get a
chance to rule on whether the caller is permitted to make a link.

Each uid's cache keyring is created when it first accessed and is given a
timeout that is extended each time this function is called so that the keyring
goes away after a while.  The timeout is configurable by sysctl but defaults to
three days.

Each user_namespace struct gets a lazily-created keyring that serves as the
register.  The cache keyrings are added to it.  This means that standard key
search and garbage collection facilities are available.

The user_namespace struct's register goes away when it does and anything left
in it is then automatically gc'd.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Simo Sorce <simo@redhat.com>
cc: Serge E. Hallyn <serge.hallyn@ubuntu.com>
cc: Eric W. Biederman <ebiederm@xmission.com>
2013-09-24 10:35:19 +01:00
David Howells
ab3c3587f8 KEYS: Implement a big key type that can save to tmpfs
Implement a big key type that can save its contents to tmpfs and thus
swapspace when memory is tight.  This is useful for Kerberos ticket caches.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Simo Sorce <simo@redhat.com>
2013-09-24 10:35:18 +01:00
David Howells
b2a4df200d KEYS: Expand the capacity of a keyring
Expand the capacity of a keyring to be able to hold a lot more keys by using
the previously added associative array implementation.  Currently the maximum
capacity is:

	(PAGE_SIZE - sizeof(header)) / sizeof(struct key *)

which, on a 64-bit system, is a little more 500.  However, since this is being
used for the NFS uid mapper, we need more than that.  The new implementation
gives us effectively unlimited capacity.

With some alterations, the keyutils testsuite runs successfully to completion
after this patch is applied.  The alterations are because (a) keyrings that
are simply added to no longer appear ordered and (b) some of the errors have
changed a bit.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:18 +01:00
David Howells
e57e8669f2 KEYS: Drop the permissions argument from __keyring_search_one()
Drop the permissions argument from __keyring_search_one() as the only caller
passes 0 here - which causes all checks to be skipped.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:17 +01:00
David Howells
ccc3e6d9c9 KEYS: Define a __key_get() wrapper to use rather than atomic_inc()
Define a __key_get() wrapper to use rather than atomic_inc() on the key usage
count as this makes it easier to hook in refcount error debugging.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:16 +01:00
David Howells
d0a059cac6 KEYS: Search for auth-key by name rather than target key ID
Search for auth-key by name rather than by target key ID as, in a future
patch, we'll by searching directly by index key in preference to iteration
over all keys.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:16 +01:00
David Howells
4bdf0bc300 KEYS: Introduce a search context structure
Search functions pass around a bunch of arguments, each of which gets copied
with each call.  Introduce a search context structure to hold these.

Whilst we're at it, create a search flag that indicates whether the search
should be directly to the description or whether it should iterate through all
keys looking for a non-description match.

This will be useful when keyrings use a generic data struct with generic
routines to manage their content as the search terms can just be passed
through to the iterator callback function.

Also, for future use, the data to be supplied to the match function is
separated from the description pointer in the search context.  This makes it
clear which is being supplied.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:15 +01:00
David Howells
16feef4340 KEYS: Consolidate the concept of an 'index key' for key access
Consolidate the concept of an 'index key' for accessing keys.  The index key
is the search term needed to find a key directly - basically the key type and
the key description.  We can add to that the description length.

This will be useful when turning a keyring into an associative array rather
than just a pointer block.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:15 +01:00
David Howells
7e55ca6dcd KEYS: key_is_dead() should take a const key pointer argument
key_is_dead() should take a const key pointer argument as it doesn't modify
what it points to.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:14 +01:00
David Howells
a5b4bd2874 KEYS: Use bool in make_key_ref() and is_key_possessed()
Make make_key_ref() take a bool possession parameter and make
is_key_possessed() return a bool.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:14 +01:00
David Howells
61ea0c0ba9 KEYS: Skip key state checks when checking for possession
Skip key state checks (invalidation, revocation and expiration) when checking
for possession.  Without this, keys that have been marked invalid, revoked
keys and expired keys are not given a possession attribute - which means the
possessor is not granted any possession permits and cannot do anything with
them unless they also have one a user, group or other permit.

This causes failures in the keyutils test suite's revocation and expiration
tests now that commit 96b5c8fea6 reduced the
initial permissions granted to a key.

The failures are due to accesses to revoked and expired keys being given
EACCES instead of EKEYREVOKED or EKEYEXPIRED.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:13 +01:00
Eric Paris
a3c9e45d18 security: remove erroneous comment about capabilities.o link ordering
Back when we had half ass LSM stacking we had to link capabilities.o
after bigger LSMs so that on initialization the bigger LSM would
register first and the capabilities module would be the one stacked as
the 'seconday'.  Somewhere around 6f0f0fd496 (back in 2008) we
finally removed the last of the kinda module stacking code but this
comment in the makefile still lives today.

Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-09-24 11:26:28 +10:00
Paul Moore
98f700f317 Merge git://git.infradead.org/users/eparis/selinux
Conflicts:
	security/selinux/hooks.c

Pull Eric's existing SELinux tree as there are a number of patches in
there that are not yet upstream.  There was some minor fixup needed to
resolve a conflict in security/selinux/hooks.c:selinux_set_mnt_opts()
between the labeled NFS patches and Eric's security_fs_use()
simplification patch.
2013-09-18 13:52:20 -04:00
Linus Torvalds
c7c4591db6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace changes from Eric Biederman:
 "This is an assorted mishmash of small cleanups, enhancements and bug
  fixes.

  The major theme is user namespace mount restrictions.  nsown_capable
  is killed as it encourages not thinking about details that need to be
  considered.  A very hard to hit pid namespace exiting bug was finally
  tracked and fixed.  A couple of cleanups to the basic namespace
  infrastructure.

  Finally there is an enhancement that makes per user namespace
  capabilities usable as capabilities, and an enhancement that allows
  the per userns root to nice other processes in the user namespace"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  userns:  Kill nsown_capable it makes the wrong thing easy
  capabilities: allow nice if we are privileged
  pidns: Don't have unshare(CLONE_NEWPID) imply CLONE_THREAD
  userns: Allow PR_CAPBSET_DROP in a user namespace.
  namespaces: Simplify copy_namespaces so it is clear what is going on.
  pidns: Fix hang in zap_pid_ns_processes by sending a potentially extra wakeup
  sysfs: Restrict mounting sysfs
  userns: Better restrictions on when proc and sysfs can be mounted
  vfs: Don't copy mount bind mounts of /proc/<pid>/ns/mnt between namespaces
  kernel/nsproxy.c: Improving a snippet of code.
  proc: Restrict mounting the proc filesystem
  vfs: Lock in place mounts from more privileged users
2013-09-07 14:35:32 -07:00
Linus Torvalds
11c7b03d42 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Nothing major for this kernel, just maintenance updates"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (21 commits)
  apparmor: add the ability to report a sha1 hash of loaded policy
  apparmor: export set of capabilities supported by the apparmor module
  apparmor: add the profile introspection file to interface
  apparmor: add an optional profile attachment string for profiles
  apparmor: add interface files for profiles and namespaces
  apparmor: allow setting any profile into the unconfined state
  apparmor: make free_profile available outside of policy.c
  apparmor: rework namespace free path
  apparmor: update how unconfined is handled
  apparmor: change how profile replacement update is done
  apparmor: convert profile lists to RCU based locking
  apparmor: provide base for multiple profiles to be replaced at once
  apparmor: add a features/policy dir to interface
  apparmor: enable users to query whether apparmor is enabled
  apparmor: remove minimum size check for vmalloc()
  Smack: parse multiple rules per write to load2, up to PAGE_SIZE-1 bytes
  Smack: network label match fix
  security: smack: add a hash table to quicken smk_find_entry()
  security: smack: fix memleak in smk_write_rules_list()
  xattr: Constify ->name member of "struct xattr".
  ...
2013-09-07 14:34:07 -07:00
Linus Torvalds
cc998ff881 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking changes from David Miller:
 "Noteworthy changes this time around:

   1) Multicast rejoin support for team driver, from Jiri Pirko.

   2) Centralize and simplify TCP RTT measurement handling in order to
      reduce the impact of bad RTO seeding from SYN/ACKs.  Also, when
      both timestamps and local RTT measurements are available prefer
      the later because there are broken middleware devices which
      scramble the timestamp.

      From Yuchung Cheng.

   3) Add TCP_NOTSENT_LOWAT socket option to limit the amount of kernel
      memory consumed to queue up unsend user data.  From Eric Dumazet.

   4) Add a "physical port ID" abstraction for network devices, from
      Jiri Pirko.

   5) Add a "suppress" operation to influence fib_rules lookups, from
      Stefan Tomanek.

   6) Add a networking development FAQ, from Paul Gortmaker.

   7) Extend the information provided by tcp_probe and add ipv6 support,
      from Daniel Borkmann.

   8) Use RCU locking more extensively in openvswitch data paths, from
      Pravin B Shelar.

   9) Add SCTP support to openvswitch, from Joe Stringer.

  10) Add EF10 chip support to SFC driver, from Ben Hutchings.

  11) Add new SYNPROXY netfilter target, from Patrick McHardy.

  12) Compute a rate approximation for sending in TCP sockets, and use
      this to more intelligently coalesce TSO frames.  Furthermore, add
      a new packet scheduler which takes advantage of this estimate when
      available.  From Eric Dumazet.

  13) Allow AF_PACKET fanouts with random selection, from Daniel
      Borkmann.

  14) Add ipv6 support to vxlan driver, from Cong Wang"

Resolved conflicts as per discussion.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1218 commits)
  openvswitch: Fix alignment of struct sw_flow_key.
  netfilter: Fix build errors with xt_socket.c
  tcp: Add missing braces to do_tcp_setsockopt
  caif: Add missing braces to multiline if in cfctrl_linkup_request
  bnx2x: Add missing braces in bnx2x:bnx2x_link_initialize
  vxlan: Fix kernel panic on device delete.
  net: mvneta: implement ->ndo_do_ioctl() to support PHY ioctls
  net: mvneta: properly disable HW PHY polling and ensure adjust_link() works
  icplus: Use netif_running to determine device state
  ethernet/arc/arc_emac: Fix huge delays in large file copies
  tuntap: orphan frags before trying to set tx timestamp
  tuntap: purge socket error queue on detach
  qlcnic: use standard NAPI weights
  ipv6:introduce function to find route for redirect
  bnx2x: VF RSS support - VF side
  bnx2x: VF RSS support - PF side
  vxlan: Notify drivers for listening UDP port changes
  net: usbnet: update addr_assign_type if appropriate
  driver/net: enic: update enic maintainers and driver
  driver/net: enic: Exposing symbols for Cisco's low latency driver
  ...
2013-09-05 14:54:29 -07:00
Linus Torvalds
3398d252a4 Minor fixes mainly, including a potential use-after-free on remove found by
CONFIG_DEBUG_KOBJECT_RELEASE which may be theoretical.
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJSJsL7AAoJENkgDmzRrbjx/N0P+wU81L55uYdJnDVPsW/fzhyJ
 BEZtmhFWtqxHeuFMQpwm5N3/ORwMht4czF4Hf957ePtI2PelHu+kapnFFQ+/KHZs
 T5sjH0mAYf9+CPa8wj4OKVzvd8lH4udbV+E7INVfciDySRX5HKXynJ6pZHvfeu7y
 /2q2PH6kGzuGoTEkDwOOwJ5yyZqs+RW9ZkSMStgCOUE8GmoDXEsH1KwIYE/9buCh
 XTngMo7AhimQaQ9QKypJLjlcnI9X/9ljXqFRKqSFOeMA1Ba+h+7eUqd4FJI6jDJu
 tecMOxX9PezPK6Wdg8V7AFBSzOhDPqoKQBOcaqeLd1wVICi8oQirVzwQNlsoiVNu
 JC+8rDqaeuG3dazROhaAnez7nhHfTjnMsYLVMRUmYtqXetd0qWYSmlmcJRKSJwi4
 okl/Lv5BroQdB9bB8+sc7l34nE3HXZGV3tJcNXf91NNEbDt97xFZ2YYbdtsQcOqj
 igUHcjsZq1gLsdIhnkHjTGLkLPxMTdCi8mtUc9+uzXHvSPJqEUMcn+fEA7RdliUw
 /WvpUX2tj2Al44LfBy6L0D6AVyS2/zIQ9PzH6FHsgVDqrNHomkF20w3btnM3yyPA
 hakV6vPr+kOpfSJYlTSU7yhEJh+LGkfPXeaX4X3tqubKhsZjq8rS7vrfbcqmgwvT
 DbzDKRx5R3URiiFSbb2v
 =15lp
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module updates from Rusty Russell:
 "Minor fixes mainly, including a potential use-after-free on remove
  found by CONFIG_DEBUG_KOBJECT_RELEASE which may be theoretical"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  module: Fix mod->mkobj.kobj potentially freed too early
  kernel/params.c: use scnprintf() instead of sprintf()
  kernel/module.c: use scnprintf() instead of sprintf()
  module/lsm: Have apparmor module parameters work with no args
  module: Add NOARG flag for ops with param_set_bool_enable_only() set function
  module: Add flag to allow mod params to have no arguments
  modules: add support for soft module dependencies
  scripts/mod/modpost.c: permit '.cranges' secton for sh64 architecture.
  module: fix sprintf format specifier in param_get_byte()
2013-09-04 17:34:29 -07:00
Linus Torvalds
32dad03d16 Merge branch 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
 "A lot of activities on the cgroup front.  Most changes aren't visible
  to userland at all at this point and are laying foundation for the
  planned unified hierarchy.

   - The biggest change is decoupling the lifetime management of css
     (cgroup_subsys_state) from that of cgroup's.  Because controllers
     (cpu, memory, block and so on) will need to be dynamically enabled
     and disabled, css which is the association point between a cgroup
     and a controller may come and go dynamically across the lifetime of
     a cgroup.  Till now, css's were created when the associated cgroup
     was created and stayed till the cgroup got destroyed.

     Assumptions around this tight coupling permeated through cgroup
     core and controllers.  These assumptions are gradually removed,
     which consists bulk of patches, and css destruction path is
     completely decoupled from cgroup destruction path.  Note that
     decoupling of creation path is relatively easy on top of these
     changes and the patchset is pending for the next window.

   - cgroup has its own event mechanism cgroup.event_control, which is
     only used by memcg.  It is overly complex trying to achieve high
     flexibility whose benefits seem dubious at best.  Going forward,
     new events will simply generate file modified event and the
     existing mechanism is being made specific to memcg.  This pull
     request contains prepatory patches for such change.

   - Various fixes and cleanups"

Fixed up conflict in kernel/cgroup.c as per Tejun.

* 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (69 commits)
  cgroup: fix cgroup_css() invocation in css_from_id()
  cgroup: make cgroup_write_event_control() use css_from_dir() instead of __d_cgrp()
  cgroup: make cgroup_event hold onto cgroup_subsys_state instead of cgroup
  cgroup: implement CFTYPE_NO_PREFIX
  cgroup: make cgroup_css() take cgroup_subsys * instead and allow NULL subsys
  cgroup: rename cgroup_css_from_dir() to css_from_dir() and update its syntax
  cgroup: fix cgroup_write_event_control()
  cgroup: fix subsystem file accesses on the root cgroup
  cgroup: change cgroup_from_id() to css_from_id()
  cgroup: use css_get() in cgroup_create() to check CSS_ROOT
  cpuset: remove an unncessary forward declaration
  cgroup: RCU protect each cgroup_subsys_state release
  cgroup: move subsys file removal to kill_css()
  cgroup: factor out kill_css()
  cgroup: decouple cgroup_subsys_state destruction from cgroup destruction
  cgroup: replace cgroup->css_kill_cnt with ->nr_css
  cgroup: bounce cgroup_subsys_state ref kill confirmation to a work item
  cgroup: move cgroup->subsys[] assignment to online_css()
  cgroup: reorganize css init / exit paths
  cgroup: add __rcu modifier to cgroup->subsys[]
  ...
2013-09-03 18:25:03 -07:00
Serge Hallyn
f54fb863c6 capabilities: allow nice if we are privileged
We allow task A to change B's nice level if it has a supserset of
B's privileges, or of it has CAP_SYS_NICE.  Also allow it if A has
CAP_SYS_NICE with respect to B - meaning it is root in the same
namespace, or it created B's namespace.

Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2013-08-30 23:44:09 -07:00
Eric W. Biederman
160da84dbb userns: Allow PR_CAPBSET_DROP in a user namespace.
As the capabilites and capability bounding set are per user namespace
properties it is safe to allow changing them with just CAP_SETPCAP
permission in the user namespace.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Tested-by: Richard Weinberger <richard@nod.at>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-08-30 17:30:39 -07:00
Eric Paris
0b4bdb3573 Revert "SELinux: do not handle seclabel as a special flag"
This reverts commit 308ab70c46.

It breaks my FC6 test box.  /dev/pts is not mounted.  dmesg says

SELinux: mount invalid.  Same superblock, different security settings
for (dev devpts, type devpts)

Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-08-28 14:45:21 -04:00
Anand Avati
102aefdda4 selinux: consider filesystem subtype in policies
Not considering sub filesystem has the following limitation. Support
for SELinux in FUSE is dependent on the particular userspace
filesystem, which is identified by the subtype. For e.g, GlusterFS,
a FUSE based filesystem supports SELinux (by mounting and processing
FUSE requests in different threads, avoiding the mount time
deadlock), whereas other FUSE based filesystems (identified by a
different subtype) have the mount time deadlock.

By considering the subtype of the filesytem in the SELinux policies,
allows us to specify a filesystem subtype, in the following way:

fs_use_xattr fuse.glusterfs gen_context(system_u:object_r:fs_t,s0);

This way not all FUSE filesystems are put in the same bucket and
subjected to the limitations of the other subtypes.

Signed-off-by: Anand Avati <avati@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-08-28 14:44:52 -04:00
James Morris
7320336146 Merge branch 'smack-for-3.12' of git://git.gitorious.org/smack-next/kernel into ra-next 2013-08-23 02:50:12 +10:00
Steven Rostedt
5265fc6219 module/lsm: Have apparmor module parameters work with no args
The apparmor module parameters for param_ops_aabool and
param_ops_aalockpolicy are both based off of the param_ops_bool,
and can handle a NULL value passed in as val. Have it enable the
new KERNEL_PARAM_FL_NOARGS flag to allow the parameters to be set
without having to state "=y" or "=1".

Cc: John Johansen <john.johansen@canonical.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-08-20 15:37:44 +09:30
David S. Miller
2ff1cf12c9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-08-16 15:37:26 -07:00
John Johansen
f8eb8a1324 apparmor: add the ability to report a sha1 hash of loaded policy
Provide userspace the ability to introspect a sha1 hash value for each
profile currently loaded.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2013-08-14 11:42:08 -07:00
John Johansen
84f1f78742 apparmor: export set of capabilities supported by the apparmor module
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2013-08-14 11:42:07 -07:00
John Johansen
29b3822f1e apparmor: add the profile introspection file to interface
Add the dynamic namespace relative profiles file to the interace, to allow
introspection of loaded profiles and their modes.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
2013-08-14 11:42:07 -07:00
John Johansen
556d0be74b apparmor: add an optional profile attachment string for profiles
Add the ability to take in and report a human readable profile attachment
string for profiles so that attachment specifications can be easily
inspected.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2013-08-14 11:42:07 -07:00
John Johansen
0d259f043f apparmor: add interface files for profiles and namespaces
Add basic interface files to access namespace and profile information.
The interface files are created when a profile is loaded and removed
when the profile or namespace is removed.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-08-14 11:42:07 -07:00
John Johansen
038165070a apparmor: allow setting any profile into the unconfined state
Allow emulating the default profile behavior from boot, by allowing
loading of a profile in the unconfined state into a new NS.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2013-08-14 11:42:07 -07:00
John Johansen
8651e1d657 apparmor: make free_profile available outside of policy.c
Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-08-14 11:42:06 -07:00
John Johansen
742058b0f3 apparmor: rework namespace free path
namespaces now completely use the unconfined profile to track the
refcount and rcu freeing cycle. So rework the code to simplify (track
everything through the profile path right up to the end), and move the
rcu_head from policy base to profile as the namespace no longer needs
it.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2013-08-14 11:42:06 -07:00
John Johansen
fa2ac468db apparmor: update how unconfined is handled
ns->unconfined is being used read side without locking, nor rcu but is
being updated when a namespace is removed. This works for the root ns
which is never removed but has a race window and can cause failures when
children namespaces are removed.

Also ns and ns->unconfined have a circular refcounting dependency that
is problematic and must be broken. Currently this is done incorrectly
when the namespace is destroyed.

Fix this by forward referencing unconfined via the replacedby infrastructure
instead of directly updating the ns->unconfined pointer.

Remove the circular refcount dependency by making the ns and its unconfined
profile share the same refcount.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2013-08-14 11:42:06 -07:00
John Johansen
77b071b340 apparmor: change how profile replacement update is done
remove the use of replaced by chaining and move to profile invalidation
and lookup to handle task replacement.

Replacement chaining can result in large chains of profiles being pinned
in memory when one profile in the chain is use. With implicit labeling
this will be even more of a problem, so move to a direct lookup method.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-08-14 11:42:06 -07:00
John Johansen
01e2b670aa apparmor: convert profile lists to RCU based locking
Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-08-14 11:42:06 -07:00
John Johansen
dd51c84857 apparmor: provide base for multiple profiles to be replaced at once
previously profiles had to be loaded one at a time, which could result
in cases where a replacement of a set would partially succeed, and then fail
resulting in inconsistent policy.

Allow multiple profiles to replaced "atomically" so that the replacement
either succeeds or fails for the entire set of profiles.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-08-14 11:42:06 -07:00
John Johansen
9d910a3bc0 apparmor: add a features/policy dir to interface
Add a policy directory to features to contain features that can affect
policy compilation but do not affect mediation. Eg of such features would
be types of dfa compression supported, etc.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
2013-08-14 11:42:05 -07:00
John Johansen
c611616cd3 apparmor: enable users to query whether apparmor is enabled
Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-08-14 11:42:05 -07:00
Tetsuo Handa
dfe4ac28be apparmor: remove minimum size check for vmalloc()
This is a follow-up to commit b5b3ee6c "apparmor: no need to delay vfree()".

Since vmalloc() will do "size = PAGE_ALIGN(size);",
we don't need to check for "size >= sizeof(struct work_struct)".

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-08-14 11:42:05 -07:00
Rafal Krypa
10289b0f73 Smack: parse multiple rules per write to load2, up to PAGE_SIZE-1 bytes
Smack interface for loading rules has always parsed only single rule from
data written to it. This requires user program to call one write() per
each rule it wants to load.
This change makes it possible to write multiple rules, separated by new
line character. Smack will load at most PAGE_SIZE-1 characters and properly
return number of processed bytes. In case when user buffer is larger, it
will be additionally truncated. All characters after last \n will not get
parsed to avoid partial rule near input buffer boundary.

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
2013-08-12 11:51:40 -07:00
Tejun Heo
bd8815a6d8 cgroup: make css_for_each_descendant() and friends include the origin css in the iteration
Previously, all css descendant iterators didn't include the origin
(root of subtree) css in the iteration.  The reasons were maintaining
consistency with css_for_each_child() and that at the time of
introduction more use cases needed skipping the origin anyway;
however, given that css_is_descendant() considers self to be a
descendant, omitting the origin css has become more confusing and
looking at the accumulated use cases rather clearly indicates that
including origin would result in simpler code overall.

While this is a change which can easily lead to subtle bugs, cgroup
API including the iterators has recently gone through major
restructuring and no out-of-tree changes will be applicable without
adjustments making this a relatively acceptable opportunity for this
type of change.

The conversions are mostly straight-forward.  If the iteration block
had explicit origin handling before or after, it's moved inside the
iteration.  If not, if (pos == origin) continue; is added.  Some
conversions add extra reference get/put around origin handling by
consolidating origin handling and the rest.  While the extra ref
operations aren't strictly necessary, this shouldn't cause any
noticeable difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
2013-08-08 20:11:27 -04:00
Tejun Heo
492eb21b98 cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup
cgroup is currently in the process of transitioning to using css
(cgroup_subsys_state) as the primary handle instead of cgroup in
subsystem API.  For hierarchy iterators, this is beneficial because

* In most cases, css is the only thing subsystems care about anyway.

* On the planned unified hierarchy, iterations for different
  subsystems will need to skip over different subtrees of the
  hierarchy depending on which subsystems are enabled on each cgroup.
  Passing around css makes it unnecessary to explicitly specify the
  subsystem in question as css is intersection between cgroup and
  subsystem

* For the planned unified hierarchy, css's would need to be created
  and destroyed dynamically independent from cgroup hierarchy.  Having
  cgroup core manage css iteration makes enforcing deref rules a lot
  easier.

Most subsystem conversions are straight-forward.  Noteworthy changes
are

* blkio: cgroup_to_blkcg() is no longer used.  Removed.

* freezer: cgroup_freezer() is no longer used.  Removed.

* devices: cgroup_to_devcgroup() is no longer used.  Removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Jens Axboe <axboe@kernel.dk>
2013-08-08 20:11:25 -04:00
Tejun Heo
182446d087 cgroup: pass around cgroup_subsys_state instead of cgroup in file methods
cgroup is currently in the process of transitioning to using struct
cgroup_subsys_state * as the primary handle instead of struct cgroup.
Please see the previous commit which converts the subsystem methods
for rationale.

This patch converts all cftype file operations to take @css instead of
@cgroup.  cftypes for the cgroup core files don't have their subsytem
pointer set.  These will automatically use the dummy_css added by the
previous patch and can be converted the same way.

Most subsystem conversions are straight forwards but there are some
interesting ones.

* freezer: update_if_frozen() is also converted to take @css instead
  of @cgroup for consistency.  This will make the code look simpler
  too once iterators are converted to use css.

* memory/vmpressure: mem_cgroup_from_css() needs to be exported to
  vmpressure while mem_cgroup_from_cont() can be made static.
  Updated accordingly.

* cpu: cgroup_tg() doesn't have any user left.  Removed.

* cpuacct: cgroup_ca() doesn't have any user left.  Removed.

* hugetlb: hugetlb_cgroup_form_cgroup() doesn't have any user left.
  Removed.

* net_cls: cgrp_cls_state() doesn't have any user left.  Removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Steven Rostedt <rostedt@goodmis.org>
2013-08-08 20:11:24 -04:00
Tejun Heo
eb95419b02 cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
cgroup is currently in the process of transitioning to using struct
cgroup_subsys_state * as the primary handle instead of struct cgroup *
in subsystem implementations for the following reasons.

* With unified hierarchy, subsystems will be dynamically bound and
  unbound from cgroups and thus css's (cgroup_subsys_state) may be
  created and destroyed dynamically over the lifetime of a cgroup,
  which is different from the current state where all css's are
  allocated and destroyed together with the associated cgroup.  This
  in turn means that cgroup_css() should be synchronized and may
  return NULL, making it more cumbersome to use.

* Differing levels of per-subsystem granularity in the unified
  hierarchy means that the task and descendant iterators should behave
  differently depending on the specific subsystem the iteration is
  being performed for.

* In majority of the cases, subsystems only care about its part in the
  cgroup hierarchy - ie. the hierarchy of css's.  Subsystem methods
  often obtain the matching css pointer from the cgroup and don't
  bother with the cgroup pointer itself.  Passing around css fits
  much better.

This patch converts all cgroup_subsys methods to take @css instead of
@cgroup.  The conversions are mostly straight-forward.  A few
noteworthy changes are

* ->css_alloc() now takes css of the parent cgroup rather than the
  pointer to the new cgroup as the css for the new cgroup doesn't
  exist yet.  Knowing the parent css is enough for all the existing
  subsystems.

* In kernel/cgroup.c::offline_css(), unnecessary open coded css
  dereference is replaced with local variable access.

This patch shouldn't cause any behavior differences.

v2: Unnecessary explicit cgrp->subsys[] deref in css_online() replaced
    with local variable @css as suggested by Li Zefan.

    Rebased on top of new for-3.12 which includes for-3.11-fixes so
    that ->css_free() invocation added by da0a12caff ("cgroup: fix a
    leak when percpu_ref_init() fails") is converted too.  Suggested
    by Li Zefan.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Steven Rostedt <rostedt@goodmis.org>
2013-08-08 20:11:23 -04:00
Tejun Heo
6387698699 cgroup: add css_parent()
Currently, controllers have to explicitly follow the cgroup hierarchy
to find the parent of a given css.  cgroup is moving towards using
cgroup_subsys_state as the main controller interface construct, so
let's provide a way to climb the hierarchy using just csses.

This patch implements css_parent() which, given a css, returns its
parent.  The function is guarnateed to valid non-NULL parent css as
long as the target css is not at the top of the hierarchy.

freezer, cpuset, cpu, cpuacct, hugetlb, memory, net_cls and devices
are converted to use css_parent() instead of accessing cgroup->parent
directly.

* __parent_ca() is dropped from cpuacct and its usage is replaced with
  parent_ca().  The only difference between the two was NULL test on
  cgroup->parent which is now embedded in css_parent() making the
  distinction moot.  Note that eventually a css->parent field will be
  added to css and the NULL check in css_parent() will go away.

This patch shouldn't cause any behavior differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2013-08-08 20:11:23 -04:00
Tejun Heo
a7c6d554aa cgroup: add/update accessors which obtain subsys specific data from css
css (cgroup_subsys_state) is usually embedded in a subsys specific
data structure.  Subsystems either use container_of() directly to cast
from css to such data structure or has an accessor function wrapping
such cast.  As cgroup as whole is moving towards using css as the main
interface handle, add and update such accessors to ease dealing with
css's.

All accessors explicitly handle NULL input and return NULL in those
cases.  While this looks like an extra branch in the code, as all
controllers specific data structures have css as the first field, the
casting doesn't involve any offsetting and the compiler can trivially
optimize out the branch.

* blkio, freezer, cpuset, cpu, cpuacct and net_cls didn't have such
  accessor.  Added.

* memory, hugetlb and devices already had one but didn't explicitly
  handle NULL input.  Updated.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2013-08-08 20:11:23 -04:00
Tejun Heo
8af01f56a0 cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/
The names of the two struct cgroup_subsys_state accessors -
cgroup_subsys_state() and task_subsys_state() - are somewhat awkward.
The former clashes with the type name and the latter doesn't even
indicate it's somehow related to cgroup.

We're about to revamp large portion of cgroup API, so, let's rename
them so that they're less awkward.  Most per-controller usages of the
accessors are localized in accessor wrappers and given the amount of
scheduled changes, this isn't gonna add any noticeable headache.

Rename cgroup_subsys_state() to cgroup_css() and task_subsys_state()
to task_css().  This patch is pure rename.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2013-08-08 20:11:22 -04:00
Casey Schaufler
6ea062475a Smack: IPv6 casting error fix for 3.11
The original implementation of the Smack IPv6 port based
local controls works most of the time using a sockaddr as
a temporary variable, but not always as it overflows in
some circumstances. The correct data is a sockaddr_in6.
A struct sockaddr isn't as large as a struct sockaddr_in6.
There would need to be casting one way or the other. This
patch gets it the right way.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-08-06 20:53:54 +10:00
Casey Schaufler
677264e8fb Smack: network label match fix
The Smack code that matches incoming CIPSO tags with Smack labels
reaches through the NetLabel interfaces and compares the network
data with the CIPSO header associated with a Smack label. This was
done in a ill advised attempt to optimize performance. It works
so long as the categories fit in a single capset, but this isn't
always the case.

This patch changes the Smack code to use the appropriate NetLabel
interfaces to compare the incoming CIPSO header with the CIPSO
header associated with a label. It will always match the CIPSO
headers correctly.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2013-08-01 20:04:02 -07:00
Tomasz Stanislawski
4d7cf4a1f4 security: smack: add a hash table to quicken smk_find_entry()
Accepted for the smack-next tree after changing the number of
slots from 128 to 16.

This patch adds a hash table to quicken searching of a smack label by its name.

Basically, the patch improves performance of SMACK initialization.  Parsing of
rules involves translation from a string to a smack_known (aka label) entity
which is done in smk_find_entry().

The current implementation of the function iterates over a global list of
smack_known resulting in O(N) complexity for smk_find_entry().  The total
complexity of SMACK initialization becomes O(rules * labels).  Therefore it
scales quadratically with a complexity of a system.

Applying the patch reduced the complexity of smk_find_entry() to O(1) as long
as number of label is in hundreds. If the number of labels is increased please
update SMACK_HASH_SLOTS constant defined in security/smack/smack.h. Introducing
the configuration of this constant with Kconfig or cmdline might be a good
idea.

The size of the hash table was adjusted experimentally.  The rule set used by
TIZEN contains circa 17K rules for 500 labels.  The table above contains
results of SMACK initialization using 'time smackctl apply' bash command.
The 'Ref' is a kernel without this patch applied. The consecutive values
refers to value of SMACK_HASH_SLOTS.  Every measurement was repeated three
times to reduce noise.

     |  Ref  |   1   |   2   |   4   |   8   |   16  |   32  |   64  |  128  |  256  |  512
--------------------------------------------------------------------------------------------
Run1 | 1.156 | 1.096 | 0.883 | 0.764 | 0.692 | 0.667 | 0.649 | 0.633 | 0.634 | 0.629 | 0.620
Run2 | 1.156 | 1.111 | 0.885 | 0.764 | 0.694 | 0.661 | 0.649 | 0.651 | 0.634 | 0.638 | 0.623
Run3 | 1.160 | 1.107 | 0.886 | 0.764 | 0.694 | 0.671 | 0.661 | 0.638 | 0.631 | 0.624 | 0.638
AVG  | 1.157 | 1.105 | 0.885 | 0.764 | 0.693 | 0.666 | 0.653 | 0.641 | 0.633 | 0.630 | 0.627

Surprisingly, a single hlist is slightly faster than a double-linked list.
The speed-up saturates near 64 slots.  Therefore I chose value 128 to provide
some margin if more labels were used.
It looks that IO becomes a new bottleneck.

Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
2013-08-01 16:55:20 -07:00
Tomasz Stanislawski
470043ba99 security: smack: fix memleak in smk_write_rules_list()
The smack_parsed_rule structure is allocated.  If a rule is successfully
installed then the last reference to the object is lost.  This patch fixes this
leak. Moreover smack_parsed_rule is allocated on stack because it no longer
needed ofter smk_write_rules_list() is finished.

Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
2013-08-01 12:57:24 -07:00
fan.du
ca4c3fc24e net: split rt_genid for ipv4 and ipv6
Current net name space has only one genid for both IPv4 and IPv6, it has below
drawbacks:

- Add/delete an IPv4 address will invalidate all IPv6 routing table entries.
- Insert/remove XFRM policy will also invalidate both IPv4/IPv6 routing table
  entries even when the policy is only applied for one address family.

Thus, this patch attempt to split one genid for two to cater for IPv4 and IPv6
separately in a fine granularity.

Signed-off-by: Fan Du <fan.du@windriver.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-31 14:56:36 -07:00
Chris PeBenito
2be4d74f2f Add SELinux policy capability for always checking packet and peer classes.
Currently the packet class in SELinux is not checked if there are no
SECMARK rules in the security or mangle netfilter tables.  Some systems
prefer that packets are always checked, for example, to protect the system
should the netfilter rules fail to load or if the nefilter rules
were maliciously flushed.

Add the always_check_network policy capability which, when enabled, treats
SECMARK as enabled, even if there are no netfilter SECMARK rules and
treats peer labeling as enabled, even if there is no Netlabel or
labeled IPSEC configuration.

Includes definition of "redhat1" SELinux policy capability, which
exists in the SELinux userpace library, to keep ordering correct.

The SELinux userpace portion of this was merged last year, but this kernel
change fell on the floor.

Signed-off-by: Chris PeBenito <cpebenito@tresys.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:03:38 -04:00
Paul Moore
b04eea8864 selinux: fix problems in netnode when BUG() is compiled out
When the BUG() macro is disabled at compile time it can cause some
problems in the SELinux netnode code: invalid return codes and
uninitialized variables.  This patch fixes this by making sure we take
some corrective action after the BUG() macro.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:03:27 -04:00
Eric Paris
b43e725d8d SELinux: use a helper function to determine seclabel
Use a helper to determine if a superblock should have the seclabel flag
rather than doing it in the function.  I'm going to use this in the
security server as well.

Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:03:24 -04:00
Eric Paris
a64c54cf08 SELinux: pass a superblock to security_fs_use
Rather than passing pointers to memory locations, strings, and other
stuff just give up on the separation and give security_fs_use the
superblock.  It just makes the code easier to read (even if not easier to
reuse on some other OS)

Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:03:21 -04:00
Eric Paris
308ab70c46 SELinux: do not handle seclabel as a special flag
Instead of having special code around the 'non-mount' seclabel mount option
just handle it like the mount options.

Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:03:12 -04:00
Eric Paris
f936c6e502 SELinux: change sbsec->behavior to short
We only have 6 options, so char is good enough, but use a short as that
packs nicely.  This shrinks the superblock_security_struct just a little
bit.

Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:03:09 -04:00
Eric Paris
cfca0303da SELinux: renumber the superblock options
Just to make it clear that we have mount time options and flags,
separate them.  Since I decided to move the non-mount options above
above 0x10, we need a short instead of a char.  (x86 padding says
this takes up no additional space as we have a 3byte whole in the
structure)

Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:03:06 -04:00
Eric Paris
eadcabc697 SELinux: do all flags twiddling in one place
Currently we set the initialize and seclabel flag in one place.  Do some
unrelated printk then we unset the seclabel flag.  Eww.  Instead do the flag
twiddling in one place in the code not seperated by unrelated printk.  Also
don't set and unset the seclabel flag.  Only set it if we need to.

Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:03:03 -04:00
Eric Paris
12f348b9dc SELinux: rename SE_SBLABELSUPP to SBLABEL_MNT
Just a flag rename as we prepare to make it not so special.

Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:03:01 -04:00
Eric Paris
af8e50cc7d SELinux: use define for number of bits in the mnt flags mask
We had this random hard coded value of '8' in the code (I put it there)
for the number of bits to check for mount options.  This is stupid.  Instead
use the #define we already have which tells us the number of mount
options.

Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:02:58 -04:00
Eric Paris
d355987f47 SELinux: make it harder to get the number of mnt opts wrong
Instead of just hard coding a value, use the enum to out benefit.

Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:02:53 -04:00
Eric Paris
40d3d0b85f SELinux: remove crazy contortions around proc
We check if the fsname is proc and if so set the proc superblock security
struct flag.  We then check if the flag is set and use the string 'proc'
for the fsname instead of just using the fsname.  What's the point?  It's
always proc...  Get rid of the useless conditional.

Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:02:50 -04:00
Eric Paris
b138004ea0 SELinux: fix selinuxfs policy file on big endian systems
The /sys/fs/selinux/policy file is not valid on big endian systems like
ppc64 or s390.  Let's see why:

static int hashtab_cnt(void *key, void *data, void *ptr)
{
	int *cnt = ptr;
	*cnt = *cnt + 1;

	return 0;
}

static int range_write(struct policydb *p, void *fp)
{
	size_t nel;
[...]
	/* count the number of entries in the hashtab */
	nel = 0;
	rc = hashtab_map(p->range_tr, hashtab_cnt, &nel);
	if (rc)
		return rc;
	buf[0] = cpu_to_le32(nel);
	rc = put_entry(buf, sizeof(u32), 1, fp);

So size_t is 64 bits.  But then we pass a pointer to it as we do to
hashtab_cnt.  hashtab_cnt thinks it is a 32 bit int and only deals with
the first 4 bytes.  On x86_64 which is little endian, those first 4
bytes and the least significant, so this works out fine.  On ppc64/s390
those first 4 bytes of memory are the high order bits.  So at the end of
the call to hashtab_map nel has a HUGE number.  But the least
significant 32 bits are all 0's.

We then pass that 64 bit number to cpu_to_le32() which happily truncates
it to a 32 bit number and does endian swapping.  But the low 32 bits are
all 0's.  So no matter how many entries are in the hashtab, big endian
systems always say there are 0 entries because I screwed up the
counting.

The fix is easy.  Use a 32 bit int, as the hashtab_cnt expects, for nel.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-07-25 13:02:44 -04:00
Stephen Smalley
5c73fceb8c SELinux: Enable setting security contexts on rootfs inodes.
rootfs (ramfs) can support setting of security contexts
by userspace due to the vfs fallback behavior of calling
the security module to set the in-core inode state
for security.* attributes when the filesystem does not
provide an xattr handler.  No xattr handler required
as the inodes are pinned in memory and have no backing
store.

This is useful in allowing early userspace to label individual
files within a rootfs while still providing a policy-defined
default via genfs.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:02:37 -04:00
Waiman Long
a767f680e3 SELinux: Increase ebitmap_node size for 64-bit configuration
Currently, the ebitmap_node structure has a fixed size of 32 bytes. On
a 32-bit system, the overhead is 8 bytes, leaving 24 bytes for being
used as bitmaps. The overhead ratio is 1/4.

On a 64-bit system, the overhead is 16 bytes. Therefore, only 16 bytes
are left for bitmap purpose and the overhead ratio is 1/2. With a
3.8.2 kernel, a boot-up operation will cause the ebitmap_get_bit()
function to be called about 9 million times. The average number of
ebitmap_node traversal is about 3.7.

This patch increases the size of the ebitmap_node structure to 64
bytes for 64-bit system to keep the overhead ratio at 1/4. This may
also improve performance a little bit by making node to node traversal
less frequent (< 2) as more bits are available in each node.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:02:31 -04:00
Waiman Long
fee7114298 SELinux: Reduce overhead of mls_level_isvalid() function call
While running the high_systime workload of the AIM7 benchmark on
a 2-socket 12-core Westmere x86-64 machine running 3.10-rc4 kernel
(with HT on), it was found that a pretty sizable amount of time was
spent in the SELinux code. Below was the perf trace of the "perf
record -a -s" of a test run at 1500 users:

  5.04%            ls  [kernel.kallsyms]     [k] ebitmap_get_bit
  1.96%            ls  [kernel.kallsyms]     [k] mls_level_isvalid
  1.95%            ls  [kernel.kallsyms]     [k] find_next_bit

The ebitmap_get_bit() was the hottest function in the perf-report
output.  Both the ebitmap_get_bit() and find_next_bit() functions
were, in fact, called by mls_level_isvalid(). As a result, the
mls_level_isvalid() call consumed 8.95% of the total CPU time of
all the 24 virtual CPUs which is quite a lot. The majority of the
mls_level_isvalid() function invocations come from the socket creation
system call.

Looking at the mls_level_isvalid() function, it is checking to see
if all the bits set in one of the ebitmap structure are also set in
another one as well as the highest set bit is no bigger than the one
specified by the given policydb data structure. It is doing it in
a bit-by-bit manner. So if the ebitmap structure has many bits set,
the iteration loop will be done many times.

The current code can be rewritten to use a similar algorithm as the
ebitmap_contains() function with an additional check for the
highest set bit. The ebitmap_contains() function was extended to
cover an optional additional check for the highest set bit, and the
mls_level_isvalid() function was modified to call ebitmap_contains().

With that change, the perf trace showed that the used CPU time drop
down to just 0.08% (ebitmap_contains + mls_level_isvalid) of the
total which is about 100X less than before.

  0.07%            ls  [kernel.kallsyms]     [k] ebitmap_contains
  0.05%            ls  [kernel.kallsyms]     [k] ebitmap_get_bit
  0.01%            ls  [kernel.kallsyms]     [k] mls_level_isvalid
  0.01%            ls  [kernel.kallsyms]     [k] find_next_bit

The remaining ebitmap_get_bit() and find_next_bit() functions calls
are made by other kernel routines as the new mls_level_isvalid()
function will not call them anymore.

This patch also improves the high_systime AIM7 benchmark result,
though the improvement is not as impressive as is suggested by the
reduction in CPU time spent in the ebitmap functions. The table below
shows the performance change on the 2-socket x86-64 system (with HT
on) mentioned above.

+--------------+---------------+----------------+-----------------+
|   Workload   | mean % change | mean % change  | mean % change   |
|              | 10-100 users  | 200-1000 users | 1100-2000 users |
+--------------+---------------+----------------+-----------------+
| high_systime |     +0.1%     |     +0.9%      |     +2.6%       |
+--------------+---------------+----------------+-----------------+

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:02:18 -04:00
Paul Moore
bed4d7efb3 selinux: remove the BUG_ON() from selinux_skb_xfrm_sid()
Remove the BUG_ON() from selinux_skb_xfrm_sid() and propogate the
error code up to the caller.  Also check the return values in the
only caller function, selinux_skb_peerlbl_sid().

Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:02:13 -04:00
Paul Moore
d1b17b09f3 selinux: cleanup the XFRM header
Remove the unused get_sock_isec() function and do some formatting
fixes.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:02:08 -04:00
Paul Moore
e219369580 selinux: cleanup selinux_xfrm_decode_session()
Some basic simplification.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:02:03 -04:00
Paul Moore
4baabeec2a selinux: cleanup some comment and whitespace issues in the XFRM code
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:01:58 -04:00
Paul Moore
eef9b41622 selinux: cleanup selinux_xfrm_sock_rcv_skb() and selinux_xfrm_postroute_last()
Some basic simplification and comment reformatting.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:01:52 -04:00
Paul Moore
96484348ad selinux: cleanup selinux_xfrm_policy_lookup() and selinux_xfrm_state_pol_flow_match()
Do some basic simplification and comment reformatting.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:01:46 -04:00
Paul Moore
ccf17cc4b8 selinux: cleanup and consolidate the XFRM alloc/clone/delete/free code
The SELinux labeled IPsec code state management functions have been
long neglected and could use some cleanup and consolidation.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:01:40 -04:00
Paul Moore
2e5aa86609 lsm: split the xfrm_state_alloc_security() hook implementation
The xfrm_state_alloc_security() LSM hook implementation is really a
multiplexed hook with two different behaviors depending on the
arguments passed to it by the caller.  This patch splits the LSM hook
implementation into two new hook implementations, which match the
LSM hooks in the rest of the kernel:

 * xfrm_state_alloc
 * xfrm_state_alloc_acquire

Also included in this patch are the necessary changes to the SELinux
code; no other LSMs are affected.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25 13:01:25 -04:00
Tetsuo Handa
9548906b2b xattr: Constify ->name member of "struct xattr".
Since everybody sets kstrdup()ed constant string to "struct xattr"->name but
nobody modifies "struct xattr"->name , we can omit kstrdup() and its failure
checking by constifying ->name member of "struct xattr".

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Joel Becker <jlbec@evilplan.org> [ocfs2]
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Tested-by: Paul Moore <paul@paul-moore.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-07-25 19:30:03 +10:00
Linus Torvalds
0ff08ba5d0 Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux
Pull nfsd changes from Bruce Fields:
 "Changes this time include:

   - 4.1 enabled on the server by default: the last 4.1-specific issues
     I know of are fixed, so we're not going to find the rest of the
     bugs without more exposure.
   - Experimental support for NFSv4.2 MAC Labeling (to allow running
     selinux over NFS), from Dave Quigley.
   - Fixes for some delicate cache/upcall races that could cause rare
     server hangs; thanks to Neil Brown and Bodo Stroesser for extreme
     debugging persistence.
   - Fixes for some bugs found at the recent NFS bakeathon, mostly v4
     and v4.1-specific, but also a generic bug handling fragmented rpc
     calls"

* 'for-3.11' of git://linux-nfs.org/~bfields/linux: (31 commits)
  nfsd4: support minorversion 1 by default
  nfsd4: allow destroy_session over destroyed session
  svcrpc: fix failures to handle -1 uid's
  sunrpc: Don't schedule an upcall on a replaced cache entry.
  net/sunrpc: xpt_auth_cache should be ignored when expired.
  sunrpc/cache: ensure items removed from cache do not have pending upcalls.
  sunrpc/cache: use cache_fresh_unlocked consistently and correctly.
  sunrpc/cache: remove races with queuing an upcall.
  nfsd4: return delegation immediately if lease fails
  nfsd4: do not throw away 4.1 lock state on last unlock
  nfsd4: delegation-based open reclaims should bypass permissions
  svcrpc: don't error out on small tcp fragment
  svcrpc: fix handling of too-short rpc's
  nfsd4: minor read_buf cleanup
  nfsd4: fix decoding of compounds across page boundaries
  nfsd4: clean up nfs4_open_delegation
  NFSD: Don't give out read delegations on creates
  nfsd4: allow client to send no cb_sec flavors
  nfsd4: fail attempts to request gss on the backchannel
  nfsd4: implement minimal SP4_MACH_CRED
  ...
2013-07-11 10:17:13 -07:00
Linus Torvalds
496322bc91 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "This is a re-do of the net-next pull request for the current merge
  window.  The only difference from the one I made the other day is that
  this has Eliezer's interface renames and the timeout handling changes
  made based upon your feedback, as well as a few bug fixes that have
  trickeled in.

  Highlights:

   1) Low latency device polling, eliminating the cost of interrupt
      handling and context switches.  Allows direct polling of a network
      device from socket operations, such as recvmsg() and poll().

      Currently ixgbe, mlx4, and bnx2x support this feature.

      Full high level description, performance numbers, and design in
      commit 0a4db187a9 ("Merge branch 'll_poll'")

      From Eliezer Tamir.

   2) With the routing cache removed, ip_check_mc_rcu() gets exercised
      more than ever before in the case where we have lots of multicast
      addresses.  Use a hash table instead of a simple linked list, from
      Eric Dumazet.

   3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from
      Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski,
      Marek Puzyniak, Michal Kazior, and Sujith Manoharan.

   4) Support reporting the TUN device persist flag to userspace, from
      Pavel Emelyanov.

   5) Allow controlling network device VF link state using netlink, from
      Rony Efraim.

   6) Support GRE tunneling in openvswitch, from Pravin B Shelar.

   7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from
      Daniel Borkmann and Eric Dumazet.

   8) Allow controlling of TCP quickack behavior on a per-route basis,
      from Cong Wang.

   9) Several bug fixes and improvements to vxlan from Stephen
      Hemminger, Pravin B Shelar, and Mike Rapoport.  In particular,
      support receiving on multiple UDP ports.

  10) Major cleanups, particular in the area of debugging and cookie
      lifetime handline, to the SCTP protocol code.  From Daniel
      Borkmann.

  11) Allow packets to cross network namespaces when traversing tunnel
      devices.  From Nicolas Dichtel.

  12) Allow monitoring netlink traffic via AF_PACKET sockets, in a
      manner akin to how we monitor real network traffic via ptype_all.
      From Daniel Borkmann.

  13) Several bug fixes and improvements for the new alx device driver,
      from Johannes Berg.

  14) Fix scalability issues in the netem packet scheduler's time queue,
      by using an rbtree.  From Eric Dumazet.

  15) Several bug fixes in TCP loss recovery handling, from Yuchung
      Cheng.

  16) Add support for GSO segmentation of MPLS packets, from Simon
      Horman.

  17) Make network notifiers have a real data type for the opaque
      pointer that's passed into them.  Use this to properly handle
      network device flag changes in arp_netdev_event().  From Jiri
      Pirko and Timo Teräs.

  18) Convert several drivers over to module_pci_driver(), from Peter
      Huewe.

  19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a
      O(1) calculation instead.  From Eric Dumazet.

  20) Support setting of explicit tunnel peer addresses in ipv6, just
      like ipv4.  From Nicolas Dichtel.

  21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet.

  22) Prevent a single high rate flow from overruning an individual cpu
      during RX packet processing via selective flow shedding.  From
      Willem de Bruijn.

  23) Don't use spinlocks in TCP md5 signing fast paths, from Eric
      Dumazet.

  24) Don't just drop GSO packets which are above the TBF scheduler's
      burst limit, chop them up so they are in-bounds instead.  Also
      from Eric Dumazet.

  25) VLAN offloads are missed when configured on top of a bridge, fix
      from Vlad Yasevich.

  26) Support IPV6 in ping sockets.  From Lorenzo Colitti.

  27) Receive flow steering targets should be updated at poll() time
      too, from David Majnemer.

  28) Fix several corner case regressions in PMTU/redirect handling due
      to the routing cache removal, from Timo Teräs.

  29) We have to be mindful of ipv4 mapped ipv6 sockets in
      upd_v6_push_pending_frames().  From Hannes Frederic Sowa.

  30) Fix L2TP sequence number handling bugs, from James Chapman."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits)
  drivers/net: caif: fix wrong rtnl_is_locked() usage
  drivers/net: enic: release rtnl_lock on error-path
  vhost-net: fix use-after-free in vhost_net_flush
  net: mv643xx_eth: do not use port number as platform device id
  net: sctp: confirm route during forward progress
  virtio_net: fix race in RX VQ processing
  virtio: support unlocked queue poll
  net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit
  Documentation: Fix references to defunct linux-net@vger.kernel.org
  net/fs: change busy poll time accounting
  net: rename low latency sockets functions to busy poll
  bridge: fix some kernel warning in multicast timer
  sfc: Fix memory leak when discarding scattered packets
  sit: fix tunnel update via netlink
  dt:net:stmmac: Add dt specific phy reset callback support.
  dt:net:stmmac: Add support to dwmac version 3.610 and 3.710
  dt:net:stmmac: Allocate platform data only if its NULL.
  net:stmmac: fix memleak in the open method
  ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available
  net: ipv6: fix wrong ping_v6_sendmsg return value
  ...
2013-07-09 18:24:39 -07:00
Linus Torvalds
be0c5d8c0b NFS client updates for Linux 3.11
Feature highlights include:
 - Add basic client support for NFSv4.2
 - Add basic client support for Labeled NFS (selinux for NFSv4.2)
 - Fix the use of credentials in NFSv4.1 stateful operations, and
   add support for NFSv4.1 state protection.
 
 Bugfix highlights:
 - Fix another NFSv4 open state recovery race
 - Fix an NFSv4.1 back channel session regression
 - Various rpc_pipefs races
 - Fix another issue with NFSv3 auth negotiation
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJR2vsSAAoJEGcL54qWCgDyWBIP/AqlpBBAblxbNQ1Bl/0m1Pdb
 iKH961qgM4U1BzK0svGtHTZqkovpm4o/VbkbKBT5mQ4g6SbbsJ/AsS1plCyfnIZi
 bdnKNJyj6zg0NsAkJ3vKWqd4BTaP+icdSfEIlRKQxAPESewN7b5B3OWgY4KdYmnk
 q5BP25anC1ryxVycSY67ux8S2IKXVSRZeCZv+RO21rvZ2G0bV5y7t8Om28ztxEnU
 RKrHgQHgaaktR7i8QVO0sbiWq3iqLa3GPkUvFLwWGr8PQJtTkYY0QwYSrsV3N4rY
 hYpMRUZFHpZ8UG5YvBT6xyOy/XaGwMGKSfZjB9/YG4QVju+tTy50U1JbTil5PEWY
 GHWYF68aurIeUkXrhSv8AVnOnhir0mISx5ou/SV7p0QoAZ92V6kq+LkPrW520qlc
 z8ILh3j28pN3ZUCIEArcaZhYCt48uO2hwBi5TqevQyyGRsXFGbN1moD5jvHkllft
 Fi0XGuCBdvhrzFRZcsEl+PDq7fT8lXUK2BHe8oR5jz9PhUp+jpEl9m/eg3RsjJjN
 DuxsHye2U4chScdnRtLBQvpFtdINvWX/Gy8Bi7kdE5tsQySvOa+rdwuBc7h88PHC
 +4xI2iX3z4O1+GpsAe/T9+pjW689jEilS+eVDRVEGl6yHGn9q8PYOayjPjwbJHxS
 R2mLTRhKu1DKguTzO13f
 =wGjn
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "Feature highlights include:
   - Add basic client support for NFSv4.2
   - Add basic client support for Labeled NFS (selinux for NFSv4.2)
   - Fix the use of credentials in NFSv4.1 stateful operations, and add
     support for NFSv4.1 state protection.

  Bugfix highlights:
   - Fix another NFSv4 open state recovery race
   - Fix an NFSv4.1 back channel session regression
   - Various rpc_pipefs races
   - Fix another issue with NFSv3 auth negotiation

  Please note that Labeled NFS does require some additional support from
  the security subsystem.  The relevant changesets have all been
  reviewed and acked by James Morris."

* tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (54 commits)
  NFS: Set NFS_CS_MIGRATION for NFSv4 mounts
  NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs
  nfs: have NFSv3 try server-specified auth flavors in turn
  nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it
  nfs: move server_authlist into nfs_try_mount_request
  nfs: refactor "need_mount" code out of nfs_try_mount
  SUNRPC: PipeFS MOUNT notification optimization for dying clients
  SUNRPC: split client creation routine into setup and registration
  SUNRPC: fix races on PipeFS UMOUNT notifications
  SUNRPC: fix races on PipeFS MOUNT notifications
  NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount
  NFSv4.1 use pnfs_device maxcount for the blocklayout gdia_maxcount
  NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize
  NFS: Improve legacy idmapping fallback
  NFSv4.1 end back channel session draining
  NFS: Apply v4.1 capabilities to v4.2
  NFSv4.1: Clean up layout segment comparison helper names
  NFSv4.1: layout segment comparison helpers should take 'const' parameters
  NFSv4: Move the DNS resolver into the NFSv4 module
  rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set
  ...
2013-07-09 12:09:43 -07:00
Linus Torvalds
f39d420f67 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "In this update, Smack learns to love IPv6 and to mount a filesystem
  with a transmutable hierarchy (i.e.  security labels are inherited
  from parent directory upon creation rather than creating process).

  The rest of the changes are maintenance"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (37 commits)
  tpm/tpm_i2c_infineon: Remove unused header file
  tpm: tpm_i2c_infinion: Don't modify i2c_client->driver
  evm: audit integrity metadata failures
  integrity: move integrity_audit_msg()
  evm: calculate HMAC after initializing posix acl on tmpfs
  maintainers:  add Dmitry Kasatkin
  Smack: Fix the bug smackcipso can't set CIPSO correctly
  Smack: Fix possible NULL pointer dereference at smk_netlbl_mls()
  Smack: Add smkfstransmute mount option
  Smack: Improve access check performance
  Smack: Local IPv6 port based controls
  tpm: fix regression caused by section type conflict of tpm_dev_release() in ppc builds
  maintainers: Remove Kent from maintainers
  tpm: move TPM_DIGEST_SIZE defintion
  tpm_tis: missing platform_driver_unregister() on error in init_tis()
  security: clarify cap_inode_getsecctx description
  apparmor: no need to delay vfree()
  apparmor: fix fully qualified name parsing
  apparmor: fix setprocattr arg processing for onexec
  apparmor: localize getting the security context to a few macros
  ...
2013-07-03 14:04:58 -07:00
Linus Torvalds
790eac5640 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull second set of VFS changes from Al Viro:
 "Assorted f_pos race fixes, making do_splice_direct() safe to call with
  i_mutex on parent, O_TMPFILE support, Jeff's locks.c series,
  ->d_hash/->d_compare calling conventions changes from Linus, misc
  stuff all over the place."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  Document ->tmpfile()
  ext4: ->tmpfile() support
  vfs: export lseek_execute() to modules
  lseek_execute() doesn't need an inode passed to it
  block_dev: switch to fixed_size_llseek()
  cpqphp_sysfs: switch to fixed_size_llseek()
  tile-srom: switch to fixed_size_llseek()
  proc_powerpc: switch to fixed_size_llseek()
  ubi/cdev: switch to fixed_size_llseek()
  pci/proc: switch to fixed_size_llseek()
  isapnp: switch to fixed_size_llseek()
  lpfc: switch to fixed_size_llseek()
  locks: give the blocked_hash its own spinlock
  locks: add a new "lm_owner_key" lock operation
  locks: turn the blocked_list into a hashtable
  locks: convert fl_link to a hlist_node
  locks: avoid taking global lock if possible when waking up blocked waiters
  locks: protect most of the file_lock handling with i_lock
  locks: encapsulate the fl_link list handling
  locks: make "added" in __posix_lock_file a bool
  ...
2013-07-03 09:10:19 -07:00
Linus Torvalds
b028161fbb Merge branch 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup changes from Tejun Heo:
 "This pull request contains the following changes.

   - cgroup_subsys_state (css) reference counting has been converted to
     percpu-ref.  css is what each resource controller embeds into its
     own control structure and perform reference count against.  It may
     be used in hot paths of various subsystems and is similar to module
     refcnt in that aspect.  For example, block-cgroup's css refcnting
     was showing up a lot in Mikulaus's device-mapper scalability work
     and this should alleviate it.

   - cgroup subtree iterator has been updated so that RCU read lock can
     be released after grabbing reference.  This allows simplifying its
     users which requires blocking which used to build iteration list
     under RCU read lock and then traverse it outside.  This pull
     request contains simplification of cgroup core and device-cgroup.
     A separate pull request will update cpuset.

   - Fixes for various bugs including corner race conditions and RCU
     usage bugs.

   - A lot of cleanups and some prepartory work for the planned unified
     hierarchy support."

* 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (48 commits)
  cgroup: CGRP_ROOT_SUBSYS_BOUND should also be ignored when mounting an existing hierarchy
  cgroup: CGRP_ROOT_SUBSYS_BOUND should be ignored when comparing mount options
  cgroup: fix deadlock on cgroup_mutex via drop_parsed_module_refcounts()
  cgroup: always use RCU accessors for protected accesses
  cgroup: fix RCU accesses around task->cgroups
  cgroup: fix RCU accesses to task->cgroups
  cgroup: grab cgroup_mutex in drop_parsed_module_refcounts()
  cgroup: fix cgroupfs_root early destruction path
  cgroup: reserve ID 0 for dummy_root and 1 for unified hierarchy
  cgroup: implement for_each_[builtin_]subsys()
  cgroup: move init_css_set initialization inside cgroup_mutex
  cgroup: s/for_each_subsys()/for_each_root_subsys()/
  cgroup: clean up find_css_set() and friends
  cgroup: remove cgroup->actual_subsys_mask
  cgroup: prefix global variables with "cgroup_"
  cgroup: convert CFTYPE_* flags to enums
  cgroup: rename cont to cgrp
  cgroup: clean up cgroup_serial_nr_cursor
  cgroup: convert cgroup_cft_commit() to use cgroup_for_each_descendant_pre()
  cgroup: make serial_nr_cursor available throughout cgroup.c
  ...
2013-07-02 19:54:47 -07:00
David Howells
13f8e9810b SELinux: Institute file_path_has_perm()
Create a file_path_has_perm() function that is like path_has_perm() but
instead takes a file struct that is the source of both the path and the
inode (rather than getting the inode from the dentry in the path).  This
is then used where appropriate.

This will be useful for situations like unionmount where it will be
possible to have an apparently-negative dentry (eg. a fallthrough) that is
open with the file struct pointing to an inode on the lower fs.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:57:14 +04:00
David Howells
c77cecee52 Replace a bunch of file->dentry->d_inode refs with file_inode()
Replace a bunch of file->dentry->d_inode refs with file_inode().

In __fput(), use file->f_inode instead so as not to be affected by any tricks
that file_inode() might grow.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:57:13 +04:00
Mimi Zohar
9b97b6cdd4 evm: audit integrity metadata failures
Before modifying an EVM protected extended attribute or any other
metadata included in the HMAC calculation, the existing 'security.evm'
is verified.  This patch adds calls to integrity_audit_msg() to audit
integrity metadata failures.

Reported-by: Sven Vermeulen <sven.vermeulen@siphos.be>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-06-20 07:47:50 -04:00
Mimi Zohar
d726d8d719 integrity: move integrity_audit_msg()
This patch moves the integrity_audit_msg() function and defintion to
security/integrity/, the parent directory, renames the 'ima_audit'
boot command line option to 'integrity_audit', and fixes the Kconfig
help text to reflect the actual code.

Changelog:
- Fixed ifdef inclusion of integrity_audit_msg() (Fengguang Wu)

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-06-20 07:47:49 -04:00
David Quigley
c9bccef6b9 NFS: Extend NFS xattr handlers to accept the security namespace
The existing NFSv4 xattr handlers do not accept xattr calls to the security
namespace. This patch extends these handlers to accept xattrs from the security
namespace in addition to the default NFSv4 ACL namespace.

Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-08 16:20:17 -04:00
David Quigley
aa9c266962 NFS: Client implementation of Labeled-NFS
This patch implements the client transport and handling support for labeled
NFS. The patch adds two functions to encode and decode the security label
recommended attribute which makes use of the LSM hooks added earlier. It also
adds code to grab the label from the file attribute structures and encode the
label to be sent back to the server.

Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-08 16:20:16 -04:00
David Quigley
eb9ae68650 SELinux: Add new labeling type native labels
There currently doesn't exist a labeling type that is adequate for use with
labeled NFS. Since NFS doesn't really support xattrs we can't use the use xattr
labeling behavior. For this we developed a new labeling type. The native
labeling type is used solely by NFS to ensure NFS inodes are labeled at runtime
by the NFS code instead of relying on the SELinux security server on the client
end.

Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-08 16:20:12 -04:00
David Quigley
649f6e7718 LSM: Add flags field to security_sb_set_mnt_opts for in kernel mount data.
There is no way to differentiate if a text mount option is passed from user
space or the kernel. A flags field is being added to the
security_sb_set_mnt_opts hook to allow for in kernel security flags to be sent
to the LSM for processing in addition to the text options received from mount.
This patch also updated existing code to fix compilation errors.

Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-08 16:20:12 -04:00
David Quigley
746df9b59c Security: Add Hook to test if the particular xattr is part of a MAC model.
The interface to request security labels from user space is the xattr
interface. When requesting the security label from an NFS server it is
important to make sure the requested xattr actually is a MAC label. This allows
us to make sure that we get the desired semantics from the attribute instead of
something else such as capabilities or a time based LSM.

Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-08 16:20:11 -04:00
David Quigley
d47be3dfec Security: Add hook to calculate context based on a negative dentry.
There is a time where we need to calculate a context without the
inode having been created yet. To do this we take the negative dentry and
calculate a context based on the process and the parent directory contexts.

Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-08 16:19:41 -04:00
David S. Miller
6bc19fb82d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Merge 'net' bug fixes into 'net-next' as we have patches
that will build on top of them.

This merge commit includes a change from Emil Goode
(emilgoode@gmail.com) that fixes a warning that would
have been introduced by this merge.  Specifically it
fixes the pingv6_ops method ipv6_chk_addr() to add a
"const" to the "struct net_device *dev" argument and
likewise update the dummy_ipv6_chk_addr() declaration.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-05 16:37:30 -07:00
Passion,Zhao
0fcfee61d6 Smack: Fix the bug smackcipso can't set CIPSO correctly
Bug report: https://tizendev.org/bugs/browse/TDIS-3891

The reason is userspace libsmack only use "smackfs/cipso2" long-label interface,
but the code's logical is still for orginal fixed length label. Now update
smack_cipso_apply() to support flexible label (<=256 including tailing '\0')

There is also a bug in kernel/security/smack/smackfs.c:
When smk_set_cipso() parsing the CIPSO setting from userspace, the offset of
CIPSO level should be "strlen(label)+1" instead of "strlen(label)"

Signed-off-by: Passion,Zhao <passion.zhao@intel.com>
2013-06-03 10:56:02 -07:00
Paul Moore
e4e8536f65 selinux: fix the labeled xfrm/IPsec reference count handling
The SELinux labeled IPsec code was improperly handling its reference
counting, dropping a reference on a delete operation instead of on a
free/release operation.

Reported-by: Ondrej Moris <omoris@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 17:30:07 -07:00
Jiri Pirko
351638e7de net: pass info struct via netdevice notifier
So far, only net_device * could be passed along with netdevice notifier
event. This patch provides a possibility to pass custom structure
able to provide info that event listener needs to know.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>

v2->v3: fix typo on simeth
	shortened dev_getter
	shortened notifier_info struct name
v1->v2: fix notifier_call parameter in call_netdevice_notifier()
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-28 13:11:01 -07:00
Tetsuo Handa
8cd77a0bd4 Smack: Fix possible NULL pointer dereference at smk_netlbl_mls()
netlbl_secattr_catmap_alloc(GFP_ATOMIC) can return NULL.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2013-05-28 10:15:35 -07:00
Casey Schaufler
e830b39412 Smack: Add smkfstransmute mount option
Suppliment the smkfsroot mount option with another, smkfstransmute,
that does the same thing but also marks the root inode as
transmutting. This allows a freshly created filesystem to
be mounted with a transmutting heirarchy.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2013-05-28 10:08:44 -07:00
Casey Schaufler
2f823ff8be Smack: Improve access check performance
Each Smack label that the kernel has seen is added to a
list of labels. The list of access rules for a given subject
label hangs off of the label list entry for the label.
This patch changes the structures that contain subject
labels to point at the label list entry rather that the
label itself. Doing so removes a label list lookup in
smk_access() that was accounting for the largest single
chunk of Smack overhead.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2013-05-28 10:08:32 -07:00
Casey Schaufler
c673944347 Smack: Local IPv6 port based controls
Smack does not provide access controls on IPv6 communications.
This patch introduces a mechanism for maintaining Smack lables
for local IPv6 communications. It is based on labeling local ports.
The behavior should be compatible with any future "real" IPv6
support as it provides no interfaces for users to manipulate
the labeling. Remote IPv6 connections use the ambient label
the same way that unlabeled IPv4 packets are treated.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2013-05-28 10:08:26 -07:00
Tejun Heo
d591fb5661 device_cgroup: simplify cgroup tree walk in propagate_exception()
During a config change, propagate_exception() needs to traverse the
subtree to update config on the subtree.  Because such config updates
need to allocate memory, it couldn't directly use
cgroup_for_each_descendant_pre() which required the whole iteration to
be contained in a single RCU read critical section.  To work around
the limitation, propagate_exception() built a linked list of
descendant cgroups while read-locking RCU and then walked the list
afterwards, which is safe as the whole iteration is protected by
devcgroup_mutex.  This works but is cumbersome.

With the recent updates, cgroup iterators now allow dropping RCU read
lock while iteration is in progress making this workaround no longer
necessary.  This patch replaces dev_cgroup->propagate_pending list and
get_online_devcg() with direct cgroup_for_each_descendant_pre() walk.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Aristeu Rozanski <aris@redhat.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
2013-05-24 10:55:38 +09:00
J. Bruce Fields
0d422afb89 security: cap_inode_getsecctx returning garbage
We shouldn't be returning success from this function without also
filling in the return values ctx and ctxlen.

Note currently this doesn't appear to cause bugs since the only
inode_getsecctx caller I can find is fs/sysfs/inode.c, which only calls
this if security_inode_setsecurity succeeds.  Assuming
security_inode_setsecurity is set to cap_inode_setsecurity whenever
inode_getsecctx is set to cap_inode_getsecctx, this function can never
actually called.

So I noticed this only because the server labeled NFS patches add a real
caller.

Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-13 10:11:46 -04:00
Al Viro
b5b3ee6c9c apparmor: no need to delay vfree()
vfree() can be called from interrupt contexts now

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-05-12 21:31:02 +10:00
James Morris
bd71164abc Merge tag 'aa-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor into ra-next 2013-05-12 21:28:38 +10:00
Kent Overstreet
a27bb332c0 aio: don't include aio.h in sched.h
Faster kernel compiles by way of fewer unnecessary includes.

[akpm@linux-foundation.org: fix fallout]
[akpm@linux-foundation.org: fix build]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07 20:16:25 -07:00
Linus Torvalds
20b4fb4852 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS updates from Al Viro,

Misc cleanups all over the place, mainly wrt /proc interfaces (switch
create_proc_entry to proc_create(), get rid of the deprecated
create_proc_read_entry() in favor of using proc_create_data() and
seq_file etc).

7kloc removed.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
  don't bother with deferred freeing of fdtables
  proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
  proc: Make the PROC_I() and PDE() macros internal to procfs
  proc: Supply a function to remove a proc entry by PDE
  take cgroup_open() and cpuset_open() to fs/proc/base.c
  ppc: Clean up scanlog
  ppc: Clean up rtas_flash driver somewhat
  hostap: proc: Use remove_proc_subtree()
  drm: proc: Use remove_proc_subtree()
  drm: proc: Use minor->index to label things, not PDE->name
  drm: Constify drm_proc_list[]
  zoran: Don't print proc_dir_entry data in debug
  reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
  proc: Supply an accessor for getting the data from a PDE's parent
  airo: Use remove_proc_subtree()
  rtl8192u: Don't need to save device proc dir PDE
  rtl8187se: Use a dir under /proc/net/r8180/
  proc: Add proc_mkdir_data()
  proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
  proc: Move PDE_NET() to fs/proc/proc_net.c
  ...
2013-05-01 17:51:54 -07:00
Linus Torvalds
73287a43cc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights (1721 non-merge commits, this has to be a record of some
  sort):

   1) Add 'random' mode to team driver, from Jiri Pirko and Eric
      Dumazet.

   2) Make it so that any driver that supports configuration of multiple
      MAC addresses can provide the forwarding database add and del
      calls by providing a default implementation and hooking that up if
      the driver doesn't have an explicit set of handlers.  From Vlad
      Yasevich.

   3) Support GSO segmentation over tunnels and other encapsulating
      devices such as VXLAN, from Pravin B Shelar.

   4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton.

   5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita
      Dukkipati.

   6) In the PHY layer, allow supporting wake-on-lan in situations where
      the PHY registers have to be written for it to be configured.

      Use it to support wake-on-lan in mv643xx_eth.

      From Michael Stapelberg.

   7) Significantly improve firewire IPV6 support, from YOSHIFUJI
      Hideaki.

   8) Allow multiple packets to be sent in a single transmission using
      network coding in batman-adv, from Martin Hundebøll.

   9) Add support for T5 cxgb4 chips, from Santosh Rastapur.

  10) Generalize the VXLAN forwarding tables so that there is more
      flexibility in configurating various aspects of the endpoints.
      From David Stevens.

  11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver,
      from Dmitry Kravkov.

  12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo
      Neira Ayuso.

  13) Start adding networking selftests.

  14) In situations of overload on the same AF_PACKET fanout socket, or
      per-cpu packet receive queue, minimize drop by distributing the
      load to other cpus/fanouts.  From Willem de Bruijn and Eric
      Dumazet.

  15) Add support for new payload offset BPF instruction, from Daniel
      Borkmann.

  16) Convert several drivers over to mdoule_platform_driver(), from
      Sachin Kamat.

  17) Provide a minimal BPF JIT image disassembler userspace tool, from
      Daniel Borkmann.

  18) Rewrite F-RTO implementation in TCP to match the final
      specification of it in RFC4138 and RFC5682.  From Yuchung Cheng.

  19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear
      you like netlink, so I implemented netlink dumping of netlink
      sockets.") From Andrey Vagin.

  20) Remove ugly passing of rtnetlink attributes into rtnl_doit
      functions, from Thomas Graf.

  21) Allow userspace to be able to see if a configuration change occurs
      in the middle of an address or device list dump, from Nicolas
      Dichtel.

  22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes
      Frederic Sowa.

  23) Increase accuracy of packet length used by packet scheduler, from
      Jason Wang.

  24) Beginning set of changes to make ipv4/ipv6 fragment handling more
      scalable and less susceptible to overload and locking contention,
      from Jesper Dangaard Brouer.

  25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_*()
      instead.  From Hong Zhiguo.

  26) Optimize route usage in IPVS by avoiding reference counting where
      possible, from Julian Anastasov.

  27) Convert IPVS schedulers to RCU, also from Julian Anastasov.

  28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger
      Eitzenberger.

  29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG,
      nfnetlink_log, and nfnetlink_queue.  From Gao feng.

  30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa.

  31) Support several new r8169 chips, from Hayes Wang.

  32) Support tokenized interface identifiers in ipv6, from Daniel
      Borkmann.

  33) Use usbnet_link_change() helper in USB net driver, from Ming Lei.

  34) Add 802.1ad vlan offload support, from Patrick McHardy.

  35) Support mmap() based netlink communication, also from Patrick
      McHardy.

  36) Support HW timestamping in mlx4 driver, from Amir Vadai.

  37) Rationalize AF_PACKET packet timestamping when transmitting, from
      Willem de Bruijn and Daniel Borkmann.

  38) Bring parity to what's provided by /proc/net/packet socket dumping
      and the info provided by netlink socket dumping of AF_PACKET
      sockets.  From Nicolas Dichtel.

  39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin
      Poirier"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
  filter: fix va_list build error
  af_unix: fix a fatal race with bit fields
  bnx2x: Prevent memory leak when cnic is absent
  bnx2x: correct reading of speed capabilities
  net: sctp: attribute printl with __printf for gcc fmt checks
  netlink: kconfig: move mmap i/o into netlink kconfig
  netpoll: convert mutex into a semaphore
  netlink: Fix skb ref counting.
  net_sched: act_ipt forward compat with xtables
  mlx4_en: fix a build error on 32bit arches
  Revert "bnx2x: allow nvram test to run when device is down"
  bridge: avoid OOPS if root port not found
  drivers: net: cpsw: fix kernel warn on cpsw irq enable
  sh_eth: use random MAC address if no valid one supplied
  3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA)
  tg3: fix to append hardware time stamping flags
  unix/stream: fix peeking with an offset larger than data in queue
  unix/dgram: fix peeking with an offset larger than data in queue
  unix/dgram: peek beyond 0-sized skbs
  openvswitch: Remove unneeded ovs_netdev_get_ifindex()
  ...
2013-05-01 14:08:52 -07:00
Linus Torvalds
5f56886521 Merge branch 'akpm' (incoming from Andrew)
Merge third batch of fixes from Andrew Morton:
 "Most of the rest.  I still have two large patchsets against AIO and
  IPC, but they're a bit stuck behind other trees and I'm about to
  vanish for six days.

   - random fixlets
   - inotify
   - more of the MM queue
   - show_stack() cleanups
   - DMI update
   - kthread/workqueue things
   - compat cleanups
   - epoll udpates
   - binfmt updates
   - nilfs2
   - hfs
   - hfsplus
   - ptrace
   - kmod
   - coredump
   - kexec
   - rbtree
   - pids
   - pidns
   - pps
   - semaphore tweaks
   - some w1 patches
   - relay updates
   - core Kconfig changes
   - sysrq tweaks"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (109 commits)
  Documentation/sysrq: fix inconstistent help message of sysrq key
  ethernet/emac/sysrq: fix inconstistent help message of sysrq key
  sparc/sysrq: fix inconstistent help message of sysrq key
  powerpc/xmon/sysrq: fix inconstistent help message of sysrq key
  ARM/etm/sysrq: fix inconstistent help message of sysrq key
  power/sysrq: fix inconstistent help message of sysrq key
  kgdb/sysrq: fix inconstistent help message of sysrq key
  lib/decompress.c: fix initconst
  notifier-error-inject: fix module names in Kconfig
  kernel/sys.c: make prctl(PR_SET_MM) generally available
  UAPI: remove empty Kbuild files
  menuconfig: print more info for symbol without prompts
  init/Kconfig: re-order CONFIG_EXPERT options to fix menuconfig display
  kconfig menu: move Virtualization drivers near other virtualization options
  Kconfig: consolidate CONFIG_DEBUG_STRICT_USER_COPY_CHECKS
  relay: use macro PAGE_ALIGN instead of FIX_SIZE
  kernel/relay.c: move FIX_SIZE macro into relay.c
  kernel/relay.c: remove unused function argument actor
  drivers/w1/slaves/w1_ds2760.c: fix the error handling in w1_ds2760_add_slave()
  drivers/w1/slaves/w1_ds2781.c: fix the error handling in w1_ds2781_add_slave()
  ...
2013-04-30 17:37:43 -07:00
Lucas De Marchi
93997f6ddb KEYS: split call to call_usermodehelper_fns()
Use call_usermodehelper_setup() + call_usermodehelper_exec() instead of
calling call_usermodehelper_fns().  In case there's an OOM in this last
function the cleanup function may not be called - in this case we would
miss a call to key_put().

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-30 17:04:06 -07:00
Linus Torvalds
2e1deaad1e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem update from James Morris:
 "Just some minor updates across the subsystem"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  ima: eliminate passing d_name.name to process_measurement()
  TPM: Retry SaveState command in suspend path
  tpm/tpm_i2c_infineon: Add small comment about return value of __i2c_transfer
  tpm/tpm_i2c_infineon.c: Add OF attributes type and name to the of_device_id table entries
  tpm_i2c_stm_st33: Remove duplicate inclusion of header files
  tpm: Add support for new Infineon I2C TPM (SLB 9645 TT 1.2 I2C)
  char/tpm: Convert struct i2c_msg initialization to C99 format
  drivers/char/tpm/tpm_ppi: use strlcpy instead of strncpy
  tpm/tpm_i2c_stm_st33: formatting and white space changes
  Smack: include magic.h in smackfs.c
  selinux: make security_sb_clone_mnt_opts return an error on context mismatch
  seccomp: allow BPF_XOR based ALU instructions.
  Fix NULL pointer dereference in smack_inode_unlink() and smack_inode_rmdir()
  Smack: add support for modification of existing rules
  smack: SMACK_MAGIC to include/uapi/linux/magic.h
  Smack: add missing support for transmute bit in smack_str_from_perm()
  Smack: prevent revoke-subject from failing when unseen label is written to it
  tomoyo: use DEFINE_SRCU() to define tomoyo_ss
  tomoyo: use DEFINE_SRCU() to define tomoyo_ss
2013-04-30 16:27:51 -07:00
Linus Torvalds
191a712090 Merge branch 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:

 - Fixes and a lot of cleanups.  Locking cleanup is finally complete.
   cgroup_mutex is no longer exposed to individual controlelrs which
   used to cause nasty deadlock issues.  Li fixed and cleaned up quite a
   bit including long standing ones like racy cgroup_path().

 - device cgroup now supports proper hierarchy thanks to Aristeu.

 - perf_event cgroup now supports proper hierarchy.

 - A new mount option "__DEVEL__sane_behavior" is added.  As indicated
   by the name, this option is to be used for development only at this
   point and generates a warning message when used.  Unfortunately,
   cgroup interface currently has too many brekages and inconsistencies
   to implement a consistent and unified hierarchy on top.  The new flag
   is used to collect the behavior changes which are necessary to
   implement consistent unified hierarchy.  It's likely that this flag
   won't be used verbatim when it becomes ready but will be enabled
   implicitly along with unified hierarchy.

   The option currently disables some of broken behaviors in cgroup core
   and also .use_hierarchy switch in memcg (will be routed through -mm),
   which can be used to make very unusual hierarchy where nesting is
   partially honored.  It will also be used to implement hierarchy
   support for blk-throttle which would be impossible otherwise without
   introducing a full separate set of control knobs.

   This is essentially versioning of interface which isn't very nice but
   at this point I can't see any other options which would allow keeping
   the interface the same while moving towards hierarchy behavior which
   is at least somewhat sane.  The planned unified hierarchy is likely
   to require some level of adaptation from userland anyway, so I think
   it'd be best to take the chance and update the interface such that
   it's supportable in the long term.

   Maintaining the existing interface does complicate cgroup core but
   shouldn't put too much strain on individual controllers and I think
   it'd be manageable for the foreseeable future.  Maybe we'll be able
   to drop it in a decade.

Fix up conflicts (including a semantic one adding a new #include to ppc
that was uncovered by header the file changes) as per Tejun.

* 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (45 commits)
  cpuset: fix compile warning when CONFIG_SMP=n
  cpuset: fix cpu hotplug vs rebuild_sched_domains() race
  cpuset: use rebuild_sched_domains() in cpuset_hotplug_workfn()
  cgroup: restore the call to eventfd->poll()
  cgroup: fix use-after-free when umounting cgroupfs
  cgroup: fix broken file xattrs
  devcg: remove parent_cgroup.
  memcg: force use_hierarchy if sane_behavior
  cgroup: remove cgrp->top_cgroup
  cgroup: introduce sane_behavior mount option
  move cgroupfs_root to include/linux/cgroup.h
  cgroup: convert cgroupfs_root flag bits to masks and add CGRP_ prefix
  cgroup: make cgroup_path() not print double slashes
  Revert "cgroup: remove bind() method from cgroup_subsys."
  perf: make perf_event cgroup hierarchical
  cgroup: implement cgroup_is_descendant()
  cgroup: make sure parent won't be destroyed before its children
  cgroup: remove bind() method from cgroup_subsys.
  devcg: remove broken_hierarchy tag
  cgroup: remove cgroup_lock_is_held()
  ...
2013-04-29 19:14:20 -07:00
Al Viro
e53cfda5d2 tomoyo_close_control: don't bother with return value
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-29 15:41:42 -04:00
John Johansen
2654bfbc2b apparmor: fix fully qualified name parsing
currently apparmor name parsing is only correctly handling
:<NS>:<profile>

but
:<NS>://<profile>

is also a valid form and what is exported to userspace.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-04-28 00:39:37 -07:00
John Johansen
3eea57c26e apparmor: fix setprocattr arg processing for onexec
the exec file isn't processing its command arg. It should only set be
responding to a command of exec.

Also cleanup setprocattr some more while we are at it.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-04-28 00:39:36 -07:00
John Johansen
214beacaa7 apparmor: localize getting the security context to a few macros
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2013-04-28 00:39:35 -07:00
John Johansen
53fe8b9961 apparmor: fix sparse warnings
Fix a couple of warning reported by sparse

Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-04-28 00:39:35 -07:00
John Johansen
41d1b3e868 apparmor: Fix smatch warning in aa_remove_profiles
smatch reports
  error: potential NULL dereference 'ns'.

this can not actually occur because it relies on aa_split_fqname setting
both ns_name and name as null but ns_name will actually always have a
value in this case.

so remove the unnecessary if (ns_name) conditional that is resulting
in the false positive further down.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-04-28 00:39:34 -07:00
John Johansen
b492d50bf5 apparmor: fix the audit type table
The audit type table is missing a comma so that KILLED comes out as
KILLEDAUTO.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <sbeattie@ubuntu.com>
2013-04-28 00:37:41 -07:00
John Johansen
ed686308c6 apparmor: reserve and mask off the top 8 bits of the base field
The top 8 bits of the base field have never been used, in fact can't
be used, by the current 'dfa16' format.  However they will be used in the
future as flags, so mask them off when using base as an index value.

Note: the use of the top 8 bits, without masking is trapped by the verify
      checks that base entries are within the size bounds.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
2013-04-28 00:37:32 -07:00
John Johansen
4da05cc08d apparmor: move the free_profile fn ahead of aa_alloc_profile
Move the free_profile fn ahead of aa_alloc_profile so it can be used
in aa_alloc_profile without a forward declaration.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
2013-04-28 00:37:24 -07:00
John Johansen
a4987857d2 apparmor: remove sid from profiles
The sid is not going to be a direct property of a profile anymore, instead
it will be directly related to the label, and the profile will pickup
a label back reference.

For null-profiles replace the use of sid with a per namespace unique
id.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
2013-04-28 00:37:13 -07:00
John Johansen
180a6f5965 apparmor: move perm defines into policy_unpack
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <sbeattie@ubuntu.com>
2013-04-28 00:37:04 -07:00
John Johansen
8e4ff109d0 apparmor: misc cleanup of match
tidying up comments, includes and defines

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
2013-04-28 00:36:55 -07:00
John Johansen
cf47aede3b apparmor: relax the restrictions on setting rlimits
Instead of limiting the setting of the processes limits to current,
relax this to tasks confined by the same profile, as the apparmor
controls for rlimits are at a profile level granularity.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <sbeattie@ubuntu.com>
2013-04-28 00:36:46 -07:00
John Johansen
4b7c331fc2 apparmor: remove "permipc" command
The "permipc" command is unused and unfinished, remove it.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
2013-04-28 00:36:32 -07:00
John Johansen
7a2871b566 apparmor: use common fn to clear task_context for domain transitions
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <sbeattie@ubuntu.com>
2013-04-28 00:36:20 -07:00
John Johansen
0ca554b9fc apparmor: add kvzalloc to handle zeroing for kvmalloc
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <sbeattie@ubuntu.com>
2013-04-28 00:36:09 -07:00
John Johansen
3cfcc19e0b apparmor: add utility function to get an arbitrary tasks profile.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <sbeattie@ubuntu.com>
2013-04-28 00:35:53 -07:00
John Johansen
e573cc30bb apparmor: fix error code to failure message mapping for name lookup
-ESTALE used to be incorrectly used to indicate a disconnected path, when
name lookup failed.  This was fixed in commit e1b0e444 to correctly return
-EACCESS, but the error to failure message mapping was not correctly updated
to reflect this change.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <sbeattie@ubuntu.com>
2013-04-28 00:35:40 -07:00
John Johansen
50c5ecd5d8 apparmor: refactor profile mode macros
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <sbeattie@ubuntu.com>
2013-04-28 00:35:29 -07:00
John Johansen
04266236b1 apparmor: Remove -W1 warnings
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-By: Steve Beattie <sbeattie@ubuntu.com>
2013-04-28 00:35:18 -07:00
John Johansen
17322cc3f9 apparmor: fix auditing of domain transition failures due to incomplete policy
When policy specifies a transition to a profile that is not currently
loaded, it result in exec being denied.  However the failure is not being
audited correctly because the audit code is treating this as an allowed
permission and thus not reporting it.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-By: Steve Beattie <sbeattie@ubuntu.com>
2013-04-28 00:35:04 -07:00
David S. Miller
6e0895c2ea Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/emulex/benet/be_main.c
	drivers/net/ethernet/intel/igb/igb_main.c
	drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c
	include/net/scm.h
	net/batman-adv/routing.c
	net/ipv4/tcp_input.c

The e{uid,gid} --> {uid,gid} credentials fix conflicted with the
cleanup in net-next to now pass cred structs around.

The be2net driver had a bug fix in 'net' that overlapped with the VLAN
interface changes by Patrick McHardy in net-next.

An IGB conflict existed because in 'net' the build_skb() support was
reverted, and in 'net-next' there was a comment style fix within that
code.

Several batman-adv conflicts were resolved by making sure that all
calls to batadv_is_my_mac() are changed to have a new bat_priv first
argument.

Eric Dumazet's TS ECR fix in TCP in 'net' conflicted with the F-RTO
rewrite in 'net-next', mostly overlapping changes.

Thanks to Stephen Rothwell and Antonio Quartulli for help with several
of these merge resolutions.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-22 20:32:51 -04:00
Rami Rosen
e57d5cf2f8 devcg: remove parent_cgroup.
In devcgroup_css_alloc(), there is no longer need for parent_cgroup.
bd2953ebbb("devcg: propagate local changes down the hierarchy") made
the variable parent_cgroup redundant. This patch removes parent_cgroup
from devcgroup_css_alloc().

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-04-18 11:34:35 -07:00
Mimi Zohar
df2c2afba4 ima: eliminate passing d_name.name to process_measurement()
Passing a pointer to the dentry name, as a parameter to
process_measurement(), causes a race condition with rename() and
is unnecessary, as the dentry name is already accessible via the
file parameter.

In the normal case, we use the full pathname as provided by
brpm->filename, bprm->interp, or ima_d_path().  Only on ima_d_path()
failure, do we fallback to using the d_name.name, which points
either to external memory or d_iname.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-04-17 17:20:57 -07:00
Eric Dumazet
ca10b9e9a8 selinux: add a skb_owned_by() hook
Commit 90ba9b1986 (tcp: tcp_make_synack() can use alloc_skb())
broke certain SELinux/NetLabel configurations by no longer correctly
assigning the sock to the outgoing SYNACK packet.

Cost of atomic operations on the LISTEN socket is quite big,
and we would like it to happen only if really needed.

This patch introduces a new security_ops->skb_owned_by() method,
that is a void operation unless selinux is active.

Reported-by: Miroslav Vadkerti <mvadkert@redhat.com>
Diagnosed-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-security-module@vger.kernel.org
Acked-by: James Morris <james.l.morris@oracle.com>
Tested-by: Paul Moore <pmoore@redhat.com>
Acked-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-09 13:23:11 -04:00
Tejun Heo
8adf12b0ff devcg: remove broken_hierarchy tag
bd2953ebbb ("devcg: propagate local changes down the hierarchy")
implemented proper hierarchy support.  Remove the broken tag.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Aristeu Rozanski <aris@redhat.com>
2013-04-08 08:31:59 -07:00
Casey Schaufler
958d2c2f4a Smack: include magic.h in smackfs.c
As reported for linux-next: Tree for Apr 2 (smack)
Add the required include for smackfs.c

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-04-03 13:13:51 +11:00
Jeff Layton
094f7b69ea selinux: make security_sb_clone_mnt_opts return an error on context mismatch
I had the following problem reported a while back. If you mount the
same filesystem twice using NFSv4 with different contexts, then the
second context= option is ignored. For instance:

    # mount server:/export /mnt/test1
    # mount server:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0
    # ls -dZ /mnt/test1
    drwxrwxrwt. root root system_u:object_r:nfs_t:s0       /mnt/test1
    # ls -dZ /mnt/test2
    drwxrwxrwt. root root system_u:object_r:nfs_t:s0       /mnt/test2

When we call into SELinux to set the context of a "cloned" superblock,
it will currently just bail out when it notices that we're reusing an
existing superblock. Since the existing superblock is already set up and
presumably in use, we can't go overwriting its context with the one from
the "original" sb. Because of this, the second context= option in this
case cannot take effect.

This patch fixes this by turning security_sb_clone_mnt_opts into an int
return operation. When it finds that the "new" superblock that it has
been handed is already set up, it checks to see whether the contexts on
the old superblock match it. If it does, then it will just return
success, otherwise it'll return -EBUSY and emit a printk to tell the
admin why the second mount failed.

Note that this patch may cause casualties. The NFSv4 code relies on
being able to walk down to an export from the pseudoroot. If you mount
filesystems that are nested within one another with different contexts,
then this patch will make those mounts fail in new and "exciting" ways.

For instance, suppose that /export is a separate filesystem on the
server:

    # mount server:/ /mnt/test1
    # mount salusa:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0
    mount.nfs: an incorrect mount option was specified

...with the printk in the ring buffer. Because we *might* eventually
walk down to /mnt/test1/export, the mount is denied due to this patch.
The second mount needs the pseudoroot superblock, but that's already
present with the wrong context.

OTOH, if we mount these in the reverse order, then both mounts work,
because the pseudoroot superblock created when mounting /export is
discarded once that mount is done. If we then however try to walk into
that directory, the automount fails for the similar reasons:

    # cd /mnt/test1/scratch/
    -bash: cd: /mnt/test1/scratch: Device or resource busy

The story I've gotten from the SELinux folks that I've talked to is that
this is desirable behavior. In SELinux-land, mounting the same data
under different contexts is wrong -- there can be only one.

Cc: Steve Dickson <steved@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-04-02 11:30:13 +11:00
David S. Miller
a210576cf8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/mac80211/sta_info.c
	net/wireless/core.h

Two minor conflicts in wireless.  Overlapping additions of extern
declarations in net/wireless/core.h and a bug fix overlapping with
the addition of a boolean parameter to __ieee80211_key_free().

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-01 13:36:50 -04:00
Linus Torvalds
2c3de1c2d7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns fixes from Eric W Biederman:
 "The bulk of the changes are fixing the worst consequences of the user
  namespace design oversight in not considering what happens when one
  namespace starts off as a clone of another namespace, as happens with
  the mount namespace.

  The rest of the changes are just plain bug fixes.

  Many thanks to Andy Lutomirski for pointing out many of these issues."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  userns: Restrict when proc and sysfs can be mounted
  ipc: Restrict mounting the mqueue filesystem
  vfs: Carefully propogate mounts across user namespaces
  vfs: Add a mount flag to lock read only bind mounts
  userns:  Don't allow creation if the user is chrooted
  yama:  Better permission check for ptraceme
  pid: Handle the exit of a multi-threaded init.
  scm: Require CAP_SYS_ADMIN over the current pidns to spoof pids.
2013-03-28 13:43:46 -07:00
Hong zhi guo
77954983ad selinux: replace obsolete NLMSG_* with type safe nlmsg_*
Signed-off-by: Hong Zhiguo <honkiko@gmail.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-28 14:25:49 -04:00
Eric W. Biederman
eddc0a3abf yama: Better permission check for ptraceme
Change the permission check for yama_ptrace_ptracee to the standard
ptrace permission check, testing if the traceer has CAP_SYS_PTRACE
in the tracees user namespace.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-03-26 13:17:58 -07:00
Aristeu Rozanski
bd2953ebbb devcg: propagate local changes down the hierarchy
This patch makes exception changes to propagate down in hierarchy respecting
when possible local exceptions.

New exceptions allowing additional access to devices won't be propagated, but
it'll be possible to add an exception to access all of part of the newly
allowed device(s).

New exceptions disallowing access to devices will be propagated down and the
local group's exceptions will be revalidated for the new situation.
Example:
      A
     / \
        B

    group        behavior          exceptions
    A            allow             "b 8:* rwm", "c 116:1 rw"
    B            deny              "c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm"

If a new exception is added to group A:
	# echo "c 116:* r" > A/devices.deny
it'll propagate down and after revalidating B's local exceptions, the exception
"c 116:2 rwm" will be removed.

In case parent's exceptions change and local exceptions are not allowed anymore,
they'll be deleted.

v7:
- do not allow behavior change when the cgroup has children
- update documentation

v6: fixed issues pointed by Serge Hallyn
- only copy parent's exceptions while propagating behavior if the local
  behavior is different
- while propagating exceptions, do not clear and copy parent's: it'd be against
  the premise we don't propagate access to more devices

v5: fixed issues pointed by Serge Hallyn
- updated documentation
- not propagating when an exception is written to devices.allow
- when propagating a new behavior, clean the local exceptions list if they're
  for a different behavior

v4: fixed issues pointed by Tejun Heo
- separated function to walk the tree and collect valid propagation targets

v3: fixed issues pointed by Tejun Heo
- update documentation
- move css_online/css_offline changes to a new patch
- use cgroup_for_each_descendant_pre() instead of own descendant walk
- move exception_copy rework to a separared patch
- move exception_clean rework to a separated patch

v2: fixed issues pointed by Tejun Heo
- instead of keeping the local settings that won't apply anymore, remove them

Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-20 07:50:21 -07:00
Aristeu Rozanski
1909554c97 devcg: use css_online and css_offline
Allocate resources and change behavior only when online. This is needed in
order to determine if a node is suitable for hierarchy propagation or if it's
being removed.

Locking:
Both functions take devcgroup_mutex to make changes to device_cgroup structure.
Hierarchy propagation will also take devcgroup_mutex before walking the
tree while walking the tree itself is protected by rcu lock.

Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-20 07:50:17 -07:00
Aristeu Rozanski
c39a2a3018 devcg: prepare may_access() for hierarchy support
Currently may_access() is only able to verify if an exception is valid for the
current cgroup, which has the same behavior. With hierarchy, it'll be also used
to verify if a cgroup local exception is valid towards its cgroup parent, which
might have different behavior.

v2:
- updated patch description
- rebased on top of a new patch to expand the may_access() logic to make it
  more clear
- fixed argument description order in may_access()

Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-20 07:50:13 -07:00
Aristeu Rozanski
26898fdff3 devcg: expand may_access() logic
In order to make the next patch more clear, expand may_access() logic.

v2: may_access() returns bool now

Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-20 07:50:09 -07:00
Igor Zhbanov
cdb56b6088 Fix NULL pointer dereference in smack_inode_unlink() and smack_inode_rmdir()
This patch fixes kernel Oops because of wrong common_audit_data type
in smack_inode_unlink() and smack_inode_rmdir().

When SMACK security module is enabled and SMACK logging is on (/smack/logging
is not zero) and you try to delete the file which
1) you cannot delete due to SMACK rules and logging of failures is on
or
2) you can delete and logging of success is on,

you will see following:

	Unable to handle kernel NULL pointer dereference at virtual address 000002d7

	[<...>] (strlen+0x0/0x28)
	[<...>] (audit_log_untrustedstring+0x14/0x28)
	[<...>] (common_lsm_audit+0x108/0x6ac)
	[<...>] (smack_log+0xc4/0xe4)
	[<...>] (smk_curacc+0x80/0x10c)
	[<...>] (smack_inode_unlink+0x74/0x80)
	[<...>] (security_inode_unlink+0x2c/0x30)
	[<...>] (vfs_unlink+0x7c/0x100)
	[<...>] (do_unlinkat+0x144/0x16c)

The function smack_inode_unlink() (and smack_inode_rmdir()) need
to log two structures of different types. First of all it does:

	smk_ad_init(&ad, __func__, LSM_AUDIT_DATA_DENTRY);
	smk_ad_setfield_u_fs_path_dentry(&ad, dentry);

This will set common audit data type to LSM_AUDIT_DATA_DENTRY
and store dentry for auditing (by function smk_curacc(), which in turn calls
dump_common_audit_data(), which is actually uses provided data and logs it).

	/*
	 * You need write access to the thing you're unlinking
	 */
	rc = smk_curacc(smk_of_inode(ip), MAY_WRITE, &ad);
	if (rc == 0) {
		/*
		 * You also need write access to the containing directory
		 */

Then this function wants to log anoter data:

		smk_ad_setfield_u_fs_path_dentry(&ad, NULL);
		smk_ad_setfield_u_fs_inode(&ad, dir);

The function sets inode field, but don't change common_audit_data type.

		rc = smk_curacc(smk_of_inode(dir), MAY_WRITE, &ad);
	}

So the dump_common_audit() function incorrectly interprets inode structure
as dentry, and Oops will happen.

This patch reinitializes common_audit_data structures with correct type.
Also I removed unneeded
	smk_ad_setfield_u_fs_path_dentry(&ad, NULL);
initialization, because both dentry and inode pointers are stored
in the same union.

Signed-off-by: Igor Zhbanov <i.zhbanov@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2013-03-19 14:38:17 -07:00
Rafal Krypa
e05b6f982a Smack: add support for modification of existing rules
Rule modifications are enabled via /smack/change-rule. Format is as follows:
"Subject Object rwaxt rwaxt"

First two strings are subject and object labels up to 255 characters.
Third string contains permissions to enable.
Fourth string contains permissions to disable.

All unmentioned permissions will be left unchanged.
If no rule previously existed, it will be created.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
2013-03-19 14:16:42 -07:00
Jarkko Sakkinen
cee7e44334 smack: SMACK_MAGIC to include/uapi/linux/magic.h
SMACK_MAGIC moved to a proper place for easy user space access
(i.e. libsmack).

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@iki.fi>
2013-03-19 14:16:15 -07:00
Rafal Krypa
a87d79ad7c Smack: add missing support for transmute bit in smack_str_from_perm()
This fixes audit logs for granting or denial of permissions to show
information about transmute bit.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
2013-03-19 14:15:41 -07:00
Rafal Krypa
d15d9fad16 Smack: prevent revoke-subject from failing when unseen label is written to it
Special file /smack/revoke-subject will silently accept labels that are not
present on the subject label list. Nothing has to be done for such labels,
as there are no rules for them to revoke.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
2013-03-19 14:15:21 -07:00
Dan Carpenter
4502403dcf selinux: use GFP_ATOMIC under spin_lock
The call tree here is:

sk_clone_lock()              <- takes bh_lock_sock(newsk);
xfrm_sk_clone_policy()
__xfrm_sk_clone_policy()
clone_policy()               <- uses GFP_ATOMIC for allocations
security_xfrm_policy_clone()
security_ops->xfrm_policy_clone_security()
selinux_xfrm_policy_clone()

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-03-19 00:33:09 +11:00
Lai Jiangshan
921f3ac4c3 tomoyo: use DEFINE_SRCU() to define tomoyo_ss
DEFINE_STATIC_SRCU() defines srcu struct and do init at build time.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-03-18 23:51:31 +11:00
Mathieu Desnoyers
8aec0f5d41 Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys
Looking at mm/process_vm_access.c:process_vm_rw() and comparing it to
compat_process_vm_rw() shows that the compatibility code requires an
explicit "access_ok()" check before calling
compat_rw_copy_check_uvector(). The same difference seems to appear when
we compare fs/read_write.c:do_readv_writev() to
fs/compat.c:compat_do_readv_writev().

This subtle difference between the compat and non-compat requirements
should probably be debated, as it seems to be error-prone. In fact,
there are two others sites that use this function in the Linux kernel,
and they both seem to get it wrong:

Now shifting our attention to fs/aio.c, we see that aio_setup_iocb()
also ends up calling compat_rw_copy_check_uvector() through
aio_setup_vectored_rw(). Unfortunately, the access_ok() check appears to
be missing. Same situation for
security/keys/compat.c:compat_keyctl_instantiate_key_iov().

I propose that we add the access_ok() check directly into
compat_rw_copy_check_uvector(), so callers don't have to worry about it,
and it therefore makes the compat call code similar to its non-compat
counterpart. Place the access_ok() check in the same location where
copy_from_user() can trigger a -EFAULT error in the non-compat code, so
the ABI behaviors are alike on both compat and non-compat.

While we are here, fix compat_do_readv_writev() so it checks for
compat_rw_copy_check_uvector() negative return values.

And also, fix a memory leak in compat_keyctl_instantiate_key_iov() error
handling.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-12 11:05:45 -07:00
David Howells
0da9dfdd2c keys: fix race with concurrent install_user_keyrings()
This fixes CVE-2013-1792.

There is a race in install_user_keyrings() that can cause a NULL pointer
dereference when called concurrently for the same user if the uid and
uid-session keyrings are not yet created.  It might be possible for an
unprivileged user to trigger this by calling keyctl() from userspace in
parallel immediately after logging in.

Assume that we have two threads both executing lookup_user_key(), both
looking for KEY_SPEC_USER_SESSION_KEYRING.

	THREAD A			THREAD B
	===============================	===============================
					==>call install_user_keyrings();
	if (!cred->user->session_keyring)
	==>call install_user_keyrings()
					...
					user->uid_keyring = uid_keyring;
	if (user->uid_keyring)
		return 0;
	<==
	key = cred->user->session_keyring [== NULL]
					user->session_keyring = session_keyring;
	atomic_inc(&key->usage); [oops]

At the point thread A dereferences cred->user->session_keyring, thread B
hasn't updated user->session_keyring yet, but thread A assumes it is
populated because install_user_keyrings() returned ok.

The race window is really small but can be exploited if, for example,
thread B is interrupted or preempted after initializing uid_keyring, but
before doing setting session_keyring.

This couldn't be reproduced on a stock kernel.  However, after placing
systemtap probe on 'user->session_keyring = session_keyring;' that
introduced some delay, the kernel could be crashed reliably.

Fix this by checking both pointers before deciding whether to return.
Alternatively, the test could be done away with entirely as it is checked
inside the mutex - but since the mutex is global, that may not be the best
way.

Signed-off-by: David Howells <dhowells@redhat.com>
Reported-by: Mateusz Guzik <mguzik@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-03-12 16:44:31 +11:00
Eric W. Biederman
ba0e3427b0 userns: Stop oopsing in key_change_session_keyring
Dave Jones <davej@redhat.com> writes:
> Just hit this on Linus' current tree.
>
> [   89.621770] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
> [   89.623111] IP: [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [   89.624062] PGD 122bfd067 PUD 122bfe067 PMD 0
> [   89.624901] Oops: 0000 [#1] PREEMPT SMP
> [   89.625678] Modules linked in: caif_socket caif netrom bridge hidp 8021q garp stp mrp rose llc2 af_rxrpc phonet af_key binfmt_misc bnep l2tp_ppp can_bcm l2tp_core pppoe pppox can_raw scsi_transport_iscsi ppp_generic slhc nfnetlink can ipt_ULOG ax25 decnet irda nfc rds x25 crc_ccitt appletalk atm ipx p8023 psnap p8022 llc lockd sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btusb bluetooth snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm vhost_net snd_page_alloc snd_timer tun macvtap usb_debug snd rfkill microcode macvlan edac_core pcspkr serio_raw kvm_amd soundcore kvm r8169 mii
> [   89.637846] CPU 2
> [   89.638175] Pid: 782, comm: trinity-main Not tainted 3.8.0+ #63 Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H
> [   89.639850] RIP: 0010:[<ffffffff810784b0>]  [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [   89.641161] RSP: 0018:ffff880115657eb8  EFLAGS: 00010207
> [   89.641984] RAX: 00000000000003e8 RBX: ffff88012688b000 RCX: 0000000000000000
> [   89.643069] RDX: 0000000000000000 RSI: ffffffff81c32960 RDI: ffff880105839600
> [   89.644167] RBP: ffff880115657ed8 R08: 0000000000000000 R09: 0000000000000000
> [   89.645254] R10: 0000000000000001 R11: 0000000000000246 R12: ffff880105839600
> [   89.646340] R13: ffff88011beea490 R14: ffff88011beea490 R15: 0000000000000000
> [   89.647431] FS:  00007f3ac063b740(0000) GS:ffff88012b200000(0000) knlGS:0000000000000000
> [   89.648660] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [   89.649548] CR2: 00000000000000c8 CR3: 0000000122bfc000 CR4: 00000000000007e0
> [   89.650635] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   89.651723] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [   89.652812] Process trinity-main (pid: 782, threadinfo ffff880115656000, task ffff88011beea490)
> [   89.654128] Stack:
> [   89.654433]  0000000000000000 ffff8801058396a0 ffff880105839600 ffff88011beeaa78
> [   89.655769]  ffff880115657ef8 ffffffff812c7d9b ffffffff82079be0 0000000000000000
> [   89.657073]  ffff880115657f28 ffffffff8106c665 0000000000000002 ffff880115657f58
> [   89.658399] Call Trace:
> [   89.658822]  [<ffffffff812c7d9b>] key_change_session_keyring+0xfb/0x140
> [   89.659845]  [<ffffffff8106c665>] task_work_run+0xa5/0xd0
> [   89.660698]  [<ffffffff81002911>] do_notify_resume+0x71/0xb0
> [   89.661581]  [<ffffffff816c9a4a>] int_signal+0x12/0x17
> [   89.662385] Code: 24 90 00 00 00 48 8b b3 90 00 00 00 49 8b 4c 24 40 48 39 f2 75 08 e9 83 00 00 00 48 89 ca 48 81 fa 60 29 c3 81 0f 84 41 fe ff ff <48> 8b 8a c8 00 00 00 48 39 ce 75 e4 3b 82 d0 00 00 00 0f 84 4b
> [   89.667778] RIP  [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [   89.668733]  RSP <ffff880115657eb8>
> [   89.669301] CR2: 00000000000000c8
>
> My fastest trinity induced oops yet!
>
>
> Appears to be..
>
>                 if ((set_ns == subset_ns->parent)  &&
>      850:       48 8b 8a c8 00 00 00    mov    0xc8(%rdx),%rcx
>
> from the inlined cred_cap_issubset

By historical accident we have been reading trying to set new->user_ns
from new->user_ns.  Which is totally silly as new->user_ns is NULL (as
is every other field in new except session_keyring at that point).

The intent is clearly to copy all of the fields from old to new so copy
old->user_ns into  into new->user_ns.

Cc: stable@vger.kernel.org
Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Dave Jones <davej@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-03-03 19:35:38 -08:00
Linus Torvalds
56a79b7b02 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull  more VFS bits from Al Viro:
 "Unfortunately, it looks like xattr series will have to wait until the
  next cycle ;-/

  This pile contains 9p cleanups and fixes (races in v9fs_fid_add()
  etc), fixup for nommu breakage in shmem.c, several cleanups and a bit
  more file_inode() work"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  constify path_get/path_put and fs_struct.c stuff
  fix nommu breakage in shmem.c
  cache the value of file_inode() in struct file
  9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry
  9p: make sure ->lookup() adds fid to the right dentry
  9p: untangle ->lookup() a bit
  9p: double iput() in ->lookup() if d_materialise_unique() fails
  9p: v9fs_fid_add() can't fail now
  v9fs: get rid of v9fs_dentry
  9p: turn fid->dlist into hlist
  9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine
  more file_inode() open-coded instances
  selinux: opened file can't have NULL or negative ->f_path.dentry

(In the meantime, the hlist traversal macros have changed, so this
required a semantic conflict fixup for the newly hlistified fid->dlist)
2013-03-03 13:23:03 -08:00
Sasha Levin
b67bfe0d42 hlist: drop the node parameter from iterators
I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj->member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    <+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:24 -08:00
Al Viro
45e09bd51b selinux: opened file can't have NULL or negative ->f_path.dentry
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-27 13:22:14 -05:00
Linus Torvalds
d895cb1af1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile (part one) from Al Viro:
 "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
  locking violations, etc.

  The most visible changes here are death of FS_REVAL_DOT (replaced with
  "has ->d_weak_revalidate()") and a new helper getting from struct file
  to inode.  Some bits of preparation to xattr method interface changes.

  Misc patches by various people sent this cycle *and* ocfs2 fixes from
  several cycles ago that should've been upstream right then.

  PS: the next vfs pile will be xattr stuff."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
  saner proc_get_inode() calling conventions
  proc: avoid extra pde_put() in proc_fill_super()
  fs: change return values from -EACCES to -EPERM
  fs/exec.c: make bprm_mm_init() static
  ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
  ocfs2: fix possible use-after-free with AIO
  ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
  get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
  target: writev() on single-element vector is pointless
  export kernel_write(), convert open-coded instances
  fs: encode_fh: return FILEID_INVALID if invalid fid_type
  kill f_vfsmnt
  vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
  nfsd: handle vfs_getattr errors in acl protocol
  switch vfs_getattr() to struct path
  default SET_PERSONALITY() in linux/elf.h
  ceph: prepopulate inodes only when request is aborted
  d_hash_and_lookup(): export, switch open-coded instances
  9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
  9p: split dropping the acls from v9fs_set_create_acl()
  ...
2013-02-26 20:16:07 -08:00
Al Viro
182be68478 kill f_vfsmnt
very few users left...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26 02:46:10 -05:00
Mimi Zohar
446d64e3e1 block: fix part_pack_uuid() build error
Commit "85865c1 ima: add policy support for file system uuid"
introduced a CONFIG_BLOCK dependency.  This patch defines a
wrapper called blk_part_pack_uuid(), which returns -EINVAL,
when CONFIG_BLOCK is not defined.

security/integrity/ima/ima_policy.c:538:4: error: implicit declaration
of function 'part_pack_uuid' [-Werror=implicit-function-declaration]

Changelog v2:
- Reference commit number in patch description
Changelog v1:
- rename ima_part_pack_uuid() to blk_part_pack_uuid()
- resolve scripts/checkpatch.pl warnings
Changelog v0:
- fix UUID scripts/Lindent msgs

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: David Rientjes <rientjes@google.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-02-26 03:10:52 +11:00
Mimi Zohar
a2c2c3a71c ima: "remove enforce checking duplication" merge fix
Commit "750943a ima: remove enforce checking duplication" combined
the 'in IMA policy' and 'enforcing file integrity' checks.  For
the non-file, kernel module verification, a specific check for
'enforcing file integrity' was not added.  This patch adds the
check.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-02-26 02:46:38 +11:00
Al Viro
496ad9aa8e new helper: file_inode(file)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-22 23:31:31 -05:00
Jerry Snitselaar
53eb8c82d5 device_cgroup: don't grab mutex in rcu callback
Commit 103a197c0c ("security/device_cgroup: lock assert fails in
dev_exception_clean()") grabs devcgroup_mutex to fix assert failure, but
a mutex can't be grabbed in rcu callback.  Since there shouldn't be any
other references when css_free is called, mutex isn't needed for list
cleanup in devcgroup_css_free().

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-21 17:22:15 -08:00
Linus Torvalds
33673dcb37 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "This is basically a maintenance update for the TPM driver and EVM/IMA"

Fix up conflicts in lib/digsig.c and security/integrity/ima/ima_main.c

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (45 commits)
  tpm/ibmvtpm: build only when IBM pseries is configured
  ima: digital signature verification using asymmetric keys
  ima: rename hash calculation functions
  ima: use new crypto_shash API instead of old crypto_hash
  ima: add policy support for file system uuid
  evm: add file system uuid to EVM hmac
  tpm_tis: check pnp_acpi_device return code
  char/tpm/tpm_i2c_stm_st33: drop temporary variable for return value
  char/tpm/tpm_i2c_stm_st33: remove dead assignment in tpm_st33_i2c_probe
  char/tpm/tpm_i2c_stm_st33: Remove __devexit attribute
  char/tpm/tpm_i2c_stm_st33: Don't use memcpy for one byte assignment
  tpm_i2c_stm_st33: removed unused variables/code
  TPM: Wait for TPM_ACCESS tpmRegValidSts to go high at startup
  tpm: Fix cancellation of TPM commands (interrupt mode)
  tpm: Fix cancellation of TPM commands (polling mode)
  tpm: Store TPM vendor ID
  TPM: Work around buggy TPMs that block during continue self test
  tpm_i2c_stm_st33: fix oops when i2c client is unavailable
  char/tpm: Use struct dev_pm_ops for power management
  TPM: STMicroelectronics ST33 I2C BUILD STUFF
  ...
2013-02-21 08:18:12 -08:00
David Howells
fe9453a1dc KEYS: Revert one application of "Fix unreachable code" patch
A patch to fix some unreachable code in search_my_process_keyrings() got
applied twice by two different routes upstream as commits e67eab39be
and b010520ab3 (both "fix unreachable code").

Unfortunately, the second application removed something it shouldn't
have and this wasn't detected by GIT.  This is due to the patch not
having sufficient lines of context to distinguish the two places of
application.

The effect of this is relatively minor: inside the kernel, the keyring
search routines may search multiple keyrings and then prioritise the
errors if no keys or negative keys are found in any of them.  With the
extra deletion, the presence of a negative key in the thread keyring
(causing ENOKEY) is incorrectly overridden by an error searching the
process keyring.

So revert the second application of the patch.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-21 07:56:25 -08:00
Dmitry Kasatkin
e0751257a6 ima: digital signature verification using asymmetric keys
Asymmetric keys were introduced in linux-3.7 to verify the signature on
signed kernel modules. The asymmetric keys infrastructure abstracts the
signature verification from the crypto details. This patch adds IMA/EVM
signature verification using asymmetric keys. Support for additional
signature verification methods can now be delegated to the asymmetric
key infrastructure.

Although the module signature header and the IMA/EVM signature header
could use the same format, to minimize the signature length and save
space in the extended attribute, this patch defines a new IMA/EVM
header format.  The main difference is that the key identifier is a
sha1[12 - 19] hash of the key modulus and exponent, similar to the
current implementation.  The only purpose of the key identifier is to
identify the corresponding key in the kernel keyring.  ima-evm-utils
was updated to support the new signature format.

While asymmetric signature verification functionality supports many
different hash algorithms, the hash used in this patch is calculated
during the IMA collection phase, based on the configured algorithm.
The default algorithm is sha1, but for backwards compatibility md5
is supported.  Due to this current limitation, signatures should be
generated using a sha1 hash algorithm.

Changes in this patch:
- Functionality has been moved to separate source file in order to get rid of
  in source #ifdefs.
- keyid is derived according to the RFC 3280. It does not require to assign
  IMA/EVM specific "description" when loading X509 certificate. Kernel
  asymmetric key subsystem automatically generate the description. Also
  loading a certificate does not require using of ima-evm-utils and can be
  done using keyctl only.
- keyid size is reduced to 32 bits to save xattr space.  Key search is done
  using partial match functionality of asymmetric_key_match().
- Kconfig option title was changed

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-02-06 21:22:18 -05:00
Dmitry Kasatkin
50af554466 ima: rename hash calculation functions
Rename hash calculation functions to reflect meaning
and change argument order in conventional way.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-02-06 10:41:13 -05:00
Dmitry Kasatkin
76bb28f612 ima: use new crypto_shash API instead of old crypto_hash
Old crypto hash API internally uses shash API.
Using shash API directly is more efficient.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-02-06 10:41:12 -05:00
Dmitry Kasatkin
85865c1fa1 ima: add policy support for file system uuid
The IMA policy permits specifying rules to enable or disable
measurement/appraisal/audit based on the file system magic number.
If, for example, the policy contains an ext4 measurement rule,
the rule is enabled for all ext4 partitions.

Sometimes it might be necessary to enable measurement/appraisal/audit
only for one partition and disable it for another partition of the
same type.  With the existing IMA policy syntax, this can not be done.

This patch provides support for IMA policy rules to specify the file
system by its UUID (eg. fsuuid=397449cd-687d-4145-8698-7fed4a3e0363).

For partitions not being appraised, it might be a good idea to mount
file systems with the 'noexec' option to prevent executing non-verified
binaries.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-02-06 10:40:29 -05:00
Dmitry Kasatkin
74de668424 evm: add file system uuid to EVM hmac
EVM uses the same key for all file systems to calculate the HMAC,
making it possible to paste inodes from one file system on to another
one, without EVM being able to detect it.  To prevent such an attack,
it is necessary to make the EVM HMAC file system specific.

This patch uses the file system UUID, a file system unique identifier,
to bind the EVM HMAC to the file system. The value inode->i_sb->s_uuid
is used for the HMAC hash calculation, instead of using it for deriving
the file system specific key.  Initializing the key for every inode HMAC
calculation is a bit more expensive operation than adding the uuid to
the HMAC hash.

Changing the HMAC calculation method or adding additional info to the
calculation, requires existing EVM labeled file systems to be relabeled.
This patch adds a Kconfig HMAC version option for backwards compatability.

Changelog v1:
- squash "hmac version setting"
Changelog v0:
- add missing Kconfig depends (Mimi)

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-02-06 10:40:28 -05:00
Linus Torvalds
22f8379815 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking updates from David Miller:
 "Much more accumulated than I would have liked due to an unexpected
  bout with a nasty flu:

   1) AH and ESP input don't set ECN field correctly because the
      transport head of the SKB isn't set correctly, fix from Li
      RongQing.

   2) If netfilter conntrack zones are disabled, we can return an
      uninitialized variable instead of the proper error code.  Fix from
      Borislav Petkov.

   3) Fix double SKB free in ath9k driver beacon handling, from Felix
      Feitkau.

   4) Remove bogus assumption about netns cleanup ordering in
      nf_conntrack, from Pablo Neira Ayuso.

   5) Remove a bogus BUG_ON in the new TCP fastopen code, from Eric
      Dumazet.  It uses spin_is_locked() in it's test and is therefore
      unsuitable for UP.

   6) Fix SELINUX labelling regressions added by the tuntap multiqueue
      changes, from Paul Moore.

   7) Fix CRC errors with jumbo frame receive in tg3 driver, from Nithin
      Nayak Sujir.

   8) CXGB4 driver sets interrupt coalescing parameters only on first
      queue, rather than all of them.  Fix from Thadeu Lima de Souza
      Cascardo.

   9) Fix regression in the dispatch of read/write registers in dm9601
      driver, from Tushar Behera.

  10) ipv6_append_data miscalculates header length, from Romain KUNTZ.

  11) Fix PMTU handling regressions on ipv4 routes, from Steffen
      Klassert, Timo Teräs, and Julian Anastasov.

  12) In 3c574_cs driver, add necessary parenthesis to "x << y & z"
      expression.  From Nickolai Zeldovich.

  13) macvlan_get_size() causes underallocation netlink message space,
      fix from Eric Dumazet.

  14) Avoid division by zero in xfrm_replay_advance_bmp(), from Nickolai
      Zeldovich.  Amusingly the zero check was already there, we were
      just performing it after the modulus :-)

  15) Some more splice bug fixes from Eric Dumazet, which fix things
      mostly eminating from how we now more aggressively use high-order
      pages in SKBs.

  16) Fix size calculation bug when freeing hash tables in the IPSEC
      xfrm code, from Michal Kubecek.

  17) Fix PMTU event propagation into socket cached routes, from Steffen
      Klassert.

  18) Fix off by one in TX buffer release in netxen driver, from Eric
      Dumazet.

  19) Fix rediculous memory allocation requirements introduced by the
      tuntap multiqueue changes, from Jason Wang.

  20) Remove bogus AMD platform workaround in r8169 driver that causes
      major problems in normal operation, from Timo Teräs.

  21) virtio-net set affinity and select queue don't handle
      discontiguous cpu numbers properly, fix from Wanlong Gao.

  22) Fix a route refcounting issue in loopback driver, from Eric
      Dumazet.  There's a similar fix coming that we might add to the
      macvlan driver as well.

  23) Fix SKB leaks in batman-adv's distributed arp table code, from
      Matthias Schiffer.

  24) r8169 driver gives descriptor ownership back the hardware before
      we're done reading the VLAN tag out of it, fix from Francois
      Romieu.

  25) Checksums not calculated properly in GRE tunnel driver fix from
      Pravin B Shelar.

26) Fix SCTP memory leak on namespace exit."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (101 commits)
  dm9601: support dm9620 variant
  SCTP: Free the per-net sysctl table on net exit. v2
  net: phy: icplus: fix broken INTR pin settings
  net: phy: icplus: Use the RGMII interface mode to configure clock delays
  IP_GRE: Fix kernel panic in IP_GRE with GRE csum.
  sctp: set association state to established in dupcook_a handler
  ip6mr: limit IPv6 MRT_TABLE identifiers
  r8169: fix vlan tag read ordering.
  net: cdc_ncm: use IAD provided by the USB core
  batman-adv: filter ARP packets with invalid MAC addresses in DAT
  batman-adv: check for more types of invalid IP addresses in DAT
  batman-adv: fix skb leak in batadv_dat_snoop_incoming_arp_reply()
  net: loopback: fix a dst refcounting issue
  virtio-net: reset virtqueue affinity when doing cpu hotplug
  virtio-net: split out clean affinity function
  virtio-net: fix the set affinity bug when CPU IDs are not consecutive
  can: pch_can: fix invalid error codes
  can: ti_hecc: fix invalid error codes
  can: c_can: fix invalid error codes
  r8169: remove the obsolete and incorrect AMD workaround
  ...
2013-01-28 11:41:37 -08:00
Mimi Zohar
5a73fcfa88 ima: differentiate appraise status only for hook specific rules
Different hooks can require different methods for appraising a
file's integrity.  As a result, an integrity appraisal status is
cached on a per hook basis.

Only a hook specific rule, requires the inode to be re-appraised.
This patch eliminates unnecessary appraisals.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
2013-01-22 16:10:39 -05:00
Mimi Zohar
d79d72e024 ima: per hook cache integrity appraisal status
With the new IMA policy 'appraise_type=' option, different hooks
can require different methods for appraising a file's integrity.

For example, the existing 'ima_appraise_tcb' policy defines a
generic rule, requiring all root files to be appraised, without
specfying the appraisal method.  A more specific rule could require
all kernel modules, for example, to be signed.

appraise fowner=0 func=MODULE_CHECK appraise_type=imasig
appraise fowner=0

As a result, the integrity appraisal results for the same inode, but
for different hooks, could differ.  This patch caches the integrity
appraisal results on a per hook basis.

Changelog v2:
- Rename ima_cache_status() to ima_set_cache_status()
- Rename and move get_appraise_status() to ima_get_cache_status()
Changelog v0:
- include IMA_APPRAISE/APPRAISED_SUBMASK in IMA_DO/DONE_MASK (Dmitry)
- Support independent MODULE_CHECK appraise status.
- fixed IMA_XXXX_APPRAISE/APPRAISED flags

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
2013-01-22 16:10:36 -05:00
Mimi Zohar
f578c08ec9 ima: increase iint flag size
In preparation for hook specific appraise status results, increase
the iint flags size.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
2013-01-22 16:10:34 -05:00
Dmitry Kasatkin
0e5a247cb3 ima: added policy support for 'security.ima' type
The 'security.ima' extended attribute may contain either the file data's
hash or a digital signature.  This patch adds support for requiring a
specific extended attribute type.  It extends the IMA policy with a new
keyword 'appraise_type=imasig'.  (Default is hash.)

Changelog v2:
- Fixed Documentation/ABI/testing/ima_policy option syntax
Changelog v1:
- Differentiate between 'required' vs. 'actual' extended attribute

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-01-22 16:10:31 -05:00
Jerry Snitselaar
103a197c0c security/device_cgroup: lock assert fails in dev_exception_clean()
devcgroup_css_free() calls dev_exception_clean() without the devcgroup_mutex being locked.

Shutting down a kvm virt was giving me the following trace:

[36280.732764] ------------[ cut here ]------------
[36280.732778] WARNING: at /home/snits/dev/linux/security/device_cgroup.c:172 dev_exception_clean+0xa9/0xc0()
[36280.732782] Hardware name: Studio XPS 8100
[36280.732785] Modules linked in: xt_REDIRECT fuse ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc nf_conntrack_ipv4 ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter it87 hwmon_vid xt_state nf_conntrack ip6_tables snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq coretemp snd_seq_device crc32c_intel snd_pcm snd_page_alloc snd_timer snd broadcom tg3 serio_raw i7core_edac edac_core ptp pps_core lpc_ich pcspkr mfd_core soundcore microcode i2c_i801 nfsd auth_rpcgss nfs_acl lockd vhost_net sunrpc tun macvtap macvlan kvm_intel kvm uinput binfmt_misc autofs4 usb_storage firewire_ohci firewire_core crc_itu_t radeon drm_kms_helper ttm
[36280.732921] Pid: 933, comm: libvirtd Tainted: G        W    3.8.0-rc3-00307-g4c217de #1
[36280.732922] Call Trace:
[36280.732927]  [<ffffffff81044303>] warn_slowpath_common+0x93/0xc0
[36280.732930]  [<ffffffff8104434a>] warn_slowpath_null+0x1a/0x20
[36280.732932]  [<ffffffff812deaf9>] dev_exception_clean+0xa9/0xc0
[36280.732934]  [<ffffffff812deb2a>] devcgroup_css_free+0x1a/0x30
[36280.732938]  [<ffffffff810ccd76>] cgroup_diput+0x76/0x210
[36280.732941]  [<ffffffff8119eac0>] d_delete+0x120/0x180
[36280.732943]  [<ffffffff81195cff>] vfs_rmdir+0xef/0x130
[36280.732945]  [<ffffffff81195e47>] do_rmdir+0x107/0x1c0
[36280.732949]  [<ffffffff8132d17e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[36280.732951]  [<ffffffff81198646>] sys_rmdir+0x16/0x20
[36280.732954]  [<ffffffff8173bd82>] system_call_fastpath+0x16/0x1b
[36280.732956] ---[ end trace ca39dced899a7d9f ]---

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>
Cc: stable@kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-01-22 00:27:55 +11:00
Dmitry Kasatkin
a67adb9974 evm: checking if removexattr is not a NULL
The following lines of code produce a kernel oops.

fd = socket(PF_FILE, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0);
fchmod(fd, 0666);

[  139.922364] BUG: unable to handle kernel NULL pointer dereference at   (null)
[  139.924982] IP: [<  (null)>]   (null)
[  139.924982] *pde = 00000000
[  139.924982] Oops: 0000 [#5] SMP
[  139.924982] Modules linked in: fuse dm_crypt dm_mod i2c_piix4 serio_raw evdev binfmt_misc button
[  139.924982] Pid: 3070, comm: acpid Tainted: G      D      3.8.0-rc2-kds+ #465 Bochs Bochs
[  139.924982] EIP: 0060:[<00000000>] EFLAGS: 00010246 CPU: 0
[  139.924982] EIP is at 0x0
[  139.924982] EAX: cf5ef000 EBX: cf5ef000 ECX: c143d600 EDX: c15225f2
[  139.924982] ESI: cf4d2a1c EDI: cf4d2a1c EBP: cc02df10 ESP: cc02dee4
[  139.924982]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  139.924982] CR0: 80050033 CR2: 00000000 CR3: 0c059000 CR4: 000006d0
[  139.924982] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  139.924982] DR6: ffff0ff0 DR7: 00000400
[  139.924982] Process acpid (pid: 3070, ti=cc02c000 task=d7705340 task.ti=cc02c000)
[  139.924982] Stack:
[  139.924982]  c1203c88 00000000 cc02def4 cf4d2a1c ae21eefa 471b60d5 1083c1ba c26a5940
[  139.924982]  e891fb5e 00000041 00000004 cc02df1c c1203964 00000000 cc02df4c c10e20c3
[  139.924982]  00000002 00000000 00000000 22222222 c1ff2222 cf5ef000 00000000 d76efb08
[  139.924982] Call Trace:
[  139.924982]  [<c1203c88>] ? evm_update_evmxattr+0x5b/0x62
[  139.924982]  [<c1203964>] evm_inode_post_setattr+0x22/0x26
[  139.924982]  [<c10e20c3>] notify_change+0x25f/0x281
[  139.924982]  [<c10cbf56>] chmod_common+0x59/0x76
[  139.924982]  [<c10e27a1>] ? put_unused_fd+0x33/0x33
[  139.924982]  [<c10cca09>] sys_fchmod+0x39/0x5c
[  139.924982]  [<c13f4f30>] syscall_call+0x7/0xb
[  139.924982] Code:  Bad EIP value.

This happens because sockets do not define the removexattr operation.
Before removing the xattr, verify the removexattr function pointer is
not NULL.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-01-22 00:27:50 +11:00
Dmitry Kasatkin
a175b8bb29 ima: forbid write access to files with digital signatures
This patch forbids write access to files with digital signatures, as they
are considered immutable.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-01-16 17:50:05 -05:00
Dmitry Kasatkin
ea1046d4c5 ima: move full pathname resolution to separate function
Define a new function ima_d_path(), which returns the full pathname.
This function will be used further, for example, by the directory
verification code.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-01-16 17:50:03 -05:00
Dmitry Kasatkin
ee86633174 integrity: reduce storage size for ima_status and evm_status
This patch reduces size of the iint structure by 8 bytes.
It saves about 15% of iint cache memory.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-01-16 17:50:01 -05:00
Mimi Zohar
16cac49f72 ima: rename FILE_MMAP to MMAP_CHECK
Rename FILE_MMAP hook to MMAP_CHECK to be consistent with the other
hook names.

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
2013-01-16 17:49:59 -05:00
Dmitry Kasatkin
b51524635b ima: remove security.ima hexdump
Hexdump is not really helping. Audit messages prints error messages.
Remove it.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-01-16 17:49:57 -05:00
Dmitry Kasatkin
750943a307 ima: remove enforce checking duplication
Based on the IMA appraisal policy, files are appraised.  For those
files appraised, the IMA hooks return the integrity appraisal result,
assuming IMA-appraisal is in enforcing mode.  This patch combines
both of these criteria (in policy and enforcing file integrity),
removing the checking duplication.

Changelog v1:
- Update hook comments

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-01-16 17:49:44 -05:00
Dmitry Kasatkin
def3e8b9ee ima: set appraise status in fix mode only when xattr is fixed
When a file system is mounted read-only, setting the xattr value in
fix mode fails with an error code -EROFS.  The xattr should be fixed
after the file system is remounted read-write.  This patch verifies
that the set xattr succeeds, before setting the appraise status value
to INTEGRITY_PASS.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-01-16 15:47:07 -05:00
Dmitry Kasatkin
e90805656d evm: remove unused cleanup functions
EVM cannot be built as a kernel module. Remove the unncessary __exit
functions.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2013-01-16 15:47:05 -05:00
Mimi Zohar
7163a99384 ima: re-initialize IMA policy LSM info
Although the IMA policy does not change, the LSM policy can be
reloaded, leaving the IMA LSM based rules referring to the old,
stale LSM policy.  This patch updates the IMA LSM based rules
to reflect the reloaded LSM policy.

Reported-by: Sven Vermeulen <sven.vermeulen@siphos.be>
tested-by: Sven Vermeulen <sven.vermeulen@siphos.be>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
2013-01-16 15:47:03 -05:00
Paul Moore
5dbbaf2de8 tun: fix LSM/SELinux labeling of tun/tap devices
This patch corrects some problems with LSM/SELinux that were introduced
with the multiqueue patchset.  The problem stems from the fact that the
multiqueue work changed the relationship between the tun device and its
associated socket; before the socket persisted for the life of the
device, however after the multiqueue changes the socket only persisted
for the life of the userspace connection (fd open).  For non-persistent
devices this is not an issue, but for persistent devices this can cause
the tun device to lose its SELinux label.

We correct this problem by adding an opaque LSM security blob to the
tun device struct which allows us to have the LSM security state, e.g.
SELinux labeling information, persist for the lifetime of the tun
device.  In the process we tweak the LSM hooks to work with this new
approach to TUN device/socket labeling and introduce a new LSM hook,
security_tun_dev_attach_queue(), to approve requests to attach to a
TUN queue via TUNSETQUEUE.

The SELinux code has been adjusted to match the new LSM hooks, the
other LSMs do not make use of the LSM TUN controls.  This patch makes
use of the recently added "tun_socket:attach_queue" permission to
restrict access to the TUNSETQUEUE operation.  On older SELinux
policies which do not define the "tun_socket:attach_queue" permission
the access control decision for TUNSETQUEUE will be handled according
to the SELinux policy's unknown permission setting.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Eric Paris <eparis@parisplace.org>
Tested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14 18:16:59 -05:00
Paul Moore
6f96c142f7 selinux: add the "attach_queue" permission to the "tun_socket" class
Add a new permission to align with the new TUN multiqueue support,
"tun_socket:attach_queue".

The corresponding SELinux reference policy patch is show below:

 diff --git a/policy/flask/access_vectors b/policy/flask/access_vectors
 index 28802c5..a0664a1 100644
 --- a/policy/flask/access_vectors
 +++ b/policy/flask/access_vectors
 @@ -827,6 +827,9 @@ class kernel_service

  class tun_socket
  inherits socket
 +{
 +       attach_queue
 +}

  class x_pointer
  inherits x_device

Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Eric Paris <eparis@parisplace.org>
Tested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-14 18:16:59 -05:00
Mimi Zohar
a7f2a366f6 ima: fallback to MODULE_SIG_ENFORCE for existing kernel module syscall
The new kernel module syscall appraises kernel modules based
on policy.   If the IMA policy requires kernel module checking,
fallback to module signature enforcing for the existing syscall.
Without CONFIG_MODULE_SIG_FORCE enabled, the kernel module's
integrity is unknown, return -EACCES.

Changelog v1:
- Fix ima_module_check() return result (Tetsuo Handa)

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
2012-12-24 09:35:48 -05:00
Alan Cox
e67eab39be keys: fix unreachable code
We set ret to NULL then test it. Remove the bogus test

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-20 17:40:21 -08:00
Linus Torvalds
9eb127cc04 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Really fix tuntap SKB use after free bug, from Eric Dumazet.

 2) Adjust SKB data pointer to point past the transport header before
    calling icmpv6_notify() so that the headers are in the state which
    that function expects.  From Duan Jiong.

 3) Fix ambiguities in the new tuntap multi-queue APIs.  From Jason
    Wang.

 4) mISDN needs to use del_timer_sync(), from Konstantin Khlebnikov.

 5) Don't destroy mutex after freeing up device private in mac802154,
    fix also from Konstantin Khlebnikov.

 6) Fix INET request socket leak in TCP and DCCP, from Christoph Paasch.

 7) SCTP HMAC kconfig rework, from Neil Horman.

 8) Fix SCTP jprobes function signature, otherwise things explode, from
    Daniel Borkmann.

 9) Fix typo in ipv6-offload Makefile variable reference, from Simon
    Arlott.

10) Don't fail USBNET open just because remote wakeup isn't supported,
    from Oliver Neukum.

11) be2net driver bug fixes from Sathya Perla.

12) SOLOS PCI ATM driver bug fixes from Nathan Williams and David
    Woodhouse.

13) Fix MTU changing regression in 8139cp driver, from John Greene.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits)
  solos-pci: ensure all TX packets are aligned to 4 bytes
  solos-pci: add firmware upgrade support for new models
  solos-pci: remove superfluous debug output
  solos-pci: add GPIO support for newer versions on Geos board
  8139cp: Prevent dev_close/cp_interrupt race on MTU change
  net: qmi_wwan: add ZTE MF880
  drivers/net: Use of_match_ptr() macro in smsc911x.c
  drivers/net: Use of_match_ptr() macro in smc91x.c
  ipv6: addrconf.c: remove unnecessary "if"
  bridge: Correctly encode addresses when dumping mdb entries
  bridge: Do not unregister all PF_BRIDGE rtnl operations
  use generic usbnet_manage_power()
  usbnet: generic manage_power()
  usbnet: handle PM failure gracefully
  ksz884x: fix receive polling race condition
  qlcnic: update driver version
  qlcnic: fix unused variable warnings
  net: fec: forbid FEC_PTP on SoCs that do not support
  be2net: fix wrong frag_idx reported by RX CQ
  be2net: fix be_close() to ensure all events are ack'ed
  ...
2012-12-19 20:29:15 -08:00
Linus Torvalds
7a684c452e Nothing all that exciting; a new module-from-fd syscall for those who want
to verify the source of the module (ChromeOS) and/or use standard IMA on it
 or other security hooks.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQ0VKlAAoJENkgDmzRrbjxjuEQALVHpD1cSmryOzVwkNn7rVGP
 PV3KVbUs+qzUCm2c3AafIIlSBm2LOUl+cR3uNC7di8aHarRF3VHkK2OQ4Fx97ECd
 KKBqAyY3R0q1mAKujb/MWwiK0YgosEDIOzGGn2yQhNFsxKqnMB02P4j82IO7+g+w
 Cc3XuDyWHoH2I+ySgz0Q8NHAqufD/DMZUKud7jw2Lsv6PuICJ1Oqgl/Gd/muxort
 4a5tV3tjhRGywHS/8b2fbDUXkybC5NKK0FN+gyoaROmJ/THeHEQDGXZT9bc2vmVx
 HvRy/5k8dzQ6LAJ2mLnPvy0pmv0u7NYMvjxTxxUlUkFMkYuVticikQfwSYDbDPt4
 mbsLxchpgi8z4x8HltEERffCX5tldo/5hz1uemqhqIsMRIrRFnlHkSIgkGjVHf2u
 LXQBLT8uTm6C0VyNQPrI/hUZzIax7WtKbPSoK9lmExNbKqloEFh/mVXvfQxei2kp
 wnUZcnmPIqSvw7b4CWu7HibMYu2VvGBgm3YIfJRi4AQme1mzFYLpZoxF5Pj+Ykbt
 T//Hb1EsNQTTFCg7MZhnJSAw/EVUvNDUoullORClyqw6+xxjVKqWpPJgYDRfWOlJ
 Xa+s7DNrL+Oo1WWR8l5ruoQszbR8szIyeyPKKxRUcQj2zsqghoWuzKAx2saSEw3W
 pNkoJU+dGC7kG/yVAS8N
 =uoJj
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module update from Rusty Russell:
 "Nothing all that exciting; a new module-from-fd syscall for those who
  want to verify the source of the module (ChromeOS) and/or use standard
  IMA on it or other security hooks."

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  MODSIGN: Fix kbuild output when using default extra_certificates
  MODSIGN: Avoid using .incbin in C source
  modules: don't hand 0 to vmalloc.
  module: Remove a extra null character at the top of module->strtab.
  ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants
  ASN.1: Define indefinite length marker constant
  moduleparam: use __UNIQUE_ID()
  __UNIQUE_ID()
  MODSIGN: Add modules_sign make target
  powerpc: add finit_module syscall.
  ima: support new kernel module syscall
  add finit_module syscall to asm-generic
  ARM: add finit_module syscall to ARM
  security: introduce kernel_module_from_file hook
  module: add flags arg to sys_finit_module()
  module: add syscall to load module from fd
2012-12-19 07:55:08 -08:00
Linus Torvalds
a2faf2fc53 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull (again) user namespace infrastructure changes from Eric Biederman:
 "Those bugs, those darn embarrasing bugs just want don't want to get
  fixed.

  Linus I just updated my mirror of your kernel.org tree and it appears
  you successfully pulled everything except the last 4 commits that fix
  those embarrasing bugs.

  When you get a chance can you please repull my branch"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  userns: Fix typo in description of the limitation of userns_install
  userns: Add a more complete capability subset test to commit_creds
  userns: Require CAP_SYS_ADMIN for most uses of setns.
  Fix cap_capable to only allow owners in the parent user namespace to have caps.
2012-12-18 10:55:28 -08:00
Linus Torvalds
6a2b60b17b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace changes from Eric Biederman:
 "While small this set of changes is very significant with respect to
  containers in general and user namespaces in particular.  The user
  space interface is now complete.

  This set of changes adds support for unprivileged users to create user
  namespaces and as a user namespace root to create other namespaces.
  The tyranny of supporting suid root preventing unprivileged users from
  using cool new kernel features is broken.

  This set of changes completes the work on setns, adding support for
  the pid, user, mount namespaces.

  This set of changes includes a bunch of basic pid namespace
  cleanups/simplifications.  Of particular significance is the rework of
  the pid namespace cleanup so it no longer requires sending out
  tendrils into all kinds of unexpected cleanup paths for operation.  At
  least one case of broken error handling is fixed by this cleanup.

  The files under /proc/<pid>/ns/ have been converted from regular files
  to magic symlinks which prevents incorrect caching by the VFS,
  ensuring the files always refer to the namespace the process is
  currently using and ensuring that the ptrace_mayaccess permission
  checks are always applied.

  The files under /proc/<pid>/ns/ have been given stable inode numbers
  so it is now possible to see if different processes share the same
  namespaces.

  Through the David Miller's net tree are changes to relax many of the
  permission checks in the networking stack to allowing the user
  namespace root to usefully use the networking stack.  Similar changes
  for the mount namespace and the pid namespace are coming through my
  tree.

  Two small changes to add user namespace support were commited here adn
  in David Miller's -net tree so that I could complete the work on the
  /proc/<pid>/ns/ files in this tree.

  Work remains to make it safe to build user namespaces and 9p, afs,
  ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the
  Kconfig guard remains in place preventing that user namespaces from
  being built when any of those filesystems are enabled.

  Future design work remains to allow root users outside of the initial
  user namespace to mount more than just /proc and /sys."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits)
  proc: Usable inode numbers for the namespace file descriptors.
  proc: Fix the namespace inode permission checks.
  proc: Generalize proc inode allocation
  userns: Allow unprivilged mounts of proc and sysfs
  userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file
  procfs: Print task uids and gids in the userns that opened the proc file
  userns: Implement unshare of the user namespace
  userns: Implent proc namespace operations
  userns: Kill task_user_ns
  userns: Make create_new_namespaces take a user_ns parameter
  userns: Allow unprivileged use of setns.
  userns: Allow unprivileged users to create new namespaces
  userns: Allow setting a userns mapping to your current uid.
  userns: Allow chown and setgid preservation
  userns: Allow unprivileged users to create user namespaces.
  userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped
  userns: fix return value on mntns_install() failure
  vfs: Allow unprivileged manipulation of the mount namespace.
  vfs: Only support slave subtrees across different user namespaces
  vfs: Add a user namespace reference from struct mnt_namespace
  ...
2012-12-17 15:44:47 -08:00
Linus Torvalds
2a74dbb9a8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "A quiet cycle for the security subsystem with just a few maintenance
  updates."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  Smack: create a sysfs mount point for smackfs
  Smack: use select not depends in Kconfig
  Yama: remove locking from delete path
  Yama: add RCU to drop read locking
  drivers/char/tpm: remove tasklet and cleanup
  KEYS: Use keyring_alloc() to create special keyrings
  KEYS: Reduce initial permissions on keys
  KEYS: Make the session and process keyrings per-thread
  seccomp: Make syscall skipping and nr changes more consistent
  key: Fix resource leak
  keys: Fix unreachable code
  KEYS: Add payload preparsing opportunity prior to key instantiate or update
2012-12-16 15:40:50 -08:00
Amerigo Wang
9dd9ff9953 bridge: update selinux perm table for RTM_NEWMDB and RTM_DELMDB
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-15 17:14:38 -08:00
Eric W. Biederman
520d9eabce Fix cap_capable to only allow owners in the parent user namespace to have caps.
Andy Lutomirski pointed out that the current behavior of allowing the
owner of a user namespace to have all caps when that owner is not in a
parent user namespace is wrong.  Add a test to ensure the owner of a user
namespace is in the parent of the user namespace to fix this bug.

Thankfully this bug did not apply to the initial user namespace, keeping
the mischief that can be caused by this bug quite small.

This is bug was introduced in v3.5 by commit 783291e690
"Simplify the user_namespace by making userns->creator a kuid."
But did not matter until the permisions required to create
a user namespace were relaxed allowing a user namespace to be created
inside of a user namespace.

The bug made it possible for the owner of a user namespace to be
present in a child user namespace.  Since the owner of a user nameapce
is granted all capabilities it became possible for users in a
grandchild user namespace to have all privilges over their parent user
namspace.

Reorder the checks in cap_capable.  This should make the common case
faster and make it clear that nothing magic happens in the initial
user namespace.  The reordering is safe because cred->user_ns
can only be in targ_ns or targ_ns->parent but not both.

Add a comment a the top of the loop to make the logic of
the code clear.

Add a distinct variable ns that changes as we walk up
the user namespace hierarchy to make it clear which variable
is changing.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2012-12-14 13:50:32 -08:00
Casey Schaufler
e930723741 Smack: create a sysfs mount point for smackfs
There are a number of "conventions" for where to put LSM filesystems.
Smack adheres to none of them. Create a mount point at /sys/fs/smackfs
for mounting smackfs so that Smack can be conventional.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2012-12-14 10:57:23 -08:00
Casey Schaufler
111fe8bd65 Smack: use select not depends in Kconfig
The components NETLABEL and SECURITY_NETWORK are required by
Smack. Using "depends" in Kconfig hides the Smack option
if the user hasn't figured out that they need to be enabled
while using make menuconfig. Using select is a better choice.
Because select is not recursive depends on NET and SECURITY
are added. The reflects similar usage in TOMOYO and AppArmor.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2012-12-14 10:57:10 -08:00
Mimi Zohar
fdf90729e5 ima: support new kernel module syscall
With the addition of the new kernel module syscall, which defines two
arguments - a file descriptor to the kernel module and a pointer to a NULL
terminated string of module arguments - it is now possible to measure and
appraise kernel modules like any other file on the file system.

This patch adds support to measure and appraise kernel modules in an
extensible and consistent manner.

To support filesystems without extended attribute support, additional
patches could pass the signature as the first parameter.

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-14 13:05:26 +10:30
Kees Cook
2e72d51b4a security: introduce kernel_module_from_file hook
Now that kernel module origins can be reasoned about, provide a hook to
the LSMs to make policy decisions about the module file. This will let
Chrome OS enforce that loadable kernel modules can only come from its
read-only hash-verified root filesystem. Other LSMs can, for example,
read extended attributes for signatures, etc.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-14 13:05:24 +10:30
Linus Torvalds
a2013a13e6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial branch from Jiri Kosina:
 "Usual stuff -- comment/printk typo fixes, documentation updates, dead
  code elimination."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  HOWTO: fix double words typo
  x86 mtrr: fix comment typo in mtrr_bp_init
  propagate name change to comments in kernel source
  doc: Update the name of profiling based on sysfs
  treewide: Fix typos in various drivers
  treewide: Fix typos in various Kconfig
  wireless: mwifiex: Fix typo in wireless/mwifiex driver
  messages: i2o: Fix typo in messages/i2o
  scripts/kernel-doc: check that non-void fcts describe their return value
  Kernel-doc: Convention: Use a "Return" section to describe return values
  radeon: Fix typo and copy/paste error in comments
  doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c
  various: Fix spelling of "asynchronous" in comments.
  Fix misspellings of "whether" in comments.
  eisa: Fix spelling of "asynchronous".
  various: Fix spelling of "registered" in comments.
  doc: fix quite a few typos within Documentation
  target: iscsi: fix comment typos in target/iscsi drivers
  treewide: fix typo of "suport" in various comments and Kconfig
  treewide: fix typo of "suppport" in various comments
  ...
2012-12-13 12:00:02 -08:00
Linus Torvalds
6be35c700f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking changes from David Miller:

1) Allow to dump, monitor, and change the bridge multicast database
   using netlink.  From Cong Wang.

2) RFC 5961 TCP blind data injection attack mitigation, from Eric
   Dumazet.

3) Networking user namespace support from Eric W. Biederman.

4) tuntap/virtio-net multiqueue support by Jason Wang.

5) Support for checksum offload of encapsulated packets (basically,
   tunneled traffic can still be checksummed by HW).  From Joseph
   Gasparakis.

6) Allow BPF filter access to VLAN tags, from Eric Dumazet and
   Daniel Borkmann.

7) Bridge port parameters over netlink and BPDU blocking support
   from Stephen Hemminger.

8) Improve data access patterns during inet socket demux by rearranging
   socket layout, from Eric Dumazet.

9) TIPC protocol updates and cleanups from Ying Xue, Paul Gortmaker, and
   Jon Maloy.

10) Update TCP socket hash sizing to be more in line with current day
    realities.  The existing heurstics were choosen a decade ago.
    From Eric Dumazet.

11) Fix races, queue bloat, and excessive wakeups in ATM and
    associated drivers, from Krzysztof Mazur and David Woodhouse.

12) Support DOVE (Distributed Overlay Virtual Ethernet) extensions
    in VXLAN driver, from David Stevens.

13) Add "oops_only" mode to netconsole, from Amerigo Wang.

14) Support set and query of VEB/VEPA bridge mode via PF_BRIDGE, also
    allow DCB netlink to work on namespaces other than the initial
    namespace.  From John Fastabend.

15) Support PTP in the Tigon3 driver, from Matt Carlson.

16) tun/vhost zero copy fixes and improvements, plus turn it on
    by default, from Michael S. Tsirkin.

17) Support per-association statistics in SCTP, from Michele
    Baldessari.

And many, many, driver updates, cleanups, and improvements.  Too
numerous to mention individually.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
  net/mlx4_en: Add support for destination MAC in steering rules
  net/mlx4_en: Use generic etherdevice.h functions.
  net: ethtool: Add destination MAC address to flow steering API
  bridge: add support of adding and deleting mdb entries
  bridge: notify mdb changes via netlink
  ndisc: Unexport ndisc_{build,send}_skb().
  uapi: add missing netconf.h to export list
  pkt_sched: avoid requeues if possible
  solos-pci: fix double-free of TX skb in DMA mode
  bnx2: Fix accidental reversions.
  bna: Driver Version Updated to 3.1.2.1
  bna: Firmware update
  bna: Add RX State
  bna: Rx Page Based Allocation
  bna: TX Intr Coalescing Fix
  bna: Tx and Rx Optimizations
  bna: Code Cleanup and Enhancements
  ath9k: check pdata variable before dereferencing it
  ath5k: RX timestamp is reported at end of frame
  ath9k_htc: RX timestamp is reported at end of frame
  ...
2012-12-12 18:07:07 -08:00
Linus Torvalds
d206e09036 Merge branch 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup changes from Tejun Heo:
 "A lot of activities on cgroup side.  The big changes are focused on
  making cgroup hierarchy handling saner.

   - cgroup_rmdir() had peculiar semantics - it allowed cgroup
     destruction to be vetoed by individual controllers and tried to
     drain refcnt synchronously.  The vetoing never worked properly and
     caused good deal of contortions in cgroup.  memcg was the last
     reamining user.  Michal Hocko removed the usage and cgroup_rmdir()
     path has been simplified significantly.  This was done in a
     separate branch so that the memcg people can base further memcg
     changes on top.

   - The above allowed cleaning up cgroup lifecycle management and
     implementation of generic cgroup iterators which are used to
     improve hierarchy support.

   - cgroup_freezer updated to allow migration in and out of a frozen
     cgroup and handle hierarchy.  If a cgroup is frozen, all descendant
     cgroups are frozen.

   - netcls_cgroup and netprio_cgroup updated to handle hierarchy
     properly.

   - Various fixes and cleanups.

   - Two merge commits.  One to pull in memcg and rmdir cleanups (needed
     to build iterators).  The other pulled in cgroup/for-3.7-fixes for
     device_cgroup fixes so that further device_cgroup patches can be
     stacked on top."

Fixed up a trivial conflict in mm/memcontrol.c as per Tejun (due to
commit bea8c150a7 ("memcg: fix hotplugged memory zone oops") in master
touching code close to commit 2ef37d3fe4 ("memcg: Simplify
mem_cgroup_force_empty_list error handling") in for-3.8)

* 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (65 commits)
  cgroup: update Documentation/cgroups/00-INDEX
  cgroup_rm_file: don't delete the uncreated files
  cgroup: remove subsystem files when remounting cgroup
  cgroup: use cgroup_addrm_files() in cgroup_clear_directory()
  cgroup: warn about broken hierarchies only after css_online
  cgroup: list_del_init() on removed events
  cgroup: fix lockdep warning for event_control
  cgroup: move list add after list head initilization
  netprio_cgroup: allow nesting and inherit config on cgroup creation
  netprio_cgroup: implement netprio[_set]_prio() helpers
  netprio_cgroup: use cgroup->id instead of cgroup_netprio_state->prioidx
  netprio_cgroup: reimplement priomap expansion
  netprio_cgroup: shorten variable names in extend_netdev_table()
  netprio_cgroup: simplify write_priomap()
  netcls_cgroup: move config inheritance to ->css_online() and remove .broken_hierarchy marking
  cgroup: remove obsolete guarantee from cgroup_task_migrate.
  cgroup: add cgroup->id
  cgroup, cpuset: remove cgroup_subsys->post_clone()
  cgroup: s/CGRP_CLONE_CHILDREN/CGRP_CPUSET_CLONE_CHILDREN/
  cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free()
  ...
2012-12-12 08:18:24 -08:00
Cong Wang
6e73d71d84 rtnetlink: add missing message types to selinux perm table
Rebased on the latest net-next tree.

RTM_NEWNETCONF and RTM_GETNETCONF are missing in this table.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-10 14:09:01 -05:00
Cong Wang
ee07c6e7a6 bridge: export multicast database via netlink
V5: fix two bugs pointed out by Thomas
    remove seq check for now, mark it as TODO

V4: remove some useless #include
    some coding style fix

V3: drop debugging printk's
    update selinux perm table as well

V2: drop patch 1/2, export ifindex directly
    Redesign netlink attributes
    Improve netlink seq check
    Handle IPv6 addr as well

This patch exports bridge multicast database via netlink
message type RTM_GETMDB. Similar to fdb, but currently bridge-specific.
We may need to support modify multicast database too (RTM_{ADD,DEL}MDB).

(Thanks to Thomas for patient reviews)

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-07 14:32:52 -05:00
Dave Jones
88a693b5c1 selinux: fix sel_netnode_insert() suspicious rcu dereference
===============================
[ INFO: suspicious RCU usage. ]
3.5.0-rc1+ #63 Not tainted
-------------------------------
security/selinux/netnode.c:178 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
1 lock held by trinity-child1/8750:
 #0:  (sel_netnode_lock){+.....}, at: [<ffffffff812d8f8a>] sel_netnode_sid+0x16a/0x3e0

stack backtrace:
Pid: 8750, comm: trinity-child1 Not tainted 3.5.0-rc1+ #63
Call Trace:
 [<ffffffff810cec2d>] lockdep_rcu_suspicious+0xfd/0x130
 [<ffffffff812d91d1>] sel_netnode_sid+0x3b1/0x3e0
 [<ffffffff812d8e20>] ? sel_netnode_find+0x1a0/0x1a0
 [<ffffffff812d24a6>] selinux_socket_bind+0xf6/0x2c0
 [<ffffffff810cd1dd>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff810cdb55>] ? lock_release_holdtime.part.9+0x15/0x1a0
 [<ffffffff81093841>] ? lock_hrtimer_base+0x31/0x60
 [<ffffffff812c9536>] security_socket_bind+0x16/0x20
 [<ffffffff815550ca>] sys_bind+0x7a/0x100
 [<ffffffff816c03d5>] ? sysret_check+0x22/0x5d
 [<ffffffff810d392d>] ? trace_hardirqs_on_caller+0x10d/0x1a0
 [<ffffffff8133b09e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff816c03a9>] system_call_fastpath+0x16/0x1b

This patch below does what Paul McKenney suggested in the previous thread.

Signed-off-by: Dave Jones <davej@redhat.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Cc: Eric Paris <eparis@parisplace.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-11-21 21:55:32 +11:00
Kees Cook
235e752789 Yama: remove locking from delete path
Instead of locking the list during a delete, mark entries as invalid
and trigger a workqueue to clean them up. This lets us easily handle
task_free from interrupt context.

Signed-off-by: Kees Cook <keescook@chromium.org>
2012-11-20 10:32:08 -08:00
Kees Cook
93b69d437e Yama: add RCU to drop read locking
Stop using spinlocks in the read path. Add RCU list to handle the readers.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: John Johansen <john.johansen@canonical.com>
2012-11-20 10:32:07 -08:00
Eric W. Biederman
4c44aaafa8 userns: Kill task_user_ns
The task_user_ns function hides the fact that it is getting the user
namespace from struct cred on the task.  struct cred may go away as
soon as the rcu lock is released.  This leads to a race where we
can dereference a stale user namespace pointer.

To make it obvious a struct cred is involved kill task_user_ns.

To kill the race modify the users of task_user_ns to only
reference the user namespace while the rcu lock is held.

Cc: Kees Cook <keescook@chromium.org>
Cc: James Morris <james.l.morris@oracle.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2012-11-20 04:17:44 -08:00
Tejun Heo
92fb97487a cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free()
Rename cgroup_subsys css lifetime related callbacks to better describe
what their roles are.  Also, update documentation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2012-11-19 08:13:38 -08:00
Tejun Heo
4b1c7840b7 device_cgroup: add lockdep asserts
device_cgroup uses RCU safe ->exceptions list which is write-protected
by devcgroup_mutex and has had some issues using locking correctly.
Add lockdep asserts to utility functions so that future errors can be
easily detected.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
2012-11-06 12:28:04 -08:00
Tejun Heo
201e72acb2 device_cgroup: fix RCU usage
dev_cgroup->exceptions is protected with devcgroup_mutex for writes
and RCU for reads; however, RCU usage isn't correct.

* dev_exception_clean() doesn't use RCU variant of list_del() and
  kfree().  The function can race with may_access() and may_access()
  may end up dereferencing already freed memory.  Use list_del_rcu()
  and kfree_rcu() instead.

* may_access() may be called only with RCU read locked but doesn't use
  RCU safe traversal over ->exceptions.  Use list_for_each_entry_rcu().

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Cc: stable@vger.kernel.org
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
2012-11-06 12:25:51 -08:00
Aristeu Rozanski
64e1047713 device_cgroup: fix unchecked cgroup parent usage
In 4cef7299b4 ("device_cgroup: add proper checking when changing
default behavior") the cgroup parent usage is unchecked.  root will not
have a parent and trying to use device.{allow,deny} will cause problems.
For some reason my stressing scripts didn't test the root directory so I
didn't catch it on my regular tests.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: James Morris <jmorris@namei.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2012-11-06 07:25:20 -08:00
Jiri Kosina
3bd7bf1f0f Merge branch 'master' into for-next
Sync up with Linus' tree to be able to apply Cesar's patch
against newer version of the code.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-10-28 19:29:19 +01:00
Aristeu Rozanski
4cef7299b4 device_cgroup: add proper checking when changing default behavior
Before changing a group's default behavior to ALLOW, we must check if
its parent's behavior is also ALLOW.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: James Morris <jmorris@namei.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-25 14:37:52 -07:00
Aristeu Rozanski
26fd8405dd device_cgroup: stop using simple_strtoul()
Convert the code to use kstrtou32() instead of simple_strtoul() which is
deprecated.  The real size of the variables are u32, so use kstrtou32
instead of kstrtoul

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: James Morris <jmorris@namei.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-25 14:37:52 -07:00
Aristeu Rozanski
5b7aa7d5bb device_cgroup: rename deny_all to behavior
This was done in a v2 patch but v1 ended up being committed.  The
variable name is less confusing and stores the default behavior when no
matching exception exists.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: James Morris <jmorris@namei.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-25 14:37:52 -07:00
Jiri Slaby
8c9506d169 cgroup: fix invalid rcu dereference
Commit ad676077a2 ("device_cgroup: convert device_cgroup internally to
policy + exceptions") removed rcu locks which are needed in
task_devcgroup called in this chain:

  devcgroup_inode_mknod OR __devcgroup_inode_permission ->
    __devcgroup_inode_permission ->
      task_devcgroup ->
        task_subsys_state ->
          task_subsys_state_check.

Change the code so that task_devcgroup is safely called with rcu read
lock held.

  ===============================
  [ INFO: suspicious RCU usage. ]
  3.6.0-rc5-next-20120913+ #42 Not tainted
  -------------------------------
  include/linux/cgroup.h:553 suspicious rcu_dereference_check() usage!

  other info that might help us debug this:

  rcu_scheduler_active = 1, debug_locks = 0
  2 locks held by kdevtmpfs/23:
   #0:  (sb_writers){.+.+.+}, at: [<ffffffff8116873f>]
  mnt_want_write+0x1f/0x50
   #1:  (&sb->s_type->i_mutex_key#3/1){+.+.+.}, at: [<ffffffff811558af>]
  kern_path_create+0x7f/0x170

  stack backtrace:
  Pid: 23, comm: kdevtmpfs Not tainted 3.6.0-rc5-next-20120913+ #42
  Call Trace:
    lockdep_rcu_suspicious+0xfd/0x130
    devcgroup_inode_mknod+0x19d/0x240
    vfs_mknod+0x71/0xf0
    handle_create.isra.2+0x72/0x200
    devtmpfsd+0x114/0x140
    ? handle_create.isra.2+0x200/0x200
    kthread+0xd6/0xe0
    kernel_thread_helper+0x4/0x10

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Dave Jones <davej@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: James Morris <jmorris@namei.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-25 14:37:52 -07:00
Alan Cox
b010520ab3 keys: Fix unreachable code
We set ret to NULL then test it. Remove the bogus test

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-10-25 18:00:27 +02:00
John Johansen
2e680dd61e apparmor: fix IRQ stack overflow during free_profile
BugLink: http://bugs.launchpad.net/bugs/1056078

Profile replacement can cause long chains of profiles to build up when
the profile being replaced is pinned. When the pinned profile is finally
freed, it puts the reference to its replacement, which may in turn nest
another call to free_profile on the stack. Because this may happen for
each profile in the replacedby chain this can result in a recusion that
causes the stack to overflow.

Break this nesting by directly walking the chain of replacedby profiles
(ie. use iteration instead of recursion to free the list). This results
in at most 2 levels of free_profile being called, while freeing a
replacedby chain.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-10-25 02:12:50 +11:00
John Johansen
43c422eda9 apparmor: fix apparmor OOPS in audit_log_untrustedstring+0x1c/0x40
The capability defines have moved causing the auto generated names
of capabilities that apparmor uses in logging to be incorrect.

Fix the autogenerated table source to uapi/linux/capability.h

Reported-by: YanHong <clouds.yan@gmail.com>
Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Analyzed-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-17 16:29:46 -07:00
Al Viro
45525b26a4 fix a leak in replace_fd() users
replace_fd() began with "eats a reference, tries to insert into
descriptor table" semantics; at some point I'd switched it to
much saner current behaviour ("try to insert into descriptor
table, grabbing a new reference if inserted; caller should do
fput() in any case"), but forgot to update the callers.
Mea culpa...

[Spotted by Pavel Roskin, who has really weird system with pipe-fed
coredumps as part of what he considers a normal boot ;-)]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-16 13:36:50 -04:00
Linus Torvalds
d25282d1c9 Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module signing support from Rusty Russell:
 "module signing is the highlight, but it's an all-over David Howells frenzy..."

Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG.

* 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits)
  X.509: Fix indefinite length element skip error handling
  X.509: Convert some printk calls to pr_devel
  asymmetric keys: fix printk format warning
  MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking
  MODSIGN: Make mrproper should remove generated files.
  MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs
  MODSIGN: Use the same digest for the autogen key sig as for the module sig
  MODSIGN: Sign modules during the build process
  MODSIGN: Provide a script for generating a key ID from an X.509 cert
  MODSIGN: Implement module signature checking
  MODSIGN: Provide module signing public keys to the kernel
  MODSIGN: Automatically generate module signing keys if missing
  MODSIGN: Provide Kconfig options
  MODSIGN: Provide gitignore and make clean rules for extra files
  MODSIGN: Add FIPS policy
  module: signature checking hook
  X.509: Add a crypto key parser for binary (DER) X.509 certificates
  MPILIB: Provide a function to read raw data into an MPI
  X.509: Add an ASN.1 decoder
  X.509: Add simple ASN.1 grammar compiler
  ...
2012-10-14 13:39:34 -07:00
Al Viro
808d4e3cfd consitify do_mount() arguments
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-11 20:02:04 -04:00
Linus Torvalds
9e2d8656f5 Merge branch 'akpm' (Andrew's patch-bomb)
Merge patches from Andrew Morton:
 "A few misc things and very nearly all of the MM tree.  A tremendous
  amount of stuff (again), including a significant rbtree library
  rework."

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (160 commits)
  sparc64: Support transparent huge pages.
  mm: thp: Use more portable PMD clearing sequenece in zap_huge_pmd().
  mm: Add and use update_mmu_cache_pmd() in transparent huge page code.
  sparc64: Document PGD and PMD layout.
  sparc64: Eliminate PTE table memory wastage.
  sparc64: Halve the size of PTE tables
  sparc64: Only support 4MB huge pages and 8KB base pages.
  memory-hotplug: suppress "Trying to free nonexistent resource <XXXXXXXXXXXXXXXX-YYYYYYYYYYYYYYYY>" warning
  mm: memcg: clean up mm_match_cgroup() signature
  mm: document PageHuge somewhat
  mm: use %pK for /proc/vmallocinfo
  mm, thp: fix mlock statistics
  mm, thp: fix mapped pages avoiding unevictable list on mlock
  memory-hotplug: update memory block's state and notify userspace
  memory-hotplug: preparation to notify memory block's state at memory hot remove
  mm: avoid section mismatch warning for memblock_type_name
  make GFP_NOTRACK definition unconditional
  cma: decrease cc.nr_migratepages after reclaiming pagelist
  CMA: migrate mlocked pages
  kpageflags: fix wrong KPF_THP on non-huge compound pages
  ...
2012-10-09 16:23:15 +09:00
Konstantin Khlebnikov
314e51b985 mm: kill vma flag VM_RESERVED and mm->reserved_vm counter
A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA,
currently it lost original meaning but still has some effects:

 | effect                 | alternative flags
-+------------------------+---------------------------------------------
1| account as reserved_vm | VM_IO
2| skip in core dump      | VM_IO, VM_DONTDUMP
3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
4| do not mlock           | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP

This patch removes reserved_vm counter from mm_struct.  Seems like nobody
cares about it, it does not exported into userspace directly, it only
reduces total_vm showed in proc.

Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP.

remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP.
remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP.

[akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup]
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Venkatesh Pallipadi <venki@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:19 +09:00
Konstantin Khlebnikov
2dd8ad81e3 mm: use mm->exe_file instead of first VM_EXECUTABLE vma->vm_file
Some security modules and oprofile still uses VM_EXECUTABLE for retrieving
a task's executable file.  After this patch they will use mm->exe_file
directly.  mm->exe_file is protected with mm->mmap_sem, so locking stays
the same.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>			[arch/tile]
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>	[tomoyo]
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:18 +09:00
Linus Torvalds
50e0d10232 This has three changes for asm-generic that did not really fit into any
other branch as normal asm-generic changes do. One is a fix for a
 build warning, the other two are more interesting:
 
 * A patch from Mark Brown to allow using the common clock infrastructure
 on all architectures, so we can use the clock API in architecture
 independent device drivers.
 
 * The UAPI split patches from David Howells for the asm-generic files.
 There are other architecture specific series that are going through
 the arch maintainer tree and that depend on this one.
 
 There may be a few small merge conflicts between Mark's patch and
 the following arch header file split patches. In each case the solution
 will be to keep the new "generic-y += clkdev.h" line, even if it
 ends up being the only line in the Kbuild file.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIVAwUAUHLuO2CrR//JCVInAQLsKxAAoa+oSP3KGuQbLHq2wvUxAdXWDFcZgKo+
 qMRejSJPI0sreJ9GJHpUjHtJ7W2gujeo9upmUIJzoWY9vrmjkhCDkaWliaQI8SmY
 CKB9zI2xCB9iFzHtWxocfnJzU7NvzjJm+jnIYrqkaO9HGMxL99tsv9TsBYXK/08j
 QmlGP5fHdGU3zZxVt5r1GL8/nfX4zn3/YEll9nJ7vqXZltIBbaksxmgPoa0QkkH8
 LMeMAlgRR2DHWt58gXHyGB7Afx3QEnZBDaQpYxA446P+2gtvIhFYOnpuX14pZb7t
 m4IM0vOO6WzARQR6DJlRHfYJevojgGHu4Y8wkEzuWE+Hr2BqmiVct7UKqGJdqTY5
 7+I7wwaJmdd3zE61LxRS9UOjJDwMh1gmsNU4+42RArQ5eLcikNR5zfYzDRLCTmnk
 qKZvbiaxgme2YvWazxbBT6EqmIVU6lfHHIoMLr8U0j40Cl0GCmN7EBbe7/r2Jhjs
 6VnCOJ6vb4RCOJGGAcLRMQu7xEtqcCe0Zht839wl13QXewxS3QRgwg6Bjy/fwA9r
 jij5gf+R25J/fQW7yZv4LwcMowRE1xvpu0ebwkK3LLR8jcon71scd6f3PW/bUUpj
 j4tgFuJbXzOxQ4LFgBzvdVgx3wDzsQhqb/6p2l6ROdcw7xXFDdFZ4zq3h0A25wXZ
 J6WDO387tpg=
 =Aaki
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "This has three changes for asm-generic that did not really fit into
  any other branch as normal asm-generic changes do.  One is a fix for a
  build warning, the other two are more interesting:

   * A patch from Mark Brown to allow using the common clock
     infrastructure on all architectures, so we can use the clock API in
     architecture independent device drivers.

   * The UAPI split patches from David Howells for the asm-generic
     files.  There are other architecture specific series that are going
     through the arch maintainer tree and that depend on this one.

  There may be a few small merge conflicts between Mark's patch and the
  following arch header file split patches.  In each case the solution
  will be to keep the new "generic-y += clkdev.h" line, even if it ends
  up being the only line in the Kbuild file."

* tag 'asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  UAPI: (Scripted) Disintegrate include/asm-generic
  asm-generic: Add default clkdev.h
  asm-generic: xor: mark static functions as __maybe_unused
2012-10-09 15:58:38 +09:00
David Howells
cf7f601c06 KEYS: Add payload preparsing opportunity prior to key instantiate or update
Give the key type the opportunity to preparse the payload prior to the
instantiation and update routines being called.  This is done with the
provision of two new key type operations:

	int (*preparse)(struct key_preparsed_payload *prep);
	void (*free_preparse)(struct key_preparsed_payload *prep);

If the first operation is present, then it is called before key creation (in
the add/update case) or before the key semaphore is taken (in the update and
instantiate cases).  The second operation is called to clean up if the first
was called.

preparse() is given the opportunity to fill in the following structure:

	struct key_preparsed_payload {
		char		*description;
		void		*type_data[2];
		void		*payload;
		const void	*data;
		size_t		datalen;
		size_t		quotalen;
	};

Before the preparser is called, the first three fields will have been cleared,
the payload pointer and size will be stored in data and datalen and the default
quota size from the key_type struct will be stored into quotalen.

The preparser may parse the payload in any way it likes and may store data in
the type_data[] and payload fields for use by the instantiate() and update()
ops.

The preparser may also propose a description for the key by attaching it as a
string to the description field.  This can be used by passing a NULL or ""
description to the add_key() system call or the key_create_or_update()
function.  This cannot work with request_key() as that required the description
to tell the upcall about the key to be created.

This, for example permits keys that store PGP public keys to generate their own
name from the user ID and public key fingerprint in the key.

The instantiate() and update() operations are then modified to look like this:

	int (*instantiate)(struct key *key, struct key_preparsed_payload *prep);
	int (*update)(struct key *key, struct key_preparsed_payload *prep);

and the new payload data is passed in *prep, whether or not it was preparsed.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-08 13:49:48 +10:30
Linus Torvalds
638c87a916 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull IMA bugfix (security subsystem) from James Morris.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  ima: fix bug in argument order
2012-10-07 21:07:21 +09:00
Aristeu Rozanski
db9aeca97a device_cgroup: rename whitelist to exception list
This patch replaces the "whitelist" usage in the code and comments and replace
them by exception list related information.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: James Morris <jmorris@namei.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:05:14 +09:00
Aristeu Rozanski
ad676077a2 device_cgroup: convert device_cgroup internally to policy + exceptions
The original model of device_cgroup is having a whitelist where all the
allowed devices are listed. The problem with this approach is that is
impossible to have the case of allowing everything but few devices.

The reason for that lies in the way the whitelist is handled internally:
since there's only a whitelist, the "all devices" entry would have to be
removed and replaced by the entire list of possible devices but the ones
that are being denied.  Since dev_t is 32 bits long, representing the allowed
devices as a bitfield is not memory efficient.

This patch replaces the "whitelist" by a "exceptions" list and the default
policy is kept as "deny_all" variable in dev_cgroup structure.

The current interface determines that whenever "a" is written to devices.allow
or devices.deny, the entry masking all devices will be added or removed,
respectively. This behavior is kept and it's what will determine the default
policy:

	# cat devices.list
	a *:* rwm
	# echo a >devices.deny
	# cat devices.list
	# echo a >devices.allow
	# cat devices.list
	a *:* rwm

The interface is also preserved. For example, if one wants to block only access
to /dev/null:
	# ls -l /dev/null
	crw-rw-rw- 1 root root 1, 3 Jul 24 16:17 /dev/null
	# echo a >devices.allow
	# echo "c 1:3 rwm" >devices.deny
	# cat /dev/null
	cat: /dev/null: Operation not permitted
	# echo >/dev/null
	bash: /dev/null: Operation not permitted
	mknod /tmp/null c 1 3
	mknod: `/tmp/null': Operation not permitted
	# echo "c 1:3 r" >devices.allow
	# cat /dev/null
	# echo >/dev/null
	bash: /dev/null: Operation not permitted
	mknod /tmp/null c 1 3
	mknod: `/tmp/null': Operation not permitted
	# echo "c 1:3 rw" >devices.allow
	# echo >/dev/null
	# cat /dev/null
	# mknod /tmp/null c 1 3
	mknod: `/tmp/null': Operation not permitted
	# echo "c 1:3 rwm" >devices.allow
	# echo >/dev/null
	# cat /dev/null
	# mknod /tmp/null c 1 3
	#

Note that I didn't rename the functions/variables in this patch, but in the
next one to make reviewing easier.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: James Morris <jmorris@namei.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:05:14 +09:00
Aristeu Rozanski
868539a3b6 device_cgroup: introduce dev_whitelist_clean()
This function cleans all the items in a whitelist and will be used by the next
patches.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: James Morris <jmorris@namei.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:05:14 +09:00
Aristeu Rozanski
66b8ef6775 device_cgroup: add "deny_all" in dev_cgroup structure
deny_all will determine if the default policy is to deny all device access
unless for the ones in the exception list.

This variable will be used in the next patches to convert device_cgroup
internally into a default policy + rules.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: James Morris <jmorris@namei.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:05:13 +09:00
Dmitry Kasatkin
d26e193622 ima: fix bug in argument order
mask argument goes first, then func, like ima_must_measure
and ima_get_action. ima_inode_post_setattr() assumes that.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-10-05 22:32:16 +10:00
David Howells
8a1ab3155c UAPI: (Scripted) Disintegrate include/asm-generic
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
2012-10-04 18:20:15 +01:00
Linus Torvalds
88265322c1 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

   - Integrity: add local fs integrity verification to detect offline
     attacks
   - Integrity: add digital signature verification
   - Simple stacking of Yama with other LSMs (per LSS discussions)
   - IBM vTPM support on ppc64
   - Add new driver for Infineon I2C TIS TPM
   - Smack: add rule revocation for subject labels"

Fixed conflicts with the user namespace support in kernel/auditsc.c and
security/integrity/ima/ima_policy.c.

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (39 commits)
  Documentation: Update git repository URL for Smack userland tools
  ima: change flags container data type
  Smack: setprocattr memory leak fix
  Smack: implement revoking all rules for a subject label
  Smack: remove task_wait() hook.
  ima: audit log hashes
  ima: generic IMA action flag handling
  ima: rename ima_must_appraise_or_measure
  audit: export audit_log_task_info
  tpm: fix tpm_acpi sparse warning on different address spaces
  samples/seccomp: fix 31 bit build on s390
  ima: digital signature verification support
  ima: add support for different security.ima data types
  ima: add ima_inode_setxattr/removexattr function and calls
  ima: add inode_post_setattr call
  ima: replace iint spinblock with rwlock/read_lock
  ima: allocating iint improvements
  ima: add appraise action keywords and default rules
  ima: integrity appraisal extension
  vfs: move ima_file_free before releasing the file
  ...
2012-10-02 21:38:48 -07:00
Linus Torvalds
aab174f0df Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs update from Al Viro:

 - big one - consolidation of descriptor-related logics; almost all of
   that is moved to fs/file.c

   (BTW, I'm seriously tempted to rename the result to fd.c.  As it is,
   we have a situation when file_table.c is about handling of struct
   file and file.c is about handling of descriptor tables; the reasons
   are historical - file_table.c used to be about a static array of
   struct file we used to have way back).

   A lot of stray ends got cleaned up and converted to saner primitives,
   disgusting mess in android/binder.c is still disgusting, but at least
   doesn't poke so much in descriptor table guts anymore.  A bunch of
   relatively minor races got fixed in process, plus an ext4 struct file
   leak.

 - related thing - fget_light() partially unuglified; see fdget() in
   there (and yes, it generates the code as good as we used to have).

 - also related - bits of Cyrill's procfs stuff that got entangled into
   that work; _not_ all of it, just the initial move to fs/proc/fd.c and
   switch of fdinfo to seq_file.

 - Alex's fs/coredump.c spiltoff - the same story, had been easier to
   take that commit than mess with conflicts.  The rest is a separate
   pile, this was just a mechanical code movement.

 - a few misc patches all over the place.  Not all for this cycle,
   there'll be more (and quite a few currently sit in akpm's tree)."

Fix up trivial conflicts in the android binder driver, and some fairly
simple conflicts due to two different changes to the sock_alloc_file()
interface ("take descriptor handling from sock_alloc_file() to callers"
vs "net: Providing protocol type via system.sockprotoname xattr of
/proc/PID/fd entries" adding a dentry name to the socket)

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
  MAX_LFS_FILESIZE should be a loff_t
  compat: fs: Generic compat_sys_sendfile implementation
  fs: push rcu_barrier() from deactivate_locked_super() to filesystems
  btrfs: reada_extent doesn't need kref for refcount
  coredump: move core dump functionality into its own file
  coredump: prevent double-free on an error path in core dumper
  usb/gadget: fix misannotations
  fcntl: fix misannotations
  ceph: don't abuse d_delete() on failure exits
  hypfs: ->d_parent is never NULL or negative
  vfs: delete surplus inode NULL check
  switch simple cases of fget_light to fdget
  new helpers: fdget()/fdput()
  switch o2hb_region_dev_write() to fget_light()
  proc_map_files_readdir(): don't bother with grabbing files
  make get_file() return its argument
  vhost_set_vring(): turn pollstart/pollstop into bool
  switch prctl_set_mm_exe_file() to fget_light()
  switch xfs_find_handle() to fget_light()
  switch xfs_swapext() to fget_light()
  ...
2012-10-02 20:25:04 -07:00
Linus Torvalds
aecdc33e11 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking changes from David Miller:

 1) GRE now works over ipv6, from Dmitry Kozlov.

 2) Make SCTP more network namespace aware, from Eric Biederman.

 3) TEAM driver now works with non-ethernet devices, from Jiri Pirko.

 4) Make openvswitch network namespace aware, from Pravin B Shelar.

 5) IPV6 NAT implementation, from Patrick McHardy.

 6) Server side support for TCP Fast Open, from Jerry Chu and others.

 7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel
    Borkmann.

 8) Increate the loopback default MTU to 64K, from Eric Dumazet.

 9) Use a per-task rather than per-socket page fragment allocator for
    outgoing networking traffic.  This benefits processes that have very
    many mostly idle sockets, which is quite common.

    From Eric Dumazet.

10) Use up to 32K for page fragment allocations, with fallbacks to
    smaller sizes when higher order page allocations fail.  Benefits are
    a) less segments for driver to process b) less calls to page
    allocator c) less waste of space.

    From Eric Dumazet.

11) Allow GRO to be used on GRE tunnels, from Eric Dumazet.

12) VXLAN device driver, one way to handle VLAN issues such as the
    limitation of 4096 VLAN IDs yet still have some level of isolation.
    From Stephen Hemminger.

13) As usual there is a large boatload of driver changes, with the scale
    perhaps tilted towards the wireless side this time around.

Fix up various fairly trivial conflicts, mostly caused by the user
namespace changes.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits)
  hyperv: Add buffer for extended info after the RNDIS response message.
  hyperv: Report actual status in receive completion packet
  hyperv: Remove extra allocated space for recv_pkt_list elements
  hyperv: Fix page buffer handling in rndis_filter_send_request()
  hyperv: Fix the missing return value in rndis_filter_set_packet_filter()
  hyperv: Fix the max_xfer_size in RNDIS initialization
  vxlan: put UDP socket in correct namespace
  vxlan: Depend on CONFIG_INET
  sfc: Fix the reported priorities of different filter types
  sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP
  sfc: Fix loopback self-test with separate_tx_channels=1
  sfc: Fix MCDI structure field lookup
  sfc: Add parentheses around use of bitfield macro arguments
  sfc: Fix null function pointer in efx_sriov_channel_type
  vxlan: virtual extensible lan
  igmp: export symbol ip_mc_leave_group
  netlink: add attributes to fdb interface
  tg3: unconditionally select HWMON support when tg3 is enabled.
  Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT"
  gre: fix sparse warning
  ...
2012-10-02 13:38:27 -07:00
David Howells
4442d7704c Merge branch 'modsign-keys-devel' into security-next-keys
Signed-off-by: David Howells <dhowells@redhat.com>
2012-10-02 19:30:19 +01:00
David Howells
f8aa23a55f KEYS: Use keyring_alloc() to create special keyrings
Use keyring_alloc() to create special keyrings now that it has a permissions
parameter rather than using key_alloc() + key_instantiate_and_link().

Also document and export keyring_alloc() so that modules can use it too.

Signed-off-by: David Howells <dhowells@redhat.com>
2012-10-02 19:24:56 +01:00
David Howells
96b5c8fea6 KEYS: Reduce initial permissions on keys
Reduce the initial permissions on new keys to grant the possessor everything,
view permission only to the user (so the keys can be seen in /proc/keys) and
nothing else.

This gives the creator a chance to adjust the permissions mask before other
processes can access the new key or create a link to it.

To aid with this, keyring_alloc() now takes a permission argument rather than
setting the permissions itself.

The following permissions are now set:

 (1) The user and user-session keyrings grant the user that owns them full
     permissions and grant a possessor everything bar SETATTR.

 (2) The process and thread keyrings grant the possessor full permissions but
     only grant the user VIEW.  This permits the user to see them in
     /proc/keys, but not to do anything with them.

 (3) Anonymous session keyrings grant the possessor full permissions, but only
     grant the user VIEW and READ.  This means that the user can see them in
     /proc/keys and can list them, but nothing else.  Possibly READ shouldn't
     be provided either.

 (4) Named session keyrings grant everything an anonymous session keyring does,
     plus they grant the user LINK permission.  The whole point of named
     session keyrings is that others can also subscribe to them.  Possibly this
     should be a separate permission to LINK.

 (5) The temporary session keyring created by call_sbin_request_key() gets the
     same permissions as an anonymous session keyring.

 (6) Keys created by add_key() get VIEW, SEARCH, LINK and SETATTR for the
     possessor, plus READ and/or WRITE if the key type supports them.  The used
     only gets VIEW now.

 (7) Keys created by request_key() now get the same as those created by
     add_key().

Reported-by: Lennart Poettering <lennart@poettering.net>
Reported-by: Stef Walter <stefw@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2012-10-02 19:24:56 +01:00
David Howells
3a50597de8 KEYS: Make the session and process keyrings per-thread
Make the session keyring per-thread rather than per-process, but still
inherited from the parent thread to solve a problem with PAM and gdm.

The problem is that join_session_keyring() will reject attempts to change the
session keyring of a multithreaded program but gdm is now multithreaded before
it gets to the point of starting PAM and running pam_keyinit to create the
session keyring.  See:

	https://bugs.freedesktop.org/show_bug.cgi?id=49211

The reason that join_session_keyring() will only change the session keyring
under a single-threaded environment is that it's hard to alter the other
thread's credentials to effect the change in a multi-threaded program.  The
problems are such as:

 (1) How to prevent two threads both running join_session_keyring() from
     racing.

 (2) Another thread's credentials may not be modified directly by this process.

 (3) The number of threads is uncertain whilst we're not holding the
     appropriate spinlock, making preallocation slightly tricky.

 (4) We could use TIF_NOTIFY_RESUME and key_replace_session_keyring() to get
     another thread to replace its keyring, but that means preallocating for
     each thread.

A reasonable way around this is to make the session keyring per-thread rather
than per-process and just document that if you want a common session keyring,
you must get it before you spawn any threads - which is the current situation
anyway.

Whilst we're at it, we can the process keyring behave in the same way.  This
means we can clean up some of the ickyness in the creds code.

Basically, after this patch, the session, process and thread keyrings are about
inheritance rules only and not about sharing changes of keyring.

Reported-by: Mantas M. <grawity@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Ray Strode <rstrode@redhat.com>
2012-10-02 19:24:29 +01:00
Linus Torvalds
437589a74b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace changes from Eric Biederman:
 "This is a mostly modest set of changes to enable basic user namespace
  support.  This allows the code to code to compile with user namespaces
  enabled and removes the assumption there is only the initial user
  namespace.  Everything is converted except for the most complex of the
  filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs,
  nfs, ocfs2 and xfs as those patches need a bit more review.

  The strategy is to push kuid_t and kgid_t values are far down into
  subsystems and filesystems as reasonable.  Leaving the make_kuid and
  from_kuid operations to happen at the edge of userspace, as the values
  come off the disk, and as the values come in from the network.
  Letting compile type incompatible compile errors (present when user
  namespaces are enabled) guide me to find the issues.

  The most tricky areas have been the places where we had an implicit
  union of uid and gid values and were storing them in an unsigned int.
  Those places were converted into explicit unions.  I made certain to
  handle those places with simple trivial patches.

  Out of that work I discovered we have generic interfaces for storing
  quota by projid.  I had never heard of the project identifiers before.
  Adding full user namespace support for project identifiers accounts
  for most of the code size growth in my git tree.

  Ultimately there will be work to relax privlige checks from
  "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing
  root in a user names to do those things that today we only forbid to
  non-root users because it will confuse suid root applications.

  While I was pushing kuid_t and kgid_t changes deep into the audit code
  I made a few other cleanups.  I capitalized on the fact we process
  netlink messages in the context of the message sender.  I removed
  usage of NETLINK_CRED, and started directly using current->tty.

  Some of these patches have also made it into maintainer trees, with no
  problems from identical code from different trees showing up in
  linux-next.

  After reading through all of this code I feel like I might be able to
  win a game of kernel trivial pursuit."

Fix up some fairly trivial conflicts in netfilter uid/git logging code.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits)
  userns: Convert the ufs filesystem to use kuid/kgid where appropriate
  userns: Convert the udf filesystem to use kuid/kgid where appropriate
  userns: Convert ubifs to use kuid/kgid
  userns: Convert squashfs to use kuid/kgid where appropriate
  userns: Convert reiserfs to use kuid and kgid where appropriate
  userns: Convert jfs to use kuid/kgid where appropriate
  userns: Convert jffs2 to use kuid and kgid where appropriate
  userns: Convert hpfs to use kuid and kgid where appropriate
  userns: Convert btrfs to use kuid/kgid where appropriate
  userns: Convert bfs to use kuid/kgid where appropriate
  userns: Convert affs to use kuid/kgid wherwe appropriate
  userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids
  userns: On ia64 deal with current_uid and current_gid being kuid and kgid
  userns: On ppc convert current_uid from a kuid before printing.
  userns: Convert s390 getting uid and gid system calls to use kuid and kgid
  userns: Convert s390 hypfs to use kuid and kgid where appropriate
  userns: Convert binder ipc to use kuids
  userns: Teach security_path_chown to take kuids and kgids
  userns: Add user namespace support to IMA
  userns: Convert EVM to deal with kuids and kgids in it's hmac computation
  ...
2012-10-02 11:11:09 -07:00
Linus Torvalds
68d47a137c Merge branch 'for-3.7-hierarchy' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup hierarchy update from Tejun Heo:
 "Currently, different cgroup subsystems handle nested cgroups
  completely differently.  There's no consistency among subsystems and
  the behaviors often are outright broken.

  People at least seem to agree that the broken hierarhcy behaviors need
  to be weeded out if any progress is gonna be made on this front and
  that the fallouts from deprecating the broken behaviors should be
  acceptable especially given that the current behaviors don't make much
  sense when nested.

  This patch makes cgroup emit warning messages if cgroups for
  subsystems with broken hierarchy behavior are nested to prepare for
  fixing them in the future.  This was put in a separate branch because
  more related changes were expected (didn't make it this round) and the
  memory cgroup wanted to pull in this and make changes on top."

* 'for-3.7-hierarchy' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them
2012-10-02 10:52:28 -07:00
Linus Torvalds
033d9959ed Merge branch 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue changes from Tejun Heo:
 "This is workqueue updates for v3.7-rc1.  A lot of activities this
  round including considerable API and behavior cleanups.

   * delayed_work combines a timer and a work item.  The handling of the
     timer part has always been a bit clunky leading to confusing
     cancelation API with weird corner-case behaviors.  delayed_work is
     updated to use new IRQ safe timer and cancelation now works as
     expected.

   * Another deficiency of delayed_work was lack of the counterpart of
     mod_timer() which led to cancel+queue combinations or open-coded
     timer+work usages.  mod_delayed_work[_on]() are added.

     These two delayed_work changes make delayed_work provide interface
     and behave like timer which is executed with process context.

   * A work item could be executed concurrently on multiple CPUs, which
     is rather unintuitive and made flush_work() behavior confusing and
     half-broken under certain circumstances.  This problem doesn't
     exist for non-reentrant workqueues.  While non-reentrancy check
     isn't free, the overhead is incurred only when a work item bounces
     across different CPUs and even in simulated pathological scenario
     the overhead isn't too high.

     All workqueues are made non-reentrant.  This removes the
     distinction between flush_[delayed_]work() and
     flush_[delayed_]_work_sync().  The former is now as strong as the
     latter and the specified work item is guaranteed to have finished
     execution of any previous queueing on return.

   * In addition to the various bug fixes, Lai redid and simplified CPU
     hotplug handling significantly.

   * Joonsoo introduced system_highpri_wq and used it during CPU
     hotplug.

  There are two merge commits - one to pull in IRQ safe timer from
  tip/timers/core and the other to pull in CPU hotplug fixes from
  wq/for-3.6-fixes as Lai's hotplug restructuring depended on them."

Fixed a number of trivial conflicts, but the more interesting conflicts
were silent ones where the deprecated interfaces had been used by new
code in the merge window, and thus didn't cause any real data conflicts.

Tejun pointed out a few of them, I fixed a couple more.

* 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (46 commits)
  workqueue: remove spurious WARN_ON_ONCE(in_irq()) from try_to_grab_pending()
  workqueue: use cwq_set_max_active() helper for workqueue_set_max_active()
  workqueue: introduce cwq_set_max_active() helper for thaw_workqueues()
  workqueue: remove @delayed from cwq_dec_nr_in_flight()
  workqueue: fix possible stall on try_to_grab_pending() of a delayed work item
  workqueue: use hotcpu_notifier() for workqueue_cpu_down_callback()
  workqueue: use __cpuinit instead of __devinit for cpu callbacks
  workqueue: rename manager_mutex to assoc_mutex
  workqueue: WORKER_REBIND is no longer necessary for idle rebinding
  workqueue: WORKER_REBIND is no longer necessary for busy rebinding
  workqueue: reimplement idle worker rebinding
  workqueue: deprecate __cancel_delayed_work()
  workqueue: reimplement cancel_delayed_work() using try_to_grab_pending()
  workqueue: use mod_delayed_work() instead of __cancel + queue
  workqueue: use irqsafe timer for delayed_work
  workqueue: clean up delayed_work initializers and add missing one
  workqueue: make deferrable delayed_work initializer names consistent
  workqueue: cosmetic whitespace updates for macro definitions
  workqueue: deprecate system_nrt[_freezable]_wq
  workqueue: deprecate flush[_delayed]_work_sync()
  ...
2012-10-02 09:54:49 -07:00
Linus Torvalds
94095a1fff Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core kernel fixes from Ingo Molnar:
 "This is a complex task_work series from Oleg that fixes the bug that
  this VFS commit tried to fix:

    d35abdb288 hold task_lock around checks in keyctl

  but solves the problem without the lockup regression that d35abdb288
  introduced in v3.6.

  This series came late in v3.6 and I did not feel confident about it so
  late in the cycle.  Might be worth backporting to -stable if it proves
  itself upstream."

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  task_work: Simplify the usage in ptrace_notify() and get_signal_to_deliver()
  task_work: Revert "hold task_lock around checks in keyctl"
  task_work: task_work_add() should not succeed after exit_task_work()
  task_work: Make task_work_add() lockless
2012-10-01 10:25:54 -07:00
Linus Torvalds
99dbb1632f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull the trivial tree from Jiri Kosina:
 "Tiny usual fixes all over the place"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
  doc: fix old config name of kprobetrace
  fs/fs-writeback.c: cleanup riteback_sb_inodes kerneldoc
  btrfs: fix the commment for the action flags in delayed-ref.h
  btrfs: fix trivial typo for the comment of BTRFS_FREE_INO_OBJECTID
  vfs: fix kerneldoc for generic_fh_to_parent()
  treewide: fix comment/printk/variable typos
  ipr: fix small coding style issues
  doc: fix broken utf8 encoding
  nfs: comment fix
  platform/x86: fix asus_laptop.wled_type module parameter
  mfd: printk/comment fixes
  doc: getdelays.c: remember to close() socket on error in create_nl_socket()
  doc: aliasing-test: close fd on write error
  mmc: fix comment typos
  dma: fix comments
  spi: fix comment/printk typos in spi
  Coccinelle: fix typo in memdup_user.cocci
  tmiofb: missing NULL pointer checks
  tools: perf: Fix typo in tools/perf
  tools/testing: fix comment / output typos
  ...
2012-10-01 09:06:36 -07:00
David S. Miller
6a06e5e1bb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/team/team.c
	drivers/net/usb/qmi_wwan.c
	net/batman-adv/bat_iv_ogm.c
	net/ipv4/fib_frontend.c
	net/ipv4/route.c
	net/l2tp/l2tp_netlink.c

The team, fib_frontend, route, and l2tp_netlink conflicts were simply
overlapping changes.

qmi_wwan and bat_iv_ogm were of the "use HEAD" variety.

With help from Antonio Quartulli.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-28 14:40:49 -04:00
Alan Cox
a84a921978 key: Fix resource leak
On an error iov may still have been reallocated and need freeing

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2012-09-28 12:20:02 +01:00
Alan Cox
631527703d keys: Fix unreachable code
We set ret to NULL then test it. Remove the bogus test

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2012-09-28 12:20:02 +01:00
James Morris
bf53083445 Linux 3.6-rc7
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJQX7MuAAoJEHm+PkMAQRiG0h0IAJURkrMCAQUxA+Ik66ReH89s
 LQcVd0U9uL4UUOi7f5WR64Vf9Cfu6VVGX9ZKSvjpNskvlQaUQPMIt4pMe6g4X4dI
 u0bApEy4XZz3nGabUAghIU8jJ8cDmhCG6kPpSiS7pi7KHc0yIa4WFtJRrIpGaIWT
 xuK38YOiOHcSDRlLyWZzainMncQp/ixJdxnqVMTonkVLk0q0b84XzOr4/qlLE5lU
 i+TsK3PRKdQXgvZ4CebL+srPBwWX1dmgP3VkeBloQbSSenSeELICbFWavn2ml+sF
 GXi4dO93oNquL/Oy5SwI666T4uNcrRPaS+5X+xSZgBW/y2aQVJVJuNZg6ZP/uWk=
 =0v2l
 -----END PGP SIGNATURE-----

Merge tag 'v3.6-rc7' into next

Linux 3.6-rc7

Requested by David Howells so he can merge his key susbsystem work into
my tree with requisite -linus changesets.
2012-09-28 13:37:32 +10:00
Al Viro
cb0942b812 make get_file() return its argument
simplifies a bunch of callers...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26 21:10:25 -04:00
Al Viro
c3c073f808 new helper: iterate_fd()
iterates through the opened files in given descriptor table,
calling a supplied function; we stop once non-zero is returned.
Callback gets struct file *, descriptor number and const void *
argument passed to iterator.  It is called with files->file_lock
held, so it is not allowed to block.

tty_io, netprio_cgroup and selinux flush_unauthorized_files()
converted to its use.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26 21:09:59 -04:00
Al Viro
ee97cd872d switch flush_unauthorized_files() to replace_fd()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26 21:09:58 -04:00
Eric W. Biederman
d2b31ca644 userns: Teach security_path_chown to take kuids and kgids
Don't make the security modules deal with raw user space uid and
gids instead pass in a kuid_t and a kgid_t so that security modules
only have to deal with internal kernel uids and gids.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: James Morris <james.l.morris@oracle.com>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-09-21 03:13:25 -07:00
Eric W. Biederman
8b94eea4bf userns: Add user namespace support to IMA
Use kuid's in the IMA rules.

When reporting the current uid in audit logs use from_kuid
to get a usable value.

Cc: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-09-21 03:13:24 -07:00
Eric W. Biederman
cf9c93526f userns: Convert EVM to deal with kuids and kgids in it's hmac computation
Cc: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-09-21 03:13:24 -07:00
Eric W. Biederman
581abc09c2 userns: Convert selinux to use kuid and kgid where appropriate
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2012-09-21 03:13:22 -07:00
Eric W. Biederman
609fcd1b3a userns: Convert tomoyo to use kuid and kgid where appropriate
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-09-21 03:13:22 -07:00
Eric W. Biederman
2db8145293 userns: Convert apparmor to use kuid and kgid where appropriate
Cc: John Johansen <john.johansen@canonical.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-09-21 03:13:21 -07:00
Dmitry Kasatkin
0a72ba7aff ima: change flags container data type
IMA audit hashes patches introduced new IMA flags and required
space went beyond 8 bits. Currently the only flag is IMA_DIGSIG.
This patch use 16 bit short instead of 8 bit char.
Without this fix IMA signature will be replaced with hash, which
should not happen.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-09-19 08:55:20 -04:00
Nicolas Dichtel
ee8372dd19 xfrm: invalidate dst on policy insertion/deletion
When a policy is inserted or deleted, all dst should be recalculated.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-18 15:57:03 -04:00
Casey Schaufler
46a2f3b9e9 Smack: setprocattr memory leak fix
The data structure allocations being done in prepare_creds
are duplicated in smack_setprocattr. This results in the
structure allocated in prepare_creds being orphaned and
never freed. The duplicate code is removed from
smack_setprocattr.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2012-09-18 09:51:06 -07:00
Rafal Krypa
449543b043 Smack: implement revoking all rules for a subject label
Add /smack/revoke-subject special file. Writing a SMACK label to this file will
set the access to '-' for all access rules with that subject label.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
2012-09-18 09:50:52 -07:00
Casey Schaufler
c00bedb368 Smack: remove task_wait() hook.
On 12/20/2011 11:20 PM, Jarkko Sakkinen wrote:
> Allow SIGCHLD to be passed to child process without
> explicit policy. This will help to keep the access
> control policy simple and easily maintainable with
> complex applications that require use of multiple
> security contexts. It will also help to keep them
> as isolated as possible.
>
> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>

I have a slightly different version that applies to the
current smack-next tree.

Allow SIGCHLD to be passed to child process without
explicit policy. This will help to keep the access
control policy simple and easily maintainable with
complex applications that require use of multiple
security contexts. It will also help to keep them
as isolated as possible.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>

 security/smack/smack_lsm.c |   37 ++++++++-----------------------------
 1 files changed, 8 insertions(+), 29 deletions(-)
2012-09-18 09:50:37 -07:00
Tejun Heo
8c7f6edbda cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them
Currently, cgroup hierarchy support is a mess.  cpu related subsystems
behave correctly - configuration, accounting and control on a parent
properly cover its children.  blkio and freezer completely ignore
hierarchy and treat all cgroups as if they're directly under the root
cgroup.  Others show yet different behaviors.

These differing interpretations of cgroup hierarchy make using cgroup
confusing and it impossible to co-mount controllers into the same
hierarchy and obtain sane behavior.

Eventually, we want full hierarchy support from all subsystems and
probably a unified hierarchy.  Users using separate hierarchies
expecting completely different behaviors depending on the mounted
subsystem is deterimental to making any progress on this front.

This patch adds cgroup_subsys.broken_hierarchy and sets it to %true
for controllers which are lacking in hierarchy support.  The goal of
this patch is two-fold.

* Move users away from using hierarchy on currently non-hierarchical
  subsystems, so that implementing proper hierarchy support on those
  doesn't surprise them.

* Keep track of which controllers are broken how and nudge the
  subsystems to implement proper hierarchy support.

For now, start with a single warning message.  We can whine louder
later on.

v2: Fixed a typo spotted by Michal. Warning message updated.

v3: Updated memcg part so that it doesn't generate warning in the
    cases where .use_hierarchy=false doesn't make the behavior
    different from root.use_hierarchy=true.  Fixed a typo spotted by
    Glauber.

v4: Check ->broken_hierarchy after cgroup creation is complete so that
    ->create() can affect the result per Michal.  Dropped unnecessary
    memcg root handling per Michal.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Li Zefan <lizefan@huawei.com>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Turner <pjt@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2012-09-14 12:01:16 -07:00
Eric W. Biederman
9a56c2db49 userns: Convert security/keys to the new userns infrastructure
- Replace key_user ->user_ns equality checks with kuid_has_mapping checks.
- Use from_kuid to generate key descriptions
- Use kuid_t and kgid_t and the associated helpers instead of uid_t and gid_t
- Avoid potential problems with file descriptor passing by displaying
  keys in the user namespace of the opener of key status proc files.

Cc: linux-security-module@vger.kernel.org
Cc: keyrings@linux-nfs.org
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-09-13 18:28:02 -07:00
Peter Moody
e7c568e0fd ima: audit log hashes
This adds an 'audit' policy action which audit logs file measurements.

Changelog v6:
 - use new action flag handling (Dmitry Kasatkin).
 - removed whitespace (Mimi)

Changelog v5:
 - use audit_log_untrustedstring.

Changelog v4:
 - cleanup digest -> hash conversion.
 - use filename rather than d_path in ima_audit_measurement.

Changelog v3:
 - Use newly exported audit_log_task_info for logging pid/ppid/uid/etc.
 - Update the ima_policy ABI documentation.

Changelog v2:
 - Use 'audit' action rather than 'measure_and_audit' to permit
 auditing in the absence of measuring..

Changelog v1:
 - Initial posting.

Signed-off-by: Peter Moody <pmoody@google.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-09-13 14:48:44 -04:00
Dmitry Kasatkin
45e2472e67 ima: generic IMA action flag handling
Make the IMA action flag handling generic in order to support
additional new actions, without requiring changes to the base
implementation.  New actions, like audit logging, will only
need to modify the define statements.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-09-13 14:23:57 -04:00
Oleg Nesterov
b3f68f16db task_work: Revert "hold task_lock around checks in keyctl"
This reverts commit d35abdb288.

task_lock() was added to ensure exit_mm() and thus exit_task_work() is
not possible before task_work_add().

This is wrong, task_lock() must not be nested with write_lock(tasklist).
And this is no longer needed, task_work_add() now fails if it is called
after exit_task_work().

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20120826191214.GA4231@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-13 16:47:36 +02:00
David Howells
d4f65b5d24 KEYS: Add payload preparsing opportunity prior to key instantiate or update
Give the key type the opportunity to preparse the payload prior to the
instantiation and update routines being called.  This is done with the
provision of two new key type operations:

	int (*preparse)(struct key_preparsed_payload *prep);
	void (*free_preparse)(struct key_preparsed_payload *prep);

If the first operation is present, then it is called before key creation (in
the add/update case) or before the key semaphore is taken (in the update and
instantiate cases).  The second operation is called to clean up if the first
was called.

preparse() is given the opportunity to fill in the following structure:

	struct key_preparsed_payload {
		char		*description;
		void		*type_data[2];
		void		*payload;
		const void	*data;
		size_t		datalen;
		size_t		quotalen;
	};

Before the preparser is called, the first three fields will have been cleared,
the payload pointer and size will be stored in data and datalen and the default
quota size from the key_type struct will be stored into quotalen.

The preparser may parse the payload in any way it likes and may store data in
the type_data[] and payload fields for use by the instantiate() and update()
ops.

The preparser may also propose a description for the key by attaching it as a
string to the description field.  This can be used by passing a NULL or ""
description to the add_key() system call or the key_create_or_update()
function.  This cannot work with request_key() as that required the description
to tell the upcall about the key to be created.

This, for example permits keys that store PGP public keys to generate their own
name from the user ID and public key fingerprint in the key.

The instantiate() and update() operations are then modified to look like this:

	int (*instantiate)(struct key *key, struct key_preparsed_payload *prep);
	int (*update)(struct key *key, struct key_preparsed_payload *prep);

and the new payload data is passed in *prep, whether or not it was preparsed.

Signed-off-by: David Howells <dhowells@redhat.com>
2012-09-13 13:06:29 +01:00
Dmitry Kasatkin
d9d300cdb6 ima: rename ima_must_appraise_or_measure
When AUDIT action support is added to the IMA,
ima_must_appraise_or_measure() does not reflect the real meaning anymore.
Rename it to ima_get_action().

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-09-12 07:28:05 -04:00
Pablo Neira Ayuso
9f00d9776b netlink: hide struct module parameter in netlink_kernel_create
This patch defines netlink_kernel_create as a wrapper function of
__netlink_kernel_create to hide the struct module *me parameter
(which seems to be THIS_MODULE in all existing netlink subsystems).

Suggested by David S. Miller.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-08 18:46:30 -04:00
Pablo Neira Ayuso
9785e10aed netlink: kill netlink_set_nonroot
Replace netlink_set_nonroot by one new field `flags' in
struct netlink_kernel_cfg that is passed to netlink_kernel_create.

This patch also renames NL_NONROOT_* to NL_CFG_F_NONROOT_* since
now the flags field in nl_table is generic (so we can add more
flags if needed in the future).

Also adjust all callers in the net-next tree to use these flags
instead of netlink_set_nonroot.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-08 18:45:27 -04:00
Dmitry Kasatkin
8606404fa5 ima: digital signature verification support
This patch adds support for digital signature based integrity appraisal.
With this patch, 'security.ima' contains either the file data hash or
a digital signature of the file data hash. The file data hash provides
the security attribute of file integrity. In addition to file integrity,
a digital signature provides the security attribute of authenticity.

Unlike EVM, when the file metadata changes, the digital signature is
replaced with an HMAC, modification of the file data does not cause the
'security.ima' digital signature to be replaced with a hash. As a
result, after any modification, subsequent file integrity appraisals
would fail.

Although digitally signed files can be modified, but by not updating
'security.ima' to reflect these modifications, in essence digitally
signed files could be considered 'immutable'.

IMA uses a different keyring than EVM. While the EVM keyring should not
be updated after initialization and locked, the IMA keyring should allow
updating or adding new keys when upgrading or installing packages.

Changelog v4:
- Change IMA_DIGSIG to hex equivalent
Changelog v3:
- Permit files without any 'security.ima' xattr to be labeled properly.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-09-07 14:57:48 -04:00
Mimi Zohar
5a44b41207 ima: add support for different security.ima data types
IMA-appraisal currently verifies the integrity of a file based on a
known 'good' measurement value.  This patch reserves the first byte
of 'security.ima' as a place holder for the type of method used for
verifying file data integrity.

Changelog v1:
- Use the newly defined 'struct evm_ima_xattr_data'

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@nokia.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-09-07 14:57:47 -04:00
Mimi Zohar
42c63330f2 ima: add ima_inode_setxattr/removexattr function and calls
Based on xattr_permission comments, the restriction to modify 'security'
xattr is left up to the underlying fs or lsm. Ensure that not just anyone
can modify or remove 'security.ima'.

Changelog v1:
- Unless IMA-APPRAISE is configured, use stub ima_inode_removexattr()/setxattr()
  functions.  (Moved ima_inode_removexattr()/setxattr() to ima_appraise.c)

Changelog:
  - take i_mutex to fix locking (Dmitry Kasatkin)
  - ima_reset_appraise_flags should only be called when modifying or
    removing the 'security.ima' xattr. Requires CAP_SYS_ADMIN privilege.
    (Incorporated fix from Roberto Sassu)
  - Even if allowed to update security.ima, reset the appraisal flags,
    forcing re-appraisal.
  - Replace CAP_MAC_ADMIN with CAP_SYS_ADMIN
  - static inline ima_inode_setxattr()/ima_inode_removexattr() stubs
  - ima_protect_xattr should be static

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
2012-09-07 14:57:47 -04:00
Dmitry Kasatkin
a10bf26b2f ima: replace iint spinblock with rwlock/read_lock
For performance, replace the iint spinlock with rwlock/read_lock.

Eric Paris questioned this change, from spinlocks to rwlocks, saying
"rwlocks have been shown to actually be slower on multi processor
systems in a number of cases due to the cache line bouncing required."

Based on performance measurements compiling the kernel on a cold
boot with multiple jobs with/without this patch, Dmitry Kasatkin
and I found that rwlocks performed better than spinlocks, but very
insignificantly.  For example with total compilation time around 6
minutes, with rwlocks time was 1 - 3 seconds shorter... but always
like that.

Changelog v2:
- new patch taken from the 'allocating iint improvements' patch

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
2012-09-07 14:57:46 -04:00
Dmitry Kasatkin
bf2276d10c ima: allocating iint improvements
With IMA-appraisal's removal of the iint mutex and taking the i_mutex
instead, allocating the iint becomes a lot simplier, as we don't need
to be concerned with two processes racing to allocate the iint. This
patch cleans up and improves performance for allocating the iint.

- removed redundant double i_mutex locking
- combined iint allocation with tree search

Changelog v2:
- removed the rwlock/read_lock changes from this patch

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
2012-09-07 14:57:45 -04:00
Mimi Zohar
07f6a79415 ima: add appraise action keywords and default rules
Unlike the IMA measurement policy, the appraise policy can not be dependent
on runtime process information, such as the task uid, as the 'security.ima'
xattr is written on file close and must be updated each time the file changes,
regardless of the current task uid.

This patch extends the policy language with 'fowner', defines an appraise
policy, which appraises all files owned by root, and defines 'ima_appraise_tcb',
a new boot command line option, to enable the appraise policy.

Changelog v3:
- separate the measure from the appraise rules in order to support measuring
  without appraising and appraising without measuring.
- change appraisal default for filesystems without xattr support to fail
- update default appraise policy for cgroups

Changelog v1:
- don't appraise RAMFS (Dmitry Kasatkin)
- merged rest of "ima: ima_must_appraise_or_measure API change" commit
  (Dmtiry Kasatkin)

  ima_must_appraise_or_measure() called ima_match_policy twice, which
  searched the policy for a matching rule.  Once for a matching measurement
  rule and subsequently for an appraisal rule. Searching the policy twice
  is unnecessary overhead, which could be noticeable with a large policy.

  The new version of ima_must_appraise_or_measure() does everything in a
  single iteration using a new version of ima_match_policy().  It returns
  IMA_MEASURE, IMA_APPRAISE mask.

  With the use of action mask only one efficient matching function
  is enough.  Removed other specific versions of matching functions.

Changelog:
- change 'owner' to 'fowner' to conform to the new LSM conditions posted by
  Roberto Sassu.
- fix calls to ima_log_string()

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
2012-09-07 14:57:45 -04:00
Mimi Zohar
2fe5d6def1 ima: integrity appraisal extension
IMA currently maintains an integrity measurement list used to assert the
integrity of the running system to a third party.  The IMA-appraisal
extension adds local integrity validation and enforcement of the
measurement against a "good" value stored as an extended attribute
'security.ima'.  The initial methods for validating 'security.ima' are
hashed based, which provides file data integrity, and digital signature
based, which in addition to providing file data integrity, provides
authenticity.

This patch creates and maintains the 'security.ima' xattr, containing
the file data hash measurement.  Protection of the xattr is provided by
EVM, if enabled and configured.

Based on policy, IMA calls evm_verifyxattr() to verify a file's metadata
integrity and, assuming success, compares the file's current hash value
with the one stored as an extended attribute in 'security.ima'.

Changelov v4:
- changed iint cache flags to hex values

Changelog v3:
- change appraisal default for filesystems without xattr support to fail

Changelog v2:
- fix audit msg 'res' value
- removed unused 'ima_appraise=' values

Changelog v1:
- removed unused iint mutex (Dmitry Kasatkin)
- setattr hook must not reset appraised (Dmitry Kasatkin)
- evm_verifyxattr() now differentiates between no 'security.evm' xattr
  (INTEGRITY_NOLABEL) and no EVM 'protected' xattrs included in the
  'security.evm' (INTEGRITY_NOXATTRS).
- replace hash_status with ima_status (Dmitry Kasatkin)
- re-initialize slab element ima_status on free (Dmitry Kasatkin)
- include 'security.ima' in EVM if CONFIG_IMA_APPRAISE, not CONFIG_IMA
- merged half "ima: ima_must_appraise_or_measure API change" (Dmitry Kasatkin)
- removed unnecessary error variable in process_measurement() (Dmitry Kasatkin)
- use ima_inode_post_setattr() stub function, if IMA_APPRAISE not configured
  (moved ima_inode_post_setattr() to ima_appraise.c)
- make sure ima_collect_measurement() can read file

Changelog:
- add 'iint' to evm_verifyxattr() call (Dimitry Kasatkin)
- fix the race condition between chmod, which takes the i_mutex and then
  iint->mutex, and ima_file_free() and process_measurement(), which take
  the locks in the reverse order, by eliminating iint->mutex. (Dmitry Kasatkin)
- cleanup of ima_appraise_measurement() (Dmitry Kasatkin)
- changes as a result of the iint not allocated for all regular files, but
  only for those measured/appraised.
- don't try to appraise new/empty files
- expanded ima_appraisal description in ima/Kconfig
- IMA appraise definitions required even if IMA_APPRAISE not enabled
- add return value to ima_must_appraise() stub
- unconditionally set status = INTEGRITY_PASS *after* testing status,
  not before.  (Found by Joe Perches)

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
2012-09-07 14:57:44 -04:00
Kees Cook
2e4930eb7c Yama: handle 32-bit userspace prctl
When running a 64-bit kernel and receiving prctls from a 32-bit
userspace, the "-1" used as an unsigned long will end up being
misdetected. The kernel is looking for 0xffffffffffffffff instead of
0xffffffff. Since prctl lacks a distinct compat interface, Yama needs
to handle this translation itself. As such, support either value as
meaning PR_SET_PTRACER_ANY, to avoid breaking the ABI for 64-bit.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: John Johansen <john.johansen@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-09-08 01:06:14 +10:00
Kees Cook
c6993e4ac0 security: allow Yama to be unconditionally stacked
Unconditionally call Yama when CONFIG_SECURITY_YAMA_STACKED is selected,
no matter what LSM module is primary.

Ubuntu and Chrome OS already carry patches to do this, and Fedora
has voiced interest in doing this as well. Instead of having multiple
distributions (or LSM authors) carrying these patches, just allow Yama
to be called unconditionally when selected by the new CONFIG.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-09-05 14:12:31 -07:00
Paul Bolle
ec2e1ed2d7 AppArmor: remove af_names.h from .gitignore
Commit 4fdef2183e ("AppArmor: Cleanup make
file to remove cruft and make it easier to read") removed all traces of
af_names.h from the tree. Remove its entry in AppArmor's .gitignore file
too.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-09-01 08:35:34 -07:00
Kent Yoder
20328b56cd ima: enable the IBM vTPM as the default TPM in the PPC64 case
Enable tpm_ibmvtpm driver by default when IMA is enabled on PPC64

Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>
2012-08-22 16:23:23 -05:00
Kent Yoder
41ab999c80 tpm: Move tpm_get_random api into the TPM device driver
Move the tpm_get_random api from the trusted keys code into the TPM
device driver itself so that other callers can make use of it. Also,
change the api slightly so that the number of bytes read is returned in
the call, since the TPM command can potentially return fewer bytes than
requested.

Acked-by: David Safford <safford@linux.vnet.ibm.com>
Reviewed-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>
2012-08-22 11:11:33 -05:00
Tejun Heo
3b07e9ca26 workqueue: deprecate system_nrt[_freezable]_wq
system_nrt[_freezable]_wq are now spurious.  Mark them deprecated and
convert all users to system[_freezable]_wq.

If you're cc'd and wondering what's going on: Now all workqueues are
non-reentrant, so there's no reason to use system_nrt[_freezable]_wq.
Please use system[_freezable]_wq instead.

This patch doesn't make any functional difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-By: Lai Jiangshan <laijs@cn.fujitsu.com>

Cc: Jens Axboe <axboe@kernel.dk>
Cc: David Airlie <airlied@linux.ie>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
2012-08-20 14:51:24 -07:00
Kees Cook
7612bfeecc Yama: access task_struct->comm directly
The core ptrace access checking routine holds a task lock, and when
reporting a failure, Yama takes a separate task lock. To avoid a
potential deadlock with two ptracers taking the opposite locks, do not
use get_task_comm() and just use ->comm directly since accuracy is not
important for the report.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
CC: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-08-17 20:40:38 +10:00
Kees Cook
9d8dad742a Yama: higher restrictions should block PTRACE_TRACEME
The higher ptrace restriction levels should be blocking even
PTRACE_TRACEME requests. The comments in the LSM documentation are
misleading about when the checks happen (the parent does not go through
security_ptrace_access_check() on a PTRACE_TRACEME call).

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org # 3.5.x and later
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-08-10 19:58:07 +10:00
Mel Gorman
6290c2c439 selinux: tag avc cache alloc as non-critical
Failing to allocate a cache entry will only harm performance not
correctness.  Do not consume valuable reserve pages for something like
that.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Neil Brown <neilb@suse.de>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-31 18:42:47 -07:00
Linus Torvalds
27c1ee3f92 Merge branch 'akpm' (Andrew's patch-bomb)
Merge Andrew's first set of patches:
 "Non-MM patches:

   - lots of misc bits

   - tree-wide have_clk() cleanups

   - quite a lot of printk tweaks.  I draw your attention to "printk:
     convert the format for KERN_<LEVEL> to a 2 byte pattern" which
     looks a bit scary.  But afaict it's solid.

   - backlight updates

   - lib/ feature work (notably the addition and use of memweight())

   - checkpatch updates

   - rtc updates

   - nilfs updates

   - fatfs updates (partial, still waiting for acks)

   - kdump, proc, fork, IPC, sysctl, taskstats, pps, etc

   - new fault-injection feature work"

* Merge emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits)
  drivers/misc/lkdtm.c: fix missing allocation failure check
  lib/scatterlist: do not re-write gfp_flags in __sg_alloc_table()
  fault-injection: add tool to run command with failslab or fail_page_alloc
  fault-injection: add selftests for cpu and memory hotplug
  powerpc: pSeries reconfig notifier error injection module
  memory: memory notifier error injection module
  PM: PM notifier error injection module
  cpu: rewrite cpu-notifier-error-inject module
  fault-injection: notifier error injection
  c/r: fcntl: add F_GETOWNER_UIDS option
  resource: make sure requested range is included in the root range
  include/linux/aio.h: cpp->C conversions
  fs: cachefiles: add support for large files in filesystem caching
  pps: return PTR_ERR on error in device_create
  taskstats: check nla_reserve() return
  sysctl: suppress kmemleak messages
  ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSION
  ipc: compat: use signed size_t types for msgsnd and msgrcv
  ipc: allow compat IPC version field parsing if !ARCH_WANT_OLD_COMPAT_IPC
  ipc: add COMPAT_SHMLBA support
  ...
2012-07-30 17:25:34 -07:00
Cyrill Gorcunov
1d151c337d c/r: fcntl: add F_GETOWNER_UIDS option
When we restore file descriptors we would like them to look exactly as
they were at dumping time.

With help of fcntl it's almost possible, the missing snippet is file
owners UIDs.

To be able to read their values the F_GETOWNER_UIDS is introduced.

This option is valid iif CONFIG_CHECKPOINT_RESTORE is turned on, otherwise
returning -EINVAL.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-30 17:25:21 -07:00
Al Viro
e3fea3f70f selinux: fix selinux_inode_setxattr oops
OK, what we have so far is e.g.
	setxattr(path, name, whatever, 0, XATTR_REPLACE)
with name being good enough to get through xattr_permission().
Then we reach security_inode_setxattr() with the desired value and size.
Aha.  name should begin with "security.selinux", or we won't get that
far in selinux_inode_setxattr().  Suppose we got there and have enough
permissions to relabel that sucker.  We call security_context_to_sid()
with value == NULL, size == 0.  OK, we want ss_initialized to be non-zero.
I.e. after everything had been set up and running.  No problem...

We do 1-byte kmalloc(), zero-length memcpy() (which doesn't oops, even
thought the source is NULL) and put a NUL there.  I.e. form an empty
string.  string_to_context_struct() is called and looks for the first
':' in there.  Not found, -EINVAL we get.  OK, security_context_to_sid_core()
has rc == -EINVAL, force == 0, so it silently returns -EINVAL.
All it takes now is not having CAP_MAC_ADMIN and we are fucked.

All right, it might be a different bug (modulo strange code quoted in the
report), but it's real.  Easily fixed, AFAICS:

Deal with size == 0, value == NULL case in selinux_inode_setxattr()

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Dave Jones <davej@redhat.com>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-07-30 15:36:50 +10:00
Alan Cox
3b9fc37280 smack: off by one error
Consider the input case of a rule that consists entirely of non space
symbols followed by a \0. Say 64 + \0

In this case strlen(data) = 64
kzalloc of subject and object are 64 byte objects
sscanfdata, "%s %s %s", subject, ...)

will put 65 bytes into subject.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-07-30 15:04:17 +10:00
Josh Boyer
8ded2bbc18 posix_types.h: Cleanup stale __NFDBITS and related definitions
Recently, glibc made a change to suppress sign-conversion warnings in
FD_SET (glibc commit ceb9e56b3d1).  This uncovered an issue with the
kernel's definition of __NFDBITS if applications #include
<linux/types.h> after including <sys/select.h>.  A build failure would
be seen when passing the -Werror=sign-compare and -D_FORTIFY_SOURCE=2
flags to gcc.

It was suggested that the kernel should either match the glibc
definition of __NFDBITS or remove that entirely.  The current in-kernel
uses of __NFDBITS can be replaced with BITS_PER_LONG, and there are no
uses of the related __FDELT and __FDMASK defines.  Given that, we'll
continue the cleanup that was started with commit 8b3d1cda4f
("posix_types: Remove fd_set macros") and drop the remaining unused
macros.

Additionally, linux/time.h has similar macros defined that expand to
nothing so we'll remove those at the same time.

Reported-by: Jeff Law <law@redhat.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
CC: <stable@vger.kernel.org>
Signed-off-by: Josh Boyer <jwboyer@redhat.com>
[ .. and fix up whitespace as per akpm ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-26 13:36:43 -07:00
Linus Torvalds
3c4cfadef6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking changes from David S Miller:

 1) Remove the ipv4 routing cache.  Now lookups go directly into the FIB
    trie and use prebuilt routes cached there.

    No more garbage collection, no more rDOS attacks on the routing
    cache.  Instead we now get predictable and consistent performance,
    no matter what the pattern of traffic we service.

    This has been almost 2 years in the making.  Special thanks to
    Julian Anastasov, Eric Dumazet, Steffen Klassert, and others who
    have helped along the way.

    I'm sure that with a change of this magnitude there will be some
    kind of fallout, but such things ought the be simple to fix at this
    point.  Luckily I'm not European so I'll be around all of August to
    fix things :-)

    The major stages of this work here are each fronted by a forced
    merge commit whose commit message contains a top-level description
    of the motivations and implementation issues.

 2) Pre-demux of established ipv4 TCP sockets, saves a route demux on
    input.

 3) TCP SYN/ACK performance tweaks from Eric Dumazet.

 4) Add namespace support for netfilter L4 conntrack helpers, from Gao
    Feng.

 5) Add config mechanism for Energy Efficient Ethernet to ethtool, from
    Yuval Mintz.

 6) Remove quadratic behavior from /proc/net/unix, from Eric Dumazet.

 7) Support for connection tracker helpers in userspace, from Pablo
    Neira Ayuso.

 8) Allow userspace driven TX load balancing functions in TEAM driver,
    from Jiri Pirko.

 9) Kill off NLMSG_PUT and RTA_PUT macros, more gross stuff with
    embedded gotos.

10) TCP Small Queues, essentially minimize the amount of TCP data queued
    up in the packet scheduler layer.  Whereas the existing BQL (Byte
    Queue Limits) limits the pkt_sched --> netdevice queuing levels,
    this controls the TCP --> pkt_sched queueing levels.

    From Eric Dumazet.

11) Reduce the number of get_page/put_page ops done on SKB fragments,
    from Alexander Duyck.

12) Implement protection against blind resets in TCP (RFC 5961), from
    Eric Dumazet.

13) Support the client side of TCP Fast Open, basically the ability to
    send data in the SYN exchange, from Yuchung Cheng.

    Basically, the sender queues up data with a sendmsg() call using
    MSG_FASTOPEN, then they do the connect() which emits the queued up
    fastopen data.

14) Avoid all the problems we get into in TCP when timers or PMTU events
    hit a locked socket.  The TCP Small Queues changes added a
    tcp_release_cb() that allows us to queue work up to the
    release_sock() caller, and that's what we use here too.  From Eric
    Dumazet.

15) Zero copy on TX support for TUN driver, from Michael S. Tsirkin.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1870 commits)
  genetlink: define lockdep_genl_is_held() when CONFIG_LOCKDEP
  r8169: revert "add byte queue limit support".
  ipv4: Change rt->rt_iif encoding.
  net: Make skb->skb_iif always track skb->dev
  ipv4: Prepare for change of rt->rt_iif encoding.
  ipv4: Remove all RTCF_DIRECTSRC handliing.
  ipv4: Really ignore ICMP address requests/replies.
  decnet: Don't set RTCF_DIRECTSRC.
  net/ipv4/ip_vti.c: Fix __rcu warnings detected by sparse.
  ipv4: Remove redundant assignment
  rds: set correct msg_namelen
  openvswitch: potential NULL deref in sample()
  tcp: dont drop MTU reduction indications
  bnx2x: Add new 57840 device IDs
  tcp: avoid oops in tcp_metrics and reset tcpm_stamp
  niu: Change niu_rbr_fill() to use unlikely() to check niu_rbr_add_page() return value
  niu: Fix to check for dma mapping errors.
  net: Fix references to out-of-scope variables in put_cmsg_compat()
  net: ethernet: davinci_emac: add pm_runtime support
  net: ethernet: davinci_emac: Remove unnecessary #include
  ...
2012-07-24 10:01:50 -07:00
Linus Torvalds
e05644e17e Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Nothing groundbreaking for this kernel, just cleanups and fixes, and a
  couple of Smack enhancements."

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (21 commits)
  Smack: Maintainer Record
  Smack: don't show empty rules when /smack/load or /smack/load2 is read
  Smack: user access check bounds
  Smack: onlycap limits on CAP_MAC_ADMIN
  Smack: fix smack_new_inode bogosities
  ima: audit is compiled only when enabled
  ima: ima_initialized is set only if successful
  ima: add policy for pseudo fs
  ima: remove unused cleanup functions
  ima: free securityfs violations file
  ima: use full pathnames in measurement list
  security: Fix nommu build.
  samples: seccomp: add .gitignore for untracked executables
  tpm: check the chip reference before using it
  TPM: fix memleak when register hardware fails
  TPM: chip disabled state erronously being reported as error
  MAINTAINERS: TPM maintainers' contacts update
  Merge branches 'next-queue' and 'next' into next
  Remove unused code from MPI library
  Revert "crypto: GnuPG based MPI lib - additional sources (part 4)"
  ...
2012-07-23 18:49:06 -07:00
Linus Torvalds
a66d2c8f7e Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull the big VFS changes from Al Viro:
 "This one is *big* and changes quite a few things around VFS.  What's in there:

   - the first of two really major architecture changes - death to open
     intents.

     The former is finally there; it was very long in making, but with
     Miklos getting through really hard and messy final push in
     fs/namei.c, we finally have it.  Unlike his variant, this one
     doesn't introduce struct opendata; what we have instead is
     ->atomic_open() taking preallocated struct file * and passing
     everything via its fields.

     Instead of returning struct file *, it returns -E...  on error, 0
     on success and 1 in "deal with it yourself" case (e.g.  symlink
     found on server, etc.).

     See comments before fs/namei.c:atomic_open().  That made a lot of
     goodies finally possible and quite a few are in that pile:
     ->lookup(), ->d_revalidate() and ->create() do not get struct
     nameidata * anymore; ->lookup() and ->d_revalidate() get lookup
     flags instead, ->create() gets "do we want it exclusive" flag.

     With the introduction of new helper (kern_path_locked()) we are rid
     of all struct nameidata instances outside of fs/namei.c; it's still
     visible in namei.h, but not for long.  Come the next cycle,
     declaration will move either to fs/internal.h or to fs/namei.c
     itself.  [me, miklos, hch]

   - The second major change: behaviour of final fput().  Now we have
     __fput() done without any locks held by caller *and* not from deep
     in call stack.

     That obviously lifts a lot of constraints on the locking in there.
     Moreover, it's legal now to call fput() from atomic contexts (which
     has immediately simplified life for aio.c).  We also don't need
     anti-recursion logics in __scm_destroy() anymore.

     There is a price, though - the damn thing has become partially
     asynchronous.  For fput() from normal process we are guaranteed
     that pending __fput() will be done before the caller returns to
     userland, exits or gets stopped for ptrace.

     For kernel threads and atomic contexts it's done via
     schedule_work(), so theoretically we might need a way to make sure
     it's finished; so far only one such place had been found, but there
     might be more.

     There's flush_delayed_fput() (do all pending __fput()) and there's
     __fput_sync() (fput() analog doing __fput() immediately).  I hope
     we won't need them often; see warnings in fs/file_table.c for
     details.  [me, based on task_work series from Oleg merged last
     cycle]

   - sync series from Jan

   - large part of "death to sync_supers()" work from Artem; the only
     bits missing here are exofs and ext4 ones.  As far as I understand,
     those are going via the exofs and ext4 trees resp.; once they are
     in, we can put ->write_super() to the rest, along with the thread
     calling it.

   - preparatory bits from unionmount series (from dhowells).

   - assorted cleanups and fixes all over the place, as usual.

  This is not the last pile for this cycle; there's at least jlayton's
  ESTALE work and fsfreeze series (the latter - in dire need of fixes,
  so I'm not sure it'll make the cut this cycle).  I'll probably throw
  symlink/hardlink restrictions stuff from Kees into the next pile, too.
  Plus there's a lot of misc patches I hadn't thrown into that one -
  it's large enough as it is..."

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (127 commits)
  ext4: switch EXT4_IOC_RESIZE_FS to mnt_want_write_file()
  btrfs: switch btrfs_ioctl_balance() to mnt_want_write_file()
  switch dentry_open() to struct path, make it grab references itself
  spufs: shift dget/mntget towards dentry_open()
  zoran: don't bother with struct file * in zoran_map
  ecryptfs: don't reinvent the wheels, please - use struct completion
  don't expose I_NEW inodes via dentry->d_inode
  tidy up namei.c a bit
  unobfuscate follow_up() a bit
  ext3: pass custom EOF to generic_file_llseek_size()
  ext4: use core vfs llseek code for dir seeks
  vfs: allow custom EOF in generic_file_llseek code
  vfs: Avoid unnecessary WB_SYNC_NONE writeback during sys_sync and reorder sync passes
  vfs: Remove unnecessary flushing of block devices
  vfs: Make sys_sync writeout also block device inodes
  vfs: Create function for iterating over block devices
  vfs: Reorder operations during sys_sync
  quota: Move quota syncing to ->sync_fs method
  quota: Split dquot_quota_sync() to writeback and cache flushing part
  vfs: Move noop_backing_dev_info check from sync into writeback
  ...
2012-07-23 12:27:27 -07:00
Al Viro
765927b2d5 switch dentry_open() to struct path, make it grab references itself
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-23 00:01:29 +04:00
Al Viro
d35abdb288 hold task_lock around checks in keyctl
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22 23:58:01 +04:00
Al Viro
67d1214551 merge task_work and rcu_head, get rid of separate allocation for keyring case
task_work and rcu_head are identical now; merge them (calling the result
struct callback_head, rcu_head #define'd to it), kill separate allocation
in security/keys since we can just use cred->rcu now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22 23:57:56 +04:00
Al Viro
41f9d29f09 trimming task_work: kill ->data
get rid of the only user of ->data; this is _not_ the final variant - in the
end we'll have task_work and rcu_head identical and just use cred->rcu,
at which point the separate allocation will be gone completely.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22 23:57:54 +04:00
David S. Miller
abaa72d7fd Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
2012-07-19 11:17:30 -07:00
Linus Torvalds
e2f3b78557 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull SELinux regression fixes from James Morris.

Andrew Morton has a box that hit that open perms problem.

I also renamed the "epollwakeup" selinux name for the new capability to
be "block_suspend", to match the rename done by commit d9914cf661
("PM: Rename CAP_EPOLLWAKEUP to CAP_BLOCK_SUSPEND").

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  SELinux: do not check open perms if they are not known to policy
  SELinux: include definition of new capabilities
2012-07-18 13:42:44 -07:00
Eric Paris
3d2195c332 SELinux: do not check open perms if they are not known to policy
When I introduced open perms policy didn't understand them and I
implemented them as a policycap.  When I added the checking of open perm
to truncate I forgot to conditionalize it on the userspace defined
policy capability.  Running an old policy with a new kernel will not
check open on open(2) but will check it on truncate.  Conditionalize the
truncate check the same as the open check.

Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: stable@vger.kernel.org # 3.4.x
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-07-16 11:41:47 +10:00
Eric Paris
64919e6091 SELinux: include definition of new capabilities
The kernel has added CAP_WAKE_ALARM and CAP_EPOLLWAKEUP.  We need to
define these in SELinux so they can be mediated by policy.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-07-16 11:40:31 +10:00
Rafal Krypa
65ee7f45cf Smack: don't show empty rules when /smack/load or /smack/load2 is read
This patch removes empty rules (i.e. with access set to '-') from the
rule list presented to user space.

Smack by design never removes labels nor rules from its lists. Access
for a rule may be set to '-' to effectively disable it. Such rules would
show up in the listing generated when /smack/load or /smack/load2 is
read. This may cause clutter if many rules were disabled.

As a rule with access set to '-' is equivalent to no rule at all, they
may be safely hidden from the listing.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2012-07-13 15:49:24 -07:00
Casey Schaufler
3518721a89 Smack: user access check bounds
Some of the bounds checking used on the /smack/access
interface was lost when support for long labels was
added. No kernel access checks are affected, however
this is a case where /smack/access could be used
incorrectly and fail to detect the error. This patch
reintroduces the original checks.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2012-07-13 15:49:24 -07:00
Casey Schaufler
1880eff77e Smack: onlycap limits on CAP_MAC_ADMIN
Smack is integrated with the POSIX capabilities scheme,
using the capabilities CAP_MAC_OVERRIDE and CAP_MAC_ADMIN to
determine if a process is allowed to ignore Smack checks or
change Smack related data respectively. Smack provides an
additional restriction that if an onlycap value is set
by writing to /smack/onlycap only tasks with that Smack
label are allowed to use CAP_MAC_OVERRIDE.

This change adds CAP_MAC_ADMIN as a capability that is affected
by the onlycap mechanism.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2012-07-13 15:49:23 -07:00
Casey Schaufler
eb982cb4cf Smack: fix smack_new_inode bogosities
In January of 2012 Al Viro pointed out three bits of code that
he titled "new_inode_smack bogosities". This patch repairs these
errors.

1. smack_sb_kern_mount() included a NULL check that is impossible.
   The check and NULL case are removed.
2. smack_kb_kern_mount() included pointless locking. The locking is
   removed. Since this is the only place that lock was used the lock
   is removed from the superblock_smack structure.
3. smk_fill_super() incorrectly and unnecessarily set the Smack label
   for the smackfs root inode. The assignment has been removed.

Targeted for git://gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2012-07-13 15:49:23 -07:00
Dmitry Kasatkin
417c6c8ee2 ima: audit is compiled only when enabled
IMA auditing code was compiled even when CONFIG_AUDIT was not enabled.
This patch compiles auditing code only when possible and enabled.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-07-05 16:43:59 -04:00
Dmitry Kasatkin
7ff2267af5 ima: ima_initialized is set only if successful
Set ima_initialized only if initialization was successful.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-07-05 16:43:57 -04:00
Dmitry Kasatkin
8445d64dd7 ima: add policy for pseudo fs
Exclude DEVPTS and BINFMT filesystems from the measurement policy.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-07-05 16:42:33 -04:00
David S. Miller
c90a9bb907 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-07-05 03:44:25 -07:00
Paul Mundt
75331a597c security: Fix nommu build.
The security + nommu configuration presently blows up with an undefined
reference to BDI_CAP_EXEC_MAP:

security/security.c: In function 'mmap_prot':
security/security.c:687:36: error: dereferencing pointer to incomplete type
security/security.c:688:16: error: 'BDI_CAP_EXEC_MAP' undeclared (first use in this function)
security/security.c:688:16: note: each undeclared identifier is reported only once for each function it appears in

include backing-dev.h directly to fix it up.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-07-03 21:41:03 +10:00
Dmitry Kasatkin
c7de7adc18 ima: remove unused cleanup functions
IMA cannot be used as module and does not need __exit functions.
Removed them.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-07-02 16:43:30 -04:00
Dmitry Kasatkin
0ea4f8ae41 ima: free securityfs violations file
On ima_fs_init() error, free securityfs violations file.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
2012-07-02 16:43:30 -04:00
Mimi Zohar
08e1b76ae3 ima: use full pathnames in measurement list
The IMA measurement list contains filename hints, which can be
ambigious without the full pathname.  This patch replaces the
filename hint with the full pathname, simplifying for userspace
the correlating of file hash measurements with files.

Change log v1:
- Revert to short filenames, when full pathname is longer than IMA
  measurement buffer size. (Based on Dmitry's review)

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-07-02 16:43:29 -04:00
Paul Mundt
659b5e7652 security: Fix nommu build.
The security + nommu configuration presently blows up with an undefined
reference to BDI_CAP_EXEC_MAP:

security/security.c: In function 'mmap_prot':
security/security.c:687:36: error: dereferencing pointer to incomplete type
security/security.c:688:16: error: 'BDI_CAP_EXEC_MAP' undeclared (first use in this function)
security/security.c:688:16: note: each undeclared identifier is reported only once for each function it appears in

include backing-dev.h directly to fix it up.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-07-02 23:56:04 +10:00
Pablo Neira Ayuso
a31f2d17b3 netlink: add netlink_kernel_cfg parameter to netlink_kernel_create
This patch adds the following structure:

struct netlink_kernel_cfg {
        unsigned int    groups;
        void            (*input)(struct sk_buff *skb);
        struct mutex    *cb_mutex;
};

That can be passed to netlink_kernel_create to set optional configurations
for netlink kernel sockets.

I've populated this structure by looking for NULL and zero parameters at the
existing code. The remaining parameters that always need to be set are still
left in the original interface.

That includes optional parameters for the netlink socket creation. This allows
easy extensibility of this interface in the future.

This patch also adapts all callers to use this new interface.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-29 16:46:02 -07:00
David S. Miller
01f534d0ae selinux: netlink: Move away from NLMSG_PUT().
And use nlmsg_data() while we're here too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-26 21:54:06 -07:00
James Morris
66dd07b88a Merge commit 'v3.5-rc2' into next 2012-06-10 22:52:10 +10:00
Alban Crequy
2597a8344c netfilter: selinux: switch hook PFs to nfproto
This patch is a cleanup. Use NFPROTO_* for consistency with other
netfilter code.

Signed-off-by: Alban Crequy <alban.crequy@collabora.co.uk>
Reviewed-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Reviewed-by: Vincent Sanders <vincent.sanders@collabora.co.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-07 14:58:43 +02:00
Linus Torvalds
1193755ac6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs changes from Al Viro.
 "A lot of misc stuff.  The obvious groups:
   * Miklos' atomic_open series; kills the damn abuse of
     ->d_revalidate() by NFS, which was the major stumbling block for
     all work in that area.
   * ripping security_file_mmap() and dealing with deadlocks in the
     area; sanitizing the neighborhood of vm_mmap()/vm_munmap() in
     general.
   * ->encode_fh() switched to saner API; insane fake dentry in
     mm/cleancache.c gone.
   * assorted annotations in fs (endianness, __user)
   * parts of Artem's ->s_dirty work (jff2 and reiserfs parts)
   * ->update_time() work from Josef.
   * other bits and pieces all over the place.

  Normally it would've been in two or three pull requests, but
  signal.git stuff had eaten a lot of time during this cycle ;-/"

Fix up trivial conflicts in Documentation/filesystems/vfs.txt (the
'truncate_range' inode method was removed by the VM changes, the VFS
update adds an 'update_time()' method), and in fs/btrfs/ulist.[ch] (due
to sparse fix added twice, with other changes nearby).

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (95 commits)
  nfs: don't open in ->d_revalidate
  vfs: retry last component if opening stale dentry
  vfs: nameidata_to_filp(): don't throw away file on error
  vfs: nameidata_to_filp(): inline __dentry_open()
  vfs: do_dentry_open(): don't put filp
  vfs: split __dentry_open()
  vfs: do_last() common post lookup
  vfs: do_last(): add audit_inode before open
  vfs: do_last(): only return EISDIR for O_CREAT
  vfs: do_last(): check LOOKUP_DIRECTORY
  vfs: do_last(): make ENOENT exit RCU safe
  vfs: make follow_link check RCU safe
  vfs: do_last(): use inode variable
  vfs: do_last(): inline walk_component()
  vfs: do_last(): make exit RCU safe
  vfs: split do_lookup()
  Btrfs: move over to use ->update_time
  fs: introduce inode operation ->update_time
  reiserfs: get rid of resierfs_sync_super
  reiserfs: mark the superblock as dirty a bit later
  ...
2012-06-01 10:34:35 -07:00
Al Viro
98de59bfe4 take calculation of final prot in security_mmap_file() into a helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 10:37:17 -04:00
Al Viro
8b3ec6814c take security_mmap_file() outside of ->mmap_sem
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 10:37:01 -04:00
Linus Torvalds
fb21affa49 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull second pile of signal handling patches from Al Viro:
 "This one is just task_work_add() series + remaining prereqs for it.

  There probably will be another pull request from that tree this
  cycle - at least for helpers, to get them out of the way for per-arch
  fixes remaining in the tree."

Fix trivial conflict in kernel/irq/manage.c: the merge of Andrew's pile
had brought in commit 97fd75b7b8 ("kernel/irq/manage.c: use the
pr_foo() infrastructure to prefix printks") which changed one of the
pr_err() calls that this merge moves around.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  keys: kill task_struct->replacement_session_keyring
  keys: kill the dummy key_replace_session_keyring()
  keys: change keyctl_session_to_parent() to use task_work_add()
  genirq: reimplement exit_irq_thread() hook via task_work_add()
  task_work_add: generic process-context callbacks
  avr32: missed _TIF_NOTIFY_RESUME on one of do_notify_resume callers
  parisc: need to check NOTIFY_RESUME when exiting from syscall
  move key_repace_session_keyring() into tracehook_notify_resume()
  TIF_NOTIFY_RESUME is defined on all targets now
2012-05-31 18:47:30 -07:00
Christopher Yeoh
ac34ebb3a6 aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector()
A cleanup of rw_copy_check_uvector and compat_rw_copy_check_uvector after
changes made to support CMA in an earlier patch.

Rather than having an additional check_access parameter to these
functions, the first paramater type is overloaded to allow the caller to
specify CHECK_IOVEC_ONLY which means check that the contents of the iovec
are valid, but do not check the memory that they point to.  This is used
by process_vm_readv/writev where we need to validate that a iovec passed
to the syscall is valid but do not want to check the memory that it points
to at this point because it refers to an address space in another process.

Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:32 -07:00
Boaz Harrosh
81ab6e7b26 kmod: convert two call sites to call_usermodehelper_fns()
Both kernel/sys.c && security/keys/request_key.c where inlining the exact
same code as call_usermodehelper_fns(); So simply convert these sites to
directly use call_usermodehelper_fns().

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:28 -07:00
Andrew Morton
4f1c28d241 security/keys/keyctl.c: suppress memory allocation failure warning
This allocation may be large.  The code is probing to see if it will
succeed and if not, it falls back to vmalloc().  We should suppress any
page-allocation failure messages when the fallback happens.

Reported-by: Dave Jones <davej@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:26 -07:00
Al Viro
e5467859f7 split ->file_mmap() into ->mmap_addr()/->mmap_file()
... i.e. file-dependent and address-dependent checks.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-31 13:11:54 -04:00
Al Viro
d007794a18 split cap_mmap_addr() out of cap_file_mmap()
... switch callers.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-31 13:10:54 -04:00
Al Viro
cc1dad7183 selinuxfs snprintf() misuses
a) %d does _not_ produce a page worth of output
b) snprintf() doesn't return negatives - it used to in old glibc, but
that's the kernel...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-29 23:28:33 -04:00
David Howells
423b978802 KEYS: Fix some sparse warnings
Fix some sparse warnings in the keyrings code:

 (1) compat_keyctl_instantiate_key_iov() should be static.

 (2) There were a couple of places where a pointer was being compared against
     integer 0 rather than NULL.

 (3) keyctl_instantiate_key_common() should not take a __user-labelled iovec
     pointer as the caller must have copied the iovec to kernel space.

 (4) __key_link_begin() takes and __key_link_end() releases
     keyring_serialise_link_sem under some circumstances and so this should be
     declared.

     Note that adding __acquires() and __releases() for this doesn't help cure
     the warnings messages - something only commenting out both helps.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-05-25 20:51:42 +10:00
Oleg Nesterov
413cd3d9ab keys: change keyctl_session_to_parent() to use task_work_add()
Change keyctl_session_to_parent() to use task_work_add() and move
key_replace_session_keyring() logic into task_work->func().

Note that we do task_work_cancel() before task_work_add() to ensure that
only one work can be pending at any time.  This is important, we must not
allow user-space to abuse the parent's ->task_works list.

The callback, replace_session_keyring(), checks PF_EXITING.  I guess this
is not really needed but looks better.

As a side effect, this fixes the (unlikely) race.  The callers of
key_replace_session_keyring() and keyctl_session_to_parent() lack the
necessary barriers, the parent can miss the request.

Now we can remove task_struct->replacement_session_keyring and related
code.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexander Gordeev <agordeev@redhat.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Smith <dsmith@redhat.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-23 22:11:23 -04:00
Al Viro
1227dd773d TIF_NOTIFY_RESUME is defined on all targets now
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-23 22:09:19 -04:00
Linus Torvalds
644473e9c6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace enhancements from Eric Biederman:
 "This is a course correction for the user namespace, so that we can
  reach an inexpensive, maintainable, and reasonably complete
  implementation.

  Highlights:
   - Config guards make it impossible to enable the user namespace and
     code that has not been converted to be user namespace safe.

   - Use of the new kuid_t type ensures the if you somehow get past the
     config guards the kernel will encounter type errors if you enable
     user namespaces and attempt to compile in code whose permission
     checks have not been updated to be user namespace safe.

   - All uids from child user namespaces are mapped into the initial
     user namespace before they are processed.  Removing the need to add
     an additional check to see if the user namespace of the compared
     uids remains the same.

   - With the user namespaces compiled out the performance is as good or
     better than it is today.

   - For most operations absolutely nothing changes performance or
     operationally with the user namespace enabled.

   - The worst case performance I could come up with was timing 1
     billion cache cold stat operations with the user namespace code
     enabled.  This went from 156s to 164s on my laptop (or 156ns to
     164ns per stat operation).

   - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value.
     Most uid/gid setting system calls treat these value specially
     anyway so attempting to use -1 as a uid would likely cause
     entertaining failures in userspace.

   - If setuid is called with a uid that can not be mapped setuid fails.
     I have looked at sendmail, login, ssh and every other program I
     could think of that would call setuid and they all check for and
     handle the case where setuid fails.

   - If stat or a similar system call is called from a context in which
     we can not map a uid we lie and return overflowuid.  The LFS
     experience suggests not lying and returning an error code might be
     better, but the historical precedent with uids is different and I
     can not think of anything that would break by lying about a uid we
     can't map.

   - Capabilities are localized to the current user namespace making it
     safe to give the initial user in a user namespace all capabilities.

  My git tree covers all of the modifications needed to convert the core
  kernel and enough changes to make a system bootable to runlevel 1."

Fix up trivial conflicts due to nearby independent changes in fs/stat.c

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits)
  userns:  Silence silly gcc warning.
  cred: use correct cred accessor with regards to rcu read lock
  userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq
  userns: Convert cgroup permission checks to use uid_eq
  userns: Convert tmpfs to use kuid and kgid where appropriate
  userns: Convert sysfs to use kgid/kuid where appropriate
  userns: Convert sysctl permission checks to use kuid and kgids.
  userns: Convert proc to use kuid/kgid where appropriate
  userns: Convert ext4 to user kuid/kgid where appropriate
  userns: Convert ext3 to use kuid/kgid where appropriate
  userns: Convert ext2 to use kuid/kgid where appropriate.
  userns: Convert devpts to use kuid/kgid where appropriate
  userns: Convert binary formats to use kuid/kgid where appropriate
  userns: Add negative depends on entries to avoid building code that is userns unsafe
  userns: signal remove unnecessary map_cred_ns
  userns: Teach inode_capable to understand inodes whose uids map to other namespaces.
  userns: Fail exec for suid and sgid binaries with ids outside our user namespace.
  userns: Convert stat to return values mapped from kuids and kgids
  userns: Convert user specfied uids and gids in chown into kuids and kgid
  userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs
  ...
2012-05-23 17:42:39 -07:00
Linus Torvalds
88d6ae8dc3 Merge branch 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
 "cgroup file type addition / removal is updated so that file types are
  added and removed instead of individual files so that dynamic file
  type addition / removal can be implemented by cgroup and used by
  controllers.  blkio controller changes which will come through block
  tree are dependent on this.  Other changes include res_counter cleanup
  and disallowing kthread / PF_THREAD_BOUND threads to be attached to
  non-root cgroups.

  There's a reported bug with the file type addition / removal handling
  which can lead to oops on cgroup umount.  The issue is being looked
  into.  It shouldn't cause problems for most setups and isn't a
  security concern."

Fix up trivial conflict in Documentation/feature-removal-schedule.txt

* 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits)
  res_counter: Account max_usage when calling res_counter_charge_nofail()
  res_counter: Merge res_counter_charge and res_counter_charge_nofail
  cgroups: disallow attaching kthreadd or PF_THREAD_BOUND threads
  cgroup: remove cgroup_subsys->populate()
  cgroup: get rid of populate for memcg
  cgroup: pass struct mem_cgroup instead of struct cgroup to socket memcg
  cgroup: make css->refcnt clearing on cgroup removal optional
  cgroup: use negative bias on css->refcnt to block css_tryget()
  cgroup: implement cgroup_rm_cftypes()
  cgroup: introduce struct cfent
  cgroup: relocate __d_cgrp() and __d_cft()
  cgroup: remove cgroup_add_file[s]()
  cgroup: convert memcg controller to the new cftype interface
  memcg: always create memsw files if CONFIG_CGROUP_MEM_RES_CTLR_SWAP
  cgroup: convert all non-memcg controllers to the new cftype interface
  cgroup: relocate cftype and cgroup_subsys definitions in controllers
  cgroup: merge cft_release_agent cftype array into the base files array
  cgroup: implement cgroup_add_cftypes() and friends
  cgroup: build list of all cgroups under a given cgroupfs_root
  cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir()
  ...
2012-05-22 17:40:19 -07:00
Linus Torvalds
cb60e3e65c Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "New notable features:
   - The seccomp work from Will Drewry
   - PR_{GET,SET}_NO_NEW_PRIVS from Andy Lutomirski
   - Longer security labels for Smack from Casey Schaufler
   - Additional ptrace restriction modes for Yama by Kees Cook"

Fix up trivial context conflicts in arch/x86/Kconfig and include/linux/filter.h

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (65 commits)
  apparmor: fix long path failure due to disconnected path
  apparmor: fix profile lookup for unconfined
  ima: fix filename hint to reflect script interpreter name
  KEYS: Don't check for NULL key pointer in key_validate()
  Smack: allow for significantly longer Smack labels v4
  gfp flags for security_inode_alloc()?
  Smack: recursive tramsmute
  Yama: replace capable() with ns_capable()
  TOMOYO: Accept manager programs which do not start with / .
  KEYS: Add invalidation support
  KEYS: Do LRU discard in full keyrings
  KEYS: Permit in-place link replacement in keyring list
  KEYS: Perform RCU synchronisation on keys prior to key destruction
  KEYS: Announce key type (un)registration
  KEYS: Reorganise keys Makefile
  KEYS: Move the key config into security/keys/Kconfig
  KEYS: Use the compat keyctl() syscall wrapper on Sparc64 for Sparc32 compat
  Yama: remove an unused variable
  samples/seccomp: fix dependencies on arch macros
  Yama: add additional ptrace scopes
  ...
2012-05-21 20:27:36 -07:00
James Morris
ff2bb047c4 Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into next
Per pull request, for 3.5.
2012-05-22 11:21:06 +10:00
John Johansen
cffee16e8b apparmor: fix long path failure due to disconnected path
BugLink: http://bugs.launchpad.net/bugs/955892

All failures from __d_path where being treated as disconnected paths,
however __d_path can also fail when the generated pathname is too long.

The initial ENAMETOOLONG error was being lost, and ENAMETOOLONG was only
returned if the subsequent dentry_path call resulted in that error.  Other
wise if the path was split across a mount point such that the dentry_path
fit within the buffer when the __d_path did not the failure was treated
as a disconnected path.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2012-05-18 11:09:52 -07:00
John Johansen
bf83208e0b apparmor: fix profile lookup for unconfined
BugLink: http://bugs.launchpad.net/bugs/978038

also affects apparmor portion of
BugLink: http://bugs.launchpad.net/bugs/987371

The unconfined profile is not stored in the regular profile list, but
change_profile and exec transitions may want access to it when setting
up specialized transitions like switch to the unconfined profile of a
new policy namespace.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2012-05-18 11:09:28 -07:00
Mimi Zohar
fbbb456347 ima: fix filename hint to reflect script interpreter name
When IMA was first upstreamed, the bprm filename and interp were
always the same.  Currently, the bprm->filename and bprm->interp
are the same, except for when only bprm->interp contains the
interpreter name.  So instead of using the bprm->filename as
the IMA filename hint in the measurement list, we could replace
it with bprm->interp, but this feels too fragil.

The following patch is not much better, but at least there is some
indication that sometimes we're passing the filename and other times
the interpreter name.

Reported-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-05-16 10:36:41 +10:00
James Morris
12fa8a2732 Merge branch 'for-1205' of http://git.gitorious.org/smack-next/kernel into next
Pull request from Casey.
2012-05-16 01:11:29 +10:00
David Howells
b404aef72f KEYS: Don't check for NULL key pointer in key_validate()
Don't bother checking for NULL key pointer in key_validate() as all of the
places that call it will crash anyway if the relevant key pointer is NULL by
the time they call key_validate().  Therefore, the checking must be done prior
to calling here.

Whilst we're at it, simplify the key_validate() function a bit and mark its
argument const.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-05-16 00:54:33 +10:00
Casey Schaufler
f7112e6c9a Smack: allow for significantly longer Smack labels v4
V4 updated to current linux-security#next
Targeted for git://gitorious.org/smack-next/kernel.git

Modern application runtime environments like to use
naming schemes that are structured and generated without
human intervention. Even though the Smack limit of 23
characters for a label name is perfectly rational for
human use there have been complaints that the limit is
a problem in environments where names are composed from
a set or sources, including vendor, author, distribution
channel and application name. Names like

	softwarehouse-pgwodehouse-coolappstore-mellowmuskrats

are becoming harder to avoid. This patch introduces long
label support in Smack. Labels are now limited to 255
characters instead of the old 23.

The primary reason for limiting the labels to 23 characters
was so they could be directly contained in CIPSO category sets.
This is still done were possible, but for labels that are too
large a mapping is required. This is perfectly safe for communication
that stays "on the box" and doesn't require much coordination
between boxes beyond what would have been required to keep label
names consistent.

The bulk of this patch is in smackfs, adding and updating
administrative interfaces. Because existing APIs can't be
changed new ones that do much the same things as old ones
have been introduced.

The Smack specific CIPSO data representation has been removed
and replaced with the data format used by netlabel. The CIPSO
header is now computed when a label is imported rather than
on use. This results in improved IP performance. The smack
label is now allocated separately from the containing structure,
allowing for larger strings.

Four new /smack interfaces have been introduced as four
of the old interfaces strictly required labels be specified
in fixed length arrays.

The access interface is supplemented with the check interface:
	access  "Subject                 Object                  rwxat"
	access2 "Subject Object rwaxt"

The load interface is supplemented with the rules interface:
	load   "Subject                 Object                  rwxat"
	load2  "Subject Object rwaxt"

The load-self interface is supplemented with the self-rules interface:
	load-self   "Subject                 Object                  rwxat"
	load-self2  "Subject Object rwaxt"

The cipso interface is supplemented with the wire interface:
	cipso  "Subject                  lvl cnt  c1  c2 ..."
	cipso2 "Subject lvl cnt  c1  c2 ..."

The old interfaces are maintained for compatibility.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2012-05-14 22:48:38 -07:00
Tetsuo Handa
ceffec5541 gfp flags for security_inode_alloc()?
Dave Chinner wrote:
> Yes, because you have no idea what the calling context is except
> for the fact that is from somewhere inside filesystem code and the
> filesystem could be holding locks. Therefore, GFP_NOFS is really the
> only really safe way to allocate memory here.

I see. Thank you.

I'm not sure, but can call trace happen where somewhere inside network
filesystem or stackable filesystem code with locks held invokes operations that
involves GFP_KENREL memory allocation outside that filesystem?
----------
[PATCH] SMACK: Fix incorrect GFP_KERNEL usage.

new_inode_smack() which can be called from smack_inode_alloc_security() needs
to use GFP_NOFS like SELinux's inode_alloc_security() does, for
security_inode_alloc() is called from inode_init_always() and
inode_init_always() is called from xfs_inode_alloc() which is using GFP_NOFS.

smack_inode_init_security() needs to use GFP_NOFS like
selinux_inode_init_security() does, for initxattrs() callback function (e.g.
btrfs_initxattrs()) which is called from security_inode_init_security() is
using GFP_NOFS.

smack_audit_rule_match() needs to use GFP_ATOMIC, for
security_audit_rule_match() can be called from audit_filter_user_rules() and
audit_filter_user_rules() is called from audit_filter_user() with RCU read lock
held.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Casey Schaufler <cschaufler@cschaufler-intel.(none)>
2012-05-14 22:47:44 -07:00
Casey Schaufler
2267b13a7c Smack: recursive tramsmute
The transmuting directory feature of Smack requires that
the transmuting attribute be explicitly set in all cases.
It seems the users of this facility would expect that the
transmuting attribute be inherited by subdirectories that
are created in a transmuting directory. This does not seem
to add any additional complexity to the understanding of
how the system works.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2012-05-14 22:45:17 -07:00
Kees Cook
2cc8a71641 Yama: replace capable() with ns_capable()
When checking capabilities, the question we want to be asking is "does
current() have the capability in the child's namespace?"

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-05-15 10:27:57 +10:00
Tetsuo Handa
77b513dda9 TOMOYO: Accept manager programs which do not start with / .
The pathname of /usr/sbin/tomoyo-editpolicy seen from Ubuntu 12.04 Live CD is
squashfs:/usr/sbin/tomoyo-editpolicy rather than /usr/sbin/tomoyo-editpolicy .
Therefore, we need to accept manager programs which do not start with / .

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-05-15 10:24:29 +10:00
David Howells
fd75815f72 KEYS: Add invalidation support
Add support for invalidating a key - which renders it immediately invisible to
further searches and causes the garbage collector to immediately wake up,
remove it from keyrings and then destroy it when it's no longer referenced.

It's better not to do this with keyctl_revoke() as that marks the key to start
returning -EKEYREVOKED to searches when what is actually desired is to have the
key refetched.

To invalidate a key the caller must be granted SEARCH permission by the key.
This may be too strict.  It may be better to also permit invalidation if the
caller has any of READ, WRITE or SETATTR permission.

The primary use for this is to evict keys that are cached in special keyrings,
such as the DNS resolver or an ID mapper.

Signed-off-by: David Howells <dhowells@redhat.com>
2012-05-11 10:56:56 +01:00
David Howells
31d5a79d7f KEYS: Do LRU discard in full keyrings
Do an LRU discard in keyrings that are full rather than returning ENFILE.  To
perform this, a time_t is added to the key struct and updated by the creation
of a link to a key and by a key being found as the result of a search.  At the
completion of a successful search, the keyrings in the path between the root of
the search and the first found link to it also have their last-used times
updated.

Note that discarding a link to a key from a keyring does not necessarily
destroy the key as there may be references held by other places.

An alternate discard method that might suffice is to perform FIFO discard from
the keyring, using the spare 2-byte hole in the keylist header as the index of
the next link to be discarded.

This is useful when using a keyring as a cache for DNS results or foreign
filesystem IDs.


This can be tested by the following.  As root do:

	echo 1000 >/proc/sys/kernel/keys/root_maxkeys

	kr=`keyctl newring foo @s`
	for ((i=0; i<2000; i++)); do keyctl add user a$i a $kr; done

Without this patch ENFILE should be reported when the keyring fills up.  With
this patch, the keyring discards keys in an LRU fashion.  Note that the stored
LRU time has a granularity of 1s.

After doing this, /proc/key-users can be observed and should show that most of
the 2000 keys have been discarded:

	[root@andromeda ~]# cat /proc/key-users
	    0:   517 516/516 513/1000 5249/20000

The "513/1000" here is the number of quota-accounted keys present for this user
out of the maximum permitted.

In /proc/keys, the keyring shows the number of keys it has and the number of
slots it has allocated:

	[root@andromeda ~]# grep foo /proc/keys
	200c64c4 I--Q--     1 perm 3b3f0000     0     0 keyring   foo: 509/509

The maximum is (PAGE_SIZE - header) / key pointer size.  That's typically 509
on a 64-bit system and 1020 on a 32-bit system.

Signed-off-by: David Howells <dhowells@redhat.com>
2012-05-11 10:56:56 +01:00
David Howells
233e4735f2 KEYS: Permit in-place link replacement in keyring list
Make use of the previous patch that makes the garbage collector perform RCU
synchronisation before destroying defunct keys.  Key pointers can now be
replaced in-place without creating a new keyring payload and replacing the
whole thing as the discarded keys will not be destroyed until all currently
held RCU read locks are released.

If the keyring payload space needs to be expanded or contracted, then a
replacement will still need allocating, and the original will still have to be
freed by RCU.

Signed-off-by: David Howells <dhowells@redhat.com>
2012-05-11 10:56:56 +01:00
David Howells
65d87fe68a KEYS: Perform RCU synchronisation on keys prior to key destruction
Make the keys garbage collector invoke synchronize_rcu() prior to destroying
keys with a zero usage count.  This means that a key can be examined under the
RCU read lock in the safe knowledge that it won't get deallocated until after
the lock is released - even if its usage count becomes zero whilst we're
looking at it.

This is useful in keyring search vs key link.  Consider a keyring containing a
link to a key.  That link can be replaced in-place in the keyring without
requiring an RCU copy-and-replace on the keyring contents without breaking a
search underway on that keyring when the displaced key is released, provided
the key is actually destroyed only after the RCU read lock held by the search
algorithm is released.

This permits __key_link() to replace a key without having to reallocate the key
payload.  A key gets replaced if a new key being linked into a keyring has the
same type and description.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
2012-05-11 10:56:56 +01:00
David Howells
1eb1bcf5bf KEYS: Announce key type (un)registration
Announce the (un)registration of a key type in the core key code rather than
in the callers.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
2012-05-11 10:56:56 +01:00