Commit Graph

1992 Commits

Author SHA1 Message Date
Mimi Zohar
bb7f9e5c38 evm: prohibit userspace writing 'security.evm' HMAC value
commit 2fb1c9a4f2 upstream.

Calculating the 'security.evm' HMAC value requires access to the
EVM encrypted key.  Only the kernel should have access to it.  This
patch prevents userspace tools(eg. setfattr, cp --preserve=xattr)
from setting/modifying the 'security.evm' HMAC value directly.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-26 15:12:37 -04:00
Dmitry Kasatkin
8b46db3abb ima: introduce ima_kernel_read()
commit 0430e49b6e upstream.

Commit 8aac62706 "move exit_task_namespaces() outside of exit_notify"
introduced the kernel opps since the kernel v3.10, which happens when
Apparmor and IMA-appraisal are enabled at the same time.

----------------------------------------------------------------------
[  106.750167] BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
[  106.750221] IP: [<ffffffff811ec7da>] our_mnt+0x1a/0x30
[  106.750241] PGD 0
[  106.750254] Oops: 0000 [#1] SMP
[  106.750272] Modules linked in: cuse parport_pc ppdev bnep rfcomm
bluetooth rpcsec_gss_krb5 nfsd auth_rpcgss nfs_acl nfs lockd sunrpc
fscache dm_crypt intel_rapl x86_pkg_temp_thermal intel_powerclamp
kvm_intel snd_hda_codec_hdmi kvm crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul
ablk_helper cryptd snd_hda_codec_realtek dcdbas snd_hda_intel
snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi
snd_seq_midi_event snd_rawmidi psmouse snd_seq microcode serio_raw
snd_timer snd_seq_device snd soundcore video lpc_ich coretemp mac_hid lp
parport mei_me mei nbd hid_generic e1000e usbhid ahci ptp hid libahci
pps_core
[  106.750658] CPU: 6 PID: 1394 Comm: mysqld Not tainted 3.13.0-rc7-kds+ #15
[  106.750673] Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A08
09/19/2012
[  106.750689] task: ffff8800de804920 ti: ffff880400fca000 task.ti:
ffff880400fca000
[  106.750704] RIP: 0010:[<ffffffff811ec7da>]  [<ffffffff811ec7da>]
our_mnt+0x1a/0x30
[  106.750725] RSP: 0018:ffff880400fcba60  EFLAGS: 00010286
[  106.750738] RAX: 0000000000000000 RBX: 0000000000000100 RCX:
ffff8800d51523e7
[  106.750764] RDX: ffffffffffffffea RSI: ffff880400fcba34 RDI:
ffff880402d20020
[  106.750791] RBP: ffff880400fcbae0 R08: 0000000000000000 R09:
0000000000000001
[  106.750817] R10: 0000000000000000 R11: 0000000000000001 R12:
ffff8800d5152300
[  106.750844] R13: ffff8803eb8df510 R14: ffff880400fcbb28 R15:
ffff8800d51523e7
[  106.750871] FS:  0000000000000000(0000) GS:ffff88040d200000(0000)
knlGS:0000000000000000
[  106.750910] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  106.750935] CR2: 0000000000000018 CR3: 0000000001c0e000 CR4:
00000000001407e0
[  106.750962] Stack:
[  106.750981]  ffffffff813434eb ffff880400fcbb20 ffff880400fcbb18
0000000000000000
[  106.751037]  ffff8800de804920 ffffffff8101b9b9 0001800000000000
0000000000000100
[  106.751093]  0000010000000000 0000000000000002 000000000000000e
ffff8803eb8df500
[  106.751149] Call Trace:
[  106.751172]  [<ffffffff813434eb>] ? aa_path_name+0x2ab/0x430
[  106.751199]  [<ffffffff8101b9b9>] ? sched_clock+0x9/0x10
[  106.751225]  [<ffffffff8134a68d>] aa_path_perm+0x7d/0x170
[  106.751250]  [<ffffffff8101b945>] ? native_sched_clock+0x15/0x80
[  106.751276]  [<ffffffff8134aa73>] aa_file_perm+0x33/0x40
[  106.751301]  [<ffffffff81348c5e>] common_file_perm+0x8e/0xb0
[  106.751327]  [<ffffffff81348d78>] apparmor_file_permission+0x18/0x20
[  106.751355]  [<ffffffff8130c853>] security_file_permission+0x23/0xa0
[  106.751382]  [<ffffffff811c77a2>] rw_verify_area+0x52/0xe0
[  106.751407]  [<ffffffff811c789d>] vfs_read+0x6d/0x170
[  106.751432]  [<ffffffff811cda31>] kernel_read+0x41/0x60
[  106.751457]  [<ffffffff8134fd45>] ima_calc_file_hash+0x225/0x280
[  106.751483]  [<ffffffff8134fb52>] ? ima_calc_file_hash+0x32/0x280
[  106.751509]  [<ffffffff8135022d>] ima_collect_measurement+0x9d/0x160
[  106.751536]  [<ffffffff810b552d>] ? trace_hardirqs_on+0xd/0x10
[  106.751562]  [<ffffffff8134f07c>] ? ima_file_free+0x6c/0xd0
[  106.751587]  [<ffffffff81352824>] ima_update_xattr+0x34/0x60
[  106.751612]  [<ffffffff8134f0d0>] ima_file_free+0xc0/0xd0
[  106.751637]  [<ffffffff811c9635>] __fput+0xd5/0x300
[  106.751662]  [<ffffffff811c98ae>] ____fput+0xe/0x10
[  106.751687]  [<ffffffff81086774>] task_work_run+0xc4/0xe0
[  106.751712]  [<ffffffff81066fad>] do_exit+0x2bd/0xa90
[  106.751738]  [<ffffffff8173c958>] ? retint_swapgs+0x13/0x1b
[  106.751763]  [<ffffffff8106780c>] do_group_exit+0x4c/0xc0
[  106.751788]  [<ffffffff81067894>] SyS_exit_group+0x14/0x20
[  106.751814]  [<ffffffff8174522d>] system_call_fastpath+0x1a/0x1f
[  106.751839] Code: c3 0f 1f 44 00 00 55 48 89 e5 e8 22 fe ff ff 5d c3
0f 1f 44 00 00 55 65 48 8b 04 25 c0 c9 00 00 48 8b 80 28 06 00 00 48 89
e5 5d <48> 8b 40 18 48 39 87 c0 00 00 00 0f 94 c0 c3 0f 1f 80 00 00 00
[  106.752185] RIP  [<ffffffff811ec7da>] our_mnt+0x1a/0x30
[  106.752214]  RSP <ffff880400fcba60>
[  106.752236] CR2: 0000000000000018
[  106.752258] ---[ end trace 3c520748b4732721 ]---
----------------------------------------------------------------------

The reason for the oops is that IMA-appraisal uses "kernel_read()" when
file is closed. kernel_read() honors LSM security hook which calls
Apparmor handler, which uses current->nsproxy->mnt_ns. The 'guilty'
commit changed the order of cleanup code so that nsproxy->mnt_ns was
not already available for Apparmor.

Discussion about the issue with Al Viro and Eric W. Biederman suggested
that kernel_read() is too high-level for IMA. Another issue, except
security checking, that was identified is mandatory locking. kernel_read
honors it as well and it might prevent IMA from calculating necessary hash.
It was suggested to use simplified version of the function without security
and locking checks.

This patch introduces special version ima_kernel_read(), which skips security
and mandatory locking checking. It prevents the kernel oops to happen.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-26 15:12:37 -04:00
Paul Moore
913a13b6da selinux: correctly label /proc inodes in use before the policy is loaded
commit f64410ec66 upstream.

This patch is based on an earlier patch by Eric Paris, he describes
the problem below:

  "If an inode is accessed before policy load it will get placed on a
   list of inodes to be initialized after policy load.  After policy
   load we call inode_doinit() which calls inode_doinit_with_dentry()
   on all inodes accessed before policy load.  In the case of inodes
   in procfs that means we'll end up at the bottom where it does:

     /* Default to the fs superblock SID. */
     isec->sid = sbsec->sid;

     if ((sbsec->flags & SE_SBPROC) && !S_ISLNK(inode->i_mode)) {
             if (opt_dentry) {
                     isec->sclass = inode_mode_to_security_class(...)
                     rc = selinux_proc_get_sid(opt_dentry,
                                               isec->sclass,
                                               &sid);
                     if (rc)
                             goto out_unlock;
                     isec->sid = sid;
             }
     }

   Since opt_dentry is null, we'll never call selinux_proc_get_sid()
   and will leave the inode labeled with the label on the superblock.
   I believe a fix would be to mimic the behavior of xattrs.  Look
   for an alias of the inode.  If it can't be found, just leave the
   inode uninitialized (and pick it up later) if it can be found, we
   should be able to call selinux_proc_get_sid() ..."

On a system exhibiting this problem, you will notice a lot of files in
/proc with the generic "proc_t" type (at least the ones that were
accessed early in the boot), for example:

   # ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }'
   system_u:object_r:proc_t:s0 /proc/sys/kernel/shmmax

However, with this patch in place we see the expected result:

   # ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }'
   system_u:object_r:sysctl_kernel_t:s0 /proc/sys/kernel/shmmax

Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14 06:42:14 -07:00
Eric Paris
3e66969eab SELinux: bigendian problems with filename trans rules
commit 9085a64229 upstream.

When writing policy via /sys/fs/selinux/policy I wrote the type and class
of filename trans rules in CPU endian instead of little endian.  On
x86_64 this works just fine, but it means that on big endian arch's like
ppc64 and s390 userspace reads the policy and converts it from
le32_to_cpu.  So the values are all screwed up.  Write the values in le
format like it should have been to start.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-06 21:30:12 -08:00
Stephen Smalley
95664c9622 SELinux: Fix kernel BUG on empty security contexts.
commit 2172fa709a upstream.

Setting an empty security context (length=0) on a file will
lead to incorrectly dereferencing the type and other fields
of the security context structure, yielding a kernel BUG.
As a zero-length security context is never valid, just reject
all such security contexts whether coming from userspace
via setxattr or coming from the filesystem upon a getxattr
request by SELinux.

Setting a security context value (empty or otherwise) unknown to
SELinux in the first place is only possible for a root process
(CAP_MAC_ADMIN), and, if running SELinux in enforcing mode, only
if the corresponding SELinux mac_admin permission is also granted
to the domain by policy.  In Fedora policies, this is only allowed for
specific domains such as livecd for setting down security contexts
that are not defined in the build host policy.

Reproducer:
su
setenforce 0
touch foo
setfattr -n security.selinux foo

Caveat:
Relabeling or removing foo after doing the above may not be possible
without booting with SELinux disabled.  Any subsequent access to foo
after doing the above will also trigger the BUG.

BUG output from Matthew Thode:
[  473.893141] ------------[ cut here ]------------
[  473.962110] kernel BUG at security/selinux/ss/services.c:654!
[  473.995314] invalid opcode: 0000 [#6] SMP
[  474.027196] Modules linked in:
[  474.058118] CPU: 0 PID: 8138 Comm: ls Tainted: G      D   I
3.13.0-grsec #1
[  474.116637] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0
07/29/10
[  474.149768] task: ffff8805f50cd010 ti: ffff8805f50cd488 task.ti:
ffff8805f50cd488
[  474.183707] RIP: 0010:[<ffffffff814681c7>]  [<ffffffff814681c7>]
context_struct_compute_av+0xce/0x308
[  474.219954] RSP: 0018:ffff8805c0ac3c38  EFLAGS: 00010246
[  474.252253] RAX: 0000000000000000 RBX: ffff8805c0ac3d94 RCX:
0000000000000100
[  474.287018] RDX: ffff8805e8aac000 RSI: 00000000ffffffff RDI:
ffff8805e8aaa000
[  474.321199] RBP: ffff8805c0ac3cb8 R08: 0000000000000010 R09:
0000000000000006
[  474.357446] R10: 0000000000000000 R11: ffff8805c567a000 R12:
0000000000000006
[  474.419191] R13: ffff8805c2b74e88 R14: 00000000000001da R15:
0000000000000000
[  474.453816] FS:  00007f2e75220800(0000) GS:ffff88061fc00000(0000)
knlGS:0000000000000000
[  474.489254] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  474.522215] CR2: 00007f2e74716090 CR3: 00000005c085e000 CR4:
00000000000207f0
[  474.556058] Stack:
[  474.584325]  ffff8805c0ac3c98 ffffffff811b549b ffff8805c0ac3c98
ffff8805f1190a40
[  474.618913]  ffff8805a6202f08 ffff8805c2b74e88 00068800d0464990
ffff8805e8aac860
[  474.653955]  ffff8805c0ac3cb8 000700068113833a ffff880606c75060
ffff8805c0ac3d94
[  474.690461] Call Trace:
[  474.723779]  [<ffffffff811b549b>] ? lookup_fast+0x1cd/0x22a
[  474.778049]  [<ffffffff81468824>] security_compute_av+0xf4/0x20b
[  474.811398]  [<ffffffff8196f419>] avc_compute_av+0x2a/0x179
[  474.843813]  [<ffffffff8145727b>] avc_has_perm+0x45/0xf4
[  474.875694]  [<ffffffff81457d0e>] inode_has_perm+0x2a/0x31
[  474.907370]  [<ffffffff81457e76>] selinux_inode_getattr+0x3c/0x3e
[  474.938726]  [<ffffffff81455cf6>] security_inode_getattr+0x1b/0x22
[  474.970036]  [<ffffffff811b057d>] vfs_getattr+0x19/0x2d
[  475.000618]  [<ffffffff811b05e5>] vfs_fstatat+0x54/0x91
[  475.030402]  [<ffffffff811b063b>] vfs_lstat+0x19/0x1b
[  475.061097]  [<ffffffff811b077e>] SyS_newlstat+0x15/0x30
[  475.094595]  [<ffffffff8113c5c1>] ? __audit_syscall_entry+0xa1/0xc3
[  475.148405]  [<ffffffff8197791e>] system_call_fastpath+0x16/0x1b
[  475.179201] Code: 00 48 85 c0 48 89 45 b8 75 02 0f 0b 48 8b 45 a0 48
8b 3d 45 d0 b6 00 8b 40 08 89 c6 ff ce e8 d1 b0 06 00 48 85 c0 49 89 c7
75 02 <0f> 0b 48 8b 45 b8 4c 8b 28 eb 1e 49 8d 7d 08 be 80 01 00 00 e8
[  475.255884] RIP  [<ffffffff814681c7>]
context_struct_compute_av+0xce/0x308
[  475.296120]  RSP <ffff8805c0ac3c38>
[  475.328734] ---[ end trace f076482e9d754adc ]---

Reported-by:  Matthew Thode <mthode@mthode.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-20 11:06:10 -08:00
Tetsuo Handa
f6333f55fc SELinux: Fix memory leak upon loading policy
commit 8ed8146028 upstream.

Hello.

I got below leak with linux-3.10.0-54.0.1.el7.x86_64 .

[  681.903890] kmemleak: 5538 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

Below is a patch, but I don't know whether we need special handing for undoing
ebitmap_set_bit() call.
----------
>>From fe97527a90fe95e2239dfbaa7558f0ed559c0992 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Mon, 6 Jan 2014 16:30:21 +0900
Subject: SELinux: Fix memory leak upon loading policy

Commit 2463c26d "SELinux: put name based create rules in a hashtable" did not
check return value from hashtab_insert() in filename_trans_read(). It leaks
memory if hashtab_insert() returns error.

  unreferenced object 0xffff88005c9160d0 (size 8):
    comm "systemd", pid 1, jiffies 4294688674 (age 235.265s)
    hex dump (first 8 bytes):
      57 0b 00 00 6b 6b 6b a5                          W...kkk.
    backtrace:
      [<ffffffff816604ae>] kmemleak_alloc+0x4e/0xb0
      [<ffffffff811cba5e>] kmem_cache_alloc_trace+0x12e/0x360
      [<ffffffff812aec5d>] policydb_read+0xd1d/0xf70
      [<ffffffff812b345c>] security_load_policy+0x6c/0x500
      [<ffffffff812a623c>] sel_write_load+0xac/0x750
      [<ffffffff811eb680>] vfs_write+0xc0/0x1f0
      [<ffffffff811ec08c>] SyS_write+0x4c/0xa0
      [<ffffffff81690419>] system_call_fastpath+0x16/0x1b
      [<ffffffffffffffff>] 0xffffffffffffffff

However, we should not return EEXIST error to the caller, or the systemd will
show below message and the boot sequence freezes.

  systemd[1]: Failed to load SELinux policy. Freezing.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-13 13:47:59 -08:00
Steven Rostedt
057f2f7daf SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()
commit 3dc91d4338 upstream.

While running stress tests on adding and deleting ftrace instances I hit
this bug:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
  IP: selinux_inode_permission+0x85/0x160
  PGD 63681067 PUD 7ddbe067 PMD 0
  Oops: 0000 [#1] PREEMPT
  CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20
  Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
  task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000
  RIP: 0010:[<ffffffff812d8bc5>]  [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
  RSP: 0018:ffff88007ddb1c48  EFLAGS: 00010246
  RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840
  RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000
  RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54
  R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000
  R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000
  FS:  00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
  CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0
  Call Trace:
    security_inode_permission+0x1c/0x30
    __inode_permission+0x41/0xa0
    inode_permission+0x18/0x50
    link_path_walk+0x66/0x920
    path_openat+0xa6/0x6c0
    do_filp_open+0x43/0xa0
    do_sys_open+0x146/0x240
    SyS_open+0x1e/0x20
    system_call_fastpath+0x16/0x1b
  Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff
  RIP  selinux_inode_permission+0x85/0x160
  CR2: 0000000000000020

Investigating, I found that the inode->i_security was NULL, and the
dereference of it caused the oops.

in selinux_inode_permission():

	isec = inode->i_security;

	rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);

Note, the crash came from stressing the deletion and reading of debugfs
files.  I was not able to recreate this via normal files.  But I'm not
sure they are safe.  It may just be that the race window is much harder
to hit.

What seems to have happened (and what I have traced), is the file is
being opened at the same time the file or directory is being deleted.
As the dentry and inode locks are not held during the path walk, nor is
the inodes ref counts being incremented, there is nothing saving these
structures from being discarded except for an rcu_read_lock().

The rcu_read_lock() protects against freeing of the inode, but it does
not protect freeing of the inode_security_struct.  Now if the freeing of
the i_security happens with a call_rcu(), and the i_security field of
the inode is not changed (it gets freed as the inode gets freed) then
there will be no issue here.  (Linus Torvalds suggested not setting the
field to NULL such that we do not need to check if it is NULL in the
permission check).

Note, this is a hack, but it fixes the problem at hand.  A real fix is
to restructure the destroy_inode() to call all the destructor handlers
from the RCU callback.  But that is a major job to do, and requires a
lot of work.  For now, we just band-aid this bug with this fix (it
works), and work on a more maintainable solution in the future.

Link: http://lkml.kernel.org/r/20140109101932.0508dec7@gandalf.local.home
Link: http://lkml.kernel.org/r/20140109182756.17abaaa8@gandalf.local.home

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-25 08:27:11 -08:00
Paul Moore
0fdb9385a5 selinux: process labeled IPsec TCP SYN-ACK packets properly in selinux_ip_postroute()
commit c0828e5048 upstream.

Due to difficulty in arriving at the proper security label for
TCP SYN-ACK packets in selinux_ip_postroute(), we need to check packets
while/before they are undergoing XFRM transforms instead of waiting
until afterwards so that we can determine the correct security label.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-09 12:24:24 -08:00
Paul Moore
070357081f selinux: look for IPsec labels on both inbound and outbound packets
commit 817eff718d upstream.

Previously selinux_skb_peerlbl_sid() would only check for labeled
IPsec security labels on inbound packets, this patch enables it to
check both inbound and outbound traffic for labeled IPsec security
labels.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-09 12:24:24 -08:00
Oleg Nesterov
2b25925431 selinux: selinux_setprocattr()->ptrace_parent() needs rcu_read_lock()
commit c0c1439541 upstream.

selinux_setprocattr() does ptrace_parent(p) under task_lock(p),
but task_struct->alloc_lock doesn't pin ->parent or ->ptrace,
this looks confusing and triggers the "suspicious RCU usage"
warning because ptrace_parent() does rcu_dereference_check().

And in theory this is wrong, spin_lock()->preempt_disable()
doesn't necessarily imply rcu_read_lock() we need to access
the ->parent.

Reported-by: Evan McNabb <emcnabb@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-09 12:24:20 -08:00
Chad Hanson
faecbbe421 selinux: fix broken peer recv check
commit 46d01d6322 upstream.

Fix a broken networking check. Return an error if peer recv fails.  If
secmark is active and the packet recv succeeds the peer recv error is
ignored.

Signed-off-by: Chad Hanson <chanson@trustedcs.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-09 12:24:20 -08:00
Paul Moore
348d786715 selinux: handle TCP SYN-ACK packets correctly in selinux_ip_postroute()
commit 446b802437 upstream.

In selinux_ip_postroute() we perform access checks based on the
packet's security label.  For locally generated traffic we get the
packet's security label from the associated socket; this works in all
cases except for TCP SYN-ACK packets.  In the case of SYN-ACK packet's
the correct security label is stored in the connection's request_sock,
not the server's socket.  Unfortunately, at the point in time when
selinux_ip_postroute() is called we can't query the request_sock
directly, we need to recreate the label using the same logic that
originally labeled the associated request_sock.

See the inline comments for more explanation.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Tested-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-20 07:45:09 -08:00
Paul Moore
216c4a776a selinux: handle TCP SYN-ACK packets correctly in selinux_ip_output()
commit 4718006827 upstream.

In selinux_ip_output() we always label packets based on the parent
socket.  While this approach works in almost all cases, it doesn't
work in the case of TCP SYN-ACK packets when the correct label is not
the label of the parent socket, but rather the label of the larval
socket represented by the request_sock struct.

Unfortunately, since the request_sock isn't queued on the parent
socket until *after* the SYN-ACK packet is sent, we can't lookup the
request_sock to determine the correct label for the packet; at this
point in time the best we can do is simply pass/NF_ACCEPT the packet.
It must be said that simply passing the packet without any explicit
labeling action, while far from ideal, is not terrible as the SYN-ACK
packet will inherit any IP option based labeling from the initial
connection request so the label *should* be correct and all our
access controls remain in place so we shouldn't have to worry about
information leaks.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Tested-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-20 07:45:09 -08:00
Paul Moore
6be060cda9 selinux: correct locking in selinux_netlbl_socket_connect)
commit 42d64e1add upstream.

The SELinux/NetLabel glue code has a locking bug that affects systems
with NetLabel enabled, see the kernel error message below.  This patch
corrects this problem by converting the bottom half socket lock to a
more conventional, and correct for this call-path, lock_sock() call.

 ===============================
 [ INFO: suspicious RCU usage. ]
 3.11.0-rc3+ #19 Not tainted
 -------------------------------
 net/ipv4/cipso_ipv4.c:1928 suspicious rcu_dereference_protected() usage!

 other info that might help us debug this:

 rcu_scheduler_active = 1, debug_locks = 0
 2 locks held by ping/731:
  #0:  (slock-AF_INET/1){+.-...}, at: [...] selinux_netlbl_socket_connect
  #1:  (rcu_read_lock){.+.+..}, at: [<...>] netlbl_conn_setattr

 stack backtrace:
 CPU: 1 PID: 731 Comm: ping Not tainted 3.11.0-rc3+ #19
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000001 ffff88006f659d28 ffffffff81726b6a ffff88003732c500
  ffff88006f659d58 ffffffff810e4457 ffff88006b845a00 0000000000000000
  000000000000000c ffff880075aa2f50 ffff88006f659d90 ffffffff8169bec7
 Call Trace:
  [<ffffffff81726b6a>] dump_stack+0x54/0x74
  [<ffffffff810e4457>] lockdep_rcu_suspicious+0xe7/0x120
  [<ffffffff8169bec7>] cipso_v4_sock_setattr+0x187/0x1a0
  [<ffffffff8170f317>] netlbl_conn_setattr+0x187/0x190
  [<ffffffff8170f195>] ? netlbl_conn_setattr+0x5/0x190
  [<ffffffff8131ac9e>] selinux_netlbl_socket_connect+0xae/0xc0
  [<ffffffff81303025>] selinux_socket_connect+0x135/0x170
  [<ffffffff8119d127>] ? might_fault+0x57/0xb0
  [<ffffffff812fb146>] security_socket_connect+0x16/0x20
  [<ffffffff815d3ad3>] SYSC_connect+0x73/0x130
  [<ffffffff81739a85>] ? sysret_check+0x22/0x5d
  [<ffffffff810e5e2d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
  [<ffffffff81373d4e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
  [<ffffffff815d52be>] SyS_connect+0xe/0x10
  [<ffffffff81739a59>] system_call_fastpath+0x16/0x1b

Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-04 10:57:04 -08:00
Mimi Zohar
8bcee197d6 Revert "ima: policy for RAMFS"
commit 08de59eb14 upstream.

This reverts commit 4c2c392763.

Everything in the initramfs should be measured and appraised,
but until the initramfs has extended attribute support, at
least measured.

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29 11:11:44 -08:00
Paul Moore
e4e8536f65 selinux: fix the labeled xfrm/IPsec reference count handling
The SELinux labeled IPsec code was improperly handling its reference
counting, dropping a reference on a delete operation instead of on a
free/release operation.

Reported-by: Ondrej Moris <omoris@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 17:30:07 -07:00
Kent Overstreet
a27bb332c0 aio: don't include aio.h in sched.h
Faster kernel compiles by way of fewer unnecessary includes.

[akpm@linux-foundation.org: fix fallout]
[akpm@linux-foundation.org: fix build]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07 20:16:25 -07:00
Linus Torvalds
20b4fb4852 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS updates from Al Viro,

Misc cleanups all over the place, mainly wrt /proc interfaces (switch
create_proc_entry to proc_create(), get rid of the deprecated
create_proc_read_entry() in favor of using proc_create_data() and
seq_file etc).

7kloc removed.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
  don't bother with deferred freeing of fdtables
  proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
  proc: Make the PROC_I() and PDE() macros internal to procfs
  proc: Supply a function to remove a proc entry by PDE
  take cgroup_open() and cpuset_open() to fs/proc/base.c
  ppc: Clean up scanlog
  ppc: Clean up rtas_flash driver somewhat
  hostap: proc: Use remove_proc_subtree()
  drm: proc: Use remove_proc_subtree()
  drm: proc: Use minor->index to label things, not PDE->name
  drm: Constify drm_proc_list[]
  zoran: Don't print proc_dir_entry data in debug
  reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
  proc: Supply an accessor for getting the data from a PDE's parent
  airo: Use remove_proc_subtree()
  rtl8192u: Don't need to save device proc dir PDE
  rtl8187se: Use a dir under /proc/net/r8180/
  proc: Add proc_mkdir_data()
  proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
  proc: Move PDE_NET() to fs/proc/proc_net.c
  ...
2013-05-01 17:51:54 -07:00
Linus Torvalds
73287a43cc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights (1721 non-merge commits, this has to be a record of some
  sort):

   1) Add 'random' mode to team driver, from Jiri Pirko and Eric
      Dumazet.

   2) Make it so that any driver that supports configuration of multiple
      MAC addresses can provide the forwarding database add and del
      calls by providing a default implementation and hooking that up if
      the driver doesn't have an explicit set of handlers.  From Vlad
      Yasevich.

   3) Support GSO segmentation over tunnels and other encapsulating
      devices such as VXLAN, from Pravin B Shelar.

   4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton.

   5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita
      Dukkipati.

   6) In the PHY layer, allow supporting wake-on-lan in situations where
      the PHY registers have to be written for it to be configured.

      Use it to support wake-on-lan in mv643xx_eth.

      From Michael Stapelberg.

   7) Significantly improve firewire IPV6 support, from YOSHIFUJI
      Hideaki.

   8) Allow multiple packets to be sent in a single transmission using
      network coding in batman-adv, from Martin Hundebøll.

   9) Add support for T5 cxgb4 chips, from Santosh Rastapur.

  10) Generalize the VXLAN forwarding tables so that there is more
      flexibility in configurating various aspects of the endpoints.
      From David Stevens.

  11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver,
      from Dmitry Kravkov.

  12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo
      Neira Ayuso.

  13) Start adding networking selftests.

  14) In situations of overload on the same AF_PACKET fanout socket, or
      per-cpu packet receive queue, minimize drop by distributing the
      load to other cpus/fanouts.  From Willem de Bruijn and Eric
      Dumazet.

  15) Add support for new payload offset BPF instruction, from Daniel
      Borkmann.

  16) Convert several drivers over to mdoule_platform_driver(), from
      Sachin Kamat.

  17) Provide a minimal BPF JIT image disassembler userspace tool, from
      Daniel Borkmann.

  18) Rewrite F-RTO implementation in TCP to match the final
      specification of it in RFC4138 and RFC5682.  From Yuchung Cheng.

  19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear
      you like netlink, so I implemented netlink dumping of netlink
      sockets.") From Andrey Vagin.

  20) Remove ugly passing of rtnetlink attributes into rtnl_doit
      functions, from Thomas Graf.

  21) Allow userspace to be able to see if a configuration change occurs
      in the middle of an address or device list dump, from Nicolas
      Dichtel.

  22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes
      Frederic Sowa.

  23) Increase accuracy of packet length used by packet scheduler, from
      Jason Wang.

  24) Beginning set of changes to make ipv4/ipv6 fragment handling more
      scalable and less susceptible to overload and locking contention,
      from Jesper Dangaard Brouer.

  25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_*()
      instead.  From Hong Zhiguo.

  26) Optimize route usage in IPVS by avoiding reference counting where
      possible, from Julian Anastasov.

  27) Convert IPVS schedulers to RCU, also from Julian Anastasov.

  28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger
      Eitzenberger.

  29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG,
      nfnetlink_log, and nfnetlink_queue.  From Gao feng.

  30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa.

  31) Support several new r8169 chips, from Hayes Wang.

  32) Support tokenized interface identifiers in ipv6, from Daniel
      Borkmann.

  33) Use usbnet_link_change() helper in USB net driver, from Ming Lei.

  34) Add 802.1ad vlan offload support, from Patrick McHardy.

  35) Support mmap() based netlink communication, also from Patrick
      McHardy.

  36) Support HW timestamping in mlx4 driver, from Amir Vadai.

  37) Rationalize AF_PACKET packet timestamping when transmitting, from
      Willem de Bruijn and Daniel Borkmann.

  38) Bring parity to what's provided by /proc/net/packet socket dumping
      and the info provided by netlink socket dumping of AF_PACKET
      sockets.  From Nicolas Dichtel.

  39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin
      Poirier"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
  filter: fix va_list build error
  af_unix: fix a fatal race with bit fields
  bnx2x: Prevent memory leak when cnic is absent
  bnx2x: correct reading of speed capabilities
  net: sctp: attribute printl with __printf for gcc fmt checks
  netlink: kconfig: move mmap i/o into netlink kconfig
  netpoll: convert mutex into a semaphore
  netlink: Fix skb ref counting.
  net_sched: act_ipt forward compat with xtables
  mlx4_en: fix a build error on 32bit arches
  Revert "bnx2x: allow nvram test to run when device is down"
  bridge: avoid OOPS if root port not found
  drivers: net: cpsw: fix kernel warn on cpsw irq enable
  sh_eth: use random MAC address if no valid one supplied
  3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA)
  tg3: fix to append hardware time stamping flags
  unix/stream: fix peeking with an offset larger than data in queue
  unix/dgram: fix peeking with an offset larger than data in queue
  unix/dgram: peek beyond 0-sized skbs
  openvswitch: Remove unneeded ovs_netdev_get_ifindex()
  ...
2013-05-01 14:08:52 -07:00
Linus Torvalds
5f56886521 Merge branch 'akpm' (incoming from Andrew)
Merge third batch of fixes from Andrew Morton:
 "Most of the rest.  I still have two large patchsets against AIO and
  IPC, but they're a bit stuck behind other trees and I'm about to
  vanish for six days.

   - random fixlets
   - inotify
   - more of the MM queue
   - show_stack() cleanups
   - DMI update
   - kthread/workqueue things
   - compat cleanups
   - epoll udpates
   - binfmt updates
   - nilfs2
   - hfs
   - hfsplus
   - ptrace
   - kmod
   - coredump
   - kexec
   - rbtree
   - pids
   - pidns
   - pps
   - semaphore tweaks
   - some w1 patches
   - relay updates
   - core Kconfig changes
   - sysrq tweaks"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (109 commits)
  Documentation/sysrq: fix inconstistent help message of sysrq key
  ethernet/emac/sysrq: fix inconstistent help message of sysrq key
  sparc/sysrq: fix inconstistent help message of sysrq key
  powerpc/xmon/sysrq: fix inconstistent help message of sysrq key
  ARM/etm/sysrq: fix inconstistent help message of sysrq key
  power/sysrq: fix inconstistent help message of sysrq key
  kgdb/sysrq: fix inconstistent help message of sysrq key
  lib/decompress.c: fix initconst
  notifier-error-inject: fix module names in Kconfig
  kernel/sys.c: make prctl(PR_SET_MM) generally available
  UAPI: remove empty Kbuild files
  menuconfig: print more info for symbol without prompts
  init/Kconfig: re-order CONFIG_EXPERT options to fix menuconfig display
  kconfig menu: move Virtualization drivers near other virtualization options
  Kconfig: consolidate CONFIG_DEBUG_STRICT_USER_COPY_CHECKS
  relay: use macro PAGE_ALIGN instead of FIX_SIZE
  kernel/relay.c: move FIX_SIZE macro into relay.c
  kernel/relay.c: remove unused function argument actor
  drivers/w1/slaves/w1_ds2760.c: fix the error handling in w1_ds2760_add_slave()
  drivers/w1/slaves/w1_ds2781.c: fix the error handling in w1_ds2781_add_slave()
  ...
2013-04-30 17:37:43 -07:00
Lucas De Marchi
93997f6ddb KEYS: split call to call_usermodehelper_fns()
Use call_usermodehelper_setup() + call_usermodehelper_exec() instead of
calling call_usermodehelper_fns().  In case there's an OOM in this last
function the cleanup function may not be called - in this case we would
miss a call to key_put().

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-30 17:04:06 -07:00
Linus Torvalds
2e1deaad1e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem update from James Morris:
 "Just some minor updates across the subsystem"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  ima: eliminate passing d_name.name to process_measurement()
  TPM: Retry SaveState command in suspend path
  tpm/tpm_i2c_infineon: Add small comment about return value of __i2c_transfer
  tpm/tpm_i2c_infineon.c: Add OF attributes type and name to the of_device_id table entries
  tpm_i2c_stm_st33: Remove duplicate inclusion of header files
  tpm: Add support for new Infineon I2C TPM (SLB 9645 TT 1.2 I2C)
  char/tpm: Convert struct i2c_msg initialization to C99 format
  drivers/char/tpm/tpm_ppi: use strlcpy instead of strncpy
  tpm/tpm_i2c_stm_st33: formatting and white space changes
  Smack: include magic.h in smackfs.c
  selinux: make security_sb_clone_mnt_opts return an error on context mismatch
  seccomp: allow BPF_XOR based ALU instructions.
  Fix NULL pointer dereference in smack_inode_unlink() and smack_inode_rmdir()
  Smack: add support for modification of existing rules
  smack: SMACK_MAGIC to include/uapi/linux/magic.h
  Smack: add missing support for transmute bit in smack_str_from_perm()
  Smack: prevent revoke-subject from failing when unseen label is written to it
  tomoyo: use DEFINE_SRCU() to define tomoyo_ss
  tomoyo: use DEFINE_SRCU() to define tomoyo_ss
2013-04-30 16:27:51 -07:00
Linus Torvalds
191a712090 Merge branch 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:

 - Fixes and a lot of cleanups.  Locking cleanup is finally complete.
   cgroup_mutex is no longer exposed to individual controlelrs which
   used to cause nasty deadlock issues.  Li fixed and cleaned up quite a
   bit including long standing ones like racy cgroup_path().

 - device cgroup now supports proper hierarchy thanks to Aristeu.

 - perf_event cgroup now supports proper hierarchy.

 - A new mount option "__DEVEL__sane_behavior" is added.  As indicated
   by the name, this option is to be used for development only at this
   point and generates a warning message when used.  Unfortunately,
   cgroup interface currently has too many brekages and inconsistencies
   to implement a consistent and unified hierarchy on top.  The new flag
   is used to collect the behavior changes which are necessary to
   implement consistent unified hierarchy.  It's likely that this flag
   won't be used verbatim when it becomes ready but will be enabled
   implicitly along with unified hierarchy.

   The option currently disables some of broken behaviors in cgroup core
   and also .use_hierarchy switch in memcg (will be routed through -mm),
   which can be used to make very unusual hierarchy where nesting is
   partially honored.  It will also be used to implement hierarchy
   support for blk-throttle which would be impossible otherwise without
   introducing a full separate set of control knobs.

   This is essentially versioning of interface which isn't very nice but
   at this point I can't see any other options which would allow keeping
   the interface the same while moving towards hierarchy behavior which
   is at least somewhat sane.  The planned unified hierarchy is likely
   to require some level of adaptation from userland anyway, so I think
   it'd be best to take the chance and update the interface such that
   it's supportable in the long term.

   Maintaining the existing interface does complicate cgroup core but
   shouldn't put too much strain on individual controllers and I think
   it'd be manageable for the foreseeable future.  Maybe we'll be able
   to drop it in a decade.

Fix up conflicts (including a semantic one adding a new #include to ppc
that was uncovered by header the file changes) as per Tejun.

* 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (45 commits)
  cpuset: fix compile warning when CONFIG_SMP=n
  cpuset: fix cpu hotplug vs rebuild_sched_domains() race
  cpuset: use rebuild_sched_domains() in cpuset_hotplug_workfn()
  cgroup: restore the call to eventfd->poll()
  cgroup: fix use-after-free when umounting cgroupfs
  cgroup: fix broken file xattrs
  devcg: remove parent_cgroup.
  memcg: force use_hierarchy if sane_behavior
  cgroup: remove cgrp->top_cgroup
  cgroup: introduce sane_behavior mount option
  move cgroupfs_root to include/linux/cgroup.h
  cgroup: convert cgroupfs_root flag bits to masks and add CGRP_ prefix
  cgroup: make cgroup_path() not print double slashes
  Revert "cgroup: remove bind() method from cgroup_subsys."
  perf: make perf_event cgroup hierarchical
  cgroup: implement cgroup_is_descendant()
  cgroup: make sure parent won't be destroyed before its children
  cgroup: remove bind() method from cgroup_subsys.
  devcg: remove broken_hierarchy tag
  cgroup: remove cgroup_lock_is_held()
  ...
2013-04-29 19:14:20 -07:00
Al Viro
e53cfda5d2 tomoyo_close_control: don't bother with return value
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-29 15:41:42 -04:00
David S. Miller
6e0895c2ea Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/emulex/benet/be_main.c
	drivers/net/ethernet/intel/igb/igb_main.c
	drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c
	include/net/scm.h
	net/batman-adv/routing.c
	net/ipv4/tcp_input.c

The e{uid,gid} --> {uid,gid} credentials fix conflicted with the
cleanup in net-next to now pass cred structs around.

The be2net driver had a bug fix in 'net' that overlapped with the VLAN
interface changes by Patrick McHardy in net-next.

An IGB conflict existed because in 'net' the build_skb() support was
reverted, and in 'net-next' there was a comment style fix within that
code.

Several batman-adv conflicts were resolved by making sure that all
calls to batadv_is_my_mac() are changed to have a new bat_priv first
argument.

Eric Dumazet's TS ECR fix in TCP in 'net' conflicted with the F-RTO
rewrite in 'net-next', mostly overlapping changes.

Thanks to Stephen Rothwell and Antonio Quartulli for help with several
of these merge resolutions.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-22 20:32:51 -04:00
Rami Rosen
e57d5cf2f8 devcg: remove parent_cgroup.
In devcgroup_css_alloc(), there is no longer need for parent_cgroup.
bd2953ebbb("devcg: propagate local changes down the hierarchy") made
the variable parent_cgroup redundant. This patch removes parent_cgroup
from devcgroup_css_alloc().

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-04-18 11:34:35 -07:00
Mimi Zohar
df2c2afba4 ima: eliminate passing d_name.name to process_measurement()
Passing a pointer to the dentry name, as a parameter to
process_measurement(), causes a race condition with rename() and
is unnecessary, as the dentry name is already accessible via the
file parameter.

In the normal case, we use the full pathname as provided by
brpm->filename, bprm->interp, or ima_d_path().  Only on ima_d_path()
failure, do we fallback to using the d_name.name, which points
either to external memory or d_iname.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-04-17 17:20:57 -07:00
Eric Dumazet
ca10b9e9a8 selinux: add a skb_owned_by() hook
Commit 90ba9b1986 (tcp: tcp_make_synack() can use alloc_skb())
broke certain SELinux/NetLabel configurations by no longer correctly
assigning the sock to the outgoing SYNACK packet.

Cost of atomic operations on the LISTEN socket is quite big,
and we would like it to happen only if really needed.

This patch introduces a new security_ops->skb_owned_by() method,
that is a void operation unless selinux is active.

Reported-by: Miroslav Vadkerti <mvadkert@redhat.com>
Diagnosed-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-security-module@vger.kernel.org
Acked-by: James Morris <james.l.morris@oracle.com>
Tested-by: Paul Moore <pmoore@redhat.com>
Acked-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-09 13:23:11 -04:00
Tejun Heo
8adf12b0ff devcg: remove broken_hierarchy tag
bd2953ebbb ("devcg: propagate local changes down the hierarchy")
implemented proper hierarchy support.  Remove the broken tag.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Aristeu Rozanski <aris@redhat.com>
2013-04-08 08:31:59 -07:00
Casey Schaufler
958d2c2f4a Smack: include magic.h in smackfs.c
As reported for linux-next: Tree for Apr 2 (smack)
Add the required include for smackfs.c

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-04-03 13:13:51 +11:00
Jeff Layton
094f7b69ea selinux: make security_sb_clone_mnt_opts return an error on context mismatch
I had the following problem reported a while back. If you mount the
same filesystem twice using NFSv4 with different contexts, then the
second context= option is ignored. For instance:

    # mount server:/export /mnt/test1
    # mount server:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0
    # ls -dZ /mnt/test1
    drwxrwxrwt. root root system_u:object_r:nfs_t:s0       /mnt/test1
    # ls -dZ /mnt/test2
    drwxrwxrwt. root root system_u:object_r:nfs_t:s0       /mnt/test2

When we call into SELinux to set the context of a "cloned" superblock,
it will currently just bail out when it notices that we're reusing an
existing superblock. Since the existing superblock is already set up and
presumably in use, we can't go overwriting its context with the one from
the "original" sb. Because of this, the second context= option in this
case cannot take effect.

This patch fixes this by turning security_sb_clone_mnt_opts into an int
return operation. When it finds that the "new" superblock that it has
been handed is already set up, it checks to see whether the contexts on
the old superblock match it. If it does, then it will just return
success, otherwise it'll return -EBUSY and emit a printk to tell the
admin why the second mount failed.

Note that this patch may cause casualties. The NFSv4 code relies on
being able to walk down to an export from the pseudoroot. If you mount
filesystems that are nested within one another with different contexts,
then this patch will make those mounts fail in new and "exciting" ways.

For instance, suppose that /export is a separate filesystem on the
server:

    # mount server:/ /mnt/test1
    # mount salusa:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0
    mount.nfs: an incorrect mount option was specified

...with the printk in the ring buffer. Because we *might* eventually
walk down to /mnt/test1/export, the mount is denied due to this patch.
The second mount needs the pseudoroot superblock, but that's already
present with the wrong context.

OTOH, if we mount these in the reverse order, then both mounts work,
because the pseudoroot superblock created when mounting /export is
discarded once that mount is done. If we then however try to walk into
that directory, the automount fails for the similar reasons:

    # cd /mnt/test1/scratch/
    -bash: cd: /mnt/test1/scratch: Device or resource busy

The story I've gotten from the SELinux folks that I've talked to is that
this is desirable behavior. In SELinux-land, mounting the same data
under different contexts is wrong -- there can be only one.

Cc: Steve Dickson <steved@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-04-02 11:30:13 +11:00
David S. Miller
a210576cf8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/mac80211/sta_info.c
	net/wireless/core.h

Two minor conflicts in wireless.  Overlapping additions of extern
declarations in net/wireless/core.h and a bug fix overlapping with
the addition of a boolean parameter to __ieee80211_key_free().

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-01 13:36:50 -04:00
Linus Torvalds
2c3de1c2d7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns fixes from Eric W Biederman:
 "The bulk of the changes are fixing the worst consequences of the user
  namespace design oversight in not considering what happens when one
  namespace starts off as a clone of another namespace, as happens with
  the mount namespace.

  The rest of the changes are just plain bug fixes.

  Many thanks to Andy Lutomirski for pointing out many of these issues."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  userns: Restrict when proc and sysfs can be mounted
  ipc: Restrict mounting the mqueue filesystem
  vfs: Carefully propogate mounts across user namespaces
  vfs: Add a mount flag to lock read only bind mounts
  userns:  Don't allow creation if the user is chrooted
  yama:  Better permission check for ptraceme
  pid: Handle the exit of a multi-threaded init.
  scm: Require CAP_SYS_ADMIN over the current pidns to spoof pids.
2013-03-28 13:43:46 -07:00
Hong zhi guo
77954983ad selinux: replace obsolete NLMSG_* with type safe nlmsg_*
Signed-off-by: Hong Zhiguo <honkiko@gmail.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-28 14:25:49 -04:00
Eric W. Biederman
eddc0a3abf yama: Better permission check for ptraceme
Change the permission check for yama_ptrace_ptracee to the standard
ptrace permission check, testing if the traceer has CAP_SYS_PTRACE
in the tracees user namespace.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-03-26 13:17:58 -07:00
Aristeu Rozanski
bd2953ebbb devcg: propagate local changes down the hierarchy
This patch makes exception changes to propagate down in hierarchy respecting
when possible local exceptions.

New exceptions allowing additional access to devices won't be propagated, but
it'll be possible to add an exception to access all of part of the newly
allowed device(s).

New exceptions disallowing access to devices will be propagated down and the
local group's exceptions will be revalidated for the new situation.
Example:
      A
     / \
        B

    group        behavior          exceptions
    A            allow             "b 8:* rwm", "c 116:1 rw"
    B            deny              "c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm"

If a new exception is added to group A:
	# echo "c 116:* r" > A/devices.deny
it'll propagate down and after revalidating B's local exceptions, the exception
"c 116:2 rwm" will be removed.

In case parent's exceptions change and local exceptions are not allowed anymore,
they'll be deleted.

v7:
- do not allow behavior change when the cgroup has children
- update documentation

v6: fixed issues pointed by Serge Hallyn
- only copy parent's exceptions while propagating behavior if the local
  behavior is different
- while propagating exceptions, do not clear and copy parent's: it'd be against
  the premise we don't propagate access to more devices

v5: fixed issues pointed by Serge Hallyn
- updated documentation
- not propagating when an exception is written to devices.allow
- when propagating a new behavior, clean the local exceptions list if they're
  for a different behavior

v4: fixed issues pointed by Tejun Heo
- separated function to walk the tree and collect valid propagation targets

v3: fixed issues pointed by Tejun Heo
- update documentation
- move css_online/css_offline changes to a new patch
- use cgroup_for_each_descendant_pre() instead of own descendant walk
- move exception_copy rework to a separared patch
- move exception_clean rework to a separated patch

v2: fixed issues pointed by Tejun Heo
- instead of keeping the local settings that won't apply anymore, remove them

Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-20 07:50:21 -07:00
Aristeu Rozanski
1909554c97 devcg: use css_online and css_offline
Allocate resources and change behavior only when online. This is needed in
order to determine if a node is suitable for hierarchy propagation or if it's
being removed.

Locking:
Both functions take devcgroup_mutex to make changes to device_cgroup structure.
Hierarchy propagation will also take devcgroup_mutex before walking the
tree while walking the tree itself is protected by rcu lock.

Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-20 07:50:17 -07:00
Aristeu Rozanski
c39a2a3018 devcg: prepare may_access() for hierarchy support
Currently may_access() is only able to verify if an exception is valid for the
current cgroup, which has the same behavior. With hierarchy, it'll be also used
to verify if a cgroup local exception is valid towards its cgroup parent, which
might have different behavior.

v2:
- updated patch description
- rebased on top of a new patch to expand the may_access() logic to make it
  more clear
- fixed argument description order in may_access()

Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-20 07:50:13 -07:00
Aristeu Rozanski
26898fdff3 devcg: expand may_access() logic
In order to make the next patch more clear, expand may_access() logic.

v2: may_access() returns bool now

Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-03-20 07:50:09 -07:00
Igor Zhbanov
cdb56b6088 Fix NULL pointer dereference in smack_inode_unlink() and smack_inode_rmdir()
This patch fixes kernel Oops because of wrong common_audit_data type
in smack_inode_unlink() and smack_inode_rmdir().

When SMACK security module is enabled and SMACK logging is on (/smack/logging
is not zero) and you try to delete the file which
1) you cannot delete due to SMACK rules and logging of failures is on
or
2) you can delete and logging of success is on,

you will see following:

	Unable to handle kernel NULL pointer dereference at virtual address 000002d7

	[<...>] (strlen+0x0/0x28)
	[<...>] (audit_log_untrustedstring+0x14/0x28)
	[<...>] (common_lsm_audit+0x108/0x6ac)
	[<...>] (smack_log+0xc4/0xe4)
	[<...>] (smk_curacc+0x80/0x10c)
	[<...>] (smack_inode_unlink+0x74/0x80)
	[<...>] (security_inode_unlink+0x2c/0x30)
	[<...>] (vfs_unlink+0x7c/0x100)
	[<...>] (do_unlinkat+0x144/0x16c)

The function smack_inode_unlink() (and smack_inode_rmdir()) need
to log two structures of different types. First of all it does:

	smk_ad_init(&ad, __func__, LSM_AUDIT_DATA_DENTRY);
	smk_ad_setfield_u_fs_path_dentry(&ad, dentry);

This will set common audit data type to LSM_AUDIT_DATA_DENTRY
and store dentry for auditing (by function smk_curacc(), which in turn calls
dump_common_audit_data(), which is actually uses provided data and logs it).

	/*
	 * You need write access to the thing you're unlinking
	 */
	rc = smk_curacc(smk_of_inode(ip), MAY_WRITE, &ad);
	if (rc == 0) {
		/*
		 * You also need write access to the containing directory
		 */

Then this function wants to log anoter data:

		smk_ad_setfield_u_fs_path_dentry(&ad, NULL);
		smk_ad_setfield_u_fs_inode(&ad, dir);

The function sets inode field, but don't change common_audit_data type.

		rc = smk_curacc(smk_of_inode(dir), MAY_WRITE, &ad);
	}

So the dump_common_audit() function incorrectly interprets inode structure
as dentry, and Oops will happen.

This patch reinitializes common_audit_data structures with correct type.
Also I removed unneeded
	smk_ad_setfield_u_fs_path_dentry(&ad, NULL);
initialization, because both dentry and inode pointers are stored
in the same union.

Signed-off-by: Igor Zhbanov <i.zhbanov@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2013-03-19 14:38:17 -07:00
Rafal Krypa
e05b6f982a Smack: add support for modification of existing rules
Rule modifications are enabled via /smack/change-rule. Format is as follows:
"Subject Object rwaxt rwaxt"

First two strings are subject and object labels up to 255 characters.
Third string contains permissions to enable.
Fourth string contains permissions to disable.

All unmentioned permissions will be left unchanged.
If no rule previously existed, it will be created.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
2013-03-19 14:16:42 -07:00
Jarkko Sakkinen
cee7e44334 smack: SMACK_MAGIC to include/uapi/linux/magic.h
SMACK_MAGIC moved to a proper place for easy user space access
(i.e. libsmack).

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@iki.fi>
2013-03-19 14:16:15 -07:00
Rafal Krypa
a87d79ad7c Smack: add missing support for transmute bit in smack_str_from_perm()
This fixes audit logs for granting or denial of permissions to show
information about transmute bit.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
2013-03-19 14:15:41 -07:00
Rafal Krypa
d15d9fad16 Smack: prevent revoke-subject from failing when unseen label is written to it
Special file /smack/revoke-subject will silently accept labels that are not
present on the subject label list. Nothing has to be done for such labels,
as there are no rules for them to revoke.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
2013-03-19 14:15:21 -07:00
Dan Carpenter
4502403dcf selinux: use GFP_ATOMIC under spin_lock
The call tree here is:

sk_clone_lock()              <- takes bh_lock_sock(newsk);
xfrm_sk_clone_policy()
__xfrm_sk_clone_policy()
clone_policy()               <- uses GFP_ATOMIC for allocations
security_xfrm_policy_clone()
security_ops->xfrm_policy_clone_security()
selinux_xfrm_policy_clone()

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-03-19 00:33:09 +11:00
Lai Jiangshan
921f3ac4c3 tomoyo: use DEFINE_SRCU() to define tomoyo_ss
DEFINE_STATIC_SRCU() defines srcu struct and do init at build time.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-03-18 23:51:31 +11:00
Mathieu Desnoyers
8aec0f5d41 Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys
Looking at mm/process_vm_access.c:process_vm_rw() and comparing it to
compat_process_vm_rw() shows that the compatibility code requires an
explicit "access_ok()" check before calling
compat_rw_copy_check_uvector(). The same difference seems to appear when
we compare fs/read_write.c:do_readv_writev() to
fs/compat.c:compat_do_readv_writev().

This subtle difference between the compat and non-compat requirements
should probably be debated, as it seems to be error-prone. In fact,
there are two others sites that use this function in the Linux kernel,
and they both seem to get it wrong:

Now shifting our attention to fs/aio.c, we see that aio_setup_iocb()
also ends up calling compat_rw_copy_check_uvector() through
aio_setup_vectored_rw(). Unfortunately, the access_ok() check appears to
be missing. Same situation for
security/keys/compat.c:compat_keyctl_instantiate_key_iov().

I propose that we add the access_ok() check directly into
compat_rw_copy_check_uvector(), so callers don't have to worry about it,
and it therefore makes the compat call code similar to its non-compat
counterpart. Place the access_ok() check in the same location where
copy_from_user() can trigger a -EFAULT error in the non-compat code, so
the ABI behaviors are alike on both compat and non-compat.

While we are here, fix compat_do_readv_writev() so it checks for
compat_rw_copy_check_uvector() negative return values.

And also, fix a memory leak in compat_keyctl_instantiate_key_iov() error
handling.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-12 11:05:45 -07:00
David Howells
0da9dfdd2c keys: fix race with concurrent install_user_keyrings()
This fixes CVE-2013-1792.

There is a race in install_user_keyrings() that can cause a NULL pointer
dereference when called concurrently for the same user if the uid and
uid-session keyrings are not yet created.  It might be possible for an
unprivileged user to trigger this by calling keyctl() from userspace in
parallel immediately after logging in.

Assume that we have two threads both executing lookup_user_key(), both
looking for KEY_SPEC_USER_SESSION_KEYRING.

	THREAD A			THREAD B
	===============================	===============================
					==>call install_user_keyrings();
	if (!cred->user->session_keyring)
	==>call install_user_keyrings()
					...
					user->uid_keyring = uid_keyring;
	if (user->uid_keyring)
		return 0;
	<==
	key = cred->user->session_keyring [== NULL]
					user->session_keyring = session_keyring;
	atomic_inc(&key->usage); [oops]

At the point thread A dereferences cred->user->session_keyring, thread B
hasn't updated user->session_keyring yet, but thread A assumes it is
populated because install_user_keyrings() returned ok.

The race window is really small but can be exploited if, for example,
thread B is interrupted or preempted after initializing uid_keyring, but
before doing setting session_keyring.

This couldn't be reproduced on a stock kernel.  However, after placing
systemtap probe on 'user->session_keyring = session_keyring;' that
introduced some delay, the kernel could be crashed reliably.

Fix this by checking both pointers before deciding whether to return.
Alternatively, the test could be done away with entirely as it is checked
inside the mutex - but since the mutex is global, that may not be the best
way.

Signed-off-by: David Howells <dhowells@redhat.com>
Reported-by: Mateusz Guzik <mguzik@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-03-12 16:44:31 +11:00
Eric W. Biederman
ba0e3427b0 userns: Stop oopsing in key_change_session_keyring
Dave Jones <davej@redhat.com> writes:
> Just hit this on Linus' current tree.
>
> [   89.621770] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
> [   89.623111] IP: [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [   89.624062] PGD 122bfd067 PUD 122bfe067 PMD 0
> [   89.624901] Oops: 0000 [#1] PREEMPT SMP
> [   89.625678] Modules linked in: caif_socket caif netrom bridge hidp 8021q garp stp mrp rose llc2 af_rxrpc phonet af_key binfmt_misc bnep l2tp_ppp can_bcm l2tp_core pppoe pppox can_raw scsi_transport_iscsi ppp_generic slhc nfnetlink can ipt_ULOG ax25 decnet irda nfc rds x25 crc_ccitt appletalk atm ipx p8023 psnap p8022 llc lockd sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btusb bluetooth snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm vhost_net snd_page_alloc snd_timer tun macvtap usb_debug snd rfkill microcode macvlan edac_core pcspkr serio_raw kvm_amd soundcore kvm r8169 mii
> [   89.637846] CPU 2
> [   89.638175] Pid: 782, comm: trinity-main Not tainted 3.8.0+ #63 Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H
> [   89.639850] RIP: 0010:[<ffffffff810784b0>]  [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [   89.641161] RSP: 0018:ffff880115657eb8  EFLAGS: 00010207
> [   89.641984] RAX: 00000000000003e8 RBX: ffff88012688b000 RCX: 0000000000000000
> [   89.643069] RDX: 0000000000000000 RSI: ffffffff81c32960 RDI: ffff880105839600
> [   89.644167] RBP: ffff880115657ed8 R08: 0000000000000000 R09: 0000000000000000
> [   89.645254] R10: 0000000000000001 R11: 0000000000000246 R12: ffff880105839600
> [   89.646340] R13: ffff88011beea490 R14: ffff88011beea490 R15: 0000000000000000
> [   89.647431] FS:  00007f3ac063b740(0000) GS:ffff88012b200000(0000) knlGS:0000000000000000
> [   89.648660] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [   89.649548] CR2: 00000000000000c8 CR3: 0000000122bfc000 CR4: 00000000000007e0
> [   89.650635] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   89.651723] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [   89.652812] Process trinity-main (pid: 782, threadinfo ffff880115656000, task ffff88011beea490)
> [   89.654128] Stack:
> [   89.654433]  0000000000000000 ffff8801058396a0 ffff880105839600 ffff88011beeaa78
> [   89.655769]  ffff880115657ef8 ffffffff812c7d9b ffffffff82079be0 0000000000000000
> [   89.657073]  ffff880115657f28 ffffffff8106c665 0000000000000002 ffff880115657f58
> [   89.658399] Call Trace:
> [   89.658822]  [<ffffffff812c7d9b>] key_change_session_keyring+0xfb/0x140
> [   89.659845]  [<ffffffff8106c665>] task_work_run+0xa5/0xd0
> [   89.660698]  [<ffffffff81002911>] do_notify_resume+0x71/0xb0
> [   89.661581]  [<ffffffff816c9a4a>] int_signal+0x12/0x17
> [   89.662385] Code: 24 90 00 00 00 48 8b b3 90 00 00 00 49 8b 4c 24 40 48 39 f2 75 08 e9 83 00 00 00 48 89 ca 48 81 fa 60 29 c3 81 0f 84 41 fe ff ff <48> 8b 8a c8 00 00 00 48 39 ce 75 e4 3b 82 d0 00 00 00 0f 84 4b
> [   89.667778] RIP  [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [   89.668733]  RSP <ffff880115657eb8>
> [   89.669301] CR2: 00000000000000c8
>
> My fastest trinity induced oops yet!
>
>
> Appears to be..
>
>                 if ((set_ns == subset_ns->parent)  &&
>      850:       48 8b 8a c8 00 00 00    mov    0xc8(%rdx),%rcx
>
> from the inlined cred_cap_issubset

By historical accident we have been reading trying to set new->user_ns
from new->user_ns.  Which is totally silly as new->user_ns is NULL (as
is every other field in new except session_keyring at that point).

The intent is clearly to copy all of the fields from old to new so copy
old->user_ns into  into new->user_ns.

Cc: stable@vger.kernel.org
Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Dave Jones <davej@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-03-03 19:35:38 -08:00
Linus Torvalds
56a79b7b02 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull  more VFS bits from Al Viro:
 "Unfortunately, it looks like xattr series will have to wait until the
  next cycle ;-/

  This pile contains 9p cleanups and fixes (races in v9fs_fid_add()
  etc), fixup for nommu breakage in shmem.c, several cleanups and a bit
  more file_inode() work"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  constify path_get/path_put and fs_struct.c stuff
  fix nommu breakage in shmem.c
  cache the value of file_inode() in struct file
  9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry
  9p: make sure ->lookup() adds fid to the right dentry
  9p: untangle ->lookup() a bit
  9p: double iput() in ->lookup() if d_materialise_unique() fails
  9p: v9fs_fid_add() can't fail now
  v9fs: get rid of v9fs_dentry
  9p: turn fid->dlist into hlist
  9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine
  more file_inode() open-coded instances
  selinux: opened file can't have NULL or negative ->f_path.dentry

(In the meantime, the hlist traversal macros have changed, so this
required a semantic conflict fixup for the newly hlistified fid->dlist)
2013-03-03 13:23:03 -08:00