31889 Commits

Author SHA1 Message Date
Richard Genoud
4791df938f UBIFS: correct mount message
commit beadadfa54 upstream.

When mounting an UBIFS R/W volume, we have the message:
UBIFS: mounted UBI device 0, volume 1, name "rootfs"(null)
With this patch, we'll have:
UBIFS: mounted UBI device 0, volume 1, name "rootfs"
Which is, I think, what was intended.

Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:36 -07:00
Steve French
e885d8b8a8 Handle big endianness in NTLM (ntlmv2) authentication
commit fdf96a907c upstream.

This is RH bug 970891
Uppercasing of username during calculation of ntlmv2 hash fails
because UniStrupr function does not handle big endian wchars.

Also fix a comment in the same code to reflect its correct usage.

[To make it easier for stable (rather than require 2nd patch) fixed
this patch of Shirish's to remove endian warning generated
by sparse -- steve f.]

Reported-by: steve <sanpatr1@in.ibm.com>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:36 -07:00
Theodore Ts'o
7557d8d5a4 ext4: don't allow ext4_free_blocks() to fail due to ENOMEM
commit e7676a704e upstream.

The filesystem should not be marked inconsistent if ext4_free_blocks()
is not able to allocate memory.  Unfortunately some callers (most
notably ext4_truncate) don't have a way to reflect an error back up to
the VFS.  And even if we did, most userspace applications won't deal
with most system calls returning ENOMEM anyway.

Reported-by: Nagachandra P <nagachandra@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:33 -07:00
Theodore Ts'o
bbb1d9216c ext4: don't show usrquota/grpquota twice in /proc/mounts
commit ad065dd016 upstream.

We now print mount options in a generic fashion in
ext4_show_options(), so we shouldn't be explicitly printing the
{usr,grp}quota options in ext4_show_quota_options().

Without this patch, /proc/mounts can look like this:

 /dev/vdb /vdb ext4 rw,relatime,quota,usrquota,data=ordered,usrquota 0 0
                                      ^^^^^^^^              ^^^^^^^^

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:33 -07:00
Theodore Ts'o
d860925657 ext4: fix ext4_get_group_number()
commit 960fd856fd upstream.

The function ext4_get_group_number() was introduced as an optimization
in commit bd86298e60.  Unfortunately, this commit incorrectly
calculate the group number for file systems with a 1k block size (when
s_first_data_block is 1 instead of zero).  This could cause the
following kernel BUG:

[  568.877799] ------------[ cut here ]------------
[  568.877833] kernel BUG at fs/ext4/mballoc.c:3728!
[  568.877840] Oops: Exception in kernel mode, sig: 5 [#1]
[  568.877845] SMP NR_CPUS=32 NUMA pSeries
[  568.877852] Modules linked in: binfmt_misc
[  568.877861] CPU: 1 PID: 3516 Comm: fs_mark Not tainted 3.10.0-03216-g7c6809f-dirty #1
[  568.877867] task: c0000001fb0b8000 ti: c0000001fa954000 task.ti: c0000001fa954000
[  568.877873] NIP: c0000000002f42a4 LR: c0000000002f4274 CTR: c000000000317ef8
[  568.877879] REGS: c0000001fa956ed0 TRAP: 0700   Not tainted  (3.10.0-03216-g7c6809f-dirty)
[  568.877884] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>  CR: 24000428  XER: 00000000
[  568.877902] SOFTE: 1
[  568.877905] CFAR: c0000000002b5464
[  568.877908]
GPR00: 0000000000000001 c0000001fa957150 c000000000c6a408 c0000001fb588000
GPR04: 0000000000003fff c0000001fa9571c0 c0000001fa9571c4 000138098c50625f
GPR08: 1301200000000000 0000000000000002 0000000000000001 0000000000000000
GPR12: 0000000024000422 c00000000f33a300 0000000000008000 c0000001fa9577f0
GPR16: c0000001fb7d0100 c000000000c29190 c0000000007f46e8 c000000000a14672
GPR20: 0000000000000001 0000000000000008 ffffffffffffffff 0000000000000000
GPR24: 0000000000000100 c0000001fa957278 c0000001fdb2bc78 c0000001fa957288
GPR28: 0000000000100100 c0000001fa957288 c0000001fb588000 c0000001fdb2bd10
[  568.877993] NIP [c0000000002f42a4] .ext4_mb_release_group_pa+0xec/0x1c0
[  568.877999] LR [c0000000002f4274] .ext4_mb_release_group_pa+0xbc/0x1c0
[  568.878004] Call Trace:
[  568.878008] [c0000001fa957150] [c0000000002f4274] .ext4_mb_release_group_pa+0xbc/0x1c0 (unreliable)
[  568.878017] [c0000001fa957200] [c0000000002fb070] .ext4_mb_discard_lg_preallocations+0x394/0x444
[  568.878025] [c0000001fa957340] [c0000000002fb45c] .ext4_mb_release_context+0x33c/0x734
[  568.878032] [c0000001fa957440] [c0000000002fbcf8] .ext4_mb_new_blocks+0x4a4/0x5f4
[  568.878039] [c0000001fa957510] [c0000000002ef56c] .ext4_ext_map_blocks+0xc28/0x1178
[  568.878047] [c0000001fa957640] [c0000000002c1a94] .ext4_map_blocks+0x2c8/0x490
[  568.878054] [c0000001fa957730] [c0000000002c536c] .ext4_writepages+0x738/0xc60
[  568.878062] [c0000001fa957950] [c000000000168a78] .do_writepages+0x5c/0x80
[  568.878069] [c0000001fa9579d0] [c00000000015d1c4] .__filemap_fdatawrite_range+0x88/0xb0
[  568.878078] [c0000001fa957aa0] [c00000000015d23c] .filemap_write_and_wait_range+0x50/0xfc
[  568.878085] [c0000001fa957b30] [c0000000002b8edc] .ext4_sync_file+0x220/0x3c4
[  568.878092] [c0000001fa957be0] [c0000000001f849c] .vfs_fsync_range+0x64/0x80
[  568.878098] [c0000001fa957c70] [c0000000001f84f0] .vfs_fsync+0x38/0x4c
[  568.878105] [c0000001fa957d00] [c0000000001f87f4] .do_fsync+0x54/0x90
[  568.878111] [c0000001fa957db0] [c0000000001f8894] .SyS_fsync+0x28/0x3c
[  568.878120] [c0000001fa957e30] [c000000000009c88] syscall_exit+0x0/0x7c
[  568.878125] Instruction dump:
[  568.878130] 60000000 813d0034 81610070 38000000 7f8b4800 419e001c 813f007c 7d2bfe70
[  568.878144] 7d604a78 7c005850 54000ffe 7c0007b4 <0b000000> e8a10076 e87f0090 7fa4eb78
[  568.878160] ---[ end trace 594d911d9654770b ]---

In addition fix the STD_GROUP optimization so that it works for
bigalloc file systems as well.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Li Zhong <lizhongfs@gmail.com>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:33 -07:00
Jan Kara
bb39c83ce5 ext4: fix overflow when counting used blocks on 32-bit architectures
commit 8af8eecc13 upstream.

The arithmetics adding delalloc blocks to the number of used blocks in
ext4_getattr() can easily overflow on 32-bit archs as we first multiply
number of blocks by blocksize and then divide back by 512. Make the
arithmetics more clever and also use proper type (unsigned long long
instead of unsigned long).

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:32 -07:00
Jan Kara
b9ea84239c ext4: fix data offset overflow in ext4_xattr_fiemap() on 32-bit archs
commit a60697f411 upstream.

On 32-bit architectures with 32-bit sector_t computation of data offset
in ext4_xattr_fiemap() can overflow resulting in reporting bogus data
location. Fix the problem by typing block number to proper type before
shifting.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:32 -07:00
Jan Kara
80747f06d9 ext4: fix overflows in SEEK_HOLE, SEEK_DATA implementations
commit e7293fd146 upstream.

ext4_lblk_t is just u32 so multiplying it by blocksize can easily
overflow for files larger than 4 GB. Fix that by properly typing the
block offsets before shifting.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:32 -07:00
Jan Kara
e47438e6af ext4: fix data offset overflow on 32-bit archs in ext4_inline_data_fiemap()
commit eaf3793728 upstream.

On 32-bit archs when sector_t is defined as 32-bit the logic computing
data offset in ext4_inline_data_fiemap(). Fix that by properly typing
the shifted value.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:32 -07:00
Josef Bacik
7a44bf654d Btrfs: only do the tree_mod_log_free_eb if this is our last ref
commit 7fb7d76f96 upstream.

There is another bug in the tree mod log stuff in that we're calling
tree_mod_log_free_eb every single time a block is cow'ed.  The problem with this
is that if this block is shared by multiple snapshots we will call this multiple
times per block, so if we go to rewind the mod log for this block we'll BUG_ON()
in __tree_mod_log_rewind because we try to rewind a free twice.  We only want to
call tree_mod_log_free_eb if we are actually freeing the block.  With this patch
I no longer hit the panic in __tree_mod_log_rewind.  Thanks,

Reviewed-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:32 -07:00
Josef Bacik
c701343cd0 Btrfs: hold the tree mod lock in __tree_mod_log_rewind
commit f1ca7e98a6 upstream.

We need to hold the tree mod log lock in __tree_mod_log_rewind since we walk
forward in the tree mod entries, otherwise we'll end up with random entries and
trip the BUG_ON() at the front of __tree_mod_log_rewind.  This fixes the panics
people were seeing when running

find /whatever -type f -exec btrfs fi defrag {} \;

Thansk,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:31 -07:00
Josef Bacik
8049d11b3a Btrfs: fix estale with btrfs send
commit 139f807a1e upstream.

This fixes bugzilla 57491.  If we take a snapshot of a fs with a unlink ongoing
and then try to send that root we will run into problems.  When comparing with a
parent root we will search the parents and the send roots commit_root, which if
we've just created the snapshot will include the file that needs to be evicted
by the orphan cleanup.  So when we find a changed extent we will try and copy
that info into the send stream, but when we lookup the inode we use the normal
root, which no longer has the inode because the orphan cleanup deleted it.  The
best solution I have for this is to check our otransid with the generation of
the commit root and if they match just commit the transaction again, that way we
get the changes from the orphan cleanup.  With this patch the reproducer I made
for this bugzilla no longer returns ESTALE when trying to do the send.  Thanks,

Reported-by: Chris Wilson <jakdaw@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:31 -07:00
Junxiao Bi
1926bf8ae4 ocfs2: xattr: fix inlined xattr reflink
commit ef962df057 upstream.

Inlined xattr shared free space of inode block with inlined data or data
extent record, so the size of the later two should be adjusted when
inlined xattr is enabled.  See ocfs2_xattr_ibody_init().  But this isn't
done well when reflink.  For inode with inlined data, its max inlined
data size is adjusted in ocfs2_duplicate_inline_data(), no problem.  But
for inode with data extent record, its record count isn't adjusted.  Fix
it, or data extent record and inlined xattr may overwrite each other,
then cause data corruption or xattr failure.

One panic caused by this bug in our test environment is the following:

  kernel BUG at fs/ocfs2/xattr.c:1435!
  invalid opcode: 0000 [#1] SMP
  Pid: 10871, comm: multi_reflink_t Not tainted 2.6.39-300.17.1.el5uek #1
  RIP: ocfs2_xa_offset_pointer+0x17/0x20 [ocfs2]
  RSP: e02b:ffff88007a587948  EFLAGS: 00010283
  RAX: 0000000000000000 RBX: 0000000000000010 RCX: 00000000000051e4
  RDX: ffff880057092060 RSI: 0000000000000f80 RDI: ffff88007a587a68
  RBP: ffff88007a587948 R08: 00000000000062f4 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000010
  R13: ffff88007a587a68 R14: 0000000000000001 R15: ffff88007a587c68
  FS:  00007fccff7f06e0(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
  CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 00000000015cf000 CR3: 000000007aa76000 CR4: 0000000000000660
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process multi_reflink_t
  Call Trace:
    ocfs2_xa_reuse_entry+0x60/0x280 [ocfs2]
    ocfs2_xa_prepare_entry+0x17e/0x2a0 [ocfs2]
    ocfs2_xa_set+0xcc/0x250 [ocfs2]
    ocfs2_xattr_ibody_set+0x98/0x230 [ocfs2]
    __ocfs2_xattr_set_handle+0x4f/0x700 [ocfs2]
    ocfs2_xattr_set+0x6c6/0x890 [ocfs2]
    ocfs2_xattr_user_set+0x46/0x50 [ocfs2]
    generic_setxattr+0x70/0x90
    __vfs_setxattr_noperm+0x80/0x1a0
    vfs_setxattr+0xa9/0xb0
    setxattr+0xc3/0x120
    sys_fsetxattr+0xa8/0xd0
    system_call_fastpath+0x16/0x1b

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:29 -07:00
Theodore Ts'o
0529b225e1 ext4: check error return from ext4_write_inline_data_end()
commit 42c832debb upstream.

The function ext4_write_inline_data_end() can return an error.  So we
need to assign it to a signed integer variable to check for an error
return (since copied is an unsigned int).

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:23 -07:00
Al Viro
5196bcf984 ext3,ext4: don't mess with dir_file->f_pos in htree_dirblock_to_tree()
commit 64cb927371 upstream.

Both ext3 and ext4 htree_dirblock_to_tree() is just filling the
in-core rbtree for use by call_filldir().  All updates of ->f_pos are
done by the latter; bumping it here (on error) is obviously wrong - we
might very well have it nowhere near the block we'd found an error in.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:23 -07:00
Maarten ter Huurne
b6a81140ef ext4: fix corruption when online resizing a fs with 1K block size
commit 6ca792edc1 upstream.

Subtracting the number of the first data block places the superblock
backups one block too early, corrupting the file system. When the block
size is larger than 1K, the first data block is 0, so the subtraction
has no effect and no corruption occurs.

Signed-off-by: Maarten ter Huurne <maarten@treewalker.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:23 -07:00
Theodore Ts'o
15f26a4c48 jbd2: fix theoretical race in jbd2__journal_restart
commit 39c04153fd upstream.

Once we decrement transaction->t_updates, if this is the last handle
holding the transaction from closing, and once we release the
t_handle_lock spinlock, it's possible for the transaction to commit
and be released.  In practice with normal kernels, this probably won't
happen, since the commit happens in a separate kernel thread and it's
unlikely this could all happen within the space of a few CPU cycles.

On the other hand, with a real-time kernel, this could potentially
happen, so save the tid found in transaction->t_tid before we release
t_handle_lock.  It would require an insane configuration, such as one
where the jbd2 thread was set to a very high real-time priority,
perhaps because a high priority real-time thread is trying to read or
write to a file system.  But some people who use real-time kernels
have been known to do insane things, including controlling
laser-wielding industrial robots.  :-)

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:22 -07:00
Theodore Ts'o
5f81e31388 jbd2: move superblock checksum calculation to jbd2_write_superblock()
commit fe52d17cdd upstream.

Some of the functions which modify the jbd2 superblock were not
updating the checksum before calling jbd2_write_superblock().  Move
the call to jbd2_superblock_csum_set() to jbd2_write_superblock(), so
that the checksum is calculated consistently.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:22 -07:00
Pavel Shilovsky
059a1671f2 CIFS: Fix a deadlock when a file is reopened
commit 689c3db4d5 upstream.

If we request reading or writing on a file that needs to be
reopened, it causes the deadlock: we are already holding rw
semaphore for reading and then we try to acquire it for writing
in cifs_relock_file. Fix this by acquiring the semaphore for
reading in cifs_relock_file due to we don't make any changes in
locks and don't need a write access.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:21 -07:00
Steve French
cfe24e4e36 CIFS use sensible file nlink values if unprovided
commit 6658b9f70e upstream.

Certain servers may not set the NumberOfLinks field in query file/path
info responses. In such a case, cifs_inode_needs_reval() assumes that
all regular files are hardlinks and triggers revalidation, leading to
excessive and unnecessary network traffic.

This change hardcodes cf_nlink (and subsequently i_nlink) when not
returned by the server, similar to what already occurs in cifs_mkdir().

Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21 18:21:21 -07:00
J. Bruce Fields
c334daca73 nfsd4: fix decoding of compounds across page boundaries
commit 247500820e upstream.

A freebsd NFSv4.0 client was getting rare IO errors expanding a tarball.
A network trace showed the server returning BAD_XDR on the final getattr
of a getattr+write+getattr compound.  The final getattr started on a
page boundary.

I believe the Linux client ignores errors on the post-write getattr, and
that that's why we haven't seen this before.

Reported-by: Rick Macklem <rmacklem@uoguelph.ca>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-13 11:42:27 -07:00
Andy Adamson
9c42c27677 NFSv4.1 end back channel session draining
commit 62f288a02f upstream.

We need to ensure that we clear NFS4_SLOT_TBL_DRAINING on the back
channel when we're done recovering the session.

Regression introduced by commit 774d5f14e (NFSv4.1 Fix a pNFS session
draining deadlock)

Signed-off-by: Andy Adamson <andros@netapp.com>
[Trond: Changed order to start back-channel first. Minor code cleanup]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-13 11:42:27 -07:00
Mikulas Patocka
70621db094 hpfs: better test for errors
commit 3ebacb0504 upstream.

The test if bitmap access is out of bound could errorneously pass if the
device size is divisible by 16384 sectors and we are asking for one bitmap
after the end.

Check for invalid size in the superblock. Invalid size could cause integer
overflows in the rest of the code.

Signed-off-by: Mikulas Patocka <mpatocka@artax.karlin.mff.cuni.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-13 11:42:26 -07:00
majianpeng
bd4d4d8725 ceph: fix sleeping function called from invalid context.
commit a1dc193733 upstream.

[ 1121.231883] BUG: sleeping function called from invalid context at kernel/rwsem.c:20
[ 1121.231935] in_atomic(): 1, irqs_disabled(): 0, pid: 9831, name: mv
[ 1121.231971] 1 lock held by mv/9831:
[ 1121.231973]  #0:  (&(&ci->i_ceph_lock)->rlock){+.+...},at:[<ffffffffa02bbd38>] ceph_getxattr+0x58/0x1d0 [ceph]
[ 1121.231998] CPU: 3 PID: 9831 Comm: mv Not tainted 3.10.0-rc6+ #215
[ 1121.232000] Hardware name: To Be Filled By O.E.M. To Be Filled By
O.E.M./To be filled by O.E.M., BIOS 080015  11/09/2011
[ 1121.232027]  ffff88006d355a80 ffff880092f69ce0 ffffffff8168348c ffff880092f69cf8
[ 1121.232045]  ffffffff81070435 ffff88006d355a20 ffff880092f69d20 ffffffff816899ba
[ 1121.232052]  0000000300000004 ffff8800b76911d0 ffff88006d355a20 ffff880092f69d68
[ 1121.232056] Call Trace:
[ 1121.232062]  [<ffffffff8168348c>] dump_stack+0x19/0x1b
[ 1121.232067]  [<ffffffff81070435>] __might_sleep+0xe5/0x110
[ 1121.232071]  [<ffffffff816899ba>] down_read+0x2a/0x98
[ 1121.232080]  [<ffffffffa02baf70>] ceph_vxattrcb_layout+0x60/0xf0 [ceph]
[ 1121.232088]  [<ffffffffa02bbd7f>] ceph_getxattr+0x9f/0x1d0 [ceph]
[ 1121.232093]  [<ffffffff81188d28>] vfs_getxattr+0xa8/0xd0
[ 1121.232097]  [<ffffffff8118900b>] getxattr+0xab/0x1c0
[ 1121.232100]  [<ffffffff811704f2>] ? final_putname+0x22/0x50
[ 1121.232104]  [<ffffffff81155f80>] ? kmem_cache_free+0xb0/0x260
[ 1121.232107]  [<ffffffff811704f2>] ? final_putname+0x22/0x50
[ 1121.232110]  [<ffffffff8109e63d>] ? trace_hardirqs_on+0xd/0x10
[ 1121.232114]  [<ffffffff816957a7>] ? sysret_check+0x1b/0x56
[ 1121.232120]  [<ffffffff81189c9c>] SyS_fgetxattr+0x6c/0xc0
[ 1121.232125]  [<ffffffff81695782>] system_call_fastpath+0x16/0x1b
[ 1121.232129] BUG: scheduling while atomic: mv/9831/0x10000002
[ 1121.232154] 1 lock held by mv/9831:
[ 1121.232156]  #0:  (&(&ci->i_ceph_lock)->rlock){+.+...}, at:
[<ffffffffa02bbd38>] ceph_getxattr+0x58/0x1d0 [ceph]

I think move the ci->i_ceph_lock down is safe because we can't free
ceph_inode_info at there.

Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-13 11:42:26 -07:00
Linus Torvalds
63edbce160 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull ubifs fixes from Al Viro:
 "A couple of ubifs readdir/lseek race fixes.  Stable fodder, really
  nasty..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  UBIFS: fix a horrid bug
  UBIFS: prepare to fix a horrid bug
2013-06-29 10:30:31 -07:00
Linus Torvalds
82d0b80ad6 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Ingo Molnar:
 "One more fix for a recently discovered bug"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Disable monitoring on setuid processes for regular users
2013-06-29 10:26:50 -07:00
Artem Bityutskiy
605c912bb8 UBIFS: fix a horrid bug
Al Viro pointed me to the fact that '->readdir()' and '->llseek()' have no
mutual exclusion, which means the 'ubifs_dir_llseek()' can be run while we are
in the middle of 'ubifs_readdir()'.

This means that 'file->private_data' can be freed while 'ubifs_readdir()' uses
it, and this is a very bad bug: not only 'ubifs_readdir()' can return garbage,
but this may corrupt memory and lead to all kinds of problems like crashes an
security holes.

This patch fixes the problem by using the 'file->f_version' field, which
'->llseek()' always unconditionally sets to zero. We set it to 1 in
'ubifs_readdir()' and whenever we detect that it became 0, we know there was a
seek and it is time to clear the state saved in 'file->private_data'.

I tested this patch by writing a user-space program which runds readdir and
seek in parallell. I could easily crash the kernel without these patches, but
could not crash it with these patches.

Cc: stable@vger.kernel.org
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:45:37 +04:00
Artem Bityutskiy
33f1a63ae8 UBIFS: prepare to fix a horrid bug
Al Viro pointed me to the fact that '->readdir()' and '->llseek()' have no
mutual exclusion, which means the 'ubifs_dir_llseek()' can be run while we are
in the middle of 'ubifs_readdir()'.

First of all, this means that 'file->private_data' can be freed while
'ubifs_readdir()' uses it.  But this particular patch does not fix the problem.
This patch is only a preparation, and the fix will follow next.

In this patch we make 'ubifs_readdir()' stop using 'file->f_pos' directly,
because 'file->f_pos' can be changed by '->llseek()' at any point. This may
lead 'ubifs_readdir()' to returning inconsistent data: directory entry names
may correspond to incorrect file positions.

So here we introduce a local variable 'pos', read 'file->f_pose' once at very
the beginning, and then stick to 'pos'. The result of this is that when
'ubifs_dir_llseek()' changes 'file->f_pos' while we are in the middle of
'ubifs_readdir()', the latter "wins".

Cc: stable@vger.kernel.org
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:45:37 +04:00
Stephane Eranian
2976b10f05 perf: Disable monitoring on setuid processes for regular users
There was a a bug in setup_new_exec(), whereby
the test to disabled perf monitoring was not
correct because the new credentials for the
process were not yet committed and therefore
the get_dumpable() test was never firing.

The patch fixes the problem by moving the
perf_event test until after the credentials
are committed.

Signed-off-by: Stephane Eranian <eranian@google.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26 11:40:18 +02:00
Linus Torvalds
5dbc746960 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse bugfix from Miklos Szeredi:
 "This fixes a race between fallocate() and truncate()"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: hold i_mutex in fuse_file_fallocate()
2013-06-25 09:06:04 -10:00
Randy Dunlap
acdb37c361 fs: fix new splice.c kernel-doc warning
Fix new kernel-doc warning in fs/splice.c:

  Warning(fs/splice.c:1298): No description found for parameter 'opos'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-23 16:19:56 -10:00
Al Viro
7995bd2871 splice: don't pass the address of ->f_pos to methods
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-20 19:02:45 +04:00
Maxim Patlasov
14c14414d1 fuse: hold i_mutex in fuse_file_fallocate()
Changing size of a file on server and local update (fuse_write_update_size)
should be always protected by inode->i_mutex. Otherwise a race like this is
possible:

1. Process 'A' calls fallocate(2) to extend file (~FALLOC_FL_KEEP_SIZE).
fuse_file_fallocate() sends FUSE_FALLOCATE request to the server.
2. Process 'B' calls ftruncate(2) shrinking the file. fuse_do_setattr()
sends shrinking FUSE_SETATTR request to the server and updates local i_size
by i_size_write(inode, outarg.attr.size).
3. Process 'A' resumes execution of fuse_file_fallocate() and calls
fuse_write_update_size(inode, offset + length). But 'offset + length' was
obsoleted by ftruncate from previous step.

Changed in v2 (thanks Brian and Anand for suggestions):
 - made relation between mutex_lock() and fuse_set_nowrite(inode) more
   explicit and clear.
 - updated patch description to use ftruncate(2) in example

Signed-off-by: Maxim V. Patlasov <MPatlasov@parallels.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-06-18 01:39:03 +02:00
Linus Torvalds
d0ff934881 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS fixes from Al Viro:
 "Several fixes + obvious cleanup (you've missed a couple of open-coded
  can_lookup() back then)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  snd_pcm_link(): fix a leak...
  use can_lookup() instead of direct checks of ->i_op->lookup
  move exit_task_namespaces() outside of exit_notify()
  fput: task_work_add() can fail if the caller has passed exit_task_work()
  ncpfs: fix rmdir returns Device or resource busy
2013-06-14 19:18:56 -10:00
Linus Torvalds
d58c6ff0b7 Merge tag 'for-linus-v3.10-rc6' of git://oss.sgi.com/xfs/xfs
Pull xfs fixes from Ben Myers:
 - Remove noisy warnings about experimental support which spams the logs
 - Add padding to align directory and attr structures correctly
 - Set block number on child buffer on a root btree split
 - Disable verifiers during log recovery for non-CRC filesystems

* tag 'for-linus-v3.10-rc6' of git://oss.sgi.com/xfs/xfs:
  xfs: don't shutdown log recovery on validation errors
  xfs: ensure btree root split sets blkno correctly
  xfs: fix implicit padding in directory and attr CRC formats
  xfs: don't emit v5 superblock warnings on write
2013-06-14 19:16:31 -10:00
Al Viro
0525290119 use can_lookup() instead of direct checks of ->i_op->lookup
a couple of places got missed back when Linus has introduced that one...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-15 05:41:45 +04:00
Oleg Nesterov
e7b2c40692 fput: task_work_add() can fail if the caller has passed exit_task_work()
fput() assumes that it can't be called after exit_task_work() but
this is not true, for example free_ipc_ns()->shm_destroy() can do
this. In this case fput() silently leaks the file.

Change it to fallback to delayed_fput_work if task_work_add() fails.
The patch looks complicated but it is not, it changes the code from

	if (PF_KTHREAD) {
		schedule_work(...);
		return;
	}
	task_work_add(...)

to
	if (!PF_KTHREAD) {
		if (!task_work_add(...))
			return;
		/* fallback */
	}
	schedule_work(...);

As for shm_destroy() in particular, we could make another fix but I
think this change makes sense anyway. There could be another similar
user, it is not safe to assume that task_work_add() can't fail.

Reported-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-15 05:39:08 +04:00
Dave Chinner
d302cf1d31 xfs: don't shutdown log recovery on validation errors
Unfortunately, we cannot guarantee that items logged multiple times
and replayed by log recovery do not take objects back in time. When
they are taken back in time, the go into an intermediate state which
is corrupt, and hence verification that occurs on this intermediate
state causes log recovery to abort with a corruption shutdown.

Instead of causing a shutdown and unmountable filesystem, don't
verify post-recovery items before they are written to disk. This is
less than optimal, but there is no way to detect this issue for
non-CRC filesystems If log recovery successfully completes, this
will be undone and the object will be consistent by subsequent
transactions that are replayed, so in most cases we don't need to
take drastic action.

For CRC enabled filesystems, leave the verifiers in place - we need
to call them to recalculate the CRCs on the objects anyway. This
recovery problem can be solved for such filesystems - we have a LSN
stamped in all metadata at writeback time that we can to determine
whether the item should be replayed or not. This is a separate piece
of work, so is not addressed by this patch.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

(cherry picked from commit 9222a9cf86)
2013-06-14 15:59:45 -05:00
Dave Chinner
088c9f67c3 xfs: ensure btree root split sets blkno correctly
For CRC enabled filesystems, the BMBT is rooted in an inode, so it
passes through a different code path on root splits than the
freespace and inode btrees. This is much less traversed by xfstests
than the other trees. When testing on a 1k block size filesystem,
I've been seeing ASSERT failures in generic/234 like:

XFS: Assertion failed: cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_private.b.allocated == 0, file: fs/xfs/xfs_btree.c, line: 317

which are generally preceded by a lblock check failure. I noticed
this in the bmbt stats:

$ pminfo -f xfs.btree.block_map

xfs.btree.block_map.lookup
    value 39135

xfs.btree.block_map.compare
    value 268432

xfs.btree.block_map.insrec
    value 15786

xfs.btree.block_map.delrec
    value 13884

xfs.btree.block_map.newroot
    value 2

xfs.btree.block_map.killroot
    value 0
.....

Very little coverage of root splits and merges. Indeed, on a 4k
filesystem, block_map.newroot and block_map.killroot are both zero.
i.e. the code is not exercised at all, and it's the only generic
btree infrastructure operation that is not exercised by a default run
of xfstests.

Turns out that on a 1k filesystem, generic/234 accounts for one of
those two root splits, and that is somewhat of a smoking gun. In
fact, it's the same problem we saw in the directory/attr code where
headers are memcpy()d from one block to another without updating the
self describing metadata.

Simple fix - when copying the header out of the root block, make
sure the block number is updated correctly.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

(cherry picked from commit ade1335afe)
2013-06-14 15:59:31 -05:00
Dave Chinner
5170711df7 xfs: fix implicit padding in directory and attr CRC formats
Michael L. Semon has been testing CRC patches on a 32 bit system and
been seeing assert failures in the directory code from xfs/080.
Thanks to Michael's heroic efforts with printk debugging, we found
that the problem was that the last free space being left in the
directory structure was too small to fit a unused tag structure and
it was being corrupted and attempting to log a region out of bounds.
Hence the assert failure looked something like:

.....
#5 calling xfs_dir2_data_log_unused() 36 32
#1 4092 4095 4096
#2 8182 8183 4096
XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length), file: fs/xfs/xfs_trans_buf.c, line: 568

Where #1 showed the first region of the dup being logged (i.e. the
last 4 bytes of a directory buffer) and #2 shows the corrupt values
being calculated from the length of the dup entry which overflowed
the size of the buffer.

It turns out that the problem was not in the logging code, nor in
the freespace handling code. It is an initial condition bug that
only shows up on 32 bit systems. When a new buffer is initialised,
where's the freespace that is set up:

[  172.316249] calling xfs_dir2_leaf_addname() from xfs_dir_createname()
[  172.316346] #9 calling xfs_dir2_data_log_unused()
[  172.316351] #1 calling xfs_trans_log_buf() 60 63 4096
[  172.316353] #2 calling xfs_trans_log_buf() 4094 4095 4096

Note the offset of the first region being logged? It's 60 bytes into
the buffer. Once I saw that, I pretty much knew that the bug was
going to be caused by this.

Essentially, all direct entries are rounded to 8 bytes in length,
and all entries start with an 8 byte alignment. This means that we
can decode inplace as variables are naturally aligned. With the
directory data supposedly starting on a 8 byte boundary, and all
entries padded to 8 bytes, the minimum freespace in a directory
block is supposed to be 8 bytes, which is large enough to fit a
unused data entry structure (6 bytes in size). The fact we only have
4 bytes of free space indicates a directory data block alignment
problem.

And what do you know - there's an implicit hole in the directory
data block header for the CRC format, which means the header is 60
byte on 32 bit intel systems and 64 bytes on 64 bit systems. Needs
padding. And while looking at the structures, I found the same
problem in the attr leaf header. Fix them both.

Note that this only affects 32 bit systems with CRCs enabled.
Everything else is just fine. Note that CRC enabled filesystems created
before this fix on such systems will not be readable with this fix
applied.

Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Debugged-by: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

(cherry picked from commit 8a1fd2950e)
2013-06-14 15:59:16 -05:00
Dave Chinner
47ad2fcba9 xfs: don't emit v5 superblock warnings on write
We write the superblock every 30s or so which results in the
verifier being called. Right now that results in this output
every 30s:

XFS (vda): Version 5 superblock detected. This kernel has EXPERIMENTAL support enabled!
Use of these features in this kernel is at your own risk!

And spamming the logs.

We don't need to check for whether we support v5 superblocks or
whether there are feature bits we don't support set as these are
only relevant when we first mount the filesytem. i.e. on superblock
read. Hence for the write verification we can just skip all the
checks (and hence verbose output) altogether.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

(cherry picked from commit 34510185ab)
2013-06-14 15:58:47 -05:00
Linus Torvalds
a2648ebb7e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
 "This is an assortment of crash fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: stop all workers before cleaning up roots
  Btrfs: fix use-after-free bug during umount
  Btrfs: init relocate extent_io_tree with a mapping
  btrfs: Drop inode if inode root is NULL
  Btrfs: don't delete fs_roots until after we cleanup the transaction
2013-06-13 22:34:14 -07:00
Linus Torvalds
a568fa1c91 Merge branch 'akpm' (updates from Andrew Morton)
Merge misc fixes from Andrew Morton:
 "Bunch of fixes and one little addition to math64.h"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (27 commits)
  include/linux/math64.h: add div64_ul()
  mm: memcontrol: fix lockless reclaim hierarchy iterator
  frontswap: fix incorrect zeroing and allocation size for frontswap_map
  kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules()
  mm: migration: add migrate_entry_wait_huge()
  ocfs2: add missing lockres put in dlm_mig_lockres_handler
  mm/page_alloc.c: fix watermark check in __zone_watermark_ok()
  drivers/misc/sgi-gru/grufile.c: fix info leak in gru_get_config_info()
  aio: fix io_destroy() regression by using call_rcu()
  rtc-at91rm9200: use shadow IMR on at91sam9x5
  rtc-at91rm9200: add shadow interrupt mask
  rtc-at91rm9200: refactor interrupt-register handling
  rtc-at91rm9200: add configuration support
  rtc-at91rm9200: add match-table compile guard
  fs/ocfs2/namei.c: remove unecessary ERROR when removing non-empty directory
  swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion
  drivers/rtc/rtc-twl.c: fix missing device_init_wakeup() when booted with device tree
  cciss: fix broken mutex usage in ioctl
  audit: wait_for_auditd() should use TASK_UNINTERRUPTIBLE
  drivers/rtc/rtc-cmos.c: fix accidentally enabling rtc channel
  ...
2013-06-12 16:29:53 -07:00
Xue jiufei
27749f2ff0 ocfs2: add missing lockres put in dlm_mig_lockres_handler
dlm_mig_lockres_handler() is missing a dlm_lockres_put() on an error path.

Signed-off-by: joyce <xuejiufei@huawei.com>
Reviewed-by: shencanquan <shencanquan@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-12 16:29:46 -07:00
Kent Overstreet
4fcc712f5c aio: fix io_destroy() regression by using call_rcu()
There was a regression introduced by 36f5588905 ("aio: refcounting
cleanup"), reported by Jens Axboe - the refcounting cleanup switched to
using RCU in the shutdown path, but the synchronize_rcu() was done in
the context of the io_destroy() syscall greatly increasing the time it
could block.

This patch switches it to call_rcu() and makes shutdown asynchronous
(more asynchronous than it was originally; before the refcount changes
io_destroy() would still wait on pending kiocbs).

Note that there's a global quota on the max outstanding kiocbs, and that
quota must be manipulated synchronously; otherwise io_setup() could
return -EAGAIN when there isn't quota available, and userspace won't
have any way of waiting until shutdown of the old kioctxs has finished
(besides busy looping).

So we release our quota before kioctx shutdown has finished, which
should be fine since the quota never corresponded to anything real
anyways.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Reported-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Tested-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-12 16:29:46 -07:00
Goldwyn Rodrigues
e099127169 fs/ocfs2/namei.c: remove unecessary ERROR when removing non-empty directory
While removing a non-empty directory, the kernel dumps a message:

  (rmdir,21743,1):ocfs2_unlink:953 ERROR: status = -39

Suppress the error message from being printed in the dmesg so users
don't panic.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Acked-by: Sunil Mushran <sunil.mushran@gmail.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-12 16:29:45 -07:00
Xiaowei.Hu
7869e59067 ocfs2: ocfs2_prep_new_orphaned_file() should return ret
If an error occurs, for example an EIO in __ocfs2_prepare_orphan_dir,
ocfs2_prep_new_orphaned_file will release the inode_ac, then when the
caller of ocfs2_prep_new_orphaned_file gets a 0 return, it will refer to
a NULL ocfs2_alloc_context struct in the following functions.  A kernel
panic happens.

Signed-off-by: "Xiaowei.Hu" <xiaowei.hu@oracle.com>
Reviewed-by: shencanquan <shencanquan@huawei.com>
Acked-by: Sunil Mushran <sunil.mushran@gmail.com>
Cc: Joe Jin <joe.jin@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-12 16:29:44 -07:00
Kees Cook
637241a900 kmsg: honor dmesg_restrict sysctl on /dev/kmsg
The dmesg_restrict sysctl currently covers the syslog method for access
dmesg, however /dev/kmsg isn't covered by the same protections.  Most
people haven't noticed because util-linux dmesg(1) defaults to using the
syslog method for access in older versions.  With util-linux dmesg(1)
defaults to reading directly from /dev/kmsg.

To fix /dev/kmsg, let's compare the existing interfaces and what they
allow:

 - /proc/kmsg allows:
  - open (SYSLOG_ACTION_OPEN) if CAP_SYSLOG since it uses a destructive
    single-reader interface (SYSLOG_ACTION_READ).
  - everything, after an open.

 - syslog syscall allows:
  - anything, if CAP_SYSLOG.
  - SYSLOG_ACTION_READ_ALL and SYSLOG_ACTION_SIZE_BUFFER, if
    dmesg_restrict==0.
  - nothing else (EPERM).

The use-cases were:
 - dmesg(1) needs to do non-destructive SYSLOG_ACTION_READ_ALLs.
 - sysklog(1) needs to open /proc/kmsg, drop privs, and still issue the
   destructive SYSLOG_ACTION_READs.

AIUI, dmesg(1) is moving to /dev/kmsg, and systemd-journald doesn't
clear the ring buffer.

Based on the comments in devkmsg_llseek, it sounds like actions besides
reading aren't going to be supported by /dev/kmsg (i.e.
SYSLOG_ACTION_CLEAR), so we have a strict subset of the non-destructive
syslog syscall actions.

To this end, move the check as Josh had done, but also rename the
constants to reflect their new uses (SYSLOG_FROM_CALL becomes
SYSLOG_FROM_READER, and SYSLOG_FROM_FILE becomes SYSLOG_FROM_PROC).
SYSLOG_FROM_READER allows non-destructive actions, and SYSLOG_FROM_PROC
allows destructive actions after a capabilities-constrained
SYSLOG_ACTION_OPEN check.

 - /dev/kmsg allows:
  - open if CAP_SYSLOG or dmesg_restrict==0
  - reading/polling, after open

Addresses https://bugzilla.redhat.com/show_bug.cgi?id=903192

[akpm@linux-foundation.org: use pr_warn_once()]
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Christian Kujau <lists@nerdbynature.de>
Tested-by: Josh Boyer <jwboyer@redhat.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-12 16:29:44 -07:00
Linus Torvalds
8d7a8fe2ce Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull ceph fixes from Sage Weil:
 "There is a pair of fixes for double-frees in the recent bundle for
  3.10, a couple of fixes for long-standing bugs (sleep while atomic and
  an endianness fix), and a locking fix that can be triggered when osds
  are going down"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: fix cleanup in rbd_add()
  rbd: don't destroy ceph_opts in rbd_add()
  ceph: ceph_pagelist_append might sleep while atomic
  ceph: add cpu_to_le32() calls when encoding a reconnect capability
  libceph: must hold mutex for reset_changed_osds()
2013-06-12 08:28:19 -07:00
Mikulas Patocka
bbd465df73 hpfs: fix warnings when the filesystem fills up
This patch fixes warnings due to missing lock on write error path.

  WARNING: at fs/hpfs/hpfs_fn.h:353 hpfs_truncate+0x75/0x80 [hpfs]()
  Hardware name: empty
  Pid: 26563, comm: dd Tainted: P           O 3.9.4 #12
  Call Trace:
    hpfs_truncate+0x75/0x80 [hpfs]
    hpfs_write_begin+0x84/0x90 [hpfs]
    _hpfs_bmap+0x10/0x10 [hpfs]
    generic_file_buffered_write+0x121/0x2c0
    __generic_file_aio_write+0x1c7/0x3f0
    generic_file_aio_write+0x7c/0x100
    do_sync_write+0x98/0xd0
    hpfs_file_write+0xd/0x50 [hpfs]
    vfs_write+0xa2/0x160
    sys_write+0x51/0xa0
    page_fault+0x22/0x30
    system_call_fastpath+0x1a/0x1f

Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: stable@kernel.org  # 2.6.39+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-08 17:39:40 -07:00