linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-11 13:27:06 +09:00

Author	SHA1	Message	Date
Konstantin Khlebnikov	faf067b39a	epoll: fix use-after-free in eventpoll_release_file commit `ebe06187bf` upstream. This fixes use-after-free of epi->fllink.next inside list loop macro. This loop actually releases elements in the body. The list is rcu-protected but here we cannot hold rcu_read_lock because we need to lock mutex inside. The obvious solution is to use list_for_each_entry_safe(). RCU-ness isn't essential because nobody can change this list under us, it's final fput for this file. The bug was introduced by `ae10b2b4eb` ("epoll: optimize EPOLL_CTL_DEL using rcu") Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Reported-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Jason Baron <jbaron@akamai.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:04 -07:00
Eric Sandeen	28084dbaaf	btrfs: fix use of uninit "ret" in end_extent_writepage() commit `3e2426bd0e` upstream. If this condition in end_extent_writepage() is false: if (tree->ops && tree->ops->writepage_end_io_hook) we will then test an uninitialized "ret" at: ret = ret < 0 ? ret : -EIO; The test for ret is for the case where ->writepage_end_io_hook failed, and we'd choose that ret as the error; but if there is no ->writepage_end_io_hook, nothing sets ret. Initializing ret to 0 should be sufficient; if writepage_end_io_hook wasn't set, (!uptodate) means non-zero err was passed in, so we choose -EIO in that case. Signed-of-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:03 -07:00
Liu Bo	eed31172a3	Btrfs: fix scrub_print_warning to handle skinny metadata extents commit `6eda71d0c0` upstream. The skinny extents are intepreted incorrectly in scrub_print_warning(), and end up hitting the BUG() in btrfs_extent_inline_ref_size. Reported-by: Konstantinos Skarlatos <k.skarlatos@gmail.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:03 -07:00
Liu Bo	6d2a63f3d1	Btrfs: use right type to get real comparison commit `cd857dd6bc` upstream. We want to make sure the point is still within the extent item, not to verify the memory it's pointing to. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:03 -07:00
Josef Bacik	efb57277b9	Btrfs: don't check nodes for extent items commit `8a56457f5f` upstream. The backref code was looking at nodes as well as leaves when we tried to populate extent item entries. This is not good, and although we go away with it for the most part because we'd skip where disk_bytenr != random_memory, sometimes random_memory would match and suddenly boom. This fixes that problem. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:03 -07:00
Rickard Strandqvist	e517155065	fs: btrfs: volumes.c: Fix for possible null pointer dereference commit `8321cf2596` upstream. There is otherwise a risk of a possible null pointer dereference. Was largely found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:03 -07:00
Jeff Mahoney	80693a87ae	btrfs: allocate raid type kobjects dynamically commit `c1895442be` upstream. We are currently allocating space_info objects in an array when we allocate space_info. When a user does something like: # btrfs balance start -mconvert=raid1 -dconvert=raid1 /mnt # btrfs balance start -mconvert=single -dconvert=single /mnt -f # btrfs balance start -mconvert=raid1 -dconvert=raid1 / We can end up with memory corruption since the kobject hasn't been reinitialized properly and the name pointer was left set. The rationale behind allocating them statically was to avoid creating a separate kobject container that just contained the raid type. It used the index in the array to determine the index. Ultimately, though, this wastes more memory than it saves in all but the most complex scenarios and introduces kobject lifetime questions. This patch allocates the kobjects dynamically instead. Note that we also remove the kobject_get/put of the parent kobject since kobject_add and kobject_del do that internally. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reported-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:03 -07:00
Filipe Manana	69a109b2ce	Btrfs: send, use the right limits for xattr names and values commit `7e3ae33efa` upstream. We were limiting the sum of the xattr name and value lengths to PATH_MAX, which is not correct, specially on filesystems created with btrfs-progs v3.12 or higher, where the default leaf size is max(16384, PAGE_SIZE), or systems with page sizes larger than 4096 bytes. Xattrs have their own specific maximum name and value lengths, which depend on the leaf size, therefore use these limits to be able to send xattrs with sizes larger than PATH_MAX. A test case for xfstests follows. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:03 -07:00
Filipe Manana	1a7fc2e61a	Btrfs: send, don't error in the presence of subvols/snapshots commit `1af56070e3` upstream. If we are doing an incremental send and the base snapshot has a directory with name X that doesn't exist anymore in the second snapshot and a new subvolume/snapshot exists in the second snapshot that has the same name as the directory (name X), the incremental send would fail with -ENOENT error. This is because it attempts to lookup for an inode with a number matching the objectid of a root, which doesn't exist. Steps to reproduce: mkfs.btrfs -f /dev/sdd mount /dev/sdd /mnt mkdir /mnt/testdir btrfs subvolume snapshot -r /mnt /mnt/mysnap1 rmdir /mnt/testdir btrfs subvolume create /mnt/testdir btrfs subvolume snapshot -r /mnt /mnt/mysnap2 btrfs send -p /mnt/mysnap1 /mnt/mysnap2 -f /tmp/send.data A test case for xfstests follows. Reported-by: Robert White <rwhite@pobox.com> Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:03 -07:00
Wang Shilong	1b524a36f1	Btrfs: set right total device count for seeding support commit `298658414a` upstream. Seeding device support allows us to create a new filesystem based on existed filesystem. However newly created filesystem's @total_devices should include seed devices. This patch fix the following problem: # mkfs.btrfs -f /dev/sdb # btrfstune -S 1 /dev/sdb # mount /dev/sdb /mnt # btrfs device add -f /dev/sdc /mnt --->fs_devices->total_devices = 1 # umount /mnt # mount /dev/sdc /mnt --->fs_devices->total_devices = 2 This is because we record right @total_devices in superblock, but @fs_devices->total_devices is reset to be 0 in btrfs_prepare_sprout(). Fix this problem by not resetting @fs_devices->total_devices. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:03 -07:00
Liu Bo	8d7293457e	Btrfs: mark mapping with error flag to report errors to userspace commit `5dca6eea91` upstream. According to commit `865ffef379` (fs: fix fsync() error reporting), it's not stable to just check error pages because pages can be truncated or invalidated, we should also mark mapping with error flag so that a later fsync can catch the error. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:02 -07:00
Liu Bo	c6b4616c41	Btrfs: fix NULL pointer crash of deleting a seed device commit `29cc83f69c` upstream. Same as normal devices, seed devices should be initialized with fs_info->dev_root as well, otherwise we'll get a NULL pointer crash. Cc: Chris Murphy <lists@colorremedies.com> Reported-by: Chris Murphy <lists@colorremedies.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:02 -07:00
Wang Shilong	add81375ec	Btrfs: make sure there are not any read requests before stopping workers commit `de348ee022` upstream. In close_ctree(), after we have stopped all workers,there maybe still some read requests(for example readahead) to submit and this maybe trigger an oops that user reported before: kernel BUG at fs/btrfs/async-thread.c:619! By hacking codes, i can reproduce this problem with one cpu available. We fix this potential problem by invalidating all btree inode pages before stopping all workers. Thanks to Miao for pointing out this problem. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:02 -07:00
Filipe Manana	7df26ca1d8	Btrfs: send, account for orphan directories when building path strings commit `c992ec94f2` upstream. If we have directories with a pending move/rename operation, we must take into account any orphan directories that got created before executing the pending move/rename. Those orphan directories are directories with an inode number higher then the current send progress and that don't exist in the parent snapshot, they are created before current progress reaches their inode number, with a generated name of the form oN-M-I and at the root of the filesystem tree, and later when progress matches their inode number, moved/renamed to their final location. Reproducer: $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir -p /mnt/a/b/c/d $ mkdir /mnt/a/b/e $ mv /mnt/a/b/c /mnt/a/b/e/CC $ mkdir /mnt/a/b/e/CC/d/f $ mkdir /mnt/a/g $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mkdir /mnt/a/g/h $ mv /mnt/a/b/e /mnt/a/g/h/EE $ mv /mnt/a/g/h/EE/CC/d /mnt/a/g/h/EE/DD $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send The second receive command failed with the following error: ERROR: rename a/b/e/CC/d -> o264-7-0/EE/DD failed. No such file or directory A test case for xfstests follows soon. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:02 -07:00
Miao Xie	580b02787e	Btrfs: output warning instead of error when loading free space cache failed commit `32d6b47fe6` upstream. If we fail to load a free space cache, we can rebuild it from the extent tree, so it is not a serious error, we should not output a error message that would make the users uncomfortable. This patch uses warning message instead of it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:02 -07:00
Qu Wenruo	2e9e31430a	btrfs: Add ctime/mtime update for btrfs device add/remove. commit `5a1972bd9f` upstream. Btrfs will send uevent to udev inform the device change, but ctime/mtime for the block device inode is not udpated, which cause libblkid used by btrfs-progs unable to detect device change and use old cache, causing 'btrfs dev scan; btrfs dev rmove; btrfs dev scan' give an error message. Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Cc: Karel Zak <kzak@redhat.com> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:02 -07:00
Filipe Manana	2501c4a066	Btrfs: read inode size after acquiring the mutex when punching a hole commit `a1a50f60a6` upstream. In a previous change, commit `12870f1c9b`, I accidentally moved the roundup of inode->i_size to outside of the critical section delimited by the inode mutex, which is not atomic and not correct since the size can be changed by other task before we acquire the mutex. Therefore fix it. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:02 -07:00
Chris Mason	bddf0faccf	Btrfs: fix double free in find_lock_delalloc_range commit `7d78874273` upstream. We need to NULL the cached_state after freeing it, otherwise we might free it again if find_delalloc_range doesn't find anything. Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:02 -07:00
Filipe Manana	938bdab678	Btrfs: fix leaf corruption caused by ENOSPC while hole punching commit `fc19c5e736` upstream. While running a stress test with multiple threads writing to the same btrfs file system, I ended up with a situation where a leaf was corrupted in that it had 2 file extent item keys that had the same exact key. I was able to detect this quickly thanks to the following patch which triggers an assertion as soon as a leaf is marked dirty if there are duplicated keys or out of order keys: Btrfs: check if items are ordered when a leaf is marked dirty (https://patchwork.kernel.org/patch/3955431/) Basically while running the test, I got the following in dmesg: [28877.415877] WARNING: CPU: 2 PID: 10706 at fs/btrfs/file.c:553 btrfs_drop_extent_cache+0x435/0x440 [btrfs]() (...) [28877.415917] Call Trace: [28877.415922] [<ffffffff816f1189>] dump_stack+0x4e/0x68 [28877.415926] [<ffffffff8104a32c>] warn_slowpath_common+0x8c/0xc0 [28877.415929] [<ffffffff8104a37a>] warn_slowpath_null+0x1a/0x20 [28877.415944] [<ffffffffa03775a5>] btrfs_drop_extent_cache+0x435/0x440 [btrfs] [28877.415949] [<ffffffff8118e7be>] ? kmem_cache_alloc+0xfe/0x1c0 [28877.415962] [<ffffffffa03777d9>] fill_holes+0x229/0x3e0 [btrfs] [28877.415972] [<ffffffffa0345865>] ? block_rsv_add_bytes+0x55/0x80 [btrfs] [28877.415984] [<ffffffffa03792cb>] btrfs_fallocate+0xb6b/0xc20 [btrfs] (...) [29854.132560] BTRFS critical (device sdc): corrupt leaf, bad key order: block=955232256,root=1, slot=24 [29854.132565] BTRFS info (device sdc): leaf 955232256 total ptrs 40 free space 778 (...) [29854.132637] item 23 key (3486 108 667648) itemoff 2694 itemsize 53 [29854.132638] extent data disk bytenr 14574411776 nr 286720 [29854.132639] extent data offset 0 nr 286720 ram 286720 [29854.132640] item 24 key (3486 108 954368) itemoff 2641 itemsize 53 [29854.132641] extent data disk bytenr 0 nr 0 [29854.132643] extent data offset 0 nr 0 ram 0 [29854.132644] item 25 key (3486 108 954368) itemoff 2588 itemsize 53 [29854.132645] extent data disk bytenr 8699670528 nr 77824 [29854.132646] extent data offset 0 nr 77824 ram 77824 [29854.132647] item 26 key (3486 108 1146880) itemoff 2535 itemsize 53 [29854.132648] extent data disk bytenr 8699670528 nr 77824 [29854.132649] extent data offset 0 nr 77824 ram 77824 (...) [29854.132707] kernel BUG at fs/btrfs/ctree.h:3901! (...) [29854.132771] Call Trace: [29854.132779] [<ffffffffa0342b5c>] setup_items_for_insert+0x2dc/0x400 [btrfs] [29854.132791] [<ffffffffa0378537>] __btrfs_drop_extents+0xba7/0xdd0 [btrfs] [29854.132794] [<ffffffff8109c0d6>] ? trace_hardirqs_on_caller+0x16/0x1d0 [29854.132797] [<ffffffff8109c29d>] ? trace_hardirqs_on+0xd/0x10 [29854.132800] [<ffffffff8118e7be>] ? kmem_cache_alloc+0xfe/0x1c0 [29854.132810] [<ffffffffa036783b>] insert_reserved_file_extent.constprop.66+0xab/0x310 [btrfs] [29854.132820] [<ffffffffa036a6c6>] __btrfs_prealloc_file_range+0x116/0x340 [btrfs] [29854.132830] [<ffffffffa0374d53>] btrfs_prealloc_file_range+0x23/0x30 [btrfs] (...) So this is caused by getting an -ENOSPC error while punching a file hole, more specifically, we get -ENOSPC error from __btrfs_drop_extents in the while loop of file.c:btrfs_punch_hole() when it's unable to modify the btree to delete one or more file extent items due to lack of enough free space. When this happens, in btrfs_punch_hole(), we attempt to reclaim free space by switching our transaction block reservation object to root->fs_info->trans_block_rsv, end our transaction and start a new transaction basically - and, we keep increasing our current offset (cur_offset) as long as it's smaller than the end of the target range (lockend) - this makes use leave the loop with cur_offset == drop_end which in turn makes us call fill_holes() for inserting a file extent item that represents a 0 bytes range hole (and this insertion succeeds, as in the meanwhile more space became available). This 0 bytes file hole extent item is a problem because any subsequent caller of __btrfs_drop_extents (regular file writes, or fallocate calls for e.g.), with a start file offset that is equal to the offset of the hole, will not remove this extent item due to the following conditional in the while loop of __btrfs_drop_extents: if (extent_end <= search_start) { path->slots[0]++; goto next_slot; } This later makes the call to setup_items_for_insert() (at the very end of __btrfs_drop_extents), insert a new file extent item with the same offset as the 0 bytes file hole extent item that follows it. Needless is to say that this causes chaos, either when reading the leaf from disk (btree_readpage_end_io_hook), where we perform leaf sanity checks or in subsequent operations that manipulate file extent items, as in the fallocate call as shown by the dmesg trace above. Without my other patch to perform the leaf sanity checks once a leaf is marked as dirty (if the integrity checker is enabled), it would have been much harder to debug this issue. This change might fix a few similar issues reported by users in the mailing list regarding assertion failures in btrfs_set_item_key_safe calls performed by __btrfs_drop_extents, such as the following report: http://comments.gmane.org/gmane.comp.file-systems.btrfs/32938 Asking fill_holes() to create a 0 bytes wide file hole item also produced the first warning in the trace above, as we passed a range to btrfs_drop_extent_cache that has an end smaller (by -1) than its start. On 3.14 kernels this issue manifests itself through leaf corruption, as we get duplicated file extent item keys in a leaf when calling setup_items_for_insert(), but on older kernels, setup_items_for_insert() isn't called by __btrfs_drop_extents(), instead we have callers of __btrfs_drop_extents(), namely the functions inode.c:insert_inline_extent() and inode.c:insert_reserved_file_extent(), calling btrfs_insert_empty_item() to insert the new file extent item, which would fail with error -EEXIST, instead of inserting a duplicated key - which is still a serious issue as it would make all similar file extent item replace operations keep failing if they target the same file range. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:02 -07:00
Pavel Shilovsky	0d81fcdd14	CIFS: Fix memory leaks in SMB2_open commit `663a962151` upstream. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:02 -07:00
Benjamin LaHaise	0ea9bb58b2	aio: fix kernel memory disclosure in io_getevents() introduced in v3.10 commit `edfbbf388f` upstream. A kernel memory disclosure was introduced in aio_read_events_ring() in v3.10 by commit `a31ad380be`. The changes made to aio_read_events_ring() failed to correctly limit the index into ctx->ring_pages[], allowing an attacked to cause the subsequent kmap() of an arbitrary page with a copy_to_user() to copy the contents into userspace. This vulnerability has been assigned CVE-2014-0206. Thanks to Mateusz and Petr for disclosing this issue. This patch applies to v3.12+. A separate backport is needed for 3.10/3.11. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Cc: Mateusz Guzik <mguzik@redhat.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Kent Overstreet <kmo@daterainc.com> Cc: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:02 -07:00
Benjamin LaHaise	b34e0e1319	aio: fix aio request leak when events are reaped by userspace commit `f8567a3845` upstream. The aio cleanups and optimizations by kmo that were merged into the 3.10 tree added a regression for userspace event reaping. Specifically, the reference counts are not decremented if the event is reaped in userspace, leading to the application being unable to submit further aio requests. This patch applies to 3.12+. A separate backport is required for 3.10/3.11. This issue was uncovered as part of CVE-2014-0206. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Cc: Kent Overstreet <kmo@daterainc.com> Cc: Mateusz Guzik <mguzik@redhat.com> Cc: Petr Matousek <pmatouse@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:02 -07:00
Jaegeuk Kim	fecb45f708	f2fs: submit bio at the reclaim path commit `2aea39eca6` upstream. If f2fs_write_data_page is called through the reclaim path, we should submit the bio right away. This patch resolves the following issue that Marc Dietrich reported. "It took me a while to bisect a problem which causes my ARM (tegra2) netbook to frequently stall for 5-10 seconds when I enable EXA acceleration (opentegra experimental ddx)." And this patch fixes that. Reported-by: Marc Dietrich <marvin24@gmx.de> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:14:01 -07:00
Maurizio Lombardi	feb62c88c9	ext4: fix wrong assert in ext4_mb_normalize_request() commit `b5b6077855` upstream. The variable "size" is expressed as number of blocks and not as number of clusters, this could trigger a kernel panic when using ext4 with the size of a cluster different from the size of a block. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:13:56 -07:00
Namjae Jeon	59bc864486	ext4: fix ZERO_RANGE test failure in data journalling commit `e1ee60fd89` upstream. xfstests generic/091 is failing when mounting ext4 with data=journal. I think that this regression is same problem that occurred prior to collapse range issue. So ZERO RANGE also need to call ext4_force_commit as collapse range. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:13:56 -07:00
Jan Kara	f70b5876eb	ext4: fix zeroing of page during writeback commit `eeece469de` upstream. Tail of a page straddling inode size must be zeroed when being written out due to POSIX requirement that modifications of mmaped page beyond inode size must not be written to the file. ext4_bio_write_page() did this only for blocks fully beyond inode size but didn't properly zero blocks partially beyond inode size. Fix this. The problem has been uncovered by mmap_11-4 test in openposix test suite (part of LTP). Reported-by: Xiaoguang Wang <wangxg.fnst@cn.fujitsu.com> Fixes: `5a0dc7365c` Fixes: `bd2d0210cf` CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:13:56 -07:00
Namjae Jeon	ba6eb3c3ac	ext4: fix data integrity sync in ordered mode commit `1c8349a171` upstream. When we perform a data integrity sync we tag all the dirty pages with PAGECACHE_TAG_TOWRITE at start of ext4_da_writepages. Later we check for this tag in write_cache_pages_da and creates a struct mpage_da_data containing contiguously indexed pages tagged with this tag and sync these pages with a call to mpage_da_map_and_submit. This process is done in while loop until all the PAGECACHE_TAG_TOWRITE pages are synced. We also do journal start and stop in each iteration. journal_stop could initiate journal commit which would call ext4_writepage which in turn will call ext4_bio_write_page even for delayed OR unwritten buffers. When ext4_bio_write_page is called for such buffers, even though it does not sync them but it clears the PAGECACHE_TAG_TOWRITE of the corresponding page and hence these pages are also not synced by the currently running data integrity sync. We will end up with dirty pages although sync is completed. This could cause a potential data loss when the sync call is followed by a truncate_pagecache call, which is exactly the case in collapse_range. (It will cause generic/127 failure in xfstests) To avoid this issue, we can use set_page_writeback_keepwrite instead of set_page_writeback, which doesn't clear TOWRITE tag. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-30 20:13:56 -07:00
Al Viro	4e358517b2	lock_parent: don't step on stale ->d_parent of all-but-freed one commit `c2338f2dc7` upstream. Dentry that had been through (or into) __dentry_kill() might be seen by shrink_dentry_list(); that's normal, it'll be taken off the shrink list and freed if __dentry_kill() has already finished. The problem is, its ->d_parent might be pointing to already freed dentry, so lock_parent() needs to be careful. We need to check that dentry hasn't already gone into __dentry_kill() and grab rcu_read_lock() before dropping ->d_lock - the latter makes sure that whatever we see in ->d_parent after dropping ->d_lock it won't be freed until we drop rcu_read_lock(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-16 13:44:09 -07:00
Andy Lutomirski	d3c8656bc2	fs,userns: Change inode_capable to capable_wrt_inode_uidgid commit `23adbe12ef` upstream. The kernel has no concept of capabilities with respect to inodes; inodes exist independently of namespaces. For example, inode_capable(inode, CAP_LINUX_IMMUTABLE) would be nonsense. This patch changes inode_capable to check for uid and gid mappings and renames it to capable_wrt_inode_uidgid, which should make it more obvious what it does. Fixes CVE-2014-4014. Cc: Theodore Ts'o <tytso@mit.edu> Cc: Serge Hallyn <serge.hallyn@ubuntu.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-06-16 13:44:09 -07:00
Linus Torvalds	c593e89787	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fix from Chris Mason: "I had this in my 3.16 merge window queue, but it is small and obvious enough for 3.15. I cherry-picked and retested against current rc8" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: send, fix corrupted path strings for long paths	2014-06-07 15:12:18 -07:00
Naoya Horiguchi	d4c54919ed	mm: add !pte_present() check on existing hugetlb_entry callbacks The age table walker doesn't check non-present hugetlb entry in common path, so hugetlb_entry() callbacks must check it. The reason for this behavior is that some callers want to handle it in its own way. [ I think that reason is bogus, btw - it should just do what the regular code does, which is to call the "pte_hole()" function for such hugetlb entries - Linus] However, some callers don't check it now, which causes unpredictable result, for example when we have a race between migrating hugepage and reading /proc/pid/numa_maps. This patch fixes it by adding !pte_present checks on buggy callbacks. This bug exists for years and got visible by introducing hugepage migration. ChangeLog v2: - fix if condition (check !pte_present() instead of pte_present()) Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Rik van Riel <riel@redhat.com> Cc: <stable@vger.kernel.org> [3.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Backported to 3.15. Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-06 13:21:16 -07:00
Filipe Manana	01a9a8a9e2	Btrfs: send, fix corrupted path strings for long paths If a path has more than 230 characters, we allocate a new buffer to use for the path, but we were forgotting to copy the contents of the previous buffer into the new one, which has random content from the kmalloc call. Test: mkfs.btrfs -f /dev/sdd mount /dev/sdd /mnt TEST_PATH="/mnt/fdmanana/.config/google-chrome-mysetup/Default/Pepper_Data/Shockwave_Flash/WritableRoot/#SharedObjects/JSHJ4ZKN/s.wsj.net/[[IMPORT]]/players.edgesuite.net/flash/plugins/osmf/advanced-streaming-plugin/v2.7/osmf1.6/Ak#" mkdir -p $TEST_PATH echo "hello world" > $TEST_PATH/amaiAdvancedStreamingPlugin.txt btrfs subvolume snapshot -r /mnt /mnt/mysnap1 btrfs send /mnt/mysnap1 -f /tmp/1.snap A test for xfstests follows. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Cc: Marc Merlin <marc@merlins.org> Tested-by: Marc MERLIN <marc@merlins.org> Signed-off-by: Chris Mason <clm@fb.com>	2014-06-06 12:00:46 -07:00
Jianyu Zhan	c9482a5bdc	kernfs: move the last knowledge of sysfs out from kernfs There is still one residue of sysfs remaining: the sb_magic SYSFS_MAGIC. However this should be kernfs user specific, so this patch moves it out. Kerrnfs user should specify their magic number while mouting. Signed-off-by: Jianyu Zhan <nasa4836@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-03 08:11:18 -07:00
Linus Torvalds	9f12600fe4	dcache: add missing lockdep annotation lock_parent() very much on purpose does nested locking of dentries, and is careful to maintain the right order (lock parent first). But because it didn't annotate the nested locking order, lockdep thought it might be a deadlock on d_lock, and complained. Add the proper annotation for the inner locking of the child dentry to make lockdep happy. Introduced by commit `046b961b45` ("shrink_dentry_list(): take parent's ->d_lock earlier"). Reported-and-tested-by: Josh Boyer <jwboyer@fedoraproject.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-05-31 09:13:21 -07:00
Al Viro	8cbf74da43	dentry_kill() doesn't need the second argument now it's 1 in the only remaining caller. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-05-30 11:10:33 -04:00
Al Viro	b2b80195d8	dealing with the rest of shrink_dentry_list() livelock We have the same problem with ->d_lock order in the inner loop, where we are dropping references to ancestors. Same solution, basically - instead of using dentry_kill() we use lock_parent() (introduced in the previous commit) to get that lock in a safe way, recheck ->d_count (in case if lock_parent() has ended up dropping and retaking ->d_lock and somebody managed to grab a reference during that window), trylock the inode->i_lock and use __dentry_kill() to do the rest. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-05-30 11:10:33 -04:00
Al Viro	046b961b45	shrink_dentry_list(): take parent's ->d_lock earlier The cause of livelocks there is that we are taking ->d_lock on dentry and its parent in the wrong order, forcing us to use trylock on the parent's one. d_walk() takes them in the right order, and unfortunately it's not hard to create a situation when shrink_dentry_list() can't make progress since trylock keeps failing, and shrink_dcache_parent() or check_submounts_and_drop() keeps calling d_walk() disrupting the very shrink_dentry_list() it's waiting for. Solution is straightforward - if that trylock fails, let's unlock the dentry itself and take locks in the right order. We need to stabilize ->d_parent without holding ->d_lock, but that's doable using RCU. And we'd better do that in the very beginning of the loop in shrink_dentry_list(), since the checks on refcount, etc. would need to be redone anyway. That deals with a half of the problem - killing dentries on the shrink list itself. Another one (dropping their parents) is in the next commit. locking parent is interesting - it would be easy to do rcu_read_lock(), lock whatever we think is a parent, lock dentry itself and check if the parent is still the right one. Except that we need to check that before locking the dentry, or we are risking taking ->d_lock out of order. Fortunately, once the D1 is locked, we can check if D2->d_parent is equal to D1 without the need to lock D2; D2->d_parent can start or stop pointing to D1 only under D1->d_lock, so taking D1->d_lock is enough. In other words, the right solution is rcu_read_lock/lock what looks like parent right now/check if it's still our parent/rcu_read_unlock/lock the child. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-05-30 11:03:21 -04:00
Al Viro	ff2fde9929	expand dentry_kill(dentry, 0) in shrink_dentry_list() Result will be massaged to saner shape in the next commits. It is ugly, no questions - the point of that one is to be a provably equivalent transformation (and it might be worth splitting a bit more). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-05-29 08:50:08 -04:00
Al Viro	e55fd01154	split dentry_kill() ... into trylocks and everything else. The latter (actual killing) is __dentry_kill(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-05-29 08:46:08 -04:00
Al Viro	64fd72e0a4	lift the "already marked killed" case into shrink_dentry_list() It can happen only when dentry_kill() is called with unlock_on_failure equal to 0 - other callers had dentry pinned until the moment they've got ->d_lock and DCACHE_DENTRY_KILLED is set only after lockref_mark_dead(). IOW, only one of three call sites of dentry_kill() might end up reaching that code. Just move it there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-05-28 09:48:44 -04:00
Miklos Szeredi	b6dd6f4738	vfs: fix vmplice_to_user() Commit `6130f5315e` "switch vmsplice_to_user() to copy_page_to_iter()" in v3.15-rc1 broke vmsplice(2). This patch fixes two bugs: - count is not initialized to a proper value, which resulted in no data being copied - if rw_copy_check_uvector() returns negative then the iov might be leaked. Tested OK. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-05-28 01:54:52 -04:00
Linus Torvalds	db1003f231	Merge branch 'afs' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes and cleanups from David Howells: "Here are some patches to the AFS filesystem: 1) Fix problems in the clean-up parts of the cache manager service handler. 2) Split afs_end_call() introduced in (1) and replace some identical code elsewhere with a call to the first half of the split function. 3) Fix an error introduced in the workqueue PREPARE_WORK() elimination commits. 4) Clean up argument passing to functions called from the workqueue as there's now an insulating layer between them and the workqueue. This is possible from (3)" * 'afs' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: AFS: Pass an afs_call* to call->async_workfn() instead of a work_struct* AFS: Fix kafs module unloading AFS: Part of afs_end_call() is identical to code elsewhere, so split it AFS: Fix cache manager service handlers	2014-05-25 12:40:36 -07:00
Linus Torvalds	80a1de29a5	Merge branch 'for-3.15' of git://linux-nfs.org/~bfields/linux Pull two nfsd bugfixes from Bruce Fields: "Just two bugfixes, one for a merge-window-introduced ACL regression, the other for a longer-standing v4 state bug" * 'for-3.15' of git://linux-nfs.org/~bfields/linux: nfsd4: warn on finding lockowner without stateid's nfsd4: remove lockowner when removing lock stateid nfsd4: fix corruption on setting an ACL.	2014-05-25 10:08:48 -07:00
Joseph Qi	66db6cfd49	ocfs2: fix double kmem_cache_destroy in dlm_init In dlm_init, if create dlm_lockname_cache failed in dlm_init_master_caches, it will destroy dlm_lockres_cache which created before twice. And this will cause system die when loading modules. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-05-23 09:37:30 -07:00
David Howells	656f88ddf1	AFS: Pass an afs_call* to call->async_workfn() instead of a work_struct* call->async_workfn() can take an afs_call* arg rather than a work_struct* as the functions assigned there are now called from afs_async_workfn() which has to call container_of() anyway. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu> Reviewed-by: Tejun Heo <tj@kernel.org>	2014-05-23 13:05:22 +01:00
Nathaniel Wesley Filardo	150a6b4789	AFS: Fix kafs module unloading At present, it is not possible to successfully unload the kafs module if there are outstanding async outgoing calls (those made with afs_make_call()). This appears to be due to the changes introduced by: commit `059499453a` Author: Tejun Heo <tj@kernel.org> Date: Fri Mar 7 10:24:50 2014 -0500 Subject: afs: don't use PREPARE_WORK which didn't go far enough. The problem is due to: (1) The aforementioned commit introduced a separate handler function pointer in the call, call->async_workfn, in addition to the original workqueue item, call->async_work, for asynchronous operations because workqueues subsystem cannot handle the workqueue item pointer being changed whilst the item is queued or being processed. (2) afs_async_workfn() was introduced in that commit to be the callback for call->async_work. Its sole purpose is to run whatever call->async_workfn points to. (3) call->async_workfn is only used from afs_async_workfn(), which is only set on async_work by afs_collect_incoming_call() - ie. for incoming calls. (4) call->async_workfn is not set by afs_make_call() when outgoing calls are made, and call->async_work is set afs_process_async_call() - and not afs_async_workfn(). (5) afs_process_async_call() now changes call->async_workfn rather than call->async_work to point to afs_delete_async_call() to clean up, but this is only effective for incoming calls because call->async_work does not point to afs_async_workfn() for outgoing calls. (6) Because, for incoming calls, call->async_work remains pointing to afs_process_async_call() this results in an infinite loop. Instead, make the workqueue uniformly vector through call->async_workfn, via afs_async_workfn() and simply initialise call->async_workfn to point to afs_process_async_call() in afs_make_call(). Signed-off-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Tejun Heo <tj@kernel.org>	2014-05-23 13:05:22 +01:00
Nathaniel Wesley Filardo	6cf12869f5	AFS: Part of afs_end_call() is identical to code elsewhere, so split it Split afs_end_call() into two pieces, one of which is identical to code in afs_process_async_call(). Replace the latter with a call to the first part of afs_end_call(). Signed-off-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu> Signed-off-by: David Howells <dhowells@redhat.com>	2014-05-23 13:05:15 +01:00
Linus Torvalds	11da37b263	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull two btrfs fixes from Chris Mason: "This has two fixes that we've been testing for 3.16, but since both are safe and fix real bugs, it makes sense to send for 3.15 instead" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: send, fix incorrect ref access when using extrefs Btrfs: fix EIO on reading file after ioctl clone works on it	2014-05-22 05:40:13 +09:00
Linus Torvalds	5e9d9fc4ed	Merge tag 'xfs-for-linus-3.15-rc6' of git://oss.sgi.com/xfs/xfs Pull xfs fixes from Dave Chinner: "Code inspection of the XFS error number sign translations found a bunch of issues, including returning incorrectly signed errors for some data integrity operations. These leak to userspace and result in applications not getting the errors correctly reported. Hence they need fixing sooner rather than later. A couple of the bugs are in data integrity operations, a couple more are in the new COLLAPSE_RANGE code. One of these came in through a recent ext4 merge and so I had to update the base tree to 3.15-rc5 before fixing the issues" * tag 'xfs-for-linus-3.15-rc6' of git://oss.sgi.com/xfs/xfs: xfs: list_lru_init returns a negative error xfs: negate xfs_icsb_init_counters error value xfs: negate mount workqueue init error value xfs: fix wrong err sign on xfs_set_acl() xfs: fix wrong errno from xfs_initxattrs xfs: correct error sign on COLLAPSE_RANGE errors xfs: xfs_commit_metadata returns wrong errno xfs: fix incorrect error sign in xfs_file_aio_read xfs: xfs_dir_fsync() returns positive errno	2014-05-22 05:36:07 +09:00
J. Bruce Fields	27b11428b7	nfsd4: warn on finding lockowner without stateid's The current code assumes a one-to-one lockowner<->lock stateid correspondance. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-21 11:11:21 -04:00

1 2 3 4 5 ...

36043 Commits