linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-09 12:17:12 +09:00

Author	SHA1	Message	Date
FUJITA Tomonori	588809a6ee	fix SBA IOMMU to handle allocation failure properly commit `e2a465675d` upstream. It's possible that SBA IOMMU might fail to find I/O space under heavy I/Os. SBA IOMMU panics on allocation failure but it shouldn't; drivers can handle the failure. The majority of other IOMMU drivers don't panic on allocation failure. This patch fixes SBA IOMMU path to handle allocation failure properly. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Leonardo Chiquitto <lchiquitto@novell.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:15 -07:00
Benjamin Herrenschmidt	f40bf5f2fc	mutex: Don't spin when the owner CPU is offline or other weird cases commit `4b40221048` upstream. Due to recent load-balancer changes that delay the task migration to the next wakeup, the adaptive mutex spinning ends up in a live lock when the owner's CPU gets offlined because the cpu_online() check lives before the owner running check. This patch changes mutex_spin_on_owner() to return 0 (don't spin) in any case where we aren't sure about the owner struct validity or CPU number, and if the said CPU is offline. There is no point going back & re-evaluate spinning in corner cases like that, let's just go to sleep. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1271212509.13059.135.camel@pasglop> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:14 -07:00
Hidetoshi Seto	19eb722b76	sched, cputime: Introduce thread_group_times() commit `0cf55e1ec0` upstream. This is a real fix for problem of utime/stime values decreasing described in the thread: http://lkml.org/lkml/2009/11/3/522 Now cputime is accounted in the following way: - {u,s}time in task_struct are increased every time when the thread is interrupted by a tick (timer interrupt). - When a thread exits, its {u,s}time are added to signal->{u,s}time, after adjusted by task_times(). - When all threads in a thread_group exits, accumulated {u,s}time (and also c{u,s}time) in signal struct are added to c{u,s}time in signal struct of the group's parent. So {u,s}time in task struct are "raw" tick count, while {u,s}time and c{u,s}time in signal struct are "adjusted" values. And accounted values are used by: - task_times(), to get cputime of a thread: This function returns adjusted values that originates from raw {u,s}time and scaled by sum_exec_runtime that accounted by CFS. - thread_group_cputime(), to get cputime of a thread group: This function returns sum of all {u,s}time of living threads in the group, plus {u,s}time in the signal struct that is sum of adjusted cputimes of all exited threads belonged to the group. The problem is the return value of thread_group_cputime(), because it is mixed sum of "raw" value and "adjusted" value: group's {u,s}time = foreach(thread){{u,s}time} + exited({u,s}time) This misbehavior can break {u,s}time monotonicity. Assume that if there is a thread that have raw values greater than adjusted values (e.g. interrupted by 1000Hz ticks 50 times but only runs 45ms) and if it exits, cputime will decrease (e.g. -5ms). To fix this, we could do: group's {u,s}time = foreach(t){task_times(t)} + exited({u,s}time) But task_times() contains hard divisions, so applying it for every thread should be avoided. This patch fixes the above problem in the following way: - Modify thread's exit (= __exit_signal()) not to use task_times(). It means {u,s}time in signal struct accumulates raw values instead of adjusted values. As the result it makes thread_group_cputime() to return pure sum of "raw" values. - Introduce a new function thread_group_times(task, utime, *stime) that converts "raw" values of thread_group_cputime() to "adjusted" values, in same calculation procedure as task_times(). - Modify group's exit (= wait_task_zombie()) to use this introduced thread_group_times(). It make c{u,s}time in signal struct to have adjusted values like before this patch. - Replace some thread_group_cputime() by thread_group_times(). This replacements are only applied where conveys the "adjusted" cputime to users, and where already uses task_times() near by it. (i.e. sys_times(), getrusage(), and /proc/<PID>/stat.) This patch have a positive side effect: - Before this patch, if a group contains many short-life threads (e.g. runs 0.9ms and not interrupted by ticks), the group's cputime could be invisible since thread's cputime was accumulated after adjusted: imagine adjustment function as adj(ticks, runtime), {adj(0, 0.9) + adj(0, 0.9) + ....} = {0 + 0 + ....} = 0. After this patch it will not happen because the adjustment is applied after accumulated. v2: - remove if()s, put new variables into signal_struct. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Spencer Candland <spencer@bluehost.com> Cc: Americo Wang <xiyou.wangcong@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> LKML-Reference: <4B162517.8040909@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:14 -07:00
Hidetoshi Seto	2b2513f387	sched: Fix granularity of task_u/stime() commit `761b1d26df` upstream. Originally task_s/utime() were designed to return clock_t but later changed to return cputime_t by following commit: commit `efe567fc82` Author: Christian Borntraeger <borntraeger@de.ibm.com> Date: Thu Aug 23 15:18:02 2007 +0200 It only changed the type of return value, but not the implementation. As the result the granularity of task_s/utime() is still that of clock_t, not that of cputime_t. So using task_s/utime() in __exit_signal() makes values accumulated to the signal struct to be rounded and coarse grained. This patch removes casts to clock_t in task_u/stime(), to keep granularity of cputime_t over the calculation. v2: Use div_u64() to avoid error "undefined reference to `__udivdi3`" on some 32bit systems. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: xiyou.wangcong@gmail.com Cc: Spencer Candland <spencer@bluehost.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> LKML-Reference: <4AFB9029.9000208@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:14 -07:00
Lin Ming	8aa3149405	timekeeping: Fix clock_gettime vsyscall time warp commit `0696b711e4` upstream. Since commit `0a544198` "timekeeping: Move NTP adjusted clock multiplier to struct timekeeper" the clock multiplier of vsyscall is updated with the unmodified clock multiplier of the clock source and not with the NTP adjusted multiplier of the timekeeper. This causes user space observerable time warps: new CLOCK-warp maximum: 120 nsecs, 00000025c337c537 -> 00000025c337c4bf Add a new argument "mult" to update_vsyscall() and hand in the timekeeping internal NTP adjusted multiplier. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: "Zhang Yanmin" <yanmin_zhang@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Tony Luck <tony.luck@intel.com> LKML-Reference: <1258436990.17765.83.camel@minggr.sh.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kurt Garloff <garloff@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:13 -07:00
Martin Schwidefsky	e66bb88311	nohz: Reuse ktime in sub-functions of tick_check_idle. commit `eed3b9cf3f` upstream. On a system with NOHZ=y tick_check_idle calls tick_nohz_stop_idle and tick_nohz_update_jiffies. Given the right conditions (ts->idle_active and/or ts->tick_stopped) both function get a time stamp with ktime_get. The same time stamp can be reused if both function require one. On s390 this change has the additional benefit that gcc inlines the tick_nohz_stop_idle function into tick_check_idle. The number of instructions to execute tick_check_idle drops from 225 to 144 (without the ktime_get optimization it is 367 vs 215 instructions). before: 0) \| tick_check_idle() { 0) \| tick_nohz_stop_idle() { 0) \| ktime_get() { 0) \| read_tod_clock() { 0) 0.601 us \| } 0) 1.765 us \| } 0) 3.047 us \| } 0) \| ktime_get() { 0) \| read_tod_clock() { 0) 0.570 us \| } 0) 1.727 us \| } 0) \| tick_do_update_jiffies64() { 0) 0.609 us \| } 0) 8.055 us \| } after: 0) \| tick_check_idle() { 0) \| ktime_get() { 0) \| read_tod_clock() { 0) 0.617 us \| } 0) 1.773 us \| } 0) \| tick_do_update_jiffies64() { 0) 0.593 us \| } 0) 4.477 us \| } Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: john stultz <johnstul@us.ibm.com> LKML-Reference: <20090929122533.206589318@de.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Jolly <jjolly@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:13 -07:00
Martin Schwidefsky	7bfa0a73b2	nohz: Introduce arch_needs_cpu commit `3c5d92a0cf` upstream. Allow the architecture to request a normal jiffy tick when the system goes idle and tick_nohz_stop_sched_tick is called . On s390 the hook is used to prevent the system going fully idle if there has been an interrupt other than a clock comparator interrupt since the last wakeup. On s390 the HiperSockets response time for 1 connection ping-pong goes down from 42 to 34 microseconds. The CPU cost decreases by 27%. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> LKML-Reference: <20090929122533.402715150@de.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Jolly <jjolly@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:13 -07:00
Josef Bacik	55f32f8ac6	Btrfs: kfree correct pointer during mount option parsing commit `da495ecc0f` upstream. We kstrdup the options string, but then strsep screws with the pointer, so when we kfree() it, we're not giving it the right pointer. Tested-by: Andy Lutomirski <luto@mit.edu> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:12 -07:00
Shaohua Li	f827e15f9f	Btrfs: btrfs_mark_extent_written uses the wrong slot commit `3f6fae9559` upstream. My test do: fallocate a big file and do write. The file is 512M, but after file write is done btrfs-debug-tree shows: item 6 key (257 EXTENT_DATA 0) itemoff 3516 itemsize 53 extent data disk byte 1103101952 nr 536870912 extent data offset 0 nr 399634432 ram 536870912 extent compression 0 Looks like a regression introducted by `6c7d54ac87`, where we set wrong slot. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Acked-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:12 -07:00
Aneesh Kumar K.V	b5763c0f3b	Btrfs: apply updated fallocate i_size fix commit `23b5c50945` upstream. This version of the i_size fix for fallocate makes sure we only update the i_size when the current fallocate is really operating outside of i_size. Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:12 -07:00
Josef Bacik	8fde9c08a0	Btrfs: do not try and lookup the file extent when finishing ordered io commit `efd049fb26` upstream. When running the following fio job [torrent] filename=torrent-test rw=randwrite size=4g filesize=4g bs=4k ioengine=sync you would see long stalls where no work was being done. That is because we were doing all this extra work to read in the file extent outside of the transaction, however in the random io case this ends up hurting us because the file extents are not there to begin with. So axe this logic, since we end up reading in the file extent when we go to update it anyway. This took the fio job from 11 mb/s with several ~10 second stalls to 24 mb/s to a couple of 1-2 second stalls. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:11 -07:00
Yan, Zheng	fa3c9782cd	Btrfs: Fix oopsen when dropping empty tree. commit `7a7965f83e` upstream. When dropping a empty tree, walk_down_tree() skips checking extent information for the tree root. This will triggers a BUG_ON in walk_up_proc(). Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:11 -07:00
Miao Xie	4c324840d6	Btrfs: remove BUG_ON() due to mounting bad filesystem commit `d7ce5843bb` upstream. Mounting a bad filesystem caused a BUG_ON(). The following is steps to reproduce it. # mkfs.btrfs /dev/sda2 # mount /dev/sda2 /mnt # mkfs.btrfs /dev/sda1 /dev/sda2 (the program says that /dev/sda2 was mounted, and then exits. ) # umount /mnt # mount /dev/sda1 /mnt At the third step, mkfs.btrfs exited in the way of make filesystem. So the initialization of the filesystem didn't finish. So the filesystem was bad, and it caused BUG_ON() when mounting it. But BUG_ON() should be called by the wrong code, not user's operation, so I think it is a bug of btrfs. This patch fixes it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:11 -07:00
Roel Kluin	caf32f2c7d	Btrfs: make error return negative in btrfs_sync_file() commit `014e4ac4f7` upstream. It appears the error return should be negative Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:10 -07:00
Yan, Zheng	85d0fbf784	Btrfs: fix race between allocate and release extent buffer. commit `f044ba7835` upstream. Increase extent buffer's reference count while holding the lock. Otherwise it can race with try_release_extent_buffer. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:10 -07:00
Josef Bacik	95ec1a915f	Btrfs: check total number of devices when removing missing commit `035fe03a7a` upstream. If you have a disk failure in RAID1 and then add a new disk to the array, and then try to remove the missing volume, it will fail. The reason is the sanity check only looks at the total number of rw devices, which is just 2 because we have 2 good disks and 1 bad one. Instead check the total number of devices in the array to make sure we can actually remove the device. Tested this with a failed disk setup and with this test we can now run btrfs-vol -r missing /mount/point and it works fine. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:09 -07:00
Josef Bacik	b618d2a8d9	Btrfs: check return value of open_bdev_exclusive properly commit `7f59203abe` upstream. Hit this problem while testing RAID1 failure stuff. open_bdev_exclusive returns ERR_PTR(), not NULL. So change the return value properly. This is important if you accidently specify a device that doesn't exist when trying to add a new device to an array, you will panic the box dereferencing bdev. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:08 -07:00
Josef Bacik	020009fcab	Btrfs: do not mark the chunk as readonly if in degraded mode commit `f48b90756b` upstream. If a RAID setup has chunks that span multiple disks, and one of those disks has failed, btrfs_chunk_readonly will return 1 since one of the disks in that chunk's stripes is dead and therefore not writeable. So instead if we are in degraded mode, return 0 so we can go ahead and allocate stuff. Without this patch all of the block groups in a RAID1 setup will end up read-only, which will mean we can't add new disks to the array since we won't be able to make allocations. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:08 -07:00
Josef Bacik	8766aeadf9	Btrfs: run orphan cleanup on default fs root commit `e3acc2a685` upstream. This patch revert's commit `6c090a11e1` Since it introduces this problem where we can run orphan cleanup on a volume that can have orphan entries re-added. Instead of my original fix, Yan Zheng pointed out that we can just revert my original fix and then run the orphan cleanup in open_ctree after we look up the fs_root. I have tested this with all the tests that gave me problems and this patch fixes both problems. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:07 -07:00
Yang Hongyang	aeda1ce857	Btrfs: fix a memory leak in btrfs_init_acl commit `f858153c36` upstream. In btrfs_init_acl() cloned acl is not released Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:06 -07:00
Aneesh Kumar K.V	ccfc4449f8	Btrfs: Use correct values when updating inode i_size on fallocate commit `d1ea6a6145` upstream. commit f2bc9dd07e3424c4ec5f3949961fe053d47bc825 Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Date: Wed Jan 20 12:57:53 2010 +0530 Btrfs: Use correct values when updating inode i_size on fallocate Even though we allocate more, we should be updating inode i_size as per the arguments passed Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:05 -07:00
Josef Bacik	fad9f7ba5e	Btrfs: fix possible panic on unmount commit `11dfe35a01` upstream. We can race with the unmount of an fs and the stopping of a kthread where we will free the block group before we're done using it. The reason for this is because we do not hold a reference on the block group while its caching, since the allocator drops its reference once it exits or moves on to the next block group. This patch fixes the problem by taking a reference to the block group before we start caching and dropping it when we're done to make sure all accesses to the block group are safe. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:05 -07:00
Chris Mason	f7733c6a95	Btrfs: deal with NULL acl sent to btrfs_set_acl commit `a9cc71a60c` upstream. It is legal for btrfs_set_acl to be sent a NULL acl. This makes sure we don't dereference it. A similar patch was sent by Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:04 -07:00
Josef Bacik	3d31143c56	Btrfs: fix regression in orphan cleanup commit `6c090a11e1` upstream. Currently orphan cleanup only ever gets triggered if we cross subvolumes during a lookup, which means that if we just mount a plain jane fs that has orphans in it, they will never get cleaned up. This results in panic's like these http://www.kerneloops.org/oops.php?number=1109085 where adding an orphan entry results in -EEXIST being returned and we panic. In order to fix this, we check to see on lookup if our root has had the orphan cleanup done, and if not go ahead and do it. This is easily reproduceable by running this testcase #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <string.h> #include <unistd.h> #include <stdio.h> int main(int argc, char *argv) { char data[4096]; char newdata[4096]; int fd1, fd2; memset(data, 'a', 4096); memset(newdata, 'b', 4096); while (1) { int i; fd1 = creat("file1", 0666); if (fd1 < 0) break; for (i = 0; i < 512; i++) write(fd1, data, 4096); fsync(fd1); close(fd1); fd2 = creat("file2", 0666); if (fd2 < 0) break; ftruncate(fd2, 4096 512); for (i = 0; i < 512; i++) write(fd2, newdata, 4096); close(fd2); i = rename("file2", "file1"); unlink("file1"); } return 0; } and then pulling the power on the box, and then trying to run that test again when the box comes back up. I've tested this locally and it fixes the problem. Thanks to Tomas Carnecky for helping me track this down initially. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:04 -07:00
Yan, Zheng	f0862ee9b0	Btrfs: Fix race in btrfs_mark_extent_written commit `6c7d54ac87` upstream. Fix bug reported by Johannes Hirte. The reason of that bug is btrfs_del_items is called after btrfs_duplicate_item and btrfs_del_items triggers tree balance. The fix is check that case and call btrfs_search_slot when needed. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:03 -07:00
Jiri Slaby	ded3c4fede	Btrfs, fix memory leaks in error paths commit `2423fdfb96` upstream. Stanse found 2 memory leaks in relocate_block_group and __btrfs_map_block. cluster and multi are not freed/assigned on all paths. Fix that. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: linux-btrfs@vger.kernel.org Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:02 -07:00
Yan, Zheng	e00818493b	Btrfs: align offsets for btrfs_ordered_update_i_size commit `a038fab0cb` upstream. Some callers of btrfs_ordered_update_i_size can now pass in a NULL for the ordered extent to update against. This makes sure we properly align the offset they pass in when deciding how much to bump the on disk i_size. Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:02 -07:00
Jan Engelhardt	77033193e0	btrfs: fix missing last-entry in readdir(3) commit `406266ab9a` upstream. parent 49313cdac7b34c9f7ecbb1780cfc648b1c082cd7 (v2.6.32-1-g49313cd) commit ff48c08e1c05c67e8348ab6f8a24de8034e0e34d Author: Jan Engelhardt <jengelh@medozas.de> Date: Wed Dec 9 22:57:36 2009 +0100 Btrfs: fix missing last-entry in readdir(3) When one does a 32-bit readdir(3), the last entry of a directory is missing. This is however not due to passing a large value to filldir, but it seems to have to do with glibc doing telldir or something quirky. In any case, this patch fixes it in practice. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:01 -07:00
Chris Mason	88a9a7cf4c	Btrfs: make sure fallocate properly starts a transaction commit `3a1abec9f6` upstream. The recent patch to make fallocate enospc friendly would send down a NULL trans handle to the allocator. This moves the transaction start to properly fix things. Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:00 -07:00
Josef Bacik	e4b2abc432	Btrfs: make metadata chunks smaller commit `83d3c9696f` upstream. This patch makes us a bit less zealous about making sure we have enough free metadata space by pearing down the size of new metadata chunks to 256mb instead of 1gb. Also, we used to try an allocate metadata chunks when allocating data, but that sort of thing is done elsewhere now so we can just remove it. With my -ENOSPC test I used to have 3gb reserved for metadata out of 75gb, now I have 1.7gb. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:00 -07:00
Matthew Wilcox	0ab44aac74	Btrfs: Show discard option in /proc/mounts commit `20a5239a5d` upstream. Christoph's patch `e244a0aeb6` doesn't display the discard option in /proc/mounts, leading to some confusion for me. Here's the missing bit. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:20:00 -07:00
TARUISI Hiroaki	8db426bbfe	Btrfs: deny sys_link across subvolumes. commit `4a8be425a8` upstream. I rebased Christian Parpart's patch to deny hard link across subvolumes. Original patch modifies also btrfs_rename, but I excluded it because we can move across subvolumes now and it make no problem. ----------------- Hard link across subvolumes should not allowed in Btrfs. btrfs_link checks root of 'to' directory is same as root of 'from' file. If not same, btrfs_link returns -EPERM. Signed-off-by: TARUISI Hiroaki <taruishi.hiroak@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:59 -07:00
Sage Weil	f09a3eb5f7	Btrfs: fail mount on bad mount options commit `a7a3f7cadd` upstream. We shouldn't silently ignore unrecognized options. Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:58 -07:00
Yan, Zheng	755885f6c0	Btrfs: don't add extent 0 to the free space cache v2 commit `06b2331f83` upstream. If block group 0 is completely free, btrfs_read_block_groups will add extent [0, BTRFS_SUPER_INFO_OFFSET) to the free space cache. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:58 -07:00
Yan, Zheng	feeeea7814	Btrfs: Fix per root used space accounting commit `86b9f2eca5` upstream. The bytes_used field in root item was originally planned to trace the amount of used data and tree blocks. But it never worked right since we can't trace freeing of data accurately. This patch changes it to only trace the amount of tree blocks. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:57 -07:00
Yan, Zheng	c87afd35a2	Btrfs: Fix btrfs_drop_extent_cache for skip pinned case commit `55ef689900` upstream. The check for skip pinned case is wrong, it may breaks the while loop too soon. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:57 -07:00
Yan, Zheng	a5b2df73b8	Btrfs: Add delayed iput commit `24bbcf0442` upstream. iput() can trigger new transactions if we are dropping the final reference, so calling it in btrfs_commit_transaction may end up deadlock. This patch adds delayed iput to avoid the issue. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:57 -07:00
Yan, Zheng	6e423a0b56	Btrfs: Pass transaction handle to security and ACL initialization functions commit `f34f57a3ab` upstream. Pass transaction handle down to security and ACL initialization functions, so we can avoid starting nested transactions Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:56 -07:00
Yan, Zheng	6a5b99b6b6	Btrfs: Make truncate(2) more ENOSPC friendly commit `8082510e71` upstream. truncating and deleting regular files are unbound operations, so it's not good to do them in a single transaction. This patch makes btrfs_truncate and btrfs_delete_inode start a new transaction after all items in a tree leaf are deleted. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:56 -07:00
Yan, Zheng	16fdd7c37a	Btrfs: Make fallocate(2) more ENOSPC friendly commit `5a303d5d4b` upstream. fallocate(2) may allocate large number of file extents, so it's not good to do it in a single transaction. This patch make fallocate(2) start a new transaction for each file extents it allocates. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:55 -07:00
Yan, Zheng	9202f1111a	Btrfs: Avoid orphan inodes cleanup during committing transaction commit `2e4bfab970` upstream. btrfs_lookup_dentry may trigger orphan cleanup, so it's not good to call it while committing a transaction. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:55 -07:00
Yan, Zheng	c4ba0bd9db	Btrfs: Avoid orphan inodes cleanup while replaying log commit `c71bf099ab` upstream. We do log replay in a single transaction, so it's not good to do unbound operations. This patch cleans up orphan inodes cleanup after replaying the log. It also avoids doing other unbound operations such as truncating a file during replaying log. These unbound operations are postponed to the orphan inode cleanup stage. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:54 -07:00
Yan, Zheng	eb99c071ac	Btrfs: Fix disk_i_size update corner case commit `c216775458` upstream. There are some cases file extents are inserted without involving ordered struct. In these cases, we update disk_i_size directly, without checking pending ordered extent and DELALLOC bit. This patch extends btrfs_ordered_update_i_size() to handle these cases. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:54 -07:00
Yan, Zheng	c8e789da04	Btrfs: Rewrite btrfs_drop_extents commit `920bbbfb05` upstream. Rewrite btrfs_drop_extents by using btrfs_duplicate_item, so we can avoid calling lock_extent within transaction. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:53 -07:00
Yan, Zheng	8bf364c4eb	Btrfs: Add btrfs_duplicate_item commit `ad48fd7546` upstream. btrfs_duplicate_item duplicates item with new key, guaranteeing the source item and the new items are in the same tree leaf and contiguous. It allows us to split file extent in place, without using lock_extent to prevent bookend extent race. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:52 -07:00
Yan, Zheng	4a77a16b9f	Btrfs: Avoid superfluous tree-log writeout commit `8cef4e160d` upstream. We allow two log transactions at a time, but use same flag to mark dirty tree-log btree blocks. So we may flush dirty blocks belonging to newer log transaction when committing a log transaction. This patch fixes the issue by using two flags to mark dirty tree-log btree blocks. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:52 -07:00
Dave Müller	f403905d43	drm/i915: Use RSEN instead of HTPLG for tfp410 monitor detection. commit `f458823b86` upstream. Presence detection of a digital monitor seems not to be reliable using the HTPLG bit. Dave Müller <dave.mueller@gmx.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:52 -07:00
Eric Sandeen	c3d0780019	ext4: fix freeze deadlock under IO commit `437f88cc03` upstream. Commit `6b0310fbf0` caused a regression resulting in deadlocks when freezing a filesystem which had active IO; the vfs_check_frozen level (SB_FREEZE_WRITE) did not let the freeze-related IO syncing through. Duh. Changing the test to FREEZE_TRANS should let the normal freeze syncing get through the fs, but still block any transactions from starting once the fs is completely frozen. I tested this by running fsstress in the background while periodically snapshotting the fs and running fsck on the result. I ran into occasional deadlocks, but different ones. I think this is a fine fix for the problem at hand, and the other deadlocky things will need more investigation. Reported-by: Phillip Susi <psusi@cfl.rr.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:51 -07:00
Ian Campbell	7d9deb060f	xen: Do not suspend IPI IRQs. commit `4877c73728` upstream. In general the semantics of IPIs are that they are are expected to continue functioning after dpm_suspend_noirq(). Specifically I have seen a deadlock between the callfunc IPI and the stop machine used by xen's do_suspend() routine. If one CPU has already called dpm_suspend_noirq() then there is a window where it can be sent a callfunc IPI before all the other CPUs have entered stop_cpu(). If this happens then the first CPU ends up spinning in stop_cpu() waiting for the other to rendezvous in state STOPMACHINE_PREPARE while the other is spinning in csd_lock_wait(). Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: xen-devel@lists.xensource.com LKML-Reference: <1280398595-29708-4-git-send-email-ian.campbell@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:51 -07:00
Ian Campbell	35822af390	irq: Add new IRQ flag IRQF_NO_SUSPEND commit `685fd0b4ea` upstream. A small number of users of IRQF_TIMER are using it for the implied no suspend behaviour on interrupts which are not timer interrupts. Therefore add a new IRQF_NO_SUSPEND flag, rename IRQF_TIMER to __IRQF_TIMER and redefine IRQF_TIMER in terms of these new flags. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: xen-devel@lists.xensource.com Cc: linux-input@vger.kernel.org Cc: linuxppc-dev@ozlabs.org Cc: devicetree-discuss@lists.ozlabs.org LKML-Reference: <1280398595-29708-1-git-send-email-ian.campbell@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-13 13:19:50 -07:00

1 2 3 4 5 ...

170792 Commits