linux

mirror of https://github.com/hardkernel/linux.git synced 2026-04-04 04:03:04 +09:00

Author	SHA1	Message	Date
Sheng Yang	926fd42e59	KVM: VMX: Disable unrestricted guest when EPT disabled commit `046d87103a` upstream Otherwise would cause VMEntry failure when using ept=0 on unrestricted guest supported processors. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:21 -07:00
Eduardo Habkost	e075e2caeb	KVM: SVM: Reset cr0 properly on vcpu reset commit `18fa000ae4` upstream svm_vcpu_reset() was not properly resetting the contents of the guest-visible cr0 register, causing the following issue: https://bugzilla.redhat.com/show_bug.cgi?id=525699 Without resetting cr0 properly, the vcpu was running the SIPI bootstrap routine with paging enabled, making the vcpu get a pagefault exception while trying to run it. Instead of setting vmcb->save.cr0 directly, the new code just resets kvm->arch.cr0 and calls kvm_set_cr0(). The bits that were set/cleared on vmcb->save.cr0 (PG, WP, !CD, !NW) will be set properly by svm_set_cr0(). kvm_set_cr0() is used instead of calling svm_set_cr0() directly to make sure kvm_mmu_reset_context() is called to reset the mmu to nonpaging mode. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:20 -07:00
Eduardo Habkost	5b4ce46f45	KVM: VMX: Use macros instead of hex value on cr0 initialization commit `fa40052ca0` upstream This should have no effect, it is just to make the code clearer. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:20 -07:00
Jan Kiszka	86143fe846	KVM: VMX: Update instruction length on intercepted BP commit `c573cd2293` upstream We intercept #BP while in guest debugging mode. As VM exits due to intercepted exceptions do not necessarily come with valid idt_vectoring, we have to update event_exit_inst_len explicitly in such cases. At least in the absence of migration, this ensures that re-injections of #BP will find and use the correct instruction length. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:20 -07:00
Gleb Natapov	78ce64a384	KVM: Fix segment descriptor loading commit `c697518a86` upstream Add proper error and permission checking. This patch also change task switching code to load segment selectors before segment descriptors, like SDM requires, otherwise permission checking during segment descriptor loading will be incorrect. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:20 -07:00
Gleb Natapov	ad66f35eaa	KVM: x86 emulator: Fix popf emulation commit `d4c6a1549c` upstream POPF behaves differently depending on current CPU mode. Emulate correct logic to prevent guest from changing flags that it can't change otherwise. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:20 -07:00
Gleb Natapov	f6eb212e60	KVM: x86 emulator: Check IOPL level during io instruction emulation commit `f850e2e603` upstream Make emulator check that vcpu is allowed to execute IN, INS, OUT, OUTS, CLI, STI. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:20 -07:00
Gleb Natapov	2f44154668	KVM: x86 emulator: fix memory access during x86 emulation commit `1871c6020d` upstream Currently when x86 emulator needs to access memory, page walk is done with broadest permission possible, so if emulated instruction was executed by userspace process it can still access kernel memory. Fix that by providing correct memory access to page walker during emulation. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:20 -07:00
Gleb Natapov	10a505e60e	KVM: x86 emulator: Add Virtual-8086 mode of emulation commit `a004475567` upstream For some instructions CPU behaves differently for real-mode and virtual 8086. Let emulator know which mode cpu is in, so it will not poke into vcpu state directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:20 -07:00
Evan McClain	5774cdfd3d	backlight: mbp_nvidia_bl - add five more MacBook variants commit `36bc5ee6a8` upstream. This adds the MacBook 1,1 2,1 3,1 4,1 and 4,2 to the DMI tables. Signed-off-by: Evan McClain <evan.mcclain@gatech.edu> Signed-off-by: Richard Purdie <rpurdie@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:19 -07:00
Jiri Slaby	6abbc7bac8	resource: move kernel function inside __KERNEL__ commit `96d07d2117` upstream From: Jiri Slaby <jslaby@suse.cz> resource: move kernel function inside __KERNEL__ It is an internal function. Move it inside __KERNEL__ ifdef, along with task_struct declaration. Then we get: #--- /usr/include/linux/resource.h 2009-09-14 15:09:29.000000000 +0200 #+++ usr/include/linux/resource.h 2010-01-04 11:30:54.000000000 +0100 #@@ -3,8 +3,6 @@ # ##include <linux/time.h> # #-struct task_struct; #- #/* #* Resource control/accounting header file for linux #/ #@@ -70,6 +68,5 @@ #/ ##include <asm/resource.h> # #-int getrusage(struct task_struct p, int who, struct rusage ru); # ##endif # #*********** include/linux/Kbuild is untouched, since unifdef is run even on headers-y nowadays. backport to 2.6.32 by maximilian attems <max@stro.at> Patch commented out by gregkh due to quilt complaining. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: maximilian attems <max@stro.at> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:19 -07:00
Andreas Herrmann	787814b206	x86, amd: Get multi-node CPU info from NodeId MSR instead of PCI config space commit `9d260ebc09` upstream. Use NodeId MSR to get NodeId and number of nodes per processor. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20091216144355.GB28798@alberich.amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:19 -07:00
Daniel T Chen	97d0f3f2a0	ALSA: hda: Fix 0 dB offset for Lenovo Thinkpad models using AD1981 commit `b8e80cf386` upstream. BugLink: https://launchpad.net/bugs/551606 The OR's hardware distorts at PCM 100% because it does not correspond to 0 dB. Fix this in patch_ad1981() for all models using the Thinkpad quirk. Reported-by: Jane Silber Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:19 -07:00
Dan Carpenter	db27624dc7	ALSA: mixart: range checking proc file commit `b0cc58a25d` upstream. The original code doesn't take into consideration that the value of MIXART_BA0_SIZE - pos can be less than zero which would lead to a large unsigned value for "count". Also I moved the check that read size is a multiple of 4 bytes below the code that adjusts "count". Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:19 -07:00
Wu Fengguang	a54573d774	readahead: fix NULL filp dereference commit `70655c06bd` upstream. btrfs relocate_file_extent_cluster() calls us with NULL filp: [ 4005.426805] BUG: unable to handle kernel NULL pointer dereference at 00000021 [ 4005.426818] IP: [<c109a130>] page_cache_sync_readahead+0x18/0x3e Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Yan Zheng <yanzheng@21cn.com> Reported-by: Kirill A. Shutemov <kirill@shutemov.name> Tested-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:19 -07:00
Anton Blanchard	957c93832c	raw: fsync method is now required commit `55ab3a1ff8` upstream. Commit `148f948ba8` (vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode) broke the raw driver. We now call through generic_file_aio_write -> generic_write_sync -> vfs_fsync_range. vfs_fsync_range has: if (!fop \|\| !fop->fsync) { ret = -EINVAL; goto out; } But drivers/char/raw.c doesn't set an fsync method. We have two options: fix it or remove the raw driver completely. I'm happy to do either, the fact this has been broken for so long suggests it is rarely used. The patch below adds an fsync method to the raw driver. My knowledge of the block layer is pretty sketchy so this could do with a once over. If we instead decide to remove the raw driver, this patch might still be useful as a backport to 2.6.33 and 2.6.32. Signed-off-by: Anton Blanchard <anton@samba.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <jens.axboe@oracle.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Tested-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:19 -07:00
Jiri Kosina	6e313203f4	HID: fix oops in gyration_event() commit `d8e4ebf8b6` upstream. Fix oops caused by dereferencing field->hidinput in cases where the device hasn't been claimed by hid-input. Reported-by: Andreas Demmer <mail@andreas-demmer.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:19 -07:00
Alan Cox	4e44a79800	pata_ali: Fix regression with old devices commit `d6250a03fa` upstream. Making the new stuff work broke some of the old chipsets. We need to go back to the old set up values for these it seems. Unfortunately even with documentation this is basically a mix of cargoculting and guesswork. Chased down to the exact line by Gianluca. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Cc: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:18 -07:00
Éric Piel	6807699cad	lis3: fix show rate for 8 bits chips commit `4b5d95b380` upstream. Originally the driver was only targeted to 12bits sensors. When support for 8bits sensors was added, some slight difference in the registers were overlooked. This should fix it, both for initialization, and for displaying the rate. Reported-by: Kalhan Trisal <kalhan.trisal@intel.com> Reported-by: Christoph Plattner <christoph.plattner@gmx.at> Tested-by: Christoph Plattner <christoph.plattner@gmx.at> Tested-by: Samu Onkalo <samu.p.onkalo@nokia.com> Signed-off-by: Éric Piel <eric.piel@tremplin-utc.net> Signed-off-by: Samu Onkalo <samu.p.onkalo@nokia.com> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:18 -07:00
Oleg Nesterov	e2278e63b1	tty: release_one_tty() forgets to put pids commit `6da8d866d0` upstream. release_one_tty(tty) can be called when tty still has a reference to pgrp/session. In this case we leak the pid. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Catalin Marinas <catalin.marinas@arm.com> Reported-and-tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:18 -07:00
Thomas Gleixner	fae08fb3f9	genirq: Force MSI irq handlers to run with interrupts disabled commit `753649dbc4` upstream. Network folks reported that directing all MSI-X vectors of their multi queue NICs to a single core can cause interrupt stack overflows when enough interrupts fire at the same time. This is caused by the fact that we run interrupt handlers by default with interrupts enabled unless the driver reuqests the interrupt with the IRQF_DISABLED set. The NIC handlers do not set this flag, so simultaneous interrupts can nest unlimited and cause the stack overflow. The only safe counter measure is to run the interrupt handlers with interrupts disabled. We can't switch to this mode in general right now, but it is safe to do so for MSI interrupts. Force IRQF_DISABLED for MSI interrupt handlers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Linus Torvalds <torvalds@osdl.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: David Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:18 -07:00
Seth Heasley	e3abceb779	WATCHDOG: iTCO_wdt: TCO Watchdog patch for additional Intel Cougar Point DeviceIDs commit `4c7d849204` upstream. This patch adds the Intel Cougar Point PCH LPC Controller DeviceIDs for iTCO Watchdog. Signed-off-by: Seth Heasley <seth.heasley@intel.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:18 -07:00
Thomas Mingarelli	628f1a0323	WATCHDOG: hpwdt - fix lower timeout limit commit `8ba42bd88c` upstream. [Novell Bug 581103] HP Watchdog driver has arbitrary (wrong) timeout limits. Fix the lower timeout limit to a more appropriate value. Signed-off-by: Thomas Mingarelli <Thomas.Mingarelli@hp.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:18 -07:00
Wey-Yi Guy	a850f39d86	mac80211: tear down all agg queues when restart/reconfig hw commit `74e2bd1fa3` upstream. When there is a need to restart/reconfig hw, tear down all the aggregation queues and let the mac80211 and driver get in-sync to have the opportunity to re-establish the aggregation queues again. Need to wait until driver re-establish all the station information before tear down the aggregation queues, driver(at least iwlwifi driver) will reject the stop aggregation queue request if station is not ready. But also need to make sure the aggregation queues are tear down before waking up the queues, so mac80211 will not sending frames with aggregation bit set. Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:18 -07:00
Johannes Berg	6a45694284	mac80211: move netdev queue enabling to correct spot commit `7236fe29fd` upstream. "mac80211: fix skb buffering issue" still left a race between enabling the hardware queues and the virtual interface queues. In hindsight it's totally obvious that enabling the netdev queues for a hardware queue when the hardware queue is enabled is wrong, because it could well possible that we can fill the hw queue with packets we already have pending. Thus, we must only enable the netdev queues once all the pending packets have been processed and sent off to the device. In testing, I haven't been able to trigger this race condition, but it's clearly there, possibly only when aggregation is being enabled. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:17 -07:00
Valentin Longchamp	c7f02946f8	setup correct int pipe type in ar9170_usb_exec_cmd commit `2d20c72c02` upstream. An int urb is constructed but we fill it in with a bulk pipe type. Commit `f661c6f8c6` implemented a pipe type check when CONFIG_USB_DEBUG is enabled. The check failed for all the ar9170 usb transfers and the driver could not configure the wifi dongle. This went unnoticed until now because most people don't have CONFIG_USB_DEBUG enabled. Signed-off-by: Valentin Longchamp <valentin.longchamp@epfl.ch> Acked-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:17 -07:00
Dan Carpenter	a2266949a5	iwlwifi: range checking issue commit `8e1a53c615` upstream. IWL_RATE_COUNT is 13 and IWL_RATE_COUNT_LEGACY is 12. IWL_RATE_COUNT_LEGACY is the right one here because iwl3945_rates doesn't support 60M and also that's how "rates" is defined in iwlcore_init_geos() from drivers/net/wireless/iwlwifi/iwl-core.c. rates = kzalloc((sizeof(struct ieee80211_rate) * IWL_RATE_COUNT_LEGACY), GFP_KERNEL); Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Zhu Yi <yi.zhu@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:17 -07:00
Stanislaw Gruszka	79ae3ed145	iwlwifi: fix nfreed-- During backporting of `a120e912eb` ("iwlwifi: sanity check before counting number of tfds can be free") we forget one hunk, what make lot of messages "free more than tfds_in_queue" show up in dmesg. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Tested-by: Adel Gadllah <adel.gadllah@gmail.com> (picked from https://patchwork.kernel.org/patch/86722/) Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:17 -07:00
Wey-Yi Guy	e69f8f5303	iwlwifi: counting number of tfds can be free for 4965 commit `be6b38bcb1` upstream. Forget one hunk in 4965 during "iwlwifi: error checking for number of tfds in queue" patch. Reported-by: Shanyu Zhao <shanyu.zhao@intel.com> Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:17 -07:00
Matt Helsley	a0223a1cdb	Freezer: Fix buggy resume test for tasks frozen with cgroup freezer commit `5a7aadfe2f` upstream. When the cgroup freezer is used to freeze tasks we do not want to thaw those tasks during resume. Currently we test the cgroup freezer state of the resuming tasks to see if the cgroup is FROZEN. If so then we don't thaw the task. However, the FREEZING state also indicates that the task should remain frozen. This also avoids a problem pointed out by Oren Ladaan: the freezer state transition from FREEZING to FROZEN is updated lazily when userspace reads or writes the freezer.state file in the cgroup filesystem. This means that resume will thaw tasks in cgroups which should be in the FROZEN state if there is no read/write of the freezer.state file to trigger this transition before suspend. NOTE: Another "simple" solution would be to always update the cgroup freezer state during resume. However it's a bad choice for several reasons: Updating the cgroup freezer state is somewhat expensive because it requires walking all the tasks in the cgroup and checking if they are each frozen. Worse, this could easily make resume run in N^2 time where N is the number of tasks in the cgroup. Finally, updating the freezer state from this code path requires trickier locking because of the way locks must be ordered. Instead of updating the freezer state we rely on the fact that lazy updates only manage the transition from FREEZING to FROZEN. We know that a cgroup with the FREEZING state may actually be FROZEN so test for that state too. This makes sense in the resume path even for partially-frozen cgroups -- those that really are FREEZING but not FROZEN. Reported-by: Oren Ladaan <orenl@cs.columbia.edu> Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:17 -07:00
Mike Christie	d3ce482f41	libiscsi: Fix recovery slowdown regression commit `4ae0a6c15e` upstream. We could be failing/stopping a connection due to libiscsi starting recovery/cleanup, but the xmit path or scsi eh thread path could be dropping the connection at the same time. As a result the session->state gets set to failed instead of in recovery. We end up not blocking the session and so the replacement timeout never gets started and we only end up failing the IO when scsi_softirq_done sees that the cmd has been running for (cmd->allowed + 1) * rq->timeout secs. We used to fail the IO right away so users are seeing a long delay when using dm-multipath. This problem was added in 2.6.28. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:17 -07:00
Andrew Stubbs	bbe04b45ff	sh: Fix FDPIC binary loader commit `d5ab780305` upstream. Ensure that the aux table is properly initialized, even when optional features are missing. Without this, the FDPIC loader did not work. Signed-off-by: Andrew Stubbs <ams@codesourcery.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:17 -07:00
Matt Fleming	69fd993218	sh: Enable the mmu in start_secondary() commit `4bea3418c7` upstream. For the boot, enable_mmu() is called from setup_arch() but we don't call setup_arch() for any of the other cpus. So turn on the non-boot cpu's mmu inside of start_secondary(). I noticed this bug on an SMP board when trying to map I/O memory (smsc911x registers) into the kernel address space. Since the Address Translation bit in MMUCR wasn't set, accessing the virtual address where the smsc911x registers were supposedly mapped actually performed a physical address access. Signed-off-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:16 -07:00
Christoph Hellwig	94aa7e9130	xfs: fix locking for inode cache radix tree tag updates commit `f1f724e4b5` upstream The radix-tree code requires it's users to serialize tag updates against other updates to the tree. While XFS protects tag updates against each other it does not serialize them against updates of the tree contents, which can lead to tag corruption. Fix the inode cache to always take pag_ici_lock in exclusive mode when updating radix tree tags. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Patrick Schreurs <patrick@news-service.com> Tested-by: Patrick Schreurs <patrick@news-service.com> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:16 -07:00
Dave Chinner	e52af5078f	xfs: Non-blocking inode locking in IO completion commit `77d7a0c2ee` upstream The introduction of barriers to loop devices has created a new IO order completion dependency that XFS does not handle. The loop device implements barriers using fsync and so turns a log IO in the XFS filesystem on the loop device into a data IO in the backing filesystem. That is, the completion of log IOs in the loop filesystem are now dependent on completion of data IO in the backing filesystem. This can cause deadlocks when a flush daemon issues a log force with an inode locked because the IO completion of IO on the inode is blocked by the inode lock. This in turn prevents further data IO completion from occuring on all XFS filesystems on that CPU (due to the shared nature of the completion queues). This then prevents the log IO from completing because the log is waiting for data IO completion as well. The fix for this new completion order dependency issue is to make the IO completion inode locking non-blocking. If the inode lock can't be grabbed, simply requeue the IO completion back to the work queue so that it can be processed later. This prevents the completion queue from being blocked and allows data IO completion on other inodes to proceed, hence avoiding completion order dependent deadlocks. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:16 -07:00
Christoph Hellwig	76962b5ed7	xfs: remove invalid barrier optimization from xfs_fsync commit `e8b217e753` upstream Date: Tue, 2 Feb 2010 10:16:26 +1100 We always need to flush the disk write cache and can't skip it just because the no inode attributes have changed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:16 -07:00
Dave Chinner	83b546b837	xfs: don't hold onto reserved blocks on remount, ro commit `cbe132a8bd` upstream If we hold onto reserved blocks when doing a remount,ro we end up writing the blocks used count to disk that includes the reserved blocks. Reserved blocks are not actually used, so this results in the values in the superblock being incorrect. Hence if we run xfs_check or xfs_repair -n while the filesystem is mounted remount,ro we end up with an inconsistent filesystem being reported. Also, running xfs_copy on the remount,ro filesystem will result in an inconsistent image being generated. To fix this, unreserve the blocks when doing the remount,ro, and reserved them again on remount,rw. This way a remount,ro filesystem will appear consistent on disk to all utilities. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:16 -07:00
Christoph Hellwig	1b8e6dfb49	xfs: quota limit statvfs available blocks commit `9b00f30762` upstream A "df" run on an NFS client of an exported XFS file system reports the wrong information for "available" blocks. When a block quota is enforced, the amount reported as free is limited by the quota, but the amount reported available is not (and should be). Reported-by: Guk-Bong, Kwon <gbkwon@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:16 -07:00
Dave Chinner	059f754c44	xfs: xfs_swap_extents needs to handle dynamic fork offsets commit `e09f98606d` upstream When swapping extents, we can corrupt inodes by swapping data forks that are in incompatible formats. This is caused by the two indoes having different fork offsets due to the presence of an attribute fork on an attr2 filesystem. xfs_fsr tries to be smart about setting the fork offset, but the trick it plays only works on attr1 (old fixed format attribute fork) filesystems. Changing the way xfs_fsr sets up the attribute fork will prevent this situation from ever occurring, so in the kernel code we can get by with a preventative fix - check that the data fork in the defragmented inode is in a format valid for the inode it is being swapped into. This will lead to files that will silently and potentially repeatedly fail defragmentation, so issue a warning to the log when this particular failure occurs to let us know that xfs_fsr needs updating/fixing. To help identify how to improve xfs_fsr to avoid this issue, add trace points for the inodes being swapped so that we can determine why the swap was rejected and to confirm that the code is making the right decisions and modifications when swapping forks. A further complication is even when the swap is allowed to proceed when the fork offset is different between the two inodes then value for the maximum number of extents the data fork can hold can be wrong. Make sure these are also set correctly after the swap occurs. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:16 -07:00
Dave Chinner	bcb56a6645	xfs: fix stale inode flush avoidance commit `4b6a46882c` upstream When reclaiming stale inodes, we need to guarantee that inodes are unpinned before returning with a "clean" status. If we don't we can reclaim inodes that are pinned, leading to use after free in the transaction subsystem as transactions complete. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:15 -07:00
Dave Chinner	9e1e9675fb	xfs: reclaim all inodes by background tree walks commit `57817c6822` upstream We cannot do direct inode reclaim without taking the flush lock to ensure that we do not reclaim an inode under IO. We check the inode is clean before doing direct reclaim, but this is not good enough because the inode flush code marks the inode clean once it has copied the in-core dirty state to the backing buffer. It is the flush lock that determines whether the inode is still under IO, even though it is marked clean, and the inode is still required at IO completion so we can't reclaim it even though it is clean in core. Hence the requirement that we need to take the flush lock even on clean inodes because this guarantees that the inode writeback IO has completed and it is safe to reclaim the inode. With delayed write inode flushing, we could end up waiting a long time on the flush lock even for a clean inode. The background reclaim already handles this efficiently, so avoid all the problems by killing the direct reclaim path altogether. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:15 -07:00
Dave Chinner	22a482c621	xfs: Avoid inodes in reclaim when flushing from inode cache commit `018027be90` upstream The reclaim code will handle flushing of dirty inodes before reclaim occurs, so avoid them when determining whether an inode is a candidate for flushing to disk when walking the radix trees. This is based on a test patch from Christoph Hellwig. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:15 -07:00
Dave Chinner	96ce91ba51	xfs: reclaim inodes under a write lock commit `c8e20be020` upstream Make the inode tree reclaim walk exclusive to avoid races with concurrent sync walkers and lookups. This is a version of a patch posted by Christoph Hellwig that avoids all the code duplication. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:15 -07:00
Dave Chinner	74a1282e07	xfs: Ensure we force all busy extents in range to disk commit `fd45e47841` upstream When we search for and find a busy extent during allocation we force the log out to ensure the extent free transaction is on disk before the allocation transaction. The current implementation has a subtle bug in it--it does not handle multiple overlapping ranges. That is, if we free lots of little extents into a single contiguous extent, then allocate the contiguous extent, the busy search code stops searching at the first extent it finds that overlaps the allocated range. It then uses the commit LSN of the transaction to force the log out to. Unfortunately, the other busy ranges might have more recent commit LSNs than the first busy extent that is found, and this results in xfs_alloc_search_busy() returning before all the extent free transactions are on disk for the range being allocated. This can lead to potential metadata corruption or stale data exposure after a crash because log replay won't replay all the extent free transactions that cover the allocation range. Modified-by: Alex Elder <aelder@sgi.com> (Dropped the "found" argument from the xfs_alloc_busysearch trace event.) Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:15 -07:00
Dave Chinner	d4203bda89	xfs: Don't flush stale inodes commit `44e08c45cc` upstream Because inodes remain in cache much longer than inode buffers do under memory pressure, we can get the situation where we have stale, dirty inodes being reclaimed but the backing storage has been freed. Hence we should never, ever flush XFS_ISTALE inodes to disk as there is no guarantee that the backing buffer is in cache and still marked stale when the flush occurs. Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:15 -07:00
Christoph Hellwig	8ae95bb907	xfs: fix timestamp handling in xfs_setattr commit `d6d59bada3` upstream We currently have some rather odd code in xfs_setattr for updating the a/c/mtime timestamps: - first we do a non-transaction update if all three are updated together - second we implicitly update the ctime for various changes instead of relying on the ATTR_CTIME flag - third we set the timestamps to the current time instead of the arguments in the iattr structure in many cases. This patch makes sure we update it in a consistent way: - always transactional - ctime is only updated if ATTR_CTIME is set or we do a size update, which is a special case - always to the times passed in from the caller instead of the current time The only non-size caller of xfs_setattr that doesn't come from the VFS is updated to set ATTR_CTIME and pass in a valid ctime value. Reported-by: Eric Blake <ebb9@byu.net> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:15 -07:00
Christoph Hellwig	55f4564203	xfs: check for not fully initialized inodes in xfs_ireclaim commit `b44b112627` upstream Add an assert for inodes not added to the inode cache in xfs_ireclaim, to make sure we're not going to introduce something like the famous nfsd inode cache bug again. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:15 -07:00
Jason Gunthorpe	b3e6e64f48	xfs: Fix error return for fallocate() on XFS commit `44a743f687` upstream Noticed that through glibc fallocate would return 28 rather than -1 and errno = 28 for ENOSPC. The xfs routines uses XFS_ERROR format positive return error codes while the syscalls use negative return codes. Fixup the two cases in xfs_vn_fallocate syscall to convert to negative. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:14 -07:00
Andy Poling	ce49b28e33	xfs: Wrapped journal record corruption on read at recovery commit `fc5bc4c85c` upstream Summary of problem: If a journal record wraps at the physical end of the journal, it has to be read in two parts in xlog_do_recovery_pass(): a read at the physical end and a read at the physical beginning. If xlog_bread() has to re-align the first read, the second read request does not take that re-alignment into account. If the first read was re-aligned, the second read over-writes the end of the data from the first read, effectively corrupting it. This can happen either when reading the record header or reading the record data. The first sanity check in xlog_recover_process_data() is to check for a valid clientid, so that is the error reported. Summary of fix: If there was a first read at the physical end, XFS_BUF_PTR() returns where the data was requested to begin. Conversely, because it is the result of xlog_align(), offset indicates where the requested data for the first read actually begins - whether or not xlog_bread() has re-aligned it. Using offset as the base for the calculation of where to place the second read data ensures that it will be correctly placed immediately following the data from the first read instead of sometimes over-writing the end of it. The attached patch has resolved the reported problem of occasional inability to recover the journal (reporting "bad clientid"). Signed-off-by: Andy Poling <andy@realbig.com> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:14 -07:00
Christoph Hellwig	fe5bdcac72	xfs: I/O completion handlers must use NOFS allocations commit `80641dc66a` upstream When completing I/O requests we must not allow the memory allocator to recurse into the filesystem, as we might deadlock on waiting for the I/O completion otherwise. The only thing currently allocating normal GFP_KERNEL memory is the allocation of the transaction structure for the unwritten extent conversion. Add a memflags argument to _xfs_trans_alloc to allow controlling the allocator behaviour. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Thomas Neumann <tneumann@users.sourceforge.net> Tested-by: Thomas Neumann <tneumann@users.sourceforge.net> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-04-26 07:41:14 -07:00

... 4 5 6 7 8 ...

170330 Commits