linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-10 04:48:04 +09:00

Author	SHA1	Message	Date
Mark Brown	40e58abd6b	Merge remote-tracking branch 'lsk/v3.10/topic/arm64-hugepages' into linux-linaro-lsk	2014-05-09 12:05:57 +01:00
Mark Brown	08ff259785	arm64: Fix build for __PAGE_NONE define Simple typo. Signed-off-by: Mark Brown <broonie@linaro.org>	2014-05-09 12:02:48 +01:00
Steve Capper	e0834c5484	arm64: mm: Add double logical invert to pte accessors Page table entries on ARM64 are 64 bits, and some pte functions such as pte_dirty return a bitwise-and of a flag with the pte value. If the flag to be tested resides in the upper 32 bits of the pte, then we run into the danger of the result being dropped if downcast. For example: gather_stats(page, md, pte_dirty(pte), 1); where pte_dirty(pte) is downcast to an int. This patch adds a double logical invert to all the pte_ accessors to ensure predictable downcasting. Signed-off-by: Steve Capper <steve.capper@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 12:02:48 +01:00
Steve Capper	6d251ba203	arm64: mm: Introduce PTE_WRITE We have the following means for encoding writable or dirty ptes: PTE_DIRTY PTE_RDONLY !pte_dirty && !pte_write 0 1 !pte_dirty && pte_write 0 1 pte_dirty && !pte_write 1 1 pte_dirty && pte_write 1 0 So we can't distinguish between writable clean ptes and read only ptes. This can cause problems with ptes being incorrectly flagged as read only when they are writable but not dirty. This patch introduces a new software bit PTE_WRITE which allows us to correctly identify writable ptes. PTE_RDONLY is now only clear for valid ptes where a page is both writable and dirty. Signed-off-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Conflicts: arch/arm64/include/asm/pgtable.h	2014-05-09 12:02:48 +01:00
Steve Capper	e0934c2624	arm64: mm: Remove PTE_BIT_FUNC macro Expand out the pte manipulation functions. This makes our life easier when using things like tags and cscope. Signed-off-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 12:02:48 +01:00
Andreas Sandberg	89a6a78ec2	mm/hugetlb.c: call MMU notifiers when copying a hugetlb page range When copy_hugetlb_page_range() is called to copy a range of hugetlb mappings, the secondary MMUs are not notified if there is a protection downgrade, which breaks COW semantics in KVM. This patch adds the necessary MMU notifier calls. Signed-off-by: Andreas Sandberg <andreas@sandberg.pp.se> Acked-by: Steve Capper <steve.capper@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-05-09 12:02:47 +01:00
Steve Capper	7e5f0d41d3	arm64: mm: Fix PMD_SECT_PROT_NONE definition Modify the value of PMD_SECT_PROT_NONE to match that of PTE_NONE. This should have been in commit `3676f9ef54` (Move PTE_PROT_NONE higher up). Signed-off-by: Steve Capper <steve.capper@linaro.org> Cc: <stable@vger.kernel.org> # 3.11+: `3676f9ef54`: arm64: Move PTE_PROT_NONE higher up Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 12:02:47 +01:00
Catalin Marinas	a2b07b7b16	arm64: Move PTE_PROT_NONE higher up PTE_PROT_NONE means that a pte is present but does not have any read/write attributes. However, setting the memory type like pgprot_writecombine() is allowed and such bits overlap with PTE_PROT_NONE. This causes mmap/munmap issues in drivers that change the vma->vm_pg_prot on PROT_NONE mappings. This patch reverts the PTE_FILE/PTE_PROT_NONE shift in commit `59911ca432` (ARM64: mm: Move PTE_PROT_NONE bit) and moves PTE_PROT_NONE together with the other software bits. Signed-off-by: Steve Capper <steve.capper@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Steve Capper <steve.capper@linaro.org> Cc: <stable@vger.kernel.org> # 3.11+	2014-05-09 12:02:47 +01:00
Steve Capper	a348ad67a1	ARM64: mm: THP support. Bring Transparent HugePage support to ARM. The size of a transparent huge page depends on the normal page size. A transparent huge page is always represented as a pmd. If PAGE_SIZE is 4KB, THPs are 2MB. If PAGE_SIZE is 64KB, THPs are 512MB. Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 12:02:47 +01:00
Steve Capper	74c50894f3	ARM64: mm: Raise MAX_ORDER for 64KB pages and THP. The buddy allocator has a default MAX_ORDER of 11, which is too low to allocate enough memory for 512MB Transparent HugePages if our base page size is 64KB. This patch introduces MAX_ZONE_ORDER and sets it to 14 when 64KB pages are used in conjuction with THP, otherwise the default value of 11 is used. Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 12:02:47 +01:00
Steve Capper	597173f474	ARM64: mm: HugeTLB support. Add huge page support to ARM64, different huge page sizes are supported depending on the size of normal pages: PAGE_SIZE is 4KB: 2MB - (pmds) these can be allocated at any time. 1024MB - (puds) usually allocated on bootup with the command line with something like: hugepagesz=1G hugepages=6 PAGE_SIZE is 64KB: 512MB - (pmds) usually allocated on bootup via command line. Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 12:02:47 +01:00
Steve Capper	7308560ac7	ARM64: mm: Move PTE_PROT_NONE bit. Under ARM64, PTEs can be broadly categorised as follows: - Present and valid: Bit #0 is set. The PTE is valid and memory access to the region may fault. - Present and invalid: Bit #0 is clear and bit #1 is set. Represents present memory with PROT_NONE protection. The PTE is an invalid entry, and the user fault handler will raise a SIGSEGV. - Not present (file or swap): Bits #0 and #1 are clear. Memory represented has been paged out. The PTE is an invalid entry, and the fault handler will try and re-populate the memory where necessary. Huge PTEs are block descriptors that have bit #1 clear. If we wish to represent PROT_NONE huge PTEs we then run into a problem as there is no way to distinguish between regular and huge PTEs if we set bit #1. To resolve this ambiguity this patch moves PTE_PROT_NONE from bit #1 to bit #2 and moves PTE_FILE from bit #2 to bit #3. The number of swap/file bits is reduced by 1 as a consequence, leaving 60 bits for file and swap entries. Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 12:02:47 +01:00
Steve Capper	5ea0c51491	ARM64: mm: Make PAGE_NONE pages read only and no-execute. If we consider the following code sequence: my_pte = pte_modify(entry, myprot); x = pte_write(my_pte); y = pte_exec(my_pte); If myprot comes from a PROT_NONE page, then x and y will both be true which is undesireable behaviour. This patch sets the no-execute and read-only bits for PAGE_NONE such that the code above will return false for both x and y. Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 12:02:47 +01:00
Steve Capper	f45009d071	ARM64: mm: Restore memblock limit when map_mem finished. In paging_init the memblock limit is set to restrict any addresses returned by early_alloc to fit within the initial direct kernel mapping in swapper_pg_dir. This allows map_mem to allocate puds, pmds and ptes from the initial direct kernel mapping. The limit stays low after paging_init() though, meaning any bootmem allocations will be from a restricted subset of memory. Gigabyte huge pages, for instance, are normally allocated from bootmem as their order (18) is too large for the default buddy allocator (MAX_ORDER = 11). This patch restores the memblock limit when map_mem has finished, allowing gigabyte huge pages (and other objects) to be allocated from all of bootmem. Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 12:02:47 +01:00
Steve Capper	1d85bc6282	mm: thp: Correct the HPAGE_PMD_ORDER check. All Transparent Huge Pages are allocated by the buddy allocator. A compile time check is in place that fails when the order of a transparent huge page is too large to be allocated by the buddy allocator. Unfortunately that compile time check passes when: HPAGE_PMD_ORDER == MAX_ORDER ( which is incorrect as the buddy allocator can only allocate memory of order strictly less than MAX_ORDER. ) This patch updates the compile time check to fail in the above case. Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org>	2014-05-09 12:02:47 +01:00
Steve Capper	66ea6bbaf0	x86: mm: Remove general hugetlb code from x86. huge_pte_alloc, huge_pte_offset and follow_huge_p[mu]d have already been copied over to mm. This patch removes the x86 copies of these functions and activates the general ones by enabling: CONFIG_ARCH_WANT_GENERAL_HUGETLB Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org>	2014-05-09 12:02:46 +01:00
Steve Capper	ba47dd3ca2	mm: hugetlb: Copy general hugetlb code from x86 to mm. The huge_pte_alloc, huge_pte_offset and follow_huge_p[mu]d functions in x86/mm/hugetlbpage.c do not rely on any architecture specific knowledge other than the fact that pmds and puds can be treated as huge ptes. To allow other architectures to use this code (and reduce the need for code duplication), this patch copies these functions into mm, replaces the use of pud_large with pud_huge and provides a config flag to activate them: CONFIG_ARCH_WANT_GENERAL_HUGETLB If CONFIG_ARCH_WANT_HUGE_PMD_SHARE is also active then the huge_pmd_share code will be called by huge_pte_alloc (othewise we call pmd_alloc and skip the sharing code). Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org>	2014-05-09 12:02:46 +01:00
Steve Capper	4daa1bbe44	x86: mm: Remove x86 version of huge_pmd_share. The huge_pmd_share code has been copied over to mm/hugetlb.c to make it accessible to other architectures. Remove the x86 copy of the huge_pmd_share code and enable the ARCH_WANT_HUGE_PMD_SHARE config flag. That way we reference the general one. Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org>	2014-05-09 12:02:46 +01:00
Steve Capper	5f525ac6a4	mm: hugetlb: Copy huge_pmd_share from x86 to mm. Under x86, multiple puds can be made to reference the same bank of huge pmds provided that they represent a full PUD_SIZE of shared huge memory that is aligned to a PUD_SIZE boundary. The code to share pmds does not require any architecture specific knowledge other than the fact that pmds can be indexed, thus can be beneficial to some other architectures. This patch copies the huge pmd sharing (and unsharing) logic from x86/ to mm/ and introduces a new config option to activate it: CONFIG_ARCH_WANTS_HUGE_PMD_SHARE Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org>	2014-05-09 12:02:46 +01:00
Mark Brown	8810719ecb	Merge remote-tracking branch 'lsk/v3.10/topic/juno' into linux-linaro-lsk Conflicts: arch/arm64/boot/dts/Makefile	2014-05-08 12:11:25 +01:00
Mark Brown	9b92dfc49a	Merge remote-tracking branch 'lsk/v3.10/topic/misc' into linux-linaro-lsk	2014-05-08 12:11:07 +01:00
Mark Brown	17b540de36	dtc: Use general include directory Since newer DT bindings include references to include/dt-bindings we need to make this available to build DTs using them. Upstream has a number of reworkings which are much more invasive but featureful, just include a minimal fix. Signed-off-by: Mark Brown <broonie@linaro.org>	2014-05-08 12:10:16 +01:00
Liviu Dudau	6c658548ec	arm64: Add Juno platform support [Squashed down several commits from dev repository, reused vexpress config option -- broonie] Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: Mark Brown <broonie@linaro.org>	2014-05-08 11:07:25 +01:00
Mark Brown	4b898aa532	Merge remote-tracking branch 'lsk/v3.10/topic/configs' into linux-linaro-lsk	2014-05-08 10:55:10 +01:00
Anders Roxell	fa1090cdaf	linaro/configs: add fragment preempt-rt Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Mark Brown <broonie@linaro.org>	2014-05-08 10:54:03 +01:00
Mark Brown	2cae6ea4bb	Merge remote-tracking branch 'lsk/v3.10/topic/big.LITTLE' into linux-linaro-lsk	2014-05-07 14:55:58 +01:00
Mark Brown	970a8e6b53	Merge branch 'for-lsk' of git://git.linaro.org/arm/big.LITTLE/mp into lsk-v3.10-big.LITTLE	2014-05-07 14:52:11 +01:00
Chris Redpath	1ade57e54e	sched: hmp: Change small task packing defaults for all platforms All platforms other than TC2 default to enabling packing. Since TC2 shows no performance or energy degradation with this feature enabled make it default enabled the same as everyone else. Likewise, vendors have been including TC2 support in multi-machine kernel builds so they expect the default thresholds to remain the same when the TC2 #ifdef is removed. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>	2014-05-07 11:34:00 +01:00
Mark Brown	03b1200275	Merge tag 'v3.10.39' into linux-linaro-lsk This is the 3.10.39 stable release	2014-05-07 09:50:01 +01:00
Greg Kroah-Hartman	5d897eedc5	Linux 3.10.39 v3.10.39	2014-05-06 07:56:24 -07:00
Felipe Balbi	6b32172a1d	usb: musb: avoid NULL pointer dereference commit `eee3f15d5f` upstream. instead of relying on the otg pointer, which can be NULL in certain cases, we can use the gadget and host pointers we already hold inside struct musb. Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:33 -07:00
Aaron Sanders	0bf463e622	USB: pl2303: add ids for Hewlett-Packard HP POS pole displays commit `b16c02fbfb` upstream. Add device ids to pl2303 for the Hewlett-Packard HP POS pole displays: LD960: 03f0:0B39 LCM220: 03f0:3139 LCM960: 03f0:3239 [ Johan: fix indentation and sort PIDs numerically ] Signed-off-by: Aaron Sanders <aaron.sanders@hp.com> Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:33 -07:00
Theodore Ts'o	fcc742c294	ext4: use i_size_read in ext4_unaligned_aio() commit `6e6358fc3c` upstream. We haven't taken i_mutex yet, so we need to use i_size_read(). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:32 -07:00
Jan Kara	f78e2d2db6	ext4: fix jbd2 warning under heavy xattr load commit `ec4cb1aa2b` upstream. When heavily exercising xattr code the assertion that jbd2_journal_dirty_metadata() shouldn't return error was triggered: WARNING: at /srv/autobuild-ceph/gitbuilder.git/build/fs/jbd2/transaction.c:1237 jbd2_journal_dirty_metadata+0x1ba/0x260() CPU: 0 PID: 8877 Comm: ceph-osd Tainted: G W 3.10.0-ceph-00049-g68d04c9 #1 Hardware name: Dell Inc. PowerEdge R410/01V648, BIOS 1.6.3 02/07/2011 ffffffff81a1d3c8 ffff880214469928 ffffffff816311b0 ffff880214469968 ffffffff8103fae0 ffff880214469958 ffff880170a9dc30 ffff8802240fbe80 0000000000000000 ffff88020b366000 ffff8802256e7510 ffff880214469978 Call Trace: [<ffffffff816311b0>] dump_stack+0x19/0x1b [<ffffffff8103fae0>] warn_slowpath_common+0x70/0xa0 [<ffffffff8103fb2a>] warn_slowpath_null+0x1a/0x20 [<ffffffff81267c2a>] jbd2_journal_dirty_metadata+0x1ba/0x260 [<ffffffff81245093>] __ext4_handle_dirty_metadata+0xa3/0x140 [<ffffffff812561f3>] ext4_xattr_release_block+0x103/0x1f0 [<ffffffff81256680>] ext4_xattr_block_set+0x1e0/0x910 [<ffffffff8125795b>] ext4_xattr_set_handle+0x38b/0x4a0 [<ffffffff810a319d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff81257b32>] ext4_xattr_set+0xc2/0x140 [<ffffffff81258547>] ext4_xattr_user_set+0x47/0x50 [<ffffffff811935ce>] generic_setxattr+0x6e/0x90 [<ffffffff81193ecb>] __vfs_setxattr_noperm+0x7b/0x1c0 [<ffffffff811940d4>] vfs_setxattr+0xc4/0xd0 [<ffffffff8119421e>] setxattr+0x13e/0x1e0 [<ffffffff811719c7>] ? __sb_start_write+0xe7/0x1b0 [<ffffffff8118f2e8>] ? mnt_want_write_file+0x28/0x60 [<ffffffff8118c65c>] ? fget_light+0x3c/0x130 [<ffffffff8118f2e8>] ? mnt_want_write_file+0x28/0x60 [<ffffffff8118f1f8>] ? __mnt_want_write+0x58/0x70 [<ffffffff811946be>] SyS_fsetxattr+0xbe/0x100 [<ffffffff816407c2>] system_call_fastpath+0x16/0x1b The reason for the warning is that buffer_head passed into jbd2_journal_dirty_metadata() didn't have journal_head attached. This is caused by the following race of two ext4_xattr_release_block() calls: CPU1 CPU2 ext4_xattr_release_block() ext4_xattr_release_block() lock_buffer(bh); /* False / if (BHDR(bh)->h_refcount == cpu_to_le32(1)) } else { le32_add_cpu(&BHDR(bh)->h_refcount, -1); unlock_buffer(bh); lock_buffer(bh); / True */ if (BHDR(bh)->h_refcount == cpu_to_le32(1)) get_bh(bh); ext4_free_blocks() ... jbd2_journal_forget() jbd2_journal_unfile_buffer() -> JH is gone error = ext4_handle_dirty_xattr_block(handle, inode, bh); -> triggers the warning We fix the problem by moving ext4_handle_dirty_xattr_block() under the buffer lock. Sadly this cannot be done in nojournal mode as that function can call sync_dirty_buffer() which would deadlock. Luckily in nojournal mode the race is harmless (we only dirty already freed buffer) and thus for nojournal mode we leave the dirtying outside of the buffer lock. Reported-by: Sage Weil <sage@inktank.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:32 -07:00
alex chen	5da43e877e	ocfs2: do not put bh when buffer_uptodate failed commit `f7cf4f5bfe` upstream. Do not put bh when buffer_uptodate failed in ocfs2_write_block and ocfs2_write_super_or_backup, because it will put bh in b_end_io. Otherwise it will hit a warning "VFS: brelse: Trying to free free buffer". Signed-off-by: Alex Chen <alex.chen@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:32 -07:00
Junxiao Bi	a093bceda6	ocfs2: dlm: fix recovery hung commit `ded2cf7141` upstream. There is a race window in dlm_do_recovery() between dlm_remaster_locks() and dlm_reset_recovery() when the recovery master nearly finish the recovery process for a dead node. After the master sends FINALIZE_RECO message in dlm_remaster_locks(), another node may become the recovery master for another dead node, and then send the BEGIN_RECO message to all the nodes included the old master, in the handler of this message dlm_begin_reco_handler() of old master, dlm->reco.dead_node and dlm->reco.new_master will be set to the second dead node and the new master, then in dlm_reset_recovery(), these two variables will be reset to default value. This will cause new recovery master can not finish the recovery process and hung, at last the whole cluster will hung for recovery. old recovery master: new recovery master: dlm_remaster_locks() become recovery master for another dead node. dlm_send_begin_reco_message() dlm_begin_reco_handler() { if (dlm->reco.state & DLM_RECO_STATE_FINALIZE) { return -EAGAIN; } dlm_set_reco_master(dlm, br->node_idx); dlm_set_reco_dead_node(dlm, br->dead_node); } dlm_reset_recovery() { dlm_set_reco_dead_node(dlm, O2NM_INVALID_NODE_NUM); dlm_set_reco_master(dlm, O2NM_INVALID_NODE_NUM); } will hang in dlm_remaster_locks() for request dlm locks info Before send FINALIZE_RECO message, recovery master should set DLM_RECO_STATE_FINALIZE for itself and clear it after the recovery done, this can break the race windows as the BEGIN_RECO messages will not be handled before DLM_RECO_STATE_FINALIZE flag is cleared. A similar race may happen between new recovery master and normal node which is in dlm_finalize_reco_handler(), also fix it. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com> Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:32 -07:00
Junxiao Bi	261985c439	ocfs2: dlm: fix lock migration crash commit `34aa8dac48` upstream. This issue was introduced by commit `800deef3f6` ("ocfs2: use list_for_each_entry where benefical") in 2007 where it replaced list_for_each with list_for_each_entry. The variable "lock" will point to invalid data if "tmpq" list is empty and a panic will be triggered due to this. Sunil advised reverting it back, but the old version was also not right. At the end of the outer for loop, that list_for_each_entry will also set "lock" to an invalid data, then in the next loop, if the "tmpq" list is empty, "lock" will be an stale invalid data and cause the panic. So reverting the list_for_each back and reset "lock" to NULL to fix this issue. Another concern is that this seemes can not happen because the "tmpq" list should not be empty. Let me describe how. old lock resource owner(node 1): migratation target(node 2): image there's lockres with a EX lock from node 2 in granted list, a NR lock from node x with convert_type EX in converting list. dlm_empty_lockres() { dlm_pick_migration_target() { pick node 2 as target as its lock is the first one in granted list. } dlm_migrate_lockres() { dlm_mark_lockres_migrating() { res->state \|= DLM_LOCK_RES_BLOCK_DIRTY; wait_event(dlm->ast_wq, !dlm_lockres_is_dirty(dlm, res)); //after the above code, we can not dirty lockres any more, // so dlm_thread shuffle list will not run downconvert lock from EX to NR upconvert lock from NR to EX <<< migration may schedule out here, then <<< node 2 send down convert request to convert type from EX to <<< NR, then send up convert request to convert type from NR to <<< EX, at this time, lockres granted list is empty, and two locks <<< in the converting list, node x up convert lock followed by <<< node 2 up convert lock. // will set lockres RES_MIGRATING flag, the following // lock/unlock can not run dlm_lockres_release_ast(dlm, res); } dlm_send_one_lockres() dlm_process_recovery_data() for (i=0; i<mres->num_locks; i++) if (ml->node == dlm->node_num) for (j = DLM_GRANTED_LIST; j <= DLM_BLOCKED_LIST; j++) { list_for_each_entry(lock, tmpq, list) if (lock) break; <<< lock is invalid as grant list is empty. } if (lock->ml.node != ml->node) BUG() >>> crash here } I see the above locks status from a vmcore of our internal bug. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com> Cc: Sunil Mushran <sunil.mushran@gmail.com> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:32 -07:00
Liu Hua	82c647cbfc	hung_task: check the value of "sysctl_hung_task_timeout_sec" commit `80df284765` upstream. As sysctl_hung_task_timeout_sec is unsigned long, when this value is larger then LONG_MAX/HZ, the function schedule_timeout_interruptible in watchdog will return immediately without sleep and with print : schedule_timeout: wrong timeout value ffffffffffffff83 and then the funtion watchdog will call schedule_timeout_interruptible again and again. The screen will be filled with "schedule_timeout: wrong timeout value ffffffffffffff83" This patch does some check and correction in sysctl, to let the function schedule_timeout_interruptible allways get the valid parameter. Signed-off-by: Liu Hua <sdu.liu@huawei.com> Tested-by: Satoru Takeuchi <satoru.takeuchi@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:32 -07:00
Mizuma, Masayoshi	733ad2dce4	mm: hugetlb: fix softlockup when a large number of hugepages are freed. commit `55f67141a8` upstream. When I decrease the value of nr_hugepage in procfs a lot, softlockup happens. It is because there is no chance of context switch during this process. On the other hand, when I allocate a large number of hugepages, there is some chance of context switch. Hence softlockup doesn't happen during this process. So it's necessary to add the context switch in the freeing process as same as allocating process to avoid softlockup. When I freed 12 TB hugapages with kernel-2.6.32-358.el6, the freeing process occupied a CPU over 150 seconds and following softlockup message appeared twice or more. $ echo 6000000 > /proc/sys/vm/nr_hugepages $ cat /proc/sys/vm/nr_hugepages 6000000 $ grep ^Huge /proc/meminfo HugePages_Total: 6000000 HugePages_Free: 6000000 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB $ echo 0 > /proc/sys/vm/nr_hugepages BUG: soft lockup - CPU#16 stuck for 67s! [sh:12883] ... Pid: 12883, comm: sh Not tainted 2.6.32-358.el6.x86_64 #1 Call Trace: free_pool_huge_page+0xb8/0xd0 set_max_huge_pages+0x128/0x190 hugetlb_sysctl_handler_common+0x113/0x140 hugetlb_sysctl_handler+0x1e/0x20 proc_sys_call_handler+0x97/0xd0 proc_sys_write+0x14/0x20 vfs_write+0xb8/0x1a0 sys_write+0x51/0x90 __audit_syscall_exit+0x265/0x290 system_call_fastpath+0x16/0x1b I have not confirmed this problem with upstream kernels because I am not able to prepare the machine equipped with 12TB memory now. However I confirmed that the amount of decreasing hugepages was directly proportional to the amount of required time. I measured required times on a smaller machine. It showed 130-145 hugepages decreased in a millisecond. Amount of decreasing Required time Decreasing rate hugepages (msec) (pages/msec) ------------------------------------------------------------ 10,000 pages == 20GB 70 - 74 135-142 30,000 pages == 60GB 208 - 229 131-144 It means decrement of 6TB hugepages will trigger softlockup with the default threshold 20sec, in this decreasing rate. Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:32 -07:00
Vlastimil Babka	77552735ba	mm: try_to_unmap_cluster() should lock_page() before mlocking commit `57e68e9cd6` upstream. A BUG_ON(!PageLocked) was triggered in mlock_vma_page() by Sasha Levin fuzzing with trinity. The call site try_to_unmap_cluster() does not lock the pages other than its check_page parameter (which is already locked). The BUG_ON in mlock_vma_page() is not documented and its purpose is somewhat unclear, but apparently it serializes against page migration, which could otherwise fail to transfer the PG_mlocked flag. This would not be fatal, as the page would be eventually encountered again, but NR_MLOCK accounting would become distorted nevertheless. This patch adds a comment to the BUG_ON in mlock_vma_page() and munlock_vma_page() to that effect. The call site try_to_unmap_cluster() is fixed so that for page != check_page, trylock_page() is attempted (to avoid possible deadlocks as we already have check_page locked) and mlock_vma_page() is performed only upon success. If the page lock cannot be obtained, the page is left without PG_mlocked, which is again not a problem in the whole unevictable memory design. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Bob Liu <bob.liu@oracle.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Michel Lespinasse <walken@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:32 -07:00
Matt Fleming	9e9c7bb483	sh: fix format string bug in stack tracer commit `a0c32761e7` upstream. Kees reported the following error: arch/sh/kernel/dumpstack.c: In function 'print_trace_address': arch/sh/kernel/dumpstack.c:118:2: error: format not a string literal and no format arguments [-Werror=format-security] Use the "%s" format so that it's impossible to interpret 'data' as a format string. Signed-off-by: Matt Fleming <matt.fleming@intel.com> Reported-by: Kees Cook <keescook@chromium.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:32 -07:00
Felipe Franciosi	981752df71	mtip32xx: Set queue bounce limit commit `1044b1bb92` upstream. We need to set the queue bounce limit during the device initialization to prevent excessive bouncing on 32 bit architectures. Signed-off-by: Felipe Franciosi <felipe@paradoxo.org> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:32 -07:00
Alan Stern	59f0103d74	USB: unbind all interfaces before rebinding any commit `6aec044cc2` upstream. When a driver doesn't have pre_reset, post_reset, or reset_resume methods, the USB core unbinds that driver when its device undergoes a reset or a reset-resume, and then rebinds it afterward. The existing straightforward implementation can lead to problems, because each interface gets unbound and rebound before the next interface is handled. If a driver claims additional interfaces, the claim may fail because the old binding instance may still own the additional interface when the new instance tries to claim it. This patch fixes the problem by first unbinding all the interfaces that are marked (i.e., their needs_binding flag is set) and then rebinding all of them. The patch also makes the helper functions in driver.c a little more uniform and adjusts some out-of-date comments. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: "Poulain, Loic" <loic.poulain@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:32 -07:00
Michal Simek	fe7f9f59b7	usb: phy: Add ulpi IDs for SMSC USB3320 and TI TUSB1210 commit `ead5178bf4` upstream. Add new ulpi IDs which are available on Xilinx Zynq boards. Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:32 -07:00
Paul Gortmaker	94804de804	hvc: ensure hvc_init is only ever called once in hvc_console.c commit `f76a1cbed1` upstream. Commit `3e6c6f630a` ("Delay creation of khcvd thread") moved the call of hvc_init from being a device_initcall into hvc_alloc, and used a non-null hvc_driver as indication of whether hvc_init had already been called. The problem with this is that hvc_driver is only assigned a value at the bottom of hvc_init, and so there is a window where multiple hvc_alloc calls can be in progress at the same time and hence try and call hvc_init multiple times. Previously the use of device_init guaranteed that hvc_init was only called once. This manifests itself as sporadic instances of two hvc_init calls racing each other, and with the loser of the race getting -EBUSY from tty_register_driver() and hence that virtual console fails: Couldn't register hvc console driver virtio-ports vport0p1: error -16 allocating hvc for port Here we add an atomic_t to guarantee we'll never run hvc_init twice. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: `3e6c6f630a` ("Delay creation of khcvd thread") Reported-by: Jim Somerville <Jim.Somerville@windriver.com> Tested-by: Jim Somerville <Jim.Somerville@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:32 -07:00
Huang Rui	aede1b94f8	usb: dwc3: fix wrong bit mask in dwc3_event_devt commit `06f9b6e596` upstream. Around DWC USB3 2.30a release another bit has been added to the Device-Specific Event (DEVT) Event Information (EvtInfo) bitfield. Because of that, what used to be 8 bits long, has become 9 bits long. Per dwc3 2.30a+ spec in the Device-Specific Event (DEVT), the field of Event Information Bits(EvtInfo) uses [24:16] bits, and it has 9 bits not 8 bits. And the following reserved field uses [31:25] bits not [31:24] bits, and it has 7 bits. So in dwc3_event_devt, the bit mask should be: event_info [24:16] 9 bits reserved31_25 [31:25] 7 bits This patch makes sure that newer core releases will work fine with Linux and that we will decode the event information properly on new core releases. [ balbi@ti.com : improve commit log a bit ] Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:31 -07:00
Wolfram Sang	2fdb8dd78f	media: media: gspca: sn9c20x: add ID for Genius Look 1320 V2 commit `61f0319193` upstream. Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:31 -07:00
Florian Vaussard	fe529a27f1	media: omap3isp: preview: Fix the crop margins commit `8b57b9669a` upstream. Commit `3fdfedaaa` "[media] omap3isp: preview: Lower the crop margins" accidentally changed the previewer's cropping, causing the previewer to miss four pixels on each line, thus corrupting the final image. Restored the removed setting. Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:31 -07:00
Hans Verkuil	6671cbfc06	media: saa7134: fix WARN_ON during resume commit `30d652823d` upstream. Do not attempt to reload the tuner modules when resuming after a suspend. This triggers a WARN_ON in kernel/kmod.c:148 __request_module. This fixes https://bugzilla.kernel.org/show_bug.cgi?id=69581. This has always been wrong, but it was never noticed until the WARN_ON was added in 3.9. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:31 -07:00
Antti Palosaari	57b03c6ecd	media: em28xx: fix PCTV 290e LNA oops commit `3ec40dcfb4` upstream. Pointer to device state has been moved to different location during some change. PCTV 290e LNA function still uses old pointer, carried over FE priv, and it crash. Reported-by: Janne Kujanpää <jikuja@iki.fi> Signed-off-by: Antti Palosaari <crope@iki.fi> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-06 07:55:31 -07:00

1 2 3 4 5 ...

380757 Commits