linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-05 02:21:52 +09:00

Author	SHA1	Message	Date
Petr Vorel	cd2399b49d	ima: Fallback to the builtin hash algorithm [ Upstream commit `ab60368ab6` ] IMA requires having it's hash algorithm be compiled-in due to it's early use. The default IMA algorithm is protected by Kconfig to be compiled-in. The ima_hash kernel parameter allows to choose the hash algorithm. When the specified algorithm is not available or available as a module, IMA initialization fails, which leads to a kernel panic (mknodat syscall calls ima_post_path_mknod()). Therefore as fallback we force IMA to use the default builtin Kconfig hash algorithm. Fixed crash: $ grep CONFIG_CRYPTO_MD4 .config CONFIG_CRYPTO_MD4=m [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.12.14-2.3-default root=UUID=74ae8202-9ca7-4e39-813b-22287ec52f7a video=1024x768-16 plymouth.ignore-serial-consoles console=ttyS0 console=tty resume=/dev/disk/by-path/pci-0000:00:07.0-part3 splash=silent showopts ima_hash=md4 ... [ 1.545190] ima: Can not allocate md4 (reason: -2) ... [ 2.610120] BUG: unable to handle kernel NULL pointer dereference at (null) [ 2.611903] IP: ima_match_policy+0x23/0x390 [ 2.612967] PGD 0 P4D 0 [ 2.613080] Oops: 0000 [#1] SMP [ 2.613080] Modules linked in: autofs4 [ 2.613080] Supported: Yes [ 2.613080] CPU: 0 PID: 1 Comm: systemd Not tainted 4.12.14-2.3-default #1 [ 2.613080] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 [ 2.613080] task: ffff88003e2d0040 task.stack: ffffc90000190000 [ 2.613080] RIP: 0010:ima_match_policy+0x23/0x390 [ 2.613080] RSP: 0018:ffffc90000193e88 EFLAGS: 00010296 [ 2.613080] RAX: 0000000000000000 RBX: 000000000000000c RCX: 0000000000000004 [ 2.613080] RDX: 0000000000000010 RSI: 0000000000000001 RDI: ffff880037071728 [ 2.613080] RBP: 0000000000008000 R08: 0000000000000000 R09: 0000000000000000 [ 2.613080] R10: 0000000000000008 R11: 61c8864680b583eb R12: 00005580ff10086f [ 2.613080] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000008000 [ 2.613080] FS: 00007f5c1da08940(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 2.613080] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.613080] CR2: 0000000000000000 CR3: 0000000037002000 CR4: 00000000003406f0 [ 2.613080] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2.613080] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2.613080] Call Trace: [ 2.613080] ? shmem_mknod+0xbf/0xd0 [ 2.613080] ima_post_path_mknod+0x1c/0x40 [ 2.613080] SyS_mknod+0x210/0x220 [ 2.613080] entry_SYSCALL_64_fastpath+0x1a/0xa5 [ 2.613080] RIP: 0033:0x7f5c1bfde570 [ 2.613080] RSP: 002b:00007ffde1c90dc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000085 [ 2.613080] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5c1bfde570 [ 2.613080] RDX: 0000000000000000 RSI: 0000000000008000 RDI: 00005580ff10086f [ 2.613080] RBP: 00007ffde1c91040 R08: 00005580ff10086f R09: 0000000000000000 [ 2.613080] R10: 0000000000104000 R11: 0000000000000246 R12: 00005580ffb99660 [ 2.613080] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000002 [ 2.613080] Code: 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 57 41 56 44 8d 14 09 41 55 41 54 55 53 44 89 d3 09 cb 48 83 ec 38 48 8b 05 c5 03 29 01 <4c> 8b 20 4c 39 e0 0f 84 d7 01 00 00 4c 89 44 24 08 89 54 24 20 [ 2.613080] RIP: ima_match_policy+0x23/0x390 RSP: ffffc90000193e88 [ 2.613080] CR2: 0000000000000000 [ 2.613080] ---[ end trace 9a9f0a8a73079f6a ]--- [ 2.673052] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 [ 2.673052] [ 2.675337] Kernel Offset: disabled [ 2.676405] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 Signed-off-by: Petr Vorel <pvorel@suse.cz> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:29 +02:00
Jiandi An	bc72e4fcc1	ima: Fix Kconfig to select TPM 2.0 CRB interface [ Upstream commit `fac37c628f` ] TPM_CRB driver provides TPM CRB 2.0 support. If it is built as a module, the TPM chip is registered after IMA init. tpm_pcr_read() in IMA fails and displays the following message even though eventually there is a TPM chip on the system. ima: No TPM chip found, activating TPM-bypass! (rc=-19) Fix IMA Kconfig to select TPM_CRB so TPM_CRB driver is built in the kernel and initializes before IMA. Signed-off-by: Jiandi An <anjiandi@codeaurora.org> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:29 +02:00
Arjun Vynipadath	d7b13824c3	cxgb4: Setup FW queues before registering netdev [ Upstream commit `843bd7db79` ] When NetworkManager is enabled, there are chances that interface up is called even before probe completes. This means we have not yet allocated the FW sge queues, hence rest of ingress queue allocation wont be proper. Fix this by calling setup_fw_sge_queues() before register_netdev(). Fixes: `0fbc81b3ad` ('chcr/cxgb4i/cxgbit/RDMA/cxgb4: Allocate resources dynamically for all cxgb4 ULD's') Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:29 +02:00
Sebastian Gottschall	aa5a781f59	ath9k: fix crash in spectral scan [ Upstream commit `221b6ec69e` ] Fixes crash seen on arm smp systems (gateworks ventana imx6): Unable to handle kernel NULL pointer dereference at virtual address 00000014 pgd = 80004000 [00000014] *pgd=00000000 Internal error: Oops - BUG: 17 [#1] PREEMPT SMP ARM Modules linked in: ip6table_filter nf_conntrack_ipv6 ip6_tables nf_log_ipv6 nf_defrag_ipv6 shortcut_fe ipcomp6 xfrm_ipcomp xfrm6_tunnel xfrm6_mode_tunnel xfrm6_mode_transport xfrm6_mode_ro xfrm6_mode_beet ip6_tunnel tunnel6 mip6 ah6 esp6 xfrm_algo sit ip_tunnel tunnel4 ipv6 ath10k_pci ath10k_core ath9k ath mac80211 cfg80211 compat ath_pci ath_hal(P) caamalg authencesn authenc caamrng caamhash caam_jr caam cdc_ncm usbnet usbcore sky2 imx2_wdt CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: P 4.9.85 #19 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) task: bf064980 task.stack: bf07c000 PC is at relay_buf_full+0xc/0x30 LR is at _674+0x740/0xf10 [ath9k] pc : [<8018bce0>] lr : [<7f1aa604>] psr: 80000013 sp : bf07dbf0 ip : bf07dc00 fp : bf07dbfc r10: 0000003f r9 : bf130e00 r8 : 809044b0 r7 : 00000000 r6 : be67a9f0 r5 : 00000000 r4 : 809043e4 r3 : c0864c24 r2 : 00000000 r1 : 00000004 r0 : 00000000 Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: 4e6a004a DAC: 00000055 Process ksoftirqd/0 (pid: 3, stack limit = 0xbf07c210) Stack: (0xbf07dbf0 to 0xbf07e000) dbe0: bf07dd04 bf07dc00 7f1aa604 8018bce0 dc00: 00004014 be59e010 bf07dc34 bf07dc18 7f1a7084 7f19c07c be59c010 be6470a0 dc20: 0000096c be648954 bf07dc6c bf07dc38 7f1c286c bf07dd90 bf07dc5c bf07dc48 dc40: 8029ea4c 0000003c 00000001 be59c010 00000094 00000000 00000000 00000000 dc60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 dc80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 dca0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 dcc0: 00000000 00000000 00000000 00000000 00000000 00000000 8010ef24 00000030 dce0: be94f5e8 be6485a0 bddf0200 be59c010 be6465a0 be6415a0 bf07ddf4 bf07dd08 dd00: 7f1cf800 7f1aa55c 1fc38c4c 00000000 bf07dd58 cccccccd 66666667 be640bc0 dd20: bf07dd54 be6415a0 1fc38c4c 00000000 00000000 be59c038 be67a9c0 be59e010 dd40: be67a9f0 be647170 8090c904 be59c010 00000000 00000001 1fc38e84 00000000 dd60: be640bc0 bddf0200 00000200 00000010 0000003f 00000002 20000013 be59c010 dd80: 8092d940 bf7ca2c0 bf07ddb4 bf07dd98 1fc38c4c 2602003f 0100ff1b 80ff1b00 dda0: 00808080 00000000 00000000 80808080 80808080 80808080 80808080 00008080 ddc0: 00000000 00000000 7f1b62b8 00000002 be6470ec be6470f0 00000000 bf07de98 dde0: 8092d940 be6415a0 bf07de94 bf07ddf8 7f1d1ed8 7f1cf1fc 00000000 00000000 de00: bf7cc4c0 00000400 be6470f0 bf07de18 8015165c be59c010 8090453c 8090453c de20: bf07dec4 be6465a0 8014f614 80148884 0000619a 00000001 bf07c000 00000100 de40: bf07de78 00000001 7f327850 00000002 afb50401 bf064980 bf07de9c bf07de68 de60: bf064a00 803cc668 bf064a00 be6470b4 be6470b8 80844180 00000000 bf07de98 de80: 8092d940 bf07c000 bf07dec4 bf07de98 80124d18 7f1d1c44 80124c94 00000000 dea0: 00000006 80902098 80902080 40000006 00000100 bf07c000 bf07df24 bf07dec8 dec0: 8012501c 80124ca0 bf7cc4c0 bf064980 be95e1c0 04208040 80902d00 000061c7 dee0: 0000000a 80600b54 8092d940 808441f8 80902080 bf07dec8 bf03b200 bf07c000 df00: bf03b200 8090fe54 00000000 00000000 00000000 00000000 bf07df34 bf07df28 df20: 80125148 80124f28 bf07df5c bf07df38 8013deb4 8012511c 00000000 bf03b240 df40: bf03b200 8013dc90 00000000 00000000 bf07dfac bf07df60 8013ad40 8013dc9c df60: 70448040 00000001 00000000 bf03b200 00000000 00030003 bf07df78 bf07df78 df80: 00000000 00000000 bf07df88 bf07df88 bf03b240 8013ac48 00000000 00000000 dfa0: 00000000 bf07dfb0 80107760 8013ac54 00000000 00000000 00000000 00000000 dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 8c120004 1190ad04 Backtrace: [<8018bcd4>] (relay_buf_full) from [<7f1aa604>] (_674+0x740/0xf10 [ath9k]) [<7f1aa550>] (_674 [ath9k]) from [<7f1cf800>] (_582+0x14b4/0x3708 [ath9k]) r10:be6415a0 r9:be6465a0 r8:be59c010 r7:bddf0200 r6:be6485a0 r5:be94f5e8 r4:00000030 [<7f1cf1f0>] (_582 [ath9k]) from [<7f1d1ed8>] (_735+0x2a0/0xec4 [ath9k]) r10:be6415a0 r9:8092d940 r8:bf07de98 r7:00000000 r6:be6470f0 r5:be6470ec r4:00000002 [<7f1d1c38>] (_735 [ath9k]) from [<80124d18>] (tasklet_action+0x84/0xf8) r10:bf07c000 r9:8092d940 r8:bf07de98 r7:00000000 r6:80844180 r5:be6470b8 r4:be6470b4 [<80124c94>] (tasklet_action) from [<8012501c>] (__do_softirq+0x100/0x1f4) r10:bf07c000 r9:00000100 r8:40000006 r7:80902080 r6:80902098 r5:00000006 r4:00000000 r3:80124c94 [<80124f1c>] (__do_softirq) from [<80125148>] (run_ksoftirqd+0x38/0x4c) r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:8090fe54 r5:bf03b200 r4:bf07c000 [<80125110>] (run_ksoftirqd) from [<8013deb4>] (smpboot_thread_fn+0x224/0x260) [<8013dc90>] (smpboot_thread_fn) from [<8013ad40>] (kthread+0xf8/0x100) r9:00000000 r8:00000000 r7:8013dc90 r6:bf03b200 r5:bf03b240 r4:00000000 [<8013ac48>] (kthread) from [<80107760>] (ret_from_fork+0x14/0x34) r7:00000000 r6:00000000 r5:8013ac48 r4:bf03b240 Code: e89da800 e1a0c00d e92dd800 e24cb004 (e5901014) ---[ end trace dddf11ac9111b272 ]--- Kernel panic - not syncing: Fatal exception in interrupt CPU1: stopping CPU: 1 PID: 0 Comm: swapper/1 Tainted: P D 4.9.85 #19 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) Backtrace: [<8010a708>] (dump_backtrace) from [<8010a99c>] (show_stack+0x18/0x1c) r7:bf093f58 r6:20000193 r5:809168e8 r4:00000000 [<8010a984>] (show_stack) from [<802a09c4>] (dump_stack+0x94/0xa8) [<802a0930>] (dump_stack) from [<8010d184>] (handle_IPI+0xe8/0x180) r7:bf093f58 r6:00000000 r5:00000001 r4:808478c4 [<8010d09c>] (handle_IPI) from [<801013e8>] (gic_handle_irq+0x78/0x7c) r7:f4000100 r6:bf093f58 r5:f400010c r4:8090467c [<80101370>] (gic_handle_irq) from [<8010b378>] (__irq_svc+0x58/0x8c) Exception stack(0xbf093f58 to 0xbf093fa0) 3f40: bf7d62a0 00000000 3f60: 0010a5f4 80113460 bf092000 809043e4 00000002 80904434 bf092008 412fc09a 3f80: 00000000 bf093fb4 bf093fb8 bf093fa8 8010804c 80108050 60000013 ffffffff r9:bf092000 r8:bf092008 r7:bf093f8c r6:ffffffff r5:60000013 r4:80108050 [<80108014>] (arch_cpu_idle) from [<80553c2c>] (default_idle_call+0x30/0x34) [<80553bfc>] (default_idle_call) from [<80158394>] (cpu_startup_entry+0xc4/0xfc) [<801582d0>] (cpu_startup_entry) from [<8010ce40>] (secondary_start_kernel+0x168/0x174) r7:8092d2f8 r4:80913568 [<8010ccd8>] (secondary_start_kernel) from [<10101488>] (0x10101488) r5:00000055 r4:4f07806a Rebooting in 10 seconds.. Reboot failed -- System halted Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:29 +02:00
Jarosław Janik	085ec7d554	nvme-pci: disable APST for Samsung NVMe SSD 960 EVO + ASUS PRIME Z370-A [ Upstream commit `467c77d4cb` ] Yet another "incompatible" Samsung NVMe SSD 960 EVO and Asus motherboard combination. 960 EVO device disappears from PCIe bus within few minutes after boot-up when APST is in use and never gets back. Forcing NVME_QUIRK_NO_APST is the only way to make this drive work with this particular motherboard. NVME_QUIRK_NO_DEEPEST_PS doesn't work, upgrading motherboard's BIOS didn't help either. Since this is a desktop motherboard, the only drawback of not using APST is increased device temperature. Signed-off-by: Jarosław Janik <jaroslaw.janik@gmail.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:29 +02:00
Karthikeyan Periyasamy	7e5487b399	ath10k: Fix kernel panic while using worker (ath10k_sta_rc_update_wk) [ Upstream commit `8b2d93dd22` ] When attempt to run worker (ath10k_sta_rc_update_wk) after the station object (ieee80211_sta) delete will trigger the kernel panic. This problem arise in AP + Mesh configuration, Where the current node AP VAP and neighbor node mesh VAP MAC address are same. When the current mesh node try to establish the mesh link with neighbor node, driver peer creation for the neighbor mesh node fails due to duplication MAC address. Already the AP VAP created with same MAC address. It is caused by the following scenario steps. Steps: 1. In above condition, ath10k driver sta_state callback (ath10k_sta_state) fails to do the state change for a station from IEEE80211_STA_NOTEXIST to IEEE80211_STA_NONE due to peer creation fails. Sta_state callback is called from ieee80211_add_station() to handle the new station (neighbor mesh node) request from the wpa_supplicant. 2. Concurrently ath10k receive the sta_rc_update callback notification from the mesh_neighbour_update() to handle the beacon frames of the above neighbor mesh node. since its atomic callback, ath10k driver queue the work (ath10k_sta_rc_update_wk) to handle rc update. 3. Due to driver sta_state callback fails (step 1), mac80211 free the station object. 4. When the worker (ath10k_sta_rc_update_wk) scheduled to run, it will access the station object which is already deleted. so it will trigger kernel panic. Added the peer exist check in sta_rc_update callback before queue the work. Kernel Panic log: Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = c0204000 [00000000] *pgd=00000000 Internal error: Oops: 17 [#1] PREEMPT SMP ARM CPU: 1 PID: 1833 Comm: kworker/u4:2 Not tainted 3.14.77 #1 task: dcef0000 ti: d72b6000 task.ti: d72b6000 PC is at pwq_activate_delayed_work+0x10/0x40 LR is at pwq_activate_delayed_work+0xc/0x40 pc : [<c023f988>] lr : [<c023f984>] psr: 40000193 sp : d72b7f18 ip : 0000007a fp : d72b6000 r10: 00000000 r9 : dd404414 r8 : d8c31998 r7 : d72b6038 r6 : 00000004 r5 : d4907ec8 r4 : dcee1300 r3 : ffffffe0 r2 : 00000000 r1 : 00000001 r0 : 00000000 Flags: nZcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c5787d Table: 595bc06a DAC: 00000015 ... Process kworker/u4:2 (pid: 1833, stack limit = 0xd72b6238) Stack: (0xd72b7f18 to 0xd72b8000) 7f00: 00000001 dcee1300 7f20: 00000001 c02410dc d8c31980 dd404400 dd404400 c0242790 d8c31980 00000089 7f40: 00000000 d93e1340 00000000 d8c31980 c0242568 00000000 00000000 00000000 7f60: 00000000 c02474dc 00000000 00000000 000000f8 d8c31980 00000000 00000000 7f80: d72b7f80 d72b7f80 00000000 00000000 d72b7f90 d72b7f90 d72b7fac d93e1340 7fa0: c0247404 00000000 00000000 c0208d20 00000000 00000000 00000000 00000000 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000 [<c023f988>] (pwq_activate_delayed_work) from [<c02410dc>] (pwq_dec_nr_in_flight+0x58/0xc4) [<c02410dc>] (pwq_dec_nr_in_flight) from [<c0242790>] (worker_thread+0x228/0x360) [<c0242790>] (worker_thread) from [<c02474dc>] (kthread+0xd8/0xec) [<c02474dc>] (kthread) from [<c0208d20>] (ret_from_fork+0x14/0x34) Code: e92d4038 e1a05000 ebffffbc[69210.619376] SMP: failed to stop secondary CPUs Rebooting in 3 seconds.. Signed-off-by: Karthikeyan Periyasamy <periyasa@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:29 +02:00
Alexey Khoroshilov	5db7e1bb6a	watchdog: davinci_wdt: fix error handling in davinci_wdt_probe() [ Upstream commit `d66e53649c` ] clk_disable_unprepare() was added to one error path, but there is another one. The patch makes sure clk is disabled at the both of them. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:29 +02:00
Leon Romanovsky	fc7bcbb940	net/mlx5: Protect from command bit overflow [ Upstream commit `957f6ba8ad` ] The system with CONFIG_UBSAN enabled on produces the following error during driver initialization. The reason to it that max_reg_cmds can be larger enough to cause to "1 << max_reg_cmds" overflow the unsigned long. ================================================================================ UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx5/core/cmd.c:1805:42 signed integer overflow: -2147483648 - 1 cannot be represented in type 'int' CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc2-00032-g06cda2358d9b-dirty #724 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0xe9/0x18f ? dma_virt_alloc+0x81/0x81 ubsan_epilogue+0xe/0x4e handle_overflow+0x187/0x20c mlx5_cmd_init+0x73a/0x12b0 mlx5_load_one+0x1c3d/0x1d30 init_one+0xd02/0xf10 pci_device_probe+0x26c/0x3b0 driver_probe_device+0x622/0xb40 __driver_attach+0x175/0x1b0 bus_for_each_dev+0xef/0x190 bus_add_driver+0x2db/0x490 driver_register+0x16b/0x1e0 __pci_register_driver+0x177/0x1b0 init+0x6d/0x92 do_one_initcall+0x15b/0x270 kernel_init_freeable+0x2d8/0x3d0 kernel_init+0x14/0x190 ret_from_fork+0x24/0x30 ================================================================================ Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:28 +02:00
Michael Ellerman	d018d551e7	selftests: Print the test we're running to /dev/kmsg [ Upstream commit `88893cf787` ] Some tests cause the kernel to print things to the kernel log buffer (ie. printk), in particular oops and warnings etc. However when running all the tests in succession it's not always obvious which test(s) caused the kernel to print something. We can narrow it down by printing which test directory we're running in to /dev/kmsg, if it's writable. Example output: [ 170.149149] kselftest: Running tests in powerpc [ 305.300132] kworker/dying (71) used greatest stack depth: 7776 bytes left [ 808.915456] kselftest: Running tests in pstore Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:28 +02:00
Frank Asseg	faace30e6e	tools/thermal: tmon: fix for segfault [ Upstream commit `6c59f64b7e` ] Fixes a segfault occurring when e.g. <TAB> is pressed multiple times in the ncurses tmon application. The segfault is caused by incrementing cur_thermal_record in the main function without checking if it's value reached NR_THERMAL_RECORD immediately. Since the boundary check only occurred in update_thermal_data a race condition existed, which lead to an attempted read beyond the last element of the trec array. The fix was implemented by moving the cur_thermal_record incrementation to the update_thermal_data function using a temporary variable on which the boundary condition is checked before updating cur_thread_record, so that the variable is never incremented beyond the trec array's boundary. It seems the segfault does not occur on every machine: On a HP EliteBook G4 the segfault happens, while it does not happen on a Thinkpad T540p. Signed-off-by: Frank Asseg <frank.asseg@objecthunter.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:28 +02:00
Amitkumar Karwar	b652092f8e	rsi: fix kernel panic observed on 64bit machine [ Upstream commit `864db4d508` ] Following kernel panic is observed on 64bit machine while loading the driver. It is fixed if we pass dynamically allocated memory to SDIO for DMA. BUG: unable to handle kernel paging request at ffffeb04000172e0 IP: sg_miter_stop+0x56/0x70 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI Modules linked in: rsi_sdio(OE+) rsi_91x(OE) btrsi(OE) rfcomm bluetooth ecdh_generic mac80211 mmc_block fuse xt_CHECKSUM iptable_mangle drm_kms_helper mmc_core serio_raw drm firewire_ohci tg3 CPU: 0 PID: 4003 Comm: insmod Tainted: G OE 4.16.0-rc1+ #27 Hardware name: Dell Inc. Latitude E5500 /0DW634, BIOS A19 06/13/2013 RIP: 0010:sg_miter_stop+0x56/0x70 RSP: 0018:ffff88007d003e78 EFLAGS: 00010002 RAX: 0000000000000003 RBX: 0000000000000004 RCX: 0000000000000000 RDX: ffffeb04000172c0 RSI: ffff88002f58002c RDI: ffff88007d003e80 RBP: 0000000000000004 R08: ffff88007d003e80 R09: 0000000000000008 R10: 0000000000000003 R11: 0000000000000001 R12: 0000000000000004 R13: ffff88002f580028 R14: 0000000000000000 R15: 0000000000000004 FS: 00007f35c29db700(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffeb04000172e0 CR3: 000000007038e000 CR4: 00000000000406f0 Call Trace: <IRQ> sg_copy_buffer+0xc6/0xf0 sdhci_tasklet_finish+0x170/0x260 [sdhci] tasklet_action+0xf4/0x100 __do_softirq+0xef/0x26e irq_exit+0xbe/0xd0 do_IRQ+0x4a/0xc0 common_interrupt+0xa2/0xa2 </IRQ> Signed-off-by: Amitkumar Karwar <amit.karwar@redpinesignals.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:28 +02:00
Michael Ellerman	31dbd9cfcb	powerpc/perf: Fix kernel address leak via sampling registers [ Upstream commit `e1ebd0e5b9` ] Current code in power_pmu_disable() does not clear the sampling registers like Sampling Instruction Address Register (SIAR) and Sampling Data Address Register (SDAR) after disabling the PMU. Since these are userspace readable and could contain kernel addresses, add code to explicitly clear the content of these registers. Also add a "context synchronizing instruction" to enforce no further updates to these registers as suggested by Power ISA v3.0B. From section 9.4, on page 1108: "If an mtspr instruction is executed that changes the value of a Performance Monitor register other than SIAR, SDAR, and SIER, the change is not guaranteed to have taken effect until after a subsequent context synchronizing instruction has been executed (see Chapter 11. "Synchronization Requirements for Context Alterations" on page 1133)." Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> [mpe: Massage change log and add ISA reference] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:28 +02:00
Madhavan Srinivasan	6a0a9f0ab8	powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer [ Upstream commit `bb19af8160` ] The current Branch History Rolling Buffer (BHRB) code does not check for any privilege levels before updating the data from BHRB. This could leak kernel addresses to userspace even when profiling only with userspace privileges. Add proper checks to prevent it. Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:28 +02:00
Guenter Roeck	68a38cedff	hwmon: (nct6775) Fix writing pwmX_mode [ Upstream commit `415eb2a1aa` ] pwmX_mode is defined in the ABI as 0=DC mode, 1=pwm mode. The chip register bit is set to 1 for DC mode. This got mixed up, and writing 1 into pwmX_mode resulted in DC mode enabled. Fix it up by using the ABI definition throughout the driver for consistency. Fixes: `77eb5b3703` ("hwmon: (nct6775) Add support for pwm, pwm_mode, ... ") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:28 +02:00
Helge Deller	dbce9e4116	parisc/pci: Switch LBA PCI bus from Hard Fail to Soft Fail mode [ Upstream commit `b845f66f78` ] Carlo Pisani noticed that his C3600 workstation behaved unstable during heavy I/O on the PCI bus with a VIA VT6421 IDE/SATA PCI card. To avoid such instability, this patch switches the LBA PCI bus from Hard Fail mode into Soft Fail mode. In this mode the bus will return -1UL for timed out MMIO transactions, which is exactly how the x86 (and most other architectures) PCI busses behave. This patch is based on a proposal by Grant Grundler and Kyle McMartin 10 years ago: https://www.spinics.net/lists/linux-parisc/msg01027.html Cc: Carlo Pisani <carlojpisani@gmail.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Reviewed-by: Grant Grundler <grantgrundler@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:28 +02:00
Luca Coelho	f375195434	iwlwifi: mvm: check if mac80211_queue is valid in iwl_mvm_disable_txq [ Upstream commit `9a233bb802` ] Sometimes iwl_mvm_disable_txq() may be called with mac80211_queue == IEEE80211_INVAL_HW_QUEUE, and this would cause us to use BIT(0xFF) which is way too large for the u16 we used to store it in hw_queue_to_mac820211. If this happens the following UBSAN warning will be generated: [ 167.185167] UBSAN: Undefined behaviour in drivers/net/wireless/intel/iwlwifi/mvm/utils.c:838:5 [ 167.185171] shift exponent 255 is too large for 64-bit type 'long unsigned int' Fix that by checking that it is not IEEE80211_INVAL_HW_QUEUE and, while at it, add a warning if the queue number is larger than IEEE80211_MAX_QUEUES. Fixes: `34e10860ae` ("iwlwifi: mvm: remove references to queue_info in new TX path") Reported-by: Paul Menzel <pmenzel+linux-wireless@molgen.mpg.de> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:28 +02:00
Greg Ungerer	6a020bb3c6	m68k: set dma and coherent masks for platform FEC ethernets [ Upstream commit `f61e64310b` ] As of commit `205e1b7f51` ("dma-mapping: warn when there is no coherent_dma_mask") the Freescale FEC driver is issuing the following warning on driver initialization on ColdFire systems: WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516 0x40159e20 Modules linked in: CPU: 0 PID: 1 Comm: swapper Not tainted 4.16.0-rc7-dirty #4 Stack from 41833dd8: 41833dd8 40259c53 40025534 40279e26 00000003 00000000 4004e514 41827000 400255de 40244e42 00000204 40159e20 00000009 00000000 00000000 4024531d 40159e20 40244e42 00000204 00000000 00000000 00000000 00000007 00000000 00000000 40279e26 4028d040 40226576 4003ae88 40279e26 418273f6 41833ef8 7fffffff 418273f2 41867028 4003c9a2 4180ac6c 00000004 41833f8c 4013e71c 40279e1c 40279e26 40226c16 4013ced2 40279e26 40279e58 4028d040 00000000 Call Trace: [<40025534>] 0x40025534 [<4004e514>] 0x4004e514 [<400255de>] 0x400255de [<40159e20>] 0x40159e20 [<40159e20>] 0x40159e20 It is not fatal, the driver and the system continue to function normally. As per the warning the coherent_dma_mask is not set on this device. There is nothing special about the DMA memory coherency on this hardware so we can just set the mask to 32bits in the platform data for the FEC ethernet devices. Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:27 +02:00
Alexander Shishkin	80fceaf3f1	intel_th: Use correct method of finding hub [ Upstream commit `9ad5770871` ] Since commit `8edc514b01` ("intel_th: Make SOURCE devices children of the root device") the hub is not the parent of SOURCE devices any more, so the new helper function should be used for that instead of always using the parent. The intel_th_set_output() path, however, still uses the old logic, leading to the hub driver structure being aliased with something else, like struct pci_driver or struct acpi_driver, and an incorrect call to an address inferred from that, potentially resulting in a crash. Fixes: `8edc514b01` ("intel_th: Make SOURCE devices children of the root device") Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:27 +02:00
Sebastian Andrzej Siewior	1366b31d18	iommu/amd: Take into account that alloc_dev_data() may return NULL [ Upstream commit `39ffe39545` ] find_dev_data() does not check whether the return value alloc_dev_data() is NULL. This was okay once because the pointer was returned once as-is. Since commit `df3f7a6e8e` ("iommu/amd: Use is_attach_deferred call-back") the pointer may be used within find_dev_data() so a NULL check is required. Cc: Baoquan He <bhe@redhat.com> Fixes: `df3f7a6e8e` ("iommu/amd: Use is_attach_deferred call-back") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:27 +02:00
Anilkumar Kolli	6bc2bf6023	ath10k: advertize beacon_int_min_gcd [ Upstream commit `8ebee73b57` ] This patch fixes regression caused by `0c317a02ca` ("cfg80211: support virtual interfaces with different beacon intervals"), with this change cfg80211 expects the driver to advertize 'beacon_int_min_gcd' to support different beacon intervals in multivap scenario. This support is added for, QCA988X/QCA99X0/QCA9984/QCA4019. Verifed AP + mesh bring up on QCA9984 with beacon interval 100msec and 1000msec respectively. Frimware: firmware-5.bin_10.4-3.5.3-00053 Fixes: `0c317a02ca` ("cfg80211: support virtual interfaces with different beacon intervals") Signed-off-by: Anilkumar Kolli <akolli@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:27 +02:00
Harry Morris	9c222c497b	ieee802154: ca8210: fix uninitialised data read [ Upstream commit `86674a97f5` ] In ca8210_test_int_user_write() a user can request the transfer of a frame with a length field (command.length) that is longer than the actual buffer provided (len). In this scenario the driver will copy the buffer contents into the uninitialised command[] buffer, then transfer <data.length> bytes over the SPI even though only <len> bytes had been populated, potentially leaking sensitive kernel memory. Also the first 6 bytes of the command buffer must be initialised in case a malformed, short packet is written and the uninitialised bytes are read in ca8210_test_check_upstream. Reported-by: Domen Puncer Kugler <domen.puncer@samsung.com> Signed-off-by: Harry Morris <h.morris@cascoda.com> Tested-by: Harry Morris <h.morris@cascoda.com> Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:27 +02:00
Michael Ellerman	c3a2a87820	powerpc/mpic: Check if cpu_possible() in mpic_physmask() [ Upstream commit `0834d627fb` ] In mpic_physmask() we loop over all CPUs up to 32, then get the hard SMP processor id of that CPU. Currently that's possibly walking off the end of the paca array, but in a future patch we will change the paca array to be an array of pointers, and in that case we will get a NULL for missing CPUs and oops. eg: Unable to handle kernel paging request for data at address 0x88888888888888b8 Faulting instruction address: 0xc00000000004e380 Oops: Kernel access of bad area, sig: 11 [#1] ... NIP .mpic_set_affinity+0x60/0x1a0 LR .irq_do_set_affinity+0x48/0x100 Fix it by checking the CPU is possible, this also fixes the code if there are gaps in the CPU numbering which probably never happens on mpic systems but who knows. Debugged-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:27 +02:00
Lenny Szubowicz	fc2de79692	ACPI: acpi_pad: Fix memory leak in power saving threads [ Upstream commit `8b29d29abc` ] Fix once per second (round_robin_time) memory leak of about 1 KB in each acpi_pad kernel idling thread that is activated. Found by testing with kmemleak. Signed-off-by: Lenny Szubowicz <lszubowi@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:27 +02:00
Aaro Koskinen	d023498fef	drivers: macintosh: rack-meter: really fix bogus memsets [ Upstream commit `e283655b5a` ] We should zero an array using sizeof instead of number of elements. Fixes the following compiler (GCC 7.3.0) warnings: drivers/macintosh/rack-meter.c: In function 'rackmeter_do_pause': drivers/macintosh/rack-meter.c:157:2: warning: 'memset' used with length equal to number of elements without multiplication by element size [-Wmemset-elt-size] drivers/macintosh/rack-meter.c:158:2: warning: 'memset' used with length equal to number of elements without multiplication by element size [-Wmemset-elt-size] Fixes: `4f7bef7a9f` ("drivers: macintosh: rack-meter: fix bogus memsets") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:27 +02:00
Dan Carpenter	8effa2182d	xen/acpi: off by one in read_acpi_id() [ Upstream commit `c37a3c9477` ] If acpi_id is == nr_acpi_bits, then we access one element beyond the end of the acpi_psd[] array or we set one bit beyond the end of the bit map when we do __set_bit(acpi_id, acpi_id_present); Fixes: `59a5680291` ("xen/acpi-processor: C and P-state driver that uploads said data to hypervisor.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:26 +02:00
David Howells	637b9b187f	rxrpc: Don't treat call aborts as conn aborts [ Upstream commit `57b0c9d49b` ] If a call-level abort is received for the previous call to complete on a connection channel, then that abort is queued for the connection processor to handle. Unfortunately, the connection processor then assumes without checking that the abort is connection-level (ie. callNumber is 0) and distributes it over all active calls on that connection, thereby incorrectly aborting them. Fix this by discarding aborts aimed at a completed call. Further, discard all packets aimed at a call that's complete if there's currently an active call on a channel, since the DATA packets associated with the new call automatically terminate the old call. Fixes: `18bfeba50d` ("rxrpc: Perform terminal call ACK/ABORT retransmission from conn processor") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:26 +02:00
David Howells	4a9fabcd34	rxrpc: Fix Tx ring annotation after initial Tx failure [ Upstream commit `03877bf6a3` ] rxrpc calls have a ring of packets that are awaiting ACK or retransmission and a parallel ring of annotations that tracks the state of those packets. If the initial transmission of a packet on the underlying UDP socket fails then the packet annotation is marked for resend - but the setting of this mark accidentally erases the last-packet mark also stored in the same annotation slot. If this happens, a call won't switch out of the Tx phase when all the packets have been transmitted. Fix this by retaining the last-packet mark and only altering the packet state. Fixes: `248f219cb8` ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:26 +02:00
Qu Wenruo	204bfcda82	btrfs: qgroup: Fix root item corruption when multiple same source snapshots are created with quota enabled [ Upstream commit `4d31778aa2` ] When multiple pending snapshots referring to the same source subvolume are executed, enabled quota will cause root item corruption, where root items are using old bytenr (no backref in extent tree). This can be triggered by fstests btrfs/152. The cause is when source subvolume is still dirty, extra commit (simplied transaction commit) of qgroup_account_snapshot() can skip dirty roots not recorded in current transaction, making root item of source subvolume not updated. Fix it by forcing recording source subvolume in current transaction before qgroup sub-transaction commit. Reported-by: Justin Maggard <jmaggard@netgear.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:26 +02:00
Jeff Mahoney	de00d57294	btrfs: fix lockdep splat in btrfs_alloc_subvolume_writers [ Upstream commit `8a5a916d9a` ] While running btrfs/011, I hit the following lockdep splat. This is the important bit: pcpu_alloc+0x1ac/0x5e0 __percpu_counter_init+0x4e/0xb0 btrfs_init_fs_root+0x99/0x1c0 [btrfs] btrfs_get_fs_root.part.54+0x5b/0x150 [btrfs] resolve_indirect_refs+0x130/0x830 [btrfs] find_parent_nodes+0x69e/0xff0 [btrfs] btrfs_find_all_roots_safe+0xa0/0x110 [btrfs] btrfs_find_all_roots+0x50/0x70 [btrfs] btrfs_qgroup_prepare_account_extents+0x53/0x90 [btrfs] btrfs_commit_transaction+0x3ce/0x9b0 [btrfs] The percpu_counter_init call in btrfs_alloc_subvolume_writers uses GFP_KERNEL, which we can't do during transaction commit. This switches it to GFP_NOFS. ======================================================== WARNING: possible irq lock inversion dependency detected 4.12.14-kvmsmall #8 Tainted: G W -------------------------------------------------------- kswapd0/50 just changed the state of lock: (&delayed_node->mutex){+.+.-.}, at: [<ffffffffc06994fa>] __btrfs_release_delayed_node+0x3a/0x1f0 [btrfs] but this lock took another, RECLAIM_FS-unsafe lock in the past: (pcpu_alloc_mutex){+.+.+.} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Chain exists of: &delayed_node->mutex --> &found->groups_sem --> pcpu_alloc_mutex Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(pcpu_alloc_mutex); local_irq_disable(); lock(&delayed_node->mutex); lock(&found->groups_sem); <Interrupt> lock(&delayed_node->mutex); * DEADLOCK * 2 locks held by kswapd0/50: #0: (shrinker_rwsem){++++..}, at: [<ffffffff811dc11f>] shrink_slab+0x7f/0x5b0 #1: (&type->s_umount_key#30){+++++.}, at: [<ffffffff8126dec6>] trylock_super+0x16/0x50 the shortest dependencies between 2nd lock and 1st lock: -> (pcpu_alloc_mutex){+.+.+.} ops: 4904 { HARDIRQ-ON-W at: __mutex_lock+0x4e/0x8c0 pcpu_alloc+0x1ac/0x5e0 alloc_kmem_cache_cpus.isra.70+0x25/0xa0 __do_tune_cpucache+0x2c/0x220 do_tune_cpucache+0x26/0xc0 enable_cpucache+0x6d/0xf0 kmem_cache_init_late+0x42/0x75 start_kernel+0x343/0x4cb x86_64_start_kernel+0x127/0x134 secondary_startup_64+0xa5/0xb0 SOFTIRQ-ON-W at: __mutex_lock+0x4e/0x8c0 pcpu_alloc+0x1ac/0x5e0 alloc_kmem_cache_cpus.isra.70+0x25/0xa0 __do_tune_cpucache+0x2c/0x220 do_tune_cpucache+0x26/0xc0 enable_cpucache+0x6d/0xf0 kmem_cache_init_late+0x42/0x75 start_kernel+0x343/0x4cb x86_64_start_kernel+0x127/0x134 secondary_startup_64+0xa5/0xb0 RECLAIM_FS-ON-W at: __kmalloc+0x47/0x310 pcpu_extend_area_map+0x2b/0xc0 pcpu_alloc+0x3ec/0x5e0 alloc_kmem_cache_cpus.isra.70+0x25/0xa0 __do_tune_cpucache+0x2c/0x220 do_tune_cpucache+0x26/0xc0 enable_cpucache+0x6d/0xf0 __kmem_cache_create+0x1bf/0x390 create_cache+0xba/0x1b0 kmem_cache_create+0x1f8/0x2b0 ksm_init+0x6f/0x19d do_one_initcall+0x50/0x1b0 kernel_init_freeable+0x201/0x289 kernel_init+0xa/0x100 ret_from_fork+0x3a/0x50 INITIAL USE at: __mutex_lock+0x4e/0x8c0 pcpu_alloc+0x1ac/0x5e0 alloc_kmem_cache_cpus.isra.70+0x25/0xa0 setup_cpu_cache+0x2f/0x1f0 __kmem_cache_create+0x1bf/0x390 create_boot_cache+0x8b/0xb1 kmem_cache_init+0xa1/0x19e start_kernel+0x270/0x4cb x86_64_start_kernel+0x127/0x134 secondary_startup_64+0xa5/0xb0 } ... key at: [<ffffffff821d8e70>] pcpu_alloc_mutex+0x70/0xa0 ... acquired at: pcpu_alloc+0x1ac/0x5e0 __percpu_counter_init+0x4e/0xb0 btrfs_init_fs_root+0x99/0x1c0 [btrfs] btrfs_get_fs_root.part.54+0x5b/0x150 [btrfs] resolve_indirect_refs+0x130/0x830 [btrfs] find_parent_nodes+0x69e/0xff0 [btrfs] btrfs_find_all_roots_safe+0xa0/0x110 [btrfs] btrfs_find_all_roots+0x50/0x70 [btrfs] btrfs_qgroup_prepare_account_extents+0x53/0x90 [btrfs] btrfs_commit_transaction+0x3ce/0x9b0 [btrfs] transaction_kthread+0x176/0x1b0 [btrfs] kthread+0x102/0x140 ret_from_fork+0x3a/0x50 -> (&fs_info->commit_root_sem){++++..} ops: 1566382 { HARDIRQ-ON-W at: down_write+0x3e/0xa0 cache_block_group+0x287/0x420 [btrfs] find_free_extent+0x106c/0x12d0 [btrfs] btrfs_reserve_extent+0xd8/0x170 [btrfs] cow_file_range.isra.66+0x133/0x470 [btrfs] run_delalloc_range+0x121/0x410 [btrfs] writepage_delalloc.isra.50+0xfe/0x180 [btrfs] __extent_writepage+0x19a/0x360 [btrfs] extent_write_cache_pages.constprop.56+0x249/0x3e0 [btrfs] extent_writepages+0x4d/0x60 [btrfs] do_writepages+0x1a/0x70 __filemap_fdatawrite_range+0xa7/0xe0 btrfs_rename+0x5ee/0xdb0 [btrfs] vfs_rename+0x52a/0x7e0 SyS_rename+0x351/0x3b0 do_syscall_64+0x79/0x1e0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 HARDIRQ-ON-R at: down_read+0x35/0x90 caching_thread+0x57/0x560 [btrfs] normal_work_helper+0x1c0/0x5e0 [btrfs] process_one_work+0x1e0/0x5c0 worker_thread+0x44/0x390 kthread+0x102/0x140 ret_from_fork+0x3a/0x50 SOFTIRQ-ON-W at: down_write+0x3e/0xa0 cache_block_group+0x287/0x420 [btrfs] find_free_extent+0x106c/0x12d0 [btrfs] btrfs_reserve_extent+0xd8/0x170 [btrfs] cow_file_range.isra.66+0x133/0x470 [btrfs] run_delalloc_range+0x121/0x410 [btrfs] writepage_delalloc.isra.50+0xfe/0x180 [btrfs] __extent_writepage+0x19a/0x360 [btrfs] extent_write_cache_pages.constprop.56+0x249/0x3e0 [btrfs] extent_writepages+0x4d/0x60 [btrfs] do_writepages+0x1a/0x70 __filemap_fdatawrite_range+0xa7/0xe0 btrfs_rename+0x5ee/0xdb0 [btrfs] vfs_rename+0x52a/0x7e0 SyS_rename+0x351/0x3b0 do_syscall_64+0x79/0x1e0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 SOFTIRQ-ON-R at: down_read+0x35/0x90 caching_thread+0x57/0x560 [btrfs] normal_work_helper+0x1c0/0x5e0 [btrfs] process_one_work+0x1e0/0x5c0 worker_thread+0x44/0x390 kthread+0x102/0x140 ret_from_fork+0x3a/0x50 INITIAL USE at: down_write+0x3e/0xa0 cache_block_group+0x287/0x420 [btrfs] find_free_extent+0x106c/0x12d0 [btrfs] btrfs_reserve_extent+0xd8/0x170 [btrfs] cow_file_range.isra.66+0x133/0x470 [btrfs] run_delalloc_range+0x121/0x410 [btrfs] writepage_delalloc.isra.50+0xfe/0x180 [btrfs] __extent_writepage+0x19a/0x360 [btrfs] extent_write_cache_pages.constprop.56+0x249/0x3e0 [btrfs] extent_writepages+0x4d/0x60 [btrfs] do_writepages+0x1a/0x70 __filemap_fdatawrite_range+0xa7/0xe0 btrfs_rename+0x5ee/0xdb0 [btrfs] vfs_rename+0x52a/0x7e0 SyS_rename+0x351/0x3b0 do_syscall_64+0x79/0x1e0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 } ... key at: [<ffffffffc0729578>] __key.61970+0x0/0xfffffffffff9aa88 [btrfs] ... acquired at: cache_block_group+0x287/0x420 [btrfs] find_free_extent+0x106c/0x12d0 [btrfs] btrfs_reserve_extent+0xd8/0x170 [btrfs] btrfs_alloc_tree_block+0x12f/0x4c0 [btrfs] btrfs_create_tree+0xbb/0x2a0 [btrfs] btrfs_create_uuid_tree+0x37/0x140 [btrfs] open_ctree+0x23c0/0x2660 [btrfs] btrfs_mount+0xd36/0xf90 [btrfs] mount_fs+0x3a/0x160 vfs_kern_mount+0x66/0x150 btrfs_mount+0x18c/0xf90 [btrfs] mount_fs+0x3a/0x160 vfs_kern_mount+0x66/0x150 do_mount+0x1c1/0xcc0 SyS_mount+0x7e/0xd0 do_syscall_64+0x79/0x1e0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 -> (&found->groups_sem){++++..} ops: 2134587 { HARDIRQ-ON-W at: down_write+0x3e/0xa0 __link_block_group+0x34/0x130 [btrfs] btrfs_read_block_groups+0x33d/0x7b0 [btrfs] open_ctree+0x2054/0x2660 [btrfs] btrfs_mount+0xd36/0xf90 [btrfs] mount_fs+0x3a/0x160 vfs_kern_mount+0x66/0x150 btrfs_mount+0x18c/0xf90 [btrfs] mount_fs+0x3a/0x160 vfs_kern_mount+0x66/0x150 do_mount+0x1c1/0xcc0 SyS_mount+0x7e/0xd0 do_syscall_64+0x79/0x1e0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 HARDIRQ-ON-R at: down_read+0x35/0x90 btrfs_calc_num_tolerated_disk_barrier_failures+0x113/0x1f0 [btrfs] open_ctree+0x207b/0x2660 [btrfs] btrfs_mount+0xd36/0xf90 [btrfs] mount_fs+0x3a/0x160 vfs_kern_mount+0x66/0x150 btrfs_mount+0x18c/0xf90 [btrfs] mount_fs+0x3a/0x160 vfs_kern_mount+0x66/0x150 do_mount+0x1c1/0xcc0 SyS_mount+0x7e/0xd0 do_syscall_64+0x79/0x1e0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 SOFTIRQ-ON-W at: down_write+0x3e/0xa0 __link_block_group+0x34/0x130 [btrfs] btrfs_read_block_groups+0x33d/0x7b0 [btrfs] open_ctree+0x2054/0x2660 [btrfs] btrfs_mount+0xd36/0xf90 [btrfs] mount_fs+0x3a/0x160 vfs_kern_mount+0x66/0x150 btrfs_mount+0x18c/0xf90 [btrfs] mount_fs+0x3a/0x160 vfs_kern_mount+0x66/0x150 do_mount+0x1c1/0xcc0 SyS_mount+0x7e/0xd0 do_syscall_64+0x79/0x1e0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 SOFTIRQ-ON-R at: down_read+0x35/0x90 btrfs_calc_num_tolerated_disk_barrier_failures+0x113/0x1f0 [btrfs] open_ctree+0x207b/0x2660 [btrfs] btrfs_mount+0xd36/0xf90 [btrfs] mount_fs+0x3a/0x160 vfs_kern_mount+0x66/0x150 btrfs_mount+0x18c/0xf90 [btrfs] mount_fs+0x3a/0x160 vfs_kern_mount+0x66/0x150 do_mount+0x1c1/0xcc0 SyS_mount+0x7e/0xd0 do_syscall_64+0x79/0x1e0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 INITIAL USE at: down_write+0x3e/0xa0 __link_block_group+0x34/0x130 [btrfs] btrfs_read_block_groups+0x33d/0x7b0 [btrfs] open_ctree+0x2054/0x2660 [btrfs] btrfs_mount+0xd36/0xf90 [btrfs] mount_fs+0x3a/0x160 vfs_kern_mount+0x66/0x150 btrfs_mount+0x18c/0xf90 [btrfs] mount_fs+0x3a/0x160 vfs_kern_mount+0x66/0x150 do_mount+0x1c1/0xcc0 SyS_mount+0x7e/0xd0 do_syscall_64+0x79/0x1e0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 } ... key at: [<ffffffffc0729488>] __key.59101+0x0/0xfffffffffff9ab78 [btrfs] ... acquired at: find_free_extent+0xcb4/0x12d0 [btrfs] btrfs_reserve_extent+0xd8/0x170 [btrfs] btrfs_alloc_tree_block+0x12f/0x4c0 [btrfs] __btrfs_cow_block+0x110/0x5b0 [btrfs] btrfs_cow_block+0xd7/0x290 [btrfs] btrfs_search_slot+0x1f6/0x960 [btrfs] btrfs_lookup_inode+0x2a/0x90 [btrfs] __btrfs_update_delayed_inode+0x65/0x210 [btrfs] btrfs_commit_inode_delayed_inode+0x121/0x130 [btrfs] btrfs_evict_inode+0x3fe/0x6a0 [btrfs] evict+0xc4/0x190 __dentry_kill+0xbf/0x170 dput+0x2ae/0x2f0 SyS_rename+0x2a6/0x3b0 do_syscall_64+0x79/0x1e0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 -> (&delayed_node->mutex){+.+.-.} ops: 5580204 { HARDIRQ-ON-W at: __mutex_lock+0x4e/0x8c0 btrfs_delayed_update_inode+0x46/0x6e0 [btrfs] btrfs_update_inode+0x83/0x110 [btrfs] btrfs_dirty_inode+0x62/0xe0 [btrfs] touch_atime+0x8c/0xb0 do_generic_file_read+0x818/0xb10 __vfs_read+0xdc/0x150 vfs_read+0x8a/0x130 SyS_read+0x45/0xa0 do_syscall_64+0x79/0x1e0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 SOFTIRQ-ON-W at: __mutex_lock+0x4e/0x8c0 btrfs_delayed_update_inode+0x46/0x6e0 [btrfs] btrfs_update_inode+0x83/0x110 [btrfs] btrfs_dirty_inode+0x62/0xe0 [btrfs] touch_atime+0x8c/0xb0 do_generic_file_read+0x818/0xb10 __vfs_read+0xdc/0x150 vfs_read+0x8a/0x130 SyS_read+0x45/0xa0 do_syscall_64+0x79/0x1e0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 IN-RECLAIM_FS-W at: __mutex_lock+0x4e/0x8c0 __btrfs_release_delayed_node+0x3a/0x1f0 [btrfs] btrfs_evict_inode+0x22c/0x6a0 [btrfs] evict+0xc4/0x190 dispose_list+0x35/0x50 prune_icache_sb+0x42/0x50 super_cache_scan+0x139/0x190 shrink_slab+0x262/0x5b0 shrink_node+0x2eb/0x2f0 kswapd+0x2eb/0x890 kthread+0x102/0x140 ret_from_fork+0x3a/0x50 INITIAL USE at: __mutex_lock+0x4e/0x8c0 btrfs_delayed_update_inode+0x46/0x6e0 [btrfs] btrfs_update_inode+0x83/0x110 [btrfs] btrfs_dirty_inode+0x62/0xe0 [btrfs] touch_atime+0x8c/0xb0 do_generic_file_read+0x818/0xb10 __vfs_read+0xdc/0x150 vfs_read+0x8a/0x130 SyS_read+0x45/0xa0 do_syscall_64+0x79/0x1e0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 } ... key at: [<ffffffffc072d488>] __key.56935+0x0/0xfffffffffff96b78 [btrfs] ... acquired at: __lock_acquire+0x264/0x11c0 lock_acquire+0xbd/0x1e0 __mutex_lock+0x4e/0x8c0 __btrfs_release_delayed_node+0x3a/0x1f0 [btrfs] btrfs_evict_inode+0x22c/0x6a0 [btrfs] evict+0xc4/0x190 dispose_list+0x35/0x50 prune_icache_sb+0x42/0x50 super_cache_scan+0x139/0x190 shrink_slab+0x262/0x5b0 shrink_node+0x2eb/0x2f0 kswapd+0x2eb/0x890 kthread+0x102/0x140 ret_from_fork+0x3a/0x50 stack backtrace: CPU: 1 PID: 50 Comm: kswapd0 Tainted: G W 4.12.14-kvmsmall #8 SLE15 (unreleased) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0x78/0xb7 print_irq_inversion_bug.part.38+0x19f/0x1aa check_usage_forwards+0x102/0x120 ? ret_from_fork+0x3a/0x50 ? check_usage_backwards+0x110/0x110 mark_lock+0x16c/0x270 __lock_acquire+0x264/0x11c0 ? pagevec_lookup_entries+0x1a/0x30 ? truncate_inode_pages_range+0x2b3/0x7f0 lock_acquire+0xbd/0x1e0 ? __btrfs_release_delayed_node+0x3a/0x1f0 [btrfs] __mutex_lock+0x4e/0x8c0 ? __btrfs_release_delayed_node+0x3a/0x1f0 [btrfs] ? __btrfs_release_delayed_node+0x3a/0x1f0 [btrfs] ? btrfs_evict_inode+0x1f6/0x6a0 [btrfs] __btrfs_release_delayed_node+0x3a/0x1f0 [btrfs] btrfs_evict_inode+0x22c/0x6a0 [btrfs] evict+0xc4/0x190 dispose_list+0x35/0x50 prune_icache_sb+0x42/0x50 super_cache_scan+0x139/0x190 shrink_slab+0x262/0x5b0 shrink_node+0x2eb/0x2f0 kswapd+0x2eb/0x890 kthread+0x102/0x140 ? mem_cgroup_shrink_node+0x2c0/0x2c0 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x3a/0x50 Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:26 +02:00
Filipe Manana	92efba91a7	Btrfs: fix copy_items() return value when logging an inode [ Upstream commit `8434ec46c6` ] When logging an inode, at tree-log.c:copy_items(), if we call btrfs_next_leaf() at the loop which checks for the need to log holes, we need to make sure copy_items() returns the value 1 to its caller and not 0 (on success). This is because the path the caller passed was released and is now different from what is was before, and the caller expects a return value of 0 to mean both success and that the path has not changed, while a return value of 1 means both success and signals the caller that it can not reuse the path, it has to perform another tree search. Even though this is a case that should not be triggered on normal circumstances or very rare at least, its consequences can be very unpredictable (especially when replaying a log tree). Fixes: `16e7549f04` ("Btrfs: incompatible format change to remove hole extents") Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:26 +02:00
Qu Wenruo	d7255626a0	btrfs: tests/qgroup: Fix wrong tree backref level [ Upstream commit `3c0efdf03b` ] The extent tree of the test fs is like the following: BTRFS info (device (null)): leaf 16327509003777336587 total ptrs 1 free space 3919 item 0 key (4096 168 4096) itemoff 3944 itemsize 51 extent refs 1 gen 1 flags 2 tree block key (68719476736 0 0) level 1 ^^^^^^^ ref#0: tree block backref root 5 And it's using an empty tree for fs tree, so there is no way that its level can be 1. For REAL (created by mkfs) fs tree backref with no skinny metadata, the result should look like: item 3 key (30408704 EXTENT_ITEM 4096) itemoff 3845 itemsize 51 refs 1 gen 4 flags TREE_BLOCK tree block key (256 INODE_ITEM 0) level 0 ^^^^^^^ tree block backref root 5 Fix the level to 0, so it won't break later tree level checker. Fixes: `faa2dbf004` ("Btrfs: add sanity tests for new qgroup accounting code") Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:26 +02:00
Nicholas Piggin	27a913cc91	powerpc/64s: sreset panic if there is no debugger or crash dump handlers [ Upstream commit `d40b6768e4` ] system_reset_exception does most of its own crash handling now, invoking the debugger or crash dumps if they are registered. If not, then it goes through to die() to print stack traces, and then is supposed to panic (according to comments). However after die() prints oopses, it does its own handling which doesn't allow system_reset_exception to panic (e.g., it may just kill the current process). This patch causes sreset exceptions to return from die after it prints messages but before acting. This also stops die from invoking the debugger on 0x100 crashes. system_reset_exception similarly calls the debugger. It had been thought this was harmless (because if the debugger was disabled, neither call would fire, and if it was enabled the first call would return). However in some cases like xmon 'X' command, the debugger returns 0, which currently causes it to be entered again (first in system_reset_exception, then in die), which is confusing. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:26 +02:00
Florian Fainelli	305f25c1ed	net: bgmac: Correctly annotate register space [ Upstream commit `16a1c0646e` ] All the members: base, idm_base and nicpm_base should be annotated with __iomem since they are pointers to register space. This fixes a bunch of sparse reported warnings. Fixes: `f6a95a2495` ("net: ethernet: bgmac: Add platform device support") Fixes: `dd5c5d037f` ("net: ethernet: bgmac: add NS2 support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:26 +02:00
Florian Fainelli	435290f7a4	net: bgmac: Fix endian access in bgmac_dma_tx_ring_free() [ Upstream commit `60d6e6f0b9` ] bgmac_dma_tx_ring_free() assigns the ctl1 word which is a litle endian 32-bit word without using proper accessors, fix this, and because a length cannot be negative, use unsigned int while at it. Fixes: `9cde94506e` ("bgmac: implement scatter/gather support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:25 +02:00
David S. Miller	4a6cd791d6	sparc64: Make atomic_xchg() an inline function rather than a macro. [ Upstream commit `d13864b68e` ] This avoids a lot of -Wunused warnings such as: ==================== kernel/debug/debug_core.c: In function ‘kgdb_cpu_enter’: ./arch/sparc/include/asm/cmpxchg_64.h:55:22: warning: value computed is not used [-Wunused-value] #define xchg(ptr,x) ((__typeof__((ptr)))__xchg((unsigned long)(x),(ptr),sizeof((ptr)))) ./arch/sparc/include/asm/atomic_64.h:86:30: note: in expansion of macro ‘xchg’ #define atomic_xchg(v, new) (xchg(&((v)->counter), new)) ^~~~ kernel/debug/debug_core.c:508:4: note: in expansion of macro ‘atomic_xchg’ atomic_xchg(&kgdb_active, cpu); ^~~~~~~~~~~ ==================== Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:25 +02:00
David Howells	22f1bde5d1	fscache: Fix hanging wait on page discarded by writeback [ Upstream commit `2c98425720` ] If the fscache asynchronous write operation elects to discard a page that's pending storage to the cache because the page would be over the store limit then it needs to wake the page as someone may be waiting on completion of the write. The problem is that the store limit may be updated by a different asynchronous operation - and so may miss the write - and that the store limit may not even get updated until later by the netfs. Fix the kernel hang by making fscache_write_op() mark as written any pages that are over the limit. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:25 +02:00
Alexander Graf	6d03ff1669	lan78xx: Connect phy early [ Upstream commit `92571a1aae` ] When using wicked with a lan78xx device attached to the system, we end up with ethtool commands issued on the device before an ifup got issued. That lead to the following crash: Unable to handle kernel NULL pointer dereference at virtual address 0000039c pgd = ffff800035b30000 [0000039c] *pgd=0000000000000000 Internal error: Oops: 96000004 [#1] SMP Modules linked in: [...] Supported: Yes CPU: 3 PID: 638 Comm: wickedd Tainted: G E 4.12.14-0-default #1 Hardware name: raspberrypi rpi/rpi, BIOS 2018.03-rc2 02/21/2018 task: ffff800035e74180 task.stack: ffff800036718000 PC is at phy_ethtool_ksettings_get+0x20/0x98 LR is at lan78xx_get_link_ksettings+0x44/0x60 [lan78xx] pc : [<ffff0000086f7f30>] lr : [<ffff000000dcca84>] pstate: 20000005 sp : ffff80003671bb20 x29: ffff80003671bb20 x28: ffff800035e74180 x27: ffff000008912000 x26: 000000000000001d x25: 0000000000000124 x24: ffff000008f74d00 x23: 0000004000114809 x22: 0000000000000000 x21: ffff80003671bbd0 x20: 0000000000000000 x19: ffff80003671bbd0 x18: 000000000000040d x17: 0000000000000001 x16: 0000000000000000 x15: 0000000000000000 x14: ffffffffffffffff x13: 0000000000000000 x12: 0000000000000020 x11: 0101010101010101 x10: fefefefefefefeff x9 : 7f7f7f7f7f7f7f7f x8 : fefefeff31677364 x7 : 0000000080808080 x6 : ffff80003671bc9c x5 : ffff80003671b9f8 x4 : ffff80002c296190 x3 : 0000000000000000 x2 : 0000000000000000 x1 : ffff80003671bbd0 x0 : ffff80003671bc00 Process wickedd (pid: 638, stack limit = 0xffff800036718000) Call trace: Exception stack(0xffff80003671b9e0 to 0xffff80003671bb20) b9e0: ffff80003671bc00 ffff80003671bbd0 0000000000000000 0000000000000000 ba00: ffff80002c296190 ffff80003671b9f8 ffff80003671bc9c 0000000080808080 ba20: fefefeff31677364 7f7f7f7f7f7f7f7f fefefefefefefeff 0101010101010101 ba40: 0000000000000020 0000000000000000 ffffffffffffffff 0000000000000000 ba60: 0000000000000000 0000000000000001 000000000000040d ffff80003671bbd0 ba80: 0000000000000000 ffff80003671bbd0 0000000000000000 0000004000114809 baa0: ffff000008f74d00 0000000000000124 000000000000001d ffff000008912000 bac0: ffff800035e74180 ffff80003671bb20 ffff000000dcca84 ffff80003671bb20 bae0: ffff0000086f7f30 0000000020000005 ffff80002c296000 ffff800035223900 bb00: 0000ffffffffffff 0000000000000000 ffff80003671bb20 ffff0000086f7f30 [<ffff0000086f7f30>] phy_ethtool_ksettings_get+0x20/0x98 [<ffff000000dcca84>] lan78xx_get_link_ksettings+0x44/0x60 [lan78xx] [<ffff0000087cbc40>] ethtool_get_settings+0x68/0x210 [<ffff0000087cc0d4>] dev_ethtool+0x214/0x2180 [<ffff0000087e5008>] dev_ioctl+0x400/0x630 [<ffff00000879dd00>] sock_do_ioctl+0x70/0x88 [<ffff00000879f5f8>] sock_ioctl+0x208/0x368 [<ffff0000082cde10>] do_vfs_ioctl+0xb0/0x848 [<ffff0000082ce634>] SyS_ioctl+0x8c/0xa8 Exception stack(0xffff80003671bec0 to 0xffff80003671c000) bec0: 0000000000000009 0000000000008946 0000fffff4e841d0 0000aa0032687465 bee0: 0000aaaafa2319d4 0000fffff4e841d4 0000000032687465 0000000032687465 bf00: 000000000000001d 7f7fff7f7f7f7f7f 72606b622e71ff4c 7f7f7f7f7f7f7f7f bf20: 0101010101010101 0000000000000020 ffffffffffffffff 0000ffff7f510c68 bf40: 0000ffff7f6a9d18 0000ffff7f44ce30 000000000000040d 0000ffff7f6f98f0 bf60: 0000fffff4e842c0 0000000000000001 0000aaaafa2c2e00 0000ffff7f6ab000 bf80: 0000fffff4e842c0 0000ffff7f62a000 0000aaaafa2b9f20 0000aaaafa2c2e00 bfa0: 0000fffff4e84818 0000fffff4e841a0 0000ffff7f5ad0cc 0000fffff4e841a0 bfc0: 0000ffff7f44ce3c 0000000080000000 0000000000000009 000000000000001d bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 The culprit is quite simple: The driver tries to access the phy left and right, but only actually has a working reference to it when the device is up. The fix thus is quite simple too: Get a reference to the phy on probe already and keep it even when the device is going down. With this patch applied, I can successfully run wicked on my system and bring the interface up and down as many times as I want, without getting NULL pointer dereferences in between. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:25 +02:00
Sean Christopherson	80b8f3da49	KVM: VMX: raise internal error for exception during invalid protected mode state [ Upstream commit `add5ff7a21` ] Exit to userspace with KVM_INTERNAL_ERROR_EMULATION if we encounter an exception in Protected Mode while emulating guest due to invalid guest state. Unlike Big RM, KVM doesn't support emulating exceptions in PM, i.e. PM exceptions are always injected via the VMCS. Because we will never do VMRESUME due to emulation_required, the exception is never realized and we'll keep emulating the faulting instruction over and over until we receive a signal. Exit to userspace iff there is a pending exception, i.e. don't exit simply on a requested event. The purpose of this check and exit is to aid in debugging a guest that is in all likelihood already doomed. Invalid guest state in PM is extremely limited in normal operation, e.g. it generally only occurs for a few instructions early in BIOS, and any exception at this time is all but guaranteed to be fatal. Non-vectored interrupts, e.g. INIT, SIPI and SMI, can be cleanly handled/emulated, while checking for vectored interrupts, e.g. INTR and NMI, without hitting false positives would add a fair amount of complexity for almost no benefit (getting hit by lightning seems more likely than encountering this specific scenario). Add a WARN_ON_ONCE to vmx_queue_exception() if we try to inject an exception via the VMCS and emulation_required is true. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:25 +02:00
Sai Praneeth	fd97bbca67	x86/mm: Fix bogus warning during EFI bootup, use boot_cpu_has() instead of this_cpu_has() in build_cr3_noflush() [ Upstream commit `162ee5a8ab` ] Linus reported the following boot warning: WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/tlbflush.h:134 load_new_mm_cr3+0x114/0x170 [...] Call Trace: switch_mm_irqs_off+0x267/0x590 switch_mm+0xe/0x20 efi_switch_mm+0x3e/0x50 efi_enter_virtual_mode+0x43f/0x4da start_kernel+0x3bf/0x458 secondary_startup_64+0xa5/0xb0 ... after merging: `03781e4089`: x86/efi: Use efi_switch_mm() rather than manually twiddling with %cr3 When the platform supports PCID and if CONFIG_DEBUG_VM=y is enabled, build_cr3_noflush() (called via switch_mm()) does a sanity check to see if X86_FEATURE_PCID is set. Presently, build_cr3_noflush() uses "this_cpu_has(X86_FEATURE_PCID)" to perform the check but this_cpu_has() works only after SMP is initialized (i.e. per cpu cpu_info's should be populated) and this happens to be very late in the boot process (during rest_init()). As efi_runtime_services() are called during (early) kernel boot time and run time, modify build_cr3_noflush() to use boot_cpu_has() all the time. As suggested by Dave Hansen, this should be OK because all CPU's have same capabilities on x86. With this change the warning is fixed. ( Dave also suggested that we put a warning in this_cpu_has() if it's used early in the boot process. This is still work in progress as it affects MCE. ) Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Lee Chun-Yi <jlee@suse.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1522870459-7432-1-git-send-email-sai.praneeth.prakhya@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:25 +02:00
Davidlohr Bueso	3aeaeecda0	sched/rt: Fix rq->clock_update_flags < RQCF_ACT_SKIP warning [ Upstream commit `d29a20645d` ] While running rt-tests' pi_stress program I got the following splat: rq->clock_update_flags < RQCF_ACT_SKIP WARNING: CPU: 27 PID: 0 at kernel/sched/sched.h:960 assert_clock_updated.isra.38.part.39+0x13/0x20 [...] <IRQ> enqueue_top_rt_rq+0xf4/0x150 ? cpufreq_dbs_governor_start+0x170/0x170 sched_rt_rq_enqueue+0x65/0x80 sched_rt_period_timer+0x156/0x360 ? sched_rt_rq_enqueue+0x80/0x80 __hrtimer_run_queues+0xfa/0x260 hrtimer_interrupt+0xcb/0x220 smp_apic_timer_interrupt+0x62/0x120 apic_timer_interrupt+0xf/0x20 </IRQ> [...] do_idle+0x183/0x1e0 cpu_startup_entry+0x5f/0x70 start_secondary+0x192/0x1d0 secondary_startup_64+0xa5/0xb0 We can get rid of it be the "traditional" means of adding an update_rq_clock() call after acquiring the rq->lock in do_sched_rt_period_timer(). The case for the RT task throttling (which this workload also hits) can be ignored in that the skip_update call is actually bogus and quite the contrary (the request bits are removed/reverted). By setting RQCF_UPDATED we really don't care if the skip is happening or not and will therefore make the assert_clock_updated() check happy. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave@stgolabs.net Cc: linux-kernel@vger.kernel.org Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/20180402164954.16255-1-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:25 +02:00
Nicholas Piggin	be6a5ad51a	powerpc/64s/idle: Fix restore of AMOR on POWER9 after deep sleep [ Upstream commit `c1b25a17d2` ] POWER8 restores AMOR when waking from deep sleep, but POWER9 does not, because it does not go through the subcore restore. Have POWER9 restore it in core restore. Fixes: `ee97b6b99f` ("powerpc/mm/radix: Setup AMOR in HV mode to allow key 0") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:25 +02:00
Jun Piao	839c27f713	ocfs2/dlm: don't handle migrate lockres if already in shutdown [ Upstream commit `bb34f24c7d` ] We should not handle migrate lockres if we are already in 'DLM_CTXT_IN_SHUTDOWN', as that will cause lockres remains after leaving dlm domain. At last other nodes will get stuck into infinite loop when requsting lock from us. The problem is caused by concurrency umount between nodes. Before receiveing N1's DLM_BEGIN_EXIT_DOMAIN_MSG, N2 has picked up N1 as the migrate target. So N2 will continue sending lockres to N1 even though N1 has left domain. N1 N2 (owner) touch file access the file, and get pr lock begin leave domain and pick up N1 as new owner begin leave domain and migrate all lockres done begin migrate lockres to N1 end leave domain, but the lockres left unexpectedly, because migrate task has passed [piaojun@huawei.com: v3] Link: http://lkml.kernel.org/r/5A9CBD19.5020107@huawei.com Link: http://lkml.kernel.org/r/5A99F028.2090902@huawei.com Signed-off-by: Jun Piao <piaojun@huawei.com> Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Reviewed-by: Changwei Ge <ge.changwei@h3c.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:24 +02:00
Mikhail Malygin	9ebe297713	IB/rxe: Fix for oops in rxe_register_device on ppc64le arch [ Upstream commit `efc365e729` ] On ppc64le arch rxe_add command causes oops in kernel log: [ 92.495140] Oops: Kernel access of bad area, sig: 11 [#1] [ 92.499710] SMP NR_CPUS=2048 NUMA pSeries [ 92.499792] Modules linked in: ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) nf_conntrack_netlink(E) nfnetlink(E) xfrm_user(E) iptable _nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) xt_addrtype(E) iptable_filter(E) ip_tables(E) xt_conntrack(E) x_tables(E) nf_nat(E) nf_conntrack(E) br_netfilter(E) bridge(E) stp(E) llc(E) overlay(E) af_packet(E) rpcrdma(E) ib_isert(E) iscsi_target_mod(E) i b_iser(E) libiscsi(E) ib_srpt(E) target_core_mod(E) ib_srp(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) ib_uverbs(E) ib_umad(E) bochs_drm(E) tt m(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) drm(E) agpgart(E) virtio_rng(E) virtio_console(E) rtc_ generic(E) dm_ec(OEN) ttln_rdma(OEN) rdma_cm(E) configfs(E) iw_cm(E) ib_cm(E) rdma_rxe(E) ip6_udp_tunnel(E) udp_tunnel(E) ib_core(E) ql a2xxx(E) [ 92.499832] scsi_transport_fc(E) nvme_fc(E) nvme_fabrics(E) nvme_core(E) ipmi_watchdog(E) ipmi_ssif(E) ipmi_poweroff(E) ipmi_powernv(EX) ipmi_devintf(E) ipmi_msghandler(E) dummy(E) ext4(E) crc16(E) jbd2(E) mbcache(E) dm_service_time(E) scsi_transport_iscsi(E) sd_mod(E) sr_mod(E) cdrom(E) hid_generic(E) usbhid(E) virtio_blk(E) virtio_scsi(E) virtio_net(E) ibmvscsi(EX) scsi_transport_srp(E) xhci_pci(E) xhci_hcd(E) usbcore(E) usb_common(E) virtio_pci(E) virtio_ring(E) virtio(E) sunrpc(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) autofs4(E) [ 92.499834] Supported: No, Unsupported modules are loaded [ 92.499839] CPU: 3 PID: 5576 Comm: sh Tainted: G OE NX 4.4.120-ttln.17-default #1 [ 92.499841] task: c0000000afe8a490 ti: c0000000beba8000 task.ti: c0000000beba8000 [ 92.499842] NIP: c00000000008ba3c LR: c000000000027644 CTR: c00000000008ba10 [ 92.499844] REGS: c0000000bebab750 TRAP: 0300 Tainted: G OE NX (4.4.120-ttln.17-default) [ 92.499850] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28424428 XER: 20000000 [ 92.499871] CFAR: 0000000000002424 DAR: 0000000000000208 DSISR: 40000000 SOFTE: 1 GPR00: c000000000027644 c0000000bebab9d0 c000000000f09700 0000000000000000 GPR04: d0000000043d7192 0000000000000002 000000000000001a fffffffffffffffe GPR08: 000000000000009c c00000000008ba10 d0000000043e5848 d0000000043d3828 GPR12: c00000000008ba10 c000000007a02400 0000000010062e38 0000010020388860 GPR16: 0000000000000000 0000000000000000 00000100203885f0 00000000100f6c98 GPR20: c0000000b3f1fcc0 c0000000b3f1fc48 c0000000b3f1fbd0 c0000000b3f1fb58 GPR24: c0000000b3f1fae0 c0000000b3f1fa68 00000000000005dc c0000000b3f1f9f0 GPR28: d0000000043e5848 c0000000b3f1f900 c0000000b3f1f320 c0000000b3f1f000 [ 92.499881] NIP [c00000000008ba3c] dma_get_required_mask_pSeriesLP+0x2c/0x1a0 [ 92.499885] LR [c000000000027644] dma_get_required_mask+0x44/0xac [ 92.499886] Call Trace: [ 92.499891] [c0000000bebab9d0] [c0000000bebaba30] 0xc0000000bebaba30 (unreliable) [ 92.499894] [c0000000bebaba10] [c000000000027644] dma_get_required_mask+0x44/0xac [ 92.499904] [c0000000bebaba30] [d0000000043cb4b4] rxe_register_device+0xc4/0x430 [rdma_rxe] [ 92.499910] [c0000000bebabab0] [d0000000043c06c8] rxe_add+0x448/0x4e0 [rdma_rxe] [ 92.499915] [c0000000bebabb30] [d0000000043d28dc] rxe_net_add+0x4c/0xf0 [rdma_rxe] [ 92.499921] [c0000000bebabb60] [d0000000043d305c] rxe_param_set_add+0x6c/0x1ac [rdma_rxe] [ 92.499924] [c0000000bebabbf0] [c0000000000e78c0] param_attr_store+0xa0/0x180 [ 92.499927] [c0000000bebabc70] [c0000000000e6448] module_attr_store+0x48/0x70 [ 92.499932] [c0000000bebabc90] [c000000000391f60] sysfs_kf_write+0x70/0xb0 [ 92.499935] [c0000000bebabcb0] [c000000000390f1c] kernfs_fop_write+0x18c/0x1e0 [ 92.499939] [c0000000bebabd00] [c0000000002e22ac] __vfs_write+0x4c/0x1d0 [ 92.499942] [c0000000bebabd90] [c0000000002e2f94] vfs_write+0xc4/0x200 [ 92.499945] [c0000000bebabde0] [c0000000002e488c] SyS_write+0x6c/0x110 [ 92.499948] [c0000000bebabe30] [c000000000009384] system_call+0x38/0xe4 [ 92.499949] Instruction dump: [ 92.499954] 4e800020 3c4c00e8 3842dcf0 7c0802a6 f8010010 60000000 7c0802a6 fba1ffe8 [ 92.499958] fbc1fff0 fbe1fff8 f8010010 f821ffc1 <e9230208> 7c7e1b78 2fa90000 419e0078 [ 92.499962] ---[ end trace bed077e15eb420cf ]--- It fails in dma_get_required_mask, that has ppc-specific implementation, and fail if provided device argument is NULL Signed-off-by: Mikhail Malygin <mikhail@malygin.me> Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:24 +02:00
Nikolay Borisov	370b3353f4	btrfs: Fix possible softlock on single core machines [ Upstream commit `1e1c50a929` ] do_chunk_alloc implements a loop checking whether there is a pending chunk allocation and if so causes the caller do loop. Generally this loop is executed only once, however testing with btrfs/072 on a single core vm machines uncovered an extreme case where the system could loop indefinitely. This is due to a missing cond_resched when loop which doesn't give a chance to the previous chunk allocator finish its job. The fix is to simply add the missing cond_resched. Fixes: `6d74119f1a` ("Btrfs: avoid taking the chunk_mutex in do_chunk_alloc") Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:24 +02:00
Liu Bo	acfd8e8865	Btrfs: fix NULL pointer dereference in log_dir_items [ Upstream commit `80c0b4210a` ] 0, 1 and <0 can be returned by btrfs_next_leaf(), and when <0 is returned, path->nodes[0] could be NULL, log_dir_items lacks such a check for <0 and we may run into a null pointer dereference panic. Fixes: `e02119d5a7` ("Btrfs: Add a write ahead tree log to optimize synchronous operations") Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:24 +02:00
Liu Bo	afef64b108	Btrfs: bail out on error during replay_dir_deletes [ Upstream commit `b98def7ca6` ] If errors were returned by btrfs_next_leaf(), replay_dir_deletes needs to bail out, otherwise @ret would be forced to be 0 after 'break;' and the caller won't be aware of it. Fixes: `e02119d5a7` ("Btrfs: Add a write ahead tree log to optimize synchronous operations") Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:24 +02:00
Yang Shi	5ade3c9618	mm: thp: fix potential clearing to referenced flag in page_idle_clear_pte_refs_one() [ Upstream commit `f0849ac0b8` ] For PTE-mapped THP, the compound THP has not been split to normal 4K pages yet, the whole THP is considered referenced if any one of sub page is referenced. When walking PTE-mapped THP by pvmw, all relevant PTEs will be checked to retrieve referenced bit. But, the current code just returns the result of the last PTE. If the last PTE has not referenced, the referenced flag will be cleared. Just set referenced when ptep{pmdp}_clear_young_notify() returns true. Link: http://lkml.kernel.org/r/1518212451-87134-1-git-send-email-yang.shi@linux.alibaba.com Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Reported-by: Gang Deng <gavin.dg@linux.alibaba.com> Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:24 +02:00
Huang Ying	8d700626fb	mm: fix races between address_space dereference and free in page_evicatable [ Upstream commit `e92bb4dd96` ] When page_mapping() is called and the mapping is dereferenced in page_evicatable() through shrink_active_list(), it is possible for the inode to be truncated and the embedded address space to be freed at the same time. This may lead to the following race. CPU1 CPU2 truncate(inode) shrink_active_list() ... page_evictable(page) truncate_inode_page(mapping, page); delete_from_page_cache(page) spin_lock_irqsave(&mapping->tree_lock, flags); __delete_from_page_cache(page, NULL) page_cache_tree_delete(..) ... mapping = page_mapping(page); page->mapping = NULL; ... spin_unlock_irqrestore(&mapping->tree_lock, flags); page_cache_free_page(mapping, page) put_page(page) if (put_page_testzero(page)) -> false - inode now has no pages and can be freed including embedded address_space mapping_unevictable(mapping) test_bit(AS_UNEVICTABLE, &mapping->flags); - we've dereferenced mapping which is potentially already free. Similar race exists between swap cache freeing and page_evicatable() too. The address_space in inode and swap cache will be freed after a RCU grace period. So the races are fixed via enclosing the page_mapping() and address_space usage in rcu_read_lock/unlock(). Some comments are added in code to make it clear what is protected by the RCU read lock. Link: http://lkml.kernel.org/r/20180212081227.1940-1-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:24 +02:00
Claudio Imbrenda	763111d9f3	mm/ksm: fix interaction with THP [ Upstream commit `77da2ba064` ] This patch fixes a corner case for KSM. When two pages belong or belonged to the same transparent hugepage, and they should be merged, KSM fails to split the page, and therefore no merging happens. This bug can be reproduced by: * making sure ksm is running (in case disabling ksmtuned) * enabling transparent hugepages * allocating a THP-aligned 1-THP-sized buffer e.g. on amd64: posix_memalign(&p, 1<<21, 1<<21) * filling it with the same values e.g. memset(p, 42, 1<<21) * performing madvise to make it mergeable e.g. madvise(p, 1<<21, MADV_MERGEABLE) * waiting for KSM to perform a few scans The expected outcome is that the all the pages get merged (1 shared and the rest sharing); the actual outcome is that no pages get merged (1 unshared and the rest volatile) The reason of this behaviour is that we increase the reference count once for both pages we want to merge, but if they belong to the same hugepage (or compound page), the reference counter used in both cases is the one of the head of the compound page. This means that split_huge_page will find a value of the reference counter too high and will fail. This patch solves this problem by testing if the two pages to merge belong to the same hugepage when attempting to merge them. If so, the hugepage is split safely. This means that the hugepage is not split if not necessary. Link: http://lkml.kernel.org/r/1521548069-24758-1-git-send-email-imbrenda@linux.vnet.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Co-authored-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:24 +02:00
Thomas Falcon	378a1e49f9	ibmvnic: Zero used TX descriptor counter on reset [ Upstream commit `41f714672f` ] The counter that tracks used TX descriptors pending completion needs to be zeroed as part of a device reset. This change fixes a bug causing transmit queues to be stopped unnecessarily and in some cases a transmit queue stall and timeout reset. If the counter is not reset, the remaining descriptors will not be "removed", effectively reducing queue capacity. If the queue is over half full, it will cause the queue to stall if stopped. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:52:24 +02:00

1 2 3 4 5 ...

712908 Commits