linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-05 10:31:46 +09:00

Author	SHA1	Message	Date
Filipe Manana	9eec74b484	Btrfs: send, fix race with transaction commits that create snapshots commit `be6821f82c` upstream. If we create a snapshot of a snapshot currently being used by a send operation, we can end up with send failing unexpectedly (returning -ENOENT error to user space for example). The following diagram shows how this happens. CPU 1 CPU2 CPU3 btrfs_ioctl_send() (...) create_snapshot() -> creates snapshot of a root used by the send task btrfs_commit_transaction() create_pending_snapshot() __get_inode_info() btrfs_search_slot() btrfs_search_slot_get_root() down_read commit_root_sem get reference on eb of the commit root -> eb with bytenr == X up_read commit_root_sem btrfs_cow_block(root node) btrfs_free_tree_block() -> creates delayed ref to free the extent btrfs_run_delayed_refs() -> runs the delayed ref, adds extent to fs_info->pinned_extents btrfs_finish_extent_commit() unpin_extent_range() -> marks extent as free in the free space cache transaction commit finishes btrfs_start_transaction() (...) btrfs_cow_block() btrfs_alloc_tree_block() btrfs_reserve_extent() -> allocates extent at bytenr == X btrfs_init_new_buffer(bytenr X) btrfs_find_create_tree_block() alloc_extent_buffer(bytenr X) find_extent_buffer(bytenr X) -> returns existing eb, which the send task got (...) -> modifies content of the eb with bytenr == X -> uses an eb that now belongs to some other tree and no more matches the commit root of the snapshot, resuts will be unpredictable The consequences of this race can be various, and can lead to searches in the commit root performed by the send task failing unexpectedly (unable to find inode items, returning -ENOENT to user space, for example) or not failing because an inode item with the same number was added to the tree that reused the metadata extent, in which case send can behave incorrectly in the worst case or just fail later for some reason. Fix this by performing a copy of the commit root's extent buffer when doing a search in the context of a send operation. CC: stable@vger.kernel.org # 4.4.x: `1fc28d8e2e`: Btrfs: move get root out of btrfs_search_slot to a helper CC: stable@vger.kernel.org # 4.4.x: `f9ddfd0592`: Btrfs: remove unused check of skip_locking CC: stable@vger.kernel.org # 4.4.x Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:45 +01:00
Josef Bacik	6911b074a0	btrfs: run delayed items before dropping the snapshot commit `0568e82dbe` upstream. With my delayed refs patches in place we started seeing a large amount of aborts in __btrfs_free_extent: BTRFS error (device sdb1): unable to find ref byte nr 91947008 parent 0 root 35964 owner 1 offset 0 Call Trace: ? btrfs_merge_delayed_refs+0xaf/0x340 __btrfs_run_delayed_refs+0x6ea/0xfc0 ? btrfs_set_path_blocking+0x31/0x60 btrfs_run_delayed_refs+0xeb/0x180 btrfs_commit_transaction+0x179/0x7f0 ? btrfs_check_space_for_delayed_refs+0x30/0x50 ? should_end_transaction.isra.19+0xe/0x40 btrfs_drop_snapshot+0x41c/0x7c0 btrfs_clean_one_deleted_snapshot+0xb5/0xd0 cleaner_kthread+0xf6/0x120 kthread+0xf8/0x130 ? btree_invalidatepage+0x90/0x90 ? kthread_bind+0x10/0x10 ret_from_fork+0x35/0x40 This was because btrfs_drop_snapshot depends on the root not being modified while it's dropping the snapshot. It will unlock the root node (and really every node) as it walks down the tree, only to re-lock it when it needs to do something. This is a problem because if we modify the tree we could cow a block in our path, which frees our reference to that block. Then once we get back to that shared block we'll free our reference to it again, and get ENOENT when trying to lookup our extent reference to that block in __btrfs_free_extent. This is ultimately happening because we have delayed items left to be processed for our deleted snapshot _after_ all of the inodes are closed for the snapshot. We only run the delayed inode item if we're deleting the inode, and even then we do not run the delayed insertions or delayed removals. These can be run at any point after our final inode does its last iput, which is what triggers the snapshot deletion. We can end up with the snapshot deletion happening and then have the delayed items run on that file system, resulting in the above problem. This problem has existed forever, however my patches made it much easier to hit as I wake up the cleaner much more often to deal with delayed iputs, which made us more likely to start the snapshot dropping work before the transaction commits, which is when the delayed items would generally be run. Before, generally speaking, we would run the delayed items, commit the transaction, and wakeup the cleaner thread to start deleting snapshots, which means we were less likely to hit this problem. You could still hit it if you had multiple snapshots to be deleted and ended up with lots of delayed items, but it was definitely harder. Fix for now by simply running all the delayed items before starting to drop the snapshot. We could make this smarter in the future by making the delayed items per-root, and then simply drop any delayed items for roots that we are going to delete. But for now just a quick and easy solution is the safest. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:45 +01:00
Filipe Manana	10b04210aa	Btrfs: fix fsync of files with multiple hard links in new directories commit `41bd606769` upstream. The log tree has a long standing problem that when a file is fsync'ed we only check for new ancestors, created in the current transaction, by following only the hard link for which the fsync was issued. We follow the ancestors using the VFS' dget_parent() API. This means that if we create a new link for a file in a directory that is new (or in an any other new ancestor directory) and then fsync the file using an old hard link, we end up not logging the new ancestor, and on log replay that new hard link and ancestor do not exist. In some cases, involving renames, the file will not exist at all. Example: mkfs.btrfs -f /dev/sdb mount /dev/sdb /mnt mkdir /mnt/A touch /mnt/foo ln /mnt/foo /mnt/A/bar xfs_io -c fsync /mnt/foo <power failure> In this example after log replay only the hard link named 'foo' exists and directory A does not exist, which is unexpected. In other major linux filesystems, such as ext4, xfs and f2fs for example, both hard links exist and so does directory A after mounting again the filesystem. Checking if any new ancestors are new and need to be logged was added in 2009 by commit `12fcfd22fe` ("Btrfs: tree logging unlink/rename fixes"), however only for the ancestors of the hard link (dentry) for which the fsync was issued, instead of checking for all ancestors for all of the inode's hard links. So fix this by tracking the id of the last transaction where a hard link was created for an inode and then on fsync fallback to a full transaction commit when an inode has more than one hard link and at least one new hard link was created in the current transaction. This is the simplest solution since this is not a common use case (adding frequently hard links for which there's an ancestor created in the current transaction and then fsync the file). In case it ever becomes a common use case, a solution that consists of iterating the fs/subvol btree for each hard link and check if any ancestor is new, could be implemented. This solves many unexpected scenarios reported by Jayashree Mohan and Vijay Chidambaram, and for which there is a new test case for fstests under review. Fixes: `12fcfd22fe` ("Btrfs: tree logging unlink/rename fixes") CC: stable@vger.kernel.org # 4.4+ Reported-by: Vijay Chidambaram <vvijay03@gmail.com> Reported-by: Jayashree Mohan <jayashree2912@gmail.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:45 +01:00
Lu Fengqi	7708a83090	btrfs: skip file_extent generation check for free_space_inode in run_delalloc_nocow commit `27a7ff554e` upstream. The test case btrfs/001 with inode_cache mount option will encounter the following warning: WARNING: CPU: 1 PID: 23700 at fs/btrfs/inode.c:956 cow_file_range.isra.19+0x32b/0x430 [btrfs] CPU: 1 PID: 23700 Comm: btrfs Kdump: loaded Tainted: G W O 4.20.0-rc4-custom+ #30 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:cow_file_range.isra.19+0x32b/0x430 [btrfs] Call Trace: ? free_extent_buffer+0x46/0x90 [btrfs] run_delalloc_nocow+0x455/0x900 [btrfs] btrfs_run_delalloc_range+0x1a7/0x360 [btrfs] writepage_delalloc+0xf9/0x150 [btrfs] __extent_writepage+0x125/0x3e0 [btrfs] extent_write_cache_pages+0x1b6/0x3e0 [btrfs] ? __wake_up_common_lock+0x63/0xc0 extent_writepages+0x50/0x80 [btrfs] do_writepages+0x41/0xd0 ? __filemap_fdatawrite_range+0x9e/0xf0 __filemap_fdatawrite_range+0xbe/0xf0 btrfs_fdatawrite_range+0x1b/0x50 [btrfs] __btrfs_write_out_cache+0x42c/0x480 [btrfs] btrfs_write_out_ino_cache+0x84/0xd0 [btrfs] btrfs_save_ino_cache+0x551/0x660 [btrfs] commit_fs_roots+0xc5/0x190 [btrfs] btrfs_commit_transaction+0x2bf/0x8d0 [btrfs] btrfs_mksubvol+0x48d/0x4d0 [btrfs] btrfs_ioctl_snap_create_transid+0x170/0x180 [btrfs] btrfs_ioctl_snap_create_v2+0x124/0x180 [btrfs] btrfs_ioctl+0x123f/0x3030 [btrfs] The file extent generation of the free space inode is equal to the last snapshot of the file root, so the inode will be passed to cow_file_rage. But the inode was created and its extents were preallocated in btrfs_save_ino_cache, there are no cow copies on disk. The preallocated extent is not yet in the extent tree, and btrfs_cross_ref_exist will ignore the -ENOENT returned by check_committed_ref, so we can directly write the inode to the disk. Fixes: `78d4295b1e` ("btrfs: lift some btrfs_cross_ref_exist checks in nocow path") CC: stable@vger.kernel.org # 4.18+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:45 +01:00
Anand Jain	c1f90eb019	btrfs: dev-replace: go back to suspend state if another EXCL_OP is running commit `05c49e6bc1` upstream. In a secnario where balance and replace co-exists as below, - start balance - pause balance - start replace - reboot and when system restarts, balance resumes first. Then the replace is attempted to restart but will fail as the EXCL_OP lock is already held by the balance. If so place the replace state back to BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED state. Fixes: `010a47bde9` ("btrfs: add proper safety check before resuming dev-replace") CC: stable@vger.kernel.org # 4.18+ Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:45 +01:00
Anand Jain	28867a52e4	btrfs: dev-replace: go back to suspended state if target device is missing commit `0d228ece59` upstream. At the time of forced unmount we place the running replace to BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED state, so when the system comes back and expect the target device is missing. Then let the replace state continue to be in BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED state instead of BTRFS_IOCTL_DEV_REPLACE_STATE_STARTED as there isn't any matching scrub running as part of replace. Fixes: `e93c89c1aa` ("Btrfs: add new sources for device replace code") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:45 +01:00
Macpaul Lin	326ca6bd0f	cdc-acm: fix abnormal DATA RX issue for Mediatek Preloader. commit `eafb27fa52` upstream. Mediatek Preloader is a proprietary embedded boot loader for loading Little Kernel and Linux into device DRAM. This boot loader also handle firmware update. Mediatek Preloader will be enumerated as a virtual COM port when the device is connected to Windows or Linux OS via CDC-ACM class driver. When the USB enumeration has been done, Mediatek Preloader will send out handshake command "READY" to PC actively instead of waiting command from the download tool. Since Linux 4.12, the commit "tty: reset termios state on device registration" (`93857edd98`) causes Mediatek Preloader receiving some abnoraml command like "READYXX" as it sent. This will be recognized as an incorrect response. The behavior change also causes the download handshake fail. This change only affects subsequent connects if the reconnected device happens to get the same minor number. By disabling the ECHO termios flag could avoid this problem. However, it cannot be done by user space configuration when download tool open /dev/ttyACM0. This is because the device running Mediatek Preloader will send handshake command "READY" immediately once the CDC-ACM driver is ready. This patch wants to fix above problem by introducing "DISABLE_ECHO" property in driver_info. When Mediatek Preloader is connected, the CDC-ACM driver could disable ECHO flag in termios to avoid the problem. Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com> Cc: stable@vger.kernel.org Reviewed-by: Johan Hovold <johan@kernel.org> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:45 +01:00
Tejun Heo	8a2fbdd5b0	cgroup: fix CSS_TASK_ITER_PROCS commit `e9d81a1bc2` upstream. CSS_TASK_ITER_PROCS implements process-only iteration by making css_task_iter_advance() skip tasks which aren't threadgroup leaders; however, when an iteration is started css_task_iter_start() calls the inner helper function css_task_iter_advance_css_set() instead of css_task_iter_advance(). As the helper doesn't have the skip logic, when the first task to visit is a non-leader thread, it doesn't get skipped correctly as shown in the following example. # ps -L 2030 PID LWP TTY STAT TIME COMMAND 2030 2030 pts/0 Sl+ 0:00 ./test-thread 2030 2031 pts/0 Sl+ 0:00 ./test-thread # mkdir -p /sys/fs/cgroup/x/a/b # echo threaded > /sys/fs/cgroup/x/a/cgroup.type # echo threaded > /sys/fs/cgroup/x/a/b/cgroup.type # echo 2030 > /sys/fs/cgroup/x/a/cgroup.procs # cat /sys/fs/cgroup/x/a/cgroup.threads 2030 2031 # cat /sys/fs/cgroup/x/cgroup.procs 2030 # echo 2030 > /sys/fs/cgroup/x/a/b/cgroup.threads # cat /sys/fs/cgroup/x/cgroup.procs 2031 2030 The last read of cgroup.procs is incorrectly showing non-leader 2031 in cgroup.procs output. This can be fixed by updating css_task_iter_advance() to handle the first advance and css_task_iters_tart() to call css_task_iter_advance() instead of the inner helper. After the fix, the same commands result in the following (correct) result: # ps -L 2062 PID LWP TTY STAT TIME COMMAND 2062 2062 pts/0 Sl+ 0:00 ./test-thread 2062 2063 pts/0 Sl+ 0:00 ./test-thread # mkdir -p /sys/fs/cgroup/x/a/b # echo threaded > /sys/fs/cgroup/x/a/cgroup.type # echo threaded > /sys/fs/cgroup/x/a/b/cgroup.type # echo 2062 > /sys/fs/cgroup/x/a/cgroup.procs # cat /sys/fs/cgroup/x/a/cgroup.threads 2062 2063 # cat /sys/fs/cgroup/x/cgroup.procs 2062 # echo 2062 > /sys/fs/cgroup/x/a/b/cgroup.threads # cat /sys/fs/cgroup/x/cgroup.procs 2062 Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> Fixes: `8cfd8147df` ("cgroup: implement cgroup v2 thread support") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:45 +01:00
Dmitry Eremin-Solenikov	99dcd45f27	crypto: cfb - fix decryption commit `fa4600734b` upstream. crypto_cfb_decrypt_segment() incorrectly XOR'ed generated keystream with IV, rather than with data stream, resulting in incorrect decryption. Test vectors will be added in the next patch. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:45 +01:00
Dmitry Eremin-Solenikov	d8e4b24ffb	crypto: testmgr - add AES-CFB tests commit `7da6667077` upstream. Add AES128/192/256-CFB testvectors from NIST SP800-38A. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:44 +01:00
Atul Gupta	cc43a8afa8	crypto: chcr - small packet Tx stalls the queue commit `c35828ea90` upstream. Immediate packets sent to hardware should include the work request length in calculating the flits. WR occupy one flit and if not accounted result in invalid request which stalls the HW queue. Cc: stable@vger.kernel.org Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:44 +01:00
Wenwen Wang	0fa6bead41	crypto: cavium/nitrox - fix a DMA pool free failure commit `7172122be6` upstream. In crypto_alloc_context(), a DMA pool is allocated through dma_pool_alloc() to hold the crypto context. The meta data of the DMA pool, including the pool used for the allocation 'ndev->ctx_pool' and the base address of the DMA pool used by the device 'dma', are then stored to the beginning of the pool. These meta data are eventually used in crypto_free_context() to free the DMA pool through dma_pool_free(). However, given that the DMA pool can also be accessed by the device, a malicious device can modify these meta data, especially when the device is controlled to deploy an attack. This can cause an unexpected DMA pool free failure. To avoid the above issue, this patch introduces a new structure crypto_ctx_hdr and a new field chdr in the structure nitrox_crypto_ctx hold the meta data information of the DMA pool after the allocation. Note that the original structure ctx_hdr is not changed to ensure the compatibility. Cc: <stable@vger.kernel.org> Signed-off-by: Wenwen Wang <wang6495@umn.edu> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:44 +01:00
Jernej Skrabec	d095e1ba41	clk: sunxi-ng: Use u64 for calculation of NM rate commit `65b6657672` upstream. Allwinner H6 SoC has multiplier N range between 1 and 254. Since parent rate is 24MHz, intermediate result when calculating final rate easily overflows 32 bit variable. Because of that, introduce function for calculating clock rate which uses 64 bit variable for intermediate result. Fixes: `6174a1e24b` ("clk: sunxi-ng: Add N-M-factor clock support") Fixes: `ee28648cb2` ("clk: sunxi-ng: Remove the use of rational computations") CC: <stable@vger.kernel.org> Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net> Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:44 +01:00
Johan Jonker	36ef9d14fd	clk: rockchip: fix typo in rk3188 spdif_frac parent commit `8b19faf6fa` upstream. Fix typo in common_clk_branches. Make spdif_pre parent of spdif_frac. Fixes: `6674642089` ("clk: rockchip: include downstream muxes into fractional dividers") Cc: stable@vger.kernel.org Signed-off-by: Johan Jonker <jbx9999@hotmail.com> Acked-by: Elaine Zhang <zhangqing@rock-chips.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:44 +01:00
Lukas Wunner	9e9c669859	spi: bcm2835: Avoid finishing transfer prematurely in IRQ mode commit `56c1723426` upstream. The IRQ handler bcm2835_spi_interrupt() first reads as much as possible from the RX FIFO, then writes as much as possible to the TX FIFO. Afterwards it decides whether the transfer is finished by checking if the TX FIFO is empty. If very few bytes were written to the TX FIFO, they may already have been transmitted by the time the FIFO's emptiness is checked. As a result, the transfer will be declared finished and the chip will be reset without reading the corresponding received bytes from the RX FIFO. The odds of this happening increase with a high clock frequency (such that the TX FIFO drains quickly) and either passing "threadirqs" on the command line or enabling CONFIG_PREEMPT_RT_BASE (such that the IRQ handler may be preempted between filling the TX FIFO and checking its emptiness). Fix by instead checking whether rx_len has reached zero, which means that the transfer has been received in full. This is also more efficient as it avoids one bus read access per interrupt. Note that bcm2835_spi_transfer_one_poll() likewise uses rx_len to determine whether the transfer has finished. Signed-off-by: Lukas Wunner <lukas@wunner.de> Fixes: `e34ff011c7` ("spi: bcm2835: move to the transfer_one driver model") Cc: stable@vger.kernel.org # v4.1+ Cc: Mathias Duckeck <m.duckeck@kunbus.de> Cc: Frank Pavlic <f.pavlic@kunbus.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:44 +01:00
Lukas Wunner	cc8b83ff6a	spi: bcm2835: Fix book-keeping of DMA termination commit `dbc944115e` upstream. If submission of a DMA TX transfer succeeds but submission of the corresponding RX transfer does not, the BCM2835 SPI driver terminates the TX transfer but neglects to reset the dma_pending flag to false. Thus, if the next transfer uses interrupt mode (because it is shorter than BCM2835_SPI_DMA_MIN_LENGTH) and runs into a timeout, dmaengine_terminate_all() will be called both for TX (once more) and for RX (which was never started in the first place). Fix it. Signed-off-by: Lukas Wunner <lukas@wunner.de> Fixes: `3ecd37edaa` ("spi: bcm2835: enable dma modes for transfers meeting certain conditions") Cc: stable@vger.kernel.org # v4.2+ Cc: Mathias Duckeck <m.duckeck@kunbus.de> Cc: Frank Pavlic <f.pavlic@kunbus.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:44 +01:00
Lukas Wunner	63f97d305a	spi: bcm2835: Fix race on DMA termination commit `e82b0b3828` upstream. If a DMA transfer finishes orderly right when spi_transfer_one_message() determines that it has timed out, the callbacks bcm2835_spi_dma_done() and bcm2835_spi_handle_err() race to call dmaengine_terminate_all(), potentially leading to double termination. Prevent by atomically changing the dma_pending flag before calling dmaengine_terminate_all(). Signed-off-by: Lukas Wunner <lukas@wunner.de> Fixes: `3ecd37edaa` ("spi: bcm2835: enable dma modes for transfers meeting certain conditions") Cc: stable@vger.kernel.org # v4.2+ Cc: Mathias Duckeck <m.duckeck@kunbus.de> Cc: Frank Pavlic <f.pavlic@kunbus.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:44 +01:00
Theodore Ts'o	0cb4f65590	ext4: check for shutdown and r/o file system in ext4_write_inode() commit `18f2c4fceb` upstream. If the file system has been shut down or is read-only, then ext4_write_inode() needs to bail out early. Also use jbd2_complete_transaction() instead of ext4_force_commit() so we only force a commit if it is needed. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:44 +01:00
Theodore Ts'o	bf2fd1f970	ext4: force inode writes when nfsd calls commit_metadata() commit `fde872682e` upstream. Some time back, nfsd switched from calling vfs_fsync() to using a new commit_metadata() hook in export_operations(). If the file system did not provide a commit_metadata() hook, it fell back to using sync_inode_metadata(). Unfortunately doesn't work on all file systems. In particular, it doesn't work on ext4 due to how the inode gets journalled --- the VFS writeback code will not always call ext4_write_inode(). So we need to provide our own ext4_nfs_commit_metdata() method which calls ext4_write_inode() directly. Google-Bug-Id: 121195940 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:43 +01:00
Theodore Ts'o	263663888d	ext4: avoid declaring fs inconsistent due to invalid file handles commit `8a363970d1` upstream. If we receive a file handle, either from NFS or open_by_handle_at(2), and it points at an inode which has not been initialized, and the file system has metadata checksums enabled, we shouldn't try to get the inode, discover the checksum is invalid, and then declare the file system as being inconsistent. This can be reproduced by creating a test file system via "mke2fs -t ext4 -O metadata_csum /tmp/foo.img 8M", mounting it, cd'ing into that directory, and then running the following program. #define _GNU_SOURCE #include <fcntl.h> struct handle { struct file_handle fh; unsigned char fid[MAX_HANDLE_SZ]; }; int main(int argc, char **argv) { struct handle h = {{8, 1 }, { 12, }}; open_by_handle_at(AT_FDCWD, &h.fh, O_RDONLY); return 0; } Google-Bug-Id: 120690101 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:43 +01:00
Theodore Ts'o	6633fcb231	ext4: include terminating u32 in size of xattr entries when expanding inodes commit `a805622a75` upstream. In ext4_expand_extra_isize_ea(), we calculate the total size of the xattr header, plus the xattr entries so we know how much of the beginning part of the xattrs to move when expanding the inode extra size. We need to include the terminating u32 at the end of the xattr entries, or else if there is uninitialized, non-zero bytes after the xattr entries and before the xattr values, the list of xattr entries won't be properly terminated. Reported-by: Steve Graham <stgraham2000@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:43 +01:00
ruippan (潘睿)	11bb168bae	ext4: fix EXT4_IOC_GROUP_ADD ioctl commit `e647e29196` upstream. Commit `e2b911c535` ("ext4: clean up feature test macros with predicate functions") broke the EXT4_IOC_GROUP_ADD ioctl. This was not noticed since only very old versions of resize2fs (before e2fsprogs 1.42) use this ioctl. However, using a new kernel with an enterprise Linux userspace will cause attempts to use online resize to fail with "No reserved GDT blocks". Fixes: `e2b911c535` ("ext4: clean up feature test macros with predicate...") Cc: stable@kernel.org # v4.4 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: ruippan (潘睿) <ruippan@tencent.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:43 +01:00
Maurizio Lombardi	0d078853b8	ext4: missing unlock/put_page() in ext4_try_to_write_inline_data() commit `132d00becb` upstream. In case of error, ext4_try_to_write_inline_data() should unlock and release the page it holds. Fixes: `f19d5870cb` ("ext4: add normal write support for inline data") Cc: stable@kernel.org # 3.8 Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:43 +01:00
Pan Bian	0a1c177dd9	ext4: fix possible use after free in ext4_quota_enable commit `61157b24e6` upstream. The function frees qf_inode via iput but then pass qf_inode to lockdep_set_quota_inode on the failure path. This may result in a use-after-free bug. The patch frees df_inode only when it is never used. Fixes: `daf647d2dd` ("ext4: add lockdep annotations for i_data_sem") Cc: stable@kernel.org # 4.6 Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:43 +01:00
Theodore Ts'o	b878c8a7f0	ext4: add ext4_sb_bread() to disambiguate ENOMEM cases commit `fb265c9cb4` upstream. Today, when sb_bread() returns NULL, this can either be because of an I/O error or because the system failed to allocate the buffer. Since it's an old interface, changing would require changing many call sites. So instead we create our own ext4_sb_bread(), which also allows us to set the REQ_META flag. Also fixed a problem in the xattr code where a NULL return in a function could also mean that the xattr was not found, which could lead to the wrong error getting returned to userspace. Fixes: `ac27a0ec11` ("ext4: initial copy of files from ext3") Cc: stable@kernel.org # 2.6.19 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:43 +01:00
Greg Kurz	6665481e1c	ocxl: Fix endiannes bug in read_afu_name() commit `2f07229f02` upstream. The AFU Descriptor Template in the PCI config space has a Name Space field which is a 24 Byte ASCII character string of descriptive name space for the AFU. The OCXL driver read the string four characters at a time with pci_read_config_dword(). This optimization is valid on a little-endian system since this is PCI, but a big-endian system ends up with each subset of four characters in reverse order. This could be fixed by switching to read characters one by one. Another option is to swap the bytes if we're big-endian. Go for the latter with le32_to_cpu(). Cc: stable@vger.kernel.org # v4.16 Signed-off-by: Greg Kurz <groug@kaod.org> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:43 +01:00
Greg Kurz	3fbf78b252	ocxl: Fix endiannes bug in ocxl_link_update_pe() commit `e1e71e2017` upstream. All fields in the PE are big-endian. Use cpu_to_be32() like everywhere else something is written to the PE. Otherwise a wrong TID will be used by the NPU. If this TID happens to point to an existing thread sharing the same mm, it could be woken up by error. This is highly improbable though. The likely outcome of this is the NPU not finding the target thread and forcing the AFU into sending an interrupt, which userspace is supposed to handle anyway. Fixes: `e948e06fc6` ("ocxl: Expose the thread_id needed for wait on POWER9") Cc: stable@vger.kernel.org # v4.18 Signed-off-by: Greg Kurz <groug@kaod.org> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:43 +01:00
Arnaldo Carvalho de Melo	65e4e67de3	perf env: Also consider env->arch == NULL as local operation commit `804234f271` upstream. We'll set a new machine field based on env->arch, which for live mode, like with 'perf top' means we need to use uname() to figure the name of the arch, fix perf_env__arch() to consider both (env == NULL) and (env->arch == NULL) as local operation. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: stable@vger.kernel.org # 4.19 Link: https://lkml.kernel.org/n/tip-vcz4ufzdon7cwy8dm2ua53xk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:43 +01:00
Ben Hutchings	d124dd5c6a	perf pmu: Suppress potential format-truncation warning commit `11a64a05dc` upstream. Depending on which functions are inlined in util/pmu.c, the snprintf() calls in perf_pmu__parse_{scale,unit,per_pkg,snapshot}() might trigger a warning: util/pmu.c: In function 'pmu_aliases': util/pmu.c:178:31: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size between 0 and 4095 [-Werror=format-truncation=] snprintf(path, PATH_MAX, "%s/%s.unit", dir, name); ^~ I found this when trying to build perf from Linux 3.16 with gcc 8. However I can reproduce the problem in mainline if I force __perf_pmu__new_alias() to be inlined. Suppress this by using scnprintf() as has been done elsewhere in perf. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20181111184524.fux4taownc6ndbx6@decadent.org.uk Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:42 +01:00
Adrian Hunter	307dbd3836	perf script: Use fallbacks for branch stacks commit `692d0e6332` upstream. Branch stacks do not necessarily have the same cpumode as the 'ip'. Use the fallback functions in those cases. This patch depends on patch "perf tools: Add fallback functions for cases where cpumode is insufficient". Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: stable@vger.kernel.org # 4.19 Link: http://lkml.kernel.org/r/20181106210712.12098-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:42 +01:00
Adrian Hunter	39dad822b7	perf tools: Use fallback for sample_addr_correlates_sym() cases commit `225f99e0c8` upstream. thread__resolve() is used in the sample_addr_correlates_sym() cases where 'addr' is a destination of a branch which does not necessarily have the same cpumode as the 'ip'. Use the fallback function in that case. This patch depends on patch "perf tools: Add fallback functions for cases where cpumode is insufficient". Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: stable@vger.kernel.org # 4.19 Link: http://lkml.kernel.org/r/20181106210712.12098-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:42 +01:00
Adrian Hunter	0ada27a744	perf thread: Add fallback functions for cases where cpumode is insufficient commit `8e80ad9983` upstream. For branch stacks or branch samples, the sample cpumode might not be correct because it applies only to the sample 'ip' and not necessary to 'addr' or branch stack addresses. Add fallback functions that can be used to deal with those cases Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: stable@vger.kernel.org # 4.19 Link: http://lkml.kernel.org/r/20181106210712.12098-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:42 +01:00
Adrian Hunter	62977a9ba8	perf machine: Record if a arch has a single user/kernel address space commit `ec1891afae` upstream. Some architectures have a single address space for kernel and user addresses, which makes it possible to determine if an address is in kernel space or user space. Some don't, e.g.: sparc. Cache that info in perf_env so that, for instance, code needing to fallback failed symbol lookups at the kernel space in single address space arches can lookup at userspace. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: stable@vger.kernel.org # 4.19 Link: http://lkml.kernel.org/r/20181106210712.12098-2-adrian.hunter@intel.com [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:42 +01:00
Alexey Brodkin	bf75d9382b	clocksource/drivers/arc_timer: Utilize generic sched_clock commit `bf287607c8` upstream. It turned out we used to use default implementation of sched_clock() from kernel/sched/clock.c which was as precise as 1/HZ, i.e. by default we had 10 msec granularity of time measurement. Now given ARC built-in timers are clocked with the same frequency as CPU cores we may get much higher precision of time tracking. Thus we switch to generic sched_clock which really reads ARC hardware counters. This is especially helpful for measuring short events. That's what we used to have: ------------------------------>8------------------------ $ perf stat /bin/sh -c /root/lmbench-master/bin/arc/hello > /dev/null Performance counter stats for '/bin/sh -c /root/lmbench-master/bin/arc/hello': 10.000000 task-clock (msec) # 2.832 CPUs utilized 1 context-switches # 0.100 K/sec 1 cpu-migrations # 0.100 K/sec 63 page-faults # 0.006 M/sec 3049480 cycles # 0.305 GHz 1091259 instructions # 0.36 insn per cycle 256828 branches # 25.683 M/sec 27026 branch-misses # 10.52% of all branches 0.003530687 seconds time elapsed 0.000000000 seconds user 0.010000000 seconds sys ------------------------------>8------------------------ And now we'll see: ------------------------------>8------------------------ $ perf stat /bin/sh -c /root/lmbench-master/bin/arc/hello > /dev/null Performance counter stats for '/bin/sh -c /root/lmbench-master/bin/arc/hello': 3.004322 task-clock (msec) # 0.865 CPUs utilized 1 context-switches # 0.333 K/sec 1 cpu-migrations # 0.333 K/sec 63 page-faults # 0.021 M/sec 2986734 cycles # 0.994 GHz 1087466 instructions # 0.36 insn per cycle 255209 branches # 84.947 M/sec 26002 branch-misses # 10.19% of all branches 0.003474829 seconds time elapsed 0.003519000 seconds user 0.000000000 seconds sys ------------------------------>8------------------------ Note how much more meaningful is the second output - time spent for execution pretty much matches number of cycles spent (we're runnign @ 1GHz here). Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Acked-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:42 +01:00
Eugeniy Paltsev	ca3a6fd272	DRM: UDL: get rid of useless vblank initialization commit `32e932e37e` upstream. UDL doesn't support vblank functionality so we don't need to initialize vblank here (we are able to send page flip completion events even without vblank initialization) Moreover current drm_vblank_init call with num_crtcs > 0 causes sending DRM_EVENT_FLIP_COMPLETE event with zero timestamp every time. This breaks userspace apps (for example weston) which relies on timestamp value. Cc: stable@vger.kernel.org Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180928144126.21598-1-Eugeniy.Paltsev@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:42 +01:00
Eric Anholt	29ac2218a9	drm/v3d: Skip debugfs dumping GCA on platforms without GCA. commit `2f20fa8d12` upstream. Fixes an oops reading this debugfs entry on BCM7278. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20180928232126.4332-4-eric@anholt.net Fixes: `57692c94dc` ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+") Cc: <stable@vger.kernel.org> Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:42 +01:00
Miquel Raynal	6c56e89e4e	platform-msi: Free descriptors in platform_msi_domain_free() commit `81b1e6e6a8` upstream. Since the addition of platform MSI support, there were two helpers supposed to allocate/free IRQs for a device: platform_msi_domain_alloc_irqs() platform_msi_domain_free_irqs() In these helpers, IRQ descriptors are allocated in the "alloc" routine while they are freed in the "free" one. Later, two other helpers have been added to handle IRQ domains on top of MSI domains: platform_msi_domain_alloc() platform_msi_domain_free() Seen from the outside, the logic is pretty close with the former helpers and people used it with the same logic as before: a platform_msi_domain_alloc() call should be balanced with a platform_msi_domain_free() call. While this is probably what was intended to do, the platform_msi_domain_free() does not remove/free the IRQ descriptor(s) created/inserted in platform_msi_domain_alloc(). One effect of such situation is that removing a module that requested an IRQ will let one orphaned IRQ descriptor (with an allocated MSI entry) in the device descriptors list. Next time the module will be inserted back, one will observe that the allocation will happen twice in the MSI domain, one time for the remaining descriptor, one time for the new one. It also has the side effect to quickly overshoot the maximum number of allocated MSI and then prevent any module requesting an interrupt in the same domain to be inserted anymore. This situation has been met with loops of insertion/removal of the mvpp2.ko module (requesting 15 MSIs each time). Fixes: `552c494a76` ("platform-msi: Allow creation of a MSI-based stacked irq domain") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:42 +01:00
Sean Christopherson	c9dae887cf	KVM: nVMX: Free the VMREAD/VMWRITE bitmaps if alloc_kvm_area() fails commit `1b3ab5ad1b` upstream. Fixes: `34a1cd60d1` ("kvm: x86: vmx: move some vmx setting from vmx_init() to hardware_setup()") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:41 +01:00
Marc Zyngier	07cbcfc33f	arm64: KVM: Make VHE Stage-2 TLB invalidation operations non-interruptible commit `c987876a80` upstream. Contrary to the non-VHE version of the TLB invalidation helpers, the VHE code has interrupts enabled, meaning that we can take an interrupt in the middle of such a sequence, and start running something else with HCR_EL2.TGE cleared. That's really not a good idea. Take the heavy-handed option and disable interrupts in __tlb_switch_to_guest_vhe, restoring them in __tlb_switch_to_host_vhe. The latter also gain an ISB in order to make sure that TGE really has taken effect. Cc: stable@vger.kernel.org Acked-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:41 +01:00
Sean Christopherson	edcf33b155	KVM: x86: Use jmp to invoke kvm_spurious_fault() from .fixup commit `e814349950` upstream. ____kvm_handle_fault_on_reboot() provides a generic exception fixup handler that is used to cleanly handle faults on VMX/SVM instructions during reboot (or at least try to). If there isn't a reboot in progress, ____kvm_handle_fault_on_reboot() treats any exception as fatal to KVM and invokes kvm_spurious_fault(), which in turn generates a BUG() to get a stack trace and die. When it was originally added by commit `4ecac3fd6d` ("KVM: Handle virtualization instruction #UD faults during reboot"), the "call" to kvm_spurious_fault() was handcoded as PUSH+JMP, where the PUSH'd value is the RIP of the faulting instructing. The PUSH+JMP trickery is necessary because the exception fixup handler code lies outside of its associated function, e.g. right after the function. An actual CALL from the .fixup code would show a slightly bogus stack trace, e.g. an extra "random" function would be inserted into the trace, as the return RIP on the stack would point to no known function (and the unwinder will likely try to guess who owns the RIP). Unfortunately, the JMP was replaced with a CALL when the macro was reworked to not spin indefinitely during reboot (commit `b7c4145ba2` "KVM: Don't spin on virt instruction faults during reboot"). This causes the aforementioned behavior where a bogus function is inserted into the stack trace, e.g. my builds like to blame free_kvm_area(). Revert the CALL back to a JMP. The changelog for commit `b7c4145ba2` ("KVM: Don't spin on virt instruction faults during reboot") contains nothing that indicates the switch to CALL was deliberate. This is backed up by the fact that the PUSH <insn RIP> was left intact. Note that an alternative to the PUSH+JMP magic would be to JMP back to the "real" code and CALL from there, but that would require adding a JMP in the non-faulting path to avoid calling kvm_spurious_fault() and would add no value, i.e. the stack trace would be the same. Using CALL: ------------[ cut here ]------------ kernel BUG at /home/sean/go/src/kernel.org/linux/arch/x86/kvm/x86.c:356! invalid opcode: 0000 [#1] SMP CPU: 4 PID: 1057 Comm: qemu-system-x86 Not tainted 4.20.0-rc6+ #75 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_spurious_fault+0x5/0x10 [kvm] Code: <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41 RSP: 0018:ffffc900004bbcc8 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffffffffff RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff888273fd8000 R08: 00000000000003e8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000784 R12: ffffc90000371fb0 R13: 0000000000000000 R14: 000000026d763cf4 R15: ffff888273fd8000 FS: 00007f3d69691700(0000) GS:ffff888277800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f89bc56fe0 CR3: 0000000271a5a001 CR4: 0000000000362ee0 Call Trace: free_kvm_area+0x1044/0x43ea [kvm_intel] ? vmx_vcpu_run+0x156/0x630 [kvm_intel] ? kvm_arch_vcpu_ioctl_run+0x447/0x1a40 [kvm] ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] ? __set_task_blocked+0x38/0x90 ? __set_current_blocked+0x50/0x60 ? __fpu__restore_sig+0x97/0x490 ? do_vfs_ioctl+0xa1/0x620 ? __x64_sys_futex+0x89/0x180 ? ksys_ioctl+0x66/0x70 ? __x64_sys_ioctl+0x16/0x20 ? do_syscall_64+0x4f/0x100 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 Modules linked in: vhost_net vhost tap kvm_intel kvm irqbypass bridge stp llc ---[ end trace 9775b14b123b1713 ]--- Using JMP: ------------[ cut here ]------------ kernel BUG at /home/sean/go/src/kernel.org/linux/arch/x86/kvm/x86.c:356! invalid opcode: 0000 [#1] SMP CPU: 6 PID: 1067 Comm: qemu-system-x86 Not tainted 4.20.0-rc6+ #75 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_spurious_fault+0x5/0x10 [kvm] Code: <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41 RSP: 0018:ffffc90000497cd0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffffffffff RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88827058bd40 R08: 00000000000003e8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000784 R12: ffffc90000369fb0 R13: 0000000000000000 R14: 00000003c8fc6642 R15: ffff88827058bd40 FS: 00007f3d7219e700(0000) GS:ffff888277900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3d64001000 CR3: 0000000271c6b004 CR4: 0000000000362ee0 Call Trace: vmx_vcpu_run+0x156/0x630 [kvm_intel] ? kvm_arch_vcpu_ioctl_run+0x447/0x1a40 [kvm] ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] ? __set_task_blocked+0x38/0x90 ? __set_current_blocked+0x50/0x60 ? __fpu__restore_sig+0x97/0x490 ? do_vfs_ioctl+0xa1/0x620 ? __x64_sys_futex+0x89/0x180 ? ksys_ioctl+0x66/0x70 ? __x64_sys_ioctl+0x16/0x20 ? do_syscall_64+0x4f/0x100 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 Modules linked in: vhost_net vhost tap kvm_intel kvm irqbypass bridge stp llc ---[ end trace f9daedb85ab3ddba ]--- Fixes: `b7c4145ba2` ("KVM: Don't spin on virt instruction faults during reboot") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:41 +01:00
Dan Williams	4910271928	x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init() commit `ba6f508d0e` upstream. Commit: `f77084d963` "x86/mm/pat: Disable preemption around __flush_tlb_all()" addressed a case where __flush_tlb_all() is called without preemption being disabled. It also left a warning to catch other cases where preemption is not disabled. That warning triggers for the memory hotplug path which is also used for persistent memory enabling: WARNING: CPU: 35 PID: 911 at ./arch/x86/include/asm/tlbflush.h:460 RIP: 0010:__flush_tlb_all+0x1b/0x3a [..] Call Trace: phys_pud_init+0x29c/0x2bb kernel_physical_mapping_init+0xfc/0x219 init_memory_mapping+0x1a5/0x3b0 arch_add_memory+0x2c/0x50 devm_memremap_pages+0x3aa/0x610 pmem_attach_disk+0x585/0x700 [nd_pmem] Andy wondered why a path that can sleep was using __flush_tlb_all() [1] and Dave confirmed the expectation for TLB flush is for modifying / invalidating existing PTE entries, but not initial population [2]. Drop the usage of __flush_tlb_all() in phys_{p4d,pud,pmd}_init() on the expectation that this path is only ever populating empty entries for the linear map. Note, at linear map teardown time there is a call to the all-cpu flush_tlb_all() to invalidate the removed mappings. [1]: https://lkml.kernel.org/r/9DFD717D-857D-493D-A606-B635D72BAC21@amacapital.net [2]: https://lkml.kernel.org/r/749919a4-cdb1-48a3-adb4-adb81a5fa0b5@intel.com [ mingo: Minor readability edits. ] Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave.hansen@intel.com Fixes: `f77084d963` ("x86/mm/pat: Disable preemption around __flush_tlb_all()") Link: http://lkml.kernel.org/r/154395944713.32119.15611079023837132638.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:41 +01:00
Michal Hocko	86ba6f66c9	x86/speculation/l1tf: Drop the swap storage limit restriction when l1tf=off commit `5b5e4d623e` upstream. Swap storage is restricted to max_swapfile_size (~16TB on x86_64) whenever the system is deemed affected by L1TF vulnerability. Even though the limit is quite high for most deployments it seems to be too restrictive for deployments which are willing to live with the mitigation disabled. We have a customer to deploy 8x 6,4TB PCIe/NVMe SSD swap devices which is clearly out of the limit. Drop the swap restriction when l1tf=off is specified. It also doesn't make much sense to warn about too much memory for the l1tf mitigation when it is forcefully disabled by the administrator. [ tglx: Folded the documentation delta change ] Fixes: `377eeaa8e1` ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2") Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: <linux-mm@kvack.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181113184910.26697-1-mhocko@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:41 +01:00
Patrick Dreyer	aeb5e53416	Input: elan_i2c - add ACPI ID for touchpad in ASUS Aspire F5-573G commit `7db54c89f0` upstream. This adds ELAN0501 to the ACPI table to support Elan touchpad found in ASUS Aspire F5-573G. Signed-off-by: Patrick Dreyer <Patrick.Dreyer@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:41 +01:00
Sanjeev Chugh	f168056530	Input: atmel_mxt_ts - don't try to free unallocated kernel memory commit `1e3c336ad8` upstream. If the user attempts to update Atmel device with an invalid configuration cfg file, error handling code is trying to free cfg file memory which is not allocated yet hence results into kernel crash. This patch fixes the order of memory free operations. Signed-off-by: Sanjeev Chugh <sanjeev_chugh@mentor.com> Fixes: `a4891f1058` ("Input: atmel_mxt_ts - zero terminate config firmware file") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:41 +01:00
Sebastian Ott	d648a9bdac	s390/pci: fix sleeping in atomic during hotplug commit `98dfd32620` upstream. When triggered by pci hotplug (PEC 0x306) clp_get_state is called with spinlocks held resulting in the following warning: zpci: n/a: Event 0x306 reconfigured PCI function 0x0 BUG: sleeping function called from invalid context at mm/page_alloc.c:4324 in_atomic(): 1, irqs_disabled(): 0, pid: 98, name: kmcheck 2 locks held by kmcheck/98: Change the allocation to use GFP_ATOMIC. Cc: stable@vger.kernel.org # 4.13+ Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:41 +01:00
Hans de Goede	47148001ae	ASoC: intel: cht_bsw_max98090_ti: Add pmc_plt_clk_0 quirk for Chromebook Gnawty commit `94ea56cff5` upstream. The Gnawty model Chromebook uses pmc_plt_clk_0 instead of pmc_plt_clk_3 for the mclk, just like the Clapper and Swanky models. This commit adds a DMI based quirk for this. This fixing audio no longer working on these devices after commit `648e921888` ("clk: x86: Stop marking clocks as CLK_IS_CRITICAL") that commit fixes us unnecessary keeping unused clocks on, but in case of the Gnawty that was breaking audio support since we were not using the right clock in the cht_bsw_max98090_ti machine driver. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=201787 Cc: stable@vger.kernel.org Fixes: `648e921888` ("clk: x86: Stop marking clocks as CLK_IS_CRITICAL") Reported-and-tested-by: Jaime Pérez <19.jaime.91@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:41 +01:00
Hans de Goede	c4b6173e54	ASoC: intel: cht_bsw_max98090_ti: Add pmc_plt_clk_0 quirk for Chromebook Clapper commit `984bfb398a` upstream. The Clapper model Chromebook uses pmc_plt_clk_0 instead of pmc_plt_clk_3 for the mclk, just like the Swanky model. This commit adds a DMI based quirk for this. This fixing audio no longer working on these devices after commit `648e921888` ("clk: x86: Stop marking clocks as CLK_IS_CRITICAL") that commit fixes us unnecessary keeping unused clocks on, but in case of the Clapper that was breaking audio support since we were not using the right clock in the cht_bsw_max98090_ti machine driver. Cc: stable@vger.kernel.org Fixes: `648e921888` ("clk: x86: Stop marking clocks as CLK_IS_CRITICAL") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:41 +01:00
Colin Ian King	6cd208cb93	staging: wilc1000: fix missing read_write setting when reading data commit `c58eef061d` upstream. Currently the cmd.read_write setting is not initialized so it contains garbage from the stack. Fix this by setting it to 0 to indicate a read is required. Detected by CoverityScan, CID#1357925 ("Uninitialized scalar variable") Fixes: `c5c77ba18e` ("staging: wilc1000: Add SDIO/SPI 802.11 driver") Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: stable <stable@vger.kernel.org> Acked-by: Ajay Singh <ajay.kathat@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:40 +01:00
Malcolm Priestley	80562cf3b1	media: dvb-usb-v2: Fix incorrect use of transfer_flags URB_FREE_BUFFER commit `255095fa7f` upstream. commit `1a0c10ed7b` ("media: dvb-usb-v2: stop using coherent memory for URBs") incorrectly adds URB_FREE_BUFFER after every urb transfer. It cannot use this flag because it reconfigures the URBs accordingly to suit connected devices. In doing a call to usb_free_urb is made and invertedly frees the buffers. The stream buffer should remain constant while driver is up. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> CC: stable@vger.kernel.org # v4.18+ Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:40 +01:00
Heikki Krogerus	f295bc9b8c	usb: roles: Add a description for the class to Kconfig commit `c3788cd996` upstream. That makes the USB role switch support option visible and selectable for the user. The class driver is also moved to drivers/usb/roles/ directory. This will fix an issue that we have with the Intel USB role switch driver on systems that don't have USB Type-C connectors: Intel USB role switch driver depends on the USB role switch class as it should, but since there was no way for the user to enable the USB role switch class, there was also no way to select that driver. USB Type-C drivers select the USB role switch class which makes the Intel USB role switch driver available and therefore hides the problem. So in practice Intel USB role switch driver was depending on USB Type-C drivers. Fixes: `f6fb9ec02b` ("usb: roles: Add Intel xHCI USB role switch driver") Cc: <stable@vger.kernel.org> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-09 17:38:40 +01:00

1 2 3 4 5 ...

785323 Commits