linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-06 19:08:57 +09:00

Author	SHA1	Message	Date
Greg Kroah-Hartman	9cf2ceaffd	Merge 5.10.5 into android12-5.10 Changes in 5.10.5 net/sched: sch_taprio: reset child qdiscs before freeing them mptcp: fix security context on server socket ethtool: fix error paths in ethnl_set_channels() ethtool: fix string set id check md/raid10: initialize r10_bio->read_slot before use. drm/amd/display: Add get_dig_frontend implementation for DCEx io_uring: close a small race gap for files cancel jffs2: Allow setting rp_size to zero during remounting jffs2: Fix NULL pointer dereference in rp_size fs option parsing spi: dw-bt1: Fix undefined devm_mux_control_get symbol opp: fix memory leak in _allocate_opp_table opp: Call the missing clk_put() on error scsi: block: Fix a race in the runtime power management code mm/hugetlb: fix deadlock in hugetlb_cow error path mm: memmap defer init doesn't work as expected lib/zlib: fix inflating zlib streams on s390 io_uring: don't assume mm is constant across submits io_uring: use bottom half safe lock for fixed file data io_uring: add a helper for setting a ref node io_uring: fix io_sqe_files_unregister() hangs uapi: move constants from <linux/kernel.h> to <linux/const.h> tools headers UAPI: Sync linux/const.h with the kernel headers cgroup: Fix memory leak when parsing multiple source parameters zlib: move EXPORT_SYMBOL() and MODULE_LICENSE() out of dfltcc_syms.c scsi: cxgb4i: Fix TLS dependency Bluetooth: hci_h5: close serdev device and free hu in h5_close fbcon: Disable accelerated scrolling reiserfs: add check for an invalid ih_entry_count misc: vmw_vmci: fix kernel info-leak by initializing dbells in vmci_ctx_get_chkpt_doorbells() media: gp8psk: initialize stats at power control logic f2fs: fix shift-out-of-bounds in sanity_check_raw_super() ALSA: seq: Use bool for snd_seq_queue internal flags ALSA: rawmidi: Access runtime->avail always in spinlock bfs: don't use WARNING: string when it's just info. ext4: check for invalid block size early when mounting a file system fcntl: Fix potential deadlock in send_sig{io, urg}() io_uring: check kthread stopped flag when sq thread is unparked rtc: sun6i: Fix memleak in sun6i_rtc_clk_init module: set MODULE_STATE_GOING state when a module fails to load quota: Don't overflow quota file offsets rtc: pl031: fix resource leak in pl031_probe powerpc: sysdev: add missing iounmap() on error in mpic_msgr_probe() i3c master: fix missing destroy_workqueue() on error in i3c_master_register NFSv4: Fix a pNFS layout related use-after-free race when freeing the inode f2fs: avoid race condition for shrinker count f2fs: fix race of pending_pages in decompression module: delay kobject uevent until after module init call powerpc/64: irq replay remove decrementer overflow check fs/namespace.c: WARN if mnt_count has become negative watchdog: rti-wdt: fix reference leak in rti_wdt_probe um: random: Register random as hwrng-core device um: ubd: Submit all data segments atomically NFSv4.2: Don't error when exiting early on a READ_PLUS buffer overflow ceph: fix inode refcount leak when ceph_fill_inode on non-I_NEW inode fails drm/amd/display: updated wm table for Renoir tick/sched: Remove bogus boot "safety" check s390: always clear kernel stack backchain before calling functions io_uring: remove racy overflow list fast checks ALSA: pcm: Clear the full allocated memory at hw_params dm verity: skip verity work if I/O error when system is shutting down ext4: avoid s_mb_prefetch to be zero in individual scenarios device-dax: Fix range release Linux 5.10.5 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I2b481bfac06bafdef2cf3cc1ac2c2a4ddf9913dc	2021-01-10 12:19:03 +01:00
Suren Baghdasaryan	275bdf7976	ANDROID: GKI: disable CONFIG_MEMCG CONFIG_MEMCG introduces overhead both in terms of memory usage as well as in the minor page fault path and after moving to PSI it is currently unused on non-Android Go devices. Disable it in GKI to avoid the overhead. Bug: 169443770 Bug: 172296409 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I717c2a1bde6264285b86d583ae1a1007c36be223	2021-01-08 18:30:50 +00:00
Chris Goldsworthy	91ce4829f6	ANDROID: mm, oom: Avoid killing tasks with negative ADJ scores Only kill a task with a negative ADJ score if there are no tasks with non-negative ADJ scores. Otherwise, kill the task with the most badness points whose ADJ score is also positive, if such a suitable task exists. Bug: 173837271 Signed-off-by: Chris Goldsworthy <cgoldswo@codeaurora.org> Change-Id: I70fe48a3eeb853085bb1acfb422f88cd36d1f14d	2021-01-08 17:03:42 +00:00
Park Bumgyu	f9ebdfbf70	ANDROID: add flags to android_rvh_enqueue_task/dequeue_task parameter "flags" is added to the vendor hook parameter so that the module can know the event type of task enqueue/dequeue. Bug: 176917922 Signed-off-by: Park Bumgyu <bumgyu.park@samsung.com> Change-Id: I7cc60908e301d75393bdf84861878a94de80d683	2021-01-08 16:46:20 +00:00
Shaleen Agrawal	372cb88a76	ANDROID: Sched: Add export symbol resched_curr Add export symbol resched_curr to enable scheduler value add. Bug: 176077958 Change-Id: I9c26b4d8738d6fd7d1067cb164a30b0228c5a301 Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org>	2021-01-08 02:01:45 +00:00
Shaleen Agrawal	1feedbd763	ANDROID: Sched: Add hooks for scheduler Add vendors hooks for to facilitate various scheduler value adds. Bug: 176077958 Change-Id: I5d488ae78ce05f81e6c73b69c56128b065647fec Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org>	2021-01-08 02:01:31 +00:00
Will Deacon	0b24bdb73c	UPSTREAM: arm64: sdei: Push IS_ENABLED() checks down to callee functions Handling all combinations of the VMAP_STACK and SHADOW_CALL_STACK options in sdei_arch_get_entry_point() makes the code difficult to read, particularly when considering the error and cleanup paths. Move the checking of these options into the callee functions, so that they return early if the relevant option is not enabled. Bug: 169781940 Change-Id: I3daf8a409d3544fa4e76a28c2b2ae9efb82001ba (cherry picked from commit `eec3bf6861`) Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>	2021-01-07 17:56:54 -08:00
Sami Tolvanen	1868c4c8cb	UPSTREAM: arm64: scs: use vmapped IRQ and SDEI shadow stacks Use scs_alloc() to allocate also IRQ and SDEI shadow stacks instead of using statically allocated stacks. Bug: 169781940 Change-Id: If3f38d603a7c1e8ebcf1e8655b70fa6bfde7c48d (cherry picked from commit `ac20ffbb02`) Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130233442.2562064-3-samitolvanen@google.com [will: Move CONFIG_SHADOW_CALL_STACK check into init_irq_scs()] Signed-off-by: Will Deacon <will@kernel.org>	2021-01-07 17:56:54 -08:00
Sami Tolvanen	27047fb22e	UPSTREAM: scs: switch to vmapped shadow stacks The kernel currently uses kmem_cache to allocate shadow call stacks, which means an overflows may not be immediately detected and can potentially result in another task's shadow stack to be overwritten. This change switches SCS to use virtually mapped shadow stacks for tasks, which increases shadow stack size to a full page and provides more robust overflow detection, similarly to VMAP_STACK. Bug: 169781940 Change-Id: I92c8f5706c11e4bf45b071e4f302a65502faa1e1 (cherry picked from commit `a2abe7cbd8`) Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130233442.2562064-2-samitolvanen@google.com Signed-off-by: Will Deacon <will@kernel.org>	2021-01-07 17:56:54 -08:00
Satya Durga Srinivasu Prabhala	598670e2d0	ANDROID: sched: add trace hook to enable EAS for SMP systems At present, EAS gets disabled when on ASYM Capacity systems if all BIG or Little CPUs gets hot-plugged. Instead of disabling EAS by default, add trace hook and let vendor decide if EAS should be disabled or not. Bug: 176964092 Change-Id: I583272cc89d44f3e3a4b1c43e3f75d731092ebf6 Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>	2021-01-07 22:34:23 +00:00
Pavankumar Kondeti	2a715fd012	ANDROID: sched/tracing: Print task status in sched_migrate_task A task can migrate either while it is waking or while it is running via load balancer. Print the task status i.e running or not in sched_migrate_task. This helps in counting the different types of migrations without relying on other trace events. Bug: 176709810 Change-Id: Ib473f9ccdc78003bb1f5d2dc24354f2db7a684f5 Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>	2021-01-07 22:31:31 +00:00
Pavankumar Kondeti	5ed9ed0164	UPSTREAM: PM / EM: Micro optimization in em_cpu_energy When the sum of the utilization of CPUs in a power domain is zero, return the energy as 0 without doing any computations. Acked-by: Quentin Perret <qperret@google.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit `9cc7e96aa8`) Bug: 173981595 Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Change-Id: I9b1a83d210c30a8a86da26f94ac0c2f855d2ed10	2021-01-07 22:30:38 +00:00
Ram Muthiah	f2684370d3	ANDROID: GKI: Disable symbol stripping Temporary workaround to enable arm64 gki devices to boot. Virtual devices failed to boot with 5.10 ARM64 GKI because symbol stripping has removed tracepoint symbols pertaining to xdp which are included in the symbol allowlist. klog excerpt for this error. init: Loading module /lib/modules/virtio_net.ko with args "" virtio_net: disagrees about version of symbol __traceiter_xdp_exception virtio_net: Unknown symbol __traceiter_xdp_exception (err -22) virtio_net: disagrees about version of symbol __tracepoint_xdp_exception virtio_net: Unknown symbol __tracepoint_xdp_exception (err -22) init: Failed to insmod '/lib/modules/virtio_net.ko' with args '' init: LoadWithAliases was unable to load virtio_net init: Failed to load kernel modules Bug: 176831960 Test: Treehugger Signed-off-by: Ram Muthiah <rammuthiah@google.com> Change-Id: If5b6fd12ce1c783966ff4ed0a8bc141d077c71a3	2021-01-07 19:39:38 +00:00
Daeho Jeong	ade6dc441f	ANDROID: GKI: bfq: enable bfq i/o group scheduling To enable bfq i/o group scheduling for separating i/o groups to foreground and background i/o groups, we need to set CONFIG_IOSCHED_BFQ and CONFIG_BFQ_GROUP_IOSCHED to "y". Bug: 171739280 Bug: 172520400 Signed-off-by: Daeho Jeong <daehojeong@google.com> Change-Id: If9b5664ecfc8f78d9792d7ee5d3ea5a88a50b9d7	2021-01-07 19:34:46 +00:00
Jimmy Shiu	4d1055d3d8	ANDROID: sched: cpufreq_schedutil: add sugov tracepoints Add vendor hook tracepoints to track when cpu util gets updated and when freq is choosen. Bug: 174488007 Signed-off-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rohit Gupta <rohgup@codeaurora.org> Signed-off-by: Jonathan Avila <avilaj@codeaurora.org> Signed-off-by: Jimmy Shiu <jimmyshiu@google.com> Change-Id: Ibb22fd0337a2539820a05b1e6b54b09aeaebd040 Signed-off-by: Will McVicker <willmcvicker@google.com>	2021-01-06 17:13:29 -08:00
Liam Mark	fd0328e37d	UPSTREAM: mm/page_owner: record timestamp and pid Collect the time for each allocation recorded in page owner so that allocation "surges" can be measured. Record the pid for each allocation recorded in page owner so that the source of allocation "surges" can be better identified. The above is very useful when doing memory analysis. On a crash for example, we can get this information from kdump (or ramdump) and parse it to figure out memory allocation problems. Please note that on x86_64 this increases the size of struct page_owner from 16 bytes to 32. Vlastimil: it's not a functionality intended for production, so unless somebody says they need to enable page_owner for debugging and this increase prevents them from fitting into available memory, let's not complicate things with making this optional. [lmark@codeaurora.org: v3] Link: https://lkml.kernel.org/r/20201210160357.27779-1-georgi.djakov@linaro.org Link: https://lkml.kernel.org/r/20201209125153.10533-1-georgi.djakov@linaro.org Signed-off-by: Liam Mark <lmark@codeaurora.org> Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit `9cc7e96aa8`) Bug: 175129313 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I5e246ea009c7e9e34c1cc608bcd3196fc0e623b4	2021-01-06 23:25:47 +00:00
Ram Muthiah	e7579f4acd	ANDROID: renamed virtual device symbol list Formerly cuttlefish and goldfish had separate symbol lists. The defconfigs and symbol lists were unified recently. However, the symbol lists should conform to this naming convention. Generated with BUILD_CONFIG=common/build.config.gki.aarch64 build/build.sh; BUILD_CONFIG= \ common-modules/virtual-device/build.config.virtual_device.aarch64 \ build/build.sh; build/abi/extract_symbols out/android12-5.10/dist/ \ --whitelist common/android/abi_gki_aarch64_virtual_device Test: Treehugger Bug: 176831960 Signed-off-by: Ram Muthiah <rammuthiah@google.com> Change-Id: I21755fbd3e9ab6319fdf4fcd06e501d722fb7242	2021-01-06 21:26:57 +00:00
Charan Teja Reddy	a7af91adc7	ANDROID: mm: oom_kill: reap memory of a task that receives SIGKILL Free the pages parallely for a task that receives SIGKILL, from ULMK process, using the oom_reaper. This freeing of pages will help to give the pages to buddy system well advance. Add the boot param, reap_mem_when_killed_by=, that configures the process name, the kill signal to a process from which makes its memory reaped by oom reaper. As an example, when reap_mem_when_killed_by=lmkd, then all the processes that receives the kill signal from lmkd is added to oom reaper. Not initializing this param makes this feature disabled. Change-Id: I21adb95de5e380a80d7eb0b87d9b5b553f52e28a Bug: 171763461 Signed-off-by: Charan Teja Reddy <charante@codeaurora.org> Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>	2021-01-06 21:11:02 +00:00
Jaegeuk Kim	51ca215606	Merge remote-tracking branch 'aosp/upstream-f2fs-stable-linux-5.10.y' into android12-5.10 * aosp/upstream-f2fs-stable-linux-5.10.y: fs-verity: move structs needed for file signing to UAPI header fs-verity: rename "file measurement" to "file digest" fs-verity: rename fsverity_signed_digest to fsverity_formatted_digest fs-verity: remove filenames from file comments fscrypt: allow deleting files with unsupported encryption policy fscrypt: unexport fscrypt_get_encryption_info() fscrypt: move fscrypt_require_key() to fscrypt_private.h fscrypt: move body of fscrypt_prepare_setattr() out-of-line fscrypt: introduce fscrypt_prepare_readdir() ext4: don't call fscrypt_get_encryption_info() from dx_show_leaf() ubifs: remove ubifs_dir_open() f2fs: remove f2fs_dir_open() ext4: remove ext4_dir_open() fscrypt: simplify master key locking fscrypt: remove unnecessary calls to fscrypt_require_key() ubifs: prevent creating duplicate encrypted filenames f2fs: prevent creating duplicate encrypted filenames ext4: prevent creating duplicate encrypted filenames fscrypt: add fscrypt_is_nokey_name() fscrypt: remove kernel-internal constants from UAPI header Conflicts: fs/crypto/hooks.c Bug: 174873661 Signed-off-by: Jaegeuk Kim <jaegeuk@google.com> Change-Id: Id56d42fc959242524628752223e9d773a2c8681c	2021-01-06 09:39:43 -08:00
Yan Yan	5979a56598	ANDROID: GKI: Enable XFRM_MIGRATE To be able to update addresses of an IPsec SA, as required by supporting MOBIKE Bug: 169169084 Signed-off-by: Yan Yan <evitayan@google.com> Change-Id: I5aa3f3556d615e4f0695bb78cd3cad9e83851df5	2021-01-06 16:25:22 +00:00
Vijayanand Jitta	b76264c26c	ANDROID: mm: Export get_page_owner Export get_page_owner symbol for loadable vendor modules. Bug: 176277889 Change-Id: Iea0a8022e542d1223caf4a742a888647828ca7cc Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>	2021-01-06 16:16:52 +00:00
Vijayanand Jitta	0a7166ae71	ANDROID: mm: Export lookup_page_ext Export lookup_page_ext symbol for loadable vendor modules. Bug: 176277892 Change-Id: If7de83bf48c2867460ec88e61e0f709958dc5e16 Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>	2021-01-06 16:16:41 +00:00
Vijayanand Jitta	df2e575fcc	ANDROID: mm: Export get_slabinfo Export get_slabinfo symbol for loadable vendor modules. Bug: 176277895 Change-Id: I01870a370da9bf5db842ff14801d94ef79350560 Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>	2021-01-06 16:16:27 +00:00
Greg Kroah-Hartman	f5247949c0	Linux 5.10.5 Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Jeffrin Jose T <jeffrin@rajagiritech.edu.in> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20210104155708.800470590@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-06 14:56:56 +01:00
Dan Williams	12d377b93e	device-dax: Fix range release [ Upstream commit `6268d7da4d` ] There are multiple locations that open-code the release of the last range in a device-dax instance. Consolidate this into a new dev_dax_trim_range() helper. This also addresses a kmemleak report: # cat /sys/kernel/debug/kmemleak [..] unreferenced object 0xffff976bd46f6240 (size 64): comm "ndctl", pid 23556, jiffies 4299514316 (age 5406.733s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 20 c3 37 00 00 00 .......... .7... ff ff ff 7f 38 00 00 00 00 00 00 00 00 00 00 00 ....8........... backtrace: [<00000000064003cf>] __kmalloc_track_caller+0x136/0x379 [<00000000d85e3c52>] krealloc+0x67/0x92 [<00000000d7d3ba8a>] __alloc_dev_dax_range+0x73/0x25c [<0000000027d58626>] devm_create_dev_dax+0x27d/0x416 [<00000000434abd43>] __dax_pmem_probe+0x1c9/0x1000 [dax_pmem_core] [<0000000083726c1c>] dax_pmem_probe+0x10/0x1f [dax_pmem] [<00000000b5f2319c>] nvdimm_bus_probe+0x9d/0x340 [libnvdimm] [<00000000c055e544>] really_probe+0x230/0x48d [<000000006cabd38e>] driver_probe_device+0x122/0x13b [<0000000029c7b95a>] device_driver_attach+0x5b/0x60 [<0000000053e5659b>] bind_store+0xb7/0xc3 [<00000000d3bdaadc>] drv_attr_store+0x27/0x31 [<00000000949069c5>] sysfs_kf_write+0x4a/0x57 [<000000004a8b5adf>] kernfs_fop_write+0x150/0x1e5 [<00000000bded60f0>] __vfs_write+0x1b/0x34 [<00000000b92900f0>] vfs_write+0xd8/0x1d1 Reported-by: Jane Chu <jane.chu@oracle.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/160834570161.1791850.14911670304441510419.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:56 +01:00
Chunguang Xu	aceb8ae8e3	ext4: avoid s_mb_prefetch to be zero in individual scenarios [ Upstream commit `82ef1370b0` ] Commit `cfd7323772` ("ext4: add prefetching for block allocation bitmaps") introduced block bitmap prefetch, and expects to read block bitmaps of flex_bg through an IO. However, it seems to ignore the value range of s_log_groups_per_flex. In the scenario where the value of s_log_groups_per_flex is greater than 27, s_mb_prefetch or s_mb_prefetch_limit will overflow, cause a divide zero exception. In addition, the logic of calculating nr is also flawed, because the size of flexbg is fixed during a single mount, but s_mb_prefetch can be modified, which causes nr to fail to meet the value condition of [1, flexbg_size]. To solve this problem, we need to set the upper limit of s_mb_prefetch. Since we expect to load block bitmaps of a flex_bg through an IO, we can consider determining a reasonable upper limit among the IO limit parameters. After consideration, we chose BLK_MAX_SEGMENT_SIZE. This is a good choice to solve divide zero problem and avoiding performance degradation. [ Some minor code simplifications to make the changes easy to follow -- TYT ] Reported-by: Tosk Robot <tencent_os_robot@tencent.com> Signed-off-by: Chunguang Xu <brookxu@tencent.com> Reviewed-by: Samuel Liao <samuelliao@tencent.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/1607051143-24508-1-git-send-email-brookxu@tencent.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:56 +01:00
Hyeongseok Kim	aff18aa806	dm verity: skip verity work if I/O error when system is shutting down [ Upstream commit `252bd12563` ] If emergency system shutdown is called, like by thermal shutdown, a dm device could be alive when the block device couldn't process I/O requests anymore. In this state, the handling of I/O errors by new dm I/O requests or by those already in-flight can lead to a verity corruption state, which is a misjudgment. So, skip verity work in response to I/O error when system is shutting down. Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:56 +01:00
Takashi Iwai	610d2fa0ec	ALSA: pcm: Clear the full allocated memory at hw_params [ Upstream commit `618de0f4ef` ] The PCM hw_params core function tries to clear up the PCM buffer before actually using for avoiding the information leak from the previous usages or the usage before a new allocation. It performs the memset() with runtime->dma_bytes, but this might still leave some remaining bytes untouched; namely, the PCM buffer size is aligned in page size for mmap, hence runtime->dma_bytes doesn't necessarily cover all PCM buffer pages, and the remaining bytes are exposed via mmap. This patch changes the memory clearance to cover the all buffer pages if the stream is supposed to be mmap-ready (that guarantees that the buffer size is aligned in page size). Reviewed-by: Lars-Peter Clausen <lars@metafoo.de> Link: https://lore.kernel.org/r/20201218145625.2045-3-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:56 +01:00
Pavel Begunkov	c7b04d27c9	io_uring: remove racy overflow list fast checks [ Upstream commit `9cd2be519d` ] list_empty_careful() is not racy only if some conditions are met, i.e. no re-adds after del_init. io_cqring_overflow_flush() does list_move(), so it's actually racy. Remove those checks, we have ->cq_check_overflow for the fast path. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Heiko Carstens	13f9eec229	s390: always clear kernel stack backchain before calling functions [ Upstream commit `9365965db0` ] Clear the kernel stack backchain before potentially calling the lockdep trace_hardirqs_off/on functions. Without this walking the kernel backchain, e.g. during a panic, might stop too early. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Thomas Gleixner	330c1ee7d5	tick/sched: Remove bogus boot "safety" check [ Upstream commit `ba8ea8e7dd` ] can_stop_idle_tick() checks whether the do_timer() duty has been taken over by a CPU on boot. That's silly because the boot CPU always takes over with the initial clockevent device. But even if no CPU would have installed a clockevent and taken over the duty then the question whether the tick on the current CPU can be stopped or not is moot. In that case the current CPU would have no clockevent either, so there would be nothing to keep ticking. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20201206212002.725238293@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Jake Wang	9b22bc0f16	drm/amd/display: updated wm table for Renoir [ Upstream commit `410066d24c` ] [Why] For certain timings, Renoir may underflow due to sr exit latency being too slow. [How] Updated wm table for renoir. Signed-off-by: Jake Wang <haonan.wang2@amd.com> Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Jeff Layton	86be0f2a0e	ceph: fix inode refcount leak when ceph_fill_inode on non-I_NEW inode fails [ Upstream commit `68cbb8056a` ] Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Trond Myklebust	8bcfa178f9	NFSv4.2: Don't error when exiting early on a READ_PLUS buffer overflow [ Upstream commit `503b934a75` ] Expanding the READ_PLUS extents can cause the read buffer to overflow. If it does, then don't error, but just exit early. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Gabriel Krisman Bertazi	ef3b9ad967	um: ubd: Submit all data segments atomically [ Upstream commit `fc6b6a872d` ] Internally, UBD treats each physical IO segment as a separate command to be submitted in the execution pipe. If the pipe returns a transient error after a few segments have already been written, UBD will tell the block layer to requeue the request, but there is no way to reclaim the segments already submitted. When a new attempt to dispatch the request is done, those segments already submitted will get duplicated, causing the WARN_ON below in the best case, and potentially data corruption. In my system, running a UML instance with 2GB of RAM and a 50M UBD disk, I can reproduce the WARN_ON by simply running mkfs.fvat against the disk on a freshly booted system. There are a few ways to around this, like reducing the pressure on the pipe by reducing the queue depth, which almost eliminates the occurrence of the problem, increasing the pipe buffer size on the host system, or by limiting the request to one physical segment, which causes the block layer to submit way more requests to resolve a single operation. Instead, this patch modifies the format of a UBD command, such that all segments are sent through a single element in the communication pipe, turning the command submission atomic from the point of view of the block layer. The new format has a variable size, depending on the number of elements, and looks like this: +------------+-----------+-----------+------------ \| cmd_header \| segment 0 \| segment 1 \| segment ... +------------+-----------+-----------+------------ With this format, we push a pointer to cmd_header in the submission pipe. This has the advantage of reducing the memory footprint of executing a single request, since it allow us to merge some fields in the header. It is possible to reduce even further each segment memory footprint, by merging bitmap_words and cow_offset, for instance, but this is not the focus of this patch and is left as future work. One issue with the patch is that for a big number of segments, we now perform one big memory allocation instead of multiple small ones, but I wasn't able to trigger any real issues or -ENOMEM because of this change, that wouldn't be reproduced otherwise. This was tested using fio with the verify-crc32 option, and by running an ext4 filesystem over this UBD device. The original WARN_ON was: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0x13f/0x141 refcount_t: underflow; use-after-free. Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.5.0-rc6-00002-g2a5bb2cf75c8 #346 Stack: 6084eed0 6063dc77 00000009 6084ef60 00000000 604b8d9f 6084eee0 6063dcbc 6084ef40 6006ab8d e013d780 1c00000000 Call Trace: [<600a0c1c>] ? printk+0x0/0x94 [<6004a888>] show_stack+0x13b/0x155 [<6063dc77>] ? dump_stack_print_info+0xdf/0xe8 [<604b8d9f>] ? refcount_warn_saturate+0x13f/0x141 [<6063dcbc>] dump_stack+0x2a/0x2c [<6006ab8d>] __warn+0x107/0x134 [<6008da6c>] ? wake_up_process+0x17/0x19 [<60487628>] ? blk_queue_max_discard_sectors+0x0/0xd [<6006b05f>] warn_slowpath_fmt+0xd1/0xdf [<6006af8e>] ? warn_slowpath_fmt+0x0/0xdf [<600acc14>] ? raw_read_seqcount_begin.constprop.0+0x0/0x15 [<600619ae>] ? os_nsecs+0x1d/0x2b [<604b8d9f>] refcount_warn_saturate+0x13f/0x141 [<6048bc8f>] refcount_sub_and_test.constprop.0+0x2f/0x37 [<6048c8de>] blk_mq_free_request+0xf1/0x10d [<6048ca06>] __blk_mq_end_request+0x10c/0x114 [<6005ac0f>] ubd_intr+0xb5/0x169 [<600a1a37>] __handle_irq_event_percpu+0x6b/0x17e [<600a1b70>] handle_irq_event_percpu+0x26/0x69 [<600a1bd9>] handle_irq_event+0x26/0x34 [<600a1bb3>] ? handle_irq_event+0x0/0x34 [<600a5186>] ? unmask_irq+0x0/0x37 [<600a57e6>] handle_edge_irq+0xbc/0xd6 [<600a131a>] generic_handle_irq+0x21/0x29 [<60048f6e>] do_IRQ+0x39/0x54 [...] ---[ end trace c6e7444e55386c0f ]--- Cc: Christopher Obbard <chris.obbard@collabora.com> Reported-by: Martyn Welch <martyn@collabora.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Tested-by: Christopher Obbard <chris.obbard@collabora.com> Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Christopher Obbard	a8b49c4bdf	um: random: Register random as hwrng-core device [ Upstream commit `72d3e093af` ] The UML random driver creates a dummy device under the guest, /dev/hw_random. When this file is read from the guest, the driver reads from the host machine's /dev/random, in-turn reading from the host kernel's entropy pool. This entropy pool could have been filled by a hardware random number generator or just the host kernel's internal software entropy generator. Currently the driver does not fill the guests kernel entropy pool, this requires a userspace tool running inside the guest (like rng-tools) to read from the dummy device provided by this driver, which then would fill the guest's internal entropy pool. This all seems quite pointless when we are already reading from an entropy pool, so this patch aims to register the device as a hwrng device using the hwrng-core framework. This not only improves and cleans up the driver, but also fills the guest's entropy pool without having to resort to using extra userspace tools in the guest. This is typically a nuisance when booting a guest: the random pool takes a long time (~200s) to build up enough entropy since the dummy hwrng is not used to fill the guest's pool. This port was originally attempted by Alexander Neville "dark" (in CC, discussion in Link), but the conversation there stalled since the handling of -EAGAIN errors were no removed and longer handled by the driver. This patch attempts to use the existing method of error handling but utilises the new hwrng core. The issue can be noticed when booting a UML guest: [ 2.560000] random: fast init done [ 214.000000] random: crng init done With the patch applied, filling the pool becomes a lot quicker: [ 2.560000] random: fast init done [ 12.000000] random: crng init done Cc: Alexander Neville <dark@volatile.bz> Link: https://lore.kernel.org/lkml/20190828204609.02a7ff70@TheDarkness/ Link: https://lore.kernel.org/lkml/20190829135001.6a5ff940@TheDarkness.local/ Cc: Sjoerd Simons <sjoerd.simons@collabora.co.uk> Signed-off-by: Christopher Obbard <chris.obbard@collabora.com> Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Zhang Qilong	0aa2eecf85	watchdog: rti-wdt: fix reference leak in rti_wdt_probe [ Upstream commit `8711071e97` ] pm_runtime_get_sync() will increment pm usage counter even it failed. Forgetting to call pm_runtime_put_noidle will result in reference leak in rti_wdt_probe, so we should fix it. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20201030154909.100023-1-zhangqilong3@huawei.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:54 +01:00
Eric Biggers	eae1fb3bc5	fs/namespace.c: WARN if mnt_count has become negative [ Upstream commit `edf7ddbf1c` ] Missing calls to mntget() (or equivalently, too many calls to mntput()) are hard to detect because mntput() delays freeing mounts using task_work_add(), then again using call_rcu(). As a result, mnt_count can often be decremented to -1 without getting a KASAN use-after-free report. Such cases are still bugs though, and they point to real use-after-frees being possible. For an example of this, see the bug fixed by commit `1b0b9cc8d3` ("vfs: fsmount: add missing mntget()"), discussed at https://lkml.kernel.org/linux-fsdevel/20190605135401.GB30925@xxxxxxxxxxxxxxxxxxxxxxxxx/T/#u. This bug should have been trivial to find. But actually, it wasn't found until syzkaller happened to use fchdir() to manipulate the reference count just right for the bug to be noticeable. Address this by making mntput_no_expire() issue a WARN if mnt_count has become negative. Suggested-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:54 +01:00
Nicholas Piggin	b1e155ccc8	powerpc/64: irq replay remove decrementer overflow check [ Upstream commit `59d512e437` ] This is way to catch some cases of decrementer overflow, when the decrementer has underflowed an odd number of times, while MSR[EE] was disabled. With a typical small decrementer, a timer that fires when MSR[EE] is disabled will be "lost" if MSR[EE] remains disabled for between 4.3 and 8.6 seconds after the timer expires. In any case, the decrementer interrupt would be taken at 8.6 seconds and the timer would be found at that point. So this check is for catching extreme latency events, and it prevents those latencies from being a further few seconds long. It's not obvious this is a good tradeoff. This is already a watchdog magnitude event and that situation is not improved a significantly with this check. For large decrementers, it's useless. Therefore remove this check, which avoids a mftb when enabling hard disabled interrupts (e.g., when enabling after coming from hardware interrupt handlers). Perhaps more importantly, it also removes the clunky MSR[EE] vs PACA_IRQ_HARD_DIS incoherency in soft-interrupt replay which simplifies the code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201107014336.2337337-1-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:54 +01:00
Jessica Yu	8b5b2b7683	module: delay kobject uevent until after module init call [ Upstream commit `38dc717e97` ] Apparently there has been a longstanding race between udev/systemd and the module loader. Currently, the module loader sends a uevent right after sysfs initialization, but before the module calls its init function. However, some udev rules expect that the module has initialized already upon receiving the uevent. This race has been triggered recently (see link in references) in some systemd mount unit files. For instance, the configfs module creates the /sys/kernel/config mount point in its init function, however the module loader issues the uevent before this happens. sys-kernel-config.mount expects to be able to mount /sys/kernel/config upon receipt of the module loading uevent, but if the configfs module has not called its init function yet, then this directory will not exist and the mount unit fails. A similar situation exists for sys-fs-fuse-connections.mount, as the fuse sysfs mount point is created during the fuse module's init function. If udev is faster than module initialization then the mount unit would fail in a similar fashion. To fix this race, delay the module KOBJ_ADD uevent until after the module has finished calling its init routine. References: https://github.com/systemd/systemd/issues/17586 Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-By: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> Signed-off-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:54 +01:00
Daeho Jeong	db6129f6ad	f2fs: fix race of pending_pages in decompression [ Upstream commit `6422a71ef4` ] I found out f2fs_free_dic() is invoked in a wrong timing, but f2fs_verify_bio() still needed the dic info and it triggered the below kernel panic. It has been caused by the race condition of pending_pages value between decompression and verity logic, when the same compression cluster had been split in different bios. By split bios, f2fs_verify_bio() ended up with decreasing pending_pages value before it is reset to nr_cpages by f2fs_decompress_pages() and caused the kernel panic. [ 4416.564763] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 ... [ 4416.896016] Workqueue: fsverity_read_queue f2fs_verity_work [ 4416.908515] pc : fsverity_verify_page+0x20/0x78 [ 4416.913721] lr : f2fs_verify_bio+0x11c/0x29c [ 4416.913722] sp : ffffffc019533cd0 [ 4416.913723] x29: ffffffc019533cd0 x28: 0000000000000402 [ 4416.913724] x27: 0000000000000001 x26: 0000000000000100 [ 4416.913726] x25: 0000000000000001 x24: 0000000000000004 [ 4416.913727] x23: 0000000000001000 x22: 0000000000000000 [ 4416.913728] x21: 0000000000000000 x20: ffffffff2076f9c0 [ 4416.913729] x19: ffffffff2076f9c0 x18: ffffff8a32380c30 [ 4416.913731] x17: ffffffc01f966d97 x16: 0000000000000298 [ 4416.913732] x15: 0000000000000000 x14: 0000000000000000 [ 4416.913733] x13: f074faec89ffffff x12: 0000000000000000 [ 4416.913734] x11: 0000000000001000 x10: 0000000000001000 [ 4416.929176] x9 : ffffffff20d1f5c7 x8 : 0000000000000000 [ 4416.929178] x7 : 626d7464ff286b6b x6 : ffffffc019533ade [ 4416.929179] x5 : 000000008049000e x4 : ffffffff2793e9e0 [ 4416.929180] x3 : 000000008049000e x2 : ffffff89ecfa74d0 [ 4416.929181] x1 : 0000000000000c40 x0 : ffffffff2076f9c0 [ 4416.929184] Call trace: [ 4416.929187] fsverity_verify_page+0x20/0x78 [ 4416.929189] f2fs_verify_bio+0x11c/0x29c [ 4416.929192] f2fs_verity_work+0x58/0x84 [ 4417.050667] process_one_work+0x270/0x47c [ 4417.055354] worker_thread+0x27c/0x4d8 [ 4417.059784] kthread+0x13c/0x320 [ 4417.063693] ret_from_fork+0x10/0x18 Chao pointed this can happen by the below race condition. Thread A f2fs_post_read_wq fsverity_wq - f2fs_read_multi_pages() - f2fs_alloc_dic - dic->pending_pages = 2 - submit_bio() - submit_bio() - f2fs_post_read_work() handle first bio - f2fs_decompress_work() - __read_end_io() - f2fs_decompress_pages() - dic->pending_pages-- - enqueue f2fs_verity_work() - f2fs_verity_work() handle first bio - f2fs_verify_bio() - dic->pending_pages-- - f2fs_post_read_work() handle second bio - f2fs_decompress_work() - enqueue f2fs_verity_work() - f2fs_verify_pages() - f2fs_free_dic() - f2fs_verity_work() handle second bio - f2fs_verfy_bio() - use-after-free on dic Signed-off-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:54 +01:00
Jaegeuk Kim	ee3f8aefd0	f2fs: avoid race condition for shrinker count [ Upstream commit `a95ba66ac1` ] Light reported sometimes shinker gets nat_cnt < dirty_nat_cnt resulting in wrong do_shinker work. Let's avoid to return insane overflowed value by adding single tracking value. Reported-by: Light Hsieh <Light.Hsieh@mediatek.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:54 +01:00
Trond Myklebust	3c0f0f5f58	NFSv4: Fix a pNFS layout related use-after-free race when freeing the inode [ Upstream commit `b6d49ecd10` ] When returning the layout in nfs4_evict_inode(), we need to ensure that the layout is actually done being freed before we can proceed to free the inode itself. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:54 +01:00
Qinglang Miao	06ac2ca098	i3c master: fix missing destroy_workqueue() on error in i3c_master_register [ Upstream commit `59165d16c6` ] Add the missing destroy_workqueue() before return from i3c_master_register in the error handling case. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://lore.kernel.org/linux-i3c/20201028091543.136167-1-miaoqinglang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:53 +01:00
Qinglang Miao	498d90690f	powerpc: sysdev: add missing iounmap() on error in mpic_msgr_probe() [ Upstream commit `ffa1797040` ] I noticed that iounmap() of msgr_block_addr before return from mpic_msgr_probe() in the error handling case is missing. So use devm_ioremap() instead of just ioremap() when remapping the message register block, so the mapping will be automatically released on probe failure. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201028091551.136400-1-miaoqinglang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:53 +01:00
Zheng Liang	acc3c8cc27	rtc: pl031: fix resource leak in pl031_probe [ Upstream commit `1eab0fea25` ] When devm_rtc_allocate_device is failed in pl031_probe, it should release mem regions with device. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zheng Liang <zhengliang6@huawei.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20201112093139.32566-1-zhengliang6@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:53 +01:00
Jan Kara	26058c397b	quota: Don't overflow quota file offsets [ Upstream commit `10f04d40a9` ] The on-disk quota format supports quota files with upto 2^32 blocks. Be careful when computing quota file offsets in the quota files from block numbers as they can overflow 32-bit types. Since quota files larger than 4GB would require ~26 millions of quota users, this is mostly a theoretical concern now but better be careful, fuzzers would find the problem sooner or later anyway... Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:53 +01:00
Miroslav Benes	bb2ab902f6	module: set MODULE_STATE_GOING state when a module fails to load [ Upstream commit `5e8ed280da` ] If a module fails to load due to an error in prepare_coming_module(), the following error handling in load_module() runs with MODULE_STATE_COMING in module's state. Fix it by correctly setting MODULE_STATE_GOING under "bug_cleanup" label. Signed-off-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:53 +01:00
Dinghao Liu	0ad9a6e613	rtc: sun6i: Fix memleak in sun6i_rtc_clk_init [ Upstream commit `28d211919e` ] When clk_hw_register_fixed_rate_with_accuracy() fails, clk_data should be freed. It's the same for the subsequent two error paths, but we should also unregister the already registered clocks in them. Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201020061226.6572-1-dinghao.liu@zju.edu.cn Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:53 +01:00
Xiaoguang Wang	b5a2f093b6	io_uring: check kthread stopped flag when sq thread is unparked commit `65b2b21348` upstream. syzbot reports following issue: INFO: task syz-executor.2:12399 can't die for more than 143 seconds. task:syz-executor.2 state:D stack:28744 pid:12399 ppid: 8504 flags:0x00004004 Call Trace: context_switch kernel/sched/core.c:3773 [inline] __schedule+0x893/0x2170 kernel/sched/core.c:4522 schedule+0xcf/0x270 kernel/sched/core.c:4600 schedule_timeout+0x1d8/0x250 kernel/time/timer.c:1847 do_wait_for_common kernel/sched/completion.c:85 [inline] __wait_for_common kernel/sched/completion.c:106 [inline] wait_for_common kernel/sched/completion.c:117 [inline] wait_for_completion+0x163/0x260 kernel/sched/completion.c:138 kthread_stop+0x17a/0x720 kernel/kthread.c:596 io_put_sq_data fs/io_uring.c:7193 [inline] io_sq_thread_stop+0x452/0x570 fs/io_uring.c:7290 io_finish_async fs/io_uring.c:7297 [inline] io_sq_offload_create fs/io_uring.c:8015 [inline] io_uring_create fs/io_uring.c:9433 [inline] io_uring_setup+0x19b7/0x3730 fs/io_uring.c:9507 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45deb9 Code: Unable to access opcode bytes at RIP 0x45de8f. RSP: 002b:00007f174e51ac78 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9 RAX: ffffffffffffffda RBX: 0000000000008640 RCX: 000000000045deb9 RDX: 0000000000000000 RSI: 0000000020000140 RDI: 00000000000050e5 RBP: 000000000118bf58 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000118bf2c R13: 00007ffed9ca723f R14: 00007f174e51b9c0 R15: 000000000118bf2c INFO: task syz-executor.2:12399 blocked for more than 143 seconds. Not tainted 5.10.0-rc3-next-20201110-syzkaller #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Currently we don't have a reproducer yet, but seems that there is a race in current codes: => io_put_sq_data ctx_list is empty now. \| ==> kthread_park(sqd->thread); \| \| T1: sq thread is parked now. ==> kthread_stop(sqd->thread); \| KTHREAD_SHOULD_STOP is set now.\| ===> kthread_unpark(k); \| \| T2: sq thread is now unparkd, run again. \| \| T3: sq thread is now preempted out. \| ===> wake_up_process(k); \| \| \| T4: Since sqd ctx_list is empty, needs_sched will be true, \| then sq thread sets task state to TASK_INTERRUPTIBLE, \| and schedule, now sq thread will never be waken up. ===> wait_for_completion \| I have artificially used mdelay() to simulate above race, will get same stack like this syzbot report, but to be honest, I'm not sure this code race triggers syzbot report. To fix this possible code race, when sq thread is unparked, need to check whether sq thread has been stopped. Reported-by: syzbot+03beeb595f074db9cfd1@syzkaller.appspotmail.com Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-06 14:56:53 +01:00

1 2 3 4 5 ...

971765 Commits