linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-06 19:08:57 +09:00

Author	SHA1	Message	Date
Mark Rutland	95bc0ba6be	atomics: Fix atomic64_{read_acquire,set_release} fallbacks [ Upstream commit `dc1b4df09a` ] Arnd reports that on 32-bit architectures, the fallbacks for atomic64_read_acquire() and atomic64_set_release() are broken as they use smp_load_acquire() and smp_store_release() respectively, which do not work on types larger than the native word size. Since those contain compiletime_assert_atomic_type(), any attempt to use those fallbacks will result in a build-time error. e.g. with the following added to arch/arm/kernel/setup.c: \| void test_atomic64(atomic64_t v) \| { \| atomic64_set_release(v, 5); \| atomic64_read_acquire(v); \| } The compiler will complain as follows: \| In file included from <command-line>: \| In function 'arch_atomic64_set_release', \| inlined from 'test_atomic64' at ./include/linux/atomic/atomic-instrumented.h:669:2: \| ././include/linux/compiler_types.h:346:38: error: call to '__compiletime_assert_9' declared with attribute error: Need native word sized stores/loads for atomicity. \| 346 \| _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) \| \| ^ \| ././include/linux/compiler_types.h:327:4: note: in definition of macro '__compiletime_assert' \| 327 \| prefix ## suffix(); \ \| \| ^~~~~~ \| ././include/linux/compiler_types.h:346:2: note: in expansion of macro '_compiletime_assert' \| 346 \| _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) \| \| ^~~~~~~~~~~~~~~~~~~ \| ././include/linux/compiler_types.h:349:2: note: in expansion of macro 'compiletime_assert' \| 349 \| compiletime_assert(__native_word(t), \ \| \| ^~~~~~~~~~~~~~~~~~ \| ./include/asm-generic/barrier.h:133:2: note: in expansion of macro 'compiletime_assert_atomic_type' \| 133 \| compiletime_assert_atomic_type(p); \ \| \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \| ./include/asm-generic/barrier.h:164:55: note: in expansion of macro '__smp_store_release' \| 164 \| #define smp_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0) \| \| ^~~~~~~~~~~~~~~~~~~ \| ./include/linux/atomic/atomic-arch-fallback.h:1270:2: note: in expansion of macro 'smp_store_release' \| 1270 \| smp_store_release(&(v)->counter, i); \| \| ^~~~~~~~~~~~~~~~~ \| make[2]: * [scripts/Makefile.build:288: arch/arm/kernel/setup.o] Error 1 \| make[1]: * [scripts/Makefile.build:550: arch/arm/kernel] Error 2 \| make: *** [Makefile:1831: arch/arm] Error 2 Fix this by only using smp_load_acquire() and smp_store_release() for native atomic types, and otherwise falling back to the regular barriers necessary for acquire/release semantics, as we do in the more generic acquire and release fallbacks. Since the fallback templates are used to generate the atomic64_() and atomic_() operations, the __native_word() check is added to both. For the atomic_*() operations, which are always 32-bit, the __native_word() check is redundant but not harmful, as it is always true. For the example above this works as expected on 32-bit, e.g. for arm multi_v7_defconfig: \| <test_atomic64>: \| push {r4, r5} \| dmb ish \| pldw [r0] \| mov r2, #5 \| mov r3, #0 \| ldrexd r4, [r0] \| strexd r4, r2, [r0] \| teq r4, #0 \| bne 484 <test_atomic64+0x14> \| ldrexd r2, [r0] \| dmb ish \| pop {r4, r5} \| bx lr ... and also on 64-bit, e.g. for arm64 defconfig: \| <test_atomic64>: \| bti c \| paciasp \| mov x1, #0x5 \| stlr x1, [x0] \| ldar x0, [x0] \| autiasp \| ret Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/20220207101943.439825-1-mark.rutland@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:57 +02:00
Minghao Chi	75fe5dcb16	spi: tegra20: Use of_device_get_match_data() [ Upstream commit `c9839acfcb` ] Use of_device_get_match_data() to simplify the code. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Link: https://lore.kernel.org/r/20220315023138.2118293-1-chi.minghao@zte.com.cn Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:57 +02:00
Chris Leech	ffe0c49167	nvme-tcp: lockdep: annotate in-kernel sockets [ Upstream commit `841aee4d75` ] Put NVMe/TCP sockets in their own class to avoid some lockdep warnings. Sockets created by nvme-tcp are not exposed to user-space, and will not trigger certain code paths that the general socket API exposes. Lockdep complains about a circular dependency between the socket and filesystem locks, because setsockopt can trigger a page fault with a socket lock held, but nvme-tcp sends requests on the socket while file system locks are held. ====================================================== WARNING: possible circular locking dependency detected 5.15.0-rc3 #1 Not tainted ------------------------------------------------------ fio/1496 is trying to acquire lock: (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendpage+0x23/0x80 but task is already holding lock: (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs] which lock already depends on the new lock. other info that might help us debug this: chain exists of: sk_lock-AF_INET --> sb_internal --> &xfs_dir_ilock_class/5 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&xfs_dir_ilock_class/5); lock(sb_internal); lock(&xfs_dir_ilock_class/5); lock(sk_lock-AF_INET); * DEADLOCK * 6 locks held by fio/1496: #0: (sb_writers#13){.+.+}-{0:0}, at: path_openat+0x9fc/0xa20 #1: (&inode->i_sb->s_type->i_mutex_dir_key){++++}-{3:3}, at: path_openat+0x296/0xa20 #2: (sb_internal){.+.+}-{0:0}, at: xfs_trans_alloc_icreate+0x41/0xd0 [xfs] #3: (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs] #4: (hctx->srcu){....}-{0:0}, at: hctx_lock+0x51/0xd0 #5: (&queue->send_mutex){+.+.}-{3:3}, at: nvme_tcp_queue_rq+0x33e/0x380 [nvme_tcp] This annotation lets lockdep analyze nvme-tcp controlled sockets independently of what the user-space sockets API does. Link: https://lore.kernel.org/linux-nvme/CAHj4cs9MDYLJ+q+2_GXUK9HxFizv2pxUryUR0toX974M040z7g@mail.gmail.com/ Signed-off-by: Chris Leech <cleech@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:56 +02:00
John David Anglin	b3ea76bda7	parisc: Fix handling off probe non-access faults [ Upstream commit `e00b0a2ab8` ] Currently, the parisc kernel does not fully support non-access TLB fault handling for probe instructions. In the fast path, we set the target register to zero if it is not a shadowed register. The slow path is not implemented, so we call do_page_fault. The architecture indicates that non-access faults should not cause a page fault from disk. This change adds to code to provide non-access fault support for probe instructions. It also modifies the handling of faults on userspace so that if the address lies in a valid VMA and the access type matches that for the VMA, the probe target register is set to one. Otherwise, the target register is set to zero. This was done to make probe instructions more useful for userspace. Probe instructions are not very useful if they set the target register to zero whenever a page is not present in memory. Nominally, the purpose of the probe instruction is determine whether read or write access to a given address is allowed. This fixes a problem in function pointer comparison noticed in the glibc testsuite (stdio-common/tst-vfprintf-user-type). The same problem is likely in glibc (_dl_lookup_address). V2 adds flush and lpa instruction support to handle_nadtlb_fault. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:56 +02:00
Dmitry Baryshkov	c29642ba72	PM: core: keep irq flags in device_pm_check_callbacks() [ Upstream commit `524bb1da78` ] The function device_pm_check_callbacks() can be called under the spin lock (in the reported case it happens from genpd_add_device() -> dev_pm_domain_set(), when the genpd uses spinlocks rather than mutexes. However this function uncoditionally uses spin_lock_irq() / spin_unlock_irq(), thus not preserving the CPU flags. Use the irqsave/irqrestore instead. The backtrace for the reference: [ 2.752010] ------------[ cut here ]------------ [ 2.756769] raw_local_irq_restore() called with IRQs enabled [ 2.762596] WARNING: CPU: 4 PID: 1 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x34/0x50 [ 2.772338] Modules linked in: [ 2.775487] CPU: 4 PID: 1 Comm: swapper/0 Tainted: G S 5.17.0-rc6-00384-ge330d0d82eff-dirty #684 [ 2.781384] Freeing initrd memory: 46024K [ 2.785839] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 2.785841] pc : warn_bogus_irq_restore+0x34/0x50 [ 2.785844] lr : warn_bogus_irq_restore+0x34/0x50 [ 2.785846] sp : ffff80000805b7d0 [ 2.785847] x29: ffff80000805b7d0 x28: 0000000000000000 x27: 0000000000000002 [ 2.785850] x26: ffffd40e80930b18 x25: ffff7ee2329192b8 x24: ffff7edfc9f60800 [ 2.785853] x23: ffffd40e80930b18 x22: ffffd40e80930d30 x21: ffff7edfc0dffa00 [ 2.785856] x20: ffff7edfc09e3768 x19: 0000000000000000 x18: ffffffffffffffff [ 2.845775] x17: 6572206f74206465 x16: 6c696166203a3030 x15: ffff80008805b4f7 [ 2.853108] x14: 0000000000000000 x13: ffffd40e809550b0 x12: 00000000000003d8 [ 2.860441] x11: 0000000000000148 x10: ffffd40e809550b0 x9 : ffffd40e809550b0 [ 2.867774] x8 : 00000000ffffefff x7 : ffffd40e809ad0b0 x6 : ffffd40e809ad0b0 [ 2.875107] x5 : 000000000000bff4 x4 : 0000000000000000 x3 : 0000000000000000 [ 2.882440] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff7edfc03a8000 [ 2.889774] Call trace: [ 2.892290] warn_bogus_irq_restore+0x34/0x50 [ 2.896770] _raw_spin_unlock_irqrestore+0x94/0xa0 [ 2.901690] genpd_unlock_spin+0x20/0x30 [ 2.905724] genpd_add_device+0x100/0x2d0 [ 2.909850] __genpd_dev_pm_attach+0xa8/0x23c [ 2.914329] genpd_dev_pm_attach_by_id+0xc4/0x190 [ 2.919167] genpd_dev_pm_attach_by_name+0x3c/0xd0 [ 2.924086] dev_pm_domain_attach_by_name+0x24/0x30 [ 2.929102] psci_dt_attach_cpu+0x24/0x90 [ 2.933230] psci_cpuidle_probe+0x2d4/0x46c [ 2.937534] platform_probe+0x68/0xe0 [ 2.941304] really_probe.part.0+0x9c/0x2fc [ 2.945605] __driver_probe_device+0x98/0x144 [ 2.950085] driver_probe_device+0x44/0x15c [ 2.954385] __device_attach_driver+0xb8/0x120 [ 2.958950] bus_for_each_drv+0x78/0xd0 [ 2.962896] __device_attach+0xd8/0x180 [ 2.966843] device_initial_probe+0x14/0x20 [ 2.971144] bus_probe_device+0x9c/0xa4 [ 2.975092] device_add+0x380/0x88c [ 2.978679] platform_device_add+0x114/0x234 [ 2.983067] platform_device_register_full+0x100/0x190 [ 2.988344] psci_idle_init+0x6c/0xb0 [ 2.992113] do_one_initcall+0x74/0x3a0 [ 2.996060] kernel_init_freeable+0x2fc/0x384 [ 3.000543] kernel_init+0x28/0x130 [ 3.004132] ret_from_fork+0x10/0x20 [ 3.007817] irq event stamp: 319826 [ 3.011404] hardirqs last enabled at (319825): [<ffffd40e7eda0268>] __up_console_sem+0x78/0x84 [ 3.020332] hardirqs last disabled at (319826): [<ffffd40e7fd6d9d8>] el1_dbg+0x24/0x8c [ 3.028458] softirqs last enabled at (318312): [<ffffd40e7ec90410>] _stext+0x410/0x588 [ 3.036678] softirqs last disabled at (318299): [<ffffd40e7ed1bf68>] __irq_exit_rcu+0x158/0x174 [ 3.045607] ---[ end trace 0000000000000000 ]--- Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:56 +02:00
Darren Hart	c02f2d420a	ACPI/APEI: Limit printable size of BERT table data [ Upstream commit `3f8dec1162` ] Platforms with large BERT table data can trigger soft lockup errors while attempting to print the entire BERT table data to the console at boot: watchdog: BUG: soft lockup - CPU#160 stuck for 23s! [swapper/0:1] Observed on Ampere Altra systems with a single BERT record of ~250KB. The original bert driver appears to have assumed relatively small table data. Since it is impractical to reassemble large table data from interwoven console messages, and the table data is available in /sys/firmware/acpi/tables/data/BERT limit the size for tables printed to the console to 1024 (for no reason other than it seemed like a good place to kick off the discussion, would appreciate feedback from existing users in terms of what size would maintain their current usage model). Alternatively, we could make printing a CONFIG option, use the bert_disable boot arg (or something similar), or use a debug log level. However, all those solutions require extra steps or change the existing behavior for small table data. Limiting the size preserves existing behavior on existing platforms with small table data, and eliminates the soft lockups for platforms with large table data, while still making it available. Signed-off-by: Darren Hart <darren@os.amperecomputing.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:56 +02:00
Paolo Valente	65d8a73745	Revert "Revert "block, bfq: honor already-setup queue merges"" [ Upstream commit `15729ff814` ] A crash [1] happened to be triggered in conjunction with commit `2d52c58b9c` ("block, bfq: honor already-setup queue merges"). The latter was then reverted by commit `ebc69e897e` ("Revert "block, bfq: honor already-setup queue merges""). Yet, the reverted commit was not the one introducing the bug. In fact, it actually triggered a UAF introduced by a different commit, and now fixed by commit `d29bd41428` ("block, bfq: reset last_bfqq_created on group change"). So, there is no point in keeping commit `2d52c58b9c` ("block, bfq: honor already-setup queue merges") out. This commit restores it. [1] https://bugzilla.kernel.org/show_bug.cgi?id=214503 Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20211125181510.15004-1-paolo.valente@linaro.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:56 +02:00
Paul Menzel	5b8d69c8c1	lib/raid6/test/Makefile: Use $(pound) instead of \# for Make 4.3 [ Upstream commit `633174a704` ] Buidling raid6test on Ubuntu 21.10 (ppc64le) with GNU Make 4.3 shows the errors below: $ cd lib/raid6/test/ $ make <stdin>:1:1: error: stray ‘\’ in program <stdin>:1:2: error: stray ‘#’ in program <stdin>:1:11: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ \ before ‘<’ token [...] The errors come from the HAS_ALTIVEC test, which fails, and the POWER optimized versions are not built. That’s also reason nobody noticed on the other architectures. GNU Make 4.3 does not remove the backslash anymore. From the 4.3 release announcment: > * WARNING: Backward-incompatibility! > Number signs (#) appearing inside a macro reference or function invocation > no longer introduce comments and should not be escaped with backslashes: > thus a call such as: > foo := $(shell echo '#') > is legal. Previously the number sign needed to be escaped, for example: > foo := $(shell echo '\#') > Now this latter will resolve to "\#". If you want to write makefiles > portable to both versions, assign the number sign to a variable: > H := \# > foo := $(shell echo '$H') > This was claimed to be fixed in 3.81, but wasn't, for some reason. > To detect this change search for 'nocomment' in the .FEATURES variable. So, do the same as commit `9564a8cf42` ("Kbuild: fix # escaping in .cmd files for future Make") and commit `929bef4677` ("bpf: Use $(pound) instead of \# in Makefiles") and define and use a $(pound) variable. Reference for the change in make: https://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57 Cc: Matt Brown <matthew.brown.dev@gmail.com> Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:56 +02:00
Rafael J. Wysocki	33ccf4f817	ACPICA: Avoid walking the ACPI Namespace if it is not there [ Upstream commit `0c9992315e` ] ACPICA commit b1c3656ef4950098e530be68d4b589584f06cddc Prevent acpi_ns_walk_namespace() from crashing when called with start_node equal to ACPI_ROOT_OBJECT if the Namespace has not been instantiated yet and acpi_gbl_root_node is NULL. For instance, this can happen if the kernel is run with "acpi=off" in the command line. Link: `b1c3656ef4` Link: https://lore.kernel.org/linux-acpi/CAJZ5v0hJWW_vZ3wwajE7xT38aWjY7cZyvqMJpXHzUL98-SiCVQ@mail.gmail.com/ Reported-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:56 +02:00
Zhang Wensheng	080665e2c3	bfq: fix use-after-free in bfq_dispatch_request [ Upstream commit `ab552fcb17` ] KASAN reports a use-after-free report when doing normal scsi-mq test [69832.239032] ================================================================== [69832.241810] BUG: KASAN: use-after-free in bfq_dispatch_request+0x1045/0x44b0 [69832.243267] Read of size 8 at addr ffff88802622ba88 by task kworker/3:1H/155 [69832.244656] [69832.245007] CPU: 3 PID: 155 Comm: kworker/3:1H Not tainted 5.10.0-10295-g576c6382529e #8 [69832.246626] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [69832.249069] Workqueue: kblockd blk_mq_run_work_fn [69832.250022] Call Trace: [69832.250541] dump_stack+0x9b/0xce [69832.251232] ? bfq_dispatch_request+0x1045/0x44b0 [69832.252243] print_address_description.constprop.6+0x3e/0x60 [69832.253381] ? __cpuidle_text_end+0x5/0x5 [69832.254211] ? vprintk_func+0x6b/0x120 [69832.254994] ? bfq_dispatch_request+0x1045/0x44b0 [69832.255952] ? bfq_dispatch_request+0x1045/0x44b0 [69832.256914] kasan_report.cold.9+0x22/0x3a [69832.257753] ? bfq_dispatch_request+0x1045/0x44b0 [69832.258755] check_memory_region+0x1c1/0x1e0 [69832.260248] bfq_dispatch_request+0x1045/0x44b0 [69832.261181] ? bfq_bfqq_expire+0x2440/0x2440 [69832.262032] ? blk_mq_delay_run_hw_queues+0xf9/0x170 [69832.263022] __blk_mq_do_dispatch_sched+0x52f/0x830 [69832.264011] ? blk_mq_sched_request_inserted+0x100/0x100 [69832.265101] __blk_mq_sched_dispatch_requests+0x398/0x4f0 [69832.266206] ? blk_mq_do_dispatch_ctx+0x570/0x570 [69832.267147] ? __switch_to+0x5f4/0xee0 [69832.267898] blk_mq_sched_dispatch_requests+0xdf/0x140 [69832.268946] __blk_mq_run_hw_queue+0xc0/0x270 [69832.269840] blk_mq_run_work_fn+0x51/0x60 [69832.278170] process_one_work+0x6d4/0xfe0 [69832.278984] worker_thread+0x91/0xc80 [69832.279726] ? __kthread_parkme+0xb0/0x110 [69832.280554] ? process_one_work+0xfe0/0xfe0 [69832.281414] kthread+0x32d/0x3f0 [69832.282082] ? kthread_park+0x170/0x170 [69832.282849] ret_from_fork+0x1f/0x30 [69832.283573] [69832.283886] Allocated by task 7725: [69832.284599] kasan_save_stack+0x19/0x40 [69832.285385] __kasan_kmalloc.constprop.2+0xc1/0xd0 [69832.286350] kmem_cache_alloc_node+0x13f/0x460 [69832.287237] bfq_get_queue+0x3d4/0x1140 [69832.287993] bfq_get_bfqq_handle_split+0x103/0x510 [69832.289015] bfq_init_rq+0x337/0x2d50 [69832.289749] bfq_insert_requests+0x304/0x4e10 [69832.290634] blk_mq_sched_insert_requests+0x13e/0x390 [69832.291629] blk_mq_flush_plug_list+0x4b4/0x760 [69832.292538] blk_flush_plug_list+0x2c5/0x480 [69832.293392] io_schedule_prepare+0xb2/0xd0 [69832.294209] io_schedule_timeout+0x13/0x80 [69832.295014] wait_for_common_io.constprop.1+0x13c/0x270 [69832.296137] submit_bio_wait+0x103/0x1a0 [69832.296932] blkdev_issue_discard+0xe6/0x160 [69832.297794] blk_ioctl_discard+0x219/0x290 [69832.298614] blkdev_common_ioctl+0x50a/0x1750 [69832.304715] blkdev_ioctl+0x470/0x600 [69832.305474] block_ioctl+0xde/0x120 [69832.306232] vfs_ioctl+0x6c/0xc0 [69832.306877] __se_sys_ioctl+0x90/0xa0 [69832.307629] do_syscall_64+0x2d/0x40 [69832.308362] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [69832.309382] [69832.309701] Freed by task 155: [69832.310328] kasan_save_stack+0x19/0x40 [69832.311121] kasan_set_track+0x1c/0x30 [69832.311868] kasan_set_free_info+0x1b/0x30 [69832.312699] __kasan_slab_free+0x111/0x160 [69832.313524] kmem_cache_free+0x94/0x460 [69832.314367] bfq_put_queue+0x582/0x940 [69832.315112] __bfq_bfqd_reset_in_service+0x166/0x1d0 [69832.317275] bfq_bfqq_expire+0xb27/0x2440 [69832.318084] bfq_dispatch_request+0x697/0x44b0 [69832.318991] __blk_mq_do_dispatch_sched+0x52f/0x830 [69832.319984] __blk_mq_sched_dispatch_requests+0x398/0x4f0 [69832.321087] blk_mq_sched_dispatch_requests+0xdf/0x140 [69832.322225] __blk_mq_run_hw_queue+0xc0/0x270 [69832.323114] blk_mq_run_work_fn+0x51/0x60 [69832.323942] process_one_work+0x6d4/0xfe0 [69832.324772] worker_thread+0x91/0xc80 [69832.325518] kthread+0x32d/0x3f0 [69832.326205] ret_from_fork+0x1f/0x30 [69832.326932] [69832.338297] The buggy address belongs to the object at ffff88802622b968 [69832.338297] which belongs to the cache bfq_queue of size 512 [69832.340766] The buggy address is located 288 bytes inside of [69832.340766] 512-byte region [ffff88802622b968, ffff88802622bb68) [69832.343091] The buggy address belongs to the page: [69832.344097] page:ffffea0000988a00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88802622a528 pfn:0x26228 [69832.346214] head:ffffea0000988a00 order:2 compound_mapcount:0 compound_pincount:0 [69832.347719] flags: 0x1fffff80010200(slab\|head) [69832.348625] raw: 001fffff80010200 ffffea0000dbac08 ffff888017a57650 ffff8880179fe840 [69832.354972] raw: ffff88802622a528 0000000000120008 00000001ffffffff 0000000000000000 [69832.356547] page dumped because: kasan: bad access detected [69832.357652] [69832.357970] Memory state around the buggy address: [69832.358926] ffff88802622b980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [69832.360358] ffff88802622ba00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [69832.361810] >ffff88802622ba80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [69832.363273] ^ [69832.363975] ffff88802622bb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc [69832.375960] ffff88802622bb80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [69832.377405] ================================================================== In bfq_dispatch_requestfunction, it may have function call: bfq_dispatch_request __bfq_dispatch_request bfq_select_queue bfq_bfqq_expire __bfq_bfqd_reset_in_service bfq_put_queue kmem_cache_free In this function call, in_serv_queue has beed expired and meet the conditions to free. In the function bfq_dispatch_request, the address of in_serv_queue pointing to has been released. For getting the value of idle_timer_disabled, it will get flags value from the address which in_serv_queue pointing to, then the problem of use-after-free happens; Fix the problem by check in_serv_queue == bfqd->in_service_queue, to get the value of idle_timer_disabled if in_serve_queue is equel to bfqd->in_service_queue. If the space of in_serv_queue pointing has been released, this judge will aviod use-after-free problem. And if in_serv_queue may be expired or finished, the idle_timer_disabled will be false which would not give effects to bfq_update_dispatch_stats. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com> Link: https://lore.kernel.org/r/20220303070334.3020168-1-zhangwensheng5@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:56 +02:00
Akira Kawata	e0943c456b	fs/binfmt_elf: Fix AT_PHDR for unusual ELF files [ Upstream commit `0da1d50027` ] BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=197921 As pointed out in the discussion of buglink, we cannot calculate AT_PHDR as the sum of load_addr and exec->e_phoff. : The AT_PHDR of ELF auxiliary vectors should point to the memory address : of program header. But binfmt_elf.c calculates this address as follows: : : NEW_AUX_ENT(AT_PHDR, load_addr + exec->e_phoff); : : which is wrong since e_phoff is the file offset of program header and : load_addr is the memory base address from PT_LOAD entry. : : The ld.so uses AT_PHDR as the memory address of program header. In normal : case, since the e_phoff is usually 64 and in the first PT_LOAD region, it : is the correct program header address. : : But if the address of program header isn't equal to the first PT_LOAD : address + e_phoff (e.g. Put the program header in other non-consecutive : PT_LOAD region), ld.so will try to read program header from wrong address : then crash or use incorrect program header. This is because exec->e_phoff is the offset of PHDRs in the file and the address of PHDRs in the memory may differ from it. This patch fixes the bug by calculating the address of program headers from PT_LOADs directly. Signed-off-by: Akira Kawata <akirakawata1@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220127124014.338760-2-akirakawata1@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:56 +02:00
Souptick Joarder (HPE)	757322b5ab	irqchip/nvic: Release nvic_base upon failure [ Upstream commit `e414c25e33` ] smatch warning was reported as below -> smatch warnings: drivers/irqchip/irq-nvic.c:131 nvic_of_init() warn: 'nvic_base' not released on lines: 97. Release nvic_base upon failure. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Souptick Joarder (HPE) <jrdr.linux@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220218163303.33344-1-jrdr.linux@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:56 +02:00
Marc Zyngier	dabfc878ef	irqchip/qcom-pdc: Fix broken locking [ Upstream commit `a6aca2f460` ] pdc_enable_intr() serves as a primitive to qcom_pdc_gic_{en,dis}able, and has a raw spinlock for mutual exclusion, which is uses with interruptible primitives. This means that this critical section can itself be interrupted. Should the interrupt also be a PDC interrupt, and the endpoint driver perform an irq_disable() on that interrupt, we end-up in a deadlock. Fix this by using the irqsave/irqrestore variants of the locking primitives. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Maulik Shah <quic_mkshah@quicinc.com> Link: https://lore.kernel.org/r/20220224101226.88373-5-maz@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:55 +02:00
Casey Schaufler	05ba7d0c63	Fix incorrect type in assignment of ipv6 port for audit [ Upstream commit `a5cd1ab7ab` ] Remove inappropriate use of ntohs() and assign the port value directly. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:55 +02:00
Chaitanya Kulkarni	860d36424d	loop: use sysfs_emit() in the sysfs xxx show() [ Upstream commit `b27824d31f` ] sprintf does not know the PAGE_SIZE maximum of the temporary buffer used for outputting sysfs content and it's possible to overrun the PAGE_SIZE buffer length. Use a generic sysfs_emit function that knows the size of the temporary buffer and ensures that no overrun is done for offset attribute in loop_attr_[offset\|sizelimit\|autoclear\|partscan\|dio]_show() callbacks. Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Link: https://lore.kernel.org/r/20220215213310.7264-2-kch@nvidia.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:55 +02:00
Richard Haines	55d192691b	selinux: allow FIOCLEX and FIONCLEX with policy capability [ Upstream commit `65881e1db4` ] These ioctls are equivalent to fcntl(fd, F_SETFD, flags), which SELinux always allows too. Furthermore, a failed FIOCLEX could result in a file descriptor being leaked to a process that should not have access to it. As this patch removes access controls, a policy capability needs to be enabled in policy to always allow these ioctls. Based-on-patch-by: Demi Marie Obenour <demiobenour@gmail.com> Signed-off-by: Richard Haines <richard_c_haines@btinternet.com> [PM: subject line tweak] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:55 +02:00
Fangrui Song	e48c260b0b	arm64: module: remove (NOLOAD) from linker script [ Upstream commit `4013e26670` ] On ELF, (NOLOAD) sets the section type to SHT_NOBITS[1]. It is conceptually inappropriate for .plt and .text.* sections which are always SHT_PROGBITS. In GNU ld, if PLT entries are needed, .plt will be SHT_PROGBITS anyway and (NOLOAD) will be essentially ignored. In ld.lld, since https://reviews.llvm.org/D118840 ("[ELF] Support (TYPE=<value>) to customize the output section type"), ld.lld will report a `section type mismatch` error. Just remove (NOLOAD) to fix the error. [1] https://lld.llvm.org/ELF/linker_script.html As of today, "The section should be marked as not loadable" on https://sourceware.org/binutils/docs/ld/Output-Section-Type.html is outdated for ELF. Tested-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Fangrui Song <maskray@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220218081209.354383-1-maskray@google.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:55 +02:00
Christian Göttsche	15bb7a467b	selinux: use correct type for context length [ Upstream commit `b97df7c098` ] security_sid_to_context() expects a pointer to an u32 as the address where to store the length of the computed context. Reported by sparse: security/selinux/xfrm.c:359:39: warning: incorrect type in arg 4 (different signedness) security/selinux/xfrm.c:359:39: expected unsigned int [usertype] scontext_len security/selinux/xfrm.c:359:39: got int Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: wrapped commit description] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:55 +02:00
Yu Kuai	8f34dea99c	block, bfq: don't move oom_bfqq [ Upstream commit `8410f70977` ] Our test report a UAF: [ 2073.019181] ================================================================== [ 2073.019188] BUG: KASAN: use-after-free in __bfq_put_async_bfqq+0xa0/0x168 [ 2073.019191] Write of size 8 at addr ffff8000ccf64128 by task rmmod/72584 [ 2073.019192] [ 2073.019196] CPU: 0 PID: 72584 Comm: rmmod Kdump: loaded Not tainted 4.19.90-yk #5 [ 2073.019198] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 2073.019200] Call trace: [ 2073.019203] dump_backtrace+0x0/0x310 [ 2073.019206] show_stack+0x28/0x38 [ 2073.019210] dump_stack+0xec/0x15c [ 2073.019216] print_address_description+0x68/0x2d0 [ 2073.019220] kasan_report+0x238/0x2f0 [ 2073.019224] __asan_store8+0x88/0xb0 [ 2073.019229] __bfq_put_async_bfqq+0xa0/0x168 [ 2073.019233] bfq_put_async_queues+0xbc/0x208 [ 2073.019236] bfq_pd_offline+0x178/0x238 [ 2073.019240] blkcg_deactivate_policy+0x1f0/0x420 [ 2073.019244] bfq_exit_queue+0x128/0x178 [ 2073.019249] blk_mq_exit_sched+0x12c/0x160 [ 2073.019252] elevator_exit+0xc8/0xd0 [ 2073.019256] blk_exit_queue+0x50/0x88 [ 2073.019259] blk_cleanup_queue+0x228/0x3d8 [ 2073.019267] null_del_dev+0xfc/0x1e0 [null_blk] [ 2073.019274] null_exit+0x90/0x114 [null_blk] [ 2073.019278] __arm64_sys_delete_module+0x358/0x5a0 [ 2073.019282] el0_svc_common+0xc8/0x320 [ 2073.019287] el0_svc_handler+0xf8/0x160 [ 2073.019290] el0_svc+0x10/0x218 [ 2073.019291] [ 2073.019294] Allocated by task 14163: [ 2073.019301] kasan_kmalloc+0xe0/0x190 [ 2073.019305] kmem_cache_alloc_node_trace+0x1cc/0x418 [ 2073.019308] bfq_pd_alloc+0x54/0x118 [ 2073.019313] blkcg_activate_policy+0x250/0x460 [ 2073.019317] bfq_create_group_hierarchy+0x38/0x110 [ 2073.019321] bfq_init_queue+0x6d0/0x948 [ 2073.019325] blk_mq_init_sched+0x1d8/0x390 [ 2073.019330] elevator_switch_mq+0x88/0x170 [ 2073.019334] elevator_switch+0x140/0x270 [ 2073.019338] elv_iosched_store+0x1a4/0x2a0 [ 2073.019342] queue_attr_store+0x90/0xe0 [ 2073.019348] sysfs_kf_write+0xa8/0xe8 [ 2073.019351] kernfs_fop_write+0x1f8/0x378 [ 2073.019359] __vfs_write+0xe0/0x360 [ 2073.019363] vfs_write+0xf0/0x270 [ 2073.019367] ksys_write+0xdc/0x1b8 [ 2073.019371] __arm64_sys_write+0x50/0x60 [ 2073.019375] el0_svc_common+0xc8/0x320 [ 2073.019380] el0_svc_handler+0xf8/0x160 [ 2073.019383] el0_svc+0x10/0x218 [ 2073.019385] [ 2073.019387] Freed by task 72584: [ 2073.019391] __kasan_slab_free+0x120/0x228 [ 2073.019394] kasan_slab_free+0x10/0x18 [ 2073.019397] kfree+0x94/0x368 [ 2073.019400] bfqg_put+0x64/0xb0 [ 2073.019404] bfqg_and_blkg_put+0x90/0xb0 [ 2073.019408] bfq_put_queue+0x220/0x228 [ 2073.019413] __bfq_put_async_bfqq+0x98/0x168 [ 2073.019416] bfq_put_async_queues+0xbc/0x208 [ 2073.019420] bfq_pd_offline+0x178/0x238 [ 2073.019424] blkcg_deactivate_policy+0x1f0/0x420 [ 2073.019429] bfq_exit_queue+0x128/0x178 [ 2073.019433] blk_mq_exit_sched+0x12c/0x160 [ 2073.019437] elevator_exit+0xc8/0xd0 [ 2073.019440] blk_exit_queue+0x50/0x88 [ 2073.019443] blk_cleanup_queue+0x228/0x3d8 [ 2073.019451] null_del_dev+0xfc/0x1e0 [null_blk] [ 2073.019459] null_exit+0x90/0x114 [null_blk] [ 2073.019462] __arm64_sys_delete_module+0x358/0x5a0 [ 2073.019467] el0_svc_common+0xc8/0x320 [ 2073.019471] el0_svc_handler+0xf8/0x160 [ 2073.019474] el0_svc+0x10/0x218 [ 2073.019475] [ 2073.019479] The buggy address belongs to the object at ffff8000ccf63f00 which belongs to the cache kmalloc-1024 of size 1024 [ 2073.019484] The buggy address is located 552 bytes inside of 1024-byte region [ffff8000ccf63f00, ffff8000ccf64300) [ 2073.019486] The buggy address belongs to the page: [ 2073.019492] page:ffff7e000333d800 count:1 mapcount:0 mapping:ffff8000c0003a00 index:0x0 compound_mapcount: 0 [ 2073.020123] flags: 0x7ffff0000008100(slab\|head) [ 2073.020403] raw: 07ffff0000008100 ffff7e0003334c08 ffff7e00001f5a08 ffff8000c0003a00 [ 2073.020409] raw: 0000000000000000 00000000001c001c 00000001ffffffff 0000000000000000 [ 2073.020411] page dumped because: kasan: bad access detected [ 2073.020412] [ 2073.020414] Memory state around the buggy address: [ 2073.020420] ffff8000ccf64000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 2073.020424] ffff8000ccf64080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 2073.020428] >ffff8000ccf64100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 2073.020430] ^ [ 2073.020434] ffff8000ccf64180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 2073.020438] ffff8000ccf64200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 2073.020439] ================================================================== The same problem exist in mainline as well. This is because oom_bfqq is moved to a non-root group, thus root_group is freed earlier. Thus fix the problem by don't move oom_bfqq. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Paolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20220129015924.3958918-4-yukuai3@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:55 +02:00
Kai Ye	69d41c77aa	crypto: hisilicon/sec - not need to enable sm4 extra mode at HW V3 [ Upstream commit `f8a2652826` ] It is not need to enable sm4 extra mode in at HW V3. Here is fix it. Signed-off-by: Kai Ye <yekai13@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:55 +02:00
Herbert Xu	f84b163300	crypto: xts - Add softdep on ecb [ Upstream commit `dfe085d8dc` ] The xts module needs ecb to be present as it's meant to work on top of ecb. This patch adds a softdep so ecb can be included automatically into the initramfs. Reported-by: rftc <rftc@gmx.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:55 +02:00
Yahu Gao	e11293de5c	block/bfq_wf2q: correct weight to ioprio [ Upstream commit `bcd2be7632` ] The return value is ioprio * BFQ_WEIGHT_CONVERSION_COEFF or 0. What we want is ioprio or 0. Correct this by changing the calculation. Signed-off-by: Yahu Gao <gaoyahu19@gmail.com> Acked-by: Paolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20220107065859.25689-1-gaoyahu19@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:55 +02:00
Paul E. McKenney	e34806c6c2	rcu: Mark writes to the rcu_segcblist structure's ->flags field [ Upstream commit `c099290310` ] KCSAN reports data races between the rcu_segcblist_clear_flags() and rcu_segcblist_set_flags() functions, though misreporting the latter as a call to rcu_segcblist_is_enabled() from call_rcu(). This commit converts the updates of this field to WRITE_ONCE(), relying on the resulting unmarked reads to continue to detect buggy concurrent writes to this field. Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:55 +02:00
Marc Zyngier	99780fcb54	pinctrl: npcm: Fix broken references to chip->parent_device [ Upstream commit `f7e53e2255` ] The npcm driver has a bunch of references to the irq_chip parent_device field, but never sets it. Fix it by fishing that reference from somewhere else, but it is obvious that these debug statements were never used. Also remove an unused field in a local data structure. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Bartosz Golaszewski <brgl@bgdev.pl> Link: https://lore.kernel.org/r/20220201120310.878267-11-maz@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:54 +02:00
Kees Cook	999ee26653	gcc-plugins/stackleak: Exactly match strings instead of prefixes [ Upstream commit `27e9faf415` ] Since STRING_CST may not be NUL terminated, strncmp() was used for check for equality. However, this may lead to mismatches for longer section names where the start matches the tested-for string. Test for exact equality by checking for the presences of NUL termination. Cc: Alexander Popov <alex.popov@linux.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:54 +02:00
Kai Ye	ca97dfbda5	crypto: hisilicon/qm - cleanup warning in qm_vf_read_qos [ Upstream commit `05b3bade29` ] The kernel test rebot report this warning: Uninitialized variable: ret. The code flow may return value of ret directly. This value is an uninitialized variable, here is fix it. Signed-off-by: Kai Ye <yekai13@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:54 +02:00
Dave Stevenson	4941c21090	regulator: rpi-panel: Handle I2C errors/timing to the Atmel [ Upstream commit `5665eee7a3` ] The Atmel is doing some things in the I2C ISR, during which period it will not respond to further commands. This is particularly true of the POWERON command. Increase delays appropriately, and retry should I2C errors be reported. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com> Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com> Link: https://lore.kernel.org/r/20220124220129.158891-3-detlev.casanova@collabora.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:54 +02:00
Casey Schaufler	f3f93a1aaa	LSM: general protection fault in legacy_parse_param [ Upstream commit `ecff30575b` ] The usual LSM hook "bail on fail" scheme doesn't work for cases where a security module may return an error code indicating that it does not recognize an input. In this particular case Smack sees a mount option that it recognizes, and returns 0. A call to a BPF hook follows, which returns -ENOPARAM, which confuses the caller because Smack has processed its data. The SELinux hook incorrectly returns 1 on success. There was a time when this was correct, however the current expectation is that it return 0 on success. This is repaired. Reported-by: syzbot+d1e3b1d92d25abf97943@syzkaller.appspotmail.com Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:54 +02:00
Linus Torvalds	c331c9d1d2	fs: fix fd table size alignment properly [ Upstream commit `d888c83fce` ] Jason Donenfeld reports that my commit `1c24a18639` ("fs: fd tables have to be multiples of BITS_PER_LONG") doesn't work, and the reason is an embarrassing brown-paper-bag bug. Yes, we want to align the number of fds to BITS_PER_LONG, and yes, the reason they might not be aligned is because the incoming 'max_fd' argument might not be aligned. But aligining the argument - while simple - will cause a "infinitely big" maxfd (eg NR_OPEN_MAX) to just overflow to zero. Which most definitely isn't what we want either. The obvious fix was always just to do the alignment last, but I had moved it earlier just to make the patch smaller and the code look simpler. Duh. It certainly made _me_ look simple. Fixes: `1c24a18639` ("fs: fd tables have to be multiples of BITS_PER_LONG") Reported-and-tested-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Fedor Pchelkin <aissur0002@gmail.com> Cc: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:54 +02:00
Dan Carpenter	611170142b	lib/test: use after free in register_test_dev_kmod() [ Upstream commit `dc0ce6cc4b` ] The "test_dev" pointer is freed but then returned to the caller. Fixes: `d9c6a72d6f` ("kmod: add test driver to stress test the module loader") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:54 +02:00
Linus Torvalds	136736abcd	fs: fd tables have to be multiples of BITS_PER_LONG [ Upstream commit `1c24a18639` ] This has always been the rule: fdtables have several bitmaps in them, and as a result they have to be sized properly for bitmaps. We walk those bitmaps in chunks of 'unsigned long' in serveral cases, but even when we don't, we use the regular kernel bitops that are defined to work on arrays of 'unsigned long', not on some byte array. Now, the distinction between arrays of bytes and 'unsigned long' normally only really ends up being noticeable on big-endian systems, but Fedor Pchelkin and Alexey Khoroshilov reported that copy_fd_bitmaps() could be called with an argument that wasn't even a multiple of BITS_PER_BYTE. And then it fails to do the proper copy even on little-endian machines. The bug wasn't in copy_fd_bitmap(), but in sane_fdtable_size(), which didn't actually sanitize the fdtable size sufficiently, and never made sure it had the proper BITS_PER_LONG alignment. That's partly because the alignment historically came not from having to explicitly align things, but simply from previous fdtable sizes, and from count_open_files(), which counts the file descriptors by walking them one 'unsigned long' word at a time and thus naturally ends up doing sizing in the proper 'chunks of unsigned long'. But with the introduction of close_range(), we now have an external source of "this is how many files we want to have", and so sane_fdtable_size() needs to do a better job. This also adds that explicit alignment to alloc_fdtable(), although there it is mainly just for documentation at a source code level. The arithmetic we do there to pick a reasonable fdtable size already aligns the result sufficiently. In fact,clang notices that the added ALIGN() in that function doesn't actually do anything, and does not generate any extra code for it. It turns out that gcc ends up confusing itself by combining a previous constant-sized shift operation with the variable-sized shift operations in roundup_pow_of_two(). And probably due to that doesn't notice that the ALIGN() is a no-op. But that's a (tiny) gcc misfeature that doesn't matter. Having the explicit alignment makes sense, and would actually matter on a 128-bit architecture if we ever go there. This also adds big comments above both functions about how fdtable sizes have to have that BITS_PER_LONG alignment. Fixes: `60997c3d45` ("close_range: add CLOSE_RANGE_UNSHARE") Reported-by: Fedor Pchelkin <aissur0002@gmail.com> Reported-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Link: https://lore.kernel.org/all/20220326114009.1690-1-aissur0002@gmail.com/ Tested-and-acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:54 +02:00
Xiaomeng Tong	fd3f70b907	net: dsa: bcm_sf2_cfp: fix an incorrect NULL check on list iterator [ Upstream commit `6da69b1da1` ] The bug is here: return rule; The list iterator value 'rule' will always be set and non-NULL by list_for_each_entry(), so it is incorrect to assume that the iterator value will be NULL if the list is empty or no element is found. To fix the bug, return 'rule' when found, otherwise return NULL. Fixes: `ae7a5aff78` ("net: dsa: bcm_sf2: Keep copy of inserted rules") Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Link: https://lore.kernel.org/r/20220328032431.22538-1-xiam0nd.tong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:54 +02:00
Trond Myklebust	a738ff8143	NFSv4/pNFS: Fix another issue with a list iterator pointing to the head [ Upstream commit `7c9d845f06` ] In nfs4_callback_devicenotify(), if we don't find a matching entry for the deviceid, we're left with a pointer to 'struct nfs_server' that actually points to the list of super blocks associated with our struct nfs_client. Furthermore, even if we have a valid pointer, nothing pins the super block, and so the struct nfs_server could end up getting freed while we're using it. Since all we want is a pointer to the struct pnfs_layoutdriver_type, let's skip all the iteration over super blocks, and just use APIs to find the layout driver directly. Reported-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Fixes: `1be5683b03` ("pnfs: CB_NOTIFY_DEVICEID") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:54 +02:00
Marcelo Ricardo Leitner	bcbf4e5c3b	net/sched: act_ct: fix ref leak when switching zones [ Upstream commit `bcb74e132a` ] When switching zones or network namespaces without doing a ct clear in between, it is now leaking a reference to the old ct entry. That's because tcf_ct_skb_nfct_cached() returns false and tcf_ct_flow_table_lookup() may simply overwrite it. The fix is to, as the ct entry is not reusable, free it already at tcf_ct_skb_nfct_cached(). Reported-by: Florian Westphal <fw@strlen.de> Fixes: `2f131de361` ("net/sched: act_ct: Fix flow table lookup after ct clear or switching zones") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:53 +02:00
Florian Westphal	72dd9e61fa	net: prefer nf_ct_put instead of nf_conntrack_put [ Upstream commit `408bdcfce8` ] Its the same as nf_conntrack_put(), but without the need for an indirect call. The downside is a module dependency on nf_conntrack, but all of these already depend on conntrack anyway. Cc: Paul Blakey <paulb@mellanox.com> Cc: dev@openvswitch.org Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:53 +02:00
Tom Rix	6b663fa23c	octeontx2-af: initialize action variable [ Upstream commit `33b5bc9e70` ] Clang static analysis reports this representative issue rvu_npc.c:898:15: warning: Assigned value is garbage or undefined req.match_id = action.match_id; ^ ~~~~~~~~~~~~~~~ The initial setting of action is conditional on if (is_mcam_entry_enabled(...)) The later check of action.op will sometimes be garbage. So initialize action. Reduce setting of (u64 )&action = 0x00; to (u64 )&action = 0; Fixes: `967db3529e` ("octeontx2-af: add support for multicast/promisc packet replication feature") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:53 +02:00
Zheng Yongjun	b375ea083f	net: sparx5: switchdev: fix possible NULL pointer dereference [ Upstream commit `0906f3a3df` ] As the possible failure of the allocation, devm_kzalloc() may return NULL pointer. Therefore, it should be better to check the 'db' in order to prevent the dereference of NULL pointer. Fixes: `10615907e9` ("net: sparx5: switchdev: adding frame DMA functionality") Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:53 +02:00
Duoming Zhou	409570a619	net/x25: Fix null-ptr-deref caused by x25_disconnect [ Upstream commit `7781607938` ] When the link layer is terminating, x25->neighbour will be set to NULL in x25_disconnect(). As a result, it could cause null-ptr-deref bugs in x25_sendmsg(),x25_recvmsg() and x25_connect(). One of the bugs is shown below. (Thread 1) \| (Thread 2) x25_link_terminated() \| x25_recvmsg() x25_kill_by_neigh() \| ... x25_disconnect() \| lock_sock(sk) ... \| ... x25->neighbour = NULL //(1) \| ... \| x25->neighbour->extended //(2) The code sets NULL to x25->neighbour in position (1) and dereferences x25->neighbour in position (2), which could cause null-ptr-deref bug. This patch adds lock_sock() in x25_kill_by_neigh() in order to synchronize with x25_sendmsg(), x25_recvmsg() and x25_connect(). What`s more, the sock held by lock_sock() is not NULL, because it is extracted from x25_list and uses x25_list_lock to synchronize. Fixes: `4becb7ee5b` ("net/x25: Fix x25_neigh refcnt leak when x25 disconnect") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Reviewed-by: Lin Ma <linma@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:53 +02:00
Tom Rix	c416e9bb85	qlcnic: dcb: default to returning -EOPNOTSUPP [ Upstream commit `1521db37f0` ] Clang static analysis reports this issue qlcnic_dcb.c:382:10: warning: Assigned value is garbage or undefined mbx_out = *val; ^ ~~~~ val is set in the qlcnic_dcb_query_hw_capability() wrapper. If there is no query_hw_capability op in dcp, success is returned without setting the val. For this and similar wrappers, return -EOPNOTSUPP. Fixes: `14d385b990` ("qlcnic: dcb: Query adapter DCB capabilities.") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:53 +02:00
Randy Dunlap	e87c47df21	net: sparx5: depends on PTP_1588_CLOCK_OPTIONAL [ Upstream commit `08be6b13db` ] Fix build errors when PTP_1588_CLOCK=m and SPARX5_SWTICH=y. arc-linux-ld: drivers/net/ethernet/microchip/sparx5/sparx5_ethtool.o: in function `sparx5_get_ts_info': sparx5_ethtool.c:(.text+0x146): undefined reference to `ptp_clock_index' arc-linux-ld: sparx5_ethtool.c:(.text+0x146): undefined reference to `ptp_clock_index' arc-linux-ld: drivers/net/ethernet/microchip/sparx5/sparx5_ptp.o: in function `sparx5_ptp_init': sparx5_ptp.c:(.text+0xd56): undefined reference to `ptp_clock_register' arc-linux-ld: sparx5_ptp.c:(.text+0xd56): undefined reference to `ptp_clock_register' arc-linux-ld: drivers/net/ethernet/microchip/sparx5/sparx5_ptp.o: in function `sparx5_ptp_deinit': sparx5_ptp.c:(.text+0xf30): undefined reference to `ptp_clock_unregister' arc-linux-ld: sparx5_ptp.c:(.text+0xf30): undefined reference to `ptp_clock_unregister' arc-linux-ld: sparx5_ptp.c:(.text+0xf38): undefined reference to `ptp_clock_unregister' arc-linux-ld: sparx5_ptp.c:(.text+0xf46): undefined reference to `ptp_clock_unregister' arc-linux-ld: drivers/net/ethernet/microchip/sparx5/sparx5_ptp.o:sparx5_ptp.c:(.text+0xf46): more undefined references to `ptp_clock_unregister' follow Fixes: `3cfa11bac9` ("net: sparx5: add the basic sparx5 driver") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Horatiu Vultur <horatiu.vultur@microchip.com> Cc: UNGLinuxDriver@microchip.com Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Steen Hegelund <steen.hegelund@microchip.com> Cc: Bjarni Jonasson <bjarni.jonasson@microchip.com> Cc: Lars Povlsen <lars.povlsen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:53 +02:00
Peng Li	34a5c64951	net: hns3: clean residual vf config after disable sriov [ Upstream commit `671cb8cbb9` ] After disable sriov, VF still has some config and info need to be cleaned, which configured by PF. This patch clean the HW config and SW struct vport->vf_info. Fixes: `fa8d82e853` ("net: hns3: Add support of .sriov_configure in HNS3 driver") Signed-off-by: Peng Li<lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:53 +02:00
Trond Myklebust	c955782358	NFS: Don't loop forever in nfs_do_recoalesce() [ Upstream commit `d02d81efc7` ] If __nfs_pageio_add_request() fails to add the request, it will return with either desc->pg_error < 0, or mirror->pg_recoalesce will be set, so we are guaranteed either to exit the function altogether, or to loop. However if there is nothing left in mirror->pg_list to coalesce, we must exit, so make sure that we clear mirror->pg_recoalesce every time we loop. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: `70536bf4eb` ("NFS: Clean up reset of the mirror accounting variables") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:53 +02:00
Ido Schimmel	667760fe01	selftests: test_vxlan_under_vrf: Fix broken test case [ Upstream commit `b50d3b46f8` ] The purpose of the last test case is to test VXLAN encapsulation and decapsulation when the underlay lookup takes place in a non-default VRF. This is achieved by enslaving the physical device of the tunnel to a VRF. The binding of the VXLAN UDP socket to the VRF happens when the VXLAN device itself is opened, not when its physical device is opened. This was also mentioned in the cited commit ("tests that moving the underlay from a VRF to another works when down/up the VXLAN interface"), but the test did something else. Fix it by reopening the VXLAN device instead of its physical device. Before: # ./test_vxlan_under_vrf.sh Checking HV connectivity [ OK ] Check VM connectivity through VXLAN (underlay in the default VRF) [ OK ] Check VM connectivity through VXLAN (underlay in a VRF) [FAIL] After: # ./test_vxlan_under_vrf.sh Checking HV connectivity [ OK ] Check VM connectivity through VXLAN (underlay in the default VRF) [ OK ] Check VM connectivity through VXLAN (underlay in a VRF) [ OK ] Fixes: `03f1c26b1c` ("test/net: Add script for VXLAN underlay in a VRF") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220324200514.1638326-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:52 +02:00
Florian Fainelli	2d05a00709	net: phy: broadcom: Fix brcm_fet_config_init() [ Upstream commit `bf8bfc4336` ] A Broadcom AC201 PHY (same entry as 5241) would be flagged by the Broadcom UniMAC MDIO controller as not completing the turn around properly since the PHY expects 65 MDC clock cycles to complete a write cycle, and the MDIO controller was only sending 64 MDC clock cycles as determined by looking at a scope shot. This would make the subsequent read fail with the UniMAC MDIO controller command field having MDIO_READ_FAIL set and we would abort the brcm_fet_config_init() function and thus not probe the PHY at all. After issuing a software reset, wait for at least 1ms which is well above the 1us reset delay advertised by the datasheet and issue a dummy read to let the PHY turn around the line properly. This read specifically ignores -EIO which would be returned by MDIO controllers checking for the line being turned around. If we have a genuine reaad failure, the next read of the interrupt status register would pick it up anyway. Fixes: `d7a2ed9248` ("broadcom: Add AC131 phy support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220324232438.1156812-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:52 +02:00
Jian Shen	2dc73ba932	net: hns3: refine the process when PF set VF VLAN [ Upstream commit `190cd8a72b` ] Currently, when PF set VF VLAN, it sends notify mailbox to VF if VF alive. VF stop its traffic, and send request mailbox to PF, then PF updates VF VLAN. It's a bit complex. If VF is killed before sending request, PF will not set VF VLAN without any log. This patch refines the process, PF can set VF VLAN direclty, and then notify the VF. If VF is resetting at that time, the notify may be dropped, so VF should query it after reset finished. Fixes: `92f11ea177` ("net: hns3: fix set port based VLAN issue for VF") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:52 +02:00
Yufeng Mo	ee7e9a9d73	net: hns3: format the output of the MAC address [ Upstream commit `4f331fda35` ] Printing the whole MAC addresse may bring security risks. Therefore, the MAC address is partially encrypted to improve security. Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:52 +02:00
Jian Shen	30f0ff7176	net: hns3: add vlan list lock to protect vlan list [ Upstream commit `1932a624ab` ] When adding port base VLAN, vf VLAN need to remove from HW and modify the vlan state in vf VLAN list as false. If the periodicity task is freeing the same node, it may cause "use after free" error. This patch adds a vlan list lock to protect the vlan list. Fixes: `c6075b1934` ("net: hns3: Record VF vlan tables") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:52 +02:00
Jian Shen	02948e5782	net: hns3: fix port base vlan add fail when concurrent with reset [ Upstream commit `c0f46de30c` ] Currently, Port base vlan is initiated by PF and configured to its VFs, by using command "ip link set <pf name> vf <vf id> vlan <vlan id>". When a global reset was triggered, the hardware vlan table and the soft recorded vlan information will be cleared by PF, and restored them until VFs were ready. There is a short time window between the table had been cleared and before table restored. If configured a new port base vlan tag at this moment, driver will check the soft recorded vlan information, and find there hasn't the old tag in it, which causing a warning print. Due to the port base vlan is managed by PF, so the VFs's port base vlan restoring should be handled by PF when PF was ready. This patch fixes it. Fixes: `039ba863e8` ("net: hns3: optimize the filter table entries handling when resetting") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:52 +02:00
Jian Shen	5e528c0e06	net: hns3: fix bug when PF set the duplicate MAC address for VFs [ Upstream commit `ccb18f0553` ] If the MAC address A is configured to vport A and then vport B. The MAC address of vport A in the hardware becomes invalid. If the address of vport A is changed to MAC address B, the driver needs to delete the MAC address A of vport A. Due to the MAC address A of vport A has become invalid in the hardware entry, so "-ENOENT" is returned. In this case, the "used_umv_size" value recorded in driver is not updated. As a result, the MAC entry status of the software is inconsistent with that of the hardware. Therefore, the driver updates the umv size even if the MAC entry cannot be found. Ensure that the software and hardware status is consistent. Fixes: `ee4bcd3b7a` ("net: hns3: refactor the MAC address configure") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:52 +02:00
Vladimir Oltean	be6937a11b	net: enetc: report software timestamping via SO_TIMESTAMPING [ Upstream commit `feb13dcb18` ] Let user space properly determine that the enetc driver provides software timestamps. Fixes: `4caefbce06` ("enetc: add software timestamping") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20220324161210.4122281-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:52 +02:00

1 2 3 4 5 ...

1050221 Commits