linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-07 11:26:02 +09:00

Author	SHA1	Message	Date
YueHaibing	a16516ef11	net/x25: Fix null-ptr-deref in x25_disconnect commit `8999dc8949` upstream. We should check null before do x25_neigh_put in x25_disconnect, otherwise may cause null-ptr-deref like this: #include <sys/socket.h> #include <linux/x25.h> int main() { int sck_x25; sck_x25 = socket(AF_X25, SOCK_SEQPACKET, 0); close(sck_x25); return 0; } BUG: kernel NULL pointer dereference, address: 00000000000000d8 CPU: 0 PID: 4817 Comm: t2 Not tainted 5.7.0-rc3+ #159 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3- RIP: 0010:x25_disconnect+0x91/0xe0 Call Trace: x25_release+0x18a/0x1b0 __sock_release+0x3d/0xc0 sock_close+0x13/0x20 __fput+0x107/0x270 ____fput+0x9/0x10 task_work_run+0x6d/0xb0 exit_to_usermode_loop+0x102/0x110 do_syscall_64+0x23c/0x260 entry_SYSCALL_64_after_hwframe+0x49/0xb3 Reported-by: syzbot+6db548b615e5aeefdce2@syzkaller.appspotmail.com Fixes: `4becb7ee5b` ("net/x25: Fix x25_neigh refcnt leak when x25 disconnect") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:44:12 +09:00
Xiyu Yang	2153cceab1	net/x25: Fix x25_neigh refcnt leak when x25 disconnect commit `4becb7ee5b` upstream. x25_connect() invokes x25_get_neigh(), which returns a reference of the specified x25_neigh object to "x25->neighbour" with increased refcnt. When x25 connect success and returns, the reference still be hold by "x25->neighbour", so the refcount should be decreased in x25_disconnect() to keep refcount balanced. The reference counting issue happens in x25_disconnect(), which forgets to decrease the refcnt increased by x25_get_neigh() in x25_connect(), causing a refcnt leak. Fix this issue by calling x25_neigh_put() before x25_disconnect() returns. Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:44:11 +09:00
Rolf Eike Beer	7f45f1d42f	install several missing uapi headers Commit `fcc8487d47` ("uapi: export all headers under uapi directories") changed the default to install all headers not marked to be conditional. This takes the list of headers listed in the commit message and manually adds an export for those that are already present in this kernel version. Found during an attempt to build mtd-utils 2.1.2 as it wants hash_info.h, which exists since 3.13 but has not been installed until the above mentioned commit, which ended up in 4.12. Signed-off-by: Rolf Eike Beer <eb@emlix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:44:09 +09:00
Nicolas Dichtel	8a651805d3	uapi: includes linux/types.h before exporting files commit `9078b4eea1` upstream. Some files will be exported after a following patch. 0-day tests report the following warning/error: ./usr/include/linux/bcache.h:8: include of <linux/types.h> is preferred over <asm/types.h> ./usr/include/linux/bcache.h:11: found __[us]{8,16,32,64} type without #include <linux/types.h> ./usr/include/linux/qrtr.h:8: found __[us]{8,16,32,64} type without #include <linux/types.h> ./usr/include/linux/cryptouser.h:39: found __[us]{8,16,32,64} type without #include <linux/types.h> ./usr/include/linux/pr.h:14: found __[us]{8,16,32,64} type without #include <linux/types.h> ./usr/include/linux/btrfs_tree.h:337: found __[us]{8,16,32,64} type without #include <linux/types.h> ./usr/include/rdma/bnxt_re-abi.h:45: found __[us]{8,16,32,64} type without #include <linux/types.h> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> reb: left out include/uapi/rdma/bnxt_re-abi.h as it's not in this kernel version Signed-off-by: Rolf Eike Beer <eb@emlix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:44:08 +09:00
Rik van Riel	51146e5a79	xfs: fix missed wakeup on l_flush_wait commit `cdea5459ce` upstream. The code in xlog_wait uses the spinlock to make adding the task to the wait queue, and setting the task state to UNINTERRUPTIBLE atomic with respect to the waker. Doing the wakeup after releasing the spinlock opens up the following race condition: Task 1 task 2 add task to wait queue wake up task set task state to UNINTERRUPTIBLE This issue was found through code inspection as a result of kworkers being observed stuck in UNINTERRUPTIBLE state with an empty wait queue. It is rare and largely unreproducable. Simply moving the spin_unlock to after the wake_up_all results in the waker not being able to see a task on the waitqueue before it has set its state to UNINTERRUPTIBLE. This bug dates back to the conversion of this code to generic waitqueue infrastructure from a counting semaphore back in 2008 which didn't place the wakeups consistently w.r.t. to the relevant spin locks. [dchinner: Also fix a similar issue in the shutdown path on xc_commit_wait. Update commit log with more details of the issue.] Fixes: `d748c62367` ("[XFS] Convert l_flushsema to a sv_t") Reported-by: Chris Mason <clm@fb.com> Signed-off-by: Rik van Riel <riel@surriel.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Cc: stable@vger.kernel.org # 4.9.x-4.19.x [modified for contextual change near xlog_state_do_callback()] Signed-off-by: Samuel Mendoza-Jonas <samjonas@amazon.com> Reviewed-by: Frank van der Linden <fllinden@amazon.com> Reviewed-by: Suraj Jitindar Singh <surajjs@amazon.com> Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com> Reviewed-by: Anchal Agarwal <anchalag@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:44:07 +09:00
Peilin Ye	d634bc2bd9	rds: Prevent kernel-infoleak in rds_notify_queue_get() commit `bbc8a99e95` upstream. rds_notify_queue_get() is potentially copying uninitialized kernel stack memory to userspace since the compiler may leave a 4-byte hole at the end of `cmsg`. In 2016 we tried to fix this issue by doing `= { 0 };` on `cmsg`, which unfortunately does not always initialize that 4-byte hole. Fix it by using memset() instead. Cc: stable@vger.kernel.org Fixes: `f037590fff` ("rds: fix a leak of kernel memory") Fixes: `bdbe6fbc6a` ("RDS: recv.c") Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:44:05 +09:00
Tetsuo Handa	ffe17bae77	fbdev: Detect integer underflow at "struct fbcon_ops"->clear_margins. [ Upstream commit `033724d686` ] syzbot is reporting general protection fault in bitfill_aligned() [1] caused by integer underflow in bit_clear_margins(). The cause of this problem is when and how do_vc_resize() updates vc->vc_{cols,rows}. If vc_do_resize() fails (e.g. kzalloc() fails) when var.xres or var.yres is going to shrink, vc->vc_{cols,rows} will not be updated. This allows bit_clear_margins() to see info->var.xres < (vc->vc_cols * cw) or info->var.yres < (vc->vc_rows * ch). Unexpectedly large rw or bh will try to overrun the __iomem region and causes general protection fault. Also, vc_resize(vc, 0, 0) does not set vc->vc_{cols,rows} = 0 due to new_cols = (cols ? cols : vc->vc_cols); new_rows = (lines ? lines : vc->vc_rows); exception. Since cols and lines are calculated as cols = FBCON_SWAP(ops->rotate, info->var.xres, info->var.yres); rows = FBCON_SWAP(ops->rotate, info->var.yres, info->var.xres); cols /= vc->vc_font.width; rows /= vc->vc_font.height; vc_resize(vc, cols, rows); in fbcon_modechanged(), var.xres < vc->vc_font.width makes cols = 0 and var.yres < vc->vc_font.height makes rows = 0. This means that const int fd = open("/dev/fb0", O_ACCMODE); struct fb_var_screeninfo var = { }; ioctl(fd, FBIOGET_VSCREENINFO, &var); var.xres = var.yres = 1; ioctl(fd, FBIOPUT_VSCREENINFO, &var); easily reproduces integer underflow bug explained above. Of course, callers of vc_resize() are not handling vc_do_resize() failure is bad. But we can't avoid vc_resize(vc, 0, 0) which returns 0. Therefore, as a band-aid workaround, this patch checks integer underflow in "struct fbcon_ops"->clear_margins call, assuming that vc->vc_cols * vc->vc_font.width and vc->vc_rows * vc->vc_font.heigh do not cause integer overflow. [1] https://syzkaller.appspot.com/bug?id=a565882df74fa76f10d3a6fec4be31098dbb37c6 Reported-and-tested-by: syzbot <syzbot+e5fd3e65515b48c02a30@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200715015102.3814-1-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:44:04 +09:00
Joerg Roedel	3e01d30dc5	x86, vmlinux.lds: Page-align end of ..page_aligned sections [ Upstream commit `de2b41be8f` ] On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is page-aligned, but the end of the .bss..page_aligned section is not guaranteed to be page-aligned. As a result, objects from other .bss sections may end up on the same 4k page as the idt_table, and will accidentially get mapped read-only during boot, causing unexpected page-faults when the kernel writes to them. This could be worked around by making the objects in the page aligned sections page sized, but that's wrong. Explicit sections which store only page aligned objects have an implicit guarantee that the object is alone in the page in which it is placed. That works for all objects except the last one. That's inconsistent. Enforcing page sized objects for these sections would wreckage memory sanitizers, because the object becomes artificially larger than it should be and out of bound access becomes legit. Align the end of the .bss..page_aligned and .data..page_aligned section on page-size so all objects places in these sections are guaranteed to have their own page. [ tglx: Amended changelog ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200721093448.10417-1-joro@8bytes.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:44:02 +09:00
Sami Tolvanen	bfca7be148	x86/build/lto: Fix truncated .bss with -fdata-sections [ Upstream commit `6a03469a1e` ] With CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y, we compile the kernel with -fdata-sections, which also splits the .bss section. The new section, with a new .bss.* name, which pattern gets missed by the main x86 linker script which only expects the '.bss' name. This results in the discarding of the second part and a too small, truncated .bss section and an unhappy, non-working kernel. Use the common BSS_MAIN macro in the linker script to properly capture and merge all the generated BSS sections. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20190415164956.124067-1-samitolvanen@google.com [ Extended the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:44:01 +09:00
Wang Hai	390eb5a344	9p/trans_fd: Fix concurrency del of req_list in p9_fd_cancelled/p9_read_work [ Upstream commit `74d6a5d566` ] p9_read_work and p9_fd_cancelled may be called concurrently. In some cases, req->req_list may be deleted by both p9_read_work and p9_fd_cancelled. We can fix it by ignoring replies associated with a cancelled request and ignoring cancelled request if message has been received before lock. Link: http://lkml.kernel.org/r/20200612090833.36149-1-wanghai38@huawei.com Fixes: `60ff779c4a` ("9p: client: remove unused code and any reference to "cancelled" function") Cc: <stable@vger.kernel.org> # v3.12+ Reported-by: syzbot+77a25acfa0382e06ab23@syzkaller.appspotmail.com Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:43:59 +09:00
Dominique Martinet	c23390e5e1	9p/trans_fd: abort p9_read_work if req status changed [ Upstream commit `e4ca13f7d0` ] p9_read_work would try to handle an errored req even if it got put to error state by another thread between the lookup (that worked) and the time it had been fully read. The request itself is safe to use because we hold a ref to it from the lookup (for m->rreq, so it was safe to read into the request data buffer until this point), but the req_list has been deleted at the same time status changed, and client_cb already has been called as well, so we should not do either. Link: http://lkml.kernel.org/r/1539057956-23741-1-git-send-email-asmadeus@codewreck.org Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Reported-by: syzbot+2222c34dc40b515f30dc@syzkaller.appspotmail.com Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:43:58 +09:00
Sheng Yong	130b8cc7fd	f2fs: check if file namelen exceeds max value [ Upstream commit `720db06863` ] Dentry bitmap is not enough to detect incorrect dentries. So this patch also checks the namelen value of a dentry. Signed-off-by: Gong Chen <gongchen4@huawei.com> Signed-off-by: Sheng Yong <shengyong1@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:43:56 +09:00
Jaegeuk Kim	021efd48eb	f2fs: check memory boundary by insane namelen [ Upstream commit `4e240d1bab` ] If namelen is corrupted to have very long value, fill_dentries can copy wrong memory area. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:43:55 +09:00
Steve Cohen	142b281a90	drm: hold gem reference until object is no longer accessed commit `8490d6a7e0` upstream. A use-after-free in drm_gem_open_ioctl can happen if the GEM object handle is closed between the idr lookup and retrieving the size from said object since a local reference is not being held at that point. Hold the local reference while the object can still be accessed to fix this and plug the potential security hole. Signed-off-by: Steve Cohen <cohens@codeaurora.org> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/1595284250-31580-1-git-send-email-cohens@codeaurora.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:43:53 +09:00
Peilin Ye	af2666a3ab	drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() commit `543e8669ed` upstream. Compiler leaves a 4-byte hole near the end of `dev_info`, causing amdgpu_info_ioctl() to copy uninitialized kernel stack memory to userspace when `size` is greater than 356. In 2015 we tried to fix this issue by doing `= {};` on `dev_info`, which unfortunately does not initialize that 4-byte hole. Fix it by using memset() instead. Cc: stable@vger.kernel.org Fixes: `c193fa91b9` ("drm/amdgpu: information leak in amdgpu_info_ioctl()") Fixes: `d38ceaf99e` ("drm/amdgpu: add core driver (v4)") Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:43:52 +09:00
Will Deacon	88da4edbb3	ARM: 8986/1: hw_breakpoint: Don't invoke overflow handler on uaccess watchpoints commit `eec13b42d4` upstream. Unprivileged memory accesses generated by the so-called "translated" instructions (e.g. LDRT) in kernel mode can cause user watchpoints to fire unexpectedly. In such cases, the hw_breakpoint logic will invoke the user overflow handler which will typically raise a SIGTRAP back to the current task. This is futile when returning back to the kernel because (a) the signal won't have been delivered and (b) userspace can't handle the thing anyway. Avoid invoking the user overflow handler for watchpoints triggered by kernel uaccess routines, and instead single-step over the faulting instruction as we would if no overflow handler had been installed. Cc: <stable@vger.kernel.org> Fixes: `f81ef4a920` ("ARM: 6356/1: hw-breakpoint: add ARM backend for the hw-breakpoint framework") Reported-by: Luis Machado <luis.machado@linaro.org> Tested-by: Luis Machado <luis.machado@linaro.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:43:50 +09:00
Robert Hancock	4b57c0bc53	PCI/ASPM: Disable ASPM on ASMedia ASM1083/1085 PCIe-to-PCI bridge commit `b361663c5a` upstream. Recently ASPM handling was changed to allow ASPM on PCIe-to-PCI/PCI-X bridges. Unfortunately the ASMedia ASM1083/1085 PCIe to PCI bridge device doesn't seem to function properly with ASPM enabled. On an Asus PRIME H270-PRO motherboard, it causes errors like these: pcieport 0000:00:1c.0: AER: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID) pcieport 0000:00:1c.0: AER: device [8086:a292] error status/mask=00003000/00002000 pcieport 0000:00:1c.0: AER: [12] Timeout pcieport 0000:00:1c.0: AER: Corrected error received: 0000:00:1c.0 pcieport 0000:00:1c.0: AER: can't find device of ID00e0 In addition to flooding the kernel log, this also causes the machine to wake up immediately after suspend is initiated. The device advertises ASPM L0s and L1 support in the Link Capabilities register, but the ASMedia web page for ASM1083 [1] claims "No PCIe ASPM support". Windows 10 (build 2004) enables L0s, but it also logs correctable PCIe errors. Add a quirk to disable ASPM for this device. [1] https://www.asmedia.com.tw/eng/e_show_products.php?cate_index=169&item=114 [bhelgaas: commit log] Fixes: `66ff14e59e` ("PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208667 Link: https://lore.kernel.org/r/20200722021803.17958-1-hancockrwd@gmail.com Signed-off-by: Robert Hancock <hancockrwd@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:43:49 +09:00
Navid Emamdoost	b0447a7313	ath9k: release allocated buffer if timed out [ Upstream commit `728c1e2a05` ] In ath9k_wmi_cmd, the allocated network buffer needs to be released if timeout happens. Otherwise memory will be leaked. Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:43:47 +09:00
Navid Emamdoost	a7a850bbd2	ath9k_htc: release allocated buffer if timed out [ Upstream commit `853acf7caf` ] In htc_config_pipe_credits, htc_setup_complete, and htc_connect_service if time out happens, the allocated buffer needs to be released. Otherwise there will be memory leak. Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:43:46 +09:00
Navid Emamdoost	410c271b51	media: rc: prevent memory leak in cx23888_ir_probe [ Upstream commit `a7b2df76b4` ] In cx23888_ir_probe if kfifo_alloc fails the allocated memory for state should be released. Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:43:44 +09:00
Navid Emamdoost	800386c5b8	crypto: ccp - Release all allocated memory if sha type is invalid [ Upstream commit `128c664292` ] Release all allocated memory if sha type is invalid: In ccp_run_sha_cmd, if the type of sha is invalid, the allocated hmac_buf should be released. v2: fix the goto. Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Acked-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:43:43 +09:00
Wei Yongjun	0833d9c06b	net: phy: mdio-bcm-unimac: fix potential NULL dereference in unimac_mdio_probe() [ Upstream commit `297a6961ff` ] platform_get_resource() may fail and return NULL, so we should better check it's return value to avoid a NULL pointer dereference a bit later in the code. This is detected by Coccinelle semantic patch. @@ expression pdev, res, n, t, e, e1, e2; @@ res = platform_get_resource(pdev, t, n); + if (!res) + return -EINVAL; ... when != res == NULL e = devm_ioremap(e1, res->start, e2); Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:39:54 +09:00
Eric Sandeen	9eb6823eb6	xfs: don't call xfs_da_shrink_inode with NULL bp [ Upstream commit `bb3d48dcf8` ] xfs_attr3_leaf_create may have errored out before instantiating a buffer, for example if the blkno is out of range. In that case there is no work to do to remove it, and in fact xfs_da_shrink_inode will lead to an oops if we try. This also seems to fix a flaw where the original error from xfs_attr3_leaf_create gets overwritten in the cleanup case, and it removes a pointless assignment to bp which isn't used after this. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199969 Reported-by: Xu, Wen <wen.xu@gatech.edu> Tested-by: Xu, Wen <wen.xu@gatech.edu> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:39:52 +09:00
Dave Chinner	e2d429df38	xfs: validate cached inodes are free when allocated [ Upstream commit `afca6c5b25` ] A recent fuzzed filesystem image cached random dcache corruption when the reproducer was run. This often showed up as panics in lookup_slow() on a null inode->i_ops pointer when doing pathwalks. BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 .... Call Trace: lookup_slow+0x44/0x60 walk_component+0x3dd/0x9f0 link_path_walk+0x4a7/0x830 path_lookupat+0xc1/0x470 filename_lookup+0x129/0x270 user_path_at_empty+0x36/0x40 path_listxattr+0x98/0x110 SyS_listxattr+0x13/0x20 do_syscall_64+0xf5/0x280 entry_SYSCALL_64_after_hwframe+0x42/0xb7 but had many different failure modes including deadlocks trying to lock the inode that was just allocated or KASAN reports of use-after-free violations. The cause of the problem was a corrupt INOBT on a v4 fs where the root inode was marked as free in the inobt record. Hence when we allocated an inode, it chose the root inode to allocate, found it in the cache and re-initialised it. We recently fixed a similar inode allocation issue caused by inobt record corruption problem in xfs_iget_cache_miss() in commit `ee457001ed` ("xfs: catch inode allocation state mismatch corruption"). This change adds similar checks to the cache-hit path to catch it, and turns the reproducer into a corruption shutdown situation. Reported-by: Wen Xu <wen.xu@gatech.edu> Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: fix typos in comment] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:39:50 +09:00
Dave Chinner	a4971da960	xfs: catch inode allocation state mismatch corruption [ Upstream commit `ee457001ed` ] We recently came across a V4 filesystem causing memory corruption due to a newly allocated inode being setup twice and being added to the superblock inode list twice. From code inspection, the only way this could happen is if a newly allocated inode was not marked as free on disk (i.e. di_mode wasn't zero). Running the metadump on an upstream debug kernel fails during inode allocation like so: XFS: Assertion failed: ip->i_d.di_nblocks == 0, file: fs/xfs/xfs_inod= e.c, line: 838 ------------[ cut here ]------------ kernel BUG at fs/xfs/xfs_message.c:114! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 11 PID: 3496 Comm: mkdir Not tainted 4.16.0-rc5-dgc #442 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/0= 1/2014 RIP: 0010:assfail+0x28/0x30 RSP: 0018:ffffc9000236fc80 EFLAGS: 00010202 RAX: 00000000ffffffea RBX: 0000000000004000 RCX: 0000000000000000 RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffff8227211b RBP: ffffc9000236fce8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000bec R11: f000000000000000 R12: ffffc9000236fd30 R13: ffff8805c76bab80 R14: ffff8805c77ac800 R15: ffff88083fb12e10 FS: 00007fac8cbff040(0000) GS:ffff88083fd00000(0000) knlGS:0000000000000= 000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fffa6783ff8 CR3: 00000005c6e2b003 CR4: 00000000000606e0 Call Trace: xfs_ialloc+0x383/0x570 xfs_dir_ialloc+0x6a/0x2a0 xfs_create+0x412/0x670 xfs_generic_create+0x1f7/0x2c0 ? capable_wrt_inode_uidgid+0x3f/0x50 vfs_mkdir+0xfb/0x1b0 SyS_mkdir+0xcf/0xf0 do_syscall_64+0x73/0x1a0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 Extracting the inode number we crashed on from an event trace and looking at it with xfs_db: xfs_db> inode 184452204 xfs_db> p core.magic = 0x494e core.mode = 0100644 core.version = 2 core.format = 2 (extents) core.nlinkv2 = 1 core.onlink = 0 ..... Confirms that it is not a free inode on disk. xfs_repair also trips over this inode: ..... zero length extent (off = 0, fsbno = 0) in ino 184452204 correcting nextents for inode 184452204 bad attribute fork in inode 184452204, would clear attr fork bad nblocks 1 for inode 184452204, would reset to 0 bad anextents 1 for inode 184452204, would reset to 0 imap claims in-use inode 184452204 is free, would correct imap would have cleared inode 184452204 ..... disconnected inode 184452204, would move to lost+found And so we have a situation where the directory structure and the inobt thinks the inode is free, but the inode on disk thinks it is still in use. Where this corruption came from is not possible to diagnose, but we can detect it and prevent the kernel from oopsing on lookup. The reproducer now results in: $ sudo mkdir /mnt/scratch/{0,1,2,3,4,5}{0,1,2,3,4,5} mkdir: cannot create directory =E2=80=98/mnt/scratch/00=E2=80=99: File ex= ists mkdir: cannot create directory =E2=80=98/mnt/scratch/01=E2=80=99: File ex= ists mkdir: cannot create directory =E2=80=98/mnt/scratch/03=E2=80=99: Structu= re needs cleaning mkdir: cannot create directory =E2=80=98/mnt/scratch/04=E2=80=99: Input/o= utput error mkdir: cannot create directory =E2=80=98/mnt/scratch/05=E2=80=99: Input/o= utput error .... And this corruption shutdown: [ 54.843517] XFS (loop0): Corruption detected! Free inode 0xafe846c not= marked free on disk [ 54.845885] XFS (loop0): Internal error xfs_trans_cancel at line 1023 = of file fs/xfs/xfs_trans.c. Caller xfs_create+0x425/0x670 [ 54.848994] CPU: 10 PID: 3541 Comm: mkdir Not tainted 4.16.0-rc5-dgc #= 443 [ 54.850753] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIO= S 1.10.2-1 04/01/2014 [ 54.852859] Call Trace: [ 54.853531] dump_stack+0x85/0xc5 [ 54.854385] xfs_trans_cancel+0x197/0x1c0 [ 54.855421] xfs_create+0x425/0x670 [ 54.856314] xfs_generic_create+0x1f7/0x2c0 [ 54.857390] ? capable_wrt_inode_uidgid+0x3f/0x50 [ 54.858586] vfs_mkdir+0xfb/0x1b0 [ 54.859458] SyS_mkdir+0xcf/0xf0 [ 54.860254] do_syscall_64+0x73/0x1a0 [ 54.861193] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 54.862492] RIP: 0033:0x7fb73bddf547 [ 54.863358] RSP: 002b:00007ffdaa553338 EFLAGS: 00000246 ORIG_RAX: 0000= 000000000053 [ 54.865133] RAX: ffffffffffffffda RBX: 00007ffdaa55449a RCX: 00007fb73= bddf547 [ 54.866766] RDX: 0000000000000001 RSI: 00000000000001ff RDI: 00007ffda= a55449a [ 54.868432] RBP: 00007ffdaa55449a R08: 00000000000001ff R09: 00005623a= 8670dd0 [ 54.870110] R10: 00007fb73be72d5b R11: 0000000000000246 R12: 000000000= 00001ff [ 54.871752] R13: 00007ffdaa5534b0 R14: 0000000000000000 R15: 00007ffda= a553500 [ 54.873429] XFS (loop0): xfs_do_force_shutdown(0x8) called from line 1= 024 of file fs/xfs/xfs_trans.c. Return address = ffffffff814cd050 [ 54.882790] XFS (loop0): Corruption of in-memory data detected. Shutt= ing down filesystem [ 54.884597] XFS (loop0): Please umount the filesystem and rectify the = problem(s) Note that this crash is only possible on v4 filesystemsi or v5 filesystems mounted with the ikeep mount option. For all other V5 filesystems, this problem cannot occur because we don't read inodes we are allocating from disk - we simply overwrite them with the new inode information. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Tested-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:39:49 +09:00
secuflag	b8154d6f54	Revert "block: cgroups, kconfig, build bits for BFQ-v7r11-4.5.0" This reverts commit f6532c1fec5fa35626b08dfcc1d5705af96be3e7.	2023-05-16 08:39:45 +09:00
secuflag	3299b45ec6	Revert "N2/C4: Set BFQ as default IO sched" This reverts commit 2ad1115354a197d89464edb469b043d03037a552.	2023-05-16 08:39:39 +09:00
secuflag	4f42c9548f	N2/C4: Set BFQ as default IO sched	2023-05-16 08:39:15 +09:00
Paolo Valente	48f3ff96f2	block: cgroups, kconfig, build bits for BFQ-v7r11-4.5.0 Update Kconfig.iosched and do the related Makefile changes to include kernel configuration options for BFQ. Also increase the number of policies supported by the blkio controller so that BFQ can add its own. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Signed-off-by: Arianna Avanzini <avanzini@google.com> block: introduce the BFQ-v7r11 I/O sched for 4.5.0 The general structure is borrowed from CFQ, as much of the code for handling I/O contexts. Over time, several useful features have been ported from CFQ as well (details in the changelog in README.BFQ). A (bfq_)queue is associated to each task doing I/O on a device, and each time a scheduling decision has to be made a queue is selected and served until it expires. - Slices are given in the service domain: tasks are assigned budgets, measured in number of sectors. Once got the disk, a task must however consume its assigned budget within a configurable maximum time (by default, the maximum possible value of the budgets is automatically computed to comply with this timeout). This allows the desired latency vs "throughput boosting" tradeoff to be set. - Budgets are scheduled according to a variant of WF2Q+, implemented using an augmented rb-tree to take eligibility into account while preserving an O(log N) overall complexity. - A low-latency tunable is provided; if enabled, both interactive and soft real-time applications are guaranteed a very low latency. - Latency guarantees are preserved also in the presence of NCQ. - Also with flash-based devices, a high throughput is achieved while still preserving latency guarantees. - BFQ features Early Queue Merge (EQM), a sort of fusion of the cooperating-queue-merging and the preemption mechanisms present in CFQ. EQM is in fact a unified mechanism that tries to get a sequential read pattern, and hence a high throughput, with any set of processes performing interleaved I/O over a contiguous sequence of sectors. - BFQ supports full hierarchical scheduling, exporting a cgroups interface. Since each node has a full scheduler, each group can be assigned its own weight. - If the cgroups interface is not used, only I/O priorities can be assigned to processes, with ioprio values mapped to weights with the relation weight = IOPRIO_BE_NR - ioprio. - ioprio classes are served in strict priority order, i.e., lower priority queues are not served as long as there are higher priority queues. Among queues in the same class the bandwidth is distributed in proportion to the weight of each queue. A very thin extra bandwidth is however guaranteed to the Idle class, to prevent it from starving. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Signed-off-by: Arianna Avanzini <avanzini@google.com> block, bfq: add Early Queue Merge (EQM) to BFQ-v7r11 for 4.5.0 A set of processes may happen to perform interleaved reads, i.e.,requests whose union would give rise to a sequential read pattern. There are two typical cases: in the first case, processes read fixed-size chunks of data at a fixed distance from each other, while in the second case processes may read variable-size chunks at variable distances. The latter case occurs for example with QEMU, which splits the I/O generated by the guest into multiple chunks, and lets these chunks be served by a pool of cooperating processes, iteratively assigning the next chunk of I/O to the first available process. CFQ uses actual queue merging for the first type of rocesses, whereas it uses preemption to get a sequential read pattern out of the read requests performed by the second type of processes. In the end it uses two different mechanisms to achieve the same goal: boosting the throughput with interleaved I/O. This patch introduces Early Queue Merge (EQM), a unified mechanism to get a sequential read pattern with both types of processes. The main idea is checking newly arrived requests against the next request of the active queue both in case of actual request insert and in case of request merge. By doing so, both the types of processes can be handled by just merging their queues. EQM is then simpler and more compact than the pair of mechanisms used in CFQ. Finally, EQM also preserves the typical low-latency properties of BFQ, by properly restoring the weight-raising state of a queue when it gets back to a non-merged state. Signed-off-by: Mauro Andreolini <mauro.andreolini@unimore.it> Signed-off-by: Arianna Avanzini <avanzini@google.com> Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Turn into BFQ-v8r7 for 4.9.0 CHANGELOG from v8r4 to v8r7 BFQ v8r7 BUGFIX: make BFQ compile also without hierarchical support BFQ v8r6 BUGFIX Removed the check that, when the new queue to set in service must be selected, the cached next_in_service entities coincide with the entities chosen by __bfq_lookup_next_entity. This check, issuing a warning on failure, was wrong, because the cached and the newly chosen entity could differ in case of a CLASS_IDLE timeout. EFFICIENCY IMPROVEMENT (this improvement is related to the above BUGFIX) The cached next_in_service entities are now really used to select the next queue to serve when the in-service queue expires. Before this change, the cached values were used only for extra (and in general wrong) consistency checks. This caused additional overhead instead of reducing it. EFFICIENCY IMPROVEMENT The next entity to serve, for each level of the hierarchy, is now updated on every event that may change it, i.e., on every activation or deactivation of any entity. This finer granularity is not strictly needed for corectness, because it is only on queue expirations that BFQ needs to know what are the next entities to serve. Yet this change makes it possible to implement optimizations in which it is necessary to know the next queue to serve before the in-service queue expires. SERVICE-ACCURACY IMPROVEMENT The per-device CLASS_IDLE service timeout has been turned into a much more accurate per-group timeout. CODE-QUALITY IMPROVEMENT The non-trivial parts touched by the above improvements have been partially rewritten, and enriched of comments, so as to improve their transparency and understandability. IMPROVEMENT Ported and improved CFQ commit `41647e7a` Before this improvememtn, BFQ used the same logic for detecting seeky queues for rotational disks and SSDs. This logic is appropriate for the former, as it takes into account only inter-request distance, and the latter is the dominant latency factor on a rotational device. Yet things change with flash-based devices, where serving a large request still yields a high throughput, even the request is far from the previous request served. This commits extends seeky detection to take into accoutn also this fact with flash-based devices. In particular, this commit is an improved port of the original commit `41647e7a` for CFQ. CODE IMPROVEMENT Remove useless parameter from bfq_del_bfqq_busy OPTIMIZATION Optimize the update of next_in_service entity. If the update of the next_in_service candidate entity is triggered by the activation of an entity, then it is not necessary to perform full lookups in the active trees to update next_in_service. In fact, it is enough to check whether the just-activated entity has a higher priority than next_in_service, or, even if it has the same priority as next_in_service, is eligible and has a lower virtual finish time than next_in_service. If this compound condition holds, then the new entity can be set as the new next_in_service. Otherwise no change is needed. This commit implements this optimization. BUGFIX Fix bug causing occasional loss of weight raising. When a bfq_queue, say bfqq, is split after a merging with another bfq_queue, BFQ checks whether it has to restore for bfqq the weight-raising state that bfqq had before being merged. In particular, the weight-raising is restored only if, according to the weight-raising duration decided for bfqq when it started to be weight-raised (before being merged), bfqq would not have already finished its weight-raising period. Yet, by mistake, such a duration was not saved when bfqq is merged. So, if bfqq was freed and reallocated when it was split, then this duration was wrongly set to zero on the split. As a consequence, the weight-raising state of bfqq was wrongly not restored, which caused BFQ to fail in guaranteeing a low latency to bfqq. This commit fixes this bug by saving weight-raising duration when bfqq is merged, and correctly restoring it when bfqq is split. BUGFIX Fix wrong reset of in-service entities In-service entities were reset with an indirect logic, which happened to be even buggy for some cases. This commit fixes this bug in two important steps. First, by replacing this indirect logic with a direct logic, in which all involved entities are immediately reset, with a bubble-up loop, when the in-service queue is reset. Second, by restructuring the code related to this change, so as to become not only correct with respect to this change, but also cleaner and hopefully clearer. CODE IMPROVEMENT Add code to be able to redirect trace log to console. BUGFIX Fixed bug in optimized update of next_in_service entity. There was a case where bfq_update_next_in_service did not update next_in_service, even if it might need to be changed: in case of requeueing or repositioning of the entity that happened to be pointed exactly by next_in_service. This could result in violation of service guarantees, because, after a change of timestamps for such an entity, it might be the case that next_in_service had to point to a different entity. This commit fixes this bug. OPTIMIZATION Stop bubble-up of next_in_service update if possible. BUGFIX Fixed a false-positive warning for uninitialized var BFQ-v8r5 DOCUMENTATION IMPROVEMENT Added documentation of BFQ benefits, inner workings, interface and tunables. BUGFIX: Replaced max wrongly used for modulo numbers. DOCUMENTATION IMPROVEMENT Improved help message in Kconfig.iosched. BUGFIX: Removed wrong conversion in use of bfq_fifo_expire. CODE IMPROVEMENT Added parentheses to complex macros. Signed-off-by: Paolo Valente <paolo.valente@linaro.org>	2023-05-16 08:39:14 +09:00
Greg Kroah-Hartman	4a61b06bdb	Linux 4.9.232 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:39:12 +09:00
Changbin Du	1d0ee22bfa	perf: Make perf able to build with latest libbfd commit `0ada120c88` upstream. libbfd has changed the bfd_section_* macros to inline functions bfd_section_<field> since 2019-09-18. See below two commits: o http://www.sourceware.org/ml/gdb-cvs/2019-09/msg00064.html o https://www.sourceware.org/ml/gdb-cvs/2019-09/msg00072.html This fix make perf able to build with both old and new libbfd. Signed-off-by: Changbin Du <changbin.du@gmail.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200128152938.31413-1-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:39:11 +09:00
Jiri Olsa	695a12292d	perf tools: Fix snprint warnings for gcc 8 commit `77f18153c0` upstream. [Add an additional sprintf replacement in tools/perf/builtin-script.c] With gcc 8 we get new set of snprintf() warnings that breaks the compilation, one example: tests/mem.c: In function ‘check’: tests/mem.c:19:48: error: ‘%s’ directive output may be truncated writing \ up to 99 bytes into a region of size 89 [-Werror=format-truncation=] snprintf(failure, sizeof failure, "unexpected %s", out); The gcc docs says: To avoid the warning either use a bigger buffer or handle the function's return value which indicates whether or not its output has been truncated. Given that all these warnings are harmless, because the code either properly fails due to uncomplete file path or we don't care for truncated output at all, I'm changing all those snprintf() calls to scnprintf(), which actually 'checks' for the snprint return value so the gcc stays silent. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Link: http://lkml.kernel.org/r/20180319082902.4518-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:39:10 +09:00
Arnaldo Carvalho de Melo	7c83710c9a	perf annotate: Use asprintf when formatting objdump command line commit `6810158d52` upstream. We were using a local buffer with an arbitrary size, that would have to get increased to avoid truncation as warned by gcc 8: util/annotate.c: In function 'symbol__disassemble': util/annotate.c:1488:4: error: '%s' directive output may be truncated writing up to 4095 bytes into a region of size between 3966 and 8086 [-Werror=format-truncation=] "%s %s%s --start-address=0x%016" PRIx64 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ util/annotate.c:1498:20: symfs_filename, symfs_filename); ~~~~~~~~~~~~~~ util/annotate.c:1490:50: note: format string is defined here " -l -d %s %s -C \"%s\" 2>/dev/null\|grep -v \"%s:\"\|expand", ^~ In file included from /usr/include/stdio.h:861, from util/color.h:5, from util/sort.h:8, from util/annotate.c:14: /usr/include/bits/stdio2.h:67:10: note: '__builtin___snprintf_chk' output 116 or more bytes (assuming 8331) into a destination of size 8192 return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ __bos (__s), __fmt, __va_arg_pack ()); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ So switch to asprintf, that will make sure enough space is available. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-qagoy2dmbjpc9gdnaj0r3mml@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:39:09 +09:00
Masami Hiramatsu	b87f172a0b	perf probe: Fix to check blacklist address correctly commit `80526491c2` upstream. Fix to check kprobe blacklist address correctly with relocated address by adjusting debuginfo address. Since the address in the debuginfo is same as objdump, it is different from relocated kernel address with KASLR. Thus, 'perf probe' always misses to catch the blacklisted addresses. Without this patch, 'perf probe' can not detect the blacklist addresses on a KASLR enabled kernel. # perf probe kprobe_dispatcher Failed to write event: Invalid argument Error: Failed to add events. # With this patch, it correctly shows the error message. # perf probe kprobe_dispatcher kprobe_dispatcher is blacklisted function, skip it. Probe point 'kprobe_dispatcher' not found. Error: Failed to add events. # Fixes: `9aaf5a5f47` ("perf probe: Check kprobes blacklist when adding new events") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/158763966411.30755.5882376357738273695.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:39:08 +09:00
Eric Sandeen	102244e0da	xfs: set format back to extents if xfs_bmap_extents_to_btree commit `2c4306f719` upstream. If xfs_bmap_extents_to_btree fails in a mode where we call xfs_iroot_realloc(-1) to de-allocate the root, set the format back to extents. Otherwise we can assume we can dereference ifp->if_broot based on the XFS_DINODE_FMT_BTREE format, and crash. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199423 Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Nobuhiro Iwamatsu (CIP) <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:39:07 +09:00
Peng Fan	ba85eb0d5d	regmap: debugfs: check count when read regmap file commit `74edd08a4f` upstream. When executing the following command, we met kernel dump. dmesg -c > /dev/null; cd /sys; for i in `ls /sys/kernel/debug/regmap/* -d`; do echo "Checking regmap in $i"; cat $i/registers; done && grep -ri "0x02d0" *; It is because the count value is too big, and kmalloc fails. So add an upper bound check to allow max size `PAGE_SIZE << (MAX_ORDER - 1)`. Signed-off-by: Peng Fan <peng.fan@nxp.com> Link: https://lore.kernel.org/r/1584064687-12964-1-git-send-email-peng.fan@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:39:06 +09:00
Xie He	aab485e9ce	drivers/net/wan/x25_asy: Fix to make it work [ Upstream commit `8fdcabeac3` ] This driver is not working because of problems of its receiving code. This patch fixes it to make it work. When the driver receives an LAPB frame, it should first pass the frame to the LAPB module to process. After processing, the LAPB module passes the data (the packet) back to the driver, the driver should then add a one-byte pseudo header and pass the data to upper layers. The changes to the "x25_asy_bump" function and the "x25_asy_data_indication" function are to correctly implement this procedure. Also, the "x25_asy_unesc" function ignores any frame that is shorter than 3 bytes. However the shortest frames are 2-byte long. So we need to change it to allow 2-byte frames to pass. Cc: Eric Dumazet <edumazet@google.com> Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Xie He <xie.he.0141@gmail.com> Reviewed-by: Martin Schiller <ms@dev.tdt.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:39:04 +09:00
Wei Yongjun	15b4a5f272	ip6_gre: fix null-ptr-deref in ip6gre_init_net() [ Upstream commit `46ef5b89ec` ] KASAN report null-ptr-deref error when register_netdev() failed: KASAN: null-ptr-deref in range [0x00000000000003c0-0x00000000000003c7] CPU: 2 PID: 422 Comm: ip Not tainted 5.8.0-rc4+ #12 Call Trace: ip6gre_init_net+0x4ab/0x580 ? ip6gre_tunnel_uninit+0x3f0/0x3f0 ops_init+0xa8/0x3c0 setup_net+0x2de/0x7e0 ? rcu_read_lock_bh_held+0xb0/0xb0 ? ops_init+0x3c0/0x3c0 ? kasan_unpoison_shadow+0x33/0x40 ? __kasan_kmalloc.constprop.0+0xc2/0xd0 copy_net_ns+0x27d/0x530 create_new_namespaces+0x382/0xa30 unshare_nsproxy_namespaces+0xa1/0x1d0 ksys_unshare+0x39c/0x780 ? walk_process_tree+0x2a0/0x2a0 ? trace_hardirqs_on+0x4a/0x1b0 ? _raw_spin_unlock_irq+0x1f/0x30 ? syscall_trace_enter+0x1a7/0x330 ? do_syscall_64+0x1c/0xa0 __x64_sys_unshare+0x2d/0x40 do_syscall_64+0x56/0xa0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ip6gre_tunnel_uninit() has set 'ign->fb_tunnel_dev' to NULL, later access to ign->fb_tunnel_dev cause null-ptr-deref. Fix it by saving 'ign->fb_tunnel_dev' to local variable ndev. Fixes: `dafabb6590` ("ip6_gre: fix use-after-free in ip6gre_tunnel_lookup()") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:39:03 +09:00
Yuchung Cheng	5271b31147	tcp: allow at most one TLP probe per flight [ Upstream commit `76be93fc07` ] Previously TLP may send multiple probes of new data in one flight. This happens when the sender is cwnd limited. After the initial TLP containing new data is sent, the sender receives another ACK that acks partial inflight. It may re-arm another TLP timer to send more, if no further ACK returns before the next TLP timeout (PTO) expires. The sender may send in theory a large amount of TLP until send queue is depleted. This only happens if the sender sees such irregular uncommon ACK pattern. But it is generally undesirable behavior during congestion especially. The original TLP design restrict only one TLP probe per inflight as published in "Reducing Web Latency: the Virtue of Gentle Aggression", SIGCOMM 2013. This patch changes TLP to send at most one probe per inflight. Note that if the sender is app-limited, TLP retransmits old data and did not have this issue. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:39:02 +09:00
Dan Carpenter	0232de1d3d	AX.25: Prevent integer overflows in connect and sendmsg [ Upstream commit `17ad73e941` ] We recently added some bounds checking in ax25_connect() and ax25_sendmsg() and we so we removed the AX25_MAX_DIGIS checks because they were no longer required. Unfortunately, I believe they are required to prevent integer overflows so I have added them back. Fixes: `8885bb0621` ("AX.25: Prevent out-of-bounds read in ax25_sendmsg()") Fixes: `2f2a7ffad5` ("AX.25: Fix out-of-bounds read in ax25_connect()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:39:01 +09:00
David Howells	4112f8b125	rxrpc: Fix sendmsg() returning EPIPE due to recvmsg() returning ENODATA [ Upstream commit `639f181f0e` ] rxrpc_sendmsg() returns EPIPE if there's an outstanding error, such as if rxrpc_recvmsg() indicating ENODATA if there's nothing for it to read. Change rxrpc_recvmsg() to return EAGAIN instead if there's nothing to read as this particular error doesn't get stored in ->sk_err by the networking core. Also change rxrpc_sendmsg() so that it doesn't fail with delayed receive errors (there's no way for it to report which call, if any, the error was caused by). Fixes: `17926a7932` ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:38:56 +09:00
Miaohe Lin	db33b969d5	net: udp: Fix wrong clean up for IS_UDPLITE macro [ Upstream commit `b0a422772f` ] We can't use IS_UDPLITE to replace udp_sk->pcflag when UDPLITE_RECV_CC is checked. Fixes: `b2bf1e2659` ("[UDP]: Clean up for IS_UDPLITE macro") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:38:55 +09:00
Xiongfeng Wang	3b4e16cf3a	net-sysfs: add a newline when printing 'tx_timeout' by sysfs [ Upstream commit `9bb5fbea59` ] When I cat 'tx_timeout' by sysfs, it displays as follows. It's better to add a newline for easy reading. root@syzkaller:~# cat /sys/devices/virtual/net/lo/queues/tx-0/tx_timeout 0root@syzkaller:~# Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:38:54 +09:00
Subash Abhinov Kasiviswanathan	3b641c71a0	dev: Defer free of skbs in flush_backlog [ Upstream commit `7df5cb75cf` ] IRQs are disabled when freeing skbs in input queue. Use the IRQ safe variant to free skbs here. Fixes: `145dd5f9c8` ("net: flush the softnet backlog in process context") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:38:53 +09:00
Peilin Ye	07512f34c1	AX.25: Prevent out-of-bounds read in ax25_sendmsg() [ Upstream commit `8885bb0621` ] Checks on `addr_len` and `usax->sax25_ndigis` are insufficient. ax25_sendmsg() can go out of bounds when `usax->sax25_ndigis` equals to 7 or 8. Fix it. It is safe to remove `usax->sax25_ndigis > AX25_MAX_DIGIS`, since `addr_len` is guaranteed to be less than or equal to `sizeof(struct full_sockaddr_ax25)` Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:38:52 +09:00
Peilin Ye	807138b31c	AX.25: Fix out-of-bounds read in ax25_connect() [ Upstream commit `2f2a7ffad5` ] Checks on `addr_len` and `fsa->fsa_ax25.sax25_ndigis` are insufficient. ax25_connect() can go out of bounds when `fsa->fsa_ax25.sax25_ndigis` equals to 7 or 8. Fix it. This issue has been reported as a KMSAN uninit-value bug, because in such a case, ax25_connect() reaches into the uninitialized portion of the `struct sockaddr_storage` statically allocated in __sys_connect(). It is safe to remove `fsa->fsa_ax25.sax25_ndigis > AX25_MAX_DIGIS` because `addr_len` is guaranteed to be less than or equal to `sizeof(struct full_sockaddr_ax25)`. Reported-by: syzbot+c82752228ed975b0a623@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=55ef9d629f3b3d7d70b69558015b63b48d01af66 Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:38:51 +09:00
Mark O'Donovan	9f6a01f9d0	ath9k: Fix regression with Atheros 9271 commit `92f53e2fda` upstream. This fix allows ath9k_htc modules to connect to WLAN once again. Fixes: `2bbcaaee1f` ("ath9k: Fix general protection fault in ath9k_hif_usb_rx_cb") Link: https://bugzilla.kernel.org/show_bug.cgi?id=208251 Signed-off-by: Mark O'Donovan <shiftee@posteo.net> Reported-by: Roman Mamedov <rm@romanrm.net> Tested-by: Viktor Jägersküpper <viktor_jaegerskuepper@freenet.de> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20200711043324.8079-1-shiftee@posteo.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:38:50 +09:00
Qiujun Huang	ede8746fcd	ath9k: Fix general protection fault in ath9k_hif_usb_rx_cb commit `2bbcaaee1f` upstream. In ath9k_hif_usb_rx_cb interface number is assumed to be 0. usb_ifnum_to_if(urb->dev, 0) But it isn't always true. The case reported by syzbot: https://lore.kernel.org/linux-usb/000000000000666c9c05a1c05d12@google.com usb 2-1: new high-speed USB device number 2 using dummy_hcd usb 2-1: config 1 has an invalid interface number: 2 but max is 0 usb 2-1: config 1 has no interface number 0 usb 2-1: New USB device found, idVendor=0cf3, idProduct=9271, bcdDevice= 1.08 usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 general protection fault, probably for non-canonical address 0xdffffc0000000015: 0000 [#1] SMP KASAN KASAN: null-ptr-deref in range [0x00000000000000a8-0x00000000000000af] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0-rc5-syzkaller #0 Call Trace __usb_hcd_giveback_urb+0x29a/0x550 drivers/usb/core/hcd.c:1650 usb_hcd_giveback_urb+0x368/0x420 drivers/usb/core/hcd.c:1716 dummy_timer+0x1258/0x32ae drivers/usb/gadget/udc/dummy_hcd.c:1966 call_timer_fn+0x195/0x6f0 kernel/time/timer.c:1404 expire_timers kernel/time/timer.c:1449 [inline] __run_timers kernel/time/timer.c:1773 [inline] __run_timers kernel/time/timer.c:1740 [inline] run_timer_softirq+0x5f9/0x1500 kernel/time/timer.c:1786 __do_softirq+0x21e/0x950 kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x178/0x1a0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:546 [inline] smp_apic_timer_interrupt+0x141/0x540 arch/x86/kernel/apic/apic.c:1146 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829 Reported-and-tested-by: syzbot+40d5d2e8a4680952f042@syzkaller.appspotmail.com Signed-off-by: Qiujun Huang <hqjagain@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20200404041838.10426-6-hqjagain@gmail.com Cc: Viktor Jägersküpper <viktor_jaegerskuepper@freenet.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:38:49 +09:00
John David Anglin	beb59f93b1	parisc: Add atomic64_set_release() define to avoid CPU soft lockups commit `be6577af0c` upstream. Stalls are quite frequent with recent kernels. I enabled CONFIG_SOFTLOCKUP_DETECTOR and I caught the following stall: watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [cc1:22803] CPU: 0 PID: 22803 Comm: cc1 Not tainted 5.6.17+ #3 Hardware name: 9000/800/rp3440 IAOQ[0]: d_alloc_parallel+0x384/0x688 IAOQ[1]: d_alloc_parallel+0x388/0x688 RP(r2): d_alloc_parallel+0x134/0x688 Backtrace: [<000000004036974c>] __lookup_slow+0xa4/0x200 [<0000000040369fc8>] walk_component+0x288/0x458 [<000000004036a9a0>] path_lookupat+0x88/0x198 [<000000004036e748>] filename_lookup+0xa0/0x168 [<000000004036e95c>] user_path_at_empty+0x64/0x80 [<000000004035d93c>] vfs_statx+0x104/0x158 [<000000004035dfcc>] __do_sys_lstat64+0x44/0x80 [<000000004035e5a0>] sys_lstat64+0x20/0x38 [<0000000040180054>] syscall_exit+0x0/0x14 The code was stuck in this loop in d_alloc_parallel: 4037d414: 0e 00 10 dc ldd 0(r16),ret0 4037d418: c7 fc 5f ed bb,< ret0,1f,4037d414 <d_alloc_parallel+0x384> 4037d41c: 08 00 02 40 nop This is the inner loop of bit_spin_lock which is called by hlist_bl_unlock in d_alloc_parallel: static inline void bit_spin_lock(int bitnum, unsigned long addr) { / * Assuming the lock is uncontended, this never enters * the body of the outer loop. If it is contended, then * within the inner loop a non-atomic test is used to * busywait with less bus contention for a good time to * attempt to acquire the lock bit. */ preempt_disable(); #if defined(CONFIG_SMP) \|\| defined(CONFIG_DEBUG_SPINLOCK) while (unlikely(test_and_set_bit_lock(bitnum, addr))) { preempt_enable(); do { cpu_relax(); } while (test_bit(bitnum, addr)); preempt_disable(); } #endif __acquire(bitlock); } After consideration, I realized that we must be losing bit unlocks. Then, I noticed that we missed defining atomic64_set_release(). Adding this define fixes the stalls in bit operations. Signed-off-by: Dave Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:38:48 +09:00
Michael J. Ruhl	d4ca9b4931	io-mapping: indicate mapping failure commit `e0b3e0b1a0` upstream. The !ATOMIC_IOMAP version of io_maping_init_wc will always return success, even when the ioremap fails. Since the ATOMIC_IOMAP version returns NULL when the init fails, and callers check for a NULL return on error this is unexpected. During a device probe, where the ioremap failed, a crash can look like this: BUG: unable to handle page fault for address: 0000000000210000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page Oops: 0002 [#1] PREEMPT SMP CPU: 0 PID: 177 Comm: RIP: 0010:fill_page_dma [i915] gen8_ppgtt_create [i915] i915_ppgtt_create [i915] intel_gt_init [i915] i915_gem_init [i915] i915_driver_probe [i915] pci_device_probe really_probe driver_probe_device The remap failure occurred much earlier in the probe. If it had been propagated, the driver would have exited with an error. Return NULL on ioremap failure. [akpm@linux-foundation.org: detect ioremap_wc() errors earlier] Fixes: `cafaf14a5d` ("io-mapping: Always create a struct to hold metadata about the io-mapping") Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200721171936.81563-1-michael.j.ruhl@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 08:38:47 +09:00

1 2 3 4 5 ...

658201 Commits