linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-08 03:40:35 +09:00

Author	SHA1	Message	Date
Johan Hovold	7f93c8c895	USB: iowarrior: fix use-after-free on release commit `80cd5479b5` upstream. The driver was accessing its struct usb_interface from its release() callback without holding a reference. This would lead to a use-after-free whenever debugging was enabled and the device was disconnected while its character device was open. Fixes: `549e83500b` ("USB: iowarrior: Convert local dbg macro to dev_dbg") Cc: stable <stable@vger.kernel.org> # 3.16 Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191009104846.5925-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:56 -07:00
Johan Hovold	2fdcf7e19b	USB: iowarrior: fix use-after-free on disconnect commit `edc4746f25` upstream. A recent fix addressing a deadlock on disconnect introduced a new bug by moving the present flag out of the critical section protected by the driver-data mutex. This could lead to a racing release() freeing the driver data before disconnect() is done with it. Due to insufficient locking a related use-after-free could be triggered also before the above mentioned commit. Specifically, the driver needs to hold the driver-data mutex also while checking the opened flag at disconnect(). Fixes: `c468a8aa79` ("usb: iowarrior: fix deadlock on disconnect") Fixes: `946b960d13` ("USB: add driver for iowarrior devices.") Cc: stable <stable@vger.kernel.org> # 2.6.21 Reported-by: syzbot+0761012cebf7bdb38137@syzkaller.appspotmail.com Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191009104846.5925-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:56 -07:00
Johan Hovold	ab162d331c	USB: adutux: fix use-after-free on release commit `123a0f125f` upstream. The driver was accessing its struct usb_device in its release() callback without holding a reference. This would lead to a use-after-free whenever the device was disconnected while the character device was still open. Fixes: `66d4bc30d1` ("USB: adutux: remove custom debug macro") Cc: stable <stable@vger.kernel.org> # 3.12 Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191009153848.8664-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:55 -07:00
Johan Hovold	ca9c18c00a	USB: adutux: fix NULL-derefs on disconnect commit `b2fa7baee7` upstream. The driver was using its struct usb_device pointer as an inverted disconnected flag, but was setting it to NULL before making sure all completion handlers had run. This could lead to a NULL-pointer dereference in a number of dev_dbg statements in the completion handlers which relies on said pointer. The pointer was also dereferenced unconditionally in a dev_dbg statement release() something which would lead to a NULL-deref whenever a device was disconnected before the final character-device close if debugging was enabled. Fix this by unconditionally stopping all I/O and preventing resubmissions by poisoning the interrupt URBs at disconnect and using a dedicated disconnected flag. This also makes sure that all I/O has completed by the time the disconnect callback returns. Fixes: `1ef37c6047` ("USB: adutux: remove custom debug macro and module parameter") Fixes: `66d4bc30d1` ("USB: adutux: remove custom debug macro") Cc: stable <stable@vger.kernel.org> # 3.12 Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20190925092913.8608-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:54 -07:00
Johan Hovold	316f51d775	USB: adutux: fix use-after-free on disconnect commit `44efc269db` upstream. The driver was clearing its struct usb_device pointer, which it used as an inverted disconnected flag, before deregistering the character device and without serialising against racing release(). This could lead to a use-after-free if a racing release() callback observes the cleared pointer and frees the driver data before disconnect() is finished with it. This could also lead to NULL-pointer dereferences in a racing open(). Fixes: `f08812d5eb` ("USB: FIx locks and urb->status in adutux (updated)") Cc: stable <stable@vger.kernel.org> # 2.6.24 Reported-by: syzbot+0243cb250a51eeefb8cc@syzkaller.appspotmail.com Tested-by: syzbot+0243cb250a51eeefb8cc@syzkaller.appspotmail.com Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20190925092913.8608-1-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:54 -07:00
Kai-Heng Feng	ea72556633	xhci: Increase STS_SAVE timeout in xhci_suspend() commit `ac34336684` upstream. After commit `f7fac17ca9` ("xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic()"), ASMedia xHCI may fail to suspend. Although the algorithms are essentially the same, the old max timeout is (usec + usec * time of doing readl()), and the new max timeout is just usec, which is much less than the old one. Increase the timeout to make ASMedia xHCI able to suspend again. BugLink: https://bugs.launchpad.net/bugs/1844021 Fixes: `f7fac17ca9` ("xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic()") Cc: <stable@vger.kernel.org> # v5.2+ Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/1570190373-30684-8-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:53 -07:00
Bill Kuzeja	cbc5abaa6f	xhci: Prevent deadlock when xhci adapter breaks during init commit `8de66b0e6a` upstream. The system can hit a deadlock if an xhci adapter breaks while initializing. The deadlock is between two threads: thread 1 is tearing down the adapter and is stuck in usb_unlocked_disable_lpm waiting to lock the hcd->handwidth_mutex. Thread 2 is holding this mutex (while still trying to add a usb device), but is stuck in xhci_endpoint_reset waiting for a stop or config command to complete. A reboot is required to resolve. It turns out when calling xhci_queue_stop_endpoint and xhci_queue_configure_endpoint in xhci_endpoint_reset, the return code is not checked for errors. If the timing is right and the adapter dies just before either of these commands get issued, we hang indefinitely waiting for a completion on a command that didn't get issued. This wasn't a problem before the following fix because we didn't send commands in xhci_endpoint_reset: commit `f5249461b5` ("xhci: Clear the host side toggle manually when endpoint is soft reset") With the patch I am submitting, a duration test which breaks adapters during initialization (and which deadlocks with the standard kernel) runs without issue. Fixes: `f5249461b5` ("xhci: Clear the host side toggle manually when endpoint is soft reset") Cc: <stable@vger.kernel.org> # v4.17+ Cc: Torez Smith <torez@redhat.com> Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com> Signed-off-by: Torez Smith <torez@redhat.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/1570190373-30684-7-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:53 -07:00
Rick Tseng	fde058a17c	usb: xhci: wait for CNR controller not ready bit in xhci resume commit `a70bcbc322` upstream. NVIDIA 3.1 xHCI card would lose power when moving power state into D3Cold. Thus we need to wait for CNR bit to clear in xhci resume, just as in xhci init. [Minor changes to comment and commit message -Mathias] Cc: <stable@vger.kernel.org> Signed-off-by: Rick Tseng <rtseng@nvidia.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/1570190373-30684-6-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:53 -07:00
Mathias Nyman	13e793da4f	xhci: Fix USB 3.1 capability detection on early xHCI 1.1 spec based hosts commit `47f50d6107` upstream. Early xHCI 1.1 spec did not mention USB 3.1 capable hosts should set sbrn to 0x31, or that the minor revision is a two digit BCD containing minor and sub-minor numbers. This was later clarified in xHCI 1.2. Some USB 3.1 capable hosts therefore have sbrn set to 0x30, or minor revision set to 0x1 instead of 0x10. Detect the USB 3.1 capability correctly for these hosts as well Fixes: `ddd57980a0` ("xhci: detect USB 3.2 capable host controllers correctly") Cc: <stable@vger.kernel.org> # v4.18+ Cc: Loïc Yhuel <loic.yhuel@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/1570190373-30684-5-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:52 -07:00
Jan Schmidt	d6bdd4686f	xhci: Check all endpoints for LPM timeout commit `d500c63f80` upstream. If an endpoint is encountered that returns USB3_LPM_DEVICE_INITIATED, keep checking further endpoints, as there might be periodic endpoints later that return USB3_LPM_DISABLED due to shorter service intervals. Without this, the code can set too high a maximum-exit-latency and prevent the use of multiple USB3 cameras that should be able to work. Cc: <stable@vger.kernel.org> Signed-off-by: Jan Schmidt <jan@centricular.com> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/1570190373-30684-4-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:52 -07:00
Mathias Nyman	faa0502a5b	xhci: Prevent device initiated U1/U2 link pm if exit latency is too long commit `cd9d9491e8` upstream. If host/hub initiated link pm is prevented by a driver flag we still must ensure that periodic endpoints have longer service intervals than link pm exit latency before allowing device initiated link pm. Fix this by continue walking and checking endpoint service interval if xhci_get_timeout_no_hub_lpm() returns anything else than USB3_LPM_DISABLED While at it fix the split line error message Tested-by: Jan Schmidt <jan@centricular.com> Cc: <stable@vger.kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/1570190373-30684-3-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:51 -07:00
Mathias Nyman	077855ba2d	xhci: Fix false warning message about wrong bounce buffer write length commit `c03101ff4f` upstream. The check printing out the "WARN Wrong bounce buffer write length:" uses incorrect values when comparing bytes written from scatterlist to bounce buffer. Actual copied lengths are fine. The used seg->bounce_len will be set to equal new_buf_len a few lines later in the code, but is incorrect when doing the comparison. The patch which added this false warning was backported to 4.8+ kernels so this should be backported as far as well. Cc: <stable@vger.kernel.org> # v4.8+ Fixes: `597c56e372` ("xhci: update bounce buffer with correct sg num") Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/1570190373-30684-2-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:51 -07:00
Johan Hovold	31604075ce	USB: usb-skeleton: fix NULL-deref on disconnect commit `bed5ef2309` upstream. The driver was using its struct usb_interface pointer as an inverted disconnected flag and was setting it to NULL before making sure all completion handlers had run. This could lead to NULL-pointer dereferences in the dev_err() statements in the completion handlers which relies on said pointer. Fix this by using a dedicated disconnected flag. Note that this is also addresses a NULL-pointer dereference at release() and a struct usb_interface reference leak introduced by a recent runtime PM fix, which depends on and should have been submitted together with this patch. Fixes: `4212cd74ca` ("USB: usb-skeleton.c: remove err() usage") Fixes: `5c290a5e42` ("USB: usb-skeleton: fix runtime PM after driver unbind") Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191009170944.30057-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:50 -07:00
Johan Hovold	dcabc48fe0	USB: usb-skeleton: fix runtime PM after driver unbind commit `5c290a5e42` upstream. Since commit `c2b71462d2` ("USB: core: Fix bug caused by duplicate interface PM usage counter") USB drivers must always balance their runtime PM gets and puts, including when the driver has already been unbound from the interface. Leaving the interface with a positive PM usage counter would prevent a later bound driver from suspending the device. Fixes: `c2b71462d2` ("USB: core: Fix bug caused by duplicate interface PM usage counter") Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191001084908.2003-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:49 -07:00
Johan Hovold	571a140636	USB: yurex: fix NULL-derefs on disconnect commit `aafb00a977` upstream. The driver was using its struct usb_interface pointer as an inverted disconnected flag, but was setting it to NULL without making sure all code paths that used it were done with it. Before commit `ef61eb43ad` ("USB: yurex: Fix protection fault after device removal") this included the interrupt-in completion handler, but there are further accesses in dev_err and dev_dbg statements in yurex_write() and the driver-data destructor (sic!). Fix this by unconditionally stopping also the control URB at disconnect and by using a dedicated disconnected flag. Note that we need to take a reference to the struct usb_interface to avoid a use-after-free in the destructor whenever the device was disconnected while the character device was still open. Fixes: `aadd6472d9` ("USB: yurex.c: remove dbg() usage") Fixes: `45714104b9` ("USB: yurex.c: remove err() usage") Cc: stable <stable@vger.kernel.org> # 3.5: `ef61eb43ad` Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191009153848.8664-6-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:49 -07:00
Alan Stern	a8fe336f20	USB: yurex: Don't retry on unexpected errors commit `32a0721c66` upstream. According to Greg KH, it has been generally agreed that when a USB driver encounters an unknown error (or one it can't handle directly), it should just give up instead of going into a potentially infinite retry loop. The three codes -EPROTO, -EILSEQ, and -ETIME fall into this category. They can be caused by bus errors such as packet loss or corruption, attempting to communicate with a disconnected device, or by malicious firmware. Nowadays the extent of packet loss or corruption is negligible, so it should be safe for a driver to give up whenever one of these errors occurs. Although the yurex driver handles -EILSEQ errors in this way, it doesn't do the same for -EPROTO (as discovered by the syzbot fuzzer) or other unrecognized errors. This patch adjusts the driver so that it doesn't log an error message for -EPROTO or -ETIME, and it doesn't retry after any errors. Reported-and-tested-by: syzbot+b24d736f18a1541ad550@syzkaller.appspotmail.com Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: Tomoki Sekiyama <tomoki.sekiyama@gmail.com> CC: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1909171245410.1590-100000@iolanthe.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:48 -07:00
Bastien Nocera	86575b7f63	USB: rio500: Remove Rio 500 kernel driver commit `015664d152` upstream. The Rio500 kernel driver has not been used by Rio500 owners since 2001 not long after the rio500 project added support for a user-space USB stack through the very first versions of usbdevfs and then libusb. Support for the kernel driver was removed from the upstream utilities in 2008: `943f624ab7` Cc: Cesar Miquel <miquel@df.uba.ar> Signed-off-by: Bastien Nocera <hadess@hadess.net> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/6251c17584d220472ce882a3d9c199c401a51a71.camel@hadess.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:47 -07:00
Icenowy Zheng	95bcc0d980	f2fs: use EINVAL for superblock with invalid magic [ Upstream commit `38fb6d0ea3` ] The kernel mount_block_root() function expects -EACESS or -EINVAL for a unmountable filesystem when trying to mount the root with different filesystem types. However, in 5.3-rc1 the behavior when F2FS code cannot find valid block changed to return -EFSCORRUPTED(-EUCLEAN), and this error code makes mount_block_root() fail when trying to probe F2FS. When the magic number of the superblock mismatches, it has a high probability that it's just not a F2FS. In this case return -EINVAL seems to be a better result, and this return value can make mount_block_root() probing work again. Return -EINVAL when the superblock has magic mismatch, -EFSCORRUPTED in other cases (the magic matches but the superblock cannot be recognized). Fixes: `10f966bbf5` ("f2fs: use generic EFSBADCRC/EFSCORRUPTED") Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-10-17 13:44:47 -07:00
Will Deacon	7d1688c673	panic: ensure preemption is disabled during panic() commit `20bb759a66` upstream. Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the calling CPU in an infinite loop, but with interrupts and preemption enabled. From this state, userspace can continue to be scheduled, despite the system being "dead" as far as the kernel is concerned. This is easily reproducible on arm64 when booting with "nosmp" on the command line; a couple of shell scripts print out a periodic "Ping" message whilst another triggers a crash by writing to /proc/sysrq-trigger: \| sysrq: Trigger a crash \| Kernel panic - not syncing: sysrq triggered crash \| CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 \| Hardware name: linux,dummy-virt (DT) \| Call trace: \| dump_backtrace+0x0/0x148 \| show_stack+0x14/0x20 \| dump_stack+0xa0/0xc4 \| panic+0x140/0x32c \| sysrq_handle_reboot+0x0/0x20 \| __handle_sysrq+0x124/0x190 \| write_sysrq_trigger+0x64/0x88 \| proc_reg_write+0x60/0xa8 \| __vfs_write+0x18/0x40 \| vfs_write+0xa4/0x1b8 \| ksys_write+0x64/0xf0 \| __arm64_sys_write+0x14/0x20 \| el0_svc_common.constprop.0+0xb0/0x168 \| el0_svc_handler+0x28/0x78 \| el0_svc+0x8/0xc \| Kernel Offset: disabled \| CPU features: 0x0002,24002004 \| Memory Limit: none \| ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- \| Ping 2! \| Ping 1! \| Ping 1! \| Ping 2! The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise local interrupts are disabled in 'smp_send_stop()'. Disable preemption in 'panic()' before re-enabling interrupts. Link: http://lkml.kernel.org/r/20191002123538.22609-1-will@kernel.org Link: https://lore.kernel.org/r/BX1W47JXPMR8.58IYW53H6M5N@dragonstone Signed-off-by: Will Deacon <will@kernel.org> Reported-by: Xogium <contact@xogium.me> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Feng Tang <feng.tang@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-17 13:44:46 -07:00
Greg Kroah-Hartman	dafd634415	Linux 4.19.79	2019-10-11 18:21:44 +02:00
Johannes Berg	1bd17a737c	nl80211: validate beacon head commit `f88eb7c0d0` upstream. We currently don't validate the beacon head, i.e. the header, fixed part and elements that are to go in front of the TIM element. This means that the variable elements there can be malformed, e.g. have a length exceeding the buffer size, but most downstream code from this assumes that this has already been checked. Add the necessary checks to the netlink policy. Cc: stable@vger.kernel.org Fixes: `ed1b6cc7f8` ("cfg80211/nl80211: add beacon settings") Link: https://lore.kernel.org/r/1569009255-I7ac7fbe9436e9d8733439eab8acbbd35e55c74ef@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:44 +02:00
Jouni Malinen	527ba5d763	cfg80211: Use const more consistently in for_each_element macros commit `7388afe091` upstream. Enforce the first argument to be a correct type of a pointer to struct element and avoid unnecessary typecasts from const to non-const pointers (the change in validate_ie_attr() is needed to make this part work). In addition, avoid signed/unsigned comparison within for_each_element() and mark struct element packed just in case. Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:43 +02:00
Johannes Berg	ad180cace8	cfg80211: add and use strongly typed element iteration macros commit `0f3b07f027` upstream. Rather than always iterating elements from frames with pure u8 pointers, add a type "struct element" that encapsulates the id/datalen/data format of them. Then, add the element iteration macros * for_each_element * for_each_element_id * for_each_element_extid which take, as their first 'argument', such a structure and iterate through a given u8 array interpreting it as elements. While at it and since we'll need it, also add * for_each_subelement * for_each_subelement_id * for_each_subelement_extid which instead of taking data/length just take an outer element and use its data/datalen. Also add for_each_element_completed() to determine if any of the loops above completed, i.e. it was able to parse all of the elements successfully and no data remained. Use for_each_element_id() in cfg80211_find_ie_match() as the first user of this. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:42 +02:00
Gao Xiang	3dab5ba6d7	staging: erofs: detect potential multiref due to corrupted images commit `e12a0ce2fa` upstream. As reported by erofs-utils fuzzer, currently, multiref (ondisk deduplication) hasn't been supported for now, we should forbid it properly. Fixes: `3883a79abd` ("staging: erofs: introduce VLE decompression support") Cc: <stable@vger.kernel.org> # 4.19+ Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Link: https://lore.kernel.org/r/20190821140152.229648-1-gaoxiang25@huawei.com [ Gao Xiang: Since earlier kernels don't define EFSCORRUPTED, let's use EIO instead. ] Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:41 +02:00
Gao Xiang	8b4341f9b8	staging: erofs: add two missing erofs_workgroup_put for corrupted images commit `138e1a0990` upstream. As reported by erofs-utils fuzzer, these error handling path will be entered to handle corrupted images. Lack of erofs_workgroup_puts will cause unmounting unsuccessfully. Fix these return values to EFSCORRUPTED as well. Fixes: `3883a79abd` ("staging: erofs: introduce VLE decompression support") Cc: <stable@vger.kernel.org> # 4.19+ Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Link: https://lore.kernel.org/r/20190819103426.87579-4-gaoxiang25@huawei.com [ Gao Xiang: Older kernel versions don't have length validity check and EFSCORRUPTED, thus backport pageofs check for now. ] Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:41 +02:00
Gao Xiang	596bbc4e0e	staging: erofs: some compressed cluster should be submitted for corrupted images commit `ee45197c80` upstream. As reported by erofs_utils fuzzer, a logical page can belong to at most 2 compressed clusters, if one compressed cluster is corrupted, but the other has been ready in submitting chain. The chain needs to submit anyway in order to keep the page working properly (page unlocked with PG_error set, PG_uptodate not set). Let's fix it now. Fixes: `3883a79abd` ("staging: erofs: introduce VLE decompression support") Cc: <stable@vger.kernel.org> # 4.19+ Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Link: https://lore.kernel.org/r/20190819103426.87579-2-gaoxiang25@huawei.com [ Gao Xiang: Manually backport to v4.19.y stable. ] Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:40 +02:00
Gao Xiang	e7c4441038	staging: erofs: fix an error handling in erofs_readdir() commit `acb383f1dc` upstream. Richard observed a forever loop of erofs_read_raw_page() [1] which can be generated by forcely setting ->u.i_blkaddr to 0xdeadbeef (as my understanding block layer can handle access beyond end of device correctly). After digging into that, it seems the problem is highly related with directories and then I found the root cause is an improper error handling in erofs_readdir(). Let's fix it now. [1] https://lore.kernel.org/r/1163995781.68824.1566084358245.JavaMail.zimbra@nod.at/ Reported-by: Richard Weinberger <richard@nod.at> Fixes: `3aa8ec716e` ("staging: erofs: add directory operations") Cc: <stable@vger.kernel.org> # 4.19+ Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190818125457.25906-1-hsiangkao@aol.com [ Gao Xiang: Since earlier kernels don't define EFSCORRUPTED, let's use original error code instead. ] Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:40 +02:00
Andrew Murray	1b94c1e80c	coresight: etm4x: Use explicit barriers on enable/disable commit `1004ce4c25` upstream. Synchronization is recommended before disabling the trace registers to prevent any start or stop points being speculative at the point of disabling the unit (section 7.3.77 of ARM IHI 0064D). Synchronization is also recommended after programming the trace registers to ensure all updates are committed prior to normal code resuming (section 4.3.7 of ARM IHI 0064D). Let's ensure these syncronization points are present in the code and clearly commented. Note that we could rely on the barriers in CS_LOCK and coresight_disclaim_device_unlocked or the context switch to user space - however coresight may be of use in the kernel. On armv8 the mb macro is defined as dsb(sy) - Given that the etm4x is only used on armv8 let's directly use dsb(sy) instead of mb(). This removes some ambiguity and makes it easier to correlate the code with the TRM. Signed-off-by: Andrew Murray <andrew.murray@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> [Fixed capital letter for "use" in title] Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Link: https://lore.kernel.org/r/20190829202842.580-11-mathieu.poirier@linaro.org Cc: stable@vger.kernel.org # 4.9+ Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:39 +02:00
Eric Sandeen	effad578c2	vfs: Fix EOVERFLOW testing in put_compat_statfs64 commit `cc3a7bfe62` upstream. Today, put_compat_statfs64() disallows nearly any field value over 2^32 if f_bsize is only 32 bits, but that makes no sense. compat_statfs64 is there for the explicit purpose of providing 64-bit fields for f_files, f_ffree, etc. And f_bsize is always only 32 bits. As a result, 32-bit userspace gets -EOVERFLOW for i.e. large file counts even with -D_FILE_OFFSET_BITS=64 set. In reality, only f_bsize and f_frsize can legitimately overflow (fields like f_type and f_namelen should never be large), so test only those fields. This bug was discussed at length some time ago, and this is the proposal Al suggested at https://lkml.org/lkml/2018/8/6/640. It seemed to get dropped amid the discussion of other related changes, but this part seems obviously correct on its own, so I've picked it up and sent it, for expediency. Fixes: `64d2ab32ef` ("vfs: fix put_compat_statfs64() does not handle errors") Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:39 +02:00
Josh Poimboeuf	d976344d27	arm64/speculation: Support 'mitigations=' cmdline option commit `a111b7c0f2` upstream. Configure arm64 runtime CPU speculation bug mitigations in accordance with the 'mitigations=' cmdline option. This affects Meltdown, Spectre v2, and Speculative Store Bypass. The default behavior is unchanged. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> [will: reorder checks so KASLR implies KPTI and SSBS is affected by cmdline] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:38 +02:00
Marc Zyngier	af33d74628	arm64: Use firmware to detect CPUs that are not affected by Spectre-v2 commit `517953c2c4` upstream. The SMCCC ARCH_WORKAROUND_1 service can indicate that although the firmware knows about the Spectre-v2 mitigation, this particular CPU is not vulnerable, and it is thus not necessary to call the firmware on this CPU. Let's use this information to our benefit. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:38 +02:00
Marc Zyngier	17d1acc4c6	arm64: Force SSBS on context switch [ Upstream commit `cbdf8a189a` ] On a CPU that doesn't support SSBS, PSTATE[12] is RES0. In a system where only some of the CPUs implement SSBS, we end-up losing track of the SSBS bit across task migration. To address this issue, let's force the SSBS bit on context switch. Fixes: `8f04e8e6e2` ("arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> [will: inverted logic and added comments] Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:37 +02:00
Will Deacon	fe22ea561c	arm64: ssbs: Don't treat CPUs with SSBS as unaffected by SSB [ Upstream commit `eb337cdfcd` ] SSBS provides a relatively cheap mitigation for SSB, but it is still a mitigation and its presence does not indicate that the CPU is unaffected by the vulnerability. Tweak the mitigation logic so that we report the correct string in sysfs. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:37 +02:00
Jeremy Linton	dada3a4abb	arm64: add sysfs vulnerability show for speculative store bypass [ Upstream commit `526e065dbc` ] Return status based on ssbd_state and __ssb_safe. If the mitigation is disabled, or the firmware isn't responding then return the expected machine state based on a whitelist of known good cores. Given a heterogeneous machine, the overall machine vulnerability defaults to safe but is reset to unsafe when we miss the whitelist and the firmware doesn't explicitly tell us the core is safe. In order to make that work we delay transitioning to vulnerable until we know the firmware isn't responding to avoid a case where we miss the whitelist, but the firmware goes ahead and reports the core is not vulnerable. If all the cores in the machine have SSBS, then __ssb_safe will remain true. Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:36 +02:00
Jeremy Linton	f41df38898	arm64: add sysfs vulnerability show for spectre-v2 [ Upstream commit `d2532e27b5` ] Track whether all the cores in the machine are vulnerable to Spectre-v2, and whether all the vulnerable cores have been mitigated. We then expose this information to userspace via sysfs. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:35 +02:00
Jeremy Linton	9d1bb39cdd	arm64: Always enable spectre-v2 vulnerability detection [ Upstream commit `8c1e3d2bb4` ] Ensure we are always able to detect whether or not the CPU is affected by Spectre-v2, so that we can later advertise this to userspace. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:35 +02:00
Marc Zyngier	b1a33cfd80	arm64: Advertise mitigation of Spectre-v2, or lack thereof [ Upstream commit `73f3816609` ] We currently have a list of CPUs affected by Spectre-v2, for which we check that the firmware implements ARCH_WORKAROUND_1. It turns out that not all firmwares do implement the required mitigation, and that we fail to let the user know about it. Instead, let's slightly revamp our checks, and rely on a whitelist of cores that are known to be non-vulnerable, and let the user know the status of the mitigation in the kernel log. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:34 +02:00
Jeremy Linton	59a6dc262c	arm64: Provide a command line to disable spectre_v2 mitigation [ Upstream commit `e5ce5e7267` ] There are various reasons, such as benchmarking, to disable spectrev2 mitigation on a machine. Provide a command-line option to do so. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:34 +02:00
Jeremy Linton	c131623b1e	arm64: Always enable ssb vulnerability detection [ Upstream commit `d42281b6e4` ] Ensure we are always able to detect whether or not the CPU is affected by SSB, so that we can later advertise this to userspace. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> [will: Use IS_ENABLED instead of #ifdef] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:33 +02:00
Mian Yousaf Kaukab	47a11f2eaf	arm64: enable generic CPU vulnerabilites support [ Upstream commit `61ae1321f0` ] Enable CPU vulnerabilty show functions for spectre_v1, spectre_v2, meltdown and store-bypass. Signed-off-by: Mian Yousaf Kaukab <ykaukab@suse.de> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:33 +02:00
Jeremy Linton	512158d0c6	arm64: add sysfs vulnerability show for meltdown [ Upstream commit `1b3ccf4be0` ] We implement page table isolation as a mitigation for meltdown. Report this to userspace via sysfs. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:32 +02:00
Mian Yousaf Kaukab	047aac35fd	arm64: Add sysfs vulnerability show for spectre-v1 [ Upstream commit `3891ebccac` ] spectre-v1 has been mitigated and the mitigation is always active. Report this to userspace via sysfs Signed-off-by: Mian Yousaf Kaukab <ykaukab@suse.de> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:32 +02:00
Mark Rutland	edfc026626	arm64: fix SSBS sanitization [ Upstream commit `f54dada827` ] In valid_user_regs() we treat SSBS as a RES0 bit, and consequently it is unexpectedly cleared when we restore a sigframe or fiddle with GPRs via ptrace. This patch fixes valid_user_regs() to account for this, updating the function to refer to the latest ARM ARM (ARM DDI 0487D.a). For AArch32 tasks, SSBS appears in bit 23 of SPSR_EL1, matching its position in the AArch32-native PSR format, and we don't need to translate it as we have to for DIT. There are no other bit assignments that we need to account for today. As the recent documentation describes the DIT bit, we can drop our comment regarding DIT. While removing SSBS from the RES0 masks, existing inconsistent whitespace is corrected. Fixes: `d71be2b6c0` ("arm64: cpufeature: Detect SSBS and advertise to userspace") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:31 +02:00
Will Deacon	09c22781dd	arm64: docs: Document SSBS HWCAP [ Upstream commit `ee91176120` ] We advertise the MRS/MSR instructions for toggling SSBS at EL0 using an HWCAP, so document it along with the others. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:31 +02:00
Will Deacon	a59d42ac50	KVM: arm64: Set SCTLR_EL2.DSSBS if SSBD is forcefully disabled and !vhe [ Upstream commit `7c36447ae5` ] When running without VHE, it is necessary to set SCTLR_EL2.DSSBS if SSBD has been forcefully disabled on the kernel command-line. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:30 +02:00
Will Deacon	1eaff33e24	arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3 [ Upstream commit `8f04e8e6e2` ] On CPUs with support for PSTATE.SSBS, the kernel can toggle the SSBD state without needing to call into firmware. This patch hooks into the existing SSBD infrastructure so that SSBS is used on CPUs that support it, but it's all made horribly complicated by the very real possibility of big/little systems that don't uniformly provide the new capability. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:30 +02:00
Vincent Chen	d286a37471	riscv: Avoid interrupts being erroneously enabled in handle_exception() [ Upstream commit `c82dd6d078` ] When the handle_exception function addresses an exception, the interrupts will be unconditionally enabled after finishing the context save. However, It may erroneously enable the interrupts if the interrupts are disabled before entering the handle_exception. For example, one of the WARN_ON() condition is satisfied in the scheduling where the interrupt is disabled and rq.lock is locked. The WARN_ON will trigger a break exception and the handle_exception function will enable the interrupts before entering do_trap_break function. During the procedure, if a timer interrupt is pending, it will be taken when interrupts are enabled. In this case, it may cause a deadlock problem if the rq.lock is locked again in the timer ISR. Hence, the handle_exception() can only enable interrupts when the state of sstatus.SPIE is 1. This patch is tested on HiFive Unleashed board. Signed-off-by: Vincent Chen <vincent.chen@sifive.com> Reviewed-by: Palmer Dabbelt <palmer@sifive.com> [paul.walmsley@sifive.com: updated to apply] Fixes: `bcae803a21` ("RISC-V: Enable IRQ during exception handling") Cc: David Abdurachmanov <david.abdurachmanov@sifive.com> Cc: stable@vger.kernel.org Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-10-11 18:21:29 +02:00
Srikar Dronamraju	5b67a4721d	perf stat: Reset previous counts on repeat with interval [ Upstream commit `b63fd11cce` ] When using 'perf stat' with repeat and interval option, it shows wrong values for events. The wrong values will be shown for the first interval on the second and subsequent repetitions. Without the fix: # perf stat -r 3 -I 2000 -e faults -e sched:sched_switch -a sleep 5 2.000282489 53 faults 2.000282489 513 sched:sched_switch 4.005478208 3,721 faults 4.005478208 2,666 sched:sched_switch 5.025470933 395 faults 5.025470933 1,307 sched:sched_switch 2.009602825 1,84,46,74,40,73,70,95,47,520 faults <------ 2.009602825 1,84,46,74,40,73,70,95,49,568 sched:sched_switch <------ 4.019612206 4,730 faults 4.019612206 2,746 sched:sched_switch 5.039615484 3,953 faults 5.039615484 1,496 sched:sched_switch 2.000274620 1,84,46,74,40,73,70,95,47,520 faults <------ 2.000274620 1,84,46,74,40,73,70,95,47,520 sched:sched_switch <------ 4.000480342 4,282 faults 4.000480342 2,303 sched:sched_switch 5.000916811 1,322 faults 5.000916811 1,064 sched:sched_switch # prev_raw_counts is allocated when using intervals. This is used when calculating the difference in the counts of events when using interval. The current counts are stored in prev_raw_counts to calculate the differences in the next iteration. On the first interval of the second and subsequent repetitions, prev_raw_counts would be the values stored in the last interval of the previous repetitions, while the current counts will only be for the first interval of the current repetition. Hence there is a possibility of events showing up as big number. Fix this by resetting prev_raw_counts whenever perf stat repeats the command. With the fix: # perf stat -r 3 -I 2000 -e faults -e sched:sched_switch -a sleep 5 2.019349347 2,597 faults 2.019349347 2,753 sched:sched_switch 4.019577372 3,098 faults 4.019577372 2,532 sched:sched_switch 5.019415481 1,879 faults 5.019415481 1,356 sched:sched_switch 2.000178813 8,468 faults 2.000178813 2,254 sched:sched_switch 4.000404621 7,440 faults 4.000404621 1,266 sched:sched_switch 5.040196079 2,458 faults 5.040196079 556 sched:sched_switch 2.000191939 6,870 faults 2.000191939 1,170 sched:sched_switch 4.000414103 541 faults 4.000414103 902 sched:sched_switch 5.000809863 450 faults 5.000809863 364 sched:sched_switch # Committer notes: This was broken since the cset introducing the --interval feature, i.e. --repeat + --interval wasn't tested at that point, add the Fixes tag so that automatic scripts can pick this up. Fixes: `13370a9b5b` ("perf stat: Add interval printing") Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: stable@vger.kernel.org # v3.9+ Link: http://lore.kernel.org/lkml/20190904094738.9558-2-srikar@linux.vnet.ibm.com [ Fixed up conflicts with libperf, i.e. some perf_{evsel,evlist} lost the 'perf' prefix ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-10-11 18:21:29 +02:00
Jiri Olsa	15c57bf9dc	perf tools: Fix segfault in cpu_cache_level__read() [ Upstream commit `0216234c2e` ] We release wrong pointer on error path in cpu_cache_level__read function, leading to segfault: (gdb) r record ls Starting program: /root/perf/tools/perf/perf record ls ... [ perf record: Woken up 1 times to write data ] double free or corruption (out) Thread 1 "perf" received signal SIGABRT, Aborted. 0x00007ffff7463798 in raise () from /lib64/power9/libc.so.6 (gdb) bt #0 0x00007ffff7463798 in raise () from /lib64/power9/libc.so.6 #1 0x00007ffff7443bac in abort () from /lib64/power9/libc.so.6 #2 0x00007ffff74af8bc in __libc_message () from /lib64/power9/libc.so.6 #3 0x00007ffff74b92b8 in malloc_printerr () from /lib64/power9/libc.so.6 #4 0x00007ffff74bb874 in _int_free () from /lib64/power9/libc.so.6 #5 0x0000000010271260 in __zfree (ptr=0x7fffffffa0b0) at ../../lib/zalloc.. #6 0x0000000010139340 in cpu_cache_level__read (cache=0x7fffffffa090, cac.. #7 0x0000000010143c90 in build_caches (cntp=0x7fffffffa118, size=<optimiz.. ... Releasing the proper pointer. Fixes: `720e98b5fa` ("perf tools: Add perf data cache feature") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org: # v4.6+ Link: http://lore.kernel.org/lkml/20190912105235.10689-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-10-11 18:21:28 +02:00
Balasubramani Vivekanandan	e5331c37c0	tick: broadcast-hrtimer: Fix a race in bc_set_next [ Upstream commit `b9023b91dd` ] When a cpu requests broadcasting, before starting the tick broadcast hrtimer, bc_set_next() checks if the timer callback (bc_handler) is active using hrtimer_try_to_cancel(). But hrtimer_try_to_cancel() does not provide the required synchronization when the callback is active on other core. The callback could have already executed tick_handle_oneshot_broadcast() and could have also returned. But still there is a small time window where the hrtimer_try_to_cancel() returns -1. In that case bc_set_next() returns without doing anything, but the next_event of the tick broadcast clock device is already set to a timeout value. In the race condition diagram below, CPU #1 is running the timer callback and CPU #2 is entering idle state and so calls bc_set_next(). In the worst case, the next_event will contain an expiry time, but the hrtimer will not be started which happens when the racing callback returns HRTIMER_NORESTART. The hrtimer might never recover if all further requests from the CPUs to subscribe to tick broadcast have timeout greater than the next_event of tick broadcast clock device. This leads to cascading of failures and finally noticed as rcu stall warnings Here is a depiction of the race condition CPU #1 (Running timer callback) CPU #2 (Enter idle and subscribe to tick broadcast) --------------------- --------------------- __run_hrtimer() tick_broadcast_enter() bc_handler() __tick_broadcast_oneshot_control() tick_handle_oneshot_broadcast() raw_spin_lock(&tick_broadcast_lock); dev->next_event = KTIME_MAX; //wait for tick_broadcast_lock //next_event for tick broadcast clock set to KTIME_MAX since no other cores subscribed to tick broadcasting raw_spin_unlock(&tick_broadcast_lock); if (dev->next_event == KTIME_MAX) return HRTIMER_NORESTART // callback function exits without restarting the hrtimer //tick_broadcast_lock acquired raw_spin_lock(&tick_broadcast_lock); tick_broadcast_set_event() clockevents_program_event() dev->next_event = expires; bc_set_next() hrtimer_try_to_cancel() //returns -1 since the timer callback is active. Exits without restarting the timer cpu_base->running = NULL; The comment that hrtimer cannot be armed from within the callback is wrong. It is fine to start the hrtimer from within the callback. Also it is safe to start the hrtimer from the enter/exit idle code while the broadcast handler is active. The enter/exit idle code and the broadcast handler are synchronized using tick_broadcast_lock. So there is no need for the existing try to cancel logic. All this can be removed which will eliminate the race condition as well. Fixes: `5d1638acb9` ("tick: Introduce hrtimer based broadcast") Originally-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Balasubramani Vivekanandan <balasubramani_vivekanandan@mentor.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190926135101.12102-2-balasubramani_vivekanandan@mentor.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-10-11 18:21:28 +02:00

1 2 3 4 5 ...

791583 Commits