linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-07 19:30:30 +09:00

Author	SHA1	Message	Date
David Disseldorp	0ad08996da	scsi: target: avoid per-loop XCOPY buffer allocations The main target_xcopy_do_work() loop unnecessarily allocates an I/O buffer with each synchronous READ / WRITE pair. This commit significantly reduces allocations by reusing the XCOPY I/O buffer when possible. Link: https://lore.kernel.org/r/20200327141954.955-4-ddiss@suse.de Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-29 18:10:59 -04:00
David Disseldorp	267fc83f88	scsi: target: drop xcopy DISK BLOCK LENGTH debug The DISK BLOCK LENGTH field is carried with XCOPY target descriptors on the wire, but is currently unmarshalled during 0x02 segment descriptor passing. The unmarshalled value is currently unused, so drop it. Link: https://lore.kernel.org/r/20200327141954.955-3-ddiss@suse.de Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-29 18:10:59 -04:00
David Disseldorp	95b1b51e77	scsi: target: use #define for xcopy descriptor len Link: https://lore.kernel.org/r/20200327141954.955-2-ddiss@suse.de Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-29 18:10:59 -04:00
Stanley Chu	8033824bbf	scsi: ufs-mediatek: add error recovery for suspend and resume Once fail happens during suspend and resume flow if the desired low power link state is H8, link recovery is required for MediaTek UFS controller. For resume flow, since power and clocks are already enabled before invoking vendor's resume callback, simply using ufshcd_link_recovery() inside callback is fine. For suspend flow, the device power enters low power mode or is disabled before suspend callback, thus ufshcd_link_recovery() can not be directly used in vendor callback. One solution is to set the link to off state and then ufshcd_host_reset_and_restore() will be executed by ufshcd_suspend(). Link: https://lore.kernel.org/r/20200327095329.10083-3-stanley.chu@mediatek.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-29 18:10:58 -04:00
Stanley Chu	087c5efafa	scsi: ufs: export ufshcd_link_recovery Export ufshcd_link_recovery to allow vendors to recover failed link in vendor's callbacks. Link: https://lore.kernel.org/r/20200327095329.10083-2-stanley.chu@mediatek.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-29 18:10:58 -04:00
Subhash Jadavani	394b949f2d	scsi: ufs: Clean up ufshcd_scale_clks() and clock scaling error out path This change introduces a func ufshcd_set_clk_freq() to explicitly set clock frequency so that it can be used in reset_and_restore path and in ufshcd_scale_clks(). This change also cleans up the clock scaling error out path. [mkp: commit desc] Link: https://lore.kernel.org/r/1585214742-5466-2-git-send-email-cang@codeaurora.org Fixes: `a3cd5ec55f` ("scsi: ufs: add load based scaling of UFS gear") Reviewed-by: Bean Huo <beanhuo@micron.com> Acked-by: Avri Altman <Avri.Altman@wdc.com> Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-29 18:10:58 -04:00
James Smart	d75e119e60	scsi: lpfc: Update lpfc version to 12.8.0.0 Update lpfc version to 12.8.0.0 Link: https://lore.kernel.org/r/20200322181304.37655-13-jsmart2021@gmail.com Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-29 18:10:58 -04:00
James Smart	0e75461a68	scsi: lpfc: Remove prototype FIPS/DSS options from SLI-3 During code review, identified dss feature that was a prototype only and was never productized in SLI3. They shouldn't be there and prevents reuse of the command areas. Remove any code in the driver to deal with dss, including code to deal with fips, which is associated with the dss feature. Link: https://lore.kernel.org/r/20200322181304.37655-12-jsmart2021@gmail.com Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-29 18:10:58 -04:00
James Smart	2fcbc569b9	scsi: lpfc: Make debugfs ktime stats generic for NVME and SCSI Currently driver ktime stats, measuring code paths, is NVME-specific. Convert the stats routines such that the code paths are generic, providing status for NVME and SCSI. Added ktime stat calls in SCSI queuecommand and cmpl routines. Link: https://lore.kernel.org/r/20200322181304.37655-11-jsmart2021@gmail.com Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-29 18:10:58 -04:00
James Smart	840eda9602	scsi: lpfc: Fix erroneous cpu limit of 128 on I/O statistics The cpu io statistics were capped by a hard define limit of 128. This effectively was a max number of CPUs, not an actual CPU count, nor actual CPU numbers which can be even larger than both of those values. This made stats off/misleading and on large CPU count systems, wrong. Fix the stats so that all CPUs can have a stats struct. Fix the looping such that it loops by hdwq, finds CPUs that used the hdwq, and sum the stats, then display. Link: https://lore.kernel.org/r/20200322181304.37655-9-jsmart2021@gmail.com Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-03-29 18:10:48 -04:00
Steve French	f460c50274	cifs: update internal module version number To 2.26 Signed-off-by: Steve French <stfrench@microsoft.com>	2020-03-29 16:59:31 -05:00
Alex Dewar	4a7c46247f	um: Remove some unnecessary NULL checks in vector_user.c kfree() already checks for null pointers, so additional checking is unnecessary. Signed-off-by: Alex Dewar <alex.dewar@gmx.co.uk> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:56:47 +02:00
Sjoerd Simons	237ce2e681	um: vector: Avoid NULL ptr deference if transport is unset When the transport option of a vec isn't set strncmp ends up being called on a NULL pointer. Better not do that. Signed-off-by: Sjoerd Simons <sjoerd.simons@collabora.co.uk> Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:54:51 +02:00
Brendan Higgins	3363179385	um: Make CONFIG_STATIC_LINK actually static Currently, CONFIG_STATIC_LINK can be enabled with options which cannot be statically linked, namely UML_NET_VECTOR, UML_NET_VDE, and UML_NET_PCAP; this is because glibc tries to load NSS which does not support being statically linked. So make CONFIG_STATIC_LINK depend on !UML_NET_VECTOR && !UML_NET_VDE && !UML_NET_PCAP. Link: https://lore.kernel.org/lkml/f658f317-be54-ed75-8296-c373c2dcc697@cambridgegreys.com/#t Signed-off-by: Brendan Higgins <brendanhiggins@google.com> Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:51:24 +02:00
Long Li	3946d0d04b	cifs: Allocate encryption header through kmalloc When encryption is used, smb2_transform_hdr is defined on the stack and is passed to the transport. This doesn't work with RDMA as the buffer needs to be DMA'ed. Fix it by using kmalloc. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-03-29 16:42:54 -05:00
Long Li	4ebb8795a7	cifs: smbd: Check and extend sender credits in interrupt context When a RDMA packet is received and server is extending send credits, we should check and unblock senders immediately in IRQ context. Doing it in a worker queue causes unnecessary delay and doesn't save much CPU on the receive path. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-03-29 16:42:36 -05:00
Long Li	f7950cb05d	cifs: smbd: Calculate the correct maximum packet size for segmented SMBDirect send/receive The packet size needs to take account of SMB2 header size and possible encryption header size. This is only done when signing is used and it is for RDMA send/receive, not read/write. Also remove the dead SMBD code in smb2_negotiate_r(w)size. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-03-29 16:41:49 -05:00
Johannes Berg	5bef0a153b	um: Implement cpu_relax() as ndelay(1) for time-travel In time-travel mode, cpu_relax() currently does actual CPU relax, but that doesn't affect the simulation. Ideally, we wouldn't run anything that uses it in simulation, but if we actually have virtio devices combined with the same simulation it's possible. Implement cpu_relax() as ndelay(1) in this case, using time_travel_ndelay(1) directly to catch errors if this is used erroneously in builds that don't set CONFIG_UML_TIME_TRAVEL_SUPPORT. While at it, convert it to an __always_inline and also add that to rep_nop() like the original does now. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:29:56 +02:00
Johannes Berg	0bc8fb4dda	um: Implement ndelay/udelay in time-travel mode In external or inf-cpu time-travel mode, ndelay/udelay currently just waste CPU time since the simulation time doesn't advance. Implement them properly in this case. Note that the "if (time_travel_mode == ...)" parts compile out if CONFIG_UML_TIME_TRAVEL_SUPPORT isn't set, time_travel_mode is defined to TT_MODE_OFF in that case. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:29:52 +02:00
Johannes Berg	88ce642492	um: Implement time-travel=ext This implements synchronized time-travel mode which - using a special application on a unix socket - lets multiple machines take part in a time-travelling simulation together. The protocol for the unix domain socket is defined in the new file include/uapi/linux/um_timetravel.h. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:29:08 +02:00
Johannes Berg	dd9ada5627	um: virtio: Implement VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS Implement in-band notifications that are necessary for running vhost-user devices under externally synchronized time-travel mode (which is in a follow-up patch). This feature makes what usually should be eventfd notifications in-band messages. We'll prefer this feature, under the assumption that only a few (simulation) devices will ever support it, since it's not very efficient. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:29:00 +02:00
Johannes Berg	4b786e24ca	um: time-travel: Rewrite as an event scheduler Instead of tracking all the various timer configurations, modify the time-travel mode to have an event scheduler and use a timer event on the scheduler to handle the different timer configurations. This doesn't change the function right now, but it prepares the code for having different kinds of events in the future (i.e. interrupts coming from other devices that are part of co-simulation.) While at it, also move time_travel_sleep() to time.c to reduce the externally visible API surface. Also, we really should mark time-travel as incompatible with SMP, even if UML doesn't support SMP yet. Finally, I noticed a bug while developing this - if we move time forward due to consuming time while reading the clock, we might move across the next event and that would cause us to go backward in time when we then handle that event. Fix that by invoking the whole event machine in this case, but in order to simplify this, make reading the clock only cost something when interrupts are not disabled. Otherwise, we'd have to hook into the interrupt delivery machinery etc. and that's somewhat intrusive. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:28:51 +02:00
Johannes Berg	f185063bff	um: Move timer-internal.h to non-shared This file isn't really shared, it's only used on the kernel side, not on the user side. Remove the include from the user-side and move the file to a better place. While at it, rename it to time-internal.h, it's not really just timers but all kinds of things related to timekeeping. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:28:43 +02:00
Andy Shevchenko	b58c4e9619	hostfs: Use kasprintf() instead of fixed buffer formatting Improve readability and maintainability by replacing a hardcoded string allocation and formatting by the use of the kasprintf() helper. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:23:00 +02:00
Alan Maguire	35f3401317	um: falloc.h needs to be directly included for older libc When building UML with glibc 2.17 installed, compilation of arch/um/os-Linux/file.c fails due to failure to find FALLOC_FL_PUNCH_HOLE and FALLOC_FL_KEEP_SIZE definitions. It appears that /usr/include/bits/fcntl-linux.h (indirectly included by /usr/include/fcntl.h) does not include falloc.h with an older glibc, whereas a more up-to-date version does. Adding the direct include to file.c resolves the issue and does not cause problems for more recent glibc. Fixes: `50109b5a03` ("um: Add support for DISCARD in the UBD Driver") Cc: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:21:42 +02:00
Gabriel Krisman Bertazi	e355b2f55e	um: ubd: Retry buffer read on any kind of error Should bulk_req_safe_read return an error, we want to retry the read, otherwise, even though no IO will be done, os_write_file might still end up writing garbage to the pipe. Cc: Martyn Welch <martyn.welch@collabora.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:21:37 +02:00
Gabriel Krisman Bertazi	6e682d53fc	um: ubd: Prevent buffer overrun on command completion On the hypervisor side, when completing commands and the pipe is full, we retry writing only the entries that failed, by offsetting io_req_buffer, but we don't reduce the number of bytes written, which can cause a buffer overrun of io_req_buffer, and write garbage to the pipe. Cc: Martyn Welch <martyn.welch@collabora.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:21:33 +02:00
David Gow	598f563036	um: Fix overlapping ELF segments when statically linked When statically linked, the .text section in UML kernels is not page aligned, causing it to share a page with the executable headers. As .text and the executable headers have different permissions, this causes the kernel to wish to map the same page twice (once as headers with r-- permissions, once as .text with r-x permissions), causing a segfault, and a nasty message printed to the host kernel's dmesg: "Uhuuh, elf segment at 0000000060000000 requested but the memory is mapped already" By aligning the .text to a page boundary (as in the dynamically linked version in dyn.lds.S), there is no such overlap, and the kernel runs correctly. Signed-off-by: David Gow <davidgow@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:21:29 +02:00
Leon Romanovsky	73343392aa	um: Delete never executed timer The "#ifdef undef" construction effectively disabled the timer. It causes to the fact that this timer did nothing, so delete it. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:21:26 +02:00
Leon Romanovsky	c2ed957c3b	um: Don't overwrite ethtool driver version In-tree drivers don't need to manage internal version because they are aligned to the global Linux kernel version, which is reported by default with "ethtool -i". Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:20:25 +02:00
Wen Yang	ba758cfce0	um: Fix len of file in create_pid_file sizeof gives us the size of the pointer variable, not of the area it points to. So the number of bytes copied by umid_file_name() is 8. We should pass in the correct length of the file buffer. Signed-off-by: Wen Yang <wenyang@linux.alibaba.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:20:07 +02:00
Andy Shevchenko	7d7c056828	um: Don't use console_drivers directly console_drivers is kind of (semi-)private variable to the console code. Direct use of it make us stuck with it being exported here and there. Reduce use of console_drivers by replacing it with for_each_console(). Cc: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:17:10 +02:00
Krzysztof Kozlowski	b495dfed70	um: Cleanup CONFIG_IOSCHED_CFQ CONFIG_IOSCHED_CFQ is gone since commit `f382fb0bce` ("block: remove legacy IO schedulers"). The IOSCHED_BFQ seems to replace IOSCHED_CFQ so select it in configs previously choosing the latter. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:15:22 +02:00
Thomas Gleixner	8a13b02a01	Merge tag 'irqchip-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - Second batch of the GICv4.1 support saga - Level triggered interrupt support for the stm32 controller - Versatile-fpga chained interrupt fixes - DT support for cascaded VIC interrupt controller - RPi irqchip initialization fixes - Multi-instance support for the Xilinx interrupt controller - Multi-instance support for the PLIC interrupt controller - CPU hotplug support for the PLIC interrupt controller - Ingenic X1000 TCU support - Small fixes all over the shop (GICv3, GICv4, Xilinx, Atmel, sa1111) - Cleanups (setup_irq removal, zero-length array removal)	2020-03-29 22:20:48 +02:00
Leonard Crestez	a29de86521	rtc: imx-sc: Align imx sc msg structs to 4 The imx SC api strongly assumes that messages are composed out of 4-bytes words but some of our message structs have odd sizeofs. This produces many oopses with CONFIG_KASAN=y. Fix by marking with __aligned(4). Fixes: `a3094fc1a1` ("rtc: imx-sc: add rtc alarm support") Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Link: https://lore.kernel.org/r/13404bac8360852d86c61fad5ae5f0c91ffc4cb6.1582216144.git.leonard.crestez@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>	2020-03-29 22:08:35 +02:00
Biwen Li	9c328c9dd8	rtc: fsl-ftm-alarm: report alarm to core Report interrupt state to the RTC core. Signed-off-by: Biwen Li <biwen.li@nxp.com> Link: https://lore.kernel.org/r/20200327084457.45161-1-biwen.li@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>	2020-03-29 22:01:49 +02:00
afzal mohammed	ba947241f1	unicore32: Replace setup_irq() by request_irq() request_irq() is preferred over setup_irq(). Invocations of setup_irq() occur after memory allocators are ready. setup_irq() was required in older kernels as the memory allocator was not available during early boot. Hence replace setup_irq() by request_irq(). Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/82667ae23520611b2a9d8db77e1d8aeb982f08e5.1585320721.git.afzal.mohd.ma@gmail.com	2020-03-29 21:03:43 +02:00
afzal mohammed	5497fce735	sh: Replace setup_irq() by request_irq() request_irq() is preferred over setup_irq(). Invocations of setup_irq() occur after memory allocators are ready. setup_irq() was required in older kernels as the memory allocator was not available during early boot. Hence replace setup_irq() by request_irq(). Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/b060312689820559121ee0a6456bbc1202fb7ee5.1585320721.git.afzal.mohd.ma@gmail.com	2020-03-29 21:03:43 +02:00
afzal mohammed	45b26ddee6	hexagon: Replace setup_irq() by request_irq() request_irq() is preferred over setup_irq(). Invocations of setup_irq() occur after memory allocators are ready. setup_irq() was required in older kernels as the memory allocator was not available during early boot. Hence replace setup_irq() by request_irq(). Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/e84ac60de8f747d49ce082659e51595f708c29d4.1585320721.git.afzal.mohd.ma@gmail.com	2020-03-29 21:03:42 +02:00
afzal mohammed	e13b99f300	c6x: Replace setup_irq() by request_irq() request_irq() is preferred over setup_irq(). Invocations of setup_irq() occur after memory allocators are ready. setup_irq() was required in older kernels as the memory allocator was not available during early boot. Hence replace setup_irq() by request_irq(). Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/56e991e920ce5806771fab892574cba89a3d413f.1585320721.git.afzal.mohd.ma@gmail.com	2020-03-29 21:03:42 +02:00
afzal mohammed	82c849eb36	alpha: Replace setup_irq() by request_irq() request_irq() is preferred over setup_irq(). Invocations of setup_irq() occur after memory allocators are ready. setup_irq() was required in older kernels as the memory allocator was not available during early boot. Hence replace setup_irq() by request_irq(). Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Matt Turner <mattst88@gmail.com> Link: https://lkml.kernel.org/r/51f8ae7da9f47a23596388141933efa2bdef317b.1585320721.git.afzal.mohd.ma@gmail.com	2020-03-29 21:03:41 +02:00
Linus Torvalds	570203ec83	Merge branch 'akpm' (patches from Andrew) Merge vm fixes from Andrew Morton: "5 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/sparse: fix kernel crash with pfn_section_valid check mm: fork: fix kernel_stack memcg stats for various stack implementations hugetlb_cgroup: fix illegal access to memory drivers/base/memory.c: indicate all memory blocks as removable mm/swapfile.c: move inode_lock out of claim_swapfile	2020-03-29 10:40:31 -07:00
Linus Torvalds	ab93e984db	Merge tag 'timers-urgent-2020-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A single fix for the Hyper-V clocksource driver to make sched clock actually return nanoseconds and not the virtual clock value which increments at 10e7 HZ (100ns)" * tag 'timers-urgent-2020-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/drivers/hyper-v: Make sched clock return nanoseconds correctly	2020-03-29 10:36:29 -07:00
Linus Torvalds	01af08bd24	Merge tag 'irq-urgent-2020-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Thomas Gleixner: "A single bugfix to prevent reference leaks in irq affinity notifiers" * tag 'irq-urgent-2020-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Fix reference leaks on irq affinity notifiers	2020-03-29 10:07:00 -07:00
Aneesh Kumar K.V	b943f045a9	mm/sparse: fix kernel crash with pfn_section_valid check Fix the crash like this: BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc000000000c3447c Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries CPU: 11 PID: 7519 Comm: lt-ndctl Not tainted 5.6.0-rc7-autotest #1 ... NIP [c000000000c3447c] vmemmap_populated+0x98/0xc0 LR [c000000000088354] vmemmap_free+0x144/0x320 Call Trace: section_deactivate+0x220/0x240 __remove_pages+0x118/0x170 arch_remove_memory+0x3c/0x150 memunmap_pages+0x1cc/0x2f0 devm_action_release+0x30/0x50 release_nodes+0x2f8/0x3e0 device_release_driver_internal+0x168/0x270 unbind_store+0x130/0x170 drv_attr_store+0x44/0x60 sysfs_kf_write+0x68/0x80 kernfs_fop_write+0x100/0x290 __vfs_write+0x3c/0x70 vfs_write+0xcc/0x240 ksys_write+0x7c/0x140 system_call+0x5c/0x68 The crash is due to NULL dereference at test_bit(idx, ms->usage->subsection_map); due to ms->usage = NULL in pfn_section_valid() With commit `d41e2f3bd5` ("mm/hotplug: fix hot remove failure in SPARSEMEM\|!VMEMMAP case") section_mem_map is set to NULL after depopulate_section_mem(). This was done so that pfn_page() can work correctly with kernel config that disables SPARSEMEM_VMEMMAP. With that config pfn_to_page does __section_mem_map_addr(__sec) + __pfn; where static inline struct page __section_mem_map_addr(struct mem_section section) { unsigned long map = section->section_mem_map; map &= SECTION_MAP_MASK; return (struct page )map; } Now with SPASEMEM_VMEMAP enabled, mem_section->usage->subsection_map is used to check the pfn validity (pfn_valid()). Since section_deactivate release mem_section->usage if a section is fully deactivated, pfn_valid() check after a subsection_deactivate cause a kernel crash. static inline int pfn_valid(unsigned long pfn) { ... return early_section(ms) \|\| pfn_section_valid(ms, pfn); } where static inline int pfn_section_valid(struct mem_section ms, unsigned long pfn) { int idx = subsection_map_index(pfn); return test_bit(idx, ms->usage->subsection_map); } Avoid this by clearing SECTION_HAS_MEM_MAP when mem_section->usage is freed. For architectures like ppc64 where large pages are used for vmmemap mapping (16MB), a specific vmemmap mapping can cover multiple sections. Hence before a vmemmap mapping page can be freed, the kernel needs to make sure there are no valid sections within that mapping. Clearing the section valid bit before depopulate_section_memap enables this. [aneesh.kumar@linux.ibm.com: add comment] Link: http://lkml.kernel.org/r/20200326133235.343616-1-aneesh.kumar@linux.ibm.comLink: http://lkml.kernel.org/r/20200325031914.107660-1-aneesh.kumar@linux.ibm.com Fixes: `d41e2f3bd5` ("mm/hotplug: fix hot remove failure in SPARSEMEM\|!VMEMMAP case") Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Reviewed-by: Baoquan He <bhe@redhat.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-03-29 09:47:06 -07:00
Roman Gushchin	8380ce4790	mm: fork: fix kernel_stack memcg stats for various stack implementations Depending on CONFIG_VMAP_STACK and the THREAD_SIZE / PAGE_SIZE ratio the space for task stacks can be allocated using __vmalloc_node_range(), alloc_pages_node() and kmem_cache_alloc_node(). In the first and the second cases page->mem_cgroup pointer is set, but in the third it's not: memcg membership of a slab page should be determined using the memcg_from_slab_page() function, which looks at page->slab_cache->memcg_params.memcg . In this case, using mod_memcg_page_state() (as in account_kernel_stack()) is incorrect: page->mem_cgroup pointer is NULL even for pages charged to a non-root memory cgroup. It can lead to kernel_stack per-memcg counters permanently showing 0 on some architectures (depending on the configuration). In order to fix it, let's introduce a mod_memcg_obj_state() helper, which takes a pointer to a kernel object as a first argument, uses mem_cgroup_from_obj() to get a RCU-protected memcg pointer and calls mod_memcg_state(). It allows to handle all possible configurations (CONFIG_VMAP_STACK and various THREAD_SIZE/PAGE_SIZE values) without spilling any memcg/kmem specifics into fork.c . Note: This is a special version of the patch created for stable backports. It contains code from the following two patches: - mm: memcg/slab: introduce mem_cgroup_from_obj() - mm: fork: fix kernel_stack memcg stats for various stack implementations [guro@fb.com: introduce mem_cgroup_from_obj()] Link: http://lkml.kernel.org/r/20200324004221.GA36662@carbon.dhcp.thefacebook.com Fixes: `4d96ba3530` ("mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages") Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Bharata B Rao <bharata@linux.ibm.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200303233550.251375-1-guro@fb.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-03-29 09:47:05 -07:00
Mina Almasry	726b7bbeaf	hugetlb_cgroup: fix illegal access to memory This appears to be a mistake in commit `faced7e080` ("mm: hugetlb controller for cgroups v2"). Essentially that commit does a hugetlb_cgroup_from_counter assuming that page_counter_try_charge has initialized counter. But if that has failed then it seems will not initialize counter, so hugetlb_cgroup_from_counter(counter) ends up pointing to random memory, causing kasan to complain. The solution is to simply use 'h_cg', instead of hugetlb_cgroup_from_counter(counter), since that is a reference to the hugetlb_cgroup anyway. After this change kasan ceases to complain. Fixes: `faced7e080` ("mm: hugetlb controller for cgroups v2") Reported-by: syzbot+cac0c4e204952cf449b1@syzkaller.appspotmail.com Signed-off-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Giuseppe Scrivano <gscrivan@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: David Rientjes <rientjes@google.com> Link: http://lkml.kernel.org/r/20200313223920.124230-1-almasrymina@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-03-29 09:47:05 -07:00
David Hildenbrand	53cdc1cb29	drivers/base/memory.c: indicate all memory blocks as removable We see multiple issues with the implementation/interface to compute whether a memory block can be offlined (exposed via /sys/devices/system/memory/memoryX/removable) and would like to simplify it (remove the implementation). 1. It runs basically lockless. While this might be good for performance, we see possible races with memory offlining that will require at least some sort of locking to fix. 2. Nowadays, more false positives are possible. No arch-specific checks are performed that validate if memory offlining will not be denied right away (and such check will require locking). For example, arm64 won't allow to offline any memory block that was added during boot - which will imply a very high error rate. Other archs have other constraints. 3. The interface is inherently racy. E.g., if a memory block is detected to be removable (and was not a false positive at that time), there is still no guarantee that offlining will actually succeed. So any caller already has to deal with false positives. 4. It is unclear which performance benefit this interface actually provides. The introducing commit `5c755e9fd8` ("memory-hotplug: add sysfs removable attribute for hotplug memory remove") mentioned "A user-level agent must be able to identify which sections of memory are likely to be removable before attempting the potentially expensive operation." However, no actual performance comparison was included. Known users: - lsmem: Will group memory blocks based on the "removable" property. [1] - chmem: Indirect user. It has a RANGE mode where one can specify removable ranges identified via lsmem to be offlined. However, it also has a "SIZE" mode, which allows a sysadmin to skip the manual "identify removable blocks" step. [2] - powerpc-utils: Uses the "removable" attribute to skip some memory blocks right away when trying to find some to offline+remove. However, with ballooning enabled, it already skips this information completely (because it once resulted in many false negatives). Therefore, the implementation can deal with false positives properly already. [3] According to Nathan Fontenot, DLPAR on powerpc is nowadays no longer driven from userspace via the drmgr command (powerpc-utils). Nowadays it's managed in the kernel - including onlining/offlining of memory blocks - triggered by drmgr writing to /sys/kernel/dlpar. So the affected legacy userspace handling is only active on old kernels. Only very old versions of drmgr on a new kernel (unlikely) might execute slower - totally acceptable. With CONFIG_MEMORY_HOTREMOVE, always indicating "removable" should not break any user space tool. We implement a very bad heuristic now. Without CONFIG_MEMORY_HOTREMOVE we cannot offline anything, so report "not removable" as before. Original discussion can be found in [4] ("[PATCH RFC v1] mm: is_mem_section_removable() overhaul"). Other users of is_mem_section_removable() will be removed next, so that we can remove is_mem_section_removable() completely. [1] http://man7.org/linux/man-pages/man1/lsmem.1.html [2] http://man7.org/linux/man-pages/man8/chmem.8.html [3] https://github.com/ibm-power-utilities/powerpc-utils [4] https://lkml.kernel.org/r/20200117105759.27905-1-david@redhat.com Also, this patch probably fixes a crash reported by Steve. http://lkml.kernel.org/r/CAPcyv4jpdaNvJ67SkjyUJLBnBnXXQv686BiVW042g03FUmWLXw@mail.gmail.com Reported-by: "Scargall, Steve" <steve.scargall@intel.com> Suggested-by: Michal Hocko <mhocko@kernel.org> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Nathan Fontenot <ndfont@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Karel Zak <kzak@redhat.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200128093542.6908-1-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-03-29 09:47:05 -07:00
Naohiro Aota	d795a90e2b	mm/swapfile.c: move inode_lock out of claim_swapfile claim_swapfile() currently keeps the inode locked when it is successful, or the file is already swapfile (with -EBUSY). And, on the other error cases, it does not lock the inode. This inconsistency of the lock state and return value is quite confusing and actually causing a bad unlock balance as below in the "bad_swap" section of __do_sys_swapon(). This commit fixes this issue by moving the inode_lock() and IS_SWAPFILE check out of claim_swapfile(). The inode is unlocked in "bad_swap_unlock_inode" section, so that the inode is ensured to be unlocked at "bad_swap". Thus, error handling codes after the locking now jumps to "bad_swap_unlock_inode" instead of "bad_swap". ===================================== WARNING: bad unlock balance detected! 5.5.0-rc7+ #176 Not tainted ------------------------------------- swapon/4294 is trying to release lock (&sb->s_type->i_mutex_key) at: __do_sys_swapon+0x94b/0x3550 but there are no more locks to release! other info that might help us debug this: no locks held by swapon/4294. stack backtrace: CPU: 5 PID: 4294 Comm: swapon Not tainted 5.5.0-rc7-BTRFS-ZNS+ #176 Hardware name: ASUS All Series/H87-PRO, BIOS 2102 07/29/2014 Call Trace: dump_stack+0xa1/0xea print_unlock_imbalance_bug.cold+0x114/0x123 lock_release+0x562/0xed0 up_write+0x2d/0x490 __do_sys_swapon+0x94b/0x3550 __x64_sys_swapon+0x54/0x80 do_syscall_64+0xa4/0x4b0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f15da0a0dc7 Fixes: `1638045c36` ("mm: set S_SWAPFILE on blockdev swap devices") Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Qais Youef <qais.yousef@arm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200206090132.154869-1-naohiro.aota@wdc.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-03-29 09:47:05 -07:00
Chaitanya Kulkarni	654a3667df	block: return NULL in blk_alloc_queue() on error This patch fixes follwoing warning: block/blk-core.c: In function ‘blk_alloc_queue’: block/blk-core.c:558:10: warning: returning ‘int’ from a function with return type ‘struct request_queue *’ makes pointer from integer without a cast [-Wint-conversion] return -EINVAL; Fixes: `3d745ea5b0` ("block: simplify queue allocation") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-03-29 10:08:26 -06:00

... 36 37 38 39 40 ...

915498 Commits