linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-07 19:30:30 +09:00

Author	SHA1	Message	Date
Yoshihiro Shimoda	709ca46f1d	phy: renesas: rcar-gen2: Fix memory leak at error paths [ Upstream commit `d4a36e8292` ] This patch fixes memory leak at error paths of the probe function. In for_each_child_of_node, if the loop returns, the driver should call of_put_node() before returns. Reported-by: Julia Lawall <julia.lawall@lip6.fr> Fixes: `1233f59f74` ("phy: Renesas R-Car Gen2 PHY driver") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:27:00 +02:00
David Riley	725c7b7811	drm/virtio: Add memory barriers for capset cache. [ Upstream commit `9ff3a5c88e` ] After data is copied to the cache entry, atomic_set is used indicate that the data is the entry is valid without appropriate memory barriers. Similarly the read side was missing the corresponding memory barriers. Signed-off-by: David Riley <davidriley@chromium.org> Link: http://patchwork.freedesktop.org/patch/msgid/20190610211810.253227-5-davidriley@chromium.org Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:27:00 +02:00
Nicholas Kazlauskas	11b4e9f369	drm/amd/display: Always allocate initial connector state state [ Upstream commit `f04bee34d6` ] [Why] Unlike our regular connectors, MST connectors don't start off with an initial connector state. This causes a NULL pointer dereference to occur when attaching the bpc property since it tries to modify the connector state. We need an initial connector state on the connector to avoid the crash. [How] Use our reset helper to allocate an initial state and reset the values to their defaults. We were already doing this before, just not for MST connectors. Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:27:00 +02:00
Rautkoski Kimmo EXT	1a2425b597	serial: 8250: Fix TX interrupt handling condition [ Upstream commit `db1b5bc047` ] Interrupt handler checked THRE bit (transmitter holding register empty) in LSR to detect if TX fifo is empty. In case when there is only receive interrupts the TX handling got called because THRE bit in LSR is set when there is no transmission (FIFO empty). TX handling caused TX stop, which in RS-485 half-duplex mode actually resets receiver FIFO. This is not desired during reception because of possible data loss. The fix is to check if THRI is set in IER in addition of the TX fifo status. THRI in IER is set when TX is started and cleared when TX is stopped. This ensures that TX handling is only called when there is really transmission on going and an interrupt for THRE and not when there are only RX interrupts. Signed-off-by: Kimmo Rautkoski <ext-kimmo.rautkoski@vaisala.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:27:00 +02:00
Jorge Ramirez-Ortiz	a0e7d6b7fa	tty: serial: msm_serial: avoid system lockup condition [ Upstream commit `ba3684f99f` ] The function msm_wait_for_xmitr can be taken with interrupts disabled. In order to avoid a potential system lockup - demonstrated under stress testing conditions on SoC QCS404/5 - make sure we wait for a bounded amount of time. Tested on SoC QCS404. Signed-off-by: Jorge Ramirez-Ortiz <jorge.ramirez-ortiz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:59 +02:00
Kefeng Wang	e40f5a873f	tty/serial: digicolor: Fix digicolor-usart already registered warning [ Upstream commit `c7ad9ba061` ] When modprobe/rmmod/modprobe module, if platform_driver_register() fails, the kernel complained, proc_dir_entry 'driver/digicolor-usart' already registered WARNING: CPU: 1 PID: 5636 at fs/proc/generic.c:360 proc_register+0x19d/0x270 Fix this by adding uart_unregister_driver() when platform_driver_register() fails. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:59 +02:00
Wang Hai	5c0e54839d	memstick: Fix error cleanup path of memstick_init [ Upstream commit `65f1a0d39c` ] If bus_register fails. On its error handling path, it has cleaned up what it has done. There is no need to call bus_unregister again. Otherwise, if bus_unregister is called, issues such as null-ptr-deref will arise. Syzkaller report this: kobject_add_internal failed for memstick (error: -12 parent: bus) BUG: KASAN: null-ptr-deref in sysfs_remove_file_ns+0x1b/0x40 fs/sysfs/file.c:467 Read of size 8 at addr 0000000000000078 by task syz-executor.0/4460 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xa9/0x10e lib/dump_stack.c:113 __kasan_report+0x171/0x18d mm/kasan/report.c:321 kasan_report+0xe/0x20 mm/kasan/common.c:614 sysfs_remove_file_ns+0x1b/0x40 fs/sysfs/file.c:467 sysfs_remove_file include/linux/sysfs.h:519 [inline] bus_remove_file+0x6c/0x90 drivers/base/bus.c:145 remove_probe_files drivers/base/bus.c:599 [inline] bus_unregister+0x6e/0x100 drivers/base/bus.c:916 ? 0xffffffffc1590000 memstick_init+0x7a/0x1000 [memstick] do_one_initcall+0xb9/0x3b5 init/main.c:914 do_init_module+0xe0/0x330 kernel/module.c:3468 load_module+0x38eb/0x4270 kernel/module.c:3819 __do_sys_finit_module+0x162/0x190 kernel/module.c:3909 do_syscall_64+0x72/0x2a0 arch/x86/entry/common.c:298 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: `baf8532a14` ("memstick: initial commit for Sony MemoryStick support") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai26@huawei.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:59 +02:00
Daniel Vetter	0a50a27238	drm/crc-debugfs: Also sprinkle irqrestore over early exits [ Upstream commit `d99004d720` ] I. was. blind. Caught with vkms, which has some really slow crc computation function. Fixes: `1882018a70` ("drm/crc-debugfs: User irqsafe spinlock in drm_crtc_add_crc_entry") Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190606211544.5389-1-daniel.vetter@ffwll.ch Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:59 +02:00
Daniel Vetter	26a6645454	drm/crc-debugfs: User irqsafe spinlock in drm_crtc_add_crc_entry [ Upstream commit `1882018a70` ] We can be called from any context, we need to be prepared. Noticed this while hacking on vkms, which calls this function from a normal worker. Which really upsets lockdep. Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org> Reviewed-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190605194556.16744-1-daniel.vetter@ffwll.ch Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:59 +02:00
Thierry Reding	4d14323a2e	gpu: host1x: Increase maximum DMA segment size [ Upstream commit `1e390478cf` ] Recent versions of the DMA API debug code have started to warn about violations of the maximum DMA segment size. This is because the segment size defaults to 64 KiB, which can easily be exceeded in large buffer allocations such as used in DRM/KMS for framebuffers. Technically the Tegra SMMU and ARM SMMU don't have a maximum segment size (they map individual pages irrespective of whether they are contiguous or not), so the choice of 4 MiB is a bit arbitrary here. The maximum segment size is a 32-bit unsigned integer, though, so we can't set it to the correct maximum size, which would be the size of the aperture. Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:59 +02:00
Jyri Sarha	f9bfd6bd82	drm/bridge: sii902x: pixel clock unit is 10kHz instead of 1kHz [ Upstream commit `8dbfc5b650` ] The pixel clock unit in the first two registers (0x00 and 0x01) of sii9022 is 10kHz, not 1kHz as in struct drm_display_mode. Division by 10 fixes the issue. Signed-off-by: Jyri Sarha <jsarha@ti.com> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/1a2a8eae0b9d6333e7a5841026bf7fd65c9ccd09.1558964241.git.jsarha@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:59 +02:00
Tomi Valkeinen	7af9abd7d6	drm/bridge: tc358767: read display_props in get_modes() [ Upstream commit `3231573065` ] We need to know the link bandwidth to filter out modes we cannot support, so we need to have read the display props before doing the filtering. To ensure we have up to date display props, call tc_get_display_props() in the beginning of tc_connector_get_modes(). Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528082747.3631-22-tomi.valkeinen@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:58 +02:00
Alex Williamson	49c7230d8f	PCI: Return error if cannot probe VF [ Upstream commit `76002d8b48` ] Commit `0e7df22401` ("PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding") allows the user to specify that drivers for VFs of a PF should not be probed, but it actually causes pci_device_probe() to return success back to the driver core in this case. Therefore by all sysfs appearances the device is bound to a driver, the driver link from the device exists as does the device link back from the driver, yet the driver's probe function is never called on the device. We also fail to do any sort of cleanup when we're prohibited from probing the device, the IRQ setup remains in place and we even hold a device reference. Instead, abort with errno before any setup or references are taken when pci_device_can_probe() prevents us from trying to probe the device. Link: https://lore.kernel.org/lkml/155672991496.20698.4279330795743262888.stgit@gimli.home Fixes: `0e7df22401` ("PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding") Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:58 +02:00
Gen Zhang	2a18d76592	drm/edid: Fix a missing-check bug in drm_load_edid_firmware() [ Upstream commit `9f1f1a2dab` ] In drm_load_edid_firmware(), fwstr is allocated by kstrdup(). And fwstr is dereferenced in the following codes. However, memory allocation functions such as kstrdup() may fail and returns NULL. Dereferencing this null pointer may cause the kernel go wrong. Thus we should check this kstrdup() operation. Further, if kstrdup() returns NULL, we should return ERR_PTR(-ENOMEM) to the caller site. Signed-off-by: Gen Zhang <blackgod016574@gmail.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524023222.GA5302@zhanggen-UX430UQ Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:58 +02:00
Oak Zeng	210dfe6309	drm/amdkfd: Fix sdma queue map issue [ Upstream commit `065e4bdfa1` ] Previous codes assumes there are two sdma engines. This is not true e.g., Raven only has 1 SDMA engine. Fix the issue by using sdma engine number info in device_info. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:58 +02:00
Oak Zeng	db64bc1394	drm/amdkfd: Fix a potential memory leak [ Upstream commit `e73390d181` ] Free mqd_mem_obj it GTT buffer allocation for MQD+control stack fails. Signed-off-by: Oak Zeng <ozeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:58 +02:00
Paul Hsieh	6b1d2871fe	drm/amd/display: Disable ABM before destroy ABM struct [ Upstream commit `1090d58d48` ] [Why] When disable driver, OS will set backlight optimization then do stop device. But this flag will cause driver to enable ABM when driver disabled. [How] Send ABM disable command before destroy ABM construct Signed-off-by: Paul Hsieh <paul.hsieh@amd.com> Reviewed-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:58 +02:00
Tiecheng Zhou	c242a531bb	drm/amdgpu/sriov: Need to initialize the HDP_NONSURFACE_BAStE [ Upstream commit `fe2b5323d2` ] it requires to initialize HDP_NONSURFACE_BASE, so as to avoid using the value left by a previous VM under sriov scenario. v2: it should not hurt baremetal, generalize it for both sriov and baremetal Signed-off-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:58 +02:00
Nicholas Kazlauskas	147137f86b	drm/amd/display: Fill prescale_params->scale for RGB565 [ Upstream commit `1352c779cb` ] [Why] An assertion is thrown when using SURFACE_PIXEL_FORMAT_GRPH_RGB565 formats on DCE since the prescale_params->scale wasn't being filled. Found by a dmesg-fail when running the igt@kms_plane@pixel-format-pipe-a-planes test on Baffin. [How] Fill in the scale parameter. Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Roman Li <Roman.Li@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:57 +02:00
Christophe Leroy	08b0bcc807	tty: serial: cpm_uart - fix init when SMC is relocated [ Upstream commit `06aaa3d066` ] SMC relocation can also be activated earlier by the bootloader, so the driver's behaviour cannot rely on selected kernel config. When the SMC is relocated, CPM_CR_INIT_TRX cannot be used. But the only thing CPM_CR_INIT_TRX does is to clear the rstate and tstate registers, so this can be done manually, even when SMC is not relocated. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Fixes: `9ab9212014` ("cpm_uart: fix non-console port startup bug") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:57 +02:00
Wen Yang	c901780d92	pinctrl: rockchip: fix leaked of_node references [ Upstream commit `3c89c70634` ] The call to of_parse_phandle returns a node pointer with refcount incremented thus it must be explicitly decremented after the last usage. Detected by coccinelle with the following warnings: ./drivers/pinctrl/pinctrl-rockchip.c:3221:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 3196, but without a corresponding object release within this function. ./drivers/pinctrl/pinctrl-rockchip.c:3223:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 3196, but without a corresponding object release within this function. Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Heiko Stuebner <heiko@sntech.de> Cc: linux-gpio@vger.kernel.org Cc: linux-rockchip@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:57 +02:00
Serge Semin	a9dfb6e436	tty: max310x: Fix invalid baudrate divisors calculator [ Upstream commit `35240ba26a` ] Current calculator doesn't do it' job quite correct. First of all the max310x baud-rates generator supports the divisor being less than 16. In this case the x2/x4 modes can be used to double or quadruple the reference frequency. But the current baud-rate setter function just filters all these modes out by the first condition and setups these modes only if there is a clocks-baud division remainder. The former doesn't seem right at all, since enabling the x2/x4 modes causes the line noise tolerance reduction and should be only used as a last resort to enable a requested too high baud-rate. Finally the fraction is supposed to be calculated from D = Fref/(cbaud) formulae, but not from D % 16, which causes the precision loss. So to speak the current baud-rate calculator code works well only if the baud perfectly fits to the uart reference input frequency. Lets fix the calculator by implementing the algo fully compliant with the fractional baud-rate generator described in the datasheet: D = Fref / (cbaud), where c={16,8,4} is the x1/x2/x4 rate mode respectively, Fref - reference input frequency. The divisor fraction is calculated from the same formulae, but making sure it is found with a resolution of 0.0625 (four bits). Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:57 +02:00
Thinh Nguyen	b0084c1b50	usb: core: hub: Disable hub-initiated U1/U2 [ Upstream commit `5617592927` ] If the device rejects the control transfer to enable device-initiated U1/U2 entry, then the device will not initiate U1/U2 transition. To improve the performance, the downstream port should not initate transition to U1/U2 to avoid the delay from the device link command response (no packet can be transmitted while waiting for a response from the device). If the device has some quirks and does not implement U1/U2, it may reject all the link state change requests, and the downstream port may resend and flood the bus with more requests. This will affect the device performance even further. This patch disables the hub-initated U1/U2 if the device-initiated U1/U2 entry fails. Reference: USB 3.2 spec 7.2.4.2.3 Signed-off-by: Thinh Nguyen <thinhn@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:57 +02:00
Quentin Deslandes	19755a124f	staging: vt6656: use meaningful error code during buffer allocation [ Upstream commit `d8c2869300` ] Check on called function's returned value for error and return 0 on success or a negative errno value on error instead of a boolean value. Signed-off-by: Quentin Deslandes <quentin.deslandes@itdev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:57 +02:00
Fabien Dessenne	b59f7650a5	iio: adc: stm32-dfsdm: missing error case during probe [ Upstream commit `d2fc015696` ] During probe, check the devm_ioremap_resource() error value. Also return the devm_clk_get() error value instead of -EINVAL. Signed-off-by: Fabien Dessenne <fabien.dessenne@st.com> Acked-by: Fabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:56 +02:00
Fabien Dessenne	302e4cdca1	iio: adc: stm32-dfsdm: manage the get_irq error case [ Upstream commit `3e53ef91f8` ] During probe, check the "get_irq" error value. Signed-off-by: Fabien Dessenne <fabien.dessenne@st.com> Acked-by: Fabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:56 +02:00
Peter Ujfalusi	586946ce83	drm/panel: simple: Fix panel_simple_dsi_probe [ Upstream commit `7ad9db66fa` ] In case mipi_dsi_attach() fails remove the registered panel to avoid added panel without corresponding device. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190226081153.31334-1-peter.ujfalusi@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:56 +02:00
Sunil Muthuswamy	49fb03de36	hvsock: fix epollout hang from race condition [ Upstream commit `cb359b6041` ] Currently, hvsock can enter into a state where epoll_wait on EPOLLOUT will not return even when the hvsock socket is writable, under some race condition. This can happen under the following sequence: - fd = socket(hvsocket) - fd_out = dup(fd) - fd_in = dup(fd) - start a writer thread that writes data to fd_out with a combination of epoll_wait(fd_out, EPOLLOUT) and - start a reader thread that reads data from fd_in with a combination of epoll_wait(fd_in, EPOLLIN) - On the host, there are two threads that are reading/writing data to the hvsocket stack: hvs_stream_has_space hvs_notify_poll_out vsock_poll sock_poll ep_poll Race condition: check for epollout from ep_poll(): assume no writable space in the socket hvs_stream_has_space() returns 0 check for epollin from ep_poll(): assume socket has some free space < HVS_PKT_LEN(HVS_SEND_BUF_SIZE) hvs_stream_has_space() will clear the channel pending send size host will not notify the guest because the pending send size has been cleared and so the hvsocket will never mark the socket writable Now, the EPOLLOUT will never return even if the socket write buffer is empty. The fix is to set the pending size to the default size and never change it. This way the host will always notify the guest whenever the writable space is bigger than the pending size. The host is already optimized to only notify the guest when the pending size threshold boundary is crossed and not everytime. This change also reduces the cpu usage somewhat since hv_stream_has_space() is in the hotpath of send: vsock_stream_sendmsg()->hv_stream_has_space() Earlier hv_stream_has_space was setting/clearing the pending size on every call. Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-31 07:26:56 +02:00
Greg Kroah-Hartman	64f4694072	Linux 4.19.62	2019-07-28 08:29:30 +02:00
Vlad Buslov	60e9babfda	net: sched: verify that q!=NULL before setting q->flags commit `503d81d428` upstream. In function int tc_new_tfilter() q pointer can be NULL when adding filter on a shared block. With recent change that resets TCQ_F_CAN_BYPASS after filter creation, following NULL pointer dereference happens in case parent block is shared: [ 212.925060] BUG: kernel NULL pointer dereference, address: 0000000000000010 [ 212.925445] #PF: supervisor write access in kernel mode [ 212.925709] #PF: error_code(0x0002) - not-present page [ 212.925965] PGD 8000000827923067 P4D 8000000827923067 PUD 827924067 PMD 0 [ 212.926302] Oops: 0002 [#1] SMP KASAN PTI [ 212.926539] CPU: 18 PID: 2617 Comm: tc Tainted: G B 5.2.0+ #512 [ 212.926938] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017 [ 212.927364] RIP: 0010:tc_new_tfilter+0x698/0xd40 [ 212.927633] Code: 74 0d 48 85 c0 74 08 48 89 ef e8 03 aa 62 00 48 8b 84 24 a0 00 00 00 48 8d 78 10 48 89 44 24 18 e8 4d 0c 6b ff 48 8b 44 24 18 <83> 60 10 f b 48 85 ed 0f 85 3d fe ff ff e9 4f fe ff ff e8 81 26 f8 [ 212.928607] RSP: 0018:ffff88884fd5f5d8 EFLAGS: 00010296 [ 212.928905] RAX: 0000000000000000 RBX: 0000000000000000 RCX: dffffc0000000000 [ 212.929201] RDX: 0000000000000007 RSI: 0000000000000004 RDI: 0000000000000297 [ 212.929402] RBP: ffff88886bedd600 R08: ffffffffb91d4b51 R09: fffffbfff7616e4d [ 212.929609] R10: fffffbfff7616e4c R11: ffffffffbb0b7263 R12: ffff88886bc61040 [ 212.929803] R13: ffff88884fd5f950 R14: ffffc900039c5000 R15: ffff88835e927680 [ 212.929999] FS: 00007fe7c50b6480(0000) GS:ffff88886f980000(0000) knlGS:0000000000000000 [ 212.930235] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 212.930394] CR2: 0000000000000010 CR3: 000000085bd04002 CR4: 00000000001606e0 [ 212.930588] Call Trace: [ 212.930682] ? tc_del_tfilter+0xa40/0xa40 [ 212.930811] ? __lock_acquire+0x5b5/0x2460 [ 212.930948] ? find_held_lock+0x85/0xa0 [ 212.931081] ? tc_del_tfilter+0xa40/0xa40 [ 212.931201] rtnetlink_rcv_msg+0x4ab/0x5f0 [ 212.931332] ? rtnl_dellink+0x490/0x490 [ 212.931454] ? lockdep_hardirqs_on+0x260/0x260 [ 212.931589] ? netlink_deliver_tap+0xab/0x5a0 [ 212.931717] ? match_held_lock+0x1b/0x240 [ 212.931844] netlink_rcv_skb+0xd0/0x200 [ 212.931958] ? rtnl_dellink+0x490/0x490 [ 212.932079] ? netlink_ack+0x440/0x440 [ 212.932205] ? netlink_deliver_tap+0x161/0x5a0 [ 212.932335] ? lock_downgrade+0x360/0x360 [ 212.932457] ? lock_acquire+0xe5/0x210 [ 212.932579] netlink_unicast+0x296/0x350 [ 212.932705] ? netlink_attachskb+0x390/0x390 [ 212.932834] ? _copy_from_iter_full+0xe0/0x3a0 [ 212.932976] netlink_sendmsg+0x394/0x600 [ 212.937998] ? netlink_unicast+0x350/0x350 [ 212.943033] ? move_addr_to_kernel.part.0+0x90/0x90 [ 212.948115] ? netlink_unicast+0x350/0x350 [ 212.953185] sock_sendmsg+0x96/0xa0 [ 212.958099] ___sys_sendmsg+0x482/0x520 [ 212.962881] ? match_held_lock+0x1b/0x240 [ 212.967618] ? copy_msghdr_from_user+0x250/0x250 [ 212.972337] ? lock_downgrade+0x360/0x360 [ 212.976973] ? rwlock_bug.part.0+0x60/0x60 [ 212.981548] ? __mod_node_page_state+0x1f/0xa0 [ 212.986060] ? match_held_lock+0x1b/0x240 [ 212.990567] ? find_held_lock+0x85/0xa0 [ 212.994989] ? do_user_addr_fault+0x349/0x5b0 [ 212.999387] ? lock_downgrade+0x360/0x360 [ 213.003713] ? find_held_lock+0x85/0xa0 [ 213.007972] ? __fget_light+0xa1/0xf0 [ 213.012143] ? sockfd_lookup_light+0x91/0xb0 [ 213.016165] __sys_sendmsg+0xba/0x130 [ 213.020040] ? __sys_sendmsg_sock+0xb0/0xb0 [ 213.023870] ? handle_mm_fault+0x337/0x470 [ 213.027592] ? page_fault+0x8/0x30 [ 213.031316] ? lockdep_hardirqs_off+0xbe/0x100 [ 213.034999] ? mark_held_locks+0x24/0x90 [ 213.038671] ? do_syscall_64+0x1e/0xe0 [ 213.042297] do_syscall_64+0x74/0xe0 [ 213.045828] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 213.049354] RIP: 0033:0x7fe7c527c7b8 [ 213.052792] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f 0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 54 [ 213.060269] RSP: 002b:00007ffc3f7908a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 213.064144] RAX: ffffffffffffffda RBX: 000000005d34716f RCX: 00007fe7c527c7b8 [ 213.068094] RDX: 0000000000000000 RSI: 00007ffc3f790910 RDI: 0000000000000003 [ 213.072109] RBP: 0000000000000000 R08: 0000000000000001 R09: 00007fe7c5340cc0 [ 213.076113] R10: 0000000000404ec2 R11: 0000000000000246 R12: 0000000000000080 [ 213.080146] R13: 0000000000480640 R14: 0000000000000080 R15: 0000000000000000 [ 213.084147] Modules linked in: act_gact cls_flower sch_ingress nfsv3 nfs_acl nfs lockd grace fscache bridge stp llc sunrpc intel_rapl_msr intel_rapl_common [<1;69;32Msb_edac rdma_ucm rdma_cm x86_pkg_temp_thermal iw_cm intel_powerclamp ib_cm coretemp kvm_intel kvm irqbypass mlx5_ib ib_uverbs ib_core crct10dif_pclmul crc32_pc lmul crc32c_intel ghash_clmulni_intel mlx5_core intel_cstate intel_uncore iTCO_wdt igb iTCO_vendor_support mlxfw mei_me ptp ses intel_rapl_perf mei pcspkr ipmi _ssif i2c_i801 joydev enclosure pps_core lpc_ich ioatdma wmi dca ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad ast i2c_algo_bit drm_vram_helpe r ttm drm_kms_helper drm mpt3sas raid_class scsi_transport_sas [ 213.112326] CR2: 0000000000000010 [ 213.117429] ---[ end trace adb58eb0a4ee6283 ]--- Verify that q pointer is not NULL before setting the 'flags' field. Fixes: `3f05e6886a` ("net_sched: unset TCQ_F_CAN_BYPASS when adding filters") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:30 +02:00
Kuo-Hsin Yang	c1d98b766e	mm: vmscan: scan anonymous pages on file refaults commit `2c012a4ad1` upstream. When file refaults are detected and there are many inactive file pages, the system never reclaim anonymous pages, the file pages are dropped aggressively when there are still a lot of cold anonymous pages and system thrashes. This issue impacts the performance of applications with large executable, e.g. chrome. With this patch, when file refault is detected, inactive_list_is_low() always returns true for file pages in get_scan_count() to enable scanning anonymous pages. The problem can be reproduced by the following test program. ---8<--- void fallocate_file(const char filename, off_t size) { struct stat st; int fd; if (!stat(filename, &st) && st.st_size >= size) return; fd = open(filename, O_WRONLY \| O_CREAT, 0600); if (fd < 0) { perror("create file"); exit(1); } if (posix_fallocate(fd, 0, size)) { perror("fallocate"); exit(1); } close(fd); } long alloc_anon(long size) { long start = malloc(size); memset(start, 1, size); return start; } long access_file(const char filename, long size, long rounds) { int fd, i; volatile char start1, end1, start2; const int page_size = getpagesize(); long sum = 0; fd = open(filename, O_RDONLY); if (fd == -1) { perror("open"); exit(1); } / * Some applications, e.g. chrome, use a lot of executable file * pages, map some of the pages with PROT_EXEC flag to simulate * the behavior. / start1 = mmap(NULL, size / 2, PROT_READ \| PROT_EXEC, MAP_SHARED, fd, 0); if (start1 == MAP_FAILED) { perror("mmap"); exit(1); } end1 = start1 + size / 2; start2 = mmap(NULL, size / 2, PROT_READ, MAP_SHARED, fd, size / 2); if (start2 == MAP_FAILED) { perror("mmap"); exit(1); } for (i = 0; i < rounds; ++i) { struct timeval before, after; volatile char ptr1 = start1, ptr2 = start2; gettimeofday(&before, NULL); for (; ptr1 < end1; ptr1 += page_size, ptr2 += page_size) sum += ptr1 + ptr2; gettimeofday(&after, NULL); printf("File access time, round %d: %f (sec) ", i, (after.tv_sec - before.tv_sec) + (after.tv_usec - before.tv_usec) / 1000000.0); } return sum; } int main(int argc, char argv[]) { const long MB = 1024 * 1024; long anon_mb, file_mb, file_rounds; const char filename[] = "large"; long ret1; long ret2; if (argc != 4) { printf("usage: thrash ANON_MB FILE_MB FILE_ROUNDS "); exit(0); } anon_mb = atoi(argv[1]); file_mb = atoi(argv[2]); file_rounds = atoi(argv[3]); fallocate_file(filename, file_mb MB); printf("Allocate %ld MB anonymous pages ", anon_mb); ret1 = alloc_anon(anon_mb * MB); printf("Access %ld MB file pages ", file_mb); ret2 = access_file(filename, file_mb * MB, file_rounds); printf("Print result to prevent optimization: %ld ", *ret1 + ret2); return 0; } ---8<--- Running the test program on 2GB RAM VM with kernel 5.2.0-rc5, the program fills ram with 2048 MB memory, access a 200 MB file for 10 times. Without this patch, the file cache is dropped aggresively and every access to the file is from disk. $ ./thrash 2048 200 10 Allocate 2048 MB anonymous pages Access 200 MB file pages File access time, round 0: 2.489316 (sec) File access time, round 1: 2.581277 (sec) File access time, round 2: 2.487624 (sec) File access time, round 3: 2.449100 (sec) File access time, round 4: 2.420423 (sec) File access time, round 5: 2.343411 (sec) File access time, round 6: 2.454833 (sec) File access time, round 7: 2.483398 (sec) File access time, round 8: 2.572701 (sec) File access time, round 9: 2.493014 (sec) With this patch, these file pages can be cached. $ ./thrash 2048 200 10 Allocate 2048 MB anonymous pages Access 200 MB file pages File access time, round 0: 2.475189 (sec) File access time, round 1: 2.440777 (sec) File access time, round 2: 2.411671 (sec) File access time, round 3: 1.955267 (sec) File access time, round 4: 0.029924 (sec) File access time, round 5: 0.000808 (sec) File access time, round 6: 0.000771 (sec) File access time, round 7: 0.000746 (sec) File access time, round 8: 0.000738 (sec) File access time, round 9: 0.000747 (sec) Checked the swap out stats during the test [1], 19006 pages swapped out with this patch, 3418 pages swapped out without this patch. There are more swap out, but I think it's within reasonable range when file backed data set doesn't fit into the memory. $ ./thrash 2000 100 2100 5 1 # ANON_MB FILE_EXEC FILE_NOEXEC ROUNDS PROCESSES Allocate 2000 MB anonymous pages active_anon: 1613644, inactive_anon: 348656, active_file: 892, inactive_file: 1384 (kB) pswpout: 7972443, pgpgin: 478615246 Access 100 MB executable file pages Access 2100 MB regular file pages File access time, round 0: 12.165, (sec) active_anon: 1433788, inactive_anon: 478116, active_file: 17896, inactive_file: 24328 (kB) File access time, round 1: 11.493, (sec) active_anon: 1430576, inactive_anon: 477144, active_file: 25440, inactive_file: 26172 (kB) File access time, round 2: 11.455, (sec) active_anon: 1427436, inactive_anon: 476060, active_file: 21112, inactive_file: 28808 (kB) File access time, round 3: 11.454, (sec) active_anon: 1420444, inactive_anon: 473632, active_file: 23216, inactive_file: 35036 (kB) File access time, round 4: 11.479, (sec) active_anon: 1413964, inactive_anon: 471460, active_file: 31728, inactive_file: 32224 (kB) pswpout: 7991449 (+ 19006), pgpgin: 489924366 (+ 11309120) With 4 processes accessing non-overlapping parts of a large file, 30316 pages swapped out with this patch, 5152 pages swapped out without this patch. The swapout number is small comparing to pgpgin. [1]: https://github.com/vovo/testing/blob/master/mem_thrash.c Link: http://lkml.kernel.org/r/20190701081038.GA83398@google.com Fixes: `e986850598` ("mm,vmscan: only evict file pages when we have plenty") Fixes: `7c5bd705d8` ("mm: memcg: only evict file pages when we have plenty") Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Sonny Rao <sonnyrao@chromium.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Rik van Riel <riel@redhat.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: <stable@vger.kernel.org> [4.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [backported to 4.14.y, 4.19.y, 5.1.y: adjust context] Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:30 +02:00
Jan Kiszka	7560e33369	KVM: nVMX: Clear pending KVM_REQ_GET_VMCS12_PAGES when leaving nested commit `cf64527bb3` upstream. Letting this pend may cause nested_get_vmcs12_pages to run against an invalid state, corrupting the effective vmcs of L1. This was triggerable in QEMU after a guest corruption in L2, followed by a L1 reset. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Cc: stable@vger.kernel.org Fixes: `7f7f1ba33c` ("KVM: x86: do not load vmcs12 pages while still in SMM") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:30 +02:00
Paolo Bonzini	967bc679c5	KVM: nVMX: do not use dangling shadow VMCS after guest reset commit `88dddc11a8` upstream. If a KVM guest is reset while running a nested guest, free_nested will disable the shadow VMCS execution control in the vmcs01. However, on the next KVM_RUN vmx_vcpu_run would nevertheless try to sync the VMCS12 to the shadow VMCS which has since been freed. This causes a vmptrld of a NULL pointer on my machime, but Jan reports the host to hang altogether. Let's see how much this trivial patch fixes. Reported-by: Jan Kiszka <jan.kiszka@siemens.com> Cc: Liran Alon <liran.alon@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:30 +02:00
Theodore Ts'o	3a17ca864b	ext4: allow directory holes commit `4e19d6b65f` upstream. The largedir feature was intended to allow ext4 directories to have unmapped directory blocks (e.g., directory holes). And so the released e2fsprogs no longer enforces this for largedir file systems; however, the corresponding change to the kernel-side code was not made. This commit fixes this oversight. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:30 +02:00
Ross Zwisler	caa4e08253	ext4: use jbd2_inode dirty range scoping commit `73131fbb00` upstream. Use the newly introduced jbd2_inode dirty range scoping to prevent us from waiting forever when trying to complete a journal transaction. Signed-off-by: Ross Zwisler <zwisler@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:29 +02:00
Ross Zwisler	af3812b65c	jbd2: introduce jbd2_inode dirty range scoping commit `6ba0e7dc64` upstream. Currently both journal_submit_inode_data_buffers() and journal_finish_inode_data_buffers() operate on the entire address space of each of the inodes associated with a given journal entry. The consequence of this is that if we have an inode where we are constantly appending dirty pages we can end up waiting for an indefinite amount of time in journal_finish_inode_data_buffers() while we wait for all the pages under writeback to be written out. The easiest way to cause this type of workload is do just dd from /dev/zero to a file until it fills the entire filesystem. This can cause journal_finish_inode_data_buffers() to wait for the duration of the entire dd operation. We can improve this situation by scoping each of the inode dirty ranges associated with a given transaction. We do this via the jbd2_inode structure so that the scoping is contained within jbd2 and so that it follows the lifetime and locking rules for that structure. This allows us to limit the writeback & wait in journal_submit_inode_data_buffers() and journal_finish_inode_data_buffers() respectively to the dirty range for a given struct jdb2_inode, keeping us from waiting forever if the inode in question is still being appended to. Signed-off-by: Ross Zwisler <zwisler@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:29 +02:00
Ross Zwisler	4becd6c11e	mm: add filemap_fdatawait_range_keep_errors() commit `aa0bfcd939` upstream. In the spirit of filemap_fdatawait_range() and filemap_fdatawait_keep_errors(), introduce filemap_fdatawait_range_keep_errors() which both takes a range upon which to wait and does not clear errors from the address space. Signed-off-by: Ross Zwisler <zwisler@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:29 +02:00
Theodore Ts'o	c9ea4620a3	ext4: enforce the immutable flag on open files commit `02b016ca7f` upstream. According to the chattr man page, "a file with the 'i' attribute cannot be modified..." Historically, this was only enforced when the file was opened, per the rest of the description, "... and the file can not be opened in write mode". There is general agreement that we should standardize all file systems to prevent modifications even for files that were opened at the time the immutable flag is set. Eventually, a change to enforce this at the VFS layer should be landing in mainline. Until then, enforce this at the ext4 level to prevent xfstests generic/553 from failing. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:29 +02:00
Darrick J. Wong	29171e8234	ext4: don't allow any modifications to an immutable file commit `2e53840362` upstream. Don't allow any modifications to a file that's marked immutable, which means that we have to flush all the writable pages to make the readonly and we have to check the setattr/setflags parameters more closely. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:28 +02:00
Peter Zijlstra	4a5cc64d8a	perf/core: Fix race between close() and fork() commit `1cf8dfe8a6` upstream. Syzcaller reported the following Use-after-Free bug: close() clone() copy_process() perf_event_init_task() perf_event_init_context() mutex_lock(parent_ctx->mutex) inherit_task_group() inherit_group() inherit_event() mutex_lock(event->child_mutex) // expose event on child list list_add_tail() mutex_unlock(event->child_mutex) mutex_unlock(parent_ctx->mutex) ... goto bad_fork_* bad_fork_cleanup_perf: perf_event_free_task() perf_release() perf_event_release_kernel() list_for_each_entry() mutex_lock(ctx->mutex) mutex_lock(event->child_mutex) // event is from the failing inherit // on the other CPU perf_remove_from_context() list_move() mutex_unlock(event->child_mutex) mutex_unlock(ctx->mutex) mutex_lock(ctx->mutex) list_for_each_entry_safe() // event already stolen mutex_unlock(ctx->mutex) delayed_free_task() free_task() list_for_each_entry_safe() list_del() free_event() _free_event() // and so event->hw.target // is the already freed failed clone() if (event->hw.target) put_task_struct(event->hw.target) // WHOOPSIE, already quite dead Which puts the lie to the the comment on perf_event_free_task(): 'unexposed, unused context' not so much. Which is a 'fun' confluence of fail; copy_process() doing an unconditional free_task() and not respecting refcounts, and perf having creative locking. In particular: `82d94856fa` ("perf/core: Fix lock inversion between perf,trace,cpuhp") seems to have overlooked this 'fun' parade. Solve it by using the fact that detached events still have a reference count on their (previous) context. With this perf_event_free_task() can detect when events have escaped and wait for their destruction. Debugged-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Reported-by: syzbot+a24c397a29ad22d86c98@syzkaller.appspotmail.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: `82d94856fa` ("perf/core: Fix lock inversion between perf,trace,cpuhp") Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:28 +02:00
Alexander Shishkin	75100ec5f0	perf/core: Fix exclusive events' grouping commit `8a58ddae23` upstream. So far, we tried to disallow grouping exclusive events for the fear of complications they would cause with moving between contexts. Specifically, moving a software group to a hardware context would violate the exclusivity rules if both groups contain matching exclusive events. This attempt was, however, unsuccessful: the check that we have in the perf_event_open() syscall is both wrong (looks at wrong PMU) and insufficient (group leader may still be exclusive), as can be illustrated by running: $ perf record -e '{intel_pt//,cycles}' uname $ perf record -e '{cycles,intel_pt//}' uname ultimately successfully. Furthermore, we are completely free to trigger the exclusivity violation by: perf -e '{cycles,intel_pt//}' -e '{intel_pt//,instructions}' even though the helpful perf record will not allow that, the ABI will. The warning later in the perf_event_open() path will also not trigger, because it's also wrong. Fix all this by validating the original group before moving, getting rid of broken safeguards and placing a useful one to perf_install_in_context(). Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: mathieu.poirier@linaro.org Cc: will.deacon@arm.com Fixes: `bed5b25ad9` ("perf: Add a pmu capability for "exclusive" events") Link: https://lkml.kernel.org/r/20190701110755.24646-1-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:28 +02:00
Paul Cercueil	0e6ef18431	MIPS: lb60: Fix pin mappings commit `1323c3b72a` upstream. The pin mappings introduced in commit `636f8ba67f` ("MIPS: JZ4740: Qi LB60: Add pinctrl configuration for several drivers") are completely wrong. The pinctrl driver name is incorrect, and the function and group fields are swapped. Fixes: `636f8ba67f` ("MIPS: JZ4740: Qi LB60: Add pinctrl configuration for several drivers") Cc: <stable@vger.kernel.org> Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: od@zcrc.me Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:28 +02:00
Keerthy	dd5994ab1f	gpio: davinci: silence error prints in case of EPROBE_DEFER commit `541e4095f3` upstream. Silence error prints in case of EPROBE_DEFER. This avoids multiple/duplicate defer prints during boot. Cc: <stable@vger.kernel.org> Signed-off-by: Keerthy <j-keerthy@ti.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:28 +02:00
Chris Wilson	c947cf3e95	dma-buf: Discard old fence_excl on retrying get_fences_rcu for realloc commit `f5b07b04e5` upstream. If we have to drop the seqcount & rcu lock to perform a krealloc, we have to restart the loop. In doing so, be careful not to lose track of the already acquired exclusive fence. Fixes: `fedf54132d` ("dma-buf: Restart reservation_object_get_fences_rcu() after writes") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: stable@vger.kernel.org #v4.10 Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190604125323.21396-1-chris@chris-wilson.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:28 +02:00
Jérôme Glisse	95ee55cab1	dma-buf: balance refcount inbalance commit `5e383a9798` upstream. The debugfs take reference on fence without dropping them. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: Stéphane Marchesin <marcheu@chromium.org> Cc: stable@vger.kernel.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20181206161840.6578-1-jglisse@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:28 +02:00
Nikolay Aleksandrov	b72fb8dec1	net: bridge: stp: don't cache eth dest pointer before skb pull [ Upstream commit `2446a68ae6` ] Don't cache eth dest pointer before calling pskb_may_pull. Fixes: `cf0f02d04a` ("[BRIDGE]: use llc for receiving STP packets") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:27 +02:00
Nikolay Aleksandrov	78701843ec	net: bridge: don't cache ether dest pointer on input [ Upstream commit `3d26eb8ad1` ] We would cache ether dst pointer on input in br_handle_frame_finish but after the neigh suppress code that could lead to a stale pointer since both ipv4 and ipv6 suppress code do pskb_may_pull. This means we have to always reload it after the suppress code so there's no point in having it cached just retrieve it directly. Fixes: `057658cb33` ("bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports") Fixes: `ed842faeb2` ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:27 +02:00
Nikolay Aleksandrov	41a8df7180	net: bridge: mcast: fix stale ipv6 hdr pointer when handling v6 query [ Upstream commit `3b26a5d03d` ] We get a pointer to the ipv6 hdr in br_ip6_multicast_query but we may call pskb_may_pull afterwards and end up using a stale pointer. So use the header directly, it's just 1 place where it's needed. Fixes: `08b202b672` ("bridge br_multicast: IPv6 MLD support.") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Tested-by: Martin Weinelt <martin@linuxlounge.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:27 +02:00
Nikolay Aleksandrov	caf4488fc0	net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling [ Upstream commit `e57f61858b` ] We take a pointer to grec prior to calling pskb_may_pull and use it afterwards to get nsrcs so record nsrcs before the pull when handling igmp3 and we get a pointer to nsrcs and call pskb_may_pull when handling mld2 which again could lead to reading 2 bytes out-of-bounds. ================================================================== BUG: KASAN: use-after-free in br_multicast_rcv+0x480c/0x4ad0 [bridge] Read of size 2 at addr ffff8880421302b4 by task ksoftirqd/1/16 CPU: 1 PID: 16 Comm: ksoftirqd/1 Tainted: G OE 5.2.0-rc6+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Call Trace: dump_stack+0x71/0xab print_address_description+0x6a/0x280 ? br_multicast_rcv+0x480c/0x4ad0 [bridge] __kasan_report+0x152/0x1aa ? br_multicast_rcv+0x480c/0x4ad0 [bridge] ? br_multicast_rcv+0x480c/0x4ad0 [bridge] kasan_report+0xe/0x20 br_multicast_rcv+0x480c/0x4ad0 [bridge] ? br_multicast_disable_port+0x150/0x150 [bridge] ? ktime_get_with_offset+0xb4/0x150 ? __kasan_kmalloc.constprop.6+0xa6/0xf0 ? __netif_receive_skb+0x1b0/0x1b0 ? br_fdb_update+0x10e/0x6e0 [bridge] ? br_handle_frame_finish+0x3c6/0x11d0 [bridge] br_handle_frame_finish+0x3c6/0x11d0 [bridge] ? br_pass_frame_up+0x3a0/0x3a0 [bridge] ? virtnet_probe+0x1c80/0x1c80 [virtio_net] br_handle_frame+0x731/0xd90 [bridge] ? select_idle_sibling+0x25/0x7d0 ? br_handle_frame_finish+0x11d0/0x11d0 [bridge] __netif_receive_skb_core+0xced/0x2d70 ? virtqueue_get_buf_ctx+0x230/0x1130 [virtio_ring] ? do_xdp_generic+0x20/0x20 ? virtqueue_napi_complete+0x39/0x70 [virtio_net] ? virtnet_poll+0x94d/0xc78 [virtio_net] ? receive_buf+0x5120/0x5120 [virtio_net] ? __netif_receive_skb_one_core+0x97/0x1d0 __netif_receive_skb_one_core+0x97/0x1d0 ? __netif_receive_skb_core+0x2d70/0x2d70 ? _raw_write_trylock+0x100/0x100 ? __queue_work+0x41e/0xbe0 process_backlog+0x19c/0x650 ? _raw_read_lock_irq+0x40/0x40 net_rx_action+0x71e/0xbc0 ? __switch_to_asm+0x40/0x70 ? napi_complete_done+0x360/0x360 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __schedule+0x85e/0x14d0 __do_softirq+0x1db/0x5f9 ? takeover_tasklets+0x5f0/0x5f0 run_ksoftirqd+0x26/0x40 smpboot_thread_fn+0x443/0x680 ? sort_range+0x20/0x20 ? schedule+0x94/0x210 ? __kthread_parkme+0x78/0xf0 ? sort_range+0x20/0x20 kthread+0x2ae/0x3a0 ? kthread_create_worker_on_cpu+0xc0/0xc0 ret_from_fork+0x35/0x40 The buggy address belongs to the page: page:ffffea0001084c00 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0 flags: 0xffffc000000000() raw: 00ffffc000000000 ffffea0000cfca08 ffffea0001098608 0000000000000000 raw: 0000000000000000 0000000000000003 00000000ffffff7f 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888042130180: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888042130200: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > ffff888042130280: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff888042130300: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888042130380: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== Disabling lock debugging due to kernel taint Fixes: `bc8c20acae` ("bridge: multicast: treat igmpv3 report with INCLUDE and no sources as a leave") Reported-by: Martin Weinelt <martin@linuxlounge.net> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Tested-by: Martin Weinelt <martin@linuxlounge.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:27 +02:00
Xin Long	bc9a2f36a7	sctp: not bind the socket in sctp_connect [ Upstream commit `9b6c08878e` ] Now when sctp_connect() is called with a wrong sa_family, it binds to a port but doesn't set bp->port, then sctp_get_af_specific will return NULL and sctp_connect() returns -EINVAL. Then if sctp_bind() is called to bind to another port, the last port it has bound will leak due to bp->port is NULL by then. sctp_connect() doesn't need to bind ports, as later __sctp_connect will do it if bp->port is NULL. So remove it from sctp_connect(). While at it, remove the unnecessary sockaddr.sa_family len check as it's already done in sctp_inet_connect. Fixes: `644fbdeacf` ("sctp: fix the issue that flags are ignored when using kernel_connect") Reported-by: syzbot+079bf326b38072f849d9@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-28 08:29:27 +02:00

1 2 3 4 5 ...

790066 Commits