linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-08 11:50:43 +09:00

Author	SHA1	Message	Date
Chris Lew	1cbc40a07c	rpmsg: glink: Put an extra reference during cleanup commit `b646293e27` upstream. In a remote processor crash scenario, there is no guarantee the remote processor sent close requests before it went into a bad state. Remove the reference that is normally handled by the close command in the so channel resources can be released. Fixes: `b4f8e52b89` ("rpmsg: Introduce Qualcomm RPM glink driver") Cc: stable@vger.kernel.org Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Chris Lew <clew@codeaurora.org> Reported-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:38 +01:00
Arun Kumar Neelakantam	d375fb033a	rpmsg: glink: Fix use after free in open_ack TIMEOUT case commit `ac74ea0186` upstream. Extra channel reference put when remote sending OPEN_ACK after timeout causes use-after-free while handling next remote CLOSE command. Remove extra reference put in timeout case to avoid use-after-free. Fixes: `b4f8e52b89` ("rpmsg: Introduce Qualcomm RPM glink driver") Cc: stable@vger.kernel.org Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Arun Kumar Neelakantam <aneela@codeaurora.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:37 +01:00
Arun Kumar Neelakantam	bee84d7d8b	rpmsg: glink: Fix reuse intents memory leak issue commit `b85f6b6014` upstream. Memory allocated for re-usable intents are not freed during channel cleanup which causes memory leak in system. Check and free all re-usable memory to avoid memory leak. Fixes: `933b45da5d` ("rpmsg: glink: Add support for TX intents") Cc: stable@vger.kernel.org Acked-By: Chris Lew <clew@codeaurora.org> Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Arun Kumar Neelakantam <aneela@codeaurora.org> Reported-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:37 +01:00
Chris Lew	06e60a45a4	rpmsg: glink: Set tail pointer to 0 at end of FIFO commit `4623e8bf1d` upstream. When wrapping around the FIFO, the remote expects the tail pointer to be reset to 0 on the edge case where the tail equals the FIFO length. Fixes: `caf989c350` ("rpmsg: glink: Introduce glink smem based transport") Cc: stable@vger.kernel.org Signed-off-by: Chris Lew <clew@codeaurora.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:36 +01:00
Max Filippov	bae1e47136	xtensa: fix syscall_set_return_value commit `c2d9aa3b6e` upstream. syscall return value is in the register a2, not a0. Cc: stable@vger.kernel.org # v5.0+ Fixes: `9f24f3c106` ("xtensa: implement tracehook functions and enable HAVE_ARCH_TRACEHOOK") Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:35 +01:00
Max Filippov	147128e775	xtensa: fix TLB sanity checker commit `36de10c478` upstream. Virtual and translated addresses retrieved by the xtensa TLB sanity checker must be consistent, i.e. correspond to the same state of the checked TLB entry. KASAN shadow memory is mapped dynamically using auto-refill TLB entries and thus may change TLB state between the virtual and translated address retrieval, resulting in false TLB insanity report. Move read_xtlb_translation close to read_xtlb_virtual to make sure that read values are consistent. Cc: stable@vger.kernel.org Fixes: `a99e07ee5e` ("xtensa: check TLB sanity on return to userspace") Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:34 +01:00
Bob Peterson	0007f536dc	gfs2: fix glock reference problem in gfs2_trans_remove_revoke commit `fe5e7ba11f` upstream. Commit `9287c6452d` fixed a situation in which gfs2 could use a glock after it had been freed. To do that, it temporarily added a new glock reference by calling gfs2_glock_hold in function gfs2_add_revoke. However, if the bd element was removed by gfs2_trans_remove_revoke, it failed to drop the additional reference. This patch adds logic to gfs2_trans_remove_revoke to properly drop the additional glock reference. Fixes: `9287c6452d` ("gfs2: Fix occasional glock use-after-free") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:34 +01:00
Andreas Gruenbacher	e697fd14db	gfs2: Multi-block allocations in gfs2_page_mkwrite commit `f53056c430` upstream. In gfs2_page_mkwrite's gfs2_allocate_page_backing helper, try to allocate as many blocks at once as we need. Pass in the size of the requested allocation. Fixes: `35af80aef9` ("gfs2: don't use buffer_heads in gfs2_allocate_page_backing") Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:33 +01:00
Max Filippov	1948e76afc	xtensa: use MEMBLOCK_ALLOC_ANYWHERE for KASAN shadow map commit `e64681b487` upstream. KASAN shadow map doesn't need to be accessible through the linear kernel mapping, allocate its pages with MEMBLOCK_ALLOC_ANYWHERE so that high memory can be used. This frees up to ~100MB of low memory on xtensa configurations with KASAN and high memory. Cc: stable@vger.kernel.org # v5.1+ Fixes: `f240ec09bb` ("memblock: replace memblock_alloc_base(ANYWHERE) with memblock_phys_alloc") Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:33 +01:00
Andreas Gruenbacher	06ad673b6c	block: fix "check bi_size overflow before merge" commit `cc90bc6842` upstream. This partially reverts commit `e3a5d8e386`. Commit `e3a5d8e386` ("check bi_size overflow before merge") adds a bio_full check to __bio_try_merge_page. This will cause __bio_try_merge_page to fail when the last bi_io_vec has been reached. Instead, what we want here is only the bi_size overflow check. Fixes: `e3a5d8e386` ("block: check bi_size overflow before merge") Cc: stable@vger.kernel.org # v5.4+ Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:32 +01:00
Leonard Crestez	f092fa8da2	PM / QoS: Redefine FREQ_QOS_MAX_DEFAULT_VALUE to S32_MAX commit `c6a3aea935` upstream. QOS requests for DEFAULT_VALUE are supposed to be ignored but this is not the case for FREQ_QOS_MAX. Adding one request for MAX_DEFAULT_VALUE and one for a real value will cause freq_qos_read_value to unexpectedly return MAX_DEFAULT_VALUE (-1). This happens because freq_qos max value is aggregated with PM_QOS_MIN but FREQ_QOS_MAX_DEFAULT_VALUE is (-1) so it's smaller than other values. Fix this by redefining FREQ_QOS_MAX_DEFAULT_VALUE to S32_MAX. Looking at current users for freq_qos it seems that none of them create requests for FREQ_QOS_MAX_DEFAULT_VALUE. Fixes: `77751a466e` ("PM: QoS: Introduce frequency QoS") Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Reported-by: Matthias Kaehlcke <mka@chromium.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Cc: 5.4+ <stable@vger.kernel.org> # 5.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:31 +01:00
George Cherian	69396e4b31	PCI: Apply Cavium ACS quirk to ThunderX2 and ThunderX3 commit `f338bb9f01` upstream. Enhance the ACS quirk for Cavium Processors. Add the root port vendor IDs for ThunderX2 and ThunderX3 series of processors. [bhelgaas: add Fixes: and stable tag] Fixes: `f2ddaf8dfd` ("PCI: Apply Cavium ThunderX ACS quirk to more Root Ports") Link: https://lore.kernel.org/r/20191111024243.GA11408@dc5-eodlnx05.marvell.com Signed-off-by: George Cherian <george.cherian@marvell.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Robert Richter <rrichter@marvell.com> Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:30 +01:00
Yoshihiro Shimoda	2a67fc32eb	PCI: rcar: Fix missing MACCTLR register setting in initialization sequence commit `7c7e53e1c9` upstream. The R-Car Gen2/3 manual - available at: https://www.renesas.com/eu/en/products/microcontrollers-microprocessors/rz/rzg/rzg1m.html#documents "RZ/G Series User's Manual: Hardware" section strictly enforces the MACCTLR inizialization value - 39.3.1 - "Initial Setting of PCI Express": "Be sure to write the initial value (= H'80FF 0000) to MACCTLR before enabling PCIETCTLR.CFINIT". To avoid unexpected behavior and to match the SW initialization sequence guidelines, this patch programs the MACCTLR with the correct value. Note that the MACCTLR.SPCHG bit in the MACCTLR register description reports that "Only writing 1 is valid and writing 0 is invalid" but this "invalid" has to be interpreted as a write-ignore aka "ignored", not "prohibited". Reported-by: Eugeniu Rosca <erosca@de.adit-jv.com> Fixes: `c25da47788` ("PCI: rcar: Add Renesas R-Car PCIe driver") Fixes: `be20bbcb0a` ("PCI: rcar: Add the initialization of PCIe link in resume_noirq()") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: <stable@vger.kernel.org> # v5.2+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:30 +01:00
Subbaraya Sundeep	286a524948	PCI: Do not use bus number zero from EA capability commit `73884a7082` upstream. As per PCIe r5.0, sec 7.8.5.2, fixed bus numbers of a bridge must be zero when no function that uses EA is located behind it. Hence, if EA supplies bus numbers of zero, assign bus numbers normally. A secondary bus can never have a bus number of zero, so setting a bridge's Secondary Bus Number to zero makes downstream devices unreachable. [bhelgaas: retain bool return value so "zero is invalid" logic is local] Fixes: `2dbce59011` ("PCI: Assign bus numbers present in EA capability for bridges") Link: https://lore.kernel.org/r/1572850664-9861-1-git-send-email-sundeep.lkml@gmail.com Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:29 +01:00
Jian-Hong Pan	a4d3d16fcb	PCI/MSI: Fix incorrect MSI-X masking on resume commit `e045fa29e8` upstream. When a driver enables MSI-X, msix_program_entries() reads the MSI-X Vector Control register for each vector and saves it in desc->masked. Each register is 32 bits and bit 0 is the actual Mask bit. When we restored these registers during resume, we previously set the Mask bit if any bit in desc->masked was set instead of when the Mask bit itself was set: pci_restore_state pci_restore_msi_state __pci_restore_msix_state for_each_pci_msi_entry msix_mask_irq(entry, entry->masked) <-- entire u32 word __pci_msix_desc_mask_irq(desc, flag) mask_bits = desc->masked & ~PCI_MSIX_ENTRY_CTRL_MASKBIT if (flag) <-- testing entire u32, not just bit 0 mask_bits \|= PCI_MSIX_ENTRY_CTRL_MASKBIT writel(mask_bits, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL) This means that after resume, MSI-X vectors were masked when they shouldn't be, which leads to timeouts like this: nvme nvme0: I/O 978 QID 3 timeout, completion polled On resume, set the Mask bit only when the saved Mask bit from suspend was set. This should remove the need for `19ea025e1d` ("nvme: Add quirk for Kingston NVME SSD running FW E8FK11.T"). [bhelgaas: commit log, move fix to __pci_msix_desc_mask_irq()] Link: https://bugzilla.kernel.org/show_bug.cgi?id=204887 Link: https://lore.kernel.org/r/20191008034238.2503-1-jian-hong@endlessm.com Fixes: `f2440d9acb` ("PCI MSI: Refactor interrupt masking code") Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:28 +01:00
Steffen Liebergeld	1c6a922cf8	PCI: Fix Intel ACS quirk UPDCR register address commit `d8558ac8c9` upstream. According to documentation [0] the correct offset for the Upstream Peer Decode Configuration Register (UPDCR) is 0x1014. It was previously defined as 0x1114. `d99321b63b` ("PCI: Enable quirks for PCIe ACS on Intel PCH root ports") intended to enforce isolation between PCI devices allowing them to be put into separate IOMMU groups. Due to the wrong register offset the intended isolation was not fully enforced. This is fixed with this patch. Please note that I did not test this patch because I have no hardware that implements this register. [0] https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/4th-gen-core-family-mobile-i-o-datasheet.pdf (page 325) Fixes: `d99321b63b` ("PCI: Enable quirks for PCIe ACS on Intel PCH root ports") Link: https://lore.kernel.org/r/7a3505df-79ba-8a28-464c-88b83eefffa6@kernkonzept.com Signed-off-by: Steffen Liebergeld <steffen.liebergeld@kernkonzept.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Andrew Murray <andrew.murray@arm.com> Acked-by: Ashok Raj <ashok.raj@intel.com> Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:27 +01:00
Lukas Wunner	9bd9d12339	PCI: pciehp: Avoid returning prematurely from sysfs requests commit `157c1062fc` upstream. A sysfs request to enable or disable a PCIe hotplug slot should not return before it has been carried out. That is sought to be achieved by waiting until the controller's "pending_events" have been cleared. However the IRQ thread pciehp_ist() clears the "pending_events" before it acts on them. If pciehp_sysfs_enable_slot() / _disable_slot() happen to check the "pending_events" after they have been cleared but while pciehp_ist() is still running, the functions may return prematurely with an incorrect return value. Fix by introducing an "ist_running" flag which must be false before a sysfs request is allowed to return. Fixes: `32a8cef274` ("PCI: pciehp: Enable/disable exclusively from IRQ thread") Link: https://lore.kernel.org/linux-pci/1562226638-54134-1-git-send-email-wangxiongfeng2@huawei.com Link: https://lore.kernel.org/r/4174210466e27eb7e2243dd1d801d5f75baaffd8.1565345211.git.lukas@wunner.de Reported-and-tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:27 +01:00
Dexuan Cui	01acd9e82f	PCI/PM: Always return devices to D0 when thawing commit `f2c33ccacb` upstream. pci_pm_thaw_noirq() is supposed to return the device to D0 and restore its configuration registers, but previously it only did that for devices whose drivers implemented the new power management ops. Hibernation, e.g., via "echo disk > /sys/power/state", involves freezing devices, creating a hibernation image, thawing devices, writing the image, and powering off. The fact that thawing did not return devices with legacy power management to D0 caused errors, e.g., in this path: pci_pm_thaw_noirq if (pci_has_legacy_pm_support(pci_dev)) # true for Mellanox VF driver return pci_legacy_resume_early(dev) # ... legacy PM skips the rest pci_set_power_state(pci_dev, PCI_D0) pci_restore_state(pci_dev) pci_pm_thaw if (pci_has_legacy_pm_support(pci_dev)) pci_legacy_resume drv->resume mlx4_resume ... pci_enable_msix_range ... if (dev->current_state != PCI_D0) # <--- return -EINVAL; which caused these warnings: mlx4_core a6d1:00:02.0: INTx is not supported in multi-function mode, aborting PM: dpm_run_callback(): pci_pm_thaw+0x0/0xd7 returns -95 PM: Device a6d1:00:02.0 failed to thaw: error -95 Return devices to D0 and restore config registers for all devices, not just those whose drivers support new power management. [bhelgaas: also call pci_restore_state() before pci_legacy_resume_early(), update comment, add stable tag, commit log] Link: https://lore.kernel.org/r/KU1P153MB016637CAEAD346F0AA8E3801BFAD0@KU1P153MB0166.APCP153.PROD.OUTLOOK.COM Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:26 +01:00
Logan Gunthorpe	d83f65da65	PCI/switchtec: Read all 64 bits of part_event_bitmap commit `6acdf7e19b` upstream. The part_event_bitmap register is 64 bits wide, so read it with ioread64() instead of the 32-bit ioread32(). Fixes: `52eabba5bc` ("switchtec: Add IOCTLs to the Switchtec driver") Link: https://lore.kernel.org/r/20190910195833.3891-1-logang@deltatee.com Reported-by: Doug Meyer <dmeyer@gigaio.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v4.12+ Cc: Kelvin Cao <Kelvin.Cao@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:25 +01:00
Ulf Hansson	1a35dfb2a1	mmc: core: Re-work HW reset for SDIO cards commit `2ac55d5e5e` upstream. It have turned out that it's not a good idea to unconditionally do a power cycle and then to re-initialize the SDIO card, as currently done through mmc_hw_reset() -> mmc_sdio_hw_reset(). This because there may be multiple SDIO func drivers probed, who also shares the same SDIO card. To address these scenarios, one may be tempted to use a notification mechanism, as to allow the core to inform each of the probed func drivers, about an ongoing HW reset. However, supporting such an operation from the func driver point of view, may not be entirely trivial. Therefore, let's use a more simplistic approach to solve the problem, by instead forcing the card to be removed and re-detected, via scheduling a rescan-work. In this way, we can rely on existing infrastructure, as the func driver's ->remove() and ->probe() callbacks, becomes invoked to deal with the cleanup and the re-initialization. This solution may be considered as rather heavy, especially if a func driver doesn't share its card with other func drivers. To address this, let's keep the current immediate HW reset option as well, but run it only when there is one func driver probed for the card. Finally, to allow the caller of mmc_hw_reset(), to understand if the reset is being asynchronously managed from a scheduled work, it returns 1 (propagated from mmc_sdio_hw_reset()). If the HW reset is executed successfully and synchronously it returns 0, which maintains the existing behaviour. Reviewed-by: Douglas Anderson <dianders@chromium.org> Tested-by: Douglas Anderson <dianders@chromium.org> Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:24 +01:00
Ulf Hansson	a0b50e5c4f	mmc: core: Drop check for mmc_card_is_removable() in mmc_rescan() commit `99b4ddd8b7` upstream. Upfront in mmc_rescan() we use the host->rescan_entered flag, to allow scanning only once for non-removable cards. Therefore, it's also not possible that we can have a corresponding card bus attached (host->bus_ops is NULL), when we are scanning non-removable cards. For this reason, let' drop the check for mmc_card_is_removable() as it's redundant. Reviewed-by: Douglas Anderson <dianders@chromium.org> Tested-by: Douglas Anderson <dianders@chromium.org> Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:23 +01:00
Chaotian Jing	89c6e88294	mmc: block: Add CMD13 polling for MMC IOCTLS with R1B response commit `a0d4c7eb71` upstream. MMC IOCTLS with R1B responses may cause the card to enter the busy state, which means it's not ready to receive a new request. To prevent new requests from being sent to the card, use a CMD13 polling loop to verify that the card returns to the transfer state, before completing the request. Signed-off-by: Chaotian Jing <chaotian.jing@mediatek.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:23 +01:00
Chaotian Jing	0cc2b0e6e5	mmc: block: Make card_busy_detect() a bit more generic commit `3869468e0c` upstream. To prepare for more users of card_busy_detect(), let's drop the struct request * as an in-parameter and convert to log the error message via dev_err() instead of pr_err(). Signed-off-by: Chaotian Jing <chaotian.jing@mediatek.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:22 +01:00
Fredrik Noring	9b39b507a1	USB: Fix incorrect DMA allocations for local memory pool drivers commit `f8c63edfd7` upstream. Fix commit `7b81cb6bdd` ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") where local memory USB drivers erroneously allocate DMA memory instead of pool memory, causing OHCI Unrecoverable Error, disabled HC died; cleaning up The order between hcd_uses_dma() and hcd->localmem_pool is now arranged as in hcd_buffer_alloc() and hcd_buffer_free(), with the test for hcd->localmem_pool placed first. As an alternative, one might consider adjusting hcd_uses_dma() with static inline bool hcd_uses_dma(struct usb_hcd *hcd) { - return IS_ENABLED(CONFIG_HAS_DMA) && (hcd->driver->flags & HCD_DMA); + return IS_ENABLED(CONFIG_HAS_DMA) && + (hcd->driver->flags & HCD_DMA) && + (hcd->localmem_pool == NULL); } One can also consider unsetting HCD_DMA for local memory pool drivers. Fixes: `7b81cb6bdd` ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities") Cc: stable <stable@vger.kernel.org> Signed-off-by: Fredrik Noring <noring@nocrew.org> Link: https://lore.kernel.org/r/20191210172905.GA52526@sx9 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 11:04:21 +01:00
Greg Kroah-Hartman	9a08897100	Linux 5.4.5	2019-12-18 16:09:17 +01:00
Heiner Kallweit	68159412b2	r8169: add missing RX enabling for WoL on RTL8125 [ Upstream commit `00222d1394` ] RTL8125 also requires to enable RX for WoL. v2: add missing Fixes tag Fixes: `f1bce4ad2f` ("r8169: add support for RTL8125") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:09:17 +01:00
Vladimir Oltean	157560f95d	net: mscc: ocelot: unregister the PTP clock on deinit [ Upstream commit `9385973fe8` ] Currently a switch driver deinit frees the regmaps, but the PTP clock is still out there, available to user space via /dev/ptpN. Any PTP operation is a ticking time bomb, since it will attempt to use the freed regmaps and thus trigger kernel panics: [ 4.291746] fsl_enetc 0000:00:00.2 eth1: error -22 setting up slave phy [ 4.291871] mscc_felix 0000:00:00.5: Failed to register DSA switch: -22 [ 4.308666] mscc_felix: probe of 0000:00:00.5 failed with error -22 [ 6.358270] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000088 [ 6.367090] Mem abort info: [ 6.369888] ESR = 0x96000046 [ 6.369891] EC = 0x25: DABT (current EL), IL = 32 bits [ 6.369892] SET = 0, FnV = 0 [ 6.369894] EA = 0, S1PTW = 0 [ 6.369895] Data abort info: [ 6.369897] ISV = 0, ISS = 0x00000046 [ 6.369899] CM = 0, WnR = 1 [ 6.369902] user pgtable: 4k pages, 48-bit VAs, pgdp=00000020d58c7000 [ 6.369904] [0000000000000088] pgd=00000020d5912003, pud=00000020d5915003, pmd=0000000000000000 [ 6.369914] Internal error: Oops: 96000046 [#1] PREEMPT SMP [ 6.420443] Modules linked in: [ 6.423506] CPU: 1 PID: 262 Comm: phc_ctl Not tainted 5.4.0-03625-gb7b2a5dadd7f #204 [ 6.431273] Hardware name: LS1028A RDB Board (DT) [ 6.435989] pstate: 40000085 (nZcv daIf -PAN -UAO) [ 6.440802] pc : css_release+0x24/0x58 [ 6.444561] lr : regmap_read+0x40/0x78 [ 6.448316] sp : ffff800010513cc0 [ 6.451636] x29: ffff800010513cc0 x28: ffff002055873040 [ 6.456963] x27: 0000000000000000 x26: 0000000000000000 [ 6.462289] x25: 0000000000000000 x24: 0000000000000000 [ 6.467617] x23: 0000000000000000 x22: 0000000000000080 [ 6.472944] x21: ffff800010513d44 x20: 0000000000000080 [ 6.478270] x19: 0000000000000000 x18: 0000000000000000 [ 6.483596] x17: 0000000000000000 x16: 0000000000000000 [ 6.488921] x15: 0000000000000000 x14: 0000000000000000 [ 6.494247] x13: 0000000000000000 x12: 0000000000000000 [ 6.499573] x11: 0000000000000000 x10: 0000000000000000 [ 6.504899] x9 : 0000000000000000 x8 : 0000000000000000 [ 6.510225] x7 : 0000000000000000 x6 : ffff800010513cf0 [ 6.515550] x5 : 0000000000000000 x4 : 0000000fffffffe0 [ 6.520876] x3 : 0000000000000088 x2 : ffff800010513d44 [ 6.526202] x1 : ffffcada668ea000 x0 : ffffcada64d8b0c0 [ 6.531528] Call trace: [ 6.533977] css_release+0x24/0x58 [ 6.537385] regmap_read+0x40/0x78 [ 6.540795] __ocelot_read_ix+0x6c/0xa0 [ 6.544641] ocelot_ptp_gettime64+0x4c/0x110 [ 6.548921] ptp_clock_gettime+0x4c/0x58 [ 6.552853] pc_clock_gettime+0x5c/0xa8 [ 6.556699] __arm64_sys_clock_gettime+0x68/0xc8 [ 6.561331] el0_svc_common.constprop.2+0x7c/0x178 [ 6.566133] el0_svc_handler+0x34/0xa0 [ 6.569891] el0_sync_handler+0x114/0x1d0 [ 6.573908] el0_sync+0x140/0x180 [ 6.577232] Code: d503201f b00119a1 91022263 b27b7be4 (f9004663) [ 6.583349] ---[ end trace d196b9b14cdae2da ]--- [ 6.587977] Kernel panic - not syncing: Fatal exception [ 6.593216] SMP: stopping secondary CPUs [ 6.597151] Kernel Offset: 0x4ada54400000 from 0xffff800010000000 [ 6.603261] PHYS_OFFSET: 0xffffd0a7c0000000 [ 6.607454] CPU features: 0x10002,21806008 [ 6.611558] Memory Limit: none And now that ocelot->ptp_clock is checked at exit, prevent a potential error where ptp_clock_register returned a pointer-encoded error, which we are keeping in the ocelot private data structure. So now, ocelot->ptp_clock is now either NULL or a valid pointer. Fixes: `4e3b0468e6` ("net: mscc: PTP Hardware Clock (PHC) support") Cc: Antoine Tenart <antoine.tenart@bootlin.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:09:14 +01:00
Shannon Nelson	dd561233e0	ionic: keep users rss hash across lif reset [ Upstream commit `ffac2027e1` ] If the user has specified their own RSS hash key, don't lose it across queue resets such as DOWN/UP, MTU change, and number of channels change. This is fixed by moving the key initialization to a little earlier in the lif creation. Also, let's clean up the RSS config a little better on the way down by setting it all to 0. Fixes: `aa3198819b` ("ionic: Add RSS support") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:09:12 +01:00
Jonathan Lemon	9bd01a33c7	xdp: obtain the mem_id mutex before trying to remove an entry. [ Upstream commit `86c76c0989` ] A lockdep splat was observed when trying to remove an xdp memory model from the table since the mutex was obtained when trying to remove the entry, but not before the table walk started: Fix the splat by obtaining the lock before starting the table walk. Fixes: `c3f812cea0` ("page_pool: do not release pool until inflight == 0.") Reported-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:09:10 +01:00
Jonathan Lemon	05f646cb21	page_pool: do not release pool until inflight == 0. [ Upstream commit `c3f812cea0` ] The page pool keeps track of the number of pages in flight, and it isn't safe to remove the pool until all pages are returned. Disallow removing the pool until all pages are back, so the pool is always available for page producers. Make the page pool responsible for its own delayed destruction instead of relying on XDP, so the page pool can be used without the xdp memory model. When all pages are returned, free the pool and notify xdp if the pool is registered with the xdp memory system. Have the callback perform a table walk since some drivers (cpsw) may share the pool among multiple xdp_rxq_info. Note that the increment of pages_state_release_cnt may result in inflight == 0, resulting in the pool being released. Fixes: `d956a048cd` ("xdp: force mem allocator removal and periodic warning") Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:09:07 +01:00
Aya Levin	6b2377de13	net/mlx5e: ethtool, Fix analysis of speed setting [ Upstream commit `3d7cadae51` ] When setting speed to 100G via ethtool (AN is set to off), only 25G4 is configured while the user, who has an advanced HW which supports extended PTYS, expects also 50G2 to be configured. With this patch, when extended PTYS mode is available, configure PTYS via extended fields. Fixes: `4b95840a6c` ("net/mlx5e: Fix matching of speed to PRM link modes") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:09:06 +01:00
Aya Levin	dd54484500	net/mlx5e: Fix translation of link mode into speed [ Upstream commit `6d485e5e55` ] Add a missing value in translation of PTYS ext_eth_proto_oper to its corresponding speed. When ext_eth_proto_oper bit 10 is set, ethtool shows unknown speed. With this fix, ethtool shows speed is 100G as expected. Fixes: `a08b4ed137` ("net/mlx5: Add support to ext_* fields introduced in Port Type and Speed register") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:09:04 +01:00
Roi Dayan	65523f0fe7	net/mlx5e: Fix freeing flow with kfree() and not kvfree() [ Upstream commit `a23dae79fb` ] Flows are allocated with kzalloc() so free with kfree(). Fixes: `04de7dda73` ("net/mlx5e: Infrastructure for duplicated offloading of TC flows") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:09:03 +01:00
Eran Ben Elisha	2e4e7670cb	net/mlx5e: Fix SFF 8472 eeprom length [ Upstream commit `c431f85978` ] SFF 8472 eeprom length is 512 bytes. Fix module info return value to support 512 bytes read. Fixes: `ace329f4ab` ("net/mlx5e: ethtool, Remove unsupported SFP EEPROM high pages query") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:09:00 +01:00
Aaron Conole	4e57c23391	act_ct: support asymmetric conntrack [ Upstream commit `95219afbb9` ] The act_ct TC module shares a common conntrack and NAT infrastructure exposed via netfilter. It's possible that a packet needs both SNAT and DNAT manipulation, due to e.g. tuple collision. Netfilter can support this because it runs through the NAT table twice - once on ingress and again after egress. The act_ct action doesn't have such capability. Like netfilter hook infrastructure, we should run through NAT twice to keep the symmetry. Fixes: `b57dc7c13e` ("net/sched: Introduce action ct") Signed-off-by: Aaron Conole <aconole@redhat.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:59 +01:00
Eran Ben Elisha	411fdb9752	net/mlx5e: Fix TXQ indices to be sequential [ Upstream commit `c55d8b108c` ] Cited patch changed (channel index, tc) => (TXQ index) mapping to be a static one, in order to keep indices consistent when changing number of channels or TCs. For 32 channels (OOB) and 8 TCs, real num of TXQs is 256. When reducing the amount of channels to 8, the real num of TXQs will be changed to 64. This indices method is buggy: - Channel #0, TC 3, the TXQ index is 96. - Index 8 is not valid, as there is no such TXQ from driver perspective (As it represents channel #8, TC 0, which is not valid with the above configuration). As part of driver's select queue, it calls netdev_pick_tx which returns an index in the range of real number of TXQs. Depends on the return value, with the examples above, driver could have returned index larger than the real number of tx queues, or crash the kernel as it tries to read invalid address of SQ which was not allocated. Fix that by allocating sequential TXQ indices, and hold a new mapping between (channel index, tc) => (real TXQ index). This mapping will be updated as part of priv channels activation, and is used in mlx5e_select_queue to find the selected queue index. The existing indices mapping (channel_tc2txq) is no longer needed, as it is used only for statistics structures and can be calculated on run time. Delete its definintion and updates. Fixes: `8bfaf07f78` ("net/mlx5e: Present SW stats when state is not opened") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:57 +01:00
Martin Varghese	cd477d06d2	net: Fixed updating of ethertype in skb_mpls_push() [ Upstream commit `d04ac224b1` ] The skb_mpls_push was not updating ethertype of an ethernet packet if the packet was originally received from a non ARPHRD_ETHER device. In the below OVS data path flow, since the device corresponding to port 7 is an l3 device (ARPHRD_NONE) the skb_mpls_push function does not update the ethertype of the packet even though the previous push_eth action had added an ethernet header to the packet. recirc_id(0),in_port(7),eth_type(0x0800),ipv4(tos=0/0xfc,ttl=64,frag=no), actions:push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00), push_mpls(label=13,tc=0,ttl=64,bos=1,eth_type=0x8847),4 Fixes: `8822e270d6` ("net: core: move push MPLS functionality from OvS to core helper") Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:56 +01:00
Taehee Yoo	10fec3e566	hsr: fix a NULL pointer dereference in hsr_dev_xmit() [ Upstream commit `df95467b6d` ] hsr_dev_xmit() calls hsr_port_get_hsr() to find master node and that would return NULL if master node is not existing in the list. But hsr_dev_xmit() doesn't check return pointer so a NULL dereference could occur. Test commands: ip netns add nst ip link add veth0 type veth peer name veth1 ip link add veth2 type veth peer name veth3 ip link set veth1 netns nst ip link set veth3 netns nst ip link set veth0 up ip link set veth2 up ip link add hsr0 type hsr slave1 veth0 slave2 veth2 ip a a 192.168.100.1/24 dev hsr0 ip link set hsr0 up ip netns exec nst ip link set veth1 up ip netns exec nst ip link set veth3 up ip netns exec nst ip link add hsr1 type hsr slave1 veth1 slave2 veth3 ip netns exec nst ip a a 192.168.100.2/24 dev hsr1 ip netns exec nst ip link set hsr1 up hping3 192.168.100.2 -2 --flood & modprobe -rv hsr Splat looks like: [ 217.351122][ T1635] kasan: CONFIG_KASAN_INLINE enabled [ 217.352969][ T1635] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 217.354297][ T1635] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 217.355507][ T1635] CPU: 1 PID: 1635 Comm: hping3 Not tainted 5.4.0+ #192 [ 217.356472][ T1635] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 217.357804][ T1635] RIP: 0010:hsr_dev_xmit+0x34/0x90 [hsr] [ 217.373010][ T1635] Code: 48 8d be 00 0c 00 00 be 04 00 00 00 48 83 ec 08 e8 21 be ff ff 48 8d 78 10 48 ba 00 b [ 217.376919][ T1635] RSP: 0018:ffff8880cd8af058 EFLAGS: 00010202 [ 217.377571][ T1635] RAX: 0000000000000000 RBX: ffff8880acde6840 RCX: 0000000000000002 [ 217.379465][ T1635] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: 0000000000000010 [ 217.380274][ T1635] RBP: ffff8880acde6840 R08: ffffed101b440d5d R09: 0000000000000001 [ 217.381078][ T1635] R10: 0000000000000001 R11: ffffed101b440d5c R12: ffff8880bffcc000 [ 217.382023][ T1635] R13: ffff8880bffcc088 R14: 0000000000000000 R15: ffff8880ca675c00 [ 217.383094][ T1635] FS: 00007f060d9d1740(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000 [ 217.384289][ T1635] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 217.385009][ T1635] CR2: 00007faf15381dd0 CR3: 00000000d523c001 CR4: 00000000000606e0 [ 217.385940][ T1635] Call Trace: [ 217.386544][ T1635] dev_hard_start_xmit+0x160/0x740 [ 217.387114][ T1635] __dev_queue_xmit+0x1961/0x2e10 [ 217.388118][ T1635] ? check_object+0xaf/0x260 [ 217.391466][ T1635] ? __alloc_skb+0xb9/0x500 [ 217.392017][ T1635] ? init_object+0x6b/0x80 [ 217.392629][ T1635] ? netdev_core_pick_tx+0x2e0/0x2e0 [ 217.393175][ T1635] ? __alloc_skb+0xb9/0x500 [ 217.393727][ T1635] ? rcu_read_lock_sched_held+0x90/0xc0 [ 217.394331][ T1635] ? rcu_read_lock_bh_held+0xa0/0xa0 [ 217.395013][ T1635] ? kasan_unpoison_shadow+0x30/0x40 [ 217.395668][ T1635] ? __kasan_kmalloc.constprop.4+0xa0/0xd0 [ 217.396280][ T1635] ? __kmalloc_node_track_caller+0x3a8/0x3f0 [ 217.399007][ T1635] ? __kasan_kmalloc.constprop.4+0xa0/0xd0 [ 217.400093][ T1635] ? __kmalloc_reserve.isra.46+0x2e/0xb0 [ 217.401118][ T1635] ? memset+0x1f/0x40 [ 217.402529][ T1635] ? __alloc_skb+0x317/0x500 [ 217.404915][ T1635] ? arp_xmit+0xca/0x2c0 [ ... ] Fixes: `311633b604` ("hsr: switch ->dellink() to ->ndo_uninit()") Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:54 +01:00
Martin Varghese	2cbaf5fb57	Fixed updating of ethertype in function skb_mpls_pop [ Upstream commit `040b5cfbce` ] The skb_mpls_pop was not updating ethertype of an ethernet packet if the packet was originally received from a non ARPHRD_ETHER device. In the below OVS data path flow, since the device corresponding to port 7 is an l3 device (ARPHRD_NONE) the skb_mpls_pop function does not update the ethertype of the packet even though the previous push_eth action had added an ethernet header to the packet. recirc_id(0),in_port(7),eth_type(0x8847), mpls(label=12/0xfffff,tc=0/0,ttl=0/0x0,bos=1/1), actions:push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00), pop_mpls(eth_type=0x800),4 Fixes: `ed246cee09` ("net: core: move pop MPLS functionality from OvS to core helper") Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:53 +01:00
Cong Wang	23fbdd5d1e	gre: refetch erspan header from skb->data after pskb_may_pull() [ Upstream commit `0e4940928c` ] After pskb_may_pull() we should always refetch the header pointers from the skb->data in case it got reallocated. In gre_parse_header(), the erspan header is still fetched from the 'options' pointer which is fetched before pskb_may_pull(). Found this during code review of a KMSAN bug report. Fixes: `cb73ee40b1` ("net: ip_gre: use erspan key field for tunnel lookup") Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Acked-by: William Tu <u9012063@gmail.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:51 +01:00
Yoshiki Komachi	71bc12b1fb	cls_flower: Fix the behavior using port ranges with hw-offload [ Upstream commit `8ffb055bea` ] The recent commit `5c72299fba` ("net: sched: cls_flower: Classify packets using port ranges") had added filtering based on port ranges to tc flower. However the commit missed necessary changes in hw-offload code, so the feature gave rise to generating incorrect offloaded flow keys in NIC. One more detailed example is below: $ tc qdisc add dev eth0 ingress $ tc filter add dev eth0 ingress protocol ip flower ip_proto tcp \ dst_port 100-200 action drop With the setup above, an exact match filter with dst_port == 0 will be installed in NIC by hw-offload. IOW, the NIC will have a rule which is equivalent to the following one. $ tc qdisc add dev eth0 ingress $ tc filter add dev eth0 ingress protocol ip flower ip_proto tcp \ dst_port 0 action drop The behavior was caused by the flow dissector which extracts packet data into the flow key in the tc flower. More specifically, regardless of exact match or specified port ranges, fl_init_dissector() set the FLOW_DISSECTOR_KEY_PORTS flag in struct flow_dissector to extract port numbers from skb in skb_flow_dissect() called by fl_classify(). Note that device drivers received the same struct flow_dissector object as used in skb_flow_dissect(). Thus, offloaded drivers could not identify which of these is used because the FLOW_DISSECTOR_KEY_PORTS flag was set to struct flow_dissector in either case. This patch adds the new FLOW_DISSECTOR_KEY_PORTS_RANGE flag and the new tp_range field in struct fl_flow_key to recognize which filters are applied to offloaded drivers. At this point, when filters based on port ranges passed to drivers, drivers return the EOPNOTSUPP error because they do not support the feature (the newly created FLOW_DISSECTOR_KEY_PORTS_RANGE flag). Fixes: `5c72299fba` ("net: sched: cls_flower: Classify packets using port ranges") Signed-off-by: Yoshiki Komachi <komachi.yoshiki@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:49 +01:00
John Hurley	554d2e14c5	net: sched: allow indirect blocks to bind to clsact in TC [ Upstream commit `25a443f74b` ] When a device is bound to a clsact qdisc, bind events are triggered to registered drivers for both ingress and egress. However, if a driver registers to such a device using the indirect block routines then it is assumed that it is only interested in ingress offload and so only replays ingress bind/unbind messages. The NFP driver supports the offload of some egress filters when registering to a block with qdisc of type clsact. However, on unregister, if the block is still active, it will not receive an unbind egress notification which can prevent proper cleanup of other registered callbacks. Modify the indirect block callback command in TC to send messages of ingress and/or egress bind depending on the qdisc in use. NFP currently supports egress offload for TC flower offload so the changes are only added to TC. Fixes: `4d12ba4278` ("nfp: flower: allow offloading of matches on 'internal' ports") Signed-off-by: John Hurley <john.hurley@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:49 +01:00
John Hurley	1b511a9d2c	net: core: rename indirect block ingress cb function [ Upstream commit `dbad340889` ] With indirect blocks, a driver can register for callbacks from a device that is does not 'own', for example, a tunnel device. When registering to or unregistering from a new device, a callback is triggered to generate a bind/unbind event. This, in turn, allows the driver to receive any existing rules or to properly clean up installed rules. When first added, it was assumed that all indirect block registrations would be for ingress offloads. However, the NFP driver can, in some instances, support clsact qdisc binds for egress offload. Change the name of the indirect block callback command in flow_offload to remove the 'ingress' identifier from it. While this does not change functionality, a follow up patch will implement a more more generic callback than just those currently just supporting ingress offload. Fixes: `4d12ba4278` ("nfp: flower: allow offloading of matches on 'internal' ports") Signed-off-by: John Hurley <john.hurley@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:47 +01:00
Guillaume Nault	ee0dc0c3f3	tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE() [ Upstream commit `721c8dafad` ] Syncookies borrow the ->rx_opt.ts_recent_stamp field to store the timestamp of the last synflood. Protect them with READ_ONCE() and WRITE_ONCE() since reads and writes aren't serialised. Use of .rx_opt.ts_recent_stamp for storing the synflood timestamp was introduced by `a0f82f64e2` ("syncookies: remove last_synq_overflow from struct tcp_sock"). But unprotected accesses were already there when timestamp was stored in .last_synq_overflow. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:45 +01:00
Guillaume Nault	e70ee16481	tcp: tighten acceptance of ACKs not matching a child socket [ Upstream commit `cb44a08f86` ] When no synflood occurs, the synflood timestamp isn't updated. Therefore it can be so old that time_after32() can consider it to be in the future. That's a problem for tcp_synq_no_recent_overflow() as it may report that a recent overflow occurred while, in fact, it's just that jiffies has grown past 'last_overflow' + TCP_SYNCOOKIE_VALID + 2^31. Spurious detection of recent overflows lead to extra syncookie verification in cookie_v[46]_check(). At that point, the verification should fail and the packet dropped. But we should have dropped the packet earlier as we didn't even send a syncookie. Let's refine tcp_synq_no_recent_overflow() to report a recent overflow only if jiffies is within the [last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval. This way, no spurious recent overflow is reported when jiffies wraps and 'last_overflow' becomes in the future from the point of view of time_after32(). However, if jiffies wraps and enters the [last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval (with 'last_overflow' being a stale synflood timestamp), then tcp_synq_no_recent_overflow() still erroneously reports an overflow. In such cases, we have to rely on syncookie verification to drop the packet. We unfortunately have no way to differentiate between a fresh and a stale syncookie timestamp. In practice, using last_overflow as lower bound is problematic. If the synflood timestamp is concurrently updated between the time we read jiffies and the moment we store the timestamp in 'last_overflow', then 'now' becomes smaller than 'last_overflow' and tcp_synq_no_recent_overflow() returns true, potentially dropping a valid syncookie. Reading jiffies after loading the timestamp could fix the problem, but that'd require a memory barrier. Let's just accommodate for potential timestamp growth instead and extend the interval using 'last_overflow - HZ' as lower bound. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:43 +01:00
Guillaume Nault	9afe690185	tcp: fix rejected syncookies due to stale timestamps [ Upstream commit `04d26e7b15` ] If no synflood happens for a long enough period of time, then the synflood timestamp isn't refreshed and jiffies can advance so much that time_after32() can't accurately compare them any more. Therefore, we can end up in a situation where time_after32(now, last_overflow + HZ) returns false, just because these two values are too far apart. In that case, the synflood timestamp isn't updated as it should be, which can trick tcp_synq_no_recent_overflow() into rejecting valid syncookies. For example, let's consider the following scenario on a system with HZ=1000: * The synflood timestamp is 0, either because that's the timestamp of the last synflood or, more commonly, because we're working with a freshly created socket. * We receive a new SYN, which triggers synflood protection. Let's say that this happens when jiffies == 2147484649 (that is, 'synflood timestamp' + HZ + 2^31 + 1). * Then tcp_synq_overflow() doesn't update the synflood timestamp, because time_after32(2147484649, 1000) returns false. With: - 2147484649: the value of jiffies, aka. 'now'. - 1000: the value of 'last_overflow' + HZ. * A bit later, we receive the ACK completing the 3WHS. But cookie_v[46]_check() rejects it because tcp_synq_no_recent_overflow() says that we're not under synflood. That's because time_after32(2147484649, 120000) returns false. With: - 2147484649: the value of jiffies, aka. 'now'. - 120000: the value of 'last_overflow' + TCP_SYNCOOKIE_VALID. Of course, in reality jiffies would have increased a bit, but this condition will last for the next 119 seconds, which is far enough to accommodate for jiffie's growth. Fix this by updating the overflow timestamp whenever jiffies isn't within the [last_overflow, last_overflow + HZ] range. That shouldn't have any performance impact since the update still happens at most once per second. Now we're guaranteed to have fresh timestamps while under synflood, so tcp_synq_no_recent_overflow() can safely use it with time_after32() in such situations. Stale timestamps can still make tcp_synq_no_recent_overflow() return the wrong verdict when not under synflood. This will be handled in the next patch. For 64 bits architectures, the problem was introduced with the conversion of ->tw_ts_recent_stamp to 32 bits integer by commit `cca9bab1b7` ("tcp: use monotonic timestamps for PAWS"). The problem has always been there on 32 bits architectures. Fixes: `cca9bab1b7` ("tcp: use monotonic timestamps for PAWS") Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:43 +01:00
Sabrina Dubroca	48d58ae9e8	net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup [ Upstream commit `6c8991f415` ] ipv6_stub uses the ip6_dst_lookup function to allow other modules to perform IPv6 lookups. However, this function skips the XFRM layer entirely. All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the ip_route_output_key and ip_route_output helpers) for their IPv4 lookups, which calls xfrm_lookup_route(). This patch fixes this inconsistent behavior by switching the stub to ip6_dst_lookup_flow, which also calls xfrm_lookup_route(). This requires some changes in all the callers, as these two functions take different arguments and have different return types. Fixes: `5f81bd2e5d` ("ipv6: export a stub for IPv6 symbols used by vxlan") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:42 +01:00
Sabrina Dubroca	8cadbd146a	net: ipv6: add net argument to ip6_dst_lookup_flow [ Upstream commit `c4e85f73af` ] This will be used in the conversion of ipv6_stub to ip6_dst_lookup_flow, as some modules currently pass a net argument without a socket to ip6_dst_lookup. This is equivalent to commit `343d60aada` ("ipv6: change ipv6_stub_impl.ipv6_dst_lookup to take net argument"). Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:40 +01:00
Huy Nguyen	9617d69d66	net/mlx5e: Query global pause state before setting prio2buffer [ Upstream commit `73e6551699` ] When the user changes prio2buffer mapping while global pause is enabled, mlx5 driver incorrectly sets all active buffers (buffer that has at least one priority mapped) to lossy. Solution: If global pause is enabled, set all the active buffers to lossless in prio2buffer command. Also, add error message when buffer size is not enough to meet xoff threshold. Fixes: `0696d60853` ("net/mlx5e: Receive buffer configuration") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:38 +01:00
Taehee Yoo	0703996ff4	tipc: fix ordering of tipc module init and exit routine [ Upstream commit `9cf1cd8ee3` ] In order to set/get/dump, the tipc uses the generic netlink infrastructure. So, when tipc module is inserted, init function calls genl_register_family(). After genl_register_family(), set/get/dump commands are immediately allowed and these callbacks internally use the net_generic. net_generic is allocated by register_pernet_device() but this is called after genl_register_family() in the __init function. So, these callbacks would use un-initialized net_generic. Test commands: #SHELL1 while : do modprobe tipc modprobe -rv tipc done #SHELL2 while : do tipc link list done Splat looks like: [ 59.616322][ T2788] kasan: CONFIG_KASAN_INLINE enabled [ 59.617234][ T2788] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 59.618398][ T2788] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 59.619389][ T2788] CPU: 3 PID: 2788 Comm: tipc Not tainted 5.4.0+ #194 [ 59.620231][ T2788] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 59.621428][ T2788] RIP: 0010:tipc_bcast_get_broadcast_mode+0x131/0x310 [tipc] [ 59.622379][ T2788] Code: c7 c6 ef 8b 38 c0 65 ff 0d 84 83 c9 3f e8 d7 a5 f2 e3 48 8d bb 38 11 00 00 48 b8 00 00 00 00 [ 59.622550][ T2780] NET: Registered protocol family 30 [ 59.624627][ T2788] RSP: 0018:ffff88804b09f578 EFLAGS: 00010202 [ 59.624630][ T2788] RAX: dffffc0000000000 RBX: 0000000000000011 RCX: 000000008bc66907 [ 59.624631][ T2788] RDX: 0000000000000229 RSI: 000000004b3cf4cc RDI: 0000000000001149 [ 59.624633][ T2788] RBP: ffff88804b09f588 R08: 0000000000000003 R09: fffffbfff4fb3df1 [ 59.624635][ T2788] R10: fffffbfff50318f8 R11: ffff888066cadc18 R12: ffffffffa6cc2f40 [ 59.624637][ T2788] R13: 1ffff11009613eba R14: ffff8880662e9328 R15: ffff8880662e9328 [ 59.624639][ T2788] FS: 00007f57d8f7b740(0000) GS:ffff88806cc00000(0000) knlGS:0000000000000000 [ 59.624645][ T2788] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 59.625875][ T2780] tipc: Started in single node mode [ 59.626128][ T2788] CR2: 00007f57d887a8c0 CR3: 000000004b140002 CR4: 00000000000606e0 [ 59.633991][ T2788] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 59.635195][ T2788] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 59.636478][ T2788] Call Trace: [ 59.637025][ T2788] tipc_nl_add_bc_link+0x179/0x1470 [tipc] [ 59.638219][ T2788] ? lock_downgrade+0x6e0/0x6e0 [ 59.638923][ T2788] ? __tipc_nl_add_link+0xf90/0xf90 [tipc] [ 59.639533][ T2788] ? tipc_nl_node_dump_link+0x318/0xa50 [tipc] [ 59.640160][ T2788] ? mutex_lock_io_nested+0x1380/0x1380 [ 59.640746][ T2788] tipc_nl_node_dump_link+0x4fd/0xa50 [tipc] [ 59.641356][ T2788] ? tipc_nl_node_reset_link_stats+0x340/0x340 [tipc] [ 59.642088][ T2788] ? __skb_ext_del+0x270/0x270 [ 59.642594][ T2788] genl_lock_dumpit+0x85/0xb0 [ 59.643050][ T2788] netlink_dump+0x49c/0xed0 [ 59.643529][ T2788] ? __netlink_sendskb+0xc0/0xc0 [ 59.644044][ T2788] ? __netlink_dump_start+0x190/0x800 [ 59.644617][ T2788] ? __mutex_unlock_slowpath+0xd0/0x670 [ 59.645177][ T2788] __netlink_dump_start+0x5a0/0x800 [ 59.645692][ T2788] genl_rcv_msg+0xa75/0xe90 [ 59.646144][ T2788] ? __lock_acquire+0xdfe/0x3de0 [ 59.646692][ T2788] ? genl_family_rcv_msg_attrs_parse+0x320/0x320 [ 59.647340][ T2788] ? genl_lock_dumpit+0xb0/0xb0 [ 59.647821][ T2788] ? genl_unlock+0x20/0x20 [ 59.648290][ T2788] ? genl_parallel_done+0xe0/0xe0 [ 59.648787][ T2788] ? find_held_lock+0x39/0x1d0 [ 59.649276][ T2788] ? genl_rcv+0x15/0x40 [ 59.649722][ T2788] ? lock_contended+0xcd0/0xcd0 [ 59.650296][ T2788] netlink_rcv_skb+0x121/0x350 [ 59.650828][ T2788] ? genl_family_rcv_msg_attrs_parse+0x320/0x320 [ 59.651491][ T2788] ? netlink_ack+0x940/0x940 [ 59.651953][ T2788] ? lock_acquire+0x164/0x3b0 [ 59.652449][ T2788] genl_rcv+0x24/0x40 [ 59.652841][ T2788] netlink_unicast+0x421/0x600 [ ... ] Fixes: `7e43690578` ("tipc: fix a slab object leak") Fixes: `a62fbccecd` ("tipc: make subscriber server support net namespace") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-18 16:08:36 +01:00

1 2 3 4 5 ...

873861 Commits