linux

mirror of https://github.com/hardkernel/linux.git synced 2026-04-01 10:42:58 +09:00

Author	SHA1	Message	Date
Krzysztof Kozlowski	16ce487b86	regulator: s5m8767: do not use reset value as DVS voltage if GPIO DVS is disabled commit `b16bef60a9` upstream. The driver and its bindings, before commit `04f9f068a6` ("regulator: s5m8767: Modify parsing method of the voltage table of buck2/3/4") were requiring to provide at least one safe/default voltage for DVS registers if DVS GPIO is not being enabled. IOW, if s5m8767,pmic-buck2-uses-gpio-dvs is missing, the s5m8767,pmic-buck2-dvs-voltage should still be present and contain one voltage. This requirement was coming from driver behavior matching this condition (none of DVS GPIO is enabled): it was always initializing the DVS selector pins to 0 and keeping the DVS enable setting at reset value (enabled). Therefore if none of DVS GPIO is enabled in devicetree, driver was configuring the first DVS voltage for buck[234]. Mentioned commit `04f9f068a6` ("regulator: s5m8767: Modify parsing method of the voltage table of buck2/3/4") broke it because DVS voltage won't be parsed from devicetree if DVS GPIO is not enabled. After the change, driver will configure bucks to use the register reset value as voltage which might have unpleasant effects. Fix this by relaxing the bindings constrain: if DVS GPIO is not enabled in devicetree (therefore DVS voltage is also not parsed), explicitly disable it. Cc: <stable@vger.kernel.org> Fixes: `04f9f068a6` ("regulator: s5m8767: Modify parsing method of the voltage table of buck2/3/4") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Acked-by: Rob Herring <robh@kernel.org> Message-Id: <20211008113723.134648-2-krzysztof.kozlowski@canonical.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 12:13:12 +09:00
Miquel Raynal	80f183bad3	dt-bindings: mtd: gpmc: Fix the ECC bytes vs. OOB bytes equation [ Upstream commit `778cb8e39f` ] "PAGESIZE / 512" is the number of ECC chunks. "ECC_BYTES" is the number of bytes needed to store a single ECC code. "2" is the space reserved by the bad block marker. "2 + (PAGESIZE / 512) * ECC_BYTES" should of course be lower or equal than the total number of OOB bytes, otherwise it won't fit. Fix the equation by substituting s/>=/<=/. Suggested-by: Ryan J. Barnett <ryan.barnett@collins.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20210610143945.3504781-1-miquel.raynal@bootlin.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 11:46:56 +09:00
Jeff Layton	7754a1e167	locks: print a warning when mount fails due to lack of "mand" support [ Upstream commit `df2474a22c` ] Since `9e8925b67a` ("locks: Allow disabling mandatory locking at compile time"), attempts to mount filesystems with "-o mand" will fail. Unfortunately, there is no other indiciation of the reason for the failure. Change how the function is defined for better readability. When CONFIG_MANDATORY_FILE_LOCKING is disabled, printk a warning when someone attempts to mount with -o mand. Also, add a blurb to the mandatory-locking.txt file to explain about the "mand" option, and the behavior one should expect when it is disabled. Reported-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 11:42:37 +09:00
Finn Behrens	128e2d33b7	tweewide: Fix most Shebang lines commit `c25ce589dc` upstream. Change every shebang which does not need an argument to use /usr/bin/env. This is needed as not every distro has everything under /usr/bin, sometimes not even bash. Signed-off-by: Finn Behrens <me@kloenk.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> [nicolas@fjasle.eu: update contexts for v4.9, adapt for old scripts] Signed-off-by: Nicolas Schier <nicolas@fjasle.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 11:09:32 +09:00
Joe Perches	90304a7c8e	sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output commit `2efc459d06` upstream. Output defects can exist in sysfs content using sprintf and snprintf. sprintf does not know the PAGE_SIZE maximum of the temporary buffer used for outputting sysfs content and it's possible to overrun the PAGE_SIZE buffer length. Add a generic sysfs_emit function that knows that the size of the temporary buffer and ensures that no overrun is done. Add a generic sysfs_emit_at function that can be used in multiple call situations that also ensures that no overrun is done. Validate the output buffer argument to be page aligned. Validate the offset len argument to be within the PAGE_SIZE buf. Signed-off-by: Joe Perches <joe@perches.com> Link: https://lore.kernel.org/r/884235202216d464d61ee975f7465332c86f76b2.1600285923.git.joe@perches.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 10:40:14 +09:00
Krzysztof Kozlowski	e1fba672a5	dt-bindings: net: correct interrupt flags in examples [ Upstream commit `4d521943f7` ] GPIO_ACTIVE_x flags are not correct in the context of interrupt flags. These are simple defines so they could be used in DTS but they will not have the same meaning: 1. GPIO_ACTIVE_HIGH = 0 = IRQ_TYPE_NONE 2. GPIO_ACTIVE_LOW = 1 = IRQ_TYPE_EDGE_RISING Correct the interrupt flags, assuming the author of the code wanted same logical behavior behind the name "ACTIVE_xxx", this is: ACTIVE_LOW => IRQ_TYPE_LEVEL_LOW ACTIVE_HIGH => IRQ_TYPE_LEVEL_HIGH Fixes: `a1a8b4594f` ("NFC: pn544: i2c: Add DTS Documentation") Fixes: `6be88670fc` ("NFC: nxp-nci_i2c: Add I2C support to NXP NCI driver") Fixes: `e3b3292215` ("dt-bindings: can: tcan4x5x: Update binding to use interrupt property") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for tcan4x5x.txt Link: https://lore.kernel.org/r/20201026153620.89268-1-krzk@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 09:50:54 +09:00
Nicholas Piggin	32f6b5cd52	powerpc/64s: flush L1D after user accesses commit `9a32a7e78b` upstream. IBM Power9 processors can speculatively operate on data in the L1 cache before it has been completely validated, via a way-prediction mechanism. It is not possible for an attacker to determine the contents of impermissible memory using this method, since these systems implement a combination of hardware and software security measures to prevent scenarios where protected data could be leaked. However these measures don't address the scenario where an attacker induces the operating system to speculatively execute instructions using data that the attacker controls. This can be used for example to speculatively bypass "kernel user access prevention" techniques, as discovered by Anthony Steinhauser of Google's Safeside Project. This is not an attack by itself, but there is a possibility it could be used in conjunction with side-channels or other weaknesses in the privileged code to construct an attack. This issue can be mitigated by flushing the L1 cache between privilege boundaries of concern. This patch flushes the L1 cache after user accesses. This is part of the fix for CVE-2020-4788. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 09:48:29 +09:00
Nicholas Piggin	9ac3d709ff	powerpc/64s: flush L1D on kernel entry commit `f79643787e` upstream. IBM Power9 processors can speculatively operate on data in the L1 cache before it has been completely validated, via a way-prediction mechanism. It is not possible for an attacker to determine the contents of impermissible memory using this method, since these systems implement a combination of hardware and software security measures to prevent scenarios where protected data could be leaked. However these measures don't address the scenario where an attacker induces the operating system to speculatively execute instructions using data that the attacker controls. This can be used for example to speculatively bypass "kernel user access prevention" techniques, as discovered by Anthony Steinhauser of Google's Safeside Project. This is not an attack by itself, but there is a possibility it could be used in conjunction with side-channels or other weaknesses in the privileged code to construct an attack. This issue can be mitigated by flushing the L1 cache between privilege boundaries of concern. This patch flushes the L1 cache on kernel entry. This is part of the fix for CVE-2020-4788. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 09:48:23 +09:00
Juergen Gross	ca6dc89487	xen/events: defer eoi in case of excessive number of events commit `e99502f762` upstream. In case rogue guests are sending events at high frequency it might happen that xen_evtchn_do_upcall() won't stop processing events in dom0. As this is done in irq handling a crash might be the result. In order to avoid that, delay further inter-domain events after some time in xen_evtchn_do_upcall() by forcing eoi processing into a worker on the same cpu, thus inhibiting new events coming in. The time after which eoi processing is to be delayed is configurable via a new module parameter "event_loop_timeout" which specifies the maximum event loop time in jiffies (default: 2, the value was chosen after some tests showing that a value of 2 was the lowest with an only slight drop of dom0 network throughput while multiple guests performed an event storm). How long eoi processing will be delayed can be specified via another parameter "event_eoi_delay" (again in jiffies, default 10, again the value was chosen after testing with different delay values). This is part of XSA-332. Cc: stable@vger.kernel.org Reported-by: Julien Grall <julien@xen.org> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Wei Liu <wl@xen.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 09:48:08 +09:00
Sabrina Dubroca	20c72a6f3a	UPSTREAM: Documentation: ip-sysctl.txt: document addr_gen_mode addr_gen_mode was introduced in without documentation, add it now. Fixes: `d35a00b8e3` ("net/ipv6: allow sysctl to change link-local address generation mode") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit `f168db5e25`) Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I89df4f69976e19ff5664d1099d092fd335a4fba7	2023-05-16 09:46:42 +09:00
Eric Dumazet	15898d9379	icmp: randomize the global rate limiter [ Upstream commit `b38e7819ca` ] Keyu Man reported that the ICMP rate limiter could be used by attackers to get useful signal. Details will be provided in an upcoming academic publication. Our solution is to add some noise, so that the attackers no longer can get help from the predictable token bucket limiter. Fixes: `4cdf507d54` ("icmp: add a global rate limitation") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Keyu Man <kman001@ucr.edu> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 09:37:26 +09:00
Jiri Slaby	e9a0967050	ata: make qc_prep return ata_completion_errors commit `95364f3670` upstream. In case a driver wants to return an error from qc_prep, return enum ata_completion_errors. sata_mv is one of those drivers -- see the next patch. Other drivers return the newly defined AC_ERR_OK. [v2] use enum ata_completion_errors and AC_ERR_OK. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-ide@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-16 09:17:09 +09:00
Krzysztof Kozlowski	3d54e9cf67	dt-bindings: sound: wm8994: Correct required supplies based on actual implementaion [ Upstream commit `8c149b7d75` ] The required supplies in bindings were actually not matching implementation making the bindings incorrect and misleading. The Linux kernel driver requires all supplies to be present. Also for wlf,wm8994 uses just DBVDD-supply instead of DBVDDn-supply (n: <1,3>). Reported-by: Jonathan Bakker <xc-racer2@live.ca> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20200501133534.6706-1-krzk@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 09:16:13 +09:00
Max Staudt	965482de68	affs: fix basic permission bits to actually work [ Upstream commit `d3a84a8d0d` ] The basic permission bits (protection bits in AmigaOS) have been broken in Linux' AFFS - it would only set bits, but never delete them. Also, contrary to the documentation, the Archived bit was not handled. Let's fix this for good, and set the bits such that Linux and classic AmigaOS can coexist in the most peaceful manner. Also, update the documentation to represent the current state of things. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Max Staudt <max@enpas.org> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 09:05:59 +09:00
Tomasz Duszynski	320fc8d2df	iio: improve IIO_CONCENTRATION channel type description [ Upstream commit `df16c33a40` ] IIO_CONCENTRATION together with INFO_RAW specifier is used for reporting raw concentrations of pollutants. Raw value should be meaningless before being properly scaled. Because of that description shouldn't mention raw value unit whatsoever. Fix this by rephrasing existing description so it follows conventions used throughout IIO ABI docs. Fixes: `8ff6b3bc94` ("iio: chemical: Add IIO_CONCENTRATION channel type") Signed-off-by: Tomasz Duszynski <tomasz.duszynski@octakon.com> Acked-by: Matt Ranostay <matt.ranostay@konsulko.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-16 08:46:16 +09:00
secuflag	b8154d6f54	Revert "block: cgroups, kconfig, build bits for BFQ-v7r11-4.5.0" This reverts commit f6532c1fec5fa35626b08dfcc1d5705af96be3e7.	2023-05-16 08:39:45 +09:00
Paolo Valente	48f3ff96f2	block: cgroups, kconfig, build bits for BFQ-v7r11-4.5.0 Update Kconfig.iosched and do the related Makefile changes to include kernel configuration options for BFQ. Also increase the number of policies supported by the blkio controller so that BFQ can add its own. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Signed-off-by: Arianna Avanzini <avanzini@google.com> block: introduce the BFQ-v7r11 I/O sched for 4.5.0 The general structure is borrowed from CFQ, as much of the code for handling I/O contexts. Over time, several useful features have been ported from CFQ as well (details in the changelog in README.BFQ). A (bfq_)queue is associated to each task doing I/O on a device, and each time a scheduling decision has to be made a queue is selected and served until it expires. - Slices are given in the service domain: tasks are assigned budgets, measured in number of sectors. Once got the disk, a task must however consume its assigned budget within a configurable maximum time (by default, the maximum possible value of the budgets is automatically computed to comply with this timeout). This allows the desired latency vs "throughput boosting" tradeoff to be set. - Budgets are scheduled according to a variant of WF2Q+, implemented using an augmented rb-tree to take eligibility into account while preserving an O(log N) overall complexity. - A low-latency tunable is provided; if enabled, both interactive and soft real-time applications are guaranteed a very low latency. - Latency guarantees are preserved also in the presence of NCQ. - Also with flash-based devices, a high throughput is achieved while still preserving latency guarantees. - BFQ features Early Queue Merge (EQM), a sort of fusion of the cooperating-queue-merging and the preemption mechanisms present in CFQ. EQM is in fact a unified mechanism that tries to get a sequential read pattern, and hence a high throughput, with any set of processes performing interleaved I/O over a contiguous sequence of sectors. - BFQ supports full hierarchical scheduling, exporting a cgroups interface. Since each node has a full scheduler, each group can be assigned its own weight. - If the cgroups interface is not used, only I/O priorities can be assigned to processes, with ioprio values mapped to weights with the relation weight = IOPRIO_BE_NR - ioprio. - ioprio classes are served in strict priority order, i.e., lower priority queues are not served as long as there are higher priority queues. Among queues in the same class the bandwidth is distributed in proportion to the weight of each queue. A very thin extra bandwidth is however guaranteed to the Idle class, to prevent it from starving. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Signed-off-by: Arianna Avanzini <avanzini@google.com> block, bfq: add Early Queue Merge (EQM) to BFQ-v7r11 for 4.5.0 A set of processes may happen to perform interleaved reads, i.e.,requests whose union would give rise to a sequential read pattern. There are two typical cases: in the first case, processes read fixed-size chunks of data at a fixed distance from each other, while in the second case processes may read variable-size chunks at variable distances. The latter case occurs for example with QEMU, which splits the I/O generated by the guest into multiple chunks, and lets these chunks be served by a pool of cooperating processes, iteratively assigning the next chunk of I/O to the first available process. CFQ uses actual queue merging for the first type of rocesses, whereas it uses preemption to get a sequential read pattern out of the read requests performed by the second type of processes. In the end it uses two different mechanisms to achieve the same goal: boosting the throughput with interleaved I/O. This patch introduces Early Queue Merge (EQM), a unified mechanism to get a sequential read pattern with both types of processes. The main idea is checking newly arrived requests against the next request of the active queue both in case of actual request insert and in case of request merge. By doing so, both the types of processes can be handled by just merging their queues. EQM is then simpler and more compact than the pair of mechanisms used in CFQ. Finally, EQM also preserves the typical low-latency properties of BFQ, by properly restoring the weight-raising state of a queue when it gets back to a non-merged state. Signed-off-by: Mauro Andreolini <mauro.andreolini@unimore.it> Signed-off-by: Arianna Avanzini <avanzini@google.com> Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Turn into BFQ-v8r7 for 4.9.0 CHANGELOG from v8r4 to v8r7 BFQ v8r7 BUGFIX: make BFQ compile also without hierarchical support BFQ v8r6 BUGFIX Removed the check that, when the new queue to set in service must be selected, the cached next_in_service entities coincide with the entities chosen by __bfq_lookup_next_entity. This check, issuing a warning on failure, was wrong, because the cached and the newly chosen entity could differ in case of a CLASS_IDLE timeout. EFFICIENCY IMPROVEMENT (this improvement is related to the above BUGFIX) The cached next_in_service entities are now really used to select the next queue to serve when the in-service queue expires. Before this change, the cached values were used only for extra (and in general wrong) consistency checks. This caused additional overhead instead of reducing it. EFFICIENCY IMPROVEMENT The next entity to serve, for each level of the hierarchy, is now updated on every event that may change it, i.e., on every activation or deactivation of any entity. This finer granularity is not strictly needed for corectness, because it is only on queue expirations that BFQ needs to know what are the next entities to serve. Yet this change makes it possible to implement optimizations in which it is necessary to know the next queue to serve before the in-service queue expires. SERVICE-ACCURACY IMPROVEMENT The per-device CLASS_IDLE service timeout has been turned into a much more accurate per-group timeout. CODE-QUALITY IMPROVEMENT The non-trivial parts touched by the above improvements have been partially rewritten, and enriched of comments, so as to improve their transparency and understandability. IMPROVEMENT Ported and improved CFQ commit `41647e7a` Before this improvememtn, BFQ used the same logic for detecting seeky queues for rotational disks and SSDs. This logic is appropriate for the former, as it takes into account only inter-request distance, and the latter is the dominant latency factor on a rotational device. Yet things change with flash-based devices, where serving a large request still yields a high throughput, even the request is far from the previous request served. This commits extends seeky detection to take into accoutn also this fact with flash-based devices. In particular, this commit is an improved port of the original commit `41647e7a` for CFQ. CODE IMPROVEMENT Remove useless parameter from bfq_del_bfqq_busy OPTIMIZATION Optimize the update of next_in_service entity. If the update of the next_in_service candidate entity is triggered by the activation of an entity, then it is not necessary to perform full lookups in the active trees to update next_in_service. In fact, it is enough to check whether the just-activated entity has a higher priority than next_in_service, or, even if it has the same priority as next_in_service, is eligible and has a lower virtual finish time than next_in_service. If this compound condition holds, then the new entity can be set as the new next_in_service. Otherwise no change is needed. This commit implements this optimization. BUGFIX Fix bug causing occasional loss of weight raising. When a bfq_queue, say bfqq, is split after a merging with another bfq_queue, BFQ checks whether it has to restore for bfqq the weight-raising state that bfqq had before being merged. In particular, the weight-raising is restored only if, according to the weight-raising duration decided for bfqq when it started to be weight-raised (before being merged), bfqq would not have already finished its weight-raising period. Yet, by mistake, such a duration was not saved when bfqq is merged. So, if bfqq was freed and reallocated when it was split, then this duration was wrongly set to zero on the split. As a consequence, the weight-raising state of bfqq was wrongly not restored, which caused BFQ to fail in guaranteeing a low latency to bfqq. This commit fixes this bug by saving weight-raising duration when bfqq is merged, and correctly restoring it when bfqq is split. BUGFIX Fix wrong reset of in-service entities In-service entities were reset with an indirect logic, which happened to be even buggy for some cases. This commit fixes this bug in two important steps. First, by replacing this indirect logic with a direct logic, in which all involved entities are immediately reset, with a bubble-up loop, when the in-service queue is reset. Second, by restructuring the code related to this change, so as to become not only correct with respect to this change, but also cleaner and hopefully clearer. CODE IMPROVEMENT Add code to be able to redirect trace log to console. BUGFIX Fixed bug in optimized update of next_in_service entity. There was a case where bfq_update_next_in_service did not update next_in_service, even if it might need to be changed: in case of requeueing or repositioning of the entity that happened to be pointed exactly by next_in_service. This could result in violation of service guarantees, because, after a change of timestamps for such an entity, it might be the case that next_in_service had to point to a different entity. This commit fixes this bug. OPTIMIZATION Stop bubble-up of next_in_service update if possible. BUGFIX Fixed a false-positive warning for uninitialized var BFQ-v8r5 DOCUMENTATION IMPROVEMENT Added documentation of BFQ benefits, inner workings, interface and tunables. BUGFIX: Replaced max wrongly used for modulo numbers. DOCUMENTATION IMPROVEMENT Improved help message in Kconfig.iosched. BUGFIX: Removed wrong conversion in use of bfq_fifo_expire. CODE IMPROVEMENT Added parentheses to complex macros. Signed-off-by: Paolo Valente <paolo.valente@linaro.org>	2023-05-16 08:39:14 +09:00
Jitao Shi	3392a40510	dt-bindings: display: mediatek: control dpi pins mode to avoid leakage [ Upstream commit `b0ff9b5907` ] Add property "pinctrl-names" to swap pin mode between gpio and dpi mode. Set the dpi pins to gpio mode and output-low to avoid leakage current when dpi disabled. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Jitao Shi <jitao.shi@mediatek.com> Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-15 17:33:29 +09:00
Jon Doron	8359038087	x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit [ Upstream commit `f7d31e6536` ] The problem the patch is trying to address is the fact that 'struct kvm_hyperv_exit' has different layout on when compiling in 32 and 64 bit modes. In 64-bit mode the default alignment boundary is 64 bits thus forcing extra gaps after 'type' and 'msr' but in 32-bit mode the boundary is at 32 bits thus no extra gaps. This is an issue as even when the kernel is 64 bit, the userspace using the interface can be both 32 and 64 bit but the same 32 bit userspace has to work with 32 bit kernel. The issue is fixed by forcing the 64 bit layout, this leads to ABI change for 32 bit builds and while we are obviously breaking '32 bit userspace with 32 bit kernel' case, we're fixing the '32 bit userspace with 64 bit kernel' one. As the interface has no (known) users and 32 bit KVM is rather baroque nowadays, this seems like a reasonable decision. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Jon Doron <arilou@gmail.com> Message-Id: <20200424113746.3473563-2-arilou@gmail.com> Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-15 17:33:24 +09:00
Josh Poimboeuf	d7fac3745e	x86/speculation: Add Ivy Bridge to affected list commit `3798cc4d10` upstream Make the docs match the code. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 17:31:40 +09:00
Mark Gross	f06c53b78c	x86/speculation: Add SRBDS vulnerability and mitigation documentation commit `7222a1b5b8` upstream Add documentation for the SRBDS vulnerability and its mitigation. [ bp: Massage. jpoimboe: sysfs table strings. ] Signed-off-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 17:31:39 +09:00
Mark Gross	863d1d7219	x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigation commit `7e5b3c267d` upstream SRBDS is an MDS-like speculative side channel that can leak bits from the random number generator (RNG) across cores and threads. New microcode serializes the processor access during the execution of RDRAND and RDSEED. This ensures that the shared buffer is overwritten before it is released for reuse. While it is present on all affected CPU models, the microcode mitigation is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the cases where TSX is not supported or has been disabled with TSX_CTRL. The mitigation is activated by default on affected processors and it increases latency for RDRAND and RDSEED instructions. Among other effects this will reduce throughput from /dev/urandom. * Enable administrator to configure the mitigation off when desired using either mitigations=off or srbds=off. * Export vulnerability status via sysfs * Rename file-scoped macros to apply for non-whitelist table initializations. [ bp: Massage, - s/VULNBL_INTEL_STEPPING/VULNBL_INTEL_STEPPINGS/g, - do not read arch cap MSR a second time in tsx_fused_off() - just pass it in, - flip check in cpu_set_bug_bits() to save an indentation level, - reflow comments. jpoimboe: s/Mitigated/Mitigation/ in user-visible strings tglx: Dropped the fused off magic for now ] Signed-off-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 17:31:37 +09:00
Asbjørn Sloth Tønnesen	214618a9d4	net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_* commit `47c3e7783b` upstream. PPPOL2TP_MSG_* and L2TP_MSG_* are duplicates, and are being used interchangeably in the kernel, so let's standardize on L2TP_MSG_* internally, and keep PPPOL2TP_MSG_* defined in UAPI for compatibility. Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 17:28:54 +09:00
Madalin Bucur	396909ea4a	dt-bindings: net: FMan erratum A050385 [ Upstream commit `26d5bb9e4c` ] FMAN DMA read or writes under heavy traffic load may cause FMAN internal resource leak; thus stopping further packet processing. The FMAN internal queue can overflow when FMAN splits single read or write transactions into multiple smaller transactions such that more than 17 AXI transactions are in flight from FMAN to interconnect. When the FMAN internal queue overflows, it can stall further packet processing. The issue can occur with any one of the following three conditions: 1. FMAN AXI transaction crosses 4K address boundary (Errata A010022) 2. FMAN DMA address for an AXI transaction is not 16 byte aligned, i.e. the last 4 bits of an address are non-zero 3. Scatter Gather (SG) frames have more than one SG buffer in the SG list and any one of the buffers, except the last buffer in the SG list has data size that is not a multiple of 16 bytes, i.e., other than 16, 32, 48, 64, etc. With any one of the above three conditions present, there is likelihood of stalled FMAN packet processing, especially under stress with multiple ports injecting line-rate traffic. To avoid situations that stall FMAN packet processing, all of the above three conditions must be avoided; therefore, configure the system with the following rules: 1. Frame buffers must not span a 4KB address boundary, unless the frame start address is 256 byte aligned 2. All FMAN DMA start addresses (for example, BMAN buffer address, FD[address] + FD[offset]) are 16B aligned 3. SG table and buffer addresses are 16B aligned and the size of SG buffers are multiple of 16 bytes, except for the last SG buffer that can be of any size. Additional workaround notes: - Address alignment of 64 bytes is recommended for maximally efficient system bus transactions (although 16 byte alignment is sufficient to avoid the stall condition) - To support frame sizes that are larger than 4K bytes, there are two options: 1. Large single buffer frames that span a 4KB page boundary can be converted into SG frames to avoid transaction splits at the 4KB boundary, 2. Align the large single buffer to 256B address boundaries, ensure that the frame address plus offset is 256B aligned. - If software generated SG frames have buffers that are unaligned and with random non-multiple of 16 byte lengths, before transmitting such frames via FMAN, frames will need to be copied into a new single buffer or multiple buffer SG frame that is compliant with the three rules listed above. Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-15 17:11:35 +09:00
Jean Delvare	84e401093e	ACPI: watchdog: Allow disabling WDAT at boot [ Upstream commit `3f9e12e0df` ] In case the WDAT interface is broken, give the user an option to ignore it to let a native driver bind to the watchdog device instead. Signed-off-by: Jean Delvare <jdelvare@suse.de> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-15 17:10:17 +09:00
Al Viro	6ee59f4bd9	cifs_atomic_open(): fix double-put on late allocation failure commit `d9a9f4849f` upstream. several iterations of ->atomic_open() calling conventions ago, we used to need fput() if ->atomic_open() failed at some point after successful finish_open(). Now (since 2016) it's not needed - struct file carries enough state to make fput() work regardless of the point in struct file lifecycle and discarding it on failure exits in open() got unified. Unfortunately, I'd missed the fact that we had an instance of ->atomic_open() (cifs one) that used to need that fput(), as well as the stale comment in finish_open() demanding such late failure handling. Trivially fixed... Fixes: `fe9ec8291f` "do_last(): take fput() on error after opening to out:" Cc: stable@kernel.org # v4.7+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 17:09:26 +09:00
Jeremy Linton	bc0842a5b1	Documentation: Document arm64 kpti control commit `de19055564` upstream. For a while Arm64 has been capable of force enabling or disabling the kpti mitigations. Lets make sure the documentation reflects that. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> [florian: patch the correct file] Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 16:48:07 +09:00
Alexander Usyskin	7b2e54a46b	mei: fix modalias documentation commit `7366830921` upstream. mei client bus added the client protocol version to the device alias, but ABI documentation was not updated. Fixes: `b26864cad1` (mei: bus: add client protocol version to the device alias) Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Link: https://lore.kernel.org/r/20191008005735.12707-1-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 16:34:21 +09:00
Baruch Siach	6f24d161e7	rtc: dt-binding: abx80x: fix resistance scale [ Upstream commit `73852e5682` ] The abracon,tc-resistor property value is in kOhm. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-15 16:09:21 +09:00
Peter Hutterer	0578341be5	HID: doc: fix wrong data structure reference for UHID_OUTPUT [ Upstream commit `46b14eef59` ] Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-15 16:06:45 +09:00
Waiman Long	8dcb6348d4	x86/speculation: Fix incorrect MDS/TAA mitigation status commit `64870ed1b1` upstream. For MDS vulnerable processors with TSX support, enabling either MDS or TAA mitigations will enable the use of VERW to flush internal processor buffers at the right code path. IOW, they are either both mitigated or both not. However, if the command line options are inconsistent, the vulnerabilites sysfs files may not report the mitigation status correctly. For example, with only the "mds=off" option: vulnerabilities/mds:Vulnerable; SMT vulnerable vulnerabilities/tsx_async_abort:Mitigation: Clear CPU buffers; SMT vulnerable The mds vulnerabilities file has wrong status in this case. Similarly, the taa vulnerability file will be wrong with mds mitigation on, but taa off. Change taa_select_mitigation() to sync up the two mitigation status and have them turned off if both "mds=off" and "tsx_async_abort=off" are present. Update documentation to emphasize the fact that both "mds=off" and "tsx_async_abort=off" have to be specified together for processors that are affected by both TAA and MDS to be effective. [ bp: Massage and add kernel-parameters.txt change too. ] Fixes: `1b42f01741` ("x86/speculation/taa: Add mitigation for TSX Async Abort") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: linux-doc@vger.kernel.org Cc: Mark Gross <mgross@linux.intel.com> Cc: <stable@vger.kernel.org> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191115161445.30809-2-longman@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 15:19:23 +09:00
Gomez Iglesias, Antonio	07ffb9aa13	Documentation: Add ITLB_MULTIHIT documentation commit `7f00cc8d4a` upstream. Add the initial ITLB_MULTIHIT documentation. [ tglx: Add it to the index so it gets actually built. ] Signed-off-by: Antonio Gomez Iglesias <antonio.gomez.iglesias@intel.com> Signed-off-by: Nelson D'Souza <nelson.dsouza@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [bwh: Backported to 4.9: adjust filenames] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 15:06:53 +09:00
Junaid Shahid	a74d6720b3	kvm: x86: mmu: Recovery of shattered NX large pages commit `1aa9b9572b` upstream. The page table pages corresponding to broken down large pages are zapped in FIFO order, so that the large page can potentially be recovered, if it is not longer being used for execution. This removes the performance penalty for walking deeper EPT page tables. By default, one large page will last about one hour once the guest reaches a steady state. Signed-off-by: Junaid Shahid <junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [bwh: Backported to 4.9: - Update another error path in kvm_create_vm() to use out_err_no_mmu_notifier - Adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 15:06:51 +09:00
Paolo Bonzini	2f811b78a3	kvm: mmu: ITLB_MULTIHIT mitigation commit `b8e8c8303f` upstream. With some Intel processors, putting the same virtual address in the TLB as both a 4 KiB and 2 MiB page can confuse the instruction fetch unit and cause the processor to issue a machine check resulting in a CPU lockup. Unfortunately when EPT page tables use huge pages, it is possible for a malicious guest to cause this situation. Add a knob to mark huge pages as non-executable. When the nx_huge_pages parameter is enabled (and we are using EPT), all huge pages are marked as NX. If the guest attempts to execute in one of those pages, the page is broken down into 4K pages, which are then marked executable. This is not an issue for shadow paging (except nested EPT), because then the host is in control of TLB flushes and the problematic situation cannot happen. With nested EPT, again the nested guest can cause problems shadow and direct EPT is treated in the same way. [ tglx: Fixup default to auto and massage wording a bit ] Originally-by: Junaid Shahid <junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [bwh: Backported to 4.9: - Use kvm_mmu_invalidate_zap_all_pages() instead of kvm_mmu_zap_all_fast() - Don't provide mode for nx_largepages_splitted as all stats are read-only - Adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 15:06:49 +09:00
Vineela Tummalapalli	6007f9c891	x86/bugs: Add ITLB_MULTIHIT bug infrastructure commit `db4d30fbb7` upstream. Some processors may incur a machine check error possibly resulting in an unrecoverable CPU lockup when an instruction fetch encounters a TLB multi-hit in the instruction TLB. This can occur when the page size is changed along with either the physical address or cache type. The relevant erratum can be found here: https://bugzilla.kernel.org/show_bug.cgi?id=205195 There are other processors affected for which the erratum does not fully disclose the impact. This issue affects both bare-metal x86 page tables and EPT. It can be mitigated by either eliminating the use of large pages or by using careful TLB invalidations when changing the page size in the page tables. Just like Spectre, Meltdown, L1TF and MDS, a new bit has been allocated in MSR_IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) and will be set on CPUs which are mitigated against this issue. Signed-off-by: Vineela Tummalapalli <vineela.tummalapalli@intel.com> Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [bwh: Backported to 4.9: - No support for X86_VENDOR_HYGON, ATOM_AIRMONT_NP - Adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 15:06:46 +09:00
Junaid Shahid	0ab7a72ed2	kvm: Convert kvm_lock to a mutex commit `0d9ce162cf` upstream. It doesn't seem as if there is any particular need for kvm_lock to be a spinlock, so convert the lock to a mutex so that sleepable functions (in particular cond_resched()) can be called while holding it. Signed-off-by: Junaid Shahid <junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 4.9: - Drop changes in kvm_hyperv_tsc_notifier(), vm_stat_clear(), vcpu_stat_clear(), kvm_uevent_notify_change() - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 15:06:34 +09:00
Pawan Gupta	65c4ee1182	x86/speculation/taa: Add documentation for TSX Async Abort commit `a7a248c593` upstream. Add the documenation for TSX Async Abort. Include the description of the issue, how to check the mitigation state, control the mitigation, guidance for system administrators. [ bp: Add proper SPDX tags, touch ups by Josh and me. ] Co-developed-by: Antonio Gomez Iglesias <antonio.gomez.iglesias@intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Antonio Gomez Iglesias <antonio.gomez.iglesias@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Mark Gross <mgross@linux.intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> [bwh: Backported to 4.9: adjust filenames, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 15:06:05 +09:00
Pawan Gupta	2eb5529b5f	x86/tsx: Add "auto" option to the tsx= cmdline parameter commit `7531a3596e` upstream. Platforms which are not affected by X86_BUG_TAA may want the TSX feature enabled. Add "auto" option to the TSX cmdline parameter. When tsx=auto disable TSX when X86_BUG_TAA is present, otherwise enable TSX. More details on X86_BUG_TAA can be found here: https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html [ bp: Extend the arg buffer to accommodate "auto\0". ] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> [bwh: Backported to 4.9: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 15:06:03 +09:00
Pawan Gupta	fc4206adb0	x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default commit `95c5824f75` upstream. Add a kernel cmdline parameter "tsx" to control the Transactional Synchronization Extensions (TSX) feature. On CPUs that support TSX control, use "tsx=on\|off" to enable or disable TSX. Not specifying this option is equivalent to "tsx=off". This is because on certain processors TSX may be used as a part of a speculative side channel attack. Carve out the TSX controlling functionality into a separate compilation unit because TSX is a CPU feature while the TSX async abort control machinery will go to cpu/bugs.c. [ bp: - Massage, shorten and clear the arg buffer. - Clarifications of the tsx= possible options - Josh. - Expand on TSX_CTRL availability - Pawan. ] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> [bwh: Backported to 4.9: adjust filenames, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 15:05:57 +09:00
Bastien Nocera	dc2b3fd6ef	USB: rio500: Remove Rio 500 kernel driver commit `015664d152` upstream. The Rio500 kernel driver has not been used by Rio500 owners since 2001 not long after the rio500 project added support for a user-space USB stack through the very first versions of usbdevfs and then libusb. Support for the kernel driver was removed from the upstream utilities in 2008: `943f624ab7` Cc: Cesar Miquel <miquel@df.uba.ar> Signed-off-by: Bastien Nocera <hadess@hadess.net> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/6251c17584d220472ce882a3d9c199c401a51a71.camel@hadess.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 14:48:20 +09:00
Tom Lendacky	743a08a67f	x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h [ Upstream commit `c49a0a8013` ] There have been reports of RDRAND issues after resuming from suspend on some AMD family 15h and family 16h systems. This issue stems from a BIOS not performing the proper steps during resume to ensure RDRAND continues to function properly. RDRAND support is indicated by CPUID Fn00000001_ECX[30]. This bit can be reset by clearing MSR C001_1004[62]. Any software that checks for RDRAND support using CPUID, including the kernel, will believe that RDRAND is not supported. Update the CPU initialization to clear the RDRAND CPUID bit for any family 15h and 16h processor that supports RDRAND. If it is known that the family 15h or family 16h system does not have an RDRAND resume issue or that the system will not be placed in suspend, the "rdrand=force" kernel parameter can be used to stop the clearing of the RDRAND CPUID bit. Additionally, update the suspend and resume path to save and restore the MSR C001_1004 value to ensure that the RDRAND CPUID setting remains in place after resuming from suspend. Note, that clearing the RDRAND CPUID bit does not prevent a processor that normally supports the RDRAND instruction from executing it. So any code that determined the support based on family and model won't #UD. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chen Yu <yu.c.chen@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org> Cc: "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: <stable@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "x86@kernel.org" <x86@kernel.org> Link: https://lkml.kernel.org/r/7543af91666f491547bd86cebb1e17c66824ab9f.1566229943.git.thomas.lendacky@amd.com [sl: adjust context in docs] Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-15 14:21:25 +09:00
Jason A. Donenfeld	2edb579f2b	siphash: implement HalfSipHash1-3 for hash tables commit `1ae2324f73` upstream. HalfSipHash, or hsiphash, is a shortened version of SipHash, which generates 32-bit outputs using a weaker 64-bit key. It has much lower security margins, and shouldn't be used for anything too sensitive, but it could be used as a hashtable key function replacement, if the output is never exposed, and if the security requirement is not too high. The goal is to make this something that performance-critical jhash users would be willing to use. On 64-bit machines, HalfSipHash1-3 is slower than SipHash1-3, so we alias SipHash1-3 to HalfSipHash1-3 on those systems. 64-bit x86_64: [ 0.509409] test_siphash: SipHash2-4 cycles: 4049181 [ 0.510650] test_siphash: SipHash1-3 cycles: 2512884 [ 0.512205] test_siphash: HalfSipHash1-3 cycles: 3429920 [ 0.512904] test_siphash: JenkinsHash cycles: 978267 So, we map hsiphash() -> SipHash1-3 32-bit x86: [ 0.509868] test_siphash: SipHash2-4 cycles: 14812892 [ 0.513601] test_siphash: SipHash1-3 cycles: 9510710 [ 0.515263] test_siphash: HalfSipHash1-3 cycles: 3856157 [ 0.515952] test_siphash: JenkinsHash cycles: 1148567 So, we map hsiphash() -> HalfSipHash1-3 hsiphash() is roughly 3 times slower than jhash(), but comes with a considerable security improvement. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 4.9 to avoid regression for WireGuard with only half the siphash API present] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 14:18:14 +09:00
Jason A. Donenfeld	29418ea093	siphash: add cryptographically secure PRF commit `2c956a6077` upstream. SipHash is a 64-bit keyed hash function that is actually a cryptographically secure PRF, like HMAC. Except SipHash is super fast, and is meant to be used as a hashtable keyed lookup function, or as a general PRF for short input use cases, such as sequence numbers or RNG chaining. For the first usage: There are a variety of attacks known as "hashtable poisoning" in which an attacker forms some data such that the hash of that data will be the same, and then preceeds to fill up all entries of a hashbucket. This is a realistic and well-known denial-of-service vector. Currently hashtables use jhash, which is fast but not secure, and some kind of rotating key scheme (or none at all, which isn't good). SipHash is meant as a replacement for jhash in these cases. There are a modicum of places in the kernel that are vulnerable to hashtable poisoning attacks, either via userspace vectors or network vectors, and there's not a reliable mechanism inside the kernel at the moment to fix it. The first step toward fixing these issues is actually getting a secure primitive into the kernel for developers to use. Then we can, bit by bit, port things over to it as deemed appropriate. While SipHash is extremely fast for a cryptographically secure function, it is likely a bit slower than the insecure jhash, and so replacements will be evaluated on a case-by-case basis based on whether or not the difference in speed is negligible and whether or not the current jhash usage poses a real security risk. For the second usage: A few places in the kernel are using MD5 or SHA1 for creating secure sequence numbers, syn cookies, port numbers, or fast random numbers. SipHash is a faster and more fitting, and more secure replacement for MD5 in those situations. Replacing MD5 and SHA1 with SipHash for these uses is obvious and straight-forward, and so is submitted along with this patch series. There shouldn't be much of a debate over its efficacy. Dozens of languages are already using this internally for their hash tables and PRFs. Some of the BSDs already use this in their kernels. SipHash is a widely known high-speed solution to a widely known set of problems, and it's time we catch-up. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 4.9 as dependency of commits `df453700e8` "inet: switch IP ID generator to siphash" and `3c79107631` "netfilter: ctnetlink: don't use conntrack/expect object addresses as id"] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 14:18:13 +09:00
Daniel Borkmann	7bb4d804f2	bpf: add bpf_jit_limit knob to restrict unpriv allocations commit `ede95a63b5` upstream. Rick reported that the BPF JIT could potentially fill the entire module space with BPF programs from unprivileged users which would prevent later attempts to load normal kernel modules or privileged BPF programs, for example. If JIT was enabled but unsuccessful to generate the image, then before commit `290af86629` ("bpf: introduce BPF_JIT_ALWAYS_ON config") we would always fall back to the BPF interpreter. Nowadays in the case where the CONFIG_BPF_JIT_ALWAYS_ON could be set, then the load will abort with a failure since the BPF interpreter was compiled out. Add a global limit and enforce it for unprivileged users such that in case of BPF interpreter compiled out we fail once the limit has been reached or we fall back to BPF interpreter earlier w/o using module mem if latter was compiled in. In a next step, fair share among unprivileged users can be resolved in particular for the case where we would fail hard once limit is reached. Fixes: `290af86629` ("bpf: introduce BPF_JIT_ALWAYS_ON config") Fixes: `0a14842f5a` ("net: filter: Just In Time compiler for x86-64") Co-Developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: LKML <linux-kernel@vger.kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 14:18:02 +09:00
Josh Poimboeuf	84e3585001	x86/speculation: Enable Spectre v1 swapgs mitigations commit `a205982598` upstream. The previous commit added macro calls in the entry code which mitigate the Spectre v1 swapgs issue if the X86_FEATURE_FENCE_SWAPGS_* features are enabled. Enable those features where applicable. The mitigations may be disabled with "nospectre_v1" or "mitigations=off". There are different features which can affect the risk of attack: - When FSGSBASE is enabled, unprivileged users are able to place any value in GS, using the wrgsbase instruction. This means they can write a GS value which points to any value in kernel space, which can be useful with the following gadget in an interrupt/exception/NMI handler: if (coming from user space) swapgs mov %gs:<percpu_offset>, %reg1 // dependent load or store based on the value of %reg // for example: mov %(reg1), %reg2 If an interrupt is coming from user space, and the entry code speculatively skips the swapgs (due to user branch mistraining), it may speculatively execute the GS-based load and a subsequent dependent load or store, exposing the kernel data to an L1 side channel leak. Note that, on Intel, a similar attack exists in the above gadget when coming from kernel space, if the swapgs gets speculatively executed to switch back to the user GS. On AMD, this variant isn't possible because swapgs is serializing with respect to future GS-based accesses. NOTE: The FSGSBASE patch set hasn't been merged yet, so the above case doesn't exist quite yet. - When FSGSBASE is disabled, the issue is mitigated somewhat because unprivileged users must use prctl(ARCH_SET_GS) to set GS, which restricts GS values to user space addresses only. That means the gadget would need an additional step, since the target kernel address needs to be read from user space first. Something like: if (coming from user space) swapgs mov %gs:<percpu_offset>, %reg1 mov (%reg1), %reg2 // dependent load or store based on the value of %reg2 // for example: mov %(reg2), %reg3 It's difficult to audit for this gadget in all the handlers, so while there are no known instances of it, it's entirely possible that it exists somewhere (or could be introduced in the future). Without tooling to analyze all such code paths, consider it vulnerable. Effects of SMAP on the !FSGSBASE case: - If SMAP is enabled, and the CPU reports RDCL_NO (i.e., not susceptible to Meltdown), the kernel is prevented from speculatively reading user space memory, even L1 cached values. This effectively disables the !FSGSBASE attack vector. - If SMAP is enabled, but the CPU is susceptible to Meltdown, SMAP still prevents the kernel from speculatively reading user space memory. But it does not prevent the kernel from reading the user value from L1, if it has already been cached. This is probably only a small hurdle for an attacker to overcome. Thanks to Dave Hansen for contributing the speculative_smap() function. Thanks to Andrew Cooper for providing the inside scoop on whether swapgs is serializing on AMD. [ tglx: Fixed the USER fence decision and polished the comment as suggested by Dave Hansen ] Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> [bwh: Backported to 4.9: - Check for X86_FEATURE_KAISER instead of X86_FEATURE_PTI - mitigations= parameter is x86-only here - Adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 14:16:29 +09:00
allen yan	0fa0c20532	arm64: dts: marvell: Fix A37xx UART0 register size commit `c737abc193` upstream. Armada-37xx UART0 registers are 0x200 bytes wide. Right next to them are the UART1 registers that should not be declared in this node. Update the example in DT bindings document accordingly. Signed-off-by: allen yan <yanwei@marvell.com> Signed-off-by: Miquel Raynal <miquel.raynal@free-electrons.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 14:08:38 +09:00
Sean Nyekjaer	51301e3f50	dt-bindings: can: mcp251x: add mcp25625 support [ Upstream commit `0df82dcd55` ] Fully compatible with mcp2515, the mcp25625 have integrated transceiver. This patch add the mcp25625 to the device tree bindings documentation. Signed-off-by: Sean Nyekjaer <sean@geanix.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-15 13:58:00 +09:00
Eric Dumazet	5f5387eb3e	tcp: add tcp_min_snd_mss sysctl commit `5f3e2bf008` upstream. Some TCP peers announce a very small MSS option in their SYN and/or SYN/ACK messages. This forces the stack to send packets with a very high network/cpu overhead. Linux has enforced a minimal value of 48. Since this value includes the size of TCP options, and that the options can consume up to 40 bytes, this means that each segment can include only 8 bytes of payload. In some cases, it can be useful to increase the minimal value to a saner value. We still let the default to 48 (TCP_MIN_SND_MSS), for compatibility reasons. Note that TCP_MAXSEG socket option enforces a minimal value of (TCP_MIN_MSS). David Miller increased this minimal value in commit `c39508d6f1` ("tcp: Make TCP_MAXSEG minimum more correct.") from 64 to 88. We might in the future merge TCP_MIN_SND_MSS and TCP_MIN_MSS. CVE-2019-11479 -- tcp mss hardcoded to 48 Signed-off-by: Eric Dumazet <edumazet@google.com> Suggested-by: Jonathan Looney <jtl@netflix.com> Acked-by: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Bruce Curtis <brucec@netflix.com> Cc: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 13:46:55 +09:00
Jonathan Corbet	53720fdee2	docs: Fix conf.py for Sphinx 2.0 commit `3bc8088464` upstream. Our version check in Documentation/conf.py never envisioned a world where Sphinx moved beyond 1.x. Now that the unthinkable has happened, fix our version check to handle higher version numbers correctly. Cc: stable@vger.kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 13:40:23 +09:00
Andy Lutomirski	c1fca2ce90	x86/speculation/mds: Improve CPU buffer clear documentation commit `9d8d0294e7` upstream. On x86_64, all returns to usermode go through prepare_exit_to_usermode(), with the sole exception of do_nmi(). This even includes machine checks -- this was added several years ago to support MCE recovery. Update the documentation. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jon Masters <jcm@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: `04dcbdb805` ("x86/speculation/mds: Clear CPU buffers on exit to user") Link: http://lkml.kernel.org/r/999fa9e126ba6a48e9d214d2f18dbde5c62ac55c.1557865329.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-15 12:50:59 +09:00

1 2 3 4 5 ...

27842 Commits