* aosp/android14-6.1:
BACKPORT: genirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline
UPSTREAM: genirq/cpuhotplug: Retry with cpu_online_mask when migration fails
UPSTREAM: genirq/cpuhotplug: Skip suspended interrupts when restoring affinity
UPSTREAM: scsi: ufs: core: Perform read back after writing UTP_TASK_REQ_LIST_BASE_H
UPSTREAM: mm/userfaultfd: UFFDIO_MOVE implementation should use ptep_get()
UPSTREAM: f2fs: fix NULL pointer dereference in f2fs_submit_page_write()
UPSTREAM: f2fs: zone: fix to wait completion of last bio in zone correctly
UPSTREAM: iommu: Don't reserve 0-length IOVA region
UPSTREAM: iommu: Fix printk arg in of_iommu_get_resv_regions()
UPSTREAM: iommu: Map reserved memory as cacheable if device is coherent
BACKPORT: io_uring/fdinfo: remove need for sqpoll lock for thread/pid retrieval
UPSTREAM: wifi: cfg80211: fix assoc response warning on failed links
UPSTREAM: usb: typec: tcpm: Add additional checks for contaminant
UPSTREAM: mm: remove duplicated vma->vm_flags check when expanding stack
ANDROID: Update the ABI symbol list
ANDROID: add vendor hook for mapping_shrinkable
FROMLIST: sd: Retry START STOP UNIT commands
FROMLIST: scsi: core: Retry passthrough commands if SCMD_RETRY_PASSTHROUGH is set
ANDROID: GKI: Update symbol list for mtk
ANDROID: Update the ABI symbol list
ANDROID: GKI: add symbol list for telechips
UPSTREAM: ufs: core: bypass quick recovery if need force reset
ANDROID: Update the ABI symbol list
UPSTREAM: ring-buffer: Fix a race between readers and resize checks
UPSTREAM: lib/build_OID_registry: avoid non-destructive substitution for Perl < 5.13.2 compat
UPSTREAM: lib/build_OID_registry: don't mention the full path of the script in output
ANDROID: GKI: Add symbol list for exynosauto
UPSTREAM: erofs: fix race in z_erofs_get_gbuf()
ANDROID: fuse: Skip canonical path logic if ENOSYS
ANDROID: fsnotify: Do not notify lower fs of open when ENOSYS
ANDROID: mm: madvise: Avoid counting swap entry references for migration entries
UPSTREAM: scsi: ufs: core: Changing the status to check inflight
ANDROID: Update the ABI symbol list
ANDROID: power: add vendor hook to handle hibernate failures
ANDROID: GKI: remove export of tracing control functions
ANDROID: fix kernelci build-break for !CONFIG_ANDROID_VENDOR_OEM_DATA
ANDROID: GKI: update symbol list file for xiaomi
ANDROID: vendor_hooks: vendor hooks for optimizing the blocking problem caused by rwsem lock contention of reverse mapping during memory recycling.
UPSTREAM: of: reserved_mem: Use proper binary prefix
BACKPORT: gpio: of: support gpio-ranges for multiple gpiochip devices
UPSTREAM: dt-bindings: gpio: brcmstb: add gpio-ranges
ANDROID: Incremental fs: Retry page faults on non-fatal errors
ANDROID: GKI: Update symbol list for BCMSTB
ANDROID: GKI: Update rockchip symbols for dw_hdmi_qp.
UPSTREAM: net: usb: ax88179_178a: improve reset check
UPSTREAM: net: usb: ax88179_178a: fix link status when link is set to down/up
UPSTREAM: usb: gadget: configfs: Prevent OOB read/write in usb_string_copy()
UPSTREAM: usb: typec: tcpm: fix use-after-free case in tcpm_register_source_caps
UPSTREAM: usb: typec: tcpm: unregister existing source caps before re-registration
ANDROID: GKI: Update symbol list for mtk
Change-Id: I1b7876def61c1f0a09ec7599aeb64fd2de911619
Signed-off-by: Will McVicker <willmcvicker@google.com>
The absence of IRQD_MOVE_PCNTXT prevents immediate effectiveness of
interrupt affinity reconfiguration via procfs. Instead, the change is
deferred until the next instance of the interrupt being triggered on the
original CPU.
When the interrupt next triggers on the original CPU, the new affinity is
enforced within __irq_move_irq(). A vector is allocated from the new CPU,
but the old vector on the original CPU remains and is not immediately
reclaimed. Instead, apicd->move_in_progress is flagged, and the reclaiming
process is delayed until the next trigger of the interrupt on the new CPU.
Upon the subsequent triggering of the interrupt on the new CPU,
irq_complete_move() adds a task to the old CPU's vector_cleanup list if it
remains online. Subsequently, the timer on the old CPU iterates over its
vector_cleanup list, reclaiming old vectors.
However, a rare scenario arises if the old CPU is outgoing before the
interrupt triggers again on the new CPU.
In that case irq_force_complete_move() is not invoked on the outgoing CPU
to reclaim the old apicd->prev_vector because the interrupt isn't currently
affine to the outgoing CPU, and irq_needs_fixup() returns false. Even
though __vector_schedule_cleanup() is later called on the new CPU, it
doesn't reclaim apicd->prev_vector; instead, it simply resets both
apicd->move_in_progress and apicd->prev_vector to 0.
As a result, the vector remains unreclaimed in vector_matrix, leading to a
CPU vector leak.
To address this issue, move the invocation of irq_force_complete_move()
before the irq_needs_fixup() call to reclaim apicd->prev_vector, if the
interrupt is currently or used to be affine to the outgoing CPU.
Additionally, reclaim the vector in __vector_schedule_cleanup() as well,
following a warning message, although theoretically it should never see
apicd->move_in_progress with apicd->prev_cpu pointing to an offline CPU.
Fixes: f0383c24b4 ("genirq/cpuhotplug: Add support for cleaning up move in progress")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240522220218.162423-1-dongli.zhang@oracle.com
Bug: 359158960
Change-Id: Ib9a564add4cd1c90ec725ab7f0bdc0db505ae29f
(cherry picked from commit a6c11c0a5235fb144a65e0cb2ffd360ddc1f6c32)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
When a CPU goes offline, the interrupts affine to that CPU are
re-configured.
Managed interrupts undergo either migration to other CPUs or shutdown if
all CPUs listed in the affinity are offline. The migration of managed
interrupts is guaranteed on x86 because there are interrupt vectors
reserved.
Regular interrupts are migrated to a still online CPU in the affinity mask
or if there is no online CPU to any online CPU.
This works as long as the still online CPUs in the affinity mask have
interrupt vectors available, but in case that none of those CPUs has a
vector available the migration fails and the device interrupt becomes
stale.
This is not any different from the case where the affinity mask does not
contain any online CPU, but there is no fallback operation for this.
Instead of giving up, retry the migration attempt with the online CPU mask
if the interrupt is not managed, as managed interrupts cannot be affected
by this problem.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240423073413.79625-1-dongli.zhang@oracle.com
Bug: 359158960
Change-Id: I3e8acfe0598b0cb94389115d680d2216833d6a0c
(cherry picked from commit 88d724e2301a69c1ab805cd74fc27aa36ae529e0)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
irq_restore_affinity_of_irq() restarts managed interrupts unconditionally
when the first CPU in the affinity mask comes online. That's correct during
normal hotplug operations, but not when resuming from S3 because the
drivers are not resumed yet and interrupt delivery is not expected by them.
Skip the startup of suspended interrupts and let resume_device_irqs() deal
with restoring them. This ensures that irqs are not delivered to drivers
during the noirq phase of resuming from S3, after non-boot CPUs are brought
back online.
Signed-off-by: David Stevens <stevensd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240424090341.72236-1-stevensd@chromium.org
Bug: 359158960
Change-Id: Ie5610f7dffe05a141e6db1a8b6b067845c910b4a
(cherry picked from commit a60dd06af674d3bb76b40da5d722e4a0ecefe650)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
commit fa0db8e568787c665384430eaf2221b299b85367 upstream.
This reverts commit 28ab9769117ca944cb6eb537af5599aa436287a4.
Sense data can be in either fixed format or descriptor format.
SAT-6 revision 1, "10.4.6 Control mode page", defines the D_SENSE bit:
"The SATL shall support this bit as defined in SPC-5 with the following
exception: if the D_ SENSE bit is set to zero (i.e., fixed format sense
data), then the SATL should return fixed format sense data for ATA
PASS-THROUGH commands."
The libata SATL has always kept D_SENSE set to zero by default. (It is
however possible to change the value using a MODE SELECT SG_IO command.)
Failed ATA PASS-THROUGH commands correctly respected the D_SENSE bit,
however, successful ATA PASS-THROUGH commands incorrectly returned the
sense data in descriptor format (regardless of the D_SENSE bit).
Commit 28ab9769117c ("ata: libata-scsi: Honor the D_SENSE bit for
CK_COND=1 and no error") fixed this bug for successful ATA PASS-THROUGH
commands.
However, after commit 28ab9769117c ("ata: libata-scsi: Honor the D_SENSE
bit for CK_COND=1 and no error"), there were bug reports that hdparm,
hddtemp, and udisks were no longer working as expected.
These applications incorrectly assume the returned sense data is in
descriptor format, without even looking at the RESPONSE CODE field in the
returned sense data (to see which format the returned sense data is in).
Considering that there will be broken versions of these applications around
roughly forever, we are stuck with being bug compatible with older kernels.
Cc: stable@vger.kernel.org # 4.19+
Reported-by: Stephan Eisvogel <eisvogel@seitics.de>
Reported-by: Christian Heusel <christian@heusel.eu>
Closes: https://lore.kernel.org/linux-ide/0bf3f2f0-0fc6-4ba5-a420-c0874ef82d64@heusel.eu/
Fixes: 28ab9769117c ("ata: libata-scsi: Honor the D_SENSE bit for CK_COND=1 and no error")
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240813131900.1285842-2-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0c84bde4f37ba27d50e4c70ecacd33fe4a57030d upstream.
This reverts commit 2052138b7da52ad5ccaf74f736d00f39a1c9198c.
This breaks the TeVii s480 dual DVB-S2 S660. The device has a bulk in
endpoint but no corresponding out endpoint, so the device does not pass
the "has both receive and send bulk endpoint" test.
Seemingly this device does not use dvb_usb_generic_rw() so I have tried
removing the generic_bulk_ctrl_endpoint entry, but this resulted in
different problems.
As we have no explanation yet, revert.
$ dmesg | grep -i -e dvb -e dw21 -e usb\ 4
[ 0.999122] usb 1-1: new high-speed USB device number 2 using ehci-pci
[ 1.023123] usb 4-1: new high-speed USB device number 2 using ehci-pci
[ 1.130247] usb 1-1: New USB device found, idVendor=9022, idProduct=d482,
+bcdDevice= 0.01
[ 1.130257] usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[ 1.152323] usb 4-1: New USB device found, idVendor=9022, idProduct=d481,
+bcdDevice= 0.01
[ 1.152329] usb 4-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[ 6.701033] dvb-usb: found a 'TeVii S480.2 USB' in cold state, will try to
+load a firmware
[ 6.701178] dvb-usb: downloading firmware from file 'dvb-usb-s660.fw'
[ 6.701179] dw2102: start downloading DW210X firmware
[ 6.703715] dvb-usb: found a 'Microsoft Xbox One Digital TV Tuner' in cold
+state, will try to load a firmware
[ 6.703974] dvb-usb: downloading firmware from file 'dvb-usb-dib0700-1.20.fw'
[ 6.756432] usb 1-1: USB disconnect, device number 2
[ 6.862119] dvb-usb: found a 'TeVii S480.2 USB' in warm state.
[ 6.862194] dvb-usb: TeVii S480.2 USB error while loading driver (-22)
[ 6.862209] dvb-usb: found a 'TeVii S480.1 USB' in cold state, will try to
+load a firmware
[ 6.862244] dvb-usb: downloading firmware from file 'dvb-usb-s660.fw'
[ 6.862245] dw2102: start downloading DW210X firmware
[ 6.914811] usb 4-1: USB disconnect, device number 2
[ 7.014131] dvb-usb: found a 'TeVii S480.1 USB' in warm state.
[ 7.014487] dvb-usb: TeVii S480.1 USB error while loading driver (-22)
[ 7.014538] usbcore: registered new interface driver dw2102
Closes: https://lore.kernel.org/stable/20240801165146.38991f60@mir/
Fixes: 2052138b7da5 ("media: dvb-usb: Fix unexpected infinite loop in dvb_usb_read_remote_control()")
Reported-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a7fb0423c201ba12815877a0b5a68a6a1710b23a upstream.
Commit d23b5c577715 ("cgroup: Make operations on the cgroup root_list RCU
safe") adds a new rcu_head to the cgroup_root structure and kvfree_rcu()
for freeing the cgroup_root.
The current implementation of kvfree_rcu(), however, has the limitation
that the offset of the rcu_head structure within the larger data
structure must be less than 4096 or the compilation will fail. See the
macro definition of __is_kvfree_rcu_offset() in include/linux/rcupdate.h
for more information.
By putting rcu_head below the large cgroup structure, any change to the
cgroup structure that makes it larger run the risk of causing build
failure under certain configurations. Commit 77070eeb8821 ("cgroup:
Avoid false cacheline sharing of read mostly rstat_cpu") happens to be
the last straw that breaks it. Fix this problem by moving the rcu_head
structure up before the cgroup structure.
Fixes: d23b5c577715 ("cgroup: Make operations on the cgroup root_list RCU safe")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/lkml/20231207143806.114e0a74@canb.auug.org.au/
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Yosry Ahmed <yosryahmed@google.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3eb3cd5992f7a0c37edc8d05b4c38c98758d8671 ]
Commit 04d82a6d08 ("binfmt_flat: allow not offsetting data start")
introduced a RISC-V specific variant of the FLAT format which does
not allocate any space for the (obsolete) array of shared library
pointers. However, it did not disable the code which initializes the
array, resulting in the corruption of sizeof(long) bytes before the DATA
segment, generally the end of the TEXT segment.
Introduce MAX_SHARED_LIBS_UPDATE which depends on the state of
CONFIG_BINFMT_FLAT_NO_DATA_START_OFFSET to guard the initialization of
the shared library pointer region so that it will only be initialized
if space is reserved for it.
Fixes: 04d82a6d08 ("binfmt_flat: allow not offsetting data start")
Co-developed-by: Stefan O'Rear <sorear@fastmail.com>
Signed-off-by: Stefan O'Rear <sorear@fastmail.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Acked-by: Greg Ungerer <gerg@linux-m68k.org>
Link: https://lore.kernel.org/r/20240807195119.it.782-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit d23b5c577715892c87533b13923306acc6243f93 upstream.
At present, when we perform operations on the cgroup root_list, we must
hold the cgroup_mutex, which is a relatively heavyweight lock. In reality,
we can make operations on this list RCU-safe, eliminating the need to hold
the cgroup_mutex during traversal. Modifications to the list only occur in
the cgroup root setup and destroy paths, which should be infrequent in a
production environment. In contrast, traversal may occur frequently.
Therefore, making it RCU-safe would be beneficial.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d67c5649c1541dc93f202eeffc6f49220a4ed71d upstream.
Before this patch, receiving an ADD_ADDR echo on the just connected
MP_JOIN subflow -- initiator side, after the MP_JOIN 3WHS -- was
resulting in an MP_RESET. That's because only ACKs with a DSS or
ADD_ADDRs without the echo bit were allowed.
Not allowing the ADD_ADDR echo after an MP_CAPABLE 3WHS makes sense, as
we are not supposed to send an ADD_ADDR before because it requires to be
in full established mode first. For the MP_JOIN 3WHS, that's different:
the ADD_ADDR can be sent on a previous subflow, and the ADD_ADDR echo
can be received on the recently created one. The other peer will already
be in fully established, so it is allowed to send that.
We can then relax the conditions here to accept the ADD_ADDR echo for
MPJ subflows.
Fixes: 67b12f792d ("mptcp: full fully established support after ADD_ADDR")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-signal-v1-1-c8a9b036493b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ Conflicts in options.c, because the context has changed in commit
b3ea6b272d ("mptcp: consolidate initial ack seq generation"), which
is not in this version. This commit is unrelated to this
modification. ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 16fb9808ab2c99979f081987752abcbc5b092eac ]
The final bit of stats that is global is the rpc svc_stat. Move this
into the nfsd_net struct and use that everywhere instead of the global
struct. Remove the unused global struct.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e41ee44cc6a473b1f414031782c3b4283d7f3e5f ]
This is the last global stat, take it out of the nfsd_stats struct and
make it a global part of nfsd, report it the same as always.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4b14885411f74b2b0ce0eb2b39d0fffe54e5ca0d ]
We have a global set of counters that we modify for all of the nfsd
operations, but now that we're exposing these stats across all network
namespaces we need to make the stats also be per-network namespace. We
already have some caching stats that are per-network namespace, so move
these definitions into the same counter and then adjust all the helpers
and users of these stats to provide the appropriate nfsd_net struct so
that the stats are maintained for the per-network namespace objects.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
[ cel: adjusted to apply to v6.1.y ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 93483ac5fec62cc1de166051b219d953bb5e4ef4 ]
We are running nfsd servers inside of containers with their own network
namespace, and we want to monitor these services using the stats found
in /proc. However these are not exposed in the proc inside of the
container, so we have to bind mount the host /proc into our containers
to get at this information.
Separate out the stat counters init and the proc registration, and move
the proc registration into the pernet operations entry and exit points
so that these stats can be exposed inside of network namespaces.
This is an intermediate step, this just exposes the global counters in
the network namespace. Subsequent patches will move these counters into
the per-network namespace container.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d98416cc2154053950610bb6880911e3dcbdf8c5 ]
We're going to merge the stats all into per network namespace in
subsequent patches, rename these nn counters to be consistent with the
rest of the stats.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 418b9687dece5bd763c09b5c27a801a7e3387be9 ]
nfsd is the only thing using this helper, and it doesn't use the private
currently. When we switch to per-network namespace stats we will need
the struct net * in order to get to the nfsd_net. Use the net as the
proc private so we can utilize this when we make the switch over.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3f6ef182f144dcc9a4d942f97b6a8ed969f13c95 ]
Now that this isn't used anywhere, remove it.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
[ cel: adjusted to apply to v6.1.y ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f094323867668d50124886ad884b665de7319537 ]
Since only one service actually reports the rpc stats there's not much
of a reason to have a pointer to it in the svc_program struct. Adjust
the svc_create_pooled function to take the sv_stats as an argument and
pass the struct through there as desired instead of getting it from the
svc_program->pg_stats.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
[ cel: adjusted to apply to v6.1.y ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a2214ed588fb3c5b9824a21cff870482510372bb ]
A lot of places are setting a blank svc_stats in ->pg_stats and never
utilizing these stats. Remove all of these extra structs as we're not
reporting these stats anywhere.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ab42f4d9a26f1723dcfd6c93fcf768032b2bb5e7 ]
We check for the existence of ->sv_stats elsewhere except in the core
processing code. It appears that only nfsd actual exports these values
anywhere, everybody else just has a write only copy of sv_stats in their
svc_program. Add a check for ->sv_stats before every adjustment to
allow us to eliminate the stats struct from all the users who don't
report the stats.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
[ cel: adjusted to apply to v6.1.y ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6939ace1f22681fface7841cdbf34d3204cc94b5 ]
fs/nfsd/export.c: In function 'svc_export_parse':
fs/nfsd/export.c:737:1: warning: the frame size of 1040 bytes is larger than 1024 bytes [-Wframe-larger-than=]
737 | }
On my systems, svc_export_parse() has a stack frame of over 800
bytes, not 1040, but nonetheless, it could do with some reduction.
When a struct svc_export is on the stack, it's a temporary structure
used as an argument, and not visible as an actual exported FS. No
need to reserve space for export_stats in such cases.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310012359.YEw5IrK6-lkp@intel.com/
Cc: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Stable-dep-of: 4b14885411f7 ("nfsd: make all of the nfsd stats per-network namespace")
[ cel: adjusted to apply to v6.1.y ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5ec39944f874e1ecc09f624a70dfaa8ac3bf9d08 ]
In function ‘export_stats_init’,
inlined from ‘svc_export_alloc’ at fs/nfsd/export.c:866:6:
fs/nfsd/export.c:337:16: warning: ‘nfsd_percpu_counters_init’ accessing 40 bytes in a region of size 0 [-Wstringop-overflow=]
337 | return nfsd_percpu_counters_init(&stats->counter, EXP_STATS_COUNTERS_NUM);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fs/nfsd/export.c:337:16: note: referencing argument 1 of type ‘struct percpu_counter[0]’
fs/nfsd/stats.h: In function ‘svc_export_alloc’:
fs/nfsd/stats.h:40:5: note: in a call to function ‘nfsd_percpu_counters_init’
40 | int nfsd_percpu_counters_init(struct percpu_counter counters[], int num);
| ^~~~~~~~~~~~~~~~~~~~~~~~~
Cc: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Stable-dep-of: 93483ac5fec6 ("nfsd: expose /proc/net/sunrpc/nfsd in net namespaces")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c135e1269f ]
Avoid holding the bucket lock while freeing cache entries. This
change also caps the number of entries that are freed when the
shrinker calls to reduce the shrinker's impact on the cache's
effectiveness.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
[ cel: adjusted to apply to v6.1.y -- this one might not be necessary ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a9507f6af1 ]
Enable nfsd_prune_bucket() to drop the bucket lock while calling
kfree(). Use the same pattern that Jeff recently introduced in the
NFSD filecache.
A few percpu operations are moved outside the lock since they
temporarily disable local IRQs which is expensive and does not
need to be done while the lock is held.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Stable-dep-of: c135e1269f ("NFSD: Refactor the duplicate reply cache shrinker")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ff0d169329 ]
For readability, rename to match the other helpers.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Stable-dep-of: 4b14885411f7 ("nfsd: make all of the nfsd stats per-network namespace")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 35308e7f0f ]
To reduce contention on the bucket locks, we must avoid calling
kfree() while each bucket lock is held.
Start by refactoring nfsd_reply_cache_free_locked() into a helper
that removes an entry from the bucket (and must therefore run under
the lock) and a second helper that frees the entry (which does not
need to hold the lock).
For readability, rename the helpers nfsd_cacherep_<verb>.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Stable-dep-of: a9507f6af1 ("NFSD: Replace nfsd_prune_bucket()")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ed9ab7346e ]
Commit f5f9d4a314 ("nfsd: move reply cache initialization into nfsd
startup") moved the initialization of the reply cache into nfsd startup,
but didn't account for the stats counters, which can be accessed before
nfsd is ever started. The result can be a NULL pointer dereference when
someone accesses /proc/fs/nfsd/reply_cache_stats while nfsd is still
shut down.
This is a regression and a user-triggerable oops in the right situation:
- non-x86_64 arch
- /proc/fs/nfsd is mounted in the namespace
- nfsd is not started in the namespace
- unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats"
Although this is easy to trigger on some arches (like aarch64), on
x86_64, calling this_cpu_ptr(NULL) evidently returns a pointer to the
fixed_percpu_data. That struct looks just enough like a newly
initialized percpu var to allow nfsd_reply_cache_stats_show to access
it without Oopsing.
Move the initialization of the per-net+per-cpu reply-cache counters
back into nfsd_init_net, while leaving the rest of the reply cache
allocations to be done at nfsd startup time.
Kudos to Eirik who did most of the legwork to track this down.
Cc: stable@vger.kernel.org # v6.3+
Fixes: f5f9d4a314 ("nfsd: move reply cache initialization into nfsd startup")
Reported-and-tested-by: Eirik Fuller <efuller@redhat.com>
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2215429
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Stable-dep-of: 4b14885411f7 ("nfsd: make all of the nfsd stats per-network namespace")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f5f9d4a314 ]
There's no need to start the reply cache before nfsd is up and running,
and doing so means that we register a shrinker for every net namespace
instead of just the ones where nfsd is running.
Move it to the per-net nfsd startup instead.
Reported-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Stable-dep-of: ed9ab7346e ("nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7697a0fe0154468f5df35c23ebd7aa48994c2cdc upstream.
Chromium sandbox apparently wants to deny statx [1] so it could properly
inspect arguments after the sandboxed process later falls back to fstat.
Because there's currently not a "fd-only" version of statx, so that the
sandbox has no way to ensure the path argument is empty without being
able to peek into the sandboxed process's memory. For architectures able
to do newfstatat though, glibc falls back to newfstatat after getting
-ENOSYS for statx, then the respective SIGSYS handler [2] takes care of
inspecting the path argument, transforming allowed newfstatat's into
fstat instead which is allowed and has the same type of return value.
But, as LoongArch is the first architecture to not have fstat nor
newfstatat, the LoongArch glibc does not attempt falling back at all
when it gets -ENOSYS for statx -- and you see the problem there!
Actually, back when the LoongArch port was under review, people were
aware of the same problem with sandboxing clone3 [3], so clone was
eventually kept. Unfortunately it seemed at that time no one had noticed
statx, so besides restoring fstat/newfstatat to LoongArch uapi (and
postponing the problem further), it seems inevitable that we would need
to tackle seccomp deep argument inspection.
However, this is obviously a decision that shouldn't be taken lightly,
so we just restore fstat/newfstatat by defining __ARCH_WANT_NEW_STAT
in unistd.h. This is the simplest solution for now, and so we hope the
community will tackle the long-standing problem of seccomp deep argument
inspection in the future [4][5].
Also add "newstat" to syscall_abis_64 in Makefile.syscalls due to
upstream asm-generic changes.
More infomation please reading this thread [6].
[1] https://chromium-review.googlesource.com/c/chromium/src/+/2823150
[2] https://chromium.googlesource.com/chromium/src/sandbox/+/c085b51940bd/linux/seccomp-bpf-helpers/sigsys_handlers.cc#355
[3] https://lore.kernel.org/linux-arch/20220511211231.GG7074@brightrain.aerifal.cx/
[4] https://lwn.net/Articles/799557/
[5] https://lpc.events/event/4/contributions/560/attachments/397/640/deep-arg-inspection.pdf
[6] https://lore.kernel.org/loongarch/20240226-granit-seilschaft-eccc2433014d@brauner/T/#t
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f50733b45d865f91db90919f8311e2127ce5a0cb upstream.
When opening a file for exec via do_filp_open(), permission checking is
done against the file's metadata at that moment, and on success, a file
pointer is passed back. Much later in the execve() code path, the file
metadata (specifically mode, uid, and gid) is used to determine if/how
to set the uid and gid. However, those values may have changed since the
permissions check, meaning the execution may gain unintended privileges.
For example, if a file could change permissions from executable and not
set-id:
---------x 1 root root 16048 Aug 7 13:16 target
to set-id and non-executable:
---S------ 1 root root 16048 Aug 7 13:16 target
it is possible to gain root privileges when execution should have been
disallowed.
While this race condition is rare in real-world scenarios, it has been
observed (and proven exploitable) when package managers are updating
the setuid bits of installed programs. Such files start with being
world-executable but then are adjusted to be group-exec with a set-uid
bit. For example, "chmod o-x,u+s target" makes "target" executable only
by uid "root" and gid "cdrom", while also becoming setuid-root:
-rwxr-xr-x 1 root cdrom 16048 Aug 7 13:16 target
becomes:
-rwsr-xr-- 1 root cdrom 16048 Aug 7 13:16 target
But racing the chmod means users without group "cdrom" membership can
get the permission to execute "target" just before the chmod, and when
the chmod finishes, the exec reaches brpm_fill_uid(), and performs the
setuid to root, violating the expressed authorization of "only cdrom
group members can setuid to root".
Re-check that we still have execute permissions in case the metadata
has changed. It would be better to keep a copy from the perm-check time,
but until we can do that refactoring, the least-bad option is to do a
full inode_permission() call (under inode lock). It is understood that
this is safe against dead-locks, but hardly optimal.
Reported-by: Marco Vanotti <mvanotti@google.com>
Tested-by: Marco Vanotti <mvanotti@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4d2868b5d191c74262f7407972d68d1bf3245d6a upstream.
It should be quite uncommon to set both the subflow and the signal
flags: the initiator of the connection is typically the one creating new
subflows, not the other peer, then no need to announce additional local
addresses, and use it to create subflows.
But some people might be confused about the flags, and set both "just to
be sure at least the right one is set". To verify the previous fix, and
avoid future regressions, this specific case is now validated: the
client announces a new address, and initiates a new subflow from the
same address.
While working on this, another bug has been noticed, where the client
reset the new subflow because an ADD_ADDR echo got received as the 3rd
ACK: this new test also explicitly checks that no RST have been sent by
the client and server.
The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.
Fixes: 86e39e0448 ("mptcp: keep track of local endpoint still available for each msk")
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-signal-v1-7-c8a9b036493b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ No conflicts, but not using 'chk_add_nr 1 1 0 invert': in this
version, 'chk_add_nr' cannot be used with 'invert': d73bb9d3957b
("selftests: mptcp: join: ability to invert ADD_ADDR check") is not in
this version, and backporting it causes a lot of conflicts. That's
fine, checking that there is an additional subflow should be enough. ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 85df533a787bf07bf4367ce2a02b822ff1fba1a3 upstream.
Up to the 'Fixes' commit, having an endpoint with both the 'signal' and
'subflow' flags, resulted in the creation of a subflow and an address
announcement using the address linked to this endpoint. After this
commit, only the address announcement was done, ignoring the 'subflow'
flag.
That's because the same bitmap is used for the two flags. It is OK to
keep this single bitmap, the already selected local endpoint simply have
to be re-used, but not via select_local_address() not to look at the
just modified bitmap.
Note that it is unusual to set the two flags together: creating a new
subflow using a new local address will implicitly advertise it to the
other peer. So in theory, no need to advertise it explicitly as well.
Maybe there are use-cases -- the subflow might not reach the other peer
that way, we can ask the other peer to try initiating the new subflow
without delay -- or very likely the user is confused, and put both flags
"just to be sure at least the right one is set". Still, if it is
allowed, the kernel should do what has been asked: using this endpoint
to announce the address and to create a new subflow from it.
An alternative is to forbid the use of the two flags together, but
that's probably too late, there are maybe use-cases, and it was working
before. This patch will avoid people complaining subflows are not
created using the endpoint they added with the 'subflow' and 'signal'
flag.
Note that with the current patch, the subflow might not be created in
some corner cases, e.g. if the 'subflows' limit was reached when sending
the ADD_ADDR, but changed later on. It is probably not worth splitting
id_avail_bitmap per target ('signal', 'subflow'), which will add another
large field to the msk "just" to track (again) endpoints. Anyway,
currently when the limits are changed, the kernel doesn't check if new
subflows can be created or removed, because we would need to keep track
of the received ADD_ADDR, and more. It sounds OK to assume that the
limits should be properly configured before establishing new
connections.
Fixes: 86e39e0448 ("mptcp: keep track of local endpoint still available for each msk")
Cc: stable@vger.kernel.org
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-signal-v1-5-c8a9b036493b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 528cb5f2a1 upstream.
Pass addr parameter to mptcp_pm_alloc_anno_list() instead of entry. We
can reduce the scope, e.g. in mptcp_pm_alloc_anno_list(), we only access
"entry->addr", we can then restrict to the pointer to "addr" then.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: c95eb32ced82 ("mptcp: pm: reduce indentation blocks")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit ef1e9b624d which is
commit 373c612d72 upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: I005a81bf79fefbb0a037a77c2d248d4790b9a94e
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit b1574c8c0a which is
commit 3f858bbf04dbac934ac279aaee05d49eb9910051 upstream.
It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.
Bug: 161946584
Change-Id: Idee931998e80ee3ce1c11fa19a49daeea6388fbb
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Changes in 6.1.95
wifi: mac80211: mesh: Fix leak of mesh_preq_queue objects
wifi: mac80211: Fix deadlock in ieee80211_sta_ps_deliver_wakeup()
wifi: cfg80211: fully move wiphy work to unbound workqueue
wifi: cfg80211: Lock wiphy in cfg80211_get_station
wifi: cfg80211: pmsr: use correct nla_get_uX functions
wifi: iwlwifi: mvm: revert gen2 TX A-MPDU size to 64
wifi: iwlwifi: dbg_ini: move iwl_dbg_tlv_free outside of debugfs ifdef
wifi: iwlwifi: mvm: check n_ssids before accessing the ssids
wifi: iwlwifi: mvm: don't read past the mfuart notifcation
wifi: mac80211: correctly parse Spatial Reuse Parameter Set element
ax25: Fix refcount imbalance on inbound connections
ax25: Replace kfree() in ax25_dev_free() with ax25_dev_put()
net/ncsi: Simplify Kconfig/dts control flow
net/ncsi: Fix the multi thread manner of NCSI driver
ipv6: ioam: block BH from ioam6_output()
ipv6: sr: block BH in seg6_output_core() and seg6_input_core()
bpf: Set run context for rawtp test_run callback
octeontx2-af: Always allocate PF entries from low prioriy zone
net/smc: avoid overwriting when adjusting sock bufsizes
net: sched: sch_multiq: fix possible OOB write in multiq_tune()
vxlan: Fix regression when dropping packets due to invalid src addresses
tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB
net/mlx5: Stop waiting for PCI up if teardown was triggered
net/mlx5: Stop waiting for PCI if pci channel is offline
net/mlx5: Split function_setup() to enable and open functions
net/mlx5: Always stop health timer during driver removal
net/mlx5: Fix tainted pointer delete is case of flow rules creation fail
net/sched: taprio: always validate TCA_TAPRIO_ATTR_PRIOMAP
ptp: Fix error message on failed pin verification
ice: fix iteration of TLVs in Preserved Fields Area
ice: Introduce new parameters in ice_sched_node
ice: remove null checks before devm_kfree() calls
ice: remove af_xdp_zc_qps bitmap
net: wwan: iosm: Fix tainted pointer delete is case of region creation fail
af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer.
af_unix: Annodate data-races around sk->sk_state for writers.
af_unix: Annotate data-race of sk->sk_state in unix_inq_len().
af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll().
af_unix: Annotate data-race of sk->sk_state in unix_stream_connect().
af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg().
af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb().
af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG.
af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen.
af_unix: Use unix_recvq_full_lockless() in unix_stream_connect().
af_unix: annotate lockless accesses to sk->sk_err
af_unix: Use skb_queue_empty_lockless() in unix_release_sock().
af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen().
af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill().
ipv6: fix possible race in __fib6_drop_pcpu_from()
Bluetooth: qca: fix invalid device address check
btrfs: fix wrong block_start calculation for btrfs_drop_extent_map_range()
usb: gadget: f_fs: use io_data->status consistently
usb: gadget: f_fs: Fix race between aio_cancel() and AIO request complete
iio: accel: mxc4005: allow module autoloading via OF compatible
iio: accel: mxc4005: Reset chip on probe() and resume()
xtensa: stacktrace: include <asm/ftrace.h> for prototype
xtensa: fix MAKE_PC_FROM_RA second argument
drm/amd/display: drop unnecessary NULL checks in debugfs
drm/amd/display: Fix incorrect DSC instance for MST
arm64: dts: qcom: sm8150: align TLMM pin configuration with DT schema
arm64: dts: qcom: sa8155p-adp: fix SDHC2 CD pin configuration
misc/pvpanic: deduplicate common code
misc/pvpanic-pci: register attributes via pci_driver
serial: sc16is7xx: replace hardcoded divisor value with BIT() macro
serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using prescaler
mmc: davinci: Don't strip remove function when driver is builtin
firmware: qcom_scm: disable clocks if qcom_scm_bw_enable() fails
HID: i2c-hid: elan: Add ili9882t timing
HID: i2c-hid: elan: fix reset suspend current leakage
i2c: add fwnode APIs
i2c: acpi: Unbind mux adapters before delete
mm, vmalloc: fix high order __GFP_NOFAIL allocations
mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL
selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages
selftests/mm: conform test to TAP format output
selftests/mm: log a consistent test name for check_compaction
selftests/mm: compaction_test: fix bogus test success on Aarch64
wifi: ath10k: fix QCOM_RPROC_COMMON dependency
btrfs: remove unnecessary prototype declarations at disk-io.c
btrfs: make btrfs_destroy_delayed_refs() return void
btrfs: fix leak of qgroup extent records after transaction abort
nilfs2: return the mapped address from nilfs_get_page()
nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors
io_uring: check for non-NULL file pointer in io_file_can_poll()
USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages
USB: xen-hcd: Traverse host/ when CONFIG_USB_XEN_HCD is selected
usb: typec: tcpm: fix use-after-free case in tcpm_register_source_caps
usb: typec: tcpm: Ignore received Hard Reset in TOGGLING state
mei: me: release irq in mei_me_pci_resume error path
tty: n_tty: Fix buffer offsets when lookahead is used
landlock: Fix d_parent walk
jfs: xattr: fix buffer overflow for invalid xattr
xhci: Set correct transferred length for cancelled bulk transfers
xhci: Apply reset resume quirk to Etron EJ188 xHCI host
xhci: Handle TD clearing for multiple streams case
xhci: Apply broken streams quirk to Etron EJ188 xHCI host
thunderbolt: debugfs: Fix margin debugfs node creation condition
scsi: mpi3mr: Fix ATA NCQ priority support
scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory
scsi: sd: Use READ(16) when reading block zero on large capacity disks
gve: Clear napi->skb before dev_kfree_skb_any()
powerpc/uaccess: Fix build errors seen with GCC 13/14
Input: try trimming too long modalias strings
cxl/test: Add missing vmalloc.h for tools/testing/cxl/test/mem.c
cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd
cachefiles: remove requests from xarray during flushing requests
cachefiles: introduce object ondemand state
cachefiles: extract ondemand info field from cachefiles_object
cachefiles: resend an open request if the read request's object is closed
cachefiles: add spin_lock for cachefiles_ondemand_info
cachefiles: add restore command to recover inflight ondemand read requests
cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd()
cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read()
cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read()
cachefiles: never get a new anonymous fd if ondemand_id is valid
cachefiles: defer exposing anon_fd until after copy_to_user() succeeds
cachefiles: flush all requests after setting CACHEFILES_DEAD
selftests/ftrace: Fix to check required event file
clk: sifive: Do not register clkdevs for PRCI clocks
NFSv4.1 enforce rootpath check in fs_location query
SUNRPC: return proper error from gss_wrap_req_priv
NFS: add barriers when testing for NFS_FSDATA_BLOCKED
platform/x86: dell-smbios: Fix wrong token data in sysfs
gpio: tqmx86: fix typo in Kconfig label
gpio: tqmx86: remove unneeded call to platform_set_drvdata()
gpio: tqmx86: introduce shadow register for GPIO output value
gpio: tqmx86: Convert to immutable irq_chip
gpio: tqmx86: store IRQ trigger type and unmask status separately
gpio: tqmx86: fix broken IRQ_TYPE_EDGE_BOTH interrupt type
HID: core: remove unnecessary WARN_ON() in implement()
iommu/amd: Fix sysfs leak in iommu init
HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode()
drm/vmwgfx: Port the framebuffer code to drm fb helpers
drm/vmwgfx: Refactor drm connector probing for display modes
drm/vmwgfx: Filter modes which exceed graphics memory
drm/vmwgfx: 3D disabled should not effect STDU memory limits
drm/vmwgfx: Remove STDU logic from generic mode_valid function
net: sfp: Always call `sfp_sm_mod_remove()` on remove
net: hns3: fix kernel crash problem in concurrent scenario
net: hns3: add cond_resched() to hns3 ring buffer init process
liquidio: Adjust a NULL pointer handling path in lio_vf_rep_copy_packet
drm/komeda: check for error-valued pointer
drm/bridge/panel: Fix runtime warning on panel bridge release
tcp: fix race in tcp_v6_syn_recv_sock()
geneve: Fix incorrect inner network header offset when innerprotoinherit is set
net/mlx5e: Fix features validation check for tunneled UDP (non-VXLAN) packets
Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ
netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type
net: pse-pd: Use EOPNOTSUPP error code instead of ENOTSUPP
gve: ignore nonrelevant GSO type bits when processing TSO headers
net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters
nvmet-passthru: propagate status from id override functions
net/ipv6: Fix the RT cache flush via sysctl using a previous delay
net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state
net: bridge: mst: fix suspicious rcu usage in br_mst_set_state
ionic: fix use after netif_napi_del()
af_unix: Read with MSG_PEEK loops if the first unread byte is OOB
bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send()
misc: microchip: pci1xxxx: fix double free in the error handling of gp_aux_bus_probe()
x86/boot: Don't add the EFI stub to targets, again
iio: adc: ad9467: fix scan type sign
iio: dac: ad5592r: fix temperature channel scaling value
iio: imu: inv_icm42600: delete unneeded update watermark call
drivers: core: synchronize really_probe() and dev_uevent()
drm/exynos/vidi: fix memory leak in .get_modes()
drm/exynos: hdmi: report safe 640x480 mode as a fallback when no EDID found
mptcp: ensure snd_una is properly initialized on connect
mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID
irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()
x86/amd_nb: Check for invalid SMN reads
perf/core: Fix missing wakeup when waiting for context reference
riscv: fix overlap of allocated page and PTR_ERR
tracing/selftests: Fix kprobe event name test for .isra. functions
null_blk: Print correct max open zones limit in null_init_zoned_dev()
sock_map: avoid race between sock_map_close and sk_psock_put
vmci: prevent speculation leaks by sanitizing event in event_deliver()
spmi: hisi-spmi-controller: Do not override device identifier
knfsd: LOOKUP can return an illegal error value
fs/proc: fix softlockup in __read_vmcore
ocfs2: use coarse time for new created files
ocfs2: fix races between hole punching and AIO+DIO
PCI: rockchip-ep: Remove wrong mask on subsys_vendor_id
dmaengine: axi-dmac: fix possible race in remove()
remoteproc: k3-r5: Wait for core0 power-up before powering up core1
remoteproc: k3-r5: Do not allow core1 to power up before core0 via sysfs
riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context
drm/i915/gt: Disarm breadcrumbs if engines are already idle
drm/i915/dpt: Make DPT object unshrinkable
intel_th: pci: Add Granite Rapids support
intel_th: pci: Add Granite Rapids SOC support
intel_th: pci: Add Sapphire Rapids SOC support
intel_th: pci: Add Meteor Lake-S support
intel_th: pci: Add Lunar Lake support
btrfs: zoned: introduce a zone_info struct in btrfs_load_block_group_zone_info
btrfs: zoned: factor out per-zone logic from btrfs_load_block_group_zone_info
btrfs: zoned: factor out single bg handling from btrfs_load_block_group_zone_info
btrfs: zoned: factor out DUP bg handling from btrfs_load_block_group_zone_info
btrfs: zoned: fix use-after-free due to race with dev replace
nilfs2: fix potential kernel bug due to lack of writeback flag waiting
tick/nohz_full: Don't abuse smp_call_function_single() in tick_setup_device()
mm/huge_memory: don't unpoison huge_zero_folio
mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
serial: 8250_pxa: Configure tx_loadsz to match FIFO IRQ level
mptcp: pm: update add_addr counters after connect
Revert "fork: defer linking file vma until vma is fully initialized"
remoteproc: k3-r5: Jump to error handling labels in start/stop errors
cachefiles, erofs: Fix NULL deref in when cachefiles is not doing ondemand-mode
Bluetooth: qca: fix wcn3991 device address check
Bluetooth: qca: generalise device address check
greybus: Fix use-after-free bug in gb_interface_release due to race condition.
serial: 8250_dw: fall back to poll if there's no interrupt
serial: core: Add UPIO_UNKNOWN constant for unknown port type
usb-storage: alauda: Check whether the media is initialized
misc: microchip: pci1xxxx: Fix a memory leak in the error handling of gp_aux_bus_probe()
i2c: at91: Fix the functionality flags of the slave-only interface
i2c: designware: Fix the functionality flags of the slave-only interface
zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
Linux 6.1.95
Change-Id: I73161b2d10f7fd687ca753f1780ccdf53eeccb0e
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Currently, the UTP_TASK_REQ_LIST_BASE_L/UTP_TASK_REQ_LIST_BASE_H regs are
written to and then completed with an mb().
mb() ensures that the write completes, but completion doesn't mean that it
isn't stored in a buffer somewhere. The recommendation for ensuring these
bits have taken effect on the device is to perform a read back to force it
to make it all the way to the device. This is documented in device-io.rst
and a talk by Will Deacon on this can be seen over here:
https://youtu.be/i6DayghhA8Q?si=MiyxB5cKJXSaoc01&t=1678
Let's do that to ensure the bits hit the device. Because the mb()'s purpose
wasn't to add extra ordering (on top of the ordering guaranteed by
writel()/readl()), it can safely be removed.
Bug: 254441685
Fixes: 88441a8d35 ("scsi: ufs: core: Add hibernation callbacks")
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Link: https://lore.kernel.org/r/20240329-ufs-reset-ensure-effect-before-delay-v5-7-181252004586@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 408e28086f1c7a6423efc79926a43d7001902fae)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Iadf1f45aebea54393365a5387d35440aa5df3e00
Commit c33c794828 ("mm: ptep_get() conversion") converted all (non-arch)
call sites to use ptep_get() instead of doing a direct dereference of the
pte. Full rationale can be found in that commit's log.
Since then, UFFDIO_MOVE has been implemented which does 7 direct pte
dereferences. Let's fix those up to use ptep_get().
I've asserted in the past that there is no reliable automated mechanism to
catch these; I'm relying on a combination of Coccinelle (which throws up a
lot of false positives) and some compiler magic to force a compiler error
on dereference. But given the frequency with which new issues are coming
up, I'll add it to my todo list to try to find an automated solution.
Bug: 254441685
Link: https://lkml.kernel.org/r/20240123141755.3836179-1-ryan.roberts@arm.com
Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 56ae10cf628b02279980d17439c6241a643959c2)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I0573208f1c7ad5085e0f7e82d484199074a5f6ef