Commit Graph

1151291 Commits

Author SHA1 Message Date
Ming Lei
aeeb4e4e49 genirq/affinity: Don't pass irq_affinity_desc array to irq_build_affinity_masks
[ Upstream commit e7bdd7f0cb ]

Prepare for abstracting irq_build_affinity_masks() into a public function
for assigning all CPUs evenly into several groups.

Don't pass irq_affinity_desc array to irq_build_affinity_masks, instead
return a cpumask array by storing each assigned group into one element of
the array.

This allows to provide a generic interface for grouping all CPUs evenly
from a NUMA and CPU locality viewpoint, and the cost is one extra allocation
in irq_build_affinity_masks(), which should be fine since it is done via
GFP_KERNEL and irq_build_affinity_masks() is a slow path anyway.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20221227022905.352674-4-ming.lei@redhat.com
Stable-dep-of: 0263f92fadbb ("lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:33 +01:00
Ming Lei
9e84d7bb15 genirq/affinity: Pass affinity managed mask array to irq_build_affinity_masks
[ Upstream commit 1f962d91a1 ]

Pass affinity managed mask array to irq_build_affinity_masks() so that the
index of the first affinity managed vector is always zero.

This allows to simplify the implementation a bit.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20221227022905.352674-3-ming.lei@redhat.com
Stable-dep-of: 0263f92fadbb ("lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:33 +01:00
Ming Lei
a1dcd17947 genirq/affinity: Remove the 'firstvec' parameter from irq_build_affinity_masks
[ Upstream commit cdf07f0ea4 ]

The 'firstvec' parameter is always same with the parameter of
'startvec', so use 'startvec' directly inside irq_build_affinity_masks().

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20221227022905.352674-2-ming.lei@redhat.com
Stable-dep-of: 0263f92fadbb ("lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:33 +01:00
Takashi Iwai
f4fe76467e ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7
[ Upstream commit 634e5e1e06f5cdd614a1bc429ecb243a51cc009d ]

Lenovo Yoga Pro 7 14APH8 (PCI SSID 17aa:3882) seems requiring the
similar workaround like Yoga 9 model for the bass speaker.

Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/CAGGk=CRRQ1L9p771HsXTN_ebZP41Qj+3gw35Gezurn+nokRewg@mail.gmail.com
Link: https://lore.kernel.org/r/20231207182035.30248-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:32 +01:00
Sudeep Holla
aee609302d firmware: arm_scmi: Fix frequency truncation by promoting multiplier type
[ Upstream commit 8e3c98d9187e09274fc000a7d1a77b070a42d259 ]

Fix the possible frequency truncation for all values equal to or greater
4GHz on 64bit machines by updating the multiplier 'mult_factor' to
'unsigned long' type. It is also possible that the multiplier itself can
be greater than or equal to 2^32. So we need to also fix the equation
computing the value of the multiplier.

Fixes: a9e3fbfaa0 ("firmware: arm_scmi: add initial support for performance protocol")
Reported-by: Sibi Sankar <quic_sibis@quicinc.com>
Closes: https://lore.kernel.org/all/20231129065748.19871-3-quic_sibis@quicinc.com/
Cc: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20231130204343.503076-1-sudeep.holla@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:32 +01:00
John Fastabend
90d1f74c3c bpf, sockmap: af_unix stream sockets need to hold ref for pair sock
[ Upstream commit 8866730aed5100f06d3d965c22f1c61f74942541 ]

AF_UNIX stream sockets are a paired socket. So sending on one of the pairs
will lookup the paired socket as part of the send operation. It is possible
however to put just one of the pairs in a BPF map. This currently increments
the refcnt on the sock in the sockmap to ensure it is not free'd by the
stack before sockmap cleans up its state and stops any skbs being sent/recv'd
to that socket.

But we missed a case. If the peer socket is closed it will be free'd by the
stack. However, the paired socket can still be referenced from BPF sockmap
side because we hold a reference there. Then if we are sending traffic through
BPF sockmap to that socket it will try to dereference the free'd pair in its
send logic creating a use after free. And following splat:

   [59.900375] BUG: KASAN: slab-use-after-free in sk_wake_async+0x31/0x1b0
   [59.901211] Read of size 8 at addr ffff88811acbf060 by task kworker/1:2/954
   [...]
   [59.905468] Call Trace:
   [59.905787]  <TASK>
   [59.906066]  dump_stack_lvl+0x130/0x1d0
   [59.908877]  print_report+0x16f/0x740
   [59.910629]  kasan_report+0x118/0x160
   [59.912576]  sk_wake_async+0x31/0x1b0
   [59.913554]  sock_def_readable+0x156/0x2a0
   [59.914060]  unix_stream_sendmsg+0x3f9/0x12a0
   [59.916398]  sock_sendmsg+0x20e/0x250
   [59.916854]  skb_send_sock+0x236/0xac0
   [59.920527]  sk_psock_backlog+0x287/0xaa0

To fix let BPF sockmap hold a refcnt on both the socket in the sockmap and its
paired socket. It wasn't obvious how to contain the fix to bpf_unix logic. The
primarily problem with keeping this logic in bpf_unix was: In the sock close()
we could handle the deref by having a close handler. But, when we are destroying
the psock through a map delete operation we wouldn't have gotten any signal
thorugh the proto struct other than it being replaced. If we do the deref from
the proto replace its too early because we need to deref the sk_pair after the
backlog worker has been stopped.

Given all this it seems best to just cache it at the end of the psock and eat 8B
for the af_unix and vsock users. Notice dgram sockets are OK because they handle
locking already.

Fixes: 94531cfcbe ("af_unix: Add unix_stream_proto for sockmap")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20231129012557.95371-2-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:32 +01:00
Jakub Kicinski
5ff1682fec ethtool: don't propagate EOPNOTSUPP from dumps
[ Upstream commit cbeb989e41f4094f54bec2cecce993f26f547bea ]

The default dump handler needs to clear ret before returning.
Otherwise if the last interface returns an inconsequential
error this error will propagate to user space.

This may confuse user space (ethtool CLI seems to ignore it,
but YNL doesn't). It will also terminate the dump early
for mutli-skb dump, because netlink core treats EOPNOTSUPP
as a real error.

Fixes: 728480f124 ("ethtool: default handlers for GET requests")
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231126225806.2143528-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:32 +01:00
Ioana Ciornei
e570b15087 dpaa2-eth: recycle the RX buffer only after all processing done
[ Upstream commit beb1930f966d1517921488bd5d64147f58f79abf ]

The blamed commit added support for Rx copybreak. This meant that for
certain frame sizes, a new skb was allocated and the initial data buffer
was recycled. Instead of waiting to recycle the Rx buffer only after all
processing was done on it (like accessing the parse results or timestamp
information), the code path just went ahead and re-used the buffer right
away.

This sometimes lead to corrupted HW and SW annotation areas.
Fix this by delaying the moment when the buffer is recycled.

Fixes: 50f826999a ("dpaa2-eth: add rx copybreak support")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:32 +01:00
Ioana Ciornei
5b8938fc7d net: dpaa2-eth: rearrange variable in dpaa2_eth_get_ethtool_stats
[ Upstream commit 3313206827 ]

Rearrange the variables in the dpaa2_eth_get_ethtool_stats() function so
that we adhere to the reverse Christmas tree rule.
Also, in the next patch we are adding more variables and I didn't know
where to place them with the current ordering.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: beb1930f966d ("dpaa2-eth: recycle the RX buffer only after all processing done")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:32 +01:00
Paulo Alcantara
e88275ce7e smb: client: fix missing mode bits for SMB symlinks
[ Upstream commit ef22bb800d967616c7638d204bc1b425beac7f5f ]

When instantiating inodes for SMB symlinks, add the mode bits from
@cifs_sb->ctx->file_mode as we already do for the other special files.

Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:32 +01:00
Christoph Hellwig
bf223fd4d9 block: update the stable_writes flag in bdev_add
[ Upstream commit 1898efcdbed32bb1c67269c985a50bab0dbc9493 ]

Propagate the per-queue stable_write flags into each bdev inode in bdev_add.
This makes sure devices that require stable writes have it set for I/O
on the block device node as well.

Note that this doesn't cover the case of a flag changing on a live device
yet.  We should handle that as well, but I plan to cover it as part of a
more general rework of how changing runtime paramters on block devices
works.

Fixes: 1cb039f3dc ("bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag")
Reported-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20231025141020.192413-3-hch@lst.de
Tested-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:32 +01:00
Christoph Hellwig
a8e4300ae5 filemap: add a per-mapping stable writes flag
[ Upstream commit 762321dab9a72760bf9aec48362f932717c9424d ]

folio_wait_stable waits for writeback to finish before modifying the
contents of a folio again, e.g. to support check summing of the data
in the block integrity code.

Currently this behavior is controlled by the SB_I_STABLE_WRITES flag
on the super_block, which means it is uniform for the entire file system.
This is wrong for the block device pseudofs which is shared by all
block devices, or file systems that can use multiple devices like XFS
witht the RT subvolume or btrfs (although btrfs currently reimplements
folio_wait_stable anyway).

Add a per-address_space AS_STABLE_WRITES flag to control the behavior
in a more fine grained way.  The existing SB_I_STABLE_WRITES is kept
to initialize AS_STABLE_WRITES to the existing default which covers
most cases.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20231025141020.192413-2-hch@lst.de
Tested-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: 1898efcdbed3 ("block: update the stable_writes flag in bdev_add")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:32 +01:00
David Howells
d0eafc7631 mm, netfs, fscache: stop read optimisation when folio removed from pagecache
[ Upstream commit b4fa966f03 ]

Fscache has an optimisation by which reads from the cache are skipped
until we know that (a) there's data there to be read and (b) that data
isn't entirely covered by pages resident in the netfs pagecache.  This is
done with two flags manipulated by fscache_note_page_release():

	if (...
	    test_bit(FSCACHE_COOKIE_HAVE_DATA, &cookie->flags) &&
	    test_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags))
		clear_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);

where the NO_DATA_TO_READ flag causes cachefiles_prepare_read() to
indicate that netfslib should download from the server or clear the page
instead.

The fscache_note_page_release() function is intended to be called from
->releasepage() - but that only gets called if PG_private or PG_private_2
is set - and currently the former is at the discretion of the network
filesystem and the latter is only set whilst a page is being written to
the cache, so sometimes we miss clearing the optimisation.

Fix this by following Willy's suggestion[1] and adding an address_space
flag, AS_RELEASE_ALWAYS, that causes filemap_release_folio() to always call
->release_folio() if it's set, even if PG_private or PG_private_2 aren't
set.

Note that this would require folio_test_private() and page_has_private() to
become more complicated.  To avoid that, in the places[*] where these are
used to conditionalise calls to filemap_release_folio() and
try_to_release_page(), the tests are removed the those functions just
jumped to unconditionally and the test is performed there.

[*] There are some exceptions in vmscan.c where the check guards more than
just a call to the releaser.  I've added a function, folio_needs_release()
to wrap all the checks for that.

AS_RELEASE_ALWAYS should be set if a non-NULL cookie is obtained from
fscache and cleared in ->evict_inode() before truncate_inode_pages_final()
is called.

Additionally, the FSCACHE_COOKIE_NO_DATA_TO_READ flag needs to be cleared
and the optimisation cancelled if a cachefiles object already contains data
when we open it.

[dwysocha@redhat.com: call folio_mapping() inside folio_needs_release()]
  Link: 902c990e31
Link: https://lkml.kernel.org/r/20230628104852.3391651-3-dhowells@redhat.com
Fixes: 1f67e6d0b1 ("fscache: Provide a function to note the release of a page")
Fixes: 047487c947 ("cachefiles: Implement the I/O routines")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Reported-by: Rohith Surabattula <rohiths.msft@gmail.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Tested-by: SeongJae Park <sj@kernel.org>
Cc: Daire Byrne <daire.byrne@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steve French <sfrench@samba.org>
Cc: Shyam Prasad N <nspmangalore@gmail.com>
Cc: Rohith Surabattula <rohiths.msft@gmail.com>
Cc: Dave Wysochanski <dwysocha@redhat.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jingbo Xu <jefflexu@linux.alibaba.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 1898efcdbed3 ("block: update the stable_writes flag in bdev_add")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:31 +01:00
David Howells
bceff380f3 mm: merge folio_has_private()/filemap_release_folio() call pairs
[ Upstream commit 0201ebf274 ]

Patch series "mm, netfs, fscache: Stop read optimisation when folio
removed from pagecache", v7.

This fixes an optimisation in fscache whereby we don't read from the cache
for a particular file until we know that there's data there that we don't
have in the pagecache.  The problem is that I'm no longer using PG_fscache
(aka PG_private_2) to indicate that the page is cached and so I don't get
a notification when a cached page is dropped from the pagecache.

The first patch merges some folio_has_private() and
filemap_release_folio() pairs and introduces a helper,
folio_needs_release(), to indicate if a release is required.

The second patch is the actual fix.  Following Willy's suggestions[1], it
adds an AS_RELEASE_ALWAYS flag to an address_space that will make
filemap_release_folio() always call ->release_folio(), even if
PG_private/PG_private_2 aren't set.  folio_needs_release() is altered to
add a check for this.

This patch (of 2):

Make filemap_release_folio() check folio_has_private().  Then, in most
cases, where a call to folio_has_private() is immediately followed by a
call to filemap_release_folio(), we can get rid of the test in the pair.

There are a couple of sites in mm/vscan.c that this can't so easily be
done.  In shrink_folio_list(), there are actually three cases (something
different is done for incompletely invalidated buffers), but
filemap_release_folio() elides two of them.

In shrink_active_list(), we don't have have the folio lock yet, so the
check allows us to avoid locking the page unnecessarily.

A wrapper function to check if a folio needs release is provided for those
places that still need to do it in the mm/ directory.  This will acquire
additional parts to the condition in a future patch.

After this, the only remaining caller of folio_has_private() outside of
mm/ is a check in fuse.

Link: https://lkml.kernel.org/r/20230628104852.3391651-1-dhowells@redhat.com
Link: https://lkml.kernel.org/r/20230628104852.3391651-2-dhowells@redhat.com
Reported-by: Rohith Surabattula <rohiths.msft@gmail.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steve French <sfrench@samba.org>
Cc: Shyam Prasad N <nspmangalore@gmail.com>
Cc: Rohith Surabattula <rohiths.msft@gmail.com>
Cc: Dave Wysochanski <dwysocha@redhat.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Xiubo Li <xiubli@redhat.com>
Cc: Jingbo Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 1898efcdbed3 ("block: update the stable_writes flag in bdev_add")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:31 +01:00
Vishal Moola (Oracle)
8b6b3ecf0c memory-failure: convert truncate_error_page() to use folio
[ Upstream commit ac5efa7820 ]

Replace try_to_release_page() with filemap_release_folio().  This change
is in preparation for the removal of the try_to_release_page() wrapper.

Link: https://lkml.kernel.org/r/20221118073055.55694-4-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 1898efcdbed3 ("block: update the stable_writes flag in bdev_add")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:31 +01:00
Vishal Moola (Oracle)
a6f440f3b9 khugepage: replace try_to_release_page() with filemap_release_folio()
[ Upstream commit 64ab3195ea ]

Replace some calls with their folio equivalents.  This change removes 4
calls to compound_head() and is in preparation for the removal of the
try_to_release_page() wrapper.

Link: https://lkml.kernel.org/r/20221118073055.55694-3-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 1898efcdbed3 ("block: update the stable_writes flag in bdev_add")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:31 +01:00
Vishal Moola (Oracle)
4c78612e5f ext4: convert move_extent_per_page() to use folios
[ Upstream commit 6dd8fe86fa ]

Patch series "Removing the try_to_release_page() wrapper", v3.

This patchset replaces the remaining calls of try_to_release_page() with
the folio equivalent: filemap_release_folio().  This allows us to remove
the wrapper.

This patch (of 4):

Convert move_extent_per_page() to use folios.  This change removes 5 calls
to compound_head() and is in preparation for the removal of the
try_to_release_page() wrapper.

Link: https://lkml.kernel.org/r/20221118073055.55694-1-vishal.moola@gmail.com
Link: https://lkml.kernel.org/r/20221118073055.55694-2-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 1898efcdbed3 ("block: update the stable_writes flag in bdev_add")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:31 +01:00
Bryan O'Donoghue
b92a8f591c media: qcom: camss: Fix set CSI2_RX_CFG1_VC_MODE when VC is greater than 3
[ Upstream commit e655d1ae9703286cef7fda8675cad62f649dc183 ]

VC_MODE = 0 implies a two bit VC address.
VC_MODE = 1 is required for VCs with a larger address than two bits.

Fixes: eebe6d00e9 ("media: camss: Add support for CSID hardware version Titan 170")
Cc: stable@vger.kernel.org
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:31 +01:00
Milen Mitkov
710f70555d media: camss: sm8250: Virtual channels for CSID
[ Upstream commit 3c4ed72a16 ]

CSID hardware on SM8250 can demux up to 4 simultaneous streams
based on virtual channel (vc) or datatype (dt).
The CSID subdevice entity now has 4 source ports that can be
enabled/disabled and thus can control which virtual channels
are enabled. Datatype demuxing not tested.

In order to keep a valid internal state of the subdevice,
implicit format propagation from the sink to the source pads
has been preserved. However, the format on each source pad
can be different and in that case it must be configured explicitly.

CSID's s_stream is called when any stream is started or stopped.
It will call configure_streams() that will rewrite IRQ settings to HW.
When multiple streams are running simultaneously there is an issue
when writing IRQ settings for one stream while another is still
running, thus avoid re-writing settings if they were not changed
in link setup, or by fully powering off the CSID hardware.

Signed-off-by: Milen Mitkov <quic_mmitkov@quicinc.com>
Reviewed-by: Robert Foss <robert.foss@linaro.org>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Acked-by: Robert Foss <robert.foss@linaro.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Stable-dep-of: e655d1ae9703 ("media: qcom: camss: Fix set CSI2_RX_CFG1_VC_MODE when VC is greater than 3")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:31 +01:00
Geliang Tang
c96a4f9360 selftests: mptcp: set FAILING_LINKS in run_tests
[ Upstream commit be7e9786c9 ]

Set FAILING_LINKS as an env var with a limited scope only when calling
run_tests().

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-3-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 7cefbe5e1dac ("selftests: mptcp: fix fastclose with csum failure")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:31 +01:00
Paolo Abeni
4b85e920af selftests: mptcp: fix fastclose with csum failure
[ Upstream commit 7cefbe5e1dacc7236caa77e9d072423f21422fe2 ]

Running the mp_join selftest manually with the following command line:

  ./mptcp_join.sh -z -C

leads to some failures:

  002 fastclose server test
  # ...
  rtx                                 [fail] got 1 MP_RST[s] TX expected 0
  # ...
  rstrx                               [fail] got 1 MP_RST[s] RX expected 0

The problem is really in the wrong expectations for the RST checks
implied by the csum validation. Note that the same check is repeated
explicitly in the same test-case, with the correct expectation and
pass successfully.

Address the issue explicitly setting the correct expectation for
the failing checks.

Reported-by: Xiumei Mu <xmu@redhat.com>
Fixes: 6bf41020b7 ("selftests: mptcp: update and extend fastclose test-cases")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-5-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:31 +01:00
Jaegeuk Kim
336d1ee07e f2fs: set the default compress_level on ioctl
[ Upstream commit f5f3bd903a5d3e3b2ba89f11e0e29db25e60c048 ]

Otherwise, we'll get a broken inode.

 # touch $FILE
 # f2fs_io setflags compression $FILE
 # f2fs_io set_coption 2 8 $FILE

[  112.227612] F2FS-fs (dm-51): sanity_check_compress_inode: inode (ino=8d3fe) has unsupported compress level: 0, run fsck to fix

Cc: stable@vger.kernel.org
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:30 +01:00
Jaegeuk Kim
1ff3f5ef28 f2fs: assign default compression level
[ Upstream commit 00e120b5e4 ]

Let's avoid any confusion from assigning compress_level=0 for LZ4HC and ZSTD.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: f5f3bd903a5d ("f2fs: set the default compress_level on ioctl")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:30 +01:00
Yangtao Li
55d3f41e55 f2fs: convert to use bitmap API
[ Upstream commit 447286ebad ]

Let's use BIT() and GENMASK() instead of open it.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: f5f3bd903a5d ("f2fs: set the default compress_level on ioctl")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:30 +01:00
Chao Yu
84a8d913fb f2fs: clean up i_compress_flag and i_compress_level usage
[ Upstream commit b90e5086df ]

.i_compress_level was introduced by commit 3fde13f817 ("f2fs: compress:
support compress level"), but never be used.

This patch updates as below:
- load high 8-bits of on-disk .i_compress_flag to in-memory .i_compress_level
- load low 8-bits of on-disk .i_compress_flag to in-memory .i_compress_flag
- change type of in-memory .i_compress_flag from unsigned short to unsigned
char.

w/ above changes, we can avoid unneeded bit shift whenever during
.init_compress_ctx(), and shrink size of struct f2fs_inode_info.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: f5f3bd903a5d ("f2fs: set the default compress_level on ioctl")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:30 +01:00
Thomas Richter
2c14f49916 s390/cpumf: support user space events for counting
[ Upstream commit 91d5364dc6 ]

CPU Measurement counting facility events PROBLEM_STATE_CPU_CYCLES(32)
and PROBLEM_STATE_INSTRUCTIONS(33) are valid events. However the device
driver returns error -EOPNOTSUPP when these event are to be installed.

Fix this and allow installation of events PROBLEM_STATE_CPU_CYCLES,
PROBLEM_STATE_CPU_CYCLES:u, PROBLEM_STATE_INSTRUCTIONS and
PROBLEM_STATE_INSTRUCTIONS:u.
Kernel space counting only is still not supported by s390.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Stable-dep-of: 09cda0a40051 ("s390/mm: add missing arch_set_page_dat() call to vmem_crst_alloc()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:30 +01:00
Heiko Carstens
a1a1e5ce88 s390/mm: add missing arch_set_page_dat() call to vmem_crst_alloc()
[ Upstream commit 09cda0a400519b1541591c506e54c9c48e3101bf ]

If the cmma no-dat feature is available all pages that are not used for
dynamic address translation are marked as "no-dat" with the ESSA
instruction. This information is visible to the hypervisor, so that the
hypervisor can optimize purging of guest TLB entries. This also means that
pages which are used for dynamic address translation must not be marked as
"no-dat", since the hypervisor may then incorrectly not purge guest TLB
entries.

Region and segment tables allocated via vmem_crst_alloc() are incorrectly
marked as "no-dat", as soon as slab_is_available() returns true.

Such tables are allocated e.g. when kernel page tables are split, memory is
hotplugged, or a DCSS segment is loaded.

Fix this by adding the missing arch_set_page_dat() call.

Cc: <stable@vger.kernel.org>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:30 +01:00
Rahul Rameshbabu
31051f722d net/mlx5: Increase size of irq name buffer
[ Upstream commit 3338bebfc26a1e2cebbba82a1cf12c0159608e73 ]

Without increased buffer size, will trigger -Wformat-truncation with W=1
for the snprintf operation writing to the buffer.

    drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c: In function 'mlx5_irq_alloc':
    drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c:296:7: error: '@pci:' directive output may be truncated writing 5 bytes into a region of size between 1 and 32 [-Werror=format-truncation=]
      296 |    "%s@pci:%s", name, pci_name(dev->pdev));
          |       ^~~~~
    drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c:295:2: note: 'snprintf' output 6 or more bytes (assuming 37) into a destination of size 32
      295 |  snprintf(irq->name, MLX5_MAX_IRQ_NAME,
          |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      296 |    "%s@pci:%s", name, pci_name(dev->pdev));
          |    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: ada9f5d007 ("IB/mlx5: Fix eq names to display nicely in /proc/interrupts")
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6d4ab2e97dcfbcd748ae71761a9d8e5e41cc732c
Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20231114215846.5902-13-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:30 +01:00
Christoph Hellwig
b5c8e0ff76 blk-mq: make sure active queue usage is held for bio_integrity_prep()
[ Upstream commit b0077e269f6c152e807fdac90b58caf012cdbaab ]

blk_integrity_unregister() can come if queue usage counter isn't held
for one bio with integrity prepared, so this request may be completed with
calling profile->complete_fn, then kernel panic.

Another constraint is that bio_integrity_prep() needs to be called
before bio merge.

Fix the issue by:

- call bio_integrity_prep() with one queue usage counter grabbed reliably

- call bio_integrity_prep() before bio merge

Fixes: 900e080752 ("block: move queue enter logic into blk_mq_submit_bio()")
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Link: https://lore.kernel.org/r/20231113035231.2708053-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:30 +01:00
Andrii Nakryiko
803fb6109f bpf: fix precision backtracking instruction iteration
[ Upstream commit 4bb7ea946a370707315ab774432963ce47291946 ]

Fix an edge case in __mark_chain_precision() which prematurely stops
backtracking instructions in a state if it happens that state's first
and last instruction indexes are the same. This situations doesn't
necessarily mean that there were no instructions simulated in a state,
but rather that we starting from the instruction, jumped around a bit,
and then ended up at the same instruction before checkpointing or
marking precision.

To distinguish between these two possible situations, we need to consult
jump history. If it's empty or contain a single record "bridging" parent
state and first instruction of processed state, then we indeed
backtracked all instructions in this state. But if history is not empty,
we are definitely not done yet.

Move this logic inside get_prev_insn_idx() to contain it more nicely.
Use -ENOENT return code to denote "we are out of instructions"
situation.

This bug was exposed by verifier_loop1.c's bounded_recursion subtest, once
the next fix in this patch set is applied.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Fixes: b5dc0163d8 ("bpf: precise scalar_value tracking")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231110002638.4168352-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:29 +01:00
Andrii Nakryiko
b08acd5c46 bpf: handle ldimm64 properly in check_cfg()
[ Upstream commit 3feb263bb516ee7e1da0acd22b15afbb9a7daa19 ]

ldimm64 instructions are 16-byte long, and so have to be handled
appropriately in check_cfg(), just like the rest of BPF verifier does.

This has implications in three places:
  - when determining next instruction for non-jump instructions;
  - when determining next instruction for callback address ldimm64
    instructions (in visit_func_call_insn());
  - when checking for unreachable instructions, where second half of
    ldimm64 is expected to be unreachable;

We take this also as an opportunity to report jump into the middle of
ldimm64. And adjust few test_verifier tests accordingly.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Reported-by: Hao Sun <sunhao.th@gmail.com>
Fixes: 475fb78fbf ("bpf: verifier (add branch/goto checks)")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231110002638.4168352-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:29 +01:00
Yonghong Song
2c795ce090 bpf: Support new 32bit offset jmp instruction
[ Upstream commit 4cd58e9af8 ]

Add interpreter/jit/verifier support for 32bit offset jmp instruction.
If a conditional jmp instruction needs more than 16bit offset,
it can be simulated with a conditional jmp + a 32bit jmp insn.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230728011231.3716103-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Stable-dep-of: 3feb263bb516 ("bpf: handle ldimm64 properly in check_cfg()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:29 +01:00
Andrii Nakryiko
b1c780ed3c bpf: clean up visit_insn()'s instruction processing
[ Upstream commit 653ae3a874 ]

Instead of referencing processed instruction repeatedly as insns[t]
throughout entire visit_insn() function, take a local insn pointer and
work with it in a cleaner way.

It makes enhancing this function further a bit easier as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230302235015.2044271-7-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Stable-dep-of: 3feb263bb516 ("bpf: handle ldimm64 properly in check_cfg()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:29 +01:00
Andrii Nakryiko
97bb6dab01 bpf: Remove unused insn_cnt argument from visit_[func_call_]insn()
[ Upstream commit dcb2288b1f ]

Number of total instructions in BPF program (including subprogs) can and
is accessed from env->prog->len. visit_func_call_insn() doesn't do any
checks against insn_cnt anymore, relying on push_insn() to do this check
internally. So remove unnecessary insn_cnt input argument from
visit_func_call_insn() and visit_insn() functions.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20221207195534.2866030-1-andrii@kernel.org
Stable-dep-of: 3feb263bb516 ("bpf: handle ldimm64 properly in check_cfg()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:29 +01:00
Andrii Nakryiko
8266c47d04 bpf: remove unnecessary prune and jump points
[ Upstream commit 618945fbed ]

Don't mark some instructions as jump points when there are actually no
jumps and instructions are just processed sequentially. Such case is
handled naturally by precision backtracking logic without the need to
update jump history. See get_prev_insn_idx(). It goes back linearly by
one instruction, unless current top of jmp_history is pointing to
current instruction. In such case we use `st->jmp_history[cnt - 1].prev_idx`
to find instruction from which we jumped to the current instruction
non-linearly.

Also remove both jump and prune point marking for instruction right
after unconditional jumps, as program flow can get to the instruction
right after unconditional jump instruction only if there is a jump to
that instruction from somewhere else in the program. In such case we'll
mark such instruction as prune/jump point because it's a destination of
a jump.

This change has no changes in terms of number of instructions or states
processes across Cilium and selftests programs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20221206233345.438540-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Stable-dep-of: 3feb263bb516 ("bpf: handle ldimm64 properly in check_cfg()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:29 +01:00
Andrii Nakryiko
743f3548d3 bpf: decouple prune and jump points
[ Upstream commit bffdeaa8a5 ]

BPF verifier marks some instructions as prune points. Currently these
prune points serve two purposes.

It's a point where verifier tries to find previously verified state and
check current state's equivalence to short circuit verification for
current code path.

But also currently it's a point where jump history, used for precision
backtracking, is updated. This is done so that non-linear flow of
execution could be properly backtracked.

Such coupling is coincidental and unnecessary. Some prune points are not
part of some non-linear jump path, so don't need update of jump history.
On the other hand, not all instructions which have to be recorded in
jump history necessarily are good prune points.

This patch splits prune and jump points into independent flags.
Currently all prune points are marked as jump points to minimize amount
of changes in this patch, but next patch will perform some optimization
of prune vs jmp point placement.

No functional changes are intended.

Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20221206233345.438540-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Stable-dep-of: 3feb263bb516 ("bpf: handle ldimm64 properly in check_cfg()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:29 +01:00
Dan Carpenter
eb4f2e1788 fbdev: imsttfb: fix double free in probe()
[ Upstream commit e08c30efda21ef4c0ec084a3a9581c220b442ba9 ]

The init_imstt() function calls framebuffer_release() on error and then
the probe() function calls it again.  It should only be done in probe.

Fixes: 518ecb6a20 ("fbdev: imsttfb: Fix error path of imsttfb_probe()")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:29 +01:00
Helge Deller
f2a79f3651 fbdev: imsttfb: Release framebuffer and dealloc cmap on error path
[ Upstream commit 5cf9a090a3 ]

Add missing cleanups in error path.

Signed-off-by: Helge Deller <deller@gmx.de>
Stable-dep-of: e08c30efda21 ("fbdev: imsttfb: fix double free in probe()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:29 +01:00
David Heidelberg
51a1b94302 arm64: dts: qcom: sdm845: Fix PSCI power domain names
[ Upstream commit a5f01673d3946e424091e6b8ff274716f9c21454 ]

The original commit hasn't been updated according to
refactoring done in sdm845.dtsi.

Fixes: a1ade6cac5 ("arm64: dts: qcom: sdm845: Switch PSCI cpu idle states from PC to OSI")
Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
Link: https://lore.kernel.org/r/20230912071205.11502-1-david@ixit.cz
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:28 +01:00
Krzysztof Kozlowski
5db8b93cbe arm64: dts: qcom: sdm845: align RPMh regulator nodes with bindings
[ Upstream commit 86dd19bbde ]

Device node names should be generic and bindings expect certain pattern
for RPMh regulator nodes.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230127114347.235963-6-krzysztof.kozlowski@linaro.org
Stable-dep-of: a5f01673d394 ("arm64: dts: qcom: sdm845: Fix PSCI power domain names")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:28 +01:00
Rotem Saado
343bb27e31 wifi: iwlwifi: yoyo: swap cdb and jacket bits values
[ Upstream commit 65008777b9dcd2002414ddb2c2158293a6e2fd6f ]

The bits are wrong, the jacket bit should be 5 and cdb bit 4.
Fix it.

Fixes: 1f171f4f14 ("iwlwifi: Add support for getting rf id with blank otp")
Signed-off-by: Rotem Saado <rotem.saado@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20231004123422.356d8dacda2f.I349ab888b43a11baa2453a1d6978a6a703e422f0@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:28 +01:00
Eric Dumazet
158b71f3a9 udp: annotate data-races around udp->encap_type
[ Upstream commit 70a36f571362a8de8b8c02d21ae524fc776287f2 ]

syzbot/KCSAN complained about UDP_ENCAP_L2TPINUDP setsockopt() racing.

Add READ_ONCE()/WRITE_ONCE() to document races on this lockless field.

syzbot report was:
BUG: KCSAN: data-race in udp_lib_setsockopt / udp_lib_setsockopt

read-write to 0xffff8881083603fa of 1 bytes by task 16557 on cpu 0:
udp_lib_setsockopt+0x682/0x6c0
udp_setsockopt+0x73/0xa0 net/ipv4/udp.c:2779
sock_common_setsockopt+0x61/0x70 net/core/sock.c:3697
__sys_setsockopt+0x1c9/0x230 net/socket.c:2263
__do_sys_setsockopt net/socket.c:2274 [inline]
__se_sys_setsockopt net/socket.c:2271 [inline]
__x64_sys_setsockopt+0x66/0x80 net/socket.c:2271
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read-write to 0xffff8881083603fa of 1 bytes by task 16554 on cpu 1:
udp_lib_setsockopt+0x682/0x6c0
udp_setsockopt+0x73/0xa0 net/ipv4/udp.c:2779
sock_common_setsockopt+0x61/0x70 net/core/sock.c:3697
__sys_setsockopt+0x1c9/0x230 net/socket.c:2263
__do_sys_setsockopt net/socket.c:2274 [inline]
__se_sys_setsockopt net/socket.c:2271 [inline]
__x64_sys_setsockopt+0x66/0x80 net/socket.c:2271
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0x01 -> 0x05

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 16554 Comm: syz-executor.5 Not tainted 6.5.0-rc7-syzkaller-00004-gf7757129e3de #0

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:28 +01:00
Eric Dumazet
8d929b6c11 udp: lockless UDP_ENCAP_L2TPINUDP / UDP_GRO
[ Upstream commit ac9a7f4ce5dda1472e8f44096f33066c6ec1a3b4 ]

Move udp->encap_enabled to udp->udp_flags.

Add udp_test_and_set_bit() helper to allow lockless
udp_tunnel_encap_enable() implementation.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stable-dep-of: 70a36f571362 ("udp: annotate data-races around udp->encap_type")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:28 +01:00
Eric Dumazet
b680a907d1 udp: move udp->accept_udp_{l4|fraglist} to udp->udp_flags
[ Upstream commit f5f52f0884a595ff99ab1a608643fe4025fca2d5 ]

These are read locklessly, move them to udp_flags to fix data-races.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stable-dep-of: 70a36f571362 ("udp: annotate data-races around udp->encap_type")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:28 +01:00
Eric Dumazet
753886c0b9 udp: move udp->gro_enabled to udp->udp_flags
[ Upstream commit e1dc0615c6b08ef36414f08c011965b8fb56198b ]

syzbot reported that udp->gro_enabled can be read locklessly.
Use one atomic bit from udp->udp_flags.

Fixes: e20cf8d3f1 ("udp: implement GRO for plain UDP sockets.")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:28 +01:00
Eric Dumazet
a01cff15cc udp: move udp->no_check6_rx to udp->udp_flags
[ Upstream commit bcbc1b1de884647aa0318bf74eb7f293d72a1e40 ]

syzbot reported that udp->no_check6_rx can be read locklessly.
Use one atomic bit from udp->udp_flags.

Fixes: 1c19448c9b ("net: Make enabling of zero UDP6 csums more restrictive")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:28 +01:00
Eric Dumazet
50e41aa9ea udp: move udp->no_check6_tx to udp->udp_flags
[ Upstream commit a0002127cd746fcaa182ad3386ef6931c37f3bda ]

syzbot reported that udp->no_check6_tx can be read locklessly.
Use one atomic bit from udp->udp_flags

Fixes: 1c19448c9b ("net: Make enabling of zero UDP6 csums more restrictive")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:27 +01:00
Eric Dumazet
e2a4392b61 udp: introduce udp->udp_flags
[ Upstream commit 81b36803ac139827538ac5ce4028e750a3c53f53 ]

According to syzbot, it is time to use proper atomic flags
for various UDP flags.

Add udp_flags field, and convert udp->corkflag to first
bit in it.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stable-dep-of: a0002127cd74 ("udp: move udp->no_check6_tx to udp->udp_flags")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:27 +01:00
David Howells
2489502fb1 ipv4, ipv6: Use splice_eof() to flush
[ Upstream commit 1d7e4538a5 ]

Allow splice to undo the effects of MSG_MORE after prematurely ending a
splice/sendfile due to getting an EOF condition (->splice_read() returned
0) after splice had called sendmsg() with MSG_MORE set when the user didn't
set MSG_MORE.

For UDP, a pending packet will not be emitted if the socket is closed
before it is flushed; with this change, it be flushed by ->splice_eof().

For TCP, it's not clear that MSG_MORE is actually effective.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Kuniyuki Iwashima <kuniyu@amazon.com>
cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
cc: David Ahern <dsahern@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: a0002127cd74 ("udp: move udp->no_check6_tx to udp->udp_flags")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:27 +01:00
David Howells
4713b7c756 splice, net: Add a splice_eof op to file-ops and socket-ops
[ Upstream commit 2bfc668509 ]

Add an optional method, ->splice_eof(), to allow splice to indicate the
premature termination of a splice to struct file_operations and struct
proto_ops.

This is called if sendfile() or splice() encounters all of the following
conditions inside splice_direct_to_actor():

 (1) the user did not set SPLICE_F_MORE (splice only), and

 (2) an EOF condition occurred (->splice_read() returned 0), and

 (3) we haven't read enough to fulfill the request (ie. len > 0 still), and

 (4) we have already spliced at least one byte.

A further patch will modify the behaviour of SPLICE_F_MORE to always be
passed to the actor if either the user set it or we haven't yet read
sufficient data to fulfill the request.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Matthew Wilcox <willy@infradead.org>
cc: Jan Kara <jack@suse.cz>
cc: Jeff Layton <jlayton@kernel.org>
cc: David Hildenbrand <david@redhat.com>
cc: Christian Brauner <brauner@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: linux-mm@kvack.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: a0002127cd74 ("udp: move udp->no_check6_tx to udp->udp_flags")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-10 17:10:27 +01:00