commit 1684878952 upstream.
There has been a lot of confusion around which platform profiles are
supported on various platforms and it would be useful to have a debug
method to be able to override the profile mode that is selected.
I don't expect this to be used in anything other than debugging in
conjunction with Lenovo engineers - but it does give a way to get a
system working whilst we wait for either FW fixes, or a driver fix
to land upstream, if something is wonky in the mode detection logic
Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://lore.kernel.org/r/20230505132523.214338-2-mpearson-lenovo@squebb.ca
Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0c0cd3e25a upstream.
I had incorrectly thought that PSC profiles were not usable on Intel
platforms so had blocked them in the driver initialistion. This broke
platform profiles on the T490.
After discussion with the FW team PSC does work on Intel platforms and
should be allowed.
Note - it's possible this may impact other platforms where it is advertised
but special driver support that only Windows has is needed. But if it does
then they will need fixing via quirks. Please report any issues to me so I
can get them addressed - but I haven't found any problems in testing...yet
Fixes: bce6243f76 ("platform/x86: thinkpad_acpi: do not use PSC mode on Intel platforms")
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2177962
Cc: stable@vger.kernel.org
Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://lore.kernel.org/r/20230505132523.214338-1-mpearson-lenovo@squebb.ca
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d66cde50c3 upstream.
Change type of pcchunk->Length from u32 to u64 to match
smb2_copychunk_range arguments type. Fixes the problem where performing
server-side copy with CIFS_IOC_COPYCHUNK_FILE ioctl resulted in incomplete
copy of large files while returning -EINVAL.
Fixes: 9bf0c9cd43 ("CIFS: Fix SMB2/SMB3 Copy offload support (refcopy) for large files")
Cc: <stable@vger.kernel.org>
Signed-off-by: Pawel Witek <pawel.ireneusz.witek@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 02ca9e6fb5 upstream.
When both of the superblock zones are full, we need to check which
superblock is newer. The calculation of last superblock position is wrong
as it does not consider zone_capacity and uses the length.
Fixes: 9658b72ef3 ("btrfs: zoned: locate superblock position using zone capacity")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f84353c7c2 upstream.
For data block groups, we zone finish a zone (or, just deactivate it) when
seeing the last IO in btrfs_finish_ordered_io(). That is only called for
IOs using ZONE_APPEND, but we use a regular WRITE command for data
relocation IOs. Detect it and call btrfs_zone_finish_endio() properly.
Fixes: be1a1d7a5d ("btrfs: zoned: finish fully written block group")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0004ff15ea upstream.
When loading a free space cache from disk, at __load_free_space_cache(),
if we fail to insert a bitmap entry, we still increment the number of
total bitmaps in the btrfs_free_space_ctl structure, which is incorrect
since we failed to add the bitmap entry. On error we then empty the
cache by calling __btrfs_remove_free_space_cache(), which will result
in getting the total bitmaps counter set to 1.
A failure to load a free space cache is not critical, so if a failure
happens we just rebuild the cache by scanning the extent tree, which
happens at block-group.c:caching_thread(). Yet the failure will result
in having the total bitmaps of the btrfs_free_space_ctl always bigger
by 1 then the number of bitmap entries we have. So fix this by having
the total bitmaps counter be incremented only if we successfully added
the bitmap entry.
Fixes: a67509c300 ("Btrfs: add a io_ctl struct and helpers for dealing with the space cache")
Reviewed-by: Anand Jain <anand.jain@oracle.com>
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c87f318e6f upstream.
Check nodesize to sectorsize in alignment check in print_extent_item.
The comment states that and this is correct, similar check is done
elsewhere in the functions.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: ea57788eb7 ("btrfs: require only sector size alignment for parent eb bytenr")
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d6a4fc857 upstream.
Previously clear_cache mount option would simply disable free-space-tree
feature temporarily then re-enable it to rebuild the whole free space
tree.
But this is problematic for block-group-tree feature, as we have an
artificial dependency on free-space-tree feature.
If we go the existing method, after clearing the free-space-tree
feature, we would flip the filesystem to read-only mode, as we detect a
super block write with block-group-tree but no free-space-tree feature.
This patch would change the behavior by properly rebuilding the free
space tree without disabling this feature, thus allowing clear_cache
mount option to work with block group tree.
Now we can mount a filesystem with block-group-tree feature and
clear_mount option:
$ mkfs.btrfs -O block-group-tree /dev/test/scratch1 -f
$ sudo mount /dev/test/scratch1 /mnt/btrfs -o clear_cache
$ sudo dmesg -t | head -n 5
BTRFS info (device dm-1): force clearing of disk cache
BTRFS info (device dm-1): using free space tree
BTRFS info (device dm-1): auto enabling async discard
BTRFS info (device dm-1): rebuilding free space tree
BTRFS info (device dm-1): checking UUID tree
CC: stable@vger.kernel.org # 6.1+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c83b56d1dd upstream.
btrfs_redirty_list_add zeroes the buffer data and sets the
EXTENT_BUFFER_NO_CHECK to make sure writeback is fine with a bogus
header. But it does that after already marking the buffer dirty, which
means that writeback could already be looking at the buffer.
Switch the order of operations around so that the buffer is only marked
dirty when we're ready to write it.
Fixes: d3575156f6 ("btrfs: zoned: redirty released extent buffers")
CC: stable@vger.kernel.org # 5.15+
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d246331b78 upstream.
Boris noticed in his simple quotas testing that he was getting a leak
with Sweet Tea's change to subvol create that stopped doing a
transaction commit. This was just a side effect of that change.
In the delayed inode code we have an optimization that will free extra
reservations if we think we can pack a dir item into an already modified
leaf. Previously this wouldn't be triggered in the subvolume create
case because we'd commit the transaction, it was still possible but
much harder to trigger. It could actually be triggered if we did a
mkdir && subvol create with qgroups enabled.
This occurs because in btrfs_insert_delayed_dir_index(), which gets
called when we're adding the dir item, we do the following:
btrfs_block_rsv_release(fs_info, trans->block_rsv, bytes, NULL);
if we're able to skip reserving space.
The problem here is that trans->block_rsv points at the temporary block
rsv for the subvolume create, which has qgroup reservations in the block
rsv.
This is a problem because btrfs_block_rsv_release() will do the
following:
if (block_rsv->qgroup_rsv_reserved >= block_rsv->qgroup_rsv_size) {
qgroup_to_release = block_rsv->qgroup_rsv_reserved -
block_rsv->qgroup_rsv_size;
block_rsv->qgroup_rsv_reserved = block_rsv->qgroup_rsv_size;
}
The temporary block rsv just has ->qgroup_rsv_reserved set,
->qgroup_rsv_size == 0. The optimization in
btrfs_insert_delayed_dir_index() sets ->qgroup_rsv_reserved = 0. Then
later on when we call btrfs_subvolume_release_metadata() which has
btrfs_block_rsv_release(fs_info, rsv, (u64)-1, &qgroup_to_release);
btrfs_qgroup_convert_reserved_meta(root, qgroup_to_release);
qgroup_to_release is set to 0, and we do not convert the reserved
metadata space.
The problem here is that the block rsv code has been unconditionally
messing with ->qgroup_rsv_reserved, because the main place this is used
is delalloc, and any time we call btrfs_block_rsv_release() we do it
with qgroup_to_release set, and thus do the proper accounting.
The subvolume code is the only other code that uses the qgroup
reservation stuff, but it's intermingled with the above optimization,
and thus was getting its reservation freed out from underneath it and
thus leaking the reserved space.
The solution is to simply not mess with the qgroup reservations if we
don't have qgroup_to_release set. This works with the existing code as
anything that messes with the delalloc reservations always have
qgroup_to_release set. This fixes the leak that Boris was observing.
Reviewed-by: Qu Wenruo <wqu@suse.com>
CC: stable@vger.kernel.org # 5.4+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7db9e5c6b upstream.
We have observed a btrfs filesystem corruption on workloads using
no-holes and encoded writes via send stream v2. The symptom is that a
file appears to be truncated to the end of its last aligned extent, even
though the final unaligned extent and even the file extent and otherwise
correctly updated inode item have been written.
So if we were writing out a 1MiB+X file via 8 128K extents and one
extent of length X, i_size would be set to 1MiB, but the ninth extent,
nbyte, etc. would all appear correct otherwise.
The source of the race is a narrow (one line of code) window in which a
no-holes fs has read in an updated i_size, but has not yet set a shared
disk_i_size variable to write. Therefore, if two ordered extents run in
parallel (par for the course for receive workloads), the following
sequence can play out: (following "threads" a bit loosely, since there
are callbacks involved for endio but extra threads aren't needed to
cause the issue)
ENC-WR1 (second to last) ENC-WR2 (last)
------- -------
btrfs_do_encoded_write
set i_size = 1M
submit bio B1 ending at 1M
endio B1
btrfs_inode_safe_disk_i_size_write
local i_size = 1M
falls off a cliff for some reason
btrfs_do_encoded_write
set i_size = 1M+X
submit bio B2 ending at 1M+X
endio B2
btrfs_inode_safe_disk_i_size_write
local i_size = 1M+X
disk_i_size = 1M+X
disk_i_size = 1M
btrfs_delayed_update_inode
btrfs_delayed_update_inode
And the delayed inode ends up filled with nbytes=1M+X and isize=1M, and
writes respect i_size and present a corrupted file missing its last
extents.
Fix this by holding the inode lock in the no-holes case so that a thread
can't sneak in a write to disk_i_size that gets overwritten with an out
of date i_size.
Fixes: 41a2ee75aa ("btrfs: introduce per-inode file extent tree")
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 64b5d5b285 upstream.
[BUG]
With block-group-tree feature enabled, mounting it with clear_cache
would cause the following transaction abort at mount or remount:
BTRFS info (device dm-4): force clearing of disk cache
BTRFS info (device dm-4): using free space tree
BTRFS info (device dm-4): auto enabling async discard
BTRFS info (device dm-4): clearing free space tree
BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE (0x1)
BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE_VALID (0x2)
BTRFS error (device dm-4): block-group-tree feature requires fres-space-tree and no-holes
BTRFS error (device dm-4): super block corruption detected before writing it to disk
BTRFS: error (device dm-4) in write_all_supers:4288: errno=-117 Filesystem corrupted (unexpected superblock corruption detected)
BTRFS warning (device dm-4: state E): Skipping commit of aborted transaction.
[CAUSE]
For block-group-tree feature, we have an artificial dependency on
free-space-tree.
This means if we detect block-group-tree without v2 cache, we consider
it a corruption and cause the problem.
For clear_cache mount option, it would temporary disable v2 cache, then
re-enable it.
But unfortunately for that temporary v2 cache disabled status, we refuse
to write a superblock with bg tree only flag, thus leads to the above
transaction abortion.
[FIX]
For now, just reject clear_cache and v1 cache mount option for block
group tree. So now we got a graceful rejection other than a transaction
abort:
BTRFS info (device dm-4): force clearing of disk cache
BTRFS error (device dm-4): cannot disable free space tree with block-group-tree feature
BTRFS error (device dm-4): open_ctree failed
CC: stable@vger.kernel.org # 6.1+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 631003e233 upstream.
find_next_bit and find_next_zero_bit take @size as the second parameter and
@offset as the third parameter. They are specified opposite in
btrfs_ensure_empty_zones(). Thanks to the later loop, it never failed to
detect the empty zones. Fix them and (maybe) return the result a bit
faster.
Note: the naming is a bit confusing, size has two meanings here, bitmap
and our range size.
Fixes: 1cd6121f2a ("btrfs: zoned: implement zoned chunk allocator")
CC: stable@vger.kernel.org # 5.15+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f932d4ef0 upstream.
A call to btrfs_prev_leaf() may end up returning a path that points to the
same item (key) again. This happens if while btrfs_prev_leaf(), after we
release the path, a concurrent insertion happens, which moves items off
from a sibling into the front of the previous leaf, and an item with the
computed previous key does not exists.
For example, suppose we have the two following leaves:
Leaf A
-------------------------------------------------------------
| ... key (300 96 10) key (300 96 15) key (300 96 16) |
-------------------------------------------------------------
slot 20 slot 21 slot 22
Leaf B
-------------------------------------------------------------
| key (300 96 20) key (300 96 21) key (300 96 22) ... |
-------------------------------------------------------------
slot 0 slot 1 slot 2
If we call btrfs_prev_leaf(), from btrfs_previous_item() for example, with
a path pointing to leaf B and slot 0 and the following happens:
1) At btrfs_prev_leaf() we compute the previous key to search as:
(300 96 19), which is a key that does not exists in the tree;
2) Then we call btrfs_release_path() at btrfs_prev_leaf();
3) Some other task inserts a key at leaf A, that sorts before the key at
slot 20, for example it has an objectid of 299. In order to make room
for the new key, the key at slot 22 is moved to the front of leaf B.
This happens at push_leaf_right(), called from split_leaf().
After this leaf B now looks like:
--------------------------------------------------------------------------------
| key (300 96 16) key (300 96 20) key (300 96 21) key (300 96 22) ... |
--------------------------------------------------------------------------------
slot 0 slot 1 slot 2 slot 3
4) At btrfs_prev_leaf() we call btrfs_search_slot() for the computed
previous key: (300 96 19). Since the key does not exists,
btrfs_search_slot() returns 1 and with a path pointing to leaf B
and slot 1, the item with key (300 96 20);
5) This makes btrfs_prev_leaf() return a path that points to slot 1 of
leaf B, the same key as before it was called, since the key at slot 0
of leaf B (300 96 16) is less than the computed previous key, which is
(300 96 19);
6) As a consequence btrfs_previous_item() returns a path that points again
to the item with key (300 96 20).
For some users of btrfs_prev_leaf() or btrfs_previous_item() this may not
be functional a problem, despite not making sense to return a new path
pointing again to the same item/key. However for a caller such as
tree-log.c:log_dir_items(), this has a bad consequence, as it can result
in not logging some dir index deletions in case the directory is being
logged without holding the inode's VFS lock (logging triggered while
logging a child inode for example) - for the example scenario above, in
case the dir index keys 17, 18 and 19 were deleted in the current
transaction.
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a48d60467 upstream.
SYM_FUNC_START_LOCAL_NOALIGN() adds an endbr leading to this layout
(leaving only the last 2 bytes of the address):
3bff <zen_untrain_ret>:
3bff: f3 0f 1e fa endbr64
3c03: f6 test $0xcc,%bl
3c04 <__x86_return_thunk>:
3c04: c3 ret
3c05: cc int3
3c06: 0f ae e8 lfence
However, "the RET at __x86_return_thunk must be on a 64 byte boundary,
for alignment within the BTB."
Use SYM_START instead.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ecc68ee216 ]
It seems that perf stat -b <prog id> doesn't produce any results:
$ perf stat -e cycles -b 4 -I 10000 -vvv
Control descriptor is not initialized
cycles: 0 0 0
time counts unit events
10.007641640 <not supported> cycles
Looks like this happens because fentry/fexit progs are getting loaded, but the
corresponding perf event is not enabled and not added into the events bpf map.
I think there is some mixing up between two type of bpf support, one for bperf
and one for bpf_profiler. Both are identified via evsel__is_bpf, based on which
perf events are enabled, but for the latter (bpf_profiler) a perf event is
required. Using evsel__is_bperf to check only bperf produces expected results:
$ perf stat -e cycles -b 4 -I 10000 -vvv
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
size 136
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
------------------------------------------------------------
[...perf_event_attr for other CPUs...]
------------------------------------------------------------
cycles: 309426 169009 169009
time counts unit events
10.010091271 309426 cycles
The final numbers correspond (at least in the level of magnitude) to the
same metric obtained via bpftool.
Fixes: 112cb56164 ("perf stat: Introduce config stat.bpf-counter-events")
Reviewed-by: Song Liu <song@kernel.org>
Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Tested-by: Song Liu <song@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230412182316.11628-1-9erthalion6@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4140aafcff ]
CRYPTO_TFM_REQ_MAY_BACKLOG tells the crypto driver that it should
internally backlog requests until the crypto hw's queue becomes
full. At that point, crypto_engine backlogs the request and returns
-EBUSY. Calling driver such as dm-crypt then waits until the
complete() function is called with a status of -EINPROGRESS before
sending a new request.
The problem lies in the call to complete() with a value of -EINPROGRESS
that is made when a backlog item is present on the queue. The call is
done before the successful execution of the crypto request. In the case
that do_one_request() returns < 0 and the retry support is available,
the request is put back in the queue. This leads upper drivers to send
a new request even if the queue is still full.
The problem can be reproduced by doing a large dd into a crypto
dm-crypt device. This is pretty easy to see when using
Freescale CAAM crypto driver and SWIOTLB dma. Since the actual amount
of requests that can be hold in the queue is unlimited we get IOs error
and dma allocation.
The fix is to call complete with a value of -EINPROGRESS only if
the request is not enqueued back in crypto_queue. This is done
by calling complete() later in the code. In order to delay the decision,
crypto_queue is modified to correctly set the backlog pointer
when a request is enqueued back.
Fixes: 6a89f492f8 ("crypto: engine - support for parallel requests based on retry mechanism")
Co-developed-by: Sylvain Ouellet <souellet@genetec.com>
Signed-off-by: Sylvain Ouellet <souellet@genetec.com>
Signed-off-by: Olivier Bacon <obacon@genetec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c35e03eaec ]
The crypto completion function currently takes a pointer to a
struct crypto_async_request object. However, in reality the API
does not allow the use of any part of the object apart from the
data field. For example, ahash/shash will create a fake object
on the stack to pass along a different data field.
This leads to potential bugs where the user may try to dereference
or otherwise use the crypto_async_request object.
This patch adds some temporary scaffolding so that the completion
function can take a void * instead. Once affected users have been
converted this can be removed.
The helper crypto_request_complete will remain even after the
conversion is complete. It should be used instead of calling
the completion function directly.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Stable-dep-of: 4140aafcff ("crypto: engine - fix crypto_queue backlog handling")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8fd91151eb ]
SS_ENCRYPTION is (0 << 7 = 0), so the test can never be true.
Use a direct comparison to SS_ENCRYPTION instead.
The same king of test is already done the same way in sun8i_ss_run_task().
Fixes: 359e893e8a ("crypto: sun8i-ss - rework handling of IV")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 57f14b5ae1 ]
An audit showed just this one problem with zfree(), fix it.
Fixes: 9fbc61f832 ("perf pmu: Add support for PMU capabilities")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 07d85ba9d0 ]
Hundreds of "read LOST count failed" error messages may be displayed,
when the below command is launched.
perf record -e '{cpu/mem-loads-aux/,cpu/event=0xcd,umask=0x1/}:S' -a
According to the commit 89e3106fa2 ("libperf: Handle read format
in perf_evsel__read()"), the PERF_FORMAT_GROUP is only available for
the leader. However, the record__read_lost_samples() goes through every
entry of an evlist, which includes both leader and member. The member
event errors out and triggers the error message. Since there may be
hundreds of CPUs on a server, the message will be printed hundreds of
times, which is very annoying.
The message itself is correct, but the pr_err is a overkill. Other error
messages in the record__read_lost_samples() are all pr_debug. To make
the output format consistent, change the pr_err("read LOST count
failed\n"); to pr_debug("read LOST count failed\n");.
User can still get the message via -v option.
Fixes: e3a23261ad ("perf record: Read and inject LOST_SAMPLES events")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230301150413.27011-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 299efdc238 ]
We should check whether the current SFI (Stream Filter Instance) table
is full before creating a new SFI entry. However, the previous logic
checks the handle by mistake and might lead to unpredictable behavior.
Fixes: 888ae5a395 ("net: enetc: add tc flower psfp offload driver")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f8bb510439 ]
For multi-queue and large ring-size use case, the following error
occurred when free_unused_bufs:
rcu: INFO: rcu_sched self-detected stall on CPU.
Fixes: 986a4f4d45 ("virtio_net: multiqueue support")
Signed-off-by: Wenliang Wang <wangwenliang.1995@bytedance.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 120a56b01b ]
On mt753x_cpu_port_enable() there's code that enables flooding for the CPU
port only. Since mt753x_cpu_port_enable() runs twice when both CPU ports
are enabled, port 6 becomes the only port to forward the frames to. But
port 5 is the active port, so no frames received from the user ports will
be forwarded to port 5 which breaks network connectivity.
Every bit of the BC_FFP, UNM_FFP, and UNU_FFP bits represents a port. Fix
this issue by setting the bit that corresponds to the CPU port without
overwriting the other bits.
Clear the bits beforehand only for the MT7531 switch. According to the
documents MT7621 Giga Switch Programming Guide v0.3 and MT7531 Reference
Manual for Development Board v1.0, after reset, the BC_FFP, UNM_FFP, and
UNU_FFP bits are set to 1 for MT7531, 0 for MT7530.
The commit 5e5502e012 ("net: dsa: mt7530: fix roaming from DSA user
ports") silently changed the method to set the bits on the MT7530_MFC.
Instead of clearing the relevant bits before mt7530_cpu_port_enable()
which runs under a for loop, the commit started doing it on
mt7530_cpu_port_enable().
Back then, this didn't really matter as only a single CPU port could be
used since the CPU port number was hardcoded. The driver was later changed
with commit 1f9a6abecf ("net: dsa: mt7530: get cpu-port via dp->cpu_dp
instead of constant") to retrieve the CPU port via dp->cpu_dp. With that,
this silent change became an issue for when using multiple CPU ports.
Fixes: 5e5502e012 ("net: dsa: mt7530: fix roaming from DSA user ports")
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7f54cc9772 ]
MT7988 shares a significant part of the setup function with MT7531.
Split-off those parts into a shared function which is going to be used
also by mt7988_setup.
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 120a56b01b ("net: dsa: mt7530: fix network connectivity with multiple CPU ports")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 37c218d802 ]
The multi-chip module MT7530 switch with a 40 MHz oscillator on the
MT7621AT, MT7621DAT, and MT7621ST SoCs forwards corrupt frames using
trgmii.
This is caused by the assumption that MT7621 SoCs have got 150 MHz PLL,
hence using the ncpo1 value, 0x0780.
My testing shows this value works on Unielec U7621-06, Bartel's testing
shows it won't work on Hi-Link HLK-MT7621A and Netgear WAC104. All devices
tested have got 40 MHz oscillators.
Using the value for 125 MHz PLL, 0x0640, works on all boards at hand. The
definitions for 125 MHz PLL exist on the Banana Pi BPI-R2 BSP source code
whilst 150 MHz PLL don't.
Forwarding frames using trgmii on the MCM MT7530 switch with a 25 MHz
oscillator on the said MT7621 SoCs works fine because the ncpo1 value
defined for it is for 125 MHz PLL.
Change the 150 MHz PLL comment to 125 MHz PLL, and use the 125 MHz PLL
ncpo1 values for both oscillator frequencies.
Link: 81d24bbce7/u-boot-mt/drivers/net/rt2880_eth.c (L2195)
Fixes: 7ef6f6f8d2 ("net: dsa: mt7530: Add MT7621 TRGMII mode support")
Tested-by: Bartel Eerdekens <bartel.eerdekens@constell8.be>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c148dc8e2f ]
Fix a potential race in gmap_make_secure() and remove the last user of
follow_page() without FOLL_GET.
The old code is locking something it doesn't have a reference to, and
as explained by Jason and David in this discussion:
https://lore.kernel.org/linux-mm/Y9J4P%2FRNvY1Ztn0Q@nvidia.com/
it can lead to all kind of bad things, including the page getting
unmapped (MADV_DONTNEED), freed, reallocated as a larger folio and the
unlock_page() would target the wrong bit.
There is also another race with the FOLL_WRITE, which could race
between the follow_page() and the get_locked_pte().
The main point is to remove the last use of follow_page() without
FOLL_GET or FOLL_PIN, removing the races can be considered a nice
bonus.
Link: https://lore.kernel.org/linux-mm/Y9J4P%2FRNvY1Ztn0Q@nvidia.com/
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes: 214d9bbcd3 ("s390/mm: provide memory management functions for protected KVM guests")
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20230428092753.27913-2-imbrenda@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0d727e1856 ]
Smatch complains that:
snd_usb_caiaq_input_init() warn: missing error code 'ret'
This patch adds a new case to handle the situation where the
device does not support any input methods in the
`snd_usb_caiaq_input_init` function. It returns an `-EINVAL` error code
to indicate that no input methods are supported on the device.
Fixes: 523f1dce37 ("[ALSA] Add Native Instrument usb audio device support")
Signed-off-by: Ruliang Lin <u202112092@hust.edu.cn>
Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn>
Acked-by: Daniel Mack <daniel@zonque.org>
Link: https://lore.kernel.org/r/20230504065054.3309-1-u202112092@hust.edu.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2397e3d8d2 ]
mgr->ctx_handles should be protected by mgr->lock.
v2: improve commit message
v3: add a Fixes tag
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Fixes: 52c6a62c64 ("drm/amdgpu: add interface for editing a foreign process's priority v3")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4a54903ff6 ]
Add a check for NULL on the alloc return. If devlink_alloc() fails and
we try to use devlink_priv() on the NULL return, the kernel gets very
unhappy and panics. With this fix, the driver load will still fail,
but at least it won't panic the kernel.
Fixes: df69ba4321 ("ionic: Add basic framework for IONIC Network device driver")
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9ad685dbfe ]
It is not possible to set the number of lanes when setting link modes
using the legacy IOCTL ethtool interface. Since 'struct
ethtool_link_ksettings' is not initialized in this path, drivers receive
an uninitialized number of lanes in 'struct
ethtool_link_ksettings::lanes'.
When this information is later queried from drivers, it results in the
ethtool code making decisions based on uninitialized memory, leading to
the following KMSAN splat [1]. In practice, this most likely only
happens with the tun driver that simply returns whatever it got in the
set operation.
As far as I can tell, this uninitialized memory is not leaked to user
space thanks to the 'ethtool_ops->cap_link_lanes_supported' check in
linkmodes_prepare_data().
Fix by initializing the structure in the IOCTL path. Did not find any
more call sites that pass an uninitialized structure when calling
'ethtool_ops::set_link_ksettings()'.
[1]
BUG: KMSAN: uninit-value in ethnl_update_linkmodes net/ethtool/linkmodes.c:273 [inline]
BUG: KMSAN: uninit-value in ethnl_set_linkmodes+0x190b/0x19d0 net/ethtool/linkmodes.c:333
ethnl_update_linkmodes net/ethtool/linkmodes.c:273 [inline]
ethnl_set_linkmodes+0x190b/0x19d0 net/ethtool/linkmodes.c:333
ethnl_default_set_doit+0x88d/0xde0 net/ethtool/netlink.c:640
genl_family_rcv_msg_doit net/netlink/genetlink.c:968 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1048 [inline]
genl_rcv_msg+0x141a/0x14c0 net/netlink/genetlink.c:1065
netlink_rcv_skb+0x3f8/0x750 net/netlink/af_netlink.c:2577
genl_rcv+0x40/0x60 net/netlink/genetlink.c:1076
netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
netlink_unicast+0xf41/0x1270 net/netlink/af_netlink.c:1365
netlink_sendmsg+0x127d/0x1430 net/netlink/af_netlink.c:1942
sock_sendmsg_nosec net/socket.c:724 [inline]
sock_sendmsg net/socket.c:747 [inline]
____sys_sendmsg+0xa24/0xe40 net/socket.c:2501
___sys_sendmsg+0x2a1/0x3f0 net/socket.c:2555
__sys_sendmsg net/socket.c:2584 [inline]
__do_sys_sendmsg net/socket.c:2593 [inline]
__se_sys_sendmsg net/socket.c:2591 [inline]
__x64_sys_sendmsg+0x36b/0x540 net/socket.c:2591
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Uninit was stored to memory at:
tun_get_link_ksettings+0x37/0x60 drivers/net/tun.c:3544
__ethtool_get_link_ksettings+0x17b/0x260 net/ethtool/ioctl.c:441
ethnl_set_linkmodes+0xee/0x19d0 net/ethtool/linkmodes.c:327
ethnl_default_set_doit+0x88d/0xde0 net/ethtool/netlink.c:640
genl_family_rcv_msg_doit net/netlink/genetlink.c:968 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1048 [inline]
genl_rcv_msg+0x141a/0x14c0 net/netlink/genetlink.c:1065
netlink_rcv_skb+0x3f8/0x750 net/netlink/af_netlink.c:2577
genl_rcv+0x40/0x60 net/netlink/genetlink.c:1076
netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
netlink_unicast+0xf41/0x1270 net/netlink/af_netlink.c:1365
netlink_sendmsg+0x127d/0x1430 net/netlink/af_netlink.c:1942
sock_sendmsg_nosec net/socket.c:724 [inline]
sock_sendmsg net/socket.c:747 [inline]
____sys_sendmsg+0xa24/0xe40 net/socket.c:2501
___sys_sendmsg+0x2a1/0x3f0 net/socket.c:2555
__sys_sendmsg net/socket.c:2584 [inline]
__do_sys_sendmsg net/socket.c:2593 [inline]
__se_sys_sendmsg net/socket.c:2591 [inline]
__x64_sys_sendmsg+0x36b/0x540 net/socket.c:2591
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Uninit was stored to memory at:
tun_set_link_ksettings+0x37/0x60 drivers/net/tun.c:3553
ethtool_set_link_ksettings+0x600/0x690 net/ethtool/ioctl.c:609
__dev_ethtool net/ethtool/ioctl.c:3024 [inline]
dev_ethtool+0x1db9/0x2a70 net/ethtool/ioctl.c:3078
dev_ioctl+0xb07/0x1270 net/core/dev_ioctl.c:524
sock_do_ioctl+0x295/0x540 net/socket.c:1213
sock_ioctl+0x729/0xd90 net/socket.c:1316
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:870 [inline]
__se_sys_ioctl+0x222/0x400 fs/ioctl.c:856
__x64_sys_ioctl+0x96/0xe0 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Local variable link_ksettings created at:
ethtool_set_link_ksettings+0x54/0x690 net/ethtool/ioctl.c:577
__dev_ethtool net/ethtool/ioctl.c:3024 [inline]
dev_ethtool+0x1db9/0x2a70 net/ethtool/ioctl.c:3078
Fixes: 012ce4dd31 ("ethtool: Extend link modes settings uAPI with lanes")
Reported-and-tested-by: syzbot+ef6edd9f1baaa54d6235@syzkaller.appspotmail.com
Link: https://lore.kernel.org/netdev/0000000000004bb41105fa70f361@google.com/
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>