commit dbcc7d57bf upstream.
While resolving backreferences, as part of a logical ino ioctl call or
fiemap, we can end up hitting a BUG_ON() when replaying tree mod log
operations of a root, triggering a stack trace like the following:
------------[ cut here ]------------
kernel BUG at fs/btrfs/ctree.c:1210!
invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 1 PID: 19054 Comm: crawl_335 Tainted: G W 5.11.0-2d11c0084b02-misc-next+ #89
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
RIP: 0010:__tree_mod_log_rewind+0x3b1/0x3c0
Code: 05 48 8d 74 10 (...)
RSP: 0018:ffffc90001eb70b8 EFLAGS: 00010297
RAX: 0000000000000000 RBX: ffff88812344e400 RCX: ffffffffb28933b6
RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff88812344e42c
RBP: ffffc90001eb7108 R08: 1ffff11020b60a20 R09: ffffed1020b60a20
R10: ffff888105b050f9 R11: ffffed1020b60a1f R12: 00000000000000ee
R13: ffff8880195520c0 R14: ffff8881bc958500 R15: ffff88812344e42c
FS: 00007fd1955e8700(0000) GS:ffff8881f5600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007efdb7928718 CR3: 000000010103a006 CR4: 0000000000170ee0
Call Trace:
btrfs_search_old_slot+0x265/0x10d0
? lock_acquired+0xbb/0x600
? btrfs_search_slot+0x1090/0x1090
? free_extent_buffer.part.61+0xd7/0x140
? free_extent_buffer+0x13/0x20
resolve_indirect_refs+0x3e9/0xfc0
? lock_downgrade+0x3d0/0x3d0
? __kasan_check_read+0x11/0x20
? add_prelim_ref.part.11+0x150/0x150
? lock_downgrade+0x3d0/0x3d0
? __kasan_check_read+0x11/0x20
? lock_acquired+0xbb/0x600
? __kasan_check_write+0x14/0x20
? do_raw_spin_unlock+0xa8/0x140
? rb_insert_color+0x30/0x360
? prelim_ref_insert+0x12d/0x430
find_parent_nodes+0x5c3/0x1830
? resolve_indirect_refs+0xfc0/0xfc0
? lock_release+0xc8/0x620
? fs_reclaim_acquire+0x67/0xf0
? lock_acquire+0xc7/0x510
? lock_downgrade+0x3d0/0x3d0
? lockdep_hardirqs_on_prepare+0x160/0x210
? lock_release+0xc8/0x620
? fs_reclaim_acquire+0x67/0xf0
? lock_acquire+0xc7/0x510
? poison_range+0x38/0x40
? unpoison_range+0x14/0x40
? trace_hardirqs_on+0x55/0x120
btrfs_find_all_roots_safe+0x142/0x1e0
? find_parent_nodes+0x1830/0x1830
? btrfs_inode_flags_to_xflags+0x50/0x50
iterate_extent_inodes+0x20e/0x580
? tree_backref_for_extent+0x230/0x230
? lock_downgrade+0x3d0/0x3d0
? read_extent_buffer+0xdd/0x110
? lock_downgrade+0x3d0/0x3d0
? __kasan_check_read+0x11/0x20
? lock_acquired+0xbb/0x600
? __kasan_check_write+0x14/0x20
? _raw_spin_unlock+0x22/0x30
? __kasan_check_write+0x14/0x20
iterate_inodes_from_logical+0x129/0x170
? iterate_inodes_from_logical+0x129/0x170
? btrfs_inode_flags_to_xflags+0x50/0x50
? iterate_extent_inodes+0x580/0x580
? __vmalloc_node+0x92/0xb0
? init_data_container+0x34/0xb0
? init_data_container+0x34/0xb0
? kvmalloc_node+0x60/0x80
btrfs_ioctl_logical_to_ino+0x158/0x230
btrfs_ioctl+0x205e/0x4040
? __might_sleep+0x71/0xe0
? btrfs_ioctl_get_supported_features+0x30/0x30
? getrusage+0x4b6/0x9c0
? __kasan_check_read+0x11/0x20
? lock_release+0xc8/0x620
? __might_fault+0x64/0xd0
? lock_acquire+0xc7/0x510
? lock_downgrade+0x3d0/0x3d0
? lockdep_hardirqs_on_prepare+0x210/0x210
? lockdep_hardirqs_on_prepare+0x210/0x210
? __kasan_check_read+0x11/0x20
? do_vfs_ioctl+0xfc/0x9d0
? ioctl_file_clone+0xe0/0xe0
? lock_downgrade+0x3d0/0x3d0
? lockdep_hardirqs_on_prepare+0x210/0x210
? __kasan_check_read+0x11/0x20
? lock_release+0xc8/0x620
? __task_pid_nr_ns+0xd3/0x250
? lock_acquire+0xc7/0x510
? __fget_files+0x160/0x230
? __fget_light+0xf2/0x110
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fd1976e2427
Code: 00 00 90 48 8b 05 (...)
RSP: 002b:00007fd1955e5cf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fd1955e5f40 RCX: 00007fd1976e2427
RDX: 00007fd1955e5f48 RSI: 00000000c038943b RDI: 0000000000000004
RBP: 0000000001000000 R08: 0000000000000000 R09: 00007fd1955e6120
R10: 0000557835366b00 R11: 0000000000000246 R12: 0000000000000004
R13: 00007fd1955e5f48 R14: 00007fd1955e5f40 R15: 00007fd1955e5ef8
Modules linked in:
---[ end trace ec8931a1c36e57be ]---
(gdb) l *(__tree_mod_log_rewind+0x3b1)
0xffffffff81893521 is in __tree_mod_log_rewind (fs/btrfs/ctree.c:1210).
1205 * the modification. as we're going backwards, we do the
1206 * opposite of each operation here.
1207 */
1208 switch (tm->op) {
1209 case MOD_LOG_KEY_REMOVE_WHILE_FREEING:
1210 BUG_ON(tm->slot < n);
1211 fallthrough;
1212 case MOD_LOG_KEY_REMOVE_WHILE_MOVING:
1213 case MOD_LOG_KEY_REMOVE:
1214 btrfs_set_node_key(eb, &tm->key, tm->slot);
Here's what happens to hit that BUG_ON():
1) We have one tree mod log user (through fiemap or the logical ino ioctl),
with a sequence number of 1, so we have fs_info->tree_mod_seq == 1;
2) Another task is at ctree.c:balance_level() and we have eb X currently as
the root of the tree, and we promote its single child, eb Y, as the new
root.
Then, at ctree.c:balance_level(), we call:
tree_mod_log_insert_root(eb X, eb Y, 1);
3) At tree_mod_log_insert_root() we create tree mod log elements for each
slot of eb X, of operation type MOD_LOG_KEY_REMOVE_WHILE_FREEING each
with a ->logical pointing to ebX->start. These are placed in an array
named tm_list.
Lets assume there are N elements (N pointers in eb X);
4) Then, still at tree_mod_log_insert_root(), we create a tree mod log
element of operation type MOD_LOG_ROOT_REPLACE, ->logical set to
ebY->start, ->old_root.logical set to ebX->start, ->old_root.level set
to the level of eb X and ->generation set to the generation of eb X;
5) Then tree_mod_log_insert_root() calls tree_mod_log_free_eb() with
tm_list as argument. After that, tree_mod_log_free_eb() calls
__tree_mod_log_insert() for each member of tm_list in reverse order,
from highest slot in eb X, slot N - 1, to slot 0 of eb X;
6) __tree_mod_log_insert() sets the sequence number of each given tree mod
log operation - it increments fs_info->tree_mod_seq and sets
fs_info->tree_mod_seq as the sequence number of the given tree mod log
operation.
This means that for the tm_list created at tree_mod_log_insert_root(),
the element corresponding to slot 0 of eb X has the highest sequence
number (1 + N), and the element corresponding to the last slot has the
lowest sequence number (2);
7) Then, after inserting tm_list's elements into the tree mod log rbtree,
the MOD_LOG_ROOT_REPLACE element is inserted, which gets the highest
sequence number, which is N + 2;
8) Back to ctree.c:balance_level(), we free eb X by calling
btrfs_free_tree_block() on it. Because eb X was created in the current
transaction, has no other references and writeback did not happen for
it, we add it back to the free space cache/tree;
9) Later some other task T allocates the metadata extent from eb X, since
it is marked as free space in the space cache/tree, and uses it as a
node for some other btree;
10) The tree mod log user task calls btrfs_search_old_slot(), which calls
get_old_root(), and finally that calls __tree_mod_log_oldest_root()
with time_seq == 1 and eb_root == eb Y;
11) First iteration of the while loop finds the tree mod log element with
sequence number N + 2, for the logical address of eb Y and of type
MOD_LOG_ROOT_REPLACE;
12) Because the operation type is MOD_LOG_ROOT_REPLACE, we don't break out
of the loop, and set root_logical to point to tm->old_root.logical
which corresponds to the logical address of eb X;
13) On the next iteration of the while loop, the call to
tree_mod_log_search_oldest() returns the smallest tree mod log element
for the logical address of eb X, which has a sequence number of 2, an
operation type of MOD_LOG_KEY_REMOVE_WHILE_FREEING and corresponds to
the old slot N - 1 of eb X (eb X had N items in it before being freed);
14) We then break out of the while loop and return the tree mod log operation
of type MOD_LOG_ROOT_REPLACE (eb Y), and not the one for slot N - 1 of
eb X, to get_old_root();
15) At get_old_root(), we process the MOD_LOG_ROOT_REPLACE operation
and set "logical" to the logical address of eb X, which was the old
root. We then call tree_mod_log_search() passing it the logical
address of eb X and time_seq == 1;
16) Then before calling tree_mod_log_search(), task T adds a key to eb X,
which results in adding a tree mod log operation of type
MOD_LOG_KEY_ADD to the tree mod log - this is done at
ctree.c:insert_ptr() - but after adding the tree mod log operation
and before updating the number of items in eb X from 0 to 1...
17) The task at get_old_root() calls tree_mod_log_search() and gets the
tree mod log operation of type MOD_LOG_KEY_ADD just added by task T.
Then it enters the following if branch:
if (old_root && tm && tm->op != MOD_LOG_KEY_REMOVE_WHILE_FREEING) {
(...)
} (...)
Calls read_tree_block() for eb X, which gets a reference on eb X but
does not lock it - task T has it locked.
Then it clones eb X while it has nritems set to 0 in its header, before
task T sets nritems to 1 in eb X's header. From hereupon we use the
clone of eb X which no other task has access to;
18) Then we call __tree_mod_log_rewind(), passing it the MOD_LOG_KEY_ADD
mod log operation we just got from tree_mod_log_search() in the
previous step and the cloned version of eb X;
19) At __tree_mod_log_rewind(), we set the local variable "n" to the number
of items set in eb X's clone, which is 0. Then we enter the while loop,
and in its first iteration we process the MOD_LOG_KEY_ADD operation,
which just decrements "n" from 0 to (u32)-1, since "n" is declared with
a type of u32. At the end of this iteration we call rb_next() to find the
next tree mod log operation for eb X, that gives us the mod log operation
of type MOD_LOG_KEY_REMOVE_WHILE_FREEING, for slot 0, with a sequence
number of N + 1 (steps 3 to 6);
20) Then we go back to the top of the while loop and trigger the following
BUG_ON():
(...)
switch (tm->op) {
case MOD_LOG_KEY_REMOVE_WHILE_FREEING:
BUG_ON(tm->slot < n);
fallthrough;
(...)
Because "n" has a value of (u32)-1 (4294967295) and tm->slot is 0.
Fix this by taking a read lock on the extent buffer before cloning it at
ctree.c:get_old_root(). This should be done regardless of the extent
buffer having been freed and reused, as a concurrent task might be
modifying it (while holding a write lock on it).
Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Link: https://lore.kernel.org/linux-btrfs/20210227155037.GN28049@hungrycats.org/
Fixes: 834328a849 ("Btrfs: tree mod log's old roots could still be part of the tree")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6704a3abf4 upstream.
On hardware which supports timestamping all packets, the timestamps are
recorded in the packet buffer, and the driver no longer uses or reads
the registers. This makes the logic for checking and clearing Rx
timestamp hangs meaningless.
If we run the ixgbe_ptp_rx_hang() function in this case, then the driver
will continuously spam the log output with "Clearing Rx timestamp hang".
These messages are spurious, and confusing to end users.
The original code in commit a9763f3cb5 ("ixgbe: Update PTP to support
X550EM_x devices", 2015-12-03) did have a flag PTP_RX_TIMESTAMP_IN_REGISTER
which was intended to be used to avoid the Rx timestamp hang check,
however it did not actually check the flag before calling the function.
Do so now in order to stop the checks and prevent the spurious log
messages.
Fixes: a9763f3cb5 ("ixgbe: Update PTP to support X550EM_x devices", 2015-12-03)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
commit 622a2ef538 upstream.
The ixgbe driver has logic to handle only one Tx timestamp at a time,
using a state bit lock to avoid multiple requests at once.
It may be possible, if incredibly unlikely, that a Tx timestamp event is
requested but never completes. Since we use an interrupt scheme to
determine when the Tx timestamp occurred we would never clear the state
bit in this case.
Add an ixgbe_ptp_tx_hang() function similar to the already existing
ixgbe_ptp_rx_hang() function. This function runs in the watchdog routine
and makes sure we eventually recover from this case instead of
permanently disabling Tx timestamps.
Note: there is no currently known way to cause this without hacking the
driver code to force it.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f9b3827ee6 upstream.
Add support for being able to set the learning attribute on port, and
make sure that the standalone ports start up with learning disabled.
We can remove the code in bcm_sf2 that configured the ports learning
attribute because we want the standalone ports to have learning disabled
by default and port 7 cannot be bridged, so its learning attribute will
not change past its initial configuration.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ce9f24cccd upstream.
Currently, system zones just track ranges of block, that are "important"
fs metadata (bitmaps, group descriptors, journal blocks, etc.). This
however complicates how extent tree (or indirect blocks) can be checked
for inodes that actually track such metadata - currently the journal
inode but arguably we should be treating quota files or resize inode
similarly. We cannot run __ext4_ext_check() on such metadata inodes when
loading their extents as that would immediately trigger the validity
checks and so we just hack around that and special-case the journal
inode. This however leads to a situation that a journal inode which has
extent tree of depth at least one can have invalid extent tree that gets
unnoticed until ext4_cache_extents() crashes.
To overcome this limitation, track inode number each system zone belongs
to (0 is used for zones not belonging to any inode). We can then verify
inode number matches the expected one when verifying extent tree and
thus avoid the false errors. With this there's no need to to
special-case journal inode during extent tree checking anymore so remove
it.
Fixes: 0a944e8a6c ("ext4: don't perform block validity checks on the journal inode")
Reported-by: Wolfgang Frisch <wolfgang.frisch@suse.com>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20200728130437.7804-4-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf9a379d09 upstream.
Currently, add_system_zone() just silently merges two added system zones
that overlap. However the overlap should not happen and it generally
suggests that some unrelated metadata overlap which indicates the fs is
corrupted. We should have caught such problems earlier (e.g. in
ext4_check_descriptors()) but add this check as another line of defense.
In later patch we also use this for stricter checking of journal inode
extent tree.
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20200728130437.7804-3-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9e77d96b8e upstream.
When creating a new event channel with 2-level events the affinity
needs to be reset initially in order to avoid using an old affinity
from earlier usage of the event channel port. So when tearing an event
channel down reset all affinity bits.
The same applies to the affinity when onlining a vcpu: all old
affinity settings for this vcpu must be reset. As percpu events get
initialized before the percpu event channel hook is called,
resetting of the affinities happens after offlining a vcpu (this is
working, as initial percpu memory is zeroed out).
Cc: stable@vger.kernel.org
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Link: https://lore.kernel.org/r/20210306161833.4552-2-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 262b003d05 upstream.
When registering a memslot, we check the size and location of that
memslot against the IPA size to ensure that we can provide guest
access to the whole of the memory.
Unfortunately, this check rejects memslot that end-up at the exact
limit of the addressing capability for a given IPA size. For example,
it refuses the creation of a 2GB memslot at 0x8000000 with a 32bit
IPA space.
Fix it by relaxing the check to accept a memslot reaching the
limit of the IPA space.
Fixes: c3058d5da2 ("arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE")
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org # 4.4, 4.9
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20210311100016.3830038-3-maz@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 62456189f3 upstream.
max6658 may report unrealistically high temperature during
the driver initialization, for which, its overtemp alarm pin
also gets asserted. For certain devices implementing overtemp
protection based on that pin, it may further trigger a reset to
the device. By reproducing the problem, the wrong reading is
found to be coincident with changing the conversion rate.
To mitigate this issue, set the stop bit before changing the
conversion rate and unset it thereafter. After such change, the
wrong reading is not reproduced. Apply this change only to the
max6657 kind for now, controlled by flag LM90_PAUSE_ON_CONFIG.
Signed-off-by: Boyang Yu <byu@arista.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7850f4d84 upstream.
There is a deadlock in bm_register_write:
First, in the begining of the function, a lock is taken on the binfmt_misc
root inode with inode_lock(d_inode(root)).
Then, if the user used the MISC_FMT_OPEN_FILE flag, the function will call
open_exec on the user-provided interpreter.
open_exec will call a path lookup, and if the path lookup process includes
the root of binfmt_misc, it will try to take a shared lock on its inode
again, but it is already locked, and the code will get stuck in a deadlock
To reproduce the bug:
$ echo ":iiiii:E::ii::/proc/sys/fs/binfmt_misc/bla:F" > /proc/sys/fs/binfmt_misc/register
backtrace of where the lock occurs (#5):
0 schedule () at ./arch/x86/include/asm/current.h:15
1 0xffffffff81b51237 in rwsem_down_read_slowpath (sem=0xffff888003b202e0, count=<optimized out>, state=state@entry=2) at kernel/locking/rwsem.c:992
2 0xffffffff81b5150a in __down_read_common (state=2, sem=<optimized out>) at kernel/locking/rwsem.c:1213
3 __down_read (sem=<optimized out>) at kernel/locking/rwsem.c:1222
4 down_read (sem=<optimized out>) at kernel/locking/rwsem.c:1355
5 0xffffffff811ee22a in inode_lock_shared (inode=<optimized out>) at ./include/linux/fs.h:783
6 open_last_lookups (op=0xffffc9000022fe34, file=0xffff888004098600, nd=0xffffc9000022fd10) at fs/namei.c:3177
7 path_openat (nd=nd@entry=0xffffc9000022fd10, op=op@entry=0xffffc9000022fe34, flags=flags@entry=65) at fs/namei.c:3366
8 0xffffffff811efe1c in do_filp_open (dfd=<optimized out>, pathname=pathname@entry=0xffff8880031b9000, op=op@entry=0xffffc9000022fe34) at fs/namei.c:3396
9 0xffffffff811e493f in do_open_execat (fd=fd@entry=-100, name=name@entry=0xffff8880031b9000, flags=<optimized out>, flags@entry=0) at fs/exec.c:913
10 0xffffffff811e4a92 in open_exec (name=<optimized out>) at fs/exec.c:948
11 0xffffffff8124aa84 in bm_register_write (file=<optimized out>, buffer=<optimized out>, count=19, ppos=<optimized out>) at fs/binfmt_misc.c:682
12 0xffffffff811decd2 in vfs_write (file=file@entry=0xffff888004098500, buf=buf@entry=0xa758d0 ":iiiii:E::ii::i:CF
", count=count@entry=19, pos=pos@entry=0xffffc9000022ff10) at fs/read_write.c:603
13 0xffffffff811defda in ksys_write (fd=<optimized out>, buf=0xa758d0 ":iiiii:E::ii::i:CF
", count=19) at fs/read_write.c:658
14 0xffffffff81b49813 in do_syscall_64 (nr=<optimized out>, regs=0xffffc9000022ff58) at arch/x86/entry/common.c:46
15 0xffffffff81c0007c in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:120
To solve the issue, the open_exec call is moved to before the write
lock is taken by bm_register_write
Link: https://lkml.kernel.org/r/20210228224414.95962-1-liorribak@gmail.com
Fixes: 948b701a60 ("binfmt_misc: add persistent opened binary handler for containers")
Signed-off-by: Lior Ribak <liorribak@gmail.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8525023121 upstream.
They used to need odd calling conventions due to old exception handling
mechanism, the last remnants of which had disappeared back in 2002.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4758ce82e6 upstream.
There are direct branches between {str*cpy,str*cat} and stx*cpy.
Ensure the branches are within range by merging these objects.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3eec029183 upstream.
This enables the Kbuild standard log style as follows:
AS arch/alpha/lib/__divlu.o
AS arch/alpha/lib/__divqu.o
AS arch/alpha/lib/__remlu.o
AS arch/alpha/lib/__remqu.o
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 14fbbc8297 ]
Commit b0841eefd9 ("configfs: provide exclusion between IO and removals")
uses ->frag_dead to mark the fragment state, thus no bothering with extra
refcount on config_item when opening a file. The configfs_get_config_item
was removed in __configfs_open_file, but not with config_item_put. So the
refcount on config_item will lost its balance, causing use-after-free
issues in some occasions like this:
Test:
1. Mount configfs on /config with read-only items:
drwxrwx--- 289 root root 0 2021-04-01 11:55 /config
drwxr-xr-x 2 root root 0 2021-04-01 11:54 /config/a
--w--w--w- 1 root root 4096 2021-04-01 11:53 /config/a/1.txt
......
2. Then run:
for file in /config
do
echo $file
grep -R 'key' $file
done
3. __configfs_open_file will be called in parallel, the first one
got called will do:
if (file->f_mode & FMODE_READ) {
if (!(inode->i_mode & S_IRUGO))
goto out_put_module;
config_item_put(buffer->item);
kref_put()
package_details_release()
kfree()
the other one will run into use-after-free issues like this:
BUG: KASAN: use-after-free in __configfs_open_file+0x1bc/0x3b0
Read of size 8 at addr fffffff155f02480 by task grep/13096
CPU: 0 PID: 13096 Comm: grep VIP: 00 Tainted: G W 4.14.116-kasan #1
TGID: 13096 Comm: grep
Call trace:
dump_stack+0x118/0x160
kasan_report+0x22c/0x294
__asan_load8+0x80/0x88
__configfs_open_file+0x1bc/0x3b0
configfs_open_file+0x28/0x34
do_dentry_open+0x2cc/0x5c0
vfs_open+0x80/0xe0
path_openat+0xd8c/0x2988
do_filp_open+0x1c4/0x2fc
do_sys_open+0x23c/0x404
SyS_openat+0x38/0x48
Allocated by task 2138:
kasan_kmalloc+0xe0/0x1ac
kmem_cache_alloc_trace+0x334/0x394
packages_make_item+0x4c/0x180
configfs_mkdir+0x358/0x740
vfs_mkdir2+0x1bc/0x2e8
SyS_mkdirat+0x154/0x23c
el0_svc_naked+0x34/0x38
Freed by task 13096:
kasan_slab_free+0xb8/0x194
kfree+0x13c/0x910
package_details_release+0x524/0x56c
kref_put+0xc4/0x104
config_item_put+0x24/0x34
__configfs_open_file+0x35c/0x3b0
configfs_open_file+0x28/0x34
do_dentry_open+0x2cc/0x5c0
vfs_open+0x80/0xe0
path_openat+0xd8c/0x2988
do_filp_open+0x1c4/0x2fc
do_sys_open+0x23c/0x404
SyS_openat+0x38/0x48
el0_svc_naked+0x34/0x38
To fix this issue, remove the config_item_put in
__configfs_open_file to balance the refcount of config_item.
Fixes: b0841eefd9 ("configfs: provide exclusion between IO and removals")
Signed-off-by: Daiyue Zhang <zhangdaiyue1@huawei.com>
Signed-off-by: Yi Chen <chenyi77@huawei.com>
Signed-off-by: Ge Qiu <qiuge@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 53cb245454 ]
An xattr 'get' handler is expected to return the length of the value on
success, yet _nfs4_get_security_label() (and consequently also
nfs4_xattr_get_nfs4_label(), which is used as an xattr handler) returns
just 0 on success.
Fix this by returning label.len instead, which contains the length of
the result.
Fixes: aa9c266962 ("NFS: Client implementation of Labeled-NFS")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 75be7fb7f9 ]
According to the RZ/A1H Group, RZ/A1M Group User's Manual: Hardware,
Rev. 4.00, the TRSCER register has bit 9 reserved, hence we can't use
the driver's default TRSCER mask. Add the explicit initializer for
sh_eth_cpu_data::trscer_err_mask for R7S72100.
Fixes: db893473d3 ("sh_eth: Add support for r7s72100")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 148e34fd33 upstream.
The analog input subdevice supports Comedi asynchronous commands that
use Comedi's 16-bit sample format. However, the call to
`comedi_buf_write_samples()` is passing the address of a 32-bit integer
parameter. On bigendian machines, this will copy 2 bytes from the wrong
end of the 32-bit value. Fix it by changing the type of the parameter
holding the sample value to `unsigned short`.
[Note: the bug was introduced in commit edf4537bcb ("staging: comedi:
pcl818: use comedi_buf_write_samples()") but the patch applies better to
commit d615416de6 ("staging: comedi: pcl818: introduce
pcl818_ai_write_sample()").]
Fixes: d615416de6 ("staging: comedi: pcl818: introduce pcl818_ai_write_sample()")
Cc: <stable@vger.kernel.org> # 4.0+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20210223143055.257402-10-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a084303a64 upstream.
The analog input subdevice supports Comedi asynchronous commands that
use Comedi's 16-bit sample format. However, the call to
`comedi_buf_write_samples()` is passing the address of a 32-bit integer
variable. On bigendian machines, this will copy 2 bytes from the wrong
end of the 32-bit value. Fix it by changing the type of the variable
holding the sample value to `unsigned short`.
Fixes: 1f44c034de ("staging: comedi: pcl711: use comedi_buf_write_samples()")
Cc: <stable@vger.kernel.org> # 3.19+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20210223143055.257402-9-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b39dfcced3 upstream.
The analog input subdevice supports Comedi asynchronous commands that
use Comedi's 16-bit sample format. However, the calls to
`comedi_buf_write_samples()` are passing the address of a 32-bit integer
variable. On bigendian machines, this will copy 2 bytes from the wrong
end of the 32-bit value. Fix it by changing the type of the variable
holding the sample value to `unsigned short`.
Fixes: de88924f67 ("staging: comedi: me4000: use comedi_buf_write_samples()")
Cc: <stable@vger.kernel.org> # 3.19+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20210223143055.257402-8-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 54999c0d94 upstream.
The analog input subdevice supports Comedi asynchronous commands that
use Comedi's 16-bit sample format. However, the call to
`comedi_buf_write_samples()` is passing the address of a 32-bit integer
variable. On bigendian machines, this will copy 2 bytes from the wrong
end of the 32-bit value. Fix it by changing the type of the variable
holding the sample value to `unsigned short`.
[Note: the bug was introduced in commit 1700529b24 ("staging: comedi:
dmm32at: use comedi_buf_write_samples()") but the patch applies better
to the later (but in the same kernel release) commit 0c0eadadcb
("staging: comedi: dmm32at: introduce dmm32_ai_get_sample()").]
Fixes: 0c0eadadcb ("staging: comedi: dmm32at: introduce dmm32_ai_get_sample()")
Cc: <stable@vger.kernel.org> # 3.19+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20210223143055.257402-7-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 459b1e8c8f upstream.
The analog input subdevice supports Comedi asynchronous commands that
use Comedi's 16-bit sample format. However, the call to
`comedi_buf_write_samples()` is passing the address of a 32-bit integer
variable. On bigendian machines, this will copy 2 bytes from the wrong
end of the 32-bit value. Fix it by changing the type of the variable
holding the sample value to `unsigned short`.
Fixes: ad9eb43c93 ("staging: comedi: das800: use comedi_buf_write_samples()")
Cc: <stable@vger.kernel.org> # 3.19+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20210223143055.257402-6-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c0f20b787 upstream.
The analog input subdevice supports Comedi asynchronous commands that
use Comedi's 16-bit sample format. However, the call to
`comedi_buf_write_samples()` is passing the address of a 32-bit integer
variable. On bigendian machines, this will copy 2 bytes from the wrong
end of the 32-bit value. Fix it by changing the type of the variable
holding the sample value to `unsigned short`.
Fixes: d1d24cb65e ("staging: comedi: das6402: read analog input samples in interrupt handler")
Cc: <stable@vger.kernel.org> # 3.19+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20210223143055.257402-5-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b2e78630f7 upstream.
The analog input subdevice supports Comedi asynchronous commands that
use Comedi's 16-bit sample format. However, the calls to
`comedi_buf_write_samples()` are passing the address of a 32-bit integer
variable. On bigendian machines, this will copy 2 bytes from the wrong
end of the 32-bit value. Fix it by changing the type of the variables
holding the sample value to `unsigned short`. The type of the `val`
parameter of `pci1710_ai_read_sample()` is changed to `unsigned short *`
accordingly. The type of the `val` variable in `pci1710_ai_insn_read()`
is also changed to `unsigned short` since its address is passed to
`pci1710_ai_read_sample()`.
Fixes: a9c3a015c1 ("staging: comedi: adv_pci1710: use comedi_buf_write_samples()")
Cc: <stable@vger.kernel.org> # 4.0+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20210223143055.257402-4-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac0bbf55ed upstream.
The digital input subdevice supports Comedi asynchronous commands that
read interrupt status information. This uses 16-bit Comedi samples (of
which only the bottom 8 bits contain status information). However, the
interrupt handler is calling `comedi_buf_write_samples()` with the
address of a 32-bit variable `unsigned int status`. On a bigendian
machine, this will copy 2 bytes from the wrong end of the variable. Fix
it by changing the type of the variable to `unsigned short`.
Fixes: a8c66b684e ("staging: comedi: addi_apci_1500: rewrite the subdevice support functions")
Cc: <stable@vger.kernel.org> #4.0+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20210223143055.257402-3-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 25317f428a upstream.
The Change-Of-State (COS) subdevice supports Comedi asynchronous
commands to read 16-bit change-of-state values. However, the interrupt
handler is calling `comedi_buf_write_samples()` with the address of a
32-bit integer `&s->state`. On bigendian architectures, it will copy 2
bytes from the wrong end of the 32-bit integer. Fix it by transferring
the value via a 16-bit integer.
Fixes: 6bb45f2b0c ("staging: comedi: addi_apci_1032: use comedi_buf_write_samples()")
Cc: <stable@vger.kernel.org> # 3.19+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20210223143055.257402-2-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 718ad9693e upstream.
attach_store() is invoked when user requests import (attach) a device
from usbip host.
Attach and detach are governed by local state and shared state
- Shared state (usbip device status) - Device status is used to manage
the attach and detach operations on import-able devices.
- Local state (tcp_socket, rx and tx thread task_struct ptrs)
A valid tcp_socket controls rx and tx thread operations while the
device is in exported state.
- Device has to be in the right state to be attached and detached.
Attach sequence includes validating the socket and creating receive (rx)
and transmit (tx) threads to talk to the host to get access to the
imported device. rx and tx threads depends on local and shared state to
be correct and in sync.
Detach sequence shuts the socket down and stops the rx and tx threads.
Detach sequence relies on local and shared states to be in sync.
There are races in updating the local and shared status in the current
attach sequence resulting in crashes. These stem from starting rx and
tx threads before local and global state is updated correctly to be in
sync.
1. Doesn't handle kthread_create() error and saves invalid ptr in local
state that drives rx and tx threads.
2. Updates tcp_socket and sockfd, starts stub_rx and stub_tx threads
before updating usbip_device status to VDEV_ST_NOTASSIGNED. This opens
up a race condition between the threads, port connect, and detach
handling.
Fix the above problems:
- Stop using kthread_get_run() macro to create/start threads.
- Create threads and get task struct reference.
- Add kthread_create() failure handling and bail out.
- Hold vhci and usbip_device locks to update local and shared states after
creating rx and tx threads.
- Update usbip_device status to VDEV_ST_NOTASSIGNED.
- Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx
- Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx,
and status) is complete.
Credit goes to syzbot and Tetsuo Handa for finding and root-causing the
kthread_get_run() improper error handling problem and others. This is
hard problem to find and debug since the races aren't seen in a normal
case. Fuzzing forces the race window to be small enough for the
kthread_get_run() error path bug and starting threads before updating the
local and shared state bug in the attach sequence.
- Update usbip_device tcp_rx and tcp_tx pointers holding vhci and
usbip_device locks.
Tested with syzbot reproducer:
- https://syzkaller.appspot.com/text?tag=ReproC&x=14801034d00000
Fixes: 9720b4bc76 ("staging/usbip: convert to kthread")
Cc: stable@vger.kernel.org
Reported-by: syzbot <syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+bf1a360e305ee719e364@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+95ce4b142579611ef0a9@syzkaller.appspotmail.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/bb434bd5d7a64fbec38b5ecfb838a6baef6eb12b.1615171203.git.skhan@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9380afd6df upstream.
usbip_sockfd_store() is invoked when user requests attach (import)
detach (unimport) usb device from usbip host. vhci_hcd sends import
request and usbip_sockfd_store() exports the device if it is free
for export.
Export and unexport are governed by local state and shared state
- Shared state (usbip device status, sockfd) - sockfd and Device
status are used to determine if stub should be brought up or shut
down.
- Local state (tcp_socket, rx and tx thread task_struct ptrs)
A valid tcp_socket controls rx and tx thread operations while the
device is in exported state.
- While the device is exported, device status is marked used and socket,
sockfd, and thread pointers are valid.
Export sequence (stub-up) includes validating the socket and creating
receive (rx) and transmit (tx) threads to talk to the client to provide
access to the exported device. rx and tx threads depends on local and
shared state to be correct and in sync.
Unexport (stub-down) sequence shuts the socket down and stops the rx and
tx threads. Stub-down sequence relies on local and shared states to be
in sync.
There are races in updating the local and shared status in the current
stub-up sequence resulting in crashes. These stem from starting rx and
tx threads before local and global state is updated correctly to be in
sync.
1. Doesn't handle kthread_create() error and saves invalid ptr in local
state that drives rx and tx threads.
2. Updates tcp_socket and sockfd, starts stub_rx and stub_tx threads
before updating usbip_device status to SDEV_ST_USED. This opens up a
race condition between the threads and usbip_sockfd_store() stub up
and down handling.
Fix the above problems:
- Stop using kthread_get_run() macro to create/start threads.
- Create threads and get task struct reference.
- Add kthread_create() failure handling and bail out.
- Hold usbip_device lock to update local and shared states after
creating rx and tx threads.
- Update usbip_device status to SDEV_ST_USED.
- Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx
- Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx,
and status) is complete.
Credit goes to syzbot and Tetsuo Handa for finding and root-causing the
kthread_get_run() improper error handling problem and others. This is a
hard problem to find and debug since the races aren't seen in a normal
case. Fuzzing forces the race window to be small enough for the
kthread_get_run() error path bug and starting threads before updating the
local and shared state bug in the stub-up sequence.
Tested with syzbot reproducer:
- https://syzkaller.appspot.com/text?tag=ReproC&x=14801034d00000
Fixes: 9720b4bc76 ("staging/usbip: convert to kthread")
Cc: stable@vger.kernel.org
Reported-by: syzbot <syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+bf1a360e305ee719e364@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+95ce4b142579611ef0a9@syzkaller.appspotmail.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/268a0668144d5ff36ec7d87fdfa90faf583b7ccc.1615171203.git.skhan@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5563b3b642 upstream.
Add PID for CH340 that's found on cheap programmers.
The driver works flawlessly as soon as the new PID (0x9986) is added to it.
These look like ANU232MI but ship with a ch341 inside. They have no special
identifiers (mine only has the string "DB9D20130716" printed on the PCB and
nothing identifiable on the packaging. The merchant i bought it from
doesn't sell these anymore).
the lsusb -v output is:
Bus 001 Device 009: ID 9986:7523
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 1.10
bDeviceClass 255 Vendor Specific Class
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 8
idVendor 0x9986
idProduct 0x7523
bcdDevice 2.54
iManufacturer 0
iProduct 0
iSerial 0
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 0x0027
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 0
bmAttributes 0x80
(Bus Powered)
MaxPower 96mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 3
bInterfaceClass 255 Vendor Specific Class
bInterfaceSubClass 1
bInterfaceProtocol 2
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x82 EP 2 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0020 1x 32 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x02 EP 2 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0020 1x 32 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0008 1x 8 bytes
bInterval 1
Signed-off-by: Niv Sardi <xaiki@evilgiggle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>