Commit Graph

1185741 Commits

Author SHA1 Message Date
Tao Su
8176080d59 block: Skip destroyed blkg when restart in blkg_destroy_all()
Kernel hang in blkg_destroy_all() when total blkg greater than
BLKG_DESTROY_BATCH_SIZE, because of not removing destroyed blkg in
blkg_list. So the size of blkg_list is same after destroying a
batch of blkg, and the infinite 'restart' occurs.

Since blkg should stay on the queue list until blkg_free_workfn(),
skip destroyed blkg when restart a new round, which will solve this
kernel hang issue and satisfy the previous will to restart.

Reported-by: Xiangfei Ma <xiangfeix.ma@intel.com>
Tested-by: Xiangfei Ma <xiangfeix.ma@intel.com>
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Fixes: f1c006f1c6 ("blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()")
Suggested-and-reviewed-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20230428045149.1310073-1-tao1.su@linux.intel.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-28 11:23:58 -06:00
Vinay Belgaumkar
c73bd1706c drm/i915/selftest: Update the SLPC selftest
Use the new efficient frequency toggling interface. Also
create a helper function to restore the frequencies after
the test is done.

v2: Restore max freq first and then min.

Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230426003942.1924347-2-vinay.belgaumkar@intel.com
2023-04-28 09:59:54 -07:00
Vinay Belgaumkar
55f9720dbf drm/i915/guc/slpc: Provide sysfs for efficient freq
SLPC enables use of efficient freq at init by default. It is
possible for GuC to request frequencies that are higher than
the 'software' max if user has set it lower than the efficient
level.

Scenarios/tests that require strict fixing of freq below the efficient
level will need to disable it through this interface.

v2: Keep just one interface to toggle sysfs. With this, user will
be completely responsible for toggling efficient frequency if need
be. There will be no implicit disabling when user sets min < RP1 (Ashutosh)

v3: Remove unused label, review comments (Ashutosh)

v4: Toggle efficient freq usage in SLPC selftest and checkpatch fixes

v5: Review comments (Andi) and add a separate patch for selftest updates

Fixes: 95ccf312a1 ("drm/i915/guc/slpc: Allow SLPC to use efficient frequency")
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230426003942.1924347-1-vinay.belgaumkar@intel.com
2023-04-28 09:59:51 -07:00
Eric Van Hensbergen
21e26d5e54 fs/9p: Fix bit operation logic error
This re-introduces a fix that somehow got dropped during rebase of the
current series in for-next.  When writeback is enabled, opens
are forced to support both read and write operations but with the
logic error other flags may be dropped unintentionaly.

Reported-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
2023-04-28 16:59:26 +00:00
Theodore Ts'o
d4fab7b28e ext4: clean up error handling in __ext4_fill_super()
There were two ways to return an error code; one was via setting the
'err' variable, and the second, if err was zero, was via the 'ret'
variable.  This was both confusing and fragile, and when code was
factored out of __ext4_fill_super(), some of the error codes returned
by the original code was replaced by -EINVAL, and in one case, the
error code was placed by 0, triggering a kernel null pointer
dereference.

Clean this up by removing the 'ret' variable, leaving only one way to
set the error code to be returned, and restore the errno codes that
were returned via the the mount system call as they were before we
started refactoring __ext4_fill_super().

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
2023-04-28 12:56:40 -04:00
Theodore Ts'o
3b50d5018e ext4: reflect error codes from ext4_multi_mount_protect() to its callers
This will allow more fine-grained errno codes to be returned by the
mount system call.

Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-04-28 12:56:40 -04:00
Theodore Ts'o
d5e72c4e32 ext4: fix lost error code reporting in __ext4_fill_super()
When code was factored out of __ext4_fill_super() into
ext4_percpu_param_init() the error return was discarded.  This meant
that it was possible for __ext4_fill_super() to return zero,
indicating success, without the struct super getting completely filled
in, leading to a potential NULL pointer dereference.

Reported-by: syzbot+bbf0f9a213c94f283a5c@syzkaller.appspotmail.com
Fixes: 1f79467c8a ("ext4: factor out ext4_percpu_param_init() ...")
Link: https://syzkaller.appspot.com/bug?id=6dac47d5e58af770c0055f680369586ec32e144c
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
2023-04-28 12:56:40 -04:00
Nathan Chancellor
856dd6c598 ext4: fix unused iterator variable warnings
When CONFIG_QUOTA is disabled, there are warnings around unused iterator
variables:

  fs/ext4/super.c: In function 'ext4_put_super':
  fs/ext4/super.c:1262:13: error: unused variable 'i' [-Werror=unused-variable]
   1262 |         int i, err;
        |             ^
  fs/ext4/super.c: In function '__ext4_fill_super':
  fs/ext4/super.c:5200:22: error: unused variable 'i' [-Werror=unused-variable]
   5200 |         unsigned int i;
        |                      ^
  cc1: all warnings being treated as errors

The kernel has updated to GNU11, allowing the variables to be declared
within the for loop.  Do so to clear up the warnings.

Fixes: dcbf87589d ("ext4: factor out ext4_flex_groups_free()")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Link: https://lore.kernel.org/r/20230420-ext4-unused-variables-super-c-v1-1-138b6db6c21c@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-04-28 12:56:40 -04:00
Ye Bin
835659598c ext4: fix use-after-free read in ext4_find_extent for bigalloc + inline
Syzbot found the following issue:
loop0: detected capacity change from 0 to 2048
EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000 without journal. Quota mode: none.
==================================================================
BUG: KASAN: use-after-free in ext4_ext_binsearch_idx fs/ext4/extents.c:768 [inline]
BUG: KASAN: use-after-free in ext4_find_extent+0x76e/0xd90 fs/ext4/extents.c:931
Read of size 4 at addr ffff888073644750 by task syz-executor420/5067

CPU: 0 PID: 5067 Comm: syz-executor420 Not tainted 6.2.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x1b1/0x290 lib/dump_stack.c:106
 print_address_description+0x74/0x340 mm/kasan/report.c:306
 print_report+0x107/0x1f0 mm/kasan/report.c:417
 kasan_report+0xcd/0x100 mm/kasan/report.c:517
 ext4_ext_binsearch_idx fs/ext4/extents.c:768 [inline]
 ext4_find_extent+0x76e/0xd90 fs/ext4/extents.c:931
 ext4_clu_mapped+0x117/0x970 fs/ext4/extents.c:5809
 ext4_insert_delayed_block fs/ext4/inode.c:1696 [inline]
 ext4_da_map_blocks fs/ext4/inode.c:1806 [inline]
 ext4_da_get_block_prep+0x9e8/0x13c0 fs/ext4/inode.c:1870
 ext4_block_write_begin+0x6a8/0x2290 fs/ext4/inode.c:1098
 ext4_da_write_begin+0x539/0x760 fs/ext4/inode.c:3082
 generic_perform_write+0x2e4/0x5e0 mm/filemap.c:3772
 ext4_buffered_write_iter+0x122/0x3a0 fs/ext4/file.c:285
 ext4_file_write_iter+0x1d0/0x18f0
 call_write_iter include/linux/fs.h:2186 [inline]
 new_sync_write fs/read_write.c:491 [inline]
 vfs_write+0x7dc/0xc50 fs/read_write.c:584
 ksys_write+0x177/0x2a0 fs/read_write.c:637
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f4b7a9737b9
RSP: 002b:00007ffc5cac3668 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f4b7a9737b9
RDX: 00000000175d9003 RSI: 0000000020000200 RDI: 0000000000000004
RBP: 00007f4b7a933050 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000000079f R11: 0000000000000246 R12: 00007f4b7a9330e0
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 </TASK>

Above issue is happens when enable bigalloc and inline data feature. As
commit 131294c35e fixed delayed allocation bug in ext4_clu_mapped for
bigalloc + inline. But it only resolved issue when has inline data, if
inline data has been converted to extent(ext4_da_convert_inline_data_to_extent)
before writepages, there is no EXT4_STATE_MAY_INLINE_DATA flag. However
i_data is still store inline data in this scene. Then will trigger UAF
when find extent.
To resolve above issue, there is need to add judge "ext4_has_inline_data(inode)"
in ext4_clu_mapped().

Fixes: 131294c35e ("ext4: fix delayed allocation bug in ext4_clu_mapped for bigalloc + inline")
Reported-by: syzbot+bf4bb7731ef73b83a3b4@syzkaller.appspotmail.com
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Tested-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Link: https://lore.kernel.org/r/20230406111627.1916759-1-tudor.ambarus@linaro.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-04-28 12:56:35 -04:00
Linus Torvalds
22b8cc3e78 Merge tag 'x86_mm_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 LAM (Linear Address Masking) support from Dave Hansen:
 "Add support for the new Linear Address Masking CPU feature.

  This is similar to ARM's Top Byte Ignore and allows userspace to store
  metadata in some bits of pointers without masking it out before use"

* tag 'x86_mm_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm/iommu/sva: Do not allow to set FORCE_TAGGED_SVA bit from outside
  x86/mm/iommu/sva: Fix error code for LAM enabling failure due to SVA
  selftests/x86/lam: Add test cases for LAM vs thread creation
  selftests/x86/lam: Add ARCH_FORCE_TAGGED_SVA test cases for linear-address masking
  selftests/x86/lam: Add inherit test cases for linear-address masking
  selftests/x86/lam: Add io_uring test cases for linear-address masking
  selftests/x86/lam: Add mmap and SYSCALL test cases for linear-address masking
  selftests/x86/lam: Add malloc and tag-bits test cases for linear-address masking
  x86/mm/iommu/sva: Make LAM and SVA mutually exclusive
  iommu/sva: Replace pasid_valid() helper with mm_valid_pasid()
  mm: Expose untagging mask in /proc/$PID/status
  x86/mm: Provide arch_prctl() interface for LAM
  x86/mm: Reduce untagged_addr() overhead for systems without LAM
  x86/uaccess: Provide untagged_addr() and remove tags before address check
  mm: Introduce untagged_addr_remote()
  x86/mm: Handle LAM on context switch
  x86: CPUID and CR3/CR4 flags for Linear Address Masking
  x86: Allow atomic MM_CONTEXT flags setting
  x86/mm: Rework address range check in get_user() and put_user()
2023-04-28 09:43:49 -07:00
Maxim Korotkov
3e46c89c74 writeback: fix call of incorrect macro
the variable 'history' is of type u16, it may be an error
 that the hweight32 macro was used for it
 I guess macro hweight16 should be used

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 2a81490811 ("writeback: implement foreign cgroup inode detection")
Signed-off-by: Maxim Korotkov <korotkov.maxim.s@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230119104443.3002-1-korotkov.maxim.s@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-28 10:41:32 -06:00
Jens Axboe
f40c153afe Merge tag 'md-next-2023-04-28' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.4/block
Pull MD fixes from Song:

"1. Improve raid5 sequential IO performance on spinning disks, which fixes
    a regression since v6.0, by Jan Kara.
 2. Fix bitmap offset types, which fixes an issue introduced in this merge
    window, by Jonathan Derrick."

* tag 'md-next-2023-04-28' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
  md: Fix bitmap offset type in sb writer
  md/raid5: Improve performance for sequential IO
2023-04-28 10:36:27 -06:00
Linus Torvalds
7b664cc38e Merge tag 'x86_tdx_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 tdx update from Dave Hansen:
 "The original tdx hypercall assembly code took two flags in %RSI to
  tweak its behavior at runtime. PeterZ recently axed one flag in commit
  e80a48bade ("x86/tdx: Remove TDX_HCALL_ISSUE_STI").

  Kill the other flag too and tweak the 'output' mode with an assembly
  macro instead. This results in elimination of one push/pop pair and
  overall easier to read assembly.

   - Do conditional __tdx_hypercall() 'output' processing via an
     assembly macro argument rather than a runtime register"

* tag 'x86_tdx_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tdx: Drop flags from __tdx_hypercall()
2023-04-28 09:36:09 -07:00
Linus Torvalds
e54debe657 Merge tag 'x86_fpu_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fpu updates from Dave Hansen:
 "There's no _actual_ kernel functionality here.

  This expands the documentation around AMX support including some code
  examples. The example code also exposed the fact that hardware
  architecture constants as part of the ABI, but there's no easy place
  that they get defined for apps. Adding them to a uabi header will
  eventually make life easier for consumers of the ABI.

  Summary:

   - Improve AMX documentation along with example code

   - Explicitly make some hardware constants part of the uabi"

* tag 'x86_fpu_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation/x86: Explain the state component permission for guests
  Documentation/x86: Add the AMX enabling example
  x86/arch_prctl: Add AMX feature numbers as ABI constants
  Documentation/x86: Explain the purpose for dynamic features
2023-04-28 09:32:34 -07:00
Linus Torvalds
4980c176a7 Merge tag 'x86_cache_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 resctrl update from Dave Hansen:
 "Reduce redundant counter reads with resctrl refactoring"

* tag 'x86_cache_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Avoid redundant counter read in __mon_event_count()
2023-04-28 09:30:51 -07:00
Linus Torvalds
682f7bbad2 Merge tag 'x86_cleanups_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Borislav Petkov:

 - Unify duplicated __pa() and __va() definitions

 - Simplify sysctl tables registration

 - Remove unused symbols

 - Correct function name in comment

* tag 'x86_cleanups_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Centralize __pa()/__va() definitions
  x86: Simplify one-level sysctl registration for itmt_kern_table
  x86: Simplify one-level sysctl registration for abi_table2
  x86/platform/intel-mid: Remove unused definitions from intel-mid.h
  x86/uaccess: Remove memcpy_page_flushcache()
  x86/entry: Change stale function name in comment to error_return()
2023-04-28 09:22:30 -07:00
Jonathan Derrick
b1211978ec md: Fix bitmap offset type in sb writer
Bitmap offset is allowed to be negative, indicating that bitmap precedes
metadata. Change the type back from sector_t to loff_t to satisfy
conditionals and calculations.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/linux-raid/CAPhsuW6HuaUJ5WcyPajVgUfkQFYp2D_cy1g6qxN4CU_gP2=z7g@mail.gmail.com/
Fixes: 10172f200b ("md: Fix types in sb writer")
Signed-off-by: Jonathan Derrick <jonathan.derrick@linux.dev>
Suggested-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230425011438.71046-1-jonathan.derrick@linux.dev
2023-04-28 09:21:06 -07:00
Jan Kara
fc05e06e60 md/raid5: Improve performance for sequential IO
Commit 7e55c60acf ("md/raid5: Pivot raid5_make_request()") changed the
order in which requests for underlying disks are created. Since for
large sequential IO adding of requests frequently races with md_raid5
thread submitting bios to underlying disks, this results in a change in
IO pattern because intermediate states of new order of request creation
result in more smaller discontiguous requests. For RAID5 on top of three
rotational disks our performance testing revealed this results in
regression in write throughput:

iozone -a -s 131072000 -y 4 -q 8 -i 0 -i 1 -R

before 7e55c60acf:
              KB  reclen   write rewrite    read    reread
       131072000       4  493670  525964   524575   513384
       131072000       8  540467  532880   512028   513703

after 7e55c60acf:
              KB  reclen   write rewrite    read    reread
       131072000       4  421785  456184   531278   509248
       131072000       8  459283  456354   528449   543834

To reduce the amount of discontiguous requests we can start generating
requests with the stripe with the lowest chunk offset as that has the
best chance of being adjacent to IO queued previously. This improves the
performance to:
              KB  reclen   write rewrite    read    reread
       131072000       4  497682  506317   518043   514559
       131072000       8  514048  501886   506453   504319

restoring big part of the regression.

Fixes: 7e55c60acf ("md/raid5: Pivot raid5_make_request()")
Cc: stable@vger.kernel.org # v6.0+
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230417171537.17899-1-jack@suse.cz
2023-04-28 09:19:02 -07:00
Randy Dunlap
6d2ed65318 lsm: move hook comments docs to security/security.c
Fix one kernel-doc warning, but invesigating that led to other
kernel-doc movement (lsm_hooks.h to security.c) that needs to be
fixed also.

include/linux/lsm_hooks.h:1: warning: no structured comments found

Fixes: e261301c85 ("lsm: move the remaining LSM hook comments to security/security.c")
Fixes: 1cd2aca64a ("lsm: move the io_uring hook comments to security/security.c")
Fixes: 452b670c72 ("lsm: move the perf hook comments to security/security.c")
Fixes: 55e853201a ("lsm: move the bpf hook comments to security/security.c")
Fixes: b14faf9c94 ("lsm: move the audit hook comments to security/security.c")
Fixes: 1427ddbe5c ("lsm: move the binder hook comments to security/security.c")
Fixes: 43fad28218 ("lsm: move the sysv hook comments to security/security.c")
Fixes: ecc419a445 ("lsm: move the key hook comments to security/security.c")
Fixes: 742b99456e ("lsm: move the xfrm hook comments to security/security.c")
Fixes: ac318aed54 ("lsm: move the Infiniband hook comments to security/security.c")
Fixes: 4a49f592e9 ("lsm: move the SCTP hook comments to security/security.c")
Fixes: 6b6bbe8c02 ("lsm: move the socket hook comments to security/security.c")
Fixes: 2c2442fd46 ("lsm: move the AF_UNIX hook comments to security/security.c")
Fixes: 2bcf51bf2f ("lsm: move the netlink hook comments to security/security.c")
Fixes: 130c53bfee ("lsm: move the task hook comments to security/security.c")
Fixes: a0fd6480de ("lsm: move the file hook comments to security/security.c")
Fixes: 9348944b77 ("lsm: move the kernfs hook comments to security/security.c")
Fixes: 916e32584d ("lsm: move the inode hook comments to security/security.c")
Fixes: 08526a902c ("lsm: move the filesystem hook comments to security/security.c")
Fixes: 36819f1855 ("lsm: move the fs_context hook comments to security/security.c")
Fixes: 1661372c91 ("lsm: move the program execution hook comments to security/security.c")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: KP Singh <kpsingh@kernel.org>
Cc: bpf@vger.kernel.org
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-04-28 11:58:34 -04:00
Naohiro Aota
631003e233 btrfs: zoned: fix wrong use of bitops API in btrfs_ensure_empty_zones
find_next_bit and find_next_zero_bit take @size as the second parameter and
@offset as the third parameter. They are specified opposite in
btrfs_ensure_empty_zones(). Thanks to the later loop, it never failed to
detect the empty zones. Fix them and (maybe) return the result a bit
faster.

Note: the naming is a bit confusing, size has two meanings here, bitmap
and our range size.

Fixes: 1cd6121f2a ("btrfs: zoned: implement zoned chunk allocator")
CC: stable@vger.kernel.org # 5.15+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-28 17:17:25 +02:00
Zhihao Cheng
1dedde6903 ext4: fix i_disksize exceeding i_size problem in paritally written case
It is possible for i_disksize can exceed i_size, triggering a warning.

generic_perform_write
 copied = iov_iter_copy_from_user_atomic(len) // copied < len
 ext4_da_write_end
 | ext4_update_i_disksize
 |  new_i_size = pos + copied;
 |  WRITE_ONCE(EXT4_I(inode)->i_disksize, newsize) // update i_disksize
 | generic_write_end
 |  copied = block_write_end(copied, len) // copied = 0
 |   if (unlikely(copied < len))
 |    if (!PageUptodate(page))
 |     copied = 0;
 |  if (pos + copied > inode->i_size) // return false
 if (unlikely(copied == 0))
  goto again;
 if (unlikely(iov_iter_fault_in_readable(i, bytes))) {
  status = -EFAULT;
  break;
 }

We get i_disksize greater than i_size here, which could trigger WARNING
check 'i_size_read(inode) < EXT4_I(inode)->i_disksize' while doing dio:

ext4_dio_write_iter
 iomap_dio_rw
  __iomap_dio_rw // return err, length is not aligned to 512
 ext4_handle_inode_extension
  WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize) // Oops

 WARNING: CPU: 2 PID: 2609 at fs/ext4/file.c:319
 CPU: 2 PID: 2609 Comm: aa Not tainted 6.3.0-rc2
 RIP: 0010:ext4_file_write_iter+0xbc7
 Call Trace:
  vfs_write+0x3b1
  ksys_write+0x77
  do_syscall_64+0x39

Fix it by updating 'copied' value before updating i_disksize just like
ext4_write_inline_data_end() does.

A reproducer can be found in the buganizer link below.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217209
Fixes: 64769240bd ("ext4: Add delayed allocation support in data=writeback mode")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230321013721.89818-1-chengzhihao1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-04-28 11:07:19 -04:00
Qu Wenruo
64b5d5b285 btrfs: properly reject clear_cache and v1 cache for block-group-tree
[BUG]
With block-group-tree feature enabled, mounting it with clear_cache
would cause the following transaction abort at mount or remount:

  BTRFS info (device dm-4): force clearing of disk cache
  BTRFS info (device dm-4): using free space tree
  BTRFS info (device dm-4): auto enabling async discard
  BTRFS info (device dm-4): clearing free space tree
  BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE (0x1)
  BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE_VALID (0x2)
  BTRFS error (device dm-4): block-group-tree feature requires fres-space-tree and no-holes
  BTRFS error (device dm-4): super block corruption detected before writing it to disk
  BTRFS: error (device dm-4) in write_all_supers:4288: errno=-117 Filesystem corrupted (unexpected superblock corruption detected)
  BTRFS warning (device dm-4: state E): Skipping commit of aborted transaction.

[CAUSE]
For block-group-tree feature, we have an artificial dependency on
free-space-tree.

This means if we detect block-group-tree without v2 cache, we consider
it a corruption and cause the problem.

For clear_cache mount option, it would temporary disable v2 cache, then
re-enable it.

But unfortunately for that temporary v2 cache disabled status, we refuse
to write a superblock with bg tree only flag, thus leads to the above
transaction abortion.

[FIX]
For now, just reject clear_cache and v1 cache mount option for block
group tree.  So now we got a graceful rejection other than a transaction
abort:

  BTRFS info (device dm-4): force clearing of disk cache
  BTRFS error (device dm-4): cannot disable free space tree with block-group-tree feature
  BTRFS error (device dm-4): open_ctree failed

CC: stable@vger.kernel.org # 6.1+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-28 16:36:45 +02:00
Filipe Manana
a2cea677db btrfs: print extent buffers when sibling keys check fails
When trying to move keys from one node/leaf to another sibling node/leaf,
if the sibling keys check fails we just print an error message with the
last key of the left sibling and the first key of the right sibling.
However it's also useful to print all the keys of each sibling, as it
may provide some clues to what went wrong, which code path may be
inserting keys in an incorrect order. So just do that, print the siblings
with btrfs_print_tree(), as it works for both leaves and nodes.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-28 16:36:39 +02:00
Filipe Manana
9ae5afd02a btrfs: abort transaction when sibling keys check fails for leaves
If the sibling keys check fails before we move keys from one sibling
leaf to another, we are not aborting the transaction - we leave that to
some higher level caller of btrfs_search_slot() (or anything else that
uses it to insert items into a b+tree).

This means that the transaction abort will provide a stack trace that
omits the b+tree modification call chain. So change this to immediately
abort the transaction and therefore get a more useful stack trace that
shows us the call chain in the bt+tree modification code.

It's also important to immediately abort the transaction just in case
some higher level caller is not doing it, as this indicates a very
serious corruption and we should stop the possibility of doing further
damage.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-28 16:36:37 +02:00
Filipe Manana
611ccc58e1 btrfs: fix leak of source device allocation state after device replace
When a device replace finishes, the source device is freed by calling
btrfs_free_device() at btrfs_rm_dev_replace_free_srcdev(), but the
allocation state, tracked in the device's alloc_state io tree, is never
freed.

This is a regression recently introduced by commit f0bb5474cf ("btrfs:
remove redundant release of btrfs_device::alloc_state"), which removed a
call to extent_io_tree_release() from btrfs_free_device(), with the
rationale that btrfs_close_one_device() already releases the allocation
state from a device and btrfs_close_one_device() is always called before
a device is freed with btrfs_free_device(). However that is not true for
the device replace case, as btrfs_free_device() is called without any
previous call to btrfs_close_one_device().

The issue is trivial to reproduce, for example, by running test btrfs/027
from fstests:

  $ ./check btrfs/027
  $ rmmod btrfs
  $ dmesg
  (...)
  [84519.395485] BTRFS info (device sdc): dev_replace from <missing disk> (devid 2) to /dev/sdg started
  [84519.466224] BTRFS info (device sdc): dev_replace from <missing disk> (devid 2) to /dev/sdg finished
  [84519.552251] BTRFS info (device sdc): scrub: started on devid 1
  [84519.552277] BTRFS info (device sdc): scrub: started on devid 2
  [84519.552332] BTRFS info (device sdc): scrub: started on devid 3
  [84519.552705] BTRFS info (device sdc): scrub: started on devid 4
  [84519.604261] BTRFS info (device sdc): scrub: finished on devid 4 with status: 0
  [84519.609374] BTRFS info (device sdc): scrub: finished on devid 3 with status: 0
  [84519.610818] BTRFS info (device sdc): scrub: finished on devid 1 with status: 0
  [84519.610927] BTRFS info (device sdc): scrub: finished on devid 2 with status: 0
  [84559.503795] BTRFS: state leak: start 1048576 end 1351614463 state 1 in tree 1 refs 1
  [84559.506764] BTRFS: state leak: start 1048576 end 1347420159 state 1 in tree 1 refs 1
  [84559.510294] BTRFS: state leak: start 1048576 end 1351614463 state 1 in tree 1 refs 1

So fix this by adding back the call to extent_io_tree_release() at
btrfs_free_device().

Fixes: f0bb5474cf ("btrfs: remove redundant release of btrfs_device::alloc_state")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-28 16:36:31 +02:00
xiaoshoukui
ac868bc9d1 btrfs: fix assertion of exclop condition when starting balance
Balance as exclusive state is compatible with paused balance and device
add, which makes some things more complicated. The assertion of valid
states when starting from paused balance needs to take into account two
more states, the combinations can be hit when there are several threads
racing to start balance and device add. This won't typically happen when
the commands are started from command line.

Scenario 1: With exclusive_operation state == BTRFS_EXCLOP_NONE.

Concurrently adding multiple devices to the same mount point and
btrfs_exclop_finish executed finishes before assertion in
btrfs_exclop_balance, exclusive_operation will changed to
BTRFS_EXCLOP_NONE state which lead to assertion failed:

  fs_info->exclusive_operation == BTRFS_EXCLOP_BALANCE ||
  fs_info->exclusive_operation == BTRFS_EXCLOP_DEV_ADD,
  in fs/btrfs/ioctl.c:456
  Call Trace:
   <TASK>
   btrfs_exclop_balance+0x13c/0x310
   ? memdup_user+0xab/0xc0
   ? PTR_ERR+0x17/0x20
   btrfs_ioctl_add_dev+0x2ee/0x320
   btrfs_ioctl+0x9d5/0x10d0
   ? btrfs_ioctl_encoded_write+0xb80/0xb80
   __x64_sys_ioctl+0x197/0x210
   do_syscall_64+0x3c/0xb0
   entry_SYSCALL_64_after_hwframe+0x63/0xcd

Scenario 2: With exclusive_operation state == BTRFS_EXCLOP_BALANCE_PAUSED.

Concurrently adding multiple devices to the same mount point and
btrfs_exclop_balance executed finish before the latter thread execute
assertion in btrfs_exclop_balance, exclusive_operation will changed to
BTRFS_EXCLOP_BALANCE_PAUSED state which lead to assertion failed:

  fs_info->exclusive_operation == BTRFS_EXCLOP_BALANCE ||
  fs_info->exclusive_operation == BTRFS_EXCLOP_DEV_ADD ||
  fs_info->exclusive_operation == BTRFS_EXCLOP_NONE,
  fs/btrfs/ioctl.c:458
  Call Trace:
   <TASK>
   btrfs_exclop_balance+0x240/0x410
   ? memdup_user+0xab/0xc0
   ? PTR_ERR+0x17/0x20
   btrfs_ioctl_add_dev+0x2ee/0x320
   btrfs_ioctl+0x9d5/0x10d0
   ? btrfs_ioctl_encoded_write+0xb80/0xb80
   __x64_sys_ioctl+0x197/0x210
   do_syscall_64+0x3c/0xb0
   entry_SYSCALL_64_after_hwframe+0x63/0xcd

An example of the failed assertion is below, which shows that the
paused balance is also needed to be checked.

  root@syzkaller:/home/xsk# ./repro
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  [  416.611428][ T7970] BTRFS info (device loop0): fs_info exclusive_operation: 0
  Failed to add device /dev/vda, errno 14
  [  416.613973][ T7971] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.615456][ T7972] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.617528][ T7973] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.618359][ T7974] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.622589][ T7975] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.624034][ T7976] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.626420][ T7977] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.627643][ T7978] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.629006][ T7979] BTRFS info (device loop0): fs_info exclusive_operation: 3
  [  416.630298][ T7980] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  [  416.632787][ T7981] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.634282][ T7982] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.636202][ T7983] BTRFS info (device loop0): fs_info exclusive_operation: 3
  [  416.637012][ T7984] BTRFS info (device loop0): fs_info exclusive_operation: 1
  Failed to add device /dev/vda, errno 14
  [  416.637759][ T7984] assertion failed: fs_info->exclusive_operation ==
  BTRFS_EXCLOP_BALANCE || fs_info->exclusive_operation ==
  BTRFS_EXCLOP_DEV_ADD || fs_info->exclusive_operation ==
  BTRFS_EXCLOP_NONE, in fs/btrfs/ioctl.c:458
  [  416.639845][ T7984] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
  [  416.640485][ T7984] CPU: 0 PID: 7984 Comm: repro Not tainted 6.2.0 #7
  [  416.641172][ T7984] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
  [  416.642090][ T7984] RIP: 0010:btrfs_assertfail+0x2c/0x2e
  [  416.644423][ T7984] RSP: 0018:ffffc90003ea7e28 EFLAGS: 00010282
  [  416.645018][ T7984] RAX: 00000000000000cc RBX: 0000000000000000 RCX: 0000000000000000
  [  416.645763][ T7984] RDX: ffff88801d030000 RSI: ffffffff81637e7c RDI: fffff520007d4fb7
  [  416.646554][ T7984] RBP: ffffffff8a533de0 R08: 00000000000000cc R09: 0000000000000000
  [  416.647299][ T7984] R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff8a533da0
  [  416.648041][ T7984] R13: 00000000000001ca R14: 000000005000940a R15: 0000000000000000
  [  416.648785][ T7984] FS:  00007fa2985d4640(0000) GS:ffff88802cc00000(0000) knlGS:0000000000000000
  [  416.649616][ T7984] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [  416.650238][ T7984] CR2: 0000000000000000 CR3: 0000000018e5e000 CR4: 0000000000750ef0
  [  416.650980][ T7984] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [  416.651725][ T7984] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [  416.652502][ T7984] PKRU: 55555554
  [  416.652888][ T7984] Call Trace:
  [  416.653241][ T7984]  <TASK>
  [  416.653527][ T7984]  btrfs_exclop_balance+0x240/0x410
  [  416.654036][ T7984]  ? memdup_user+0xab/0xc0
  [  416.654465][ T7984]  ? PTR_ERR+0x17/0x20
  [  416.654874][ T7984]  btrfs_ioctl_add_dev+0x2ee/0x320
  [  416.655380][ T7984]  btrfs_ioctl+0x9d5/0x10d0
  [  416.655822][ T7984]  ? btrfs_ioctl_encoded_write+0xb80/0xb80
  [  416.656400][ T7984]  __x64_sys_ioctl+0x197/0x210
  [  416.656874][ T7984]  do_syscall_64+0x3c/0xb0
  [  416.657346][ T7984]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
  [  416.657922][ T7984] RIP: 0033:0x4546af
  [  416.660170][ T7984] RSP: 002b:00007fa2985d4150 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  [  416.660972][ T7984] RAX: ffffffffffffffda RBX: 00007fa2985d4640 RCX: 00000000004546af
  [  416.661714][ T7984] RDX: 0000000000000000 RSI: 000000005000940a RDI: 0000000000000003
  [  416.662449][ T7984] RBP: 00007fa2985d41d0 R08: 0000000000000000 R09: 00007ffee37a4c4f
  [  416.663195][ T7984] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fa2985d4640
  [  416.663951][ T7984] R13: 0000000000000009 R14: 000000000041b320 R15: 00007fa297dd4000
  [  416.664703][ T7984]  </TASK>
  [  416.665040][ T7984] Modules linked in:
  [  416.665590][ T7984] ---[ end trace 0000000000000000 ]---
  [  416.666176][ T7984] RIP: 0010:btrfs_assertfail+0x2c/0x2e
  [  416.668775][ T7984] RSP: 0018:ffffc90003ea7e28 EFLAGS: 00010282
  [  416.669425][ T7984] RAX: 00000000000000cc RBX: 0000000000000000 RCX: 0000000000000000
  [  416.670235][ T7984] RDX: ffff88801d030000 RSI: ffffffff81637e7c RDI: fffff520007d4fb7
  [  416.671050][ T7984] RBP: ffffffff8a533de0 R08: 00000000000000cc R09: 0000000000000000
  [  416.671867][ T7984] R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff8a533da0
  [  416.672685][ T7984] R13: 00000000000001ca R14: 000000005000940a R15: 0000000000000000
  [  416.673501][ T7984] FS:  00007fa2985d4640(0000) GS:ffff88802cc00000(0000) knlGS:0000000000000000
  [  416.674425][ T7984] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [  416.675114][ T7984] CR2: 0000000000000000 CR3: 0000000018e5e000 CR4: 0000000000750ef0
  [  416.675933][ T7984] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [  416.676760][ T7984] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Link: https://lore.kernel.org/linux-btrfs/20230324031611.98986-1-xiaoshoukui@gmail.com/
CC: stable@vger.kernel.org # 6.1+
Signed-off-by: xiaoshoukui <xiaoshoukui@ruijie.com.cn>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-28 16:36:27 +02:00
Filipe Manana
6f932d4ef0 btrfs: fix btrfs_prev_leaf() to not return the same key twice
A call to btrfs_prev_leaf() may end up returning a path that points to the
same item (key) again. This happens if while btrfs_prev_leaf(), after we
release the path, a concurrent insertion happens, which moves items off
from a sibling into the front of the previous leaf, and an item with the
computed previous key does not exists.

For example, suppose we have the two following leaves:

  Leaf A

  -------------------------------------------------------------
  | ...   key (300 96 10)   key (300 96 15)   key (300 96 16) |
  -------------------------------------------------------------
              slot 20             slot 21             slot 22

  Leaf B

  -------------------------------------------------------------
  | key (300 96 20)   key (300 96 21)   key (300 96 22)   ... |
  -------------------------------------------------------------
      slot 0             slot 1             slot 2

If we call btrfs_prev_leaf(), from btrfs_previous_item() for example, with
a path pointing to leaf B and slot 0 and the following happens:

1) At btrfs_prev_leaf() we compute the previous key to search as:
   (300 96 19), which is a key that does not exists in the tree;

2) Then we call btrfs_release_path() at btrfs_prev_leaf();

3) Some other task inserts a key at leaf A, that sorts before the key at
   slot 20, for example it has an objectid of 299. In order to make room
   for the new key, the key at slot 22 is moved to the front of leaf B.
   This happens at push_leaf_right(), called from split_leaf().

   After this leaf B now looks like:

  --------------------------------------------------------------------------------
  | key (300 96 16)    key (300 96 20)   key (300 96 21)   key (300 96 22)   ... |
  --------------------------------------------------------------------------------
       slot 0              slot 1             slot 2             slot 3

4) At btrfs_prev_leaf() we call btrfs_search_slot() for the computed
   previous key: (300 96 19). Since the key does not exists,
   btrfs_search_slot() returns 1 and with a path pointing to leaf B
   and slot 1, the item with key (300 96 20);

5) This makes btrfs_prev_leaf() return a path that points to slot 1 of
   leaf B, the same key as before it was called, since the key at slot 0
   of leaf B (300 96 16) is less than the computed previous key, which is
   (300 96 19);

6) As a consequence btrfs_previous_item() returns a path that points again
   to the item with key (300 96 20).

For some users of btrfs_prev_leaf() or btrfs_previous_item() this may not
be functional a problem, despite not making sense to return a new path
pointing again to the same item/key. However for a caller such as
tree-log.c:log_dir_items(), this has a bad consequence, as it can result
in not logging some dir index deletions in case the directory is being
logged without holding the inode's VFS lock (logging triggered while
logging a child inode for example) - for the example scenario above, in
case the dir index keys 17, 18 and 19 were deleted in the current
transaction.

CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-28 16:16:30 +02:00
Jarkko Sakkinen
0c8862de05 tpm: Re-enable TPM chip boostrapping non-tpm_tis TPM drivers
TPM chip bootstrapping was removed from tpm_chip_register(), and it
was relocated to tpm_tis_core. This breaks all drivers which are not
based on tpm_tis because the chip will not get properly initialized.

Take the corrective steps:
1. Rename tpm_chip_startup() as tpm_chip_bootstrap() and make it one-shot.
2. Call tpm_chip_bootstrap() in tpm_chip_register(), which reverts the
   things  as tehy used to be.

Cc: Lino Sanfilippo <l.sanfilippo@kunbus.com>
Fixes: 548eb516ec ("tpm, tpm_tis: startup chip before testing for interrupts")
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Link: https://lore.kernel.org/all/ZEjqhwHWBnxcaRV5@xpf.sh.intel.com/
Tested-by: Pengfei Xu <pengfei.xu@intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-04-28 13:06:36 +00:00
Ville Syrjälä
99e7e3b600 drm/uapi: Document CTM matrix better
Document in which order the CTM matrix elements are stored.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230411222931.15127-2-ville.syrjala@linux.intel.com
Reviewed-by: Xaver Hugl <xaver.hugl@gmail.com>
Acked-by: Simon Ser <contact@emersion.fr>
2023-04-28 14:21:09 +03:00
Jonathan Cavitt
eee44c2903 drm/i915: Use correct huge page manager for MTL
MTL currently uses gen8_ppgtt_insert_huge when managing huge pages.
This is because MTL reports as not supporting 64K pages, or more
accurately, the system that reports whether a platform has 64K pages
reports false for MTL.  This is only half correct, as the 64K page support
reporting system only cares about 64K page support for LMEM, which MTL
doesn't have.

MTL should be using xehpsdv_ppgtt_insert_huge.  However, simply changing
over to using that manager doesn't resolve the issue because MTL is
expecting the virtual address space for the page table to be flushed after
initialization, so we must also add a flush statement there.

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230425-hugepage-migrate-v8-2-7868d54eaa27@intel.com
2023-04-28 13:20:05 +02:00
Jonathan Cavitt
d59ce1ff8e drm/i915: Migrate platform-dependent mock hugepage selftests to live
Convert the igt_mock_ppgtt_huge_fill and igt_mock_ppgtt_64K mock selftests
into live selftests as their requirements have recently become
platform-dependent. Additionally, apply necessary platform dependency
checks to these tests.

v8:
- handle properly 64K and 2M pages
v9:
- do not expect 64K pages if 2M are present
- fix hex printing
- obey commit message line limit

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Co-developed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230425-hugepage-migrate-v8-1-7868d54eaa27@intel.com
2023-04-28 13:20:04 +02:00
Olivier Bacon
4140aafcff crypto: engine - fix crypto_queue backlog handling
CRYPTO_TFM_REQ_MAY_BACKLOG tells the crypto driver that it should
internally backlog requests until the crypto hw's queue becomes
full. At that point, crypto_engine backlogs the request and returns
-EBUSY. Calling driver such as dm-crypt then waits until the
complete() function is called with a status of -EINPROGRESS before
sending a new request.

The problem lies in the call to complete() with a value of -EINPROGRESS
that is made when a backlog item is present on the queue. The call is
done before the successful execution of the crypto request. In the case
that do_one_request() returns < 0 and the retry support is available,
the request is put back in the queue. This leads upper drivers to send
a new request even if the queue is still full.

The problem can be reproduced by doing a large dd into a crypto
dm-crypt device. This is pretty easy to see when using
Freescale CAAM crypto driver and SWIOTLB dma. Since the actual amount
of requests that can be hold in the queue is unlimited we get IOs error
and dma allocation.

The fix is to call complete with a value of -EINPROGRESS only if
the request is not enqueued back in crypto_queue. This is done
by calling complete() later in the code. In order to delay the decision,
crypto_queue is modified to correctly set the backlog pointer
when a request is enqueued back.

Fixes: 6a89f492f8 ("crypto: engine - support for parallel requests based on retry mechanism")
Co-developed-by: Sylvain Ouellet <souellet@genetec.com>
Signed-off-by: Sylvain Ouellet <souellet@genetec.com>
Signed-off-by: Olivier Bacon <obacon@genetec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-04-28 17:50:43 +08:00
Christophe JAILLET
8fd91151eb crypto: sun8i-ss - Fix a test in sun8i_ss_setup_ivs()
SS_ENCRYPTION is (0 << 7 = 0), so the test can never be true.
Use a direct comparison to SS_ENCRYPTION instead.

The same king of test is already done the same way in sun8i_ss_run_task().

Fixes: 359e893e8a ("crypto: sun8i-ss - rework handling of IV")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-04-28 17:50:43 +08:00
Oswald Buddenhagen
9d2f38638a ALSA: emu10k1: use more existing defines instead of open-coded numbers
Using the *_MASK defines for "maximal value" is debatable. I got the
idea from FreeBSD, and it sorta makes sense to me.

Some hunks look a bit incomplete, because code that is going to be
subsequently removed is not touched here.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Link: https://lore.kernel.org/r/20230428080732.1697695-1-oswald.buddenhagen@gmx.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-04-28 11:22:51 +02:00
Angelo Dureghello
6686317855 net: dsa: mv88e6xxx: add mv88e6321 rsvd2cpu
Add rsvd2cpu capability for mv88e6321 model, to allow proper bpdu
processing.

Signed-off-by: Angelo Dureghello <angelo.dureghello@timesys.com>
Fixes: 51c901a775 ("net: dsa: mv88e6xxx: distinguish Global 2 Rsvd2CPU")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:57:32 +01:00
Antoine Tenart
dc6456e938 net: ipv6: fix skb hash for some RST packets
The skb hash comes from sk->sk_txhash when using TCP, except for some
IPv6 RST packets. This is because in tcp_v6_send_reset when not in
TIME_WAIT the hash is taken from sk->sk_hash, while it should come from
sk->sk_txhash as those two hashes are not computed the same way.

Packetdrill script to test the above,

   0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
  +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
  +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)

  +0 > (flowlabel 0x1) S 0:0(0) <...>

  // Wrong ack seq, trigger a rst.
  +0 < S. 0:0(0) ack 0 win 4000

  // Check the flowlabel matches prior one from SYN.
  +0 > (flowlabel 0x1) R 0:0(0) <...>

Fixes: 9258b8b1be ("ipv6: tcp: send consistent autoflowlabel in RST packets")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:53:43 +01:00
Andrea Mayer
46ef24c60f selftests: srv6: make srv6_end_dt46_l3vpn_test more robust
On some distributions, the rp_filter is automatically set (=1) by
default on a netdev basis (also on VRFs).
In an SRv6 End.DT46 behavior, decapsulated IPv4 packets are routed using
the table associated with the VRF bound to that tunnel. During lookup
operations, the rp_filter can lead to packet loss when activated on the
VRF.
Therefore, we chose to make this selftest more robust by explicitly
disabling the rp_filter during tests (as it is automatically set by some
Linux distributions).

Fixes: 03a0b567a0 ("selftests: seg6: add selftest for SRv6 End.DT46 Behavior")
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:51:40 +01:00
wuych
042334a8d4 atlantic:hw_atl2:hw_atl2_utils_fw: Remove unnecessary (void*) conversions
Pointer variables of void * type do not require type cast.

Signed-off-by: wuych <yunchuan@nfschina.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:49:21 +01:00
Cong Wang
c88f8d5cd9 sit: update dev->needed_headroom in ipip6_tunnel_bind_dev()
When a tunnel device is bound with the underlying device, its
dev->needed_headroom needs to be updated properly. IPv4 tunnels
already do the same in ip_tunnel_bind_dev(). Otherwise we may
not have enough header room for skb, especially after commit
b17f709a24 ("gue: TX support for using remote checksum offload option").

Fixes: 32b8a8e59c ("sit: add IPv4 over IPv4 support")
Reported-by: Palash Oswal <oswalpalash@gmail.com>
Link: https://lore.kernel.org/netdev/CAGyP=7fDcSPKu6nttbGwt7RXzE3uyYxLjCSE97J64pRxJP8jPA@mail.gmail.com/
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:48:14 +01:00
Vlad Buslov
da94a7781f net/sched: cls_api: remove block_cb from driver_list before freeing
Error handler of tcf_block_bind() frees the whole bo->cb_list on error.
However, by that time the flow_block_cb instances are already in the driver
list because driver ndo_setup_tc() callback is called before that up the
call chain in tcf_block_offload_cmd(). This leaves dangling pointers to
freed objects in the list and causes use-after-free[0]. Fix it by also
removing flow_block_cb instances from driver_list before deallocating them.

[0]:
[  279.868433] ==================================================================
[  279.869964] BUG: KASAN: slab-use-after-free in flow_block_cb_setup_simple+0x631/0x7c0
[  279.871527] Read of size 8 at addr ffff888147e2bf20 by task tc/2963

[  279.873151] CPU: 6 PID: 2963 Comm: tc Not tainted 6.3.0-rc6+ #4
[  279.874273] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[  279.876295] Call Trace:
[  279.876882]  <TASK>
[  279.877413]  dump_stack_lvl+0x33/0x50
[  279.878198]  print_report+0xc2/0x610
[  279.878987]  ? flow_block_cb_setup_simple+0x631/0x7c0
[  279.879994]  kasan_report+0xae/0xe0
[  279.880750]  ? flow_block_cb_setup_simple+0x631/0x7c0
[  279.881744]  ? mlx5e_tc_reoffload_flows_work+0x240/0x240 [mlx5_core]
[  279.883047]  flow_block_cb_setup_simple+0x631/0x7c0
[  279.884027]  tcf_block_offload_cmd.isra.0+0x189/0x2d0
[  279.885037]  ? tcf_block_setup+0x6b0/0x6b0
[  279.885901]  ? mutex_lock+0x7d/0xd0
[  279.886669]  ? __mutex_unlock_slowpath.constprop.0+0x2d0/0x2d0
[  279.887844]  ? ingress_init+0x1c0/0x1c0 [sch_ingress]
[  279.888846]  tcf_block_get_ext+0x61c/0x1200
[  279.889711]  ingress_init+0x112/0x1c0 [sch_ingress]
[  279.890682]  ? clsact_init+0x2b0/0x2b0 [sch_ingress]
[  279.891701]  qdisc_create+0x401/0xea0
[  279.892485]  ? qdisc_tree_reduce_backlog+0x470/0x470
[  279.893473]  tc_modify_qdisc+0x6f7/0x16d0
[  279.894344]  ? tc_get_qdisc+0xac0/0xac0
[  279.895213]  ? mutex_lock+0x7d/0xd0
[  279.896005]  ? __mutex_lock_slowpath+0x10/0x10
[  279.896910]  rtnetlink_rcv_msg+0x5fe/0x9d0
[  279.897770]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
[  279.898672]  ? __sys_sendmsg+0xb5/0x140
[  279.899494]  ? do_syscall_64+0x3d/0x90
[  279.900302]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  279.901337]  ? kasan_save_stack+0x2e/0x40
[  279.902177]  ? kasan_save_stack+0x1e/0x40
[  279.903058]  ? kasan_set_track+0x21/0x30
[  279.903913]  ? kasan_save_free_info+0x2a/0x40
[  279.904836]  ? ____kasan_slab_free+0x11a/0x1b0
[  279.905741]  ? kmem_cache_free+0x179/0x400
[  279.906599]  netlink_rcv_skb+0x12c/0x360
[  279.907450]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
[  279.908360]  ? netlink_ack+0x1550/0x1550
[  279.909192]  ? rhashtable_walk_peek+0x170/0x170
[  279.910135]  ? kmem_cache_alloc_node+0x1af/0x390
[  279.911086]  ? _copy_from_iter+0x3d6/0xc70
[  279.912031]  netlink_unicast+0x553/0x790
[  279.912864]  ? netlink_attachskb+0x6a0/0x6a0
[  279.913763]  ? netlink_recvmsg+0x416/0xb50
[  279.914627]  netlink_sendmsg+0x7a1/0xcb0
[  279.915473]  ? netlink_unicast+0x790/0x790
[  279.916334]  ? iovec_from_user.part.0+0x4d/0x220
[  279.917293]  ? netlink_unicast+0x790/0x790
[  279.918159]  sock_sendmsg+0xc5/0x190
[  279.918938]  ____sys_sendmsg+0x535/0x6b0
[  279.919813]  ? import_iovec+0x7/0x10
[  279.920601]  ? kernel_sendmsg+0x30/0x30
[  279.921423]  ? __copy_msghdr+0x3c0/0x3c0
[  279.922254]  ? import_iovec+0x7/0x10
[  279.923041]  ___sys_sendmsg+0xeb/0x170
[  279.923854]  ? copy_msghdr_from_user+0x110/0x110
[  279.924797]  ? ___sys_recvmsg+0xd9/0x130
[  279.925630]  ? __perf_event_task_sched_in+0x183/0x470
[  279.926656]  ? ___sys_sendmsg+0x170/0x170
[  279.927529]  ? ctx_sched_in+0x530/0x530
[  279.928369]  ? update_curr+0x283/0x4f0
[  279.929185]  ? perf_event_update_userpage+0x570/0x570
[  279.930201]  ? __fget_light+0x57/0x520
[  279.931023]  ? __switch_to+0x53d/0xe70
[  279.931846]  ? sockfd_lookup_light+0x1a/0x140
[  279.932761]  __sys_sendmsg+0xb5/0x140
[  279.933560]  ? __sys_sendmsg_sock+0x20/0x20
[  279.934436]  ? fpregs_assert_state_consistent+0x1d/0xa0
[  279.935490]  do_syscall_64+0x3d/0x90
[  279.936300]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  279.937311] RIP: 0033:0x7f21c814f887
[  279.938085] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[  279.941448] RSP: 002b:00007fff11efd478 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  279.942964] RAX: ffffffffffffffda RBX: 0000000064401979 RCX: 00007f21c814f887
[  279.944337] RDX: 0000000000000000 RSI: 00007fff11efd4e0 RDI: 0000000000000003
[  279.945660] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
[  279.947003] R10: 00007f21c8008708 R11: 0000000000000246 R12: 0000000000000001
[  279.948345] R13: 0000000000409980 R14: 000000000047e538 R15: 0000000000485400
[  279.949690]  </TASK>

[  279.950706] Allocated by task 2960:
[  279.951471]  kasan_save_stack+0x1e/0x40
[  279.952338]  kasan_set_track+0x21/0x30
[  279.953165]  __kasan_kmalloc+0x77/0x90
[  279.954006]  flow_block_cb_setup_simple+0x3dd/0x7c0
[  279.955001]  tcf_block_offload_cmd.isra.0+0x189/0x2d0
[  279.956020]  tcf_block_get_ext+0x61c/0x1200
[  279.956881]  ingress_init+0x112/0x1c0 [sch_ingress]
[  279.957873]  qdisc_create+0x401/0xea0
[  279.958656]  tc_modify_qdisc+0x6f7/0x16d0
[  279.959506]  rtnetlink_rcv_msg+0x5fe/0x9d0
[  279.960392]  netlink_rcv_skb+0x12c/0x360
[  279.961216]  netlink_unicast+0x553/0x790
[  279.962044]  netlink_sendmsg+0x7a1/0xcb0
[  279.962906]  sock_sendmsg+0xc5/0x190
[  279.963702]  ____sys_sendmsg+0x535/0x6b0
[  279.964534]  ___sys_sendmsg+0xeb/0x170
[  279.965343]  __sys_sendmsg+0xb5/0x140
[  279.966132]  do_syscall_64+0x3d/0x90
[  279.966908]  entry_SYSCALL_64_after_hwframe+0x46/0xb0

[  279.968407] Freed by task 2960:
[  279.969114]  kasan_save_stack+0x1e/0x40
[  279.969929]  kasan_set_track+0x21/0x30
[  279.970729]  kasan_save_free_info+0x2a/0x40
[  279.971603]  ____kasan_slab_free+0x11a/0x1b0
[  279.972483]  __kmem_cache_free+0x14d/0x280
[  279.973337]  tcf_block_setup+0x29d/0x6b0
[  279.974173]  tcf_block_offload_cmd.isra.0+0x226/0x2d0
[  279.975186]  tcf_block_get_ext+0x61c/0x1200
[  279.976080]  ingress_init+0x112/0x1c0 [sch_ingress]
[  279.977065]  qdisc_create+0x401/0xea0
[  279.977857]  tc_modify_qdisc+0x6f7/0x16d0
[  279.978695]  rtnetlink_rcv_msg+0x5fe/0x9d0
[  279.979562]  netlink_rcv_skb+0x12c/0x360
[  279.980388]  netlink_unicast+0x553/0x790
[  279.981214]  netlink_sendmsg+0x7a1/0xcb0
[  279.982043]  sock_sendmsg+0xc5/0x190
[  279.982827]  ____sys_sendmsg+0x535/0x6b0
[  279.983703]  ___sys_sendmsg+0xeb/0x170
[  279.984510]  __sys_sendmsg+0xb5/0x140
[  279.985298]  do_syscall_64+0x3d/0x90
[  279.986076]  entry_SYSCALL_64_after_hwframe+0x46/0xb0

[  279.987532] The buggy address belongs to the object at ffff888147e2bf00
                which belongs to the cache kmalloc-192 of size 192
[  279.989747] The buggy address is located 32 bytes inside of
                freed 192-byte region [ffff888147e2bf00, ffff888147e2bfc0)

[  279.992367] The buggy address belongs to the physical page:
[  279.993430] page:00000000550f405c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x147e2a
[  279.995182] head:00000000550f405c order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[  279.996713] anon flags: 0x200000000010200(slab|head|node=0|zone=2)
[  279.997878] raw: 0200000000010200 ffff888100042a00 0000000000000000 dead000000000001
[  279.999384] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
[  280.000894] page dumped because: kasan: bad access detected

[  280.002386] Memory state around the buggy address:
[  280.003338]  ffff888147e2be00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  280.004781]  ffff888147e2be80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  280.006224] >ffff888147e2bf00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  280.007700]                                ^
[  280.008592]  ffff888147e2bf80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  280.010035]  ffff888147e2c000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  280.011564] ==================================================================

Fixes: 59094b1e50 ("net: sched: use flow block API")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:44:19 +01:00
Christophe JAILLET
e0807c4302 mISDN: Use list_count_nodes()
count_list_member() really looks the same as list_count_nodes(), so use the
latter instead of hand writing it.

The first one return an int and the other a size_t, but that should be
fine. It is really unlikely that we get so many parties in a conference.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:43:04 +01:00
Eric Dumazet
7e692df393 tcp: fix skb_copy_ubufs() vs BIG TCP
David Ahern reported crashes in skb_copy_ubufs() caused by TCP tx zerocopy
using hugepages, and skb length bigger than ~68 KB.

skb_copy_ubufs() assumed it could copy all payload using up to
MAX_SKB_FRAGS order-0 pages.

This assumption broke when BIG TCP was able to put up to 512 KB per skb.

We did not hit this bug at Google because we use CONFIG_MAX_SKB_FRAGS=45
and limit gso_max_size to 180000.

A solution is to use higher order pages if needed.

v2: add missing __GFP_COMP, or we leak memory.

Fixes: 7c4e983c4f ("net: allow gso_max_size to exceed 65536")
Reported-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/netdev/c70000f6-baa4-4a05-46d0-4b3e0dc1ccc8@gmail.com/T/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Coco Li <lixiaoyan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:40:38 +01:00
Cosmo Chou
6f75cd166a net/ncsi: clear Tx enable mode when handling a Config required AEN
ncsi_channel_is_tx() determines whether a given channel should be
used for Tx or not. However, when reconfiguring the channel by
handling a Configuration Required AEN, there is a misjudgment that
the channel Tx has already been enabled, which results in the Enable
Channel Network Tx command not being sent.

Clear the channel Tx enable flag before reconfiguring the channel to
avoid the misjudgment.

Fixes: 8d951a75d0 ("net/ncsi: Configure multi-package, multi-channel modes with failover")
Signed-off-by: Cosmo Chou <chou.cosmo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-28 09:35:33 +01:00
Jeremy Kerr
8c6c78ee3b i3c: ast2600: fix register setting for 545 ohm pullups
The 2k register setting is zero, OR-ing it in doesn't parallel the 2k
and 750 ohm pullups. We need a separate value for the 545 ohm setting.

Reported-by: Lukwinski Zbigniew <zbigniew.lukwinski@linux.intel.com>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20230428001849.1775559-1-jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:52:23 +02:00
Jeremy Kerr
f2539c2079 i3c: ast2600: enable IBI support
The ast2600 i3c hardware is capable of IBIs, but we need a workaround
for a hardware issue with the I3C state machine handling IBI payloads
of specific lengths when PEC is not enabled. To avoid this, we need to
unconditionally enable PECs, at the consquence of losing a byte of data
when the device does not send a PEC.

Enable IBIs on the ast2600 platform, including an implementation of the
PEC workaround, which prints a warning when triggered.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/ba923b96d6d129024c975e8a0472c5b2fcb3af32.1680161823.git.jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:49:50 +02:00
Jeremy Kerr
f3a3553a51 i3c: dw: Add a platform facility for IBI PEC workarounds
On the AST2600 i3c controller, we'll need to apply a workaround for a
hardware issue with IBI payloads.

Introduce a platform hook to allow dw i3c platform implementations to
modify the DAT entry in IBI enable/disable to allow this workaround in a
future change.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/d5d76a8d2336d2a71886537f42e71d51db184df6.1680161823.git.jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:20:07 +02:00
Jeremy Kerr
e389b1d72a i3c: dw: Add support for in-band interrupts
This change adds support for receiving and dequeueing i3c IBIs.

By setting struct dw_i3c_master->ibi_capable before probe, a platform
implementation can select the IBI-enabled version of the i3c_master_ops,
enabling the global IBI infrastrcture for that controller.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/79daeefd7ccb7c935d0c159149df21a6c9a73ffa.1680161823.git.jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:20:07 +02:00
Jeremy Kerr
e2d43101f6 i3c: dw: Turn DAT array entry into a struct
In an upcoming change, we will want to store additional data about the
devices we have in the data address table.

Change the type of the DAT entries into a struct, which currently just
has the address data.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/9dc0d9e2857e851a0cf04819df48e5d31921f83e.1680161823.git.jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:20:07 +02:00
Jeremy Kerr
79f42b31c2 i3c: dw: Create a generic fifo read function
In a future change we'll want to read from the IBI FIFO too, so turn
dw_i3c_read_rx_fifo() into a generic read with the FIFO register as a
parameter.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/827204789583dd86addffb47ecaeab9d67cf95d5.1680161823.git.jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:20:07 +02:00
Jeremy Kerr
7dc2e0a875 i3c: Allow OF-alias-based persistent bus numbering
Parse the /aliases node to assign any fixed bus numbers, as is done with
the i2c subsystem. Numbering for non-aliased busses will start after the
highest fixed bus number.

This allows an alias node such as:

    aliases {
        i3c0 = &bus_a,
	i3c4 = &bus_b,
    };

to set the numbering for a set of i3c controllers:

    /* fixed-numbered bus, assigned "i3c-0" */
    bus_a: i3c-master {
    };

    /* another fixed-numbered bus, assigned "i3c-4" */
    bus_b: i3c-master {
    };

    /* dynamic-numbered bus, likely assigned "i3c-5" */
    bus_c: i3c-master {
    };

If no i3c device aliases are present, the numbering will stay as-is,
starting from 0.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20230405094149.1513209-1-jk@codeconstruct.com.au
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2023-04-28 08:19:01 +02:00